Age | Commit message (Collapse) | Author |
|
After the simplification of the fast fsync patch done recently by commit
b5e6c3e170b7 ("btrfs: always wait on ordered extents at fsync time") and
commit e7175a692765 ("btrfs: remove the wait ordered logic in the
log_one_extent path"), we got a very short time window where we can get
extents logged without writeback completing first or extents logged
without logging the respective data checksums. Both issues can only happen
when doing a non-full (fast) fsync.
As soon as we enter btrfs_sync_file() we trigger writeback, then lock the
inode and then wait for the writeback to complete before starting to log
the inode. However before we acquire the inode's lock and after we started
writeback, it's possible that more writes happened and dirtied more pages.
If that happened and those pages get writeback triggered while we are
logging the inode (for example, the VM subsystem triggering it due to
memory pressure, or another concurrent fsync), we end up seeing the
respective extent maps in the inode's list of modified extents and will
log matching file extent items without waiting for the respective
ordered extents to complete, meaning that either of the following will
happen:
1) We log an extent after its writeback finishes but before its checksums
are added to the csum tree, leading to -EIO errors when attempting to
read the extent after a log replay.
2) We log an extent before its writeback finishes.
Therefore after the log replay we will have a file extent item pointing
to an unwritten extent (and without the respective data checksums as
well).
This could not happen before the fast fsync patch simplification, because
for any extent we found in the list of modified extents, we would wait for
its respective ordered extent to finish writeback or collect its checksums
for logging if it did not complete yet.
Fix this by triggering writeback again after acquiring the inode's lock
and before waiting for ordered extents to complete.
Fixes: e7175a692765 ("btrfs: remove the wait ordered logic in the log_one_extent path")
Fixes: b5e6c3e170b7 ("btrfs: always wait on ordered extents at fsync time")
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Direct returns from within a loop are rude, but it doesn't mean it gets
to avoid releasing the memory acquired beforehand.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20181109023937.96105-3-farman@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
If pfn_array_alloc fails somehow, we need to release the pfn_array_table
that was malloc'd earlier.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20181109023937.96105-2-farman@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
Let's register the mediated device when all the data structures
which could be used are initialized.
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <1540487720-11634-3-git-send-email-pmorel@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
Fix the following sparse warning:
drivers/s390/cio/vfio_ccw_drv.c:25:19: warning: symbol 'vfio_ccw_io_region'
was not declared. Should it be static?
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Message-Id: <alpine.LFD.2.21.1810151328570.1636@schleppi.aag-de.ibmmobiledemo.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
The video mode for DMT is only populated to support encp.
Signed-off-by: Jorge Ramirez-Ortiz <jorge.ramirez.ortiz@gmail.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1542048069-22603-1-git-send-email-jramirez@baylibre.com
|
|
nft_compat ops do not have static storage duration, unlike all other
expressions.
When nf_tables_expr_destroy() returns, expr->ops might have been
free'd already, so we need to store next address before calling
expression destructor.
For same reason, we can't deref match pointer after nft_xt_put().
This can be easily reproduced by adding msleep() before
nft_match_destroy() returns.
Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
xt_rateest_net_exit() was added to check whether rules are flushed
successfully. but ->net_exit() callback is called earlier than
->destroy() callback.
So that ->net_exit() callback can't check that.
test commands:
%ip netns add vm1
%ip netns exec vm1 iptables -t mangle -I PREROUTING -p udp \
--dport 1111 -j RATEEST --rateest-name ap \
--rateest-interval 250ms --rateest-ewma 0.5s
%ip netns del vm1
splat looks like:
[ 668.813518] WARNING: CPU: 0 PID: 87 at net/netfilter/xt_RATEEST.c:210 xt_rateest_net_exit+0x210/0x340 [xt_RATEEST]
[ 668.813518] Modules linked in: xt_RATEEST xt_tcpudp iptable_mangle bpfilter ip_tables x_tables
[ 668.813518] CPU: 0 PID: 87 Comm: kworker/u4:2 Not tainted 4.19.0-rc7+ #21
[ 668.813518] Workqueue: netns cleanup_net
[ 668.813518] RIP: 0010:xt_rateest_net_exit+0x210/0x340 [xt_RATEEST]
[ 668.813518] Code: 00 48 8b 85 30 ff ff ff 4c 8b 23 80 38 00 0f 85 24 01 00 00 48 8b 85 30 ff ff ff 4d 85 e4 4c 89 a5 58 ff ff ff c6 00 f8 74 b2 <0f> 0b 48 83 c3 08 4c 39 f3 75 b0 48 b8 00 00 00 00 00 fc ff df 49
[ 668.813518] RSP: 0018:ffff8801156c73f8 EFLAGS: 00010282
[ 668.813518] RAX: ffffed0022ad8e85 RBX: ffff880118928e98 RCX: 5db8012a00000000
[ 668.813518] RDX: ffff8801156c7428 RSI: 00000000cb1d185f RDI: ffff880115663b74
[ 668.813518] RBP: ffff8801156c74d0 R08: ffff8801156633c0 R09: 1ffff100236440be
[ 668.813518] R10: 0000000000000001 R11: ffffed002367d852 R12: ffff880115142b08
[ 668.813518] R13: 1ffff10022ad8e81 R14: ffff880118928ea8 R15: dffffc0000000000
[ 668.813518] FS: 0000000000000000(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000
[ 668.813518] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 668.813518] CR2: 0000563aa69f4f28 CR3: 0000000105a16000 CR4: 00000000001006f0
[ 668.813518] Call Trace:
[ 668.813518] ? unregister_netdevice_many+0xe0/0xe0
[ 668.813518] ? xt_rateest_net_init+0x2c0/0x2c0 [xt_RATEEST]
[ 668.813518] ? default_device_exit+0x1ca/0x270
[ 668.813518] ? remove_proc_entry+0x1cd/0x390
[ 668.813518] ? dev_change_net_namespace+0xd00/0xd00
[ 668.813518] ? __init_waitqueue_head+0x130/0x130
[ 668.813518] ops_exit_list.isra.10+0x94/0x140
[ 668.813518] cleanup_net+0x45b/0x900
[ 668.813518] ? net_drop_ns+0x110/0x110
[ 668.813518] ? swapgs_restore_regs_and_return_to_usermode+0x3c/0x80
[ 668.813518] ? save_trace+0x300/0x300
[ 668.813518] ? lock_acquire+0x196/0x470
[ 668.813518] ? lock_acquire+0x196/0x470
[ 668.813518] ? process_one_work+0xb60/0x1de0
[ 668.813518] ? _raw_spin_unlock_irq+0x29/0x40
[ 668.813518] ? _raw_spin_unlock_irq+0x29/0x40
[ 668.813518] ? __lock_acquire+0x4500/0x4500
[ 668.813518] ? __lock_is_held+0xb4/0x140
[ 668.813518] process_one_work+0xc13/0x1de0
[ 668.813518] ? pwq_dec_nr_in_flight+0x3c0/0x3c0
[ 668.813518] ? set_load_weight+0x270/0x270
[ ... ]
Fixes: 3427b2ab63fa ("netfilter: make xt_rateest hash table per net")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
_get_optimal_vdd_voltage call provides new_supply_vbb->u_volt
as the reference voltage while it should be really new_supply_vdd->u_volt.
Cc: 4.16+ <stable@vger.kernel.org> # v4.16+
Fixes: 9a835fa6e47 ("PM / OPP: Add ti-opp-supply driver")
Signed-off-by: Keerthy <j-keerthy@ti.com>
Acked-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
The voltage range (min, max) provided in the device tree is from
the data manual and is pretty big, catering to a wide range of devices.
On a i2c read/write failure the regulator_set_voltage_triplet function
falls back to set voltage between min and max. The min value from Device
Tree can be lesser than the optimal value and in that case that can lead
to a hang or crash. Hence set the u_volt_min dynamically to the optimal
voltage value.
Cc: 4.16+ <stable@vger.kernel.org> # v4.16+
Fixes: 9a835fa6e47 ("PM / OPP: Add ti-opp-supply driver")
Signed-off-by: Keerthy <j-keerthy@ti.com>
Acked-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
Fixes:
arch/riscv/kernel/module.c: In function 'apply_r_riscv_32_rela':
./include/linux/kern_levels.h:5:18: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'Elf32_Addr' {aka 'unsigned int'} [-Wformat=]
arch/riscv/kernel/module.c:23:27: note: format string is defined here
arch/riscv/kernel/module.c: In function 'apply_r_riscv_pcrel_hi20_rela':
./include/linux/kern_levels.h:5:18: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'Elf32_Addr' {aka 'unsigned int'} [-Wformat=]
arch/riscv/kernel/module.c:104:23: note: format string is defined here
arch/riscv/kernel/module.c: In function 'apply_r_riscv_hi20_rela':
./include/linux/kern_levels.h:5:18: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'Elf32_Addr' {aka 'unsigned int'} [-Wformat=]
arch/riscv/kernel/module.c:146:23: note: format string is defined here
arch/riscv/kernel/module.c: In function 'apply_r_riscv_got_hi20_rela':
./include/linux/kern_levels.h:5:18: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'Elf32_Addr' {aka 'unsigned int'} [-Wformat=]
arch/riscv/kernel/module.c:190:60: note: format string is defined here
arch/riscv/kernel/module.c: In function 'apply_r_riscv_call_plt_rela':
./include/linux/kern_levels.h:5:18: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'Elf32_Addr' {aka 'unsigned int'} [-Wformat=]
arch/riscv/kernel/module.c:214:24: note: format string is defined here
arch/riscv/kernel/module.c: In function 'apply_r_riscv_call_rela':
./include/linux/kern_levels.h:5:18: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'Elf32_Addr' {aka 'unsigned int'} [-Wformat=]
arch/riscv/kernel/module.c:236:23: note: format string is defined here
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
|
|
Fixes the following build error from tinyconfig:
riscv64-unknown-linux-gnu-ld: kernel/sched/fair.o: in function `.L8':
fair.c:(.text+0x70): undefined reference to `__lshrti3'
riscv64-unknown-linux-gnu-ld: kernel/time/clocksource.o: in function `.L0 ':
clocksource.c:(.text+0x334): undefined reference to `__lshrti3'
Fixes: 7f47c73b355f ("RISC-V: Build tishift only on 64-bit")
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
|
|
Building kernel 4.20 for Fedora as RPM fails, because riscv is missing
vdso_install target in arch/riscv/Makefile.
Signed-off-by: David Abdurachmanov <david.abdurachmanov@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
|
|
Replace 8 spaces with tab to match styling.
Signed-off-by: David Abdurachmanov <david.abdurachmanov@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
|
|
The printk timestamps are very useful information to visually see
where kernel is spending time during boot. It also helps us see
the timing of hotplug events at runtime.
This patch enables printk timestamps in RISC-V defconfig so that
we have it enabled by default (similar to other architectures
such as x86_64, arm64, etc).
Signed-off-by: Anup Patel <anup@brainfault.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
|
|
When a metadata read is served the endio routine btree_readpage_end_io_hook
is called which eventually runs the tree-checker. If tree-checker fails
to validate the read eb then it sets EXTENT_BUFFER_CORRUPT flag. This
leads to btree_read_extent_buffer_pages wrongly assuming that all
available copies of this extent buffer are wrong and failing prematurely.
Fix this modify btree_read_extent_buffer_pages to read all copies of
the data.
This failure was exhibitted in xfstests btrfs/124 which would
spuriously fail its balance operations. The reason was that when balance
was run following re-introduction of the missing raid1 disk
__btrfs_map_block would map the read request to stripe 0, which
corresponded to devid 2 (the disk which is being removed in the test):
item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 3553624064) itemoff 15975 itemsize 112
length 1073741824 owner 2 stripe_len 65536 type DATA|RAID1
io_align 65536 io_width 65536 sector_size 4096
num_stripes 2 sub_stripes 1
stripe 0 devid 2 offset 2156920832
dev_uuid 8466c350-ed0c-4c3b-b17d-6379b445d5c8
stripe 1 devid 1 offset 3553624064
dev_uuid 1265d8db-5596-477e-af03-df08eb38d2ca
This caused read requests for a checksum item that to be routed to the
stale disk which triggered the aforementioned logic involving
EXTENT_BUFFER_CORRUPT flag. This then triggered cascading failures of
the balance operation.
Fixes: a826d6dcb32d ("Btrfs: check items for correctness as we search")
CC: stable@vger.kernel.org # 4.4+
Suggested-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Commit 07d02a67b7fa causes a use-after free in the RPCSEC_GSS credential
destroy code, because the call to get_rpccred() in gss_destroying_context()
will now always fail to increment the refcount.
While we could just replace the get_rpccred() with a refcount_set(), that
would have the unfortunate consequence of resurrecting a credential in
the credential cache for which we are in the process of destroying the
RPCSEC_GSS context. Rather than do this, we choose to make a copy that
is never added to the cache and use that to destroy the context.
Fixes: 07d02a67b7fa ("SUNRPC: Simplify lookup code")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If we exit the NFSv4 state manager due to a umount, then we can end up
leaving the NFS4CLNT_MANAGER_RUNNING flag set. If another mount causes
the nfs4_client to be rereferenced before it is destroyed, then we end
up never being able to recover state.
Fixes: 47c2199b6eb5 ("NFSv4.1: Ensure state manager thread dies on last ...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.15+
|
|
In XGMI configuration, the FB region covers vram region from peer
device, adjust system aperture to cover all of them
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
med_power_with_dipm still causes freezes after updating the firmware to
the latest version (DXT04L5Q).
Set model_rev to NULL and blacklist the device.
Cc: stable@vger.kernel.org
Signed-off-by: Diego Viola <diego.viola@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
After switched to NO_BOOTMEM, there are several boot failures. Some of
them have been fixed and some of them haven't. I find that many of them
are because of memory allocations are top-down, while the old behavior
is bottom-up. This patch let early memblock_alloc*() allocate memories
bottom-up to avoid some potential problems.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: bcec54bf3118 ("mips: switch to NO_BOOTMEM")
Patchwork: https://patchwork.linux-mips.org/patch/21069/
References: https://patchwork.linux-mips.org/patch/21031/
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <james.hogan@mips.com>
Cc: Steven J . Hill <Steven.Hill@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
|
|
Re-enable OCTEON USB driver which is needed on older hardware
(e.g. EdgeRouter Lite) for mass storage etc. This got accidentally
deleted when config options were changed for OCTEON2/3 USB.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: f922bc0ad08b ("MIPS: Octeon: cavium_octeon_defconfig: Enable more drivers")
Patchwork: https://patchwork.linux-mips.org/patch/21077/
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # 4.14+
|
|
We need to copy the io priority, too; otherwise the clone will run
with a different priority than the original one.
Fixes: 43b62ce3ff0a ("block: move bio io prio to a new field")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixed up subject, and ordered stores.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Driver assigns DMAE channel 0 for FW as part of START_RAMROD command. FW
uses this channel for DMAE operations (e.g., TIME_SYNC implementation).
Driver also uses the same channel 0 for DMAE operations for some of the PFs
(e.g., PF0 on Port0). This could lead to concurrent access to the DMAE
channel by FW and driver which is not legal. Hence need to assign unique
DMAE id for FW.
Currently following DMAE channels are used by the clients,
MFW - OCBB/OCSD functionality uses DMAE channel 14/15
Driver 0-3 and 8-11 (for PF dmae operations)
4 and 12 (for stats requests)
Assigning unique dmae_id '13' to the FW.
Changes from previous version:
------------------------------
v2: Incorporated the review comments.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Adam reported a record command crash for simple session like:
$ perf record -e cpu-clock ls
with following backtrace:
Program received signal SIGSEGV, Segmentation fault.
3543 ev = event_update_event__new(size + 1, PERF_EVENT_UPDATE__UNIT, evsel->id[0]);
(gdb) bt
#0 perf_event__synthesize_event_update_unit
#1 0x000000000051e469 in perf_event__synthesize_extra_attr
#2 0x00000000004445cb in record__synthesize
#3 0x0000000000444bc5 in __cmd_record
...
We synthesize an update event that needs to touch the evsel id array,
which is not defined at that time. Fix this by forcing the id allocation
for events with their unit defined.
Reflecting possible read_format ID bit in the attr tests.
Reported-by: Yongxin Liu <yongxin.liu@outlook.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adam Lee <leeadamrobert@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201477
Fixes: bfd8f72c2778 ("perf record: Synthesize unit/scale/... in event update")
Link: http://lkml.kernel.org/r/20181112130012.5424-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When copying to the latency type, we should be passing LATENCY_TYPE_LEN,
not DOMAIN_LEN (this isn't a problem in practice because we only pass
"total" or "I/O"). Fix it by changing all of the strlcpy() calls to use
sizeof().
Fixes: 6c3b7af1c975 ("kyber: add tracepoints")
Reported-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Tested-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Its possible to set both HANDLE and POSITION when replacing a rule.
In this case, the rule at POSITION gets replaced using the
userspace-provided handle. Rule handles are supposed to be generated
by the kernel only.
Duplicate handles should be harmless, however better disable this "feature"
by only checking for the POSITION attribute on insert operations.
Fixes: 5e94846686d0 ("netfilter: nf_tables: add insert operation")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Start flood ping for each cpu while loading/flushing rulesets to make
sure we do not access already-free'd rules from nf_tables evaluation loop.
Also add this to TARGETS so 'make run_tests' in selftest dir runs it
automatically.
This would have caught the bug fixed in previous change
("netfilter: nf_tables: do not skip inactive chains during generation update")
sooner.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
There is no synchronization between packet path and the configuration plane.
The packet path uses two arrays with rules, one contains the current (active)
generation. The other either contains the last (obsolete) generation or
the future one.
Consider:
cpu1 cpu2
nft_do_chain(c);
delete c
net->gen++;
genbit = !!net->gen;
rules = c->rg[genbit];
cpu1 ignores c when updating if c is not active anymore in the new
generation.
On cpu2, we now use rules from wrong generation, as c->rg[old]
contains the rules matching 'c' whereas c->rg[new] was not updated and
can even point to rules that have been free'd already, causing a crash.
To fix this, make sure that 'current' to the 'next' generation are
identical for chains that are going away so that c->rg[new] will just
use the matching rules even if genbit was incremented already.
Fixes: 0cbc06b3faba7 ("netfilter: nf_tables: remove synchronize_rcu in commit phase")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
In my haste to remove irq_port[] I accidentally changed the
way we deal with hpd pins that are shared by multiple encoders
(DP and HDMI for pre-DDI platforms). Previously we would only
handle such pins via ->hpd_pulse(), but now we queue up the
hotplug work for the HDMI encoder directly. Worse yet, we now
count each hpd twice and this increment the hpd storm count
twice as fast. This can lead to spurious storms being detected.
Go back to the old way of doing things, ie. delegate to
->hpd_pulse() for any pin which has an encoder with that hook
implemented. I don't really like the idea of adding irq_port[]
back so let's loop through the encoders first to check if we
have an encoder with ->hpd_pulse() for the pin, and then go
through all the pins and decided on the correct course of action
based on the earlier findings.
I have occasionally toyed with the idea of unifying the pre-DDI
HDMI and DP encoders into a single encoder as well. Besides the
hotplug processing it would have the other benefit of preventing
userspace from trying to enable both encoders at the same time.
That is simply illegal as they share the same clock/data pins.
We have some testcases that will attempt that and thus fail on
many older machines. But for now let's stick to fixing just the
hotplug code.
Cc: stable@vger.kernel.org # 4.19+
Cc: Lyude Paul <lyude@redhat.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fixes: b6ca3eee18ba ("drm/i915: Nuke dev_priv->irq_port[]")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181108200424.28371-1-ville.syrjala@linux.intel.com
Reviewed-by: Lyude Paul <lyude@redhat.com>
(cherry picked from commit 5a3aeca97af1b6b3498d59a7fd4e8bb95814c108)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Ensure that the writes into the context image are completed prior to the
register mmio to trigger execution. Although previously we were assured
by the SDM that all writes are flushed before an uncached memory
transaction (our mmio write to submit the context to HW for execution),
we have empirical evidence to believe that this is not actually the
case.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656
References: https://bugs.freedesktop.org/show_bug.cgi?id=108315
References: https://bugs.freedesktop.org/show_bug.cgi?id=106887
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181108081740.25615-1-chris@chris-wilson.co.uk
Cc: stable@vger.kernel.org
(cherry picked from commit 987abd5c62f92ee4970b45aa077f47949974e615)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
To enable DC5/6 power well 2 has to be disabled as for previous
platforms, so fix things up.
Bspec: 4234
Fixes: 67ca07e7ac10 ("drm/i915/icl: Add power well support")
Cc: Animesh Manna <animesh.manna@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181102182200.17219-1-imre.deak@intel.com
(cherry picked from commit a33e1ece777996ddddb1f23a30f8c66422ed0b68)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Turns out that if you trigger an HPD storm on a system that has an MST
topology connected to it, you'll end up causing the kernel to eventually
hit a NULL deref:
[ 332.339041] BUG: unable to handle kernel NULL pointer dereference at 00000000000000ec
[ 332.340906] PGD 0 P4D 0
[ 332.342750] Oops: 0000 [#1] SMP PTI
[ 332.344579] CPU: 2 PID: 25 Comm: kworker/2:0 Kdump: loaded Tainted: G O 4.18.0-rc3short-hpd-storm+ #2
[ 332.346453] Hardware name: LENOVO 20BWS1KY00/20BWS1KY00, BIOS JBET71WW (1.35 ) 09/14/2018
[ 332.348361] Workqueue: events intel_hpd_irq_storm_reenable_work [i915]
[ 332.350301] RIP: 0010:intel_hpd_irq_storm_reenable_work.cold.3+0x2f/0x86 [i915]
[ 332.352213] Code: 00 00 ba e8 00 00 00 48 c7 c6 c0 aa 5f a0 48 c7 c7 d0 73 62 a0 4c 89 c1 4c 89 04 24 e8 7f f5 af e0 4c 8b 04 24 44 89 f8 29 e8 <41> 39 80 ec 00 00 00 0f 85 43 13 fc ff 41 0f b6 86 b8 04 00 00 41
[ 332.354286] RSP: 0018:ffffc90000147e48 EFLAGS: 00010006
[ 332.356344] RAX: 0000000000000005 RBX: ffff8802c226c9d4 RCX: 0000000000000006
[ 332.358404] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff88032dc95570
[ 332.360466] RBP: 0000000000000005 R08: 0000000000000000 R09: ffff88031b3dc840
[ 332.362528] R10: 0000000000000000 R11: 000000031a069602 R12: ffff8802c226ca20
[ 332.364575] R13: ffff8802c2268000 R14: ffff880310661000 R15: 000000000000000a
[ 332.366615] FS: 0000000000000000(0000) GS:ffff88032dc80000(0000) knlGS:0000000000000000
[ 332.368658] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 332.370690] CR2: 00000000000000ec CR3: 000000000200a003 CR4: 00000000003606e0
[ 332.372724] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 332.374773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 332.376798] Call Trace:
[ 332.378809] process_one_work+0x1a1/0x350
[ 332.380806] worker_thread+0x30/0x380
[ 332.382777] ? wq_update_unbound_numa+0x10/0x10
[ 332.384772] kthread+0x112/0x130
[ 332.386740] ? kthread_create_worker_on_cpu+0x70/0x70
[ 332.388706] ret_from_fork+0x35/0x40
[ 332.390651] Modules linked in: i915(O) vfat fat joydev btusb btrtl btbcm btintel bluetooth ecdh_generic iTCO_wdt wmi_bmof i2c_algo_bit drm_kms_helper intel_rapl syscopyarea sysfillrect x86_pkg_temp_thermal sysimgblt coretemp fb_sys_fops crc32_pclmul drm psmouse pcspkr mei_me mei i2c_i801 lpc_ich mfd_core i2c_core tpm_tis tpm_tis_core thinkpad_acpi wmi tpm rfkill video crc32c_intel serio_raw ehci_pci xhci_pci ehci_hcd xhci_hcd [last unloaded: i915]
[ 332.394963] CR2: 00000000000000ec
This appears to be due to the fact that with an MST topology, not all
intel_connector structs will have ->encoder set. So, fix this by
skipping connectors without encoders in
intel_hpd_irq_storm_reenable_work().
For those wondering, this bug was found on accident while simulating HPD
storms using a Chamelium connected to a ThinkPad T450s (Broadwell).
Changes since v1:
- Check intel_connector->mst_port instead of intel_connector->encoder
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181106213017.14563-3-lyude@redhat.com
(cherry picked from commit fee61deecb1d850bf34f682a6a452e5ee51b7572)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
This hasn't caused any issues yet that I'm aware of, but as Ville
Syrjälä pointed out - we need to make sure that
intel_connector->mst_port is set before initializing MST connectors,
since in theory we could potentially check intel_connector->mst_port in
i915_hpd_poll_init_work() after registering the connector but before
having written it's value.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20181106213017.14563-2-lyude@redhat.com
(cherry picked from commit 66a5ab1034be801630816d1fa6cfc30db1a2f0b0)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Exercising the gpu reloc path strenuously revealed an issue where the
updated relocations (from MI_STORE_DWORD_IMM) were not being observed
upon execution. After some experiments with adding pipecontrols (a lot
of pipecontrols (32) as gen4/5 do not have a bit to wait on earlier pipe
controls or even the current on), it was discovered that we merely
needed to delay the EMIT_INVALIDATE by several flushes. It is important
to note that it is the EMIT_INVALIDATE as opposed to the EMIT_FLUSH that
needs the delay as opposed to what one might first expect -- that the
delay is required for the TLB invalidation to take effect (one presumes
to purge any CS buffers) as opposed to a delay after flushing to ensure
the writes have landed before triggering invalidation.
Testcase: igt/gem_tiled_fence_blits
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181105094305.5767-1-chris@chris-wilson.co.uk
(cherry picked from commit 55f99bf2a9c331838c981694bc872cd1ec4070b2)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
When list->count is 0, the list is deleted by GC. But list->count is
never reached 0 because initial count value is 1 and it is increased
when node is inserted. So that initial value of list->count should be 0.
Originally GC always finds zero count list through deleting node and
decreasing count. However, list may be left empty since node insertion
may fail eg. allocaton problem. In order to solve this problem, GC
routine also finds zero count list without deleting node.
Fixes: cb2b36f5a97d ("netfilter: nf_conncount: Switch to plain list")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
nf_conncount_tuple is an element of nft_connlimit and that is deleted by
conn_free(). Elements can be deleted by both GC routine and data path
functions (nf_conncount_lookup, nf_conncount_add) and they call
conn_free() to free elements. But conn_free() only protects lists, not
each element. So that list_del corruption could occurred.
The conn_free() doesn't check whether element is already deleted. In
order to protect elements, dead flag is added. If an element is deleted,
dead flag is set. The only conn_free() can delete elements so that both
list lock and dead flag are enough to protect it.
test commands:
%nft add table ip filter
%nft add chain ip filter input { type filter hook input priority 0\; }
%nft add rule filter input meter test { ip id ct count over 2 } counter
splat looks like:
[ 1779.495778] list_del corruption, ffff8800b6e12008->prev is LIST_POISON2 (dead000000000200)
[ 1779.505453] ------------[ cut here ]------------
[ 1779.506260] kernel BUG at lib/list_debug.c:50!
[ 1779.515831] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 1779.516772] CPU: 0 PID: 33 Comm: kworker/0:2 Not tainted 4.19.0-rc6+ #22
[ 1779.516772] Workqueue: events_power_efficient nft_rhash_gc [nf_tables_set]
[ 1779.516772] RIP: 0010:__list_del_entry_valid+0xd8/0x150
[ 1779.516772] Code: 39 48 83 c4 08 b8 01 00 00 00 5b 5d c3 48 89 ea 48 c7 c7 00 c3 5b 98 e8 0f dc 40 ff 0f 0b 48 c7 c7 60 c3 5b 98 e8 01 dc 40 ff <0f> 0b 48 c7 c7 c0 c3 5b 98 e8 f3 db 40 ff 0f 0b 48 c7 c7 20 c4 5b
[ 1779.516772] RSP: 0018:ffff880119127420 EFLAGS: 00010286
[ 1779.516772] RAX: 000000000000004e RBX: dead000000000200 RCX: 0000000000000000
[ 1779.516772] RDX: 000000000000004e RSI: 0000000000000008 RDI: ffffed0023224e7a
[ 1779.516772] RBP: ffff88011934bc10 R08: ffffed002367cea9 R09: ffffed002367cea9
[ 1779.516772] R10: 0000000000000001 R11: ffffed002367cea8 R12: ffff8800b6e12008
[ 1779.516772] R13: ffff8800b6e12010 R14: ffff88011934bc20 R15: ffff8800b6e12008
[ 1779.516772] FS: 0000000000000000(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000
[ 1779.516772] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1779.516772] CR2: 00007fc876534010 CR3: 000000010da16000 CR4: 00000000001006f0
[ 1779.516772] Call Trace:
[ 1779.516772] conn_free+0x9f/0x2b0 [nf_conncount]
[ 1779.516772] ? nf_ct_tmpl_alloc+0x2a0/0x2a0 [nf_conntrack]
[ 1779.516772] ? nf_conncount_add+0x520/0x520 [nf_conncount]
[ 1779.516772] ? do_raw_spin_trylock+0x1a0/0x1a0
[ 1779.516772] ? do_raw_spin_trylock+0x10/0x1a0
[ 1779.516772] find_or_evict+0xe5/0x150 [nf_conncount]
[ 1779.516772] nf_conncount_gc_list+0x162/0x360 [nf_conncount]
[ 1779.516772] ? nf_conncount_lookup+0xee0/0xee0 [nf_conncount]
[ 1779.516772] ? _raw_spin_unlock_irqrestore+0x45/0x50
[ 1779.516772] ? trace_hardirqs_off+0x6b/0x220
[ 1779.516772] ? trace_hardirqs_on_caller+0x220/0x220
[ 1779.516772] nft_rhash_gc+0x16b/0x540 [nf_tables_set]
[ ... ]
Fixes: 5c789e131cbb ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
conn_free() holds lock with spin_lock() and it is called by both
nf_conncount_lookup() and nf_conncount_gc_list(). nf_conncount_lookup()
is called from bottom-half context and nf_conncount_gc_list() from
process context. So that spin_lock() call is not safe. Hence
conn_free() should use spin_lock_bh() instead of spin_lock().
test commands:
%nft add table ip filter
%nft add chain ip filter input { type filter hook input priority 0\; }
%nft add rule filter input meter test { ip saddr ct count over 2 } \
counter
splat looks like:
[ 461.996507] ================================
[ 461.998999] WARNING: inconsistent lock state
[ 461.998999] 4.19.0-rc6+ #22 Not tainted
[ 461.998999] --------------------------------
[ 461.998999] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[ 461.998999] kworker/0:2/134 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 461.998999] 00000000a71a559a (&(&list->list_lock)->rlock){+.?.}, at: conn_free+0x69/0x2b0 [nf_conncount]
[ 461.998999] {IN-SOFTIRQ-W} state was registered at:
[ 461.998999] _raw_spin_lock+0x30/0x70
[ 461.998999] nf_conncount_add+0x28a/0x520 [nf_conncount]
[ 461.998999] nft_connlimit_eval+0x401/0x580 [nft_connlimit]
[ 461.998999] nft_dynset_eval+0x32b/0x590 [nf_tables]
[ 461.998999] nft_do_chain+0x497/0x1430 [nf_tables]
[ 461.998999] nft_do_chain_ipv4+0x255/0x330 [nf_tables]
[ 461.998999] nf_hook_slow+0xb1/0x160
[ ... ]
[ 461.998999] other info that might help us debug this:
[ 461.998999] Possible unsafe locking scenario:
[ 461.998999]
[ 461.998999] CPU0
[ 461.998999] ----
[ 461.998999] lock(&(&list->list_lock)->rlock);
[ 461.998999] <Interrupt>
[ 461.998999] lock(&(&list->list_lock)->rlock);
[ 461.998999]
[ 461.998999] *** DEADLOCK ***
[ 461.998999]
[ ... ]
Fixes: 5c789e131cbb ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This register should have been programmed with the physical address
of the memory location containing the shadow tail pointer for
the guest virtual APIC log instead of the base address.
Fixes: 8bda0cfbdc1a ('iommu/amd: Detect and initialize guest vAPIC log')
Signed-off-by: Filippo Sironi <sironi@amazon.de>
Signed-off-by: Wei Wang <wawei@amazon.de>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
We need to call pci_iounmap() instead of iounmap() for the regions
obtained via pci_iomap() call for some archs that need special
treatment.
Fixes: aa31704fd81c ("ALSA: hda/ca0132: Add PCI region2 iomap for SBZ")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
In big.Little systems, some CPUs require the Spectre workarounds in
paths such as the context switch, but other CPUs do not. In order
to handle these differences, we need per-CPU vtables.
We are unable to use the kernel's per-CPU variables to support this
as per-CPU is not initialised at times when we need access to the
vtables, so we have to use an array indexed by logical CPU number.
We use an array-of-pointers to avoid having function pointers in
the kernel's read/write .data section.
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
In vfp_preserve_user_clear_hwstate, ufp_exc->fpinst2 gets assigned to
itself. It should actually be hwstate->fpinst2 that gets assigned to the
ufp_exc field.
Fixes commit 3aa2df6ec2ca6bc143a65351cca4266d03a8bc41 ("ARM: 8791/1:
vfp: use __copy_to_user() when saving VFP state").
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Allow the way we access members of the processor vtable to be changed
at compile time. We will need to move to per-CPU vtables to fix the
Spectre variant 2 issues on big.Little systems.
However, we have a couple of calls that do not need the vtable
treatment, and indeed cause a kernel warning due to the (later) use
of smp_processor_id(), so also introduce the PROC_TABLE macro for
these which always use CPU 0's function pointers.
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Call the per-processor type check_bugs() method in the same way as we
do other per-processor functions - move the "processor." detail into
proc-fns.h.
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Split out the lookup of the processor type and associated error handling
from the rest of setup_processor() - we will need to use this in the
secondary CPU bringup path for big.Little Spectre variant 2 mitigation.
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Move lookup_processor_type() out of the __init section so it is callable
from (eg) the secondary startup code during hotplug.
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Digging through the "phy-qcom-qmp" showed me many inconsistencies
between the bindings and the reality of the driver. Let's fix them
all.
* In commit 2d66eab18375 ("dt-bindings: phy: qmp: Add support for QMP
phy in IPQ8074") we probably should have explicitly listed that
there are no clocks for this PHY and also added the reset names in
alphabetical order. You can see that there are no clocks in the
driver where "clk_list" is NULL.
* In commit 8587b220f05e ("dt-bindings: phy-qcom-qmp: Update bindings
for QMP V3 USB PHY") we probably should have listed the resets for
this new PHY and also removed the "(Optional)" marking for the "cfg"
reset since PHYs that need "cfg" really do need it. It's just that
not all PHYs need it.
* In commit 7f0802074120 ("dt-bindings: phy-qcom-qmp: Update bindings
for sdm845") we forgot to update one instance of the string
"qcom,qmp-v3-usb3-phy" to be "qcom,sdm845-qmp-usb3-phy". Let's fix
that. We should also have added "qcom,sdm845-qmp-usb3-uni-phy" to
the clock-names and reset-names lists.
* In commit 99c7c7364b71 ("dt-bindings: phy-qcom-qmp: Add UFS phy
compatible string for sdm845") we should have added the set of
clocks and resets for "qcom,sdm845-qmp-ufs-phy". These were taken
from the driver.
* Cleanup the wording for what properties child nodes have to make it
more obvious which types of PHYs need clocks and resets. This was
sorta implicit in the "-names" description but I found myself
confused.
* As per the code not all "pcie qmp phys" have resets. Specifically
note that the "has_lane_rst" property in the driver is false for
"ipq8074-qmp-pcie-phy". Thus make it clear exactly which PHYs need
child nodes with resets.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Evan Green <evgreen@chromium.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
|
|
The driver uses devm_ioremap_resource() which is only available when
CONFIG_HAS_IOMEM is set, so the driver depends on this option.
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
|