summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-10-18Merge branch 'Remove unnecessary RCU grace period chaining'Alexei Starovoitov
Hou Tao says: ==================== Now bpf uses RCU grace period chaining to wait for the completion of access from both sleepable and non-sleepable bpf program: calling call_rcu_tasks_trace() firstly to wait for a RCU-tasks-trace grace period, then in its callback calls call_rcu() or kfree_rcu() to wait for a normal RCU grace period. According to the implementation of RCU Tasks Trace, it inovkes ->postscan_func() to wait for one RCU-tasks-trace grace period and rcu_tasks_trace_postscan() inovkes synchronize_rcu() to wait for one normal RCU grace period in turn, so one RCU-tasks-trace grace period will imply one normal RCU grace period. To codify the implication, introduces rcu_trace_implies_rcu_gp() in patch #1. And using it in patch Other two uses of call_rcu_tasks_trace() are unchanged: for __bpf_prog_put_rcu() there is no gp chain and for __bpf_tramp_image_put_rcu_tasks() it chains RCU tasks trace GP and RCU tasks GP. An alternative way to remove these unnecessary RCU grace period chainings is using the RCU polling API to check whether or not a normal RCU grace period has passed (e.g. get_state_synchronize_rcu()). But it needs an unsigned long space for each free element or each call, and it is not affordable for local storage element, so as for now always rcu_trace_implies_rcu_gp(). Comments are always welcome. Change Log: v2: * codify the implication of RCU Tasks Trace grace period instead of assuming for it v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@huaweicloud.com Hou Tao (3): bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator bpf: Use rcu_trace_implies_rcu_gp() in local storage map bpf: Use rcu_trace_implies_rcu_gp() for program array freeing ==================== Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-18bpf: Use rcu_trace_implies_rcu_gp() for program array freeingHou Tao
To support both sleepable and normal uprobe bpf program, the freeing of trace program array chains a RCU-tasks-trace grace period and a normal RCU grace period one after the other. With the introduction of rcu_trace_implies_rcu_gp(), __bpf_prog_array_free_sleepable_cb() can check whether or not a normal RCU grace period has also passed after a RCU-tasks-trace grace period has passed. If it is true, it is safe to invoke kfree() directly. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20221014113946.965131-5-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-18bpf: Use rcu_trace_implies_rcu_gp() in local storage mapHou Tao
Local storage map is accessible for both sleepable and non-sleepable bpf program, and its memory is freed by using both call_rcu_tasks_trace() and kfree_rcu() to wait for both RCU-tasks-trace grace period and RCU grace period to pass. With the introduction of rcu_trace_implies_rcu_gp(), both bpf_selem_free_rcu() and bpf_local_storage_free_rcu() can check whether or not a normal RCU grace period has also passed after a RCU-tasks-trace grace period has passed. If it is true, it is safe to call kfree() directly. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20221014113946.965131-4-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-18bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocatorHou Tao
The memory free logic in bpf memory allocator chains a RCU Tasks Trace grace period and a normal RCU grace period one after the other, so it can ensure that both sleepable and non-sleepable programs have finished. With the introduction of rcu_trace_implies_rcu_gp(), __free_rcu_tasks_trace() can check whether or not a normal RCU grace period has also passed after a RCU Tasks Trace grace period has passed. If it is true, freeing these elements directly, else freeing through call_rcu(). Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20221014113946.965131-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-18rcu-tasks: Provide rcu_trace_implies_rcu_gp()Paul E. McKenney
As an accident of implementation, an RCU Tasks Trace grace period also acts as an RCU grace period. However, this could change at any time. This commit therefore creates an rcu_trace_implies_rcu_gp() that currently returns true to codify this accident. Code relying on this accident must call this function to verify that this accident is still happening. Reported-by: Hou Tao <houtao@huaweicloud.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Link: https://lore.kernel.org/r/20221014113946.965131-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-18dm bufio: use the acquire memory barrier when testing for B_READINGMikulas Patocka
The function test_bit doesn't provide any memory barrier. It may be possible that the read requests that follow test_bit(B_READING, &b->state) are reordered before the test, reading invalid data that existed before B_READING was cleared. Fix this bug by changing test_bit to test_bit_acquire. This is particularly important on arches with weak(er) memory ordering (e.g. arm64). Depends-On: 8238b4579866 ("wait_on_bit: add an acquire memory barrier") Depends-On: d6ffe6067a54 ("provide arch_test_bit_acquire for architectures that define test_bit") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-10-18cifs: Fix memory leak when build ntlmssp negotiate blob failedZhang Xiaoxu
There is a memory leak when mount cifs: unreferenced object 0xffff888166059600 (size 448): comm "mount.cifs", pid 51391, jiffies 4295596373 (age 330.596s) hex dump (first 32 bytes): fe 53 4d 42 40 00 00 00 00 00 00 00 01 00 82 00 .SMB@........... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000060609a61>] mempool_alloc+0xe1/0x260 [<00000000adfa6c63>] cifs_small_buf_get+0x24/0x60 [<00000000ebb404c7>] __smb2_plain_req_init+0x32/0x460 [<00000000bcf875b4>] SMB2_sess_alloc_buffer+0xa4/0x3f0 [<00000000753a2987>] SMB2_sess_auth_rawntlmssp_negotiate+0xf5/0x480 [<00000000f0c1f4f9>] SMB2_sess_setup+0x253/0x410 [<00000000a8b83303>] cifs_setup_session+0x18f/0x4c0 [<00000000854bd16d>] cifs_get_smb_ses+0xae7/0x13c0 [<000000006cbc43d9>] mount_get_conns+0x7a/0x730 [<000000005922d816>] cifs_mount+0x103/0xd10 [<00000000e33def3b>] cifs_smb3_do_mount+0x1dd/0xc90 [<0000000078034979>] smb3_get_tree+0x1d5/0x300 [<000000004371f980>] vfs_get_tree+0x41/0xf0 [<00000000b670d8a7>] path_mount+0x9b3/0xdd0 [<000000005e839a7d>] __x64_sys_mount+0x190/0x1d0 [<000000009404c3b9>] do_syscall_64+0x35/0x80 When build ntlmssp negotiate blob failed, the session setup request should be freed. Fixes: 49bd49f983b5 ("cifs: send workstation name during ntlmssp session setup") Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18cifs: set rc to -ENOENT if we can not get a dentry for the cached dirRonnie Sahlberg
We already set rc to this return code further down in the function but we can set it earlier in order to suppress a smash warning. Also fix a false positive for Coverity. The reason this is a false positive is that this happens during umount after all files and directories have been closed but mosetting on ->on_list to suppress the warning. Reported-by: Dan carpenter <dan.carpenter@oracle.com> Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Addresses-Coverity-ID: 1525256 ("Concurrent data access violations") Fixes: a350d6e73f5e ("cifs: enable caching of directories for which a lease is held") Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18cifs: use LIST_HEAD() and list_move() to simplify codeYang Yingliang
list_head can be initialized automatically with LIST_HEAD() instead of calling INIT_LIST_HEAD(). Using list_move() instead of list_del() and list_add(). Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18cifs: Fix xid leak in cifs_get_file_info_unix()Zhang Xiaoxu
If stardup the symlink target failed, should free the xid, otherwise the xid will be leaked. Fixes: 76894f3e2f71 ("cifs: improve symlink handling for smb2+") Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18cifs: Fix xid leak in cifs_ses_add_channel()Zhang Xiaoxu
Before return, should free the xid, otherwise, the xid will be leaked. Fixes: d70e9fa55884 ("cifs: try opening channels after mounting") Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18cifs: Fix xid leak in cifs_flock()Zhang Xiaoxu
If not flock, before return -ENOLCK, should free the xid, otherwise, the xid will be leaked. Fixes: d0677992d2af ("cifs: add support for flock") Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18cifs: Fix xid leak in cifs_copy_file_range()Zhang Xiaoxu
If the file is used by swap, before return -EOPNOTSUPP, should free the xid, otherwise, the xid will be leaked. Fixes: 4e8aea30f775 ("smb3: enable swap on SMB3 mounts") Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18cifs: Fix xid leak in cifs_create()Zhang Xiaoxu
If the cifs already shutdown, we should free the xid before return, otherwise, the xid will be leaked. Fixes: 087f757b0129 ("cifs: add shutdown support") Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18Merge tag 'cpufreq-arm-fixes-6.1-rc' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull cpufreq ARM fixes / cleanups for 6.1-rc from Viresh Kumar: "- Fix module loading in Tegra124 driver (Jon Hunter). - Fix memory leak and update to read-only region in qcom driver (Fabien Parent). - Miscellaneous minor cleanups to cpufreq drivers (Fabien Parent and Yang Yingliang)." * tag 'cpufreq-arm-fixes-6.1-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: sun50i: Switch to use dev_err_probe() helper cpufreq: qcom-nvmem: Switch to use dev_err_probe() helper cpufreq: imx6q: Switch to use dev_err_probe() helper cpufreq: dt: Switch to use dev_err_probe() helper cpufreq: qcom: remove unused parameter in function definition cpufreq: qcom: fix writes in read-only memory region cpufreq: qcom: fix memory leak in error path cpufreq: tegra194: Fix module loading
2022-10-18HID: lenovo: Make array tp10ubkbd_led static constColin Ian King
Don't populate the read-only array tp10ubkbd_led on the stack but instead make it static const. Also makes the object code a little smaller. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2022-10-18HID: saitek: add madcatz variant of MMO7 mouse device IDSamuel Bailey
The MadCatz variant of the MMO7 mouse has the ID 0738:1713 and the same quirks as the Saitek variant. Signed-off-by: Samuel Bailey <samuel.bailey1@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2022-10-18Documentation: document ublk user recovery featureZiyangZhang
Add documentation for user recovery feature of ublk subsystem. Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20221018045346.99706-2-ZiyangZhang@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-18cpufreq: sun50i: Switch to use dev_err_probe() helperYang Yingliang
In the probe path, convert pr_err() to dev_err_probe() which will check if error code is -EPROBE_DEFER and prints the error name. It also sets the defer probe reason which can be checked later through debugfs. It's more simple in error path. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-10-18cpufreq: qcom-nvmem: Switch to use dev_err_probe() helperYang Yingliang
In the probe path, dev_err() can be replaced with dev_err_probe() which will check if error code is -EPROBE_DEFER and prints the error name. It also sets the defer probe reason which can be checked later through debugfs. It's more simple in error path. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-10-18cpufreq: imx6q: Switch to use dev_err_probe() helperYang Yingliang
In the probe path, dev_err() can be replaced with dev_err_probe() which will check if error code is -EPROBE_DEFER and prints the error name. It also sets the defer probe reason which can be checked later through debugfs. It's more simple in error path. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-10-18cpufreq: dt: Switch to use dev_err_probe() helperYang Yingliang
In the probe path, dev_err() can be replaced with dev_err_probe() which will check if error code is -EPROBE_DEFER and prints the error name. It also sets the defer probe reason which can be checked later through debugfs. It's more simple in error path. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-10-18cpufreq: qcom: remove unused parameter in function definitionFabien Parent
The speedbin_nvmem parameter is not used for get_krait_bin_format_{a,b}. Let's remove the parameter to make the code cleaner. Signed-off-by: Fabien Parent <fabien.parent@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-10-18cpufreq: qcom: fix writes in read-only memory regionFabien Parent
This commit fixes a kernel oops because of a write in some read-only memory: [ 9.068287] Unable to handle kernel write to read-only memory at virtual address ffff800009240ad8 ..snip.. [ 9.138790] Internal error: Oops: 9600004f [#1] PREEMPT SMP ..snip.. [ 9.269161] Call trace: [ 9.276271] __memcpy+0x5c/0x230 [ 9.278531] snprintf+0x58/0x80 [ 9.282002] qcom_cpufreq_msm8939_name_version+0xb4/0x190 [ 9.284869] qcom_cpufreq_probe+0xc8/0x39c ..snip.. The following line defines a pointer that point to a char buffer stored in read-only memory: char *pvs_name = "speedXX-pvsXX-vXX"; This pointer is meant to hold a template "speedXX-pvsXX-vXX" where the XX values get overridden by the qcom_cpufreq_krait_name_version function. Since the template is actually stored in read-only memory, when the function executes the following call we get an oops: snprintf(*pvs_name, sizeof("speedXX-pvsXX-vXX"), "speed%d-pvs%d-v%d", speed, pvs, pvs_ver); To fix this issue, we instead store the template name onto the stack by using the following syntax: char pvs_name_buffer[] = "speedXX-pvsXX-vXX"; Because the `pvs_name` needs to be able to be assigned to NULL, the template buffer is stored in the pvs_name_buffer and not under the pvs_name variable. Cc: v5.7+ <stable@vger.kernel.org> # v5.7+ Fixes: a8811ec764f9 ("cpufreq: qcom: Add support for krait based socs") Signed-off-by: Fabien Parent <fabien.parent@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-10-18cpufreq: qcom: fix memory leak in error pathFabien Parent
If for some reason the speedbin length is incorrect, then there is a memory leak in the error path because we never free the speedbin buffer. This commit fixes the error path to always free the speedbin buffer. Cc: v5.7+ <stable@vger.kernel.org> # v5.7+ Fixes: a8811ec764f9 ("cpufreq: qcom: Add support for krait based socs") Signed-off-by: Fabien Parent <fabien.parent@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-10-18cpufreq: tegra194: Fix module loadingJon Hunter
When the Tegra194 CPUFREQ driver is built as a module it is not automatically loaded as expected on Tegra194 devices. Populate the MODULE_DEVICE_TABLE to fix this. Cc: v5.9+ <stable@vger.kernel.org> # v5.9+ Fixes: df320f89359c ("cpufreq: Add Tegra194 cpufreq driver") Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-10-18net: ip6_gre: Remove the unused function ip6gre_tnl_addr_conflict()Jiapeng Chong
The function ip6gre_tnl_addr_conflict() is defined in the ip6_gre.c file, but not called elsewhere, so delete this unused function. net/ipv6/ip6_gre.c:887:20: warning: unused function 'ip6gre_tnl_addr_conflict'. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2419 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/20221017093540.26806-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-18ip6mr: fix UAF issue in ip6mr_sk_done() when addrconf_init_net() failedZhengchao Shao
If the initialization fails in calling addrconf_init_net(), devconf_all is the pointer that has been released. Then ip6mr_sk_done() is called to release the net, accessing devconf->mc_forwarding directly causes invalid pointer access. The process is as follows: setup_net() ops_init() addrconf_init_net() all = kmemdup(...) ---> alloc "all" ... net->ipv6.devconf_all = all; __addrconf_sysctl_register() ---> failed ... kfree(all); ---> ipv6.devconf_all invalid ... ops_exit_list() ... ip6mr_sk_done() devconf = net->ipv6.devconf_all; //devconf is invalid pointer if (!devconf || !atomic_read(&devconf->mc_forwarding)) The following is the Call Trace information: BUG: KASAN: use-after-free in ip6mr_sk_done+0x112/0x3a0 Read of size 4 at addr ffff888075508e88 by task ip/14554 Call Trace: <TASK> dump_stack_lvl+0x8e/0xd1 print_report+0x155/0x454 kasan_report+0xba/0x1f0 kasan_check_range+0x35/0x1b0 ip6mr_sk_done+0x112/0x3a0 rawv6_close+0x48/0x70 inet_release+0x109/0x230 inet6_release+0x4c/0x70 sock_release+0x87/0x1b0 igmp6_net_exit+0x6b/0x170 ops_exit_list+0xb0/0x170 setup_net+0x7ac/0xbd0 copy_net_ns+0x2e6/0x6b0 create_new_namespaces+0x382/0xa50 unshare_nsproxy_namespaces+0xa6/0x1c0 ksys_unshare+0x3a4/0x7e0 __x64_sys_unshare+0x2d/0x40 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f7963322547 </TASK> Allocated by task 14554: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 __kasan_kmalloc+0xa1/0xb0 __kmalloc_node_track_caller+0x4a/0xb0 kmemdup+0x28/0x60 addrconf_init_net+0x1be/0x840 ops_init+0xa5/0x410 setup_net+0x5aa/0xbd0 copy_net_ns+0x2e6/0x6b0 create_new_namespaces+0x382/0xa50 unshare_nsproxy_namespaces+0xa6/0x1c0 ksys_unshare+0x3a4/0x7e0 __x64_sys_unshare+0x2d/0x40 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Freed by task 14554: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_save_free_info+0x2a/0x40 ____kasan_slab_free+0x155/0x1b0 slab_free_freelist_hook+0x11b/0x220 __kmem_cache_free+0xa4/0x360 addrconf_init_net+0x623/0x840 ops_init+0xa5/0x410 setup_net+0x5aa/0xbd0 copy_net_ns+0x2e6/0x6b0 create_new_namespaces+0x382/0xa50 unshare_nsproxy_namespaces+0xa6/0x1c0 ksys_unshare+0x3a4/0x7e0 __x64_sys_unshare+0x2d/0x40 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: 7d9b1b578d67 ("ip6mr: fix use-after-free in ip6mr_sk_done()") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20221017080331.16878-1-shaozhengchao@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-18x86/microcode/AMD: Apply the patch early on every logical threadBorislav Petkov
Currently, the patch application logic checks whether the revision needs to be applied on each logical CPU (SMT thread). Therefore, on SMT designs where the microcode engine is shared between the two threads, the application happens only on one of them as that is enough to update the shared microcode engine. However, there are microcode patches which do per-thread modification, see Link tag below. Therefore, drop the revision check and try applying on each thread. This is what the BIOS does too so this method is very much tested. Btw, change only the early paths. On the late loading paths, there's no point in doing per-thread modification because if is it some case like in the bugzilla below - removing a CPUID flag - the kernel cannot go and un-use features it has detected are there early. For that, one should use early loading anyway. [ bp: Fixes does not contain the oldest commit which did check for equality but that is good enough. ] Fixes: 8801b3fcb574 ("x86/microcode/AMD: Rework container parsing") Reported-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com> Cc: <stable@vger.kernel.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216211
2022-10-18pinctrl: ocelot: Fix incorrect trigger of the interrupt.Horatiu Vultur
The interrupt controller can detect only link changes. So in case an external device generated a level based interrupt, then the interrupt controller detected correctly the first edge. But the problem was that the interrupt controller was detecting also the edge when the interrupt was cleared. So it would generate another interrupt. The fix for this is to clear the second interrupt but still check the interrupt line status. Fixes: c297561bc98a ("pinctrl: ocelot: Fix interrupt controller") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Tested-by: Michael Walle <michael@walle.cc> Link: https://lore.kernel.org/r/20221018070959.1322606-1-horatiu.vultur@microchip.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-10-18udp: Update reuse->has_conns under reuseport_lock.Kuniyuki Iwashima
When we call connect() for a UDP socket in a reuseport group, we have to update sk->sk_reuseport_cb->has_conns to 1. Otherwise, the kernel could select a unconnected socket wrongly for packets sent to the connected socket. However, the current way to set has_conns is illegal and possible to trigger that problem. reuseport_has_conns() changes has_conns under rcu_read_lock(), which upgrades the RCU reader to the updater. Then, it must do the update under the updater's lock, reuseport_lock, but it doesn't for now. For this reason, there is a race below where we fail to set has_conns resulting in the wrong socket selection. To avoid the race, let's split the reader and updater with proper locking. cpu1 cpu2 +----+ +----+ __ip[46]_datagram_connect() reuseport_grow() . . |- reuseport_has_conns(sk, true) |- more_reuse = __reuseport_alloc(more_socks_size) | . | | |- rcu_read_lock() | |- reuse = rcu_dereference(sk->sk_reuseport_cb) | | | | | /* reuse->has_conns == 0 here */ | | |- more_reuse->has_conns = reuse->has_conns | |- reuse->has_conns = 1 | /* more_reuse->has_conns SHOULD BE 1 HERE */ | | | | | |- rcu_assign_pointer(reuse->socks[i]->sk_reuseport_cb, | | | more_reuse) | `- rcu_read_unlock() `- kfree_rcu(reuse, rcu) | |- sk->sk_state = TCP_ESTABLISHED Note the likely(reuse) in reuseport_has_conns_set() is always true, but we put the test there for ease of review. [0] For the record, usually, sk_reuseport_cb is changed under lock_sock(). The only exception is reuseport_grow() & TCP reqsk migration case. 1) shutdown() TCP listener, which is moved into the latter part of reuse->socks[] to migrate reqsk. 2) New listen() overflows reuse->socks[] and call reuseport_grow(). 3) reuse->max_socks overflows u16 with the new listener. 4) reuseport_grow() pops the old shutdown()ed listener from the array and update its sk->sk_reuseport_cb as NULL without lock_sock(). shutdown()ed TCP sk->sk_reuseport_cb can be changed without lock_sock(), but, reuseport_has_conns_set() is called only for UDP under lock_sock(), so likely(reuse) never be false in reuseport_has_conns_set(). [0]: https://lore.kernel.org/netdev/CANn89iLja=eQHbsM_Ta2sQF0tOGU8vAGrh_izRuuHjuO1ouUag@mail.gmail.com/ Fixes: acdcecc61285 ("udp: correct reuseport selection with connected sockets") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20221014182625.89913-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-18Revert "dt-bindings: pinctrl-zynqmp: Add output-enable configuration"Sai Krishna Potthuri
This reverts commit 133ad0d9af99bdca90705dadd8d31c20bfc9919f. On systems with older PMUFW (Xilinx ZynqMP Platform Management Firmware) using these pinctrl properties can cause system hang because there is missing feature autodetection. When this feature is implemented, support for these two properties should bring back. Cc: stable@vger.kernel.org Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> Acked-by: Michal Simek <michal.simek@amd.com> Link: https://lore.kernel.org/r/20221017130303.21746-3-sai.krishna.potthuri@amd.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-10-18Revert "pinctrl: pinctrl-zynqmp: Add support for output-enable and ↵Sai Krishna Potthuri
bias-high-impedance" This reverts commit ad2bea79ef0144043721d4893eef719c907e2e63. On systems with older PMUFW (Xilinx ZynqMP Platform Management Firmware) using these pinctrl properties can cause system hang because there is missing feature autodetection. When this feature is implemented in the PMUFW, support for these two properties should bring back. Cc: stable@vger.kernel.org Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> Acked-by: Michal Simek <michal.simek@amd.com> Link: https://lore.kernel.org/r/20221017130303.21746-2-sai.krishna.potthuri@amd.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-10-18scsi: lpfc: Fix memory leak in lpfc_create_port()Rafael Mendonca
Commit 5e633302ace1 ("scsi: lpfc: vmid: Add support for VMID in mailbox command") introduced allocations for the VMID resources in lpfc_create_port() after the call to scsi_host_alloc(). Upon failure on the VMID allocations, the new code would branch to the 'out' label, which returns NULL without unwinding anything, thus skipping the call to scsi_host_put(). Fix the problem by creating a separate label 'out_free_vmid' to unwind the VMID resources and make the 'out_put_shost' label call only scsi_host_put(), as was done before the introduction of allocations for VMID. Fixes: 5e633302ace1 ("scsi: lpfc: vmid: Add support for VMID in mailbox command") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Link: https://lore.kernel.org/r/20220916035908.712799-1-rafaelmendsr@gmail.com Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18scsi: core: Restrict legal sdev_state transitions via sysfsUday Shankar
Userspace can currently write to sysfs to transition sdev_state to RUNNING or OFFLINE from any source state. This causes issues because proper transitioning out of some states involves steps besides just changing sdev_state, so allowing userspace to change sdev_state regardless of the source state can result in inconsistencies; e.g. with ISCSI we can end up with sdev_state == SDEV_RUNNING while the device queue is quiesced. Any task attempting I/O on the device will then hang, and in more recent kernels, iscsid will hang as well. More detail about this bug is provided in my first attempt: https://groups.google.com/g/open-iscsi/c/PNKca4HgPDs/m/CXaDkntOAQAJ Link: https://lore.kernel.org/r/20220924000241.2967323-1-ushankar@purestorage.com Signed-off-by: Uday Shankar <ushankar@purestorage.com> Suggested-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-17Merge tag 'cgroup-for-6.1-rc1-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Fix a recent regression where a sleeping kernfs function is called with css_set_lock (spinlock) held - Revert the commit to enable cgroup1 support for cgroup_get_from_fd/file() Multiple users assume that the lookup only works for cgroup2 and breaks when fed a cgroup1 file. Instead, introduce a separate set of functions to lookup both v1 and v2 and use them where the user explicitly wants to support both versions. - Compat update for tools/perf/util/bpf_skel/bperf_cgroup.bpf.c. - Add Josef Bacik as a blkcg maintainer. * tag 'cgroup-for-6.1-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: blkcg: Update MAINTAINERS entry mm: cgroup: fix comments for get from fd/file helpers perf stat: Support old kernels for bperf cgroup counting bpf: cgroup_iter: support cgroup1 using cgroup fd cgroup: add cgroup_v1v2_get_from_[fd/file]() Revert "cgroup: enable cgroup_get_from_file() on cgroup1" cgroup: Reorganize css_set_lock and kernfs path processing
2022-10-17ARC: mm: fix leakage of memory allocated for PTEPavel Kozlov
Since commit d9820ff ("ARC: mm: switch pgtable_t back to struct page *") a memory leakage problem occurs. Memory allocated for page table entries not released during process termination. This issue can be reproduced by a small program that allocates a large amount of memory. After several runs, you'll see that the amount of free memory has reduced and will continue to reduce after each run. All ARC CPUs are effected by this issue. The issue was introduced since the kernel stable release v5.15-rc1. As described in commit d9820ff after switch pgtable_t back to struct page *, a pointer to "struct page" and appropriate functions are used to allocate and free a memory page for PTEs, but the pmd_pgtable macro hasn't changed and returns the direct virtual address from the PMD (PGD) entry. Than this address used as a parameter in the __pte_free() and as a result this function couldn't release memory page allocated for PTEs. Fix this issue by changing the pmd_pgtable macro and returning pointer to struct page. Fixes: d9820ff76f95 ("ARC: mm: switch pgtable_t back to struct page *") Cc: Mike Rapoport <rppt@kernel.org> Cc: <stable@vger.kernel.org> # 5.15.x Signed-off-by: Pavel Kozlov <pavel.kozlov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2022-10-17arc: update config filesLukas Bulwahn
Clean up config files by: - removing configs that were deleted in the past - removing configs not in tree and without recently pending patches - adding new configs that are replacements for old configs in the file For some detailed information, see Link. Link: https://lore.kernel.org/kernel-janitors/20220929090645.1389-1-lukas.bulwahn@gmail.com/ Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2022-10-17arc: iounmap() arg is volatileRandy Dunlap
Add 'volatile' to iounmap()'s argument to prevent build warnings. This make it the same as other major architectures. Placates these warnings: (12 such warnings) ../drivers/video/fbdev/riva/fbdev.c: In function 'rivafb_probe': ../drivers/video/fbdev/riva/fbdev.c:2067:42: error: passing argument 1 of 'iounmap' discards 'volatile' qualifier from pointer target type [-Werror=discarded-qualifiers] 2067 | iounmap(default_par->riva.PRAMIN); Fixes: 1162b0701b14b ("ARC: I/O and DMA Mappings") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Vineet Gupta <vgupta@kernel.org> Cc: linux-snps-arc@lists.infradead.org Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2022-10-17arc: dts: Harmonize EHCI/OHCI DT nodes nameSerge Semin
In accordance with the Generic EHCI/OHCI bindings the corresponding node name is suppose to comply with the Generic USB HCD DT schema, which requires the USB nodes to have the name acceptable by the regexp: "^usb(@.*)?" . Make sure the "generic-ehci" and "generic-ohci"-compatible nodes are correctly named. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Acked-by: Alexey Brodkin <abrodkin@synopsys.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2022-10-17ARC: bitops: Change __fls to return unsigned longAmadeusz Sławiński
As per asm-generic definition and other architectures __fls should return unsigned long. No functional change is expected as return value should fit in unsigned long. Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2022-10-17ARC: Fix comment typoZhang Jiaming
Change 'seperate' to 'separate'. Signed-off-by: Zhang Jiaming <jiaming@nfschina.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2022-10-17ARC: Fix comment typoJilin Yuan
- Remove one of the repeated 'call' in comment line 396. - Delete the redundant word 'to', 'since' Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2022-10-18ata: ahci_qoriq: Fix compilation warningDamien Le Moal
When compiling with clang and W=1, the following warning is generated: drivers/ata/ahci_qoriq.c:283:22: error: cast to smaller integer type 'enum ahci_qoriq_type' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] qoriq_priv->type = (enum ahci_qoriq_type)of_id->data; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by using a cast to unsigned long to match the "void *" type size of of_id->data. Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Arnd Bergmann <arnd@arndb.de>
2022-10-18ata: ahci_imx: Fix compilation warningDamien Le Moal
When compiling with clang and W=1, the following warning is generated: drivers/ata/ahci_imx.c:1070:18: error: cast to smaller integer type 'enum ahci_imx_type' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] imxpriv->type = (enum ahci_imx_type)of_id->data; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by using a cast to unsigned long to match the "void *" type size of of_id->data. Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Arnd Bergmann <arnd@arndb.de>
2022-10-18ata: ahci_xgene: Fix compilation warningDamien Le Moal
When compiling with clang and W=1, the following warning is generated: drivers/ata/ahci_xgene.c:788:14: error: cast to smaller integer type 'enum xgene_ahci_version' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] version = (enum xgene_ahci_version) of_devid->data; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by using a cast to unsigned long to match the "void *" type size of of_devid->data. Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Arnd Bergmann <arnd@arndb.de>
2022-10-18ata: ahci_brcm: Fix compilation warningDamien Le Moal
When compiling with clang and W=1, the following warning is generated: drivers/ata/ahci_brcm.c:451:18: error: cast to smaller integer type 'enum brcm_ahci_version' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] priv->version = (enum brcm_ahci_version)of_id->data; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by using a cast to unsigned long to match the "void *" type size of of_id->data. Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Florian Fainelli <f.fainelli@gmail.com>
2022-10-18ata: sata_rcar: Fix compilation warningDamien Le Moal
When compiling with clang and W=1, the following warning is generated: drivers/ata/sata_rcar.c:878:15: error: cast to smaller integer type 'enum sata_rcar_type' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] priv->type = (enum sata_rcar_type)of_device_get_match_data(dev); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by using a cast to unsigned long to match the "void *" type size returned by of_device_get_match_data(). Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
2022-10-17blkcg: Update MAINTAINERS entryTejun Heo
Josef wrote iolatency and iocost is missing from the files list. Let's add Josef as a maintainer and add blk-iocost.c to the files list. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Jens Axboe <axboe@kernel.dk>
2022-10-17x86/topology: Fix duplicated core ID within a packageZhang Rui
Today, core ID is assumed to be unique within each package. But an AlderLake-N platform adds a Module level between core and package, Linux excludes the unknown modules bits from the core ID, resulting in duplicate core ID's. To keep core ID unique within a package, Linux must include all APIC-ID bits for known or unknown levels above the core and below the package in the core ID. It is important to understand that core ID's have always come directly from the APIC-ID encoding, which comes from the BIOS. Thus there is no guarantee that they start at 0, or that they are contiguous. As such, naively using them for array indexes can be problematic. [ dhansen: un-known -> unknown ] Fixes: 7745f03eb395 ("x86/topology: Add CPUID.1F multi-die/package support") Suggested-by: Len Brown <len.brown@intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Len Brown <len.brown@intel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20221014090147.1836-5-rui.zhang@intel.com