summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-11-02net: ethernet: ti: am65-cpsw: keep active if cpts enabledGrygorii Strashko
Some K3 CPSW NUSS instances can lose context after PM runtime ON->OFF->ON transition depending on integration (including all submodules: CPTS, MDIO, etc), like J721E Main CPSW (CPSW9G). In case CPTS is enabled it's initialized during probe and does not expect to be reset. Hence, keep K3 CPSW active by forbidding PM runtime if CPTS is enabled. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02net: ethernet: ti: am65-cpsw: fix vlan offload for multi mac modeGrygorii Strashko
The VLAN offload for AM65x CPSW2G is implemented using existing ALE APIs, which are also used by legacy CPSW drivers. So, now it always adds current Ext. Port and Host as VLAN members when VLAN is added by 8021Q core (.ndo_vlan_rx_add_vid) and forcibly removes VLAN from ALE table in .ndo_vlan_rx_kill_vid(). This works as for AM65x CPSW2G (which has only one Ext. Port) as for legacy CPSW devices (which can't support same VLAN on more then one Port in multi mac (dual-mac) mode). But it doesn't work for the new J721E and AM64x multi port CPSWxG versions doesn't have such restrictions and allow to offload the same VLAN on any number of ports. Now the attempt to add same VLAN on two (or more) K3 CPSWxG Ports will cause: - VLAN members mask overwrite when VLAN is added - VLAN removal from ALE table when any Port removes VLAN This patch fixes an issue by: - switching to use cpsw_ale_vlan_add_modify() instead of cpsw_ale_add_vlan() when VLAN is added to ALE table, so VLAN members mask will not be overwritten; - Updates cpsw_ale_del_vlan() as: if more than one ext. Port is in VLAN member mask then remove only current port from VLAN member mask else remove VLAN ALE entry Example: add: P1 | P0 (Host) -> members mask: P1 | P0 add: P2 | P0 -> members mask: P2 | P1 | P0 rem: P1 | P0 -> members mask: P2 | P0 rem: P2 | P0 -> members mask: - The VLAN is forcibly removed if port_mask=0 passed to cpsw_ale_del_vlan() to preserve existing legacy CPSW drivers functionality. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02net: ethernet: ti: cpsw_ale: add cpsw_ale_vlan_del_modify()Grygorii Strashko
Add/export cpsw_ale_vlan_del_modify() and use it in cpsw_switchdev instead of generic cpsw_ale_del_vlan() to avoid mixing 8021Q and switchdev VLAN offload. This is preparation patch equired by follow up changes. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02net: ethernet: ti: am65-cpsw: use cppi5_desc_is_tdcm()Grygorii Strashko
Use cppi5_desc_is_tdcm() helper for teardown indicator detection instead of hard-coded value. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02net: ethernet: ti: am65-cpsw: move free desc queue mode selection in pdataGrygorii Strashko
In preparation of adding more multi-port K3 CPSW versions move free descriptor queue mode selection in am65_cpsw_pdata, so it can be selected basing on DT compatibility property. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02net: ethernet: ti: am65-cpsw: move ale selection in pdataGrygorii Strashko
In preparation of adding more multi-port K3 CPSW versions move ALE selection in am65_cpsw_pdata, so it can be selected basing on DT compatibility property. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02net: ipv6: For kerneldoc warnings with W=1Xin Long
net/ipv6/addrconf.c:2005: warning: Function parameter or member 'dev' not described in 'ipv6_dev_find' net/ipv6/ip6_vti.c:138: warning: Function parameter or member 'ip6n' not described in 'vti6_tnl_bucket' net/ipv6/ip6_tunnel.c:218: warning: Function parameter or member 'ip6n' not described in 'ip6_tnl_bucket' net/ipv6/ip6_tunnel.c:238: warning: Function parameter or member 'ip6n' not described in 'ip6_tnl_link' net/ipv6/ip6_tunnel.c:254: warning: Function parameter or member 'ip6n' not described in 'ip6_tnl_unlink' net/ipv6/ip6_tunnel.c:427: warning: Function parameter or member 'raw' not described in 'ip6_tnl_parse_tlv_enc_lim' net/ipv6/ip6_tunnel.c:499: warning: Function parameter or member 'skb' not described in 'ip6_tnl_err' net/ipv6/ip6_tunnel.c:499: warning: Function parameter or member 'ipproto' not described in 'ip6_tnl_err' net/ipv6/ip6_tunnel.c:499: warning: Function parameter or member 'opt' not described in 'ip6_tnl_err' net/ipv6/ip6_tunnel.c:499: warning: Function parameter or member 'type' not described in 'ip6_tnl_err' net/ipv6/ip6_tunnel.c:499: warning: Function parameter or member 'code' not described in 'ip6_tnl_err' net/ipv6/ip6_tunnel.c:499: warning: Function parameter or member 'msg' not described in 'ip6_tnl_err' net/ipv6/ip6_tunnel.c:499: warning: Function parameter or member 'info' not described in 'ip6_tnl_err' net/ipv6/ip6_tunnel.c:499: warning: Function parameter or member 'offset' not described in 'ip6_tnl_err' ip6_tnl_err() is an internal function, so remove the kerneldoc. For the others, add the missing parameters. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20201031183044.1082193-1-andrew@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02net: driver: hamradio: Fix potential unterminated stringAndrew Lunn
With W=1 the following error is reported: In function ‘strncpy’, inlined from ‘hdlcdrv_ioctl’ at drivers/net/hamradio/hdlcdrv.c:600:4: ./include/linux/string.h:297:30: warning: ‘__builtin_strncpy’ specified bound 32 equals destination size [-Wstringop-truncation] 297 | #define __underlying_strncpy __builtin_strncpy | ^ ./include/linux/string.h:307:9: note: in expansion of macro ‘__underlying_strncpy’ 307 | return __underlying_strncpy(p, q, size); Replace strncpy with strlcpy to guarantee the string is terminated. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20201031181700.1081693-1-andrew@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02drivers: net: wan: lmc: Fix W=1 set but used variable warningsAndrew Lunn
drivers/net/wan/lmc/lmc_main.c: In function ‘lmc_ioctl’: drivers/net/wan/lmc/lmc_main.c:356:25: warning: variable ‘mii’ set but not used [-Wunused-but-set-variable] 356 | u16 mii; | ^~~ drivers/net/wan/lmc/lmc_main.c:427:25: warning: variable ‘mii’ set but not used [-Wunused-but-set-variable] 427 | u16 mii; | ^~~ drivers/net/wan/lmc/lmc_main.c: In function ‘lmc_interrupt’: drivers/net/wan/lmc/lmc_main.c:1188:9: warning: variable ‘firstcsr’ set but not used [-Wunused-but-set-variable] 1188 | u32 firstcsr; This file has funky indentation, and makes little use of tabs. Keep with this style in the patch, but that makes checkpatch unhappy. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20201031181417.1081511-1-andrew@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02drivers: net: xen-netfront: Fixed W=1 set but unused warningsAndrew Lunn
drivers/net/xen-netfront.c:2416:16: warning: variable ‘target’ set but not used [-Wunused-but-set-variable] 2416 | unsigned long target; Remove target and just discard the return value from simple_strtoul(). This patch does give a checkpatch warning, but the warning was there before anyway, as this file has lots of checkpatch warnings. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20201031180435.1081127-1-andrew@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02RDMA/vmw_pvrdma: Fix the active_speed and phys_state valueAdit Ranadive
The pvrdma_port_attr structure is ABI toward the hypervisor, changing it breaks the ability to report the speed properly. Revert the change to u16. Fixes: 376ceb31ff87 ("RDMA: Fix link active_speed size") Link: https://lore.kernel.org/r/20201102225437.26557-1-aditr@vmware.com Reviewed-by: Vishnu Dasa <vdasa@vmware.com> Signed-off-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02Merge branch 'davicom-w-1-fixes'Jakub Kicinski
Andrew Lunn says: ==================== davicom W=1 fixes Fixup various W=1 warnings, and then add COMPILE_TEST support, which explains why these where missed on the previous pass. ==================== Link: https://lore.kernel.org/r/20201031005833.1060316-1-andrew@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02drivers: net: davicom Add COMPILE_TEST supportAndrew Lunn
Improve the build testing of this davicom driver by enabling it when COMPILE_TEST is selected. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02drivers: net: davicom: Fixed unused but set variable with W=1Andrew Lunn
drivers/net/ethernet/davicom//dm9000.c: In function ‘dm9000_dumpblk_8bit’: drivers/net/ethernet/davicom//dm9000.c:235:6: warning: variable ‘tmp’ set but not used [-Wunused-but-set-variable] The driver needs to read packet data from the device even when the packet is known bad. There is no need to assign the data to a variable during this discard operation. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02drivers: net: tulip: Fix set but not used with W=1Andrew Lunn
When compiled for platforms other than __i386__ or __x86_64__: drivers/net/ethernet/dec/tulip/tulip_core.c: In function ‘tulip_init_one’: drivers/net/ethernet/dec/tulip/tulip_core.c:1296:13: warning: variable ‘last_irq’ set but not used [-Wunused-but-set-variable] 1296 | static int last_irq; Add more #if defined() to totally remove the code when not needed. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20201031005445.1060112-1-andrew@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02selftests: add test script for bareudp tunnelsGuillaume Nault
Test different encapsulation modes of the bareudp module: * Unicast MPLS, * IPv4 only, * IPv4 in multiproto mode (that is, IPv4 and IPv6), * IPv6. Each mode is tested with both an IPv4 and an IPv6 underlay. v2: * Add build dependencies in config file (Willem de Bruijn). * The MPLS test now uses its own IP addresses. This minimises the amount of cleanup between tests and simplifies the script. * Verify that iproute2 supports bareudp tunnels before running the script (and other minor usability improvements). Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://lore.kernel.org/r/8abc0e58f8a7eeb404f82466505a73110bc43ab8.1604088587.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02net: dsa: qca8k: Fix port MTU settingJonathan McDowell
The qca8k only supports a switch-wide MTU setting, and the code to take the max of all ports was only looking at the port currently being set. Fix to examine all ports. Reported-by: DENG Qingfang <dqfext@gmail.com> Fixes: f58d2598cf70 ("net: dsa: qca8k: implement the port MTU callbacks") Signed-off-by: Jonathan McDowell <noodles@earth.li> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20201030183315.GA6736@earth.li Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02Merge branch 'add-ast2400-2500-phy-handle-support'Jakub Kicinski
Ivan Mikhaylov says: ==================== add ast2400/2500 phy-handle support This patch introduces ast2400/2500 phy-handle support with an embedded MDIO controller. At the current moment it is not possible to set options with this format on ast2400/2500: mac { phy-handle = <&phy>; phy-mode = "rgmii"; mdio { #address-cells = <1>; #size-cells = <0>; phy: ethernet-phy@0 { compatible = "ethernet-phy-idxxxx.yyyy"; reg = <0>; }; }; }; The patch fixes it and gets possible PHYs and register them with of_mdiobus_register. Changes from v3: 1. add dt-bindings description of MDIO node and phy-handle option with example. Changes from v2: 1. change manual phy interface type check on phy_interface_mode_is_rgmii function. 2. add err_phy_connect label. 3. split ftgmac100_destroy_mdio into ftgmac100_phy_disconnect and ftgmac100_destroy_mdio. 4. remove unneeded mdio_np checks. Changes from v1: 1. split one patch into two. ==================== Link: https://lore.kernel.org/r/20201030133707.12099-1-i.mikhaylov@yadro.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02dt-bindings: net: ftgmac100: describe phy-handle and MDIOIvan Mikhaylov
Add the phy-handle and MDIO description and add the example with PHY and MDIO nodes. Signed-off-by: Ivan Mikhaylov <i.mikhaylov@yadro.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02net: ftgmac100: add handling of mdio/phy nodes for ast2400/2500Ivan Mikhaylov
phy-handle can't be handled well for ast2400/2500 which has an embedded MDIO controller. Add ftgmac100_mdio_setup for ast2400/2500 and initialize PHYs from mdio child node with of_mdiobus_register. Signed-off-by: Ivan Mikhaylov <i.mikhaylov@yadro.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02net: ftgmac100: move phy connect out from ftgmac100_setup_mdioIvan Mikhaylov
Split MDIO registration and PHY connect into ftgmac100_setup_mdio and ftgmac100_mii_probe. Signed-off-by: Ivan Mikhaylov <i.mikhaylov@yadro.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02scsi: mpt3sas: Fix timeouts observed while reenabling IRQSreekanth Reddy
While reenabling the IRQ after irq poll there may be small time window where HBA firmware has posted some replies and raise the interrupts but driver has not received the interrupts. So we may observe I/O timeouts as the driver has not processed the replies as interrupts got missed while reenabling the IRQ. To fix this issue the driver has to go for one more round of processing the reply descriptors from reply descriptor post queue after enabling the IRQ. Link: https://lore.kernel.org/r/20201102072746.27410-1-sreekanth.reddy@broadcom.com Reported-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-02scsi: scsi_dh_alua: Avoid crash during alua_bus_detach()Hannes Reinecke
alua_bus_detach() might be running concurrently with alua_rtpg_work(), so we might trip over h->sdev == NULL and call BUG_ON(). The correct way of handling it is to not set h->sdev to NULL in alua_bus_detach(), and call rcu_synchronize() before the final delete to ensure that all concurrent threads have left the critical section. Then we can get rid of the BUG_ON() and replace it with a simple if condition. Link: https://lore.kernel.org/r/1600167537-12509-1-git-send-email-jitendra.khasdev@oracle.com Link: https://lore.kernel.org/r/20200924104559.26753-1-hare@suse.de Cc: Brian Bunker <brian@purestorage.com> Acked-by: Brian Bunker <brian@purestorage.com> Tested-by: Jitendra Khasdev <jitendra.khasdev@oracle.com> Reviewed-by: Jitendra Khasdev <jitendra.khasdev@oracle.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-02sctp: Fix COMM_LOST/CANT_STR_ASSOC err reporting on big-endian platformsPetr Malat
Commit 978aa0474115 ("sctp: fix some type cast warnings introduced since very beginning")' broke err reading from sctp_arg, because it reads the value as 32-bit integer, although the value is stored as 16-bit integer. Later this value is passed to the userspace in 16-bit variable, thus the user always gets 0 on big-endian platforms. Fix it by reading the __u16 field of sctp_arg union, as reading err field would produce a sparse warning. Fixes: 978aa0474115 ("sctp: fix some type cast warnings introduced since very beginning") Signed-off-by: Petr Malat <oss@malat.biz> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Link: https://lore.kernel.org/r/20201030132633.7045-1-oss@malat.biz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02tty: make FONTX ioctl use the tty pointer they were actually passedLinus Torvalds
Some of the font tty ioctl's always used the current foreground VC for their operations. Don't do that then. This fixes a data race on fg_console. Side note: both Michael Ellerman and Jiri Slaby point out that all these ioctls are deprecated, and should probably have been removed long ago, and everything seems to be using the KDFONTOP ioctl instead. In fact, Michael points out that it looks like busybox's loadfont program seems to have switched over to using KDFONTOP exactly _because_ of this bug (ahem.. 12 years ago ;-). Reported-by: Minh Yuan <yuanmingbuaa@gmail.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Jiri Slaby <jirislaby@kernel.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "Subsystems affected by this patch series: mm (memremap, memcg, slab-generic, kasan, mempolicy, pagecache, oom-kill, pagemap), kthread, signals, lib, epoll, and core-kernel" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: kernel/hung_task.c: make type annotations consistent epoll: add a selftest for epoll timeout race mm: always have io_remap_pfn_range() set pgprot_decrypted() mm, oom: keep oom_adj under or at upper limit when printing kthread_worker: prevent queuing delayed work from timer_fn when it is being canceled mm/truncate.c: make __invalidate_mapping_pages() static lib/crc32test: remove extra local_irq_disable/enable ptrace: fix task_join_group_stop() for the case when current is traced mm: mempolicy: fix potential pte_unmap_unlock pte error kasan: adopt KUNIT tests to SW_TAGS mode mm: memcg: link page counters to root if use_hierarchy is false mm: memcontrol: correct the NR_ANON_THPS counter of hierarchical memcg hugetlb_cgroup: fix reservation accounting mm/mremap_pages: fix static key devmap_managed_key updates
2020-11-02libbpf, hashmap: Fix undefined behavior in hash_bitsIan Rogers
If bits is 0, the case when the map is empty, then the >> is the size of the register which is undefined behavior - on x86 it is the same as a shift by 0. Fix by handling the 0 case explicitly and guarding calls to hash_bits for empty maps in hashmap__for_each_key_entry and hashmap__for_each_entry_safe. Fixes: e3b924224028 ("libbpf: add resizable non-thread safe internal hashmap") Suggested-by: Andrii Nakryiko <andriin@fb.com>, Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201029223707.494059-1-irogers@google.com
2020-11-02selftests/net: timestamping: add ptp v2 supportGrygorii Strashko
The timestamping tool is supporting now only PTPv1 (IEEE-1588 2002) while modern HW often supports also/only PTPv2. Hence timestamping tool is still useful for sanity testing of PTP drivers HW timestamping capabilities it's reasonable to upstate it to support PTPv2. This patch adds corresponding support which can be enabled by using new parameter "PTPV2". Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/20201029190931.30883-1-grygorii.strashko@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02net: ethernet: ti: cpsw: disable PTPv1 hw timestamping advertisementGrygorii Strashko
The TI CPTS does not natively support PTPv1, only PTPv2. But, as it happens, the CPTS can provide HW timestamp for PTPv1 Sync messages, because CPTS HW parser looks for PTP messageType id in PTP message octet 0 which value is 0 for PTPv1. As result, CPTS HW can detect Sync messages for PTPv1 and PTPv2 (Sync messageType = 0 for both), but it fails for any other PTPv1 messages (Delay_req/resp) and will return PTP messageType id 0 for them. The commit e9523a5a32a1 ("net: ethernet: ti: cpsw: enable HWTSTAMP_FILTER_PTP_V1_L4_EVENT filter") added PTPv1 hw timestamping advertisement by mistake, only to make Linux Kernel "timestamping" utility work, and this causes issues with only PTPv1 compatible HW/SW - Sync HW timestamped, but Delay_req/resp are not. Hence, fix it disabling PTPv1 hw timestamping advertisement, so only PTPv1 compatible HW/SW can properly roll back to SW timestamping. Fixes: e9523a5a32a1 ("net: ethernet: ti: cpsw: enable HWTSTAMP_FILTER_PTP_V1_L4_EVENT filter") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/20201029190910.30789-1-grygorii.strashko@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02vfio/fsl-mc: return -EFAULT if copy_to_user() failsDan Carpenter
The copy_to_user() function returns the number of bytes remaining to be copied, but this code should return -EFAULT. Fixes: df747bcd5b21 ("vfio/fsl-mc: Implement VFIO_DEVICE_GET_REGION_INFO ioctl call") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Diana Craciun <diana.craciun@oss.nxp.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-02vfio/type1: Use the new helper to find vfio_groupZenghui Yu
When attaching a new group to the container, let's use the new helper vfio_iommu_find_iommu_group() to check if it's already attached. There is no functional change. Also take this chance to add a missing blank line. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-02tracing: Make -ENOMEM the default error for parse_synth_field()Steven Rostedt (VMware)
parse_synth_field() returns a pointer and requires that errors get surrounded by ERR_PTR(). The ret variable is initialized to zero, but should never be used as zero, and if it is, it could cause a false return code and produce a NULL pointer dereference. It makes no sense to set ret to zero. Set ret to -ENOMEM (the most common error case), and have any other errors set it to something else. This removes the need to initialize ret on *every* error branch. Fixes: 761a8c58db6b ("tracing, synthetic events: Replace buggy strcat() with seq_buf operations") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-02ring-buffer: Fix recursion protection transitions between interrupt contextSteven Rostedt (VMware)
The recursion protection of the ring buffer depends on preempt_count() to be correct. But it is possible that the ring buffer gets called after an interrupt comes in but before it updates the preempt_count(). This will trigger a false positive in the recursion code. Use the same trick from the ftrace function callback recursion code which uses a "transition" bit that gets set, to allow for a single recursion for to handle transitions between contexts. Cc: stable@vger.kernel.org Fixes: 567cd4da54ff4 ("ring-buffer: User context bit recursion checking") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-02gfs2: Don't call cancel_delayed_work_sync from within delete work functionAndreas Gruenbacher
Right now, we can end up calling cancel_delayed_work_sync from within delete_work_func via gfs2_lookup_by_inum -> gfs2_inode_lookup -> gfs2_cancel_delete_work. When that happens, it will result in a deadlock. Instead, gfs2_inode_lookup should skip the call to gfs2_cancel_delete_work when called from delete_work_func (blktype == GFS2_BLKST_UNLINKED). Reported-by: Alexander Ahring Oder Aring <aahringo@redhat.com> Fixes: a0e3cc65fa29 ("gfs2: Turn gl_delete into a delayed work") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2020-11-02net: 9p: Fix kerneldoc warnings of missing parameters etcAndrew Lunn
net/9p/client.c:420: warning: Function parameter or member 'c' not described in 'p9_client_cb' net/9p/client.c:420: warning: Function parameter or member 'req' not described in 'p9_client_cb' net/9p/client.c:420: warning: Function parameter or member 'status' not described in 'p9_client_cb' net/9p/client.c:568: warning: Function parameter or member 'uidata' not described in 'p9_check_zc_errors' net/9p/trans_common.c:23: warning: Function parameter or member 'nr_pages' not described in 'p9_release_pages' net/9p/trans_common.c:23: warning: Function parameter or member 'pages' not described in 'p9_release_pages' net/9p/trans_fd.c:132: warning: Function parameter or member 'rreq' not described in 'p9_conn' net/9p/trans_fd.c:132: warning: Function parameter or member 'wreq' not described in 'p9_conn' net/9p/trans_fd.c:56: warning: Function parameter or member 'privport' not described in 'p9_fd_opts' net/9p/trans_rdma.c:113: warning: Function parameter or member 'cqe' not described in 'p9_rdma_context' net/9p/trans_rdma.c:129: warning: Function parameter or member 'privport' not described in 'p9_rdma_opts' net/9p/trans_virtio.c:215: warning: Function parameter or member 'limit' not described in 'pack_sg_list_p' net/9p/trans_virtio.c:83: warning: Function parameter or member 'chan_list' not described in 'virtio_chan' net/9p/trans_virtio.c:83: warning: Function parameter or member 'p9_max_pages' not described in 'virtio_chan' net/9p/trans_virtio.c:83: warning: Function parameter or member 'ring_bufs_avail' not described in 'virtio_chan' net/9p/trans_virtio.c:83: warning: Function parameter or member 'tag' not described in 'virtio_chan' net/9p/trans_virtio.c:83: warning: Function parameter or member 'vc_wq' not described in 'virtio_chan' Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Dominique Martinet <asmadeus@codewreck.org> Link: https://lore.kernel.org/r/20201031182655.1082065-1-andrew@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02kernel/hung_task.c: make type annotations consistentLukas Bulwahn
Commit 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler") removed various __user annotations from function signatures as part of its refactoring. It also removed the __user annotation for proc_dohung_task_timeout_secs() at its declaration in sched/sysctl.h, but not at its definition in kernel/hung_task.c. Hence, sparse complains: kernel/hung_task.c:271:5: error: symbol 'proc_dohung_task_timeout_secs' redeclared with different type (incompatible argument 3 (different address spaces)) Adjust the annotation at the definition fitting to that refactoring to make sparse happy again, which also resolves this warning from sparse: kernel/hung_task.c:277:52: warning: incorrect type in argument 3 (different address spaces) kernel/hung_task.c:277:52: expected void * kernel/hung_task.c:277:52: got void [noderef] __user *buffer No functional change. No change in object code. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrey Ignatov <rdna@fb.com> Link: https://lkml.kernel.org/r/20201028130541.20320-1-lukas.bulwahn@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02epoll: add a selftest for epoll timeout raceSoheil Hassas Yeganeh
Add a test case to ensure an event is observed by at least one poller when an epoll timeout is used. Signed-off-by: Guantao Liu <guantaol@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Khazhismel Kumykov <khazhy@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davidlohr Bueso <dave@stgolabs.net> Link: https://lkml.kernel.org/r/20201028180202.952079-2-soheil.kdev@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02mm: always have io_remap_pfn_range() set pgprot_decrypted()Jason Gunthorpe
The purpose of io_remap_pfn_range() is to map IO memory, such as a memory mapped IO exposed through a PCI BAR. IO devices do not understand encryption, so this memory must always be decrypted. Automatically call pgprot_decrypted() as part of the generic implementation. This fixes a bug where enabling AMD SME causes subsystems, such as RDMA, using io_remap_pfn_range() to expose BAR pages to user space to fail. The CPU will encrypt access to those BAR pages instead of passing unencrypted IO directly to the device. Places not mapping IO should use remap_pfn_range(). Fixes: aca20d546214 ("x86/mm: Add support to make use of Secure Memory Encryption") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Dave Young" <dyoung@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Ingo Molnar <mingo@kernel.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Toshimitsu Kani <toshi.kani@hpe.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/0-v1-025d64bdf6c4+e-amd_sme_fix_jgg@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02mm, oom: keep oom_adj under or at upper limit when printingCharles Haithcock
For oom_score_adj values in the range [942,999], the current calculations will print 16 for oom_adj. This patch simply limits the output so output is inline with docs. Signed-off-by: Charles Haithcock <chaithco@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Link: https://lkml.kernel.org/r/20201020165130.33927-1-chaithco@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02kthread_worker: prevent queuing delayed work from timer_fn when it is being ↵Zqiang
canceled There is a small race window when a delayed work is being canceled and the work still might be queued from the timer_fn: CPU0 CPU1 kthread_cancel_delayed_work_sync() __kthread_cancel_work_sync() __kthread_cancel_work() work->canceling++; kthread_delayed_work_timer_fn() kthread_insert_work(); BUG: kthread_insert_work() should not get called when work->canceling is set. Signed-off-by: Zqiang <qiang.zhang@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201014083030.16895-1-qiang.zhang@windriver.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02mm/truncate.c: make __invalidate_mapping_pages() staticJason Yan
Fix the following sparse warning: mm/truncate.c:531:15: warning: symbol '__invalidate_mapping_pages' was not declared. Should it be static? Fixes: eb1d7a65f08a ("mm, fadvise: improve the expensive remote LRU cache draining after FADV_DONTNEED") Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lkml.kernel.org/r/20201015054808.2445904-1-yanaijie@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02lib/crc32test: remove extra local_irq_disable/enableVasily Gorbik
Commit 4d004099a668 ("lockdep: Fix lockdep recursion") uncovered the following issue in lib/crc32test reported on s390: BUG: using __this_cpu_read() in preemptible [00000000] code: swapper/0/1 caller is lockdep_hardirqs_on_prepare+0x48/0x270 CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.9.0-next-20201015-15164-g03d992bd2de6 #19 Hardware name: IBM 3906 M04 704 (LPAR) Call Trace: lockdep_hardirqs_on_prepare+0x48/0x270 trace_hardirqs_on+0x9c/0x1b8 crc32_test.isra.0+0x170/0x1c0 crc32test_init+0x1c/0x40 do_one_initcall+0x40/0x130 do_initcalls+0x126/0x150 kernel_init_freeable+0x1f6/0x230 kernel_init+0x22/0x150 ret_from_fork+0x24/0x2c no locks held by swapper/0/1. Remove extra local_irq_disable/local_irq_enable helpers calls. Fixes: 5fb7f87408f1 ("lib: add module support to crc32 tests") Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lkml.kernel.org/r/patch.git-4369da00c06e.your-ad-here.call-01602859837-ext-1679@work.hours Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02ptrace: fix task_join_group_stop() for the case when current is tracedOleg Nesterov
This testcase #include <stdio.h> #include <unistd.h> #include <signal.h> #include <sys/ptrace.h> #include <sys/wait.h> #include <pthread.h> #include <assert.h> void *tf(void *arg) { return NULL; } int main(void) { int pid = fork(); if (!pid) { kill(getpid(), SIGSTOP); pthread_t th; pthread_create(&th, NULL, tf, NULL); return 0; } waitpid(pid, NULL, WSTOPPED); ptrace(PTRACE_SEIZE, pid, 0, PTRACE_O_TRACECLONE); waitpid(pid, NULL, 0); ptrace(PTRACE_CONT, pid, 0,0); waitpid(pid, NULL, 0); int status; int thread = waitpid(-1, &status, 0); assert(thread > 0 && thread != pid); assert(status == 0x80137f); return 0; } fails and triggers WARN_ON_ONCE(!signr) in do_jobctl_trap(). This is because task_join_group_stop() has 2 problems when current is traced: 1. We can't rely on the "JOBCTL_STOP_PENDING" check, a stopped tracee can be woken up by debugger and it can clone another thread which should join the group-stop. We need to check group_stop_count || SIGNAL_STOP_STOPPED. 2. If SIGNAL_STOP_STOPPED is already set, we should not increment sig->group_stop_count and add JOBCTL_STOP_CONSUME. The new thread should stop without another do_notify_parent_cldstop() report. To clarify, the problem is very old and we should blame ptrace_init_task(). But now that we have task_join_group_stop() it makes more sense to fix this helper to avoid the code duplication. Reported-by: syzbot+3485e3773f7da290eecc@syzkaller.appspotmail.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christian Brauner <christian@brauner.io> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Zhiqiang Liu <liuzhiqiang26@huawei.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201019134237.GA18810@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02mm: mempolicy: fix potential pte_unmap_unlock pte errorShijie Luo
When flags in queue_pages_pte_range don't have MPOL_MF_MOVE or MPOL_MF_MOVE_ALL bits, code breaks and passing origin pte - 1 to pte_unmap_unlock seems like not a good idea. queue_pages_pte_range can run in MPOL_MF_MOVE_ALL mode which doesn't migrate misplaced pages but returns with EIO when encountering such a page. Since commit a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified") and early break on the first pte in the range results in pte_unmap_unlock on an underflow pte. This can lead to lockups later on when somebody tries to lock the pte resp. page_table_lock again.. Fixes: a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified") Signed-off-by: Shijie Luo <luoshijie1@huawei.com> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Feilong Lin <linfeilong@huawei.com> Cc: Shijie Luo <luoshijie1@huawei.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201019074853.50856-1-luoshijie1@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02kasan: adopt KUNIT tests to SW_TAGS modeAndrey Konovalov
Now that we have KASAN-KUNIT tests integration, it's easy to see that some KASAN tests are not adopted to the SW_TAGS mode and are failing. Adjust the allocation size for kasan_memchr() and kasan_memcmp() by roung it up to OOB_TAG_OFF so the bad access ends up in a separate memory granule. Add a new kmalloc_uaf_16() tests that relies on UAF, and a new kasan_bitops_tags() test that is tailored to tag-based mode, as it's hard to adopt the existing kmalloc_oob_16() and kasan_bitops_generic() (renamed from kasan_bitops()) without losing the precision. Add new kmalloc_uaf_16() and kasan_bitops_uaf() tests that rely on UAFs, as it's hard to adopt the existing kmalloc_oob_16() and kasan_bitops_oob() (rename from kasan_bitops()) without losing the precision. Disable kasan_global_oob() and kasan_alloca_oob_left/right() as SW_TAGS mode doesn't instrument globals nor dynamic allocas. Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: David Gow <davidgow@google.com> Link: https://lkml.kernel.org/r/76eee17b6531ca8b3ca92b240cb2fd23204aaff7.1603129942.git.andreyknvl@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02mm: memcg: link page counters to root if use_hierarchy is falseRoman Gushchin
Richard reported a warning which can be reproduced by running the LTP madvise6 test (cgroup v1 in the non-hierarchical mode should be used): WARNING: CPU: 0 PID: 12 at mm/page_counter.c:57 page_counter_uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156) Modules linked in: CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.9.0-rc7-22-default #77 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812d-rebuilt.opensuse.org 04/01/2014 Workqueue: events drain_local_stock RIP: 0010:page_counter_uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156) Call Trace: __memcg_kmem_uncharge (mm/memcontrol.c:3022) drain_obj_stock (./include/linux/rcupdate.h:689 mm/memcontrol.c:3114) drain_local_stock (mm/memcontrol.c:2255) process_one_work (./arch/x86/include/asm/jump_label.h:25 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:108 kernel/workqueue.c:2274) worker_thread (./include/linux/list.h:282 kernel/workqueue.c:2416) kthread (kernel/kthread.c:292) ret_from_fork (arch/x86/entry/entry_64.S:300) The problem occurs because in the non-hierarchical mode non-root page counters are not linked to root page counters, so the charge is not propagated to the root memory cgroup. After the removal of the original memory cgroup and reparenting of the object cgroup, the root cgroup might be uncharged by draining a objcg stock, for example. It leads to an eventual underflow of the charge and triggers a warning. Fix it by linking all page counters to corresponding root page counters in the non-hierarchical mode. Please note, that in the non-hierarchical mode all objcgs are always reparented to the root memory cgroup, even if the hierarchy has more than 1 level. This patch doesn't change it. The patch also doesn't affect how the hierarchical mode is working, which is the only sane and truly supported mode now. Thanks to Richard for reporting, debugging and providing an alternative version of the fix! Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") Reported-by: <ltp@lists.linux.it> Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Michal Koutný <mkoutny@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201026231326.3212225-1-guro@fb.com Debugged-by: Richard Palethorpe <rpalethorpe@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02mm: memcontrol: correct the NR_ANON_THPS counter of hierarchical memcgzhongjiang-ali
memcg_page_state will get the specified number in hierarchical memcg, It should multiply by HPAGE_PMD_NR rather than an page if the item is NR_ANON_THPS. [akpm@linux-foundation.org: fix printk warning] [akpm@linux-foundation.org: use u64 cast, per Michal] Fixes: 468c398233da ("mm: memcontrol: switch to native NR_ANON_THPS counter") Signed-off-by: zhongjiang-ali <zhongjiang-ali@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Link: https://lkml.kernel.org/r/1603722395-72443-1-git-send-email-zhongjiang-ali@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02hugetlb_cgroup: fix reservation accountingMike Kravetz
Michal Privoznik was using "free page reporting" in QEMU/virtio-balloon with hugetlbfs and hit the warning below. QEMU with free page hinting uses fallocate(FALLOC_FL_PUNCH_HOLE) to discard pages that are reported as free by a VM. The reporting granularity is in pageblock granularity. So when the guest reports 2M chunks, we fallocate(FALLOC_FL_PUNCH_HOLE) one huge page in QEMU. WARNING: CPU: 7 PID: 6636 at mm/page_counter.c:57 page_counter_uncharge+0x4b/0x50 Modules linked in: ... CPU: 7 PID: 6636 Comm: qemu-system-x86 Not tainted 5.9.0 #137 Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS PRO/X570 AORUS PRO, BIOS F21 07/31/2020 RIP: 0010:page_counter_uncharge+0x4b/0x50 ... Call Trace: hugetlb_cgroup_uncharge_file_region+0x4b/0x80 region_del+0x1d3/0x300 hugetlb_unreserve_pages+0x39/0xb0 remove_inode_hugepages+0x1a8/0x3d0 hugetlbfs_fallocate+0x3c4/0x5c0 vfs_fallocate+0x146/0x290 __x64_sys_fallocate+0x3e/0x70 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Investigation of the issue uncovered bugs in hugetlb cgroup reservation accounting. This patch addresses the found issues. Fixes: 075a61d07a8e ("hugetlb_cgroup: add accounting for shared mappings") Reported-by: Michal Privoznik <mprivozn@redhat.com> Co-developed-by: David Hildenbrand <david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: <stable@vger.kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Tejun Heo <tj@kernel.org> Link: https://lkml.kernel.org/r/20201021204426.36069-1-mike.kravetz@oracle.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02mm/mremap_pages: fix static key devmap_managed_key updatesRalph Campbell
commit 6f42193fd86e ("memremap: don't use a separate devm action for devmap_managed_enable_get") changed the static key updates such that we now call devmap_managed_enable_put() without doing the equivalent devmap_managed_enable_get(). devmap_managed_enable_get() is only called for MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_FS_DAX, But memunmap_pages() get called for other pgmap types too. This results in the below warning when switching between system-ram and devdax mode for devdax namespace. jump label: negative count! WARNING: CPU: 52 PID: 1335 at kernel/jump_label.c:235 static_key_slow_try_dec+0x88/0xa0 Modules linked in: .... NIP static_key_slow_try_dec+0x88/0xa0 LR static_key_slow_try_dec+0x84/0xa0 Call Trace: static_key_slow_try_dec+0x84/0xa0 __static_key_slow_dec_cpuslocked+0x34/0xd0 static_key_slow_dec+0x54/0xf0 memunmap_pages+0x36c/0x500 devm_action_release+0x30/0x50 release_nodes+0x2f4/0x3e0 device_release_driver_internal+0x17c/0x280 bus_remove_device+0x124/0x210 device_del+0x1d4/0x530 unregister_dev_dax+0x48/0xe0 devm_action_release+0x30/0x50 release_nodes+0x2f4/0x3e0 device_release_driver_internal+0x17c/0x280 unbind_store+0x130/0x170 drv_attr_store+0x40/0x60 sysfs_kf_write+0x6c/0xb0 kernfs_fop_write+0x118/0x280 vfs_write+0xe8/0x2a0 ksys_write+0x84/0x140 system_call_exception+0x120/0x270 system_call_common+0xf0/0x27c Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Link: https://lkml.kernel.org/r/20201023183222.13186-1-rcampbell@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-02ARC: [plat-hsdk] Remap CCMs super early in asm boot trampolineVineet Gupta
ARC HSDK platform stopped booting on released v5.10-rc1, getting stuck in startup of non master SMP cores. This was bisected to upstream commit 7fef431be9c9ac25 "(mm/page_alloc: place pages to tail in __free_pages_core())" That commit itself is harmless, it just exposed a subtle assumption in our platform code (hence CC'ing linux-mm just as FYI in case some other arches / platforms trip on it). The upstream commit is semantically disruptive as it reverses the order of page allocations (actually it can be good test for hardware verification to exercise different memory patterns altogether). For ARC HSDK platform that meant a remapped memory region (pertaining to unused Closely Coupled Memory) started getting used early for dynamice allocations, while not effectively remapped on all the cores, triggering memory error exception on those cores. The fix is to move the CCM remapping from early platform code to to early core boot code. And while it is undesirable to riddle common boot code with platform quirks, there is no other way to do this since the faltering code involves setting up stack itself so even function calls are not allowed at that point. If anyone is interested, all the gory details can be found at Link below. Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/32 Cc: David Hildenbrand <david@redhat.com> Cc: linux-mm@kvack.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com>