summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2020-01-17Merge branch 'fixup-thunderx-hierarchy' into develLinus Walleij
2020-01-17Merge tag 'v5.5-rc6' into develLinus Walleij
Linux 5.5-rc6
2020-01-17Merge tag 'extcon-next-for-5.6' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into char-misc-next Chanwoo writes: Update extcon for 5.6 Detailed description for this pull request: 1. Remove unneeded 'extern' keyword from extcon.h header file 2. Clean-up the extcon provider - Clean-up the code for readability of extcon-arizona/sm5502.c * tag 'extcon-next-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon: extcon: Remove unneeded extern keyword from extcon.h extcon: sm5502: Remove unneeded semicolon extcon: arizona: Factor out microphone and button detection extcon: arizona: Factor out microphone impedance into a function extcon: arizona: Invert logic of check in arizona_hpdet_do_id extcon: arizona: Remove excessive WARN_ON extcon: arizona: Remove unnecessary sets of ACCDET_MODE extcon: arizona: Tidy up transition from mic to headphone detect extcon: arizona: Clear jack status regardless of detection type extcon: arizona: Move pdata extraction to probe extcon: arizona: Make rev A register sequences atomic extcon: arizona: Correct clean up if arizona_identify_headphone fails
2020-01-17Merge tag 'phy-for-5.6_v2' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy into usb-next Kishon writes: phy: for 5.6 *) Add support in PHY core to create link between PHY consumer and PHY provider *) Add DisplayPort PHY configuration set to be used for negotiating the configurations to be used between DisplayPort controller and DisplayPort PHY *) Add PHY wrapper driver (configure inputs to Cadence Sierra PHY) for TI's J721E SoC and adapt Cadence Sierra PHY driver to be used for J721E SoC (Supports USB and PCIe) *) Add PHY driver for eMMC PHY in Intel LGM SoC *) Add PHY support for 7216 and 7211 Broadcom SoCs which uses the new Synopsys USB Controller *) Add support for 16nm SATA PHY present in Broadcom 7216 SoC *) Fix lost packet issue, fix MDIO from getting inaccessible, fix occasional transaction failures, fix USB driver from crashing in Broadcom USB PHY driver *) Fix missing PCS SW reset in UFS PHY of Qualcomm SM8150 *) Use "struct phy_configure_opts_mipi_dphy" to pass parameters from display controller to rockchip-inno-dsidphy *) Other cleanups including compile testing for some of the PHY drivers, fixing Kconfig indentation, duplicate writes in drivers etc., Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> * tag 'phy-for-5.6_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy: (54 commits) dt-bindings: phy: Add PHY_TYPE_DP definition phy: ti: j721e-wiz: Fix return value check in wiz_probe() dt-bindings: usb: Convert Allwinner A80 USB PHY controller to a schema phy: intel-lgm-emmc: Fix warning by adding missing MODULE_LICENSE phy: ti: j721e-wiz: Manage typec-gpio-dir dt-bindings: phy: ti,phy-j721e-wiz: Add Type-C dir GPIO phy: cadence: Sierra: add phy_reset hook phy: cadence: Sierra: remove redundant initialization of pointer regmap phy: Add DisplayPort configuration options phy: Enable compile testing for some of drivers phy: mediatek: Fix Kconfig indentation phy: intel-lgm-emmc: Add support for eMMC PHY dt-bindings: phy: intel-emmc-phy: Add YAML schema for LGM eMMC PHY phy: ti: j721e-wiz: Add support for WIZ module present in TI J721E SoC dt-bindings: phy: Document WIZ (SERDES wrapper) bindings phy: cadence: Sierra: Use correct dev pointer in cdns_sierra_phy_remove() phy: cadence: Sierra: Set cmn_refclk_dig_div/cmn_refclk1_dig_div frequency to 25MHz phy: cadence: Sierra: Change MAX_LANES of Sierra to 16 phy: cadence: Sierra: Check for PLL lock during PHY power on phy: cadence: Sierra: Get reset control "array" for each link ...
2020-01-16xdp: Use bulking for non-map XDP_REDIRECT and consolidate code pathsToke Høiland-Jørgensen
Since the bulk queue used by XDP_REDIRECT now lives in struct net_device, we can re-use the bulking for the non-map version of the bpf_redirect() helper. This is a simple matter of having xdp_do_redirect_slow() queue the frame on the bulk queue instead of sending it out with __bpf_tx_xdp(). Unfortunately we can't make the bpf_redirect() helper return an error if the ifindex doesn't exit (as bpf_redirect_map() does), because we don't have a reference to the network namespace of the ingress device at the time the helper is called. So we have to leave it as-is and keep the device lookup in xdp_do_redirect_slow(). Since this leaves less reason to have the non-map redirect code in a separate function, so we get rid of the xdp_do_redirect_slow() function entirely. This does lose us the tracepoint disambiguation, but fortunately the xdp_redirect and xdp_redirect_map tracepoints use the same tracepoint entry structures. This means both can contain a map index, so we can just amend the tracepoint definitions so we always emit the xdp_redirect(_err) tracepoints, but with the map ID only populated if a map is present. This means we retire the xdp_redirect_map(_err) tracepoints entirely, but keep the definitions around in case someone is still listening for them. With this change, the performance of the xdp_redirect sample program goes from 5Mpps to 8.4Mpps (a 68% increase). Since the flush functions are no longer map-specific, rename the flush() functions to drop _map from their names. One of the renamed functions is the xdp_do_flush_map() callback used in all the xdp-enabled drivers. To keep from having to update all drivers, use a #define to keep the old name working, and only update the virtual drivers in this patch. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/157918768505.1458396.17518057312953572912.stgit@toke.dk
2020-01-16xdp: Move devmap bulk queue into struct net_deviceToke Høiland-Jørgensen
Commit 96360004b862 ("xdp: Make devmap flush_list common for all map instances"), changed devmap flushing to be a global operation instead of a per-map operation. However, the queue structure used for bulking was still allocated as part of the containing map. This patch moves the devmap bulk queue into struct net_device. The motivation for this is reusing it for the non-map variant of XDP_REDIRECT, which will be changed in a subsequent commit. To avoid other fields of struct net_device moving to different cache lines, we also move a couple of other members around. We defer the actual allocation of the bulk queue structure until the NETDEV_REGISTER notification devmap.c. This makes it possible to check for ndo_xdp_xmit support before allocating the structure, which is not possible at the time struct net_device is allocated. However, we keep the freeing in free_netdev() to avoid adding another RCU callback on NETDEV_UNREGISTER. Because of this change, we lose the reference back to the map that originated the redirect, so change the tracepoint to always return 0 as the map ID and index. Otherwise no functional change is intended with this patch. After this patch, the relevant part of struct net_device looks like this, according to pahole: /* --- cacheline 14 boundary (896 bytes) --- */ struct netdev_queue * _tx __attribute__((__aligned__(64))); /* 896 8 */ unsigned int num_tx_queues; /* 904 4 */ unsigned int real_num_tx_queues; /* 908 4 */ struct Qdisc * qdisc; /* 912 8 */ unsigned int tx_queue_len; /* 920 4 */ spinlock_t tx_global_lock; /* 924 4 */ struct xdp_dev_bulk_queue * xdp_bulkq; /* 928 8 */ struct xps_dev_maps * xps_cpus_map; /* 936 8 */ struct xps_dev_maps * xps_rxqs_map; /* 944 8 */ struct mini_Qdisc * miniq_egress; /* 952 8 */ /* --- cacheline 15 boundary (960 bytes) --- */ struct hlist_head qdisc_hash[16]; /* 960 128 */ /* --- cacheline 17 boundary (1088 bytes) --- */ struct timer_list watchdog_timer; /* 1088 40 */ /* XXX last struct has 4 bytes of padding */ int watchdog_timeo; /* 1128 4 */ /* XXX 4 bytes hole, try to pack */ struct list_head todo_list; /* 1136 16 */ /* --- cacheline 18 boundary (1152 bytes) --- */ Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/157918768397.1458396.12673224324627072349.stgit@toke.dk
2020-01-16Merge tag 'omap-for-v5.6/ti-sysc-signed' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/drivers ti-sysc driver changes for omaps for v5.6 merge window Few changes to implement quirk handling for cases where we need to block clockdomain autoidle, drop old MMU specific quirks, and simplify the return code for sysc_init_resets(). * tag 'omap-for-v5.6/ti-sysc-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: bus: ti-sysc: Use PTR_ERR_OR_ZERO() to simplify code bus: ti-sysc: Drop MMU quirks bus: ti-sysc: Implement quirk handling for CLKDM_NOAUTO Link: https://lore.kernel.org/r/pull-1579200367-372444@atomide.com-3 Signed-off-by: Olof Johansson <olof@lixom.net>
2020-01-16net/mlx5: Allow creating autogroups with reserved entriesPaul Blakey
Exclude the last n entries for an autogrouped flow table. Reserving entries at the end of the FT will ensure that this FG will be the last to be evaluated. This will be used in the next patch to create a miss group enabling custom actions on FT miss. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: Add ignore level support fwd to table rulesPaul Blakey
If user sets ignore flow level flag on a rule, that rule can point to a flow table of any level, including those with levels equal or less than the level of the flow table it is added on. This with unamanged tables will be used to create a FDB chain/prio hierarchy much larger than currently supported level range. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: fs_core: Introduce unmanaged flow tablesPaul Blakey
Currently, Most of the steering tree is statically declared ahead of time, with steering prios instances allocated for each fdb chain to assign max number of levels for each of them. This allows fs_core to manage the connections and levels of the flow tables hierarcy to prevent loops, but restricts us with the number of supported chains and priorities. Introduce unmananged flow tables, allowing the user to manage the flow table connections. A unamanged table is detached from the fs_core flow table hierarcy, and is only connected back to the hierarchy by explicit FTEs forward actions. This will be used together with firmware that supports ignoring the flow table levels to increase the number of supported chains and prios. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux This merge syncs with mlx5-next latest HW bits and layout updates for next features, in addition one patch that improves mlx5_create_auto_grouped_flow_table() API across all mlx5 users. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Refactor mlx5_create_auto_grouped_flow_table net/mlx5e: Add discard counters per priority net/mlx5e: Expose FEC feilds and related capability bit net/mlx5: Add mlx5_ifc definitions for connection tracking support net/mlx5: Add copy header action struct layout net/mlx5: Expose resource dump register mapping net/mlx5: Add structures and defines for MIRC register net/mlx5: Read MCAM register groups 1 and 2 net/mlx5: Add structures layout for new MCAM access reg groups net/mlx5: Expose vDPA emulation device capabilities net/mlx5: Add Virtio Emulation related device capabilities Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16Merge tag 'qcom-drivers-for-5.6' of ↵Olof Johansson
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/drivers Qualcomm driver updates for v5.6 * SCM major refactoring and cleanup * Properly flag active only power domains as active only * Add SC7180 and SM8150 RPMH power domains * Return EPROBE_DEFER from QMI if packet family is not yet available * tag 'qcom-drivers-for-5.6' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (27 commits) firmware: qcom_scm: Dynamically support SMCCC and legacy conventions firmware: qcom_scm: Remove thin wrappers firmware: qcom_scm: Order functions, definitions by service/command firmware: qcom_scm-32: Add device argument to atomic calls firmware: qcom_scm-32: Create common legacy atomic call firmware: qcom_scm-32: Move SMCCC register filling to qcom_scm_call firmware: qcom_scm-32: Use qcom_scm_desc in non-atomic calls firmware: qcom_scm-32: Add funcnum IDs firmware: qcom_scm-32: Use SMC arch wrappers firmware: qcom_scm-64: Improve SMC convention detection firmware: qcom_scm-64: Move SMC register filling to qcom_scm_call_smccc firmware: qcom_scm-64: Add SCM results struct firmware: qcom_scm-64: Move svc/cmd/owner into qcom_scm_desc firmware: qcom_scm-64: Make SMC macros less magical firmware: qcom_scm: Remove unused qcom_scm_get_version firmware: qcom_scm: Apply consistent naming scheme to command IDs firmware: qcom_scm: Rename macros and structures soc: qcom: rpmhpd: Set 'active_only' for active only power domains firmware: scm: Add stubs for OCMEM and restore_sec_cfg_available dt-bindings: power: rpmpd: Convert rpmpd bindings to yaml ... Link: https://lore.kernel.org/r/20200113204405.GD3325@yoga Signed-off-by: Olof Johansson <olof@lixom.net>
2020-01-16net/mlx5: Refactor mlx5_create_auto_grouped_flow_tablePaul Blakey
Refactor mlx5_create_auto_grouped_flow_table() to use ft_attr param which already carries the max_fte, prio and flags memebers, and is used the same in similar mlx5_create_flow_table() function. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-17Merge back cpuidle material for v5.6.Rafael J. Wysocki
2020-01-16net/mlx5e: Add discard counters per priorityAharon Landau
Add counters that count (per priority) the number of received packets that dropped due to lack of buffers on a physical port. If this counter is increasing, it implies that the adapter is congested and cannot absorb the traffic coming from the network. Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5e: Expose FEC feilds and related capability bitAya Levin
Introduce 50G per lane FEC modes capability bit and newly supported fields in PPLM register which allow this configuration. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: Add mlx5_ifc definitions for connection tracking supportPaul Blakey
Add the required hardware definitions to mlx5_ifc: ignore_flow_level, registers, copy_header, and fwd_and_modify cap. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Sholomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: Add copy header action struct layoutHamdan Igbaria
Add definition for copy header action, copy action is used to copy header fields from source to destination. Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com> Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: Expose resource dump register mappingAya Levin
Add new register enumeration for resource dump. Add layout mapping for resource dump: access command and response. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: Add structures and defines for MIRC registerEran Ben Elisha
Add needed structures, layouts and defines for MIRC (Management Image Re-activation Control) register. This structure will be used for the FSM reactivation flow in the downstream patches. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: Read MCAM register groups 1 and 2Eran Ben Elisha
On load, Driver caches MCAM (Management Capabilities Mask Register) registers. in addition to the only MCAM register group (0) the driver already reads, here we add support for reading groups 1 and 2. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: Add structures layout for new MCAM access reg groupsEran Ben Elisha
MCAM has 3 access_reg_groups (0-2). Defines data structures in order to read and parse access_reg_groups #1 and #2. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16PM: suspend: Add sysfs attribute to control the "sync on suspend" behaviorJonas Meurer
The sysfs attribute `/sys/power/sync_on_suspend` controls, whether or not filesystems are synced by the kernel before system suspend. Congruously, the behaviour of build-time switch CONFIG_SUSPEND_SKIP_SYNC is slightly changed: It now defines the run-tim default for the new sysfs attribute `/sys/power/sync_on_suspend`. The run-time attribute is added because the existing corresponding build-time Kconfig flag for (`CONFIG_SUSPEND_SKIP_SYNC`) is not flexible enough. E.g. Linux distributions that provide pre-compiled kernels usually want to stick with the default (sync filesystems before suspend) but under special conditions this needs to be changed. One example for such a special condition is user-space handling of suspending block devices (e.g. using `cryptsetup luksSuspend` or `dmsetup suspend`) before system suspend. The Kernel trying to sync filesystems after the underlying block device already got suspended obviously leads to dead-locks. Be aware that you have to take care of the filesystem sync yourself before suspending the system in those scenarios. Signed-off-by: Jonas Meurer <jonas@freesources.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-01-16Merge branch 'mlx5-next' into rdma.git for-nextJason Gunthorpe
From the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Merged due to dependencies in the next patches. * branch 'mlx5-next': net/mlx5: Expose relaxed ordering bits net/mlx5: Add RoCE accelerator counters
2020-01-16net/mlx5: Expose relaxed ordering bitsMichael Guralnik
Expose relaxed ordering bits in HCA capability and mkey context structs. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16net/mlx5: Add RoCE accelerator countersLeon Romanovsky
Add RoCE accelerator definitions. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16dax: Get rid of fs_dax_get_by_host() helperVivek Goyal
Looks like nobody is using fs_dax_get_by_host() except fs_dax_get_by_bdev() and it can easily use dax_get_by_host() instead. IIUC, fs_dax_get_by_host() was only introduced so that one could compile with CONFIG_FS_DAX=n and CONFIG_DAX=m. fs_dax_get_by_bdev() achieves the same purpose and hence it looks like fs_dax_get_by_host() is not needed anymore. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20200106181117.GA16248@redhat.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-01-16x86/amd_nb: Add Family 19h PCI IDsYazen Ghannam
Add the new PCI Device 18h IDs for AMD Family 19h systems. Note that Family 19h systems will not have a new PCI root device ID. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200110015651.14887-4-Yazen.Ghannam@amd.com
2020-01-16Merge branch 'topic/sdw_intel' into nextVinod Koul
2020-01-16soundwire: intel: report slave_ids for each link to SOF driverBard Liao
The existing link_mask flag is no longer sufficient to detect the hardware and identify which topology file and a machine driver to load. By reporting the slave_ids exposed in ACPI tables, the parent SOF driver will be able to compare against a set of static configurations. This patch only adds the interface change, the functionality is added in future patches. Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20200110220016.30887-1-pierre-louis.bossart@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-16gpio: Fix the no return statement warningKevin Hao
In commit 242587616710 ("gpiolib: Add support for the irqdomain which doesn't use irq_fwspec as arg") we have changed the return type of gpiochip_populate_parent_fwspec_twocell/fourcell() from void to void *, but forgot to add a return statement for these two dummy functions. Add "return NULL" to fix the build warnings. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Kevin Hao <haokexin@gmail.com> Link: https://lore.kernel.org/r/20200116095003.30324-1-haokexin@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2020-01-16mm: Add a vmf_insert_mixed_prot() functionThomas Hellstrom
The TTM module today uses a hack to be able to set a different page protection than struct vm_area_struct::vm_page_prot. To be able to do this properly, add the needed vm functionality as vmf_insert_mixed_prot(). Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: "Christian König" <christian.koenig@amd.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Andrew Morton <akpm@linux-foundation.org>
2020-01-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2020-01-15 The following pull-request contains BPF updates for your *net* tree. We've added 12 non-merge commits during the last 9 day(s) which contain a total of 13 files changed, 95 insertions(+), 43 deletions(-). The main changes are: 1) Fix refcount leak for TCP time wait and request sockets for socket lookup related BPF helpers, from Lorenz Bauer. 2) Fix wrong verification of ARSH instruction under ALU32, from Daniel Borkmann. 3) Batch of several sockmap and related TLS fixes found while operating more complex BPF programs with Cilium and OpenSSL, from John Fastabend. 4) Fix sockmap to read psock's ingress_msg queue before regular sk_receive_queue() to avoid purging data upon teardown, from Lingpeng Chen. 5) Fix printing incorrect pointer in bpftool's btf_dump_ptr() in order to properly dump a BPF map's value with BTF, from Martin KaFai Lau. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-15block: fix an integer overflow in logical block sizeMikulas Patocka
Logical block size has type unsigned short. That means that it can be at most 32768. However, there are architectures that can run with 64k pages (for example arm64) and on these architectures, it may be possible to create block devices with 64k block size. For exmaple (run this on an architecture with 64k pages): Mount will fail with this error because it tries to read the superblock using 2-sector access: device-mapper: writecache: I/O is not aligned, sector 2, size 1024, block size 65536 EXT4-fs (dm-0): unable to read superblock This patch changes the logical block size from unsigned short to unsigned int to avoid the overflow. Cc: stable@vger.kernel.org Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-15scsi: drivers: base: Propagate errors through the transport componentGabriel Krisman Bertazi
The transport registration may fail. Make sure the errors are propagated to the callers. Link: https://lore.kernel.org/r/20200106185817.640331-3-krisman@collabora.com Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-15scsi: drivers: base: Support atomic version of ↵Gabriel Krisman Bertazi
attribute_container_device_trigger attribute_container_device_trigger invokes callbacks that may fail for one or more classdevs, for instance, the transport_add_class_device callback, called during transport creation, does memory allocation. This information, though, is not propagated to upper layers, and any driver using the attribute_container_device_trigger API will not know whether any, some, or all callbacks succeeded. This patch implements a safe version of this dispatcher, to either succeed all the callbacks or revert to the original state. Link: https://lore.kernel.org/r/20200106185817.640331-2-krisman@collabora.com Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-15bpf: Sockmap/tls, push write_space updates through ulp updatesJohn Fastabend
When sockmap sock with TLS enabled is removed we cleanup bpf/psock state and call tcp_update_ulp() to push updates to TLS ULP on top. However, we don't push the write_space callback up and instead simply overwrite the op with the psock stored previous op. This may or may not be correct so to ensure we don't overwrite the TLS write space hook pass this field to the ULP and have it fixup the ctx. This completes a previous fix that pushed the ops through to the ULP but at the time missed doing this for write_space, presumably because write_space TLS hook was added around the same time. Fixes: 95fa145479fbc ("bpf: sockmap/tls, close can race with map free") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-4-john.fastabend@gmail.com
2020-01-15bpf: Sockmap/tls, during free we may call tcp_bpf_unhash() in loopJohn Fastabend
When a sockmap is free'd and a socket in the map is enabled with tls we tear down the bpf context on the socket, the psock struct and state, and then call tcp_update_ulp(). The tcp_update_ulp() call is to inform the tls stack it needs to update its saved sock ops so that when the tls socket is later destroyed it doesn't try to call the now destroyed psock hooks. This is about keeping stacked ULPs in good shape so they always have the right set of stacked ops. However, recently unhash() hook was removed from TLS side. But, the sockmap/bpf side is not doing any extra work to update the unhash op when is torn down instead expecting TLS side to manage it. So both TLS and sockmap believe the other side is managing the op and instead no one updates the hook so it continues to point at tcp_bpf_unhash(). When unhash hook is called we call tcp_bpf_unhash() which detects the psock has already been destroyed and calls sk->sk_prot_unhash() which calls tcp_bpf_unhash() yet again and so on looping and hanging the core. To fix have sockmap tear down logic fixup the stale pointer. Fixes: 5d92e631b8be ("net/tls: partially revert fix transition through disconnect with close") Reported-by: syzbot+83979935eb6304f8cd46@syzkaller.appspotmail.com Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-2-john.fastabend@gmail.com
2020-01-15bpf: Add batch ops to all htab bpf mapYonghong Song
htab can't use generic batch support due some problematic behaviours inherent to the data structre, i.e. while iterating the bpf map a concurrent program might delete the next entry that batch was about to use, in that case there's no easy solution to retrieve the next entry, the issue has been discussed multiple times (see [1] and [2]). The only way hmap can be traversed without the problem previously exposed is by making sure that the map is traversing entire buckets. This commit implements those strict requirements for hmap, the implementation follows the same interaction that generic support with some exceptions: - If keys/values buffer are not big enough to traverse a bucket, ENOSPC will be returned. - out_batch contains the value of the next bucket in the iteration, not the next key, but this is transparent for the user since the user should never use out_batch for other than bpf batch syscalls. This commits implements BPF_MAP_LOOKUP_BATCH and adds support for new command BPF_MAP_LOOKUP_AND_DELETE_BATCH. Note that for update/delete batch ops it is possible to use the generic implementations. [1] https://lore.kernel.org/bpf/20190724165803.87470-1-brianvv@google.com/ [2] https://lore.kernel.org/bpf/20190906225434.3635421-1-yhs@fb.com/ Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200115184308.162644-6-brianvv@google.com
2020-01-15bpf: Add generic support for update and delete batch opsBrian Vazquez
This commit adds generic support for update and delete batch ops that can be used for almost all the bpf maps. These commands share the same UAPI attr that lookup and lookup_and_delete batch ops use and the syscall commands are: BPF_MAP_UPDATE_BATCH BPF_MAP_DELETE_BATCH The main difference between update/delete and lookup batch ops is that for update/delete keys/values must be specified for userspace and because of that, neither in_batch nor out_batch are used. Suggested-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200115184308.162644-4-brianvv@google.com
2020-01-15bpf: Add generic support for lookup batch opBrian Vazquez
This commit introduces generic support for the bpf_map_lookup_batch. This implementation can be used by almost all the bpf maps since its core implementation is relying on the existing map_get_next_key and map_lookup_elem. The bpf syscall subcommand introduced is: BPF_MAP_LOOKUP_BATCH The UAPI attribute is: struct { /* struct used by BPF_MAP_*_BATCH commands */ __aligned_u64 in_batch; /* start batch, * NULL to start from beginning */ __aligned_u64 out_batch; /* output: next start batch */ __aligned_u64 keys; __aligned_u64 values; __u32 count; /* input/output: * input: # of key/value * elements * output: # of filled elements */ __u32 map_fd; __u64 elem_flags; __u64 flags; } batch; in_batch/out_batch are opaque values use to communicate between user/kernel space, in_batch/out_batch must be of key_size length. To start iterating from the beginning in_batch must be null, count is the # of key/value elements to retrieve. Note that the 'keys' buffer must be a buffer of key_size * count size and the 'values' buffer must be value_size * count, where value_size must be aligned to 8 bytes by userspace if it's dealing with percpu maps. 'count' will contain the number of keys/values successfully retrieved. Note that 'count' is an input/output variable and it can contain a lower value after a call. If there's no more entries to retrieve, ENOENT will be returned. If error is ENOENT, count might be > 0 in case it copied some values but there were no more entries to retrieve. Note that if the return code is an error and not -EFAULT, count indicates the number of elements successfully processed. Suggested-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200115184308.162644-3-brianvv@google.com
2020-01-15bpf: Fix incorrect verifier simulation of ARSH under ALU32Daniel Borkmann
Anatoly has been fuzzing with kBdysch harness and reported a hang in one of the outcomes: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (85) call bpf_get_socket_cookie#46 1: R0_w=invP(id=0) R10=fp0 1: (57) r0 &= 808464432 2: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 2: (14) w0 -= 810299440 3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 3: (c4) w0 s>>= 1 4: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 4: (76) if w0 s>= 0x30303030 goto pc+216 221: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 221: (95) exit processed 6 insns (limit 1000000) [...] Taking a closer look, the program was xlated as follows: # ./bpftool p d x i 12 0: (85) call bpf_get_socket_cookie#7800896 1: (bf) r6 = r0 2: (57) r6 &= 808464432 3: (14) w6 -= 810299440 4: (c4) w6 s>>= 1 5: (76) if w6 s>= 0x30303030 goto pc+216 6: (05) goto pc-1 7: (05) goto pc-1 8: (05) goto pc-1 [...] 220: (05) goto pc-1 221: (05) goto pc-1 222: (95) exit Meaning, the visible effect is very similar to f54c7898ed1c ("bpf: Fix precision tracking for unbounded scalars"), that is, the fall-through branch in the instruction 5 is considered to be never taken given the conclusion from the min/max bounds tracking in w6, and therefore the dead-code sanitation rewrites it as goto pc-1. However, real-life input disagrees with verification analysis since a soft-lockup was observed. The bug sits in the analysis of the ARSH. The definition is that we shift the target register value right by K bits through shifting in copies of its sign bit. In adjust_scalar_min_max_vals(), we do first coerce the register into 32 bit mode, same happens after simulating the operation. However, for the case of simulating the actual ARSH, we don't take the mode into account and act as if it's always 64 bit, but location of sign bit is different: dst_reg->smin_value >>= umin_val; dst_reg->smax_value >>= umin_val; dst_reg->var_off = tnum_arshift(dst_reg->var_off, umin_val); Consider an unknown R0 where bpf_get_socket_cookie() (or others) would for example return 0xffff. With the above ARSH simulation, we'd see the following results: [...] 1: R1=ctx(id=0,off=0,imm=0) R2_w=invP65535 R10=fp0 1: (85) call bpf_get_socket_cookie#46 2: R0_w=invP(id=0) R10=fp0 2: (57) r0 &= 808464432 -> R0_runtime = 0x3030 3: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 3: (14) w0 -= 810299440 -> R0_runtime = 0xcfb40000 4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 (0xffffffff) 4: (c4) w0 s>>= 1 -> R0_runtime = 0xe7da0000 5: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 (0x67c00000) (0x7ffbfff8) [...] In insn 3, we have a runtime value of 0xcfb40000, which is '1100 1111 1011 0100 0000 0000 0000 0000', the result after the shift has 0xe7da0000 that is '1110 0111 1101 1010 0000 0000 0000 0000', where the sign bit is correctly retained in 32 bit mode. In insn4, the umax was 0xffffffff, and changed into 0x7ffbfff8 after the shift, that is, '0111 1111 1111 1011 1111 1111 1111 1000' and means here that the simulation didn't retain the sign bit. With above logic, the updates happen on the 64 bit min/max bounds and given we coerced the register, the sign bits of the bounds are cleared as well, meaning, we need to force the simulation into s32 space for 32 bit alu mode. Verification after the fix below. We're first analyzing the fall-through branch on 32 bit signed >= test eventually leading to rejection of the program in this specific case: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (b7) r2 = 808464432 1: R1=ctx(id=0,off=0,imm=0) R2_w=invP808464432 R10=fp0 1: (85) call bpf_get_socket_cookie#46 2: R0_w=invP(id=0) R10=fp0 2: (bf) r6 = r0 3: R0_w=invP(id=0) R6_w=invP(id=0) R10=fp0 3: (57) r6 &= 808464432 4: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 4: (14) w6 -= 810299440 5: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 5: (c4) w6 s>>= 1 6: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0 (0x67c00000) (0xfffbfff8) 6: (76) if w6 s>= 0x30303030 goto pc+216 7: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0 7: (30) r0 = *(u8 *)skb[808464432] BPF_LD_[ABS|IND] uses reserved fields processed 8 insns (limit 1000000) [...] Fixes: 9cbe1f5a32dc ("bpf/verifier: improve register value range tracking with ARSH") Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200115204733.16648-1-daniel@iogearbox.net
2020-01-15soc: ti: k3: add navss ringacc driverGrygorii Strashko
The Ring Accelerator (RINGACC or RA) provides hardware acceleration to enable straightforward passing of work between a producer and a consumer. There is one RINGACC module per NAVSS on TI AM65x SoCs. The RINGACC converts constant-address read and write accesses to equivalent read or write accesses to a circular data structure in memory. The RINGACC eliminates the need for each DMA controller which needs to access ring elements from having to know the current state of the ring (base address, current offset). The DMA controller performs a read or write access to a specific address range (which maps to the source interface on the RINGACC) and the RINGACC replaces the address for the transaction with a new address which corresponds to the head or tail element of the ring (head for reads, tail for writes). Since the RINGACC maintains the state, multiple DMA controllers or channels are allowed to coherently share the same rings as applicable. The RINGACC is able to place data which is destined towards software into cached memory directly. Supported ring modes: - Ring Mode - Messaging Mode - Credentials Mode - Queue Manager Mode TI-SCI integration: Texas Instrument's System Control Interface (TI-SCI) Message Protocol now has control over Ringacc module resources management (RM) and Rings configuration. The corresponding support of TI-SCI Ringacc module RM protocol introduced as option through DT parameters: - ti,sci: phandle on TI-SCI firmware controller DT node - ti,sci-dev-id: TI-SCI device identifier as per TI-SCI firmware spec if both parameters present - Ringacc driver will configure/free/reset Rings using TI-SCI Message Ringacc RM Protocol. The Ringacc driver manages Rings allocation by itself now and requests TI-SCI firmware to allocate and configure specific Rings only. It's done this way because, Linux driver implements two stage Rings allocation and configuration (allocate ring and configure ring) while TI-SCI Message Protocol supports only one combined operation (allocate+configure). Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Reviewed-by: Tero Kristo <t-kristo@ti.com> Tested-by: Keerthy <j-keerthy@ti.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
2020-01-15Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull vfs fixes from Al Viro: "Fixes for mountpoint_last() bugs (by converting to use of lookup_last()) and an autofs regression fix from this cycle (caused by follow_managed() breakage introduced in barrier fixes series)" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix autofs regression caused by follow_managed() changes reimplement path_mountpoint() with less magic
2020-01-15Merge branch 'i2c/for-current' into i2c/for-5.6Wolfram Sang
2020-01-15PCI/switchtec: Add Gen4 MRPC GAS access permission checkKelvin Cao
Gen4 hardware provides new MRPC commands to read and write directly from any address in the PCI BAR (which Microsemi refers to as GAS). Since accessing BARs can be dangerous and break the driver, we don't want unprivileged users to have this ability. Therefore, require CAP_SYS_ADMIN for the local and remote GAS access MRPC commands. Privileged processes will already have access to the BAR through the sysfs resource file so this doesn't give userspace any capabilities it didn't already have. [logang@deltatee.com: rework commit message] Link: https://lore.kernel.org/r/20200106190337.2428-11-logang@deltatee.com Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-01-15PCI/switchtec: Add Gen4 flash information interface supportKelvin Cao
Add the new flash_info registers struct and the implementation of ioctl_flash_part_info() for the new Gen4 hardware. [logang@deltatee.com: rewrote commit message] Link: https://lore.kernel.org/r/20200115035648.2578-7-logang@deltatee.com Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-01-15PCI/switchtec: Add Gen4 system info register supportLogan Gunthorpe
Add the Gen4-specific system info registers and ensure their usage is guarded by a check on the device's generation. Link: https://lore.kernel.org/r/20200115035648.2578-6-logang@deltatee.com Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-01-15PCI/switchtec: Separate Gen3 register structures into unionsLogan Gunthorpe
Since the sys_info and flash_info registers differ significantly in Gen4 hardware, separate out the Gen3 registers into their own structure with a union in the main structure. No functional changes intended. Link: https://lore.kernel.org/r/20200115035648.2578-5-logang@deltatee.com Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-01-15PCI/switchtec: Add 'generation' variableLogan Gunthorpe
Add a generation variable passed through the device ID table and test for Gen3-specific registers. This will allow us to add Gen4 and other devices that extend the programming model. Link: https://lore.kernel.org/r/20200115035648.2578-3-logang@deltatee.com Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>