summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-10-10scsi: mpt3sas: Remove unnecessary parentheses and simplify null checksNathan Chancellor
Clang warns when multiple pairs of parentheses are used for a single conditional statement. drivers/scsi/mpt3sas/mpt3sas_base.c:535:11: warning: equality comparison with extraneous parentheses [-Wparentheses-equality] if ((ioc == NULL)) ~~~~^~~~~~~ drivers/scsi/mpt3sas/mpt3sas_base.c:535:11: note: remove extraneous parentheses around the comparison to silence this warning if ((ioc == NULL)) ~ ^ ~ drivers/scsi/mpt3sas/mpt3sas_base.c:535:11: note: use '=' to turn this equality comparison into an assignment if ((ioc == NULL)) ^~ = drivers/scsi/mpt3sas/mpt3sas_base.c:539:12: warning: equality comparison with extraneous parentheses [-Wparentheses-equality] if ((pdev == NULL)) ~~~~~^~~~~~~ drivers/scsi/mpt3sas/mpt3sas_base.c:539:12: note: remove extraneous parentheses around the comparison to silence this warning if ((pdev == NULL)) ~ ^ ~ drivers/scsi/mpt3sas/mpt3sas_base.c:539:12: note: use '=' to turn this equality comparison into an assignment if ((pdev == NULL)) ^~ = 2 warnings generated. Remove them and while we're at it, simplify the NULL checks as '!var' is used more than 'var == NULL'. Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-10scsi: mpt3sas: Use dma_pool_zallocSouptick Joarder
Replaced dma_pool_alloc + memset with dma_pool_zalloc. Signed-off-by: Brajeswar Ghosh <brajeswar.linux@gmail.com> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Acked-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-10scsi: mpt3sas: Remove unused macro MPT3SAS_FMTJoe Perches
All the uses have been removed, delete the macro. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-10scsi: mpt3sas: Convert logging uses with MPT3SAS_FMT without logging levelsJoe Perches
Convert these uses to ioc_<level> where appropriate. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-10scsi: mpt3sas: Remove KERN_WARNING from panic usesJoe Perches
Remove the logging level as panic calls stop the machine and should always be emitted regardless of requested logging level. These existing panic uses are perhaps inappropriate. Miscellanea: o Coalesce formats and convert MPT3SAS_FMT to "%s: " to improve clarity Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-10scsi: mpt3sas: Convert logging uses with MPT3SAS_FMT and reply_q_name to %s:Joe Perches
Convert the existing 2 uses to make the format and arguments matching more obvious. Miscellanea: o Move the word "enabled" into the format to trivially reduce object size o Remove unnecessary parentheses Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-10scsi: mpt3sas: Convert mlsleading uses of pr_<level> with MPT3SAS_FMTJoe Perches
These have misordered uses of __func__ and ioc->name that could mismatch MPT3SAS_FMT and "%s: ". Convert them to ioc_<level>. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-10scsi: mpt3sas: Convert uses of pr_<level> with MPT3SAS_FMT to ioc_<level>Joe Perches
Use a more common logging style. Done using the perl script below and some typing $ git grep --name-only -w MPT3SAS_FMT -- "*.c" | \ xargs perl -i -e 'local $/; while (<>) { s/\bpr_(info|err|notice|warn)\s*\(\s*MPT3SAS_FMT\s*("[^"]+"(?:\s*\\?\s*"[^"]+"\s*){0,5}\s*),\s*ioc->name\s*/ioc_\1(ioc, \2/g; print;}' Miscellanea for these conversions: o Coalesce formats o Realign arguments o Remove unnecessary parentheses o Use casts to u64 instead of unsigned long long where appropriate o Convert broken pr_info uses to pr_cont o Fix broken format string concatenation with line continuations and excess whitespace Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-10scsi: mpt3sas: Add ioc_<level> logging macrosJoe Perches
These macros can help identify specific logging uses and eventually perhaps reduce object sizes. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-10scsi: MAINTAINERS: Fix typo in cxlflash stanzaMatthew R. Ochs
The uapi header file listed in the cxlflash stanza has a typo. Removed the trailing 's' from the filename. Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Matthew R. Ochs <mrochs@linux.ibm.com> Acked-by: Uma Krishnan <ukrishn@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-10signal: Guard against negative signal numbers in copy_siginfo_from_user32Eric W. Biederman
While fixing an out of bounds array access in known_siginfo_layout reported by the kernel test robot it became apparent that the same bug exists in siginfo_layout and affects copy_siginfo_from_user32. The straight forward fix that makes guards against making this mistake in the future and should keep the code size small is to just take an unsigned signal number instead of a signed signal number, as I did to fix known_siginfo_layout. Cc: stable@vger.kernel.org Fixes: cc731525f26a ("signal: Remove kernel interal si_code magic") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-10-10signal: Guard against negative signal numbers in copy_siginfo_from_userEric W. Biederman
The bounds checks in known_siginfo_layout only guards against positive numbers that are too large, large negative can slip through and can cause out of bounds accesses. Ordinarily this is not a concern because early in signal processing the signal number is filtered with valid_signal which ensures it is a small positive signal number, but copy_siginfo_from_user is called before this check is performed. [ 73.031126] BUG: unable to handle kernel paging request at ffffffff6281bcb6 [ 73.032038] PGD 3014067 P4D 3014067 PUD 0 [ 73.032565] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI [ 73.033287] CPU: 0 PID: 732 Comm: trinity-c3 Tainted: G W T 4.19.0-rc1-00077-g4ce5f9c #1 [ 73.034423] RIP: 0010:copy_siginfo_from_user+0x4d/0xd0 [ 73.034908] Code: 00 8b 53 08 81 fa 80 00 00 00 0f 84 90 00 00 00 85 d2 7e 2d 48 63 0b 83 f9 1f 7f 1c 8d 71 ff bf d8 04 01 50 48 0f a3 f7 73 0e <0f> b6 8c 09 20 bb 81 82 39 ca 7f 15 eb 68 31 c0 83 fa 06 7f 0c eb [ 73.036665] RSP: 0018:ffff88001b8f7e20 EFLAGS: 00010297 [ 73.037160] RAX: 0000000000000000 RBX: ffff88001b8f7e90 RCX: fffffffff00000cb [ 73.037865] RDX: 0000000000000001 RSI: 00000000f00000ca RDI: 00000000500104d8 [ 73.038546] RBP: ffff88001b8f7e80 R08: 0000000000000000 R09: 0000000000000000 [ 73.039201] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000008 [ 73.039874] R13: 00000000000002dc R14: 0000000000000000 R15: 0000000000000000 [ 73.040613] FS: 000000000104a880(0000) GS:ffff88001f000000(0000) knlGS:0000000000000000 [ 73.041649] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 73.042405] CR2: ffffffff6281bcb6 CR3: 000000001cb52003 CR4: 00000000001606b0 [ 73.043351] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 73.044286] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 [ 73.045221] Call Trace: [ 73.045556] __x64_sys_rt_tgsigqueueinfo+0x34/0xa0 [ 73.046199] do_syscall_64+0x1a4/0x390 [ 73.046708] ? vtime_user_enter+0x61/0x80 [ 73.047242] ? __context_tracking_enter+0x4e/0x60 [ 73.047714] ? __context_tracking_enter+0x4e/0x60 [ 73.048278] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Therefore fix known_siginfo_layout to take an unsigned signal number instead of a signed signal number. All valid signal numbers are small positive numbers so they will not be affected, but invalid negative signal numbers will now become large positive signal numbers and will not be used as indices into the sig_sicodes array. Making the signal number unsigned makes it difficult for similar mistakes to happen in the future. Fixes: 4ce5f9c9e754 ("signal: Use a smaller struct siginfo in the kernel") Inspired-by: Sean Christopherson <sean.j.christopherson@intel.com> Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-10-10net/mlx5: WQ, fixes for fragmented WQ buffers APITariq Toukan
mlx5e netdevice used to calculate fragment edges by a call to mlx5_wq_cyc_get_frag_size(). This calculation did not give the correct indication for queues smaller than a PAGE_SIZE, (broken by default on PowerPC, where PAGE_SIZE == 64KB). Here it is replaced by the correct new calls/API. Since (TX/RX) Work Queues buffers are fragmented, here we introduce changes to the API in core driver, so that it gets a stride index and returns the index of last stride on same fragment, and an additional wrapping function that returns the number of physically contiguous strides that can be written contiguously to the work queue. This obsoletes the following API functions, and their buggy usage in EN driver: * mlx5_wq_cyc_get_frag_size() * mlx5_wq_cyc_ctr2fragix() The new API improves modularity and hides the details of such calculation for mlx5e netdevice and mlx5_ib rdma drivers. New calculation is also more efficient, and improves performance as follows: Packet rate test: pktgen, UDP / IPv4, 64byte, single ring, 8K ring size. Before: 16,477,619 pps After: 17,085,793 pps 3.7% improvement Fixes: 3a2f70331226 ("net/mlx5: Use order-0 allocations for all WQ types") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault typeHuy Nguyen
The HW spec defines only bits 24-26 of pftype_wq as the page fault type, use the required mask to ensure that. Fixes: d9aaed838765 ("{net,IB}/mlx5: Refactor page fault handling") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10net/mlx5: Fix memory leak when setting fpga ipsec capsTalat Batheesh
Allocated memory for context should be freed once finished working with it. Fixes: d6c4f0298cec ("net/mlx5: Refactor accel IPSec code") Signed-off-by: Talat Batheesh <talatb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10net/mlx5e: Do not ignore netdevice TX/RX queues numberFeras Daoud
The current design of mlx5e driver ignores the netdevice TX/RX queues number for netdevices that RDMA IPoIB ULP creates. Instead, the queue number is initialized to the maximum number that mlx5 thinks best for performance. As a result, ULP drivers that choose to create a netdevice with queue number that is less than the maximum channels mlx5 creates, will get a memory corruption. This fix changes the mlx5e netdev logic to respect ULP netdevices TX/RX queue number and use it when creating resources instead of the maximum channel number. Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10net/mlx5e: Use non-delayed work for update statsSaeed Mahameed
Convert mlx5e update stats work to a normal work structure, since it is never used delayed. Add a helper function to queue update stats work on demand which checks for some conditions and reduce code duplication to have a better abstraction. Fixes: ed56c5193ad8 ("net/mlx5e: Update NIC HW stats on demand only") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10net/mlx5e: Initialize all netdev common structures in one placeSaeed Mahameed
Move all mlx5e generic structures initializations to mlx5e_netdev_init. The common structure new initializer function will be used to initialize mlx5 context for netlink created netdevs such as IPoIB mlx5 accelerated child netdevs. Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com>
2018-10-10net/mlx5e: Always initialize update stats delayed workFeras Daoud
mlx5e_detach_netdev cancels update_stats work which was not initialized in ipoib netdevice profile, as a result, the following assert occurs: ODEBUG: assert_init not available (active state 0) object type: timer_list hint:(null) This change moves the update stats work to be initialized for all mlx5e netdevices. Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10net/mlx5e: Gather common netdev init/cleanup functionality in one placeFeras Daoud
Introduce a helper init/cleanup function that initializes mlx5e generic netdev private structure, and use them from all profiles init/cleanup callbacks. This patch will also be helpful to initialize/cleanup netdevs that are not created by mlx5 driver, e.g: accelerated ipoib child netdevs. Fixes: 26e59d8077a3 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10RDMA/netdev: Fix netlink support in IPoIBDenis Drozdov
IPoIB netlink support was broken by the below commit since integrating the rdma_netdev support relies on an allocation flow for netdevs that was controlled by the ipoib driver while netdev's rtnl_newlink implementation assumes that the netdev will be allocated by netlink. Such situation leads to crash in __ipoib_device_add, once trying to reuse netlink device. This patch fixes the kernel oops for both mlx4 and mlx5 devices triggered by the following command: Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Denis Drozdov <denisd@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10RDMA/netdev: Hoist alloc_netdev_mqs out of the driverDenis Drozdov
netdev has several interfaces that expect to call alloc_netdev_mqs from the core code, with the driver only providing the arguments. This is incompatible with the rdma_netdev interface that returns the netdev directly. Thus re-organize the API used by ipoib so that the verbs core code calls alloc_netdev_mqs for the driver. This is done by allowing the drivers to provide the allocation parameters via a 'get_params' callback and then initializing an allocated netdev as a second step. Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Denis Drozdov <denisd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10PCI: Remove pci_set_dma_max_seg_size()Christoph Hellwig
The few callers can just use dma_set_max_seg_size ()directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-10-10PCI: Remove pci_set_dma_seg_boundary()Christoph Hellwig
The two callers can just use dma_set_seg_boundary() directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-10-10PCI: Remove pci_unmap_addr() wrappers for DMA APIChristoph Hellwig
Only some of these were still used by the cxgb4 driver, and that despite the fact that the driver otherwise uses the generic DMA API. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-10-10Merge tag 'for-4.19/dm-fixes-3' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Mike writes: "device mapper fixes for 4.19 final - Fix a DM cache module init error path bug that doesn't properly cleanup a KMEM_CACHE if target registration fails. - Two stable@ fixes for DM zoned target; 4.20 will have changes that eliminate this code entirely but <= 4.19 needs these changes." * tag 'for-4.19/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm linear: eliminate linear_end_io call if CONFIG_DM_ZONED disabled dm: fix report zone remapping to account for partition offset dm cache: destroy migration_cache if cache target registration failed
2018-10-10ata: remove redundant 'default n' from KconfigBartlomiej Zolnierkiewicz
'default n' is the default value for any bool or tristate Kconfig setting so there is no need to write it explicitly. Also since commit f467c5640c29 ("kconfig: only write '# CONFIG_FOO is not set' for visible symbols") the Kconfig behavior is the same regardless of 'default n' being present or not: ... One side effect of (and the main motivation for) this change is making the following two definitions behave exactly the same: config FOO bool config FOO bool default n With this change, neither of these will generate a '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied). That might make it clearer to people that a bare 'default n' is redundant. ... Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-10drivers/block: remove redundant 'default n' from Kconfig-sBartlomiej Zolnierkiewicz
'default n' is the default value for any bool or tristate Kconfig setting so there is no need to write it explicitly. Also since commit f467c5640c29 ("kconfig: only write '# CONFIG_FOO is not set' for visible symbols") the Kconfig behavior is the same regardless of 'default n' being present or not: ... One side effect of (and the main motivation for) this change is making the following two definitions behave exactly the same: config FOO bool config FOO bool default n With this change, neither of these will generate a '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied). That might make it clearer to people that a bare 'default n' is redundant. ... Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-10block: remove redundant 'default n' from Kconfig-sBartlomiej Zolnierkiewicz
'default n' is the default value for any bool or tristate Kconfig setting so there is no need to write it explicitly. Also since commit f467c5640c29 ("kconfig: only write '# CONFIG_FOO is not set' for visible symbols") the Kconfig behavior is the same regardless of 'default n' being present or not: ... One side effect of (and the main motivation for) this change is making the following two definitions behave exactly the same: config FOO bool config FOO bool default n With this change, neither of these will generate a '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied). That might make it clearer to people that a bare 'default n' is redundant. ... Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-10Merge tag 'trace-v4.19-rc5' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Steven writes: "vsprint fix: It was reported that trace_printk() was not reporting properly values that came after a dereference pointer. trace_printk() utilizes vbin_printf() and bstr_printf() to keep the overhead of tracing down. vbin_printf() does not do any conversions and just stors the string format and the raw arguments into the buffer. bstr_printf() is used to read the buffer and does the conversions to complete the printf() output. This can be troublesome with dereferenced pointers because the reference may be different from the time vbin_printf() is called to the time bstr_printf() is called. To fix this, a prior commit changed vbin_printf() to convert dereferenced pointers into strings and load the converted string into the buffer. But the change to bstr_printf() had an off-by-one error and didn't account for the nul character at the end of the string and this corrupted the rest of the values in the format that came after a dereferenced pointer." * tag 'trace-v4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: vsprintf: Fix off-by-one bug in bstr_printf() processing dereferenced pointers
2018-10-10Merge tag 'devicetree-fixes-for-4.19-3' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Rob writes: "Devicetree fixes for 4.19, part 3: - Fix DT unittest on Oldworld MAC systems" * tag 'devicetree-fixes-for-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: unittest: Disable interrupt node tests for old world MAC systems
2018-10-10pinctrl: gemini: Fix up TVC clock groupLinus Walleij
The previous fix made the TVC clock get muxed in on the D-Link DIR-685 instead of giving nagging warnings of this not working. Not good. We didn't want that, as it breaks video. Create a specific group for the TVC CLK, and break out a specific GPIO group for it on the SL3516 so we can use that line as GPIO if we don't need the TVC CLK. Fixes: d17f477c5bc6 ("pinctrl: gemini: Mask and set properly") Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2018-10-10PCI/P2PDMA: Support peer-to-peer memoryLogan Gunthorpe
Some PCI devices may have memory mapped in a BAR space that's intended for use in peer-to-peer transactions. To enable such transactions the memory must be registered with ZONE_DEVICE pages so it can be used by DMA interfaces in existing drivers. Add an interface for other subsystems to find and allocate chunks of P2P memory as necessary to facilitate transfers between two PCI peers: struct pci_dev *pci_p2pmem_find[_many](); int pci_p2pdma_distance[_many](); void *pci_alloc_p2pmem(); The new interface requires a driver to collect a list of client devices involved in the transaction then call pci_p2pmem_find() to obtain any suitable P2P memory. Alternatively, if the caller knows a device which provides P2P memory, they can use pci_p2pdma_distance() to determine if it is usable. With a suitable p2pmem device, memory can then be allocated with pci_alloc_p2pmem() for use in DMA transactions. Depending on hardware, using peer-to-peer memory may reduce the bandwidth of the transfer but can significantly reduce pressure on system memory. This may be desirable in many cases: for example a system could be designed with a small CPU connected to a PCIe switch by a small number of lanes which would maximize the number of lanes available to connect to NVMe devices. The code is designed to only utilize the p2pmem device if all the devices involved in a transfer are behind the same PCI bridge. This is because we have no way of knowing whether peer-to-peer routing between PCIe Root Ports is supported (PCIe r4.0, sec 1.3.1). Additionally, the benefits of P2P transfers that go through the RC is limited to only reducing DRAM usage and, in some cases, coding convenience. The PCI-SIG may be exploring adding a new capability bit to advertise whether this is possible for future hardware. This commit includes significant rework and feedback from Christoph Hellwig. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> [bhelgaas: fold in fix from Keith Busch <keith.busch@intel.com>: https://lore.kernel.org/linux-pci/20181012155920.15418-1-keith.busch@intel.com, to address comment from Dan Carpenter <dan.carpenter@oracle.com>, fold in https://lore.kernel.org/linux-pci/20181017160510.17926-1-logang@deltatee.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-10-10IB/mlx5: Unmap DMA addr from HCA before IOMMUValentine Fatiev
The function that puts back the MR in cache also removes the DMA address from the HCA. Therefore we need to call this function before we remove the DMA mapping from MMU. Otherwise the HCA may access a memory that is no longer DMA mapped. Call trace: NMI: IOCK error (debug interrupt?) for reason 71 on CPU 0. CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc6+ #4 Hardware name: HP ProLiant DL360p Gen8, BIOS P71 08/20/2012 RIP: 0010:intel_idle+0x73/0x120 Code: 80 5c 01 00 0f ae 38 0f ae f0 31 d2 65 48 8b 04 25 80 5c 01 00 48 89 d1 0f 60 02 RSP: 0018:ffffffff9a403e38 EFLAGS: 00000046 RAX: 0000000000000030 RBX: 0000000000000005 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffffffff9a5790c0 RDI: 0000000000000000 RBP: 0000000000000030 R08: 0000000000000000 R09: 0000000000007cf9 R10: 000000000000030a R11: 0000000000000018 R12: 0000000000000000 R13: ffffffff9a5792b8 R14: ffffffff9a5790c0 R15: 0000002b48471e4d FS: 0000000000000000(0000) GS:ffff9c6caf400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f5737185000 CR3: 0000000590c0a002 CR4: 00000000000606f0 Call Trace: cpuidle_enter_state+0x7e/0x2e0 do_idle+0x1ed/0x290 cpu_startup_entry+0x6f/0x80 start_kernel+0x524/0x544 ? set_init_arg+0x55/0x55 secondary_startup_64+0xa4/0xb0 DMAR: DRHD: handling fault status reg 2 DMAR: [DMA Read] Request device [04:00.0] fault addr b34d2000 [fault reason 06] PTE Read access is not set DMAR: [DMA Read] Request device [01:00.2] fault addr bff8b000 [fault reason 06] PTE Read access is not set Fixes: f3f134f5260a ("RDMA/mlx5: Fix crash while accessing garbage pointer and freed memory") Signed-off-by: Valentine Fatiev <valentinef@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-10net: make skb_partial_csum_set() more robust against overflowsEric Dumazet
syzbot managed to crash in skb_checksum_help() [1] : BUG_ON(offset + sizeof(__sum16) > skb_headlen(skb)); Root cause is the following check in skb_partial_csum_set() if (unlikely(start > skb_headlen(skb)) || unlikely((int)start + off > skb_headlen(skb) - 2)) return false; If skb_headlen(skb) is 1, then (skb_headlen(skb) - 2) becomes 0xffffffff and the check fails to detect that ((int)start + off) is off the limit, since the compare is unsigned. When we fix that, then the first condition (start > skb_headlen(skb)) becomes obsolete. Then we should also check that (skb_headroom(skb) + start) wont overflow 16bit field. [1] kernel BUG at net/core/dev.c:2880! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 7330 Comm: syz-executor4 Not tainted 4.19.0-rc6+ #253 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:skb_checksum_help+0x9e3/0xbb0 net/core/dev.c:2880 Code: 85 00 ff ff ff 48 c1 e8 03 42 80 3c 28 00 0f 84 09 fb ff ff 48 8b bd 00 ff ff ff e8 97 a8 b9 fb e9 f8 fa ff ff e8 2d 09 76 fb <0f> 0b 48 8b bd 28 ff ff ff e8 1f a8 b9 fb e9 b1 f6 ff ff 48 89 cf RSP: 0018:ffff8801d83a6f60 EFLAGS: 00010293 RAX: ffff8801b9834380 RBX: ffff8801b9f8d8c0 RCX: ffffffff8608c6d7 RDX: 0000000000000000 RSI: ffffffff8608cc63 RDI: 0000000000000006 RBP: ffff8801d83a7068 R08: ffff8801b9834380 R09: 0000000000000000 R10: ffff8801d83a76d8 R11: 0000000000000000 R12: 0000000000000001 R13: 0000000000010001 R14: 000000000000ffff R15: 00000000000000a8 FS: 00007f1a66db5700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7d77f091b0 CR3: 00000001ba252000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: skb_csum_hwoffload_help+0x8f/0xe0 net/core/dev.c:3269 validate_xmit_skb+0xa2a/0xf30 net/core/dev.c:3312 __dev_queue_xmit+0xc2f/0x3950 net/core/dev.c:3797 dev_queue_xmit+0x17/0x20 net/core/dev.c:3838 packet_snd net/packet/af_packet.c:2928 [inline] packet_sendmsg+0x422d/0x64c0 net/packet/af_packet.c:2953 Fixes: 5ff8dda3035d ("net: Ensure partial checksum offset is inside the skb head") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10Merge branch 'devlink-param-type-string-fixes'David S. Miller
Moshe Shemesh says: ==================== devlink param type string fixes This patchset fixes devlink param infrastructure for string param type. The devlink param infrastructure doesn't handle copying the string data correctly. The first two patches fix it and the third patch adds helper function to safely copy string value without exceeding DEVLINK_PARAM_MAX_STRING_VALUE. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10devlink: Add helper function for safely copy string paramMoshe Shemesh
Devlink string param buffer is allocated at the size of DEVLINK_PARAM_MAX_STRING_VALUE. Add helper function which makes sure this size is not exceeded. Renamed DEVLINK_PARAM_MAX_STRING_VALUE to __DEVLINK_PARAM_MAX_STRING_VALUE to emphasize that it should be used by devlink only. The driver should use the helper function instead to verify it doesn't exceed the allowed length. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10devlink: Fix param cmode driverinit for string typeMoshe Shemesh
Driverinit configuration mode value is held by devlink to enable the driver fetch the value after reload command. In case the param type is string devlink should copy the value from driver string buffer to devlink string buffer on devlink_param_driverinit_value_set() and vice-versa on devlink_param_driverinit_value_get(). Fixes: ec01aeb1803e ("devlink: Add support for get/set driverinit value") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10devlink: Fix param set handling for string typeMoshe Shemesh
In case devlink param type is string, it needs to copy the string value it got from the input to devlink_param_value. Fixes: e3b7ca18ad7b ("devlink: Add param set command") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11samples: disable CONFIG_SAMPLES for UMLMasahiro Yamada
Some samples require headers installation, so commit 3fca1700c4c3 ("kbuild: make samples really depend on headers_install") added such dependency in the top Makefile. However, UML fails to build with CONFIG_SAMPLES=y because UML does not support headers_install. Fixes: 3fca1700c4c3 ("kbuild: make samples really depend on headers_install") Reported-by: Kees Cook <keescook@chromium.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-10-10Merge branch 'octeontx2-af-Add-RVU-Admin-Function-driver'David S. Miller
Sunil Goutham says: ==================== octeontx2-af: Add RVU Admin Function driver Resource virtualization unit (RVU) on Marvell's OcteonTX2 SOC maps HW resources from the network, crypto and other functional blocks into PCI-compatible physical and virtual functions. Each functional block again has multiple local functions (LFs) for provisioning to PCI devices. RVU supports multiple PCIe SRIOV physical functions (PFs) and virtual functions (VFs). PF0 is called the administrative / admin function (AF) and has privileges to provision RVU functional block's LFs to each of the PF/VF. RVU managed networking functional blocks - Network pool allocator (NPA) - Network interface controller (NIX) - Network parser CAM (NPC) - Schedule/Synchronize/Order unit (SSO) RVU managed non-networking functional blocks - Crypto accelerator (CPT) - Scheduled timers unit (TIM) - Schedule/Synchronize/Order unit (SSO) Used for both networking and non networking usecases - Compression (upcoming in future variants of the silicons) Resource provisioning examples - A PF/VF with NIX-LF & NPA-LF resources works as a pure network device - A PF/VF with CPT-LF resource works as a pure cyrpto offload device. This admin function driver neither receives any data nor processes it i.e no I/O, a configuration only driver. PF/VFs communicates with AF via a shared memory region (mailbox). Upon receiving requests from PF/VF, AF does resource provisioning and other HW configuration. AF is always attached to host, but PF/VFs may be used by host kernel itself, or attached to VMs or to userspace applications like DPDK etc. So AF has to handle provisioning/configuration requests sent by any device from any domain. This patch series adds logic for the following - RVU AF driver with functional blocks provisioning support. - Mailbox infrastructure for communication between AF and PFs. - CGX (MAC controller) driver which communicates with firmware for managing physical ethernet interfaces. AF collects info from this driver and forwards the same to the PF/VFs uaing these interfaces. This is the first set of patches out of 80+ patches. Changes from v8: 1 Removed unnecessary typecasts in entire series - Suggested by David Miller 2 Added COMPILE_TEST to AF driver - Suggested by Arnd Bergmann 3 Changed udelay() to usleep_range() in rvu_poll_reg - Suggested by Arnd Bergmann 4 MSIX vector base IOMMU mapping is done using dma_map_resource() API instead of dma_map_single() as it accepts physical address. - Issue pointed by Arnd Bergmann Changes from v7: 1 Removed unnecessary typecasts in mbox infra code. - Suggested by David Miller 2 Fixed MAINTAINERS patch - Suggested by Joe Perches Changes from v6: Fixed ordering of local variables from longest to shortest line. - Suggested by David Miller Changes from v5: Modified bitfield based command structures to bitmasks for communication with firmware, to address endianness issues. - Suggested by Arnd Bergmann Changes from v4: 1 Removed module author/version/description from CGX driver as it's now merged with AF driver module. - Suggested by Arnd Bergmann 2 Added big-endian bitfields for CGX's kernel <=> firmware communication command structures. - Suggested by Arnd Bergmann Changes from v3: Moved driver from drivers/soc to drivers/net/ethernet - Suggested by Arnd Bergmann https://patchwork.kernel.org/cover/10587635/ Changes from v2: No changes, submitted again with netdev mailing list in loop. - Suggested by Arnd Bergmann and Andrew Lunn Changes from v1: 1 Merged RVU admin function and CGX drivers into a single module - Suggested by Arnd Bergmann 2 Pulled mbox communication APIs into a separate module to remove admin function driver dependency in a VM where AF is not attached. - Suggested by Arnd Bergmann ==================== Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10Documentation/arm64: HugeTLB page implementationPunit Agrawal
Arm v8 architecture supports multiple page sizes - 4k, 16k and 64k. Based on the active page size, the Linux port supports corresponding hugepage sizes at PMD and PUD(4k only) levels. In addition, the architecture also supports caching larger sized ranges (composed of multiple entries) at the PTE and PMD level in the TLBs using the contiguous bit. The Linux port makes use of this architectural support to enable additional hugepage sizes. Describe the two different types of hugepages supported by the arm64 kernel and the hugepage sizes enabled by each. Acked-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-10MAINTAINERS: Add entry for Marvell OcteonTX2 Admin Function driverSunil Goutham
Added maintainers entry for Marvell OcteonTX2 SOC's RVU admin function driver. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10octeontx2-af: Register for CGX lmac eventsLinu Cherian
Added support in RVU AF driver to register for CGX LMAC link status change events from firmware and managing them. Processing part will be added in followup patches. - Introduced eventqueue for posting events from cgx lmac. Queueing mechanism will ensure that events can be posted and firmware can be acked immediately and hence event reception and processing are decoupled. - Events gets added to the queue by notification callback. Notification callback is expected to be atomic, since it is called from interrupt context. - Events are dequeued and processed in a worker thread. Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10octeontx2-af: Add support for CGX link managementLinu Cherian
CGX LMAC initialization, link status polling etc is done by low level secure firmware. For link management this patch adds a interface or communication mechanism between firmware and this kernel CGX driver. - Firmware interface specification is defined in cgx_fw_if.h. - Support to send/receive commands/events to/form firmware. - events/commands implemented * link up * link down * reading firmware version Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Nithya Mani <nmani@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10octeontx2-af: Set RVU PFs to CGX LMACs mappingLinu Cherian
Each of the enabled CGX LMAC is considered a physical interface and RVU PFs are mapped to these. VFs of these SRIOV PFs will be virtual interfaces and share CGX LMAC along with PF. This mapping info will be used later on for Rx/Tx pkt steering. Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10octeontx2-af: Add Marvell OcteonTX2 CGX driverSunil Goutham
This patch adds basic template for Marvell OcteonTX2's CGX ethernet interface driver. Just the probe. RVU AF driver will use APIs exported by this driver for various things like PF to physical interface mapping, loopback mode, interface stats etc. Hence marged both drivers into a single module. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10octeontx2-af: Reconfig MSIX base with IOVAGeetha sowjanya
HW interprets RVU_AF_MSIXTR_BASE address as an IOVA, hence create a IOMMU mapping for the physcial address configured by firmware and reconfig RVU_AF_MSIXTR_BASE with IOVA. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10octeontx2-af: Configure block LF's MSIX vector offsetSunil Goutham
Firmware configures a certain number of MSIX vectors to each of enabled RVU PF/VF. When a block LF is attached to a PF/VF, number of MSIX vectors needed by that LF are set aside (out of PF/VF's total MSIX vectors) and LF's msix_offset is configured in HW. Also added support for a RVU PF/VF to retrieve that block LF's MSIX vector offset information from AF via mbox. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10octeontx2-af: Add RVU block LF provisioning supportSunil Goutham
Added support for a RVU PF/VF to request AF via mailbox to attach or detach NPA/NIX/SSO/SSOW/TIM/CPT block LFs. Also supports partial detachment and modifying current LF attached count of a certian block type. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>