summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-06-13blk-mq: remove WARN_ON(!q->elevator) from blk_mq_sched_free_requestsMing Lei
blk_mq_sched_free_requests() may be called in failure path in which q->elevator may not be setup yet, so remove WARN_ON(!q->elevator) from blk_mq_sched_free_requests for avoiding the false positive. This function is actually safe to call in case of !q->elevator because hctx->sched_tags is checked. Cc: Bart Van Assche <bvanassche@acm.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Yi Zhang <yi.zhang@redhat.com> Fixes: c3e2219216c9 ("block: free sched's request pool in blk_cleanup_queue") Reported-by: syzbot+b9d0d56867048c7bcfde@syzkaller.appspotmail.com Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13blkio-controller.txt: Remove references to CFQAndreas Herrmann
CFQ is gone. No need anymore to document its "proportional weight time based division of disk policy". Signed-off-by: Andreas Herrmann <aherrmann@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13block/switching-sched.txt: Update to blk-mq schedulersAndreas Herrmann
Remove references to CFQ and legacy block layer which are gone. Update example with what's available under blk-mq. Signed-off-by: Andreas Herrmann <aherrmann@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13null_blk: remove duplicate check for report zoneChaitanya Kulkarni
This patch removes the check in the null_blk_zoned for report zone command, where it checks for the dev-,>zoned before executing the report zone. The null_zone_report() function is a block_device operation callback which is initialized in the null_blk_main.c and gets called as a part of blkdev for report zone IOCTL (BLKREPORTZONE). blkdev_ioctl() blkdev_report_zones_ioctl() blkdev_report_zones() blk_report_zones() disk->fops->report_zones() nullb_zone_report(); The null_zone_report() will never get executed on the non-zoned block device, in the non zoned block device blk_queue_is_zoned() will always be false which is first check the blkdev_report_zones_ioctl() before actual low level driver report zone callback is executed. Here is the detailed scenario:- 1. modprobe null_blk null_init null_alloc_dev dev->zoned = 0 null_add_dev dev->zoned == 0 so we don't set the q->limits.zoned = BLK_ZONED_HR 2. blkzone report /dev/nullb0 blkdev_ioctl() blkdev_report_zones_ioctl() blk_queue_is_zoned() blk_queue_is_zoned q->limits.zoned == 0 return false if (!blk_queue_is_zoned(q)) <--- true return -ENOTTY; Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13blk-mq: no need to check return value of debugfs_create functionsGreg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. When all of these checks are cleaned up, lots of the functions used in the blk-mq-debugfs code can now return void, as no need to check the return value of them either. Overall, this ends up cleaning up the code and making it smaller, always a nice win. Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13io_uring: fix memory leak of UNIX domain socket inodeEric Biggers
Opening and closing an io_uring instance leaks a UNIX domain socket inode. This is because the ->file of the io_uring instance's internal UNIX domain socket is set to point to the io_uring file, but then sock_release() sees the non-NULL ->file and assumes the inode reference is held by the file so doesn't call iput(). That's not the case here, since the reference is still meant to be held by the socket; the actual inode of the io_uring file is different. Fix this leak by NULL-ing out ->file before releasing the socket. Reported-by: syzbot+111cb28d9f583693aefa@syzkaller.appspotmail.com Fixes: 2b188cc1bb85 ("Add io_uring IO interface") Cc: <stable@vger.kernel.org> # v5.1+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13block: force select mq-deadline for zoned block devicesDamien Le Moal
In most use cases of zoned block devices (aka SMR disks), the mq-deadline scheduler is mandatory as it implements sequential write command processing guarantees with zone write locking. So make sure that this scheduler is always enabled if CONFIG_BLK_DEV_ZONED is selected. Tested-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13binder: fix possible UAF when freeing bufferTodd Kjos
There is a race between the binder driver cleaning up a completed transaction via binder_free_transaction() and a user calling binder_ioctl(BC_FREE_BUFFER) to release a buffer. It doesn't matter which is first but they need to be protected against running concurrently which can result in a UAF. Signed-off-by: Todd Kjos <tkjos@google.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-13crypto: sahara - Use devm_platform_ioremap_resource()Fabio Estevam
Use devm_platform_ioremap_resource() to simplify the code a bit. Signed-off-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: mxs-dcp - Use devm_platform_ioremap_resource()Fabio Estevam
Use devm_platform_ioremap_resource() to simplify the code a bit. Signed-off-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: hisilicon - Use the correct style for SPDX License IdentifierNishad Kamdar
This patch corrects the SPDX License Identifier style in header file related to Crypto Drivers for Hisilicon SEC Engine in Hip06 and Hip07. For C header files Documentation/process/license-rules.rst mandates C-like comments (opposed to C source files where C++ style should be used) Changes made by using a script provided by Joe Perches here: https://lkml.org/lkml/2019/2/7/46 Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: qat - use struct_size() helperGustavo A. R. Silva
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct qat_alg_buf_list { ... struct qat_alg_buf bufers[]; } __packed __aligned(64); Make use of the struct_size() helper instead of an open-coded version in order to avoid any potential type mistakes. So, replace the following form: sizeof(struct qat_alg_buf_list) + ((1 + n) * sizeof(struct qat_alg_buf)) with: struct_size(bufl, bufers, n + 1) This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13ARM: dts: imx7ulp: add crypto supportIuliana Prodan
Add crypto node in device tree for CAAM support. Noteworthy is that on 7ulp the interrupt line is shared between the two job rings. Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com> Signed-off-by: Franck LENORMAND <franck.lenormand@nxp.com> Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Acked-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: cavium/nitrox - Use the correct style for SPDX License IdentifierNishad Kamdar
This patch corrects the SPDX License Identifier style in header files related to Crypto Drivers for Cavium Nitrox family CNN55XX devices. For C header files Documentation/process/license-rules.rst mandates C-like comments (opposed to C source files where C++ style should be used) Changes made by using a script provided by Joe Perches here: https://lkml.org/lkml/2019/2/7/46 Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: bcm - Make some symbols staticYueHaibing
Fix sparse warnings: drivers/crypto/bcm/cipher.c:99:6: warning: symbol 'BCMHEADER' was not declared. Should it be static? drivers/crypto/bcm/cipher.c:2096:6: warning: symbol 'spu_no_incr_hash' was not declared. Should it be static? drivers/crypto/bcm/cipher.c:4823:5: warning: symbol 'bcm_spu_probe' was not declared. Should it be static? drivers/crypto/bcm/cipher.c:4867:5: warning: symbol 'bcm_spu_remove' was not declared. Should it be static? drivers/crypto/bcm/spu2.c:52:6: warning: symbol 'spu2_cipher_type_names' was not declared. Should it be static? drivers/crypto/bcm/spu2.c:56:6: warning: symbol 'spu2_cipher_mode_names' was not declared. Should it be static? drivers/crypto/bcm/spu2.c:60:6: warning: symbol 'spu2_hash_type_names' was not declared. Should it be static? drivers/crypto/bcm/spu2.c:66:6: warning: symbol 'spu2_hash_mode_names' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: chacha - constify ctx and iv argumentsEric Biggers
Constify the ctx and iv arguments to crypto_chacha_init() and the various chacha*_stream_xor() functions. This makes it clear that they are not modified. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: chacha20poly1305 - a few cleanupsEric Biggers
- Use sg_init_one() instead of sg_init_table() then sg_set_buf(). - Remove unneeded calls to sg_init_table() prior to scatterwalk_ffwd(). - Simplify initializing the poly tail block. - Simplify computing padlen. This doesn't change any actual behavior. Cc: Martin Willi <martin@strongswan.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: skcipher - make chunksize and walksize accessors internalEric Biggers
The 'chunksize' and 'walksize' properties of skcipher algorithms are implementation details that users of the skcipher API should not be looking at. So move their accessor functions from <crypto/skcipher.h> to <crypto/internal/skcipher.h>. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: skcipher - un-inline encrypt and decrypt functionsEric Biggers
crypto_skcipher_encrypt() and crypto_skcipher_decrypt() have grown to be more than a single indirect function call. They now also check whether a key has been set, and with CONFIG_CRYPTO_STATS=y they also update the crypto statistics. That can add up to a lot of bloat at every call site. Moreover, these always involve a function call anyway, which greatly limits the benefits of inlining. So change them to be non-inline. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: aead - un-inline encrypt and decrypt functionsEric Biggers
crypto_aead_encrypt() and crypto_aead_decrypt() have grown to be more than a single indirect function call. They now also check whether a key has been set, the decryption side checks whether the input is at least as long as the authentication tag length, and with CONFIG_CRYPTO_STATS=y they also update the crypto statistics. That can add up to a lot of bloat at every call site. Moreover, these always involve a function call anyway, which greatly limits the benefits of inlining. So change them to be non-inline. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: x86/aesni - remove unused internal cipher algorithmEric Biggers
Since commit 944585a64f5e ("crypto: x86/aes-ni - remove special handling of AES in PCBC mode"), the "__aes-aesni" internal cipher algorithm is no longer used. So remove it too. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: doc - improve the skcipher API example codeEric Biggers
Rewrite the skcipher API example, changing it to encrypt a buffer with AES-256-XTS. This addresses various problems with the previous example: - It requests a specific driver "cbc-aes-aesni", which is unusual. Normally users ask for "cbc(aes)", not a specific driver. - It encrypts only a single AES block. For the reader, that doesn't clearly distinguish the "skcipher" API from the "cipher" API. - Showing how to encrypt something with bare CBC is arguably a poor choice of example, as it doesn't follow modern crypto trends. Now, usually authenticated encryption is recommended, in which case the user would use the AEAD API, not skcipher. Disk encryption is still a legitimate use for skcipher, but for that usually XTS is recommended. - Many other bugs and poor coding practices, such as not setting CRYPTO_TFM_REQ_MAY_SLEEP, unnecessarily allocating a heap buffer for the IV, unnecessary NULL checks, using a pointless wrapper struct, and forgetting to set an error code in one case. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: testmgr - add some more preemption pointsEric Biggers
Call cond_resched() after each fuzz test iteration. This avoids stall warnings if fuzz_iterations is set very high for testing purposes. While we're at it, also call cond_resched() after finishing testing each test vector. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: algapi - require cra_name and cra_driver_nameEric Biggers
Now that all algorithms explicitly set cra_driver_name, make it required for algorithm registration and remove the code that generated a default cra_driver_name. Also add an explicit check that cra_name is set too, since that's obviously required too, yet it didn't seem to be checked anywhere. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13crypto: make all generic algorithms set cra_driver_nameEric Biggers
Most generic crypto algorithms declare a driver name ending in "-generic". The rest don't declare a driver name and instead rely on the crypto API automagically appending "-generic" upon registration. Having multiple conventions is unnecessarily confusing and makes it harder to grep for all generic algorithms in the kernel source tree. But also, allowing NULL driver names is problematic because sometimes people fail to set it, e.g. the case fixed by commit 417980364300 ("crypto: cavium/zip - fix collision with generic cra_driver_name"). Of course, people can also incorrectly name their drivers "-generic". But that's much easier to notice / grep for. Therefore, let's make cra_driver_name mandatory. In preparation for this, this patch makes all generic algorithms set cra_driver_name. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13Merge branch 'context-id-fix' into fixesMichael Ellerman
This merges a fix for a bug in our context id handling on 64-bit hash CPUs. The fix was written against v5.1 to ease backporting to stable releases. Here we are merging it up to a v5.2-rc2 base, which involves a bit of manual resolution. It also adds a test case for the bug. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-12Merge tag 'selinux-pr-20190612' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fixes from Paul Moore: "Three patches for v5.2. One fixes a problem where we weren't correctly logging raw SELinux labels, the other two fix problems where we weren't properly checking calls to kmemdup()" * tag 'selinux-pr-20190612' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: fix a missing-check bug in selinux_sb_eat_lsm_opts() selinux: fix a missing-check bug in selinux_add_mnt_opt( ) selinux: log raw contexts as untrusted strings
2019-06-13selftests/powerpc: Add test of fork with mapping above 512TBMichael Ellerman
This tests that when a process with a mapping above 512TB forks we correctly separate the parent and child address spaces. This exercises the bug in the context id handling fixed in the previous commit. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-12drm/amdgpu: return 0 by default in amdgpu_pm_load_smu_firmwareAlex Deucher
Fixes SI cards running on amdgpu. Fixes: 1929059893022 ("drm/amd/amdgpu: add RLC firmware to support raven1 refresh") Bug: https://bugs.freedesktop.org/show_bug.cgi?id=110883 Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-12drm/amdgpu: Fix bounds checking in amdgpu_ras_is_supported()Dan Carpenter
The "block" variable can be set by the user through debugfs, so it can be quite large which leads to shift wrapping here. This means we report a "block" as supported when it's not, and that leads to array overflows later on. This bug is not really a security issue in real life, because debugfs is generally root only. Fixes: 36ea1bd2d084 ("drm/amdgpu: add debugfs ctrl node") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-12Merge tag 'clk-meson-5.2-1-fixes' of https://github.com/BayLibre/clk-meson ↵Stephen Boyd
into clk-fixes Pull Meson clk driver fixes from Jerome Brunet: - MPLL50M DT bindings typo fix - Meson9 VPU typo fixes * tag 'clk-meson-5.2-1-fixes' of https://github.com/BayLibre/clk-meson: clk: meson: meson8b: fix a typo in the VPU parent names array variable clk: meson: fix MPLL 50M binding id typo
2019-06-12Input: synaptics - enable SMBus on ThinkPad E480 and E580Alexander Mikhaylenko
They are capable of using intertouch and it works well with psmouse.synaptics_intertouch=1, so add them to the list. Without it, scrolling and gestures are jumpy, three-finger pinch gesture doesn't work and three- or four-finger swipes sometimes get stuck. Signed-off-by: Alexander Mikhaylenko <exalm7659@gmail.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2019-06-12selinux: fix empty write to keycreate fileOndrej Mosnacek
When sid == 0 (we are resetting keycreate_sid to the default value), we should skip the KEY__CREATE check. Before this patch, doing a zero-sized write to /proc/self/keycreate would check if the current task can create unlabeled keys (which would usually fail with -EACCESS and generate an AVC). Now it skips the check and correctly sets the task's keycreate_sid to 0. Bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1719067 Tested using the reproducer from the report above. Fixes: 4eb582cf1fbd ("[PATCH] keys: add a way to store the appropriate context for newly-created keys") Reported-by: Kir Kolyshkin <kir@sacred.ru> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-06-12Merge branch 'net-mvpp2-prs-Fixes-for-VID-filtering'David S. Miller
Maxime Chevallier says: ==================== net: mvpp2: prs: Fixes for VID filtering This series fixes some issues with VID filtering offload, mainly due to the wrong ranges being used in the TCAM header parser. The first patch fixes a bug where removing a VLAN from a port's whitelist would also remove it from other port's, if they are on the same PPv2 instance. The second patch makes so that we don't invalidate the wrong TCAM entries when clearing the whole whitelist. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12net: mvpp2: prs: Use the correct helpers when removing all VID filtersMaxime Chevallier
When removing all VID filters, the mvpp2_prs_vid_entry_remove would be called with the TCAM id incorrectly used as a VID, causing the wrong TCAM entries to be invalidated. Fix this by directly invalidating entries in the VID range. Fixes: 56beda3db602 ("net: mvpp2: Add hardware offloading for VLAN filtering") Suggested-by: Yuri Chipchev <yuric@marvell.com> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12net: mvpp2: prs: Fix parser range for VID filteringMaxime Chevallier
VID filtering is implemented in the Header Parser, with one range of 11 vids being assigned for each no-loopback port. Make sure we use the per-port range when looking for existing entries in the Parser. Since we used a global range instead of a per-port one, this causes VIDs to be removed from the whitelist from all ports of the same PPv2 instance. Fixes: 56beda3db602 ("net: mvpp2: Add hardware offloading for VLAN filtering") Suggested-by: Yuri Chipchev <yuric@marvell.com> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12Merge branch 'mlxsw-Various-fixes'David S. Miller
Ido Schimmel says: ==================== mlxsw: Various fixes This patchset contains various fixes for mlxsw. Patch #1 fixes an hash polarization problem when a nexthop device is a LAG device. This is caused by the fact that the same seed is used for the LAG and ECMP hash functions. Patch #2 fixes an issue in which the driver fails to refresh a nexthop neighbour after it becomes dead. This prevents the nexthop from ever being written to the adjacency table and used to forward traffic. Patch Patch #4 fixes a wrong extraction of TOS value in flower offload code. Patch #5 is a test case. Patch #6 works around a buffer issue in Spectrum-2 by reducing the default sizes of the shared buffer pools. Patch #7 prevents prio-tagged packets from entering the switch when PVID is removed from the bridge port. Please consider patches #2, #4 and #6 for 5.1.y ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12mlxsw: spectrum: Disallow prio-tagged packets when PVID is removedIdo Schimmel
When PVID is removed from a bridge port, the Linux bridge drops both untagged and prio-tagged packets. Align mlxsw with this behavior. Fixes: 148f472da5db ("mlxsw: reg: Add the Switch Port Acceptable Frame Types register") Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12mlxsw: spectrum_buffers: Reduce pool size on Spectrum-2Petr Machata
Due to an issue on Spectrum-2, in front-panel ports split four ways, 2 out of 32 port buffers cannot be used. To work around this, the next FW release will mark them as unused, and will report correspondingly lower total shared buffer size. mlxsw will pick up the new value through a query to cap_total_buffer_size resource. However the initial size for shared buffer pool 0 is hard-coded and therefore needs to be updated. Thus reduce the pool size by 2.7 MiB (which corresponds to 2/32 of the total size of 42 MiB), and round down to the whole number of cells. Fixes: fe099bf682ab ("mlxsw: spectrum_buffers: Add Spectrum-2 shared buffer configuration") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12selftests: tc_flower: Add TOS matching testJiri Pirko
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12mlxsw: spectrum_flower: Fix TOS matchingJiri Pirko
The TOS value was not extracted correctly. Fix it. Fixes: 87996f91f739 ("mlxsw: spectrum_flower: Add support for ip tos") Reported-by: Alexander Petrovskiy <alexpe@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12selftests: mlxsw: Test nexthop offload indicationIdo Schimmel
Test that IPv4 and IPv6 nexthops are correctly marked with offload indication in response to neighbour events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12mlxsw: spectrum_router: Refresh nexthop neighbour when it becomes deadIdo Schimmel
The driver tries to periodically refresh neighbours that are used to reach nexthops. This is done by periodically calling neigh_event_send(). However, if the neighbour becomes dead, there is nothing we can do to return it to a connected state and the above function call is basically a NOP. This results in the nexthop never being written to the device's adjacency table and therefore never used to forward packets. Fix this by dropping our reference from the dead neighbour and associating the nexthop with a new neigbhour which we will try to refresh. Fixes: a7ff87acd995 ("mlxsw: spectrum_router: Implement next-hop routing") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alex Veber <alexve@mellanox.com> Tested-by: Alex Veber <alexve@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12mlxsw: spectrum: Use different seeds for ECMP and LAG hashIdo Schimmel
The same hash function and seed are used for both ECMP and LAG hash. Therefore, when a LAG device is used as a nexthop device as part of an ECMP group, hash polarization can occur and all the traffic will be hashed to a single LAG slave. Fix this by using a different seed for the LAG hash. Fixes: fa73989f2697 ("mlxsw: spectrum: Use a stable ECMP/LAG seed") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alex Veber <alexve@mellanox.com> Tested-by: Alex Veber <alexve@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12net: tls, correctly account for copied bytes with multiple sk_msgsJohn Fastabend
tls_sw_do_sendpage needs to return the total number of bytes sent regardless of how many sk_msgs are allocated. Unfortunately, copied (the value we return up the stack) is zero'd before each new sk_msg is allocated so we only return the copied size of the last sk_msg used. The caller (splice, etc.) of sendpage will then believe only part of its data was sent and send the missing chunks again. However, because the data actually was sent the receiver will get multiple copies of the same data. To reproduce this do multiple sendfile calls with a length close to the max record size. This will in turn call splice/sendpage, sendpage may use multiple sk_msg in this case and then returns the incorrect number of bytes. This will cause splice to resend creating duplicate data on the receiver. Andre created a C program that can easily generate this case so we will push a similar selftest for this to bpf-next shortly. The fix is to _not_ zero the copied field so that the total sent bytes is returned. Reported-by: Steinar H. Gunderson <steinar+kernel@gunderson.no> Reported-by: Andre Tomt <andre@tomt.net> Tested-by: Andre Tomt <andre@tomt.net> Fixes: d829e9c4112b ("tls: convert to generic sk_msg interface") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12vrf: Increment Icmp6InMsgs on the original netdevStephen Suryaputra
Get the ingress interface and increment ICMP counters based on that instead of skb->dev when the the dev is a VRF device. This is a follow up on the following message: https://www.spinics.net/lists/netdev/msg560268.html v2: Avoid changing skb->dev since it has unintended effect for local delivery (David Ahern). Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12cpuset: restore sanity to cpuset_cpus_allowed_fallback()Joel Savitz
In the case that a process is constrained by taskset(1) (i.e. sched_setaffinity(2)) to a subset of available cpus, and all of those are subsequently offlined, the scheduler will set tsk->cpus_allowed to the current value of task_cs(tsk)->effective_cpus. This is done via a call to do_set_cpus_allowed() in the context of cpuset_cpus_allowed_fallback() made by the scheduler when this case is detected. This is the only call made to cpuset_cpus_allowed_fallback() in the latest mainline kernel. However, this is not sane behavior. I will demonstrate this on a system running the latest upstream kernel with the following initial configuration: # grep -i cpu /proc/$$/status Cpus_allowed: ffffffff,fffffff Cpus_allowed_list: 0-63 (Where cpus 32-63 are provided via smt.) If we limit our current shell process to cpu2 only and then offline it and reonline it: # taskset -p 4 $$ pid 2272's current affinity mask: ffffffffffffffff pid 2272's new affinity mask: 4 # echo off > /sys/devices/system/cpu/cpu2/online # dmesg | tail -3 [ 2195.866089] process 2272 (bash) no longer affine to cpu2 [ 2195.872700] IRQ 114: no longer affine to CPU2 [ 2195.879128] smpboot: CPU 2 is now offline # echo on > /sys/devices/system/cpu/cpu2/online # dmesg | tail -1 [ 2617.043572] smpboot: Booting Node 0 Processor 2 APIC 0x4 We see that our current process now has an affinity mask containing every cpu available on the system _except_ the one we originally constrained it to: # grep -i cpu /proc/$$/status Cpus_allowed: ffffffff,fffffffb Cpus_allowed_list: 0-1,3-63 This is not sane behavior, as the scheduler can now not only place the process on previously forbidden cpus, it can't even schedule it on the cpu it was originally constrained to! Other cases result in even more exotic affinity masks. Take for instance a process with an affinity mask containing only cpus provided by smt at the moment that smt is toggled, in a configuration such as the following: # taskset -p f000000000 $$ # grep -i cpu /proc/$$/status Cpus_allowed: 000000f0,00000000 Cpus_allowed_list: 36-39 A double toggle of smt results in the following behavior: # echo off > /sys/devices/system/cpu/smt/control # echo on > /sys/devices/system/cpu/smt/control # grep -i cpus /proc/$$/status Cpus_allowed: ffffff00,ffffffff Cpus_allowed_list: 0-31,40-63 This is even less sane than the previous case, as the new affinity mask excludes all smt-provided cpus with ids less than those that were previously in the affinity mask, as well as those that were actually in the mask. With this patch applied, both of these cases end in the following state: # grep -i cpu /proc/$$/status Cpus_allowed: ffffffff,ffffffff Cpus_allowed_list: 0-63 The original policy is discarded. Though not ideal, it is the simplest way to restore sanity to this fallback case without reinventing the cpuset wheel that rolls down the kernel just fine in cgroup v2. A user who wishes for the previous affinity mask to be restored in this fallback case can use that mechanism instead. This patch modifies scheduler behavior by instead resetting the mask to task_cs(tsk)->cpus_allowed by default, and cpu_possible mask in legacy mode. I tested the cases above on both modes. Note that the scheduler uses this fallback mechanism if and only if _every_ other valid avenue has been traveled, and it is the last resort before calling BUG(). Suggested-by: Waiman Long <longman@redhat.com> Suggested-by: Phil Auld <pauld@redhat.com> Signed-off-by: Joel Savitz <jsavitz@redhat.com> Acked-by: Phil Auld <pauld@redhat.com> Acked-by: Waiman Long <longman@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2019-06-12net: ethtool: Allow matching on vlan DEI bitMaxime Chevallier
Using ethtool, users can specify a classification action matching on the full vlan tag, which includes the DEI bit (also previously called CFI). However, when converting the ethool_flow_spec to a flow_rule, we use dissector keys to represent the matching patterns. Since the vlan dissector key doesn't include the DEI bit, this information was silently discarded when translating the ethtool flow spec in to a flow_rule. This commit adds the DEI bit into the vlan dissector key, and allows propagating the information to the driver when parsing the ethtool flow spec. Fixes: eca4205f9ec3 ("ethtool: add ethtool_rx_flow_spec to flow_rule structure translator") Reported-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12linux-next: DOC: RDS: Fix a typo in rds.txtMasanari Iida
This patch fixes a spelling typo in rds.txt Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12x86/kgdb: Return 0 from kgdb_arch_set_breakpoint()Matt Mullins
err must be nonzero in order to reach text_poke(), which caused kgdb to fail to set breakpoints: (gdb) break __x64_sys_sync Breakpoint 1 at 0xffffffff81288910: file ../fs/sync.c, line 124. (gdb) c Continuing. Warning: Cannot insert breakpoint 1. Cannot access memory at address 0xffffffff81288910 Command aborted. Fixes: 86a22057127d ("x86/kgdb: Avoid redundant comparison of patched code") Signed-off-by: Matt Mullins <mmullins@fb.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Nadav Amit <namit@vmware.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190531194755.6320-1-mmullins@fb.com