summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-08-30string: improve default out-of-line memcmp() implementationLinus Torvalds
This just does the "if the architecture does efficient unaligned handling, start the memcmp using 'unsigned long' accesses", since Nikolay Borisov found a load that cares. This is basically the minimal patch, and limited to architectures that are known to not have slow unaligned handling. We've had the stupid byte-at-a-time version forever, and nobody has ever even noticed before, so let's keep the fix minimal. A potential further improvement would be to align one of the sources in order to at least minimize unaligned cases, but the only real case of bigger memcmp() users seems to be the FIDEDUPERANGE ioctl(). As David Sterba says, the dedupe ioctl is typically called on ranges spanning many pages so the common case will all be page-aligned anyway. All the relevant architectures select HAVE_EFFICIENT_UNALIGNED_ACCESS, so I'm not going to worry about the combination of a very rare use-case and a rare architecture until somebody actually hits it. Particularly since Nikolay also tested the more complex patch with extra alignment handling code, and it only added overhead. Link: https://lore.kernel.org/lkml/20210721135926.602840-1-nborisov@suse.com/ Reported-by: Nikolay Borisov <nborisov@suse.com> Cc: David Sterba <dsterba@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-30Merge branch 'rework/printk_safe-removal' into for-linusPetr Mladek
2021-08-30Merge branch 'rework/fixup-for-5.15' into for-linusPetr Mladek
2021-08-30io-wq: fix wakeup race when adding new workJens Axboe
When new work is added, io_wqe_enqueue() checks if we need to wake or create a new worker. But that check is done outside the lock that otherwise synchronizes us with a worker going to sleep, so we can end up in the following situation: CPU0 CPU1 lock insert work unlock atomic_read(nr_running) != 0 lock atomic_dec(nr_running) no wakeup needed Hold the wqe lock around the "need to wakeup" check. Then we can also get rid of the temporary work_flags variable, as we know the work will remain valid as long as we hold the lock. Cc: stable@vger.kernel.org Reported-by: Andres Freund <andres@anarazel.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-30io-wq: wqe and worker locks no longer need to be IRQ safeJens Axboe
io_uring no longer queues async work off completion handlers that run in hard or soft interrupt context, and that use case was the only reason that io-wq had to use IRQ safe locks for wqe and worker locks. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-30io-wq: check max_worker limits if a worker transitions bound stateJens Axboe
For the two places where new workers are created, we diligently check if we are allowed to create a new worker. If we're currently at the limit of how many workers of a given type we can have, then we don't create any new ones. If you have a mixed workload with various types of bound and unbounded work, then it can happen that a worker finishes one type of work and is then transitioned to the other type. For this case, we don't check if we are actually allowed to do so. This can cause io-wq to temporarily exceed the allowed number of workers for a given type. When retrieving work, check that the types match. If they don't, check if we are allowed to transition to the other type. If not, then don't handle the new work. Cc: stable@vger.kernel.org Reported-by: Johannes Lundberg <johalun0@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-30perf tests: Fix *probe_vfs_getname.sh test failuresJames Clark
The commit 4d6101f5fd5d9960 ("perf probe: Clarify error message about not finding kernel modules debuginfo") changed the error message "Failed to find the path for kernel" to "Failed to find the path for the kernel". Update the regex so that the tests still skip rather than fail when kernel debug symbols aren't present. Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Link: http://lore.kernel.org/lkml/20210825164259.833222-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30perf bench inject-buildid: Handle writen() errorsArnaldo Carvalho de Melo
The build on fedora:35 and fedora:rawhide with clang is failing with: 49 41.00 fedora:35 : FAIL clang version 13.0.0 (Fedora 13.0.0~rc1-1.fc35) bench/inject-buildid.c:351:6: error: variable 'len' set but not used [-Werror,-Wunused-but-set-variable] u64 len = 0; ^ 1 error generated. make[3]: *** [/git/perf-5.14.0-rc7/tools/build/Makefile.build:139: bench] Error 2 50 41.11 fedora:rawhide : FAIL clang version 13.0.0 (Fedora 13.0.0~rc1-1.fc35) bench/inject-buildid.c:351:6: error: variable 'len' set but not used [-Werror,-Wunused-but-set-variable] u64 len = 0; ^ 1 error generated. make[3]: *** [/git/perf-5.14.0-rc7/tools/build/Makefile.build:139: bench] Error 2 That 'len' variable is not used at all, so just make sure all the synthesize_RECORD() routines return ssize_t to propagate the writen() return, as it may fail, ditch the 'ret' var and bail out if those routines fail. Fixes: 0bf02a0d80427f26 ("perf bench: Add build-id injection benchmark") Acked-by: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/CAM9d7cgEZNSor+B+7Y2C+QYGme_v5aH0Zn0RLfxoQ+Fy83EHrg@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30perf unwind: Do not overwrite FEATURE_CHECK_LDFLAGS-libunwind-{x86,aarch64}Li Huafei
When setting LIBUNWIND_DIR, we first set FEATURE_CHECK_LDFLAGS-libunwind-{aarch64,x86} = -L$(LIBUNWIND_DIR)/lib. <committer note> This happens a bit before, the overwritting, in: libunwind_arch_set_flags = $(eval $(libunwind_arch_set_flags_code)) define libunwind_arch_set_flags_code FEATURE_CHECK_CFLAGS-libunwind-$(1) = -I$(LIBUNWIND_DIR)/include FEATURE_CHECK_LDFLAGS-libunwind-$(1) = -L$(LIBUNWIND_DIR)/lib endef ifdef LIBUNWIND_DIR LIBUNWIND_CFLAGS = -I$(LIBUNWIND_DIR)/include LIBUNWIND_LDFLAGS = -L$(LIBUNWIND_DIR)/lib LIBUNWIND_ARCHS = x86 x86_64 arm aarch64 debug-frame-arm debug-frame-aarch64 $(foreach libunwind_arch,$(LIBUNWIND_ARCHS),$(call libunwind_arch_set_flags,$(libunwind_arch))) endif Look at that 'foreach' on all the LIBUNWIND_ARCHS. </> After commit 5c4d7c82c0dc ("perf unwind: Do not put libunwind-{x86,aarch64} in FEATURE_TESTS_BASIC"), FEATURE_CHECK_LDFLAGS-libunwind-{x86,aarch64} is overwritten. As a result, the remote libunwind libraries cannot be searched from $(LIBUNWIND_DIR)/lib directory during feature check tests. Fix it with variable appending. Before this patch: perf$ make VF=1 LIBUNWIND_DIR=/opt/libunwind_aarch64 BUILD: Doing 'make -j16' parallel build <SNIP> ... ... libopencsd: [ OFF ] ... libunwind-x86: [ OFF ] ... libunwind-x86_64: [ OFF ] ... libunwind-arm: [ OFF ] ... libunwind-aarch64: [ OFF ] ... libunwind-debug-frame: [ OFF ] ... libunwind-debug-frame-arm: [ OFF ] ... libunwind-debug-frame-aarch64: [ OFF ] ... cxx: [ OFF ] <SNIP> perf$ cat ../build/feature/test-libunwind-aarch64.make.output /usr/bin/ld: cannot find -lunwind-aarch64 /usr/bin/ld: cannot find -lunwind-aarch64 collect2: error: ld returned 1 exit status After this patch: perf$ make VF=1 LIBUNWIND_DIR=/opt/libunwind_aarch64 BUILD: Doing 'make -j16' parallel build <SNIP> ... libopencsd: [ OFF ] ... libunwind-x86: [ OFF ] ... libunwind-x86_64: [ OFF ] ... libunwind-arm: [ OFF ] ... libunwind-aarch64: [ on ] ... libunwind-debug-frame: [ OFF ] ... libunwind-debug-frame-arm: [ OFF ] ... libunwind-debug-frame-aarch64: [ OFF ] ... cxx: [ OFF ] <SNIP> perf$ cat ../build/feature/test-libunwind-aarch64.make.output perf$ ldd ./perf linux-vdso.so.1 (0x00007ffdf07da000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f30953dc000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f30951d4000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f3094e36000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f3094c32000) libelf.so.1 => /usr/lib/x86_64-linux-gnu/libelf.so.1 (0x00007f3094a18000) libdw.so.1 => /usr/lib/x86_64-linux-gnu/libdw.so.1 (0x00007f30947cc000) libunwind-x86_64.so.8 => /usr/lib/x86_64-linux-gnu/libunwind-x86_64.so.8 (0x00007f30945ad000) libunwind.so.8 => /usr/lib/x86_64-linux-gnu/libunwind.so.8 (0x00007f3094392000) liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f309416c000) libunwind-aarch64.so.8 => not found libslang.so.2 => /lib/x86_64-linux-gnu/libslang.so.2 (0x00007f3093c8a000) libpython2.7.so.1.0 => /usr/local/lib/libpython2.7.so.1.0 (0x00007f309386b000) libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f309364e000) libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 (0x00007f3093443000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f3093052000) /lib64/ld-linux-x86-64.so.2 (0x00007f3096097000) libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007f3092e42000) libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f3092c3f000) Fixes: 5c4d7c82c0dceccf ("perf unwind: Do not put libunwind-{x86,aarch64} in FEATURE_TESTS_BASIC") Signed-off-by: Li Huafei <lihuafei1@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhang Jinhao <zhangjinhao2@huawei.com> Link: http://lore.kernel.org/lkml/20210823134340.60955-1-lihuafei1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30perf config: Fix caching and memory leak in perf_home_perfconfig()Arnaldo Carvalho de Melo
Acaict, perf_home_perfconfig() is supposed to cache the result of home_perfconfig, which returns the default location of perfconfig for the user, given the HOME environment variable. However, the current implementation calls home_perfconfig every time perf_home_perfconfig() is called (so no caching is actually performed), replacing the previous pointer, thus also causing a memory leak. This patch adds a check of whether either config or failed is set and, in that case, directly returns config without calling home_perfconfig at each invocation. Fixes: f5f03e19ce14fc31 ("perf config: Add perf_home_perfconfig function") Signed-off-by: Riccardo Mancini <rickyman7@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Link: http://lore.kernel.org/lkml/20210820130817.740536-1-rickyman7@gmail.com [ Removed needless double check for the 'failed' variable ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30perf tools: Fixup get_current_dir_name() compilationAlexey Dobriyan
strdup() prototype doesn't live in stdlib.h . Add limits.h for PATH_MAX definition as well. This fixes the build on Android. Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/YRukaQbrgDWhiwGr@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo
To pick up fixes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30Merge tag 'asoc-v5.15' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Updates for v5.15 Quite a quiet release this time, mostly a combination of cleanups and a good set of new drivers. - Lots of cleanups and improvements to the Intel drivers, including some new systems support. - New support for AMD Vangoh, CUI CMM-4030D-261, Mediatek Mt8195, Renesas RZ/G2L Mediatek Mt8195, RealTek RT101P, Renesas RZ/G2L,, Rockchip RK3568 S/PDIF.
2021-08-30Merge branch 'for-5.15-verbose-console' into for-linusPetr Mladek
2021-08-30Merge branch 'for-5.15-printk-index' into for-linusPetr Mladek
2021-08-30Merge branch 'sg_nents' into rdma.git for-nextJason Gunthorpe
From Maor Gottlieb ==================== Fix the use of nents and orig_nents in the sg table append helpers. The nents should be used by the DMA layer to store the number of DMA mapped sges, the orig_nents is the number of CPU sges. Since the sg append logic doesn't always create a SGL with exactly orig_nents entries store a total_nents as well to allow the table to be properly free'd and reorganize the freeing logic to share across all the use cases. ==================== Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> * 'sg_nents': RDMA: Use the sg_table directly and remove the opencoded version from umem lib/scatterlist: Fix wrong update of orig_nents lib/scatterlist: Provide a dedicated function to support table append
2021-08-30RDMA/mlx5: Relax DCS QP creation checksLior Nahmanson
In order to create DCS QPs, we don't need to rely on both log_max_dci_stream_channels and log_max_dci_errored_streams capabilities. Fixes: 11656f593a86 ("RDMA/mlx5: Add DCS offload support") Link: https://lore.kernel.org/r/3e7b3363fd73686176cc584295e86832a7cf99b2.1630320354.git.leonro@nvidia.com Signed-off-by: Lior Nahmanson <liorna@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-30dt-bindings: Use 'enum' instead of 'oneOf' plus 'const' entriesRob Herring
'enum' is equivalent to 'oneOf' with a list of 'const' entries, but 'enum' is more concise and yields better error messages. Cc: Maxime Ripard <mripard@kernel.org> Cc: Vignesh R <vigneshr@ti.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Lee Jones <lee.jones@linaro.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Kishon Vijay Abraham I <kishon@ti.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: dmaengine@vger.kernel.org Cc: linux-i2c@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: linux-phy@lists.infradead.org Cc: linux-serial@vger.kernel.org Cc: alsa-devel@alsa-project.org Cc: linux-spi@vger.kernel.org Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> (mipi-ccs) Acked-by: Mark Brown <broonie@kernel.org> Reviewed-by: Vinod Koul <vkoul@kernel.org> Acked-By: Vinod Koul <vkoul@kernel.org> Acked-by: Wolfram Sang <wsa@kernel.org> # for I2C Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210824202014.978922-1-robh@kernel.org
2021-08-30net: ipv4: Fix the warning for dereferenceYajun Deng
Add a if statements to avoid the warning. Dan Carpenter report: The patch faf482ca196a: "net: ipv4: Move ip_options_fragment() out of loop" from Aug 23, 2021, leads to the following Smatch complaint: net/ipv4/ip_output.c:833 ip_do_fragment() warn: variable dereferenced before check 'iter.frag' (see line 828) Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: faf482ca196a ("net: ipv4: Move ip_options_fragment() out of loop") Link: https://lore.kernel.org/netdev/20210830073802.GR7722@kadam/T/#t Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30Merge remote-tracking branch 'asoc/for-5.15' into asoc-linusMark Brown
2021-08-30Merge remote-tracking branch 'asoc/for-5.14' into asoc-linusMark Brown
2021-08-30net: qrtr: make checks in qrtr_endpoint_post() stricterDan Carpenter
These checks are still not strict enough. The main problem is that if "cb->type == QRTR_TYPE_NEW_SERVER" is true then "len - hdrlen" is guaranteed to be 4 but we need to be at least 16 bytes. In fact, we can reject everything smaller than sizeof(*pkt) which is 20 bytes. Also I don't like the ALIGN(size, 4). It's better to just insist that data is needs to be aligned at the start. Fixes: 0baa99ee353c ("net: qrtr: Allow non-immediate node routing") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30fix array-index-out-of-bounds in taprio_changeHaimin Zhang
syzbot report an array-index-out-of-bounds in taprio_change index 16 is out of range for type '__u16 [16]' that's because mqprio->num_tc is lager than TC_MAX_QUEUE,so we check the return value of netdev_set_num_tc. Reported-by: syzbot+2b3e5fb6c7ef285a94f6@syzkaller.appspotmail.com Signed-off-by: Haimin Zhang <tcs_kernel@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30net: fix NULL pointer reference in cipso_v4_doi_free王贇
In netlbl_cipsov4_add_std() when 'doi_def->map.std' alloc failed, we sometime observe panic: BUG: kernel NULL pointer dereference, address: ... RIP: 0010:cipso_v4_doi_free+0x3a/0x80 ... Call Trace: netlbl_cipsov4_add_std+0xf4/0x8c0 netlbl_cipsov4_add+0x13f/0x1b0 genl_family_rcv_msg_doit.isra.15+0x132/0x170 genl_rcv_msg+0x125/0x240 This is because in cipso_v4_doi_free() there is no check on 'doi_def->map.std' when doi_def->type got value 1, which is possibe, since netlbl_cipsov4_add_std() haven't initialize it before alloc 'doi_def->map.std'. This patch just add the check to prevent panic happen in similar cases. Reported-by: Abaci <abaci@linux.alibaba.com> Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30Merge branch 'inet-exceptions-less-predictable'David S. Miller
Eric Dumazet says: ==================== inet: make exception handling less predictible This second round of patches is addressing Keyu Man recommendations to make linux hosts more robust against a class of brute force attacks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30ipv4: make exception cache less predictibleEric Dumazet
Even after commit 6457378fe796 ("ipv4: use siphash instead of Jenkins in fnhe_hashfun()"), an attacker can still use brute force to learn some secrets from a victim linux host. One way to defeat these attacks is to make the max depth of the hash table bucket a random value. Before this patch, each bucket of the hash table used to store exceptions could contain 6 items under attack. After the patch, each bucket would contains a random number of items, between 6 and 10. The attacker can no longer infer secrets. This is slightly increasing memory size used by the hash table, by 50% in average, we do not expect this to be a problem. This patch is more complex than the prior one (IPv6 equivalent), because IPv4 was reusing the oldest entry. Since we need to be able to evict more than one entry per update_or_create_fnhe() call, I had to replace fnhe_oldest() with fnhe_remove_oldest(). Also note that we will queue extra kfree_rcu() calls under stress, which hopefully wont be a too big issue. Fixes: 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Keyu Man <kman001@ucr.edu> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Reviewed-by: David Ahern <dsahern@kernel.org> Tested-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30ipv6: make exception cache less predictibleEric Dumazet
Even after commit 4785305c05b2 ("ipv6: use siphash in rt6_exception_hash()"), an attacker can still use brute force to learn some secrets from a victim linux host. One way to defeat these attacks is to make the max depth of the hash table bucket a random value. Before this patch, each bucket of the hash table used to store exceptions could contain 6 items under attack. After the patch, each bucket would contains a random number of items, between 6 and 10. The attacker can no longer infer secrets. This is slightly increasing memory size used by the hash table, we do not expect this to be a problem. Following patch is dealing with the same issue in IPv4. Fixes: 35732d01fe31 ("ipv6: introduce a hash table to store dst cache") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Keyu Man <kman001@ucr.edu> Cc: Wei Wang <weiwan@google.com> Cc: Martin KaFai Lau <kafai@fb.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30ASoC: Revert PCM trigger changesMark Brown
These have turned up some issues in further testing. Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-30s390: remove SCHED_CORE from defconfigsHeiko Carstens
This causes too many problems. Enable it again when everything has been sorted out. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-30x86: xen: platform-pci-unplug: use pr_err() and pr_warn() instead of raw ↵zhaoxiao
printk() Since we have the nice helpers pr_err() and pr_warn(), use them instead of raw printk(). [jgross@suse.com] Move the "#define pr_fmt" above the #includes in order to avoid build warnings due to redefinition. Signed-off-by: zhaoxiao <zhaoxiao@uniontech.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20210825114111.29009-1-zhaoxiao@uniontech.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30drivers/xen/xenbus/xenbus_client.c: fix bugon.cocci warningsJing Yangyang
Use BUG_ON instead of a if condition followed by BUG. Generated by: scripts/coccinelle/misc/bugon.cocci Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Jing Yangyang <jing.yangyang@zte.com.cn> Reviewed-by: SeongJae Park <sjpark@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20210825062451.69998-1-deng.changcheng@zte.com.cn Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30xen/blkfront: don't trust the backend response data blindlyJuergen Gross
Today blkfront will trust the backend to send only sane response data. In order to avoid privilege escalations or crashes in case of malicious backends verify the data to be within expected limits. Especially make sure that the response always references an outstanding request. Introduce a new state of the ring BLKIF_STATE_ERROR which will be switched to in case an inconsistency is being detected. Recovering from this state is possible only via removing and adding the virtual device again (e.g. via a suspend/resume cycle). Make all warning messages issued due to valid error responses rate limited in order to avoid message floods being triggered by a malicious backend. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20210730103854.12681-4-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30xen/blkfront: don't take local copy of a request from the ring pageJuergen Gross
In order to avoid a malicious backend being able to influence the local copy of a request build the request locally first and then copy it to the ring page instead of doing it the other way round as today. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20210730103854.12681-3-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30xen/blkfront: read response from backend only onceJuergen Gross
In order to avoid problems in case the backend is modifying a response on the ring page while the frontend has already seen it, just read the response into a local buffer in one go and then operate on that buffer only. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20210730103854.12681-2-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Clean up and consolidate ct ecache infrastructure by merging ct and expect notifiers, from Florian Westphal. 2) Missing counters and timestamp in nfnetlink_queue and _log conntrack information. 3) Missing error check for xt_register_template() in iptables mangle, as a incremental fix for the previous pull request, also from Florian Westphal. 4) Add netfilter hooks for the SRv6 lightweigh tunnel driver, from Ryoga Sato. The hooks are enabled via nf_hooks_lwtunnel sysctl to make sure existing netfilter rulesets do not break. There is a static key to disable the hooks by default. The pktgen_bench_xmit_mode_netif_receive.sh shows no noticeable impact in the seg6_input path for non-netfilter users: similar numbers with and without this patch. This is a sample of the perf report output: 11.67% kpktgend_0 [ipv6] [k] ipv6_get_saddr_eval 7.89% kpktgend_0 [ipv6] [k] __ipv6_addr_label 7.52% kpktgend_0 [ipv6] [k] __ipv6_dev_get_saddr 6.63% kpktgend_0 [kernel.vmlinux] [k] asm_exc_nmi 4.74% kpktgend_0 [ipv6] [k] fib6_node_lookup_1 3.48% kpktgend_0 [kernel.vmlinux] [k] pskb_expand_head 3.33% kpktgend_0 [ipv6] [k] ip6_rcv_core.isra.29 3.33% kpktgend_0 [ipv6] [k] seg6_do_srh_encap 2.53% kpktgend_0 [ipv6] [k] ipv6_dev_get_saddr 2.45% kpktgend_0 [ipv6] [k] fib6_table_lookup 2.24% kpktgend_0 [kernel.vmlinux] [k] ___cache_free 2.16% kpktgend_0 [ipv6] [k] ip6_pol_route 2.11% kpktgend_0 [kernel.vmlinux] [k] __ipv6_addr_type ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30xen: assume XENFEAT_gnttab_map_avail_bits being set for pv guestsJuergen Gross
XENFEAT_gnttab_map_avail_bits is always set in Xen 4.0 and newer. Remove coding assuming it might be zero. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210730071804.4302-4-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30xen: assume XENFEAT_mmu_pt_update_preserve_ad being set for pv guestsJuergen Gross
XENFEAT_mmu_pt_update_preserve_ad is always set in Xen 4.0 and newer. Remove coding assuming it might be zero. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210730071804.4302-3-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30xen: check required Xen featuresJuergen Gross
Linux kernel is not supported to run on Xen versions older than 4.0. Add tests for required Xen features always being present in Xen 4.0 and newer. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210730071804.4302-2-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30xen: fix setting of max_pfn in shared_infoJuergen Gross
Xen PV guests are specifying the highest used PFN via the max_pfn field in shared_info. This value is used by the Xen tools when saving or migrating the guest. Unfortunately this field is misnamed, as in reality it is specifying the number of pages (including any memory holes) of the guest, so it is the highest used PFN + 1. Renaming isn't possible, as this is a public Xen hypervisor interface which needs to be kept stable. The kernel will set the value correctly initially at boot time, but when adding more pages (e.g. due to memory hotplug or ballooning) a real PFN number is stored in max_pfn. This is done when expanding the p2m array, and the PFN stored there is even possibly wrong, as it should be the last possible PFN of the just added P2M frame, and not one which led to the P2M expansion. Fix that by setting shared_info->max_pfn to the last possible PFN + 1. Fixes: 98dd166ea3a3c3 ("x86/xen/p2m: hint at the last populated P2M entry") Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Link: https://lore.kernel.org/r/20210730092622.9973-2-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30netfilter: refuse insertion if chain has grown too largeFlorian Westphal
Also add a stat counter for this that gets exported both via old /proc interface and ctnetlink. Assuming the old default size of 16536 buckets and max hash occupancy of 64k, this results in 128k insertions (origin+reply), so ~8 entries per chain on average. The revised settings in this series will result in about two entries per bucket on average. This allows a hard-limit ceiling of 64. This is not tunable at the moment, but its possible to either increase nf_conntrack_buckets or decrease nf_conntrack_max to reduce average lengths. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-30netfilter: conntrack: switch to siphashFlorian Westphal
Replace jhash in conntrack and nat core with siphash. While at it, use the netns mix value as part of the input key rather than abuse the seed value. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-30netfilter: conntrack: sanitize table size default settingsFlorian Westphal
conntrack has two distinct table size settings: nf_conntrack_max and nf_conntrack_buckets. The former limits how many conntrack objects are allowed to exist in each namespace. The second sets the size of the hashtable. As all entries are inserted twice (once for original direction, once for reply), there should be at least twice as many buckets in the table than the maximum number of conntrack objects that can exist at the same time. Change the default multiplier to 1 and increase the chosen bucket sizes. This results in the same nf_conntrack_max settings as before but reduces the average bucket list length. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-30Merge branch 'IXP46x-PTP-Timer'David S. Miller
Linus Walleij says: ==================== IXP46x PTP Timer clean-up and DT ChangeLog v2->v3: - Dropped the patch enabling compile tests: we are still dependent on some machine-specific headers. The plan is to get rid of this after device tree conversion. We include one of the compile testing fixes anyway, because it is nice to have fixed. - Rebased on the latest net-next ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30ixp4xx_eth: Probe the PTP module from the device treeLinus Walleij
This adds device tree probing support for the PTP module adjacent to the ethernet module. It is pretty straight forward, all resources are in the device tree as they come to the platform device. Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30ixp4xx_eth: Add devicetree bindingsLinus Walleij
This adds device tree bindings for the IXP46x PTP Timer, a companion to the IXP4xx ethernet in newer platforms. Cc: devicetree@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30ixp4xx_eth: Stop referring to GPIOsLinus Walleij
The driver is being passed interrupts, then looking up the same interrupts as GPIOs a second time to convert them into interrupts and set properties on them. This is pointless: the GPIO and irqchip APIs of a GPIO chip are orthogonal. Just request the interrupts and be done with it, drop reliance on any GPIO functions or definitions. Use devres-managed functions and add a small devress quirk to unregister the clock as well and we can rely on devres to handle all the resources and cut down a bunch of boilerplate in the process. Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30ixp4xx_eth: fix compile-testingArnd Bergmann
Change the driver to use portable integer types to avoid warnings during compile testing, including: drivers/net/ethernet/xscale/ixp4xx_eth.c:721:21: error: cast to 'u32 *' (aka 'unsigned int *') from smaller integer type 'int' [-Werror,-Wint-to-pointer-cast] memcpy_swab32(mem, (u32 *)((int)skb->data & ~3), bytes / 4); ^ drivers/net/ethernet/xscale/ixp4xx_eth.c:963:12: error: incompatible pointer types passing 'u32 *' (aka 'unsigned int *') to parameter of type 'dma_addr_t *' (aka 'unsigned long long *') [-Werror,-Wincompatible-pointer-types] &port->desc_tab_phys))) ^~~~~~~~~~~~~~~~~~~~ include/linux/dmapool.h:27:20: note: passing argument to parameter 'handle' here dma_addr_t *handle); ^ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30ixp4xx_eth: make ptp support a platform driverArnd Bergmann
After the recent ixp4xx cleanups, the ptp driver has gained a build failure in some configurations: drivers/net/ethernet/xscale/ptp_ixp46x.c: In function 'ptp_ixp_init': drivers/net/ethernet/xscale/ptp_ixp46x.c:290:51: error: 'IXP4XX_TIMESYNC_BASE_VIRT' undeclared (first use in this function) Avoid the last bit of hardcoded constants from platform headers by turning the ptp driver bit into a platform driver and passing the IRQ and MMIO address as resources. This is a bit tricky: - The interface between the two drivers is now the new ixp46x_ptp_find() function, replacing the global ixp46x_phc_index variable. The call is done as late as possible, in hwtstamp_set(), to ensure that the ptp device is fully probed. - As the ptp driver is now called by the network driver, the link dependency is reversed, which in turn requires a small Makefile hack - The GPIO number is still left hardcoded. This is clearly not great, but it can be addressed later. Note that commit 98ac0cc270b7 ("ARM: ixp4xx: Convert to MULTI_IRQ_HANDLER") changed the IRQ number to something meaningless. Passing the correct IRQ in a resource fixes this. - When the PTP driver is disabled, ethtool .get_ts_info() now correctly lists only software timestamping regardless of the hardware. Signed-off-by: Arnd Bergmann <arnd@arndb.de> [Fix a missing include] Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30Merge branch 'hns3-cleanups'David S. Miller
Guangbin Huang says: ==================== net: hns3: add some cleanups This series includes some cleanups for the HNS3 ethernet driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30net: hns3: uniform parameter name of hclge_ptp_clean_tx_hwts()Hao Chen
The parameter name of hclge_ptp_clean_tx_hwts() in declaration is "dev", but the definition of this function is used the common name "hdev" as other functions, so modify it. Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>