summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2019-08-28net: sched: act_sample: fix psample group handling on overwriteVlad Buslov
Action sample doesn't properly handle psample_group pointer in overwrite case. Following issues need to be fixed: - In tcf_sample_init() function RCU_INIT_POINTER() is used to set s->psample_group, even though we neither setting the pointer to NULL, nor preventing concurrent readers from accessing the pointer in some way. Use rcu_swap_protected() instead to safely reset the pointer. - Old value of s->psample_group is not released or deallocated in any way, which results resource leak. Use psample_group_put() on non-NULL value obtained with rcu_swap_protected(). - The function psample_group_put() that released reference to struct psample_group pointed by rcu-pointer s->psample_group doesn't respect rcu grace period when deallocating it. Extend struct psample_group with rcu head and use kfree_rcu when freeing it. Fixes: 5c5670fae430 ("net/sched: Introduce sample tc action") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-28Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU changes from Paul E. McKenney: - A one-line change that affects only Tiny RCU that is needed by the RISC-V guys, courtesy of Christoph Hellwig. - An update to my email address. The old one still works, at least most of the time. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-08-28asm-generic/div64: Fix documentation of do_div() parameterJonathan Neuschäfer
Contrary to the description, the first parameter (n) should not be passed as a pointer, but directly as an lvalue. This is possible because do_div() is a macro. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lkml.kernel.org/r/20190808181948.27659-1-j.neuschaefer@gmx.net
2019-08-28hrtimer: Add kernel doc annotation for HRTIMER_MODE_HARDSebastian Andrzej Siewior
Add kernel doc annotation for HRTIMER_MODE_HARD. Fixes: ae6683d815895 ("hrtimer: Introduce HARD expiry mode") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190823113845.12125-2-bigeasy@linutronix.de
2019-08-28regulator: mt6358: Add support for MT6358 regulatorHsin-Hsiung Wang
The MT6358 is a regulator found on boards based on MediaTek MT8183 and probably other SoCs. It is a so called pmic and connects as a slave to SoC using SPI, wrapped inside the pmic-wrapper. Signed-off-by: Hsin-Hsiung Wang <hsin-hsiung.wang@mediatek.com> Link: https://lore.kernel.org/r/1566531931-9772-8-git-send-email-hsin-hsiung.wang@mediatek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2019-08-28posix-cpu-timers: Utilize timerqueue for storageThomas Gleixner
Using a linear O(N) search for timer insertion affects execution time and D-cache footprint badly with a larger number of timers. Switch the storage to a timerqueue which is already used for hrtimers and alarmtimers. It does not affect the size of struct k_itimer as it.alarm is still larger. The extra list head for the expiry list will go away later once the expiry is moved into task work context. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908272129220.1939@nanos.tec.linutronix.de
2019-08-28posix-cpu-timers: Move state tracking to struct posix_cputimersThomas Gleixner
Put it where it belongs and clean up the ifdeffery in fork completely. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190821192922.743229404@linutronix.de
2019-08-28posix-cpu-timers: Get rid of zero checksThomas Gleixner
Deactivation of the expiry cache is done by setting all clock caches to 0. That requires to have a check for zero in all places which update the expiry cache: if (cache == 0 || new < cache) cache = new; Use U64_MAX as the deactivated value, which allows to remove the zero checks when updating the cache and reduces it to the obvious check: if (new < cache) cache = new; This also removes the weird workaround in do_prlimit() which was required to convert a RLIMIT_CPU value of 0 (immediate expiry) to 1 because handing in 0 to the posix CPU timer code would have effectively disarmed it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192922.275086128@linutronix.de
2019-08-28posix-cpu-timers: Switch thread group sampling to arrayThomas Gleixner
That allows more simplifications in various places. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192921.988426956@linutronix.de
2019-08-28posix-cpu-timers: Restructure expiry arrayThomas Gleixner
Now that the abused struct task_cputime is gone, it's more natural to bundle the expiry cache and the list head of each clock into a struct and have an array of those structs. Follow the hrtimer naming convention of 'bases' and rename the expiry cache to 'nextevt' and adapt all usage sites. Generates also better code .text size shrinks by 80 bytes. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908262021140.1939@nanos.tec.linutronix.de
2019-08-28posix-cpu-timers: Remove cputime_expiresThomas Gleixner
The last users of the magic struct cputime based expiry cache are gone. Remove the leftovers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192921.790209622@linutronix.de
2019-08-28posix-cpu-timers: Remove the odd field rename definesThomas Gleixner
The last users of the odd define based renaming of struct task_cputime members are gone. Good riddance. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192921.499058279@linutronix.de
2019-08-28posix-cpu-timers: Provide array based access to expiry cacheThomas Gleixner
Using struct task_cputime for the expiry cache is a pretty odd choice and comes with magic defines to rename the fields for usage in the expiry cache. struct task_cputime is basically a u64 array with 3 members, but it has distinct members. The expiry cache content is different than the content of task_cputime because expiry[PROF] = task_cputime.stime + task_cputime.utime expiry[VIRT] = task_cputime.utime expiry[SCHED] = task_cputime.sum_exec_runtime So there is no direct mapping between task_cputime and the expiry cache and the #define based remapping is just a horrible hack. Having the expiry cache array based allows further simplification of the expiry code. To avoid an all in one cleanup which is hard to review add a temporary anonymous union into struct task_cputime which allows array based access to it. That requires to reorder the members. Add a build time sanity check to validate that the members are at the same place. The union and the build time checks will be removed after conversion. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192921.105793824@linutronix.de
2019-08-28posix-cpu-timers: Move expiry cache into struct posix_cputimersThomas Gleixner
The expiry cache belongs into the posix_cputimers container where the other cpu timers information is. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192921.014444012@linutronix.de
2019-08-28sched: Move struct task_cputime to types.hThomas Gleixner
For upcoming posix-timer changes to avoid include recursion hell. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190821192920.909530418@linutronix.de
2019-08-28posix-cpu-timers: Create a container structThomas Gleixner
Per task/process data of posix CPU timers is all over the place which makes the code hard to follow and requires ifdeffery. Create a container to hold all this information in one place, so data is consolidated and the ifdeffery can be confined to the posix timer header file and removed from places like fork. As a first step, move the cpu_timers list head array into the new struct and clean up the initializers and simplify fork. The remaining #ifdef in fork will be removed later. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192920.819418976@linutronix.de
2019-08-28posix-cpu-timers: Rename thread_group_cputimer() and make it staticThomas Gleixner
thread_group_cputimer() is a complete misnomer. The function does two things: - For arming process wide timers it makes sure that the atomic time storage is up to date. If no cpu timer is armed yet, then the atomic time storage is not updated by the scheduler for performance reasons. In that case a full summing up of all threads needs to be done and the update needs to be enabled. - Samples the current time into the caller supplied storage. Rename it to thread_group_start_cputime(), make it static and fixup the callsite. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192919.869350319@linutronix.de
2019-08-28posix-cpu-timers: Provide quick sample function for itimerThomas Gleixner
get_itimer() needs a sample of the current thread group cputime. It invokes thread_group_cputimer() - which is a misnomer. That function also starts eventually the group cputime accouting which is bogus because the accounting is already active when a timer is armed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192919.599658199@linutronix.de
2019-08-28perf: Allow normal events to output AUX dataAlexander Shishkin
In some cases, ordinary (non-AUX) events can generate data for AUX events. For example, PEBS events can come out as records in the Intel PT stream instead of their usual DS records, if configured to do so. One requirement for such events is to consistently schedule together, to ensure that the data from the "AUX output" events isn't lost while their corresponding AUX event is not scheduled. We use grouping to provide this guarantee: an "AUX output" event can be added to a group where an AUX event is a group leader, and provided that the former supports writing to the latter. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: kan.liang@linux.intel.com Link: https://lkml.kernel.org/r/20190806084606.4021-2-alexander.shishkin@linux.intel.com
2019-08-27Add genphy_c45_config_aneg() function to phy-c45.cMarco Hartmann
Commit 34786005eca3 ("net: phy: prevent PHYs w/o Clause 22 regs from calling genphy_config_aneg") introduced a check that aborts phy_config_aneg() if the phy is a C45 phy. This causes phy_state_machine() to call phy_error() so that the phy ends up in PHY_HALTED state. Instead of returning -EOPNOTSUPP, call genphy_c45_config_aneg() (analogous to the C22 case) so that the state machine can run correctly. genphy_c45_config_aneg() closely resembles mv3310_config_aneg() in drivers/net/phy/marvell10g.c, excluding vendor specific configurations for 1000BaseT. Fixes: 22b56e827093 ("net: phy: replace genphy_10g_driver with genphy_c45_driver") Signed-off-by: Marco Hartmann <marco.hartmann@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27net_sched: fix a NULL pointer deref in ipt actionCong Wang
The net pointer in struct xt_tgdtor_param is not explicitly initialized therefore is still NULL when dereferencing it. So we have to find a way to pass the correct net pointer to ipt_destroy_target(). The best way I find is just saving the net pointer inside the per netns struct tcf_idrinfo, which could make this patch smaller. Fixes: 0c66dc1ea3f0 ("netfilter: conntrack: register hooks in netns when needed by ruleset") Reported-and-tested-by: itugrok@yahoo.com Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27Merge tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Stable fixes: - Fix a page lock leak in nfs_pageio_resend() - Ensure O_DIRECT reports an error if the bytes read/written is 0 - Don't handle errors if the bind/connect succeeded - Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidat ed" Bugfixes: - Don't refresh attributes with mounted-on-file information - Fix return values for nfs4_file_open() and nfs_finish_open() - Fix pnfs layoutstats reporting of I/O errors - Don't use soft RPC calls for pNFS/flexfiles I/O, and don't abort for soft I/O errors when the user specifies a hard mount. - Various fixes to the error handling in sunrpc - Don't report writepage()/writepages() errors twice" * tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: remove set but not used variable 'mapping' NFSv2: Fix write regression NFSv2: Fix eof handling NFS: Fix writepage(s) error handling to not report errors twice NFS: Fix spurious EIO read errors pNFS/flexfiles: Don't time out requests on hard mounts SUNRPC: Handle connection breakages correctly in call_status() Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated" SUNRPC: Handle EADDRINUSE and ENOBUFS correctly pNFS/flexfiles: Turn off soft RPC calls SUNRPC: Don't handle errors if the bind/connect succeeded NFS: On fatal writeback errors, we need to call nfs_inode_remove_request() NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup NFS: Ensure O_DIRECT reports an error if the bytes read/written is 0 NFSv4/pnfs: Fix a page lock leak in nfs_pageio_resend() NFSv4: Fix return value in nfs_finish_open() NFSv4: Fix return values for nfs4_file_open() NFS: Don't refresh attributes with mounted-on-file information
2019-08-27Merge tag 'arc-5.3-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC updates from Vineet Gupta: - support for Edge Triggered IRQs in ARC IDU intc - other fixes here and there * tag 'arc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: arc: prefer __section from compiler_attributes.h dt-bindings: IDU-intc: Add support for edge-triggered interrupts dt-bindings: IDU-intc: Clean up documentation ARCv2: IDU-intc: Add support for edge-triggered interrupts ARC: unwind: Mark expected switch fall-throughs ARC: [plat-hsdk]: allow to switch between AXI DMAC port configurations ARC: fix typo in setup_dma_ops log message ARCv2: entry: early return from exception need not clear U & DE bits
2019-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Use 32-bit index for tails calls in s390 bpf JIT, from Ilya Leoshkevich. 2) Fix missed EPOLLOUT events in TCP, from Eric Dumazet. Same fix for SMC from Jason Baron. 3) ipv6_mc_may_pull() should return 0 for malformed packets, not -EINVAL. From Stefano Brivio. 4) Don't forget to unpin umem xdp pages in error path of xdp_umem_reg(). From Ivan Khoronzhuk. 5) Fix sta object leak in mac80211, from Johannes Berg. 6) Fix regression by not configuring PHYLINK on CPU port of bcm_sf2 switches. From Florian Fainelli. 7) Revert DMA sync removal from r8169 which was causing regressions on some MIPS Loongson platforms. From Heiner Kallweit. 8) Use after free in flow dissector, from Jakub Sitnicki. 9) Fix NULL derefs of net devices during ICMP processing across collect_md tunnels, from Hangbin Liu. 10) proto_register() memory leaks, from Zhang Lin. 11) Set NLM_F_MULTI flag in multipart netlink messages consistently, from John Fastabend. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits) r8152: Set memory to all 0xFFs on failed reg reads openvswitch: Fix conntrack cache with timeout ipv4: mpls: fix mpls_xmit for iptunnel nexthop: Fix nexthop_num_path for blackhole nexthops net: rds: add service level support in rds-info net: route dump netlink NLM_F_MULTI flag missing s390/qeth: reject oversized SNMP requests sock: fix potential memory leak in proto_register() MAINTAINERS: Add phylink keyword to SFF/SFP/SFP+ MODULE SUPPORT xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md mode ipv4/icmp: fix rt dst dev null pointer dereference openvswitch: Fix log message in ovs conntrack bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0 bpf: fix use after free in prog symbol exposure bpf: fix precision tracking in presence of bpf2bpf calls flow_dissector: Fix potential use-after-free on BPF_PROG_DETACH Revert "r8169: remove not needed call to dma_sync_single_for_device" ipv6: propagate ipv6_add_dev's error returns out of ipv6_find_idev net/ncsi: Fix the payload copying for the request coming from Netlink qed: Add cleanup in qed_slowpath_start() ...
2019-08-27rxrpc: Use skb_unshare() rather than skb_cow_data()David Howells
The in-place decryption routines in AF_RXRPC's rxkad security module currently call skb_cow_data() to make sure the data isn't shared and that the skb can be written over. This has a problem, however, as the softirq handler may be still holding a ref or the Rx ring may be holding multiple refs when skb_cow_data() is called in rxkad_verify_packet() - and so skb_shared() returns true and __pskb_pull_tail() dislikes that. If this occurs, something like the following report will be generated. kernel BUG at net/core/skbuff.c:1463! ... RIP: 0010:pskb_expand_head+0x253/0x2b0 ... Call Trace: __pskb_pull_tail+0x49/0x460 skb_cow_data+0x6f/0x300 rxkad_verify_packet+0x18b/0xb10 [rxrpc] rxrpc_recvmsg_data.isra.11+0x4a8/0xa10 [rxrpc] rxrpc_kernel_recv_data+0x126/0x240 [rxrpc] afs_extract_data+0x51/0x2d0 [kafs] afs_deliver_fs_fetch_data+0x188/0x400 [kafs] afs_deliver_to_call+0xac/0x430 [kafs] afs_wait_for_call_to_complete+0x22f/0x3d0 [kafs] afs_make_call+0x282/0x3f0 [kafs] afs_fs_fetch_data+0x164/0x300 [kafs] afs_fetch_data+0x54/0x130 [kafs] afs_readpages+0x20d/0x340 [kafs] read_pages+0x66/0x180 __do_page_cache_readahead+0x188/0x1a0 ondemand_readahead+0x17d/0x2e0 generic_file_read_iter+0x740/0xc10 __vfs_read+0x145/0x1a0 vfs_read+0x8c/0x140 ksys_read+0x4a/0xb0 do_syscall_64+0x43/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix this by using skb_unshare() instead in the input path for DATA packets that have a security index != 0. Non-DATA packets don't need in-place encryption and neither do unencrypted DATA packets. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Reported-by: Julian Wollrath <jwollrath@web.de> Signed-off-by: David Howells <dhowells@redhat.com>
2019-08-27rxrpc: Use the tx-phase skb flag to simplify tracingDavid Howells
Use the previously-added transmit-phase skbuff private flag to simplify the socket buffer tracing a bit. Which phase the skbuff comes from can now be divined from the skb rather than having to be guessed from the call state. We can also reduce the number of rxrpc_skb_trace values by eliminating the difference between Tx and Rx in the symbols. Signed-off-by: David Howells <dhowells@redhat.com>
2019-08-26rcu: Don't include <linux/ktime.h> in rcutiny.hChristoph Hellwig
The kbuild reported a built failure due to a header loop when RCUTINY is enabled with my pending riscv-nommu port. Switch rcutiny.h to only include the minimal required header to get HZ instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-08-26Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated"Trond Myklebust
This reverts commit a79f194aa4879e9baad118c3f8bb2ca24dbef765. The mechanism for aborting I/O is racy, since we are not guaranteed that the request is asleep while we're changing both task->tk_status and task->tk_action. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v5.1
2019-08-26ARCv2: IDU-intc: Add support for edge-triggered interruptsMischa Jonker
This adds support for an optional extra interrupt cell to specify edge vs level triggered. It is backward compatible with dts files with only one cell, and will default to level-triggered in such a case. Note that I had to make a change to idu_irq_set_affinity as well, as this function was setting the interrupt type to "level" unconditionally, since this was the only type supported previously. Signed-off-by: Mischa Jonker <mischa.jonker@synopsys.com> Reviewed-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2019-08-26fbdev: da8xx: remove panel_power_ctrl() callback from platform dataBartosz Golaszewski
There are no more users of panel_power_ctrl(). Remove it from the driver. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com>
2019-08-26dt-bindings: clk: meson: add sm1 periph clock controller bindingsNeil Armstrong
Update the documentation to support clock driver for the Amlogic SM1 SoC and expose the GP1, DSU and the CPU 1, 2 & 3 clocks. SM1 clock tree is very close, the main differences are : - each CPU core can achieve a different frequency, albeit a common PLL - a similar tree as the clock tree has been added for the DynamIQ Shared Unit - has a new GP1 PLL used for the DynamIQ Shared Unit - SM1 has additional clocks like for CSI, NanoQ an other components Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
2019-08-25nexthop: Fix nexthop_num_path for blackhole nexthopsDavid Ahern
Donald reported this sequence: ip next add id 1 blackhole ip next add id 2 blackhole ip ro add 1.1.1.1/32 nhid 1 ip ro add 1.1.1.2/32 nhid 2 would cause a crash. Backtrace is: [ 151.302790] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 151.304043] CPU: 1 PID: 277 Comm: ip Not tainted 5.3.0-rc5+ #37 [ 151.305078] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 [ 151.306526] RIP: 0010:fib_add_nexthop+0x8b/0x2aa [ 151.307343] Code: 35 f7 81 48 8d 14 01 c7 02 f1 f1 f1 f1 c7 42 04 01 f4 f4 f4 48 89 f2 48 c1 ea 03 65 48 8b 0c 25 28 00 00 00 48 89 4d d0 31 c9 <80> 3c 02 00 74 08 48 89 f7 e8 1a e8 53 ff be 08 00 00 00 4c 89 e7 [ 151.310549] RSP: 0018:ffff888116c27340 EFLAGS: 00010246 [ 151.311469] RAX: dffffc0000000000 RBX: ffff8881154ece00 RCX: 0000000000000000 [ 151.312713] RDX: 0000000000000004 RSI: 0000000000000020 RDI: ffff888115649b40 [ 151.313968] RBP: ffff888116c273d8 R08: ffffed10221e3757 R09: ffff888110f1bab8 [ 151.315212] R10: 0000000000000001 R11: ffff888110f1bab3 R12: ffff888115649b40 [ 151.316456] R13: 0000000000000020 R14: ffff888116c273b0 R15: ffff888115649b40 [ 151.317707] FS: 00007f60b4d8d800(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000 [ 151.319113] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 151.320119] CR2: 0000555671ffdc00 CR3: 00000001136ba005 CR4: 0000000000020ee0 [ 151.321367] Call Trace: [ 151.321820] ? fib_nexthop_info+0x635/0x635 [ 151.322572] fib_dump_info+0xaa4/0xde0 [ 151.323247] ? fib_create_info+0x2431/0x2431 [ 151.324008] ? napi_alloc_frag+0x2a/0x2a [ 151.324711] rtmsg_fib+0x2c4/0x3be [ 151.325339] fib_table_insert+0xe2f/0xeee ... fib_dump_info incorrectly has nhs = 0 for blackhole nexthops, so it believes the nexthop object is a multipath group (nhs != 1) and ends up down the nexthop_mpath_fill_node() path which is wrong for a blackhole. The blackhole check in nexthop_num_path is leftover from early days of the blackhole implementation which did not initialize the device. In the end the design was simpler (fewer special case checks) to set the device to loopback in nh_info, so the check in nexthop_num_path should have been removed. Fixes: 430a049190de ("nexthop: Add support for nexthop groups") Reported-by: Donald Sharp <sharpd@cumulusnetworks.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-25Merge tag 'for-linus-5.3-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull UBIFS and JFFS2 fixes from Richard Weinberger: "UBIFS: - Don't block too long in writeback_inodes_sb() - Fix for a possible overrun of the log head - Fix double unlock in orphan_delete() JFFS2: - Remove C++ style from UAPI header and unbreak picky toolchains" * tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: ubifs: Limit the number of pages in shrink_liability ubifs: Correctly initialize c->min_log_bytes ubifs: Fix double unlock around orphan_delete() jffs2: Remove C++ style comments from uapi header
2019-08-25Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timekeeping fix from Thomas Gleixner: "A single fix for a regression caused by the generic VDSO implementation where a math overflow causes CLOCK_BOOTTIME to become a random number generator" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping/vsyscall: Prevent math overflow in BOOTTIME update
2019-08-24Merge tag 'dma-mapping-5.3-5' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping fixes from Christoph Hellwig: "Two fixes for regressions in this merge window: - select the Kconfig symbols for the noncoherent dma arch helpers on arm if swiotlb is selected, not just for LPAE to not break then Xen build, that uses swiotlb indirectly through swiotlb-xen - fix the page allocator fallback in dma_alloc_contiguous if the CMA allocation fails" * tag 'dma-mapping-5.3-5' of git://git.infradead.org/users/hch/dma-mapping: dma-direct: fix zone selection after an unaddressable CMA allocation arm: select the dma-noncoherent symbols for all swiotlb builds
2019-08-24net: rds: add service level support in rds-infoZhu Yanjun
>From IB specific 7.6.5 SERVICE LEVEL, Service Level (SL) is used to identify different flows within an IBA subnet. It is carried in the local route header of the packet. Before this commit, run "rds-info -I". The outputs are as below: " RDS IB Connections: LocalAddr RemoteAddr Tos SL LocalDev RemoteDev 192.2.95.3 192.2.95.1 2 0 fe80::21:28:1a:39 fe80::21:28:10:b9 192.2.95.3 192.2.95.1 1 0 fe80::21:28:1a:39 fe80::21:28:10:b9 192.2.95.3 192.2.95.1 0 0 fe80::21:28:1a:39 fe80::21:28:10:b9 " After this commit, the output is as below: " RDS IB Connections: LocalAddr RemoteAddr Tos SL LocalDev RemoteDev 192.2.95.3 192.2.95.1 2 2 fe80::21:28:1a:39 fe80::21:28:10:b9 192.2.95.3 192.2.95.1 1 1 fe80::21:28:1a:39 fe80::21:28:10:b9 192.2.95.3 192.2.95.1 0 0 fe80::21:28:1a:39 fe80::21:28:10:b9 " The commit fe3475af3bdf ("net: rds: add per rds connection cache statistics") adds cache_allocs in struct rds_info_rdma_connection as below: struct rds_info_rdma_connection { ... __u32 rdma_mr_max; __u32 rdma_mr_size; __u8 tos; __u32 cache_allocs; }; The peer struct in rds-tools of struct rds_info_rdma_connection is as below: struct rds_info_rdma_connection { ... uint32_t rdma_mr_max; uint32_t rdma_mr_size; uint8_t tos; uint8_t sl; uint32_t cache_allocs; }; The difference between userspace and kernel is the member variable sl. In the kernel struct, the member variable sl is missing. This will introduce risks. So it is necessary to use this commit to avoid this risk. Fixes: fe3475af3bdf ("net: rds: add per rds connection cache statistics") CC: Joe Jin <joe.jin@oracle.com> CC: JUNXIAO_BI <junxiao.bi@oracle.com> Suggested-by: Gerd Rausch <gerd.rausch@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-24net: route dump netlink NLM_F_MULTI flag missingJohn Fastabend
An excerpt from netlink(7) man page, In multipart messages (multiple nlmsghdr headers with associated payload in one byte stream) the first and all following headers have the NLM_F_MULTI flag set, except for the last header which has the type NLMSG_DONE. but, after (ee28906) there is a missing NLM_F_MULTI flag in the middle of a FIB dump. The result is user space applications following above man page excerpt may get confused and may stop parsing msg believing something went wrong. In the golang netlink lib [0] the library logic stops parsing believing the message is not a multipart message. Found this running Cilium[1] against net-next while adding a feature to auto-detect routes. I noticed with multiple route tables we no longer could detect the default routes on net tree kernels because the library logic was not returning them. Fix this by handling the fib_dump_info_fnhe() case the same way the fib_dump_info() handles it by passing the flags argument through the call chain and adding a flags argument to rt_fill_info(). Tested with Cilium stack and auto-detection of routes works again. Also annotated libs to dump netlink msgs and inspected NLM_F_MULTI and NLMSG_DONE flags look correct after this. Note: In inet_rtm_getroute() pass rt_fill_info() '0' for flags the same as is done for fib_dump_info() so this looks correct to me. [0] https://github.com/vishvananda/netlink/ [1] https://github.com/cilium/ Fixes: ee28906fd7a14 ("ipv4: Dump route exceptions if requested") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-24Merge tag 'gpio-v5.3-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Here is a (hopefully last) set of GPIO fixes for the v5.3 kernel cycle. Two are pretty core: - Fix not reporting open drain/source lines to userspace as "input" - Fix a minor build error found in randconfigs - Fix a chip select quirk on the Freescale SPI - Fix the irqchip initialization semantic order to reflect what it was using the old API" * tag 'gpio-v5.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: Fix irqchip initialization order gpio: of: fix Freescale SPI CS quirk handling gpio: Fix build error of function redefinition gpiolib: never report open-drain/source lines as 'input' to user-space
2019-08-23Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Doug Ledford: "No beating around the bush: this is a monster pull request for an -rc5 kernel. Intel hit me with a series of fixes for TID processing. Mellanox hit me with a series for their UMR memory support. And we had one fix for siw that fixes the 32bit build warnings and because of the number of casts that had to be changed to properly silence the warnings, that one patch alone is a full 40% of the LOC of this entire pull request. Given that this is the initial release kernel for siw, I'm trying to fix anything in it that we can, so that adds to the impetus to take fixes for it like this one. I had to do a rebase early in the week. Jason had thought he put a patch on the rc queue that he needed to be there so he could base some work off of it, and it had actually not been placed there. So he asked me (on Tuesday) to fix that up before pushing my wip branch to the official rc branch. I did, and that's why the early patches look like they were all committed at the same time on Tuesday. That bunch had been in my queue prior. The various patches all pass my test for being legitimate fixes and not attempts to slide new features or development into a late rc. Well, they were all fixes with the exception of a couple clean up patches people wrote for making the fixes they also wrote better (like a cleanup patch to move UMR checking into a function so that the remaining UMR fix patches can reference that function), so I left those in place too. My apologies for the LOC count and the number of patches here, it's just how the cards fell this cycle. Summary: - Fix siw buffer mapping issue - Fix siw 32/64 casting issues - Fix a KASAN access issue in bnxt_re - Fix several memory leaks (hfi1, mlx4) - Fix a NULL deref in cma_cleanup - Fixes for UMR memory support in mlx5 (4 patch series) - Fix namespace check for restrack - Fixes for counter support - Fixes for hfi1 TID processing (5 patch series) - Fix potential NULL deref in siw - Fix memory page calculations in mlx5" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (21 commits) RDMA/siw: Fix 64/32bit pointer inconsistency RDMA/siw: Fix SGL mapping issues RDMA/bnxt_re: Fix stack-out-of-bounds in bnxt_qplib_rcfw_send_message infiniband: hfi1: fix memory leaks infiniband: hfi1: fix a memory leak bug IB/mlx4: Fix memory leaks RDMA/cma: fix null-ptr-deref Read in cma_cleanup IB/mlx5: Block MR WR if UMR is not possible IB/mlx5: Fix MR re-registration flow to use UMR properly IB/mlx5: Report and handle ODP support properly IB/mlx5: Consolidate use_umr checks into single function RDMA/restrack: Rewrite PID namespace check to be reliable RDMA/counters: Properly implement PID checks IB/core: Fix NULL pointer dereference when bind QP to counter IB/hfi1: Drop stale TID RDMA packets that cause TIDErr IB/hfi1: Add additional checks when handling TID RDMA WRITE DATA packet IB/hfi1: Add additional checks when handling TID RDMA READ RESP packet IB/hfi1: Unsafe PSN checking for TID RDMA READ Resp packet IB/hfi1: Drop stale TID RDMA packets RDMA/siw: Fix potential NULL de-ref ...
2019-08-23Merge tag 'ceph-for-5.3-rc6' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fixes from Ilya Dryomov: "Three important fixes tagged for stable (an indefinite hang, a crash on an assert and a NULL pointer dereference) plus a small series from Luis fixing instances of vfree() under spinlock" * tag 'ceph-for-5.3-rc6' of git://github.com/ceph/ceph-client: libceph: fix PG split vs OSD (re)connect race ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply ceph: clear page dirty before invalidate page ceph: fix buffer free while holding i_ceph_lock in fill_inode() ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob() ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr() libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer
2019-08-23Merge branch 'for-joerg/arm-smmu/updates' of ↵Joerg Roedel
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu
2019-08-23fdt: add support for rng-seedHsin-Yi Wang
Introducing a chosen node, rng-seed, which is an entropy that can be passed to kernel called very early to increase initial device randomness. Bootloader should provide this entropy and the value is read from /chosen/rng-seed in DT. Obtain of_fdt_crc32 for CRC check after early_init_dt_scan_nodes(), since early_init_dt_scan_chosen() would modify fdt to erase rng-seed. Add a new interface add_bootloader_randomness() for rng-seed use case. Depends on whether the seed is trustworthy, rng seed would be passed to add_hwgenerator_randomness(). Otherwise it would be passed to add_device_randomness(). Decision is controlled by kernel config RANDOM_TRUST_BOOTLOADER. Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> # drivers/char/random.c Signed-off-by: Will Deacon <will@kernel.org>
2019-08-23clocksource/drivers/hyperv: Enable TSC page clocksource on 32bitVitaly Kuznetsov
There is no particular reason to not enable TSC page clocksource on 32-bit. mul_u64_u64_shr() is available and despite the increased computational complexity (compared to 64bit) TSC page is still a huge win compared to MSR-based clocksource. In-kernel reads: MSR based clocksource: 3361 cycles TSC page clocksource: 49 cycles Reads from userspace (utilizing vDSO in case of TSC page): MSR based clocksource: 5664 cycles TSC page clocksource: 131 cycles Enabling TSC page on 32bits allows to get rid of CONFIG_HYPERV_TSCPAGE as it is now not any different from CONFIG_HYPERV_TIMER. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lkml.kernel.org/r/20190822083630.17059-1-vkuznets@redhat.com
2019-08-23clocksource/drivers/hyperv: Add Hyper-V specific sched clock functionTianyu Lan
Hyper-V guests use the default native_sched_clock() in pv_ops.time.sched_clock on x86. But native_sched_clock() directly uses the raw TSC value, which can be discontinuous in a Hyper-V VM. Add the generic hv_setup_sched_clock() to set the sched clock function appropriately. On x86, this sets pv_ops.time.sched_clock to read the Hyper-V reference TSC value that is scaled and adjusted to be continuous. Also move the Hyper-V reference TSC initialization much earlier in the boot process so no discontinuity is observed when pv_ops.time.sched_clock calculates its offset. [ tglx: Folded build fix ] Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lkml.kernel.org/r/20190814123216.32245-3-Tianyu.Lan@microsoft.com
2019-08-23soc: mediatek: cmdq: change the type of input parameterBibby Hsieh
According to the cmdq hardware design, the subsys is u8, the offset is u16 and the event id is u16. This patch changes the type of subsys, offset and event id to the correct type. Signed-off-by: Bibby Hsieh <bibby.hsieh@mediatek.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2019-08-23soc: mediatek: cmdq: reorder the parameterBibby Hsieh
The order of gce instructions is [subsys offset value] so reorder the parameter of cmdq_pkt_write_mask and cmdq_pkt_write function. Signed-off-by: Bibby Hsieh <bibby.hsieh@mediatek.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2019-08-23gpio: Move gpiochip_lock/unlock_as_irq to gpio/driver.hYueHaibing
If CONFIG_GPIOLIB is not, gpiochip_lock/unlock_as_irq will conflict as this: In file included from sound/soc/codecs/wm5100.c:18:0: ./include/linux/gpio.h:224:19: error: static declaration of gpiochip_lock_as_irq follows non-static declaration static inline int gpiochip_lock_as_irq(struct gpio_chip *chip, ^~~~~~~~~~~~~~~~~~~~ In file included from sound/soc/codecs/wm5100.c:17:0: ./include/linux/gpio/driver.h:494:5: note: previous declaration of gpiochip_lock_as_irq was here int gpiochip_lock_as_irq(struct gpio_chip *chip, unsigned int offset); ^~~~~~~~~~~~~~~~~~~~ In file included from sound/soc/codecs/wm5100.c:18:0: ./include/linux/gpio.h:231:20: error: static declaration of gpiochip_unlock_as_irq follows non-static declaration static inline void gpiochip_unlock_as_irq(struct gpio_chip *chip, ^~~~~~~~~~~~~~~~~~~~~~ In file included from sound/soc/codecs/wm5100.c:17:0: ./include/linux/gpio/driver.h:495:6: note: previous declaration of gpiochip_unlock_as_irq was here void gpiochip_unlock_as_irq(struct gpio_chip *chip, unsigned int offset); ^~~~~~~~~~~~~~~~~~~~~~ Move them to gpio/driver.h and use CONFIG_GPIOLIB guard this. Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: d74be6dfea1b ("gpio: remove gpiod_lock/unlock_as_irq()") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Link: https://lore.kernel.org/r/20190822031817.32888-1-yuehaibing@huawei.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-08-23iommu: Add helpers to set/get default domain typeJoerg Roedel
Add a couple of functions to allow changing the default domain type from architecture code and a function for iommu drivers to request whether the default domain is passthrough. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-08-23timekeeping/vsyscall: Prevent math overflow in BOOTTIME updateThomas Gleixner
The VDSO update for CLOCK_BOOTTIME has a overflow issue as it shifts the nanoseconds based boot time offset left by the clocksource shift. That overflows once the boot time offset becomes large enough. As a consequence CLOCK_BOOTTIME in the VDSO becomes a random number causing applications to misbehave. Fix it by storing a timespec64 representation of the offset when boot time is adjusted and add that to the MONOTONIC base time value in the vdso data page. Using the timespec64 representation avoids a 64bit division in the update code. Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation") Reported-by: Chris Clayton <chris2553@googlemail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Chris Clayton <chris2553@googlemail.com> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908221257580.1983@nanos.tec.linutronix.de
2019-08-22Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU and LKMM changes from Paul E. McKenney: - A few more RCU flavor consolidation cleanups. - Miscellaneous fixes. - Updates to RCU's list-traversal macros improving lockdep usability. - Torture-test updates. - Forward-progress improvements for no-CBs CPUs: Avoid ignoring incoming callbacks during grace-period waits. - Forward-progress improvements for no-CBs CPUs: Use ->cblist structure to take advantage of others' grace periods. - Also added a small commit that avoids needlessly inflicting scheduler-clock ticks on callback-offloaded CPUs. - Forward-progress improvements for no-CBs CPUs: Reduce contention on ->nocb_lock guarding ->cblist. - Forward-progress improvements for no-CBs CPUs: Add ->nocb_bypass list to further reduce contention on ->nocb_lock guarding ->cblist. - LKMM updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>