summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-05-26cls_flower: Support filtering on multiple MPLS Label Stack EntriesGuillaume Nault
With struct flow_dissector_key_mpls now recording the first FLOW_DIS_MPLS_MAX labels, we can extend Flower to filter on any of these LSEs independently. In order to avoid creating new netlink attributes for every possible depth, let's define a new TCA_FLOWER_KEY_MPLS_OPTS nested attribute that contains the list of LSEs to match. Each LSE is represented by another attribute, TCA_FLOWER_KEY_MPLS_OPTS_LSE, which then contains the attributes representing the depth and the MPLS fields to match at this depth (label, TTL, etc.). For each MPLS field, the mask is always set to all-ones, as this is what the original API did. We could allow user configurable masks in the future if there is demand for more flexibility. The new API also allows to only specify an LSE depth. In that case, Flower only verifies that the MPLS label stack depth is greater or equal to the provided depth (that is, an LSE exists at this depth). Filters that only match on one (or more) fields of the first LSE are dumped using the old netlink attributes, to avoid confusing user space programs that don't understand the new API. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26flow_dissector: Parse multiple MPLS Label Stack EntriesGuillaume Nault
The current MPLS dissector only parses the first MPLS Label Stack Entry (second LSE can be parsed too, but only to set a key_id). This patch adds the possibility to parse several LSEs by making __skb_flow_dissect_mpls() return FLOW_DISSECT_RET_PROTO_AGAIN as long as the Bottom Of Stack bit hasn't been seen, up to a maximum of FLOW_DIS_MPLS_MAX entries. FLOW_DIS_MPLS_MAX is arbitrarily set to 7. This should be enough for many practical purposes, without wasting too much space. To record the parsed values, flow_dissector_key_mpls is modified to store an array of stack entries, instead of just the values of the first one. A bit field, "used_lses", is also added to keep track of the LSEs that have been set. The objective is to avoid defining a new FLOW_DISSECTOR_KEY_MPLS_XX for each level of the MPLS stack. TC flower is adapted for the new struct flow_dissector_key_mpls layout. Matching on several MPLS Label Stack Entries will be added in the next patch. The NFP and MLX5 drivers are also adapted: nfp_flower_compile_mac() and mlx5's parse_tunnel() now verify that the rule only uses the first LSE and fail if it doesn't. Finally, the behaviour of the FLOW_DISSECTOR_KEY_MPLS_ENTROPY key is slightly modified. Instead of recording the first Entropy Label, it now records the last one. This shouldn't have any consequences since there doesn't seem to have any user of FLOW_DISSECTOR_KEY_MPLS_ENTROPY in the tree. We'd probably better do a hash of all parsed MPLS labels instead (excluding reserved labels) anyway. That'd give better entropy and would probably also simplify the code. But that's not the purpose of this patch, so I'm keeping that as a future possible improvement. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26Merge tag 'batadv-next-for-davem-20200526' of ↵David S. Miller
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This cleanup patchset includes the following patches: - Fix revert dynamic lockdep key changes for batman-adv, by Sven Eckelmann - use rcu_replace_pointer() where appropriate, by Antonio Quartulli - Revert "disable ethtool link speed detection when auto negotiation off", by Sven Eckelmann ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26Merge branch 'tipc-add-some-improvements'David S. Miller
Tuong Lien says: ==================== tipc: add some improvements This series adds some improvements to TIPC. The first patch improves the TIPC broadcast's performance with the 'Gap ACK blocks' mechanism similar to unicast before, while the others give support on tracing & statistics for broadcast links, and an alternative to carry broadcast retransmissions via unicast which might be useful in some cases. Besides, the Nagle algorithm can now automatically 'adjust' itself depending on the specific network condition a stream connection runs by the last patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26tipc: add test for Nagle algorithm effectivenessTuong Lien
When streaming in Nagle mode, we try to bundle small messages from user as many as possible if there is one outstanding buffer, i.e. not ACK-ed by the receiving side, which helps boost up the overall throughput. So, the algorithm's effectiveness really depends on when Nagle ACK comes or what the specific network latency (RTT) is, compared to the user's message sending rate. In a bad case, the user's sending rate is low or the network latency is small, there will not be many bundles, so making a Nagle ACK or waiting for it is not meaningful. For example: a user sends its messages every 100ms and the RTT is 50ms, then for each messages, we require one Nagle ACK but then there is only one user message sent without any bundles. In a better case, even if we have a few bundles (e.g. the RTT = 300ms), but now the user sends messages in medium size, then there will not be any difference at all, that says 3 x 1000-byte data messages if bundled will still result in 3 bundles with MTU = 1500. When Nagle is ineffective, the delay in user message sending is clearly wasted instead of sending directly. Besides, adding Nagle ACKs will consume some processor load on both the sending and receiving sides. This commit adds a test on the effectiveness of the Nagle algorithm for an individual connection in the network on which it actually runs. Particularly, upon receipt of a Nagle ACK we will compare the number of bundles in the backlog queue to the number of user messages which would be sent directly without Nagle. If the ratio is good (e.g. >= 2), Nagle mode will be kept for further message sending. Otherwise, we will leave Nagle and put a 'penalty' on the connection, so it will have to spend more 'one-way' messages before being able to re-enter Nagle. In addition, the 'ack-required' bit is only set when really needed that the number of Nagle ACKs will be reduced during Nagle mode. Testing with benchmark showed that with the patch, there was not much difference in throughput for small messages since the tool continuously sends messages without a break, so Nagle would still take in effect. Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26tipc: add support for broadcast rcv stats dumpingTuong Lien
This commit enables dumping the statistics of a broadcast-receiver link like the traditional 'broadcast-link' one (which is for broadcast- sender). The link dumping can be triggered via netlink (e.g. the iproute2/tipc tool) by the link flag - 'TIPC_NLA_LINK_BROADCAST' as the indicator. The name of a broadcast-receiver link of a specific peer will be in the format: 'broadcast-link:<peer-id>'. For example: Link <broadcast-link:1001002> Window:50 packets RX packets:7841 fragments:2408/440 bundles:0/0 TX packets:0 fragments:0/0 bundles:0/0 RX naks:0 defs:124 dups:0 TX naks:21 acks:0 retrans:0 Congestion link:0 Send queue max:0 avg:0 In addition, the broadcast-receiver link statistics can be reset in the usual way via netlink by specifying that link name in command. Note: the 'tipc_link_name_ext()' is removed because the link name can now be retrieved simply via the 'l->name'. Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26tipc: enable broadcast retrans via unicastTuong Lien
In some environment, broadcast traffic is suppressed at high rate (i.e. a kind of bandwidth limit setting). When it is applied, TIPC broadcast can still run successfully. However, when it comes to a high load, some packets will be dropped first and TIPC tries to retransmit them but the packet retransmission is intentionally broadcast too, so making things worse and not helpful at all. This commit enables the broadcast retransmission via unicast which only retransmits packets to the specific peer that has really reported a gap i.e. not broadcasting to all nodes in the cluster, so will prevent from being suppressed, and also reduce some overheads on the other peers due to duplicates, finally improve the overall TIPC broadcast performance. Note: the functionality can be turned on/off via the sysctl file: echo 1 > /proc/sys/net/tipc/bc_retruni echo 0 > /proc/sys/net/tipc/bc_retruni Default is '0', i.e. the broadcast retransmission still works as usual. Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26tipc: add back link trace eventsTuong Lien
In the previous commit ("tipc: add Gap ACK blocks support for broadcast link"), we have removed the following link trace events due to the code changes: - tipc_link_bc_ack - tipc_link_retrans This commit adds them back along with some minor changes to adapt to the new code. Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26tipc: introduce Gap ACK blocks for broadcast linkTuong Lien
As achieved through commit 9195948fbf34 ("tipc: improve TIPC throughput by Gap ACK blocks"), we apply the same mechanism for the broadcast link as well. The 'Gap ACK blocks' data field in a 'PROTOCOL/STATE_MSG' will consist of two parts built for both the broadcast and unicast types: 31 16 15 0 +-------------+-------------+-------------+-------------+ | bgack_cnt | ugack_cnt | len | +-------------+-------------+-------------+-------------+ - | gap | ack | | +-------------+-------------+-------------+-------------+ > bc gacks : : : | +-------------+-------------+-------------+-------------+ - | gap | ack | | +-------------+-------------+-------------+-------------+ > uc gacks : : : | +-------------+-------------+-------------+-------------+ - which is "automatically" backward-compatible. We also increase the max number of Gap ACK blocks to 128, allowing upto 64 blocks per type (total buffer size = 516 bytes). Besides, the 'tipc_link_advance_transmq()' function is refactored which is applicable for both the unicast and broadcast cases now, so some old functions can be removed and the code is optimized. With the patch, TIPC broadcast is more robust regardless of packet loss or disorder, latency, ... in the underlying network. Its performance is boost up significantly. For example, experiment with a 5% packet loss rate results: $ time tipc-pipe --mc --rdm --data_size 123 --data_num 1500000 real 0m 42.46s user 0m 1.16s sys 0m 17.67s Without the patch: $ time tipc-pipe --mc --rdm --data_size 123 --data_num 1500000 real 8m 27.94s user 0m 0.55s sys 0m 2.38s Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26qed: Add EDPM mode type for user-fw compatibilityYuval Basson
In older FW versions the completion flag was treated as the ack flag in edpm messages. Expose the FW option of setting which mode the QP is in by adding a flag to the qedr <-> qed API. Flag is added for backward compatibility with libqedr. This flag will be set by qedr after determining whether the libqedr is using the updated version. Fixes: f10939403352 ("qed: Add support for QP verbs") Signed-off-by: Yuval Basson <yuval.bason@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26tcp: tcp_v4_err() icmp skb is named icmp_skbEric Dumazet
I missed the fact that tcp_v4_err() differs from tcp_v6_err(). After commit 4d1a2d9ec1c1 ("Rename skb to icmp_skb in tcp_v4_err()") the skb argument has been renamed to icmp_skb only in one function. I will in a future patch reconciliate these functions to avoid this kind of confusion. Fixes: 45af29ca761c ("tcp: allow traceroute -Mtcp for unpriv users") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26exec: Always set cap_ambient in cap_bprm_set_credsEric W. Biederman
An invariant of cap_bprm_set_creds is that every field in the new cred structure that cap_bprm_set_creds might set, needs to be set every time to ensure the fields does not get a stale value. The field cap_ambient is not set every time cap_bprm_set_creds is called, which means that if there is a suid or sgid script with an interpreter that has neither the suid nor the sgid bits set the interpreter should be able to accept ambient credentials. Unfortuantely because cap_ambient is not reset to it's original value the interpreter can not accept ambient credentials. Given that the ambient capability set is expected to be controlled by the caller, I don't think this is particularly serious. But it is definitely worth fixing so the code works correctly. I have tested to verify my reading of the code is correct and the interpreter of a sgid can receive ambient capabilities with this change and cannot receive ambient capabilities without this change. Cc: stable@vger.kernel.org Cc: Andy Lutomirski <luto@kernel.org> Fixes: 58319057b784 ("capabilities: ambient capabilities") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-26sched/fair: Don't NUMA balance for kthreadsJens Axboe
Stefano reported a crash with using SQPOLL with io_uring: BUG: kernel NULL pointer dereference, address: 00000000000003b0 CPU: 2 PID: 1307 Comm: io_uring-sq Not tainted 5.7.0-rc7 #11 RIP: 0010:task_numa_work+0x4f/0x2c0 Call Trace: task_work_run+0x68/0xa0 io_sq_thread+0x252/0x3d0 kthread+0xf9/0x130 ret_from_fork+0x35/0x40 which is task_numa_work() oopsing on current->mm being NULL. The task work is queued by task_tick_numa(), which checks if current->mm is NULL at the time of the call. But this state isn't necessarily persistent, if the kthread is using use_mm() to temporarily adopt the mm of a task. Change the task_tick_numa() check to exclude kernel threads in general, as it doesn't make sense to attempt ot balance for kthreads anyway. Reported-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/865de121-8190-5d30-ece5-3b097dc74431@kernel.dk
2020-05-26x86/syscalls: Revert "x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long"Andy Lutomirski
Revert 45e29d119e99 ("x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long") and add a comment to discourage someone else from making the same mistake again. It turns out that some user code fails to compile if __X32_SYSCALL_BIT is unsigned long. See, for example [1] below. [ bp: Massage and do the same thing in the respective tools/ header. ] Fixes: 45e29d119e99 ("x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long") Reported-by: Thorsten Glaser <t.glaser@tarent.de> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@kernel.org Link: [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=954294 Link: https://lkml.kernel.org/r/92e55442b744a5951fdc9cfee10badd0a5f7f828.1588983892.git.luto@kernel.org
2020-05-26powerpc/64s: Fix restore of NV GPRs after facility unavailable exceptionMichael Ellerman
Commit 702f09805222 ("powerpc/64s/exception: Remove lite interrupt return") changed the interrupt return path to not restore non-volatile registers by default, and explicitly restore them in paths where it is required. But it missed that the facility unavailable exception can sometimes modify user registers, ie. when it does emulation of move from DSCR. This is seen as a failure of the dscr_sysfs_thread_test: test: dscr_sysfs_thread_test [cpu 0] User DSCR should be 1 but is 0 failure: dscr_sysfs_thread_test So restore non-volatile GPRs after facility unavailable exceptions. Currently the hypervisor facility unavailable exception is also wired up to call facility_unavailable_exception(). In practice we should never take a hypervisor facility unavailable exception for the DSCR. On older bare metal systems we set HFSCR_DSCR unconditionally in __init_HFSCR, or on newer systems it should be enabled via the "data-stream-control-register" device tree CPU feature. Even if it's not, since commit f3c99f97a3cd ("KVM: PPC: Book3S HV: Don't access HFSCR, LPIDR or LPCR when running nested"), the KVM code has unconditionally set HFSCR_DSCR when running guests. So we should only get a hypervisor facility unavailable for the DSCR if skiboot has disabled the "data-stream-control-register" feature, and we are somehow in guest context but not via KVM. Given all that, it should be unnecessary to add a restore of non-volatile GPRs after the hypervisor facility exception, because we never expect to hit that path. But equally we may as well add the restore, because we never expect to hit that path, and if we ever did, at least we would correctly restore the registers to their post emulation state. In future we can split the non-HV and HV facility unavailable handling so that there is no emulation in the HV handler, and then remove the restore for the HV case. Fixes: 702f09805222 ("powerpc/64s/exception: Remove lite interrupt return") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200526061808.2472279-1-mpe@ellerman.id.au
2020-05-26batman-adv: Revert "disable ethtool link speed detection when auto ↵Sven Eckelmann
negotiation off" The commit 8c46fcd78308 ("batman-adv: disable ethtool link speed detection when auto negotiation off") disabled the usage of ethtool's link_ksetting when auto negotation was enabled due to invalid values when used with tun/tap virtual net_devices. According to the patch, automatic measurements should be used for these kind of interfaces. But there are major flaws with this argumentation: * automatic measurements are not implemented * auto negotiation has nothing to do with the validity of the retrieved values The first point has to be fixed by a longer patch series. The "validity" part of the second point must be addressed in the same patch series by dropping the usage of ethtool's link_ksetting (thus always doing automatic measurements over ethernet). Drop the patch again to have more default values for various net_device types/configurations. The user can still overwrite them using the batadv_hardif's BATADV_ATTR_THROUGHPUT_OVERRIDE. Reported-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2020-05-26ALSA: usb-audio: mixer: volume quirk for ESS Technology Asus USB DACChris Chiu
The Asus USB DAC is a USB type-C audio dongle for connecting to the headset and headphone. The volume minimum value -23040 which is 0xa600 in hexadecimal with the resolution value 1 indicates this should be endianness issue caused by the firmware bug. Add a volume quirk to fix the volume control problem. Also fixes this warning: Warning! Unlikely big volume range (=23040), cval->res is probably wrong. [5] FU [Headset Capture Volume] ch = 1, val = -23040/0/1 Warning! Unlikely big volume range (=23040), cval->res is probably wrong. [7] FU [Headset Playback Volume] ch = 1, val = -23040/0/1 Signed-off-by: Chris Chiu <chiu@endlessm.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200526062613.55401-1-chiu@endlessm.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-05-26ALSA: hda/realtek - Add a model for Thinkpad T570 without DAC workaroundTakashi Iwai
We fixed the regression of the speaker volume for some Thinkpad models (e.g. T570) by the commit 54947cd64c1b ("ALSA: hda/realtek - Fix speaker output regression on Thinkpad T570"). Essentially it fixes the DAC / pin pairing by a static table. It was confirmed and merged to stable kernel later. Now, interestingly, we got another regression report for the very same model (T570) about the similar problem, and the commit above was the culprit. That is, by some reason, there are devices that prefer the DAC1, and another device DAC2! Unfortunately those have the same ID and we have no idea what can differentiate, in this patch, a new fixup model "tpt470-dock-fix" is provided, so that users with such a machine can apply it manually. When model=tpt470-dock-fix option is passed to snd-hda-intel module, it avoids the fixed DAC pairing and the DAC1 is assigned to the speaker like the earlier versions. Fixes: 54947cd64c1b ("ALSA: hda/realtek - Fix speaker output regression on Thinkpad T570") BugLink: https://apibugzilla.suse.com/show_bug.cgi?id=1172017 Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200526062406.9799-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-05-26ALSA: hwdep: fix a left shifting 1 by 31 UB bugChangming Liu
The "info.index" variable can be 31 in "1 << info.index". This might trigger an undefined behavior since 1 is signed. Fix this by casting 1 to 1u just to be sure "1u << 31" is defined. Signed-off-by: Changming Liu <liu.changm@northeastern.edu> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/BL0PR06MB4548170B842CB055C9AF695DE5B00@BL0PR06MB4548.namprd06.prod.outlook.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-05-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Set VLAN tag in tcp reset/icmp unreachable packets to reject connections in the bridge family, from Michael Braun. 2) Incorrect subcounter flag update in ipset, from Phil Sutter. 3) Possible buffer overflow in the pptp conntrack helper, based on patch from Dan Carpenter. 4) Restore userspace conntrack helper hook logic that broke after hook consolidation rework. 5) Unbreak userspace conntrack helper registration via nfnetlink_cthelper. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25Merge branch ↵David S. Miller
'r8169-sync-hw-config-for-few-chip-versions-with-r8168-vendor-driver' Heiner Kallweit says: ==================== r8169: sync hw config for few chip versions with r8168 vendor driver Sync hw config for few chip versions with r8168 vendor driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25r8169: sync RTL8168f/RTL8411 hw config with vendor driverHeiner Kallweit
Sync hw config for RTL8168f/RTL8411 with r8168 vendor driver. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25r8169: sync RTL8168evl hw config with vendor driverHeiner Kallweit
Sync hw config for RTL8168evl with r8168 vendor driver. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25r8169: sync RTL8168h hw config with vendor driverHeiner Kallweit
Sync hw config for RTL8168h with r8168 vendor driver. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25r8169: sync RTL8168g hw config with vendor driverHeiner Kallweit
Sync hw config for RTL8168g with r8168 vendor driver. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25Merge tag 'mac80211-for-net-2020-05-25' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A few changes: * fix a debugfs vs. wiphy rename crash * fix an invalid HE spec definition * fix a mesh timer crash ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25Merge tag 'wireless-drivers-next-2020-05-25' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for v5.8 Second set of patches for v5.8. Lots of new features and new supported hardware for mt76. Also rtw88 got new hardware support. Major changes: rtw88 * add support for Realtek 8723DE PCI adapter * rename rtw88.ko/rtwpci.ko to rtw88_core.ko/rtw88_pci.ko iwlwifi * stop supporting swcrypto and bt_coex_active module parameters on mvm devices * enable A-AMSDU in low latency mt76 * new devices for mt76x0/mt76x2 * support for non-offload firmware on mt7663 * hw/sched scan support for mt7663 * mt7615/mt7663 MSI support * TDLS support * mt7603/mt7615 rate control fixes * new driver for mt7915 * wowlan support for mt7663 * suspend/resume support for mt7663 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25bridge: mrp: Fix out-of-bounds read in br_mrp_parseHoratiu Vultur
The issue was reported by syzbot. When the function br_mrp_parse was called with a valid net_bridge_port, the net_bridge was an invalid pointer. Therefore the check br->stp_enabled could pass/fail depending where it was pointing in memory. The fix consists of setting the net_bridge pointer if the port is a valid pointer. Reported-by: syzbot+9c6f0f1f8e32223df9a4@syzkaller.appspotmail.com Fixes: 6536993371fa ("bridge: mrp: Integrate MRP into the bridge") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25qlcnic: fix missing release in qlcnic_83xx_interrupt_test.Qiushi Wu
In function qlcnic_83xx_interrupt_test(), function qlcnic_83xx_diag_alloc_res() is not handled by function qlcnic_83xx_diag_free_res() after a call of the function qlcnic_alloc_mbx_args() failed. Fix this issue by adding a jump target "fail_mbx_args", and jump to this new target when qlcnic_alloc_mbx_args() failed. Fixes: b6b4316c8b2f ("qlcnic: Handle qlcnic_alloc_mbx_args() failure") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25drivers: ipa: print dev_err info accuratelyWang Wenhu
Print certain name string instead of hard-coded "memory" for dev_err output, which would be more accurate and helpful for debugging. Signed-off-by: Wang Wenhu <wenhu.wang@vivo.com> Cc: Alex Elder <elder@kernel.org> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25dpaa_eth: fix usage as DSA master, try 3Vladimir Oltean
The dpaa-eth driver probes on compatible string for the MAC node, and the fman/mac.c driver allocates a dpaa-ethernet platform device that triggers the probing of the dpaa-eth net device driver. All of this is fine, but the problem is that the struct device of the dpaa_eth net_device is 2 parents away from the MAC which can be referenced via of_node. So of_find_net_device_by_node can't find it, and DSA switches won't be able to probe on top of FMan ports. It would be a bit silly to modify a core function (of_find_net_device_by_node) to look for dev->parent->parent->of_node just for one driver. We're just 1 step away from implementing full recursion. Actually there have already been at least 2 previous attempts to make this work: - Commit a1a50c8e4c24 ("fsl/man: Inherit parent device and of_node") - One or more of the patches in "[v3,0/6] adapt DPAA drivers for DSA": https://patchwork.ozlabs.org/project/netdev/cover/1508178970-28945-1-git-send-email-madalin.bucur@nxp.com/ (I couldn't really figure out which one was supposed to solve the problem and how). Point being, it looks like this is still pretty much a problem today. On T1040, the /sys/class/net/eth0 symlink currently points to ../../devices/platform/ffe000000.soc/ffe400000.fman/ffe4e6000.ethernet/dpaa-ethernet.0/net/eth0 which pretty much illustrates the problem. The closest of_node we've got is the "fsl,fman-memac" at /soc@ffe000000/fman@400000/ethernet@e6000, which is what we'd like to be able to reference from DSA as host port. For of_find_net_device_by_node to find the eth0 port, we would need the parent of the eth0 net_device to not be the "dpaa-ethernet" platform device, but to point 1 level higher, aka the "fsl,fman-memac" node directly. The new sysfs path would look like this: ../../devices/platform/ffe000000.soc/ffe400000.fman/ffe4e6000.ethernet/net/eth0 And this is exactly what SET_NETDEV_DEV does. It sets the parent of the net_device. The new parent has an of_node associated with it, and of_dev_node_match already checks for the of_node of the device or of its parent. Fixes: a1a50c8e4c24 ("fsl/man: Inherit parent device and of_node") Fixes: c6e26ea8c893 ("dpaa_eth: change device used") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25ptp_clock: Let the ADJ_OFFSET interface respect the ADJ_NANO flag for PHC ↵Richard Cochran
devices. In commit 184ecc9eb260d5a3bcdddc5bebd18f285ac004e9 ("ptp: Add adjphase function to support phase offset control.") the PTP Hardware Clock interface expanded to support the ADJ_OFFSET offset mode. However, the implementation did not respect the traditional yet pedantic distinction between units of microseconds and nanoseconds signaled by the ADJ_NANO flag. This patch fixes the issue by adding logic to handle that flag. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Vincent Cheng <vincent.cheng.xh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25tcp: allow traceroute -Mtcp for unpriv usersEric Dumazet
Unpriv users can use traceroute over plain UDP sockets, but not TCP ones. $ traceroute -Mtcp 8.8.8.8 You do not have enough privileges to use this traceroute method. $ traceroute -n -Mudp 8.8.8.8 traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets 1 192.168.86.1 3.631 ms 3.512 ms 3.405 ms 2 10.1.10.1 4.183 ms 4.125 ms 4.072 ms 3 96.120.88.125 20.621 ms 19.462 ms 20.553 ms 4 96.110.177.65 24.271 ms 25.351 ms 25.250 ms 5 69.139.199.197 44.492 ms 43.075 ms 44.346 ms 6 68.86.143.93 27.969 ms 25.184 ms 25.092 ms 7 96.112.146.18 25.323 ms 96.112.146.22 25.583 ms 96.112.146.26 24.502 ms 8 72.14.239.204 24.405 ms 74.125.37.224 16.326 ms 17.194 ms 9 209.85.251.9 18.154 ms 209.85.247.55 14.449 ms 209.85.251.9 26.296 ms^C We can easily support traceroute over TCP, by queueing an error message into socket error queue. Note that applications need to set IP_RECVERR/IPV6_RECVERR option to enable this feature, and that the error message is only queued while in SYN_SNT state. socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 3 setsockopt(3, SOL_IPV6, IPV6_RECVERR, [1], 4) = 0 setsockopt(3, SOL_SOCKET, SO_TIMESTAMP_OLD, [1], 4) = 0 setsockopt(3, SOL_IPV6, IPV6_UNICAST_HOPS, [5], 4) = 0 connect(3, {sa_family=AF_INET6, sin6_port=htons(8787), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "2002:a05:6608:297::", &sin6_addr), sin6_scope_id=0}, 28) = -1 EHOSTUNREACH (No route to host) recvmsg(3, {msg_name={sa_family=AF_INET6, sin6_port=htons(8787), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "2002:a05:6608:297::", &sin6_addr), sin6_scope_id=0}, msg_namelen=1024->28, msg_iov=[{iov_base="`\r\337\320\0004\6\1&\7\370\260\200\231\16\27\0\0\0\0\0\0\0\0 \2\n\5f\10\2\227"..., iov_len=1024}], msg_iovlen=1, msg_control=[{cmsg_len=32, cmsg_level=SOL_SOCKET, cmsg_type=SO_TIMESTAMP_OLD, cmsg_data={tv_sec=1590340680, tv_usec=272424}}, {cmsg_len=60, cmsg_level=SOL_IPV6, cmsg_type=IPV6_RECVERR}], msg_controllen=96, msg_flags=MSG_ERRQUEUE}, MSG_ERRQUEUE) = 144 Suggested-by: Maciej Żenczykowski <maze@google.com Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25bnx2x: allow bnx2x_bsc_read() to scheduleEric Dumazet
bnx2x_warpcore_read_sfp_module_eeprom() can call bnx2x_bsc_read() three times before giving up. This causes latency blips of at least 31 ms (58 ms being reported by our teams) Convert the long lasting loops of udelay() to usleep_range() ones, and breaks the loops on precise time tracking. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ariel Elior <aelior@marvell.com> Cc: Sudarsana Kalluru <skalluru@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25ipv4: potential underflow in compat_ip_setsockopt()Dan Carpenter
The value of "n" is capped at 0x1ffffff but it checked for negative values. I don't think this causes a problem but I'm not certain and it's harmless to prevent it. Fixes: 2e04172875c9 ("ipv4: do compat setsockopt for MCAST_MSFILTER directly") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25mvneta: MVNETA_SKB_HEADROOM set last 3 bits to zeroSven Auhagen
For XDP the MVNETA_SKB_HEADROOM is used as an offset for the received data. The MVNETA manual states that the last 3 bits assumed to be 0. This is currently the case but lets make it explicit in the definition to prevent future problems. Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25net/tls: fix race condition causing kernel panicVinay Kumar Yadav
tls_sw_recvmsg() and tls_decrypt_done() can be run concurrently. // tls_sw_recvmsg() if (atomic_read(&ctx->decrypt_pending)) crypto_wait_req(-EINPROGRESS, &ctx->async_wait); else reinit_completion(&ctx->async_wait.completion); //tls_decrypt_done() pending = atomic_dec_return(&ctx->decrypt_pending); if (!pending && READ_ONCE(ctx->async_notify)) complete(&ctx->async_wait.completion); Consider the scenario tls_decrypt_done() is about to run complete() if (!pending && READ_ONCE(ctx->async_notify)) and tls_sw_recvmsg() reads decrypt_pending == 0, does reinit_completion(), then tls_decrypt_done() runs complete(). This sequence of execution results in wrong completion. Consequently, for next decrypt request, it will not wait for completion, eventually on connection close, crypto resources freed, there is no way to handle pending decrypt response. This race condition can be avoided by having atomic_read() mutually exclusive with atomic_dec_return(),complete().Intoduced spin lock to ensure the mutual exclution. Addressed similar problem in tx direction. v1->v2: - More readable commit message. - Corrected the lock to fix new race scenario. - Removed barrier which is not needed now. Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance") Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: - correct value of decompressor tag size in header - fix DACR value when we have nested exceptions - fix a missing newline on a kernel message - fix mask for ptrace thumb breakpoint hook * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8977/1: ptrace: Fix mask for thumb breakpoint hook ARM: 8973/1: Add missing newline terminator to kernel message ARM: uaccess: fix DACR mismatch with nested exceptions ARM: uaccess: integrate uaccess_save and uaccess_restore ARM: uaccess: consolidate uaccess asm to asm/uaccess-asm.h ARM: 8970/1: decompressor: increase tag size
2020-05-26Merge tag 'omap-for-v5.7/cpsw-fixes-signed' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes Few cpsw related dts fixes for omaps Recent cpsw driver changes exposed few regressions in the cpsw related dts configuration that would be good to fix: - Few more boards still need to be updated to use rgmii-rxid phy caused by the fallout from commit bcf3440c6dd7 ("net: phy: micrel: add phy-mode support for the KSZ9031 PHY" as the rx delay is now disabled unless we use rgmii-rxid. - On dm814x we have been using a wrong clock for mdio that now can produce external abort on some boards * tag 'omap-for-v5.7/cpsw-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: Fix wrong mdio clock for dm814x ARM: dts: am437x: fix networking on boards with ksz9031 phy ARM: dts: am57xx: fix networking on boards with ksz9031 phy Link: https://lore.kernel.org/r/pull-1589472123-367692@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-05-26xsk: Add overflow check for u64 division, stored into u32Björn Töpel
The npgs member of struct xdp_umem is an u32 entity, and stores the number of pages the UMEM consumes. The calculation of npgs npgs = size / PAGE_SIZE can overflow. To avoid overflow scenarios, the division is now first stored in a u64, and the result is verified to fit into 32b. An alternative would be storing the npgs as a u64, however, this wastes memory and is an unrealisticly large packet area. Fixes: c0c77d8fb787 ("xsk: add user memory registration support sockopt") Reported-by: "Minh Bùi Quang" <minhquangbui99@gmail.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/CACtPs=GGvV-_Yj6rbpzTVnopgi5nhMoCcTkSkYrJHGQHJWFZMQ@mail.gmail.com/ Link: https://lore.kernel.org/bpf/20200525080400.13195-1-bjorn.topel@gmail.com
2020-05-25netfilter: nfnetlink_cthelper: unbreak userspace helper supportPablo Neira Ayuso
Restore helper data size initialization and fix memcopy of the helper data size. Fixes: 157ffffeb5dc ("netfilter: nfnetlink_cthelper: reject too large userspace allocation requests") Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-25netfilter: conntrack: make conntrack userspace helpers work againPablo Neira Ayuso
Florian Westphal says: "Problem is that after the helper hook was merged back into the confirm one, the queueing itself occurs from the confirm hook, i.e. we queue from the last netfilter callback in the hook-list. Therefore, on return, the packet bypasses the confirm action and the connection is never committed to the main conntrack table. To fix this there are several ways: 1. revert the 'Fixes' commit and have a extra helper hook again. Works, but has the drawback of adding another indirect call for everyone. 2. Special case this: split the hooks only when userspace helper gets added, so queueing occurs at a lower priority again, and normal enqueue reinject would eventually call the last hook. 3. Extend the existing nf_queue ct update hook to allow a forced confirmation (plus run the seqadj code). This goes for 3)." Fixes: 827318feb69cb ("netfilter: conntrack: remove helper hook again") Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-25netfilter: nf_conntrack_pptp: prevent buffer overflows in debug codePablo Neira Ayuso
Dan Carpenter says: "Smatch complains that the value for "cmd" comes from the network and can't be trusted." Add pptp_msg_name() helper function that checks for the array boundary. Fixes: f09943fefe6b ("[NETFILTER]: nf_conntrack/nf_nat: add PPTP helper port") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-25netfilter: ipset: Fix subcounter update skipPhil Sutter
If IPSET_FLAG_SKIP_SUBCOUNTER_UPDATE is set, user requested to not update counters in sub sets. Therefore IPSET_FLAG_SKIP_COUNTER_UPDATE must be set, not unset. Fixes: 6e01781d1c80e ("netfilter: ipset: set match: add support to match the counters") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-25netfilter: nft_reject_bridge: enable reject with bridge vlanMichael Braun
Currently, using the bridge reject target with tagged packets results in untagged packets being sent back. Fix this by mirroring the vlan id as well. Fixes: 85f5b3086a04 ("netfilter: bridge: add reject support") Signed-off-by: Michael Braun <michael-dev@fami-braun.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-25RDMA/pvrdma: Fix missing pci disable in pvrdma_pci_probe()Qiushi Wu
In function pvrdma_pci_probe(), pdev was not disabled in one error path. Thus replace the jump target “err_free_device” by "err_disable_pdev". Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver") Link: https://lore.kernel.org/r/20200523030457.16160-1-wu000273@umn.edu Signed-off-by: Qiushi Wu <wu000273@umn.edu> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25xfrm: fix a warning in xfrm_policy_insert_listXin Long
This waring can be triggered simply by: # ip xfrm policy update src 192.168.1.1/24 dst 192.168.1.2/24 dir in \ priority 1 mark 0 mask 0x10 #[1] # ip xfrm policy update src 192.168.1.1/24 dst 192.168.1.2/24 dir in \ priority 2 mark 0 mask 0x1 #[2] # ip xfrm policy update src 192.168.1.1/24 dst 192.168.1.2/24 dir in \ priority 2 mark 0 mask 0x10 #[3] Then dmesg shows: [ ] WARNING: CPU: 1 PID: 7265 at net/xfrm/xfrm_policy.c:1548 [ ] RIP: 0010:xfrm_policy_insert_list+0x2f2/0x1030 [ ] Call Trace: [ ] xfrm_policy_inexact_insert+0x85/0xe50 [ ] xfrm_policy_insert+0x4ba/0x680 [ ] xfrm_add_policy+0x246/0x4d0 [ ] xfrm_user_rcv_msg+0x331/0x5c0 [ ] netlink_rcv_skb+0x121/0x350 [ ] xfrm_netlink_rcv+0x66/0x80 [ ] netlink_unicast+0x439/0x630 [ ] netlink_sendmsg+0x714/0xbf0 [ ] sock_sendmsg+0xe2/0x110 The issue was introduced by Commit 7cb8a93968e3 ("xfrm: Allow inserting policies with matching mark and different priorities"). After that, the policies [1] and [2] would be able to be added with different priorities. However, policy [3] will actually match both [1] and [2]. Policy [1] was matched due to the 1st 'return true' in xfrm_policy_mark_match(), and policy [2] was matched due to the 2nd 'return true' in there. It caused WARN_ON() in xfrm_policy_insert_list(). This patch is to fix it by only (the same value and priority) as the same policy in xfrm_policy_mark_match(). Thanks to Yuehaibing, we could make this fix better. v1->v2: - check policy->mark.v == pol->mark.v only without mask. Fixes: 7cb8a93968e3 ("xfrm: Allow inserting policies with matching mark and different priorities") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-05-25cfg80211: fix debugfs rename crashJohannes Berg
Removing the "if (IS_ERR(dir)) dir = NULL;" check only works if we adjust the remaining code to not rely on it being NULL. Check IS_ERR_OR_NULL() before attempting to dereference it. I'm not actually entirely sure this fixes the syzbot crash as the kernel config indicates that they do have DEBUG_FS in the kernel, but this is what I found when looking there. Cc: stable@vger.kernel.org Fixes: d82574a8e5a4 ("cfg80211: no need to check return value of debugfs_create functions") Reported-by: syzbot+fd5332e429401bf42d18@syzkaller.appspotmail.com Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20200525113816.fc4da3ec3d4b.Ica63a110679819eaa9fb3bc1b7437d96b1fd187d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-05-25Merge tag 'arm-soc/for-5.7/devicetree-fixes-part2-v2' of ↵Arnd Bergmann
https://github.com/Broadcom/stblinux into arm/fixes This pull request contains Broadcom ARM-based SoCs Device Tree fixes for 5.7, please pull the following: - Vincent fixes the polarity of the ACT LED on the Raspberry Pi Zero W board - Hamish fixes the ARM PPI interrupts sensitivy for the Hurricane 2 SoCs * tag 'arm-soc/for-5.7/devicetree-fixes-part2-v2' of https://github.com/Broadcom/stblinux: ARM: dts: bcm: HR2: Fix PPI interrupt types ARM: dts: bcm2835-rpi-zero-w: Fix led polarity Link: https://lore.kernel.org/r/20200524203714.17035-1-f.fainelli@gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-05-25gpio: bcm-kona: Fix return value of bcm_kona_gpio_probe()Tiezhu Yang
Propagate the error code returned by devm_platform_ioremap_resource() out of probe() instead of overwriting it. Fixes: 72d8cb715477 ("drivers: gpio: bcm-kona: use devm_platform_ioremap_resource()") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> [Bartosz: tweaked the commit message] Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>