summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-02-16netfilter: ipt_CLUSTERIP: make symbol 'cip_netdev_notifier' staticWei Yongjun
Fixes the following sparse warnings: net/ipv4/netfilter/ipt_CLUSTERIP.c:867:23: warning: symbol 'cip_netdev_notifier' was not declared. Should it be static? Fixes: 5a86d68bcf02 ("netfilter: ipt_CLUSTERIP: fix deadlock in netns exit routine") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-16ipvs: fix warning on unused variableAndrea Claudi
When CONFIG_IP_VS_IPV6 is not defined, build produced this warning: net/netfilter/ipvs/ip_vs_ctl.c:899:6: warning: unused variable ‘ret’ [-Wunused-variable] int ret = 0; ^~~ Fix this by moving the declaration of 'ret' in the CONFIG_IP_VS_IPV6 section in the same function. While at it, drop its unneeded initialisation. Fixes: 098e13f5b21d ("ipvs: fix dependency on nf_defrag_ipv6") Reported-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-15Merge branch 'net-dsa-b53-VLAN-and-L2-fixes'David S. Miller
Florian Fainelli says: ==================== net: dsa: b53: VLAN and L2 fixes This patch series contains a collection of fixes to the b53 driver in order to: - consistently program the same default VLAN ID when a port is bridged or not - properly account for VLAN filtering being turned on/off and turning on ingress VID checking accordingly - have SYSTEMPORT properly forward BPDU frames to the network stack (which it did not) - do not assume that WoL is supported by the DSA master network device we are connected to ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net: dsa: b53: Do not program CPU port's PVIDFlorian Fainelli
The CPU port is special and does not need to obey VLAN restrictions as far as untagged traffic goes, also, having the CPU port be part of a particular PVID is against the idea of keeping it tagged in all VLANs. Fixes: ca8931948344 ("net: dsa: b53: Keep CPU port as tagged in all VLANs") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net: dsa: bcm_sf2: Do not assume DSA master supports WoLFlorian Fainelli
We assume in the bcm_sf2 driver that the DSA master network device supports ethtool_ops::{get,set}_wol operations, which is not a given. Avoid de-referencing potentially non-existent function pointers and check them as we should. Fixes: 96e65d7f3f88 ("net: dsa: bcm_sf2: add support for Wake-on-LAN") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net: systemport: Fix reception of BPDUsFlorian Fainelli
SYSTEMPORT has its RXCHK parser block that attempts to validate the packet structures, unfortunately setting the L2 header check bit will cause Bridge PDUs (BPDUs) to be incorrectly rejected because they look like LLC/SNAP packets with a non-IPv4 or non-IPv6 Ethernet Type. Fixes: 4e8aedfe78c7 ("net: systemport: Turn on offloads by default") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net: dsa: b53: Properly account for VLAN filteringFlorian Fainelli
VLAN filtering can be built into the kernel, and also dynamically turned on/off through the bridge master device. Allow re-configuring the switch appropriately to account for that by deciding whether VLAN table (v_table) misses should lead to a drop or forward. Fixes: a2482d2ce349 ("net: dsa: b53: Plug in VLAN support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net: dsa: b53: Fix default VLAN IDFlorian Fainelli
We were not consistent in how the default VID of a given port was defined, b53_br_leave() would make sure the VLAN ID would be either 0/1 depending on the switch generation, but b53_configure_vlan(), which is the default configuration would unconditionally set it to 1. The correct value is 1 for 5325/5365 series and 0 otherwise. To avoid repeating that mistake ever again, introduce a helper function: b53_default_pvid() to factor that out. Fixes: 967dd82ffc52 ("net: dsa: b53: Add support for Broadcom RoboSwitch") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15Merge branch 's390-next'David S. Miller
Julian Wiedmann says: ==================== s390/qeth: updates 2019-02-15 please apply a few more qeth patches to net-next. Along with some smaller improvements, this revamps our code for the SW statistics that are exposed through ETHTOOL_GSTATS. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15s390/qeth: split out OSN netdev opsJulian Wiedmann
Rather than special-casing OSN in a number of places, just give this device type its own netdev_ops structure. When setting up the OSN net_device, also skip the handling of the various HW offloads (eg TSO). The device shouldn't be advertising any of them, and the OSN code paths in qeth don't have support for them. In particular RX VLAN filtering is not supported, so don't hook up those callbacks in the netdev_ops. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15s390/qeth: add support for ETHTOOL_GRINGPARAMJulian Wiedmann
Implement a trivial callback that exposes the queue sizes. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15s390/qeth: overhaul ethtool statisticsJulian Wiedmann
Accumulate per-TX queue statistics, and increase their size to 64 bit. Don't bother with enabling/disabling the statistics, the overhead is negligible. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15s390/qeth: move ethtool code into its own fileJulian Wiedmann
Most of this is self-contained code. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15s390/qeth: reduce ethtool statisticsJulian Wiedmann
Counting the number of function calls and the time spent in functions is best left to proper tracing facilities. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15s390/qeth: use a static Output Queue arrayJulian Wiedmann
qeth dynamically allocates an array for storing pointers to its Output Queue structures. Switch this to a static array - we are currently limited to 4 Output Queues, so shrinking the qeth_qdio_info struct by just a few bytes doesn't justify the additional complexity. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15s390/qeth: allow manual recovery when device is SOFTSETUPJulian Wiedmann
Once a qeth ccwgroup device is set online, it's also armed for internal recovery. So allow for testing that code path via sysfs, regardless of whether the interface is up or down. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15selftests: forwarding: Add some missing configuration symbolsFlorian Fainelli
For the forwarding selftests to work, we need network namespaces when using veth/vrf otherwise ping/ping6 commands like these: ip vrf exec vveth0 /bin/ping 192.0.2.2 -c 10 -i 0.1 -w 5 will fail because network namespaces may not be enabled. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net: validate untrusted gso packets without csum offloadWillem de Bruijn
Syzkaller again found a path to a kernel crash through bad gso input. By building an excessively large packet to cause an skb field to wrap. If VIRTIO_NET_HDR_F_NEEDS_CSUM was set this would have been dropped in skb_partial_csum_set. GSO packets that do not set checksum offload are suspicious and rare. Most callers of virtio_net_hdr_to_skb already pass them to skb_probe_transport_header. Move that test forward, change it to detect parse failure and drop packets on failure as those cleary are not one of the legitimate VIRTIO_NET_HDR_GSO types. Fixes: bfd5f4a3d605 ("packet: Add GSO/csum offload support.") Fixes: f43798c27684 ("tun: Allow GSO using virtio_net_hdr") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net/ipv6: prefer rcu_access_pointer() over rcu_dereference()Paolo Abeni
rt6_cache_allowed_for_pmtu() checks for rt->from presence, but it does not access the RCU protected pointer. We can use rcu_access_pointer() and clean-up the code a bit. No functional changes intended. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net: Fix for_each_netdev_feature on Big endianHauke Mehrtens
The features attribute is of type u64 and stored in the native endianes on the system. The for_each_set_bit() macro takes a pointer to a 32 bit array and goes over the bits in this area. On little Endian systems this also works with an u64 as the most significant bit is on the highest address, but on big endian the words are swapped. When we expect bit 15 here we get bit 47 (15 + 32). This patch converts it more or less to its own for_each_set_bit() implementation which works on 64 bit integers directly. This is then completely in host endianness and should work like expected. Fixes: fd867d51f ("net/core: generic support for disabling netdev features down stack") Signed-off-by: Hauke Mehrtens <hauke.mehrtens@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net: phy: xgmiitorgmii: Support generic PHY status readPaul Kocialkowski
Some PHY drivers like the generic one do not provide a read_status callback on their own but rely on genphy_read_status being called directly. With the current code, this results in a NULL function pointer call. Call genphy_read_status instead when there is no specific callback. Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15mlxsw: core: fix spelling mistake "temprature" -> "temperature"Colin Ian King
There is a spelling mistake in several dev_err messages, fix these. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net: ip6_gre: initialize erspan_ver just for erspan tunnelsLorenzo Bianconi
After commit c706863bc890 ("net: ip6_gre: always reports o_key to userspace"), ip6gre and ip6gretap tunnels started reporting TUNNEL_KEY output flag even if it is not configured. ip6gre_fill_info checks erspan_ver value to add TUNNEL_KEY for erspan tunnels, however in commit 84581bdae9587 ("erspan: set erspan_ver to 1 by default when adding an erspan dev") erspan_ver is initialized to 1 even for ip6gre or ip6gretap Fix the issue moving erspan_ver initialization in a dedicated routine Fixes: c706863bc890 ("net: ip6_gre: always reports o_key to userspace") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15Merge tag 'mac80211-for-davem-2019-02-15' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Just a few fixes this time: * mesh rhashtable fixes from Herbert * a small error path fix when starting AP interfaces ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15scsi: core: reset host byte in DID_NEXUS_FAILURE caseMartin Wilck
Up to 4.12, __scsi_error_from_host_byte() would reset the host byte to DID_OK for various cases including DID_NEXUS_FAILURE. Commit 2a842acab109 ("block: introduce new block status code type") replaced this function with scsi_result_to_blk_status() and removed the host-byte resetting code for the DID_NEXUS_FAILURE case. As the line set_host_byte(cmd, DID_OK) was preserved for the other cases, I suppose this was an editing mistake. The fact that the host byte remains set after 4.13 is causing problems with the sg_persist tool, which now returns success rather then exit status 24 when a RESERVATION CONFLICT error is encountered. Fixes: 2a842acab109 "block: introduce new block status code type" Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-15scsi: libsas: Fix rphy phy_identifier for PHYs with end devices attachedJohn Garry
The sysfs phy_identifier attribute for a sas_end_device comes from the rphy phy_identifier value. Currently this is not being set for rphys with an end device attached, so we see incorrect symlinks from systemd disk/by-path: root@localhost:~# ls -l /dev/disk/by-path/ total 0 lrwxrwxrwx 1 root root 9 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0 -> ../../sdb lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part2 -> ../../sdb2 lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part3 -> ../../sdc3 Indeed, each sas_end_device phy_identifier value is 0: root@localhost:/# more sys/class/sas_device/end_device-0\:0\:2/phy_identifier 0 root@localhost:/# more sys/class/sas_device/end_device-0\:0\:10/phy_identifier 0 This patch fixes the discovery code to set the phy_identifier. With this, we now get proper symlinks: root@localhost:~# ls -l /dev/disk/by-path/ total 0 lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy10-lun-0 -> ../../sdg lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy11-lun-0 -> ../../sdh lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy2-lun-0 -> ../../sda lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy2-lun-0-part1 -> ../../sda1 lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0 -> ../../sdb lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0-part2 -> ../../sdb2 lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0 -> ../../sdc lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part1 -> ../../sdc1 lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part2 -> ../../sdc2 lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part3 -> ../../sdc3 lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy5-lun-0 -> ../../sdd lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0 -> ../../sde lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part1 -> ../../sde1 lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part2 -> ../../sde2 lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part3 -> ../../sde3 lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0 -> ../../sdf lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part1 -> ../../sdf1 lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part2 -> ../../sdf2 lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part3 -> ../../sdf3 Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver") Reported-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Tested-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-15scsi: sd_zbc: Fix sd_zbc_report_zones() buffer allocationMasato Suzuki
The function sd_zbc_do_report_zones() issues a REPORT ZONES command with a buffer size calculated based on the number of zones requested by the caller. This value should however not exceed the capabilities of the hardware maximum command size, that is, should not exceed the max_hw_sectors limit of the device. This problem leads to failures of report zones commands when re-validating disks with some SAS HBAs. Fix this by limiting a report zone command buffer size to the minimum of the device max_hw_sectors and calculated value based on the requested number of zones. This does not change the semantic of the report_zones file operation as report zones can always return less zone reports than requested. Short reports are handled using a loop execution of the report_zones file operation in the function blk_report_zones(). [Damien] Before patch 'e76239a3748c ("block: add a report_zones method")', report zones buffer allocation was limited to max_sectors when allocated in blk_report_zones(). This however does not consider the actual format of the device reply which is interface dependent. Limiting the allocation based on the size of the expected reply format rather than the size of the array of generic sturct blkzone passed by blk_report_zones() makes more sense. Fixes: e76239a3748c ("block: add a report_zones method") Cc: stable@vger.kernel.org Signed-off-by: Masato Suzuki <masato.suzuki@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-15scsi: libiscsi: Fix race between iscsi_xmit_task and iscsi_complete_taskAnoob Soman
When a target sends Check Condition, whilst initiator is busy xmiting re-queued data, could lead to race between iscsi_complete_task() and iscsi_xmit_task() and eventually crashing with the following kernel backtrace. [3326150.987523] ALERT: BUG: unable to handle kernel NULL pointer dereference at 0000000000000078 [3326150.987549] ALERT: IP: [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi] [3326150.987571] WARN: PGD 569c8067 PUD 569c9067 PMD 0 [3326150.987582] WARN: Oops: 0002 [#1] SMP [3326150.987593] WARN: Modules linked in: tun nfsv3 nfs fscache dm_round_robin [3326150.987762] WARN: CPU: 2 PID: 8399 Comm: kworker/u32:1 Tainted: G O 4.4.0+2 #1 [3326150.987774] WARN: Hardware name: Dell Inc. PowerEdge R720/0W7JN5, BIOS 2.5.4 01/22/2016 [3326150.987790] WARN: Workqueue: iscsi_q_13 iscsi_xmitworker [libiscsi] [3326150.987799] WARN: task: ffff8801d50f3800 ti: ffff8801f5458000 task.ti: ffff8801f5458000 [3326150.987810] WARN: RIP: e030:[<ffffffffa05ce70d>] [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi] [3326150.987825] WARN: RSP: e02b:ffff8801f545bdb0 EFLAGS: 00010246 [3326150.987831] WARN: RAX: 00000000ffffffc3 RBX: ffff880282d2ab20 RCX: ffff88026b6ac480 [3326150.987842] WARN: RDX: 0000000000000000 RSI: 00000000fffffe01 RDI: ffff880282d2ab20 [3326150.987852] WARN: RBP: ffff8801f545bdc8 R08: 0000000000000000 R09: 0000000000000008 [3326150.987862] WARN: R10: 0000000000000000 R11: 000000000000fe88 R12: 0000000000000000 [3326150.987872] WARN: R13: ffff880282d2abe8 R14: ffff880282d2abd8 R15: ffff880282d2ac08 [3326150.987890] WARN: FS: 00007f5a866b4840(0000) GS:ffff88028a640000(0000) knlGS:0000000000000000 [3326150.987900] WARN: CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [3326150.987907] WARN: CR2: 0000000000000078 CR3: 0000000070244000 CR4: 0000000000042660 [3326150.987918] WARN: Stack: [3326150.987924] WARN: ffff880282d2ad58 ffff880282d2ab20 ffff880282d2abe8 ffff8801f545be18 [3326150.987938] WARN: ffffffffa05cea90 ffff880282d2abf8 ffff88026b59cc80 ffff88026b59cc00 [3326150.987951] WARN: ffff88022acf32c0 ffff880289491800 ffff880255a80800 0000000000000400 [3326150.987964] WARN: Call Trace: [3326150.987975] WARN: [<ffffffffa05cea90>] iscsi_xmitworker+0x2f0/0x360 [libiscsi] [3326150.987988] WARN: [<ffffffff8108862c>] process_one_work+0x1fc/0x3b0 [3326150.987997] WARN: [<ffffffff81088f95>] worker_thread+0x2a5/0x470 [3326150.988006] WARN: [<ffffffff8159cad8>] ? __schedule+0x648/0x870 [3326150.988015] WARN: [<ffffffff81088cf0>] ? rescuer_thread+0x300/0x300 [3326150.988023] WARN: [<ffffffff8108ddf5>] kthread+0xd5/0xe0 [3326150.988031] WARN: [<ffffffff8108dd20>] ? kthread_stop+0x110/0x110 [3326150.988040] WARN: [<ffffffff815a0bcf>] ret_from_fork+0x3f/0x70 [3326150.988048] WARN: [<ffffffff8108dd20>] ? kthread_stop+0x110/0x110 [3326150.988127] ALERT: RIP [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi] [3326150.988138] WARN: RSP <ffff8801f545bdb0> [3326150.988144] WARN: CR2: 0000000000000078 [3326151.020366] WARN: ---[ end trace 1c60974d4678d81b ]--- Commit 6f8830f5bbab ("scsi: libiscsi: add lock around task lists to fix list corruption regression") introduced "taskqueuelock" to fix list corruption during the race, but this wasn't enough. Re-setting of conn->task to NULL, could race with iscsi_xmit_task(). iscsi_complete_task() { .... if (conn->task == task) conn->task = NULL; } conn->task in iscsi_xmit_task() could be NULL and so will be task. __iscsi_get_task(task) will crash (NullPtr de-ref), trying to access refcount. iscsi_xmit_task() { struct iscsi_task *task = conn->task; __iscsi_get_task(task); } This commit will take extra conn->session->back_lock in iscsi_xmit_task() to ensure iscsi_xmit_task() waits for iscsi_complete_task(), if iscsi_complete_task() wins the race. If iscsi_xmit_task() wins the race, iscsi_xmit_task() increments task->refcount (__iscsi_get_task) ensuring iscsi_complete_task() will not iscsi_free_task(). Signed-off-by: Anoob Soman <anoob.soman@citrix.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Acked-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-15net/mlx5: E-Switch, Allow transition to offloads mode for ECPFBodong Wang
Currently, the e-switch driver requires going to legacy mode before changing to the offloads mode. This makes sense for regular case as the legacy mode is done by creating VFs. However, it's problematic when ECPF is the eswitch manager. In such case, ECPF will control the vports on peer host including the peer PF and VFs. But ECPF doesn't need and shall not create VFs as the VFs are created in the peer PF host. Grant ECPF the ability to change from none to the offloads mode. Note that currently the only way to go back to none mode is by unloading the ECPF driver. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Load/unload VF reps according to event from host PFBodong Wang
When host PF changes the number of VFs, the ECPF esw driver will get a FW event. It should query the number of VFs enabled by host PF and update the VF reps accordingly. Note that host PF can't change the number of VFs dynamically, it has to reset the number of VFs to 0 before changing to a new positive number. The host event is registered when driver is moving to switchdev mode, and it's the last step to do in esw_offloads_init. It's unregistered and the work queue is flushed when driver quits from switchdev mode. In this way, the host event and devlink command are serialized. When driver is enabling switchdev mode, pay attention to the following two facts: 1. Host PF must not have VF initialized as the flow table in ECPF has ENCAP enabled as default. Such flow table can't be created with existing initialized VFs. 2. ECPF doesn't know how many VFs the host PF will enable, ECPF offloads flow steering shall create the flow table/groups based on the max number of VFs possibly supported by host PF. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Consider ECPF vport depends on eswitch ownershipBodong Wang
ECPF connects to the eswitch through vport 0xfffe. ECPF may or may not be the eswitch manager depending on firmware configuration. 1. If ECPF is eswitch manager: ECPF will take over the eswitch manager responsibility. A rep of the host PF shall be created at the ECPF side for the eswitch manager to control. 2. If ECPF is not eswitch manager: host PF will be the eswitch manager, ECPF acts similar as a VF to the host PF. Host PF will be aware of the ECPF vport presence and control it's rep. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Assign a different position for uplink rep and vportBodong Wang
In offloads mode, the current implementation puts the uplink representor at index zero of the vport reps array. It is not "natural" to place it at index 0 since we want to put the representor for vport 0 at index 0 with the introduction of SmartNIC. A separate patch will handle the case whether a rep is needed for vport 0 (PF vport). So, we want to have a different placeholder for uplink vport and representor. It was placed at the end of vport and rep array. Since vport number can no longer act as an index into the vport or representors arrays, use functions to map vport numbers to indices when accessing the vports or representors arrays, and vice versa. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Centralize repersentor reg/unreg to eswitch driverBodong Wang
Eswitch has two users: IB and ETH. They both register repersentors when mlx5 interface is added, and unregister the repersentors when mlx5 interface is removed. Ideally, each driver should only deal with the entities which are unique to itself. However, current IB and ETH drivers have to perform the following eswitch operations: 1. When registering, specify how many vports to register. This number is the same for both drivers which is the total available vport numbers. 2. When unregistering, specify the number of registered vports to do unregister. Also, unload the repersentors which are already loaded. It's unnecessary for eswitch driver to hands out the control of above operations to individual driver users, as they're not unique to each driver. Instead, such operations should be centralized to eswitch driver. This consolidates eswitch control flow, and simplified IB and ETH driver. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Support load/unload reps of specific vport typesBodong Wang
Currently the driver loads and unloads all reps in an unbreakable group. However, with ECPF, the reps of special vports such as uplink and host PF should always be loaded in switchdev mode where the reps for VFs will be loaded on-demand and unloaded on no-demand. This is a pre-step for that change. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Add state to eswitch vport representorsBodong Wang
Currently the eswitch vport reps have a valid indicator, which is set on register and unset on unregister. However, a rep can be loaded or not loaded when doing unregister, current driver checks if the vport of that rep is enabled as a flag to imply the rep is loaded. However, for ECPF, this is not valid as the host PF will enable the vports for its VFs instead. Add three states: {unregistered, registered, loaded}, with the following state changes across different operations: create: (none) -> unregistered reg: unregistered -> registered load: registered -> loaded unload: loaded -> registered unreg: registered -> unregistered Note that the state shall only be updated inside eswitch driver rather than individual drivers such as ETH or IB. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Suggested-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Use getter and iterator to access vport/repBodong Wang
With only PF and VF, it is sufficient to have the vport/rep array index as the vport number. This is because PF and VF vports numbers are consecutive serial numbers. In downstream patches with introducing of ECPF and UPLINK vports, it's not consecutive any more. Use getter to get specific vport/rep, and use iterator to traversal a list of vport/rep. This hides the translation between array index and vport number, and provides flexibility of using different translation mechanism in the future. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Suggested-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Split VF and special vports for offloads modeBodong Wang
When driver is entering offloads mode, there are two major tasks to do: initialize flow steering and create representors. Flow steering should make sure enough flow table/group spaces are reserved for all reps. Representors will be created in a group, all or none. With the introduction of ECPF, flow steering should still reserve the same spaces. But, the representors are not always loaded/unloaded in a single piece. Once ECPF is in offloads mode, it will get the number of VF changing event from host PF. In such scenario, only the VF reps should be loaded/unloaded, not the reps for special vports (such as the uplink vport). Thus, when entering offloads mode, driver should specify the total number of reps, and the number of VF reps separately. When leaving offloads mode, the cleanup should use the information self-contained in eswitch such as number of VFs. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Refactor offloads flow steering init/cleanupBodong Wang
E-switch offloads mode initialize/cleanup multiple steering related entities (flow table/group). Refactor these operations to internal helper functions for better block design. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Properly refer to host PF vport as other vportBodong Wang
Commands referring to vports use the following scheme: 1. When referring to my own vport, put 0 in vport and 0 in other_vport. 2. When referring to another vport, put the vport number of the referred vport and put 1 in other_vport. It was assumed that driver is accessing other vport when vport number is greater than 0. With the above scheme, the case that ECPF eswitch manager is trying to access host PF vport will fall over with scheme 1 as the vport number is 0. This is apparently wrong as driver is trying to refer other vport. As such usage can only happen in the eswitch context, change relevant functions to provide other vport input properly. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Properly refer to the esw manager vportBodong Wang
In SmartNIC mode, the eswitch manager is not necessarily the PF (vport 0). Use a helper function to get the correct eswitch manager vport number and cache on the eswitch instance for fast reference. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: Correctly set LAG mode for ECPFBodong Wang
When bonding is added, driver assumes that it's RoCE LAG if no VF is enabled. This is not enough for ECPF as the VF is enabled in host PF side. LAG should only choose RoCE mode when both slave devices meet conditions below: 1. E-Switch offloads mode is NONE. 2. No VF is enabled. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Merge mlx5-next shared branched into net-next, From Bodong Wang: 1) Introduction of ECPF (Embedded CPU Physical Function), and low level bits for mlx5 SmartNic capabilities support. 2) Vport enumeration refactoring that affect mlx5_ib and mlx5_core From Aya Levin, 3) Add support for 50Gbps per lane link modes in the Port Type and Speed register (PTYS) 4) Refactor low level query functions for PTYS register 5) Add support for 50Gbps per lane link modes to mlx5_ib Note: due to a change in API in mlx5/core and a later patch from net-next, a fixup was squashed with this merge commit that replaces FDB_UPLINK_VPORT with MLX5_VPORT_UPLINK which exists only in upstream net-next. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-16MIPS: eBPF: Remove REG_32BIT_ZERO_EXPaul Burton
REG_32BIT_ZERO_EX and REG_64BIT are always handled in exactly the same way, and reg_val_propagate_range() never actually sets any register to type REG_32BIT_ZERO_EX. Remove the redundant & unused REG_32BIT_ZERO_EX. Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-16MIPS: eBPF: Always return sign extended 32b valuesPaul Burton
The function prototype used to call JITed eBPF code (ie. the type of the struct bpf_prog bpf_func field) returns an unsigned int. The MIPS n64 ABI that MIPS64 kernels target defines that 32 bit integers should always be sign extended when passed in registers as either arguments or return values. This means that when returning any value which may not already be sign extended (ie. of type REG_64BIT or REG_32BIT_ZERO_EX) we need to perform that sign extension in order to comply with the n64 ABI. Without this we see strange looking test failures from test_bpf.ko, such as: test_bpf: #65 ALU64_MOV_X: dst = 4294967295 jited:1 ret -1 != -1 FAIL (1 times) Although the return value printed matches the expected value, this is only because printf is only examining the least significant 32 bits of the 64 bit register value we returned. The register holding the expected value is sign extended whilst the v0 register was set to a zero extended value by our JITed code, so when compared by a conditional branch instruction the values are not equal. We already handle this when the return value register is of type REG_32BIT_ZERO_EX, so simply extend this to also cover REG_64BIT. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: b6bd53f9c4e8 ("MIPS: Add missing file for eBPF JIT.") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-16bpf: make LWTUNNEL_BPF dependent on INETPeter Oskolkov
Lightweight tunnels are L3 constructs that are used with IP/IP6. For example, lwtunnel_xmit is called from ip_output.c and ip6_output.c only. Make the dependency explicit at least for LWT-BPF, as now they call into IP routing. V2: added "Reported-by" below. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Peter Oskolkov <posk@google.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-15keys: Timestamp new keysDavid Howells
Set the timestamp on new keys rather than leaving it unset. Fixes: 31d5a79d7f3d ("KEYS: Do LRU discard in full keyrings") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2019-02-15keys: Fix dependency loop between construction record and auth keyDavid Howells
In the request_key() upcall mechanism there's a dependency loop by which if a key type driver overrides the ->request_key hook and the userspace side manages to lose the authorisation key, the auth key and the internal construction record (struct key_construction) can keep each other pinned. Fix this by the following changes: (1) Killing off the construction record and using the auth key instead. (2) Including the operation name in the auth key payload and making the payload available outside of security/keys/. (3) The ->request_key hook is given the authkey instead of the cons record and operation name. Changes (2) and (3) allow the auth key to naturally be cleaned up if the keyring it is in is destroyed or cleared or the auth key is unlinked. Fixes: 7ee02a316600 ("keys: Fix dependency loop between construction record and auth key") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2019-02-15assoc_array: Fix shortcut creationDavid Howells
Fix the creation of shortcuts for which the length of the index key value is an exact multiple of the machine word size. The problem is that the code that blanks off the unused bits of the shortcut value malfunctions if the number of bits in the last word equals machine word size. This is due to the "<<" operator being given a shift of zero in this case, and so the mask that should be all zeros is all ones instead. This causes the subsequent masking operation to clear everything rather than clearing nothing. Ordinarily, the presence of the hash at the beginning of the tree index key makes the issue very hard to test for, but in this case, it was encountered due to a development mistake that caused the hash output to be either 0 (keyring) or 1 (non-keyring) only. This made it susceptible to the keyctl/unlink/valid test in the keyutils package. The fix is simply to skip the blanking if the shift would be 0. For example, an index key that is 64 bits long would produce a 0 shift and thus a 'blank' of all 1s. This would then be inverted and AND'd onto the index_key, incorrectly clearing the entire last word. Fixes: 3cb989501c26 ("Add a generic associative array implementation.") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2019-02-15KEYS: allow reaching the keys quotas exactlyEric Biggers
If the sysctl 'kernel.keys.maxkeys' is set to some number n, then actually users can only add up to 'n - 1' keys. Likewise for 'kernel.keys.maxbytes' and the root_* versions of these sysctls. But these sysctls are apparently supposed to be *maximums*, as per their names and all documentation I could find -- the keyrings(7) man page, Documentation/security/keys/core.rst, and all the mentions of EDQUOT meaning that the key quota was *exceeded* (as opposed to reached). Thus, fix the code to allow reaching the quotas exactly. Fixes: 0b77f5bfb45c ("keys: make the keyring quotas controllable through /proc/sys") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2019-02-15Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two fairly small fixes: the qla one is a panic inducing use after free and the entropy fix may seem minor but it has had huge userspace impact thanks to an unrelated change in openssl that causes sshd to refuse logins until it has enough entropy for the session keys, which causes tens of minutes delay before the affected systems allow logins after reboot" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qla2xxx: Fix panic from use after free in qla2x00_async_tm_cmd scsi: sd: fix entropy gathering for most rotational disks