summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-02-15s390/qeth: use a static Output Queue arrayJulian Wiedmann
qeth dynamically allocates an array for storing pointers to its Output Queue structures. Switch this to a static array - we are currently limited to 4 Output Queues, so shrinking the qeth_qdio_info struct by just a few bytes doesn't justify the additional complexity. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15s390/qeth: allow manual recovery when device is SOFTSETUPJulian Wiedmann
Once a qeth ccwgroup device is set online, it's also armed for internal recovery. So allow for testing that code path via sysfs, regardless of whether the interface is up or down. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15selftests: forwarding: Add some missing configuration symbolsFlorian Fainelli
For the forwarding selftests to work, we need network namespaces when using veth/vrf otherwise ping/ping6 commands like these: ip vrf exec vveth0 /bin/ping 192.0.2.2 -c 10 -i 0.1 -w 5 will fail because network namespaces may not be enabled. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net: validate untrusted gso packets without csum offloadWillem de Bruijn
Syzkaller again found a path to a kernel crash through bad gso input. By building an excessively large packet to cause an skb field to wrap. If VIRTIO_NET_HDR_F_NEEDS_CSUM was set this would have been dropped in skb_partial_csum_set. GSO packets that do not set checksum offload are suspicious and rare. Most callers of virtio_net_hdr_to_skb already pass them to skb_probe_transport_header. Move that test forward, change it to detect parse failure and drop packets on failure as those cleary are not one of the legitimate VIRTIO_NET_HDR_GSO types. Fixes: bfd5f4a3d605 ("packet: Add GSO/csum offload support.") Fixes: f43798c27684 ("tun: Allow GSO using virtio_net_hdr") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net/ipv6: prefer rcu_access_pointer() over rcu_dereference()Paolo Abeni
rt6_cache_allowed_for_pmtu() checks for rt->from presence, but it does not access the RCU protected pointer. We can use rcu_access_pointer() and clean-up the code a bit. No functional changes intended. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net: Fix for_each_netdev_feature on Big endianHauke Mehrtens
The features attribute is of type u64 and stored in the native endianes on the system. The for_each_set_bit() macro takes a pointer to a 32 bit array and goes over the bits in this area. On little Endian systems this also works with an u64 as the most significant bit is on the highest address, but on big endian the words are swapped. When we expect bit 15 here we get bit 47 (15 + 32). This patch converts it more or less to its own for_each_set_bit() implementation which works on 64 bit integers directly. This is then completely in host endianness and should work like expected. Fixes: fd867d51f ("net/core: generic support for disabling netdev features down stack") Signed-off-by: Hauke Mehrtens <hauke.mehrtens@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net: phy: xgmiitorgmii: Support generic PHY status readPaul Kocialkowski
Some PHY drivers like the generic one do not provide a read_status callback on their own but rely on genphy_read_status being called directly. With the current code, this results in a NULL function pointer call. Call genphy_read_status instead when there is no specific callback. Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15mlxsw: core: fix spelling mistake "temprature" -> "temperature"Colin Ian King
There is a spelling mistake in several dev_err messages, fix these. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net: ip6_gre: initialize erspan_ver just for erspan tunnelsLorenzo Bianconi
After commit c706863bc890 ("net: ip6_gre: always reports o_key to userspace"), ip6gre and ip6gretap tunnels started reporting TUNNEL_KEY output flag even if it is not configured. ip6gre_fill_info checks erspan_ver value to add TUNNEL_KEY for erspan tunnels, however in commit 84581bdae9587 ("erspan: set erspan_ver to 1 by default when adding an erspan dev") erspan_ver is initialized to 1 even for ip6gre or ip6gretap Fix the issue moving erspan_ver initialization in a dedicated routine Fixes: c706863bc890 ("net: ip6_gre: always reports o_key to userspace") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15Merge tag 'mac80211-for-davem-2019-02-15' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Just a few fixes this time: * mesh rhashtable fixes from Herbert * a small error path fix when starting AP interfaces ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15scsi: core: reset host byte in DID_NEXUS_FAILURE caseMartin Wilck
Up to 4.12, __scsi_error_from_host_byte() would reset the host byte to DID_OK for various cases including DID_NEXUS_FAILURE. Commit 2a842acab109 ("block: introduce new block status code type") replaced this function with scsi_result_to_blk_status() and removed the host-byte resetting code for the DID_NEXUS_FAILURE case. As the line set_host_byte(cmd, DID_OK) was preserved for the other cases, I suppose this was an editing mistake. The fact that the host byte remains set after 4.13 is causing problems with the sg_persist tool, which now returns success rather then exit status 24 when a RESERVATION CONFLICT error is encountered. Fixes: 2a842acab109 "block: introduce new block status code type" Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-15scsi: libsas: Fix rphy phy_identifier for PHYs with end devices attachedJohn Garry
The sysfs phy_identifier attribute for a sas_end_device comes from the rphy phy_identifier value. Currently this is not being set for rphys with an end device attached, so we see incorrect symlinks from systemd disk/by-path: root@localhost:~# ls -l /dev/disk/by-path/ total 0 lrwxrwxrwx 1 root root 9 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0 -> ../../sdb lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part2 -> ../../sdb2 lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part3 -> ../../sdc3 Indeed, each sas_end_device phy_identifier value is 0: root@localhost:/# more sys/class/sas_device/end_device-0\:0\:2/phy_identifier 0 root@localhost:/# more sys/class/sas_device/end_device-0\:0\:10/phy_identifier 0 This patch fixes the discovery code to set the phy_identifier. With this, we now get proper symlinks: root@localhost:~# ls -l /dev/disk/by-path/ total 0 lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy10-lun-0 -> ../../sdg lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy11-lun-0 -> ../../sdh lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy2-lun-0 -> ../../sda lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy2-lun-0-part1 -> ../../sda1 lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0 -> ../../sdb lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0-part2 -> ../../sdb2 lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0 -> ../../sdc lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part1 -> ../../sdc1 lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part2 -> ../../sdc2 lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part3 -> ../../sdc3 lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy5-lun-0 -> ../../sdd lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0 -> ../../sde lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part1 -> ../../sde1 lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part2 -> ../../sde2 lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part3 -> ../../sde3 lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0 -> ../../sdf lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part1 -> ../../sdf1 lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part2 -> ../../sdf2 lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part3 -> ../../sdf3 Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver") Reported-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Tested-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-15scsi: sd_zbc: Fix sd_zbc_report_zones() buffer allocationMasato Suzuki
The function sd_zbc_do_report_zones() issues a REPORT ZONES command with a buffer size calculated based on the number of zones requested by the caller. This value should however not exceed the capabilities of the hardware maximum command size, that is, should not exceed the max_hw_sectors limit of the device. This problem leads to failures of report zones commands when re-validating disks with some SAS HBAs. Fix this by limiting a report zone command buffer size to the minimum of the device max_hw_sectors and calculated value based on the requested number of zones. This does not change the semantic of the report_zones file operation as report zones can always return less zone reports than requested. Short reports are handled using a loop execution of the report_zones file operation in the function blk_report_zones(). [Damien] Before patch 'e76239a3748c ("block: add a report_zones method")', report zones buffer allocation was limited to max_sectors when allocated in blk_report_zones(). This however does not consider the actual format of the device reply which is interface dependent. Limiting the allocation based on the size of the expected reply format rather than the size of the array of generic sturct blkzone passed by blk_report_zones() makes more sense. Fixes: e76239a3748c ("block: add a report_zones method") Cc: stable@vger.kernel.org Signed-off-by: Masato Suzuki <masato.suzuki@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-15scsi: libiscsi: Fix race between iscsi_xmit_task and iscsi_complete_taskAnoob Soman
When a target sends Check Condition, whilst initiator is busy xmiting re-queued data, could lead to race between iscsi_complete_task() and iscsi_xmit_task() and eventually crashing with the following kernel backtrace. [3326150.987523] ALERT: BUG: unable to handle kernel NULL pointer dereference at 0000000000000078 [3326150.987549] ALERT: IP: [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi] [3326150.987571] WARN: PGD 569c8067 PUD 569c9067 PMD 0 [3326150.987582] WARN: Oops: 0002 [#1] SMP [3326150.987593] WARN: Modules linked in: tun nfsv3 nfs fscache dm_round_robin [3326150.987762] WARN: CPU: 2 PID: 8399 Comm: kworker/u32:1 Tainted: G O 4.4.0+2 #1 [3326150.987774] WARN: Hardware name: Dell Inc. PowerEdge R720/0W7JN5, BIOS 2.5.4 01/22/2016 [3326150.987790] WARN: Workqueue: iscsi_q_13 iscsi_xmitworker [libiscsi] [3326150.987799] WARN: task: ffff8801d50f3800 ti: ffff8801f5458000 task.ti: ffff8801f5458000 [3326150.987810] WARN: RIP: e030:[<ffffffffa05ce70d>] [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi] [3326150.987825] WARN: RSP: e02b:ffff8801f545bdb0 EFLAGS: 00010246 [3326150.987831] WARN: RAX: 00000000ffffffc3 RBX: ffff880282d2ab20 RCX: ffff88026b6ac480 [3326150.987842] WARN: RDX: 0000000000000000 RSI: 00000000fffffe01 RDI: ffff880282d2ab20 [3326150.987852] WARN: RBP: ffff8801f545bdc8 R08: 0000000000000000 R09: 0000000000000008 [3326150.987862] WARN: R10: 0000000000000000 R11: 000000000000fe88 R12: 0000000000000000 [3326150.987872] WARN: R13: ffff880282d2abe8 R14: ffff880282d2abd8 R15: ffff880282d2ac08 [3326150.987890] WARN: FS: 00007f5a866b4840(0000) GS:ffff88028a640000(0000) knlGS:0000000000000000 [3326150.987900] WARN: CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [3326150.987907] WARN: CR2: 0000000000000078 CR3: 0000000070244000 CR4: 0000000000042660 [3326150.987918] WARN: Stack: [3326150.987924] WARN: ffff880282d2ad58 ffff880282d2ab20 ffff880282d2abe8 ffff8801f545be18 [3326150.987938] WARN: ffffffffa05cea90 ffff880282d2abf8 ffff88026b59cc80 ffff88026b59cc00 [3326150.987951] WARN: ffff88022acf32c0 ffff880289491800 ffff880255a80800 0000000000000400 [3326150.987964] WARN: Call Trace: [3326150.987975] WARN: [<ffffffffa05cea90>] iscsi_xmitworker+0x2f0/0x360 [libiscsi] [3326150.987988] WARN: [<ffffffff8108862c>] process_one_work+0x1fc/0x3b0 [3326150.987997] WARN: [<ffffffff81088f95>] worker_thread+0x2a5/0x470 [3326150.988006] WARN: [<ffffffff8159cad8>] ? __schedule+0x648/0x870 [3326150.988015] WARN: [<ffffffff81088cf0>] ? rescuer_thread+0x300/0x300 [3326150.988023] WARN: [<ffffffff8108ddf5>] kthread+0xd5/0xe0 [3326150.988031] WARN: [<ffffffff8108dd20>] ? kthread_stop+0x110/0x110 [3326150.988040] WARN: [<ffffffff815a0bcf>] ret_from_fork+0x3f/0x70 [3326150.988048] WARN: [<ffffffff8108dd20>] ? kthread_stop+0x110/0x110 [3326150.988127] ALERT: RIP [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi] [3326150.988138] WARN: RSP <ffff8801f545bdb0> [3326150.988144] WARN: CR2: 0000000000000078 [3326151.020366] WARN: ---[ end trace 1c60974d4678d81b ]--- Commit 6f8830f5bbab ("scsi: libiscsi: add lock around task lists to fix list corruption regression") introduced "taskqueuelock" to fix list corruption during the race, but this wasn't enough. Re-setting of conn->task to NULL, could race with iscsi_xmit_task(). iscsi_complete_task() { .... if (conn->task == task) conn->task = NULL; } conn->task in iscsi_xmit_task() could be NULL and so will be task. __iscsi_get_task(task) will crash (NullPtr de-ref), trying to access refcount. iscsi_xmit_task() { struct iscsi_task *task = conn->task; __iscsi_get_task(task); } This commit will take extra conn->session->back_lock in iscsi_xmit_task() to ensure iscsi_xmit_task() waits for iscsi_complete_task(), if iscsi_complete_task() wins the race. If iscsi_xmit_task() wins the race, iscsi_xmit_task() increments task->refcount (__iscsi_get_task) ensuring iscsi_complete_task() will not iscsi_free_task(). Signed-off-by: Anoob Soman <anoob.soman@citrix.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Acked-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-15net/mlx5: E-Switch, Allow transition to offloads mode for ECPFBodong Wang
Currently, the e-switch driver requires going to legacy mode before changing to the offloads mode. This makes sense for regular case as the legacy mode is done by creating VFs. However, it's problematic when ECPF is the eswitch manager. In such case, ECPF will control the vports on peer host including the peer PF and VFs. But ECPF doesn't need and shall not create VFs as the VFs are created in the peer PF host. Grant ECPF the ability to change from none to the offloads mode. Note that currently the only way to go back to none mode is by unloading the ECPF driver. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Load/unload VF reps according to event from host PFBodong Wang
When host PF changes the number of VFs, the ECPF esw driver will get a FW event. It should query the number of VFs enabled by host PF and update the VF reps accordingly. Note that host PF can't change the number of VFs dynamically, it has to reset the number of VFs to 0 before changing to a new positive number. The host event is registered when driver is moving to switchdev mode, and it's the last step to do in esw_offloads_init. It's unregistered and the work queue is flushed when driver quits from switchdev mode. In this way, the host event and devlink command are serialized. When driver is enabling switchdev mode, pay attention to the following two facts: 1. Host PF must not have VF initialized as the flow table in ECPF has ENCAP enabled as default. Such flow table can't be created with existing initialized VFs. 2. ECPF doesn't know how many VFs the host PF will enable, ECPF offloads flow steering shall create the flow table/groups based on the max number of VFs possibly supported by host PF. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Consider ECPF vport depends on eswitch ownershipBodong Wang
ECPF connects to the eswitch through vport 0xfffe. ECPF may or may not be the eswitch manager depending on firmware configuration. 1. If ECPF is eswitch manager: ECPF will take over the eswitch manager responsibility. A rep of the host PF shall be created at the ECPF side for the eswitch manager to control. 2. If ECPF is not eswitch manager: host PF will be the eswitch manager, ECPF acts similar as a VF to the host PF. Host PF will be aware of the ECPF vport presence and control it's rep. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Assign a different position for uplink rep and vportBodong Wang
In offloads mode, the current implementation puts the uplink representor at index zero of the vport reps array. It is not "natural" to place it at index 0 since we want to put the representor for vport 0 at index 0 with the introduction of SmartNIC. A separate patch will handle the case whether a rep is needed for vport 0 (PF vport). So, we want to have a different placeholder for uplink vport and representor. It was placed at the end of vport and rep array. Since vport number can no longer act as an index into the vport or representors arrays, use functions to map vport numbers to indices when accessing the vports or representors arrays, and vice versa. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Centralize repersentor reg/unreg to eswitch driverBodong Wang
Eswitch has two users: IB and ETH. They both register repersentors when mlx5 interface is added, and unregister the repersentors when mlx5 interface is removed. Ideally, each driver should only deal with the entities which are unique to itself. However, current IB and ETH drivers have to perform the following eswitch operations: 1. When registering, specify how many vports to register. This number is the same for both drivers which is the total available vport numbers. 2. When unregistering, specify the number of registered vports to do unregister. Also, unload the repersentors which are already loaded. It's unnecessary for eswitch driver to hands out the control of above operations to individual driver users, as they're not unique to each driver. Instead, such operations should be centralized to eswitch driver. This consolidates eswitch control flow, and simplified IB and ETH driver. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Support load/unload reps of specific vport typesBodong Wang
Currently the driver loads and unloads all reps in an unbreakable group. However, with ECPF, the reps of special vports such as uplink and host PF should always be loaded in switchdev mode where the reps for VFs will be loaded on-demand and unloaded on no-demand. This is a pre-step for that change. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Add state to eswitch vport representorsBodong Wang
Currently the eswitch vport reps have a valid indicator, which is set on register and unset on unregister. However, a rep can be loaded or not loaded when doing unregister, current driver checks if the vport of that rep is enabled as a flag to imply the rep is loaded. However, for ECPF, this is not valid as the host PF will enable the vports for its VFs instead. Add three states: {unregistered, registered, loaded}, with the following state changes across different operations: create: (none) -> unregistered reg: unregistered -> registered load: registered -> loaded unload: loaded -> registered unreg: registered -> unregistered Note that the state shall only be updated inside eswitch driver rather than individual drivers such as ETH or IB. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Suggested-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Use getter and iterator to access vport/repBodong Wang
With only PF and VF, it is sufficient to have the vport/rep array index as the vport number. This is because PF and VF vports numbers are consecutive serial numbers. In downstream patches with introducing of ECPF and UPLINK vports, it's not consecutive any more. Use getter to get specific vport/rep, and use iterator to traversal a list of vport/rep. This hides the translation between array index and vport number, and provides flexibility of using different translation mechanism in the future. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Suggested-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Split VF and special vports for offloads modeBodong Wang
When driver is entering offloads mode, there are two major tasks to do: initialize flow steering and create representors. Flow steering should make sure enough flow table/group spaces are reserved for all reps. Representors will be created in a group, all or none. With the introduction of ECPF, flow steering should still reserve the same spaces. But, the representors are not always loaded/unloaded in a single piece. Once ECPF is in offloads mode, it will get the number of VF changing event from host PF. In such scenario, only the VF reps should be loaded/unloaded, not the reps for special vports (such as the uplink vport). Thus, when entering offloads mode, driver should specify the total number of reps, and the number of VF reps separately. When leaving offloads mode, the cleanup should use the information self-contained in eswitch such as number of VFs. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Refactor offloads flow steering init/cleanupBodong Wang
E-switch offloads mode initialize/cleanup multiple steering related entities (flow table/group). Refactor these operations to internal helper functions for better block design. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Properly refer to host PF vport as other vportBodong Wang
Commands referring to vports use the following scheme: 1. When referring to my own vport, put 0 in vport and 0 in other_vport. 2. When referring to another vport, put the vport number of the referred vport and put 1 in other_vport. It was assumed that driver is accessing other vport when vport number is greater than 0. With the above scheme, the case that ECPF eswitch manager is trying to access host PF vport will fall over with scheme 1 as the vport number is 0. This is apparently wrong as driver is trying to refer other vport. As such usage can only happen in the eswitch context, change relevant functions to provide other vport input properly. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Properly refer to the esw manager vportBodong Wang
In SmartNIC mode, the eswitch manager is not necessarily the PF (vport 0). Use a helper function to get the correct eswitch manager vport number and cache on the eswitch instance for fast reference. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: Correctly set LAG mode for ECPFBodong Wang
When bonding is added, driver assumes that it's RoCE LAG if no VF is enabled. This is not enough for ECPF as the VF is enabled in host PF side. LAG should only choose RoCE mode when both slave devices meet conditions below: 1. E-Switch offloads mode is NONE. 2. No VF is enabled. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Merge mlx5-next shared branched into net-next, From Bodong Wang: 1) Introduction of ECPF (Embedded CPU Physical Function), and low level bits for mlx5 SmartNic capabilities support. 2) Vport enumeration refactoring that affect mlx5_ib and mlx5_core From Aya Levin, 3) Add support for 50Gbps per lane link modes in the Port Type and Speed register (PTYS) 4) Refactor low level query functions for PTYS register 5) Add support for 50Gbps per lane link modes to mlx5_ib Note: due to a change in API in mlx5/core and a later patch from net-next, a fixup was squashed with this merge commit that replaces FDB_UPLINK_VPORT with MLX5_VPORT_UPLINK which exists only in upstream net-next. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-16MIPS: eBPF: Remove REG_32BIT_ZERO_EXPaul Burton
REG_32BIT_ZERO_EX and REG_64BIT are always handled in exactly the same way, and reg_val_propagate_range() never actually sets any register to type REG_32BIT_ZERO_EX. Remove the redundant & unused REG_32BIT_ZERO_EX. Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-16MIPS: eBPF: Always return sign extended 32b valuesPaul Burton
The function prototype used to call JITed eBPF code (ie. the type of the struct bpf_prog bpf_func field) returns an unsigned int. The MIPS n64 ABI that MIPS64 kernels target defines that 32 bit integers should always be sign extended when passed in registers as either arguments or return values. This means that when returning any value which may not already be sign extended (ie. of type REG_64BIT or REG_32BIT_ZERO_EX) we need to perform that sign extension in order to comply with the n64 ABI. Without this we see strange looking test failures from test_bpf.ko, such as: test_bpf: #65 ALU64_MOV_X: dst = 4294967295 jited:1 ret -1 != -1 FAIL (1 times) Although the return value printed matches the expected value, this is only because printf is only examining the least significant 32 bits of the 64 bit register value we returned. The register holding the expected value is sign extended whilst the v0 register was set to a zero extended value by our JITed code, so when compared by a conditional branch instruction the values are not equal. We already handle this when the return value register is of type REG_32BIT_ZERO_EX, so simply extend this to also cover REG_64BIT. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: b6bd53f9c4e8 ("MIPS: Add missing file for eBPF JIT.") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-16bpf: make LWTUNNEL_BPF dependent on INETPeter Oskolkov
Lightweight tunnels are L3 constructs that are used with IP/IP6. For example, lwtunnel_xmit is called from ip_output.c and ip6_output.c only. Make the dependency explicit at least for LWT-BPF, as now they call into IP routing. V2: added "Reported-by" below. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Peter Oskolkov <posk@google.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-15keys: Timestamp new keysDavid Howells
Set the timestamp on new keys rather than leaving it unset. Fixes: 31d5a79d7f3d ("KEYS: Do LRU discard in full keyrings") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2019-02-15keys: Fix dependency loop between construction record and auth keyDavid Howells
In the request_key() upcall mechanism there's a dependency loop by which if a key type driver overrides the ->request_key hook and the userspace side manages to lose the authorisation key, the auth key and the internal construction record (struct key_construction) can keep each other pinned. Fix this by the following changes: (1) Killing off the construction record and using the auth key instead. (2) Including the operation name in the auth key payload and making the payload available outside of security/keys/. (3) The ->request_key hook is given the authkey instead of the cons record and operation name. Changes (2) and (3) allow the auth key to naturally be cleaned up if the keyring it is in is destroyed or cleared or the auth key is unlinked. Fixes: 7ee02a316600 ("keys: Fix dependency loop between construction record and auth key") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2019-02-15assoc_array: Fix shortcut creationDavid Howells
Fix the creation of shortcuts for which the length of the index key value is an exact multiple of the machine word size. The problem is that the code that blanks off the unused bits of the shortcut value malfunctions if the number of bits in the last word equals machine word size. This is due to the "<<" operator being given a shift of zero in this case, and so the mask that should be all zeros is all ones instead. This causes the subsequent masking operation to clear everything rather than clearing nothing. Ordinarily, the presence of the hash at the beginning of the tree index key makes the issue very hard to test for, but in this case, it was encountered due to a development mistake that caused the hash output to be either 0 (keyring) or 1 (non-keyring) only. This made it susceptible to the keyctl/unlink/valid test in the keyutils package. The fix is simply to skip the blanking if the shift would be 0. For example, an index key that is 64 bits long would produce a 0 shift and thus a 'blank' of all 1s. This would then be inverted and AND'd onto the index_key, incorrectly clearing the entire last word. Fixes: 3cb989501c26 ("Add a generic associative array implementation.") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2019-02-15KEYS: allow reaching the keys quotas exactlyEric Biggers
If the sysctl 'kernel.keys.maxkeys' is set to some number n, then actually users can only add up to 'n - 1' keys. Likewise for 'kernel.keys.maxbytes' and the root_* versions of these sysctls. But these sysctls are apparently supposed to be *maximums*, as per their names and all documentation I could find -- the keyrings(7) man page, Documentation/security/keys/core.rst, and all the mentions of EDQUOT meaning that the key quota was *exceeded* (as opposed to reached). Thus, fix the code to allow reaching the quotas exactly. Fixes: 0b77f5bfb45c ("keys: make the keyring quotas controllable through /proc/sys") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2019-02-15Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two fairly small fixes: the qla one is a panic inducing use after free and the entropy fix may seem minor but it has had huge userspace impact thanks to an unrelated change in openssl that causes sshd to refuse logins until it has enough entropy for the session keys, which causes tens of minutes delay before the affected systems allow logins after reboot" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qla2xxx: Fix panic from use after free in qla2x00_async_tm_cmd scsi: sd: fix entropy gathering for most rotational disks
2019-02-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The netfilter conflicts were rather simple overlapping changes. However, the cls_tcindex.c stuff was a bit more complex. On the 'net' side, Cong is fixing several races and memory leaks. Whilst on the 'net-next' side we have Vlad adding the rtnl-ness support. What I've decided to do, in order to resolve this, is revert the conversion over to using a workqueue that Cong did, bringing us back to pure RCU. I did it this way because I believe that either Cong's races don't apply with have Vlad did things, or Cong will have to implement the race fix slightly differently. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15sunrpc: fix 4 more call sites that were using stack memory with a scatterlistScott Mayhew
While trying to reproduce a reported kernel panic on arm64, I discovered that AUTH_GSS basically doesn't work at all with older enctypes on arm64 systems with CONFIG_VMAP_STACK enabled. It turns out there still a few places using stack memory with scatterlists, causing krb5_encrypt() and krb5_decrypt() to produce incorrect results (or a BUG if CONFIG_DEBUG_SG is enabled). Tested with cthon on v4.0/v4.1/v4.2 with krb5/krb5i/krb5p using des3-cbc-sha1 and arcfour-hmac-md5. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-02-15Merge tag 'omap-for-v5.0/fixes-rc5' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes Fix omap4 and later lost cpu1 interrupts for periodic timer A fix from Russell that took a while to get applied into fixes as I thought Russell is merging this one. * tag 'omap-for-v5.0/fixes-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP2+: fix lack of timer interrupts on CPU1 after hotplug
2019-02-15include/linux/module.h: copy __init/__exit attrs to init/cleanup_moduleMiguel Ojeda
The upcoming GCC 9 release extends the -Wmissing-attributes warnings (enabled by -Wall) to C and aliases: it warns when particular function attributes are missing in the aliases but not in their target. In particular, it triggers for all the init/cleanup_module aliases in the kernel (defined by the module_init/exit macros), ending up being very noisy. These aliases point to the __init/__exit functions of a module, which are defined as __cold (among other attributes). However, the aliases themselves do not have the __cold attribute. Since the compiler behaves differently when compiling a __cold function as well as when compiling paths leading to calls to __cold functions, the warning is trying to point out the possibly-forgotten attribute in the alias. In order to keep the warning enabled, we decided to silence this case. Ideally, we would mark the aliases directly as __init/__exit. However, there are currently around 132 modules in the kernel which are missing __init/__exit in their init/cleanup functions (either because they are missing, or for other reasons, e.g. the functions being called from somewhere else); and a section mismatch is a hard error. A conservative alternative was to mark the aliases as __cold only. However, since we would like to eventually enforce __init/__exit to be always marked, we chose to use the new __copy function attribute (introduced by GCC 9 as well to deal with this). With it, we copy the attributes used by the target functions into the aliases. This way, functions that were not marked as __init/__exit won't have their aliases marked either, and therefore there won't be a section mismatch. Note that the warning would go away marking either the extern declaration, the definition, or both. However, we only mark the definition of the alias, since we do not want callers (which only see the declaration) to be compiled as if the function was __cold (and therefore the paths leading to those calls would be assumed to be unlikely). Link: https://lore.kernel.org/lkml/20190123173707.GA16603@gmail.com/ Link: https://lore.kernel.org/lkml/20190206175627.GA20399@gmail.com/ Suggested-by: Martin Sebor <msebor@gcc.gnu.org> Acked-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
2019-02-15Compiler Attributes: add support for __copy (gcc >= 9)Miguel Ojeda
From the GCC manual: copy copy(function) The copy attribute applies the set of attributes with which function has been declared to the declaration of the function to which the attribute is applied. The attribute is designed for libraries that define aliases or function resolvers that are expected to specify the same set of attributes as their targets. The copy attribute can be used with functions, variables, or types. However, the kind of symbol to which the attribute is applied (either function or variable) must match the kind of symbol to which the argument refers. The copy attribute copies only syntactic and semantic attributes but not attributes that affect a symbol’s linkage or visibility such as alias, visibility, or weak. The deprecated attribute is also not copied. https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html The upcoming GCC 9 release extends the -Wmissing-attributes warnings (enabled by -Wall) to C and aliases: it warns when particular function attributes are missing in the aliases but not in their target, e.g.: void __cold f(void) {} void __alias("f") g(void); diagnoses: warning: 'g' specifies less restrictive attribute than its target 'f': 'cold' [-Wmissing-attributes] Using __copy(f) we can copy the __cold attribute from f to g: void __cold f(void) {} void __copy(f) __alias("f") g(void); This attribute is most useful to deal with situations where an alias is declared but we don't know the exact attributes the target has. For instance, in the kernel, the widely used module_init/exit macros define the init/cleanup_module aliases, but those cannot be marked always as __init/__exit since some modules do not have their functions marked as such. Suggested-by: Martin Sebor <msebor@gcc.gnu.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
2019-02-15lib/crc32.c: mark crc32_le_base/__crc32c_le_base aliases as __pureMiguel Ojeda
The upcoming GCC 9 release extends the -Wmissing-attributes warnings (enabled by -Wall) to C and aliases: it warns when particular function attributes are missing in the aliases but not in their target. In particular, it triggers here because crc32_le_base/__crc32c_le_base aren't __pure while their target crc32_le/__crc32c_le are. These aliases are used by architectures as a fallback in accelerated versions of CRC32. See commit 9784d82db3eb ("lib/crc32: make core crc32() routines weak so they can be overridden"). Therefore, being fallbacks, it is likely that even if the aliases were called from C, there wouldn't be any optimizations possible. Currently, the only user is arm64, which calls this from asm. Still, marking the aliases as __pure makes sense and is a good idea for documentation purposes and possible future optimizations, which also silences the warning. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
2019-02-15auxdisplay: ht16k33: fix potential user-after-free on module unloadMiguel Ojeda
On module unload/remove, we need to ensure that work does not run after we have freed resources. Concretely, cancel_delayed_work() may return while the callback function is still running. From kernel/workqueue.c: The work callback function may still be running on return, unless it returns true and the work doesn't re-arm itself. Explicitly flush or use cancel_delayed_work_sync() to wait on it. Link: https://lore.kernel.org/lkml/20190204220952.30761-1-TheSven73@googlemail.com/ Reported-by: Sven Van Asbroeck <thesven73@gmail.com> Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Sven Van Asbroeck <TheSven73@gmail.com> Acked-by: Robin van der Gracht <robin@protonic.nl> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
2019-02-15tracing: Fix number of entries in trace headerQuentin Perret
The following commit 441dae8f2f29 ("tracing: Add support for display of tgid in trace output") removed the call to print_event_info() from print_func_help_header_irq() which results in the ftrace header not reporting the number of entries written in the buffer. As this wasn't the original intent of the patch, re-introduce the call to print_event_info() to restore the orginal behaviour. Link: http://lkml.kernel.org/r/20190214152950.4179-1-quentin.perret@arm.com Acked-by: Joel Fernandes <joelaf@google.com> Cc: stable@vger.kernel.org Fixes: 441dae8f2f29 ("tracing: Add support for display of tgid in trace output") Signed-off-by: Quentin Perret <quentin.perret@arm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-15kprobe: Do not use uaccess functions to access kernel memory that can faultChangbin Du
The userspace can ask kprobe to intercept strings at any memory address, including invalid kernel address. In this case, fetch_store_strlen() would crash since it uses general usercopy function, and user access functions are no longer allowed to access kernel memory. For example, we can crash the kernel by doing something as below: $ sudo kprobe 'p:do_sys_open +0(+0(%si)):string' [ 103.620391] BUG: GPF in non-whitelisted uaccess (non-canonical address?) [ 103.622104] general protection fault: 0000 [#1] SMP PTI [ 103.623424] CPU: 10 PID: 1046 Comm: cat Not tainted 5.0.0-rc3-00130-gd73aba1-dirty #96 [ 103.625321] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-2-g628b2e6-dirty-20190104_103505-linux 04/01/2014 [ 103.628284] RIP: 0010:process_fetch_insn+0x1ab/0x4b0 [ 103.629518] Code: 10 83 80 28 2e 00 00 01 31 d2 31 ff 48 8b 74 24 28 eb 0c 81 fa ff 0f 00 00 7f 1c 85 c0 75 18 66 66 90 0f ae e8 48 63 ca 89 f8 <8a> 0c 31 66 66 90 83 c2 01 84 c9 75 dc 89 54 24 34 89 44 24 28 48 [ 103.634032] RSP: 0018:ffff88845eb37ce0 EFLAGS: 00010246 [ 103.635312] RAX: 0000000000000000 RBX: ffff888456c4e5a8 RCX: 0000000000000000 [ 103.637057] RDX: 0000000000000000 RSI: 2e646c2f6374652f RDI: 0000000000000000 [ 103.638795] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 103.640556] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 103.642297] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 103.644040] FS: 0000000000000000(0000) GS:ffff88846f000000(0000) knlGS:0000000000000000 [ 103.646019] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 103.647436] CR2: 00007ffc79758038 CR3: 0000000463360006 CR4: 0000000000020ee0 [ 103.649147] Call Trace: [ 103.649781] ? sched_clock_cpu+0xc/0xa0 [ 103.650747] ? do_sys_open+0x5/0x220 [ 103.651635] kprobe_trace_func+0x303/0x380 [ 103.652645] ? do_sys_open+0x5/0x220 [ 103.653528] kprobe_dispatcher+0x45/0x50 [ 103.654682] ? do_sys_open+0x1/0x220 [ 103.655875] kprobe_ftrace_handler+0x90/0xf0 [ 103.657282] ftrace_ops_assist_func+0x54/0xf0 [ 103.658564] ? __call_rcu+0x1dc/0x280 [ 103.659482] 0xffffffffc00000bf [ 103.660384] ? __ia32_sys_open+0x20/0x20 [ 103.661682] ? do_sys_open+0x1/0x220 [ 103.662863] do_sys_open+0x5/0x220 [ 103.663988] do_syscall_64+0x60/0x210 [ 103.665201] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 103.666862] RIP: 0033:0x7fc22fadccdd [ 103.668034] Code: 48 89 54 24 e0 41 83 e2 40 75 32 89 f0 25 00 00 41 00 3d 00 00 41 00 74 24 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 77 33 f3 c3 66 0f 1f 84 00 00 00 00 00 48 8d 44 [ 103.674029] RSP: 002b:00007ffc7972c3a8 EFLAGS: 00000287 ORIG_RAX: 0000000000000101 [ 103.676512] RAX: ffffffffffffffda RBX: 0000562f86147a21 RCX: 00007fc22fadccdd [ 103.678853] RDX: 0000000000080000 RSI: 00007fc22fae1428 RDI: 00000000ffffff9c [ 103.681151] RBP: ffffffffffffffff R08: 0000000000000000 R09: 0000000000000000 [ 103.683489] R10: 0000000000000000 R11: 0000000000000287 R12: 00007fc22fce90a8 [ 103.685774] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000 [ 103.688056] Modules linked in: [ 103.689131] ---[ end trace 43792035c28984a1 ]--- This can be fixed by using probe_mem_read() instead, as it can handle faulting kernel memory addresses, which kprobes can legitimately do. Link: http://lkml.kernel.org/r/20190125151051.7381-1-changbin.du@gmail.com Cc: stable@vger.kernel.org Fixes: 9da3f2b7405 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses") Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-15Merge tag 'for-linus-20190215' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - Ensure we insert into the hctx dispatch list, if a request is marked as DONTPREP (Jianchao) - NVMe pull request, single missing unlock on error fix (Keith) - MD pull request, single fix for a potentially data corrupting issue (Nate) - Floppy check_events regression fix (Yufen) * tag 'for-linus-20190215' of git://git.kernel.dk/linux-block: md/raid1: don't clear bitmap bits on interrupted recovery. floppy: check_events callback should not return a negative number nvme-pci: add missing unlock for reset error blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue
2019-02-15Merge tag 'for-5.0/dm-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix bug in DM crypt's sizing of its block integrity tag space, resulting in less memory use when DM crypt layers on DM integrity. - Fix a long-standing DM thinp crash consistency bug that was due to improper handling of FUA. This issue is specific to writes that fill an entire thinp block which needs to be allocated. * tag 'for-5.0/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm thin: fix bug where bio that overwrites thin block ignores FUA dm crypt: don't overallocate the integrity tag space
2019-02-15Merge tag 'mmc-v5.0-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "A couple of MMC fixes intended for v5.0-rc7. MMC core: - Fix deadlock bug for block I/O requests MMC host: - sunxi: Disable broken HS-DDR mode for H5 by default - sunxi: Avoid unsupported speed modes declared via DT - meson-gx: Restore interrupt name" * tag 'mmc-v5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: meson-gx: fix interrupt name mmc: block: handle complete_work on separate workqueue mmc: sunxi: Filter out unsupported modes declared in the device tree mmc: sunxi: Disable HS-DDR mode for H5 eMMC controller by default
2019-02-15iw_cxgb4: cq/qp mask depends on bar2 pages in a host pageRaju Rangoju
Adjust the cq/qp mask based on the number of bar2 pages in a host page. For user-mode rdma, the granularity of the BAR2 memory mapped to a user rdma process during queue allocation must be based on the host page size. The lld attributes udb_density and ucq_density are used to figure out how many sge contexts are in a bar2 page. So the rdev->qpmask and rdev->cqmask in iw_cxgb4 need to now be adjusted based on how many sge bar2 pages are in a host page. Otherwise the device fails to work on non 4k page size systems. Fixes: 2391b0030e24 ("cxgb4: Remove SGE_HOST_PAGE_SIZE dependency on page size") Signed-off-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-15cxgb4: Export sge_host_page_size to uldsRaju Rangoju
Export the sge_host_page_size field to ULDs via cxgb4_lld_info, so that iw_cxgb4 can make use of this in calculating the correct qp/cq mask. Fixes: 2391b0030e24 ("cxgb4: Remove SGE_HOST_PAGE_SIZE dependency on page size") Signed-off-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>