summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-02-19net/tls: Fix to avoid gettig invalid tls recordRohit Maheshwari
Current code doesn't check if tcp sequence number is starting from (/after) 1st record's start sequnce number. It only checks if seq number is before 1st record's end sequnce number. This problem will always be a possibility in re-transmit case. If a record which belongs to a requested seq number is already deleted, tls_get_record will start looking into list and as per the check it will look if seq number is before the end seq of 1st record, which will always be true and will return 1st record always, it should in fact return NULL. As part of the fix, start looking each record only if the sequence number lies in the list else return NULL. There is one more check added, driver look for the start marker record to handle tcp packets which are before the tls offload start sequence number, hence return 1st record if the record is tls start marker and seq number is before the 1st record's starting sequence number. Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure") Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19bpf: Fix a potential deadlock with bpf_map_do_batchYonghong Song
Commit 057996380a42 ("bpf: Add batch ops to all htab bpf map") added lookup_and_delete batch operation for hash table. The current implementation has bpf_lru_push_free() inside the bucket lock, which may cause a deadlock. syzbot reports: -> #2 (&htab->buckets[i].lock#2){....}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 htab_lru_map_delete_node+0xce/0x2f0 kernel/bpf/hashtab.c:593 __bpf_lru_list_shrink_inactive kernel/bpf/bpf_lru_list.c:220 [inline] __bpf_lru_list_shrink+0xf9/0x470 kernel/bpf/bpf_lru_list.c:266 bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:340 [inline] bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline] bpf_lru_pop_free+0x87c/0x1670 kernel/bpf/bpf_lru_list.c:499 prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132 __htab_lru_percpu_map_update_elem+0x67e/0xa90 kernel/bpf/hashtab.c:1069 bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585 bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181 generic_map_update_batch+0x41f/0x610 kernel/bpf/syscall.c:1319 bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 __do_sys_bpf+0x9b7/0x41e0 kernel/bpf/syscall.c:3460 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (&loc_l->lock){....}: check_prev_add kernel/locking/lockdep.c:2475 [inline] check_prevs_add kernel/locking/lockdep.c:2580 [inline] validate_chain kernel/locking/lockdep.c:2970 [inline] __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 __htab_map_lookup_and_delete_batch+0x8d4/0x1540 kernel/bpf/hashtab.c:1374 htab_lru_map_lookup_and_delete_batch+0x34/0x40 kernel/bpf/hashtab.c:1491 bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe Possible unsafe locking scenario: CPU0 CPU2 ---- ---- lock(&htab->buckets[i].lock#2); lock(&l->lock); lock(&htab->buckets[i].lock#2); lock(&loc_l->lock); *** DEADLOCK *** To fix the issue, for htab_lru_map_lookup_and_delete_batch() in CPU0, let us do bpf_lru_push_free() out of the htab bucket lock. This can avoid the above deadlock scenario. Fixes: 057996380a42 ("bpf: Add batch ops to all htab bpf map") Reported-by: syzbot+a38ff3d9356388f2fb83@syzkaller.appspotmail.com Reported-by: syzbot+122b5421d14e68f29cd1@syzkaller.appspotmail.com Suggested-by: Hillf Danton <hdanton@sina.com> Suggested-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Brian Vazquez <brianvv@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200219234757.3544014-1-yhs@fb.com
2020-02-19bpf: Do not grab the bucket spinlock by default on htab batch opsBrian Vazquez
Grabbing the spinlock for every bucket even if it's empty, was causing significant perfomance cost when traversing htab maps that have only a few entries. This patch addresses the issue by checking first the bucket_cnt, if the bucket has some entries then we go and grab the spinlock and proceed with the batching. Tested with a htab of size 50K and different value of populated entries. Before: Benchmark Time(ns) CPU(ns) --------------------------------------------- BM_DumpHashMap/1 2759655 2752033 BM_DumpHashMap/10 2933722 2930825 BM_DumpHashMap/200 3171680 3170265 BM_DumpHashMap/500 3639607 3635511 BM_DumpHashMap/1000 4369008 4364981 BM_DumpHashMap/5k 11171919 11134028 BM_DumpHashMap/20k 69150080 69033496 BM_DumpHashMap/39k 190501036 190226162 After: Benchmark Time(ns) CPU(ns) --------------------------------------------- BM_DumpHashMap/1 202707 200109 BM_DumpHashMap/10 213441 210569 BM_DumpHashMap/200 478641 472350 BM_DumpHashMap/500 980061 967102 BM_DumpHashMap/1000 1863835 1839575 BM_DumpHashMap/5k 8961836 8902540 BM_DumpHashMap/20k 69761497 69322756 BM_DumpHashMap/39k 187437830 186551111 Fixes: 057996380a42 ("bpf: Add batch ops to all htab bpf map") Signed-off-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200218172552.215077-1-brianvv@google.com
2020-02-19ice: Wait for VF to be reset/ready before configurationBrett Creeley
The configuration/command below is failing when the VF in the xml file is already bound to the host iavf driver. pci_0000_af_0_0.xml: <interface type='hostdev' managed='yes'> <source> <address type='pci' domain='0x0000' bus='0xaf' slot='0x0' function='0x0'/> </source> <mac address='00:de:ad:00:11:01'/> </interface> > virsh attach-device domain_name pci_0000_af_0_0.xml error: Failed to attach device from pci_0000_af_0_0.xml error: Cannot set interface MAC/vlanid to 00:de:ad:00:11:01/0 for ifname ens1f1 vf 0: Device or resource busy This is failing because the VF has not been completely removed/reset after being unbound (via the virsh command above) from the host iavf driver and ice_set_vf_mac() checks if the VF is disabled before waiting for the reset to finish. Fix this by waiting for the VF remove/reset process to happen before checking if the VF is disabled. Also, since many functions for VF administration on the PF were more or less calling the same 3 functions (ice_wait_on_vf_reset(), ice_is_vf_disabled(), and ice_check_vf_init()) move these into the helper function ice_check_vf_ready_for_cfg(). Then call this function in any flow that attempts to configure/query a VF from the PF. Lastly, increase the maximum wait time in ice_wait_on_vf_reset() to 800ms, and modify/add the #define(s) that determine the wait time. This was done for robustness because in rare/stress cases VF removal can take a max of ~800ms and previously the wait was a max of ~300ms. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-19ice: Don't tell the OS that link is going downMichal Swiatkowski
Remove code that tell the OS that link is going down when user change flow control via ethtool. When link is up it isn't certain that link goes down after 0x0605 aq command. If link doesn't go down, OS thinks that link is down, but physical link is up. To reset this state user have to take interface down and up. If link goes down after 0x0605 command, FW send information about that and after that driver tells the OS that the link goes down. So this code in ethtool is unnecessary. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-19ice: Don't reject odd values of usecs set by userBrett Creeley
Currently if a user sets an odd [tx|rx]-usecs value through ethtool, the request is denied because the hardware is set to have an ITR granularity of 2us. This caused poor customer experience. Fix this by aligning to a register allowed value, which results in rounding down. Also, print a once per ring container type message to be clear about our intentions. Also, change the ITR_TO_REG define to be the bitwise and of the ITR setting and the ICE_ITR_MASK. This makes the purpose of ITR_TO_REG more obvious. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-19bridge: br_stp: Use built-in RCU list checkingMadhuparna Bhowmik
list_for_each_entry_rcu() has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu() to silence false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled by default. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19nfc: pn544: Fix occasional HW initialization failureDmitry Osipenko
The PN544 driver checks the "enable" polarity during of driver's probe and it's doing that by turning ON and OFF NFC with different polarities until enabling succeeds. It takes some time for the hardware to power-down, and thus, to deassert the IRQ that is raised by turning ON the hardware. Since the delay after last power-down of the polarity-checking process is missed in the code, the interrupt may trigger immediately after installing the IRQ handler (right after the checking is done), which results in IRQ handler trying to touch the disabled HW and ends with marking NFC as 'DEAD' during of the driver's probe: pn544_hci_i2c 1-002a: NFC: nfc_en polarity : active high pn544_hci_i2c 1-002a: NFC: invalid len byte shdlc: llc_shdlc_recv_frame: NULL Frame -> link is dead This patch fixes the occasional NFC initialization failure on Nexus 7 device. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19net: hsr: Pass lockdep expression to RCU listsAmol Grover
node_db is traversed using list_for_each_entry_rcu outside an RCU read-side critical section but under the protection of hsr->list_lock. Hence, add corresponding lockdep expression to silence false-positive warnings, and harden RCU lists. Signed-off-by: Amol Grover <frextrite@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19Merge tag 'mlx5-fixes-2020-02-18' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2020-02-18 This series introduces some fixes to mlx5 driver. Please pull and let me know if there is any problem. For -stable v5.3 ('net/mlx5: Fix sleep while atomic in mlx5_eswitch_get_vepa') For -stable v5.4 ('net/mlx5: DR, Fix matching on vport gvmi') ('net/mlx5e: Fix crash in recovery flow without devlink reporter') For -stable v5.5 ('net/mlx5e: Reset RQ doorbell counter before moving RQ state from RST to RDY') ('net/mlx5e: Don't clear the whole vf config when switching modes') ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19Merge tag 'iommu-fixes-v5.6-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Compile warning fix for the Intel IOMMU driver - Fix kdump boot with Intel IOMMU enabled and in passthrough mode - Disable AMD IOMMU on a Laptop/Embedded platform because the delay it introduces in DMA transactions causes screen flickering there with 4k monitors - Make domain_free function in QCOM IOMMU driver robust and not leak memory/dereference NULL pointers - Fix ARM-SMMU module parameter prefix names * tag 'iommu-fixes-v5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/arm-smmu: Restore naming of driver parameter prefix iommu/qcom: Fix bogus detach logic iommu/amd: Disable IOMMU on Stoney Ridge systems iommu/vt-d: Simplify check in identity_mapping() iommu/vt-d: Remove deferred_attach_domain() iommu/vt-d: Do deferred attachment in iommu_need_mapping() iommu/vt-d: Move deferred device attachment into helper function iommu/vt-d: Add attach_deferred() helper iommu/vt-d: Fix compile warning from intel-svm.h
2020-02-19Merge tag 'sound-5.6-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "The only largish change in this pull request is about the revert of the recent max98090 and its relevant patches due to regressions. Other than that, all small fixes for ALSA core (covering KCSAN fuzzer warnings in ALSA sequencer and rawmidi), Intel SOF HD-audio fixes, AMD ACP fixes, usual HD-audio quirks, and various ASoC fixes" * tag 'sound-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda: Use scnprintf() for printing texts for sysfs/procfs ALSA: hda/realtek - Apply quirk for yet another MSI laptop ASoC: sun8i-codec: Fix setting DAI data format ALSA: hda/realtek - Apply quirk for MSI GP63, too ASoC: amd: ACP needs to be powered off in BIOS. ASoC: hdmi-codec: set plugged_cb to NULL when component removing ASoC: dapm: remove snd_soc_dapm_put_enum_double_locked ASoC: max98090: revert invalid fix for handling SHDN ALSA: rawmidi: Avoid bit fields for state flags ALSA: seq: Fix concurrent access to queue current tick/time ALSA: seq: Avoid concurrent access to queue flags ASoC: codec2codec: avoid invalid/double-free of pcm runtime ASoC: amd: Buffer Size instead of MAX Buffer ASoC: SOF: Intel: hda: move i915 init earlier ASoC: SOF: Intel: hda: fix ordering bug in resume flow ALSA: hda: do not override bus codec_mask in link_get() ASoC: atmel: fix atmel_ssc_set_audio link failure ASoC: fsl_sai: Fix exiting path on probing failure
2020-02-19drm/amdgpu/display: clean up hdcp workqueue handlingAlex Deucher
Use the existence of the workqueue itself to determine when to enable HDCP features rather than sprinkling asic checks all over the code. Also add a check for the existence of the hdcp workqueue in the irq handling on the off chance we get and HPD RX interrupt with the CP bit set. This avoids a crash if the driver doesn't support HDCP for a particular asic. Fixes: 96a3b32e67236f ("drm/amd/display: only enable HDCP for DCN+") Bug: https://bugzilla.kernel.org/show_bug.cgi?id=206519 Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-19drm/amdgpu: add is_raven_kicker judgement for raven1changzhu
The rlc version of raven_kicer_rlc is different from the legacy rlc version of raven_rlc. So it needs to add a judgement function for raven_kicer_rlc and avoid disable GFXOFF when loading raven_kicer_rlc. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-19iommu/arm-smmu: Restore naming of driver parameter prefixWill Deacon
Extending the Arm SMMU driver to allow for modular builds changed KBUILD_MODNAME to be "arm_smmu_mod" so that a single module could be built from the multiple existing object files without the need to rename any source files. This inadvertently changed the name of the driver parameters, which may lead to runtime issues if bootloaders are relying on the old names for correctness (e.g. "arm-smmu.disable_bypass=0"). Although MODULE_PARAM_PREFIX can be overridden to restore the old naming for builtin parameters, only the new name is matched by modprobe and so loading the driver as a module would cause parameters specified on the kernel command line to be ignored. Instead, rename "arm_smmu_mod" to "arm_smmu". Whilst it's a bit of a bodge, this allows us to create a single module without renaming any files and makes use of the fact that underscores and hyphens can be used interchangeably in parameter names. Cc: Robin Murphy <robin.murphy@arm.com> Cc: Russell King <linux@armlinux.org.uk> Reported-by: Li Yang <leoyang.li@nxp.com> Fixes: cd221bd24ff5 ("iommu/arm-smmu: Allow building as a module") Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-02-19iommu/qcom: Fix bogus detach logicRobin Murphy
Currently, the implementation of qcom_iommu_domain_free() is guaranteed to do one of two things: WARN() and leak everything, or dereference NULL and crash. That alone is terrible, but in fact the whole idea of trying to track the liveness of a domain via the qcom_domain->iommu pointer as a sanity check is full of fundamentally flawed assumptions. Make things robust and actually functional by not trying to be quite so clever. Reported-by: Brian Masney <masneyb@onstation.org> Tested-by: Brian Masney <masneyb@onstation.org> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Fixes: 0ae349a0f33f ("iommu/qcom: Add qcom_iommu") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Stephan Gerhold <stephan@gerhold.net> Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-02-19iommu/amd: Disable IOMMU on Stoney Ridge systemsKai-Heng Feng
Serious screen flickering when Stoney Ridge outputs to a 4K monitor. Use identity-mapping and PCI ATS doesn't help this issue. According to Alex Deucher, IOMMU isn't enabled on Windows, so let's do the same here to avoid screen flickering on 4K monitor. Cc: Alex Deucher <alexander.deucher@amd.com> Bug: https://gitlab.freedesktop.org/drm/amd/issues/961 Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-02-18net/mlx5: DR, Handle reformat capability over sw-steering tablesErez Shitrit
On flow table creation, send the relevant flags according to what the FW currently supports. When FW doesn't support reformat option over SW-steering managed table, the driver shouldn't pass this. Fixes: 988fd6b32d07 ("net/mlx5: DR, Pass table flags at creation to lower layer") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5: Fix lowest FDB pool sizePaul Blakey
The pool sizes represent the pool sizes in the fw. when we request a pool size from fw, it will return the next possible group. We track how many pools the fw has left and start requesting groups from the big to the small. When we start request 4k group, which doesn't exists in fw, fw wants to allocate the next possible size, 64k, but will fail since its exhausted. The correct smallest pool size in fw is 128 and not 4k. Fixes: 39ac237ce009 ("net/mlx5: E-Switch, Refactor chains and priorities") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5e: Don't clear the whole vf config when switching modesDmytro Linkin
There is no need to reset all vf config (except link state) between legacy and switchdev modes changes. Also, set link state to AUTO, when legacy enabled. Fixes: 3b83b6c2e024 ("net/mlx5e: Clear VF config when switching modes") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5: DR, Fix matching on vport gvmiHamdan Igbaria
Set vport gvmi in the tag, only when source gvmi is set in the bit mask. Fixes: 26d688e3 ("net/mlx5: DR, Add Steering entry (STE) utilities") Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5e: Fix crash in recovery flow without devlink reporterAya Levin
When health reporters are not supported, recovery function is invoked directly, not via devlink health reporters. In this direct flow, the recover function input parameter was passed incorrectly and is causing a kernel oops. This patch is fixing the input parameter. Following call trace is observed on rx error health reporting. Internal error: Oops: 96000007 [#1] PREEMPT SMP Process kworker/u16:4 (pid: 4584, stack limit = 0x00000000c9e45703) Call trace: mlx5e_rx_reporter_err_rq_cqe_recover+0x30/0x164 [mlx5_core] mlx5e_health_report+0x60/0x6c [mlx5_core] mlx5e_reporter_rq_cqe_err+0x6c/0x90 [mlx5_core] mlx5e_rq_err_cqe_work+0x20/0x2c [mlx5_core] process_one_work+0x168/0x3d0 worker_thread+0x58/0x3d0 kthread+0x108/0x134 Fixes: c50de4af1d63 ("net/mlx5e: Generalize tx reporter's functionality") Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5e: Reset RQ doorbell counter before moving RQ state from RST to RDYAya Levin
Initialize RQ doorbell counters to zero prior to moving an RQ from RST to RDY state. Per HW spec, when RQ is back to RDY state, the descriptor ID on the completion is reset. The doorbell record must comply. Fixes: 8276ea1353a4 ("net/mlx5e: Report and recover from CQE with error on RQ") Signed-off-by: Aya Levin <ayal@mellanox.com> Reported-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5: Fix sleep while atomic in mlx5_eswitch_get_vepaHuy Nguyen
rtnl_bridge_getlink is protected by rcu lock, so mlx5_eswitch_get_vepa cannot take mutex lock. Two possible issues can happen: 1. User at the same time change vepa mode via RTM_SETLINK command. 2. User at the same time change the switchdev mode via devlink netlink interface. Case 1 cannot happen because rtnl executes one message in order. Case 2 can happen but we do not expect user to change the switchdev mode when changing vepa. Even if a user does it, so he will read a value which is no longer valid. Fixes: 8da202b24913 ("net/mlx5: E-Switch, Add support for VEPA in legacy mode.") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net This batch contains Netfilter fixes for net: 1) Restrict hashlimit size to 1048576, from Cong Wang. 2) Check for offload flags from nf_flow_table_offload_setup(), this fixes a crash in case the hardware offload is disabled. From Florian Westphal. 3) Three preparation patches to extend the conntrack clash resolution, from Florian. 4) Extend clash resolution to deal with DNS packets from the same flow racing to set up the NAT configuration. 5) Small documentation fix in pipapo, from Stefano Brivio. 6) Remove misleading unlikely() from pipapo_refill(), also from Stefano. 7) Reduce hashlimit mutex scope, from Cong Wang. This patch is actually triggering another problem, still under discussion, another patch to fix this will follow up. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-18Merge tag 'dma-mapping-5.6' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping fixes from Christoph Hellwig: - give command line cma= precedence over the CONFIG_ option (Nicolas Saenz Julienne) - always allow 32-bit DMA, even for weirdly placed ZONE_DMA - improve the debug printks when memory is not addressable, to help find problems with swiotlb initialization * tag 'dma-mapping-5.6' of git://git.infradead.org/users/hch/dma-mapping: dma-direct: improve DMA mask overflow reporting dma-direct: improve swiotlb error reporting dma-direct: relax addressability checks in dma_direct_supported dma-contiguous: CMA: give precedence to cmdline
2020-02-18Merge tag 'tpmdd-next-20200217' of git://git.infradead.org/users/jjs/linux-tpmddLinus Torvalds
Pull tpm fixes from Jarkko Sakkinen: "Two bug fixes" * tag 'tpmdd-next-20200217' of git://git.infradead.org/users/jjs/linux-tpmdd: tpm: Initialize crypto_id of allocated_banks to HASH_ALGO__LAST tpm: Revert tpm_tis_spi_mod.ko to tpm_tis_spi.ko.
2020-02-18pipe: make sure to wake up everybody when the last reader/writer closesLinus Torvalds
Andrei Vagin reported that commit 0ddad21d3e99 ("pipe: use exclusive waits when reading or writing") broke one of the CRIU tests. He even has a trivial reproducer: #include <unistd.h> #include <sys/types.h> #include <sys/wait.h> int main() { int p[2]; pid_t p1, p2; int status; if (pipe(p) == -1) return 1; p1 = fork(); if (p1 == 0) { close(p[1]); read(p[0], &status, sizeof(status)); return 0; } p2 = fork(); if (p2 == 0) { close(p[1]); read(p[0], &status, sizeof(status)); return 0; } sleep(1); close(p[1]); wait(&status); wait(&status); return 0; } and the problem - once he points it out - is obvious. We use these nice exclusive waits, but when the last writer goes away, it then needs to wake up _every_ reader (and conversely, the last reader disappearing needs to wake every writer, of course). In fact, when going through this, we had several small oddities around how to wake things. We did in fact wake every reader when we changed the size of the pipe buffers. But that's entirely pointless, since that just acts as a possible source of new space - no new data to read. And when we change the size of the buffer, we don't need to wake all writers even when we add space - that case acts just as if somebody made space by reading, and any writer that finds itself not filling it up entirely will wake the next one. On the other hand, on the exit path, we tried to limit the wakeups with the proper poll keys etc, which is entirely pointless, because at that point we obviously need to wake up everybody. So don't do that: just wake up everybody - but only do that if the counts changed to zero. So fix those non-IO wakeups to be more proper: space change doesn't add any new data, but it might make room for writers, so it wakes up a writer. And the actual changes to reader/writer counts should wake up everybody, since everybody is affected (ie readers will all see EOF if the writers have gone away, and writers will all get EPIPE if all readers have gone away). Fixes: 0ddad21d3e99 ("pipe: use exclusive waits when reading or writing") Reported-and-tested-by: Andrei Vagin <avagin@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-18netfilter: nft_set_pipapo: Don't abuse unlikely() in pipapo_refill()Stefano Brivio
I originally used unlikely() in the if (match_only) clause, which we hit on the mapping table for the last field in a set, to ensure we avoid branching to the rest of for loop body, which is executed more frequently. However, Pablo reports, this is confusing as it gives the impression that this is not a common case, and it's actually not the intended usage of unlikely(). I couldn't observe any statistical difference in matching rates on x864_64 and aarch64 without it, so just drop it. Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-02-18netfilter: nft_set_pipapo: Fix mapping table example in commentsStefano Brivio
In both insertion and lookup examples, the two element pointers of rule mapping tables were swapped. Fix that. Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-02-18flow_table.c: Use built-in RCU list checkingMadhuparna Bhowmik
hlist_for_each_entry_rcu() has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu() to silence false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled by default. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-18datapath.c: Use built-in RCU list checkingMadhuparna Bhowmik
hlist_for_each_entry_rcu() has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu() to silence false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled by default. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-18vport.c: Use built-in RCU list checkingMadhuparna Bhowmik
hlist_for_each_entry_rcu() has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu() to silence false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled by default. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-18meter.c: Use built-in RCU list checkingMadhuparna Bhowmik
hlist_for_each_entry_rcu() has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu() to silence false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled by default. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-18netlabel_domainhash.c: Use built-in RCU list checkingMadhuparna Bhowmik
list_for_each_entry_rcu() has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu() to silence false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled by default. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-18net: netlabel: Use built-in RCU list checkingMadhuparna Bhowmik
list_for_each_entry_rcu() has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu() to silence false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled by default. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-18Revert "net: dev: introduce support for sch BYPASS for lockless qdisc"Paolo Abeni
This reverts commit ba27b4cdaaa66561aaedb2101876e563738d36fe Ahmed reported ouf-of-order issues bisected to commit ba27b4cdaaa6 ("net: dev: introduce support for sch BYPASS for lockless qdisc"). I can't find any working solution other than a plain revert. This will introduce some minor performance regressions for pfifo_fast qdisc. I plan to address them in net-next with more indirect call wrapper boilerplate for qdiscs. Reported-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Fixes: ba27b4cdaaa6 ("net: dev: introduce support for sch BYPASS for lockless qdisc") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-18net: phy: broadcom: Fix a typo ("firsly")Jonathan Neuschäfer
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-18mptcp: fix bogus socket flag valuesFlorian Westphal
Dan Carpenter reports static checker warnings due to bogus BIT() usage: net/mptcp/subflow.c:571 subflow_write_space() warn: test_bit() takes a bit number net/mptcp/subflow.c:694 subflow_state_change() warn: test_bit() takes a bit number net/mptcp/protocol.c:261 ssk_check_wmem() warn: test_bit() takes a bit number [..] This is harmless (we use bits 1 & 2 instead of 0 and 1), but would break eventually when adding BIT(5) (or 6, depends on size of 'long'). Just use 0 and 1, the values are only passed to test/set/clear_bit functions. Fixes: 648ef4b88673 ("mptcp: Implement MPTCP receive path") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-18qede: Fix race between rdma destroy workqueue and link change eventMichal Kalderon
If an event is added while the rdma workqueue is being destroyed it could lead to several races, list corruption, null pointer dereference during queue_work or init_queue. This fixes the race between the two flows which can occur during shutdown. A kref object and a completion object are added to the rdma_dev structure, these are initialized before the workqueue is created. The refcnt is used to indicate work is being added to the workqueue and ensures the cleanup flow won't start while we're in the middle of adding the event. Once the work is added, the refcnt is decreased and the cleanup flow is safe to run. Fixes: cee9fbd8e2e ("qede: Add qedr framework") Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-18Merge tag 'thunderbolt-fix-for-v5.6-rc3' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-linus Mika writes: thunderbolt: Fix for v5.6-rc3 Single fix that orders the THUNDERBOLT MAINTAINERS record according to parse-maintainers.pl. * tag 'thunderbolt-fix-for-v5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt: MAINTAINERS: Sort entries in database for THUNDERBOLT
2020-02-18iommu/vt-d: Simplify check in identity_mapping()Joerg Roedel
The function only has one call-site and there it is never called with dummy or deferred devices. Simplify the check in the function to account for that. Fixes: 1ee0186b9a12 ("iommu/vt-d: Refactor find_domain() helper") Cc: stable@vger.kernel.org # v5.5 Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-02-18iommu/vt-d: Remove deferred_attach_domain()Joerg Roedel
The function is now only a wrapper around find_domain(). Remove the function and call find_domain() directly at the call-sites. Fixes: 1ee0186b9a12 ("iommu/vt-d: Refactor find_domain() helper") Cc: stable@vger.kernel.org # v5.5 Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-02-18iommu/vt-d: Do deferred attachment in iommu_need_mapping()Joerg Roedel
The attachment of deferred devices needs to happen before the check whether the device is identity mapped or not. Otherwise the check will return wrong results, cause warnings boot failures in kdump kernels, like WARNING: CPU: 0 PID: 318 at ../drivers/iommu/intel-iommu.c:592 domain_get_iommu+0x61/0x70 [...] Call Trace: __intel_map_single+0x55/0x190 intel_alloc_coherent+0xac/0x110 dmam_alloc_attrs+0x50/0xa0 ahci_port_start+0xfb/0x1f0 [libahci] ata_host_start.part.39+0x104/0x1e0 [libata] With the earlier check the kdump boot succeeds and a crashdump is written. Fixes: 1ee0186b9a12 ("iommu/vt-d: Refactor find_domain() helper") Cc: stable@vger.kernel.org # v5.5 Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-02-18iommu/vt-d: Move deferred device attachment into helper functionJoerg Roedel
Move the code that does the deferred device attachment into a separate helper function. Fixes: 1ee0186b9a12 ("iommu/vt-d: Refactor find_domain() helper") Cc: stable@vger.kernel.org # v5.5 Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-02-18iommu/vt-d: Add attach_deferred() helperJoerg Roedel
Implement a helper function to check whether a device's attach process is deferred. Fixes: 1ee0186b9a12 ("iommu/vt-d: Refactor find_domain() helper") Cc: stable@vger.kernel.org # v5.5 Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-02-18libbpf: Sanitise internal map names so they are not rejected by the kernelToke Høiland-Jørgensen
The kernel only accepts map names with alphanumeric characters, underscores and periods in their name. However, the auto-generated internal map names used by libbpf takes their prefix from the user-supplied BPF object name, which has no such restriction. This can lead to "Invalid argument" errors when trying to load a BPF program using global variables. Fix this by sanitising the map names, replacing any non-allowed characters with underscores. Fixes: d859900c4c56 ("bpf, libbpf: support global data/bss/rodata sections") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200217171701.215215-1-toke@redhat.com
2020-02-18bpf, uapi: Remove text about bpf_redirect_map() giving higher performanceToke Høiland-Jørgensen
The performance of bpf_redirect() is now roughly the same as that of bpf_redirect_map(). However, David Ahern pointed out that the header file has not been updated to reflect this, and still says that a significant performance increase is possible when using bpf_redirect_map(). Remove this text from the bpf_redirect_map() description, and reword the description in bpf_redirect() slightly. Also fix the 'Return' section of the bpf_redirect_map() documentation. Fixes: 1d233886dd90 ("xdp: Use bulking for non-map XDP_REDIRECT and consolidate code paths") Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200218130334.29889-1-toke@redhat.com
2020-02-18ima: add sm3 algorithm to hash algorithm configuration listTianjia Zhang
sm3 has been supported by the ima hash algorithm, but it is not yet in the Kconfig configuration list. After adding, both ima and tpm2 can support sm3 well. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-02-18crypto: rename sm3-256 to sm3 in hash_algo_nameTianjia Zhang
The name sm3-256 is defined in hash_algo_name in hash_info, but the algorithm name implemented in sm3_generic.c is sm3, which will cause the sm3-256 algorithm to be not found in some application scenarios of the hash algorithm, and an ENOENT error will occur. For example, IMA, keys, and other subsystems that reference hash_algo_name all use the hash algorithm of sm3. Fixes: 5ca4c20cfd37 ("keys, trusted: select hash algorithm for TPM2 chips") Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Reviewed-by: Pascal van Leeuwen <pvanleeuwen@rambus.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>