summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-06-22net/mlx5: E-switch, Support querying port function mac addressParav Pandit
Support querying mac address of the eswitch devlink port function. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22net/mlx5: Move helper to eswitch layerParav Pandit
To use port number to port index conversion at eswitch level, move it to eswitch header. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22net/mlx5: E-switch, Introduce and use eswitch support check helperParav Pandit
Introduce an helper routine to get esw from a devlink device and use it at eswitch callbacks and in subsequent patch. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22net/mlx5: Constify mac address pointerParav Pandit
Since none of the functions need to modify the input mac address, constify them. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22net/devlink: Support setting hardware address of port functionParav Pandit
PCI PF and VF devlink port can manage the function represented by a devlink port. Allow users to set port function's hardware address. Example of a PCI VF port which supports a port function: $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 function: hw_addr 00:00:00:00:00:00 $ devlink port function set pci/0000:06:00.0/2 hw_addr 00:11:22:33:44:55 $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 function: hw_addr 00:11:22:33:44:55 Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22net/devlink: Support querying hardware address of port functionParav Pandit
PCI PF and VF devlink port can manage the function represented by a devlink port. Enable users to query port function's hardware address. Example of a PCI VF port which supports a port function: $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 function: hw_addr 00:11:22:33:44:66 $ devlink port show pci/0000:06:00.0/2 -jp { "port": { "pci/0000:06:00.0/2": { "type": "eth", "netdev": "enp6s0pf0vf1", "flavour": "pcivf", "pfnum": 0, "vfnum": 1, "function": { "hw_addr": "00:11:22:33:44:66" } } } } Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22net/devlink: Prepare devlink port functions to fill extackParav Pandit
Prepare devlink port related functions to optionally fill up the extack information which will be used in subsequent patch by port function attribute(s). Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22KVM: nVMX: Plumb L2 GPA through to PML emulationSean Christopherson
Explicitly pass the L2 GPA to kvm_arch_write_log_dirty(), which for all intents and purposes is vmx_write_pml_buffer(), instead of having the latter pull the GPA from vmcs.GUEST_PHYSICAL_ADDRESS. If the dirty bit update is the result of KVM emulation (rare for L2), then the GPA in the VMCS may be stale and/or hold a completely unrelated GPA. Fixes: c5f983f6e8455 ("nVMX: Implement emulated Page Modification Logging") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200622215832.22090-2-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-23libbpf: Add a bunch of attribute getters/setters for map definitionsAndrii Nakryiko
Add a bunch of getter for various aspects of BPF map. Some of these attribute (e.g., key_size, value_size, type, etc) are available right now in struct bpf_map_def, but this patch adds getter allowing to fetch them individually. bpf_map_def approach isn't very scalable, when ABI stability requirements are taken into account. It's much easier to extend libbpf and add support for new features, when each aspect of BPF map has separate getter/setter. Getters follow the common naming convention of not explicitly having "get" in its name: bpf_map__type() returns map type, bpf_map__key_size() returns key_size. Setters, though, explicitly have set in their name: bpf_map__set_type(), bpf_map__set_key_size(). This patch ensures we now have a getter and a setter for the following map attributes: - type; - max_entries; - map_flags; - numa_node; - key_size; - value_size; - ifindex. bpf_map__resize() enforces unnecessary restriction of max_entries > 0. It is unnecessary, because libbpf actually supports zero max_entries for some cases (e.g., for PERF_EVENT_ARRAY map) and treats it specially during map creation time. To allow setting max_entries=0, new bpf_map__set_max_entries() setter is added. bpf_map__resize()'s behavior is preserved for backwards compatibility reasons. Map ifindex getter is added as well. There is a setter already, but no corresponding getter. Fix this assymetry as well. bpf_map__set_ifindex() itself is converted from void function into error-returning one, similar to other setters. The only error returned right now is -EBUSY, if BPF map is already loaded and has corresponding FD. One lacking attribute with no ability to get/set or even specify it declaratively is numa_node. This patch fixes this gap and both adds programmatic getter/setter, as well as adds support for numa_node field in BTF-defined map. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20200621062112.3006313-1-andriin@fb.com
2020-06-22selftests/bpf: Test access to bpf map pointerAndrey Ignatov
Add selftests to test access to map pointers from bpf program for all map types except struct_ops (that one would need additional work). verifier test focuses mostly on scenarios that must be rejected. prog_tests test focuses on accessing multiple fields both scalar and a nested struct from bpf program and verifies that those fields have expected values. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/139a6a17f8016491e39347849b951525335c6eb4.1592600985.git.rdna@fb.com
2020-06-22bpf: Set map_btf_{name, id} for all map typesAndrey Ignatov
Set map_btf_name and map_btf_id for all map types so that map fields can be accessed by bpf programs. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/a825f808f22af52b018dbe82f1c7d29dab5fc978.1592600985.git.rdna@fb.com
2020-06-22bpf: Support access to bpf map fieldsAndrey Ignatov
There are multiple use-cases when it's convenient to have access to bpf map fields, both `struct bpf_map` and map type specific struct-s such as `struct bpf_array`, `struct bpf_htab`, etc. For example while working with sock arrays it can be necessary to calculate the key based on map->max_entries (some_hash % max_entries). Currently this is solved by communicating max_entries via "out-of-band" channel, e.g. via additional map with known key to get info about target map. That works, but is not very convenient and error-prone while working with many maps. In other cases necessary data is dynamic (i.e. unknown at loading time) and it's impossible to get it at all. For example while working with a hash table it can be convenient to know how much capacity is already used (bpf_htab.count.counter for BPF_F_NO_PREALLOC case). At the same time kernel knows this info and can provide it to bpf program. Fill this gap by adding support to access bpf map fields from bpf program for both `struct bpf_map` and map type specific fields. Support is implemented via btf_struct_access() so that a user can define their own `struct bpf_map` or map type specific struct in their program with only necessary fields and preserve_access_index attribute, cast a map to this struct and use a field. For example: struct bpf_map { __u32 max_entries; } __attribute__((preserve_access_index)); struct bpf_array { struct bpf_map map; __u32 elem_size; } __attribute__((preserve_access_index)); struct { __uint(type, BPF_MAP_TYPE_ARRAY); __uint(max_entries, 4); __type(key, __u32); __type(value, __u32); } m_array SEC(".maps"); SEC("cgroup_skb/egress") int cg_skb(void *ctx) { struct bpf_array *array = (struct bpf_array *)&m_array; struct bpf_map *map = (struct bpf_map *)&m_array; /* .. use map->max_entries or array->map.max_entries .. */ } Similarly to other btf_struct_access() use-cases (e.g. struct tcp_sock in net/ipv4/bpf_tcp_ca.c) the patch allows access to any fields of corresponding struct. Only reading from map fields is supported. For btf_struct_access() to work there should be a way to know btf id of a struct that corresponds to a map type. To get btf id there should be a way to get a stringified name of map-specific struct, such as "bpf_array", "bpf_htab", etc for a map type. Two new fields are added to `struct bpf_map_ops` to handle it: * .map_btf_name keeps a btf name of a struct returned by map_alloc(); * .map_btf_id is used to cache btf id of that struct. To make btf ids calculation cheaper they're calculated once while preparing btf_vmlinux and cached same way as it's done for btf_id field of `struct bpf_func_proto` While calculating btf ids, struct names are NOT checked for collision. Collisions will be checked as a part of the work to prepare btf ids used in verifier in compile time that should land soon. The only known collision for `struct bpf_htab` (kernel/bpf/hashtab.c vs net/core/sock_map.c) was fixed earlier. Both new fields .map_btf_name and .map_btf_id must be set for a map type for the feature to work. If neither is set for a map type, verifier will return ENOTSUPP on a try to access map_ptr of corresponding type. If just one of them set, it's verifier misconfiguration. Only `struct bpf_array` for BPF_MAP_TYPE_ARRAY and `struct bpf_htab` for BPF_MAP_TYPE_HASH are supported by this patch. Other map types will be supported separately. The feature is available only for CONFIG_DEBUG_INFO_BTF=y and gated by perfmon_capable() so that unpriv programs won't have access to bpf map fields. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/6479686a0cd1e9067993df57b4c3eef0e276fec9.1592600985.git.rdna@fb.com
2020-06-22bpf: Rename bpf_htab to bpf_shtab in sock_mapAndrey Ignatov
There are two different `struct bpf_htab` in bpf code in the following files: - kernel/bpf/hashtab.c - net/core/sock_map.c It makes it impossible to find proper btf_id by name = "bpf_htab" and kind = BTF_KIND_STRUCT what is needed to support access to map ptr so that bpf program can access `struct bpf_htab` fields. To make it possible one of the struct-s should be renamed, sock_map.c looks like a better candidate for rename since it's specialized version of hashtab. Rename it to bpf_shtab ("sh" stands for Sock Hash). Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/c006a639e03c64ca50fc87c4bb627e0bfba90f4e.1592600985.git.rdna@fb.com
2020-06-22bpf: Switch btf_parse_vmlinux to btf_find_by_name_kindAndrey Ignatov
btf_parse_vmlinux() implements manual search for struct bpf_ctx_convert since at the time of implementing btf_find_by_name_kind() was not available. Later btf_find_by_name_kind() was introduced in 27ae7997a661 ("bpf: Introduce BPF_PROG_TYPE_STRUCT_OPS"). It provides similar search functionality and can be leveraged in btf_parse_vmlinux(). Do it. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/6e12d5c3e8a3d552925913ef73a695dd1bb27800.1592600985.git.rdna@fb.com
2020-06-22MAINTAINERS: update email address for Felix FietkauFelix Fietkau
The old address has been bouncing for a while now Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22IB/mad: Fix use after free when destroying MAD agentShay Drory
Currently, when RMPP MADs are processed while the MAD agent is destroyed, it could result in use after free of rmpp_recv, as decribed below: cpu-0 cpu-1 ----- ----- ib_mad_recv_done() ib_mad_complete_recv() ib_process_rmpp_recv_wc() unregister_mad_agent() ib_cancel_rmpp_recvs() cancel_delayed_work() process_rmpp_data() start_rmpp() queue_delayed_work(rmpp_recv->cleanup_work) destroy_rmpp_recv() free_rmpp_recv() cleanup_work()[1] spin_lock_irqsave(&rmpp_recv->agent->lock) <-- use after free [1] cleanup_work() == recv_cleanup_handler Fix it by waiting for the MAD agent reference count becoming zero before calling to ib_cancel_rmpp_recvs(). Fixes: 9a41e38a467c ("IB/mad: Use IDR for agent IDs") Link: https://lore.kernel.org/r/20200621104738.54850-2-leon@kernel.org Signed-off-by: Shay Drory <shayd@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-22RDMA/mlx5: Protect from kernel crash if XRC_TGT doesn't have udataLeon Romanovsky
Don't deref udata if it is NULL BUG: kernel NULL pointer dereference, address: 0000000000000030 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 SMP PTI CPU: 2 PID: 1592 Comm: python3 Not tainted 5.7.0-rc6+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 RIP: 0010:create_qp+0x39e/0xae0 [mlx5_ib] Code: c0 0d 00 00 bf 10 01 00 00 e8 be a9 e4 e0 48 85 c0 49 89 c2 0f 84 0c 07 00 00 41 8b 85 74 63 01 00 0f c8 a9 00 00 00 10 74 0a <41> 8b 46 30 0f c8 41 89 42 14 41 8b 52 18 41 0f b6 4a 1c 0f ca 89 RSP: 0018:ffffc9000067f8b0 EFLAGS: 00010206 RAX: 0000000010170000 RBX: ffff888441313000 RCX: 0000000000000000 RDX: 0000000000000200 RSI: 0000000000000000 RDI: ffff88845b1d4400 RBP: ffffc9000067fa60 R08: 0000000000000200 R09: ffff88845b1d4200 R10: ffff88845b1d4200 R11: ffff888441313000 R12: ffffc9000067f950 R13: ffff88846ac00140 R14: 0000000000000000 R15: ffff88846c2bc000 FS: 00007faa1a3c0540(0000) GS:ffff88846fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000030 CR3: 0000000446dca003 CR4: 0000000000760ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 mlx5_ib_create_qp+0x897/0xfa0 [mlx5_ib] ib_create_qp+0x9e/0x300 [ib_core] create_qp+0x92d/0xb20 [ib_uverbs] ? ib_uverbs_cq_event_handler+0x30/0x30 [ib_uverbs] ? release_resource+0x30/0x30 ib_uverbs_create_qp+0xc4/0xe0 [ib_uverbs] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xc8/0xf0 [ib_uverbs] ib_uverbs_run_method+0x223/0x770 [ib_uverbs] ? track_pfn_remap+0xa7/0x100 ? uverbs_disassociate_api+0xd0/0xd0 [ib_uverbs] ? remap_pfn_range+0x358/0x490 ib_uverbs_cmd_verbs.isra.6+0x19b/0x370 [ib_uverbs] ? rdma_umap_priv_init+0x82/0xe0 [ib_core] ? vm_mmap_pgoff+0xec/0x120 ib_uverbs_ioctl+0xc0/0x120 [ib_uverbs] ksys_ioctl+0x92/0xb0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x48/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: e383085c2425 ("RDMA/mlx5: Set ECE options during QP create") Link: https://lore.kernel.org/r/20200621115959.60126-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-22KVM: x86/mmu: Avoid mixing gpa_t with gfn_t in walk_addr_generic()Vitaly Kuznetsov
translate_gpa() returns a GPA, assigning it to 'real_gfn' seems obviously wrong. There is no real issue because both 'gpa_t' and 'gfn_t' are u64 and we don't use the value in 'real_gfn' as a GFN, we do real_gfn = gpa_to_gfn(real_gfn); instead. 'If you see a "buffalo" sign on an elephant's cage, do not trust your eyes', but let's fix it for good. No functional change intended. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200622151435.752560-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-22KVM: LAPIC: ensure APIC map is up to date on concurrent update requestsPaolo Bonzini
The following race can cause lost map update events: cpu1 cpu2 apic_map_dirty = true ------------------------------------------------------------ kvm_recalculate_apic_map: pass check mutex_lock(&kvm->arch.apic_map_lock); if (!kvm->arch.apic_map_dirty) and in process of updating map ------------------------------------------------------------- other calls to apic_map_dirty = true might be too late for affected cpu ------------------------------------------------------------- apic_map_dirty = false ------------------------------------------------------------- kvm_recalculate_apic_map: bail out on if (!kvm->arch.apic_map_dirty) To fix it, record the beginning of an update of the APIC map in apic_map_dirty. If another APIC map change switches apic_map_dirty back to DIRTY during the update, kvm_recalculate_apic_map should not make it CLEAN, and the other caller will go through the slow path. Reported-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-22RDMA/counter: Query a counter before releaseMark Zhang
Query a dynamically-allocated counter before release it, to update it's hwcounters and log all of them into history data. Otherwise all values of these hwcounters will be lost. Fixes: f34a55e497e8 ("RDMA/core: Get sum value of all counters when perform a sysfs stat read") Link: https://lore.kernel.org/r/20200621110000.56059-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-22Merge tag 'spi-fix-v5.8-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "Quite a lot of fixes here for no single reason. There's a collection of the usual sort of device specific fixes and also a bunch of people have been working on spidev and the userspace test program spidev_test so they've got an unusually large collection of small fixes" * tag 'spi-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spidev: fix a potential use-after-free in spidev_release() spi: spidev: fix a race between spidev_release and spidev_remove spi: stm32-qspi: Fix error path in case of -EPROBE_DEFER spi: uapi: spidev: Use TABs for alignment spi: spi-fsl-dspi: Free DMA memory with matching function spi: tools: Add macro definitions to fix build errors spi: tools: Make default_tx/rx and input_tx static spi: dt-bindings: amlogic, meson-gx-spicc: Fix schema for meson-g12a spi: rspi: Use requested instead of maximum bit rate spi: spidev_test: Use %u to format unsigned numbers spi: sprd: switch the sequence of setting WDG_LOAD_LOW and _HIGH
2020-06-22kvm: lapic: fix broken vcpu hotplugIgor Mammedov
Guest fails to online hotplugged CPU with error smpboot: do_boot_cpu failed(-1) to wakeup CPU#4 It's caused by the fact that kvm_apic_set_state(), which used to call recalculate_apic_map() unconditionally and pulled hotplugged CPU into apic map, is updating map conditionally on state changes. In this case the APIC map is not considered dirty and the is not updated. Fix the issue by forcing unconditional update from kvm_apic_set_state(), like it used to be. Fixes: 4abaffce4d25a ("KVM: LAPIC: Recalculate apic map in batch") Cc: stable@vger.kernel.org Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20200622160830.426022-1-imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-22Merge tag 'regulator-fix-v5.8-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "This has a fix for the refactoring out of the pickable ranges functionality, plus the removal of a BROKEN dependency on mt6358 now that the dependencies were merged in -rc1 and a couple of device specific fixes" * tag 'regulator-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: mt6358: Remove BROKEN dependency regualtor: pfuze100: correct sw1a/sw2 on pfuze3000 regulator: Fix pickable ranges mapping regulator: da9063: fix LDO9 suspend and warning.
2020-06-22Merge tag 'regmap-fix-v5.8-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fixes from Mark Brown: "A few small fixes, none of which are likely to have any substantial impact here - the most substantial one is a fix for a long standing memory leak on devices that use register patching which will only have an impact if the device is removed and re-added" * tag 'regmap-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: Fix memory leak from regmap_register_patch regmap: fix the kerneldoc for regmap_test_bits() regmap: fix alignment issue
2020-06-22tools/virtio: Use tools/include/list.h instead of stubsEugenio Pérez
It should not make any significant difference but reduce stub code. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Link: https://lore.kernel.org/r/20200418102217.32327-9-eperezma@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22tools/virtio: Reset index in virtio_test --reset.Eugenio Pérez
This way behavior for vhost is more like a VM. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Link: https://lore.kernel.org/r/20200418102217.32327-8-eperezma@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22tools/virtio: Extract virtqueue initialization in vq_resetEugenio Pérez
So we can reset after that in the main loop. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Link: https://lore.kernel.org/r/20200418102217.32327-7-eperezma@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22tools/virtio: Use __vring_new_virtqueue in virtio_test.cEugenio Pérez
As updated in ("2a2d1382fe9d virtio: Add improved queue allocation API") Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Link: https://lore.kernel.org/r/20200418102217.32327-6-eperezma@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22tools/virtio: Add --resetEugenio Pérez
Currently, it only removes and add backend, but it will reset vq position in future commits. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Link: https://lore.kernel.org/r/20200418102217.32327-5-eperezma@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22tools/virtio: Add --batch=random optionEugenio Pérez
So we can test with non-deterministic batches in flight. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Link: https://lore.kernel.org/r/20200418102217.32327-4-eperezma@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22tools/virtio: Add --batch optionEugenio Pérez
This allow to test vhost having >1 buffers in flight Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Link: https://lore.kernel.org/r/20200401183118.8334-5-eperezma@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20200418102217.32327-3-eperezma@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22virtio-mem: add memory via add_memory_driver_managed()David Hildenbrand
Virtio-mem managed memory is always detected and added by the virtio-mem driver, never using something like the firmware-provided memory map. This is the case after an ordinary system reboot, and has to be guaranteed after kexec. Especially, virtio-mem added memory resources can contain inaccessible parts ("unblocked memory blocks"), blindly forwarding them to a kexec kernel is dangerous, as unplugged memory will get accessed (esp. written). Let's use the new way of adding special driver-managed memory introduced in commit 7b7b27214bba ("mm/memory_hotplug: introduce add_memory_driver_managed()"). This will result in no entries in /sys/firmware/memmap ("raw firmware- provided memory map"), the memory resource will be flagged IORESOURCE_MEM_DRIVER_MANAGED (esp., kexec_file_load() will not place kexec images on this memory), and it is exposed as "System RAM (virtio_mem)" in /proc/iomem, so esp. kexec-tools can properly handle it. Example /proc/iomem before this change: [...] 140000000-333ffffff : virtio0 140000000-147ffffff : System RAM 334000000-533ffffff : virtio1 338000000-33fffffff : System RAM 340000000-347ffffff : System RAM 348000000-34fffffff : System RAM [...] Example /proc/iomem after this change: [...] 140000000-333ffffff : virtio0 140000000-147ffffff : System RAM (virtio_mem) 334000000-533ffffff : virtio1 338000000-33fffffff : System RAM (virtio_mem) 340000000-347ffffff : System RAM (virtio_mem) 348000000-34fffffff : System RAM (virtio_mem) [...] Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: teawater <teawaterz@linux.alibaba.com> Fixes: 5f1f79bbc9e26 ("virtio-mem: Paravirtualized memory hotplug") Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200611093518.5737-1-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
2020-06-22virtio-mem: silence a static checker warningDan Carpenter
Smatch complains that "rc" can be uninitialized if we hit the "break;" statement on the first iteration through the loop. I suspect that this can't happen in real life, but returning a zero literal is cleaner and silence the static checker warning. Fixes: 5f1f79bbc9e2 ("virtio-mem: Paravirtualized memory hotplug") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20200610085911.GC5439@mwanda Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22vhost_vdpa: Fix potential underflow in vhost_vdpa_mmap()Dan Carpenter
The "vma->vm_pgoff" variable is an unsigned long so if it's larger than INT_MAX then "index" can be negative leading to an underflow. Fix this by changing the type of "index" to "unsigned long". Fixes: ddd89d0a059d ("vhost_vdpa: support doorbell mapping via mmap") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20200610085852.GB5439@mwanda Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22vdpa: fix typos in the comments for __vdpa_alloc_device()Jason Wang
Fix two typos in the comments for __vdpa_alloc_device(). Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20200527060528.9100-1-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22Merge tag 'asoc-fix-v5.8-rc2' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v5.8 This is a collection of mostly small fixes, mostly fixing fallout from some of the DPCM changes that went in last time around which shook out some issues on i.MX and Qualcomm platforms. The addition of a managed version of snd_soc_register_dai() is to fix resource leaks. There's also a few new device IDs for x86 systems.
2020-06-21Revert "kernel/printk: add kmsg SEEK_CUR handling"Jason A. Donenfeld
This reverts commit 8ece3b3eb576a78d2e67ad4c3a80a39fa6708809. This commit broke userspace. Bash uses ESPIPE to determine whether or not the file should be read using "unbuffered I/O", which means reading 1 byte at a time instead of 128 bytes at a time. I used to use bash to read through kmsg in a really quite nasty way: while read -t 0.1 -r line 2>/dev/null || [[ $? -ne 142 ]]; do echo "SARU $line" done < /dev/kmsg This will show all lines that can fit into the 128 byte buffer, and skip lines that don't. That's pretty awful, but at least it worked. With this change, bash now tries to do 1-byte reads, which means it skips all the lines, which is worse than before. Now, I don't really care very much about this, and I'm already look for a workaround. But I did just spend an hour trying to figure out why my scripts were broken. Either way, it makes no difference to me personally whether this is reverted, but it might be something to consider. If you declare that "trying to read /dev/kmsg with bash is terminally stupid anyway," I might be inclined to agree with you. But do note that bash uses lseek(fd, 0, SEEK_CUR)==>ESPIPE to determine whether or not it's reading from a pipe. Cc: Bruno Meneguele <bmeneg@redhat.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-21Linux 5.8-rc2v5.8-rc2Linus Torvalds
2020-06-21Merge tag 'selinux-pr-20200621' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull SELinux fixes from Paul Moore: "Three small patches to fix problems in the SELinux code, all found via clang. Two patches fix potential double-free conditions and one fixes an undefined return value" * tag 'selinux-pr-20200621' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: fix undefined return of cond_evaluate_expr selinux: fix a double free in cond_read_node()/cond_read_list() selinux: fix double free
2020-06-21Merge tag 'pinctrl-v5.8-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "Some early fixes collected during the first week after the merge window, all pretty self-evident, with the details below. The revert is the crucial thing. - Fix a warning on the Qualcomm SPMI GPIO chip being instatiated twice without a unique irqchip struct - Use the noirq variants of the suspend and resume callbacks in the Tegra driver - Clean up the errorpath on the MCP23s08 driver - Revert the use of devm_of_iomap() in the Freescale driver as it was regressing the platform - Add some missing pins in the Qualcomm IPQ6018 driver - Fix a simple documentation bug in the pinctrl-single driver" * tag 'pinctrl-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: single: fix function name in documentation pinctrl: qcom: ipq6018 Add missing pins in qpic pin group Revert "pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak in case of error in 'imx_pinctrl_probe()'" pinctrl: mcp23s08: Split to three parts: fix ptr_ret.cocci warnings pinctrl: tegra: Use noirq suspend/resume callbacks pinctrl: qcom: spmi-gpio: fix warning about irq chip reusage
2020-06-21Merge tag 'kbuild-fixes-v5.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - fix -gz=zlib compiler option test for CONFIG_DEBUG_INFO_COMPRESSED - improve cc-option in scripts/Kbuild.include to clean up temp files - improve cc-option in scripts/Kconfig.include for more reliable compile option test - do not copy modules.builtin by 'make install' because it would break existing systems - use 'userprogs' syntax for watch_queue sample * tag 'kbuild-fixes-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: samples: watch_queue: build sample program for target architecture Revert "Makefile: install modules.builtin even if CONFIG_MODULES=n" scripts: Fix typo in headers_install.sh kconfig: unify cc-option and as-option kbuild: improve cc-option to clean up all temporary files Makefile: Improve compressed debug info support detection
2020-06-21Merge tag 'powerpc-5.8-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - One fix for the interrupt rework we did last release which broke KVM-PR - Three commits fixing some fallout from the READ_ONCE() changes interacting badly with our 8xx 16K pages support, which uses a pte_t that is a structure of 4 actual PTEs - A cleanup of the 8xx pte_update() to use the newly added pmd_off() - A fix for a crash when handling an oops if CONFIG_DEBUG_VIRTUAL is enabled - A minor fix for the SPU syscall generation Thanks to Aneesh Kumar K.V, Christian Zigotzky, Christophe Leroy, Mike Rapoport, Nicholas Piggin. * tag 'powerpc-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/8xx: Provide ptep_get() with 16k pages mm: Allow arches to provide ptep_get() mm/gup: Use huge_ptep_get() in gup_hugepte() powerpc/syscalls: Use the number when building SPU syscall table powerpc/8xx: use pmd_off() to access a PMD entry in pte_update() powerpc/64s: Fix KVM interrupt using wrong save area powerpc: Fix kernel crash in show_instructions() w/DEBUG_VIRTUAL
2020-06-21Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - NULL dereference in octeontx - PM reference imbalance in ks-sa - deadlock in crypto manager - memory leak in drbg - missing socket limit check on receive SG list size in algif_skcipher - typos in caam - warnings in ccp and hisilicon * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: drbg - always try to free Jitter RNG instance crypto: marvell/octeontx - Fix a potential NULL dereference crypto: algboss - don't wait during notifier callback crypto: caam - fix typos crypto: ccp - Fix sparse warnings in sev-dev crypto: hisilicon - Cap block size at 2^31 crypto: algif_skcipher - Cap recv SG list at ctx->used hwrng: ks-sa - Fix runtime PM imbalance on error
2020-06-22samples: watch_queue: build sample program for target architectureMasahiro Yamada
This userspace program includes UAPI headers exported to usr/include/. 'make headers' always works for the target architecture (i.e. the same architecture as the kernel), so the sample program should be built for the target as well. Kbuild now supports 'userprogs' for that. I also guarded the CONFIG option by 'depends on CC_CAN_LINK' because $(CC) may not provide libc. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-06-22Revert "Makefile: install modules.builtin even if CONFIG_MODULES=n"Masahiro Yamada
This reverts commit e0b250b57dcf403529081e5898a9de717f96b76b, which broke build systems that need to install files to a certain path, but do not set INSTALL_MOD_PATH when invoking 'make install'. $ make INSTALL_PATH=/tmp/destdir install mkdir: cannot create directory ‘/lib/modules/5.8.0-rc1+/’: Permission denied Makefile:1342: recipe for target '_builtin_inst_' failed make: *** [_builtin_inst_] Error 1 While modules.builtin is useful also for CONFIG_MODULES=n, this change in the behavior is quite unexpected. Maybe "make modules_install" can install modules.builtin irrespective of CONFIG_MODULES as Jonas originally suggested. Anyway, that commit should be reverted ASAP. Reported-by: Douglas Anderson <dianders@chromium.org> Reported-by: Guenter Roeck <linux@roeck-us.net> Cc: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net>
2020-06-20Merge branch 'Marvell-mvpp2-improvements'David S. Miller
Russell King says: ==================== Marvell mvpp2 improvements This series primarily cleans up mvpp2, but also fixes a left-over from 91a208f2185a ("net: phylink: propagate resolved link config via mac_link_up()"). Patch 1 introduces some port helpers: mvpp2_port_supports_xlg() - does the port support the XLG MAC mvpp2_port_supports_rgmii() - does the port support RGMII modes Patch 2 introduces mvpp2_phylink_to_port(), rather than having repeated open coding of container_of(). Patch 3 introduces mvpp2_modify(), which reads-modifies-writes a register - I've converted the phylink specific code to use this helper. Patch 4 moves the hardware control of the pause modes from mvpp2_xlg_config() (which is called via the phylink_config method) to mvpp2_mac_link_up() - a change that was missed in the above referenced commit. v2: remove "inline" in patch 2. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20net: mvpp2: set xlg flow control in mvpp2_mac_link_up()Russell King
Set the flow control settings in mvpp2_mac_link_up() for 10G links just as we do for 1G and slower links. This is now the preferred location. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20net: mvpp2: add register modification helperRussell King
Add a helper to read-modify-write a register, and use it in the phylink helpers. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20net: mvpp2: add mvpp2_phylink_to_port() helperRussell King
Add a helper to convert the struct phylink_config pointer passed in from phylink to the drivers internal struct mvpp2_port. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20net: mvpp2: add port support helpersRussell King
The mvpp2 code has tests scattered amongst the code to determine whether the port supports the XLG, and whether the port supports RGMII mode. Rather than having these tests scattered, provide a couple of helper functions, so that future additions can ensure that they get these tests correct. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>