summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-10-22net: dsa: use ports list to setup default CPU portVivien Didelot
Use the new ports list instead of iterating over switches and their ports when setting up the default CPU port. Unassign it on teardown. Now that we can iterate over multiple CPU ports, remove dst->cpu_dp. At the same time, provide a better error message for CPU-less tree. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net: dsa: use ports list to find first CPU portVivien Didelot
Use the new ports list instead of iterating over switches and their ports when looking up the first CPU port in the tree. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net: dsa: use ports list to setup multiple master devicesVivien Didelot
Now that we have a potential list of CPU ports, make use of it instead of only configuring the master device of an unique CPU port. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net: dsa: use ports list to find a port by nodeVivien Didelot
Use the new ports list instead of iterating over switches and their ports to find a port from a given node. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net: dsa: use ports list for routing table setupVivien Didelot
Use the new ports list instead of accessing the dsa_switch array of ports when iterating over DSA ports of a switch to set up the routing table. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net: dsa: use ports list to setup switchesVivien Didelot
Use the new ports list instead of iterating over switches and their ports when setting up the switches and their ports. At the same time, provide setup states and messages for ports and switches as it is done for the trees. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net: dsa: use ports list to find slaveVivien Didelot
Use the new ports list instead of iterating over switches and their ports when looking for a slave device from a given master interface. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net: dsa: use ports list in dsa_to_portVivien Didelot
Use the new ports list instead of accessing the dsa_switch array of ports in the dsa_to_port helper. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net: dsa: add ports list in the switch fabricVivien Didelot
Add a list of switch ports within the switch fabric. This will help the lookup of a port inside the whole fabric, and it is the first step towards supporting multiple CPU ports, before deprecating the usage of the unique dst->cpu_dp pointer. In preparation for a future allocation of the dsa_port structures, return -ENOMEM in case no structure is returned, even though this error cannot be reached yet. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net: dsa: use dsa_to_port helper everywhereVivien Didelot
Do not let the drivers access the ds->ports static array directly while there is a dsa_to_port helper for this purpose. At the same time, un-const this helper since the SJA1105 driver assigns the priv member of the returned dsa_port structure. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22libbpf: Make DECLARE_LIBBPF_OPTS macro strictly a variable declarationAndrii Nakryiko
LIBBPF_OPTS is implemented as a mix of field declaration and memset + assignment. This makes it neither variable declaration nor purely statements, which is a problem, because you can't mix it with either other variable declarations nor other function statements, because C90 compiler mode emits warning on mixing all that together. This patch changes LIBBPF_OPTS into a strictly declaration of variable and solves this problem, as can be seen in case of bpftool, which previously would emit compiler warning, if done this way (LIBBPF_OPTS as part of function variables declaration block). This patch also renames LIBBPF_OPTS into DECLARE_LIBBPF_OPTS to follow kernel convention for similar macros more closely. v1->v2: - rename LIBBPF_OPTS into DECLARE_LIBBPF_OPTS (Jakub Sitnicki). Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191022172100.3281465-1-andriin@fb.com
2019-10-22tools/bpf: Turn on llvm alu32 attribute by defaultYonghong Song
LLVM alu32 was introduced in LLVM7: https://reviews.llvm.org/rL325987 https://reviews.llvm.org/rL325989 Experiments showed that in general performance is better with alu32 enabled: https://lwn.net/Articles/775316/ This patch turns on alu32 with no-flavor test_progs which is tested most often. The flavor test at no_alu32/test_progs can be used to test without alu32 enabled. The Makefile check for whether LLVM supports '-mattr=+alu32 -mcpu=v3' is removed as LLVM7 should be available for recent distributions and also latest LLVM is preferred to run BPF selftests. Note that jmp32 is checked by -mcpu=probe and will be enabled if the host kernel supports it. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191022043119.2625263-1-yhs@fb.com
2019-10-22Merge branch 'r8169-remove-fiddling-with-the-pcie-max-read-request-size'Jakub Kicinski
Heiner Kallweit says: ==================== The attempt to improve performance by changing the PCIe max read request size was added in the vendor driver more than 10 years back and copied to r8169 driver. In the vendor driver this has been removed long ago. Obviously it had no effect, also in my tests I didn't see any difference. Typically the max payload size is less than 512 bytes anyway, and the PCI core takes care that the maximum supported value is set. So let's remove fiddling with PCIe max read request size from r8169 too. This change allows to simplify the driver in the subsequent three patches of this series. ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22r8169: remove rtl_hw_start_8168befHeiner Kallweit
We can remove rtl_hw_start_8168bef() and use rtl_hw_start_8168b() instead because setting register Config4 is done in rtl_jumbo_config(), being called from rtl_hw_start(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22r8169: remove rtl_hw_start_8168dpHeiner Kallweit
We can remove rtl_hw_start_8168dp() because it's the same as rtl_hw_start_8168dp() now. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22r8169: simplify setting PCI_EXP_DEVCTL_NOSNOOP_ENHeiner Kallweit
r8168b_0_hw_jumbo_enable() and r8168b_0_hw_jumbo_disable() both do the same and just set PCI_EXP_DEVCTL_NOSNOOP_EN. We can simplify the code by moving this setting for RTL8168B to rtl_hw_start_8168(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22r8169: remove fiddling with the PCIe max read request sizeHeiner Kallweit
The attempt to improve performance by changing the PCIe max read request size was added in the vendor driver more than 10 years back and copied to r8169 driver. In the vendor driver this has been removed long ago. Obviously it had no effect, also in my tests I didn't see any difference. Typically the max payload size is less than 512 bytes anyway, and the PCI core takes care that the maximum supported value is set. So let's remove fiddling with PCIe max read request size from r8169 too. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22bpf: Fix use after free in subprog's jited symbol removalDaniel Borkmann
syzkaller managed to trigger the following crash: [...] BUG: unable to handle page fault for address: ffffc90001923030 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD aa551067 P4D aa551067 PUD aa552067 PMD a572b067 PTE 80000000a1173163 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 7982 Comm: syz-executor912 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:bpf_jit_binary_hdr include/linux/filter.h:787 [inline] RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:531 [inline] RIP: 0010:bpf_tree_comp kernel/bpf/core.c:600 [inline] RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline] RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline] RIP: 0010:bpf_prog_kallsyms_find kernel/bpf/core.c:674 [inline] RIP: 0010:is_bpf_text_address+0x184/0x3b0 kernel/bpf/core.c:709 [...] Call Trace: kernel_text_address kernel/extable.c:147 [inline] __kernel_text_address+0x9a/0x110 kernel/extable.c:102 unwind_get_return_address+0x4c/0x90 arch/x86/kernel/unwind_frame.c:19 arch_stack_walk+0x98/0xe0 arch/x86/kernel/stacktrace.c:26 stack_trace_save+0xb6/0x150 kernel/stacktrace.c:123 save_stack mm/kasan/common.c:69 [inline] set_track mm/kasan/common.c:77 [inline] __kasan_kmalloc+0x11c/0x1b0 mm/kasan/common.c:510 kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:518 slab_post_alloc_hook mm/slab.h:584 [inline] slab_alloc mm/slab.c:3319 [inline] kmem_cache_alloc+0x1f5/0x2e0 mm/slab.c:3483 getname_flags+0xba/0x640 fs/namei.c:138 getname+0x19/0x20 fs/namei.c:209 do_sys_open+0x261/0x560 fs/open.c:1091 __do_sys_open fs/open.c:1115 [inline] __se_sys_open fs/open.c:1110 [inline] __x64_sys_open+0x87/0x90 fs/open.c:1110 do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe [...] After further debugging it turns out that we walk kallsyms while in parallel we tear down a BPF program which contains subprograms that have been JITed though the program itself has not been fully exposed and is eventually bailing out with error. The bpf_prog_kallsyms_del_subprogs() in bpf_prog_load()'s error path removes the symbols, however, bpf_prog_free() tears down the JIT memory too early via scheduled work. Instead, it needs to properly respect RCU grace period as the kallsyms walk for BPF is under RCU. Fix it by refactoring __bpf_prog_put()'s tear down and reuse it in our error path where we defer final destruction when we have subprogs in the program. Fixes: 7d1982b4e335 ("bpf: fix panic in prog load calls cleanup") Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs") Reported-by: syzbot+710043c5d1d5b5013bc7@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: syzbot+710043c5d1d5b5013bc7@syzkaller.appspotmail.com Link: https://lore.kernel.org/bpf/55f6367324c2d7e9583fa9ccf5385dcbba0d7a6e.1571752452.git.daniel@iogearbox.net
2019-10-22Merge branch 'net-smc-improve-termination-handling'Jakub Kicinski
Karsten Graul says: ==================== More patches to address abnormal termination processing of sockets and link groups. ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net/smc: remove close abort workerUrsula Braun
With the introduction of the link group termination worker there is no longer a need to postpone smc_close_active_abort() to a worker. To protect socket destruction due to normal and abnormal socket closing, the socket refcount is increased. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net/smc: introduce link group termination workerUrsula Braun
Use a worker for link group termination to guarantee process context. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net/smc: improve abnormal termination of link groupsUrsula Braun
If a link group and its connections must be terminated, * wake up socket waiters * do not enable buffer reuse A linkgroup might be terminated while normal connection closing is running. Avoid buffer reuse and its related LLC DELETE RKEY call, if linkgroup termination has started. And use the earliest indication of linkgroup termination possible, namely the removal from the linkgroup list. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net/smc: tell peers about abnormal link group terminationUrsula Braun
There are lots of link group termination scenarios. Most of them still allow to inform the peer of the terminating sockets about aborting. This patch tries to call smc_close_abort() for terminating sockets. And the internal TCP socket is reset with tcp_abort(). Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net/smc: improve link group freeingUrsula Braun
Usually link groups are freed delayed to enable quick connection creation for a follow-on SMC socket. Terminated link groups are freed faster. This patch makes sure, fast schedule of link group freeing is not rescheduled by a delayed schedule. And it makes sure link group freeing is not rescheduled, if the real freeing is already running. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net/smc: improve abnormal termination lockingUrsula Braun
Locking hierarchy requires that the link group conns_lock can be taken if the socket lock is held, but not vice versa. Nevertheless socket termination during abnormal link group termination should be protected by the socket lock. This patch reduces the time segments the link group conns_lock is held to enable usage of lock_sock in smc_lgr_terminate(). Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net/smc: terminate link group without holding lgr lockUrsula Braun
When a link group is to be terminated, it is sufficient to hold the lgr lock when unlinking the link group from its list. Move the lock-protected link group unlinking into smc_lgr_terminate(). Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22net/smc: cancel send and receive for terminated socketUrsula Braun
The resources for a terminated socket are being cleaned up. This patch makes sure * no more data is received for an actively terminated socket * no more data is sent for an actively or passively terminated socket Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22Merge branch 'mlxsw-core-extend-qsfp-eeprom-size'Jakub Kicinski
Ido Schimmel says: ==================== Vadim says: This patch set extends the size of QSFP EEPROM for the cable types SSF-8436 and SFF-8636 from 256 bytes to 640 bytes. This allows ethtool to show correct information for these cable types (more details below). Patch #1 adds a macro that computes the EEPROM page number from the provided offset specified in the request. Patch #2 teaches the driver to access the information stored in the upper pages of the QSFP memory map. Details and examples: SFF-8436 specification defines pages 0, 1, 2 and 3. Page 0 contains lower memory page offsets (from 0x00 to 0x7f) and upper page offsets (from 0x80 to 0xfe). Upper pages 1, 2 and 3 are optional and can be empty. Page 1 is provided if upper page 0 byte 0xc3 bit 6 is set. Page 2 is provided if upper page 0 byte 0xc3 bit 7 is set. Page 3 is provided if lower page 0 byte 0x02 bit 2 is cleared. Offset 0xc3 for the upper page is provided as 0x43 = 0xc3 - 0x80. As a result of exposing 256 bytes only, ethtool shows wrong information for pages 1, 2 and 3. In the below hex dump from ethtool for a cable compliant to SFF-8636 specification, it can be seen that EEPROM of this device contains optical diagnostic page (lower page 0 byte 0x02 bit 2 is cleared), but it is not exposed, as the length defined for this type is 256 bytes. $ ethtool -m sfp42 hex on Offset Values ------ ------ 0x0000: 11 07 00 ff 00 ff 00 00 00 55 55 00 00 00 00 00 0x0010: 00 00 00 00 00 00 2a 90 00 00 82 ae 00 00 00 00 0x0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 0x0060: 00 00 ff 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0080: 11 8c 0c 80 00 00 00 00 00 00 00 05 ff 00 00 23 0x0090: 00 00 32 00 4d 65 6c 6c 61 6e 6f 78 20 20 20 20 0x00a0: 20 20 20 20 00 00 02 c9 4d 4d 41 31 42 30 30 2d 0x00b0: 53 53 31 20 20 20 20 20 41 32 42 68 0b b8 46 05 0x00c0: 02 07 f5 9e 4d 54 31 38 33 34 46 54 30 33 38 34 0x00d0: 36 20 20 20 31 38 30 37 30 33 00 00 0c 10 67 c2 0x00e0: 38 32 36 46 4d 41 32 32 36 49 30 31 31 35 20 20 0x00f0: 00 00 00 00 00 00 00 00 00 00 01 00 0e 00 00 00 After changing the length returned by get_module_info() callback from 256 bytes to 640 bytes, the upper pages 1, 2 and 3 are exposed by ethtool. In the below hex dump from the same cable it can be seen that the optical diagnostic page (page 3, from offset 0x0200) has non-zero data. $ ethtool -m sfp42 hex on Offset Values ------ ------ 0x0000: 11 07 00 ff 00 ff 00 00 00 55 55 00 00 00 00 00 0x0010: 00 00 00 00 00 00 27 79 00 00 82 c5 00 00 00 00 0x0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 0x0060: 00 00 ff 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0080: 11 8c 0c 80 00 00 00 00 00 00 00 05 ff 00 00 23 0x0090: 00 00 32 00 4d 65 6c 6c 61 6e 6f 78 20 20 20 20 0x00a0: 20 20 20 20 00 00 02 c9 4d 4d 41 31 42 30 30 2d 0x00b0: 53 53 31 20 20 20 20 20 41 32 42 68 0b b8 46 05 0x00c0: 02 07 f5 9e 4d 54 31 38 33 34 46 54 30 33 38 34 0x00d0: 36 20 20 20 31 38 30 37 30 33 00 00 0c 10 67 c2 0x00e0: 38 32 36 46 4d 41 32 32 36 49 30 31 31 35 20 20 0x00f0: 00 00 00 00 00 00 00 00 00 00 01 00 0e 00 00 00 0x0100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0200: 50 00 f6 00 46 00 00 00 00 00 00 00 00 00 00 00 0x0210: 88 b8 79 18 87 5a 7a 76 00 00 00 00 00 00 00 00 0x0220: 00 00 00 00 00 00 00 00 00 00 18 30 0e 61 60 b7 0x0230: 87 71 01 d3 43 e2 03 a5 10 9a 0a ba 0f a0 0b b8 0x0240: 87 71 02 d4 43 e2 05 a5 00 00 00 00 00 00 00 00 0x0250: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0260: a7 03 00 00 00 00 00 00 00 00 44 44 22 22 11 11 0x0270: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 And 'ethtool -m sfp42' shows the real values for the below fields, while before it exposed zeros for these fields: Laser bias current high alarm threshold : 8.500 mA Laser bias current low alarm threshold : 5.492 mA Laser bias current high warning threshold : 8.000 mA Laser bias current low warning threshold : 6.000 mA Laser output power high alarm threshold : 3.4673 mW / 5.40 dBm Laser output power low alarm threshold : 0.0724 mW / -11.40 dBm Laser output power high warning threshold : 1.7378 mW / 2.40 dBm Laser output power low warning threshold : 0.1445 mW / -8.40 dBm Module temperature high alarm threshold : 80.00 degrees C / 176.00 F Module temperature low alarm threshold : -10.00 degrees C / 14.00 F Module temperature high warning threshold : 70.00 degrees C / 158.00 F Module temperature low warning threshold : 0.00 degrees C / 32.00 F Module voltage high alarm threshold : 3.5000 V Module voltage low alarm threshold : 3.1000 V ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22RDMA/uverbs: Prevent potential underflowDan Carpenter
The issue is in drivers/infiniband/core/uverbs_std_types_cq.c in the UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE) function. We check that: if (attr.comp_vector >= attrs->ufile->device->num_comp_vectors) { But we don't check if "attr.comp_vector" is negative. It could potentially lead to an array underflow. My concern would be where cq->vector is used in the create_cq() function from the cxgb4 driver. And really "attr.comp_vector" is appears as a u32 to user space so that's the right type to use. Fixes: 9ee79fce3642 ("IB/core: Add completion queue (cq) object actions") Link: https://lore.kernel.org/r/20191011133419.GA22905@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-22mlxsw: core: Extend QSFP EEPROM size for ethtoolVadim Pasternak
Extend the size of QSFP EEPROM for the cable types SSF8436 and SFF8636 from 256 to 640 bytes in order to expose all the EEPROM pages by ethtool. For SFF-8636 and SFF-8436 specifications, the driver exposes 256 bytes of data for ethtool's get_module_eeprom() callback. This is because the driver uses the below defines to specify SFF module length in ethtool's get_module_info() callback: 'ETH_MODULE_SFF_8636_LEN' and 'ETH_MODULE_SFF_8436_LEN' (both are 256). As a result of exposing 256 bytes only, ethtool shows wrong "zero" info for pages 1, 2, 3. The patch changes the length returned by callback for get_module_info() to the values from the next defines: 'ETH_MODULE_SFF_8636_MAX_LEN' and 'ETH_MODULE_SFF_8436_MAX_LEN' (both are 640) to allow exposing of upper page 1, 2 and 3. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22mlxsw: reg: Add macro for getting QSFP module EEPROM page numberVadim Pasternak
Provide a macro for getting QSFP module EEPROM page number from the optional upper page number row offset, specified in request. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22KVM: nVMX: Don't leak L1 MMIO regions to L2Jim Mattson
If the "virtualize APIC accesses" VM-execution control is set in the VMCS, the APIC virtualization hardware is triggered when a page walk in VMX non-root mode terminates at a PTE wherein the address of the 4k page frame matches the APIC-access address specified in the VMCS. On hardware, the APIC-access address may be any valid 4k-aligned physical address. KVM's nVMX implementation enforces the additional constraint that the APIC-access address specified in the vmcs12 must be backed by a "struct page" in L1. If not, L0 will simply clear the "virtualize APIC accesses" VM-execution control in the vmcs02. The problem with this approach is that the L1 guest has arranged the vmcs12 EPT tables--or shadow page tables, if the "enable EPT" VM-execution control is clear in the vmcs12--so that the L2 guest physical address(es)--or L2 guest linear address(es)--that reference the L2 APIC map to the APIC-access address specified in the vmcs12. Without the "virtualize APIC accesses" VM-execution control in the vmcs02, the APIC accesses in the L2 guest will directly access the APIC-access page in L1. When there is no mapping whatsoever for the APIC-access address in L1, the L2 VM just loses the intended APIC virtualization. However, when the APIC-access address is mapped to an MMIO region in L1, the L2 guest gets direct access to the L1 MMIO device. For example, if the APIC-access address specified in the vmcs12 is 0xfee00000, then L2 gets direct access to L1's APIC. Since this vmcs12 configuration is something that KVM cannot faithfully emulate, the appropriate response is to exit to userspace with KVM_INTERNAL_ERROR_EMULATION. Fixes: fe3ef05c7572 ("KVM: nVMX: Prepare vmcs02 from vmcs01 and vmcs12") Reported-by: Dan Cross <dcross@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Peter Shier <pshier@google.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-22ARC: perf: Accommodate big-endian CPUAlexey Brodkin
8-letter strings representing ARC perf events are stores in two 32-bit registers as ASCII characters like that: "IJMP", "IALL", "IJMPTAK" etc. And the same order of bytes in the word is used regardless CPU endianness. Which means in case of big-endian CPU core we need to swap bytes to get the same order as if it was on little-endian CPU. Otherwise we're seeing the following error message on boot: ------------------------->8---------------------- ARC perf : 8 counters (32 bits), 40 conditions, [overflow IRQ support] sysfs: cannot create duplicate filename '/devices/arc_pct/events/pmji' CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3 Stack Trace: arc_unwind_core+0xd4/0xfc dump_stack+0x64/0x80 sysfs_warn_dup+0x46/0x58 sysfs_add_file_mode_ns+0xb2/0x168 create_files+0x70/0x2a0 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at kernel/events/core.c:12144 perf_event_sysfs_init+0x70/0xa0 Failed to register pmu: arc_pct, reason -17 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3 Stack Trace: arc_unwind_core+0xd4/0xfc dump_stack+0x64/0x80 __warn+0x9c/0xd4 warn_slowpath_fmt+0x22/0x2c perf_event_sysfs_init+0x70/0xa0 ---[ end trace a75fb9a9837bd1ec ]--- ------------------------->8---------------------- What happens here we're trying to register more than one raw perf event with the same name "PMJI". Why? Because ARC perf events are 4 to 8 letters and encoded into two 32-bit words. In this particular case we deal with 2 events: * "IJMP____" which counts all jump & branch instructions * "IJMPC___" which counts only conditional jumps & branches Those strings are split in two 32-bit words this way "IJMP" + "____" & "IJMP" + "C___" correspondingly. Now if we read them swapped due to CPU core being big-endian then we read "PMJI" + "____" & "PMJI" + "___C". And since we interpret read array of ASCII letters as a null-terminated string on big-endian CPU we end up with 2 events of the same name "PMJI". Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2019-10-22ARC: [plat-hsdk]: Enable on-boardi SPI ADC ICEugeniy Paltsev
HSDK board has adc108s102 SPI ADC IC installed, enable it. Acked-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2019-10-22ARC: [plat-hsdk]: Enable on-board SPI NOR flash ICEugeniy Paltsev
HSDK board has sst26wf016b SPI NOR flash IC installed, enable it. Acked-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2019-10-22xen/netback: cleanup init and deinit codeJuergen Gross
Do some cleanup of the netback init and deinit code: - add an omnipotent queue deinit function usable from xenvif_disconnect_data() and the error path of xenvif_connect_data() - only install the irq handlers after initializing all relevant items (especially the kthreads related to the queue) - there is no need to use get_task_struct() after creating a kthread and using put_task_struct() again after having stopped it. - use kthread_run() instead of kthread_create() to spare the call of wake_up_process(). Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Paul Durrant <pdurrant@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22KVM: SVM: Fix potential wrong physical id in avic_handle_ldr_updateMiaohe Lin
Guest physical APIC ID may not equal to vcpu->vcpu_id in some case. We may set the wrong physical id in avic_handle_ldr_update as we always use vcpu->vcpu_id. Get physical APIC ID from vAPIC page instead. Export and use kvm_xapic_id here and in avic_handle_apic_id_update as suggested by Vitaly. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-22Merge branch 'r8152-phy-firmware'Jakub Kicinski
Hayes Wang says: ==================== Support loading the firmware of the PHY with the type of RTL_FW_PHY_NC. ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22r8152: support firmware of PHY NC for RTL8153AHayes Wang
Support the firmware of PHY NC which is used to fix the issue found for PHY. Currently, only RTL_VER_04, RTL_VER_05, and RTL_VER_06 need it. The order of loading PHY firmware would be RTL_FW_PHY_START RTL_FW_PHY_NC RTL_FW_PHY_STOP The RTL_FW_PHY_START/RTL_FW_PHY_STOP are used to lock/unlock the PHY, and set/clear the patch key from the firmware file. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22r8152: move r8153_patch_request forwardHayes Wang
Move r8153_patch_request() forward for later patch. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22r8152: add checking fw_offset field of struct fw_macHayes Wang
Make sure @fw_offset field of struct fw_mac is more than the size of struct fw_mac. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22r8152: rename fw_type_1 with fw_macHayes Wang
The struct fw_type_1 is used by MAC only, so rename it to a meaningful one. Besides, adjust two messages. Replace "load xxx fail" with "check xxx fail" Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22Merge branch 'misc' into fixesRussell King
2019-10-22cpufreq: Cancel policy update work scheduled before freeingSudeep Holla
Scheduled policy update work may end up racing with the freeing of the policy and unregistering the driver. One possible race is as below, where the cpufreq_driver is unregistered, but the scheduled work gets executed at later stage when, cpufreq_driver is NULL (i.e. after freeing the policy and driver). Unable to handle kernel NULL pointer dereference at virtual address 0000001c pgd = (ptrval) [0000001c] *pgd=80000080204003, *pmd=00000000 Internal error: Oops: 206 [#1] SMP THUMB2 Modules linked in: CPU: 0 PID: 34 Comm: kworker/0:1 Not tainted 5.4.0-rc3-00006-g67f5a8081a4b #86 Hardware name: ARM-Versatile Express Workqueue: events handle_update PC is at cpufreq_set_policy+0x58/0x228 LR is at dev_pm_qos_read_value+0x77/0xac Control: 70c5387d Table: 80203000 DAC: fffffffd Process kworker/0:1 (pid: 34, stack limit = 0x(ptrval)) (cpufreq_set_policy) from (refresh_frequency_limits.part.24+0x37/0x48) (refresh_frequency_limits.part.24) from (handle_update+0x2f/0x38) (handle_update) from (process_one_work+0x16d/0x3cc) (process_one_work) from (worker_thread+0xff/0x414) (worker_thread) from (kthread+0xff/0x100) (kthread) from (ret_from_fork+0x11/0x28) Fixes: 67d874c3b2c6 ("cpufreq: Register notifiers with the PM QoS framework") Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> [ rjw: Cancel the work before dropping the QoS requests ] Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-10-22s390/kaslr: add support for R_390_GLOB_DAT relocation typeGerald Schaefer
Commit "bpf: Process in-kernel BTF" in linux-next introduced an undefined __weak symbol, which results in an R_390_GLOB_DAT relocation type. That is not yet handled by the KASLR relocation code, and the kernel stops with the message "Unknown relocation type". Add code to detect and handle R_390_GLOB_DAT relocation types and undefined symbols. Fixes: 805bc0bc238f ("s390/kernel: build a relocatable kernel") Cc: <stable@vger.kernel.org> # v5.2+ Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-22s390/zcrypt: fix memleak at releaseJohan Hovold
If a process is interrupted while accessing the crypto device and the global ap_perms_mutex is contented, release() could return early and fail to free related resources. Fixes: 00fab2350e6b ("s390/zcrypt: multiple zcrypt device nodes support") Cc: <stable@vger.kernel.org> # 4.19 Cc: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-22ALSA: usb-audio: Fix copy&paste error in the validatorTakashi Iwai
The recently introduced USB-audio descriptor validator had a stupid copy&paste error that may lead to an unexpected overlook of too short descriptors for processing and extension units. It's likely the cause of the report triggered by syzkaller fuzzer. Let's fix it. Fixes: 57f8770620e9 ("ALSA: usb-audio: More validations of descriptor units") Reported-by: syzbot+0620f79a1978b1133fd7@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/s5hsgnkdbsl.wl-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-10-22tools: gpio: Use !building_out_of_srctree to determine srctreeShuah Khan
make TARGETS=gpio kselftest fails with: Makefile:23: tools/build/Makefile.include: No such file or directory When the gpio tool make is invoked from tools Makefile, srctree is cleared and the current logic check for srctree equals to empty string to determine srctree location from CURDIR. When the build in invoked from selftests/gpio Makefile, the srctree is set to "." and the same logic used for srctree equals to empty is needed to determine srctree. Check building_out_of_srctree undefined as the condition for both cases to fix "make TARGETS=gpio kselftest" build failure. Cc: stable@vger.kernel.org Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2019-10-22perf/aux: Fix AUX output stoppingAlexander Shishkin
Commit: 8a58ddae2379 ("perf/core: Fix exclusive events' grouping") allows CAP_EXCLUSIVE events to be grouped with other events. Since all of those also happen to be AUX events (which is not the case the other way around, because arch/s390), this changes the rules for stopping the output: the AUX event may not be on its PMU's context any more, if it's grouped with a HW event, in which case it will be on that HW event's context instead. If that's the case, munmap() of the AUX buffer can't find and stop the AUX event, potentially leaving the last reference with the atomic context, which will then end up freeing the AUX buffer. This will then trip warnings: Fix this by using the context's PMU context when looking for events to stop, instead of the event's PMU context. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20191022073940.61814-1-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-22Merge tag 'kvm-ppc-fixes-5.4-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD PPC KVM fix for 5.4 - Fix a bug in the XIVE code which can cause a host crash.