summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-11-03tipc: eliminate unnecessary probingJon Maloy
The neighbor monitor employs a threshold, default set to 32 peer nodes, where it activates the "Overlapping Neighbor Monitoring" algorithm. Below that threshold, monitoring is full-mesh, and no "domain records" are passed between the nodes. Because of this, a node never received a peer's ack that it has received the most recent update of the own domain. Hence, the field 'acked_gen' in struct tipc_monitor_state remains permamently at zero, whereas the own domain generation is incremented for each added or removed peer. This has the effect that the function tipc_mon_get_state() always sets the field 'probing' in struct tipc_monitor_state true, again leading the tipc_link_timeout() of the link in question to always send out a probe, even when link->silent_intv_count is zero. This is functionally harmless, but leads to some unncessary probing, which can easily be eliminated by setting the 'probing' field of the said struct correctly in such cases. At the same time, we explictly invalidate the sent domain records when the algorithm is not activated. This will eliminate any risk that an invalid domain record might be inadverently accepted by the peer. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03net: sched: move block offload unbind after all chains are flushedJiri Pirko
Currently, the offload unbind is done before the chains are flushed. That causes driver to unregister block callback before it can get all the callback calls done during flush, leaving the offloaded tps inside the HW. So fix the order to prevent this situation and restore the original behaviour. Reported-by: Alexander Duyck <alexander.duyck@gmail.com> Reported-by: Jakub Kicinski <kubakici@wp.pl> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03cxgb4vf: define get_fecparam ethtool callbackGanesh Goudar
Add support to new ethtool operation get_fecparam to fetch FEC parameters. Original Work by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03cxgb4: add new T6 pci device id'sGanesh Goudar
Add 0x6086 T6 device id. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03net/ncsi: Make local function ncsi_get_filter() staticWei Yongjun
Fixes the following sparse warnings: net/ncsi/ncsi-manage.c:41:5: warning: symbol 'ncsi_get_filter' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03net: bridge: Convert timers to use timer_setup()Allen Pais
switch to using the new timer_setup() and from_timer() api's. Signed-off-by: Allen Pais <allen.pais@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03net: bridge: Convert timers to use timer_setup()Allen Pais
switch to using the new timer_setup() and from_timer() api's. Signed-off-by: Allen Pais <allen.pais@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03Merge branch 'mlxsw-Align-multipath-hash-parameters-with-kernels'David S. Miller
Jiri Pirko says: ==================== mlxsw: Align multipath hash parameters with kernel's Ido says: This set makes sure the device is using the same parameters as the kernel when it computes the multipath hash during IP forwarding. First patch adds a new netevent to let interested listeners know that the multipath hash policy has changed. Next two patches do small and non-functional changes in the mlxsw driver. Last patches configure the multipath hash policy upon driver initialization and as a response to netevents. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03mlxsw: spectrum_router: Update multipath hash parameters upon neteventsIdo Schimmel
Make sure the device and the kernel are performing the multipath hash according to the same parameters by updating the device whenever the relevant netevent is generated. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03mlxsw: spectrum_router: Align multipath hash parameters with kernel'sIdo Schimmel
Up until now we used the hardware's defaults for multipath hash computation. This patch aligns the hardware's multipath parameters with the kernel's. For IPv4 packets, the parameters are determined according to the 'fib_multipath_hash_policy' sysctl during module initialization. In case L3-mode is requested, only the source and destination IP addresses are used. There is no special handling of ICMP error packets. In case L4-mode is requested, a 5-tuple is used: source and destination IP addresses, source and destination ports and IP protocol. Note that the layer 4 fields are not considered for fragmented packets. For IPv6 packets, the source and destination IP addresses are used, as well as the flow label and the next header fields. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03mlxsw: reg: Add Router ECMP Configuration Register Version 2Ido Schimmel
The RECRv2 register is used for setting up the router's ECMP hash configuration. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03mlxsw: spectrum_router: Properly name netevent work structIdo Schimmel
The struct containing the work item queued from the netevent handler is named after the only event it is currently used for, which is neighbour updates. Use a more appropriate name for the struct, as we are going to use it for more events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03mlxsw: spectrum_router: Embed netevent notifier block in router structIdo Schimmel
We are going to need to respond to netevents notifying us about multipath hash updates by configuring the device's hash parameters. Embed the netevent notifier in the router struct so that we could retrieve it upon notifications and use it to configure the device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03ipv4: Send a netevent whenever multipath hash policy is changedIdo Schimmel
Devices performing IPv4 forwarding need to update their multipath hash policy whenever it is changed. Inform these devices by generating a netevent. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03Documentation: Add Tim Bird to list of enforcement statement endorsersBird, Timothy
Add my name to the list. Signed-off-by: Tim Bird <tim.bird@sony.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-03net: systemport: Correct IPG length settingsFlorian Fainelli
Due to a documentation mistake, the IPG length was set to 0x12 while it should have been 12 (decimal). This would affect short packet (64B typically) performance since the IPG was bigger than necessary. Fixes: 44a4524c54af ("net: systemport: Add support for SYSTEMPORT Lite") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03tcp: do not mangle skb->cb[] in tcp_make_synack()Eric Dumazet
Christoph Paasch sent a patch to address the following issue : tcp_make_synack() is leaving some TCP private info in skb->cb[], then send the packet by other means than tcp_transmit_skb() tcp_transmit_skb() makes sure to clear skb->cb[] to not confuse IPv4/IPV6 stacks, but we have no such cleanup for SYNACK. tcp_make_synack() should not use tcp_init_nondata_skb() : tcp_init_nondata_skb() really should be limited to skbs put in write/rtx queues (the ones that are only sent via tcp_transmit_skb()) This patch fixes the issue and should even save few cpu cycles ;) Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03fib: fib_dump_info can no longer use __in_dev_get_rtnlFlorian Westphal
syzbot reported yet another regression added with DOIT_UNLOCKED. When nexthop is marked as dead, fib_dump_info uses __in_dev_get_rtnl(): ./include/linux/inetdevice.h:230 suspicious rcu_dereference_protected() usage! rcu_scheduler_active = 2, debug_locks = 1 1 lock held by syz-executor2/23859: #0: (rcu_read_lock){....}, at: [<ffffffff840283f0>] inet_rtm_getroute+0xaa0/0x2d70 net/ipv4/route.c:2738 [..] lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4665 __in_dev_get_rtnl include/linux/inetdevice.h:230 [inline] fib_dump_info+0x1136/0x13d0 net/ipv4/fib_semantics.c:1377 inet_rtm_getroute+0xf97/0x2d70 net/ipv4/route.c:2785 .. This isn't safe anymore, callers either hold RTNL mutex or rcu read lock, so these spots must use rcu_dereference_rtnl() or plain rcu_derefence() (plus unconditional rcu read lock). This does the latter. Fixes: 394f51abb3d04f ("ipv4: route: set ipv4 RTM_GETROUTE to not use rtnl") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03cxgb4: fix error return code in cxgb4_set_hash_filter()Wei Yongjun
Fix to return a negative error code from thecxgb4_alloc_atid() error handling case instead of 0. Fixes: 12b276fbf6e0 ("cxgb4: add support to create hash filters") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-By: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03bpf: fix out-of-bounds access warning in bpf_checkArnd Bergmann
The bpf_verifer_ops array is generated dynamically and may be empty depending on configuration, which then causes an out of bounds access: kernel/bpf/verifier.c: In function 'bpf_check': kernel/bpf/verifier.c:4320:29: error: array subscript is above array bounds [-Werror=array-bounds] This adds a check to the start of the function as a workaround. I would assume that the function is never called in that configuration, so the warning is probably harmless. Fixes: 00176a34d9e2 ("bpf: remove the verifier ops from program structure") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03bpf: fix link error without CONFIG_NETArnd Bergmann
I ran into this link error with the latest net-next plus linux-next trees when networking is disabled: kernel/bpf/verifier.o:(.rodata+0x2958): undefined reference to `tc_cls_act_analyzer_ops' kernel/bpf/verifier.o:(.rodata+0x2970): undefined reference to `xdp_analyzer_ops' It seems that the code was written to deal with varying contents of the arrray, but the actual #ifdef was missing. Both tc_cls_act_analyzer_ops and xdp_analyzer_ops are defined in the core networking code, so adding a check for CONFIG_NET seems appropriate here, and I've verified this with many randconfig builds Fixes: 4f9218aaf8a4 ("bpf: move knowledge about post-translation offsets out of verifier") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03net: Define eth_stp_addr in linux/etherdevice.hEgil Hjelmeland
The lan9303 driver defines eth_stp_addr as a synonym to eth_reserved_addr_base to get the STP ethernet address 01:80:c2:00:00:00. eth_reserved_addr_base is also used to define the start of Bridge Reserved ethernet address range, which happen to be the STP address. br_dev_setup refer to eth_reserved_addr_base as a definition of STP address. Clean up by: - Move the eth_stp_addr definition to linux/etherdevice.h - Use eth_stp_addr instead of eth_reserved_addr_base in br_dev_setup. Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03stmmac: use of_property_read_u32 instead of read_u8Bhadram Varka
Numbers in DT are stored in “cells” which are 32-bits in size. of_property_read_u8 does not work properly because of endianness problem. This causes it to always return 0 with little-endian architectures. Fix it by using of_property_read_u32() OF API. Signed-off-by: Bhadram Varka <vbhadram@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03liquidio: bump up driver version to 1.7.0 to match newer NIC firmwareFelix Manlunas
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Acked-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03openrisc: fix possible deadlock scenario during timer syncStafford Horne
OpenRISC borrows its timer sync logic from MIPS, Matt helped to review the OpenRISC implementation and noted that we may suffer the same deadlock case that MIPS has faced. The case being: "the MIPS timer synchronization code contained the possibility of deadlock. If you mark a CPU online before it goes into the synchronize loop, then the boot CPU can schedule a different thread and send IPIs to all "online" CPUs. It gets stuck waiting for the secondary to ack it's IPI, since this secondary CPU has not enabled IRQs yet, and is stuck waiting for the master to synchronise with it. The system then deadlocks." Fix this by moving set_cpu_online() to after timer sync. Reported-by: Matt Redfearn <matt.redfearn@mips.com> Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03openrisc: pass endianness info to sparseLuc Van Oostenryck
openrisc is big-endian only but sparse assumes the same endianness as the building machine. This is problematic for code which expect __BYTE_ORDER__ being correctly predefined by the compiler which sparse can then pre-process differently from what gcc would, depending on the building machine endianness. Fix this by letting sparse know about the architecture endianness. To: Jonas Bonn <jonas@southpole.se> To: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> To: Stafford Horne <shorne@gmail.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03openrisc: add tick timer multi-core sync logicStafford Horne
In case timers are not in sync when cpus start (i.e. hot plug / offset resets) we need to synchronize the secondary cpus internal timer with the main cpu. This is needed as in OpenRISC SMP there is only one clocksource registered which reads from the same ttcr register on each cpu. This synchronization routine heavily borrows from mips implementation that does something similar. Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03openrisc: enable LOCKDEP_SUPPORT and irqflags tracingStafford Horne
Lockdep is needed for proving the spinlocks and rwlocks work fine on our platform. It also requires calling the trace_hardirqs_off() and trace_hardirqs_on() pair of routines when entering and exiting an interrupt. For OpenRISC the interrupt stack frame does not support frame pointers, so to call trace_hardirqs_on() and trace_hardirqs_off() here the macro's build up a stack frame each time. There is one necessary small change in _sys_call_handler to move interrupt enabling later so they can get re-enabled during syscall restart. This was done to fix lockdep warnings that are now possible due to this patch. Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03openrisc: support framepointers and STACKTRACE_SUPPORTStafford Horne
For lockdep support a reliable stack trace mechanism is needed. This patch adds support in OpenRISC for the stacktrace framework, implemented by a simple unwinder api. The unwinder api supports both framepointer and basic stack tracing. The unwinder is now used to replace the stack_dump() implementation as well. The new traces are inline with other architectures trace format: Call trace: [<c0004448>] show_stack+0x3c/0x58 [<c031c940>] dump_stack+0xa8/0xe4 [<c0008104>] __cpu_up+0x64/0x130 [<c000d268>] bringup_cpu+0x3c/0x178 [<c000d038>] cpuhp_invoke_callback+0xa8/0x1fc [<c000d680>] cpuhp_up_callbacks+0x44/0x14c [<c000e400>] cpu_up+0x14c/0x1bc [<c041da60>] smp_init+0x104/0x15c [<c033843c>] ? kernel_init+0x0/0x140 [<c0415e04>] kernel_init_freeable+0xbc/0x25c [<c033843c>] ? kernel_init+0x0/0x140 [<c0338458>] kernel_init+0x1c/0x140 [<c003a174>] ? schedule_tail+0x18/0xa0 [<c0006b80>] ret_from_fork+0x1c/0x9c Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03openrisc: add simple_smp dts and defconfig for simulatorsStefan Kristiansson
Simple enough to be compatible with simulation environments, such as verilated systems, QEMU and other targets supporting OpenRISC SMP. This also supports our base FPGA SoC's if the cpu frequency is upped to 50Mhz. Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> [shorne@gmail.com: Added defconfig] Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03openrisc: add cacheflush support to fix icache aliasingJan Henrik Weinstock
On OpenRISC the icache does not snoop data stores. This can cause aliasing as reported by Jan. This patch fixes the issue to ensure icache is properly synchronized when code is written to memory. It supports both SMP and UP flushing. This supports dcache flush as well for architectures that do not support write-through caches; most OpenRISC implementations do implement write-through cache however. Dcache flushes are done only on a single core as OpenRISC dcaches all support snooping of bus stores. Signed-off-by: Jan Henrik Weinstock <jan.weinstock@ice.rwth-aachen.de> [shorne@gmail.com: Squashed patches and wrote commit message] Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03openrisc: sleep instead of spin on secondary waitStafford Horne
Currently we do a spin on secondary cpus when waiting to boot. This theoretically causes issues with power consumption and does cause issues with qemu cycle burning (it starves cpu 0 from actually being able to boot.) This change puts each secondary cpu to sleep if they have a power management unit, then signals them to wake via IPI when its time to boot. If the cpus have no power management unit they will loop as before. Note: The wakeup IPI requires a special interrupt handler as on secondary cpu's the interrupt infrastructure is not yet established. This interrupt handler is set and reset by updating SPR_EVBAR. Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03openrisc: fix initial preempt state for secondary cpu tasksStafford Horne
During SMP testing we were getting the below warning after booting the secondary cpu: [ 0.060000] BUG: scheduling while atomic: swapper/1/0/0x00000000 This change follows similar patterns from other architectures to start the schduler with preempt disabled. Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03openrisc: initial SMP supportStefan Kristiansson
This patch introduces the SMP support for the OpenRISC architecture. The SMP architecture requires cores which have multi-core features which have been introduced a few years back including: - New SPRS SPR_COREID SPR_NUMCORES - Shadow SPRs - Atomic Instructions - Cache Coherency - A wired in IPI controller This patch adds all of the SMP specific changes to core infrastructure, it looks big but it needs to go all together as its hard to split this one up. Boot loader spinning of second cpu is not supported yet, it's assumed that Linux is booted straight after cpu reset. The bulk of these changes are trivial changes to refactor to use per cpu data structures throughout. The addition of the smp.c and changes in time.c are the changes. Some specific notes: MM changes ---------- The reason why this is created as an array, and not with DEFINE_PER_CPU is that doing it this way, we'll save a load in the tlb-miss handler (the load from __per_cpu_offset). TLB Flush --------- The SMP implementation of flush_tlb_* works by sending out a function-call IPI to all the non-local cpus by using the generic on_each_cpu() function. Currently, all flush_tlb_* functions will result in a flush_tlb_all(), which has always been the behaviour in the UP case. CPU INFO -------- This creates a per cpu cpuinfo struct and fills it out accordingly for each activated cpu. show_cpuinfo is also updated to reflect new version information in later versions of the spec. SMP API ------- This imitates the arm64 implementation by having a smp_cross_call callback that can be set by set_smp_cross_call to initiate an IPI and a handle_IPI function that is expected to be called from an IPI irqchip driver. Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> [shorne@gmail.com: added cpu stop, checkpatch fixes, wrote commit message] Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03irqchip: add initial support for ompicStafford Horne
IPI driver for the Open Multi-Processor Interrupt Controller (ompic) as described in the Multi-core support section of the OpenRISC 1.2 architecture specification: https://github.com/openrisc/doc/raw/master/openrisc-arch-1.2-rev0.pdf Each OpenRISC core contains a full interrupt controller which is used in the SMP architecture for interrupt balancing. This IPI device, the ompic, is the only external device required for enabling SMP on OpenRISC. Pending ops are stored in a memory bit mask which can allow multiple pending operations to be set and serviced at a time. This is mostly borrowed from the alpha IPI implementation. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> [shorne@gmail.com: converted ops to bitmask, wrote commit message] Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03dt-bindings: add openrisc to vendor prefixes listStafford Horne
Add OpenRISC.io to vendor prefixes. This is reserved for softcores developed by the OpenRISC community. The OpenRISC community has separated from OpenCores.org requiring a new prefix. Reviewed-by: Andreas Färber <afaerber@suse.de> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03openrisc: use qspinlocks and qrwlocksStafford Horne
Enable OpenRISC to use qspinlocks and qrwlocks for upcoming SMP support. Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03openrisc: add 1 and 2 byte cmpxchg supportStafford Horne
OpenRISC only supports hardware instructions that perform 4 byte atomic operations. For enabling qrwlocks for upcoming SMP support 1 and 2 byte implementations are needed. To do this we leverage the 4 byte atomic operations and shift/mask the 1 and 2 byte areas as needed. This heavily borrows ideas and routines from sh and mips, which do something similar. Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03openrisc: use shadow registers to save regs on exceptionStefan Kristiansson
Previously, the area between 0x0-0x100 have been used as a "scratch" memory area to temporarily store regs during exception entry. In a multi-core environment, this will not work. This change is to use shadow registers for nested context. Currently only the "critical" temp load/stores are covered, the EMERGENCY_PRINT ones are left as is (when they are used, it's game over anyway), they need to be handled as well in the future. Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03dt-bindings: openrisc: Add OpenRISC platform SoCStafford Horne
Add devicetree binding documentation for the OpenRISC platform opencores,or1ksim. This is the main OpenRISC reference platform supporting multiple FPGA SoC's. This format is based on some of the mips binding docs as we have similar requirements. Also, update maintainers so openrisc related binding changes are visible to the openrisc team. Acked-by: Rob Herring <robh@kernel.org> Suggested-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03Merge branch 'net-sched-use-after-free'David S. Miller
Cong Wang says: ==================== net_sched: fix a use-after-free for tc actions This patchset fixes a use-after-free reported by Lucas and closes potential races too. Please see each patch for details. ==================== Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03net_sched: hold netns refcnt for each actionCong Wang
TC actions have been destroyed asynchronously for a long time, previously in a RCU callback and now in a workqueue. If we don't hold a refcnt for its netns, we could use the per netns data structure, struct tcf_idrinfo, after it has been freed by netns workqueue. Hold refcnt to ensure netns destroy happens after all actions are gone. Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc actions") Reported-by: Lucas Bates <lucasb@mojatatu.com> Tested-by: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03net_sched: acquire RTNL in tc_action_net_exit()Cong Wang
I forgot to acquire RTNL in tc_action_net_exit() which leads that action ops->cleanup() is not always called with RTNL. This usually is not a big deal because this function is called after all netns refcnt are gone, but given RTNL protects more than just actions, add it for safety and consistency. Also add an assertion to catch other potential bugs. Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc actions") Reported-by: Lucas Bates <lucasb@mojatatu.com> Tested-by: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03tcp: add tracepoint trace_tcp_retransmit_synack()Song Liu
This tracepoint can be used to trace synack retransmits. It maintains pointer to struct request_sock. We cannot simply reuse trace_tcp_retransmit_skb() here, because the sk here is the LISTEN socket. The IP addresses and ports should be extracted from struct request_sock. Note that, like many other tracepoints, this patch uses IS_ENABLED in TP_fast_assign macro, which triggers sparse warning like: ./include/trace/events/tcp.h:274:1: error: directive in argument list ./include/trace/events/tcp.h:281:1: error: directive in argument list However, there is no good solution to avoid these warnings. To the best of our knowledge, these warnings are harmless. Signed-off-by: Song Liu <songliubraving@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03ipv6: Implement limits on Hop-by-Hop and Destination optionsTom Herbert
RFC 8200 (IPv6) defines Hop-by-Hop options and Destination options extension headers. Both of these carry a list of TLVs which is only limited by the maximum length of the extension header (2048 bytes). By the spec a host must process all the TLVs in these options, however these could be used as a fairly obvious denial of service attack. I think this could in fact be a significant DOS vector on the Internet, one mitigating factor might be that many FWs drop all packets with EH (and obviously this is only IPv6) so an Internet wide attack might not be so effective (yet!). By my calculation, the worse case packet with TLVs in a standard 1500 byte MTU packet that would be processed by the stack contains 1282 invidual TLVs (including pad TLVS) or 724 two byte TLVs. I wrote a quick test program that floods a whole bunch of these packets to a host and sure enough there is substantial time spent in ip6_parse_tlv. These packets contain nothing but unknown TLVS (that are ignored), TLV padding, and bogus UDP header with zero payload length. 25.38% [kernel] [k] __fib6_clean_all 21.63% [kernel] [k] ip6_parse_tlv 4.21% [kernel] [k] __local_bh_enable_ip 2.18% [kernel] [k] ip6_pol_route.isra.39 1.98% [kernel] [k] fib6_walk_continue 1.88% [kernel] [k] _raw_write_lock_bh 1.65% [kernel] [k] dst_release This patch adds configurable limits to Destination and Hop-by-Hop options. There are three limits that may be set: - Limit the number of options in a Hop-by-Hop or Destination options extension header. - Limit the byte length of a Hop-by-Hop or Destination options extension header. - Disallow unrecognized options in a Hop-by-Hop or Destination options extension header. The limits are set in corresponding sysctls: ipv6.sysctl.max_dst_opts_cnt ipv6.sysctl.max_hbh_opts_cnt ipv6.sysctl.max_dst_opts_len ipv6.sysctl.max_hbh_opts_len If a max_*_opts_cnt is less than zero then unknown TLVs are disallowed. The number of known TLVs that are allowed is the absolute value of this number. If a limit is exceeded when processing an extension header the packet is dropped. Default values are set to 8 for options counts, and set to INT_MAX for maximum length. Note the choice to limit options to 8 is an arbitrary guess (roughly based on the fact that the stack supports three HBH options and just one destination option). These limits have being proposed in draft-ietf-6man-rfc6434-bis. Tested (by Martin Lau) I tested out 1 thread (i.e. one raw_udp process). I changed the net.ipv6.max_dst_(opts|hbh)_number between 8 to 2048. With sysctls setting to 2048, the softirq% is packed to 100%. With 8, the softirq% is almost unnoticable from mpstat. v2; - Code and documention cleanup. - Change references of RFC2460 to be RFC8200. - Add reference to RFC6434-bis where the limits will be in standard. Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03Merge tag 'timers-conversion-next3' of ↵Thomas Gleixner
https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux into timers/core Pull the 3rd batch of timer conversions from Kees Cook: - various per-architecture conversions - several driver conversions not picked up by a specific maintainer - other Acked/Reviewed conversions to go through tip
2017-11-02drivers/sgi-xp: Convert timers to use timer_setup()Kees Cook
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Cliff Whickman <cpw@sgi.com> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Robin Holt <robinmholt@gmail.com>
2017-11-02drivers/pcmcia: Convert timers to use timer_setup()Kees Cook
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: bcm-kernel-feedback-list@broadcom.com Cc: David Howells <dhowells@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-pcmcia@lists.infradead.org Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> # for soc_common.c
2017-11-02drivers/memstick: Convert timers to use timer_setup()Kees Cook
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Alex Dubov <oakad@yahoo.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-02drivers/macintosh: Convert timers to use timer_setup()Kees Cook
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Kees Cook <keescook@chromium.org>