summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-02-01x86_64, entry: Remove the syscall exit audit and schedule optimizationsAndy Lutomirski
We used to optimize rescheduling and audit on syscall exit. Now that the full slow path is reasonably fast, remove these optimizations. Syscall exit auditing is now handled exclusively by syscall_trace_leave. This adds something like 10ns to the previously optimized paths on my computer, presumably due mostly to SAVE_REST / RESTORE_REST. I think that we should eventually replace both the syscall and non-paranoid interrupt exit slow paths with a pair of C functions along the lines of the syscall entry hooks. Link: http://lkml.kernel.org/r/22f2aa4a0361707a5cfb1de9d45260b39965dead.1421453410.git.luto@amacapital.net Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2015-02-01x86_64, entry: Use sysret to return to userspace when possibleAndy Lutomirski
The x86_64 entry code currently jumps through complex and inconsistent hoops to try to minimize the impact of syscall exit work. For a true fast-path syscall, almost nothing needs to be done, so returning is just a check for exit work and sysret. For a full slow-path return from a syscall, the C exit hook is invoked if needed and we join the iret path. Using iret to return to userspace is very slow, so the entry code has accumulated various special cases to try to do certain forms of exit work without invoking iret. This is error-prone, since it duplicates assembly code paths, and it's dangerous, since sysret can malfunction in interesting ways if used carelessly. It's also inefficient, since a lot of useful cases aren't optimized and therefore force an iret out of a combination of paranoia and the fact that no one has bothered to write even more asm code to avoid it. I would argue that this approach is backwards. Rather than trying to avoid the iret path, we should instead try to make the iret path fast. Under a specific set of conditions, iret is unnecessary. In particular, if RIP==RCX, RFLAGS==R11, RIP is canonical, RF is not set, and both SS and CS are as expected, then movq 32(%rsp),%rsp;sysret does the same thing as iret. This set of conditions is nearly always satisfied on return from syscalls, and it can even occasionally be satisfied on return from an irq. Even with the careful checks for sysret applicability, this cuts nearly 80ns off of the overhead from syscalls with unoptimized exit work. This includes tracing and context tracking, and any return that invokes KVM's user return notifier. For example, the cost of getpid with CONFIG_CONTEXT_TRACKING_FORCE=y drops from ~360ns to ~280ns on my computer. This may allow the removal and even eventual conversion to C of a respectable amount of exit asm. This may require further tweaking to give the full benefit on Xen. It may be worthwhile to adjust signal delivery and exec to try hit the sysret path. This does not optimize returns to 32-bit userspace. Making the same optimization for CS == __USER32_CS is conceptually straightforward, but it will require some tedious code to handle the differences between sysretl and sysexitl. Link: http://lkml.kernel.org/r/71428f63e681e1b4aa1a781e3ef7c27f027d1103.1421453410.git.luto@amacapital.net Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2015-02-01x86, traps: Fix ist_enter from userspaceAndy Lutomirski
context_tracking_user_exit() has no effect if in_interrupt() returns true, so ist_enter() didn't work. Fix it by calling exception_enter(), and thus context_tracking_user_exit(), before incrementing the preempt count. This also adds an assertion that will catch the problem reliably if CONFIG_PROVE_RCU=y to help prevent the bug from being reintroduced. Link: http://lkml.kernel.org/r/261ebee6aee55a4724746d0d7024697013c40a08.1422709102.git.luto@amacapital.net Fixes: 959274753857 x86, traps: Track entry into and exit from IST context Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2015-02-01Bluetooth: Fix OOB data present for BR/EDR Secure Connections Only modeMarcel Holtmann
When using Secure Connections Only mode, then only P-256 OOB data is valid and should be provided. In case userspace provides P-192 and P-256 OOB data, then the P-192 values will be set to zero. However the present value of the IO capability exchange still mentioned that both values would be available. Fix this by telling the controller clearly that only the P-256 OOB data is present. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-02-01ALSA: line6: Remove unused line6_midibuf_skip_message()Chris Rorvick
Use of this function ended with commits 3e58c868db1d ("staging: line6: drop midi_mask_receive") and af89d2897a71 ("staging: line6: drop midi_mask_transmit".) [Removed the corresponding line in midibuf.h, too -- tiwai] Signed-off-by: Chris Rorvick <chris@rorvick.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-02-01ALSA: line6: Remove unused line6_midibuf_status()Chris Rorvick
This function has not been used since merging the driver into the kernel (and a good while before that.) [Removed the corresponding line in midibuf.h, too -- tiwai] Signed-off-by: Chris Rorvick <chris@rorvick.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-02-01Bluetooth: Expose remote OOB information as debugfs entryMarcel Holtmann
For debugging purposes it is good to know which OOB data is actually currently loaded for each controller. So expose that list via debugfs. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-02-01Bluetooth: Expose hardware error code as debugfs entryMarcel Holtmann
When the Hardware Error event is send by the controller, the Bluetooth core stores the error code. Expose it via debugfs so it can be retrieved later on. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-02-01Bluetooth: Expose debug keys usage setting via debugfsMarcel Holtmann
To allow easier debugging when debug keys are generated, provide debugfs entry for checking the setting of debug keys usage. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-02-01Bluetooth: Track changes from HCI Write Simple Pairing Debug Mode commandMarcel Holtmann
When the HCI Write Simple Pairing Debug Mode command has been issued, the result needs to be tracked and stored. The hdev->ssp_debug_mode variable is already present, but was never updated when the mode in the controller was actually changed. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-02-01Bluetooth: Expose Secure Simple Pairing debug mode setting in debugfsMarcel Holtmann
The value of the ssp_debug_mode should be accessible via debugfs to be able to determine if a BR/EDR controller generates debugs keys or not. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-31net: sched: fix panic in rate estimatorsEric Dumazet
Doing the following commands on a non idle network device panics the box instantly, because cpu_bstats gets overwritten by stats. tc qdisc add dev eth0 root <your_favorite_qdisc> ... some traffic (one packet is enough) ... tc qdisc replace dev eth0 root est 1sec 4sec <your_favorite_qdisc> [ 325.355596] BUG: unable to handle kernel paging request at ffff8841dc5a074c [ 325.362609] IP: [<ffffffff81541c9e>] __gnet_stats_copy_basic+0x3e/0x90 [ 325.369158] PGD 1fa7067 PUD 0 [ 325.372254] Oops: 0000 [#1] SMP [ 325.375514] Modules linked in: ... [ 325.398346] CPU: 13 PID: 14313 Comm: tc Not tainted 3.19.0-smp-DEV #1163 [ 325.412042] task: ffff8800793ab5d0 ti: ffff881ff2fa4000 task.ti: ffff881ff2fa4000 [ 325.419518] RIP: 0010:[<ffffffff81541c9e>] [<ffffffff81541c9e>] __gnet_stats_copy_basic+0x3e/0x90 [ 325.428506] RSP: 0018:ffff881ff2fa7928 EFLAGS: 00010286 [ 325.433824] RAX: 000000000000000c RBX: ffff881ff2fa796c RCX: 000000000000000c [ 325.440988] RDX: ffff8841dc5a0744 RSI: 0000000000000060 RDI: 0000000000000060 [ 325.448120] RBP: ffff881ff2fa7948 R08: ffffffff81cd4f80 R09: 0000000000000000 [ 325.455268] R10: ffff883ff223e400 R11: 0000000000000000 R12: 000000015cba0744 [ 325.462405] R13: ffffffff81cd4f80 R14: ffff883ff223e460 R15: ffff883feea0722c [ 325.469536] FS: 00007f2ee30fa700(0000) GS:ffff88407fa20000(0000) knlGS:0000000000000000 [ 325.477630] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 325.483380] CR2: ffff8841dc5a074c CR3: 0000003feeae9000 CR4: 00000000001407e0 [ 325.490510] Stack: [ 325.492524] ffff883feea0722c ffff883fef719dc0 ffff883feea0722c ffff883ff223e4a0 [ 325.499990] ffff881ff2fa79a8 ffffffff815424ee ffff883ff223e49c 000000015cba0744 [ 325.507460] 00000000f2fa7978 0000000000000000 ffff881ff2fa79a8 ffff883ff223e4a0 [ 325.514956] Call Trace: [ 325.517412] [<ffffffff815424ee>] gen_new_estimator+0x8e/0x230 [ 325.523250] [<ffffffff815427aa>] gen_replace_estimator+0x4a/0x60 [ 325.529349] [<ffffffff815718ab>] tc_modify_qdisc+0x52b/0x590 [ 325.535117] [<ffffffff8155edd0>] rtnetlink_rcv_msg+0xa0/0x240 [ 325.540963] [<ffffffff8155ed30>] ? __rtnl_unlock+0x20/0x20 [ 325.546532] [<ffffffff8157f811>] netlink_rcv_skb+0xb1/0xc0 [ 325.552145] [<ffffffff8155b355>] rtnetlink_rcv+0x25/0x40 [ 325.557558] [<ffffffff8157f0d8>] netlink_unicast+0x168/0x220 [ 325.563317] [<ffffffff8157f47c>] netlink_sendmsg+0x2ec/0x3e0 Lets play safe and not use an union : percpu 'pointers' are mostly read anyway, and we have typically few qdiscs per host. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: John Fastabend <john.fastabend@gmail.com> Fixes: 22e0f8b9322c ("net: sched: make bstats per cpu and estimator RCU safe") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-31ipv4: icmp: use percpu allocationEric Dumazet
Get rid of nr_cpu_ids and use modern percpu allocation. Note that the sockets themselves are not yet allocated using NUMA affinity. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-31drivers: net: cpsw: make cpsw_ale.c a module to allow re-use on KeystoneKaricheri, Muralidharan
NetCP on Keystone has cpsw ale function similar to other TI SoCs and this driver is re-used. To allow both ti cpsw and keystone netcp to re-use the driver, convert the cpsw ale to a module and configure it through Kconfig option CONFIG_TI_CPSW_ALE. Currently it is statically linked to both TI CPSW and NetCP and this causes issues when the above drivers are built as dynamic modules. This patch addresses this issue While at it, fix the Makefile and code to build both netcp_core and netcp_ethss as dynamic modules. This is needed to support arm allmodconfig. This also requires exporting of API calls provided by netcp_core so that both the above can be dynamic modules. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Acked-by: Mugunthan V N <mugunthanvnm@ti.com> Tested-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-31hyperv: Fix the error processing in netvsc_send()Haiyang Zhang
The existing code frees the skb in EAGAIN case, in which the skb will be retried from upper layer and used again. Also, the existing code doesn't free send buffer slot in error case, because there is no completion message for unsent packets. This patch fixes these problems. (Please also include this patch for stable trees. Thanks!) Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-31tcp: use SACK RTTs for CCKenneth Klette Jonassen
Current behavior only passes RTTs from sequentially acked data to CC. If sender gets a combined ACK for segment 1 and SACK for segment 3, then the computed RTT for CC is the time between sending segment 1 and receiving SACK for segment 3. Pass the minimum computed RTT from any acked data to CC, i.e. time between sending segment 3 and receiving SACK for segment 3. Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-31Bluetooth: Allow remote OOB data to only provide P-192 or P-256 valuesMarcel Holtmann
In case the remote only provided P-192 or P-256 data for OOB pairing, then make sure that the data value pointers are correctly set. That way the core can provide correct information when remote OOB data present information have to be communicated. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-31Bluetooth: Fix OOB data present value for SMP pairingMarcel Holtmann
Before setting the OOB data present flag with SMP pairing, check the newly introduced present tracking that actual OOB data values have been provided. The existence of remote OOB data structure does not actually mean that the correct data values are available. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-31Bluetooth: Fix OOB data present value for BR/EDR Secure ConnectionsMarcel Holtmann
When BR/EDR Secure Connections has been enabled, the OOB data present value can take 2 additional values. The host has to clearly provide details about if P-192 OOB data, P-256 OOB data or a combination of P-192 and P-256 OOB data is present. In case BR/EDR Secure Connections is not enabled or not supported, then check that P-192 OOB data is actually present and return the correct value based on that. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-31Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "i2c driver bugfixes (s3c2410, slave-eeprom, sh_mobile), size regression "bugfix" (i2c slave), documentation bugfix (st). Also, one documentation update (da9063), so some devicetrees can now be verified" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: sh_mobile: terminate DMA reads properly i2c: Only include slave support if selected i2c: s3c2410: fix ABBA deadlock by keeping clock prepared i2c: slave-eeprom: fix boundary check when using sysfs i2c: st: Rename clock reference to something that exists DT: i2c: Add devices handled by the da9063 MFD driver
2015-01-31m68k/defconfig: Enable Ethernet bridgingGeert Uytterhoeven
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2015-01-31m68k/defconfig: Enable Atari EtherNAT and EtherNEC Ethernet supportGeert Uytterhoeven
Enable support for Atari EtherNAT (SMC91X) and EtherNEC (NE2000) Ethernet support in the Atari and multiplatform defconfig files. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2015-01-31m68k/defconfig: Enable automounting of devtmpfs at /devGeert Uytterhoeven
Enable CONFIG_DEVTMPFS_MOUNT, as it's useful for initrd-less kernels. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2015-01-31m68k/defconfig: Enable early printk supportGeert Uytterhoeven
Enable CONFIG_EARLY_PRINTK on all platforms where it's available (all but Sun-3) and not yet enabled. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2015-01-31m68k/defconfig: Enable test modulesGeert Uytterhoeven
It doesn't hurt to have CONFIG_TEST_* enabled as modules. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2015-01-31m68k/defconfig: Refresh defconfigs for v3.16-rc1--v3.19-rc2Geert Uytterhoeven
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2015-01-31Bluetooth: Store OOB data present value for each set of remote OOB dataMarcel Holtmann
Instead of doing complex calculation every time the OOB data is used, just calculate the OOB data present value and store it with the OOB data raw values. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-31Bluetooth: Set HCI_QUIRK_STRICT_DUPLICATE_FILTER for BTUSB_INTELJakub Pawlowski
The Bluetooth controllers from Intel use a strict scanning filter policy that filters based on Bluetooth device addresses and not on RSSI. So tell the core about this. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-01-30Merge tag 'char-misc-3.19-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are two tiny patches, one fixing up the drivers/Kconfig file, and one adding a MAINTAINERS entry for the UIO git tree" * tag 'char-misc-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: drivers/Kconfig: remove duplicate entry for soc MAINTAINERS: add git url entry for UIO
2015-01-30Merge tag 'staging-3.19-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging tree fixes from Greg KH: "Here are two tiny staging tree fixes. One for the nvec driver to resolve a reported problem, and one to add a MAINTAINERS entry for the Android drivers" * tag 'staging-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: MAINTAINERS: add Android driver entries staging: nvec: specify a platform-device base id
2015-01-30Merge tag 'usb-3.19-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB fixes and quirk additions for 3.19-rc7. All have been in linux-next for a while with no reported problems" * tag 'usb-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: Add OTG PET device to TPL usb-storage/SCSI: blacklist FUA on JMicron 152d:2566 USB-SATA controller uas: Add no-report-opcodes quirk for Simpletech devices with id 4971:8017 storage: Revise/fix quirk for 04E6:000F SCM USB-SCSI converter usb: phy: never defer probe in non-OF case usb: dwc2: call dwc2_is_controller_alive() under spinlock
2015-01-30drivers: net: xgene: fix: Out of order descriptor bytes readIyappan Subramanian
This patch fixes the following kernel crash, WARNING: CPU: 2 PID: 0 at net/ipv4/tcp_input.c:3079 tcp_clean_rtx_queue+0x658/0x80c() Call trace: [<fffffe0000096b7c>] dump_backtrace+0x0/0x184 [<fffffe0000096d10>] show_stack+0x10/0x1c [<fffffe0000685ea0>] dump_stack+0x74/0x98 [<fffffe00000b44e0>] warn_slowpath_common+0x88/0xb0 [<fffffe00000b461c>] warn_slowpath_null+0x14/0x20 [<fffffe00005b5c1c>] tcp_clean_rtx_queue+0x654/0x80c [<fffffe00005b6228>] tcp_ack+0x454/0x688 [<fffffe00005b6ca8>] tcp_rcv_established+0x4a4/0x62c [<fffffe00005bf4b4>] tcp_v4_do_rcv+0x16c/0x350 [<fffffe00005c225c>] tcp_v4_rcv+0x8e8/0x904 [<fffffe000059d470>] ip_local_deliver_finish+0x100/0x26c [<fffffe000059dad8>] ip_local_deliver+0xac/0xc4 [<fffffe000059d6c4>] ip_rcv_finish+0xe8/0x328 [<fffffe000059dd3c>] ip_rcv+0x24c/0x38c [<fffffe0000563950>] __netif_receive_skb_core+0x29c/0x7c8 [<fffffe0000563ea4>] __netif_receive_skb+0x28/0x7c [<fffffe0000563f54>] netif_receive_skb_internal+0x5c/0xe0 [<fffffe0000564810>] napi_gro_receive+0xb4/0x110 [<fffffe0000482a2c>] xgene_enet_process_ring+0x144/0x338 [<fffffe0000482d18>] xgene_enet_napi+0x1c/0x50 [<fffffe0000565454>] net_rx_action+0x154/0x228 [<fffffe00000b804c>] __do_softirq+0x110/0x28c [<fffffe00000b8424>] irq_exit+0x8c/0xc0 [<fffffe0000093898>] handle_IRQ+0x44/0xa8 [<fffffe000009032c>] gic_handle_irq+0x38/0x7c [...] Software writes poison data into the descriptor bytes[15:8] and upon receiving the interrupt, if those bytes are overwritten by the hardware with the valid data, software also reads bytes[7:0] and executes receive/tx completion logic. If the CPU executes the above two reads in out of order fashion, then the bytes[7:0] will have older data and causing the kernel panic. We have to force the order of the reads and thus this patch introduces read memory barrier between these reads. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Keyur Chudgar <kchudgar@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30irda: use msecs_to_jiffies for conversionsNicholas Mc Guire
This is only an API consolidation and should make things more readable it replaces var * HZ / 1000 constructs by msecs_to_jiffies(var). Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30rhashtable: Make selftest modularGeert Uytterhoeven
Allow the selftest on the resizable hash table to be built modular, just like all other tests that do not depend on DEBUG_KERNEL. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30Merge branch 'vlan_get_protocol'David S. Miller
Toshiaki Makita says: ==================== Fix checksum error when using stacked vlan When I was testing 802.1ad, I found several drivers don't take into account 802.1ad or multiple vlans when retrieving L3 (IP/IPv6) or L4 (TCP/UDP) protocol for checksum offload. It is mainly due to vlan_get_protocol(), which extracts ether type only when it is tagged with single 802.1Q. When 802.1ad is used or there are multiple vlans, it extracts vlan protocol and drivers cannot determine which L3/L4 protocol is used. Those drivers, most of which have IP_CSUM/IPV6_CSUM features, get L3/L4 header-offset by software, so it seems that their checksum offload works with multiple vlans if we can parse protocols correctly. (They know mac header length, and probably don't care about what is in it.) And another thing, some of Intel's drivers seem to use skb->protocol where vlan_get_protocol() is more suitable. I tested that at least igb/igbvf on I350 works with this patch set. Note: We can hand a double tagged packet with CHECKSUM_PARTIAL to a HW driver by creating a vlan device on a bridge device and enabling vlan_filtering of the bridge with 802.1ad protocol. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30ixgbevf: Fix checksum error when using stacked vlanToshiaki Makita
When a skb has multiple vlans and it is CHECKSUM_PARTIAL, ixgbevf_tx_csum() fails to get the network protocol and checksum related descriptor fields are not configured correctly because skb->protocol doesn't show the L3 protocol in this case. Use first->protocol instead of skb->protocol to get the proper network protocol. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30ixgbe: Fix checksum error when using stacked vlanToshiaki Makita
When a skb has multiple vlans and it is CHECKSUM_PARTIAL, ixgbe_tx_csum() fails to get the network protocol and checksum related descriptor fields are not configured correctly because skb->protocol doesn't show the L3 protocol in this case. Use vlan_get_protocol() to get the proper network protocol. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30igbvf: Fix checksum error when using stacked vlanToshiaki Makita
When a skb has multiple vlans and it is CHECKSUM_PARTIAL, igbvf_tx_csum() fails to get the network protocol and checksum related descriptor fields are not configured correctly because skb->protocol doesn't show the L3 protocol in this case. Use vlan_get_protocol() to get the proper network protocol. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30net: Fix vlan_get_protocol for stacked vlanToshiaki Makita
vlan_get_protocol() could not get network protocol if a skb has a 802.1ad vlan tag or multiple vlans, which caused incorrect checksum calculation in several drivers. Fix vlan_get_protocol() to retrieve network protocol instead of incorrect vlan protocol. As the logic is the same as skb_network_protocol(), create a common helper function __vlan_get_protocol() and call it from existing functions. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30net: mark some potential candidates __read_mostlyDaniel Borkmann
They are all either written once or extremly rarely (e.g. from init code), so we can move them to the .data..read_mostly section. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30net: sctp: fix passing wrong parameter header to param_type2af in ↵Saran Maruti Ramanara
sctp_process_param When making use of RFC5061, section 4.2.4. for setting the primary IP address, we're passing a wrong parameter header to param_type2af(), resulting always in NULL being returned. At this point, param.p points to a sctp_addip_param struct, containing a sctp_paramhdr (type = 0xc004, length = var), and crr_id as a correlation id. Followed by that, as also presented in RFC5061 section 4.2.4., comes the actual sctp_addr_param, which also contains a sctp_paramhdr, but this time with the correct type SCTP_PARAM_IPV{4,6}_ADDRESS that param_type2af() can make use of. Since we already hold a pointer to addr_param from previous line, just reuse it for param_type2af(). Fixes: d6de3097592b ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT") Signed-off-by: Saran Maruti Ramanara <saran.neti@telus.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30netlink: fix wrong subscription bitmask to group mapping inPablo Neira
The subscription bitmask passed via struct sockaddr_nl is converted to the group number when calling the netlink_bind() and netlink_unbind() callbacks. The conversion is however incorrect since bitmask (1 << 0) needs to be mapped to group number 1. Note that you cannot specify the group number 0 (usually known as _NONE) from setsockopt() using NETLINK_ADD_MEMBERSHIP since this is rejected through -EINVAL. This problem became noticeable since 97840cb ("netfilter: nfnetlink: fix insufficient validation in nfnetlink_bind") when binding to bitmask (1 << 0) in ctnetlink. Reported-by: Andre Tomt <andre@tomt.net> Reported-by: Ivan Delalande <colona@arista.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30Merge branch 'cpsw_macid'David S. Miller
Tony Lindgren says: ==================== Changes to cpsw and davinci_emac for getting MAC address Here are a few patches to add common code for cpsw and davinci_emac for getting the MAC address. Looks like we can also now add code to get the MAC address on 3517 but in a slightly different way. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30net: davinci_emac: Get device MAC on 3517Tony Lindgren
Looks like on 3517 davinci_emac MAC address registers have a different layout compared to dm816x and am33xx. Let's add a function to get the 3517 MAC address. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30net: davinci_emac: Get device dm816x MAC address using the cpsw codeTony Lindgren
At least on dm81xx, we can get the davinci_emac MAC address the same way as on am33xx cpsw. Let's also use ether_addr_copy() for davinci_emac while at it. Cc: Brian Hutchinson <b.hutchman@gmail.com> Cc: Felipe Balbi <balbi@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30net: cpsw: Add a minimal cpsw-common module for shared codeTony Lindgren
Looks like davinci_emac and cpsw can share some code although the device registers have a different layout. At least the code for getting the MAC address using syscon can be shared by passing the register offset. Let's start with that and set up a minimal shared cpsw-shared.c. Cc: Brian Hutchinson <b.hutchman@gmail.com> Cc: Felipe Balbi <balbi@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-31MIPS: fork: Fix MSA/FPU/DSP context duplication raceJames Hogan
There is a race in the MIPS fork code which allows the child to get a stale copy of parent MSA/FPU/DSP state that is active in hardware registers when the fork() is called. This is because copy_thread() saves the live register state into the child context only if the hardware is currently in use, apparently on the assumption that the hardware state cannot have been saved and disabled since the initial duplication of the task_struct. However preemption is certainly possible during this window. An example sequence of events is as follows: 1) The parent userland process puts important data into saved floating point registers ($f20-$f31), which are then dirty compared to the process' stored context. 2) The parent process calls fork() which does a clone system call. 3) In the kernel, do_fork() -> copy_process() -> dup_task_struct() -> arch_dup_task_struct() (which uses the weakly defined default implementation). This duplicates the parent process' task context, which includes a stale version of its FP context from when it was last saved, probably some time before (1). 4) At some point before copy_process() calls copy_thread(), such as when duplicating the memory map, the process is desceduled. Perhaps it is preempted asynchronously, or perhaps it sleeps while blocked on a mutex. The dirty FP state in the FP registers is saved to the parent process' context and the FPU is disabled. 5) When the process is rescheduled again it continues copying state until it gets to copy_thread(), which checks whether the FPU is in use, so that it can copy that dirty state to the child process' task context. Because of the deschedule however the FPU is not in use, so the child process' context is left with stale FP context from the last time the parent saved it (some time before (1)). 6) When the new child process is scheduled it reads the important data from the saved floating point register, and ends up doing a NULL pointer dereference as a result of the stale data. This use of saved floating point registers across function calls can be triggered fairly easily by explicitly using inline asm with a current (MIPS R2) compiler, but is far more likely to happen unintentionally with a MIPS R6 compiler where the FP registers are more likely to get used as scratch registers for storing non-fp data. It is easily fixed, in the same way that other architectures do it, by overriding the implementation of arch_dup_task_struct() to sync the dirty hardware state to the parent process' task context *prior* to duplicating it, rather than copying straight to the child process' task context in copy_thread(). Note, the FPU hardware is not disabled so the parent process may continue executing with the live register context, but now the child process is guaranteed to have an identical copy of it at that point. Signed-off-by: James Hogan <james.hogan@imgtec.com> Reported-by: Matthew Fortune <matthew.fortune@imgtec.com> Tested-by: Markos Chandras <markos.chandras@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/9075/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-01-31MIPS: Fix C0_Pagegrain[IEC] support.David Daney
The following commits: 5890f70f15c52d (MIPS: Use dedicated exception handler if CPU supports RI/XI exceptions) 6575b1d4173eae (MIPS: kernel: cpu-probe: Detect unique RI/XI exceptions) break the kernel for *all* existing MIPS CPUs that implement the CP0_PageGrain[IEC] bit. They cause the TLB exception handlers to be generated without the legacy execute-inhibit handling, but never set the CP0_PageGrain[IEC] bit to activate the use of dedicated exception vectors for execute-inhibit exceptions. The result is that upon detection of an execute-inhibit violation, we loop forever in the TLB exception handlers instead of sending SIGSEGV to the task. If we are generating TLB exception handlers expecting separate vectors, we must also enable the CP0_PageGrain[IEC] feature. The bug was introduced in kernel version 3.17. Signed-off-by: David Daney <david.daney@cavium.com> Cc: <stable@vger.kernel.org> Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: http://patchwork.linux-mips.org/patch/8880/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-01-30Revert "IPoIB: Consolidate rtnl_lock tasks in workqueue"Roland Dreier
This reverts commit afe1de664ef3cb756e70938d99417dcbc6b1379a. The series of IPoIB bug fixes that went into 3.19-rc1 introduce regressions, and after trying to sort things out, we decided to revert to 3.18's IPoIB driver and get things right for 3.20. Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-01-30Revert "IPoIB: Make the carrier_on_task race aware"Roland Dreier
This reverts commit 67d7209e1f481cbaed37f9a224a328a3f83d0482. The series of IPoIB bug fixes that went into 3.19-rc1 introduce regressions, and after trying to sort things out, we decided to revert to 3.18's IPoIB driver and get things right for 3.20. Signed-off-by: Roland Dreier <roland@purestorage.com>