summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-07-20mlxsw: spectrum: Expose per-tc counters via ethtoolIdo Schimmel
Expose the transmit queue length of each traffic class and the amount of unicast packets discarded due to insufficient room in the shared buffer. The first counter allows us to debug user priority to traffic class mapping, whereas the drop counter is useful when determining shared buffer configuration. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-20mlxsw: spectrum: Expose per-priority counters via ethtoolIdo Schimmel
Expose per-priority bytes / packets / PFC packets counters via ethtool. These counters are very useful when debugging QoS functionality and provide a better insight into the device's forwarding plane. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-20f2fs: handle error case with f2fs_bug_onJaegeuk Kim
It's enough to show BUG or WARN by f2fs_bug_on for error case. Then, we don't need to remain corrupted filesystem. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-20f2fs: avoid data race when deciding checkpoin in f2fs_sync_fileJaegeuk Kim
When fs utilization is almost full, f2fs_sync_file should do checkpoint if there is not enough space for roll-forward later. (i.e. space_for_roll_forward) So, currently we have no lock for sbi->alloc_valid_block_count, resulting in race condition. In rare case, we can get -ENOSPC when doing roll-forward which triggers if (is_valid_blkaddr(sbi, dest, META_POR)) { if (src == NULL_ADDR) { err = reserve_new_block(&dn); f2fs_bug_on(sbi, err); ... } ... } in do_recover_data. So, this patch avoids that situation in advance. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-20f2fs: support an ioctl to move a range of data blocksJaegeuk Kim
This patch implements moving a range of data blocks from source file to destination file. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-20f2fs: fix to report error number of f2fs_find_entryChao Yu
This patch fixes to report the right error number of f2fs_find_entry to its caller. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-20net: cpmac: fix error handling of cpmac_probe()Wei Yongjun
Add the missing free_netdev() before return from function cpmac_probe() in the error handling case. This patch revert commit 0465be8f4f1d ("net: cpmac: fix in releasing resources"), which changed to only free_netdev while register_netdev failed. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-20net/mlx5: Use PTR_ERR_OR_ZERO() to simplify the codeWei Yongjun
Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR. Generated by: scripts/coccinelle/api/ptr_ret.cocci Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-20net: ethernet: nb8800: fix error handling of nb8800_probe()Wei Yongjun
In ops->reset() error handling case, clk_disable_unprepare() is missed before return from this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Mans Rullgard <mans@mansr.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-20wan/fsl_ucc_hdlc: use module_platform_driver to simplify the codeWei Yongjun
module_platform_driver() makes the code simpler by eliminating boilerplate code. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-20wan/fsl_ucc_hdlc: remove .owner field for driverWei Yongjun
Remove .owner field if calls are used which set it automatically. Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-20net: axienet: Fix return value check in axienet_probe()Wei Yongjun
In case of error, the function of_parse_phandle() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Fixes: 46aa27df8853 ('net: axienet: Use devm_* calls') Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-20Merge branch 'for-upstream' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2016-07-19 Here's likely the last bluetooth-next pull request for the 4.8 kernel: - Fix for L2CAP setsockopt - Fix for is_suspending flag handling in btmrvl driver - Addition of Bluetooth HW & FW info fields to debugfs - Fix to use int instead of char for callback status. The last one (from Geert Uytterhoeven) is actually not purely a Bluetooth (or 802.15.4) patch, but it was agreed with other maintainers that we take it through the bluetooth-next tree. Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-20bpf, elf: add official ELF machine define for eBPFDaniel Borkmann
Add the official BPF ELF e_machine value that was assigned recently [1,2] and will be propagated to glibc, et al. LLVM is switching to it in 3.9 release. [1] https://github.com/llvm-mirror/llvm/commit/36b9c09330bfb5e771914cfe307588f30d5510d2 [2] http://lists.iovisor.org/pipermail/iovisor-dev/2016-June/000266.html Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-20bpf: fix implicit declaration of bpf_prog_addBrenden Blanco
For the ifndef case of CONFIG_BPF_SYSCALL, an inline version of bpf_prog_add needs to exist otherwise the build breaks on some configs. drivers/net/ethernet/mellanox/mlx4/en_netdev.c:2544:10: error: implicit declaration of function 'bpf_prog_add' prog = bpf_prog_add(prog, priv->rx_ring_num - 1); The function is introduced in 59d3656d5bf50 ("bpf: add bpf_prog_add api for bulk prog refcnt") and first used in 47f1afdba2b87 ("net/mlx4_en: add support for fast rx drop bpf program"). Fixes: 47f1afdba2b87 ("net/mlx4_en: add support for fast rx drop bpf program") Reported-by: kbuild test robot <fengguang.wu@intel.com> Reported-by: Tariq Toukan <ttoukan.linux@gmail.com> Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-20Merge branch 'kdave-part1-enospc' into for-linus-4.8Chris Mason
2016-07-20audit: fix a double fetch in audit_log_single_execve_arg()Paul Moore
There is a double fetch problem in audit_log_single_execve_arg() where we first check the execve(2) argumnets for any "bad" characters which would require hex encoding and then re-fetch the arguments for logging in the audit record[1]. Of course this leaves a window of opportunity for an unsavory application to munge with the data. This patch reworks things by only fetching the argument data once[2] into a buffer where it is scanned and logged into the audit records(s). In addition to fixing the double fetch, this patch improves on the original code in a few other ways: better handling of large arguments which require encoding, stricter record length checking, and some performance improvements (completely unverified, but we got rid of some strlen() calls, that's got to be a good thing). As part of the development of this patch, I've also created a basic regression test for the audit-testsuite, the test can be tracked on GitHub at the following link: * https://github.com/linux-audit/audit-testsuite/issues/25 [1] If you pay careful attention, there is actually a triple fetch problem due to a strnlen_user() call at the top of the function. [2] This is a tiny white lie, we do make a call to strnlen_user() prior to fetching the argument data. I don't like it, but due to the way the audit record is structured we really have no choice unless we copy the entire argument at once (which would require a rather wasteful allocation). The good news is that with this patch the kernel no longer relies on this strnlen_user() value for anything beyond recording it in the log, we also update it with a trustworthy value whenever possible. Reported-by: Pengfei Wang <wpengfeinudt@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-07-20wlcore: spi: fix build warning caused by redundant variableReizer, Eyal
The ret variable is unused in wlcore_probe_of() Remove it for fixing build warning. Fixes: 01efe65aba65 ("wlcore: spi: add wl18xx support") Signed-off-by: Eyal Reizer <eyalr@ti.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-07-20GPU-DRM-sun4i: Delete an unnecessary check before drm_fbdev_cma_hotplug_event()Markus Elfring
The drm_fbdev_cma_hotplug_event() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/cd959d92-f7d9-598c-421f-d3f40bedee10@users.sourceforge.net
2016-07-20drm/atomic: Delete an unnecessary check before drm_property_unreference_blob()Markus Elfring
The drm_property_unreference_blob() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: http://patchwork.freedesktop.org/patch/msgid/aa4cd508-38c3-78d7-a9f2-70e3b06a8fb5@users.sourceforge.net
2016-07-20Merge remote-tracking branches 'regulator/topic/qcom-spmi', ↵Mark Brown
'regulator/topic/rn5t618', 'regulator/topic/tps65218' and 'regulator/topic/twl' into regulator-next
2016-07-20Merge remote-tracking branches 'regulator/topic/mt6397', ↵Mark Brown
'regulator/topic/of', 'regulator/topic/pfuze100', 'regulator/topic/pwm' and 'regulator/topic/qcom-smd' into regulator-next
2016-07-20Merge remote-tracking branches 'regulator/topic/fixed', ↵Mark Brown
'regulator/topic/headers', 'regulator/topic/lp837x', 'regulator/topic/max8973' and 'regulator/topic/mt6323' into regulator-next
2016-07-20Merge remote-tracking branches 'regulator/topic/act8865', ↵Mark Brown
'regulator/topic/can-change-voltage', 'regulator/topic/da9210' and 'regulator/topic/da9211' into regulator-next
2016-07-20Merge remote-tracking branch 'regulator/topic/axp20x' into regulator-nextMark Brown
2016-07-20Merge remote-tracking branches 'regulator/fix/da9053' and ↵Mark Brown
'regulator/fix/s2mps11' into regulator-linus
2016-07-20arm64: kprobes: Fix overflow when saving stackMarc Zyngier
The MIN_STACK_SIZE macro tries evaluate how much stack space needs to be saved in the jprobes_stack array, sized at 128 bytes. When using the IRQ stack, said macro can happily return up to IRQ_STACK_SIZE, which is 16kB. Mayhem follows. This patch fixes things by getting rid of the crazy macro and limiting the copy to be at most the size of the jprobes_stack array, no matter which stack we're on. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-07-20ARC: dma: fix address translation in arc_dma_freeVladimir Kondratiev
page should be calculated using physical address. If platform uses non-trivial dma-to-phys memory translation, dma_handle should be converted to physicval address before calculation of page. Failing to do so results in struct page * pointing to wrong or non-existent memory. Fixes: f2e3d55397ff ("ARC: dma: reintroduce platform specific dma<->phys") Cc: stable@vger.kernel.org #4.6+ Signed-off-by: Vladimir Kondratiev <vladimir.kondratiev@intel.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-07-20dm thin: fix a race condition between discarding and provisioning a blockJoe Thornber
The discard passdown was being issued after the block was unmapped, which meant the block could be reprovisioned whilst the passdown discard was still in flight. We can only identify unshared blocks (safe to do a passdown a discard to) once they're unmapped and their ref count hits zero. Block ref counts are now used to guard against concurrent allocation of these blocks that are being discarded. So now we unmap the block, issue passdown discards, and the immediately increment ref counts for regions that have been discarded via passed down (this is safe because allocation occurs within the same thread). We then decrement ref counts once the passdown discard IO is complete -- signaling these blocks may now be allocated. This fixes the potential for corruption that was reported here: https://www.redhat.com/archives/dm-devel/2016-June/msg00311.html Reported-by: Dennis Yang <dennisyang@qnap.com> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-20dm btree: fix a bug in dm_btree_find_next_single()Joe Thornber
dm_btree_find_next_single() can short-circuit the search for a block with a return of -ENODATA if all entries are higher than the search key passed to lower_bound(). This hasn't been a problem because of the way the btree has been used by DM thinp. But it must be fixed now in preparation for fixing the race in DM thinp's handling of simultaneous block discard vs allocation. Otherwise, once that fix is in place, some of the blocks in a discard would not be unmapped as expected. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-20spi: rockchip: limit transfers to (64K - 1) bytesBrian Norris
The Rockchip SPI controller's length register only supports 16-bits, yielding a maximum length of 64KiB (the CTRLR1 register holds "length - 1"). Trying to transfer more than that (e.g., with a large SPI flash read) will cause the driver to hang. Now, it seems that while theoretically we should be able to program CTRLR1 with 0xffff, and get a 64KiB transfer, but that also seems to cause the core to choke, so stick with a maximum of 64K - 1 bytes -- i.e., 0xffff. Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-07-20regulator: da9053/52: Fix incorrectly stated minimum and maximum voltage limitsSteve Twiss
This fix alters the minimum and maximum BUCK voltage limits for DA9052 and DA9053. It does so for the following cases: DA9052 - BUCK3 (MEM) min: 0.925V -> 0.950V max: 2.500V -> 2.525V DA9053 - BUCK3 (MEM) min: 0.925V -> 0.950V max: 2.500V -> 2.525V - BUCK4 (PERI) min: 0.925V -> 0.950V max: 2.500V -> 2.525V The voltage range remains the same, but the limits are shifted by +0.025V. This change is provided on DA9052:MEM, DA9053:MEM and DA9053:PERI and is a voltage difference of 0.025V, compared to those measured before this fix is applied. The patch has the effect of decreasing *all* measured voltages on those BUCKs when compared against the previously measured values for the same software voltage request. For example, with this fix applied for DA9052:MEM, DA9053:MEM and DA9053:PERI, the following is true. Because the previous software defined slot 0 as being 0.925V, if a request for 0.950V was previously sent, the slot 1 voltage would have been used. This would have corresponded to an actual measured voltage of 0.975V. But, with this patch fix, and with slot 0 properly aligned to 0.950V, if a voltage of 0.950V is requested by software, a measured value of 0.950V will be provided. Tested-by: Steve Twiss <stwiss.opensource@diasemi.com> Signed-off-by: Steve Twiss <stwiss.opensource@diasemi.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-07-20libata-scsi: better style in ata_msense_*()Tom Yan
`changeable` is the "version" of mode page requested by the user. It will be less confusing/misleading if we do not check it "together" with the setting bits of the drive. Not to mention that we currently have ata_mselect_*() implemented in a way that each of them will serve exclusively a particular bit on each page. The old style will hence make the condition look even more unnecessarily arcane if the ata_msense_*() is reflecting more than one bit. Signed-off-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-20AHCI: Clear GHC.IS to prevent unexpectly asserting INTxPang Raymond
Due to PCI subsystem behaviour, unloading AHCI driver will disable MSI and enable INTx. When HBA supports MSIx or Multiple MSI, Driver's irq handler doesn't clear GHC.IS register. It works well when reading or writing data and GHC.IS is always non-zero. But when unloading driver (or any other operation which causes disable MSIx and enable INTx), PCI subsystem uses config write(Rx04.bit10) to enable INTx. Because GHC.IS is non-zero, HBA will falsely assume some port needs interrupt service. Then it asserts INTx. To make things worse, when AHCI controller shares the same interrupt pin with other PCI device, that PCI device's ISR will be called and nobody de-asserts previous INTx. This patch clears GHC.IS in ahci_port_stop() even when using MSIx or MMSI to prevent this case. It ensures GHC.IS is zero before PCI subsystem enables INTx. tj: Minor updates to the comment. Signed-off-by: Raymond Pang <raymond_rule@hotmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-20crypto: vmx - Fix aes_p8_xts_decrypt build failureHerbert Xu
We use _GLOBAL so there is no need to do the manual alignment, in fact it causes a build failure. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-07-20crypto: vmx - Ignore generated filesPaulo Flabiano Smorigo
Ignore assembly files generated by the perl script. Signed-off-by: Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-07-20hwmon: (ftsteutates) Remove unused including <linux/version.h>Wei Yongjun
Remove including <linux/version.h> that don't need it. Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-07-20hwmon: (adt7411) set bit 3 in CFG1 registerMichael Walle
According to the datasheet you should only write 1 to this bit. If it is not set, at least AIN3 will return bad values on newer silicon revisions. Fixes: d84ca5b345c2 ("hwmon: Add driver for ADT7411 voltage and temperature sensor") Signed-off-by: Michael Walle <michael@walle.cc> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-07-20hwmon: Add driver for FTS BMC chip "Teutates"Thilo Cestonaro
This driver implements hardware monitoring and watchdog support for the FTS BMC Chip "Teutates". Signed-off-by: Thilo Cestonaro <thilo@cestona.ro> [groeck: Updated subject and description; fixed dependencies] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-07-20x86/insn: perf tools: Fix vcvtph2ps instruction decodingAdrian Hunter
vcvtph2ps does not have an immediate operand, so remove the erroneous 'Ib' from its opcode map entry. Add vcvtph2ps to the perf tools new instructions test to verify it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dan Williams <dan.j.williams@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: X86 ML <x86@kernel.org> Link: http://lkml.kernel.org/r/1469003437-32706-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-20cifs: fix crash due to race in hmac(md5) handlingRabin Vincent
The secmech hmac(md5) structures are present in the TCP_Server_Info struct and can be shared among multiple CIFS sessions. However, the server mutex is not currently held when these structures are allocated and used, which can lead to a kernel crashes, as in the scenario below: mount.cifs(8) #1 mount.cifs(8) #2 Is secmech.sdeschmaccmd5 allocated? // false Is secmech.sdeschmaccmd5 allocated? // false secmech.hmacmd = crypto_alloc_shash.. secmech.sdeschmaccmd5 = kzalloc.. sdeschmaccmd5->shash.tfm = &secmec.hmacmd; secmech.sdeschmaccmd5 = kzalloc // sdeschmaccmd5->shash.tfm // not yet assigned crypto_shash_update() deref NULL sdeschmaccmd5->shash.tfm Unable to handle kernel paging request at virtual address 00000030 epc : 8027ba34 crypto_shash_update+0x38/0x158 ra : 8020f2e8 setup_ntlmv2_rsp+0x4bc/0xa84 Call Trace: crypto_shash_update+0x38/0x158 setup_ntlmv2_rsp+0x4bc/0xa84 build_ntlmssp_auth_blob+0xbc/0x34c sess_auth_rawntlmssp_authenticate+0xac/0x248 CIFS_SessSetup+0xf0/0x178 cifs_setup_session+0x4c/0x84 cifs_get_smb_ses+0x2c8/0x314 cifs_mount+0x38c/0x76c cifs_do_mount+0x98/0x440 mount_fs+0x20/0xc0 vfs_kern_mount+0x58/0x138 do_mount+0x1e8/0xccc SyS_mount+0x88/0xd4 syscall_common+0x30/0x54 Fix this by locking the srv_mutex around the code which uses these hmac(md5) structures. All the other secmech algos already have similar locking. Fixes: 95dc8dd14e2e84cc ("Limit allocation of crypto mechanisms to dialect which requires") Signed-off-by: Rabin Vincent <rabinv@axis.com> Acked-by: Sachin Prabhu <sprabhu@redhat.com> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <smfrench@gmail.com>
2016-07-20arm/perf: Fix hotplug state machine conversionSebastian Andrzej Siewior
Mark Rutland pointed out that this commit is incomplete: 7d88eb695a1f ("arm/perf: Convert to hotplug state machine") The problem is that: > We may have multiple PMUs (e.g. two in big.LITTLE systems), and > __oprofile_cpu_pmu only contains one of these. So this conversion is not > correct. > > We were relying on the notifier list implicitly containing a list of > those PMUs. It seems like we need an explicit list here. > > We keep __oprofile_cpu_pmu around for legacy 32-bit users of OProfile > (on non-hetereogeneous systems), and that's all that the variable should > be used for. Introduce arm_pmu_list to correctly handle multiple PMUs in the system. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-tip-commits@vger.kernel.org Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160719111733.GA22911@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-20x86/cpu: Add workaround for MONITOR instruction erratum on Goldmont based CPUsPeter Zijlstra
Monitored cached line may not wake up from mwait on certain Goldmont based CPUs. This patch will avoid calling current_set_polling_and_test() and thereby not set the TIF_ flag. The result is that we'll always send IPIs for wakeups. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1468867270-18493-1-git-send-email-jacob.jun.pan@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-20Merge branch 'linus' into x86/cpu, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-20x86, crypto: Restore MODULE_LICENSE() to glue_helper.c so it loadsPaul Gortmaker
In commit: eb008eb6f8b6 ("x86: Audit and remove any remaining unnecessary uses of module.h") ... we looked for instances of module.h that were not supporting anything more than exported symbols. To facilitate the exchange of module.h to the much smaller export.h we occasionally remove tags like MODULE_AUTHOR() etc. which in the case of built in files, are no-ops and hence that is fine, assuming the info is already in the comments at the top of the file.. However the error here is that I overlooked that this file was used not as a driver, but as a library of functions, and hence has no explicit modular linkage functions or similar, making it _appear_ non-modular. We can see that in retrospect with: arch/x86/crypto/Makefile:obj-$(CONFIG_CRYPTO_GLUE_HELPER_X86) += glue_helper.o crypto/Kconfig:config CRYPTO_GLUE_HELPER_X86 crypto/Kconfig: tristate Since we removed what was an active MODULE_LICENSE(), the module failed to load and then automated testing showed the missing glue helpers as: glue_helper: Unknown symbol blkcipher_walk_done (err 0) glue_helper: Unknown symbol blkcipher_walk_virt (err 0) glue_helper: Unknown symbol kernel_fpu_end (err 0) glue_helper: Unknown symbol kernel_fpu_begin (err 0) glue_helper: Unknown symbol blkcipher_walk_virt_block (err 0) So we do a partial revert of that change to just this one file, and watch for similar MODULE_LICENSE() only cases in future audits. Reported-by: kernel test robot <xiaolong.ye@intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: David S. Miller <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-crypto@vger.kernel.org Cc: lkp@01.org Fixes: eb008eb6f8b6 ("x86: Audit and remove any remaining unnecessary uses of module.h") Link: http://lkml.kernel.org/r/20160719144243.GK21225@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-19Merge branch 'xdp'David S. Miller
Brenden Blanco says: ==================== Add driver bpf hook for early packet drop and forwarding This patch set introduces new infrastructure for programmatically processing packets in the earliest stages of rx, as part of an effort others are calling eXpress Data Path (XDP) [1]. Start this effort by introducing a new bpf program type for early packet filtering, before even an skb has been allocated. Extend on this with the ability to modify packet data and send back out on the same port. Patch 1 adds an API for bulk bpf prog refcnt incrememnt. Patch 2 introduces the new prog type and helpers for validating the bpf program. A new userspace struct is defined containing only data and data_end as fields, with others to follow in the future. In patch 3, create a new ndo to pass the fd to supported drivers. In patch 4, expose a new rtnl option to userspace. In patch 5, enable support in mlx4 driver. In patch 6, create a sample drop and count program. With single core, achieved ~20 Mpps drop rate on a 40G ConnectX3-Pro. This includes packet data access, bpf array lookup, and increment. In patch 7, add a page recycle facility to mlx4 rx, enabled when xdp is active. In patch 8, add the XDP_TX type to bpf.h In patch 9, add helper in tx patch for writing tx_desc In patch 10, add support in mlx4 for packet data write and forwarding In patch 11, turn on packet write support in the bpf verifier In patch 12, add a sample program for packet write and forwarding. With single core, achieved ~10 Mpps rewrite and forwarding. [1] https://github.com/iovisor/bpf-docs/blob/master/Express_Data_Path.pdf v10: 1/12: Add bulk refcnt api. 5/12: Move prog from priv to ring. This attribute is still only set globally, but the path to finer granularity should be clear. No lock is taken, so some rings may operate on older programs for a time (one napi loop). Looked into options such as napi_synchronize, but they were deemed too slow (calls to msleep). Rename prog to xdp_prog. Add xdp_ring_num to help with accounting, used more heavily in later patches. 7/12: Adjust to use per-ring xdp prog. Use priv->xdp_ring_num where before priv->prog was used to determine buffer allocations. 9/12: Add cpu_to_be16 to vlan_tag in mxl4_en_xmit(). Remove unused variable from mlx4_en_xmit and unused params from build_inline_wqe. v9: 4/11: Add missing newline in en_err message. 6/11: Move page_cache cleanup from mlx4_en_destroy_rx_ring to mlx4_en_deactivate_rx_ring. Move mlx4_en_moderation_update back to static. Remove calls to mlx4_en_alloc/free_resources in mlx4_xdp_set. Adopt instead the approach of mlx4_en_change_mtu to use a watchdog. 9/11: Use a per-ring function pointer in tx to separate out the code for regular and recycle paths of tx completion handling. Add a helper function to init the recycle ring and callback, called just after activating tx. Remove extra tx ring resource requirement, and instead steal from the upper rings. This helps to avoid needing mlx4_en_alloc_resources. Add some hopefully meaningful error messages for the various error cases. Reverted some of the hard-to-follow logic that was accounting for the extra tx rings. v8: 1/11: Reduce WARN_ONCE to single line. Also, change act param of that function to u32 to match return type of bpf_prog_run_xdp. 2/11: Clarify locking semantics in ndo comment. 4/11: Add en_err warning in mlx4_xdp_set on num_frags/mtu violation. v7: Addressing two of the major discussion points: return codes and ndo. The rest will be taken as todo items for separate patches. Add an XDP_ABORTED type, which explicitly falls through to DROP. The same result must be taken for the default case as well, as it is now well-defined API behavior. Merge ndo_xdp_* into a single ndo. The style is similar to ndo_setup_tc, but with less unidirectional naming convention. The IFLA parameter names are unchanged. TODOs: Add ethtool per-ring stats for aborted, default cases, maybe even drop and tx as well. Avoid duplicate dma sync operation in XDP_PASS case as mentioned by Saeed. 1/12: Add XDP_ABORTED enum, reword API comment, and update commit message. 2/12: Rewrite ndo_xdp_*() into single ndo_xdp() with type/union style calling convention. 3/12: Switch to ndo_xdp callback. 4/12: Add XDP_ABORTED case as a fall-through to XDP_DROP. Implement ndo_xdp. 12/12: Dropped, this will need some more work. v6: 2/12: drop unnecessary netif_device_present check 4/12, 6/12, 9/12: Reorder default case statement above drop case to remove some copy/paste. v5: 0/12: Rebase and remove previous 1/13 patch 1/12: Fix nits from Daniel. Left the (void *) cast as-is, to be fixed in future. Add bpf_warn_invalid_xdp_action() helper, to be used when out of bounds action is returned by the program. Add a comment to bpf.h denoting the undefined nature of out of bounds returns. 2/12: Switch to using bpf_prog_get_type(). Rename ndo_xdp_get() to ndo_xdp_attached(). 3/12: Add IFLA_XDP as a nested type, and add the associated nla_policy for the new subtypes IFLA_XDP_FD and IFLA_XDP_ATTACHED. 4/12: Fixup the use of READ_ONCE in the ndos. Add a user of bpf_warn_invalid_xdp_action helper. 5/12: Adjust to using the nested netlink options. 6/12: kbuild was complaining about overflow of u16 on tile architecture...bump frag_stride to u32. The page_offset member that is computed from this was already u32. v4: 2/12: Add inline helper for calling xdp bpf prog under rcu 3/12: Add detail to ndo comments 5/12: Remove mlx4_call_xdp and use inline helper instead. 6/12: Fix checkpatch complaints 9/12: Introduce new patch 9/12 with common helper for tx_desc write Refactor to use common tx_desc write helper 11/12: Fix checkpatch complaints v3: Rewrite from v2 trying to incorporate feedback from multiple sources. Specifically, add ability to forward packets out the same port and allow packet modification. For packet forwarding, the driver reserves a dedicated set of tx rings for exclusive use by xdp. Upon completion, the pages on this ring are recycled directly back to a small per-rx-ring page cache without being dma unmapped. Use of the percpu skb is dropped in favor of a lightweight struct xdp_buff. The direct packet access feature is leveraged to remove dependence on the skb. The mlx4 driver implementation allocates a page-per-packet and maps it in PCI_DMA_BIDIRECTIONAL mode when the bpf program is activated. Naming is converted to use "xdp" instead of "phys_dev". v2: 1/5: Drop xdp from types, instead consistently use bpf_phys_dev_ Introduce enum for return values from phys_dev hook 2/5: Move prog->type check to just before invoking ndo Change ndo to take a bpf_prog * instead of fd Add ndo_bpf_get rather than keeping a bool in the netdev struct 3/5: Use ndo_bpf_get to fetch bool 4/5: Enforce that only 1 frag is ever given to bpf prog by disallowing mtu to increase beyond FRAG_SZ0 when bpf prog is running, or conversely to set a bpf prog when priv->num_frags > 1 Rename pseudo_skb to bpf_phys_dev_md Implement ndo_bpf_get Add dma sync just before invoking prog Check for explicit bpf return code rather than nonzero Remove increment of rx_dropped 5/5: Use explicit bpf return code in example Update commit log with higher pps numbers ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-19bpf: add sample for xdp forwarding and rewriteBrenden Blanco
Add a sample that rewrites and forwards packets out on the same interface. Observed single core forwarding performance of ~10Mpps. Since the mlx4 driver under test recycles every single packet page, the perf output shows almost exclusively just the ring management and bpf program work. Slowdowns are likely occurring due to cache misses. Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-19bpf: enable direct packet data write for xdp progsBrenden Blanco
For forwarding to be effective, XDP programs should be allowed to rewrite packet data. This requires that the drivers supporting XDP must all map the packet memory as TODEVICE or BIDIRECTIONAL before invoking the program. Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-19net/mlx4_en: add xdp forwarding and data write supportBrenden Blanco
A user will now be able to loop packets back out of the same port using a bpf program attached to xdp hook. Updates to the packet contents from the bpf program is also supported. For the packet write feature to work, the rx buffers are now mapped as bidirectional when the page is allocated. This occurs only when the xdp hook is active. When the program returns a TX action, enqueue the packet directly to a dedicated tx ring, so as to avoid completely any locking. This requires the tx ring to be allocated 1:1 for each rx ring, as well as the tx completion running in the same softirq. Upon tx completion, this dedicated tx ring recycles pages without unmapping directly back to the original rx ring. In steady state tx/drop workload, effectively 0 page allocs/frees will occur. In order to separate out the paths between free and recycle, a free_tx_desc func pointer is introduced that is optionally updated whenever recycle_ring is activated. By default the original free function is always initialized. Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-19net/mlx4_en: break out tx_desc write into separate functionBrenden Blanco
In preparation for writing the tx descriptor from multiple functions, create a helper for both normal and blueflame access. Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>