summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-08-03netfilter: nf_tables: flow event notifier must use transaction mutexFlorian Westphal
Fixes: f102d66b335a4 ("netfilter: nf_tables: use dedicated mutex to guard transactions") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03netfilter: nfnetlink_osf: rename nf_osf header file to nfnetlink_osfFernando Fernandez Mancera
The first client of the nf_osf.h userspace header is nft_osf, coming in this batch, rename it to nfnetlink_osf.h as there are no userspace clients for this yet, hence this looks consistent with other nfnetlink subsystem. Suggested-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03netfilter: nf_osf: move nf_osf_fingers to non-uapi header fileFernando Fernandez Mancera
All warnings (new ones prefixed by >>): >> ./usr/include/linux/netfilter/nf_osf.h:73: userspace cannot reference function or variable defined in the kernel Fixes: f9324952088f ("netfilter: nfnetlink_osf: extract nfnetlink_subsystem code from xt_osf.c") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03netfilter: use kvmalloc_array to allocate memory for hashtableLi RongQing
nf_ct_alloc_hashtable is used to allocate memory for conntrack, NAT bysrc and expectation hashtable. Assuming 64k bucket size, which means 7th order page allocation, __get_free_pages, called by nf_ct_alloc_hashtable, will trigger the direct memory reclaim and stall for a long time, when system has lots of memory stress so replace combination of __get_free_pages and vzalloc with kvmalloc_array, which provides a overflow check and a fallback if no high order memory is available, and do not retry to reclaim memory, reduce stall and remove nf_ct_free_hashtable, since it is just a kvfree Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Wang Li <wangli39@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03nohz: Fix missing tick reprogram when interrupting an inline softirqFrederic Weisbecker
The full nohz tick is reprogrammed in irq_exit() only if the exit is not in a nesting interrupt. This stands as an optimization: whether a hardirq or a softirq is interrupted, the tick is going to be reprogrammed when necessary at the end of the inner interrupt, with even potential new updates on the timer queue. When soft interrupts are interrupted, it's assumed that they are executing on the tail of an interrupt return. In that case tick_nohz_irq_exit() is called after softirq processing to take care of the tick reprogramming. But the assumption is wrong: softirqs can be processed inline as well, ie: outside of an interrupt, like in a call to local_bh_enable() or from ksoftirqd. Inline softirqs don't reprogram the tick once they are done, as opposed to interrupt tail softirq processing. So if a tick interrupts an inline softirq processing, the next timer will neither be reprogrammed from the interrupting tick's irq_exit() nor after the interrupted softirq processing. This situation may leave the tick unprogrammed while timers are armed. To fix this, simply keep reprogramming the tick even if a softirq has been interrupted. That can be optimized further, but for now correctness is more important. Note that new timers enqueued in nohz_full mode after a softirq gets interrupted will still be handled just fine through self-IPIs triggered by the timer code. Reported-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: stable@vger.kernel.org # 4.14+ Link: https://lkml.kernel.org/r/1533303094-15855-1-git-send-email-frederic@kernel.org
2018-08-03genirq: Make force irq threading setup more robustThomas Gleixner
The support of force threading interrupts which are set up with both a primary and a threaded handler wreckaged the setup of regular requested threaded interrupts (primary handler == NULL). The reason is that it does not check whether the primary handler is set to the default handler which wakes the handler thread. Instead it replaces the thread handler with the primary handler as it would do with force threaded interrupts which have been requested via request_irq(). So both the primary and the thread handler become the same which then triggers the warnon that the thread handler tries to wakeup a not configured secondary thread. Fortunately this only happens when the driver omits the IRQF_ONESHOT flag when requesting the threaded interrupt, which is normaly caught by the sanity checks when force irq threading is disabled. Fix it by skipping the force threading setup when a regular threaded interrupt is requested. As a consequence the interrupt request which lacks the IRQ_ONESHOT flag is rejected correctly instead of silently wreckaging it. Fixes: 2a1d3ab8986d ("genirq: Handle force threading of irqs with primary and thread handler") Reported-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de> Cc: stable@vger.kernel.org
2018-08-03Bluetooth: hci_qca: Add support for Qualcomm Bluetooth chip wcn3990Balakrishna Godavarthi
Add support to set voltage/current of various regulators to power up/down Bluetooth chip wcn3990. Signed-off-by: Balakrishna Godavarthi <bgodavar@codeaurora.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: btqca: Add wcn3990 firmware download support.Balakrishna Godavarthi
This patch enables the RAM and NV patch download for wcn3990. Signed-off-by: Balakrishna Godavarthi <bgodavar@codeaurora.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: hci_qca: Enable 3.2 Mbps operating speed.Balakrishna Godavarthi
Enable Qualcomm chips to operate at 3.2Mbps. Signed-off-by: Balakrishna Godavarthi <bgodavar@codeaurora.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: hci_qca: Add wrapper functions for setting UART speedBalakrishna Godavarthi
In function qca_setup, we set initial and operating speeds for Qualcomm Bluetooth SoC's. This block of code is common across different Qualcomm Bluetooth SoC's. Instead of duplicating the code, created a wrapper function to set the speeds. So that future coming SoC's can use these wrapper functions to set speeds. Signed-off-by: Balakrishna Godavarthi <bgodavar@codeaurora.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: btqca: Redefine qca_uart_setup() to generic function.Balakrishna Godavarthi
Redefinition of qca_uart_setup will help future Qualcomm Bluetooth SoC, to use the same function instead of duplicating the function. Added new arguments soc_type and soc_ver to the functions. These arguments will help to decide type of firmware files to be loaded into Bluetooth chip. soc_type holds the Bluetooth chip connected to APPS processor. soc_ver holds the Bluetooth chip version. Signed-off-by: Balakrishna Godavarthi <bgodavar@codeaurora.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: btqca: Rename ROME specific functions to generic functionsBalakrishna Godavarthi
Some of the QCA BTSoC ROME functions, are used for different versions or different make of BTSoC's. Instead of duplicating the same functions for new chip, update names of the functions that are used for both chips to keep this generic and would help in future when we would have new BT SoC. To have generic text in logs updated from ROME to QCA where ever possible. This avoids confusion to user, when using the future Qualcomm Bluetooth SoC's. Updated BT_DBG, BT_ERR and BT_INFO with bt_dev_dbg, bt_dev_err and bt_dev_info where ever applicable. Signed-off-by: Balakrishna Godavarthi <bgodavar@codeaurora.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03dt-bindings: net: bluetooth: Add device tree bindings for QTI chip wcn3990Balakrishna Godavarthi
This patch enables regulators for the Qualcomm Bluetooth wcn3990 controller. Signed-off-by: Balakrishna Godavarthi <bgodavar@codeaurora.org> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: hci_h5: Add support for enable and device-wake GPIOsHans de Goede
Add support for the enable and device-wake GPIOs used on ACPI enumerated RTL8723BS devices. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: hci_h5: Add support for the RTL8723BSJeremy Cline
Implement support for the RTL8723BS chip. Signed-off-by: Jeremy Cline <jeremy@jcline.org> [hdegoede@redhat.com: Port from bt3wire.c to hci_h5.c, drop broken GPIO code] Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: hci_h5: Add vendor setup, open, and close callbacksJeremy Cline
Allow vendor-specific setup, open, and close functions to be defined. Signed-off-by: Jeremy Cline <jeremy@jcline.org> [hdegoede@redhat.com: Port from bt3wire.c to hci_h5.c, drop dt support] Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: hci_h5: Add support for serdev enumerated devicesHans de Goede
Add basic support for serdev enumerated devices, note sine this does not (yet) declare any of / ACPI ids to bind to atm this is a nop. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: btrtl: Add support for a config filename postfixHans de Goede
The contents of the rtl_bt/rtlXXXX_config.bin file may be board specific allow the caller of btrtl_initialize to specify a postfix identifying the board, which if specified will make btrtl_initialize look for rtl_bt/rtlXXXX_config-<postfix>.bin instead. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: btrtl: add support for the RTL8723BS and RTL8723DS chipsMartin Blumenstingl
The Realtek RTL8723BS and RTL8723DS chipsets are SDIO wifi chips. They also contain a Bluetooth module which is connected via UART to the host. Realtek's userspace initialization tool (rtk_hciattach) differentiates these two via the HCI version and revision returned by the HCI_OP_READ_LOCAL_VERSION command. Additionally we apply these checks only the for UART devices. Everything else is assumed to be a "RTL8723B" which was originally supported by the driver (communicating via USB). Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Jeremy Cline <jeremy@jcline.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: btrtl: add support for retrieving the UART settingsMartin Blumenstingl
The UART settings are embedded in the config blob. This has to be parsed to successfully initialize the Bluetooth part of the RTL8723BS (which is an SDIO chip, but the Bluetooth part is connected via UART). The Realtek "rtl8723bs_bt" and "rtl8723ds_bt" userspace Bluetooth UART initialization tools (rtk_hciattach) use the following sequence: - send H5 sync pattern (already supported by hci_h5) - get LMP version (already supported by btrtl) - get ROM version (already supported by btrtl) - load the firmware and config for the current chipset (already supported by btrtl) - read UART settings from the config blob (part of this patch) - send UART settings via a vendor command to the device (which changes the baudrate of the device and enables or disables flow control depending on the config) - change the baudrate and flow control settings on the host - send the firmware and config blob to the device (already supported by btrtl) Sending the last firmware and config blob download command (rtl_download_cmd) fails if the UART settings are not updated beforehand. This is presumably because the device applies the config right after the firmware and config blob download - which means that at this point the host is using different UART settings than the device (which will obviously result in non-working communication). Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Jeremy Cline <jeremy@jcline.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: btrtl: Use rtl_dev_err and rtl_dev_infoHans de Goede
Consistently use rtl_dev_err and rtl_dev_info everywhere for messages. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: btrtl: split the device initialization into smaller partsMartin Blumenstingl
This prepares the btrtl code so it can be used to initialize Bluetooth modules connected via UART (these are found for example on the RTL8723BS and RTL8723DS SDIO chips, which come with an embedded UART Bluetooth module). The Realtek "rtl8723bs_bt" and "rtl8723ds_bt" userspace Bluetooth UART initialization tools (rtk_hciattach) use the following sequence: 1) send H5 sync pattern (already supported by hci_h5) 2) get LMP version (already supported by btrtl) 3) get ROM version (already supported by btrtl) 4) load the firmware and config for the current chipset (already supported by btrtl) 5) read UART settings from the config blob (currently not supported) 6) send UART settings via a vendor command to the device (which changes the baudrate of the device and enables or disables flow control depending on the config) 7) change the baudrate and flow control settings on the host 8) send the firmware and config blob to the device (already supported by btrtl) The main reason why the initialization has to be split is step #7. This requires changes to the underlying "bus", which should be kept outside of the "generic" btrtl driver. The idea for this split is borrowed from the btbcm driver but adjusted where needed (the btrtl driver for example needs two blobs: firmware and config, while the btbcm only needs one). This also prepares the code for step #5 (parsing the config blob) by centralizing the code which loads the firmware and config blobs and storing the result in the new struct btrtl_device_info. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Jeremy Cline <jeremy@jcline.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: btrtl: add MODULE_FIRMWARE declarationsMartin Blumenstingl
This makes the firmware names show up in modinfo. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Jeremy Cline <jeremy@jcline.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03Bluetooth: btusb: Use bt_dev_err for Intel firmware loading errorsMarcel Holtmann
Replace the BT_ERR functions with bt_dev_err to get a consistent error printout that always prefixes the HCI device identifier. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2018-08-03selftests/bpf: update test_lwt_seg6local.sh according to iproute2Mathieu Xhonneux
The shell file for test_lwt_seg6local contains an early iproute2 syntax for installing a seg6local End.BPF route. iproute2 support for this feature has recently been upstreamed, but with an additional keyword required. This patch updates test_lwt_seg6local.sh to the definitive iproute2 syntax Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03Bluetooth: btusb: Release RF resource on BT shutdownAmit K Bag
Issue description: Intel 7265 shares the same RF with Wifi and BT. In the shutdown scenario turn off BT, followed by turn WiFi off and on causing error in RF calibration in WiFi Module Solution: before shutdown BT ensure any RF activity to clear by HCI reset command. Reference Logs: ERR kernel: [ 386.193284] iwlwifi 0000:01:00.0: Failed to run INIT calibrations: -5 ERR kernel: [ 386.193298] iwlwifi 0000:01:00.0: Failed to run INIT ucode: -5 ERR kernel: [ 386.193309] iwlwifi 0000:01:00.0: Failed to start RT ucode: -5 Signed-off-by: Amit K Bag <amit.k.bag@intel.com> Singed-off-by: Chethan T N <chethan.tumkur.narayan@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-03selftests/bpf: fix a typo in map in map testRoman Gushchin
Commit fbeb1603bf4e ("bpf: verifier: MOV64 don't mark dst reg unbounded") revealed a typo in commit fb30d4b71214 ("bpf: Add tests for map-in-map"): BPF_MOV64_REG(BPF_REG_0, 0) was used instead of BPF_MOV64_IMM(BPF_REG_0, 0). I've noticed the problem by running bpf kselftests. Fixes: fb30d4b71214 ("bpf: Add tests for map-in-map") Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Arthur Fabre <afabre@cloudflare.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-02tools: bpf: fix BTF code added twice to different treesJakub Kicinski
commit 38d5d3b3d5db ("bpf: Introduce BPF_ANNOTATE_KV_PAIR") added to the bpf and net trees what commit 92b57121ca79 ("bpf: btf: export btf types and name by offset from lib") has already added to bpf-next/net-next, but in slightly different location. Remove the duplicates (to fix build of libbpf). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-02Merge tag 'media/v4.18-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - a deadlock regression at vsp1 driver - some Remote Controller fixes related to the new BPF filter logic added on it for Kernel 4.18. * tag 'media/v4.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: v4l: vsp1: Fix deadlock in VSPDL DRM pipelines media: rc: read out of bounds if bpf reports high protocol number media: bpf: ensure bpf program is freed on detach media: rc: be less noisy when driver misbehaves
2018-08-02Merge tag 'arc-4.18-final' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: "Another batch of fixes for ARC, this time mainly DMA API rework wreckage: - Fix software managed DMA wreckage after rework in 4.17 [Euginey] * missing cache flush * SMP_CACHE_BYTES vs cache_line_size - Fix allmodconfig build errors [Randy] - Maintainer update for Mellanox (EZChip) NPS platform" * tag 'arc-4.18-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: arc: fix type warnings in arc/mm/cache.c arc: fix build errors in arc/include/asm/delay.h arc: [plat-eznps] fix printk warning in arc/plat-eznps/mtm.c arc: [plat-eznps] fix data type errors in platform headers ARC: [plat-eznps] Add missing struct nps_host_reg_aux_dpc ARC: add SMP_CACHE_BYTES value validate ARC: dma [non-IOC] setup SMP_CACHE_BYTES and cache_line_size ARC: dma [non IOC]: fix arc_dma_sync_single_for_(device|cpu) ARC: Add Ofer Levi as plat-eznps maintainer
2018-08-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "3 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: userfaultfd: remove uffd flags from vma->vm_flags if UFFD_EVENT_FORK fails ipc/shm.c add ->pagesize function to shm_vm_ops memcg: remove memcg_cgroup::id from IDR on mem_cgroup_css_alloc() failure
2018-08-02userfaultfd: remove uffd flags from vma->vm_flags if UFFD_EVENT_FORK failsMike Rapoport
The fix in commit 0cbb4b4f4c44 ("userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails") cleared the vma->vm_userfaultfd_ctx but kept userfaultfd flags in vma->vm_flags that were copied from the parent process VMA. As the result, there is an inconsistency between the values of vma->vm_userfaultfd_ctx.ctx and vma->vm_flags which triggers BUG_ON in userfaultfd_release(). Clearing the uffd flags from vma->vm_flags in case of UFFD_EVENT_FORK failure resolves the issue. Link: http://lkml.kernel.org/r/1532931975-25473-1-git-send-email-rppt@linux.vnet.ibm.com Fixes: 0cbb4b4f4c44 ("userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails") Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Reported-by: syzbot+121be635a7a35ddb7dcb@syzkaller.appspotmail.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-02ipc/shm.c add ->pagesize function to shm_vm_opsJane Chu
Commit 05ea88608d4e ("mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct") adds a new ->pagesize() function to hugetlb_vm_ops, intended to cover all hugetlbfs backed files. With System V shared memory model, if "huge page" is specified, the "shared memory" is backed by hugetlbfs files, but the mappings initiated via shmget/shmat have their original vm_ops overwritten with shm_vm_ops, so we need to add a ->pagesize function to shm_vm_ops. Otherwise, vma_kernel_pagesize() returns PAGE_SIZE given a hugetlbfs backed vma, result in below BUG: fs/hugetlbfs/inode.c 443 if (unlikely(page_mapped(page))) { 444 BUG_ON(truncate_op); resulting in hugetlbfs: oracle (4592): Using mlock ulimits for SHM_HUGETLB is deprecated ------------[ cut here ]------------ kernel BUG at fs/hugetlbfs/inode.c:444! Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 ... CPU: 35 PID: 5583 Comm: oracle_5583_sbt Not tainted 4.14.35-1829.el7uek.x86_64 #2 RIP: 0010:remove_inode_hugepages+0x3db/0x3e2 .... Call Trace: hugetlbfs_evict_inode+0x1e/0x3e evict+0xdb/0x1af iput+0x1a2/0x1f7 dentry_unlink_inode+0xc6/0xf0 __dentry_kill+0xd8/0x18d dput+0x1b5/0x1ed __fput+0x18b/0x216 ____fput+0xe/0x10 task_work_run+0x90/0xa7 exit_to_usermode_loop+0xdd/0x116 do_syscall_64+0x187/0x1ae entry_SYSCALL_64_after_hwframe+0x150/0x0 [jane.chu@oracle.com: relocate comment] Link: http://lkml.kernel.org/r/20180731044831.26036-1-jane.chu@oracle.com Link: http://lkml.kernel.org/r/20180727211727.5020-1-jane.chu@oracle.com Fixes: 05ea88608d4e13 ("mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct") Signed-off-by: Jane Chu <jane.chu@oracle.com> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-02memcg: remove memcg_cgroup::id from IDR on mem_cgroup_css_alloc() failureKirill Tkhai
In case of memcg_online_kmem() failure, memcg_cgroup::id remains hashed in mem_cgroup_idr even after memcg memory is freed. This leads to leak of ID in mem_cgroup_idr. This patch adds removal into mem_cgroup_css_alloc(), which fixes the problem. For better readability, it adds a generic helper which is used in mem_cgroup_alloc() and mem_cgroup_id_put_many() as well. Link: http://lkml.kernel.org/r/152354470916.22460.14397070748001974638.stgit@localhost.localdomain Fixes 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs") Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-03Merge branch 'bpf-cgroup-local-storage'Daniel Borkmann
Roman Gushchin says: ==================== This patchset implements cgroup local storage for bpf programs. The main idea is to provide a fast accessible memory for storing various per-cgroup data, e.g. number of transmitted packets. Cgroup local storage looks as a special type of map for userspace, and is accessible using generic bpf maps API for reading and updating of the data. The (cgroup inode id, attachment type) pair is used as a map key. A user can't create new entries or destroy existing entries; it happens automatically when a user attaches/detaches a bpf program to a cgroup. From a bpf program's point of view, cgroup storage is accessible without lookup using the special get_local_storage() helper function. It takes a map fd as an argument. It always returns a valid pointer to the corresponding memory area. To implement such a lookup-free access a pointer to the cgroup storage is saved for an attachment of a bpf program to a cgroup, if required by the program. Before running the program, it's saved in a special global per-cpu variable, which is accessible from the get_local_storage() helper. This patchset implement only cgroup local storage, however the API is intentionally made extensible to support other local storage types further: e.g. thread local storage, socket local storage, etc. v7->v6: - fixed a use-after-free bug, caused by not clearing prog->aux->cgroup_storage pointer after releasing the map v6->v5: - fixed an error with returning -EINVAL instead of a pointer v5->v4: - fixed an issue in verifier (test that flags == 0 properly) - added a corresponding test - added a note about synchronization, sync docs to tools/uapi/... - switched the cgroup test to use XADD - added a check for attr->max_entries to be 0, and atter->max_flags to be sane - use bpf_uncharge_memlock() in bpf_uncharge_memlock() - rebased to bpf-next v4->v3: - fixed a leak in cgroup attachment code (discovered by Daniel) - cgroup storage map will be released if the corresponding bpf program failed to load by any reason - introduced bpf_uncharge_memlock() helper v3->v2: - fixed more build and sparse issues - rebased to bpf-next v2->v1: - fixed build issues - removed explicit rlimit calls in patch 14 - rebased to bpf-next ==================== Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03samples/bpf: extend test_cgrp2_attach2 test to use cgroup storageRoman Gushchin
The test_cgrp2_attach test covers bpf cgroup attachment code well, so let's re-use it for testing allocation/releasing of cgroup storage. The extension is pretty straightforward: the bpf program will use the cgroup storage to save the number of transmitted bytes. Expected output: $ ./test_cgrp2_attach2 Attached DROP prog. This ping in cgroup /foo should fail... ping: sendmsg: Operation not permitted Attached DROP prog. This ping in cgroup /foo/bar should fail... ping: sendmsg: Operation not permitted Attached PASS prog. This ping in cgroup /foo/bar should pass... Detached PASS from /foo/bar while DROP is attached to /foo. This ping in cgroup /foo/bar should fail... ping: sendmsg: Operation not permitted Attached PASS from /foo/bar and detached DROP from /foo. This ping in cgroup /foo/bar should pass... ### override:PASS ### multi:PASS Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03selftests/bpf: add a cgroup storage testRoman Gushchin
Implement a test to cover the cgroup storage functionality. The test implements a bpf program which drops every second packet by using the cgroup storage as a persistent storage. The test also use the userspace API to check the data in the cgroup storage, alter it, and check that the loaded and attached bpf program sees the update. Expected output: $ ./test_cgroup_storage test_cgroup_storage:PASS Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03selftests/bpf: add verifier cgroup storage testsRoman Gushchin
Add the following verifier tests to cover the cgroup storage functionality: 1) valid access to the cgroup storage 2) invalid access: use regular hashmap instead of cgroup storage map 3) invalid access: use invalid map fd 4) invalid access: try access memory after the cgroup storage 5) invalid access: try access memory before the cgroup storage 6) invalid access: call get_local_storage() with non-zero flags For tests 2)-6) check returned error strings. Expected output: $ ./test_verifier #0/u add+sub+mul OK #0/p add+sub+mul OK #1/u DIV32 by 0, zero check 1 OK ... #280/p valid cgroup storage access OK #281/p invalid cgroup storage access 1 OK #282/p invalid cgroup storage access 2 OK #283/p invalid per-cgroup storage access 3 OK #284/p invalid cgroup storage access 4 OK #285/p invalid cgroup storage access 5 OK ... #649/p pass modified ctx pointer to helper, 2 OK #650/p pass modified ctx pointer to helper, 3 OK Summary: 901 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03bpf/test_run: support cgroup local storageRoman Gushchin
Allocate a temporary cgroup storage to use for bpf program test runs. Because the test program is not actually attached to a cgroup, the storage is allocated manually just for the execution of the bpf program. If the program is executed multiple times, the storage is not zeroed on each run, emulating multiple runs of the program, attached to a real cgroup. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03bpftool: add support for CGROUP_STORAGE mapsRoman Gushchin
Add BPF_MAP_TYPE_CGROUP_STORAGE maps to the list of maps types which bpftool recognizes. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03bpf: sync bpf.h to tools/Roman Gushchin
Sync cgroup storage related changes: 1) new BPF_MAP_TYPE_CGROUP_STORAGE map type 2) struct bpf_cgroup_sotrage_key definition 3) get_local_storage() helper Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03bpf: introduce the bpf_get_local_storage() helper functionRoman Gushchin
The bpf_get_local_storage() helper function is used to get a pointer to the bpf local storage from a bpf program. It takes a pointer to a storage map and flags as arguments. Right now it accepts only cgroup storage maps, and flags argument has to be 0. Further it can be extended to support other types of local storage: e.g. thread local storage etc. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03bpf: don't allow create maps of cgroup local storagesRoman Gushchin
As there is one-to-one relation between a bpf program and cgroup local storage map, there is no sense in creating a map of cgroup local storage maps. Forbid it explicitly to avoid possible side effects. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03bpf/verifier: introduce BPF_PTR_TO_MAP_VALUERoman Gushchin
BPF_MAP_TYPE_CGROUP_STORAGE maps are special in a way that the access from the bpf program side is lookup-free. That means the result is guaranteed to be a valid pointer to the cgroup storage; no NULL-check is required. This patch introduces BPF_PTR_TO_MAP_VALUE return type, which is required to cause the verifier accept programs, which are not checking the map value pointer for being NULL. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03bpf: extend bpf_prog_array to store pointers to the cgroup storageRoman Gushchin
This patch converts bpf_prog_array from an array of prog pointers to the array of struct bpf_prog_array_item elements. This allows to save a cgroup storage pointer for each bpf program efficiently attached to a cgroup. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03bpf: allocate cgroup storage entries on attaching bpf programsRoman Gushchin
If a bpf program is using cgroup local storage, allocate a bpf_cgroup_storage structure automatically on attaching the program to a cgroup and save the pointer into the corresponding bpf_prog_list entry. Analogically, release the cgroup local storage on detaching of the bpf program. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03bpf: pass a pointer to a cgroup storage using pcpu variableRoman Gushchin
This commit introduces the bpf_cgroup_storage_set() helper, which will be used to pass a pointer to a cgroup storage to the bpf helper. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03bpf: introduce cgroup storage mapsRoman Gushchin
This commit introduces BPF_MAP_TYPE_CGROUP_STORAGE maps: a special type of maps which are implementing the cgroup storage. >From the userspace point of view it's almost a generic hash map with the (cgroup inode id, attachment type) pair used as a key. The only difference is that some operations are restricted: 1) a user can't create new entries, 2) a user can't remove existing entries. The lookup from userspace is o(log(n)). Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03bpf: add ability to charge bpf maps memory dynamicallyRoman Gushchin
This commits extends existing bpf maps memory charging API to support dynamic charging/uncharging. This is required to account memory used by maps, if all entries are created dynamically after the map initialization. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-02net/socket: remove duplicated init codeMatthieu Baerts
This refactoring work has been started by David Howells in cdfbabfb2f0c (net: Work around lockdep limitation in sockets that use sockets) but the exact same day in 581319c58600 (net/socket: use per af lockdep classes for sk queues), Paolo Abeni added new classes. This reduces the amount of (nearly) duplicated code and eases the addition of new socket types. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>