summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2016-01-21libceph: fix ceph_msg_revoke()Ilya Dryomov
There are a number of problems with revoking a "was sending" message: (1) We never make any attempt to revoke data - only kvecs contibute to con->out_skip. However, once the header (envelope) is written to the socket, our peer learns data_len and sets itself to expect at least data_len bytes to follow front or front+middle. If ceph_msg_revoke() is called while the messenger is sending message's data portion, anything we send after that call is counted by the OSD towards the now revoked message's data portion. The effects vary, the most common one is the eventual hang - higher layers get stuck waiting for the reply to the message that was sent out after ceph_msg_revoke() returned and treated by the OSD as a bunch of data bytes. This is what Matt ran into. (2) Flat out zeroing con->out_kvec_bytes worth of bytes to handle kvecs is wrong. If ceph_msg_revoke() is called before the tag is sent out or while the messenger is sending the header, we will get a connection reset, either due to a bad tag (0 is not a valid tag) or a bad header CRC, which kind of defeats the purpose of revoke. Currently the kernel client refuses to work with header CRCs disabled, but that will likely change in the future, making this even worse. (3) con->out_skip is not reset on connection reset, leading to one or more spurious connection resets if we happen to get a real one between con->out_skip is set in ceph_msg_revoke() and before it's cleared in write_partial_skip(). Fixing (1) and (3) is trivial. The idea behind fixing (2) is to never zero the tag or the header, i.e. send out tag+header regardless of when ceph_msg_revoke() is called. That way the header is always correct, no unnecessary resets are induced and revoke stands ready for disabled CRCs. Since ceph_msg_revoke() rips out con->out_msg, introduce a new "message out temp" and copy the header into it before sending. Cc: stable@vger.kernel.org # 4.0+ Reported-by: Matt Conner <matt.conner@keepertech.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Tested-by: Matt Conner <matt.conner@keepertech.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-01-21ceph: ceph_frag_contains_value can be booleanYaowei Bai
This patch makes ceph_frag_contains_value return bool to improve readability due to this particular function only using either one or zero as its return value. No functional change. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-01-21ceph: remove unused functions in ceph_frag.hYaowei Bai
These functions were introduced in commit 3d14c5d2b ("ceph: factor out libceph from Ceph file system"). Howover, there's no user of these functions since then, so remove them for simplicity. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-01-21perf: Collapse and fix event_function_call() usersPeter Zijlstra
There is one common bug left in all the event_function_call() users, between loading ctx->task and getting to the remote_function(), ctx->task can already have been changed. Therefore we need to double check and retry if ctx->task != current. Insert another trampoline specific to event_function_call() that checks for this and further validates state. This also allows getting rid of the active/inactive functions. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21{IB, net}/mlx5: Move the modify QP operation table to mlx5_ibmajd@mellanox.com
When modifying a QP, the desired operation was determined in the mlx5_core using a transition table that takes the current state, the final state, and returns the desired operation. Since this logic will be used for Raw Packet QP, move the operation table to the mlx5_ib. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21IB/mlx5: Support setting Ethernet priority for Raw Packet QPsmajd@mellanox.com
When the user changes the Address Vector(AV) in the modify QP, he provides an SL. This SL should be translated to Ethernet Priority by taking the 3 LSB bits, and modify the QP's TIS according to this Ethernet priority. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21IB/mlx5: Add Raw Packet QP query functionalitymajd@mellanox.com
Since Raw Packet QP is composed of RQ and SQ, the IB QP's state is derived from the sub-objects. Therefore we need to query each one of the sub-objects, and decide on the IB QP's state. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21net/mlx5_core: Add RQ and SQ event handlingmajd@mellanox.com
RQ/SQ will be used to implement IB verbs QPs, so the IB QP affiliated events are affiliated also with SQs and RQs. Since SQ, RQ and QP resource numbers do not share the same name space, a queue type field was added to the event data to specify the SW object that the event is affiliated with. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21net/mlx5_core: Export transport objectsmajd@mellanox.com
To be used by mlx5_ib in the following patches for implementing RAW PACKET QP. Add mlx5_core_ prefix to alloc and delloc transport_domain since they are exposed now. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-20Merge tag 'pm+acpi-4.5-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management and ACPI updates from Rafael Wysocki: "This includes fixes on top of the previous batch of PM+ACPI updates and some new material as well. From the new material perspective the most significant are the driver core changes that should allow USB devices to stay suspended over system suspend/resume cycles if they have been runtime-suspended already beforehand. Apart from that, ACPICA is updated to upstream revision 20160108 (cosmetic mostly, but including one fixup on top of the previous ACPICA update) and there are some devfreq updates the didn't make it before (due to timing). A few recent regressions are fixed, most importantly in the cpuidle menu governor and in the ACPI backlight driver and some x86 platform drivers depending on it. Some more bugs are fixed and cleanups are made on top of that. Specifics: - Modify the driver core and the USB subsystem to allow USB devices to stay suspended over system suspend/resume cycles if they have been runtime-suspended already beforehand and fix some bugs on top of these changes (Tomeu Vizoso, Rafael Wysocki). - Update ACPICA to upstream revision 20160108, including updates of the ACPICA's copyright notices, a code fixup resulting from a regression fix that was necessary in the upstream code only (the regression fixed by it has never been present in Linux) and a compiler warning fix (Bob Moore, Lv Zheng). - Fix a recent regression in the cpuidle menu governor that broke it on practically all architectures other than x86 and make a couple of optimizations on top of that fix (Rafael Wysocki). - Clean up the selection of cpuidle governors depending on whether or not the kernel is configured for tickless systems (Jean Delvare). - Revert a recent commit that introduced a regression in the ACPI backlight driver, address the problem it attempted to fix in a different way and revert one more cosmetic change depending on the problematic commit (Hans de Goede). - Add two more ACPI backlight quirks (Hans de Goede). - Fix a few minor problems in the core devfreq code, clean it up a bit and update the MAINTAINERS information related to it (Chanwoo Choi, MyungJoo Ham). - Improve an error message in the ACPI fan driver (Andy Lutomirski). - Fix a recent build regression in the cpupower tool (Shreyas Prabhu)" * tag 'pm+acpi-4.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits) cpuidle: menu: Avoid pointless checks in menu_select() sched / idle: Drop default_idle_call() fallback from call_cpuidle() cpupower: Fix build error in cpufreq-info cpuidle: Don't enable all governors by default cpuidle: Default to ladder governor on ticking systems time: nohz: Expose tick_nohz_enabled ACPICA: Update version to 20160108 ACPICA: Silence a -Wbad-function-cast warning when acpi_uintptr_t is 'uintptr_t' ACPICA: Additional 2016 copyright changes ACPICA: Reduce regression fix divergence from upstream ACPICA ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Satellite R830 ACPI / video: Revert "thinkpad_acpi: Use acpi_video_handles_brightness_key_presses()" ACPI / video: Document acpi_video_handles_brightness_key_presses() a bit ACPI / video: Fix using an uninitialized mutex / list_head in acpi_video_handles_brightness_key_presses() ACPI / video: Revert "ACPI / video: driver must be registered before checking for keypresses" ACPI / fan: Improve acpi_device_update_power error message ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Portege R700 cpuidle: menu: Fix menu_select() for CPUIDLE_DRIVER_STATE_START == 0 MAINTAINERS: Add devfreq-event entry MAINTAINERS: Add missing git repository and directory for devfreq ...
2016-01-20Merge tag 'renesas-sh-drivers-for-v4.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas Pull SH driver updates from Simon Horman: "Clean up the clock API on legacy SH to make it more similar to the Common Clock Framework. This will avoid different behaviour in drivers shared between legacy and CCF-based platforms (e.g. SCIF)" * tag 'renesas-sh-drivers-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: drivers: sh: clk: Avoid crashes when passing NULL clocks drivers: sh: clk: Remove obsolete and unused clk_round_parent()
2016-01-20Merge tag 'armsoc-drivers' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC driver updates from Olof Johansson: "Driver updates for ARM SoCs. Some for SoC-family code under drivers/soc, but also some other driver updates that don't belong anywhere else. We also bring in the drivers/reset code through arm-soc. Some of the larger updates: - Qualcomm support for SMEM, SMSM, SMP2P. All used to communicate with other parts of the chip/board on these platforms, all proprietary protocols that don't fit into other subsystems and live in drivers/soc for now. - System bus driver for UniPhier - Driver for the TI Wakeup M3 IPC device - Power management for Raspberry PI + Again a bunch of other smaller updates and patches" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (38 commits) bus: uniphier: allow only built-in driver ARM: bcm2835: clarify RASPBERRYPI_FIRMWARE dependency MAINTAINERS: Drop Kumar Gala from QCOM bus: uniphier-system-bus: add UniPhier System Bus driver ARM: bcm2835: add rpi power domain driver dt-bindings: add rpi power domain driver bindings ARM: bcm2835: Define two new packets from the latest firmware. drivers/soc: make mediatek/mtk-scpsys.c explicitly non-modular soc: mediatek: SCPSYS: Add regulator support MAINTAINERS: Change QCOM entries soc: qcom: smd-rpm: Add existing platform support memory/tegra: Add number of TLB lines for Tegra124 reset: hi6220: fix modular build soc: qcom: Introduce WCNSS_CTRL SMD client ARM: qcom: select ARM_CPU_SUSPEND for power management MAINTAINERS: Add rules for Qualcomm dts files soc: qcom: enable smsm/smp2p modular build serial: msm_serial: Make config tristate soc: qcom: smp2p: Qualcomm Shared Memory Point to Point soc: qcom: smsm: Add driver for Qualcomm SMSM ...
2016-01-20Merge tag 'armsoc-soc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC platform updates from Olof Johansson: "Updates for new platform support: - New platform: Tango4 from Sigma Designs. - Broadcom BCM2836 (Raspberry Pi 2 SoC) - Enable cpufreq on Freescale i.MX7D - Rockchip: SMP support for rk3036, general support for rk3228 - SMP support on Broadcom Kona and NSP - Cleanups for OMAP removing legacy IOMMU data + a bunch of misc fixes and tweaks for various platforms" * tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (46 commits) ARM: tango: Fix UP build issues ARM: tango: pass ARM arch level for smc.S ARM: bcm2835: Add Kconfig support for bcm2836 ARM: OMAP2+: Add support for dm814x and dra62x usb ARM: OMAP2+: Add mmc hwmod entries for dm814x ARM: OMAP2+: Update 81xx clock and power domains for default, active and sgx ARM: OMAP2+: Fix SoC detection for dra62x j5-eco ARM: tango4: Initial platform support ARM: bcm2835: Add a compat string for bcm2836 machine probe dt-bindings: Add root properties for Raspberry Pi 2 ARM: imx: select SRC for i.MX7 ARM: uniphier: select PINCTRL ARM: OMAP2+: Remove device creation for omap-pcm-audio ARM: OMAP1: Remove device creation for omap-pcm-audio ARM: rockchip: enable support for RK3228 SoCs ARM: rockchip: use const and __initconst for rk3036 smp_operations ARM: zynq: Select ARCH_HAS_RESET_CONTROLLER ARM: BCM: Add SMP support for Broadcom 4708 ARM: BCM: Add SMP support for Broadcom NSP ARM: BCM: Clean up SMP support for Broadcom Kona ...
2016-01-20Merge tag 'armsoc-multiplatform' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC multiplatform code updates from Arnd Bergmann: "This branch is the culmination of 5 years of effort to bring the ARMv6 and ARMv7 platforms together such that they can all be enabled and boot the same kernel. It has been a tremendous amount of cleanup and refactoring by a huge number of people, and creation of several new (and major) subsystems to better abstract out all the platform details in an appropriate manner. The bulk of this branch is a large patchset from Arnd that brings several of the more minor and older platforms we have closer to multiplatform support. Among these are MMP, S3C64xx, Orion5x, mv78xx0 and realview Much of this is moving around header files from old mach directories, but there are also some cleanup patches of debug_ll (lowlevel debug per-platform options) and other parts. Linus Walleij also has some patchs to clean up the older ARM Realview platforms by finally introducing DT support, and Rob Herring has some for ARM Versatile which is now DT-only. Both of these platforms are now multiplatform. Finally, a couple of patches from Russell for Dove PMU, and a fix from Valentin Rothberg for Exynos ADC, which were rebased on top of the series to avoid conflicts" * tag 'armsoc-multiplatform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (75 commits) ARM: realview: don't select SMP_ON_UP for UP builds ARM: s3c: simplify s3c_irqwake_{e,}intallow definition ARM: s3c64xx: fix pm-debug compilation iio: exynos-adc: fix irqf_oneshot.cocci warnings ARM: realview: build realview-dt SMP support only when used ARM: realview: select apropriate targets ARM: realview: clean up header files ARM: realview: make all header files local ARM: no longer make CPU targets visible separately ARM: integrator: use explicit core module options ARM: realview: enable multiplatform ARM: make default platform work for NOMMU ARM: debug-ll: move DEBUG_LL_UART_EFM32 to correct Kconfig location ARM: defconfig: use correct debug_ll settings ARM: versatile: convert to multi-platform ARM: versatile: merge mach code into a single file ARM: versatile: switch to DT only booting and remove legacy code ARM: versatile: add DT based PCI detection ARM: pxa: mark ezx structures as __maybe_unused ARM: pxa: mark raumfeld init functions as __maybe_unused ...
2016-01-20Merge tag 'armsoc-cleanup' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC cleanups from Olof Johansson: "A smallish number of general cleanup commits this release cycle. Some of these are minor tweaks: - shmobile change of binding for their GIC (using arm,pl390 now) - ARCH_RENESAS introduction - Misc other renesas updates There's also a couple of treewide commits from Masahiro Yamada cleaning up const/__initconst for SMP operation structs and a switch to using "depends on" instead of if-constructs on most of the Kconfig platform targets" * tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: staging: board: armadillo800eva: Use "arm,pl390" staging: board: kzm9d: Use "arm,pl390" ARM: shmobile: r8a7778 dtsi: Use "arm,pl390" for GIC ARM: shmobile: emev2 dtsi: Use "arm,pl390" for GIC ARM: shmobile: r8a7740 dtsi: Use "arm,pl390" for GIC ARM: shmobile: r7s72100 dtsi: Use "arm,pl390" for GIC ARM: use "depends on" for SoC configs instead of "if" after prompt ARM/clocksource: use automatic DT probing for ux500 PRCMU ARM: use const and __initconst for smp_operations ARM: hisi: do not export smp_operations structures ARM: mvebu: remove unused mach/gpio.h ARM: shmobile: Remove legacy mach/irqs.h ARM: shmobile: Introduce ARCH_RENESAS MAINTAINERS: Remove link to oss.renesas.com which is closed
2016-01-20Merge tag 'armsoc-fixes-nc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull non-urgent ARM SoC fixes from Olof Johansson: "As usual, we queue up a few fixes that don't seem urgent enough to go in through -rc. - MAINTAINERS updates to add a list for brcmstb and fix a typo - A handful of fixes for OMAP 81xx, a recently resurrected platform so these can't be considered real regressions and thus got queued. - A couple of other small fixes for scoop, sa1100 and davinci" * tag 'armsoc-fixes-nc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: OMAP2+: Fix randconfig build warning for dm814_pllss_data ARM: sa1100/simpad: Be sure to clamp return value ARM: scoop: Be sure to clamp return value ARM: davinci: fix a problematic usage of WARN() ARM: davinci: only select WT cache if cache is enabled ARM: OMAP2+: Remove useless check for legacy booting for dm814x ARM: OMAP2+: Enable GPIO for dm814x ARM: dts: Fix dm814x pinctrl address and mask ARM: dts: Fix dm8148 control modules ranges ARM: OMAP2+: Fix timer entries for dm814x ARM: dts: Fix some mux and divider clocks to get dm814x-evm booting ARM: OMAP2+: Add DPPLS clock manager for dm814x clk: ti: Add few dm814x clock aliases ARM: dts: Fix dm814x entries for pllss and prcm MAINTAINERS: gpio-brcmstb: Remove stray '>' MAINTAINERS: brcmstb: Include Broadcom internal mailing-list
2016-01-20Merge branch 'for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "The highlights this round include: - Introduce configfs support for unlocked configfs_depend_item() (krzysztof + andrezej) - Conversion of usb-gadget target driver to new function registration interface (andrzej + sebastian) - Enable qla2xxx FC target mode support for Extended Logins (himansu + giridhar) - Enable qla2xxx FC target mode support for Exchange Offload (himansu + giridhar) - Add qla2xxx FC target mode irq affinity notification + selective command queuing. (quinn + himanshu) - Fix iscsi-target deadlock in se_node_acl configfs deletion (sagi + nab) - Convert se_node_acl configfs deletion + se_node_acl->queue_depth to proper se_session->sess_kref + target_get_session() usage. (hch + sagi + nab) - Fix long-standing race between se_node_acl->acl_kref get and get_initiator_node_acl() lookup. (hch + nab) - Fix target/user block-size handling, and make sure netlink reaches all network namespaces (sheng + andy) Note there is an outstanding bug-fix series for remote I_T nexus port TMR LUN_RESET has been posted and still being tested, and will likely become post -rc1 material at this point" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (56 commits) scsi: qla2xxxx: avoid type mismatch in comparison target/user: Make sure netlink would reach all network namespaces target: Obtain se_node_acl->acl_kref during get_initiator_node_acl target: Convert ACL change queue_depth se_session reference usage iscsi-target: Fix potential dead-lock during node acl delete ib_srpt: Convert acl lookup to modern get_initiator_node_acl usage tcm_fc: Convert acl lookup to modern get_initiator_node_acl usage tcm_fc: Wait for command completion before freeing a session target: Fix a memory leak in target_dev_lba_map_store() target: Support aborting tasks with a 64-bit tag usb/gadget: Remove set-but-not-used variables target: Remove an unused variable target: Fix indentation in target_core_configfs.c target/user: Allow user to set block size before enabling device iser-target: Fix non negative ERR_PTR isert_device_get usage target/fcoe: Add tag support to tcm_fc qla2xxx: Check for online flag instead of active reset when transmitting responses qla2xxx: Set all queues to 4k qla2xxx: Disable ZIO at start time. qla2xxx: Move atioq to a different lock to reduce lock contention ...
2016-01-20mm: memcontrol: add "sock" to cgroup2 memory.statJohannes Weiner
Provide statistics on how much of a cgroup's memory footprint is made up of socket buffers from network connections owned by the group. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20mm: free swap cache aggressively if memcg swap is fullVladimir Davydov
Swap cache pages are freed aggressively if swap is nearly full (>50% currently), because otherwise we are likely to stop scanning anonymous when we near the swap limit even if there is plenty of freeable swap cache pages. We should follow the same trend in case of memory cgroup, which has its own swap limit. Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20mm: vmscan: do not scan anon pages if memcg swap limit is hitVladimir Davydov
We don't scan anonymous memory if we ran out of swap, neither should we do it in case memcg swap limit is hit, because swap out is impossible anyway. Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20swap.h: move memcg related stuff to the end of the fileVladimir Davydov
The following patches will add more functions to the memcg section of include/linux/swap.h. Some of them will need values defined below the current location of the section. So let's move the section to the end of the file. No functional changes intended. Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20mm: memcontrol: replace mem_cgroup_lruvec_online with mem_cgroup_onlineVladimir Davydov
mem_cgroup_lruvec_online() takes lruvec, but it only needs memcg. Since get_scan_count(), which is the only user of this function, now possesses pointer to memcg, let's pass memcg directly to mem_cgroup_online() instead of picking it out of lruvec and rename the function accordingly. Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20mm: memcontrol: charge swap to cgroup2Vladimir Davydov
This patchset introduces swap accounting to cgroup2. This patch (of 7): In the legacy hierarchy we charge memsw, which is dubious, because: - memsw.limit must be >= memory.limit, so it is impossible to limit swap usage less than memory usage. Taking into account the fact that the primary limiting mechanism in the unified hierarchy is memory.high while memory.limit is either left unset or set to a very large value, moving memsw.limit knob to the unified hierarchy would effectively make it impossible to limit swap usage according to the user preference. - memsw.usage != memory.usage + swap.usage, because a page occupying both swap entry and a swap cache page is charged only once to memsw counter. As a result, it is possible to effectively eat up to memory.limit of memory pages *and* memsw.limit of swap entries, which looks unexpected. That said, we should provide a different swap limiting mechanism for cgroup2. This patch adds mem_cgroup->swap counter, which charges the actual number of swap entries used by a cgroup. It is only charged in the unified hierarchy, while the legacy hierarchy memsw logic is left intact. The swap usage can be monitored using new memory.swap.current file and limited using memory.swap.max. Note, to charge swap resource properly in the unified hierarchy, we have to make swap_entry_free uncharge swap only when ->usage reaches zero, not just ->count, i.e. when all references to a swap entry, including the one taken by swap cache, are gone. This is necessary, because otherwise swap-in could result in uncharging swap even if the page is still in swap cache and hence still occupies a swap entry. At the same time, this shouldn't break memsw counter logic, where a page is never charged twice for using both memory and swap, because in case of legacy hierarchy we uncharge swap on commit (see mem_cgroup_commit_charge). Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20mm: memcontrol: clean up alloc, online, offline, free functionsJohannes Weiner
The creation and teardown of struct mem_cgroup is fairly messy and that has attracted mistakes and subtle bugs before. The main cause for this is that there is no clear model about what needs to happen when, and that attracts more chaos. So create one: 1. mem_cgroup_alloc() should allocate struct mem_cgroup and its auxiliary members and initialize work items, locks etc. so that the object it returns is fully initialized and in a neutral state. 2. mem_cgroup_css_alloc() will use mem_cgroup_alloc() to obtain a new memcg object and configure it and the system according to the role of the new memory-controlled cgroup in the hierarchy. 3. mem_cgroup_css_online() is no longer needed to synchronize with iterators, but it verifies css->id which isn't available earlier. 4. mem_cgroup_css_offline() implements stuff that needs to happen upon the user-visible destruction of a cgroup, which includes stopping all user interfacing as well as releasing certain structures when continued memory consumption would be unexpected at that point. 5. mem_cgroup_css_free() prepares the system and the memcg object for the object's disappearance, neutralizes its state, and then gives it back to mem_cgroup_free(). 6. mem_cgroup_free() releases struct mem_cgroup and auxiliary memory. [arnd@arndb.de: fix SLOB build regression] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20mm: memcontrol: flatten struct cg_protoJohannes Weiner
There are no more external users of struct cg_proto, flatten the structure into struct mem_cgroup. Since using those struct members doesn't stand out as much anymore, add cgroup2 static branches to make it clearer which code is legacy. Suggested-by: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20mm: memcontrol: rein in the CONFIG space madnessJohannes Weiner
What CONFIG_INET and CONFIG_LEGACY_KMEM guard inside the memory controller code is insignificant, having these conditionals is not worth the complication and fragility that comes with them. [akpm@linux-foundation.org: rework mem_cgroup_css_free() statement ordering] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEMJohannes Weiner
Let the user know that CONFIG_MEMCG_KMEM does not apply to the cgroup2 interface. This also makes legacy-only code sections stand out better. [arnd@arndb.de: mm: memcontrol: only manage socket pressure for CONFIG_INET] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Tejun Heo <tj@kernel.org> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20mm: memcontrol: move kmem accounting code to CONFIG_MEMCGJohannes Weiner
The cgroup2 memory controller will account important in-kernel memory consumers per default. Move all necessary components to CONFIG_MEMCG. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20mm: memcontrol: give the kmem states more descriptive namesJohannes Weiner
On any given memcg, the kmem accounting feature has three separate states: not initialized, structures allocated, and actively accounting slab memory. These are represented through a combination of the kmem_acct_activated and kmem_acct_active flags, which is confusing. Convert to a kmem_state enum with the states NONE, ALLOCATED, and ONLINE. Then rename the functions to modify the state accordingly. This follows the nomenclature of css object states more closely. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20dma-mapping: use offset_in_page macroGeliang Tang
Use offset_in_page macro instead of (addr & ~PAGE_MASK). Signed-off-by: Geliang Tang <geliangtang@163.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20dma-mapping: remove <asm-generic/dma-coherent.h>Christoph Hellwig
This wasn't an asm-generic header to start with, and can be merged into dma-mapping.h trivially. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Helge Deller <deller@gmx.de> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Steven Miao <realmz6@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20dma-mapping: always provide the dma_map_ops based implementationChristoph Hellwig
Move the generic implementation to <linux/dma-mapping.h> now that all architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now that everyone supports them. [valentinrothberg@gmail.com: remove leftovers in Kconfig] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Helge Deller <deller@gmx.de> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Steven Miao <realmz6@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20ipc/shm.c: is_file_shm_hugepages() can be booleanYaowei Bai
Make is_file_shm_hugepages() return bool to improve readability due to this particular function only using either one or zero as its return value. No functional change. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20lz4: fix wrong compress buffer size for 64-bitsBongkyu Kim
The current lz4 compress buffer is 16kb on 32-bits, 32kb on 64-bits system. But, lz4 needs only 16kb on both. On 64-bits, this causes wasted cpu cycles for additional memset during every compression. In case of lz4hc, the current buffer size is (256kb + 8) on 32-bits, (512kb + 16) on 64-bits. But, lz4hc needs only (256kb + 2 * pointer) on both. This patch fixes these wrong compress buffer sizes for 64-bits. Signed-off-by: Bongkyu Kim <bongkyu.kim@lge.com> Cc: Chanho Min <chanho.min@lge.com> Cc: Yann Collet <yann.collet.73@gmail.com> Cc: Kyungsik Lee <kyungsik.lee@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20UBSAN: run-time undefined behavior sanity checkerAndrey Ryabinin
UBSAN uses compile-time instrumentation to catch undefined behavior (UB). Compiler inserts code that perform certain kinds of checks before operations that could cause UB. If check fails (i.e. UB detected) __ubsan_handle_* function called to print error message. So the most of the work is done by compiler. This patch just implements ubsan handlers printing errors. GCC has this capability since 4.9.x [1] (see -fsanitize=undefined option and its suboptions). However GCC 5.x has more checkers implemented [2]. Article [3] has a bit more details about UBSAN in the GCC. [1] - https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Debugging-Options.html [2] - https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html [3] - http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/ Issues which UBSAN has found thus far are: Found bugs: * out-of-bounds access - 97840cb67ff5 ("netfilter: nfnetlink: fix insufficient validation in nfnetlink_bind") undefined shifts: * d48458d4a768 ("jbd2: use a better hash function for the revoke table") * 10632008b9e1 ("clockevents: Prevent shift out of bounds") * 'x << -1' shift in ext4 - http://lkml.kernel.org/r/<5444EF21.8020501@samsung.com> * undefined rol32(0) - http://lkml.kernel.org/r/<1449198241-20654-1-git-send-email-sasha.levin@oracle.com> * undefined dirty_ratelimit calculation - http://lkml.kernel.org/r/<566594E2.3050306@odin.com> * undefined roundown_pow_of_two(0) - http://lkml.kernel.org/r/<1449156616-11474-1-git-send-email-sasha.levin@oracle.com> * [WONTFIX] undefined shift in __bpf_prog_run - http://lkml.kernel.org/r/<CACT4Y+ZxoR3UjLgcNdUm4fECLMx2VdtfrENMtRRCdgHB2n0bJA@mail.gmail.com> WONTFIX here because it should be fixed in bpf program, not in kernel. signed overflows: * 32a8df4e0b33f ("sched: Fix odd values in effective_load() calculations") * mul overflow in ntp - http://lkml.kernel.org/r/<1449175608-1146-1-git-send-email-sasha.levin@oracle.com> * incorrect conversion into rtc_time in rtc_time64_to_tm() - http://lkml.kernel.org/r/<1449187944-11730-1-git-send-email-sasha.levin@oracle.com> * unvalidated timespec in io_getevents() - http://lkml.kernel.org/r/<CACT4Y+bBxVYLQ6LtOKrKtnLthqLHcw-BMp3aqP3mjdAvr9FULQ@mail.gmail.com> * [NOTABUG] signed overflow in ktime_add_safe() - http://lkml.kernel.org/r/<CACT4Y+aJ4muRnWxsUe1CMnA6P8nooO33kwG-c8YZg=0Xc8rJqw@mail.gmail.com> [akpm@linux-foundation.org: fix unused local warning] [akpm@linux-foundation.org: fix __int128 build woes] Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michal Marek <mmarek@suse.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Yury Gribov <y.gribov@samsung.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20rbtree: use READ_ONCE in RB_EMPTY_ROOTDavidlohr Bueso
With commit d72da4a4d97 ("rbtree: Make lockless searches non-fatal") our rbtrees provide weak guarantees that allows us to do lockless (and very speculative) reads of the tree. Such readers cannot see partial stores on nodes, ie left/right as well as root. As such, similar to the WRITE_ONCE semantics when doing rotations, use READ_ONCE when checking the root node in RB_EMPTY_ROOT. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Michel Lespinasse <walken@google.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20kexec: move some memembers and definitions within the scope of CONFIG_KEXEC_FILEXunlei Pang
Move the stuff currently only used by the kexec file code within CONFIG_KEXEC_FILE (and CONFIG_KEXEC_VERIFY_SIG). Also move internal "struct kexec_sha_region" and "struct kexec_buf" into "kexec_internal.h". Signed-off-by: Xunlei Pang <xlpang@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Young <dyoung@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20kernel/cpu.c: make set_cpu_* static inlinesRasmus Villemoes
Almost all callers of the set_cpu_* functions pass an explicit true or false. Making them static inline thus replaces the function calls with a simple set_bit/clear_bit, saving some .text. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20kernel/cpu.c: eliminate cpu_*_maskRasmus Villemoes
Replace the variables cpu_possible_mask, cpu_online_mask, cpu_present_mask and cpu_active_mask with macros expanding to expressions of the same type and value, eliminating some indirection. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20kernel/cpu.c: export __cpu_*_maskRasmus Villemoes
Exporting the cpumasks __cpu_possible_mask and friends will allow us to remove the extra indirection through the cpu_*_mask variables. It will also allow the set_cpu_* functions to become static inlines, which will give a .text reduction. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20ptrace: use fsuid, fsgid, effective creds for fs access checksJann Horn
By checking the effective credentials instead of the real UID / permitted capabilities, ensure that the calling process actually intended to use its credentials. To ensure that all ptrace checks use the correct caller credentials (e.g. in case out-of-tree code or newly added code omits the PTRACE_MODE_*CREDS flag), use two new flags and require one of them to be set. The problem was that when a privileged task had temporarily dropped its privileges, e.g. by calling setreuid(0, user_uid), with the intent to perform following syscalls with the credentials of a user, it still passed ptrace access checks that the user would not be able to pass. While an attacker should not be able to convince the privileged task to perform a ptrace() syscall, this is a problem because the ptrace access check is reused for things in procfs. In particular, the following somewhat interesting procfs entries only rely on ptrace access checks: /proc/$pid/stat - uses the check for determining whether pointers should be visible, useful for bypassing ASLR /proc/$pid/maps - also useful for bypassing ASLR /proc/$pid/cwd - useful for gaining access to restricted directories that contain files with lax permissions, e.g. in this scenario: lrwxrwxrwx root root /proc/13020/cwd -> /root/foobar drwx------ root root /root drwxr-xr-x root root /root/foobar -rw-r--r-- root root /root/foobar/secret Therefore, on a system where a root-owned mode 6755 binary changes its effective credentials as described and then dumps a user-specified file, this could be used by an attacker to reveal the memory layout of root's processes or reveal the contents of files he is not allowed to access (through /proc/$pid/cwd). [akpm@linux-foundation.org: fix warning] Signed-off-by: Jann Horn <jann@thejh.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20include/linux/radix-tree.h: fix error in docs about locksAdam Barth
This text refers to the "first 7 functions", which was correct when written but became incorrect when Johannes Weiner added another function to the list in 139e561660fe ("lib: radix_tree: tree node interface"). Change the text to correctly refer to the first 8 functions. Signed-off-by: Adam Barth <aurorean@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20lib/iomap_copy.c: add __ioread32_copy()Stephen Boyd
Some drivers need to read data out of iomem areas 32-bits at a time. Add an API to do this. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Bjorn Andersson <bjorn.andersson@sonymobile.com> Cc: <zajec5@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Hauke Mehrtens <hauke@hauke-m.de> Cc: Paul Walmsley <paul@pwsan.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-21Merge branch 'pm-cpuidle'Rafael J. Wysocki
* pm-cpuidle: cpuidle: menu: Avoid pointless checks in menu_select() sched / idle: Drop default_idle_call() fallback from call_cpuidle() cpuidle: Don't enable all governors by default cpuidle: Default to ladder governor on ticking systems time: nohz: Expose tick_nohz_enabled cpuidle: menu: Fix menu_select() for CPUIDLE_DRIVER_STATE_START == 0
2016-01-21Merge branch 'pm-devfreq'Rafael J. Wysocki
* pm-devfreq: MAINTAINERS: Add devfreq-event entry MAINTAINERS: Add missing git repository and directory for devfreq PM / devfreq: Do not show statistics if it's not ready. PM / devfreq: Modify the indentation of trans_stat sysfs for readability PM / devfreq: Set the freq_table of devfreq device PM / devfreq: Add show_one macro to delete the duplicate code PM / devfreq: event: Fix the error and warning from script/checkpatch.pl PM / devfreq: event: Remove the error log of devfreq_event_get_edev_by_phandle()
2016-01-21Merge branch 'pm-core'Rafael J. Wysocki
* pm-core: driver core: Avoid NULL pointer dereferences in device_is_bound() platform: Do not detach from PM domains on shutdown USB / PM: Allow USB devices to remain runtime-suspended when sleeping PM / sleep: Go direct_complete if driver has no callbacks PM / Domains: add setter for dev.pm_domain device core: add device_is_bound()
2016-01-20swiotlb: Make linux/swiotlb.h standalone includibleThierry Reding
This header file uses the enum dma_data_direction and struct page types without explicitly including the corresponding header files. This makes it rely on the includer to have included the proper headers before. To fix this, include linux/dma-direction.h and forward-declare struct page. The swiotlb_free() function is also annotated __init, therefore requires linux/init.h to be included as well. Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2016-01-20Merge branch 'pci/trivial' into nextBjorn Helgaas
* pci/trivial: PCI: shpchp: Constify hpc_ops structure PCI: Use kobj_to_dev() instead of open-coding it PCI: Use to_pci_dev() instead of open-coding it PCI: Fix all whitespace issues PCI/MSI: Fix typos in <linux/msi.h>
2016-01-20Merge branches 'pci/iommu' and 'pci/misc' into nextBjorn Helgaas
* pci/iommu: PCI: Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183 * pci/misc: PCI: Limit config space size for Netronome NFP4000 PCI: Add Netronome NFP4000 PF device ID
2016-01-19pipe: limit the per-user amount of pages allocated in pipesWilly Tarreau
On no-so-small systems, it is possible for a single process to cause an OOM condition by filling large pipes with data that are never read. A typical process filling 4000 pipes with 1 MB of data will use 4 GB of memory. On small systems it may be tricky to set the pipe max size to prevent this from happening. This patch makes it possible to enforce a per-user soft limit above which new pipes will be limited to a single page, effectively limiting them to 4 kB each, as well as a hard limit above which no new pipes may be created for this user. This has the effect of protecting the system against memory abuse without hurting other users, and still allowing pipes to work correctly though with less data at once. The limit are controlled by two new sysctls : pipe-user-pages-soft, and pipe-user-pages-hard. Both may be disabled by setting them to zero. The default soft limit allows the default number of FDs per process (1024) to create pipes of the default size (64kB), thus reaching a limit of 64MB before starting to create only smaller pipes. With 256 processes limited to 1024 FDs each, this results in 1024*64kB + (256*1024 - 1024) * 4kB = 1084 MB of memory allocated for a user. The hard limit is disabled by default to avoid breaking existing applications that make intensive use of pipes (eg: for splicing). Reported-by: socketpair@gmail.com Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Mitigates: CVE-2013-4312 (Linux 2.0+) Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>