Age | Commit message (Collapse) | Author |
|
Since the statistics handler is asynchrous, it can very well
be that we will handle the statistics (hence the RSSI
fluctuation) when we already disassociated.
Don't WARN on this case.
This solves: https://bugzilla.redhat.com/show_bug.cgi?id=1071998
Cc: <stable@vger.kernel.org> [3.10+]
Fixes: 2b76ef13086f ("iwlwifi: mvm: implement reduced Tx power")
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
|
|
Pull networking fixes from David Miller:
1) Fix memory leak in ieee80211_prep_connection(), sta_info leaked on
error. From Eytan Lifshitz.
2) Unintentional switch case fallthrough in nft_reject_inet_eval(),
from Patrick McHardy.
3) Must check if payload lenth is a power of 2 in
nft_payload_select_ops(), from Nikolay Aleksandrov.
4) Fix mis-checksumming in xen-netfront driver, ip_hdr() is not in the
correct place when we invoke skb_checksum_setup(). From Wei Liu.
5) TUN driver should not advertise HW vlan offload features in
vlan_features. Fix from Fernando Luis Vazquez Cao.
6) IPV6_VTI needs to select NET_IPV_TUNNEL to avoid build errors, fix
from Steffen Klassert.
7) Add missing locking in xfrm_migrade_state_find(), we must hold the
per-namespace xfrm_state_lock while traversing the lists. Fix from
Steffen Klassert.
8) Missing locking in ath9k driver, access to tid->sched must be done
under ath_txq_lock(). Fix from Stanislaw Gruszka.
9) Fix two bugs in TCP fastopen. First respect the size argument given
to tcp_sendmsg() in the fastopen path, and secondly prevent
tcp_send_syn_data() from potentially using order-5 allocations.
From Eric Dumazet.
10) Fix handling of default neigh garbage collection params, from Jiri
Pirko.
11) Fix cwnd bloat and over-inflation of RTT when transmit segmentation
is in use. From Eric Dumazet.
12) Missing initialization of Realtek r8169 driver's statistics
seqlocks. Fix from Kyle McMartin.
13) Fix RTNL assertion failures in 802.3ad and AB ARP monitor of bonding
driver, from Ding Tianhong.
14) Bonding slave release race can cause divide by zero, fix from
Nikolay Aleksandrov.
15) Overzealous return from neigh_periodic_work() causes reachability
time to not be computed. Fix from Duain Jiong.
16) Fix regression in ipv6_find_hdr(), it should not return -ENOENT when
a specific target is specified and found. From Hans Schillstrom.
17) Fix VLAN tag stripping regression in BNA driver, from Ivan Vecera.
18) Tail loss probe can calculate bogus RTTs due to missing packet
marking on retransmit. Fix from Yuchung Cheng.
19) We cannot do skb_dst_drop() in iptunnel_pull_header() because
multicast loopback detection in later code paths need access to
skb_rtable(). Fix from Xin Long.
20) The macvlan driver regresses in that it propagates lower device
offload support disables into itself, causing severe slowdowns when
running over a bridge. Provide the software offloads always on
macvlan devices to deal with this and the regression is gone. From
Vlad Yasevich.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits)
macvlan: Add support for 'always_on' offload features
net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable
ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer
net: cpsw: fix cpdma rx descriptor leak on down interface
be2net: isolate TX workarounds not applicable to Skyhawk-R
be2net: Fix skb double free in be_xmit_wrokarounds() failure path
be2net: clear promiscuous bits in adapter->flags while disabling promiscuous mode
be2net: Fix to reset transparent vlan tagging
qlcnic: dcb: a couple off by one bugs
tcp: fix bogus RTT on special retransmission
hsr: off by one sanity check in hsr_register_frame_in()
can: remove CAN FD compatibility for CAN 2.0 sockets
can: flexcan: factor out soft reset into seperate funtion
can: flexcan: flexcan_remove(): add missing netif_napi_del()
can: flexcan: fix transition from and to freeze mode in chip_{,un}freeze
can: flexcan: factor out transceiver {en,dis}able into seperate functions
can: flexcan: fix transition from and to low power mode in chip_{en,dis}able
can: flexcan: flexcan_open(): fix error path if flexcan_chip_start() fails
can: flexcan: fix shutdown: first disable chip, then all interrupts
USB AX88179/178A: Support D-Link DUB-1312
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A couple of fixes here which ensure that regulators using the core
support for GPIO enables work in all cases by ensuring that helpers
are used consistently rather than open coding in places and hence not
having GPIO support in some of them"
* tag 'regulator-v3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: core: Replace direct ops->disable usage
regulator: core: Replace direct ops->enable usage
|
|
Merge misc fixes from Andrew Morton.
* emailed patches from Andrew Morton akpm@linux-foundation.org>:
mm: page_alloc: exempt GFP_THISNODE allocations from zone fairness
mm: numa: bugfix for LAST_CPUPID_NOT_IN_PAGE_FLAGS
MAINTAINERS: add and correct types of some "T:" entries
MAINTAINERS: use tab for separator
rapidio/tsi721: fix tasklet termination in dma channel release
hfsplus: fix remount issue
zram: avoid null access when fail to alloc meta
sh: prefix sh-specific "CCR" and "CCR2" by "SH_"
ocfs2: fix quota file corruption
drivers/rtc/rtc-s3c.c: fix incorrect way of save/restore of S3C2410_TICNT for TYPE_S3C64XX
kallsyms: fix absolute addresses for kASLR
scripts/gen_initramfs_list.sh: fix flags for initramfs LZ4 compression
mm: include VM_MIXEDMAP flag in the VM_SPECIAL list to avoid m(un)locking
memcg: reparent charges of children before processing parent
memcg: fix endless loop in __mem_cgroup_iter_next()
lib/radix-tree.c: swapoff tmpfs radix_tree: remember to rcu_read_unlock
dma debug: account for cachelines and read-only mappings in overlap tracking
mm: close PageTail race
MAINTAINERS: EDAC: add Mauro and Borislav as interim patch collectors
|
|
Commit b5330655 ("dm thin: handle metadata failures more consistently")
increased potential for the pool's mode to be changed in response to
metadata operation failures.
When the pool mode is changed it isn't synchronized with the mode in
pool_features stored in the target's context (ti->private) that is used
as the basis for (re)establishing the pool mode during resume via
bind_control_target.
It is important that we synchronize the pool mode when it is changed
otherwise the pool may experience and unexpected mode transition on the
next resume (especially if there was no new table load).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
|
|
Fix following sparse warnings:
drivers/firmware/efi/efivars.c:230:66: warning:
Using plain integer as NULL pointer
drivers/firmware/efi/efi.c:236:27: warning:
Using plain integer as NULL pointer
Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|
There's no good reason to keep efi_enabled() under CONFIG_X86 anymore,
since nothing about the implementation is specific to x86.
Set EFI feature flags in the ia64 boot path instead of claiming to
support all features. The old behaviour was actually buggy since
efi.memmap never points to a valid memory map, so we shouldn't be
claiming to support EFI_MEMMAP.
Fortunately, this bug was never triggered because EFI_MEMMAP isn't used
outside of arch/x86 currently, but that may not always be the case.
Reviewed-and-tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|
... into a kexec flavor for better code readability and simplicity. The
original one was getting ugly with ifdeffery.
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|
Currently, running SetVirtualAddressMap() and passing the physical
address of the virtual map array was working only by a lucky coincidence
because the memory was present in the EFI page table too. Until Toshi
went and booted this on a big HP box - the krealloc() manner of resizing
the memmap we're doing did allocate from such physical addresses which
were not mapped anymore and boom:
http://lkml.kernel.org/r/1386806463.1791.295.camel@misato.fc.hp.com
One way to take care of that issue is to reimplement the krealloc thing
but with pages. We start with contiguous pages of order 1, i.e. 2 pages,
and when we deplete that memory (shouldn't happen all that often but you
know firmware) we realloc the next power-of-two pages.
Having the pages, it is much more handy and easy to map them into the
EFI page table with the already existing mapping code which we're using
for building the virtual mappings.
Thanks to Toshi Kani and Matt for the great debugging help.
Reported-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|
We will use it in efi so expose it.
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|
This is very useful for debugging issues with the recently added
pagetable switching code for EFI virtual mode.
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|
With reusing the ->trampoline_pgd page table for mapping EFI regions in
order to use them after having switched to EFI virtual mode, it is very
useful to be able to dump aforementioned page table in dmesg. This adds
that functionality through the walk_pgd_level() interface which can be
called from somewhere else.
The original functionality of dumping to debugfs remains untouched.
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|
Coalesce formats and remove spaces before tabs.
Move __initdata after the variable declaration.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|
For now we only ensure about 5kb free space for avoiding our board
refusing boot. But the comment lies that we retain 50% space.
Signed-off-by: Madper Xie <cxie@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|
This reorganization removes useless 'bytes' prior assignment and uses
'memdup_user' instead 'kmalloc' + 'copy_from_user'.
Signed-off-by: Geyslan G. Bem <geyslan@gmail.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|
It makes more sense to set the feature flag in the success path of the
detection function than it does to rely on the caller doing it. Apart
from it being more logical to group the code and data together, it sets
a much better example for new EFI architectures.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|
As we grow support for more EFI architectures they're going to want the
ability to query which EFI features are available on the running system.
Instead of storing this information in an architecture-specific place,
stick it in the global 'struct efi', which is already the central
location for EFI state.
While we're at it, let's change the return value of efi_enabled() to be
bool and replace all references to 'facility' with 'feature', which is
the usual word used to describe the attributes of the running system.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|
Jan Stancek reports manual page migration encountering allocation
failures after some pages when there is still plenty of memory free, and
bisected the problem down to commit 81c0a2bb515f ("mm: page_alloc: fair
zone allocator policy").
The problem is that GFP_THISNODE obeys the zone fairness allocation
batches on one hand, but doesn't reset them and wake kswapd on the other
hand. After a few of those allocations, the batches are exhausted and
the allocations fail.
Fixing this means either having GFP_THISNODE wake up kswapd, or
GFP_THISNODE not participating in zone fairness at all. The latter
seems safer as an acute bugfix, we can clean up later.
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: <stable@kernel.org> [3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When doing some numa tests on powerpc, I triggered an oops bug. I find
it is caused by using page->_last_cpupid. It should be initialized as
"-1 & LAST_CPUPID_MASK", but not "-1". Otherwise, in task_numa_fault(),
we will miss the checking (last_cpupid == (-1 & LAST_CPUPID_MASK)). And
finally cause an oops bug in task_numa_group(), since the online cpu is
less than possible cpu. This happen with CONFIG_SPARSE_VMEMMAP disabled
Call trace:
SMP NR_CPUS=64 NUMA PowerNV
Modules linked in:
CPU: 24 PID: 804 Comm: systemd-udevd Not tainted3.13.0-rc1+ #32
task: c000001e2746aa80 ti: c000001e32c50000 task.ti:c000001e32c50000
REGS: c000001e32c53510 TRAP: 0300 Not tainted(3.13.0-rc1+)
MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR:28024424 XER: 20000000
CFAR: c000000000009324 DAR: 7265717569726857 DSISR:40000000 SOFTE: 1
NIP .task_numa_fault+0x1470/0x2370
LR .task_numa_fault+0x1468/0x2370
Call Trace:
.task_numa_fault+0x1468/0x2370 (unreliable)
.do_numa_page+0x480/0x4a0
.handle_mm_fault+0x4ec/0xc90
.do_page_fault+0x3a8/0x890
handle_page_fault+0x10/0x30
Instruction dump:
3c82fefb 3884b138 48d9cff1 60000000 48000574 3c62fefb3863af78 3c82fefb
3884b138 48d9cfd5 60000000 e93f0100 <812902e4> 7d2907b45529063e 7d2a07b4
---[ end trace 15f2510da5ae07cf ]---
Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Tree location entries should start with the appropriate type.
Add git to some, hg to another.
Neaten tree type description.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Convert whitespace to single tab for separators.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patch is a modification of the patch originally proposed by
Xiaotian Feng <xtfeng@gmail.com>: https://lkml.org/lkml/2012/11/5/413
This new version disables DMA channel interrupts and ensures that the
tasklet wil not be scheduled again before calling tasklet_kill().
Unfortunately the updated patch was not released at that time due to
planned rework of Tsi721 mport driver to use threaded interrupts (which
has yet to happen). Recently the issue was reported again:
https://lkml.org/lkml/2014/2/19/762.
Description from the original Xiaotian's patch:
"Some drivers use tasklet_disable in device remove/release process,
tasklet_disable will inc tasklet->count and return. If the tasklet is
not handled yet under some softirq pressure, the tasklet will be
placed on the tasklet_vec, never have a chance to be excuted. This
might lead to a heavy loaded ksoftirqd, wakeup with pending_softirq,
but tasklet is disabled. tasklet_kill should be used in this case."
This patch is applicable to kernel versions starting from v3.5.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Xiaotian Feng <xtfeng@gmail.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: <stable@vger.kernel.org> [3.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Current implementation of HFS+ driver has small issue with remount
option. Namely, for example, you are unable to remount from RO mode
into RW mode by means of command "mount -o remount,rw /dev/loop0
/mnt/hfsplus". Trying to execute sequence of commands results in an
error message:
mount /dev/loop0 /mnt/hfsplus
mount -o remount,ro /dev/loop0 /mnt/hfsplus
mount -o remount,rw /dev/loop0 /mnt/hfsplus
mount: you must specify the filesystem type
mount -t hfsplus -o remount,rw /dev/loop0 /mnt/hfsplus
mount: /mnt/hfsplus not mounted or bad option
The reason of such issue is failure of mount syscall:
mount("/dev/loop0", "/mnt/hfsplus", 0x2282a60, MS_MGC_VAL|MS_REMOUNT, NULL) = -1 EINVAL (Invalid argument)
Namely, hfsplus_parse_options_remount() method receives empty "input"
argument and return false in such case. As a result, hfsplus_remount()
returns -EINVAL error code.
This patch fixes the issue by means of return true for the case of empty
"input" argument in hfsplus_parse_options_remount() method.
Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
zram_meta_alloc could fail so caller should check it. Otherwise, your
system will hang.
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit bcf24e1daa94 ("mmc: omap_hsmmc: use the generic config for
omap2plus devices"), enabled the build for other platforms for compile
testing.
sh-allmodconfig now fails with:
include/linux/omap-dma.h:171:8: error: expected identifier before numeric constant
make[4]: *** [drivers/mmc/host/omap_hsmmc.o] Error 1
This happens because SuperH #defines "CCR", which is one of the enum
values in include/linux/omap-dma.h. There's a similar issue with "CCR2"
on sh2a.
As "CCR" and "CCR2" are too generic names for global #defines, prefix
them with "SH_" to fix this.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Global quota files are accessed from different nodes. Thus we cannot
cache offset of quota structure in the quota file after we drop our node
reference count to it because after that moment quota structure may be
freed and reallocated elsewhere by a different node resulting in
corruption of quota file.
Fix the problem by clearing dq_off when we are releasing dquot structure.
We also remove the DB_READ_B handling because it is useless -
DQ_ACTIVE_B is set iff DQ_READ_B is set.
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Goldwyn Rodrigues <rgoldwyn@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
for TYPE_S3C64XX
On exynos5250, exynos5420 and exynos5260 it was observed that, after 1
cycle of S2R, the rtc-tick occurs at a very fast rate as compared to the
rtc-tick occuring before S2R.
This patch fixes the above issue by correcting the wrong way of
save/restore of S3C2410_TICNT for TYPE_S3C64XX.
Signed-off-by: Vikas Sajjan <vikas.sajjan@samsung.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently symbols that are absolute addresses are incorrectly displayed
in /proc/kallsyms if the kernel is loaded with kASLR.
The problem was that the scripts/kallsyms.c file which generates the
array of symbol names and addresses uses an relocatable value for all
symbols, even absolute symbols. This patch fixes that.
Several kallsyms output in different boot states for comparison:
$ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.nokaslr
0000000000000000 D __per_cpu_start
0000000000014280 D __per_cpu_end
ffffffff810001c8 T _stext
$ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.kaslr1
000000001f200000 D __per_cpu_start
000000001f214280 D __per_cpu_end
ffffffffa02001c8 T _stext
$ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.kaslr2
000000000d400000 D __per_cpu_start
000000000d414280 D __per_cpu_end
ffffffff8e4001c8 T _stext
$ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.kaslr-fixed
0000000000000000 D __per_cpu_start
0000000000014280 D __per_cpu_end
ffffffffadc001c8 T _stext
Signed-off-by: Andy Honig <ahonig@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
LZ4 as implemented in the kernel differs from the default method now
used by the reference implementation of LZ4. Until the in-kernel method
is updated to support the new default, passing the legacy flag (-l) to
the compressor is necessary. Without this flag the kernel-generated,
LZ4-compressed initramfs is junk.
Kyungsik said:
: It seems that lz4 supports legacy format with the same option as lz4c
: does. Just looking at the first few bytes of lz4 compressed image, we can
: see whether it is new format or not.
:
: It shows new format magic number without this patch. New format magic
: number is 0x184d2204.
:
: $ hexdump -C ./initramfs_data.cpio.lz4 |more
: 00000000 04 22 4d 18 64 70 b9 69 (Little Endian)
: ...
:
: Currently kernel supports legacy format only.
Signed-off-by: Daniel M. Weeks <dan@danweeks.net>
Cc: Michal Marek <mmarek@suse.cz>
Acked-by: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Daniel Borkmann reported a VM_BUG_ON assertion failing:
------------[ cut here ]------------
kernel BUG at mm/mlock.c:528!
invalid opcode: 0000 [#1] SMP
Modules linked in: ccm arc4 iwldvm [...]
video
CPU: 3 PID: 2266 Comm: netsniff-ng Not tainted 3.14.0-rc2+ #8
Hardware name: LENOVO 2429BP3/2429BP3, BIOS G4ET37WW (1.12 ) 05/29/2012
task: ffff8801f87f9820 ti: ffff88002cb44000 task.ti: ffff88002cb44000
RIP: 0010:[<ffffffff81171ad0>] [<ffffffff81171ad0>] munlock_vma_pages_range+0x2e0/0x2f0
Call Trace:
do_munmap+0x18f/0x3b0
vm_munmap+0x41/0x60
SyS_munmap+0x22/0x30
system_call_fastpath+0x1a/0x1f
RIP munlock_vma_pages_range+0x2e0/0x2f0
---[ end trace a0088dcf07ae10f2 ]---
because munlock_vma_pages_range() thinks it's unexpectedly in the middle
of a THP page. This can be reproduced with default config since 3.11
kernels. A reproducer can be found in the kernel's selftest directory
for networking by running ./psock_tpacket.
The problem is that an order=2 compound page (allocated by
alloc_one_pg_vec_page() is part of the munlocked VM_MIXEDMAP vma (mapped
by packet_mmap()) and mistaken for a THP page and assumed to be order=9.
The checks for THP in munlock came with commit ff6a6da60b89 ("mm:
accelerate munlock() treatment of THP pages"), i.e. since 3.9, but did
not trigger a bug. It just makes munlock_vma_pages_range() skip such
compound pages until the next 512-pages-aligned page, when it encounters
a head page. This is however not a problem for vma's where mlocking has
no effect anyway, but it can distort the accounting.
Since commit 7225522bb429 ("mm: munlock: batch non-THP page isolation
and munlock+putback using pagevec") this can trigger a VM_BUG_ON in
PageTransHuge() check.
This patch fixes the issue by adding VM_MIXEDMAP flag to VM_SPECIAL, a
list of flags that make vma's non-mlockable and non-mergeable. The
reasoning is that VM_MIXEDMAP vma's are similar to VM_PFNMAP, which is
already on the VM_SPECIAL list, and both are intended for non-LRU pages
where mlocking makes no sense anyway. Related Lkml discussion can be
found in [2].
[1] tools/testing/selftests/net/psock_tpacket
[2] https://lkml.org/lkml/2014/1/10/427
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Reported-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: John David Anglin <dave.anglin@bell.net>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Jared Hulbert <jaredeh@gmail.com>
Tested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [3.11.x+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Sometimes the cleanup after memcg hierarchy testing gets stuck in
mem_cgroup_reparent_charges(), unable to bring non-kmem usage down to 0.
There may turn out to be several causes, but a major cause is this: the
workitem to offline parent can get run before workitem to offline child;
parent's mem_cgroup_reparent_charges() circles around waiting for the
child's pages to be reparented to its lrus, but it's holding
cgroup_mutex which prevents the child from reaching its
mem_cgroup_reparent_charges().
Further testing showed that an ordered workqueue for cgroup_destroy_wq
is not always good enough: percpu_ref_kill_and_confirm's call_rcu_sched
stage on the way can mess up the order before reaching the workqueue.
Instead, when offlining a memcg, call mem_cgroup_reparent_charges() on
all its children (and grandchildren, in the correct order) to have their
charges reparented first.
Fixes: e5fca243abae ("cgroup: use a dedicated workqueue for cgroup destruction")
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org> [v3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 0eef615665ed ("memcg: fix css reference leak and endless loop in
mem_cgroup_iter") got the interaction with the commit a few before it
d8ad30559715 ("mm/memcg: iteration skip memcgs not yet fully
initialized") slightly wrong, and we didn't notice at the time.
It's elusive, and harder to get than the original, but for a couple of
days before rc1, I several times saw a endless loop similar to that
supposedly being fixed.
This time it was a tighter loop in __mem_cgroup_iter_next(): because we
can get here when our root has already been offlined, and the ordering
of conditions was such that we then just cycled around forever.
Fixes: 0eef615665ed ("memcg: fix css reference leak and endless loop in mem_cgroup_iter").
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: <stable@vger.kernel.org> [3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Running fsx on tmpfs with concurrent memhog-swapoff-swapon, lots of
BUG: sleeping function called from invalid context at kernel/fork.c:606
in_atomic(): 0, irqs_disabled(): 0, pid: 1394, name: swapoff
1 lock held by swapoff/1394:
#0: (rcu_read_lock){.+.+.+}, at: [<ffffffff812520a1>] radix_tree_locate_item+0x1f/0x2b6
followed by
================================================
[ BUG: lock held when returning to user space! ]
3.14.0-rc1 #3 Not tainted
------------------------------------------------
swapoff/1394 is leaving the kernel with locks still held!
1 lock held by swapoff/1394:
#0: (rcu_read_lock){.+.+.+}, at: [<ffffffff812520a1>] radix_tree_locate_item+0x1f/0x2b6
after which the system recovered nicely.
Whoops, I long ago forgot the rcu_read_unlock() on one unlikely branch.
Fixes e504f3fdd63d ("tmpfs radix_tree: locate_item to speed up swapoff")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
While debug_dma_assert_idle() checks if a given *page* is actively
undergoing dma the valid granularity of a dma mapping is a *cacheline*.
Sander's testing shows that the warning message "DMA-API: exceeded 7
overlapping mappings of pfn..." is falsely triggering. The test is
simply mapping multiple cachelines in a given page.
Ultimately we want overlap tracking to be valid as it is a real api
violation, so we need to track active mappings by cachelines. Update
the active dma tracking to use the page-frame-relative cacheline of the
mapping as the key, and update debug_dma_assert_idle() to check for all
possible mapped cachelines for a given page.
However, the need to track active mappings is only relevant when the
dma-mapping is writable by the device. In fact it is fairly standard
for read-only mappings to have hundreds or thousands of overlapping
mappings at once. Limiting the overlap tracking to writable
(!DMA_TO_DEVICE) eliminates this class of false-positive overlap
reports.
Note, the radix gang lookup is sub-optimal. It would be best if it
stopped fetching entries once the search passed a page boundary.
Nevertheless, this implementation does not perturb the original net_dma
failing case. That is to say the extra overhead does not show up in
terms of making the failing case pass due to a timing change.
References:
http://marc.info/?l=linux-netdev&m=139232263419315&w=2
http://marc.info/?l=linux-netdev&m=139217088107122&w=2
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Dave Jones <davej@redhat.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit bf6bddf1924e ("mm: introduce compaction and migration for
ballooned pages") introduces page_count(page) into memory compaction
which dereferences page->first_page if PageTail(page).
This results in a very rare NULL pointer dereference on the
aforementioned page_count(page). Indeed, anything that does
compound_head(), including page_count() is susceptible to racing with
prep_compound_page() and seeing a NULL or dangling page->first_page
pointer.
This patch uses Andrea's implementation of compound_trans_head() that
deals with such a race and makes it the default compound_head()
implementation. This includes a read memory barrier that ensures that
if PageTail(head) is true that we return a head page that is neither
NULL nor dangling. The patch then adds a store memory barrier to
prep_compound_page() to ensure page->first_page is set.
This is the safest way to ensure we see the head page that we are
expecting, PageTail(page) is already in the unlikely() path and the
memory barriers are unfortunately required.
Hugetlbfs is the exception, we don't enforce a store memory barrier
during init since no race is possible.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Holger Kiehl <Holger.Kiehl@dwd.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We're more or less collecting EDAC patches already anyway so let's hold it
down so that get_maintainer sees it too.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Cc: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add basic CPU topology support to arm64, based on the existing pre-v8
code and some work done by Mark Hambleton. This patch does not
implement any topology discovery support since that should be based on
information from firmware, it merely implements the scaffolding for
integration of topology support in the architecture.
No locking of the topology data is done since it is only modified during
CPU bringup with external serialisation from the SMP code.
The goal is to separate the architecture hookup for providing topology
information from the DT parsing in order to ease review and avoid
blocking the architecture code (which will be built on by other work)
with the DT code review by providing something simple and basic.
Following patches will implement support for interpreting topology
information from MPIDR and for parsing the DT topology bindings for ARM,
similar patches will be needed for ACPI.
Signed-off-by: Mark Brown <broonie@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
[catalin.marinas@arm.com: removed CONFIG_CPU_TOPOLOGY, always on if SMP]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
When plugging a headphone or headset, lots of noise is heard from
internal speaker, after changing the automute via amp instead of
pinctl, the noise disappears.
BugLink: https://bugs.launchpad.net/bugs/1268468
Cc: David Henningsson <david.henningsson@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Parameter conversion within the system call wrappers is only needed
for parameters which differ in size and have a size of eight bytes on
64 bit.
For system call parameters with a size of less than eight byte the
called system call itself will perform parameter conversion anyway.
So we can save the double conversion of e.g. int parameters.
The only types which need to be converted are therefore pointer and
(unsigned) long parameters.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Instead of explicitly changing compat system call parameters from e.g.
unsigned long to compat_ulong_t let the COMPAT_SYSCALL_WRAP macros
automatically detect (unsigned) long parameters and zero and sign
extend them automatically.
The resulting binary is completely identical.
In addition add a sys_[system call name] prototype for each system call
wrapper. This will cause compile errors if the prototype does not match
the prototype in include/linux/syscall.h.
Therefore we should now always get the correct zero and sign extension
of system call parameters. Pointers are handled like before.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
This adds support for advertising the presence of ARMv8 Crypto
Extensions in the Aarch32 execution state to 32-bit ELF binaries
running in 32-bit compat mode under the arm64 kernel.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Add support for the ELF auxv entry AT_HWCAP2 when running 32-bit
ELF binaries in compat mode.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The compat syscall wrappers for sync_file_range and fallocate merged 32 bit
parameters into 64 bit parameters. Therefore they did more than just the
usual zero and/or sign extension of system call parameters.
So convert these two wrappers to full s390 specific compat sytem calls.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|