Age | Commit message (Collapse) | Author |
|
The access beyond the end of device BUG_ON that was introduced to
dm_request_fn via commit 29e4013de7ad950280e4b2208 ("dm: implement
REQ_FLUSH/FUA support for request-based dm") was an overly
drastic (but simple) response to this situation.
I have received a report that this BUG_ON was hit and now think
it would be better to use dm_kill_unmapped_request() to fail the clone
and original request with -EIO.
map_request() will assign the valid target returned by
dm_table_find_target to tio->ti. But when the target
isn't valid tio->ti is never assigned (because map_request isn't
called); so add a check for tio->ti != NULL to dm_done().
Reported-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: stable@vger.kernel.org # v2.6.37+
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
When there are no paths and multipath receives an ioctl, it waits until
a path becomes available. This behaviour is incorrect if the
"queue_if_no_path" setting was not specified, as then the ioctl should
be rejected immediately, which this patch now does.
commit 35991652b ("dm mpath: allow ioctls to trigger pg init") should
have checked if queue_if_no_path was configured before queueing IO.
Checking for the queue_if_no_path feature, like is done in map_io(),
allows the following table load to work without blocking in the
multipath_ioctl retry loop:
echo "0 1024 multipath 0 0 0 0" | dmsetup create mpath_nodevs
Without this fix the multipath_ioctl will block with the following stack
trace:
blkid D 0000000000000002 0 23936 1 0x00000000
ffff8802b89e5cd8 0000000000000082 ffff8802b89e5fd8 0000000000012440
ffff8802b89e4010 0000000000012440 0000000000012440 0000000000012440
ffff8802b89e5fd8 0000000000012440 ffff88030c2aab30 ffff880325794040
Call Trace:
[<ffffffff814ce099>] schedule+0x29/0x70
[<ffffffff814cc312>] schedule_timeout+0x182/0x2e0
[<ffffffff8104dee0>] ? lock_timer_base+0x70/0x70
[<ffffffff814cc48e>] schedule_timeout_uninterruptible+0x1e/0x20
[<ffffffff8104f840>] msleep+0x20/0x30
[<ffffffffa0000839>] multipath_ioctl+0x109/0x170 [dm_multipath]
[<ffffffffa06bfb9c>] dm_blk_ioctl+0xbc/0xd0 [dm_mod]
[<ffffffff8122a408>] __blkdev_driver_ioctl+0x28/0x30
[<ffffffff8122a79e>] blkdev_ioctl+0xce/0x730
[<ffffffff811970ac>] block_ioctl+0x3c/0x40
[<ffffffff8117321c>] do_vfs_ioctl+0x8c/0x340
[<ffffffff81166293>] ? sys_newfstat+0x33/0x40
[<ffffffff81173571>] sys_ioctl+0xa1/0xb0
[<ffffffff814d70a9>] system_call_fastpath+0x16/0x1b
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # 3.5+
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
The dm thin pool target claims to support the zeroing of discarded
data areas. This turns out to be incorrect when processing discards
that do not exactly cover a complete number of blocks, so the target
must always set discard_zeroes_data_unsupported.
The thin pool target will zero blocks when they are allocated if the
skip_block_zeroing feature is not specified. The block layer
may send a discard that only partly covers a block. If a thin pool
block is partially discarded then there is no guarantee that the
discarded data will get zeroed before it is accessed again.
Due to this, thin devices cannot claim discards will always zero data.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Cc: stable@vger.kernel.org # 3.4+
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Pull c6x arch fixes from Mark Salter:
- Add __NR_kcmp to generic syscall list
- C6X: Use generic asm/barrier.h
* tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
syscalls: add __NR_kcmp syscall to generic unistd.h
c6x: use asm-generic/barrier.h
|
|
Cc: Lukasz Dorau <lukasz.dorau@intel.com>
Cc: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit d97b46a64 ("syscalls, x86: add __NR_kcmp syscall" ) added a new
syscall to support checkpoint restore. It is currently x86-only, but
that restriction will be removed in a subsequent patch. Unfortunately,
the kernel checksyscalls script had a bug which suppressed any warning
to other architectures that the kcmp syscall was not implemented. A
patch to checksyscalls is being tested in linux-next and other
architectures are seeing warnings about kcmp being unimplemented.
This patch adds __NR_kcmp to <asm-generic/unistd.h> so that kcmp is
wired in for architectures using the generic syscall list.
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
|
|
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@vger.kernel.org
|
|
Otherwise when X starts we commonly get a black screen scanning
out nothing, its wierd dpms on/off from userspace brings it back,
With this on F18, multi-seat works again with my 1920x1200 monitor
which is above the sku limit for the device I have.
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexander.deucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
We don't allocate enough data for this struct. As soon as we start
modifying event->event on the next lines, then we're going beyond the
end of the memory we allocated.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@gmail.com>
|
|
git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes
These just silence some printks that we are seeing that we shouldn't
* 'drm-nouveau-fixes' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nvc0/ltcg: mask off intr 0x10
drm/nouveau: silence a debug message triggered by newer userspace
|
|
NVIDIA do that at startup too on Fermi, so perhaps the heap of 0x10
intrs we receive are normal and we can ignore them.
On Kepler NVIDIA *don't* do this, but the hardware appears to come up
with the bit masked off by default - so that's probably why :)
This should silence some interrupt spam seen on Fermi+ boards.
Backported patch from reworked nouveau kernel tree.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when
a running state is saved to userspace and then reinstated from there.
Make sure that private xt_limit area is initialized with correct values.
Otherwise, random matchings due to use of uninitialized memory.
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Pull more networking fixes from David Miller:
1) Eric Dumazet discovered and fixed what turned out to be a family of
bugs. These functions were using pskb_may_pull() which might need
to reallocate the linear SKB data buffer, but the callers were not
expecting this possibility. The callers have cached pointers to the
packet header areas, and would need to reload them if we were to
continue using pskb_may_pull().
So they could end up reading garbage.
It's easier to just change these RAW4/RAW6/MIP6 routines to use
skb_header_pointer() instead of pskb_may_pull(), which won't modify
the linear SKB data area.
2) Dave Jone's syscall spammer caught a case where a non-TCP socket can
call down into the TCP keepalive code. The case basically involves
creating a raw socket with sk_protocol == IPPROTO_TCP, then calling
setsockopt(sock_fd, SO_KEEPALIVE, ...)
Fixed by Eric Dumazet.
3) Bluetooth devices do not get configured properly while being powered
on, resulting in always using legacy pairing instead of SSP. Fix
from Andrzej Kaczmarek.
4) Bluetooth cancels delayed work erroneously, put stricter checks in
place. From Andrei Emeltchenko.
5) Fix deadlock between cfg80211_mutex and reg_regdb_search_mutex in
cfg80211, from Luis R. Rodriguez.
6) Fix interrupt double release in iwlwifi, from Emmanuel Grumbach.
7) Missing module license in bcm87xx driver, from Peter Huewe.
8) Team driver can lose port changed events when adding devices to a
team, fix from Jiri Pirko.
9) Fix endless loop when trying ot unregister PPPOE device in zombie
state, from Xiaodong Xu.
10) batman-adv layer needs to set MAC address of software device
earlier, otherwise we call tt_local_add with it uninitialized.
11) Fix handling of KSZ8021 PHYs, it's matched currently by KS8051 but
that doesn't program the device properly. From Marek Vasut.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
ipv6: mip6: fix mip6_mh_filter()
ipv6: raw: fix icmpv6_filter()
net: guard tcp_set_keepalive() to tcp sockets
phy/micrel: Add missing header to micrel_phy.h
phy/micrel: Rename KS80xx to KSZ80xx
phy/micrel: Implement support for KSZ8021
batman-adv: Fix symmetry check / route flapping in multi interface setups
batman-adv: Fix change mac address of soft iface.
pppoe: drop PPPOX_ZOMBIEs in pppoe_release
team: send port changed when added
ipv4: raw: fix icmp_filter()
net/phy/bcm87xx: Add MODULE_LICENSE("GPL") to GPL driver
iwlwifi: don't double free the interrupt in failure path
cfg80211: fix possible circular lock on reg_regdb_search()
Bluetooth: Fix not removing power_off delayed work
Bluetooth: Fix freeing uninitialized delayed works
Bluetooth: mgmt: Fix enabling LE while powered off
Bluetooth: mgmt: Fix enabling SSP while powered off
|
|
mip6_mh_filter() should not modify its input, or else its caller
would need to recompute ipv6_hdr() if skb->head is reallocated.
Use skb_header_pointer() instead of pskb_may_pull()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Included fixes:
- fix the behaviour of batman-adv in case of virtual interface MAC change event
- fix symmetric link check in neighbour selection
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
icmpv6_filter() should not modify its input, or else its caller
would need to recompute ipv6_hdr() if skb->head is reallocated.
Use skb_header_pointer() instead of pskb_may_pull() and
change the prototype to make clear both sk and skb are const.
Also, if icmpv6 header cannot be found, do not deliver the packet,
as we do in IPv4.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull SuperH fix from Paul Mundt:
"One last minute regression fix.."
* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
sh: pfc: Fix up GPIO mux type reconfig case.
|
|
Merge misc fixes from Andrew Morton:
"One maintainer change and three bugfixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (4 commits)
c/r: prctl: fix build error for no-MMU case
lib/flex_proportions.c: fix corruption of denominator in flexible proportions
checksyscalls: fix "here document" handling
pwm-backlight: take over maintenance
|
|
Commit 1ad75b9e1628 ("c/r: prctl: add minimal address test to
PR_SET_MM") added some address checking to prctl_set_mm() used by
checkpoint-restore. This causes a build error for no-MMU systems:
kernel/sys.c: In function 'prctl_set_mm':
kernel/sys.c:1868:34: error: 'mmap_min_addr' undeclared (first use in this function)
The test for mmap_min_addr doesn't make a lot of sense for no-MMU code
as noted in commit 6e1415467614 ("NOMMU: Optimise away the
{dac_,}mmap_min_addr tests").
This patch defines mmap_min_addr as 0UL in the no-MMU case so that the
compiler will optimize away tests for "addr < mmap_min_addr".
Signed-off-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: <stable@vger.kernel.org> [3.6.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When racing with CPU hotplug, percpu_counter_sum() can return negative
values for the number of observed events.
This confuses fprop_new_period(), which uses unsigned type and as a
result number of events is set to big *positive* number. From that
moment on, things go pear shaped and can result e.g. in division by
zero as denominator is later truncated to 32-bits.
This bug causes a divide-by-zero oops in bdi_dirty_limit() in Borislav's
3.6.0-rc6 based kernel.
Fix the issue by using a signed type in fprop_new_period(). That makes
us bail out from the function without doing anything (mistakenly)
thinking there are no events to age. That makes aging somewhat
inaccurate but getting accurate data would be rather hard.
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Borislav Petkov <bp@amd64.org>
Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
"echo" doesn't read from stdin, therefore the checksyscalls script didn't
warn about not implemented system calls anymore since 29dc54c6
("checksyscalls: Use arch/x86/syscalls/syscall_32.tbl as source").
Use "cat" instead of "echo" which handles this correctly.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Since the pwm-backlight driver is lacking a proper maintainer and is the
heaviest user of the PWM framework I'm taking over maintenance.
Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Acked-by: Arun Murthy <arun.murthy@stericsson.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Robert Morell <rmorell@nvidia.com>
Cc: Dilan Lee <dilee@nvidia.com>
Cc: Axel Lin <axel.lin@gmail.com>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Sandy bridge EDAC is calculating the memory size with overflow.
Basically, the size field and the integer calculation is using 32 bits.
More bits are needed, when the DIMM memories have high density.
The net result is that memories are improperly reported there, when
high-density DIMMs are used:
EDAC DEBUG: in drivers/edac/sb_edac.c, line at 591: mc#0: channel 0, dimm 0, -16384 Mb (-4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800
EDAC DEBUG: in drivers/edac/sb_edac.c, line at 591: mc#0: channel 1, dimm 0, -16384 Mb (-4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800
As the number of pages value is handled at the EDAC core as unsigned
ints, the driver shows the 16 GB memories at sysfs interface as 16760832
MB! The fix is simple: calculate the number of pages as unsigned 64-bits
integer.
After the patch, the memory size (16 GB) is properly detected:
EDAC DEBUG: in drivers/edac/sb_edac.c, line at 592: mc#0: channel 0, dimm 0, 16384 Mb (4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800
EDAC DEBUG: in drivers/edac/sb_edac.c, line at 592: mc#0: channel 1, dimm 0, 16384 Mb (4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800
Cc: stable@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
When 2R memories are found, the memory size should be multiplied
by two, otherwise, it will report half of the memory size:
+-----------------------------------------------+
| mc0 |
| branch0 | branch1 |
| channel0 | channel1 | channel0 | channel1 |
-------+-----------------------------------------------+
slot3: | 0 MB | 0 MB | 0 MB | 0 MB |
slot2: | 0 MB | 0 MB | 0 MB | 0 MB |
-------+-----------------------------------------------+
slot1: | 0 MB | 0 MB | 0 MB | 0 MB |
slot0: | 1024 MB | 1024 MB | 1024 MB | 1024 MB |
-------+-----------------------------------------------+
(the above machine have 4 x 2GB 2R memories)
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
commit a895bf8b1e1ea4c032a8fa8a09475a2ce09fe77a incorrectly
changed the logic that fills the memory bank size. Fix it.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
Some drivers need to switch pin states between GPIO and pin function at
runtime, which was inadvertently broken in the pinctrl driver for GPIOs
being bound to a specific direction.
This fixes up the request path to ensure that previously configured GPIOs
don't cause us to inadvertently error out with an unsupported mux on
reconfig, which in practice is primarily aimed at trapping pull-up/down
users that have yet to be implemented under the new API.
Fixes up regressions in the TPU PWM driver, amongst others.
Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:
====================
Please pull this last(?) batch of fixes intended for 3.6...
For the Bluetooth bits, Gustavo says this:
"Here goes probably my last update to 3.6. It includes the two patches
you were ok last week(from Andrzej Kaczmarek), those are critical
ones, and two other fixes one for a system crash and the other for
a missing lockdep annotation."
The referenced fixes from Andrzej prevent attempts to configure devices
that are powered-off.
Along with the Bluetooth fixes, there are a couple of 802.11 fixes.
Emmanuel Grumbach gives us an iwlwifi fix to prevent releasing an
interrupt twice. Luis R. Rodriguez provides a fix for a possible
circular lock dependency in the cfg80211 regulatory enforcement code.
All of these have been in linux-next for a few days. I hope they are
not too late to make the 3.6 release!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull tile gxio ABI fix from Chris Metcalf:
"This fixes a last-minute change in the Tilera hypervisor ABI for TRIO
(PCI root complex) support. We've locked in this ABI going forward
and will make sure no further ABI changes like this occur."
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: gxio iorpc numbering change for TRIO interface
|
|
Pull vfio fixes from Alex Williamson:
"VFIO doc update and virqfd race fix"
* tag 'vfio-for-linus' of git://github.com/awilliam/linux-vfio:
vfio: Fix virqfd release race
vfio: Trivial Documentation correction
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull a Xen fix from Konrad Rzeszutek Wilk:
"It is a bug-fix when we run the initial PV guest on a AMD K8 machine
and have CONFIG_AMD_NUMA enabled and detect the NUMA topology from the
Northbridge.
We end up in the situation where the initial domain gets too much
information and gets confused and crashes - the fix is to restrict the
domain to get the information - and we do it by just disabling NUMA on
the PV guest (the hypervisor is still able to do its proper NUMA
allocations of guests).
It is OK to disable the PV guest from accessing NUMA data as right now
we do not inject any NUMA node information to the PV guests. When we
do get to that point, then this patch will have to be reverted."
* Disable PV NUMA support as we do not do anything with it (yet) and it
can cause bootup crashes on certain AMD machines.
* tag 'stable/for-linus-3.6-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/boot: Disable NUMA for PV guests.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull two ceph fixes from Sage Weil:
"The first fixes a leak in the rbd setup error path, and the second
fixes a more serious problem with mismatched kmap/kunmap that surfaced
after the recent refactoring work."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: only kunmap kmapped pages
rbd: drop dev reference on error in rbd_open()
|
|
Its possible to use RAW sockets to get a crash in
tcp_set_keepalive() / sk_reset_timer()
Fix is to make sure socket is a SOCK_STREAM one.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For GPIOs of gpio-lpc32xx, gpio_direction_output() ignores the value argument
(initial value of output). This patch fixes this by setting the level
accordingly.
Cc: stable@kernel.org
Signed-off-by: Roland Stigge <stigge@antcom.de>
Acked-by: Alexandre Pereira da Silva <aletes.xgr@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The license header was missing in micrel_phy.h . This patch adds
one.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David J. Choi <david.choi@micrel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is no such part as KS8001, KS8041 or KS8051. There are only
KSZ8001, KSZ8041 and KSZ8051. Rename these parts as such to match
the Micrel naming.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David J. Choi <david.choi@micrel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Cc: Linux ARM kernel <linux-arm-kernel@lists.infradead.org>
Cc: Fabio Estevam <fabio.estevam@freescale.com>
Cc: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The KSZ8021 PHY was previously caught by KS8051, which is not correct.
This PHY needs additional setup if it is strapped for address 0. In such
case an reserved bit must be written in the 0x16, "Operation Mode Strap
Override" register. According to the KS8051 datasheet, that bit means
"PHY Address 0 in non-broadcast" and it indeed behaves as such on KSZ8021.
The issue where the ethernet controller (Freescale FEC) did not communicate
with network is fixed by writing this bit as 1.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David J. Choi <david.choi@micrel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
An ABI numbering change was made in the hypervisor for Tilera's 4.1
MDE release (just shipped). It's incompatible with the previous 4.0
release ABI numbering, so we track the new numbering going forward.
We plan to avoid modifying ABI numbering for these interfaces again.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
|
|
A recent patch in the linux-next tree caused a build failure on
C6X because C6X didn't define a read_barrier_depends() macro. C6X
does not support SMP and the architecture doesn't provide any
special memory ordering instructions, so it makes sense to just
use the generic barrier.h rather than patching the existing c6x
specific header.
Signed-off-by: Mark Salter <msalter@redhat.com>
|
|
The hypervisor is in charge of allocating the proper "NUMA" memory
and dealing with the CPU scheduler to keep them bound to the proper
NUMA node. The PV guests (and PVHVM) have no inkling of where they
run and do not need to know that right now. In the future we will
need to inject NUMA configuration data (if a guest spans two or more
NUMA nodes) so that the kernel can make the right choices. But those
patches are not yet present.
In the meantime, disable the NUMA capability in the PV guest, which
also fixes a bootup issue. Andre says:
"we see Dom0 crashes due to the kernel detecting the NUMA topology not
by ACPI, but directly from the northbridge (CONFIG_AMD_NUMA).
This will detect the actual NUMA config of the physical machine, but
will crash about the mismatch with Dom0's virtual memory. Variation of
the theme: Dom0 sees what it's not supposed to see.
This happens with the said config option enabled and on a machine where
this scanning is still enabled (K8 and Fam10h, not Bulldozer class)
We have this dump then:
NUMA: Warning: node ids are out of bound, from=-1 to=-1 distance=10
Scanning NUMA topology in Northbridge 24
Number of physical nodes 4
Node 0 MemBase 0000000000000000 Limit 0000000040000000
Node 1 MemBase 0000000040000000 Limit 0000000138000000
Node 2 MemBase 0000000138000000 Limit 00000001f8000000
Node 3 MemBase 00000001f8000000 Limit 0000000238000000
Initmem setup node 0 0000000000000000-0000000040000000
NODE_DATA [000000003ffd9000 - 000000003fffffff]
Initmem setup node 1 0000000040000000-0000000138000000
NODE_DATA [0000000137fd9000 - 0000000137ffffff]
Initmem setup node 2 0000000138000000-00000001f8000000
NODE_DATA [00000001f095e000 - 00000001f0984fff]
Initmem setup node 3 00000001f8000000-0000000238000000
Cannot find 159744 bytes in node 3
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81d220e6>] __alloc_bootmem_node+0x43/0x96
Pid: 0, comm: swapper Not tainted 3.3.6 #1 AMD Dinar/Dinar
RIP: e030:[<ffffffff81d220e6>] [<ffffffff81d220e6>] __alloc_bootmem_node+0x43/0x96
.. snip..
[<ffffffff81d23024>] sparse_early_usemaps_alloc_node+0x64/0x178
[<ffffffff81d23348>] sparse_init+0xe4/0x25a
[<ffffffff81d16840>] paging_init+0x13/0x22
[<ffffffff81d07fbb>] setup_arch+0x9c6/0xa9b
[<ffffffff81683954>] ? printk+0x3c/0x3e
[<ffffffff81d01a38>] start_kernel+0xe5/0x468
[<ffffffff81d012cf>] x86_64_start_reservations+0xba/0xc1
[<ffffffff81007153>] ? xen_setup_runstate_info+0x2c/0x36
[<ffffffff81d050ee>] xen_start_kernel+0x565/0x56c
"
so we just disable NUMA scanning by setting numa_off=1.
CC: stable@vger.kernel.org
Reported-and-Tested-by: Andre Przywara <andre.przywara@amd.com>
Acked-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
If a dirty GFS2 inode was being deleted but was in use by another node, its
metadata was not getting written out before GFS2 checked for dirty buffers in
gfs2_ail_flush(). GFS2 was relying on inode_go_sync() to write out the
metadata when the other node tried to free the file, but it failed the error
check before it got that far. This patch writes out the metadata before calling
gfs2_ail_flush()
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
gfs2_ail_empty_gl() contains an "inline version" of gfs2_trans_begin(),
so it needs an explicit sb_start_intwrite() as well, to balance the
sb_end_intwrite() which will be called by gfs2_trans_end().
With this, xfstest 068 passes on lock_nolock local gfs2.
Without it, we reach a writer count of -1 and get stuck.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
This patch fixes an infinite loop in gfs2_rbm_find that was introduced
by the previous patch. The problem occurred when the length was less
than 3 but the rbm block was byte-aligned, causing it to improperly
return a extent length of zero, which caused it to spin.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Tested-by: Bob Peterson <rpeterso@redhat.com>
Tested-by: Barry Marson <bmarson@redhat.com>
|
|
With the recently added block reservation code, an additional function
was added to search for free blocks. This had a restriction of only being
able to search for aligned extents of free blocks. As a result the
allocation patterns when reserving blocks were suboptimal when the
existing allocation of blocks for an inode was not aligned to the same
boundary.
This patch resolves that problem by adding the ability for gfs2_rbm_find
to search for extents of a particular minimum size. We can then use
gfs2_rbm_find for both looking for reservations, and also looking for
free blocks on an individual basis when we actually come to do the
allocation later on. As a result we only need a single set of code
to deal with both situations.
The function gfs2_rbm_from_block() is moved up rgrp.c so that it
occurs before all of its callers.
Many thanks are due to Bob for helping track down the final issue in
this patch. That fix to the rb_tree traversal and to not share
block reservations from a dirctory to its children is included here.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
GFS2 uses i_mutex on its system quota inode to synchronize writes to
quota file. Since this is an internal inode to GFS2 (not part of directory
hiearchy or visible by user) we are safe to define locking rules for it. So
let's just get it its own locking class to make it clear.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
This patch stops multiple block allocations if a nonzero
return code is received from gfs2_rbm_from_block. Without
this patch, if enough pressure is put on the file system,
you get a kernel warning quickly followed by:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffffa04f47e8>] gfs2_alloc_blocks+0x2c8/0x880 [gfs2]
With this patch, things run normally.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
When rgd->rd_free_clone is less than rgd->rd_reserved, the
unclaimed_blocks() calculation would wrap and produce
incorrect results. This patch checks for this condition
when this function is called from gfs2_mblk_search()
In addition, the use of this particular function in other
places in the code has been dropped by means of a general
clean up of gfs2_inplace_reserve(). This function is now
much easier to follow.
Also the setting of the rgd->rd_last_alloc field is corrected.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
This patch improves the tracing of block reservations by
removing some corner cases and also providing more useful
detail in the traces.
A new field is added to the reservation structure to contain
the inode number. This is used since in certain contexts it is
not possible to access the inode itself to obtain this information.
As a result we can then display the inode number for all tracepoints
and also in case we dump the resource group.
The "del" tracepoint operation has been removed. This could be called
with the reservation rgrp set to NULL. That resulted in not printing
the device number, and thus making the information largely useless
anyway. Also, the conditional on the rgrp being NULL can then be
removed from the tracepoint. After this change, all the block
reservation tracepoint calls will be called with the rgrp information.
The existing ins,clm and tdel calls to the block reservation tracepoint
are sufficient to track the entire life of the block reservation.
In gfs2_block_alloc() the error detection is updated to print out
the inode number of the problematic inode. This can then be compared
against the information in the glock dump,tracepoints, etc.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
When we get to the stage of allocating blocks, we know that the
resource group in question must contain enough free blocks, otherwise
gfs2_inplace_reserve() would have failed. So if we are left with only
free blocks which are reserved, then we must use those. This can happen
if another node has sneeked in and use some blocks reserved on this
node, for example. Generally this will happen very rarely and only
when the resouce group is nearly full.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|