Age | Commit message (Collapse) | Author |
|
Currently, pages which are marked as unevictable are protected from
compaction, but not from other types of migration. The POSIX real time
extension explicitly states that mlock() will prevent a major page
fault, but the spirit of this is that mlock() should give a process the
ability to control sources of latency, including minor page faults.
However, the mlock manpage only explicitly says that a locked page will
not be written to swap and this can cause some confusion. The
compaction code today does not give a developer who wants to avoid swap
but wants to have large contiguous areas available any method to achieve
this state. This patch introduces a sysctl for controlling compaction
behavior with respect to the unevictable lru. Users who demand no page
faults after a page is present can set compact_unevictable_allowed to 0
and users who need the large contiguous areas can enable compaction on
locked memory by leaving the default value of 1.
To illustrate this problem I wrote a quick test program that mmaps a
large number of 1MB files filled with random data. These maps are
created locked and read only. Then every other mmap is unmapped and I
attempt to allocate huge pages to the static huge page pool. When the
compact_unevictable_allowed sysctl is 0, I cannot allocate hugepages
after fragmenting memory. When the value is set to 1, allocations
succeed.
Signed-off-by: Eric B Munson <emunson@akamai.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
With the page flag sanitization patchset, an invalid usage of
ClearPageReclaim() is detected in set_page_dirty(). This can be called
from __unmap_hugepage_range(), so let's check PageReclaim() before trying
to clear it to avoid the misuse.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
With the page flag sanitization patchset, an invalid usage of
ClearPageSwapCache() is detected in migration_page_copy().
migrate_page_copy() is shared by both normal and hugepage (both thp and
hugetlb) code path, so let's check PageSwapCache() and clear it if it's
set to avoid misuse of the invalid clear operation.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
THP uses tail page refcounting to be able to split huge pages at any time.
Tail page refcounting is not needed for other users of compound pages and
it's harmful because of overhead.
We try to exclude non-THP pages from tail page refcounting using
__compound_tail_refcounted() check. It excludes most common non-THP
compound pages: SL*B and hugetlb, but it doesn't catch rest of __GFP_COMP
users -- drivers.
And it's not only about overhead.
Drivers might want to use compound pages to get refcounting semantics
suitable for mapping high-order pages to userspace. But tail page
refcounting breaks it.
Tail page refcounting uses ->_mapcount in tail pages to store GUP pins on
them. It means GUP pins would affect page_mapcount() for tail pages.
It's not a problem for THP, because it never maps tail pages. But unlike
THP, drivers map parts of compound pages with PTEs and it makes
page_mapcount() be called for tail pages.
In particular, GUP pins would shift PSS up and affect /proc/kpagecount for
such pages. But, I'm not aware about anything which can lead to crash or
other serious misbehaviour.
Since currently all THP pages are anonymous and all drivers pages are not,
we can fix the __compound_tail_refcounted() check by requiring PageAnon()
to enable tail page refcounting.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently we take a naive approach to page flags on compound pages - we
set the flag on the page without consideration if the flag makes sense
for tail page or for compound page in general. This patchset try to
sort this out by defining per-flag policy on what need to be done if
page-flag helper operate on compound page.
The last patch in the patchset also sanitizes usege of page->mapping for
tail pages. We don't define the meaning of page->mapping for tail
pages. Currently it's always NULL, which can be inconsistent with head
page and potentially lead to problems.
For now I caught one case of illegal usage of page flags or ->mapping:
sound subsystem allocates pages with __GFP_COMP and maps them with PTEs.
It leads to setting dirty bit on tail pages and access to tail_page's
->mapping. I don't see any bad behaviour caused by this, but worth
fixing anyway.
This patchset makes more sense if you take my THP refcounting into
account: we will see more compound pages mapped with PTEs and we need to
define behaviour of flags on compound pages to avoid bugs.
This patch (of 16):
We have page-flags helper function declarations/definitions spread over
several header files. Let's consolidate them in <linux/page-flags.h>.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This cleanup patch moves all strings passed to action_result() into a
singl= e array action_page_type so that a reader can easily find which
kind of actio= n results are possible. And this patch also fixes the
odd lines to be printed out, like "unknown page state page" or "free
buddy, 2nd try page".
[akpm@linux-foundation.org: rename messages, per David]
[akpm@linux-foundation.org: s/DIRTY_UNEVICTABLE_LRU/CLEAN_UNEVICTABLE_LRU', per Andi]
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "Xie XiuQi" <xiexiuqi@huawei.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Chen Gong <gong.chen@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Low and high watermarks, as they defined in the TODO to the mem_cgroup
struct, have already been implemented by Johannes, so remove the stale
comment.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
mem_cgroup_lookup() is a wrapper around mem_cgroup_from_id(), which
checks that id != 0 before issuing the function call. Today, there is
no point in this additional check apart from optimization, because there
is no css with id <= 0, so that css_from_id, called by
mem_cgroup_from_id, will return NULL for any id <= 0.
Since mem_cgroup_from_id is only called from mem_cgroup_lookup, let us
zap mem_cgroup_lookup, substituting calls to it with mem_cgroup_from_id
and moving the check if id > 0 to css_from_id.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
All callers of zone_movable_is_highmem are under #ifdef CONFIG_HIGHMEM,
so the else branch return 0 is not needed.
Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Alter 'taks' -> 'task'
Signed-off-by: Yaowei Bai <bywxiaobai@163.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
vfs_readdir() was replaced by iterate_dir() in commit 5c0ba4e0762e
("[readdir] introduce iterate_dir() and dir_context").
Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
in the f2fs_fill_super function, variable "retry" is bool type
i think that it should be set as false.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds
Pull LED subsystem updates from Bryan Wu:
"In this cycle, we merged some fix and update for LED Flash class
driver. Then the core code of LED Flash class driver is in the kernel
now. Moreover, we also got some bug fixes, code cleanup and new
drivers for LED controllers"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds:
leds: Don't treat the LED name as a format string
leds: Use log level warn instead of info when telling about a name clash
leds/led-class: Handle LEDs with the same name
leds: lp8860: Fix typo in MODULE_DESCRIPTION in leds-lp8860.c
leds: lp8501: Fix typo in MODULE_DESCRIPTION in leds-lp8501.c
DT: leds: Add uniqueness requirement for 'label' property.
dt-binding: leds: Add common LED DT bindings macros
leds: add Qualcomm PM8941 WLED driver
leds: add DT binding for Qualcomm PM8941 WLED block
leds: pca963x: Add missing initialiation of struct led_info.flags
leds: flash: Fix the size of sysfs_groups array
Documentation: leds: Add description of LED Flash class extension
leds: flash: document sysfs interface
leds: flash: Remove synchronized flash strobe feature
leds: Introduce devres helper for led_classdev_register
leds: lp8860: make use of devm_gpiod_get_optional
leds: Let the binding document example for leds-gpio follow the gpio bindings
leds: flash: remove stray include directive
leds: leds-pwm: drop one pwm_get_period() call
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"There have been major modernization with the standard bus: in ALSA
sequencer core and HD-audio. Also, HD-audio receives the regmap
support replacing the in-house cache register cache code. These
changes shouldn't impact the existing behavior, but rather
refactoring.
In addition, HD-audio got the code split to a core library part and
the "legacy" driver parts. This is a preliminary work for adapting
the upcoming ASoC HD-audio driver, and the whole transition is still
work in progress, likely finished in 4.1.
Along with them, there are many updates in ASoC area as usual, too:
lots of cleanups, Intel code shuffling, etc.
Here are some highlights:
ALSA core:
- PCM: the audio timestamp / wallclock enhancement
- PCM: fixes in DPCM management
- Fixes / cleanups of user-space control element management
- Sequencer: modernization using the standard bus
HD-audio:
- Modernization using the standard bus
- Regmap support
- Use standard runtime PM for codec power saving
- Widget-path based power-saving for IDT, VIA and Realtek codecs
- Reorganized sysfs entries for each codec object
- More Dell headset support
ASoC:
- Move of jack registration to the card level
- Lots of ASoC cleanups, mainly moving things from the CODEC level to
the card level
- Support for DAPM routes specified by both the machine driver and DT
- Continuing improvements to rcar
- pcm512x enhacements
- Intel platforms updates
- rt5670 updates / fixes
- New platforms / devices: some non-DSP Qualcomm platforms, Google's
Storm platform, Maxmim MAX98925 CODECs and the Ingenic JZ4780 SoC
Misc:
- ice1724: Improved ESI W192M support
- emu10k1: Emu 1010 fixes/enhancement"
* tag 'sound-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (411 commits)
ALSA: hda - set GET bit when adding a vendor verb to the codec regmap
ALSA: hda/realtek - Enable the ALC292 dock fixup on the Thinkpad T450
ALSA: hda - Fix another race in runtime PM refcounting
ALSA: hda - Expose codec type sysfs
ALSA: ctl: fix to handle several elements added by one operation for userspace element
ASoC: Intel: fix array_size.cocci warnings
ASoC: n810: Automatically disconnect non-connected pins
ASoC: n810: Consistently pass the card DAPM context to n810_ext_control()
ASoC: davinci-evm: Use card DAPM context to access widgets
ASoC: mop500_ab8500: Use card DAPM context to access widgets
ASoC: wm1133-ev1: Use card DAPM context to access widgets
ASoC: atmel: Improve machine driver compile test coverage
ASoC: atmel: Add dependency to SND_SOC_I2C_AND_SPI where necessary
ALSA: control: Fix a typo of SNDRV_CTL_ELEM_ACCESS_TLV_* with SNDRV_CTL_TLV_OP_*
ALSA: usb-audio: Don't attempt to get Microsoft Lifecam Cinema sample rate
ASoC: rnsd: fix build regression without CONFIG_OF
ALSA: emu10k1: add toggles for E-mu 1010 optical ports
ALSA: ctl: fill identical information to return value when adding userspace elements
ALSA: ctl: fix a bug to return no identical information in info operation for userspace controls
ALSA: ctl: confirm to return all identical information in 'activate' event
...
|
|
git://anongit.freedesktop.org/drm-intel into drm-next
Misc i915 fixes.
* tag 'drm-intel-next-fixes-2015-04-15' of git://anongit.freedesktop.org/drm-intel:
drm/i915: Dont enable CS_PARSER_ERROR interrupts at all
drm/i915: Move drm_framebuffer_unreference out of struct_mutex for takeover
drm/i915: Allocate connector state together with the connectors
drm/i915/chv: Remove DPIO force latency causing interpair skew issue
drm/i915: Don't cancel DRRS worker synchronously for flush/invalidate
drm/i915: Fix locking in DRRS flush/invalidate hooks
|
|
git://anongit.freedesktop.org/drm-intel into drm-next
One more drm-misch pull for 4.1 with mostly simple stuff and boring
refactoring. Even the cursor fix from Matt is just to make a really anal
igt happy.
* tag 'topic/drm-misc-2015-04-15' of git://anongit.freedesktop.org/drm-intel:
drm: fix trivial typo mistake
drm: Make integer overflow checking cover universal cursor updates (v2)
drm: make crtc/encoder/connector/plane helper_private a const pointer
drm/armada: constify struct drm_encoder_helper_funcs pointer
drm/radeon: constify more struct drm_*_helper funcs pointers
drm/edid: add #defines for ELD versions
drm/atomic: Add for_each_{connector,crtc,plane}_in_state helper macros
drm: Use kref_put_mutex in drm_gem_object_unreference_unlocked
drm/drm: constify all struct drm_*_helper funcs pointers
drm/qxl: constify all struct drm_*_helper funcs pointers
drm/nouveau: constify all struct drm_*_helper funcs pointers
drm/radeon: constify all struct drm_*_helper funcs pointers
drm/gma500: constify all struct drm_*_helper funcs pointers
drm/mgag200: constify all struct drm_*_helper funcs pointers
drm/exynos: constify all struct drm_*_helper funcs pointers
drm: Fix some typos
|
|
into drm-next
This set of patches adjust the setup of the HDMI CTS/N values for audio
support to be compliant with the work-around given in the iMX6 errata
documentation as part of the preparation for integrating audio support
for this driver, and also update the HDMI phy configuration for Rockchip
devices to improve the HDMI eye pattern.
* 'drm-dwhdmi-devel' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
drm: rockchip/dw_hdmi-rockchip: improve for HDMI electrical test
drm: bridge/dw_hdmi: separate VLEVCTRL settting into platform driver
drm: bridge/dw_hdmi: fixed codec style
drm: bridge/dw_hdmi: adjust n/cts setting order
drm: bridge/dw_hdmi: protect n/cts setting with a mutex
drm: bridge/dw_hdmi: combine hdmi_set_clock_regenerator_n() and hdmi_regenerate_cts()
Conflicts:
drivers/gpu/drm/imx/dw_hdmi-imx.c
|
|
into drm-next
Some final bits for 4.1. Some fixes for userptrs and allow a new
packet for VCE to enable some new features in mesa.
* 'drm-next-4.1' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: allow creating overlapping userptrs
drm/radeon: add userptr config option
drm/radeon: add video usability info support for VCE
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2015-04-14
This series contains updates to i40e and i40evf.
Mitch provides a fix for i40e, where VFs were gone and the associated
VSI's had been removed and the rings were not stopped, which in some
circumstances cased memory corruption or DMAR errors. So stop all the
rings associated with each VF before releasing its resources. Also
cleaned up a poorly indented piece of code. Fixes VF link state, where
VF devices were assuming link is up unless told otherwise, which means
that VFs instantiated on a PF with no link, would report the wrong state.
Anjali adds support to add Flow director Sideband rules for a VF from it's
PF. Fixes a recently discovered hardware issue, where after a VFLR
hardware might be indicating to us a reset completion little too early, so
wait another 10 msec for cache to be cleaned up.
Jesse enables the user to dump the internal hardware state for better
debugging by allowing a bash script to acquire information about the
internal hardware state. The data output to the kernel log is collected
by the script and can then be sent to Intel. Also fixed a possible
failure path to allocate memory that was found by smatch. Cleaned up
unused local variables.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 9a2620c877454 ("bnx2x: prevent WARN during driver unload")
switched the napi/busy_lock locking mechanism from spin_lock() into
spin_lock_bh(), breaking inter-operability with netconsole, as netpoll
disables interrupts prior to calling our napi mechanism.
This switches the driver into using atomic assignments instead of the
spinlock mechanisms previously employed.
Based on initial patch from Yuval Mintz & Ariel Elior
I basically added softirq starvation avoidance, and mixture
of atomic operations, plain writes and barriers.
Note this slightly reduces the overhead for this driver when no
busy_poll sockets are in use.
Fixes: 9a2620c877454 ("bnx2x: prevent WARN during driver unload")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull file locking related changes from Jeff Layton:
"This set is mostly minor cleanups to the overhaul that went in last
cycle. The other noticeable items are the changes to the lm_get_owner
and lm_put_owner prototypes, and the fact that we no longer need to
use the i_lock to protect the i_flctx pointer"
* tag 'locks-v4.1-1' of git://git.samba.org/jlayton/linux:
locks: use cmpxchg to assign i_flctx pointer
locks: get rid of WE_CAN_BREAK_LSLK_NOW dead code
locks: change lm_get_owner and lm_put_owner prototypes
locks: don't allocate a lock context for an F_UNLCK request
locks: Add lockdep assertion for blocked_lock_lock
locks: remove extraneous IS_POSIX and IS_FLOCK tests
locks: Remove unnecessary IS_POSIX test
|
|
The code sets the expiry value of the timer to a relative value and
starts it with hrtimer_start_expires. That's fine, but that only works
once. The timer is started in relative mode, so the expiry value gets
overwritten with the absolut expiry time (now + expiry).
So once the timer expired, a new call to hrtimer_start_expires results
in an immidiately expired timer, because the expiry value is
already in the past.
Use the proper mechanisms to (re)start the timer in the intended way.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: dingtianhong <dingtianhong@huawei.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: netdev@vger.kernel.org
Acked-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The networking updates from David Miller removed the iocb argument from
sendmsg and recvmsg (in commit 1b784140474e: "net: Remove iocb argument
from sendmsg and recvmsg"), but the crypto code had added new instances
of them.
When I pulled the crypto update, it was a silent semantic mis-merge, and
I overlooked the new warning messages in my test-build. I try to fix
those in the merge itself, but that relies on me noticing. Oh well.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc
Pull exec domain removal from Richard Weinberger:
"This series removes execution domain support from Linux.
The idea behind exec domains was to support different ABIs. The
feature was never complete nor stable. Let's rip it out and make the
kernel signal handling code less complicated"
* 'exec_domain_rip_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc: (27 commits)
arm64: Removed unused variable
sparc: Fix execution domain removal
Remove rest of exec domains.
arch: Remove exec_domain from remaining archs
arc: Remove signal translation and exec_domain
xtensa: Remove signal translation and exec_domain
xtensa: Autogenerate offsets in struct thread_info
x86: Remove signal translation and exec_domain
unicore32: Remove signal translation and exec_domain
um: Remove signal translation and exec_domain
tile: Remove signal translation and exec_domain
sparc: Remove signal translation and exec_domain
sh: Remove signal translation and exec_domain
s390: Remove signal translation and exec_domain
mn10300: Remove signal translation and exec_domain
microblaze: Remove signal translation and exec_domain
m68k: Remove signal translation and exec_domain
m32r: Remove signal translation and exec_domain
m32r: Autogenerate offsets in struct thread_info
frv: Remove signal translation and exec_domain
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML updates from Richard Weinberger:
- hostfs saw a face lifting
- old/broken stuff was removed (SMP, HIGHMEM, SKAS3/4)
- random cleanups and bug fixes
* tag 'for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: (26 commits)
um: Print minimum physical memory requirement
um: Move uml_postsetup in the init_thread stack
um: add a kmsg_dumper
x86, UML: fix integer overflow in ELF_ET_DYN_BASE
um: hostfs: Reduce number of syscalls in readdir
um: Remove broken highmem support
um: Remove broken SMP support
um: Remove SKAS3/4 support
um: Remove ppc cruft
um: Remove ia64 cruft
um: Remove dead code from stacktrace
hostfs: No need to box and later unbox the file mode
hostfs: Use page_offset()
hostfs: Set page flags in hostfs_readpage() correctly
hostfs: Remove superfluous initializations in hostfs_open()
hostfs: hostfs_open: Reset open flags upon each retry
hostfs: Remove superfluous test in hostfs_open()
hostfs: Report append flag in ->show_options()
hostfs: Use __getname() in follow_link
hostfs: Remove open coded strcpy()
...
|
|
Pull UBI/UBIFS updates from Richard Weinberger:
"This pull request includes the following UBI/UBIFS changes:
- powercut emulation for UBI
- a huge update to UBI Fastmap
- cleanups and bugfixes all over UBI and UBIFS"
* tag 'upstream-4.1-rc1' of git://git.infradead.org/linux-ubifs: (50 commits)
UBI: power cut emulation for testing
UBIFS: fix output format of INUM_WATERMARK
UBI: Fastmap: Fall back to scanning mode after ECC error
UBI: Fastmap: Remove is_fm_block()
UBI: Fastmap: Add blank line after declarations
UBI: Fastmap: Remove else after return.
UBI: Fastmap: Introduce may_reserve_for_fm()
UBI: Fastmap: Introduce ubi_fastmap_init()
UBI: Fastmap: Wire up WL accessor functions
UBI: Add accessor functions for WL data structures
UBI: Move fastmap specific functions out of wl.c
UBI: Fastmap: Add new module parameter fm_debug
UBI: Fastmap: Make self_check_eba() depend on fastmap self checking
UBI: Fastmap: Add self check to detect absent PEBs
UBI: Fix stale pointers in ubi->lookuptbl
UBI: Fastmap: Enhance fastmap checking
UBI: Add initial support for fastmap self checks
UBI: Fastmap: Rework fastmap error paths
UBI: Fastmap: Prepare for variable sized fastmaps
UBI: Fastmap: Locking updates
...
|
|
into for-4.1
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull second vfs update from Al Viro:
"Now that net-next went in... Here's the next big chunk - killing
->aio_read() and ->aio_write().
There'll be one more pile today (direct_IO changes and
generic_write_checks() cleanups/fixes), but I'd prefer to keep that
one separate"
* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
->aio_read and ->aio_write removed
pcm: another weird API abuse
infinibad: weird APIs switched to ->write_iter()
kill do_sync_read/do_sync_write
fuse: use iov_iter_get_pages() for non-splice path
fuse: switch to ->read_iter/->write_iter
switch drivers/char/mem.c to ->read_iter/->write_iter
make new_sync_{read,write}() static
coredump: accept any write method
switch /dev/loop to vfs_iter_write()
serial2002: switch to __vfs_read/__vfs_write
ashmem: use __vfs_read()
export __vfs_read()
autofs: switch to __vfs_write()
new helper: __vfs_write()
switch hugetlbfs to ->read_iter()
coda: switch to ->read_iter/->write_iter
ncpfs: switch to ->read_iter/->write_iter
net/9p: remove (now-)unused helpers
p9_client_attach(): set fid->uid correctly
...
|
|
architectures
If CONFIG_ARCH_DMA_ADDR_T_64BIT enabled for x86 systems and physical
memory is more than 4GB, dma_map_page may return a valid memory
address which greater than 0xffffffff. As a result, the mlx5 device page
allocator RB tree will be initialized with valid addresses greater than
0xfffffff.
However, (addr & PAGE_MASK) set the high four bytes to zeros. So, it's
impossible for the function, free_4k, to release the pages whose
addresses greater than 4GB. Memory leaks. And mlx5_ib module can't
release the pages when user try to remove the module, as a result,
system hang.
[root@rdma05 root]# dmesg | grep addr | head
addr = 3fe384000
addr & PAGE_MASK = fe384000
[root@rdma05 root]# rmmod mlx5_ib <---- hang on
---------------------- cosnole log -----------------
mlx5_ib 0000:04:00.0: irq 138 for MSI/MSI-X
alloc irq_desc for 139 on node -1
alloc kstat_irqs on node -1
mlx5_ib 0000:04:00.0: irq 139 for MSI/MSI-X
0000:04:00.0:free_4k:221:(pid 1519): page not found
0000:04:00.0:free_4k:221:(pid 1519): page not found
0000:04:00.0:free_4k:221:(pid 1519): page not found
0000:04:00.0:free_4k:221:(pid 1519): page not found
---------------------- cosnole log -----------------
Fixes: bf0bf77f6519 ('mlx5: Support communicating arbitrary host page size to firmware')
Signed-off-by: Honggang Li <honli@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In some rare cases, IO operations may be not aligned to page
boundaries. This prevents iser from performing fast memory
registration. In order to overcome that iser uses a bounce
buffer to carry the transaction. We basically allocate a buffer
in the size of the transaction and perform a copy.
The buffer allocation using kmalloc is too restrictive since it
requires higher order (atomic) allocations for large transactions
(which may result in memory exhaustion fairly fast for some workloads).
We rewrite the bounce buffer code path to allocate scattered pages
and perform a copy between the transaction sg and the bounce sg.
Reported-by: Alex Lyakas <alex@zadarastorage.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In singleton scatterlists, DMA memory registration code
is taken both for Fastreg and FMR code paths. Move it to
a function.
This patch does not change any functionality.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Instead of passing ib_sge as output variable, we pass the mem_reg
pointer to have the routines fill the rkey as well. This reduces
code duplication and extra assignments. This is a preparation step
to unify some registration logics together. Also, pass iser_fast_reg_mr
the fastreg descriptor directly.
This patch does not change any functionality.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
No need to keep lkey, va, len variables, we can keep
them as struct ib_sge. This will help when we change the
memory registration logic.
This patch does not change any functionality.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Memory regions are resources that are saved
in the device caches. Increase the probability for
a cache hit by adding the MRU descriptor to pool
head.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Make iser_[create|destroy]_fastreg_desc shorter, more
readable and easily extendable.
This patch does not change any functionality.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Instead of open-coding connection fastreg pool get/put,
we introduce iser_reg_desc[get|put] helpers.
We aren't setting these static as this will be a per-device
routine later on. Also, cleanup iser_unreg_rdma_mem_fastreg
a bit.
This patch does not change any functionality.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
No need for these two separate. Keep it in a single routine
like in the fastreg case. This will also make iser_reg_page_vec
closer to iser_fast_reg_mr arguments. This is a preparation
step for registration flow refactor.
This patch does not change any functionality.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This struct members other than struct iser_mem_reg are unused,
so remove it altogether.
This patch does not change any functionality.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Buffer length was assigned twice, and no reason to set va to
io_addr and then add the offset, just set va to io_addr + offset.
This patch does not change any functionality.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
As memory registration/de-registration methods, lets
move them to their natural location. While we're at it,
make iser_reg_page_vec routine static.
This patch does not change any functionality.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
No need to pass that, we can take it from the task.
In a later stage, this function will be invoked
according to a device capability.
This patch does not change any functionality.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
No need to keep two iser_data_buf structures just in case we use
mem copy. We can avoid that just by adding a pointer to the original
sg. So keep only two iser_data_buf per command (data and protection)
and pass the relevant data_buf to bounce buffer routine.
This patch does not change any functionality.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This code was added before we had protection data length
calculation (in iser_send_command), so we needed to calc
the sg data length from the sg itself. This is not needed
anymore.
This patch does not change any functionality.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This length miss-calculation may cause a silent data corruption
in the DIX case and cause the device to reference unmapped area.
Fixes: d77e65350f2d ('libiscsi, iser: Adjust data_length to include protection information')
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Fast registration and local invalidate work requests can
also fail. We should call error completion handler for them.
Reported-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In case the user unloaded ib_iser while ep_connect is in
progress, we need to destroy the endpoint although ep_disconnect
wasn't invoked (we detect this by the iser conn state != DOWN).
However, if we got an REJECTED/UNREACHABLE CM event we move the
connection state to DOWN which will prevent us from destroying
the endpoint in the module unload stage. Fix this by setting the
connection state to TERMINATING in iser_conn_error so we can still
destroy the endpoint at unload stage.
Reported-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The driver already defined the pr_format, it just hadn't
been converted to use pr_info, pr_warn, and pr_err instead
of the equivalent printks. Convert so that messages from
the driver are now properly tagged with their driver name
and can be more easily debugged.
In addition, a number of these printk's were not newline
terminated, so fix that at the same time.
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This change slightly reduces the time needed to log in.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: David Dillow <dave@thedillows.org>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Since ib_dma_map_single can fail use ib_dma_mapping_error to check
for errors.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|