summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2017-01-12drm/probe-helpers: Drop locking from poll_enableDaniel Vetter
It was only needed to protect the connector_list walking, see commit 8c4ccc4ab6f64e859d4ff8d7c02c2ed2e956e07f Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Thu Jul 9 23:44:26 2015 +0200 drm/probe-helper: Grab mode_config.mutex in poll_init/enable Unfortunately the commit message of that patch fails to mention that the new locking check was for the connector_list. But that requirement disappeared in commit c36a3254f7857f1ad9badbe3578ccc92be541a8e Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Thu Dec 15 16:58:43 2016 +0100 drm: Convert all helpers to drm_connector_list_iter and so we can drop this again. This fixes a locking inversion on nouveau, where the rpm code needs to re-enable. But in other places the rpm_get() calls are nested within the big modeset locks. While at it, also improve the kerneldoc for these two functions a notch. v2: Update the kerneldoc even more to explain that these functions can't be called concurrently, or bad things happen (Chris). Cc: Dave Airlie <airlied@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Lyude <lyude@redhat.com> Reviewed-by: Lyude <lyude@redhat.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170111090117.5134-1-daniel.vetter@ffwll.ch
2017-01-12i2c: do not enable fall back to Host Notify by defaultDmitry Torokhov
Falling back unconditionally to HostNotify as primary client's interrupt breaks some drivers which alter their functionality depending on whether interrupt is present or not, so let's introduce a board flag telling I2C core explicitly if we want wired interrupt or HostNotify-based one: I2C_CLIENT_HOST_NOTIFY. For DT-based systems we introduce "host-notify" property that we convert to I2C_CLIENT_HOST_NOTIFY board flag. Tested-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Pali Rohár <pali.rohar@gmail.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-01-12Merge tag 'rproc-v4.10-fixes' of git://github.com/andersson/remoteprocLinus Torvalds
Pull remoteproc fixes from Bjorn Andersson: "This fixes two regressions that have been reported to be introduced in v4.10-rc1. - correct an incorrect usage of the kref api - revert the change to make the resource table read-only. As the space each vdev resource is used as virtio device config space it must be shared with the remote" * tag 'rproc-v4.10-fixes' of git://github.com/andersson/remoteproc: Revert "remoteproc: Merge table_ptr and cached_table pointers" remoteproc: fix vdev reference management
2017-01-12Merge branch 'scsi-target-for-v4.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux Pull scsi target fixes from Bart Van Assche: - a series of bug fixes for the XCOPY implementation from David Disseldorp - one bug fix for the ibmvscsis driver, a driver that is used for communication between partitions on IBM POWER systems. * 'scsi-target-for-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux: ibmvscsis: Fix srp_transfer_data fail return code target: support XCOPY requests without parameters target: check for XCOPY parameter truncation target: use XCOPY segment descriptor CSCD IDs target: check XCOPY segment descriptor CSCD IDs target: simplify XCOPY wwn->se_dev lookup helper target: return UNSUPPORTED TARGET/SEGMENT DESC TYPE CODE sense target: bounds check XCOPY total descriptor list length target: bounds check XCOPY segment descriptor list target: use XCOPY TOO MANY TARGET DESCRIPTORS sense target: add XCOPY target/segment desc sense codes
2017-01-12regmap: Fixup the kernel-doc comments on functions/structuresCharles Keepax
Most of the kernel-doc comments in regmap don't actually generate correctly. This patch fixes up a few common issues, corrects some typos and adds some missing argument descriptions. The most common issues being using a : after the function name which causes the short description to not render correctly and not separating the long and short descriptions of the function. There are quite a few instances of arguments not being described or given the wrong name as well. This patch doesn't fixup functions/structures that are currently missing descriptions. Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-01-12block: Rename blk_queue_zone_size and bdev_zone_sizeDamien Le Moal
All block device data fields and functions returning a number of 512B sectors are by convention named xxx_sectors while names in the form xxx_size are generally used for a number of bytes. The blk_queue_zone_size and bdev_zone_size functions were not following this convention so rename them. No functional change is introduced by this patch. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Collapsed the two patches, they were nonsensically split and broke bisection. Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-12jump_labels: API for flushing deferred jump label updatesDavid Matlack
Modules that use static_key_deferred need a way to synchronize with any delayed work that is still pending when the module is unloaded. Introduce static_key_deferred_flush() which flushes any pending jump label updates. Signed-off-by: David Matlack <dmatlack@google.com> Cc: stable@vger.kernel.org Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12locking/spinlocks: Remove the unused spin_lock_bh_nested() APIWaiman Long
The spin_lock_bh_nested() API is defined but is not used anywhere in the kernel. So all spin_lock_bh_nested() and related APIs are now removed. Signed-off-by: Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1483975612-16447-1-git-send-email-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-12power: supply: bq27xxx: adds specific support for bq27520-g4 revision.Chris Lapa
This commit adds the BQ27520G4 chip definition to specifically match the bq27520-G4 functionality as described in the datasheet. Signed-off-by: Chris Lapa <chris@lapa.com.au> Acked-by: Pali Rohár <pali.rohar@gmail.com> Reviewed-by: Andrew F. Davis <afd@ti.com> Signed-off-by: Sebastian Reichel <sre@kernel.org>
2017-01-12power: supply: bq27xxx: adds specific support for bq27520-g3 revision.Chris Lapa
This commit adds the BQ27520G3 chip definition to specifically match the bq27520-G3 functionality as described in the datasheet. Signed-off-by: Chris Lapa <chris@lapa.com.au> Acked-by: Pali Rohár <pali.rohar@gmail.com> Reviewed-by: Andrew F. Davis <afd@ti.com> Signed-off-by: Sebastian Reichel <sre@kernel.org>
2017-01-12power: supply: bq27xxx: adds specific support for bq27520-g2 revision.Chris Lapa
This commit adds the BQ27520G2 chip definition to specifically match the bq27520-G2 functionality as described in the datasheet. Signed-off-by: Chris Lapa <chris@lapa.com.au> Acked-by: Pali Rohár <pali.rohar@gmail.com> Reviewed-by: Andrew F. Davis <afd@ti.com> Signed-off-by: Sebastian Reichel <sre@kernel.org>
2017-01-12power: supply: bq27xxx: adds specific support for bq27520-g1 revision.Chris Lapa
This commit adds the BQ27520G1 chip definition to specifically match the bq27520-G1 functionality as described in the datasheet. Signed-off-by: Chris Lapa <chris@lapa.com.au> Acked-by: Pali Rohár <pali.rohar@gmail.com> Reviewed-by: Andrew F. Davis <afd@ti.com> Signed-off-by: Sebastian Reichel <sre@kernel.org>
2017-01-12power: supply: bq27xxx: adds specific support for bq27510-g3 revision.Chris Lapa
This commit adds the BQ27510G3 chip definition to specifically match the bq27510-G3 functionality as described in the datasheet. Signed-off-by: Chris Lapa <chris@lapa.com.au> Acked-by: Pali Rohár <pali.rohar@gmail.com> Reviewed-by: Andrew F. Davis <afd@ti.com> Tested-by: Chris Lapa <chris@lapa.com.au> Signed-off-by: Sebastian Reichel <sre@kernel.org>
2017-01-12power: supply: bq27xxx: adds specific support for bq27510-g2 revision.Chris Lapa
This commit adds the BQ27510G2 chip definition to specifically match the bq27510-G2 functionality as described in the datasheet. Signed-off-by: Chris Lapa <chris@lapa.com.au> Acked-by: Pali Rohár <pali.rohar@gmail.com> Reviewed-by: Andrew F. Davis <afd@ti.com> Signed-off-by: Sebastian Reichel <sre@kernel.org>
2017-01-12power: supply: bq27xxx: adds specific support for bq27510-g1 revision.Chris Lapa
This commit adds the BQ27510G1 chip definition to specifically match the bq27510-G1 functionality as described in the datasheet. Signed-off-by: Chris Lapa <chris@lapa.com.au> Acked-by: Pali Rohár <pali.rohar@gmail.com> Reviewed-by: Andrew F. Davis <afd@ti.com> Signed-off-by: Sebastian Reichel <sre@kernel.org>
2017-01-12power: supply: bq27xxx: adds specific support for bq27500/1 revision.Chris Lapa
This commit adds the BQ27500 chip definition to specifically match the bq27500/1 functionality as described in the datasheet. Signed-off-by: Chris Lapa <chris@lapa.com.au> Acked-by: Pali Rohár <pali.rohar@gmail.com> Reviewed-by: Andrew F. Davis <afd@ti.com> Signed-off-by: Sebastian Reichel <sre@kernel.org>
2017-01-12power: supply: bq27xxx: rename BQ27510 allow for deprecation in future.Chris Lapa
The BQ2751X definition exists only to satisfy backwards compatibility. Signed-off-by: Chris Lapa <chris@lapa.com.au> Signed-off-by: Sebastian Reichel <sre@kernel.org>
2017-01-12power: supply: bq27xxx: rename BQ27500 allow for deprecation in future.Chris Lapa
The BQ2750X definition exists only to satisfy backwards compatibility. Signed-off-by: Chris Lapa <chris@lapa.com.au> Signed-off-by: Sebastian Reichel <sre@kernel.org>
2017-01-12power: supply: axp20x_usb_power: fix warning on 64bitMichal Suchanek
Casting of_device_get_match_data return value to int causes warning on 64bit architectures. ../drivers/power/supply/axp20x_usb_power.c: In function 'axp20x_usb_power_probe': ../drivers/power/supply/axp20x_usb_power.c:297:21: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] Fixes: 0dcc70ca8644 ("power: supply: axp20x_usb_power: use of_device_id data field instead of device_is_compatible") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Sebastian Reichel <sre@kernel.org>
2017-01-11Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge fixes from Andrew Morton: "27 fixes. There are three patches that aren't actually fixes. They're simple function renamings which are nice-to-have in mainline as ongoing net development depends on them." * akpm: (27 commits) timerfd: export defines to userspace mm/hugetlb.c: fix reservation race when freeing surplus pages mm/slab.c: fix SLAB freelist randomization duplicate entries zram: support BDI_CAP_STABLE_WRITES zram: revalidate disk under init_lock mm: support anonymous stable page mm: add documentation for page fragment APIs mm: rename __page_frag functions to __page_frag_cache, drop order from drain mm: rename __alloc_page_frag to page_frag_alloc and __free_page_frag to page_frag_free mm, memcg: fix the active list aging for lowmem requests when memcg is enabled mm: don't dereference struct page fields of invalid pages mailmap: add codeaurora.org names for nameless email commits signal: protect SIGNAL_UNKILLABLE from unintentional clearing. mm: pmd dirty emulation in page fault handler ipc/sem.c: fix incorrect sem_lock pairing lib/Kconfig.debug: fix frv build failure mm: get rid of __GFP_OTHER_NODE mm: fix remote numa hits statistics mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done} ocfs2: fix crash caused by stale lvb with fsdlm plugin ...
2017-01-11cfg80211: consider VHT opmode on station updateBeni Lev
Currently, this attribute is only fetched on station addition, but not on station change. Since this info is only present in the assoc request, with full station state support in the driver it cannot be present when the station is added. Thus, add support for changing the VHT opmode on station update if done before (or while) the station is marked as associated. After this, ignore it, since it used to be ignored. Signed-off-by: Beni Lev <beni.lev@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-01-10timerfd: export defines to userspaceMike Frysinger
Since userspace is expected to call timerfd syscalls directly with these flags/ioctls, make sure we export them so they don't have to duplicate the values themselves. Link: http://lkml.kernel.org/r/20161219064052.7196-1-vapier@gentoo.org Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-10mm: support anonymous stable pageMinchan Kim
During developemnt for zram-swap asynchronous writeback, I found strange corruption of compressed page, resulting in: Modules linked in: zram(E) CPU: 3 PID: 1520 Comm: zramd-1 Tainted: G E 4.8.0-mm1-00320-ge0d4894c9c38-dirty #3274 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 task: ffff88007620b840 task.stack: ffff880078090000 RIP: set_freeobj.part.43+0x1c/0x1f RSP: 0018:ffff880078093ca8 EFLAGS: 00010246 RAX: 0000000000000018 RBX: ffff880076798d88 RCX: ffffffff81c408c8 RDX: 0000000000000018 RSI: 0000000000000000 RDI: 0000000000000246 RBP: ffff880078093cb0 R08: 0000000000000000 R09: 0000000000000000 R10: ffff88005bc43030 R11: 0000000000001df3 R12: ffff880076798d88 R13: 000000000005bc43 R14: ffff88007819d1b8 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff88007e380000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc934048f20 CR3: 0000000077b01000 CR4: 00000000000406e0 Call Trace: obj_malloc+0x22b/0x260 zs_malloc+0x1e4/0x580 zram_bvec_rw+0x4cd/0x830 [zram] page_requests_rw+0x9c/0x130 [zram] zram_thread+0xe6/0x173 [zram] kthread+0xca/0xe0 ret_from_fork+0x25/0x30 With investigation, it reveals currently stable page doesn't support anonymous page. IOW, reuse_swap_page can reuse the page without waiting writeback completion so it can overwrite page zram is compressing. Unfortunately, zram has used per-cpu stream feature from v4.7. It aims for increasing cache hit ratio of scratch buffer for compressing. Downside of that approach is that zram should ask memory space for compressed page in per-cpu context which requires stricted gfp flag which could be failed. If so, it retries to allocate memory space out of per-cpu context so it could get memory this time and compress the data again, copies it to the memory space. In this scenario, zram assumes the data should never be changed but it is not true unless stable page supports. So, If the data is changed under us, zram can make buffer overrun because second compression size could be bigger than one we got in previous trial and blindly, copy bigger size object to smaller buffer which is buffer overrun. The overrun breaks zsmalloc free object chaining so system goes crash like above. I think below is same problem. https://bugzilla.suse.com/show_bug.cgi?id=997574 Unfortunately, reuse_swap_page should be atomic so that we cannot wait on writeback in there so the approach in this patch is simply return false if we found it needs stable page. Although it increases memory footprint temporarily, it happens rarely and it should be reclaimed easily althoug it happened. Also, It would be better than waiting of IO completion, which is critial path for application latency. Fixes: da9556a2367c ("zram: user per-cpu compression streams") Link: http://lkml.kernel.org/r/20161120233015.GA14113@bbox Link: http://lkml.kernel.org/r/1482366980-3782-2-git-send-email-minchan@kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Hugh Dickins <hughd@google.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Hyeoncheol Lee <cheol.lee@lge.com> Cc: <yjay.kim@lge.com> Cc: Sangseok Lee <sangseok.lee@lge.com> Cc: <stable@vger.kernel.org> [4.7+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-10mm: rename __page_frag functions to __page_frag_cache, drop order from drainAlexander Duyck
This patch does two things. First it goes through and renames the __page_frag prefixed functions to __page_frag_cache so that we can be clear that we are draining or refilling the cache, not the frags themselves. Second we drop the order parameter from __page_frag_cache_drain since we don't actually need to pass it since all fragments are either order 0 or must be a compound page. Link: http://lkml.kernel.org/r/20170104023954.13451.5678.stgit@localhost.localdomain Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-10mm: rename __alloc_page_frag to page_frag_alloc and __free_page_frag to ↵Alexander Duyck
page_frag_free Patch series "Page fragment updates", v4. This patch series takes care of a few cleanups for the page fragments API. First we do some renames so that things are much more consistent. First we move the page_frag_ portion of the name to the front of the functions names. Secondly we split out the cache specific functions from the other page fragment functions by adding the word "cache" to the name. Finally I added a bit of documentation that will hopefully help to explain some of this. I plan to revisit this later as we get things more ironed out in the near future with the changes planned for the DMA setup to support eXpress Data Path. This patch (of 3): This patch renames the page frag functions to be more consistent with other APIs. Specifically we place the name page_frag first in the name and then have either an alloc or free call name that we append as the suffix. This makes it a bit clearer in terms of naming. In addition we drop the leading double underscores since we are technically no longer a backing interface and instead the front end that is called from the networking APIs. Link: http://lkml.kernel.org/r/20170104023854.13451.67390.stgit@localhost.localdomain Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-10mm, memcg: fix the active list aging for lowmem requests when memcg is enabledMichal Hocko
Nils Holland and Klaus Ethgen have reported unexpected OOM killer invocations with 32b kernel starting with 4.8 kernels kworker/u4:5 invoked oom-killer: gfp_mask=0x2400840(GFP_NOFS|__GFP_NOFAIL), nodemask=0, order=0, oom_score_adj=0 kworker/u4:5 cpuset=/ mems_allowed=0 CPU: 1 PID: 2603 Comm: kworker/u4:5 Not tainted 4.9.0-gentoo #2 [...] Mem-Info: active_anon:58685 inactive_anon:90 isolated_anon:0 active_file:274324 inactive_file:281962 isolated_file:0 unevictable:0 dirty:649 writeback:0 unstable:0 slab_reclaimable:40662 slab_unreclaimable:17754 mapped:7382 shmem:202 pagetables:351 bounce:0 free:206736 free_pcp:332 free_cma:0 Node 0 active_anon:234740kB inactive_anon:360kB active_file:1097296kB inactive_file:1127848kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:29528kB dirty:2596kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 184320kB anon_thp: 808kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no DMA free:3952kB min:788kB low:984kB high:1180kB active_anon:0kB inactive_anon:0kB active_file:7316kB inactive_file:0kB unevictable:0kB writepending:96kB present:15992kB managed:15916kB mlocked:0kB slab_reclaimable:3200kB slab_unreclaimable:1408kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB lowmem_reserve[]: 0 813 3474 3474 Normal free:41332kB min:41368kB low:51708kB high:62048kB active_anon:0kB inactive_anon:0kB active_file:532748kB inactive_file:44kB unevictable:0kB writepending:24kB present:897016kB managed:836248kB mlocked:0kB slab_reclaimable:159448kB slab_unreclaimable:69608kB kernel_stack:1112kB pagetables:1404kB bounce:0kB free_pcp:528kB local_pcp:340kB free_cma:0kB lowmem_reserve[]: 0 0 21292 21292 HighMem free:781660kB min:512kB low:34356kB high:68200kB active_anon:234740kB inactive_anon:360kB active_file:557232kB inactive_file:1127804kB unevictable:0kB writepending:2592kB present:2725384kB managed:2725384kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:800kB local_pcp:608kB free_cma:0kB the oom killer is clearly pre-mature because there there is still a lot of page cache in the zone Normal which should satisfy this lowmem request. Further debugging has shown that the reclaim cannot make any forward progress because the page cache is hidden in the active list which doesn't get rotated because inactive_list_is_low is not memcg aware. The code simply subtracts per-zone highmem counters from the respective memcg's lru sizes which doesn't make any sense. We can simply end up always seeing the resulting active and inactive counts 0 and return false. This issue is not limited to 32b kernels but in practice the effect on systems without CONFIG_HIGHMEM would be much harder to notice because we do not invoke the OOM killer for allocations requests targeting < ZONE_NORMAL. Fix the issue by tracking per zone lru page counts in mem_cgroup_per_node and subtract per-memcg highmem counts when memcg is enabled. Introduce helper lruvec_zone_lru_size which redirects to either zone counters or mem_cgroup_get_zone_lru_size when appropriate. We are losing empty LRU but non-zero lru size detection introduced by ca707239e8a7 ("mm: update_lru_size warn and reset bad lru_size") because of the inherent zone vs. node discrepancy. Fixes: f8d1a31163fc ("mm: consider whether to decivate based on eligible zones inactive ratio") Link: http://lkml.kernel.org/r/20170104100825.3729-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Nils Holland <nholland@tisys.org> Tested-by: Nils Holland <nholland@tisys.org> Reported-by: Klaus Ethgen <Klaus@Ethgen.de> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <stable@vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-10signal: protect SIGNAL_UNKILLABLE from unintentional clearing.Jamie Iles
Since commit 00cd5c37afd5 ("ptrace: permit ptracing of /sbin/init") we can now trace init processes. init is initially protected with SIGNAL_UNKILLABLE which will prevent fatal signals such as SIGSTOP, but there are a number of paths during tracing where SIGNAL_UNKILLABLE can be implicitly cleared. This can result in init becoming stoppable/killable after tracing. For example, running: while true; do kill -STOP 1; done & strace -p 1 and then stopping strace and the kill loop will result in init being left in state TASK_STOPPED. Sending SIGCONT to init will resume it, but init will now respond to future SIGSTOP signals rather than ignoring them. Make sure that when setting SIGNAL_STOP_CONTINUED/SIGNAL_STOP_STOPPED that we don't clear SIGNAL_UNKILLABLE. Link: http://lkml.kernel.org/r/20170104122017.25047-1-jamie.iles@oracle.com Signed-off-by: Jamie Iles <jamie.iles@oracle.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-10mm: get rid of __GFP_OTHER_NODEMichal Hocko
The flag was introduced by commit 78afd5612deb ("mm: add __GFP_OTHER_NODE flag") to allow proper accounting of remote node allocations done by kernel daemons on behalf of a process - e.g. khugepaged. After "mm: fix remote numa hits statistics" we do not need and actually use the flag so we can safely remove it because all allocations which are satisfied from their "home" node are accounted properly. [mhocko@suse.com: fix build] Link: http://lkml.kernel.org/r/20170106122225.GK5556@dhcp22.suse.cz Link: http://lkml.kernel.org/r/20170102153057.9451-3-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-10mm, slab: make sure that KMALLOC_MAX_SIZE will fit into MAX_ORDERMichal Hocko
Andrey Konovalov has reported the following warning triggered by the syzkaller fuzzer. WARNING: CPU: 1 PID: 9935 at mm/page_alloc.c:3511 __alloc_pages_nodemask+0x159c/0x1e20 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 9935 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #34 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __alloc_pages_slowpath mm/page_alloc.c:3511 __alloc_pages_nodemask+0x159c/0x1e20 mm/page_alloc.c:3781 alloc_pages_current+0x1c7/0x6b0 mm/mempolicy.c:2072 alloc_pages include/linux/gfp.h:469 kmalloc_order+0x1f/0x70 mm/slab_common.c:1015 kmalloc_order_trace+0x1f/0x160 mm/slab_common.c:1026 kmalloc_large include/linux/slab.h:422 __kmalloc+0x210/0x2d0 mm/slub.c:3723 kmalloc include/linux/slab.h:495 ep_write_iter+0x167/0xb50 drivers/usb/gadget/legacy/inode.c:664 new_sync_write fs/read_write.c:499 __vfs_write+0x483/0x760 fs/read_write.c:512 vfs_write+0x170/0x4e0 fs/read_write.c:560 SYSC_write fs/read_write.c:607 SyS_write+0xfb/0x230 fs/read_write.c:599 entry_SYSCALL_64_fastpath+0x1f/0xc2 The issue is caused by a lack of size check for the request size in ep_write_iter which should be fixed. It, however, points to another problem, that SLUB defines KMALLOC_MAX_SIZE too large because the its KMALLOC_SHIFT_MAX is (MAX_ORDER + PAGE_SHIFT) which means that the resulting page allocator request might be MAX_ORDER which is too large (see __alloc_pages_slowpath). The same applies to the SLOB allocator which allows even larger sizes. Make sure that they are capped properly and never request more than MAX_ORDER order. Link: http://lkml.kernel.org/r/20161220130659.16461-2-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-10dax: wrprotect pmd_t in dax_mapping_entry_mkcleanRoss Zwisler
Currently dax_mapping_entry_mkclean() fails to clean and write protect the pmd_t of a DAX PMD entry during an *sync operation. This can result in data loss in the following sequence: 1) mmap write to DAX PMD, dirtying PMD radix tree entry and making the pmd_t dirty and writeable 2) fsync, flushing out PMD data and cleaning the radix tree entry. We currently fail to mark the pmd_t as clean and write protected. 3) more mmap writes to the PMD. These don't cause any page faults since the pmd_t is dirty and writeable. The radix tree entry remains clean. 4) fsync, which fails to flush the dirty PMD data because the radix tree entry was clean. 5) crash - dirty data that should have been fsync'd as part of 4) could still have been in the processor cache, and is lost. Fix this by marking the pmd_t clean and write protected in dax_mapping_entry_mkclean(), which is called as part of the fsync operation 2). This will cause the writes in step 3) above to generate page faults where we'll re-dirty the PMD radix tree entry, resulting in flushes in the fsync that happens in step 4). Fixes: 4b4bb46d00b3 ("dax: clear dirty entry tags on cache flush") Link: http://lkml.kernel.org/r/1482272586-21177-3-git-send-email-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-10mm: add follow_pte_pmd()Ross Zwisler
Patch series "Write protect DAX PMDs in *sync path". Currently dax_mapping_entry_mkclean() fails to clean and write protect the pmd_t of a DAX PMD entry during an *sync operation. This can result in data loss, as detailed in patch 2. This series is based on Dan's "libnvdimm-pending" branch, which is the current home for Jan's "dax: Page invalidation fixes" series. You can find a working tree here: https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean This patch (of 2): Similar to follow_pte(), follow_pte_pmd() allows either a PTE leaf or a huge page PMD leaf to be found and returned. Link: http://lkml.kernel.org/r/1482272586-21177-2-git-send-email-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Dave Hansen <dave.hansen@intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-10gro: Disable frag0 optimization on IPv6 ext headersHerbert Xu
The GRO fast path caches the frag0 address. This address becomes invalid if frag0 is modified by pskb_may_pull or its variants. So whenever that happens we must disable the frag0 optimization. This is usually done through the combination of gro_header_hard and gro_header_slow, however, the IPv6 extension header path did the pulling directly and would continue to use the GRO fast path incorrectly. This patch fixes it by disabling the fast path when we enter the IPv6 extension header path. Fixes: 78a478d0efd9 ("gro: Inline skb_gro_header and cache frag0 virtual address") Reported-by: Slava Shwartsman <slavash@mellanox.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-10target: add XCOPY target/segment desc sense codesDavid Disseldorp
As defined in http://www.t10.org/lists/asc-num.htm. To be used during validation of XCOPY target and segment descriptor lists. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
2017-01-10Merge remote-tracking branches 'asoc/fix/arizona', 'asoc/fix/dpcm', ↵Mark Brown
'asoc/fix/dwc', 'asoc/fix/fsl-ssi' and 'asoc/fix/hdmi-codec' into asoc-linus
2017-01-10Merge remote-tracking branch 'asoc/fix/component' into asoc-linusMark Brown
2017-01-09btrfs: make tracepoint format strings more compactDavid Sterba
We've recently added the fsid to trace events, this makes the line quite long. To reduce the it again, remove extra spaces around = and remove ",". Signed-off-by: David Sterba <dsterba@suse.com>
2017-01-09Btrfs: add truncated_len for ordered extent tracepointsLiu Bo
This can help us monitor truncated ordered extents. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-01-09Btrfs: add 'inode' for extent map tracepointLiu Bo
'inode' is an important field for btrfs_get_extent, lets trace it. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-01-09btrfs: fix crash when tracepoint arguments are freed by wq callbacksDavid Sterba
Enabling btrfs tracepoints leads to instant crash, as reported. The wq callbacks could free the memory and the tracepoints started to dereference the members to get to fs_info. The proposed fix https://marc.info/?l=linux-btrfs&m=148172436722606&w=2 removed the tracepoints but we could preserve them by passing only the required data in a safe way. Fixes: bc074524e123 ("btrfs: prefix fsid to all trace events") CC: stable@vger.kernel.org # 4.8+ Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-01-08Merge branch 'fscrypt' into dTheodore Ts'o
2017-01-08Merge tag 'usb-4.10-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a bunch of USB fixes for 4.10-rc3. Yeah, it's a lot, an artifact of the holiday break I think. Lots of gadget and the usual XHCI fixups for reported issues (one day that driver will calm down...) Also included are a bunch of usb-serial driver fixes, and for good measure, a number of much-reported MUSB driver issues have finally been resolved. All of these have been in linux-next with no reported issues" * tag 'usb-4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (72 commits) USB: fix problems with duplicate endpoint addresses usb: ohci-at91: use descriptor-based gpio APIs correctly usb: storage: unusual_uas: Add JMicron JMS56x to unusual device usb: hub: Move hub_port_disable() to fix warning if PM is disabled usb: musb: blackfin: add bfin_fifo_offset in bfin_ops usb: musb: fix compilation warning on unused function usb: musb: Fix trying to free already-free IRQ 4 usb: musb: dsps: implement clear_ep_rxintr() callback usb: musb: core: add clear_ep_rxintr() to musb_platform_ops USB: serial: ti_usb_3410_5052: fix NULL-deref at open USB: serial: spcp8x5: fix NULL-deref at open USB: serial: quatech2: fix sleep-while-atomic in close USB: serial: pl2303: fix NULL-deref at open USB: serial: oti6858: fix NULL-deref at open USB: serial: omninet: fix NULL-derefs at open and disconnect USB: serial: mos7840: fix misleading interrupt-URB comment USB: serial: mos7840: remove unused write URB USB: serial: mos7840: fix NULL-deref at open USB: serial: mos7720: remove obsolete port initialisation USB: serial: mos7720: fix parallel probe ...
2017-01-08Merge tag 'staging-4.10-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging/IIO fixes from Greg KH: "Here are some staging and IIO driver fixes for 4.10-rc3. Most of these are minor IIO fixes of reported issues, along with one network driver fix to resolve an issue. And a MAINTAINERS update with a new mailing list. All of these, except the MAINTAINERS file update, have been in linux-next with no reported issues (the MAINTAINERS patch happened on Friday...)" * tag 'staging-4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: MAINTAINERS: add greybus subsystem mailing list staging: octeon: Call SET_NETDEV_DEV() iio: accel: st_accel: fix LIS3LV02 reading and scaling iio: common: st_sensors: fix channel data parsing iio: max44000: correct value in illuminance_integration_time_available iio: adc: TI_AM335X_ADC should depend on HAS_DMA iio: bmi160: Fix time needed to sleep after command execution iio: 104-quad-8: Fix active level mismatch for the preset enable option iio: 104-quad-8: Fix off-by-one errors when addressing IOR iio: 104-quad-8: Fix index control configuration
2017-01-08fscrypt: make fscrypt_operations.key_prefix a stringEric Biggers
There was an unnecessary amount of complexity around requesting the filesystem-specific key prefix. It was unclear why; perhaps it was envisioned that different instances of the same filesystem type could use different key prefixes, or that key prefixes could be binary. However, neither of those things were implemented or really make sense at all. So simplify the code by making key_prefix a const char *. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Richard Weinberger <richard@nod.at> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-01-08fscrypt: remove unused 'mode' member of fscrypt_ctxEric Biggers
Nothing reads or writes fscrypt_ctx.mode, and it doesn't belong there because a fscrypt_ctx is not tied to a specific encryption mode. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-01-07mm: workingset: fix use-after-free in shadow node shrinkerJohannes Weiner
Several people report seeing warnings about inconsistent radix tree nodes followed by crashes in the workingset code, which all looked like use-after-free access from the shadow node shrinker. Dave Jones managed to reproduce the issue with a debug patch applied, which confirmed that the radix tree shrinking indeed frees shadow nodes while they are still linked to the shadow LRU: WARNING: CPU: 2 PID: 53 at lib/radix-tree.c:643 delete_node+0x1e4/0x200 CPU: 2 PID: 53 Comm: kswapd0 Not tainted 4.10.0-rc2-think+ #3 Call Trace: delete_node+0x1e4/0x200 __radix_tree_delete_node+0xd/0x10 shadow_lru_isolate+0xe6/0x220 __list_lru_walk_one.isra.4+0x9b/0x190 list_lru_walk_one+0x23/0x30 scan_shadow_nodes+0x2e/0x40 shrink_slab.part.44+0x23d/0x5d0 shrink_node+0x22c/0x330 kswapd+0x392/0x8f0 This is the WARN_ON_ONCE(!list_empty(&node->private_list)) placed in the inlined radix_tree_shrink(). The problem is with 14b468791fa9 ("mm: workingset: move shadow entry tracking to radix tree exceptional tracking"), which passes an update callback into the radix tree to link and unlink shadow leaf nodes when tree entries change, but forgot to pass the callback when reclaiming a shadow node. While the reclaimed shadow node itself is unlinked by the shrinker, its deletion from the tree can cause the left-most leaf node in the tree to be shrunk. If that happens to be a shadow node as well, we don't unlink it from the LRU as we should. Consider this tree, where the s are shadow entries: root->rnode | [0 n] | | [s ] [sssss] Now the shadow node shrinker reclaims the rightmost leaf node through the shadow node LRU: root->rnode | [0 ] | [s ] Because the parent of the deleted node is the first level below the root and has only one child in the left-most slot, the intermediate level is shrunk and the node containing the single shadow is put in its place: root->rnode | [s ] The shrinker again sees a single left-most slot in a first level node and thus decides to store the shadow in root->rnode directly and free the node - which is a leaf node on the shadow node LRU. root->rnode | s Without the update callback, the freed node remains on the shadow LRU, where it causes later shrinker runs to crash. Pass the node updater callback into __radix_tree_delete_node() in case the deletion causes the left-most branch in the tree to collapse too. Also add warnings when linked nodes are freed right away, rather than wait for the use-after-free when the list is scanned much later. Fixes: 14b468791fa9 ("mm: workingset: move shadow entry tracking to radix tree exceptional tracking") Reported-by: Dave Chinner <david@fromorbit.com> Reported-by: Hugh Dickins <hughd@google.com> Reported-by: Andrea Arcangeli <aarcange@redhat.com> Reported-and-tested-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Chris Leech <cleech@redhat.com> Cc: Lee Duncan <lduncan@suse.com> Cc: Jan Kara <jack@suse.cz> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <mawilcox@linuxonhyperv.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-07Merge branch 'rc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild fix from Michal Marek: "The asm-prototypes.h file added in the last merge window results in invalid code with CONFIG_KMEMCHECK=y. The net result is that genksyms segfaults. This pull request fixes the header, the genksyms fix is in my kbuild branch for 4.11" * 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: asm-prototypes: Clear any CPP defines before declaring the functions
2017-01-07x86/efi: Don't allocate memmap through memblock after mm_init()Nicolai Stange
With the following commit: 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data") ... efi_bgrt_init() calls into the memblock allocator through efi_mem_reserve() => efi_arch_mem_reserve() *after* mm_init() has been called. Indeed, KASAN reports a bad read access later on in efi_free_boot_services(): BUG: KASAN: use-after-free in efi_free_boot_services+0xae/0x24c at addr ffff88022de12740 Read of size 4 by task swapper/0/0 page:ffffea0008b78480 count:0 mapcount:-127 mapping: (null) index:0x1 flags: 0x5fff8000000000() [...] Call Trace: dump_stack+0x68/0x9f kasan_report_error+0x4c8/0x500 kasan_report+0x58/0x60 __asan_load4+0x61/0x80 efi_free_boot_services+0xae/0x24c start_kernel+0x527/0x562 x86_64_start_reservations+0x24/0x26 x86_64_start_kernel+0x157/0x17a start_cpu+0x5/0x14 The instruction at the given address is the first read from the memmap's memory, i.e. the read of md->type in efi_free_boot_services(). Note that the writes earlier in efi_arch_mem_reserve() don't splat because they're done through early_memremap()ed addresses. So, after memblock is gone, allocations should be done through the "normal" page allocator. Introduce a helper, efi_memmap_alloc() for this. Use it from efi_arch_mem_reserve(), efi_free_boot_services() and, for the sake of consistency, from efi_fake_memmap() as well. Note that for the latter, the memmap allocations cease to be page aligned. This isn't needed though. Tested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Nicolai Stange <nicstange@gmail.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: <stable@vger.kernel.org> # v4.9 Cc: Dave Young <dyoung@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Mika Penttilä <mika.penttila@nextfour.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Fixes: 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data") Link: http://lkml.kernel.org/r/20170105125130.2815-1-nicstange@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-06Merge tag 'vfio-v4.10-rc3' of git://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO fixes from Alex Williamson: - Add mtty sample driver properly into build system (Alex Williamson) - Restore type1 mapping performance after mdev (Alex Williamson) - Fix mdev device race (Alex Williamson) - Cleanups to the mdev ABI used by vendor drivers (Alex Williamson) - Build fix for old compilers (Arnd Bergmann) - Fix sample driver error path (Dan Carpenter) - Handle pci_iomap() error (Arvind Yadav) - Fix mdev ioctl return type (Paul Gortmaker) * tag 'vfio-v4.10-rc3' of git://github.com/awilliam/linux-vfio: vfio-mdev: fix non-standard ioctl return val causing i386 build fail vfio-pci: Handle error from pci_iomap vfio-mdev: fix some error codes in the sample code vfio-pci: use 32-bit comparisons for register address for gcc-4.5 vfio-mdev: Make mdev_device private and abstract interfaces vfio-mdev: Make mdev_parent private vfio-mdev: de-polute the namespace, rename parent_device & parent_ops vfio-mdev: Fix remove race vfio/type1: Restore mapping performance with mdev support vfio-mdev: Fix mtty sample driver building
2017-01-06Merge branch 'stable/for-linus-4.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb Pull swiotlb fixes from Konrad Rzeszutek Wilk: "This has one fix to make i915 work when using Xen SWIOTLB, and a feature from Geert to aid in debugging of devices that can't do DMA outside the 32-bit address space. The feature from Geert is on top of v4.10 merge window commit (specifically you pulling my previous branch), as his changes were dependent on the Documentation/ movement patches. I figured it would just easier than me trying than to cherry-pick the Documentation patches to satisfy git. The patches have been soaking since 12/20, albeit I updated the last patch due to linux-next catching an compiler error and adding an Tested-and-Reported-by tag" * 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: swiotlb: Export swiotlb_max_segment to users swiotlb: Add swiotlb=noforce debug option swiotlb: Convert swiotlb_force from int to enum x86, swiotlb: Simplify pci_swiotlb_detect_override()
2017-01-06swiotlb: Export swiotlb_max_segment to usersKonrad Rzeszutek Wilk
So they can figure out what is the optimal number of pages that can be contingously stitched together without fear of bounce buffer. We also expose an mechanism for sub-users of SWIOTLB API, such as Xen-SWIOTLB to set the max segment value. And lastly if swiotlb=force is set (which mandates we bounce buffer everything) we set max_segment so at least we can bounce buffer one 4K page instead of a giant 512KB one for which we may not have space. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reported-and-Tested-by: Juergen Gross <jgross@suse.com>