summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-04-28btrfs: rename btrfs_find_device_by_user_inputDavid Sterba
For clarity how we are going to find the device, let's call it a device specifier, devspec for short. Also rename the arguments that are a leftover from previous function purpose. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: use existing device constraints table btrfs_raid_arrayDavid Sterba
We should avoid duplicating the device constraints, let's use the btrfs_raid_array in btrfs_check_raid_min_devices. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: introduce raid-type to error-code table, for minimum device constraintDavid Sterba
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: pass number of devices to btrfs_check_raid_min_devicesDavid Sterba
Before this patch, btrfs_check_raid_min_devices would do an off-by-one check of the constraints and not the miminmum check, as its name suggests. This is not a problem if the only caller is device remove, but would be confusing for others. Add an argument with the exact number and let the caller(s) decide if this needs any adjustments, like when device replace is running. Reviewed-by: Anand Jain <anand.jain@oracle.com> Tested-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: rename __check_raid_min_devicesDavid Sterba
Underscores are for special functions, use the full prefix for better stacktrace recognition. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: optimize check for stale deviceAnand Jain
Optimize check for stale device to only be checked when there is device added or changed. If there is no update to the device, there is no need to call btrfs_free_stale_device(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: introduce device delete by devidAnand Jain
This introduces new ioctl BTRFS_IOC_RM_DEV_V2, which uses enhanced struct btrfs_ioctl_vol_args_v2 to carry devid as an user argument. The patch won't delete the old ioctl interface and so kernel remains backward compatible with user land progs. Test case/script: echo "0 $(blockdev --getsz /dev/sdf) linear /dev/sdf 0" | dmsetup create bad_disk mkfs.btrfs -f -d raid1 -m raid1 /dev/sdd /dev/sde /dev/mapper/bad_disk mount /dev/sdd /btrfs dmsetup suspend bad_disk echo "0 $(blockdev --getsz /dev/sdf) error /dev/sdf 0" | dmsetup load bad_disk dmsetup resume bad_disk echo "bad disk failed. now deleting/replacing" btrfs dev del 3 /btrfs echo $? btrfs fi show /btrfs umount /btrfs btrfs-show-super /dev/sdd | egrep num_device dmsetup remove bad_disk wipefs -a /dev/sdf Signed-off-by: Anand Jain <anand.jain@oracle.com> Reported-by: Martin <m_btrfs@ml1.co.uk> [ adjust messages, s/disk/device/ ] Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: make use of btrfs_scratch_superblocks() in btrfs_rm_device()Anand Jain
With the previous patches now the btrfs_scratch_superblocks() is ready to be used in btrfs_rm_device() so use it. Signed-off-by: Anand Jain <anand.jain@oracle.com> [ use GFP_KERNEL ] Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: enhance btrfs_find_device_by_user_input() to check device pathAnand Jain
The operation of device replace and device delete follows same steps upto some depth with in btrfs kernel, however they don't share codes. This enhancement will help replace and delete to share codes. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: make use of btrfs_find_device_by_user_input()Anand Jain
btrfs_rm_device() has a section of the code which can be replaced btrfs_find_device_by_user_input() Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: create helper btrfs_find_device_by_user_input()Anand Jain
The patch renames btrfs_dev_replace_find_srcdev() to btrfs_find_device_by_user_input() and moves it to volumes.c, so that delete device can use it. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: clean up and optimize __check_raid_min_device()Anand Jain
__check_raid_min_device() which was pealed from btrfs_rm_device() maintianed its original code to show the block move. This patch cleans up __check_raid_min_device(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: create helper function __check_raid_min_devices()Anand Jain
move a section of btrfs_rm_device() code to check for min number of the devices into the function __check_raid_min_devices() Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: create a helper function to read the disk superAnand Jain
A part of code from btrfs_scan_one_device() is moved to a new function btrfs_read_disk_super(), so that former function looks cleaner. (In this process it also moves the code which ensures null terminating label). So this creates easy opportunity to merge various duplicate codes on read disk super. Earlier attempt to merge duplicate codes highlighted that there were some issues for which there are duplicate codes (to read disk super), however it was not clear what was the issue. So until we figure that out, its better to keep them in a separate functions. Signed-off-by: Anand Jain <anand.jain@oracle.com> [ use GFP_KERNEL, PAGE_CACHE_ removal related fixups ] Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28Btrfs: do not create empty block group if we have allocated dataLiu Bo
Now we force to create empty block group to keep data profile alive, however, in the below example, we eventually get an empty block group while we're trying to get more space for other types (metadata/system), - Before, block group "A": size=2G, used=1.2G block group "B": size=2G, used=512M - After "btrfs balance start -dusage=50 mount_point", block group "A": size=2G, used=(1.2+0.5)G block group "C": size=2G, used=0 Since there is no data in block group C, it won't be deleted automatically and we have to get the unused 2G until the next mount. Balance itself just moves data and doesn't remove data, so it's safe to not create such a empty block group if we already have data allocated in other block groups. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28Btrfs: __btrfs_buffered_write: Pass valid file offset when releasing ↵Chandan Rajendra
delalloc space The delalloc reserved space is calculated in terms of number of bytes used by an integral number of blocks. This is done by rounding down the value of 'pos' to the nearest multiple of sectorsize. The file offset value held by 'pos' variable may not be aligned to sectorsize and hence when passing it as an argument to btrfs_delalloc_release_space(), we may end up releasing larger delalloc space than we originally had reserved. Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28Btrfs: cleanup error handling in extent_write_cached_pagesLiu Bo
Now that we bail out immediately if ->writepage() returns an error, we don't need an extra error to retain the error code. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28Btrfs: make mapping->writeback_index point to the last written pageLiu Bo
If sequential writer is writing in the middle of the page and it just redirties the last written page by continuing from it. In the above case this can end up with seeking back to that firstly redirtied page after writing all the pages at the end of file because btrfs updates mapping->writeback_index to 1 past the current one. For non-cow filesystems, the cost is only about extra seek, while for cow filesystems such as btrfs, it means unnecessary fragments. To avoid it, we just need to continue writeback from the last written page. This also updates btrfs to behave like what write_cache_pages() does, ie, bail out immediately if there is an error in writepage(). <Ref: https://www.spinics.net/lists/linux-btrfs/msg52628.html> Reported-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctlLuke Dashjr
32-bit ioctl uses these rather than the regular FS_IOC_* versions. They can be handled in btrfs using the same code. Without this, 32-bit {ch,ls}attr fail. Signed-off-by: Luke Dashjr <luke-jr+git@utopios.org> Cc: stable@vger.kernel.org Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: fix typos in commentsLuis de Bethencourt
Correct a typo in the chunk_mutex name to make it grepable. Since it is better to fix several typos at once, fixing the 2 more in the same file. Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28Btrfs: Refactor btrfs_lock_cluster() to kill compiler warningGeert Uytterhoeven
fs/btrfs/extent-tree.c: In function ‘btrfs_lock_cluster’: fs/btrfs/extent-tree.c:6399: warning: ‘used_bg’ may be used uninitialized in this function - Replace "again: ... goto again;" by standard C "while (1) { ... }", - Move block not processed during the first iteration of the loop to the end of the loop, which allows to kill the "locked" variable, Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-and-Tested-by: Miao Xie <miaox@cn.fujitsu.com> [ the compilation warning has been fixed by other patch, now we want to clean up the function ] Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: remove save_error_info()Anand Jain
Actually save_error_info() sets the FS state to error and nothing else. Further the word save doesn't induce caffeine when compared to the word set in what actually it does. So to make it better understandable move save_error_info() code to its only consumer itself. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: Simplify conditions about compress while mapping btrfs flags to inode ↵Satoru Takeuchi
flags Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: move error handling code together in ctree.hAnand Jain
Looks like we added the incompatible defines in between the error handling defines in the file ctree.h. Now group them back. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: remove unused function btrfs_assert()Anand Jain
Apparently looks like ASSERT does the same intended job, as intended btrfs_assert(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: rename btrfs_std_error to btrfs_handle_fs_errorAnand Jain
btrfs_std_error() handles errors, puts FS into readonly mode (as of now). So its good idea to rename it to btrfs_handle_fs_error(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ edit changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28perf/x86/intel: Fix incorrect lbr_sel_mask valueKan Liang
This patch fixes a bug which was introduced by: b16a5b52eb90 ("perf/x86: Add option to disable reading branch flags/cycles") In this patch, lbr_sel_mask is used to mask the lbr_select. But LBR_SEL_MASK doesn't include the bit for LBR_CALL_STACK. So LBR call stack will never be set in lbr_select. This patch corrects the LBR_SEL_MASK by including all valid bits in LBR_SELECT. Also, the LBR_CALL_STACK bit is different as other bit in LBR_SELECT. It does not operate in suppress mode, so it needs to be specially handled in intel_pmu_setup_hw_lbr_filter. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1461231010-4399-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-28perf/x86/intel/pt: Don't die on VMXONAlexander Shishkin
Some versions of Intel PT do not support tracing across VMXON, more specifically, VMXON will clear TraceEn control bit and any attempt to set it before VMXOFF will throw a #GP, which in the current state of things will crash the kernel. Namely: $ perf record -e intel_pt// kvm -nographic on such a machine will kill it. To avoid this, notify the intel_pt driver before VMXON and after VMXOFF so that it knows when not to enable itself. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Gleb Natapov <gleb@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: hpa@zytor.com Link: http://lkml.kernel.org/r/87oa9dwrfk.fsf@ashishki-desk.ger.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-28perf/core: Fix perf_event_open() vs. execve() racePeter Zijlstra
Jann reported that the ptrace_may_access() check in find_lively_task_by_vpid() is racy against exec(). Specifically: perf_event_open() execve() ptrace_may_access() commit_creds() ... if (get_dumpable() != SUID_DUMP_USER) perf_event_exit_task(); perf_install_in_context() would result in installing a counter across the creds boundary. Fix this by wrapping lots of perf_event_open() in cred_guard_mutex. This should be fine as perf_event_exit_task() is already called with cred_guard_mutex held, so all perf locks already nest inside it. Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-28nohz/full, sched/rt: Fix missed tick-reenabling bug in sched_can_stop_tick()Peter Zijlstra
Chris Metcalf reported a that sched_can_stop_tick() sometimes fails to re-enable the tick. His observed problem is that rq->cfs.nr_running can be 1 even though there are multiple runnable CFS tasks. This happens in the cgroup case, in which case cfs.nr_running is the number of runnable entities for that level. If there is a single runnable cgroup (which can have an arbitrary number of runnable child entries itself) rq->cfs.nr_running will be 1. However, looking at that function I think there's more problems with it. It seems to assume that if there's FIFO tasks, those will run. This is incorrect. The FIFO task can have a lower prio than an RR task, in which case the RR task will run. So the whole fifo_nr_running test seems misplaced, it should go after the rr_nr_running tests. That is, only if !rr_nr_running, can we use fifo_nr_running like this. Reported-by: Chris Metcalf <cmetcalf@mellanox.com> Tested-by: Chris Metcalf <cmetcalf@mellanox.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Christoph Lameter <cl@linux.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Wanpeng Li <kernellwp@gmail.com> Fixes: 76d92ac305f2 ("sched: Migrate sched to use new tick dependency mask model") Link: http://lkml.kernel.org/r/20160421160315.GK24771@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-28perf/x86/amd: Set the size of event map array to PERF_COUNT_HW_MAXAdam Borowski
The entry for PERF_COUNT_HW_REF_CPU_CYCLES is not used on AMD, but is referenced by filter_events() which expects undefined events to have a value of 0. Found via KASAN: UBSAN: Undefined behaviour in arch/x86/events/amd/core.c:132:30 index 9 is out of range for type 'u64 [9]' UBSAN: Undefined behaviour in arch/x86/events/amd/core.c:132:9 load of address ffffffff81c021c8 with insufficient space for an object of type 'const u64' Signed-off-by: Adam Borowski <kilobyte@angband.pl> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1461749731-30979-1-git-send-email-kilobyte@angband.pl Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-28rbd: report unsupported features to syslogIlya Dryomov
... instead of just returning an error. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2016-04-28rbd: fix rbd map vs notify racesIlya Dryomov
A while ago, commit 9875201e1049 ("rbd: fix use-after free of rbd_dev->disk") fixed rbd unmap vs notify race by introducing an exported wrapper for flushing notifies and sticking it into do_rbd_remove(). A similar problem exists on the rbd map path, though: the watch is registered in rbd_dev_image_probe(), while the disk is set up quite a few steps later, in rbd_dev_device_setup(). Nothing prevents a notify from coming in and crashing on a NULL rbd_dev->disk: BUG: unable to handle kernel NULL pointer dereference at 0000000000000050 Call Trace: [<ffffffffa0508344>] rbd_watch_cb+0x34/0x180 [rbd] [<ffffffffa04bd290>] do_event_work+0x40/0xb0 [libceph] [<ffffffff8109d5db>] process_one_work+0x17b/0x470 [<ffffffff8109e3ab>] worker_thread+0x11b/0x400 [<ffffffff8109e290>] ? rescuer_thread+0x400/0x400 [<ffffffff810a5acf>] kthread+0xcf/0xe0 [<ffffffff810b41b3>] ? finish_task_switch+0x53/0x170 [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140 [<ffffffff81645dd8>] ret_from_fork+0x58/0x90 [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140 RIP [<ffffffffa050828a>] rbd_dev_refresh+0xfa/0x180 [rbd] If an error occurs during rbd map, we have to error out, potentially tearing down a watch. Just like on rbd unmap, notifies have to be flushed, otherwise rbd_watch_cb() may end up trying to read in the image header after rbd_dev_image_release() has run: Assertion failure in rbd_dev_header_info() at line 4722: rbd_assert(rbd_image_format_valid(rbd_dev->image_format)); Call Trace: [<ffffffff81cccee0>] ? rbd_parent_request_create+0x150/0x150 [<ffffffff81cd4e59>] rbd_dev_refresh+0x59/0x390 [<ffffffff81cd5229>] rbd_watch_cb+0x69/0x290 [<ffffffff81fde9bf>] do_event_work+0x10f/0x1c0 [<ffffffff81107799>] process_one_work+0x689/0x1a80 [<ffffffff811076f7>] ? process_one_work+0x5e7/0x1a80 [<ffffffff81132065>] ? finish_task_switch+0x225/0x640 [<ffffffff81107110>] ? pwq_dec_nr_in_flight+0x2b0/0x2b0 [<ffffffff81108c69>] worker_thread+0xd9/0x1320 [<ffffffff81108b90>] ? process_one_work+0x1a80/0x1a80 [<ffffffff8111b02d>] kthread+0x21d/0x2e0 [<ffffffff8111ae10>] ? kthread_stop+0x550/0x550 [<ffffffff82022802>] ret_from_fork+0x22/0x40 [<ffffffff8111ae10>] ? kthread_stop+0x550/0x550 RIP [<ffffffff81ccd8f9>] rbd_dev_header_info+0xa19/0x1e30 To fix this, a) check if RBD_DEV_FLAG_EXISTS is set before calling revalidate_disk(), b) move ceph_osdc_flush_notifies() call into rbd_dev_header_unwatch_sync() to cover rbd map error paths and c) turn header read-in into a critical section. The latter also happens to take care of rbd map foo@bar vs rbd snap rm foo@bar race. Fixes: http://tracker.ceph.com/issues/15490 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2016-04-28x86/apic: Handle zero vector gracefully in clear_vector_irq()Keith Busch
If x86_vector_alloc_irq() fails x86_vector_free_irqs() is invoked to cleanup the already allocated vectors. This subsequently calls clear_vector_irq(). The failed irq has no vector assigned, which triggers the BUG_ON(!vector) in clear_vector_irq(). We cannot suppress the call to x86_vector_free_irqs() for the failed interrupt, because the other data related to this irq must be cleaned up as well. So calling clear_vector_irq() with vector == 0 is legitimate. Remove the BUG_ON and return if vector is zero, [ tglx: Massaged changelog ] Fixes: b5dc8e6c21e7 "x86/irq: Use hierarchical irqdomain to manage CPU interrupt vectors" Signed-off-by: Keith Busch <keith.busch@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-04-27thermal: use %d to print S32 parametersLeo Yan
Power allocator's parameters are S32 type, so use %d to print them. Acked-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2016-04-27thermal: hisilicon: increase temperature resolutionLeo Yan
When calculate temperature, old code firstly do division and then convert to "millicelsius" unit. This will lose resolution and only can read back temperature with "Celsius" unit. So firstly scale step value to "millicelsius" and then do division, so finally we can increase resolution for temperature value. Also refine the calculation from temperature value to step value. Signed-off-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2016-04-27misc: mic: Fix for double fetch security bug in VOP driverAshutosh Dixit
The MIC VOP driver does two successive reads from user space to read a variable length data structure. Kernel memory corruption can result if the data structure changes between the two reads. This patch disallows the chance of this happening. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=116651 Reported by: Pengfei Wang <wpengfeinudt@gmail.com> Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-27ARM: SoCFPGA: Fix secondary CPU startup in thumb2 kernelSascha Hauer
The secondary CPU starts up in ARM mode. When the kernel is compiled in thumb2 mode we have to explicitly compile the secondary startup trampoline in ARM mode, otherwise the CPU will go to Nirvana. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reported-by: Steffen Trumtrar <s.trumtrar@pengutronix.de> Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com> Signed-off-by: Kevin Hilman <khilman@baylibre.com>
2016-04-27device property: Avoid potential dereferences of invalid pointersHeikki Krogerus
Since fwnode may hold ERR_PTR(-ENODEV) or it may be NULL, the fwnode type checks is_of_node(), is_acpi_node() and is is_pset_node() need to consider it. Using IS_ERR_OR_NULL() to check it. Fixes: 0d67e0fa1664 (device property: fix for a case of use-after-free) Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-27sparc64: Fix bootup regressions on some Kconfig combinations.David S. Miller
The system call tracing bug fix mentioned in the Fixes tag below increased the amount of assembler code in the sequence of assembler files included by head_64.S This caused to total set of code to exceed 0x4000 bytes in size, which overflows the expression in head_64.S that works to place swapper_tsb at address 0x408000. When this is violated, the TSB is not properly aligned, and also the trap table is not aligned properly either. All of this together results in failed boots. So, do two things: 1) Simplify some code by using ba,a instead of ba/nop to get those bytes back. 2) Add a linker script assertion to make sure that if this happens again the build will fail. Fixes: 1a40b95374f6 ("sparc: Fix system call tracing register handling.") Reported-by: Meelis Roos <mroos@linux.ee> Reported-by: Joerg Abraham <joerg.abraham@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-27Merge branch 'bnxt_en-fixes'David S. Miller
Michael Chan says: ==================== bnxt_en: Bug fixes for net. Only use MSIX on VF, and fix rx page buffers on architectures with PAGE_SIZE >= 64K. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-27bnxt_en: Divide a page into 32K buffers for the aggregation ring if necessary.Michael Chan
If PAGE_SIZE is bigger than BNXT_RX_PAGE_SIZE, that means the native CPU page is bigger than the maximum length of the RX BD. Divide the page into multiple 32K buffers for the aggregation ring. Add an offset field in the bnxt_sw_rx_agg_bd struct to keep track of the page offset of each buffer. Since each page can be referenced by multiple buffer entries, call get_page() as needed to get the proper reference count. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-27bnxt_en: Limit RX BD pages to be no bigger than 32K.Michael Chan
The RX BD length field of this device is 16-bit, so the largest buffer size is 65535. For LRO and GRO, we allocate native CPU pages for the aggregation ring buffers. It won't work if the native CPU page size is 64K or bigger. We fix this by defining BNXT_RX_PAGE_SIZE to be native CPU page size up to 32K. Replace PAGE_SIZE with BNXT_RX_PAGE_SIZE in all appropriate places related to the rx aggregation ring logic. The next patch will add additional logic to divide the page into 32K chunks for aggrgation ring buffers if PAGE_SIZE is bigger than BNXT_RX_PAGE_SIZE. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-27bnxt_en: Don't fallback to INTA on VF.Michael Chan
Only MSI-X can be used on a VF. The driver should fail initialization if it cannot successfully enable MSI-X. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-27Merge branch 'for-4.6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fix from Tejun Heo: "So, it turns out we had a silly bug in the most fundamental part of workqueue for a very long time. AFAICS, this dates back to pre-git era and has quite likely been there from the time workqueue was first introduced. A work item uses its PENDING bit to synchronize multiple queuers. Anyone who wins the PENDING bit owns the pending state of the work item. Whether a queuer wins or loses the race, one thing should be guaranteed - there will soon be at least one execution of the work item - where "after" means that the execution instance would be able to see all the changes that the queuer has made prior to the queueing attempt. Unfortunately, we were missing a smp_mb() after clearing PENDING for execution, so nothing guaranteed visibility of the changes that a queueing loser has made, which manifested as a reproducible blk-mq stall. Lots of kudos to Roman for debugging the problem. The patch for -stable is the minimal one. For v3.7, Peter is working on a patch to make the code path slightly more efficient and less fragile" * 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: fix ghost PENDING flag while doing MQ IO
2016-04-27Merge branch 'for-4.6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "Two patches to fix a deadlock which can be easily triggered if memcg charge moving is used. This bug was introduced while converting threadgroup locking to a global percpu_rwsem and is caused by cgroup controller task migration path depending on the ability to create new kthreads. cpuset had a similar issue which was fixed by performing heavy-lifting operations asynchronous to task migration. The two patches fix the same issue in memcg in a similar way. The first patch makes the mechanism generic and the second relocates memcg charge moving outside the migration path. Given that we don't want to perform heavy operations while writelocking threadgroup lock anyway, moving them out of the way is a desirable solution. One thing to note is that the problem was difficult to debug because lockdep couldn't figure out the deadlock condition. Looking into how to improve that" * 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: memcg: relocate charge moving from ->attach to ->post_attach cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback
2016-04-27Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "I2C has one buildfix, one ABBA deadlock fix, and three simple 'add ID' patches" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: exynos5: Fix possible ABBA deadlock by keeping I2C clock prepared i2c: cpm: Fix build break due to incompatible pointer types i2c: ismt: Add Intel DNV PCI ID i2c: xlp9xx: add support for Broadcom Vulcan i2c: rk3x: add support for rk3228
2016-04-27Merge tag 'arc-4.6-rc6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - lockdep now works for ARCv2 builds - enable DT reserved-memory binding (for forthcoming HDMI driver) * tag 'arc-4.6-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: add support for reserved memory defined by device tree ARC: support generic per-device coherent dma mem Documentation: dt: arc: fix spelling mistakes ARCv2: Enable LOCKDEP
2016-04-27Merge tag 'nios2-v4.6-fix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2 Pull arch/nios2 fix from Ley Foon Tan: "memset: use the right constraint modifier for the %4 output operand" * tag 'nios2-v4.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2: nios2: memset: use the right constraint modifier for the %4 output operand
2016-04-27drm/amdgpu: disable vm interrupts with vm_fault_stop=2Flora Cui
V2: disable all vm interrupts in late_init() Signed-off-by: Flora Cui <Flora.Cui@amd.com> Reviewed-by: Ken Wang <Qingqing.Wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>