Age | Commit message (Collapse) | Author |
|
'best' is always less or equals to 'pos', so `best - pos' returns
a negative value which is then getting casted to `unsigned int'
and passed to __cpufreq_driver_target()->acpi_cpufreq_target()
for policy->freq_table selection. This results in
BUG: unable to handle kernel paging request at ffff881019b469f8
IP: [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
PGD 267f067
PUD 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 6 PID: 70 Comm: kworker/6:1 Not tainted 4.9.0-rc1-next-20161017-dbg-dirty
Workqueue: events dbs_work_handler
task: ffff88041b808000 task.stack: ffff88041b810000
RIP: 0010:[<ffffffffa00356c1>] [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
RSP: 0018:ffff88041b813c60 EFLAGS: 00010282
RAX: ffff880419b46a00 RBX: ffff88041b848400 RCX: ffff880419b20f80
RDX: 00000000001dff38 RSI: 00000000ffffffff RDI: ffff88041b848400
RBP: ffff88041b813cb0 R08: 0000000000000006 R09: 0000000000000040
R10: ffffffff8207f9e0 R11: ffffffff8173595b R12: 0000000000000000
R13: ffff88041f1dff38 R14: 0000000000262900 R15: 0000000bfffffff4
FS: 0000000000000000(0000) GS:ffff88041f000000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff881019b469f8 CR3: 000000041a2d3000 CR4: 00000000001406e0
Stack:
ffff88041b813cb0 ffffffff813347f9 ffff88041b813ca0 ffffffff81334663
ffff88041f1d4bc0 ffff88041b848400 0000000000000000 0000000000000000
0000000000262900 0000000000000000 ffff88041b813d00 ffffffff813355dc
Call Trace:
[<ffffffff813347f9>] ? cpufreq_freq_transition_begin+0xf1/0xfc
[<ffffffff81334663>] ? get_cpu_idle_time+0x97/0xa6
[<ffffffff813355dc>] __cpufreq_driver_target+0x3b6/0x44e
[<ffffffff81336ca3>] cs_dbs_timer+0x11a/0x135
[<ffffffff81336fda>] dbs_work_handler+0x39/0x62
[<ffffffff81057823>] process_one_work+0x280/0x4a5
[<ffffffff81058719>] worker_thread+0x24f/0x397
[<ffffffff810584ca>] ? rescuer_thread+0x30b/0x30b
[<ffffffff81418380>] ? nl80211_get_key+0x29/0x36a
[<ffffffff8105d2b7>] kthread+0xfc/0x104
[<ffffffff8107ceea>] ? put_lock_stats.isra.9+0xe/0x20
[<ffffffff8105d1bb>] ? kthread_create_on_node+0x3f/0x3f
[<ffffffff814b2092>] ret_from_fork+0x22/0x30
Code: 56 4d 6b ff 0c 41 55 41 54 53 48 83 ec 28 48 8b 15 ad 1e 00 00 44 8b 41
08 48 8b 87 c8 00 00 00 49 89 d5 4e 03 2c c5 80 b2 78 81 <46> 8b 74 38 04 45
3b 75 00 75 11 31 c0 83 39 00 0f 84 1c 01 00
RIP [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
RSP <ffff88041b813c60>
CR2: ffff881019b469f8
---[ end trace 16d9fc7a17897d37 ]---
[ rjw: In some cases this bug may also cause incorrect frequencies to
be selected by cpufreq governors. ]
Fixes: 899bb6642f2a (cpufreq: skip invalid entries when searching the frequency)
Link: http://marc.info/?l=linux-kernel&m=147672030714331&w=2
Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-and-tested-by: Jörg Otte <jrg.otte@gmail.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 4.8+ <stable@vger.kernel.org> # 4.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The following commit:
c65eacbe290b ("sched/core: Allow putting thread_info into task_struct")
... made 'struct thread_info' a generic struct with only a
single ::flags member, if CONFIG_THREAD_INFO_IN_TASK_STRUCT=y is
selected.
This change however seems to be quite x86 centric, since at least the
generic preemption code (asm-generic/preempt.h) assumes that struct
thread_info also has a preempt_count member, which apparently was not
true for x86.
We could add a bit more #ifdefs to solve this problem too, but it seems
to be much simpler to make struct thread_info arch specific
again. This also makes the conversion to THREAD_INFO_IN_TASK_STRUCT a
bit easier for architectures that have a couple of arch specific stuff
in their thread_info definition.
The arch specific stuff _could_ be moved to thread_struct. However
keeping them in thread_info makes it easier: accessing thread_info
members is simple, since it is at the beginning of the task_struct,
while the thread_struct is at the end. At least on s390 the offsets
needed to access members of the thread_struct (with task_struct as
base) are too large for various asm instructions. This is not a
problem when keeping these members within thread_info.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: keescook@chromium.org
Cc: linux-arch@vger.kernel.org
Link: http://lkml.kernel.org/r/1476901693-8492-2-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Adds support for usages that may appear on the pen or pad interface which
report the state of the tablet battery.
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
The HID standard defines usages that allow digitizers to report the pen's
height, tilt, and rotation and which are used by Wacom's new "MobileStudio
Pro" devices.
Note that 'hidinput_calc_abs_res' expects ABS_Z (historically used by our
driver to report twist) to have linear units. To ensure it calculates a
resolution with the actually-angular units provided in the HID descriptor
we nedd to lie and tell it we're calculating it for the (rotational) ABS_RZ
axis instead.
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
Asking for a non-current task's stack can't be done without races
unless the task is frozen in kernel mode. As far as I know,
vm_is_stack_for_task() never had a safe non-current use case.
The __unused annotation is because some KSTK_ESP implementations
ignore their parameter, which IMO is further justification for this
patch.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux API <linux-api@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Link: http://lkml.kernel.org/r/4c3f68f426e6c061ca98b4fc7ef85ffbb0a25b0c.1475257877.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The last user of this tunable was removed in 2012 in commit:
82958366cfea ("sched: Replace update_shares weight distribution with per-entity computation")
Delete it since its very existence confuses people.
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161019141059.26408-1-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This allows the file system to tell a FIEMAP from a read operation, and thus
avoids the need to report flags that aren't actually used in the read path.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
|
Add a sysfs entry to turn on priority information being passed
to a ATA device. By default this feature is turned off.
This patch depends on ata: Enabling ATA Command Priorities
tj: Renamed ncq_prio_on to ncq_prio_enable and removed trivial
ata_ncq_prio_on() and open-coded the test.
Signed-off-by: Adam Manzanares <adam.manzanares@hgst.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
This patch checks to see if an ATA device supports NCQ command priorities.
If so and the user has specified an iocontext that indicates
IO_PRIO_CLASS_RT then we build a tf with a high priority command.
This is done to improve the tail latency of commands that are high
priority by passing priority to the device.
tj: Removed trivial ata_ncq_prio_enabled() and open-coded the test.
Signed-off-by: Adam Manzanares <adam.manzanares@hgst.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Patch adds an association between iocontext ioprio and the ioprio of a
request. This is done to enable request based drivers the ability to
act on priority information stored in the request. An example being
ATA devices that support command priorities. If the ATA driver discovers
that the device supports command priorities and the request has valid
priority information indicating the request is high priority, then a high
priority command can be sent to the device. This should improve tail
latencies for high priority IO on any device that queues requests
internally and can make use of the priority information stored in the
request.
The ioprio of the request is set in blk_rq_set_prio which takes the
request and the ioc as arguments. If the ioc is valid in blk_rq_set_prio
then the iopriority of the request is set as the iopriority of the ioc.
In init_request_from_bio a check is made to see if the ioprio of the bio
is valid and if so then the request prio comes from the bio.
Signed-off-by: Adam Manzananares <adam.manzanares@wdc.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The percpu allocator expectedly assumes that the requested alignment
is power of two but hasn't been veryfing the input. If the specified
alignment isn't power of two, the allocator can malfunction. Add the
sanity check.
The following is detailed analysis of the effects of alignments which
aren't power of two.
The alignment must be a even at least since the LSB of a chunk->map
element is used as free/in-use flag of a area; besides, the alignment
must be a power of 2 too since ALIGN() doesn't work well for other
alignment always but is adopted by pcpu_fit_in_area(). IOW, the
current allocator only works well for a power of 2 aligned area
allocation.
See below opposite example for why an odd alignment doesn't work.
Let's assume area [16, 36) is free but its previous one is in-use, we
want to allocate a @size == 8 and @align == 7 area. The larger area
[16, 36) is split to three areas [16, 21), [21, 29), [29, 36)
eventually. However, due to the usage for a chunk->map element, the
actual offset of the aim area [21, 29) is 21 but is recorded in
relevant element as 20; moreover, the residual tail free area [29,
36) is mistook as in-use and is lost silently
Unlike macro roundup(), ALIGN(x, a) doesn't work if @a isn't a power
of 2 for example, roundup(10, 6) == 12 but ALIGN(10, 6) == 10, and
the latter result isn't desired obviously.
tj: Code style and patch description updates.
Signed-off-by: zijun_hu <zijun_hu@htc.com>
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Ported over from nvme-cli.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
This makes life easier for nvme-cli and we don't really need the uuid
type anyway to start with.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Reviewed-by: Jay Freyensee <james_p_freyensee@linux.intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Import a few updates to nvme.h from nvme-cli. This mostly includes a few
new fields and error codes, but also a few renames that so far are only
used in user space. Also one field is moved from an array of two le64
values to one of 16 u8 values so that we can more easily access it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
NVMe 1.2.1 specification adds a tertiary element to the version number.
This updates the macro and its callers to include the final number and
fixup a single place in nvmet where the version was generated manually.
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
|
|
Merge the gup_flags cleanups from Lorenzo Stoakes:
"This patch series adjusts functions in the get_user_pages* family such
that desired FOLL_* flags are passed as an argument rather than
implied by flags.
The purpose of this change is to make the use of FOLL_FORCE explicit
so it is easier to grep for and clearer to callers that this flag is
being used. The use of FOLL_FORCE is an issue as it overrides missing
VM_READ/VM_WRITE flags for the VMA whose pages we are reading
from/writing to, which can result in surprising behaviour.
The patch series came out of the discussion around commit 38e088546522
("mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing"),
which addressed a BUG_ON() being triggered when a page was faulted in
with PROT_NONE set but having been overridden by FOLL_FORCE.
do_numa_page() was run on the assumption the page _must_ be one marked
for NUMA node migration as an actual PROT_NONE page would have been
dealt with prior to this code path, however FOLL_FORCE introduced a
situation where this assumption did not hold.
See
https://marc.info/?l=linux-mm&m=147585445805166
for the patch proposal"
Additionally, there's a fix for an ancient bug related to FOLL_FORCE and
FOLL_WRITE by me.
[ This branch was rebased recently to add a few more acked-by's and
reviewed-by's ]
* gup_flag-cleanups:
mm: replace access_process_vm() write parameter with gup_flags
mm: replace access_remote_vm() write parameter with gup_flags
mm: replace __access_remote_vm() write parameter with gup_flags
mm: replace get_user_pages_remote() write/force parameters with gup_flags
mm: replace get_user_pages() write/force parameters with gup_flags
mm: replace get_vaddr_frames() write/force parameters with gup_flags
mm: replace get_user_pages_locked() write/force parameters with gup_flags
mm: replace get_user_pages_unlocked() write/force parameters with gup_flags
mm: remove write/force parameters from __get_user_pages_unlocked()
mm: remove write/force parameters from __get_user_pages_locked()
mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
|
|
This removes the 'write' argument from access_process_vm() and replaces
it with 'gup_flags' as use of this function previously silently implied
FOLL_FORCE, whereas after this patch callers explicitly pass this flag.
We make this explicit as use of FOLL_FORCE can result in surprising
behaviour (and hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This removes the 'write' argument from access_remote_vm() and replaces
it with 'gup_flags' as use of this function previously silently implied
FOLL_FORCE, whereas after this patch callers explicitly pass this flag.
We make this explicit as use of FOLL_FORCE can result in surprising
behaviour (and hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This removes the 'write' and 'force' from get_user_pages_remote() and
replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This removes the 'write' and 'force' from get_user_pages() and replaces
them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers
as use of this flag can result in surprising behaviour (and hence bugs)
within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This removes the 'write' and 'force' from get_vaddr_frames() and
replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This removes the 'write' and 'force' use from get_user_pages_locked()
and replaces them with 'gup_flags' to make the use of FOLL_FORCE
explicit in callers as use of this flag can result in surprising
behaviour (and hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A BPF program is required to check the return register of a
map_elem_lookup() call before accessing memory. The verifier keeps
track of this by converting the type of the result register from
PTR_TO_MAP_VALUE_OR_NULL to PTR_TO_MAP_VALUE after a conditional
jump ensures safety. This check is currently exclusively performed
for the result register 0.
In the event the compiler reorders instructions, BPF_MOV64_REG
instructions may be moved before the conditional jump which causes
them to keep their type PTR_TO_MAP_VALUE_OR_NULL to which the
verifier objects when the register is accessed:
0: (b7) r1 = 10
1: (7b) *(u64 *)(r10 -8) = r1
2: (bf) r2 = r10
3: (07) r2 += -8
4: (18) r1 = 0x59c00000
6: (85) call 1
7: (bf) r4 = r0
8: (15) if r0 == 0x0 goto pc+1
R0=map_value(ks=8,vs=8) R4=map_value_or_null(ks=8,vs=8) R10=fp
9: (7a) *(u64 *)(r4 +0) = 0
R4 invalid mem access 'map_value_or_null'
This commit extends the verifier to keep track of all identical
PTR_TO_MAP_VALUE_OR_NULL registers after a map_elem_lookup() by
assigning them an ID and then marking them all when the conditional
jump is observed.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Tamir reported the following trace when processing ARP requests received
via a vlan device on top of a VLAN-aware bridge:
NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/1:0]
[...]
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 4.8.0-rc7 #1
Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016
task: ffff88017edfea40 task.stack: ffff88017ee10000
RIP: 0010:[<ffffffff815dcc73>] [<ffffffff815dcc73>] netdev_all_lower_get_next_rcu+0x33/0x60
[...]
Call Trace:
<IRQ>
[<ffffffffa015de0a>] mlxsw_sp_port_lower_dev_hold+0x5a/0xa0 [mlxsw_spectrum]
[<ffffffffa016f1b0>] mlxsw_sp_router_netevent_event+0x80/0x150 [mlxsw_spectrum]
[<ffffffff810ad07a>] notifier_call_chain+0x4a/0x70
[<ffffffff810ad13a>] atomic_notifier_call_chain+0x1a/0x20
[<ffffffff815ee77b>] call_netevent_notifiers+0x1b/0x20
[<ffffffff815f2eb6>] neigh_update+0x306/0x740
[<ffffffff815f38ce>] neigh_event_ns+0x4e/0xb0
[<ffffffff8165ea3f>] arp_process+0x66f/0x700
[<ffffffff8170214c>] ? common_interrupt+0x8c/0x8c
[<ffffffff8165ec29>] arp_rcv+0x139/0x1d0
[<ffffffff816e505a>] ? vlan_do_receive+0xda/0x320
[<ffffffff815e3794>] __netif_receive_skb_core+0x524/0xab0
[<ffffffff815e6830>] ? dev_queue_xmit+0x10/0x20
[<ffffffffa06d612d>] ? br_forward_finish+0x3d/0xc0 [bridge]
[<ffffffffa06e5796>] ? br_handle_vlan+0xf6/0x1b0 [bridge]
[<ffffffff815e3d38>] __netif_receive_skb+0x18/0x60
[<ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0
[<ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70
[<ffffffffa06d7856>] br_pass_frame_up+0xc6/0x160 [bridge]
[<ffffffffa06d63d7>] ? deliver_clone+0x37/0x50 [bridge]
[<ffffffffa06d656c>] ? br_flood+0xcc/0x160 [bridge]
[<ffffffffa06d7b14>] br_handle_frame_finish+0x224/0x4f0 [bridge]
[<ffffffffa06d7f94>] br_handle_frame+0x174/0x300 [bridge]
[<ffffffff815e3599>] __netif_receive_skb_core+0x329/0xab0
[<ffffffff81374815>] ? find_next_bit+0x15/0x20
[<ffffffff8135e802>] ? cpumask_next_and+0x32/0x50
[<ffffffff810c9968>] ? load_balance+0x178/0x9b0
[<ffffffff815e3d38>] __netif_receive_skb+0x18/0x60
[<ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0
[<ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70
[<ffffffffa01544a1>] mlxsw_sp_rx_listener_func+0x61/0xb0 [mlxsw_spectrum]
[<ffffffffa005c9f7>] mlxsw_core_skb_receive+0x187/0x200 [mlxsw_core]
[<ffffffffa007332a>] mlxsw_pci_cq_tasklet+0x63a/0x9b0 [mlxsw_pci]
[<ffffffff81091986>] tasklet_action+0xf6/0x110
[<ffffffff81704556>] __do_softirq+0xf6/0x280
[<ffffffff8109213f>] irq_exit+0xdf/0xf0
[<ffffffff817042b4>] do_IRQ+0x54/0xd0
[<ffffffff8170214c>] common_interrupt+0x8c/0x8c
The problem is that netdev_all_lower_get_next_rcu() never advances the
iterator, thereby causing the loop over the lower adjacency list to run
forever.
Fix this by advancing the iterator and avoid the infinite loop.
Fixes: 7ce856aaaf13 ("mlxsw: spectrum: Add couple of lower device helper functions")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Tamir Winetroub <tamirw@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Storage of the firmware name was inconsistent, either storing a pointer
to a name stored with unknown ownership, or a variable length tacked
onto the end of the struct proc allocated in rproc_alloc.
In preparation for allowing the firmware of an already allocated struct
rproc to be changed, instead always keep a locally maintained copy of
the firmware name.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
|
|
This removes the 'write' and 'force' use from get_user_pages_unlocked()
and replaces them with 'gup_flags' to make the use of FOLL_FORCE
explicit in callers as use of this flag can result in surprising
behaviour (and hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This removes the redundant 'write' and 'force' parameters from
__get_user_pages_unlocked() to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is an ancient bug that was actually attempted to be fixed once
(badly) by me eleven years ago in commit 4ceb5db9757a ("Fix
get_user_pages() race for write access") but that was then undone due to
problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug").
In the meantime, the s390 situation has long been fixed, and we can now
fix it by checking the pte_dirty() bit properly (and do it better). The
s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement
software dirty bits") which made it into v3.9. Earlier kernels will
have to look at the page state itself.
Also, the VM has become more scalable, and what used a purely
theoretical race back then has become easier to trigger.
To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
we already did a COW" rather than play racy games with FOLL_WRITE that
is very fundamental, and then use the pte dirty flag to validate that
the FOLL_COW flag is still valid.
Reported-and-tested-by: Phil "not Paul" Oester <kernel@linuxace.com>
Acked-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Instead of having the smc91x driver relying on machine_is_*() calls,
provide this data through platform data, ie. idp, mainstone and
stargate.
This way, the driver doesn't need anymore machine_is_*() calls, which
wouldn't work anymore with a device-tree build.
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Four tooling fixes, two kprobes KASAN related fixes and an x86 PMU
driver fix/cleanup"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf jit: Fix build issue on Ubuntu
perf jevents: Handle events including .c and .o
perf/x86/intel: Remove an inconsistent NULL check
kprobes: Unpoison stack in jprobe_return() for KASAN
kprobes: Avoid false KASAN reports during stack copy
perf header: Set nr_numa_nodes only when we parsed all the data
perf top: Fix refreshing hierarchy entries on TUI
|
|
Adds the new BLKREPORTZONE and BLKRESETZONE ioctls for respectively
obtaining the zone configuration of a zoned block device and resetting
the write pointer of sequential zones of a zoned block device.
The BLKREPORTZONE ioctl maps directly to a single call of the function
blkdev_report_zones. The zone information result is passed as an array
of struct blk_zone identical to the structure used internally for
processing the REQ_OP_ZONE_REPORT operation. The BLKRESETZONE ioctl
maps to a single call of the blkdev_reset_zones function.
Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Implement zoned block device zone information reporting and reset.
Zone information are reported as struct blk_zone. This implementation
does not differentiate between host-aware and host-managed device
models and is valid for both. Two functions are provided:
blkdev_report_zones for discovering the zone configuration of a
zoned block device, and blkdev_reset_zones for resetting the write
pointer of sequential zones. The helper function blk_queue_zone_size
and bdev_zone_size are also provided for, as the name suggest,
obtaining the zone size (in 512B sectors) of the zones of the device.
Signed-off-by: Hannes Reinecke <hare@suse.de>
[Damien: * Removed the zone cache
* Implement report zones operation based on earlier proposal
by Shaun Tancheff <shaun.tancheff@seagate.com>]
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Tested-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Define REQ_OP_ZONE_REPORT and REQ_OP_ZONE_RESET for handling zones of
host-managed and host-aware zoned block devices. With with these two
new operations, the total number of operations defined reaches 8 and
still fits with the 3 bits definition of REQ_OP_BITS.
Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Add the zoned queue limit to indicate the zoning model of a block device.
Defined values are 0 (BLK_ZONED_NONE) for regular block devices,
1 (BLK_ZONED_HA) for host-aware zone block devices and 2 (BLK_ZONED_HM)
for host-managed zone block devices. The standards defined drive managed
model is not defined here since these block devices do not provide any
command for accessing zone information. Drive managed model devices will
be reported as BLK_ZONED_NONE.
The helper functions blk_queue_zoned_model and bdev_zoned_model return
the zoned limit and the functions blk_queue_is_zoned and bdev_is_zoned
return a boolean for callers to test if a block device is zoned.
The zoned attribute is also exported as a string to applications via
sysfs. BLK_ZONED_NONE shows as "none", BLK_ZONED_HA as "host-aware" and
BLK_ZONED_HM as "host-managed".
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Tested-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Create an option CONFIG_LED_TRIGGER_PHY (default n), which will create a
set of led triggers for each instantiated PHY device. There is one LED
trigger per link-speed, per-phy.
The triggers are registered during phy_attach and unregistered during
phy_detach.
This allows for a user to configure their system to allow a set of LEDs
not controlled by the phy to represent link state changes on the phy.
LEDS controlled by the phy are unaffected.
For example, we have a board where some of the leds in the
RJ45 socket are controlled by the phy, but others are not. Using the
triggers provided by this patch the leds not controlled by the phy can
be configured to show the current speed of the ethernet connection. The
leds controlled by the phy are unaffected.
Signed-off-by: Josh Cartwright <josh.cartwright@ni.com>
Signed-off-by: Nathan Sullivan <nathan.sullivan@ni.com>
Signed-off-by: Zach Brown <zach.brown@ni.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
supported by a phydevice
phy_supported_speeds provides a means to get a list of all the speeds a
phy device currently supports.
Signed-off-by: Zach Brown <zach.brown@ni.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Only direct adjacencies are maintained. All upper or lower devices can
be learned via the new walk API which recursively walks the adj_list for
upper devices or lower devices.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch introduces netdev_walk_all_upper_dev_rcu,
netdev_walk_all_lower_dev and netdev_walk_all_lower_dev_rcu. These
functions recursively walk the adj_list of devices to determine all upper
and lower devices.
The functions take a callback function that is invoked for each device
in the list. If the callback returns non-0, the walk is terminated and
the functions return that code back to callers.
v3
- simplified netdev_has_upper_dev_all_rcu and __netdev_has_upper_dev and
removed typecast as suggested by Stephen
v2
- fixed definition of netdev_next_lower_dev_rcu to mirror the upper_dev
version.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
slim core is used as a basis for many IPs in the STi
chipsets such as fdma and demux. To avoid duplicating
the elf loading code in each device driver a slim
rproc driver has been created.
This driver is designed to be used by other device drivers
such as fdma, or demux whose IP is based around a slim core.
The device driver can call slim_rproc_alloc() to allocate
a slim rproc and slim_rproc_put() when finished.
This driver takes care of ioremapping the slim
registers (dmem, imem, slimcore, peripherals), whose offsets
and sizes can change between IP's. It also obtains and enables
any clocks used by the device. This approach avoids having
a double mapping of the registers as slim_rproc does not register
its own platform device. It also maps well to device tree
abstraction as it allows us to have one dt node for the whole
device.
All of the generic rproc elf loading code can be reused, and
we provide start() stop() hooks to start and stop the slim
core once the firmware has been loaded. This has been tested
successfully with fdma driver.
Signed-off-by: Peter Griffin <peter.griffin@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
|
|
The new introduced macro CLK_OF_DECLARE_DRIVER is usually used to
declare clock driver init functions, which are mostly decorated with
__init. Add __init decoration for CLK_OF_DECLARE_DRIVER function to
avoid causing section mismatch warnings on client clock drivers.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Fixes: c7296c51ce5d ("clk: core: New macro CLK_OF_DECLARE_DRIVER")
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
|
|
pkey_set() and pkey_get() were syscalls present in older versions
of the protection keys patches. They were fully excised from the
x86 code, but some cruft was left in the generic syscall code. The
C++ comments were intended to help to make it more glaring to me to
fix them before actually submitting them. That technique worked,
but later than I would have liked.
I test-compiled this for arm64.
Fixes: a60f7b69d92c0 ("generic syscalls: Wire up memory protection keys syscalls")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Cc: linux-arch@vger.kernel.org
Cc: mgorman@techsingularity.net
Cc: linux-api@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: luto@kernel.org
Cc: akpm@linux-foundation.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Entry Size in GITS_BASER<n> occupies 5 bits [52:48], but we mask out 8
bits.
Fixes: cc2d3216f53c ("irqchip: GICv3: ITS command queue")
Cc: stable@vger.kernel.org
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
The PHY interrupts are now handled in a threaded interrupt handler,
which can sleep. The work queue is no longer needed, phy_change() can
be called directly. phy_mac_interrupt() still needs to be safe to call
in interrupt context, so keep the work queue, and use a helper to call
phy_change().
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, socket lookups for l3mdev (vrf) use cases can match a socket
that is bound to a port but not a device (ie., a global socket). If the
sysctl tcp_l3mdev_accept is not set this leads to ack packets going out
based on the main table even though the packet came in from an L3 domain.
The end result is that the connection does not establish creating
confusion for users since the service is running and a socket shows in
ss output. Fix by requiring an exact dif to sk_bound_dev_if match if the
skb came through an interface enslaved to an l3mdev device and the
tcp_l3mdev_accept is not set.
skb's through an l3mdev interface are marked by setting a flag in
inet{6}_skb_parm. The IPv6 variant is already set; this patch adds the
flag for IPv4. Using an skb flag avoids a device lookup on the dif. The
flag is set in the VRF driver using the IP{6}CB macros. For IPv4, the
inet_skb_parm struct is moved in the cb per commit 971f10eca186, so the
match function in the TCP stack needs to use TCP_SKB_CB. For IPv6, the
move is done after the socket lookup, so IP6CB is used.
The flags field in inet_skb_parm struct needs to be increased to add
another flag. There is currently a 1-byte hole following the flags,
so it can be expanded to u16 without increasing the size of the struct.
Fixes: 193125dbd8eb ("net: Introduce VRF device driver")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
HDMI 2.0/CEA-861-F introduces two new aspect ratios:
- 64:27
- 256:135
This patch adds enumeration for the new aspect ratios
in the existing aspect ratio list.
V2: rebase
V3: rebase
V4: Added r-b from Jose, Ack by Tomi
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Jose Abreu <Jose.Abreu@synopsys.com>
Acked-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1476705880-15600-4-git-send-email-shashank.sharma@intel.com
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
I observed false KSAN positives in the sctp code, when
sctp uses jprobe_return() in jsctp_sf_eat_sack().
The stray 0xf4 in shadow memory are stack redzones:
[ ] ==================================================================
[ ] BUG: KASAN: stack-out-of-bounds in memcmp+0xe9/0x150 at addr ffff88005e48f480
[ ] Read of size 1 by task syz-executor/18535
[ ] page:ffffea00017923c0 count:0 mapcount:0 mapping: (null) index:0x0
[ ] flags: 0x1fffc0000000000()
[ ] page dumped because: kasan: bad access detected
[ ] CPU: 1 PID: 18535 Comm: syz-executor Not tainted 4.8.0+ #28
[ ] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[ ] ffff88005e48f2d0 ffffffff82d2b849 ffffffff0bc91e90 fffffbfff10971e8
[ ] ffffed000bc91e90 ffffed000bc91e90 0000000000000001 0000000000000000
[ ] ffff88005e48f480 ffff88005e48f350 ffffffff817d3169 ffff88005e48f370
[ ] Call Trace:
[ ] [<ffffffff82d2b849>] dump_stack+0x12e/0x185
[ ] [<ffffffff817d3169>] kasan_report+0x489/0x4b0
[ ] [<ffffffff817d31a9>] __asan_report_load1_noabort+0x19/0x20
[ ] [<ffffffff82d49529>] memcmp+0xe9/0x150
[ ] [<ffffffff82df7486>] depot_save_stack+0x176/0x5c0
[ ] [<ffffffff817d2031>] save_stack+0xb1/0xd0
[ ] [<ffffffff817d27f2>] kasan_slab_free+0x72/0xc0
[ ] [<ffffffff817d05b8>] kfree+0xc8/0x2a0
[ ] [<ffffffff85b03f19>] skb_free_head+0x79/0xb0
[ ] [<ffffffff85b0900a>] skb_release_data+0x37a/0x420
[ ] [<ffffffff85b090ff>] skb_release_all+0x4f/0x60
[ ] [<ffffffff85b11348>] consume_skb+0x138/0x370
[ ] [<ffffffff8676ad7b>] sctp_chunk_put+0xcb/0x180
[ ] [<ffffffff8676ae88>] sctp_chunk_free+0x58/0x70
[ ] [<ffffffff8677fa5f>] sctp_inq_pop+0x68f/0xef0
[ ] [<ffffffff8675ee36>] sctp_assoc_bh_rcv+0xd6/0x4b0
[ ] [<ffffffff8677f2c1>] sctp_inq_push+0x131/0x190
[ ] [<ffffffff867bad69>] sctp_backlog_rcv+0xe9/0xa20
[ ... ]
[ ] Memory state around the buggy address:
[ ] ffff88005e48f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ ] ffff88005e48f400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ ] >ffff88005e48f480: f4 f4 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ ] ^
[ ] ffff88005e48f500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ ] ffff88005e48f580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ ] ==================================================================
KASAN stack instrumentation poisons stack redzones on function entry
and unpoisons them on function exit. If a function exits abnormally
(e.g. with a longjmp like jprobe_return()), stack redzones are left
poisoned. Later this leads to random KASAN false reports.
Unpoison stack redzones in the frames we are going to jump over
before doing actual longjmp in jprobe_return().
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: kasan-dev@googlegroups.com
Cc: surovegin@google.com
Cc: rostedt@goodmis.org
Link: http://lkml.kernel.org/r/1476454043-101898-1-git-send-email-dvyukov@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull gcc plugins update from Kees Cook:
"This adds a new gcc plugin named "latent_entropy". It is designed to
extract as much possible uncertainty from a running system at boot
time as possible, hoping to capitalize on any possible variation in
CPU operation (due to runtime data differences, hardware differences,
SMP ordering, thermal timing variation, cache behavior, etc).
At the very least, this plugin is a much more comprehensive example
for how to manipulate kernel code using the gcc plugin internals"
* tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
latent_entropy: Mark functions with __latent_entropy
gcc-plugins: Add latent_entropy plugin
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi:
"This update contains fixes to the "use mounter's permission to access
underlying layers" area, and miscellaneous other fixes and cleanups.
No new features this time"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: use vfs_get_link()
vfs: add vfs_get_link() helper
ovl: use generic_readlink
ovl: explain error values when removing acl from workdir
ovl: Fix info leak in ovl_lookup_temp()
ovl: during copy up, switch to mounter's creds early
ovl: lookup: do getxattr with mounter's permission
ovl: copy_up_xattr(): use strnlen
|