Age | Commit message (Collapse) | Author |
|
Adding perf_msr_probe function to provide interface for
checking up on MSR register and set the related attribute
group visibility.
User defines following struct for each MSR register:
struct perf_msr {
u64 msr;
struct attribute_group *grp;
bool (*test)(int idx, void *data);
bool no_check;
};
Where:
msr - is the MSR address
attrs - is attribute groups array to add if the check passed
test - is test function pointer
no_check - is bool that bypass the check and adds the
attribute without any test
The array of struct perf_msr is passed into:
perf_msr_probe(struct perf_msr *msr, int cnt, bool zero, void *data)
Together with:
cnt - which is the number of struct msr array elements
data - which is user pointer passed to the test function
zero - allow counters that returns zero on rdmsr
The perf_msr_probe will executed test code, read the MSR and
check the value is != 0. If all these tests pass, related
attribute group is kept visible.
Also adding PMU_EVENT_GROUP macro helper to define attribute
group for single attribute. It will be used in following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan <kan.liang@linux.intel.com>
Cc: Liang
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/20190616140358.27799-2-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Use the macro defined in kernel ABI header to replace the local name.
No functional change.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/1559081314-9714-5-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
We don't need pmu->pebs_no_xmm_regs anymore, the capabilities
PERF_PMU_CAP_EXTENDED_REGS can be used to check if XMM registers
collection is supported.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/1559081314-9714-4-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Use generic macro PERF_REG_EXTENDED_MASK to replace PEBS_XMM_REGS to
avoid duplication.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/1559081314-9714-3-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The perf fuzzer triggers a warning which map to:
if (WARN_ON_ONCE(idx >= ARRAY_SIZE(pt_regs_offset)))
return 0;
The bits between XMM registers and generic registers are reserved.
But perf_reg_validate() doesn't check these bits.
Add PERF_REG_X86_RESERVED for reserved bits on X86.
Check the reserved bits in perf_reg_validate().
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 878068ea270e ("perf/x86: Support outputting XMM registers")
Link: https://lkml.kernel.org/r/1559081314-9714-2-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The perf fuzzer caused Skylake machine to crash:
[ 9680.085831] Call Trace:
[ 9680.088301] <IRQ>
[ 9680.090363] perf_output_sample_regs+0x43/0xa0
[ 9680.094928] perf_output_sample+0x3aa/0x7a0
[ 9680.099181] perf_event_output_forward+0x53/0x80
[ 9680.103917] __perf_event_overflow+0x52/0xf0
[ 9680.108266] ? perf_trace_run_bpf_submit+0xc0/0xc0
[ 9680.113108] perf_swevent_hrtimer+0xe2/0x150
[ 9680.117475] ? check_preempt_wakeup+0x181/0x230
[ 9680.122091] ? check_preempt_curr+0x62/0x90
[ 9680.126361] ? ttwu_do_wakeup+0x19/0x140
[ 9680.130355] ? try_to_wake_up+0x54/0x460
[ 9680.134366] ? reweight_entity+0x15b/0x1a0
[ 9680.138559] ? __queue_work+0x103/0x3f0
[ 9680.142472] ? update_dl_rq_load_avg+0x1cd/0x270
[ 9680.147194] ? timerqueue_del+0x1e/0x40
[ 9680.151092] ? __remove_hrtimer+0x35/0x70
[ 9680.155191] __hrtimer_run_queues+0x100/0x280
[ 9680.159658] hrtimer_interrupt+0x100/0x220
[ 9680.163835] smp_apic_timer_interrupt+0x6a/0x140
[ 9680.168555] apic_timer_interrupt+0xf/0x20
[ 9680.172756] </IRQ>
The XMM registers can only be collected by PEBS hardware events on the
platforms with PEBS baseline support, e.g. Icelake, not software/probe
events.
Add capabilities flag PERF_PMU_CAP_EXTENDED_REGS to indicate the PMU
which support extended registers. For X86, the extended registers are
XMM registers.
Add has_extended_regs() to check if extended registers are applied.
The generic code define the mask of extended registers as 0 if arch
headers haven't overridden it.
Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 878068ea270e ("perf/x86: Support outputting XMM registers")
Link: https://lkml.kernel.org/r/1559081314-9714-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
perf_event_open() limits the sample_period to 63 bits. See:
0819b2e30ccb ("perf: Limit perf_event_attr::sample_period to 63 bits")
Make ioctl() consistent with it.
Also on PowerPC, negative sample_period could cause a recursive
PMIs leading to a hang (reported when running perf-fuzzer).
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: maddy@linux.vnet.ibm.com
Cc: mpe@ellerman.id.au
Fixes: 0819b2e30ccb ("perf: Limit perf_event_attr::sample_period to 63 bits")
Link: https://lkml.kernel.org/r/20190604042953.914-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The syzbot reported
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xca/0x13e lib/dump_stack.c:113
print_address_description+0x67/0x231 mm/kasan/report.c:188
__kasan_report.cold+0x1a/0x32 mm/kasan/report.c:317
kasan_report+0xe/0x20 mm/kasan/common.c:614
qmi_wwan_probe+0x342/0x360 drivers/net/usb/qmi_wwan.c:1417
usb_probe_interface+0x305/0x7a0 drivers/usb/core/driver.c:361
really_probe+0x281/0x660 drivers/base/dd.c:509
driver_probe_device+0x104/0x210 drivers/base/dd.c:670
__device_attach_driver+0x1c2/0x220 drivers/base/dd.c:777
bus_for_each_drv+0x15c/0x1e0 drivers/base/bus.c:454
Caused by too many confusing indirections and casts.
id->driver_info is a pointer stored in a long. We want the
pointer here, not the address of it.
Thanks-to: Hillf Danton <hdanton@sina.com>
Reported-by: syzbot+b68605d7fadd21510de1@syzkaller.appspotmail.com
Cc: Kristian Evensen <kristian.evensen@gmail.com>
Fixes: e4bf63482c30 ("qmi_wwan: Add quirk for Quectel dynamic config")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is to fix an uninit-value issue, reported by syzbot:
BUG: KMSAN: uninit-value in memchr+0xce/0x110 lib/string.c:981
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x191/0x1f0 lib/dump_stack.c:113
kmsan_report+0x130/0x2a0 mm/kmsan/kmsan.c:622
__msan_warning+0x75/0xe0 mm/kmsan/kmsan_instr.c:310
memchr+0xce/0x110 lib/string.c:981
string_is_valid net/tipc/netlink_compat.c:176 [inline]
tipc_nl_compat_bearer_disable+0x2a1/0x480 net/tipc/netlink_compat.c:449
__tipc_nl_compat_doit net/tipc/netlink_compat.c:327 [inline]
tipc_nl_compat_doit+0x3ac/0xb00 net/tipc/netlink_compat.c:360
tipc_nl_compat_handle net/tipc/netlink_compat.c:1178 [inline]
tipc_nl_compat_recv+0x1b1b/0x27b0 net/tipc/netlink_compat.c:1281
TLV_GET_DATA_LEN() may return a negtive int value, which will be
used as size_t (becoming a big unsigned long) passed into memchr,
cause this issue.
Similar to what it does in tipc_nl_compat_bearer_enable(), this
fix is to return -EINVAL when TLV_GET_DATA_LEN() is negtive in
tipc_nl_compat_bearer_disable(), as well as in
tipc_nl_compat_link_stat_dump() and tipc_nl_compat_link_reset_stats().
v1->v2:
- add the missing Fixes tags per Eric's request.
Fixes: 0762216c0ad2 ("tipc: fix uninit-value in tipc_nl_compat_bearer_enable")
Fixes: 8b66fee7f8ee ("tipc: fix uninit-value in tipc_nl_compat_link_reset_stats")
Reported-by: syzbot+30eaa8bf392f7fafffaf@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes the MAC address setup in the probe. The MAC address
retrieved using of_get_mac_address was checked for not containing an
error, but it may also be NULL which wasn't tested. Fix it by replacing
IS_ERR with IS_ERR_OR_NULL.
Fixes: 541ddc66d665 ("net: macb: support of_get_mac_address new ERR_PTR error")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
syzbot found we can leak memory in packet_set_ring(), if user application
provides buggy parameters.
Fixes: 7f953ab2ba46 ("af_packet: TX_RING support for TPACKET_V3")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With commit 94850257cf0f ("tls: Fix tls_device handling of partial records")
a new path was introduced to cleanup partial records during sk_proto_close.
This path does not handle the SW KTLS tx_list cleanup.
This is unnecessary though since the free_resources calls for both
SW and offload paths will cleanup a partial record.
The visible effect is the following warning, but this bug also causes
a page double free.
WARNING: CPU: 7 PID: 4000 at net/core/stream.c:206 sk_stream_kill_queues+0x103/0x110
RIP: 0010:sk_stream_kill_queues+0x103/0x110
RSP: 0018:ffffb6df87e07bd0 EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff8c21db4971c0 RCX: 0000000000000007
RDX: ffffffffffffffa0 RSI: 000000000000001d RDI: ffff8c21db497270
RBP: ffff8c21db497270 R08: ffff8c29f4748600 R09: 000000010020001a
R10: ffffb6df87e07aa0 R11: ffffffff9a445600 R12: 0000000000000007
R13: 0000000000000000 R14: ffff8c21f03f2900 R15: ffff8c21f03b8df0
Call Trace:
inet_csk_destroy_sock+0x55/0x100
tcp_close+0x25d/0x400
? tcp_check_oom+0x120/0x120
tls_sk_proto_close+0x127/0x1c0
inet_release+0x3c/0x60
__sock_release+0x3d/0xb0
sock_close+0x11/0x20
__fput+0xd8/0x210
task_work_run+0x84/0xa0
do_exit+0x2dc/0xb90
? release_sock+0x43/0x90
do_group_exit+0x3a/0xa0
get_signal+0x295/0x720
do_signal+0x36/0x610
? SYSC_recvfrom+0x11d/0x130
exit_to_usermode_loop+0x69/0xb0
do_syscall_64+0x173/0x180
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7fe9b9abc10d
RSP: 002b:00007fe9b19a1d48 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 0000000000000006 RCX: 00007fe9b9abc10d
RDX: 0000000000000002 RSI: 0000000000000080 RDI: 00007fe948003430
RBP: 00007fe948003410 R08: 00007fe948003430 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00005603739d9080
R13: 00007fe9b9ab9f90 R14: 00007fe948003430 R15: 0000000000000000
Fixes: 94850257cf0f ("tls: Fix tls_device handling of partial records")
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It's not okay to cast a "u32 *" to "unsigned long *" when you are
doing a for_each_set_bit() loop because that will break on big
endian systems.
Fixes: 386145601b82 ("mfd: stmfx: Uninitialized variable in stmfx_irq_handler()")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Amelie Delaunay <amelie.delaunay@st.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
|
Initialize pidfd to an invalid descriptor, to fail gracefully on
those kernels that do not implement CLONE_PIDFD and leave pidfd
unchanged.
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Christian Brauner <christian@brauner.io>
|
|
Give userspace a cheap and reliable way to tell whether CLONE_PIDFD is
supported by the kernel or not. The easiest way is to pass an invalid
file descriptor value in parent_tidptr, perform the syscall and verify
that parent_tidptr has been changed to a valid file descriptor value.
CLONE_PIDFD uses parent_tidptr to return pidfds. CLONE_PARENT_SETTID
will use parent_tidptr to return the tid of the parent. The two flags
cannot be used together. Old kernels that only support
CLONE_PARENT_SETTID will not verify the value pointed to by
parent_tidptr. This behavior is unchanged even with the introduction of
CLONE_PIDFD.
However, if CLONE_PIDFD is specified the kernel will currently check the
value pointed to by parent_tidptr before placing the pidfd in the memory
pointed to. EINVAL will be returned if the value in parent_tidptr is not
0.
If CLONE_PIDFD is supported and fd 0 is closed, then the returned pidfd
can and likely will be 0 and parent_tidptr will be unchanged. This means
userspace must either check CLONE_PIDFD support beforehand or check that
fd 0 is not closed when invoking CLONE_PIDFD.
The check for pidfd == 0 was introduced during the v5.2 merge window by
commit b3e583825266 ("clone: add CLONE_PIDFD") to ensure that
CLONE_PIDFD could be potentially extended by passing in flags through
the return argument.
However, that extension would look horrible, and with the upcoming
introduction of the clone3 syscall in v5.3 there is no need to extend
legacy clone syscall this way. (Even if it would need to be extended,
CLONE_DETACHED can be reused with CLONE_PIDFD.)
So remove the pidfd == 0 check. Userspace that needs to be portable to
kernels without CLONE_PIDFD support can then be advised to initialize
pidfd to -1 and check the pidfd value returned by CLONE_PIDFD.
Fixes: b3e583825266 ("clone: add CLONE_PIDFD")
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Christian Brauner <christian@brauner.io>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull mtd fixes from Miquel Raynal:
- Set the raw NAND number of targets to the right value
- Fix a bug uncovered by a recent patch on Spansion SPI-NOR flashes
* tag 'mtd/fixes-for-5.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: spi-nor: use 16-bit WRR command when QE is set on spansion flashes
mtd: rawnand: initialize ntargets with maxchips
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"One fix for a bug in our context id handling on 64-bit hash CPUs,
which can lead to unrelated processes being able to read/write to each
other's virtual memory. See the commit for full details.
That is the fix for CVE-2019-12817.
This also adds a kernel selftest for the bug"
* tag 'powerpc-5.2-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
selftests/powerpc: Add test of fork with mapping above 512TB
powerpc/mm/64s/hash: Reallocate context ids on fork
|
|
According to the i.MX6UL/L RM, table 3.1 "ARM Cortex A7 domain interrupt
summary", the interrupts for the PWM[1-4] go from 83 to 86.
Fixes: b9901fe84f02 ("ARM: dts: imx6ul: add pwm[1-4] nodes")
Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
Pull auxdisplay cleanup from Miguel Ojeda:
"A cleanup for two drivers in auxdisplay: convert them to use
vm_map_pages_zero() (Souptick Joarder)"
* tag 'auxdisplay-for-linus-v5.2-rc7' of git://github.com/ojeda/linux:
auxdisplay/ht16k33.c: Convert to use vm_map_pages_zero()
auxdisplay/cfag12864bfb.c: Convert to use vm_map_pages_zero()
|
|
Since two new sysfs interface files are created for umwait control, add
an ABI document entry for the files:
/sys/devices/system/cpu/umwait_control/enable_c02
/sys/devices/system/cpu/umwait_control/max_time
[ tglx: Made the write value instructions readable ]
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Cc: "Borislav Petkov" <bp@alien8.de>
Cc: "H Peter Anvin" <hpa@zytor.com>
Cc: "Andy Lutomirski" <luto@kernel.org>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/1560994438-235698-6-git-send-email-fenghua.yu@intel.com
|
|
IA32_UMWAIT_CONTROL[31:2] determines the maximum time in TSC-quanta
that processor can stay in C0.1 or C0.2. A zero value means no maximum
time.
Each instruction sets its own deadline in the instruction's implicit
input EDX:EAX value. The instruction wakes up if the time-stamp counter
reaches or exceeds the specified deadline, or the umwait maximum time
expires, or a store happens in the monitored address range in umwait.
The administrator can write an unsigned 32-bit number to
/sys/devices/system/cpu/umwait_control/max_time to change the default
value. Note that a value of zero means there is no limit. The lower two
bits of the value must be zero.
[ tglx: Simplify the write function. Massage changelog ]
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: "Borislav Petkov" <bp@alien8.de>
Cc: "H Peter Anvin" <hpa@zytor.com>
Cc: "Andy Lutomirski" <luto@kernel.org>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/1560994438-235698-5-git-send-email-fenghua.yu@intel.com
|
|
C0.2 state in umwait and tpause instructions can be enabled or disabled
on a processor through IA32_UMWAIT_CONTROL MSR register.
By default, C0.2 is enabled and the user wait instructions results in
lower power consumption with slower wakeup time.
But in real time systems which require faster wakeup time although power
savings could be smaller, the administrator needs to disable C0.2 and all
umwait invocations from user applications use C0.1.
Create a sysfs interface which allows the administrator to control C0.2
state during run time.
Andy Lutomirski suggested to turn off local irqs before writing the MSR to
ensure the cached control value is not changed by a concurrent sysfs write
from a different CPU via IPI.
[ tglx: Simplified the update logic in the write function and got rid of
all the convoluted type casts. Added a shared update function and
made the namespace consistent. Moved the sysfs create invocation.
Massaged changelog ]
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: "Borislav Petkov" <bp@alien8.de>
Cc: "H Peter Anvin" <hpa@zytor.com>
Cc: "Andy Lutomirski" <luto@kernel.org>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/1560994438-235698-4-git-send-email-fenghua.yu@intel.com
|
|
umwait or tpause allows the processor to enter a light-weight
power/performance optimized state (C0.1 state) or an improved
power/performance optimized state (C0.2 state) for a period specified by
the instruction or until the system time limit or until a store to the
monitored address range in umwait.
IA32_UMWAIT_CONTROL MSR register allows the OS to enable/disable C0.2 on
the processor and to set the maximum time the processor can reside in C0.1
or C0.2.
By default C0.2 is enabled so the user wait instructions can enter the
C0.2 state to save more power with slower wakeup time.
Andy Lutomirski proposed to set the maximum umwait time to 100000 cycles by
default. A quote from Andy:
"What I want to avoid is the case where it works dramatically differently
on NO_HZ_FULL systems as compared to everything else. Also, UMWAIT may
behave a bit differently if the max timeout is hit, and I'd like that
path to get exercised widely by making it happen even on default
configs."
A sysfs interface to adjust the time and the C0.2 enablement is provided in
a follow up change.
[ tglx: Renamed MSR_IA32_UMWAIT_CONTROL_MAX_TIME to
MSR_IA32_UMWAIT_CONTROL_TIME_MASK because the constant is used as
mask throughout the code.
Massaged comments and changelog ]
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: "Borislav Petkov" <bp@alien8.de>
Cc: "H Peter Anvin" <hpa@zytor.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/1560994438-235698-3-git-send-email-fenghua.yu@intel.com
|
|
umonitor, umwait, and tpause are a set of user wait instructions.
umonitor arms address monitoring hardware using an address. The
address range is determined by using CPUID.0x5. A store to
an address within the specified address range triggers the
monitoring hardware to wake up the processor waiting in umwait.
umwait instructs the processor to enter an implementation-dependent
optimized state while monitoring a range of addresses. The optimized
state may be either a light-weight power/performance optimized state
(C0.1 state) or an improved power/performance optimized state
(C0.2 state).
tpause instructs the processor to enter an implementation-dependent
optimized state C0.1 or C0.2 state and wake up when time-stamp counter
reaches specified timeout.
The three instructions may be executed at any privilege level.
The instructions provide power saving method while waiting in
user space. Additionally, they can allow a sibling hyperthread to
make faster progress while this thread is waiting. One example of an
application usage of umwait is when waiting for input data from another
application, such as a user level multi-threaded packet processing
engine.
Availability of the user wait instructions is indicated by the presence
of the CPUID feature flag WAITPKG CPUID.0x07.0x0:ECX[5].
Detailed information on the instructions and CPUID feature WAITPKG flag
can be found in the latest Intel Architecture Instruction Set Extensions
and Future Features Programming Reference and Intel 64 and IA-32
Architectures Software Developer's Manual.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: "Borislav Petkov" <bp@alien8.de>
Cc: "H Peter Anvin" <hpa@zytor.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/1560994438-235698-2-git-send-email-fenghua.yu@intel.com
|
|
If register_qdisc fails, we should unregister
netdevice notifier.
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: e0a7683d30e9 ("net/sched: cbs: fix port_rate miscalculation")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
SPI memory devices from different manufacturers have widely
different configurations for Status, Control and Configuration
registers. JEDEC 216C defines a new map for these common register
bits and their functions, and describes how the individual bits may
be accessed for a specific device. For the JEDEC 216B compliant
flashes, we can partially deduce Status and Configuration registers
functions by inspecting the 16th DWORD of BFPT. Older flashes that
don't declare the SFDP tables (SPANSION FL512SAIFG1 311QQ063 A ©11
SPANSION) let the software decide how to interact with these registers.
The commit dcb4b22eeaf4 ("spi-nor: s25fl512s supports region locking")
uncovered a probe error for s25fl512s, when the Quad Enable bit CR[1]
was set to one in the bootloader. When this bit is one, only the Write
Status (01h) command with two data byts may be used, the 01h command with
one data byte is not recognized and hence the error when trying to clear
the block protection bits.
Fix the above by using the Write Status (01h) command with two data bytes
when the Quad Enable bit is one.
Backward compatibility should be fine. The newly introduced
spi_nor_spansion_clear_sr_bp() is tightly coupled with the
spansion_quad_enable() function. Both assume that the Write Register
with 16 bits, together with the Read Configuration Register (35h)
instructions are supported.
Fixes: dcb4b22eeaf44f91 ("spi-nor: s25fl512s supports region locking")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Tested-by: Jonas Bonn <jonas@norrbonn.se>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Vignesh Raghavendra <vigneshr@ti.com>
Tested-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
|
|
This patch is to fix a dst defcnt leak, which can be reproduced by doing:
# ip net a c; ip net a s; modprobe tipc
# ip net e s ip l a n eth1 type veth peer n eth1 netns c
# ip net e c ip l s lo up; ip net e c ip l s eth1 up
# ip net e s ip l s lo up; ip net e s ip l s eth1 up
# ip net e c ip a a 1.1.1.2/8 dev eth1
# ip net e s ip a a 1.1.1.1/8 dev eth1
# ip net e c tipc b e m udp n u1 localip 1.1.1.2
# ip net e s tipc b e m udp n u1 localip 1.1.1.1
# ip net d c; ip net d s; rmmod tipc
and it will get stuck and keep logging the error:
unregister_netdevice: waiting for lo to become free. Usage count = 1
The cause is that a dst is held by the udp sock's sk_rx_dst set on udp rx
path with udp_early_demux == 1, and this dst (eventually holding lo dev)
can't be released as bearer's removal in tipc pernet .exit happens after
lo dev's removal, default_device pernet .exit.
"There are two distinct types of pernet_operations recognized: subsys and
device. At creation all subsys init functions are called before device
init functions, and at destruction all device exit functions are called
before subsys exit function."
So by calling register_pernet_device instead to register tipc_net_ops, the
pernet .exit() will be invoked earlier than loopback dev's removal when a
netns is being destroyed, as fou/gue does.
Note that vxlan and geneve udp tunnels don't have this issue, as the udp
sock is released in their device ndo_stop().
This fix is also necessary for tipc dst_cache, which will hold dsts on tx
path and I will introduce in my next patch.
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since commit 605ad7f184b60cfaacbc038aa6c55ee68dee3c89 "tcp: refine TSO autosizing",
outbound throughput is dramatically reduced for some connections, as sis900
is doing TX completion within idle states only.
Make TX completion happen after every transmitted packet.
Test:
netperf
before patch:
> netperf -H remote -l -2000000 -- -s 1000000
MIGRATED TCP STREAM TEST from 0.0.0.0 () port 0 AF_INET to 95.223.112.76 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380 327680 327680 253.44 0.06
after patch:
> netperf -H remote -l -10000000 -- -s 1000000
MIGRATED TCP STREAM TEST from 0.0.0.0 () port 0 AF_INET to 95.223.112.76 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380 327680 327680 5.38 14.89
Thx to Dave Miller and Eric Dumazet for helpful hints
Signed-off-by: Sergej Benilov <sergej.benilov@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When transmitting certain PTP frames, e.g. SYNC and DELAY_REQ, the
PTP daemon, e.g. ptp4l, is polling the driver for the frame transmit
hardware timestamp. The polling will most likely timeout if the tx
coalesce is enabled due to the Interrupt-on-Completion (IC) bit is
not set in tx descriptor for those frames.
This patch will ignore the tx coalesce parameter and set the IC bit
when transmitting PTP frames which need to report out the frame
transmit hardware timestamp to user space.
Fixes: f748be531d70 ("net: stmmac: Rework coalesce timer and fix multi-queue races")
Signed-off-by: Roland Hii <roland.king.guan.hii@intel.com>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When ADDSUB bit is set, the system time seconds field is calculated as
the complement of the seconds part of the update value.
For example, if 3.000000001 seconds need to be subtracted from the
system time, this field is calculated as
2^32 - 3 = 4294967296 - 3 = 0x100000000 - 3 = 0xFFFFFFFD
Previously, the 0x100000000 is mistakenly written as 100000000.
This is further simplified from
sec = (0x100000000ULL - sec);
to
sec = -sec;
Fixes: ba1ffd74df74 ("stmmac: fix PTP support for GMAC4")
Signed-off-by: Roland Hii <roland.king.guan.hii@intel.com>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fix from Joerg Roedel:
"Revert a commit from the previous pile of fixes which causes new
lockdep splats. It is better to revert it for now and work on a better
and more well tested fix"
* tag 'iommu-fix-v5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
Revert "iommu/vt-d: Fix lock inversion between iommu->lock and device_domain_lock"
|
|
device_domain_lock"
This reverts commit 7560cc3ca7d9d11555f80c830544e463fcdb28b8.
With 5.2.0-rc5 I can easily trigger this with lockdep and iommu=pt:
======================================================
WARNING: possible circular locking dependency detected
5.2.0-rc5 #78 Not tainted
------------------------------------------------------
swapper/0/1 is trying to acquire lock:
00000000ea2b3beb (&(&iommu->lock)->rlock){+.+.}, at: domain_context_mapping_one+0xa5/0x4e0
but task is already holding lock:
00000000a681907b (device_domain_lock){....}, at: domain_context_mapping_one+0x8d/0x4e0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (device_domain_lock){....}:
_raw_spin_lock_irqsave+0x3c/0x50
dmar_insert_one_dev_info+0xbb/0x510
domain_add_dev_info+0x50/0x90
dev_prepare_static_identity_mapping+0x30/0x68
intel_iommu_init+0xddd/0x1422
pci_iommu_init+0x16/0x3f
do_one_initcall+0x5d/0x2b4
kernel_init_freeable+0x218/0x2c1
kernel_init+0xa/0x100
ret_from_fork+0x3a/0x50
-> #0 (&(&iommu->lock)->rlock){+.+.}:
lock_acquire+0x9e/0x170
_raw_spin_lock+0x25/0x30
domain_context_mapping_one+0xa5/0x4e0
pci_for_each_dma_alias+0x30/0x140
dmar_insert_one_dev_info+0x3b2/0x510
domain_add_dev_info+0x50/0x90
dev_prepare_static_identity_mapping+0x30/0x68
intel_iommu_init+0xddd/0x1422
pci_iommu_init+0x16/0x3f
do_one_initcall+0x5d/0x2b4
kernel_init_freeable+0x218/0x2c1
kernel_init+0xa/0x100
ret_from_fork+0x3a/0x50
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(device_domain_lock);
lock(&(&iommu->lock)->rlock);
lock(device_domain_lock);
lock(&(&iommu->lock)->rlock);
*** DEADLOCK ***
2 locks held by swapper/0/1:
#0: 00000000033eb13d (dmar_global_lock){++++}, at: intel_iommu_init+0x1e0/0x1422
#1: 00000000a681907b (device_domain_lock){....}, at: domain_context_mapping_one+0x8d/0x4e0
stack backtrace:
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc5 #78
Hardware name: LENOVO 20KGS35G01/20KGS35G01, BIOS N23ET50W (1.25 ) 06/25/2018
Call Trace:
dump_stack+0x85/0xc0
print_circular_bug.cold.57+0x15c/0x195
__lock_acquire+0x152a/0x1710
lock_acquire+0x9e/0x170
? domain_context_mapping_one+0xa5/0x4e0
_raw_spin_lock+0x25/0x30
? domain_context_mapping_one+0xa5/0x4e0
domain_context_mapping_one+0xa5/0x4e0
? domain_context_mapping_one+0x4e0/0x4e0
pci_for_each_dma_alias+0x30/0x140
dmar_insert_one_dev_info+0x3b2/0x510
domain_add_dev_info+0x50/0x90
dev_prepare_static_identity_mapping+0x30/0x68
intel_iommu_init+0xddd/0x1422
? printk+0x58/0x6f
? lockdep_hardirqs_on+0xf0/0x180
? do_early_param+0x8e/0x8e
? e820__memblock_setup+0x63/0x63
pci_iommu_init+0x16/0x3f
do_one_initcall+0x5d/0x2b4
? do_early_param+0x8e/0x8e
? rcu_read_lock_sched_held+0x55/0x60
? do_early_param+0x8e/0x8e
kernel_init_freeable+0x218/0x2c1
? rest_init+0x230/0x230
kernel_init+0xa/0x100
ret_from_fork+0x3a/0x50
domain_context_mapping_one() is taking device_domain_lock first then
iommu lock, while dmar_insert_one_dev_info() is doing the reverse.
That should be introduced by commit:
7560cc3ca7d9 ("iommu/vt-d: Fix lock inversion between iommu->lock and
device_domain_lock", 2019-05-27)
So far I still cannot figure out how the previous deadlock was
triggered (I cannot find iommu lock taken before calling of
iommu_flush_dev_iotlb()), however I'm pretty sure that that change
should be incomplete at least because it does not fix all the places
so we're still taking the locks in different orders, while reverting
that commit is very clean to me so far that we should always take
device_domain_lock first then the iommu lock.
We can continue to try to find the real culprit mentioned in
7560cc3ca7d9, but for now I think we should revert it to fix current
breakage.
CC: Joerg Roedel <joro@8bytes.org>
CC: Lu Baolu <baolu.lu@linux.intel.com>
CC: dave.jiang@intel.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
"If an IOMMU is present, ignore the P2PDMA whitelist we added for v5.2
because we don't yet know how to support P2PDMA in that case (Logan
Gunthorpe)"
* tag 'pci-v5.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/P2PDMA: Ignore root complex whitelist when an IOMMU is present
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three driver fixes (and one version number update): a suspend hang in
ufs, a qla hard lock on module removal and a qedi panic during
discovery"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla2xxx: Fix hardlockup in abort command during driver remove
scsi: ufs: Avoid runtime suspend possibly being blocked forever
scsi: qedi: update driver version to 8.37.0.20
scsi: qedi: Check targetname while finding boot target information
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"This is a frustratingly large batch at rc5. Some of these were sent
earlier but were missed by me due to being distracted by other things,
and some took a while to track down due to needing manual bisection on
old hardware. But still we clearly need to improve our testing of KVM,
and of 32-bit, so that we catch these earlier.
Summary: seven fixes, all for bugs introduced this cycle.
- The commit to add KASAN support broke booting on 32-bit SMP
machines, due to a refactoring that moved some setup out of the
secondary CPU path.
- A fix for another 32-bit SMP bug introduced by the fast syscall
entry implementation for 32-bit BOOKE. And a build fix for the same
commit.
- Our change to allow the DAWR to be force enabled on Power9
introduced a bug in KVM, where we clobber r3 leading to a host
crash.
- The same commit also exposed a previously unreachable bug in the
nested KVM handling of DAWR, which could lead to an oops in a
nested host.
- One of the DMA reworks broke the b43legacy WiFi driver on some
people's powermacs, fix it by enabling a 30-bit ZONE_DMA on 32-bit.
- A fix for TLB flushing in KVM introduced a new bug, as it neglected
to also flush the ERAT, this could lead to memory corruption in the
guest.
Thanks to: Aaro Koskinen, Christoph Hellwig, Christophe Leroy, Larry
Finger, Michael Neuling, Suraj Jitindar Singh"
* tag 'powerpc-5.2-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
KVM: PPC: Book3S HV: Invalidate ERAT when flushing guest TLB entries
powerpc: enable a 30-bit ZONE_DMA for 32-bit pmac
KVM: PPC: Book3S HV: Only write DAWR[X] when handling h_set_dawr in real mode
KVM: PPC: Book3S HV: Fix r3 corruption in h_set_dabr()
powerpc/32: fix build failure on book3e with KVM
powerpc/booke: fix fast syscall entry on SMP
powerpc/32s: fix initial setup of segment registers on secondary CPU
|
|
When trying to align the minimum encryption key size requirement for
Bluetooth connections, it turns out doing this in a central location in
the HCI connection handling code is not possible.
Original Bluetooth version up to 2.0 used a security model where the
L2CAP service would enforce authentication and encryption. Starting
with Bluetooth 2.1 and Secure Simple Pairing that model has changed into
that the connection initiator is responsible for providing an encrypted
ACL link before any L2CAP communication can happen.
Now connecting Bluetooth 2.1 or later devices with Bluetooth 2.0 and
before devices are causing a regression. The encryption key size check
needs to be moved out of the HCI connection handling into the L2CAP
channel setup.
To achieve this, the current check inside hci_conn_security() has been
moved into l2cap_check_enc_key_size() helper function and then called
from four decisions point inside L2CAP to cover all combinations of
Secure Simple Pairing enabled devices and device using legacy pairing
and legacy service security model.
Fixes: d5bb334a8e17 ("Bluetooth: Align minimum encryption key size for LE and BR/EDR connections")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203643
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Since commit 7d5905dc14a8 ("x86 / CPU: Always show current CPU frequency
in /proc/cpuinfo") open and read of /proc/cpuinfo sends IPI to all CPUs.
Many applications read /proc/cpuinfo at the start for trivial reasons like
counting cores or detecting cpu features. While sensitive workloads like
DPDK network polling don't like any interrupts.
Integrates this feature with cpu isolation and do not send IPIs to CPUs
without housekeeping flag HK_FLAG_MISC (set by nohz_full).
Code that requests cpu frequency like show_cpuinfo() falls back to the last
frequency set by the cpufreq driver if this method returns 0.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Link: https://lkml.kernel.org/r/155790354043.1104.15333317408370209.stgit@buzz
|
|
The arc4 crypto is mandatory at ppp_mppe probe time, so let's put a
softdep line, so that the corresponding module gets prepared
gracefully. Without this, a simple inclusion to initrd via dracut
failed due to the missing dependency, for example.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Certain cards in conjunction with certain switches need a little more
time for link setup that results in ethtool link test failure after
offline test. Patch adds a loop that waits for a link setup finish.
Changes in v2:
- added fixes header
Fixes: 4276e47e2d1c ("be2net: Add link test to list of ethtool self tests.")
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The left shift of unsigned int cpu_khz will overflow for large values of
cpu_khz, so cast it to a long long before shifting it to avoid overvlow.
For example, this can happen when cpu_khz is 4194305, i.e. ~4.2 GHz.
Addresses-Coverity: ("Unintentional integer overflow")
Fixes: 8c3ba8d04924 ("x86, apic: ack all pending irqs when crashed/on kexec")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: kernel-janitors@vger.kernel.org
Link: https://lkml.kernel.org/r/20190619181446.13635-1-colin.king@canonical.com
|
|
Same as Intel, Zhaoxin MP CPUs support C3 share cache and on all
recent Zhaoxin platforms ARB_DISABLE is a nop. So set related
flags correctly in the same way as Intel does.
Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "hpa@zytor.com" <hpa@zytor.com>
Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>
Cc: "rjw@rjwysocki.net" <rjw@rjwysocki.net>
Cc: "lenb@kernel.org" <lenb@kernel.org>
Cc: David Wang <DavidWang@zhaoxin.com>
Cc: "Cooper Yan(BJ-RD)" <CooperYan@zhaoxin.com>
Cc: "Qiyuan Wang(BJ-RD)" <QiyuanWang@zhaoxin.com>
Cc: "Herry Yang(BJ-RD)" <HerryYang@zhaoxin.com>
Link: https://lkml.kernel.org/r/a370503660994669991a7f7cda7c5e98@zhaoxin.com
|
|
Zhaoxin CPUs have NONSTOP TSC feature, so enable the ACPI
driver support for it.
Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "hpa@zytor.com" <hpa@zytor.com>
Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>
Cc: "rjw@rjwysocki.net" <rjw@rjwysocki.net>
Cc: "lenb@kernel.org" <lenb@kernel.org>
Cc: David Wang <DavidWang@zhaoxin.com>
Cc: "Cooper Yan(BJ-RD)" <CooperYan@zhaoxin.com>
Cc: "Qiyuan Wang(BJ-RD)" <QiyuanWang@zhaoxin.com>
Cc: "Herry Yang(BJ-RD)" <HerryYang@zhaoxin.com>
Link: https://lkml.kernel.org/r/d1cfd937dabc44518d42038b55522c53@zhaoxin.com
|
|
Add x86 architecture support for new Zhaoxin processors.
Carve out initialization code needed by Zhaoxin processors into
a separate compilation unit.
To identify Zhaoxin CPU, add a new vendor type X86_VENDOR_ZHAOXIN
for system recognition.
Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "hpa@zytor.com" <hpa@zytor.com>
Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>
Cc: "rjw@rjwysocki.net" <rjw@rjwysocki.net>
Cc: "lenb@kernel.org" <lenb@kernel.org>
Cc: David Wang <DavidWang@zhaoxin.com>
Cc: "Cooper Yan(BJ-RD)" <CooperYan@zhaoxin.com>
Cc: "Qiyuan Wang(BJ-RD)" <QiyuanWang@zhaoxin.com>
Cc: "Herry Yang(BJ-RD)" <HerryYang@zhaoxin.com>
Link: https://lkml.kernel.org/r/01042674b2f741b2aed1f797359bdffb@zhaoxin.com
|
|
Split Tremont based Atoms from the rest to keep logical grouping.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20190617115537.33309-1-andriy.shevchenko@linux.intel.com
|
|
Explain how the GS/FS based addressing can be utilized in user space
applications along with the differences between the generic prctl() based
GS/FS base control and the FSGSBASE version available on newer CPUs.
Originally-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Bae, Chang Seok" <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>,
Cc: H . Peter Anvin <hpa@zytor.com>
Cc: "Shankar, Ravi V" <ravi.v.shankar@intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1906132246310.1791@nanos.tec.linutronix.de
|
|
The kernel needs to explicitly enable FSGSBASE. So, the application needs
to know if it can safely use these instructions. Just looking at the CPUID
bit is not enough because it may be running in a kernel that does not
enable the instructions.
One way for the application would be to just try and catch the SIGILL.
But that is difficult to do in libraries which may not want to overwrite
the signal handlers of the main application.
Enumerate the enabled FSGSBASE capability in bit 1 of AT_HWCAP2 in the ELF
aux vector. AT_HWCAP2 is already used by PPC for similar purposes.
The application can access it open coded or by using the getauxval()
function in newer versions of glibc.
[ tglx: Massaged changelog ]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Link: https://lkml.kernel.org/r/1557309753-24073-18-git-send-email-chang.seok.bae@intel.com
|
|
Now that FSGSBASE is fully supported, remove unsafe_fsgsbase, enable
FSGSBASE by default, and add nofsgsbase to disable it.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Link: https://lkml.kernel.org/r/1557309753-24073-17-git-send-email-chang.seok.bae@intel.com
|