Age | Commit message (Collapse) | Author |
|
cgwb_bdi_destroy()
a20135ffbc44 ("writeback: don't drain bdi_writeback_congested on bdi
destruction") added rbtree_postorder_for_each_entry_safe() which is
used to remove all entries; however, according to Cody, the iterator
isn't safe against operations which may rebalance the tree. Fix it by
switching to repeatedly removing rb_first() until empty.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Cody P Schafer <dev@codyps.com>
Fixes: a20135ffbc44 ("writeback: don't drain bdi_writeback_congested on bdi destruction")
Link: http://lkml.kernel.org/g/1443997973-1700-1-git-send-email-dev@codyps.com
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Yuchung Cheng says:
====================
RACK loss detection
RACK (Recent ACK) loss recovery uses the notion of time instead of
packet sequence (FACK) or counts (dupthresh).
It's inspired by the FACK heuristic in tcp_mark_lost_retrans(): when a
limited transmit (new data packet) is sacked in recovery, then any
retransmission sent before that newly sacked packet was sent must have
been lost, since at least one round trip time has elapsed.
But that existing heuristic from tcp_mark_lost_retrans()
has several limitations:
1) it can't detect tail drops since it depends on limited transmit
2) it's disabled upon reordering (assumes no reordering)
3) it's only enabled in fast recovery but not timeout recovery
RACK addresses these limitations with a core idea: an unacknowledged
packet P1 is deemed lost if a packet P2 that was sent later is is
s/acked, since at least one round trip has passed.
Since RACK cares about the time sequence instead of the data sequence
of packets, it can detect tail drops when a later retransmission is
s/acked, while FACK or dupthresh can't. For reordering RACK uses a
dynamically adjusted reordering window ("reo_wnd") to reduce false
positives on ever (small) degree of reordering, similar to the delayed
Early Retransmit.
In the current patch set RACK is only a supplemental loss detection
and does not trigger fast recovery. However we are developing RACK
to replace or consolidate FACK/dupthresh, early retransmit, and
thin-dupack. These heuristics all implicitly bear the time notion.
For example, the delayed Early Retransmit is simply applying RACK
to trigger the fast recovery with small inflight.
RACK requires measuring the minimum RTT. Tracking a global min is less
robust due to traffic engineering pathing changes. Therefore it uses a
windowed filter by Kathleen Nichols. The min RTT can also be useful
for various other purposes like congestion control or stat monitoring.
This patch has been used on Google servers for well over 1 year. RACK
has also been implemented in the QUIC protocol. We are submitting an
IETF draft as well.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch implements the second half of RACK that uses the the most
recent transmit time among all delivered packets to detect losses.
tcp_rack_mark_lost() is called upon receiving a dubious ACK.
It then checks if an not-yet-sacked packet was sent at least
"reo_wnd" prior to the sent time of the most recently delivered.
If so the packet is deemed lost.
The "reo_wnd" reordering window starts with 1msec for fast loss
detection and changes to min-RTT/4 when reordering is observed.
We found 1msec accommodates well on tiny degree of reordering
(<3 pkts) on faster links. We use min-RTT instead of SRTT because
reordering is more of a path property but SRTT can be inflated by
self-inflicated congestion. The factor of 4 is borrowed from the
delayed early retransmit and seems to work reasonably well.
Since RACK is still experimental, it is now used as a supplemental
loss detection on top of existing algorithms. It is only effective
after the fast recovery starts or after the timeout occurs. The
fast recovery is still triggered by FACK and/or dupack threshold
instead of RACK.
We introduce a new sysctl net.ipv4.tcp_recovery for future
experiments of loss recoveries. For now RACK can be disabled by
setting it to 0.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is the first half of the RACK loss recovery.
RACK loss recovery uses the notion of time instead
of packet sequence (FACK) or counts (dupthresh). It's inspired by the
previous FACK heuristic in tcp_mark_lost_retrans(): when a limited
transmit (new data packet) is sacked, then current retransmitted
sequence below the newly sacked sequence must been lost,
since at least one round trip time has elapsed.
But it has several limitations:
1) can't detect tail drops since it depends on limited transmit
2) is disabled upon reordering (assumes no reordering)
3) only enabled in fast recovery ut not timeout recovery
RACK (Recently ACK) addresses these limitations with the notion
of time instead: a packet P1 is lost if a later packet P2 is s/acked,
as at least one round trip has passed.
Since RACK cares about the time sequence instead of the data sequence
of packets, it can detect tail drops when later retransmission is
s/acked while FACK or dupthresh can't. For reordering RACK uses a
dynamically adjusted reordering window ("reo_wnd") to reduce false
positives on ever (small) degree of reordering.
This patch implements tcp_advanced_rack() which tracks the
most recent transmission time among the packets that have been
delivered (ACKed or SACKed) in tp->rack.mstamp. This timestamp
is the key to determine which packet has been lost.
Consider an example that the sender sends six packets:
T1: P1 (lost)
T2: P2
T3: P3
T4: P4
T100: sack of P2. rack.mstamp = T2
T101: retransmit P1
T102: sack of P2,P3,P4. rack.mstamp = T4
T205: ACK of P4 since the hole is repaired. rack.mstamp = T101
We need to be careful about spurious retransmission because it may
falsely advance tp->rack.mstamp by an RTT or an RTO, causing RACK
to falsely mark all packets lost, just like a spurious timeout.
We identify spurious retransmission by the ACK's TS echo value.
If TS option is not applicable but the retransmission is acknowledged
less than min-RTT ago, it is likely to be spurious. We refrain from
using the transmission time of these spurious retransmissions.
The second half is implemented in the next patch that marks packet
lost using RACK timestamp.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
a helper to prepare the first main RACK patch.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
a helper to prepare the main RACK patch
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove the existing lost retransmit detection because RACK subsumes
it completely. This also stops the overloading the ack_seq field of
the skb control block.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Kathleen Nichols' algorithm for tracking the minimum RTT of a
data stream over some measurement window. It uses constant space
and constant time per update. Yet it almost always delivers
the same minimum as an implementation that has to keep all
the data in the window. The measurement window is tunable via
sysctl.net.ipv4.tcp_min_rtt_wlen with a default value of 5 minutes.
The algorithm keeps track of the best, 2nd best & 3rd best min
values, maintaining an invariant that the measurement time of
the n'th best >= n-1'th best. It also makes sure that the three
values are widely separated in the time window since that bounds
the worse case error when that data is monotonically increasing
over the window.
Upon getting a new min, we can forget everything earlier because
it has no value - the new min is less than everything else in the
window by definition and it's the most recent. So we restart fresh
on every new min and overwrites the 2nd & 3rd choices. The same
property holds for the 2nd & 3rd best.
Therefore we have to maintain two invariants to maximize the
information in the samples, one on values (1st.v <= 2nd.v <=
3rd.v) and the other on times (now-win <=1st.t <= 2nd.t <= 3rd.t <=
now). These invariants determine the structure of the code
The RTT input to the windowed filter is the minimum RTT measured
from ACK or SACK, or as the last resort from TCP timestamps.
The accessor tcp_min_rtt() returns the minimum RTT seen in the
window. ~0U indicates it is not available. The minimum is 1usec
even if the true RTT is below that.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently ca_seq_rtt_us does not use Kern's check. Fix that by
checking if any packet acked is a retransmit, for both RTT used
for RTT estimation and congestion control.
Fixes: 5b08e47ca ("tcp: prefer packet timing to TS-ECR for RTT")
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Heiko Schocher says:
====================
net, phy, smsc: add posibility to disable energy detect mode
On some boards the energy enable detect mode leads in
trouble with some switches, so make the enabling of
this mode configurable through DT.
Therefore the property "smsc,disable-energy-detect" is
introduced.
Patch 1 introduces phy-handle support for the ti,cpsw
driver. This is needed now for the smsc phy.
Patch 2 adds the disable energy mode functionality
to the smsc phy
Changes in v2:
- add comments from Florian Fainelli
- I did not change disable property name into enable
because I fear to break existing behaviour
- add smsc vendor prefix
- remove CONFIG_OF and use __maybe_unused
- introduce "phy-handle" ability into ti,cpsw
driver, so I can remove bogus:
if (!of_node && dev->parent->of_node)
of_node = dev->parent->of_node;
construct. Therefore new patch for the ti,cpsw
driver is necessary.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On some boards the energy enable detect mode leads in
trouble with some switches, so make the enabling of
this mode configurable through DT.
Signed-off-by: Heiko Schocher <hs@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
add the ability to parse "phy-handle". This
is needed for phys, which have a DT node, and
need to parse DT properties.
Signed-off-by: Heiko Schocher <hs@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If a gigabit ethernet PHY is connected to a fast ethernet MAC,
then it can detect 1000 support from the partner but not use it.
This results in a forced speed of 1000 and RX/TX failure.
Check for 1000BASE-T support and then check the advertisement
configuration before setting the MAC speed to 1000mbit.
Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2015-10-19
This series contains updates to i40e and i40evf only.
Kiran adds a spinlock around code accessing VSI MAC filter list to
ensure that we are synchronizing access to the filter list, otherwise
we can end up with multiple accesses at the same time which can cause
the VSI MAC filter list to get in an unstable or corrupted state.
Jesse fixes overlong BIT defines, where the RSS enabling call were
mistakenly missed. Also fixes a bug where the enable function was
enabling the interrupt twice while trying to update the two interrupt
throttle rate thresholds for Rx and Tx, while refactoring the IRQ
enable function to simplify reading the flow. Addressed the high
CPU utilization of some small streaming workloads that the driver should
reduce CPU in.
Anjali fixes two X722 issues with respect to EEPROM checksum verify and
reading NVM version info. Fixed where a mask value was accidentally
replaced with a bit mask causing Flow Director sideband to be broken.
Alex Duyck fixes areas of the drivers which run from hard interrupt
context or with interrupts already disabled in netpoll, so use
napi_schedule_irqoff() instead of napi_schedule().
Mitch fixes the VF drivers to not easily give up when it is not able
to communicate with the PF driver.
Carolyn fixes a problem where our tools MAC loopback test, after driver
unbind would fail because the hardware was configured for multiqueue and
unbind operation did not clear this configuration. Also fixed a issue
where the NVMUpdate tool gets bad data from the PHY when using the PHY
NVM feature because of contention on the MDIO interface from getting
PHY capability calls from the driver during regular operations.
Catherine fixed an issue where we were checking if autoneg was allowed
to change before checking if autoneg was changing, these checks need to
be in the reverse order.
Jean Sacren fixes up an function header comment to align the kernel-docs
with the actual code.
v2: Cleaned up the use of spin_is_locked() in patch 1 based on feedback
from David Miller, since it always evaluates to zero on uni-processor
builds
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The hci_conn objects don't have a dedicated lock themselves but rely
on the caller to hold the hci_dev lock for most types of access. The
hci_conn_timeout() function has so far sent certain HCI commands based
on the hci_conn state which has been possible without holding the
hci_dev lock.
The recent changes to do LE scanning before connect attempts added
even more operations to hci_conn and hci_dev from hci_conn_timeout,
thereby exposing potential race conditions with the hci_dev and
hci_conn states.
As an example of such a race, here there's a timeout but an
l2cap_sock_connect() call manages to race with the cleanup routine:
[Oct21 08:14] l2cap_chan_timeout: chan ee4b12c0 state BT_CONNECT
[ +0.000004] l2cap_chan_close: chan ee4b12c0 state BT_CONNECT
[ +0.000002] l2cap_chan_del: chan ee4b12c0, conn f3141580, err 111, state BT_CONNECT
[ +0.000002] l2cap_sock_teardown_cb: chan ee4b12c0 state BT_CONNECT
[ +0.000005] l2cap_chan_put: chan ee4b12c0 orig refcnt 4
[ +0.000010] hci_conn_drop: hcon f53d56e0 orig refcnt 1
[ +0.000013] l2cap_chan_put: chan ee4b12c0 orig refcnt 3
[ +0.000063] hci_conn_timeout: hcon f53d56e0 state BT_CONNECT
[ +0.000049] hci_conn_params_del: addr ee:0d:30:09:53:1f (type 1)
[ +0.000002] hci_chan_list_flush: hcon f53d56e0
[ +0.000001] hci_chan_del: hci0 hcon f53d56e0 chan f4e7ccc0
[ +0.004528] l2cap_sock_create: sock e708fc00
[ +0.000023] l2cap_chan_create: chan ee4b1770
[ +0.000001] l2cap_chan_hold: chan ee4b1770 orig refcnt 1
[ +0.000002] l2cap_sock_init: sk ee4b3390
[ +0.000029] l2cap_sock_bind: sk ee4b3390
[ +0.000010] l2cap_sock_setsockopt: sk ee4b3390
[ +0.000037] l2cap_sock_connect: sk ee4b3390
[ +0.000002] l2cap_chan_connect: 00:02:72:d9:e5:8b -> ee:0d:30:09:53:1f (type 2) psm 0x00
[ +0.000002] hci_get_route: 00:02:72:d9:e5:8b -> ee:0d:30:09:53:1f
[ +0.000001] hci_dev_hold: hci0 orig refcnt 8
[ +0.000003] hci_conn_hold: hcon f53d56e0 orig refcnt 0
Above the l2cap_chan_connect() shouldn't have been able to reach the
hci_conn f53d56e0 anymore but since hci_conn_timeout didn't do proper
locking that's not the case. The end result is a reference to hci_conn
that's not in the conn_hash list, resulting in list corruption when
trying to remove it later:
[Oct21 08:15] l2cap_chan_timeout: chan ee4b1770 state BT_CONNECT
[ +0.000004] l2cap_chan_close: chan ee4b1770 state BT_CONNECT
[ +0.000003] l2cap_chan_del: chan ee4b1770, conn f3141580, err 111, state BT_CONNECT
[ +0.000001] l2cap_sock_teardown_cb: chan ee4b1770 state BT_CONNECT
[ +0.000005] l2cap_chan_put: chan ee4b1770 orig refcnt 4
[ +0.000002] hci_conn_drop: hcon f53d56e0 orig refcnt 1
[ +0.000015] l2cap_chan_put: chan ee4b1770 orig refcnt 3
[ +0.000038] hci_conn_timeout: hcon f53d56e0 state BT_CONNECT
[ +0.000003] hci_chan_list_flush: hcon f53d56e0
[ +0.000002] hci_conn_hash_del: hci0 hcon f53d56e0
[ +0.000001] ------------[ cut here ]------------
[ +0.000461] WARNING: CPU: 0 PID: 1782 at lib/list_debug.c:56 __list_del_entry+0x3f/0x71()
[ +0.000839] list_del corruption, f53d56e0->prev is LIST_POISON2 (00000200)
The necessary fix is unfortunately more complicated than just adding
hci_dev_lock/unlock calls to the hci_conn_timeout() call path.
Particularly, the hci_conn_del() API, which expects the hci_dev lock to
be held, performs a cancel_delayed_work_sync(&hcon->disc_work) which
would lead to a deadlock if the hci_conn_timeout() call path tries to
acquire the same lock.
This patch solves the problem by deferring the cleanup work to a
separate work callback. To protect against the hci_dev or hci_conn
going away meanwhile temporary references are taken with the help of
hci_dev_hold() and hci_conn_get().
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org # 4.3
|
|
The register value to enable gpio2 was incorrect. So fix it.
Signed-off-by: Bard Liao <bardliao@realtek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
This fixes a bug where it is possible for an off-line CPU to fail to go
into a low-power state (nap/sleep/winkle), and to become unresponsive to
requests from the KVM subsystem to wake up and run a VCPU. What can
happen is that a maskable interrupt of some kind (external, decrementer,
hypervisor doorbell, or HMI) after we have called local_irq_disable() at
the beginning of pnv_smp_cpu_kill_self() and before interrupts are
hard-disabled inside power7_nap/sleep/winkle(). In this situation, the
pending event is marked in the irq_happened flag in the PACA. This
pending event prevents power7_nap/sleep/winkle from going to the
requested low-power state; instead they return immediately. We don't
deal with any of these pending event flags in the off-line loop in
pnv_smp_cpu_kill_self() because power7_nap et al. return 0 in this case,
so we will have srr1 == 0, and none of the processing to clear
interrupts or doorbells will be done.
Usually, the most obvious symptom of this is that a KVM guest will fail
with a console message saying "KVM: couldn't grab cpu N".
This fixes the problem by making sure we handle the irq_happened flags
properly. First, we hard-disable before the off-line loop. Once we have
hard-disabled, the irq_happened flags can't change underneath us. We
unconditionally clear the DEC and HMI flags: there is no processing of
timer interrupts while off-line, and the necessary HMI processing is all
done in lower-level code. We leave the EE and DBELL flags alone for the
first iteration of the loop, so that we won't fail to respond to a
split-core request that came in just before hard-disabling. Within the
loop, we handle external interrupts if the EE bit is set in irq_happened
as well as if the low-power state was interrupted by an external
interrupt. (We don't need to do the msgclr for a pending doorbell in
irq_happened, because doorbells are edge-triggered and don't remain
pending in hardware.) Then we clear both the EE and DBELL flags, and
once clear, they cannot be set again (until this CPU comes online again,
that is).
This also fixes the debug check to not be done when we just ran a KVM
guest or when the sleep didn't happen because of a pending event in
irq_happened.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
POWER8"
This reverts commit 9678cdaae939 ("Use the POWER8 Micro Partition
Prefetch Engine in KVM HV on POWER8") because the original commit had
multiple, partly self-cancelling bugs, that could cause occasional
memory corruption.
In fact the logmpp instruction was incorrectly using register r0 as the
source of the buffer address and operation code, and depending on what
was in r0, it would either do nothing or corrupt the 64k page pointed to
by r0.
The logmpp instruction encoding and the operation code definitions could
be corrected, but then there is the problem that there is no clearly
defined way to know when the hardware has finished writing to the
buffer.
The original commit attempted to work around this by aborting the
write-out before starting the prefetch, but this is ineffective in the
case where the virtual core is now executing on a different physical
core from the one where the write-out was initiated.
These problems plus advice from the hardware designers not to use the
function (since the measured performance improvement from using the
feature was actually mostly negative), mean that reverting the code is
the best option.
Fixes: 9678cdaae939 ("Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8")
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
During device assignment/deassignment the flags in the DTE
get lost, which might cause spurious faults, for example
when the device tries to access the system management range.
Fix this by not clearing the flags with the rest of the DTE.
Reported-by: G. Richard Bellamy <rbellamy@pteradigm.com>
Tested-by: G. Richard Bellamy <rbellamy@pteradigm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Merge the AMD and Intel sections and generalize the file patterns.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1445334889-300-7-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
... and save us the stub.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1445334889-300-6-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
We have the MAINTAINERS file for that. Also, Andreas doesn't
have the time for this work anymore.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1445334889-300-5-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Remove the remaining module functionality leftovers. Make
"dis_ucode_ldr" an early_param and make it static again. Drop
module aliases, autoloading table, description, etc.
Bump version number, while at it.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1445334889-300-4-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Merge the early loader functionality into the driver proper. The
diff is huge but logically, it is simply moving code from the
_early.c files into the main driver.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1445334889-300-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Make CONFIG_MICROCODE a bool. It was practically a bool already anyway,
since early loader was forcing it to =y.
Regardless, there's no real reason to have something be a module which
gets built-in on the majority of installations out there. And its not
like there's noticeable change in functionality - we still can load late
microcode - just the module glue disappears.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1445334889-300-2-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Standardize on bool instead of an inconsistent mixture of u8 and plain 'int'.
Also use u32 or 'unsigned int' instead of 'unsigned long' when a 32-bit type
suffices, generating slightly better code on x86-64.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/5624E3A002000078000AC49A@prv-mh.provo.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Move the generic help text explaining each CPU type and what to
select under the "Processor Family" prompt and not under the
M486 option. Also, amend it with the missing options.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445244077-25120-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The per CPU thermal vector init code checks if the thermal
vector is already installed and complains and bails out if it
is.
This happens after kexec, as kernel shut down does not clear the
thermal vector APIC register.
This causes two problems:
1. So we always do not fully initialize thermal reports after
kexec. The CPU is still likely initialized, as the previous
kernel should have done it. But we don't set up the software
pointer to the thermal vector, so reporting may end up with a
unknown thermal interrupt message.
2. Also it complains for every logical CPU, even though the
value is actually derived from BP only.
The problem is that we end up with one message per CPU, so on
larger systems it becomes very noisy and messes up the otherwise
nicely formatted CPU bootup numbers in the kernel log.
Just remove the check. I checked the code and there's no valid
code paths where the thermal init code for a CPU could be called
multiple times.
Why the kernel does not clean up this value on shutdown:
The thermal monitoring is controlled per logical CPU thread.
Normal shutdown code is just running on one CPU. To disable it
we would need a broadcast NMI to all CPUs on shut down. That's
overkill for this. So we just ignore it after kexec.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1445246268-26285-9-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
It is helpful when the crashkernel cmdline parsing routines
actually say which character is the unrecognized one. Make them
do so.
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Baoquan He <bhe@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: jerry_hoemann@hp.com
Link: http://lkml.kernel.org/r/1445246268-26285-8-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
memblock_reserve() can fail but the crashkernel reservation code
doesn't check that and this can lead the user into believing
that the crashkernel region was actually reserved. Make sure we
check that return value and we exit early with a failure message
in the error case.
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: WANG Chao <chaowang@redhat.com>
Cc: jerry_hoemann@hp.com
Link: http://lkml.kernel.org/r/1445246268-26285-7-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
* Remove unused auto_set variable
* Cleanup local function variable declarations
* Reformat printk string and use pr_info()
No functionality change.
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: WANG Chao <chaowang@redhat.com>
Cc: jerry_hoemann@hp.com
Link: http://lkml.kernel.org/r/1445246268-26285-6-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Use a macro instead. No functionality change.
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: WANG Chao <chaowang@redhat.com>
Cc: jerry_hoemann@hp.com
Link: http://lkml.kernel.org/r/1445246268-26285-5-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
* Shorten variable names
* Realign code, space out for better readability
No code changed:
# arch/x86/kernel/setup.o:
text data bss dec hex filename
4543 3096 69904 77543 12ee7 setup.o.before
4543 3096 69904 77543 12ee7 setup.o.after
md5:
8a1b7c6738a553ca207b56bd84a8f359 setup.o.before.asm
8a1b7c6738a553ca207b56bd84a8f359 setup.o.after.asm
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: WANG Chao <chaowang@redhat.com>
Cc: jerry_hoemann@hp.com
Link: http://lkml.kernel.org/r/1445246268-26285-4-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This function doesn't give us the "Node ID" as the function name
suggests. Rather, it receives a PCI device as argument, checks
the available F3 PCI device IDs in the system and returns the
index of the matching Bus/Device IDs.
Rename it to amd_pci_dev_to_node_id().
No functional change is introduced.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1445246268-26285-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
People reported that when allocating crashkernel memory using
the ",high" and ",low" syntax, there were cases where the
reservation of the high portion succeeds but the reservation of
the low portion fails.
Then kexec can load the kdump kernel successfully, but booting
the kdump kernel fails as there's no low memory.
The low memory allocation for the kdump kernel can fail on large
systems for a couple of reasons. For example, the manually
specified crashkernel low memory can be too large and thus no
adequate memblock region would be found.
Therefore, we try to reserve low memory for the crash kernel
*after* the high memory portion has been allocated. If that
fails, we free crashkernel high memory too and return. The user
can then take measures accordingly.
Tested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Baoquan He <bhe@redhat.com>
[ Massage text. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Young <dyoung@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: WANG Chao <chaowang@redhat.com>
Cc: jerry_hoemann@hp.com
Cc: yinghai@kernel.org
Link: http://lkml.kernel.org/r/1445246268-26285-2-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Suppose that we got a data crc error, and it triggers the mmc_reset.
mmc_reset will call mmc_send_status to see if HW reset was supported.
before issue CMD13, it will do retune, and if EMMC was in HS400 mode,
it will reduce frequency to 52Mhz firstly, then results in card init
was doing at 52Mhz.
The mmc_send_status was originally only done for mmc_test, should drop
it. And, rename the "eMMC hardware reset" to "Reset test", as we would
also be able to use the test for SD-cards.
Signed-off-by: Chaotian Jing <chaotian.jing@mediatek.com>
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: bd11e8bd03ca ("mmc: core: Flag re-tuning is needed on CRC errors")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Group station statistics by where they're (mostly) updated
(TX, RX and TX-status) and group them into sub-structs of
the struct sta_info.
Also rename the variables since the grouping now makes it
obvious where they belong.
This makes it easier to identify where the statistics are
updated in the code, and thus easier to think about them.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
There's little point in keeping (and even sending to userspace)
the beacon_loss_count value per station, since it can only apply
to the AP on a managed-mode connection. Move the value to ifmgd,
advertise it only in managed mode, and remove it from ethtool as
it's available through better interfaces.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This file only feeds a debugfs file that isn't very useful, so remove
it. If necessary, we can add other ways to get this information, for
example in the NL80211_CMD_PROBE_CLIENT response.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Major changes:
ath10k
* add board 2 API support for automatically choosing correct board file
* data path optimisations
* disable PCI power save for qca988x and QCA99x0 due to interop reasons
wil6210
* BlockAckReq support
* firmware crashdump using devcoredump
* capture all frames with sniffer
|
|
This is a new Broadcom chip and we should be able to recognize it.
Signed-off-by: Eric Caruso <ejcaruso@google.com>
Acked-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
Use the netif_carrier api to correctly set carrier state on the
different modes.
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
With this feature it becomes possible to request a station
assoc list.
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
Txrate and rxrate in get_station got assigned first with value
in kbps and then divided by 100 to get it in 100kbps unit. The
problem with that is that type of rate is u16 which resulted
in incorrect values for high data rate values.
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
When module param p2pon is used a p2p device is created at init.
This patch reworks how this is done by using the same method as
for a dynamically (by user space) created p2p device.
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
Removing a comment that was only useful during the review of
the change that introduced it and which should never have been
submitted.
Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
In brcmfmac the module parameter "firmware_path" is used as an
alternative relative path under the search path used by firmware_class
or ueventhelper. Rename the parameter to alternative_fw_path to avoid
confusion.
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Franky Lin <frankyl@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|