Age | Commit message (Collapse) | Author |
|
Back in linux-3.13 (commit b0983d3c9b13 ("tcp: fix dynamic right sizing"))
I addressed the pressing issues we had with receiver autotuning.
But DRS suffers from extra latencies caused by rcv_rtt_est.rtt_us
drifts. One common problem happens during slow start, since the
apparent RTT measured by the receiver can be inflated by ~50%,
at the end of one packet train.
Also, a single drop can delay read() calls by one RTT, meaning
tcp_rcv_space_adjust() can be called one RTT too late.
By replacing the tri-modal heuristic with a continuous function,
we can offset the effects of not growing 'at the optimal time'.
The curve of the function matches prior behavior if the space
increased by 25% and 50% exactly.
Cost of added multiply/divide is small, considering a TCP flow
typically would run this part of the code few times in its life.
I tested this patch with 100 ms RTT / 1% loss link, 100 runs
of (netperf -l 5), and got an average throughput of 4600 Mbit
instead of 1700 Mbit.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Wei Wang <weiwan@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When using large tcp_rmem[2] values (I did tests with 500 MB),
I noticed overflows while computing rcvwin.
Lets fix this before the following patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Wei Wang <weiwan@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While rcvbuf is properly clamped by tcp_rmem[2], rcvwin
is left to a potentially too big value.
It has no serious effect, since :
1) tcp_grow_window() has very strict checks.
2) window_clamp can be mangled by user space to any value anyway.
tcp_init_buffer_space() and companions use tcp_full_space(),
we use tcp_win_from_space() to avoid reloading sk->sk_rcvbuf
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Wei Wang <weiwan@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We need to put the policies when re-using the pcpu xdst entry, else
this leaks the reference.
Fixes: ec30d78c14a813db39a647b6a348b428 ("xfrm: add xdst pcpu cache")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Now in sctp_setsockopt_reset_streams, it only does the check
optlen < sizeof(*params) for optlen. But it's not enough, as
params->srs_number_streams should also match optlen.
If the streams in params->srs_stream_list are less than stream
nums in params->srs_number_streams, later when dereferencing
the stream list, it could cause a slab-out-of-bounds crash, as
reported by syzbot.
This patch is to fix it by also checking the stream numbers in
sctp_setsockopt_reset_streams to make sure at least it's not
greater than the streams in the list.
Fixes: 7f9d68ac944e ("sctp: implement sender-side procedures for SSN Reset Request Parameter")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
inet->hdrincl is racy, and could lead to uninitialized stack pointer
usage, so its value should be read only once.
Fixes: c008ba5bdc9f ("ipv4: Avoid reading user iov twice after raw_probe_proto_opt")
Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
'fixes.2017.12.11a', 'srbd.2017.12.05a' and 'torture.2017.12.11a' into HEAD
cond_resched.2017.12.04a: Convert cond_resched_rcu_qs() to cond_resched()
dyntick.2017.11.28a: Make RCU dynticks handle interrupts from NMI
fixes.2017.12.11a: Miscellaneous fixes
srbd.2017.12.05a: Remove now-redundant smp_read_barrier_depends()
torture.2017.12.11a: Torture-testing update
|
|
Currently, a nlmon link inside a child namespace can observe systemwide
netlink activity. Filter the traffic so that nlmon can only sniff
netlink messages from its own netns.
Test case:
vpnns -- bash -c "ip link add nlmon0 type nlmon; \
ip link set nlmon0 up; \
tcpdump -i nlmon0 -q -w /tmp/nlmon.pcap -U" &
sudo ip xfrm state add src 10.1.1.1 dst 10.1.1.2 proto esp \
spi 0x1 mode transport \
auth sha1 0x6162633132330000000000000000000000000000 \
enc aes 0x00000000000000000000000000000000
grep --binary abc123 /tmp/nlmon.pcap
Signed-off-by: Kevin Cernekee <cernekee@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unordered idata process is more complicated than unordered data:
- It has to add mid into sctp_stream_out to save the next mid value,
which is separated from ordered idata's.
- To support pd for unordered idata, another mid and pd_mode need to
be added to save the message id and pd state in sctp_stream_in.
- To make unordered idata reasm easier, it adds a new event queue
to save frags for idata.
The patch mostly adds the samilar reasm functions for unordered idata
as ordered idata's, and also adjusts some other codes on assign_mid,
abort_pd and ulpevent_data for idata.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
abort_pd is added as a member of sctp_stream_interleave, used to abort
partial delivery for data or idata, called in sctp_cmd_assoc_failed.
Since stream interleave allows to do partial delivery for each stream
at the same time, sctp_intl_abort_pd for idata would be very different
from the old function sctp_ulpq_abort_pd for data.
Note that sctp_ulpevent_make_pdapi will support per stream in this
patch by adding pdapi_stream and pdapi_seq in sctp_pdapi_event, as
described in section 6.1.7 of RFC6458.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
start_pd is added as a member of sctp_stream_interleave, used to
do partial_delivery for data or idata when datalen >= asoc->rwnd
in sctp_eat_data. The codes have been done in last patches, but
they need to be extracted into start_pd, so that it could be used
for SCTP_CMD_PART_DELIVER cmd as well.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
renege_events is added as a member of sctp_stream_interleave, used to
renege some old data or idata in reasm or lobby queue properly to free
some memory for the new data when there's memory stress.
It defines sctp_renege_events for idata, and leaves sctp_ulpq_renege
as it is for data.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
enqueue_event is added as a member of sctp_stream_interleave, used to
enqueue either data, idata or notification events into user socket rx
queue.
It replaces sctp_ulpq_tail_event used in the other places with
enqueue_event.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ulpevent_data is added as a member of sctp_stream_interleave, used to
do the most process in ulpq, including to convert data or idata chunk
to event, reasm them in reasm queue and put them in lobby queue in
right order, and deliver them up to user sk rx queue.
This procedure is described in section 2.2.3 of RFC8260.
It adds most functions for idata here to do the similar process as
the old functions for data. But since the details are very different
between them, the old functions can not be reused for idata.
event->ssn and event->ppid settings are moved to ulpevent_data from
sctp_ulpevent_make_rcvmsg, so that sctp_ulpevent_make_rcvmsg could
work for both data and idata.
Note that mid is added in sctp_ulpevent for idata, __packed has to
be used for defining sctp_ulpevent, or it would exceeds the skb cb
that saves a sctp_ulpevent variable for ulp layer process.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
validate_data is added as a member of sctp_stream_interleave, used
to validate ssn/chunk type for data or mid (message id)/chunk type
for idata, called in sctp_eat_data.
If this check fails, an abort packet will be sent, as said in
section 2.2.3 of RFC8260.
It also adds the process for idata in rx path. As Marcelo pointed
out, there's no need to add event table for idata, but just share
chunk_event_table with data's. It would drop data chunk for idata
and drop idata chunk for data by calling validate_data in
sctp_eat_data.
As last patch did, it also replaces sizeof(struct sctp_data_chunk)
with sctp_datachk_len for rx path.
After this patch, the idata can be accepted and delivered to ulp
layer.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
assign_number is added as a member of sctp_stream_interleave, used
to assign ssn for data or mid (message id) for idata, called in
sctp_packet_append_data. sctp_chunk_assign_ssn is left as it is,
and sctp_chunk_assign_mid is added for sctp_stream_interleave_1.
This procedure is described in section 2.2.2 of RFC8260.
All sizeof(struct sctp_data_chunk) in tx path is replaced with
sctp_datachk_len, to make it right for idata as well. And also
adjust sctp_chunk_is_data for SCTP_CID_I_DATA.
After this patch, idata can be built and sent in tx path.
Note that if sp strm_interleave is set, it has to wait_connect in
sctp_sendmsg, as asoc intl_enable need to be known after 4 shake-
hands, to decide if it should use data or idata later. data and
idata can't be mixed to send in one asoc.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To avoid hundreds of checks for the different process on I-DATA chunk,
struct sctp_stream_interleave is defined as a group of functions used
to replace the codes in some place where it needs to do different job
according to if the asoc intl_enabled is set.
With these ops, it only needs to initialize asoc->stream.si with
sctp_stream_interleave_0 for normal data if asoc intl_enable is 0,
or sctp_stream_interleave_1 for idata if asoc intl_enable is set in
sctp_stream_init.
After that, the members in asoc->stream.si can be used directly in
some special places without checking asoc intl_enable.
make_datafrag is the first member for sctp_stream_interleave, it's
used to make data or idata frags, called in sctp_datamsg_from_user.
The old function sctp_make_datafrag_empty needs to be adjust some
to fit in this ops.
Note that as idata and data chunks have different length, it also
defines data_chunk_len for sctp_stream_interleave to describe the
chunk size.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sctp_idatahdr and sctp_idata_chunk are used to define and parse
I-DATA chunk format, and sctp_make_idata is a function to build
the chunk.
The I-DATA Chunk Format is defined in section 2.1 of RFC8260.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
asoc intl_enable will be set when local sp strm_interleave is set
and there's I-DATA chunk in init and init_ack extensions, as said
in section 2.2.1 of RFC8260.
asoc intl_enable indicates all data will be sent as I-DATA chunks.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds intl_enable in asoc and netns, and strm_interleave in
sctp_sock to indicate if stream interleave is enabled and supported.
netns intl_enable would be set via procfs, but that is not added yet
until all stream interleave codes are completely implemented; asoc
intl_enable will be set when doing 4-shakehands.
sp strm_interleave can be set by sockopt SCTP_INTERLEAVING_SUPPORTED
which is also added in this patch. This socket option is defined in
section 4.3.1 of RFC8260.
Note that strm_interleave can only be set by sockopt when both netns
intl_enable and sp frag_interleave are set.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After commit 4d3a57f23dec ("netfilter: conntrack: do not enable connection
tracking unless needed") conntrack is disabled by default unless some
module explicitly declares dependency in particular network namespace.
Fixes: a357b3f80bc8 ("netfilter: nat: add dependencies on conntrack module")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Both netlink_add_tap() and netlink_remove_tap() are
called in process context, no need to bother spinlock.
Note, in fact, currently we always hold RTNL when calling
these two functions, so we don't need any other lock at
all, but keeping this lock doesn't harm anything.
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
nlmon device is not supposed to capture netlink events from
other netns, so instead of filtering events, we can simply
make netlink tap itself per netns.
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Kevin Cernekee <cernekee@chromium.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Most callers of rhashtable_walk_start don't care about a resize event
which is indicated by a return value of -EAGAIN. So calls to
rhashtable_walk_start are wrapped wih code to ignore -EAGAIN. Something
like this is common:
ret = rhashtable_walk_start(rhiter);
if (ret && ret != -EAGAIN)
goto out;
Since zero and -EAGAIN are the only possible return values from the
function this check is pointless. The condition never evaluates to true.
This patch changes rhashtable_walk_start to return void. This simplifies
code for the callers that ignore -EAGAIN. For the few cases where the
caller cares about the resize event, particularly where the table can be
walked in mulitple parts for netlink or seq file dump, the function
rhashtable_walk_start_check has been added that returns -EAGAIN on a
resize event.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fixes: 46e6b992c250 ("rtnetlink: allow GSO maximums to be set on device creation")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Three fixes:
* for certificate C file generation, don't use hexdump as it's
not always installed by default, use pure posix instead (od/sed)
* for certificate C file generation, don't write the file if
anything fails, so the build abort will not cause a bad build
upon a second attempt
* fix locking in ieee80211_sta_tear_down_BA_sessions() which had
been causing lots of locking warnings
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add missing netlink attribute policy.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This adds airtime accounting and scheduling to the mac80211 TXQ
scheduler. A new hardware flag, AIRTIME_ACCOUNTING, is added that
drivers can set if they support reporting airtime usage of
transmissions. When this flag is set, mac80211 will expect the actual
airtime usage to be reported in the tx_time and rx_time fields of the
respective status structs.
When airtime information is present, mac80211 will schedule TXQs
(through ieee80211_next_txq()) in a way that enforces airtime fairness
between active stations. This scheduling works the same way as the ath9k
in-driver airtime fairness scheduling.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This adds an API to mac80211 to handle scheduling of TXQs and changes the
interface between driver and mac80211 for TXQ handling as follows:
- The wake_tx_queue callback interface no longer includes the TXQ. Instead,
the driver is expected to retrieve that from ieee80211_next_txq()
- Two new mac80211 functions are added: ieee80211_next_txq() and
ieee80211_schedule_txq(). The former returns the next TXQ that should be
scheduled, and is how the driver gets a queue to pull packets from. The
latter is called internally by mac80211 to start scheduling a queue, and
the driver is supposed to call it to re-schedule the TXQ after it is
finished pulling packets from it (unless the queue emptied).
The ath9k and ath10k drivers are changed to use the new API.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Evidently I introduced a locking bug in my change here,
the nla_put_failure sometimes needs to unlock. Fix it.
Fixes: 44905265bc15 ("nl80211: don't expose wdev->ssid for most interfaces")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Add a key flag to indicates that the device only needs
MIC space and not a real MIC.
In such cases, keep the MIC zeroed for ease of debug.
Signed-off-by: David Spinadel <david.spinadel@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
While the change between 802.11-2012 and 802.11-2016 to move from
requiring APs to set the two top bits to now requiring them to be
cleared was apparently unintentional and will be fixed, clients
should either way assume that the top five bits are reserved and
ignore them.
Implement that in mac80211.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This mostly serves as an example for how to add error strings
and erroneous attribute pointers.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Both cfg80211_rx_mgmt and cfg80211_report_obss_beacon functions send
reports to userspace using NL80211_ATTR_RX_SIGNAL_DBM attribute w/o
any processing of their input signal values. Which means that in
order to match userspace tools expectations, input signal values
for those functions are supposed to be in dBm units.
This patch cleans up comments, variable names, and trace reports
for those functions, replacing confusing 'mBm' by 'dBm'.
Signed-off-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Add support for drivers that implement static WEP internally for IBSS.
Add the WEP keys to the IBSS params struct, that will allow the driver
to use the keys in the join flow, and not only after the connection.
Signed-off-by: Tova Mussai <tova.mussai@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
In the ieee80211_setup_sdata() we check if the interface type is valid
and, if not, call BUG(). This should never happen, but if there is
something wrong with the code, it will not be caught until the bug
happens when an interface is being set up. Calling BUG() is too
extreme for this and a WARN_ON() would be better used instead. Change
that.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Currently the restart flow enables RX back, and then proceeds
to tear down RX and TX aggregations.
The TX aggregation tear down calls synchronize_net(), which
waits for packet receiving to be done.
This is done for every session, while RX processing is already
active, and in some reproductions it takes up to 3 seconds.
Add a call once in the restart_work, before we have traffic
active again, and remove the subsequent calls when tearing
down the aggregation.
This requires to move down the code that turns off the
reconfig flag in order to be able to test it in
_ieee80211_stop_tx_ba_session().
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The 2016 version of the spec is more generic about when the
AP should update the power management state of the peer:
the AP shall update the state based on any management or
data frames. This means that even non-bufferable management
frames should be looked at to update to maintain the power
management state of the peer.
This can avoid problematic cases for example if a station
disappears while being asleep and then re-appears. The AP
would remember it as in power save, but the Authentication
frame couldn't be used to set the peer as awake again.
Note that this issues wasn't really critical since at some
point (after the association) we would have removed the
station and created another one with all the states cleared.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The function is only used with the file, so make it static.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Allow drivers to set the buffer station extended capability
for TDLS links, with a new hardware flag indicating this.
Signed-off-by: Yingying Tang <yintang@qti.qualcomm.com>
[change commit log/documentation wording]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
There's no need to re-lookup the data structures now that
we actually get them immediately with from_timer(), just
avoid that. The struct has to be valid anyway, otherwise
the timer object itself would no longer be valid, and we
can't have a different version of the struct since only a
single session per TID is permitted.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Notice that in some cases I replaced "fall through on else" and
"otherwise fall through" comments with just a "fall through" comment,
which is what GCC is expecting to find.
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Due to overlap between
commit 1281103770e9 ("mac80211: Simplify locking in ieee80211_sta_tear_down_BA_sessions()")
and the way that Luca modified
commit 72e2c3438ba3 ("mac80211: tear down RX aggregations first")
when sending it upstream from Intel's internal tree, we get
the following warning:
WARNING: CPU: 0 PID: 5472 at net/mac80211/agg-tx.c:315 ___ieee80211_stop_tx_ba_session+0x158/0x1f0
since there's no appropriate locking around the call to
___ieee80211_stop_tx_ba_session; Sara's original just had
a call to the locked __ieee80211_stop_tx_ba_session (one
less underscore) but it looks like Luca modified both of
the calls when fixing it up for upstream, leading to the
problem at hand.
Move the locking appropriately to fix this problem.
Reported-by: Kalle Valo <kvalo@codeaurora.org>
Reported-by: Pavel Machek <pavel@ucw.cz>
Tested-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Conflict was two parallel additions of include files to sch_generic.c,
no biggie.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Netlink device already allows changing GSO sizes with
ip set command. The part that is missing is allowing overriding
GSO settings on device creation.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RACK skips an ACK unless it advances the most recently delivered
TX timestamp (rack.mstamp). Since RACK also uses the most recent
RTT to decide if a packet is lost, RACK should still run the
loss detection whenever the most recent RTT changes. For example,
an ACK that does not advance the timestamp but triggers the cwnd
undo due to reordering, would then use the most recent (higher)
RTT measurement to detect further losses.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Priyaranjan Jha <priyarjha@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RACK should mark a packet lost when remaining wait time is zero.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Priyaranjan Jha <priyarjha@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When sender detects spurious retransmission, all packets
marked lost are remarked to be in-flight. However some may
be considered lost based on its timestamps in RACK. This patch
forces RACK to re-evaluate, which may be skipped previously if
the ACK does not advance RACK timestamp.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Priyaranjan Jha <priyarjha@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RACK does not test the loss recovery state correctly to compute
the reordering window. It assumes if lost_out is zero then TCP is
not in loss recovery. But it can be zero during recovery before
calling tcp_rack_detect_loss(): when an ACK acknowledges all
packets marked lost before receiving this ACK, but has not yet
to discover new ones by tcp_rack_detect_loss(). The fix is to
simply test the congestion state directly.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Priyaranjan Jha <priyarjha@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since the block is freed with last chain being put, once we reach the
end of iteration of list_for_each_entry_safe, the block may be
already freed. I'm hitting this only by creating and deleting clsact:
[ 202.171952] ==================================================================
[ 202.180182] BUG: KASAN: use-after-free in tcf_block_put_ext+0x240/0x390
[ 202.187590] Read of size 8 at addr ffff880225539a80 by task tc/796
[ 202.194508]
[ 202.196185] CPU: 0 PID: 796 Comm: tc Not tainted 4.15.0-rc2jiri+ #5
[ 202.203200] Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016
[ 202.213613] Call Trace:
[ 202.216369] dump_stack+0xda/0x169
[ 202.220192] ? dma_virt_map_sg+0x147/0x147
[ 202.224790] ? show_regs_print_info+0x54/0x54
[ 202.229691] ? tcf_chain_destroy+0x1dc/0x250
[ 202.234494] print_address_description+0x83/0x3d0
[ 202.239781] ? tcf_block_put_ext+0x240/0x390
[ 202.244575] kasan_report+0x1ba/0x460
[ 202.248707] ? tcf_block_put_ext+0x240/0x390
[ 202.253518] tcf_block_put_ext+0x240/0x390
[ 202.258117] ? tcf_chain_flush+0x290/0x290
[ 202.262708] ? qdisc_hash_del+0x82/0x1a0
[ 202.267111] ? qdisc_hash_add+0x50/0x50
[ 202.271411] ? __lock_is_held+0x5f/0x1a0
[ 202.275843] clsact_destroy+0x3d/0x80 [sch_ingress]
[ 202.281323] qdisc_destroy+0xcb/0x240
[ 202.285445] qdisc_graft+0x216/0x7b0
[ 202.289497] tc_get_qdisc+0x260/0x560
Fix this by holding the block also by chain 0 and put chain 0
explicitly, out of the list_for_each_entry_safe loop at the very
end of tcf_block_put_ext.
Fixes: efbf78973978 ("net_sched: get rid of rcu_barrier() in tcf_block_put_ext()")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|