Age | Commit message (Collapse) | Author |
|
There is now a fallback to a choose_arg index of -1 if there isn't
a pool-specific choose_arg set. If you create a per-pool weight-set,
that works for that pool. Otherwise we try the compat/default one. If
that doesn't exist either, then we use the normal CRUSH weights.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
|
|
'asoc/fix/msm8916', 'asoc/fix/multi-pcm', 'asoc/fix/of-graph' and 'asoc/fix/pxa' into asoc-linus
|
|
Marcelo added this i_size based optimization with a patch in 2004
(commitid is from the linux-history tree):
commit 765dad09b4ac101a32d87af2bb793c3060497d3c
Author: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Date: Tue Sep 7 17:51:17 2004 -0700
small wait_on_page_writeback_range() optimization
filemap_fdatawait() calls wait_on_page_writeback_range() with -1
as "end" parameter. This is not needed since we know the EOF
from the inode. Use that instead.
There may be races here, particularly with clustered or network
filesystems. It also seems like a bit of a layering violation since
we're operating on an address_space here, not an inode.
Finally, it's also questionable whether this optimization really helps
on workloads that we care about. Should we be optimizing for writeback
vs. truncate races in a codepath where we expect to wait anyway? It
doesn't seem worth the risk.
Remove this optimization from the filemap_fdatawait codepaths. This
means that filemap_fdatawait becomes a trivial wrapper around
filemap_fdatawait_range.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
|
|
This makes it possible to preserve basic futex support and compile out the
PI support when RT mutexes are not available.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Darren Hart <dvhart@infradead.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.20.1708010024190.5981@knanqh.ubzr
|
|
On many platforms, CPUs can do DVFS across cpufreq policies. i.e CPU
from policy-A can change frequency of CPUs belonging to policy-B.
This is quite common in case of ARM platforms where we don't
configure any per-cpu register.
Add a flag to identify such platforms and update
cpufreq_can_do_remote_dvfs() to allow remote callbacks if this flag is
set.
Also enable the flag for cpufreq-dt driver which is used only on ARM
platforms currently.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Saravana Kannan <skannan@codeaurora.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
With Android UI and benchmarks the latency of cpufreq response to
certain scheduling events can become very critical. Currently, callbacks
into cpufreq governors are only made from the scheduler if the target
CPU of the event is the same as the current CPU. This means there are
certain situations where a target CPU may not run the cpufreq governor
for some time.
One testcase to show this behavior is where a task starts running on
CPU0, then a new task is also spawned on CPU0 by a task on CPU1. If the
system is configured such that the new tasks should receive maximum
demand initially, this should result in CPU0 increasing frequency
immediately. But because of the above mentioned limitation though, this
does not occur.
This patch updates the scheduler core to call the cpufreq callbacks for
remote CPUs as well.
The schedutil, ondemand and conservative governors are updated to
process cpufreq utilization update hooks called for remote CPUs where
the remote CPU is managed by the cpufreq policy of the local CPU.
The intel_pstate driver is updated to always reject remote callbacks.
This is tested with couple of usecases (Android: hackbench, recentfling,
galleryfling, vellamo, Ubuntu: hackbench) on ARM hikey board (64 bit
octa-core, single policy). Only galleryfling showed minor improvements,
while others didn't had much deviation.
The reason being that this patch only targets a corner case, where
following are required to be true to improve performance and that
doesn't happen too often with these tests:
- Task is migrated to another CPU.
- The task has high demand, and should take the target CPU to higher
OPPs.
- And the target CPU doesn't call into the cpufreq governor until the
next tick.
Based on initial work from Steve Muckle.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Saravana Kannan <skannan@codeaurora.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The acpi_pci_propagate_wakeup() routine is there to handle cases in
which PCI bridges (or PCIe ports) are expected to signal wakeup
for devices below them, but currently it doesn't do that correctly.
The problem is that acpi_pci_propagate_wakeup() uses
acpi_pm_set_device_wakeup() for bridges and if that routine is
called for multiple times to disable wakeup for the same device,
it will disable it on the first invocation and the next calls
will have no effect (it works analogously when called to enable
wakeup, but that is not a problem).
Now, say acpi_pci_propagate_wakeup() has been called for two
different devices under the same bridge and it has called
acpi_pm_set_device_wakeup() for that bridge each time. The
bridge is now enabled to generate wakeup signals. Next,
suppose that one of the devices below it resumes and
acpi_pci_propagate_wakeup() is called to disable wakeup for that
device. It will then call acpi_pm_set_device_wakeup() for the bridge
and that will effectively disable remote wakeup for all devices under
it even though some of them may still be suspended and remote wakeup
may be expected to work for them.
To address this (arguably theoretical) issue, allow
wakeup.enable_count under struct acpi_device to grow beyond 1 in
certain situations. In particular, allow that to happen in
acpi_pci_propagate_wakeup() when wakeup is enabled or disabled
for PCI bridges, so that wakeup is actually disabled for the
bridge when all devices under it resume and not when just one
of them does that.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
To prepare for a subsequent change and make the code somewhat easier
to follow, do the following in the ACPI device wakeup handling code:
* Replace wakeup.flags.enabled under struct acpi_device with
wakeup.enable_count as that will be necessary going forward.
For now, wakeup.enable_count is not allowed to grow beyond 1,
so the current behavior is retained.
* Split acpi_device_wakeup() into acpi_device_wakeup_enable()
and acpi_device_wakeup_disable() and modify the callers of
it accordingly.
* Introduce a new acpi_wakeup_lock mutex to protect the wakeup
enabling/disabling code from races in case it is executed
more than once in parallel for the same device (which may
happen for bridges theoretically).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
|
|
The semaphore 'driver_lock' is used as a simple mutex, and also unnecessary as
suggested by Arnd. Hence removing it, as the concurrency between the probe and
remove is already handled in the driver core.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
Pull networking fixes from David Miller:
1) Handle notifier registry failures properly in tun/tap driver, from
Tonghao Zhang.
2) Fix bpf verifier handling of subtraction bounds and add a testcase
for this, from Edward Cree.
3) Increase reset timeout in ftgmac100 driver, from Ben Herrenschmidt.
4) Fix use after free in prd_retire_rx_blk_timer_exired() in AF_PACKET,
from Cong Wang.
5) Fix SElinux regression due to recent UDP optimizations, from Paolo
Abeni.
6) We accidently increment IPSTATS_MIB_FRAGFAILS in the ipv6 code
paths, fix from Stefano Brivio.
7) Fix some mem leaks in dccp, from Xin Long.
8) Adjust MDIO_BUS kconfig deps to avoid build errors, from Arnd
Bergmann.
9) Mac address length check and buffer size fixes from Cong Wang.
10) Don't leak sockets in ipv6 udp early demux, from Paolo Abeni.
11) Fix return value when copy_from_user() fails in
bpf_prog_get_info_by_fd(), from Daniel Borkmann.
12) Handle PHY_HALTED properly in phy library state machine, from
Florian Fainelli.
13) Fix OOPS in fib_sync_down_dev(), from Ido Schimmel.
14) Fix truesize calculation in virtio_net which led to performance
regressions, from Michael S Tsirkin.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
samples/bpf: fix bpf tunnel cleanup
udp6: fix jumbogram reception
ppp: Fix a scheduling-while-atomic bug in del_chan
Revert "net: bcmgenet: Remove init parameter from bcmgenet_mii_config"
virtio_net: fix truesize for mergeable buffers
mv643xx_eth: fix of_irq_to_resource() error check
MAINTAINERS: Add more files to the PHY LIBRARY section
ipv4: fib: Fix NULL pointer deref during fib_sync_down_dev()
net: phy: Correctly process PHY_HALTED in phy_stop_machine()
sunhme: fix up GREG_STAT and GREG_IMASK register offsets
bpf: fix bpf_prog_get_info_by_fd to dump correct xlated_prog_len
tcp: avoid bogus gcc-7 array-bounds warning
net: tc35815: fix spelling mistake: "Intterrupt" -> "Interrupt"
bpf: don't indicate success when copy_from_user fails
udp6: fix socket leak on early demux
net: thunderx: Fix BGX transmit stall due to underflow
Revert "vhost: cache used event for better performance"
team: use a larger struct for mac address
net: check dev->addr_len for dev_set_mac_address()
phy: bcm-ns-usb3: fix MDIO_BUS dependency
...
|
|
Since commit 67a51780aebb ("ipv6: udp: leverage scratch area
helpers") udp6_recvmsg() read the skb len from the scratch area,
to avoid a cache miss.
But the UDP6 rx path support RFC 2675 UDPv6 jumbograms, and their
length exceeds the 16 bits available in the scratch area. As a side
effect the length returned by recvmsg() is:
<ingress datagram len> % (1<<16)
This commit addresses the issue allocating one more bit in the
IP6CB flags field and setting it for incoming jumbograms.
Such field is still in the first cacheline, so at recvmsg()
time we can check it and fallback to access skb->len if
required, without a measurable overhead.
Fixes: 67a51780aebb ("ipv6: udp: leverage scratch area helpers")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the following stats into SCM_TIMESTAMPING_OPT_STATS control msg:
TCP_NLA_PACING_RATE
TCP_NLA_DELIVERY_RATE
TCP_NLA_SND_CWND
TCP_NLA_REORDERING
TCP_NLA_MIN_RTT
TCP_NLA_RECUR_RETRANS
TCP_NLA_DELIVERY_RATE_APP_LMT
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Necessary now for gfs2_fsync and sync_file_range, but there will
eventually be other callers.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
|
|
In preparation for having the bcmgenet driver migrate over the
mdio-bcm-unimac driver, add a platform data structure which allows
passing integrating specific details like bus name, wait function to
complete MDIO operations and PHY mask.
We also define what the platform device name contract is by defining
UNIMAC_MDIO_DRV_NAME and moving it to the platform_data header.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
was used by tcp prequeue and header prediction.
TCPFORWARDRETRANS use was removed in january.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
re-indent tcp_ack, and remove CA_ACK_SLOWPATH; it is always set now.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Like prequeue, I am not sure this is overly useful nowadays.
If we receive a train of packets, GRO will aggregate them if the
headers are the same (HP predates GRO by several years) so we don't
get a per-packet benefit, only a per-aggregated-packet one.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Was only checked by the removed prequeue code.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
prequeue is a tcp receive optimization that moves part of rx processing
from bh to process context.
This only works if the socket being processed belongs to a process that
is blocked in recv on that socket.
In practice, this doesn't happen anymore that often because nowadays
servers tend to use an event driven (epoll) model.
Even normal client applications (web browsers) commonly use many tcp
connections in parallel.
This has measureable impact only in netperf (which uses plain recv and
thus allows prequeue use) from host to locally running vm (~4%), however,
there were no changes when using netperf between two physical hosts with
ixgbe interfaces.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fixes from Tejun Heo:
"Two notable fixes.
- While adding NUMA affinity support to unbound workqueues, the
assumption that an unbound workqueue with max_active == 1 is
ordered was broken.
The plan was to use explicit alloc_ordered_workqueue() for those
cases. Unfortunately, I forgot to update the documentation properly
and we grew a handful of use cases which depend on that assumption.
While we want to convert them to alloc_ordered_workqueue(), we
don't really lose anything by enforcing ordered execution on
unbound max_active == 1 workqueues and it doesn't make sense to
risk subtle bugs. Restore the assumption.
- Workqueue assumes that CPU <-> NUMA node mapping remains static.
This is a general assumption - we don't have any synchronization
mechanism around CPU <-> node mapping. Unfortunately, powerpc may
change the mapping dynamically leading to crashes. Michael added a
workaround so that we at least don't crash while powerpc hotplug
code gets updated"
* 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Work around edge cases for calc of pool's cpumask
workqueue: implicit ordered attribute should be overridable
workqueue: restore WQ_UNBOUND/max_active==1 to be ordered
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"Dan found a really old bug where libata hotplug code wasn't sanitizing
index value from userland and may end up indexing with a negative
number. It is scary but fortunately can only be triggered by root.
Other than that, minor fixes"
* 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
libata: fix a couple of doc build warnings
libata: array underflow in ata_find_dev()
ata: sata_rcar: add gen[23] fallback compatibility strings
libata: remove unused rc in ata_eh_handle_port_resume
libata: Cleanup ata_read_log_page()
ata: fix gemini Kconfig dependencies
|
|
A trap should be sent to the FM until the FM sends a repress message.
This is in line with the IBTA 13.4.9.
Add the ability to resend traps until a repress message is received.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Michael N. Henry <michael.n.henry@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Discussion during NFWS 2017 in Faro has shown that the current
conntrack behaviour is unreasonable.
Even if conntrack module is loaded on behalf of a single net namespace,
its turned on for all namespaces, which is expensive. Commit
481fa373476 ("netfilter: conntrack: add nf_conntrack_default_on sysctl")
attempted to provide an alternative to the 'default on' behaviour by
adding a sysctl to change it.
However, as Eric points out, the sysctl only becomes available
once the module is loaded, and then its too late.
So we either have to move the sysctl to the core, or, alternatively,
change conntrack to become active only once the rule set requires this.
This does the latter, conntrack is only enabled when a rule needs it.
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Same conversion as for table names, use NFT_NAME_MAXLEN as upper
boundary as well.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Same conversion as for table names, use NFT_NAME_MAXLEN as upper
boundary as well.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Same conversion as for table names, use NFT_NAME_MAXLEN as upper
boundary as well.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Allocate all table names dynamically to allow for arbitrary lengths but
introduce NFT_NAME_MAXLEN as an upper sanity boundary. It's value was
chosen to allow using a domain name as per RFC 1035.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This is similar to strdup() for netlink string attributes.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This also removes __nf_ct_unconfirmed_destroy() call from
nf_ct_iterate_cleanup_net, so that function can be used only
when missing conntracks from unconfirmed list isn't a problem.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
We have several spots that open-code a expect walk, add a helper
that is similar to nf_ct_iterate_destroy/nf_ct_iterate_cleanup.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Hopefully making clear that it is not needed for new drivers.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Up until recently sync_file were create to export a single dma-fence to
userspace, and so we could canabalise a bit insie dma-fence to mark
whether or not we had enable polling for the sync_file itself. However,
with the advent of syncobj, we do allow userspace to create multiple
sync_files for a single dma-fence. (Similarly, that the sw-sync
validation framework also started returning multiple sync-files wrapping
a single dma-fence for a syncpt also triggering the problem.)
This patch reverts my suggestion in commit e24165537312
("dma-buf/sync_file: only enable fence signalling on poll()") to use a
single bit in the shared dma-fence and restores the sync_file->flags for
tracking the bits individually.
Reported-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Fixes: f1e8c67123cf ("dma-buf/sw-sync: Use an rbtree to sort fences in the timeline")
Fixes: e9083420bbac ("drm: introduce sync objects (v4)")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: dri-devel@lists.freedesktop.org
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.13-rc1+
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170728212951.7818-1-chris@chris-wilson.co.uk
(cherry picked from commit db1fc97ca0c0d3fdeeadf314d99a26188438940a)
|
|
Fixes the following warnings when building docs:
../include/drm/drm_drv.h:553: warning: No description found for parameter 'debugfs_init'
../include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_open_object'
../include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_close_object'
../include/drm/drm_drv.h:553: warning: No description found for parameter 'prime_handle_to_fd'
../include/drm/drm_drv.h:553: warning: No description found for parameter 'prime_fd_to_handle'
../include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_prime_export'
../include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_prime_import'
../include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_vm_ops'
../include/drm/drm_drv.h:553: warning: No description found for parameter 'major'
../include/drm/drm_drv.h:553: warning: No description found for parameter 'minor'
../include/drm/drm_drv.h:553: warning: No description found for parameter 'patchlevel'
../include/drm/drm_drv.h:553: warning: No description found for parameter 'name'
../include/drm/drm_drv.h:553: warning: No description found for parameter 'desc'
../include/drm/drm_drv.h:553: warning: No description found for parameter 'date'
../include/drm/drm_drv.h:553: warning: No description found for parameter 'driver_features'
There are still a couple more warnings for prime helpers that are
documented elsewhere.
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170720174746.29100-5-seanpaul@chromium.org
|
|
Fixes:
../drivers/gpu/drm/drm_scdc_helper.c:203: ERROR: Unexpected indentation.
../drivers/gpu/drm/drm_scdc_helper.c:204: WARNING: Block quote ends without a blank line; unexpected unindent.
Changes in v2:
- Property blockquote TMDS calculations so they look pretty (Daniel)
- Remove duplicate documentation from the header file
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170720200921.36897-1-seanpaul@chromium.org
|
|
The enumeration of FIXMEs wasn't indented properly.
Fixes: fef9df8b5945 ("drm/atomic: initial support for asynchronous plane update")
Cc: Gustavo Padovan <gustavo.padovan@collabora.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170731111733.10507-1-daniel.vetter@ffwll.ch
|
|
The kerneldoc for drm_atomic_crtc_needs_modeset() is outdated and no
longer reflects the actual code. Fix that up to remove confusion.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170731091343.21363-1-thierry.reding@gmail.com
|
|
Two variables in ext4_inode_info, i_reserved_meta_blocks and
i_allocated_meta_blocks, are unused. Removing them saves a little
memory per in-memory inode and cleans up clutter in several tracepoints.
Adjust tracepoint output from ext4_alloc_da_blocks() for consistency
and fix a typo and whitespace near these changes.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
|
|
This patch adds support for filtering based on time since last used.
When we are dumping a large number of actions it is useful to
have the option of filtering based on when the action was last
used to reduce the amount of data crossing to user space.
With this patch the user space app sets the TCA_ROOT_TIME_DELTA
attribute with the value in milliseconds with "time of interest
since now". The kernel converts this to jiffies and does the
filtering comparison matching entries that have seen activity
since then and returns them to user space.
Old kernels and old tc continue to work in legacy mode since
they dont specify this attribute.
Some example (we have 400 actions bound to 400 filters); at
installation time. Using updated when tc setting the time of
interest to 120 seconds earlier (we see 400 actions):
prompt$ hackedtc actions ls action gact since 120000| grep index | wc -l
400
go get some coffee and wait for > 120 seconds and try again:
prompt$ hackedtc actions ls action gact since 120000 | grep index | wc -l
0
Lets see a filter bound to one of these actions:
....
filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 2 success 1)
match 7f000002/ffffffff at 12 (success 1 )
action order 1: gact action pass
random type none pass val 0
index 23 ref 2 bind 1 installed 1145 sec used 802 sec
Action statistics:
Sent 84 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
....
that coffee took long, no? It was good.
Now lets ping -c 1 127.0.0.2, then run the actions again:
prompt$ hackedtc actions ls action gact since 120 | grep index | wc -l
1
More details please:
prompt$ hackedtc -s actions ls action gact since 120000
action order 0: gact action pass
random type none pass val 0
index 23 ref 2 bind 1 installed 1270 sec used 30 sec
Action statistics:
Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
And the filter?
filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 4 success 2)
match 7f000002/ffffffff at 12 (success 2 )
action order 1: gact action pass
random type none pass val 0
index 23 ref 2 bind 1 installed 1324 sec used 84 sec
Action statistics:
Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When you dump hundreds of thousands of actions, getting only 32 per
dump batch even when the socket buffer and memory allocations allow
is inefficient.
With this change, the user will get as many as possibly fitting
within the given constraints available to the kernel.
The top level action TLV space is extended. An attribute
TCA_ROOT_FLAGS is used to carry flags; flag TCA_FLAG_LARGE_DUMP_ON
is set by the user indicating the user is capable of processing
these large dumps. Older user space which doesnt set this flag
doesnt get the large (than 32) batches.
The kernel uses the TCA_ROOT_COUNT attribute to tell the user how many
actions are put in a single batch. As such user space app knows how long
to iterate (independent of the type of action being dumped)
instead of hardcoded maximum of 32 thus maintaining backward compat.
Some results dumping 1.5M actions below:
first an unpatched tc which doesnt understand these features...
prompt$ time -p tc actions ls action gact | grep index | wc -l
1500000
real 1388.43
user 2.07
sys 1386.79
Now lets see a patched tc which sets the correct flags when requesting
a dump:
prompt$ time -p updatedtc actions ls action gact | grep index | wc -l
1500000
real 178.13
user 2.02
sys 176.96
That is about 8x performance improvement for tc app which sets its
receive buffer to about 32K.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Generic bitflags attribute content sent to the kernel by user.
With this netlink attr type the user can either set or unset a
flag in the kernel.
The value is a bitmap that defines the bit values being set
The selector is a bitmask that defines which value bit is to be
considered.
A check is made to ensure the rules that a kernel subsystem always
conforms to bitflags the kernel already knows about. i.e
if the user tries to set a bit flag that is not understood then
the _it will be rejected_.
In the most basic form, the user specifies the attribute policy as:
[ATTR_GOO] = { .type = NLA_BITFIELD32, .validation_data = &myvalidflags },
where myvalidflags is the bit mask of the flags the kernel understands.
If the user _does not_ provide myvalidflags then the attribute will
also be rejected.
Examples:
value = 0x0, and selector = 0x1
implies we are selecting bit 1 and we want to set its value to 0.
value = 0x2, and selector = 0x2
implies we are selecting bit 2 and we want to set its value to 1.
Suggested-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
"Two patches addressing build warnings caused by inconsistent kernel
doc comments"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/wait: Clean up some documentation warnings
sched/core: Fix some documentation build warnings
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Thomas Gleixner:
"Fix for a regression caused by the conversion of x86 to the generic
hotplug code.
Instead of doing a plain single line revert, this adds a pile of
comments so the semantics of the force argument are clear"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/cpuhotplug: Revert "Set force affinity flag on hotplug migration"
|
|
The mc-bus specific field, fsl_mc in struct msi_desc is missing its
comment so add it.
Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The work-around for Qualcomm Technologies QDF2400 Erratum 44 hinges on a
global variable defined in the pl011 driver. The ACPI SPCR parsing code
determines whether the work-around is needed, and if so, it changes the
console name from "pl011" to "qdf2400_e44". The expectation is that
the pl011 driver will implement the work-around when it sees the console
name. The global variable qdf2400_e44_present is set when that happens.
The problem is that work-around needs to be enabled when the pl011
driver probes, not when the console name is queried. However, sbsa_probe()
is called before pl011_console_match(). The work-around appeared to work
previously because the default console on QDF2400 platforms was always
ttyAMA1. The first time sbsa_probe() is called (for ttyAMA0),
qdf2400_e44_present is still false. Then pl011_console_match() is called,
and it sets qdf2400_e44_present to true. All subsequent calls to
sbsa_probe() enable the work-around.
The solution is to move the global variable into spcr.c and let the
pl011 driver query it during probe time. This works because all QDF2400
platforms require SPCR, so parse_spcr() will always be called.
pl011_console_match still checks for the "qdf2400_e44" console name,
but it doesn't do anything else special.
Fixes: 5a0722b898f8 ("tty: pl011: use "qdf2400_e44" as the earlycon name for QDF2400 E44")
Tested-by: Jeffrey Hugo <jhugo@codeaurora.org>
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
kernelci.org reports a crazy stack usage for the VT code when CONFIG_KASAN
is enabled:
drivers/tty/vt/keyboard.c: In function 'kbd_keycode':
drivers/tty/vt/keyboard.c:1452:1: error: the frame size of 2240 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
The problem is that tty_insert_flip_char() gets inlined many times into
kbd_keycode(), and also into other functions, and each copy requires 128
bytes for stack redzone to check for a possible out-of-bounds access on
the 'ch' and 'flags' arguments that are passed into
tty_insert_flip_string_flags as a variable-length string.
This introduces a new __tty_insert_flip_char() function for the slow
path, which receives the two arguments by value. This completely avoids
the problem and the stack usage goes back down to around 100 bytes.
Without KASAN, this is also slightly better, as we don't have to
spill the arguments to the stack but can simply pass 'ch' and 'flag'
in registers, saving a few bytes in .text for each call site.
This should be backported to linux-4.0 or later, which first introduced
the stack sanitizer in the kernel.
Cc: stable@vger.kernel.org
Fixes: c420f167db8c ("kasan: enable stack instrumentation")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
First 16 bits in the flags field are user-visible except
UPF_NO_TXEN_TEST. To keep it clean we introduce internal quirks and move
UPF_NO_TXEN_TEST to them. Rename the constant to UPQ_NO_TXEN_TEST to
distinguish with port flags. Users are converted accordingly.
The quirks field might be extended later to hold the additional ones.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
refresh the tree
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Forward Error Correction (FEC) modes i.e Base-R
and Reed-Solomon modes are introduced in 25G/40G/100G standards
for providing good BER at high speeds. Various networking devices
which support 25G/40G/100G provides ability to manage supported FEC
modes and the lack of FEC encoding control and reporting today is a
source for interoperability issues for many vendors.
FEC capability as well as specific FEC mode i.e. Base-R
or RS modes can be requested or advertised through bits D44:47 of
base link codeword.
This patch set intends to provide option under ethtool to manage
and report FEC encoding settings for networking devices as per
IEEE 802.3 bj, bm and by specs.
set-fec/show-fec option(s) are designed to provide control and
report the FEC encoding on the link.
SET FEC option:
root@tor: ethtool --set-fec swp1 encoding [off | RS | BaseR | auto]
Encoding: Types of encoding
Off : Turning off any encoding
RS : enforcing RS-FEC encoding on supported speeds
BaseR : enforcing Base R encoding on supported speeds
Auto : IEEE defaults for the speed/medium combination
Here are a few examples of what we would expect if encoding=auto:
- if autoneg is on, we are expecting FEC to be negotiated as on or off
as long as protocol supports it
- if the hardware is capable of detecting the FEC encoding on it's
receiver it will reconfigure its encoder to match
- in absence of the above, the configuration would be set to IEEE
defaults.
>From our understanding , this is essentially what most hardware/driver
combinations are doing today in the absence of a way for users to
control the behavior.
SHOW FEC option:
root@tor: ethtool --show-fec swp1
FEC parameters for swp1:
Active FEC encodings: RS
Configured FEC encodings: RS | BaseR
ETHTOOL DEVNAME output modification:
ethtool devname output:
root@tor:~# ethtool swp1
Settings for swp1:
root@hpe-7712-03:~# ethtool swp18
Settings for swp18:
Supported ports: [ FIBRE ]
Supported link modes: 40000baseCR4/Full
40000baseSR4/Full
40000baseLR4/Full
100000baseSR4/Full
100000baseCR4/Full
100000baseLR4_ER4/Full
Supported pause frame use: No
Supports auto-negotiation: Yes
Supported FEC modes: [RS | BaseR | None | Not reported]
Advertised link modes: Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: [RS | BaseR | None | Not reported]
<<<< One or more FEC modes
Speed: 100000Mb/s
Duplex: Full
Port: FIBRE
PHYAD: 106
Transceiver: internal
Auto-negotiation: off
Link detected: yes
This patch includes following changes
a) New ETHTOOL_SFECPARAM/SFECPARAM API, handled by
the new get_fecparam/set_fecparam callbacks, provides support
for configuration of forward error correction modes.
b) Link mode bits for FEC modes i.e. None (No FEC mode), RS, BaseR/FC
are defined so that users can configure these fec modes for supported
and advertising fields as part of link autonegotiation.
Signed-off-by: Vidya Sagar Ravipati <vidya.chowdary@gmail.com>
Signed-off-by: Dustin Byford <dustin@cumulusnetworks.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When an early demuxed packet reaches __udp6_lib_lookup_skb(), the
sk reference is retrieved and used, but the relevant reference
count is leaked and the socket destructor is never called.
Beyond leaking the sk memory, if there are pending UDP packets
in the receive queue, even the related accounted memory is leaked.
In the long run, this will cause persistent forward allocation errors
and no UDP skbs (both ipv4 and ipv6) will be able to reach the
user-space.
Fix this by explicitly accessing the early demux reference before
the lookup, and properly decreasing the socket reference count
after usage.
Also drop the skb_steal_sock() in __udp6_lib_lookup_skb(), and
the now obsoleted comment about "socket cache".
The newly added code is derived from the current ipv4 code for the
similar path.
v1 -> v2:
fixed the __udp6_lib_rcv() return code for resubmission,
as suggested by Eric
Reported-by: Sam Edwards <CFSworks@gmail.com>
Reported-by: Marc Haber <mh+netdev@zugschlus.de>
Fixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In the H3 ES2.0 SoC the VSP2-DL instance has two connections to DU
channels that need to be configured independently. Extend the VSP-DU API
with a pipeline index to identify which pipeline the caller wants to
operate on.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Acked-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
|