Age | Commit message (Collapse) | Author |
|
Macvlan devices do not initialize vlan_features. As a result,
any vlan devices configured on top of macvlans perform very poorly.
Initialize vlan_features based on the vlan features of the lower-level
device.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add functions to support ethtool EEE manipulating, and the EEE
is disabled in default setting to enhance the compatibility
with certain switch.
Signed-off-by: Freddy Xin <freddy@asix.com.tw>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use PAGE_ALIGNED(...) instead of IS_ALIGNED(..., PAGE_SIZE).
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When dealing with ICMPv[46] Error Message, function icmp_socket_deliver()
and icmpv6_notify() do some valid checks on packet's length, but then some
protocols check packet's length redaudantly. So remove those duplicated
statements, and increase counter ICMP_MIB_INERRORS/ICMP6_MIB_INERRORS in
function icmp_socket_deliver() and icmpv6_notify() respectively.
In addition, add missed counter in udp6/udplite6 when socket is NULL.
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The SCTP socket extensions API document describes the v4mapping option as
follows:
8.1.15. Set/Clear IPv4 Mapped Addresses (SCTP_I_WANT_MAPPED_V4_ADDR)
This socket option is a Boolean flag which turns on or off the
mapping of IPv4 addresses. If this option is turned on, then IPv4
addresses will be mapped to V6 representation. If this option is
turned off, then no mapping will be done of V4 addresses and a user
will receive both PF_INET6 and PF_INET type addresses on the socket.
See [RFC3542] for more details on mapped V6 addresses.
This description isn't really in line with what the code does though.
Introduce addr_to_user (renamed addr_v4map), which should be called
before any sockaddr is passed back to user space. The new function
places the sockaddr into the correct format depending on the
SCTP_I_WANT_MAPPED_V4_ADDR option.
Audit all places that touched v4mapped and either sanely construct
a v4 or v6 address then call addr_to_user, or drop the
unnecessary v4mapped check entirely.
Audit all places that call addr_to_user and verify they are on a sycall
return path.
Add a custom getname that formats the address properly.
Several bugs are addressed:
- SCTP_I_WANT_MAPPED_V4_ADDR=0 often returned garbage for
addresses to user space
- The addr_len returned from recvmsg was not correct when
returning AF_INET on a v6 socket
- flowlabel and scope_id were not zerod when promoting
a v4 to v6
- Some syscalls like bind and connect behaved differently
depending on v4mapped
Tested bind, getpeername, getsockname, connect, and recvmsg for proper
behaviour in v4mapped = 1 and 0 cases.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Net_device is a vast and important structure, but it has no kernel-doc
compliant documentation. This patch extracts the comments from the structure
to clean it up, and let the scripts extract documentation from it. I know that
the patch is big, but it's just reordering of comments into the appropriate
form, and adding a few more, for the missing members.
Signed-off-by: Karoly Kemeny <karoly.kemeny@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"One commit that fixes a problem causing PNP devices to be associated
with wrong ACPI device objects sometimes during device enumeration due
to an incorrect check in a matching function.
That problem was uncovered by the ACPI device enumeration rework in
3.14"
* tag 'pm+acpi-3.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / PNP: Fix acpi_pnp_match()
|
|
Vince Bridgers says:
====================
net: stmmac: Improve mcast/ucast filter for snps
This patch series adds Synopsys specific bindings for the Synopsys EMAC
filter characteristics since those are implementation dependent. The
multicast and unicast filtering code was improved to handle different
configuration variations based on device tree settings.
I verified the operation of the multicast and unicast filters through
Synopsys support as requested during the V1 review, and tested the GMAC
configuration on an Altera Cyclone 5 SOC (which supports 256 multicast
bins and 128 Unicast addresses). The 10/100 variant of this driver
modification was not tested, although it was compile tested. I shared
the email thread results of the investigation through Synopsys with the
stmmac maintainer.
V4: Remove patch from series that addressed a sparse issue from a
down rev'd version of sparse that does not show up in the
latest version of sparse.
V3: Break up the patch into interface and functional change patches
per review comments
V2: Confirm with Synopsys methods to determine number of Multicast bins
and Unicast address filter entries per first round review comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds and modifies code to support multiple Multicast and Unicast
Synopsys MAC filter configurations. The default configuration is defined to
support legacy driver behavior, which is 64 Multicast bins. The Unicast
filter code previously assumed all controllers support 32 or 16 Unicast
addresses based on controller version number, but this has been corrected
to support a default of 1 Unicast address. The filter configuration may
be specified through the devicetree using a Synopsys specific device tree
entry. This information was verified with Synopsys through
Synopsys Support Case #8000684337 and shared with the maintainer.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds socfpga Ethernet filter attributes for multicast
and unicast filters per Synopsys Ethernet IP configuration chosen
by Altera for the Cyclone 5 and Arria SOC FPGAs.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This change adds bindings for the number of multicast hash bins and perfect
filter entries supported by the Synopsys EMAC. The Synopsys EMAC core is
configurable at device creation time, and can be configured for a different
number of multicast hash bins and a different number of perfect filter
entries. The device does not provide a way to query these parameters,
therefore parameters are required. The Altera Cyclone V SOC has support for
256 multicast hash bins and 128 perfect filter entries, and is different
than what's currently provided in the stmmac driver.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch removes the check for the number of mulitcast addresses
when using hash based filtering since it's not necessary. If the number
of multicast addresses in the list exceeds the number of multicast hash
bins, the bins will "fold" over into one of the bins configured and
enabled for the particular component instance.
The default number of maximum unicast addresses was changed from 32 to 1
since this number is not dependent on the component revision. The maximum
number of multicast and unicast addresses is dependent on the configuration
of the Synopsys EMAC configured by the SOC architect at the time the
features were selected and configured for a particular component. Sadly,
Synopsys does not provide a way to query the precise number supported
by a particular component, so we must fall back on a devicetree entry.
This configuration could vary from vendor to vendor (such as STMicro,
Altera, etc).
The multicast bins are set for every possible filtering case (including
no entries) - previously the bits were set only if multicast filter entries
were present.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The synopsys EMAC can be configured for different numbers of multicast hash
bins and perfect filter entries at device creation time and there's no way
to query this configuration information at runtime. As a result, a devicetree
parameter is required in order for the driver to program these filters
correctly for a particular device instance. This patch modifies the
10/100/1000 MAC software interface such that these configuration parameters
can be set at initialization time.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains netfilter updates for net-next, they are:
1) Add the reject expression for the nf_tables bridge family, this
allows us to send explicit reject (TCP RST / ICMP dest unrech) to
the packets matching a rule.
2) Simplify and consolidate the nf_tables set dumping logic. This uses
netlink control->data to filter out depending on the request.
3) Perform garbage collection in xt_hashlimit using a workqueue instead
of a timer, which is problematic when many entries are in place in
the tables, from Eric Dumazet.
4) Remove leftover code from the removed ulog target support, from
Paul Bolle.
5) Dump unmodified flags in the netfilter packet accounting when resetting
counters, so userspace knows that a counter was in overquota situation,
from Alexey Perevalov.
6) Fix wrong usage of the bitwise functions in nfnetlink_acct, also from
Alexey.
7) Fix a crash when adding new set element with an empty NFTA_SET_ELEM_LIST
attribute.
This patchset also includes a couple of cleanups for xt_LED from
Duan Jiong and for nf_conntrack_ipv4 (using coccinelle) from
Himangi Saraogi.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit d23ff7016 (tcp: add generic netlink support for tcp_metrics) introduced
netlink support for the new tcp_metrics, however it restricted getting of
tcp_metrics to root user only. This is a change from how these values could
have been fetched when in the old route cache. Unless there's a legitimate
reason to restrict the reading of these values it would be better if normal
users could fetch them.
Cc: Julian Anastasov <ja@ssi.bg>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 34c5bd66e5ed introduced the possibility that an
uninitialized pointer on the stack (orig_fp) can call into
sk_unattached_filter_destroy() when its value is non NULL.
Before that commit orig_fp was only destroyed in the same
block where it was assigned a valid BPF prog before. Fix it
up by initializing it to NULL.
Fixes: 34c5bd66e5ed ("net: filter: don't release unattached filter through call_rcu()")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Pablo Neira <pablo@netfilter.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, the driver defines SPI_MAX_CHIPSELECT as 2 and
use per device array bytes_per_word based on this. This breaks
if num_chipselect per device is greater than 2. This patch
fix this and allocate memory for this array based on
num_chipselect.
It's preparation patch to enable GPIO CS feature for
Davinci SPI.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
|
|
According to the tps65090 data manual [0], the DCDC1 and DCDC2
step-down converters and the LDO's have a fixed output voltage.
Add this information to the driver since these fixed regulators
can be used as parent input supply for switches that don't have
an output voltage defined. So the regulator core needs to fetch
the FET parent output voltage if the child voltage is queried.
[0]: http://www.ti.com/lit/gpn/tps65090
Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Mark Brown <broonie@linaro.org>
|
|
An FDB entry with vlan_id 0 doesn't mean it is used in vlan 0, but used when
vlan_filtering is disabled.
There is inconsistency around NDA_VLAN whose payload is 0 - even if we add
an entry by RTM_NEWNEIGH without any NDA_VLAN, and even though adding an
entry with NDA_VLAN 0 is prohibited, we get an entry with NDA_VLAN 0 by
RTM_GETNEIGH.
Dumping an FDB entry with vlan_id 0 shouldn't include NDA_VLAN.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Otherwise, the kernel oopses in nla_for_each_nested when iterating over
the unset attribute NFTA_SET_ELEM_LIST_ELEMENTS in the
nf_tables_{new,del}setelem() path.
netlink: 65524 bytes leftover after parsing attributes in process `nft'.
[...]
Oops: 0000 [#1] SMP
[...]
CPU: 2 PID: 6287 Comm: nft Not tainted 3.16.0-rc2+ #169
RIP: 0010:[<ffffffffa0526e61>] [<ffffffffa0526e61>] nf_tables_newsetelem+0x82/0xec [nf_tables]
[...]
Call Trace:
[<ffffffffa05178c4>] nfnetlink_rcv+0x2e7/0x3d7 [nfnetlink]
[<ffffffffa0517939>] ? nfnetlink_rcv+0x35c/0x3d7 [nfnetlink]
[<ffffffff8137d300>] netlink_unicast+0xf8/0x17a
[<ffffffff8137d6a5>] netlink_sendmsg+0x323/0x351
[...]
Fix this by returning -EINVAL if this attribute is not set, which
doesn't make sense at all since those commands are there to add and to
delete elements from the set.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This reverts commit 4fae4e7624653ef498d0e2a38f00620b9701ab04.
Undo because it breaks working systems.
Requested-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
This reverts commit d709f7bcbb3ab01704fa7b37a2e4b981cf3783c1.
Undo, because it might break exisiting functionality.
Requested-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
'arm/exynos' and 'core' into next
|
|
This commit introduces a generic device tree binding for
IOMMU devices. Only a very minimal subset is described
here, but it is enough to cover the requirements of both the
Exynos System MMU and Tegra SMMU as discussed here:
https://lkml.org/lkml/2014/4/27/346
Acked-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Olof Johansson <olof@lixom.net>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Otherwise the name of the kobject isn't getting freed and other stuff from
kobject_cleanup() isn't getting called. kobject_put() will call
kobject_del() on its own in kobject_cleanup().
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
functions
Bit helper functions were used for manipulation with NFACCT_F_OVERQUOTA,
but they are accepting pit position, but not a bit mask. As a result
not a third bit for NFACCT_F_OVERQUOTA was set, but forth. Such
behaviour was dangarous and could lead to unexpected overquota report
result.
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
It's cleaner and we don't have quite identical names like
BCMA_CORE_PCIEG2 and BCMA_CORE_PCIE2.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
When switching from one 5 GHz channel to another 5 GHz channel we need
to make sure BPHY is still in a reset. However to access BPHY register
we have to switch to 2 GHz mode for a moment. Otherwise this may result
in "Data bus error" (noticed by Hauke with BCM43224 connected to the
SoC).
Reported-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Amitkumar and Avinash are taking care of mwifiex driver as the
maintainers now due to organizational change.
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: Avinash Patil <patila@marvell.com>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Add lists of chipsets, so people can enable support for their device
easier (at least after checking lspci).
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
TDLS connections require dedicated flowrings. This patches adds
TDLS event handling and flowring creation/deletion based on these
events.
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Daniel (Deognyoun) Kim <dekim@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Msgbuf flow control was using a function to flow off and on which
was not supported without proptx enabled. Also flow control needs
to be handled per ifidx.
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Daniel (Deognyoun) Kim <dekim@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
When a pcie device gets reset then the low power modes l1 and l2
should be temporarily disabled.
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Daniel (Deognyoun) Kim <dekim@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
This patch will add PCIe support. With this patch the PCIe chipsets
43602, 4354, 4356, 43567, and 43570 will be supported.
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Daniel (Deognyoun) Kim <dekim@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
This patch will add the msgbuf protocol. This protocol is used by
the soon to be added new bus interface PCIe. Msgbuf is a protocol
where most data is and remains located on the host (driver) side
and transferred by DMA from and to device. Msgbuf is the protocol
which takes care of the signalling of the buffers between host and
device which identifies this DMA-able data.
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Daniel (Deognyoun) Kim <dekim@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
The soon to be added protocol msgbuf requires information about
interface addressing mode and peer deletion. This patch adds the
necessary APIs to proto, implements dummy functions in BCDC and
adds calls to proto wherever necessary by wl_cfg80211.
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Daniel (Deognyoun) Kim <dekim@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Daniel (Deognyoun) Kim <dekim@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Commit "c1b2053 brcmfmac: Make firmware path a module parameter"
introduced use of strcpy and strcat. The strcpy and strcat require
using null terminated strings and can cause out-of-bounds memory
access and subsequent corruption. This patch replaces these by
strncpy and strncat respectively to assure array boundaries are
not crossed.
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Signed-off-by: Daniel Kim <dekim@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Correct typo in the name of the type given to sizeof. Because it is the
size of a pointer that is wanted, the typo has no impact on compilation or
execution.
This problem was found using Coccinelle (http://coccinelle.lip6.fr/). The
semantic patch used can be found in message 0 of this patch series.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
git://git.linaro.org/people/mike.turquette/linux
Pull clock driver fix from Mike Turquette:
"A single patch to re-enable audio which is broken on all DRA7
SoC-based platforms. Missed this one from the last set of fixes"
* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux:
clk: ti: clk-7xx: Correct ABE DPLL configuration
|
|
Pull crypto fix from Herbert Xu:
"This adds missing SELinux labeling to AF_ALG sockets which apparently
causes SELinux (or at least the SELinux people) to misbehave :)"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: af_alg - properly label AF_ALG socket
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI barrier fix from James Bottomley:
"This is a potential data corruption fix: If we get an error sending
down a barrier, we simply ignore it meaning the barrier semantics get
violated without anyone being any the wiser. If the system crashes at
this point, the filesystem potentially becomes corrupt. Fix is to
report errors on failed barriers"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: handle flush errors properly
|
|
On platforms with sizeof(int) < sizeof(long), writing a temperature
limit larger than MAXINT will result in unpredictable limit values
written to the chip.
Clamp the input values to the supported limits first to fix the problem.
For set_temp_hyst:
As Guenter pointed out that the temperature is read as unsigned and stored in
an unsigned long. This is wrong; nothing in the datasheet suggests that the
value (the absolute temperature) must be positive.
So change it to signed.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tfiga/samsung-clk into clk-next-samsung
Samsung clock patches for 3.17
1) non-critical fixes (without need to push to stable):
d5e136a clk: samsung: Register clk provider only after registering its all clocks
305cfab clk: samsung: Make of_device_id array const
e9d5295 clk: samsung: exynos5420: Setup clocks before system suspend
f65d518 clk: samsung: trivial: Correct typo in author's name
2) Exynos CLKOUT driver:
800c979 clk: samsung: exynos4: Add missing CPU/DMC clock hierarchy
01f7ec2 clk: samsung: exynos4: Add CLKOUT clock hierarchy
1e832e5 clk: samsung: Add driver to control CLKOUT line on Exynos SoCs
d19bb39 ARM: dts: exynos: Update PMU node with CLKOUT related data
3) Clock hierarchy extensions:
17d3f1d clk: exynos4: Add PPMU IP block source clocks.
ca5b402 clk: samsung: register exynos5420 apll/kpll configuration data
4) ARM CLKDOWN functionality enablement for Exynos4 and 3250:
42773b2 clk: samsung: exynos4: Enable ARMCLK down feature
45c5b0a clk: samsung: exynos3250: Enable ARMCLK down feature
|
|
This has been run through Intel's LKP tests across a wide range
of modern sytems and workloads and it wasn't shown to make a
measurable performance difference positive or negative.
Now that we have some shiny new tracepoints, we can actually
figure out what the heck is going on.
During a kernel compile, 60% of the flush_tlb_mm_range() calls
are for a single page. It breaks down like this:
size percent percent<=
V V V
GLOBAL: 2.20% 2.20% avg cycles: 2283
1: 56.92% 59.12% avg cycles: 1276
2: 13.78% 72.90% avg cycles: 1505
3: 8.26% 81.16% avg cycles: 1880
4: 7.41% 88.58% avg cycles: 2447
5: 1.73% 90.31% avg cycles: 2358
6: 1.32% 91.63% avg cycles: 2563
7: 1.14% 92.77% avg cycles: 2862
8: 0.62% 93.39% avg cycles: 3542
9: 0.08% 93.47% avg cycles: 3289
10: 0.43% 93.90% avg cycles: 3570
11: 0.20% 94.10% avg cycles: 3767
12: 0.08% 94.18% avg cycles: 3996
13: 0.03% 94.20% avg cycles: 4077
14: 0.02% 94.23% avg cycles: 4836
15: 0.04% 94.26% avg cycles: 5699
16: 0.06% 94.32% avg cycles: 5041
17: 0.57% 94.89% avg cycles: 5473
18: 0.02% 94.91% avg cycles: 5396
19: 0.03% 94.95% avg cycles: 5296
20: 0.02% 94.96% avg cycles: 6749
21: 0.18% 95.14% avg cycles: 6225
22: 0.01% 95.15% avg cycles: 6393
23: 0.01% 95.16% avg cycles: 6861
24: 0.12% 95.28% avg cycles: 6912
25: 0.05% 95.32% avg cycles: 7190
26: 0.01% 95.33% avg cycles: 7793
27: 0.01% 95.34% avg cycles: 7833
28: 0.01% 95.35% avg cycles: 8253
29: 0.08% 95.42% avg cycles: 8024
30: 0.03% 95.45% avg cycles: 9670
31: 0.01% 95.46% avg cycles: 8949
32: 0.01% 95.46% avg cycles: 9350
33: 3.11% 98.57% avg cycles: 8534
34: 0.02% 98.60% avg cycles: 10977
35: 0.02% 98.62% avg cycles: 11400
We get in to dimishing returns pretty quickly. On pre-IvyBridge
CPUs, we used to set the limit at 8 pages, and it was set at 128
on IvyBrige. That 128 number looks pretty silly considering that
less than 0.5% of the flushes are that large.
The previous code tried to size this number based on the size of
the TLB. Good idea, but it's error-prone, needs maintenance
(which it didn't get up to now), and probably would not matter in
practice much.
Settting it to 33 means that we cover the mallopt
M_TRIM_THRESHOLD, which is the most universally common size to do
flushes.
That's the short version. Here's the long one for why I chose 33:
1. These numbers have a constant bias in the timestamps from the
tracing. Probably counts for a couple hundred cycles in each of
these tests, but it should be fairly _even_ across all of them.
The smallest delta between the tracepoints I have ever seen is
335 cycles. This is one reason the cycles/page cost goes down in
general as the flushes get larger. The true cost is nearer to
100 cycles.
2. A full flush is more expensive than a single invlpg, but not
by much (single percentages).
3. A dtlb miss is 17.1ns (~45 cycles) and a itlb miss is 13.0ns
(~34 cycles). At those rates, refilling the 512-entry dTLB takes
22,000 cycles.
4. 22,000 cycles is approximately the equivalent of doing 85
invlpg operations. But, the odds are that the TLB can
actually be filled up faster than that because TLB misses that
are close in time also tend to leverage the same caches.
6. ~98% of flushes are <=33 pages. There are a lot of flushes of
33 pages, probably because libc's M_TRIM_THRESHOLD is set to
128k (32 pages)
7. I've found no consistent data to support changing the IvyBridge
vs. SandyBridge tunable by a factor of 16
I used the performance counters on this hardware (IvyBridge i5-3320M)
to figure out the tlb miss costs:
ocperf.py stat -e dtlb_load_misses.walk_duration,dtlb_load_misses.walk_completed,dtlb_store_misses.walk_duration,dtlb_store_misses.walk_completed,itlb_misses.walk_duration,itlb_misses.walk_completed,itlb.itlb_flush
7,720,030,970 dtlb_load_misses_walk_duration [57.13%]
169,856,353 dtlb_load_misses_walk_completed [57.15%]
708,832,859 dtlb_store_misses_walk_duration [57.17%]
19,346,823 dtlb_store_misses_walk_completed [57.17%]
2,779,687,402 itlb_misses_walk_duration [57.15%]
82,241,148 itlb_misses_walk_completed [57.13%]
770,717 itlb_itlb_flush [57.11%]
Show that a dtlb miss is 17.1ns (~45 cycles) and a itlb miss is 13.0ns
(~34 cycles). At those rates, refilling the 512-entry dTLB takes
22,000 cycles. On a SandyBridge system with more cores and larger
caches, those are dtlb=13.4ns and itlb=9.5ns.
cat perf.stat.txt | perl -pe 's/,//g'
| awk '/itlb_misses_walk_duration/ { icyc+=$1 }
/itlb_misses_walk_completed/ { imiss+=$1 }
/dtlb_.*_walk_duration/ { dcyc+=$1 }
/dtlb_.*.*completed/ { dmiss+=$1 }
END {print "itlb cyc/miss: ", icyc/imiss, " dtlb cyc/miss: ", dcyc/dmiss, " ----- ", icyc,imiss, dcyc,dmiss }
On Westmere CPUs, the counters to use are: itlb_flush,itlb_misses.walk_cycles,itlb_misses.any,dtlb_misses.walk_cycles,dtlb_misses.any
The assumptions that this code went in under:
https://lkml.org/lkml/2012/6/12/119 say that a flush and a refill are
about 100ns. Being generous, that is over by a factor of 6 on the
refill side, although it is fairly close on the cost of an invlpg.
An increase of a single invlpg operation seems to lengthen the flush
range operation by about 200 cycles. Here is one example of the data
collected for flushing 10 and 11 pages (full data are below):
10: 0.43% 93.90% avg cycles: 3570 cycles/page: 357 samples: 4714
11: 0.20% 94.10% avg cycles: 3767 cycles/page: 342 samples: 2145
How to generate this table:
echo 10000 > /sys/kernel/debug/tracing/buffer_size_kb
echo x86-tsc > /sys/kernel/debug/tracing/trace_clock
echo 'reason != 0' > /sys/kernel/debug/tracing/events/tlb/tlb_flush/filter
echo 1 > /sys/kernel/debug/tracing/events/tlb/tlb_flush/enable
Pipe the trace output in to this script:
http://sr71.net/~dave/intel/201402-tlb/trace-time-diff-process.pl.txt
Note that these data were gathered with the invlpg threshold set to
150 pages. Only data points with >=50 of samples were printed:
Flush % of %<=
in flush this
pages es size
------------------------------------------------------------------------------
-1: 2.20% 2.20% avg cycles: 2283 cycles/page: xxxx samples: 23960
1: 56.92% 59.12% avg cycles: 1276 cycles/page: 1276 samples: 620895
2: 13.78% 72.90% avg cycles: 1505 cycles/page: 752 samples: 150335
3: 8.26% 81.16% avg cycles: 1880 cycles/page: 626 samples: 90131
4: 7.41% 88.58% avg cycles: 2447 cycles/page: 611 samples: 80877
5: 1.73% 90.31% avg cycles: 2358 cycles/page: 471 samples: 18885
6: 1.32% 91.63% avg cycles: 2563 cycles/page: 427 samples: 14397
7: 1.14% 92.77% avg cycles: 2862 cycles/page: 408 samples: 12441
8: 0.62% 93.39% avg cycles: 3542 cycles/page: 442 samples: 6721
9: 0.08% 93.47% avg cycles: 3289 cycles/page: 365 samples: 917
10: 0.43% 93.90% avg cycles: 3570 cycles/page: 357 samples: 4714
11: 0.20% 94.10% avg cycles: 3767 cycles/page: 342 samples: 2145
12: 0.08% 94.18% avg cycles: 3996 cycles/page: 333 samples: 864
13: 0.03% 94.20% avg cycles: 4077 cycles/page: 313 samples: 289
14: 0.02% 94.23% avg cycles: 4836 cycles/page: 345 samples: 236
15: 0.04% 94.26% avg cycles: 5699 cycles/page: 379 samples: 390
16: 0.06% 94.32% avg cycles: 5041 cycles/page: 315 samples: 643
17: 0.57% 94.89% avg cycles: 5473 cycles/page: 321 samples: 6229
18: 0.02% 94.91% avg cycles: 5396 cycles/page: 299 samples: 224
19: 0.03% 94.95% avg cycles: 5296 cycles/page: 278 samples: 367
20: 0.02% 94.96% avg cycles: 6749 cycles/page: 337 samples: 185
21: 0.18% 95.14% avg cycles: 6225 cycles/page: 296 samples: 1964
22: 0.01% 95.15% avg cycles: 6393 cycles/page: 290 samples: 83
23: 0.01% 95.16% avg cycles: 6861 cycles/page: 298 samples: 61
24: 0.12% 95.28% avg cycles: 6912 cycles/page: 288 samples: 1307
25: 0.05% 95.32% avg cycles: 7190 cycles/page: 287 samples: 533
26: 0.01% 95.33% avg cycles: 7793 cycles/page: 299 samples: 94
27: 0.01% 95.34% avg cycles: 7833 cycles/page: 290 samples: 66
28: 0.01% 95.35% avg cycles: 8253 cycles/page: 294 samples: 73
29: 0.08% 95.42% avg cycles: 8024 cycles/page: 276 samples: 846
30: 0.03% 95.45% avg cycles: 9670 cycles/page: 322 samples: 296
31: 0.01% 95.46% avg cycles: 8949 cycles/page: 288 samples: 79
32: 0.01% 95.46% avg cycles: 9350 cycles/page: 292 samples: 60
33: 3.11% 98.57% avg cycles: 8534 cycles/page: 258 samples: 33936
34: 0.02% 98.60% avg cycles: 10977 cycles/page: 322 samples: 268
35: 0.02% 98.62% avg cycles: 11400 cycles/page: 325 samples: 177
36: 0.01% 98.63% avg cycles: 11504 cycles/page: 319 samples: 161
37: 0.02% 98.65% avg cycles: 11596 cycles/page: 313 samples: 182
38: 0.02% 98.66% avg cycles: 11850 cycles/page: 311 samples: 195
39: 0.01% 98.68% avg cycles: 12158 cycles/page: 311 samples: 128
40: 0.01% 98.68% avg cycles: 11626 cycles/page: 290 samples: 78
41: 0.04% 98.73% avg cycles: 11435 cycles/page: 278 samples: 477
42: 0.01% 98.73% avg cycles: 12571 cycles/page: 299 samples: 74
43: 0.01% 98.74% avg cycles: 12562 cycles/page: 292 samples: 78
44: 0.01% 98.75% avg cycles: 12991 cycles/page: 295 samples: 108
45: 0.01% 98.76% avg cycles: 13169 cycles/page: 292 samples: 78
46: 0.02% 98.78% avg cycles: 12891 cycles/page: 280 samples: 261
47: 0.01% 98.79% avg cycles: 13099 cycles/page: 278 samples: 67
48: 0.01% 98.80% avg cycles: 13851 cycles/page: 288 samples: 77
49: 0.01% 98.80% avg cycles: 13749 cycles/page: 280 samples: 66
50: 0.01% 98.81% avg cycles: 13949 cycles/page: 278 samples: 73
52: 0.00% 98.82% avg cycles: 14243 cycles/page: 273 samples: 52
54: 0.01% 98.83% avg cycles: 15312 cycles/page: 283 samples: 87
55: 0.01% 98.84% avg cycles: 15197 cycles/page: 276 samples: 109
56: 0.02% 98.86% avg cycles: 15234 cycles/page: 272 samples: 208
57: 0.00% 98.86% avg cycles: 14888 cycles/page: 261 samples: 53
58: 0.01% 98.87% avg cycles: 15037 cycles/page: 259 samples: 59
59: 0.01% 98.87% avg cycles: 15752 cycles/page: 266 samples: 63
62: 0.00% 98.89% avg cycles: 16222 cycles/page: 261 samples: 54
64: 0.02% 98.91% avg cycles: 17179 cycles/page: 268 samples: 248
65: 0.12% 99.03% avg cycles: 18762 cycles/page: 288 samples: 1324
85: 0.00% 99.10% avg cycles: 21649 cycles/page: 254 samples: 50
127: 0.01% 99.18% avg cycles: 32397 cycles/page: 255 samples: 75
128: 0.13% 99.31% avg cycles: 31711 cycles/page: 247 samples: 1466
129: 0.18% 99.49% avg cycles: 33017 cycles/page: 255 samples: 1927
181: 0.33% 99.84% avg cycles: 2489 cycles/page: 13 samples: 3547
256: 0.05% 99.91% avg cycles: 2305 cycles/page: 9 samples: 550
512: 0.03% 99.95% avg cycles: 2133 cycles/page: 4 samples: 304
1512: 0.01% 99.99% avg cycles: 3038 cycles/page: 2 samples: 65
Here are the tlb counters during a 10-second slice of a kernel compile
for a SandyBridge system. It's better than IvyBridge, but probably
due to the larger caches since this was one of the 'X' extreme parts.
10,873,007,282 dtlb_load_misses_walk_duration
250,711,333 dtlb_load_misses_walk_completed
1,212,395,865 dtlb_store_misses_walk_duration
31,615,772 dtlb_store_misses_walk_completed
5,091,010,274 itlb_misses_walk_duration
163,193,511 itlb_misses_walk_completed
1,321,980 itlb_itlb_flush
10.008045158 seconds time elapsed
# cat perf.stat.1392743721.txt | perl -pe 's/,//g' | awk '/itlb_misses_walk_duration/ { icyc+=$1 } /itlb_misses_walk_completed/ { imiss+=$1 } /dtlb_.*_walk_duration/ { dcyc+=$1 } /dtlb_.*.*completed/ { dmiss+=$1 } END {print "itlb cyc/miss: ", icyc/imiss/3.3, " dtlb cyc/miss: ", dcyc/dmiss/3.3, " ----- ", icyc,imiss, dcyc,dmiss }'
itlb ns/miss: 9.45338 dtlb ns/miss: 12.9716
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154103.10C1115E@viggo.jf.intel.com
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
|
Most of the logic here is in the documentation file. Please take
a look at it.
I know we've come full-circle here back to a tunable, but this
new one is *WAY* simpler. I challenge anyone to describe in one
sentence how the old one worked. Here's the way the new one
works:
If we are flushing more pages than the ceiling, we use
the full flush, otherwise we use per-page flushes.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154101.12B52CAF@viggo.jf.intel.com
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
|
We don't have any good way to figure out what kinds of flushes
are being attempted. Right now, we can try to use the vm
counters, but those only tell us what we actually did with the
hardware (one-by-one vs full) and don't tell us what was actually
_requested_.
This allows us to select out "interesting" TLB flushes that we
might want to optimize (like the ranged ones) and ignore the ones
that we have very little control over (the ones at context
switch).
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154059.4C96CBA5@viggo.jf.intel.com
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
|
There are currently three paths through the remote flush code:
1. full invalidation
2. single page invalidation using invlpg
3. ranged invalidation using invlpg
This takes 2 and 3 and combines them in to a single path by
making the single-page one just be the start and end be start
plus a single page. This makes placement of our tracepoint easier.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154058.E0F90408@viggo.jf.intel.com
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
|
If we take the
if (end == TLB_FLUSH_ALL || vmflag & VM_HUGETLB) {
local_flush_tlb();
goto out;
}
path out of flush_tlb_mm_range(), we will have flushed the tlb,
but not incremented NR_TLB_LOCAL_FLUSH_ALL. This unifies the
way out of the function so that we always take a single path when
doing a full tlb flush.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154056.FF763B76@viggo.jf.intel.com
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
|
I think the flush_tlb_mm_range() code that tries to tune the
flush sizes based on the CPU needs to get ripped out for
several reasons:
1. It is obviously buggy. It uses mm->total_vm to judge the
task's footprint in the TLB. It should certainly be using
some measure of RSS, *NOT* ->total_vm since only resident
memory can populate the TLB.
2. Haswell, and several other CPUs are missing from the
intel_tlb_flushall_shift_set() function. Thus, it has been
demonstrated to bitrot quickly in practice.
3. It is plain wrong in my vm:
[ 0.037444] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[ 0.037444] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0
[ 0.037444] tlb_flushall_shift: 6
Which leads to it to never use invlpg.
4. The assumptions about TLB refill costs are wrong:
http://lkml.kernel.org/r/1337782555-8088-3-git-send-email-alex.shi@intel.com
(more on this in later patches)
5. I can not reproduce the original data: https://lkml.org/lkml/2012/5/17/59
I believe the sample times were too short. Running the
benchmark in a loop yields times that vary quite a bit.
Note that this leaves us with a static ceiling of 1 page. This
is a conservative, dumb setting, and will be revised in a later
patch.
This also removes the code which attempts to predict whether we
are flushing data or instructions. We expect instruction flushes
to be relatively rare and not worth tuning for explicitly.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154055.ABC88E89@viggo.jf.intel.com
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|