Age | Commit message (Collapse) | Author |
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fix from Arnaldo Carvalho de Melo:
- Use index, not CPU id, to find core/pkg id in 'perf stat' (Kan Liang)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This reverts commit 5e22ec019823b0204720e1ad9a5866c638332b3a.
|
|
net/ipv4/af_inet.c: In function 'snmp_get_cpu_field64':
>> net/ipv4/af_inet.c:1486:26: error: 'offt' undeclared (first use in this function)
v = *(((u64 *)bhptr) + offt);
^
net/ipv4/af_inet.c:1486:26: note: each undeclared identifier is reported only once for each function it appears in
net/ipv4/af_inet.c: In function 'snmp_fold_field64':
>> net/ipv4/af_inet.c:1499:39: error: 'offct' undeclared (first use in this function)
res += snmp_get_cpu_field(mib, cpu, offct, syncp_offset);
^
>> net/ipv4/af_inet.c:1499:10: error: too many arguments to function 'snmp_get_cpu_field'
res += snmp_get_cpu_field(mib, cpu, offct, syncp_offset);
^
net/ipv4/af_inet.c:1455:5: note: declared here
u64 snmp_get_cpu_field(void __percpu *mib, int cpu, int offt)
^
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Poll() returns immediately after setting the kernel current frame
(ring->head) to SKIP from user space even though there is no new
frame. And in a case of all frames is VALID, user space program
unintensionally sets (only) kernel current frame to UNUSED, then
calls poll(), it will not return immediately even though there are
VALID frames.
To avoid situations like above, I think we need to scan all frames
to find VALID frames at poll() like netlink_alloc_skb(),
netlink_forward_ring() finding an UNUSED frame at skb allocation.
Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Aleksey Makarov says:
====================
net: thunderx: New features and fixes
v2:
- The unused affinity_mask field of the structure cmp_queue
has been deleted. (thanks to David Miller)
- The unneeded initializers have been dropped. (thanks to Alexey Klimov)
- The commit message "net: thunderx: Rework interrupt handling"
has been fixed. (thanks to Alexey Klimov)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Support for setting VF's corresponding BGX LMAC in internal
loopback mode. This mode can be used for verifying basic HW
functionality such as packet I/O, RX checksum validation,
CQ/RBDR interrupts, stats e.t.c. Useful when DUT has no external
network connectivity.
'loopback' mode can be enabled or disabled via ethtool.
Note: This feature is not supported when no of VFs enabled are
morethan no of physical interfaces i.e active BGX LMACs
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds support for handling multiple qsets assigned to a
single VF. There by increasing no of queues from earlier 8 to max
no of CPUs in the system i.e 48 queues on a single node and 96 on
dual node system. User doesn't have option to assign which Qsets/VFs
to be merged. Upon request from VF, PF assigns next free Qsets as
secondary qsets. To maintain current behavior no of queues is kept
to 8 by default which can be increased via ethtool.
If user wants to unbind NICVF driver from a secondary Qset then it
should be done after tearing down primary VF's interface.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rework interrupt handler to avoid checking IRQ affinity of
CQ interrupts. Now separate handlers are registered for each IRQ
including RBDR. Register interrupt handlers for only those
which are being used. Add nicvf_dump_intr_status() and use it
in irq handlers.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch configures HW to strip 802.1Q header if found in a
receiving packet. The stripped VLAN ID and TCI information is
passed on to software via CQE_RX. Also sets netdev's 'vlan_features'
so that other HW offload features can be used for tagged packets.
This offload feature can be enabled or disabled via ethtool.
Network stack normally ignores RPS for 802.1Q packets and hence low
throughput. With this offload enabled throughput for tagged packets
will be almost same as normal packets.
Note: This patch doesn't enable HW VLAN insertion for transmit packets.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Adding support for receive hashing HW offload by using RSS_ALG
and RSS_TAG fields of CQE_RX descriptor. Also removed dependency
on minimum receive queue count to configure RSS so that hash is
always generated.
This hash is used by RPS logic to distribute flows across multiple
CPUs. Offload can be disabled via ethtool.
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the nicvf_send_msg_to_pf() function in the mailbox code.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Added ethtool support to dump receive packet error statistics reported
in CQE. Also made some small fixes
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The liquidio and thunder drivers have different maintainers.
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Raghavendra K T says:
====================
Optimize the snmp stat aggregation for large cpus
While creating 1000 containers, perf is showing lot of time spent in
snmp_fold_field on a large cpu system.
The current patch tries to improve by reordering the statistics gathering.
Please note that similar overhead was also reported while creating
veth pairs https://lkml.org/lkml/2013/3/19/556
Changes in V4:
- remove 'item' variable and use IPSTATS_MIB_MAX to avoid sparse
warning (Eric) also remove 'item' parameter (Joe)
- add missing memset of padding.
Changes in V3:
- use memset to initialize temp buffer in leaf function. (David)
- use memcpy to copy the buffer data to stat instead of unalign_pu (Joe)
- Move buffer definition to leaf function __snmp6_fill_stats64() (Eric)
-
Changes in V2:
- Allocate the stat calculation buffer in stack. (Eric)
Setup:
160 cpu (20 core) baremetal powerpc system with 1TB memory
1000 docker containers was created with command
docker run -itd ubuntu:15.04 /bin/bash in loop
observation:
Docker container creation linearly increased from around 1.6 sec to 7.5 sec
(at 1000 containers) perf data showed, creating veth interfaces resulting in
the below code path was taking more time.
rtnl_fill_ifinfo
-> inet6_fill_link_af
-> inet6_fill_ifla6_attrs
-> snmp_fold_field
proposed idea:
currently __snmp6_fill_stats64 calls snmp_fold_field that walks
through per cpu data to of an item (iteratively for around 36 items).
The patch tries to aggregate the statistics by going through
all the items of each cpu sequentially which is reducing cache
misses.
Performance of docker creation improved by around more than 2x
after the patch.
before the patch:
================
3f45ba571a42e925c4ec4aaee0e48d7610a9ed82a4c931f83324d41822cf6617
real 0m6.836s
user 0m0.095s
sys 0m0.011s
perf record -a docker run -itd ubuntu:15.04 /bin/bash
=======================================================
50.73% docker [kernel.kallsyms] [k] snmp_fold_field
9.07% swapper [kernel.kallsyms] [k] snooze_loop
3.49% docker [kernel.kallsyms] [k] veth_stats_one
2.85% swapper [kernel.kallsyms] [k] _raw_spin_lock
1.37% docker docker [.] backtrace_qsort
1.31% docker docker [.] strings.FieldsFunc
cache-misses: 2.7%
after the patch:
=============
9178273e9df399c8290b6c196e4aef9273be2876225f63b14a60cf97eacfafb5
real 0m3.249s
user 0m0.088s
sys 0m0.020s
perf record -a docker run -itd ubuntu:15.04 /bin/bash
=======================================================
10.57% docker docker [.] scanblock
8.37% swapper [kernel.kallsyms] [k] snooze_loop
6.91% docker [kernel.kallsyms] [k] snmp_get_cpu_field
6.67% docker [kernel.kallsyms] [k] veth_stats_one
3.96% docker docker [.] runtime_MSpan_Sweep
2.47% docker docker [.] strings.FieldsFunc
cache-misses: 1.41 %
Please let me know if you have suggestions/comments.
Thanks Eric, Joe and David for the comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Docker container creation linearly increased from around 1.6 sec to 7.5 sec
(at 1000 containers) and perf data showed 50% ovehead in snmp_fold_field.
reason: currently __snmp6_fill_stats64 calls snmp_fold_field that walks
through per cpu data of an item (iteratively for around 36 items).
idea: This patch tries to aggregate the statistics by going through
all the items of each cpu sequentially which is reducing cache
misses.
Docker creation got faster by more than 2x after the patch.
Result:
Before After
Docker creation time 6.836s 3.25s
cache miss 2.7% 1.41%
perf before:
50.73% docker [kernel.kallsyms] [k] snmp_fold_field
9.07% swapper [kernel.kallsyms] [k] snooze_loop
3.49% docker [kernel.kallsyms] [k] veth_stats_one
2.85% swapper [kernel.kallsyms] [k] _raw_spin_lock
perf after:
10.57% docker docker [.] scanblock
8.37% swapper [kernel.kallsyms] [k] snooze_loop
6.91% docker [kernel.kallsyms] [k] snmp_get_cpu_field
6.67% docker [kernel.kallsyms] [k] veth_stats_one
changes/ideas suggested:
Using buffer in stack (Eric), Usage of memset (David), Using memcpy in
place of unaligned_put (Joe).
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
On Alpha we have spinlocks that are 32b in size and an efficient
cmpxchg64 implementation, so we qualify to make use of cmpxchg backed
lockrefs. Select the ARCH_USE_CMPXCHG_LOCKREF Kconfig symbol and provide
a trivial implementation of arch_spin_value_unlocked to satisfy the
lockref code.
Using Linus' simple testcase from
http://article.gmane.org/gmane.linux.file-systems/77466 on a dual CPU
ES47 system I see around an 8% gain:
N Min Max Median Avg Stddev
x 30 6194580 6295654 6272504 6272514 17694.232
+ 30 6731164 6786334 6767982 6764274 13738.863
Difference at 95.0% confidence
491760 +/- 8188.17
7.83992% +/- 0.130541%
(Student's t, pooled s = 15840.5)
Signed-off-by: Matt Turner <mattst88@gmail.com>
|
|
Not all gpio banks are necessarily enabled, in the current code this can
lead to null pointer dereferences.
[ 51.130000] Unable to handle kernel NULL pointer dereference at virtual address 00000058
[ 51.130000] pgd = dee04000
[ 51.130000] [00000058] *pgd=3f66d831, *pte=00000000, *ppte=00000000
[ 51.140000] Internal error: Oops: 17 [#1] ARM
[ 51.140000] Modules linked in:
[ 51.140000] CPU: 0 PID: 1664 Comm: cat Not tainted 4.1.1+ #6
[ 51.140000] Hardware name: Atmel SAMA5
[ 51.140000] task: df6dd880 ti: dec60000 task.ti: dec60000
[ 51.140000] PC is at at91_pinconf_get+0xb4/0x200
[ 51.140000] LR is at at91_pinconf_get+0xb4/0x200
[ 51.140000] pc : [<c01e71a0>] lr : [<c01e71a0>] psr: 600f0013
sp : dec61e48 ip : 600f0013 fp : df522538
[ 51.140000] r10: df52250c r9 : 00000058 r8 : 00000068
[ 51.140000] r7 : 00000000 r6 : df53c910 r5 : 00000000 r4 : dec61e7c
[ 51.140000] r3 : 00000000 r2 : c06746d4 r1 : 00000000 r0 : 00000003
[ 51.140000] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
[ 51.140000] Control: 10c53c7d Table: 3ee04059 DAC: 00000015
[ 51.140000] Process cat (pid: 1664, stack limit = 0xdec60208)
[ 51.140000] Stack: (0xdec61e48 to 0xdec62000)
[ 51.140000] 1e40: 00000358 00000000 df522500 ded15f80 c05a9d08 ded15f80
[ 51.140000] 1e60: 0000048c 00000061 df522500 ded15f80 c05a9d08 c01e7304 ded15f80 00000000
[ 51.140000] 1e80: c01e6008 00000060 0000048c c01e6034 c01e5f6c ded15f80 dec61ec0 00000000
[ 51.140000] 1ea0: 00020000 ded6f280 dec61f80 00000001 00000001 c00ae0b8 b6e80000 ded15fb0
[ 51.140000] 1ec0: 00000000 00000000 df4bc974 00000055 00000800 ded6f280 b6e80000 ded6f280
[ 51.140000] 1ee0: ded6f280 00020000 b6e80000 00000000 00020000 c0090dec c0671e1c dec61fb0
[ 51.140000] 1f00: b6f8b510 00000001 00004201 c000924c 00000000 00000003 00000003 00000000
[ 51.140000] 1f20: df4bc940 00022000 00000022 c066e188 b6e7f000 c00836f4 000b6e7f ded6f280
[ 51.140000] 1f40: ded6f280 b6e80000 dec61f80 ded6f280 00020000 c0091508 00000000 00000003
[ 51.140000] 1f60: 00022000 00000000 00000000 ded6f280 ded6f280 00020000 b6e80000 c0091d9c
[ 51.140000] 1f80: 00000000 00000000 ffffffff 00020000 00020000 b6e80000 00000003 c000f124
[ 51.140000] 1fa0: dec60000 c000efa0 00020000 00020000 00000003 b6e80000 00020000 000271c4
[ 51.140000] 1fc0: 00020000 00020000 b6e80000 00000003 7fffe000 00000000 00000000 00020000
[ 51.140000] 1fe0: 00000000 bef50b64 00013835 b6f29c76 400f0030 00000003 00000000 00000000
[ 51.140000] [<c01e71a0>] (at91_pinconf_get) from [<c01e7304>] (at91_pinconf_dbg_show+0x18/0x2c0)
[ 51.140000] [<c01e7304>] (at91_pinconf_dbg_show) from [<c01e6034>] (pinconf_pins_show+0xc8/0xf8)
[ 51.140000] [<c01e6034>] (pinconf_pins_show) from [<c00ae0b8>] (seq_read+0x1a0/0x464)
[ 51.140000] [<c00ae0b8>] (seq_read) from [<c0090dec>] (__vfs_read+0x20/0xd0)
[ 51.140000] [<c0090dec>] (__vfs_read) from [<c0091508>] (vfs_read+0x7c/0x108)
[ 51.140000] [<c0091508>] (vfs_read) from [<c0091d9c>] (SyS_read+0x40/0x94)
[ 51.140000] [<c0091d9c>] (SyS_read) from [<c000efa0>] (ret_fast_syscall+0x0/0x3c)
[ 51.140000] Code: eb010ec2 e30a0d08 e34c005a eb0ae5a7 (e5993000)
[ 51.150000] ---[ end trace fb3c370da3ea4794 ]---
Fixes: a0b957f306fa ("pinctrl: at91: allow to have disabled gpio bank")
Cc: stable@vger.kernel.org # 3.18
Signed-off-by: David Dueck <davidcdueck@googlemail.com>
Acked-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
|
|
'asoc/topic/zx296702' into asoc-next
|
|
'asoc/topic/wm8904', 'asoc/topic/wm8960' and 'asoc/topic/wm8983' into asoc-next
|
|
'asoc/topic/wm5110', 'asoc/topic/wm8004' and 'asoc/topic/wm8731' into asoc-next
|
|
'asoc/topic/ux500' and 'asoc/topic/width' into asoc-next
|
|
'asoc/topic/tegra', 'asoc/topic/tlv' and 'asoc/topic/topology' into asoc-next
|
|
'asoc/topic/sti' and 'asoc/topic/sti-sas' into asoc-next
|
|
'asoc/topic/simple', 'asoc/topic/sirf-codec' and 'asoc/topic/spear' into asoc-next
|
|
'asoc/topic/rt5651' and 'asoc/topic/rt5670' into asoc-next
|
|
'asoc/topic/rl6231', 'asoc/topic/rockchip' and 'asoc/topic/rt286' into asoc-next
|
|
'asoc/topic/qcom' into asoc-next
|
|
'asoc/topic/nuc900', 'asoc/topic/of-name' and 'asoc/topic/omap' into asoc-next
|
|
'asoc/topic/max98357a', 'asoc/topic/max9877' and 'asoc/topic/max98925' into asoc-next
|
|
'asoc/topic/lm49453', 'asoc/topic/max9768' and 'asoc/topic/max98088' into asoc-next
|
|
'asoc/topic/gtm601', 'asoc/topic/ics43432' and 'asoc/topic/ids' into asoc-next
|
|
'asoc/topic/fsl-asrc', 'asoc/topic/fsl-card' and 'asoc/topic/fsl-sai' into asoc-next
|
|
'asoc/topic/davinci-vcif', 'asoc/topic/doc' and 'asoc/topic/dpcm' into asoc-next
|
|
'asoc/topic/cs4349' and 'asoc/topic/da732x' into asoc-next
|
|
'asoc/topic/cs4265' and 'asoc/topic/cs42l52' into asoc-next
|
|
'asoc/topic/blackfin' and 'asoc/topic/card' into asoc-next
|
|
'asoc/topic/ak4542', 'asoc/topic/arizona' and 'asoc/topic/atmel' into asoc-next
|
|
|
|
|
|
|
|
|
|
|
|
'asoc/fix/max98090', 'asoc/fix/rt5640', 'asoc/fix/samsung' and 'asoc/fix/wm8994' into asoc-linus
|
|
|
|
Use devm_* API to fix leaks in current code.
1. Use devm_kzalloc to fix memory leak for zx_i2s when unload the module.
2. Use devm_snd_soc_register_component to ensure component is unregistered
when unload the module.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Reviewed-by: Jun Nie <jun.nie@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
|