Age | Commit message (Collapse) | Author |
|
Timestamps in pktgen are currently retrieved using the deprecated
do_gettimeofday() function that wraps its signed 32-bit seconds in 2038
(on 32-bit architectures) and requires a division operation to calculate
microseconds.
The pktgen header is also defined with the same limitations, hardcoding
to a 32-bit seconds field that can be interpreted as unsigned to produce
times that only wrap in 2106. Whatever code reads the timestamps should
be aware of that problem in general, but probably doesn't care too
much as we are mostly interested in the time passing between packets,
and that is correctly represented.
Using 64-bit nanoseconds would be cheaper and good for 584 years. Using
monotonic times would also make this unambiguous by avoiding the overflow,
but would make it harder to correlate to the times with those on remote
machines. Either approach would require adding a new runtime flag and
implementing the same thing on the remote side, which we probably don't
want to do unless someone sees it as a real problem. Also, this should
be coordinated with other pktgen implementations and might need a new
magic number.
For the moment, I'm documenting the overflow in the source code, and
changing the implementation over to an open-coded ktime_get_real_ts64()
plus division, so we don't have to look at it again while scanning for
deprecated time interfaces.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Define for reserved register 31 had the incorrect address. Specify
the correct address.
Reported-by: Jens-Peter Oswald <oswald@lre.de>
Signed-off-by: Akshay Bhat <akshay.bhat@timesys.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
|
Remove duplicate define for RX8010_YEAR
Signed-off-by: Akshay Bhat <akshay.bhat@timesys.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
|
m41t80_sqw_set_rate will be called with the result from
m41t80_sqw_round_rate, so might as well make
m41t80_sqw_set_rate(n) same as
m41t80_sqw_set_rate(m41t80_sqw_round_rate(n))
As Russell King wrote[1],
"clk_round_rate() is supposed to tell you what you end up with if you
ask clk_set_rate() to set the exact same value you passed in - but
clk_round_rate() won't modify the hardware."
[1]
http://lists.infradead.org/pipermail/linux-arm-kernel/2012-January/080175.html
Signed-off-by: Troy Kisky <troy.kisky@boundarydevices.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
|
This is a little more efficient and avoids the warning
WARNING: possible circular locking dependency detected
4.14.0-rc7-00010 #16 Not tainted
------------------------------------------------------
kworker/2:1/70 is trying to acquire lock:
(prepare_lock){+.+.}, at: [<c049300c>] clk_prepare_lock+0x80/0xf4
but task is already holding lock:
(i2c_register_adapter){+.+.}, at: [<c0690b04>]
i2c_adapter_lock_bus+0x14/0x18
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (i2c_register_adapter){+.+.}:
rt_mutex_lock+0x44/0x5c
i2c_adapter_lock_bus+0x14/0x18
i2c_transfer+0xa8/0xbc
i2c_smbus_xfer+0x20c/0x5d8
i2c_smbus_read_byte_data+0x38/0x48
m41t80_sqw_is_prepared+0x18/0x28
Signed-off-by: Troy Kisky <troy.kisky@boundarydevices.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
|
This is a little more efficient, and avoids the warning
WARNING: possible circular locking dependency detected
4.14.0-rc7-00007 #14 Not tainted
------------------------------------------------------
alsactl/330 is trying to acquire lock:
(prepare_lock){+.+.}, at: [<c049300c>] clk_prepare_lock+0x80/0xf4
but task is already holding lock:
(i2c_register_adapter){+.+.}, at: [<c0690ae0>]
i2c_adapter_lock_bus+0x14/0x18
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (i2c_register_adapter){+.+.}:
rt_mutex_lock+0x44/0x5c
i2c_adapter_lock_bus+0x14/0x18
i2c_transfer+0xa8/0xbc
i2c_smbus_xfer+0x20c/0x5d8
i2c_smbus_read_byte_data+0x38/0x48
m41t80_sqw_recalc_rate+0x24/0x58
Signed-off-by: Troy Kisky <troy.kisky@boundarydevices.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
|
Previously it was returning the best of
32768, 8192, 1024, 64, 2, 0
Now, best of
32768, 8192, 4096, 2048, 1024, 512, 256, 128,
64, 32, 16, 8, 4, 2, 1, 0
Signed-off-by: Troy Kisky <troy.kisky@boundarydevices.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
|
Previously it was returning -EINVAL upon success.
Signed-off-by: Troy Kisky <troy.kisky@boundarydevices.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
|
Note that alarms are not currently implemented.
64 bytes of nvmem is supported and exposed in
sysfs (# is the instance number, starting with 0):
/sys/bus/nvmem/devices/pcf85363-#/nvmem
Signed-off-by: Eric Nelson <eric@nelint.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Tested-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
|
Register an nvmem device to expose the 3 scratch registers (total of 12
bytes) to both userspace and kernel space.
Reviewed-by: Sekhar Nori <nsekhar@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
|
Commit adfa543e7314 ("dmatest: don't use set_freezable_with_signal()")
introduced a bug (that is in fact documented by the patch commit text)
that leaves behind a dangling pointer. Since the done_wait structure is
allocated on the stack, future invocations to the DMATEST can produce
undesirable results (e.g., corrupted spinlocks). Ideally, this would be
cleaned up in the thread handler, but at the very least, the kernel
is left in a very precarious scenario that can lead to some long debug
sessions when the crash comes later.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=197605
Signed-off-by: Adam Wallis <awallis@codeaurora.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
|
|
This reverts commit 847449f23dcb: ("dmaengine: rcar-dmac: use TCRB instead
of TCR for residue") as it breaks small serial console.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
|
|
Registering qrtr with module_init makes the ability of typical platform
code to create AF_QIPCRTR socket during probe a matter of link order
luck. Moving qrtr to postcore_initcall() avoids this.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter/IPVS updates for your net-next
tree, they are:
1) Speed up table replacement on busy systems with large tables
(and many cores) in x_tables. Now xt_replace_table() synchronizes by
itself by waiting until all cpus had an even seqcount and we use no
use seqlock when fetching old counters, from Florian Westphal.
2) Add nf_l4proto_log_invalid() and nf_ct_l4proto_log_invalid() to speed
up packet processing in the fast path when logging is not enabled, from
Florian Westphal.
3) Precompute masked address from configuration plane in xt_connlimit,
from Florian.
4) Don't use explicit size for set selection if performance set policy
is selected.
5) Allow to get elements from an existing set in nf_tables.
6) Fix incorrect check in nft_hash_deactivate(), from Florian.
7) Cache netlink attribute size result in l4proto->nla_size, from
Florian.
8) Handle NFPROTO_INET in nf_ct_netns_get() from conntrack core.
9) Use power efficient workqueue in conntrack garbage collector, from
Vincent Guittot.
10) Remove unnecessary parameter, in conntrack l4proto functions, also
from Florian.
11) Constify struct nf_conntrack_l3proto definitions, from Florian.
12) Remove all typedefs in nf_conntrack_h323 via coccinelle semantic
patch, from Harsha Sharma.
13) Don't store address in the rbtree nodes in xt_connlimit, they are
never used, from Florian.
14) Fix out of bound access in the conntrack h323 helper, patch from
Eric Sesterhenn.
15) Print symbols for the address returned with %pS in IPVS, from
Helge Deller.
16) Proc output should only display its own netns in IPVS, from
KUWAZAWA Takuya.
17) Small clean up in size_entry_mwt(), from Colin Ian King.
18) Use test_and_clear_bit from nf_nat_proto_clean() instead of separated
non-atomic test and then clear bit, from Florian Westphal.
19) Consolidate prefix length maps in ipset, from Aaron Conole.
20) Fix sparse warnings in ipset, from Jozsef Kadlecsik.
21) Simplify list_set_memsize(), from simran singhal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If source and destination bus width differs pack/unpack MDMA
feature has to be activated for alignment.
This pack/unpack feature implies to have both source/destination address
and buffer length aligned on bus width.
Fixes: a4ffb13c8946 ("dmaengine: Add STM32 MDMA driver")
Signed-off-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
|
|
Since commit 3cab1e711297 ("lib/vsprintf: refactor duplicate code
to special_hex_number()") %pad doesn't need 0x prefix so drop that.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
|
|
Since commit 3cab1e711297 ("lib/vsprintf: refactor duplicate code
to special_hex_number()") %pad doesn't need 0x prefix so drop that.
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
|
|
Add ethtool statistics support by reading the GOP statistics from the
hardware counters. Also implement a workqueue to gather the statistics
every second or some 32-bit counters could overflow.
Suggested-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: Miquel Raynal <miquel.raynal@free-electrons.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Christophe JAILLET says:
====================
fsl/fman: Fix some error handling code in mac_probe
Commit c6e26ea8c893 ("dpaa_eth: change device used") generated some
conflicts in my patches waiting for submission. So I took a closer look at
it.
So here is a serie of 4 patches.
The 1st one is just about a spurious call to 'dev_set_drvdata()', which is
done in only 1 error handling path in the function.
The 2nd one removes some devm_iounmap/release/kfree functions which look
useless to me.
The 3rd one fixes a missing of_node_put.
The 4th one is just cosmetic and removes a useless message.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Memory allocation functions already display some informaton in case of
memory allocation failure. There is no need to add an extra 'dev_err' here.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If 'of_phy_find_device()' fails, we must undo the previous 'of_node_get()'
call, as done the the following error handling code.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is no need to release explicitly some devm_ allocated resources.
If the 'mac_probe()' probe function fails, they will be released
automatically, as already done in the other error handling paths of
this function.
Also goto '_return_of_get_parent' as in the other error handling paths.
This is useless (priv->fixed_link is NULL at this point), but at least
it is consistent.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit c6e26ea8c893 ("dpaa_eth: change device used") has removed usage of
'dev_set_drvdata()' in the 'mac_probe() function.
This call should also be axed.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The size for IFLA_IF_NETNSID is missing from the size calculation
because the proceeding semicolon was not removed. Fix this by removing
the semicolon.
Detected by CoverityScan, CID#1461135 ("Structurally dead code")
Fixes: 79e1ad148c84 ("rtnetlink: use netnsid to query interface")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A CDC Ethernet functional descriptor with wMaxSegmentSize = 0 will
cause a divide error in usbnet_probe:
divide error: 0000 [#1] PREEMPT SMP KASAN
Modules linked in:
CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc8-44453-g1fdc1a82c34f #56
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
task: ffff88006bef5c00 task.stack: ffff88006bf60000
RIP: 0010:usbnet_update_max_qlen+0x24d/0x390 drivers/net/usb/usbnet.c:355
RSP: 0018:ffff88006bf67508 EFLAGS: 00010246
RAX: 00000000000163c8 RBX: ffff8800621fce40 RCX: ffff8800621fcf34
RDX: 0000000000000000 RSI: ffffffff837ecb7a RDI: ffff8800621fcf34
RBP: ffff88006bf67520 R08: ffff88006bef5c00 R09: ffffed000c43f881
R10: ffffed000c43f880 R11: ffff8800621fc406 R12: 0000000000000003
R13: ffffffff85c71de0 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88006ca00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe9c0d6dac CR3: 00000000614f4000 CR4: 00000000000006f0
Call Trace:
usbnet_probe+0x18b5/0x2790 drivers/net/usb/usbnet.c:1783
qmi_wwan_probe+0x133/0x220 drivers/net/usb/qmi_wwan.c:1338
usb_probe_interface+0x324/0x940 drivers/usb/core/driver.c:361
really_probe drivers/base/dd.c:413
driver_probe_device+0x522/0x740 drivers/base/dd.c:557
Fix by simply ignoring the bogus descriptor, as it is optional
for QMI devices anyway.
Fixes: 423ce8caab7e ("net: usb: qmi_wwan: New driver for Huawei QMI based WWAN devices")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Setting dev->hard_mtu to 0 will cause a divide error in
usbnet_probe. Protect against devices with bogus CDC Ethernet
functional descriptors by ignoring a zero wMaxSegmentSize.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that ds->num_ports is 3, there is no need to check range of "port"
parameter.
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On 32-bit architectures, rtc_time_to_tm() returns incorrect results
in 2038 or later, and do_gettimeofday() is broken for the same reason.
This changes the code to use ktime_get_real_seconds() and time64_to_tm()
instead, both of them are 2038-safe, and we can also get rid of the
CONFIG_RTC_LIB dependency that way.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit a67e9472da42 ("of: Add array read functions with min/max size
limits") added a new interface for reading variable-length arrays from
DT properties. One user was added in dsa recently and this causes a
build error because that code can be built with CONFIG_OF disabled:
net/dsa/dsa2.c: In function 'dsa_switch_parse_member_of':
net/dsa/dsa2.c:678:7: error: implicit declaration of function 'of_property_read_variable_u32_array'; did you mean 'of_property_read_u32_array'? [-Werror=implicit-function-declaration]
This adds a dummy functions for of_property_read_variable_u32_array()
and a few others that had been missing here. I decided to move
of_property_read_string() and of_property_read_string_helper() in the
process to make it easier to compare the two sets of function prototypes
to make sure they match.
Fixes: 975e6e32215e ("net: dsa: rework switch parsing")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We return on the previous line so this "return 0;" statement should just
be deleted.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Egil Hjelmeland says:
====================
net: dsa: lan9303: Linting
This series is non-functional.
- Correct some errors in comments and documentation.
Remove scripts/checkpatch.pl WARNINGs and most CHECKs:
- Replace msleep(1) with usleep_range()
- Adjust indenting
Changes v1 -> v2:
- Removed patch 4 "Remove unnecessary parentheses", to be addressed later
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove scripts/checkpatch.pl CHECKs by adjusting indenting.
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove scripts/checkpatch.pl WARNING by replacing msleep(1) with usleep_range()
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Two comments refer to registers, but lack the LAN9303_ prefix.
Fix that.
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix to return a negative error code from the VID create error handling
case instead of 0, as done elsewhere in this function.
Fixes: c57529e1d5d8 ("mlxsw: spectrum: Replace vPorts with Port-VLAN")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix to return a negative error code from the dpaa_bp_alloc() error
handling case instead of 0, as done elsewhere in this function.
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 5e9859699aba ("KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing
implementation", 2016-12-20) added code that tries to exclude any use
or update of the hashed page table (HPT) while the HPT resizing code
is iterating through all the entries in the HPT. It does this by
taking the kvm->lock mutex, clearing the kvm->arch.hpte_setup_done
flag and then sending an IPI to all CPUs in the host. The idea is
that any VCPU task that tries to enter the guest will see that the
hpte_setup_done flag is clear and therefore call kvmppc_hv_setup_htab_rma,
which also takes the kvm->lock mutex and will therefore block until
we release kvm->lock.
However, any VCPU that is already in the guest, or is handling a
hypervisor page fault or hypercall, can re-enter the guest without
rechecking the hpte_setup_done flag. The IPI will cause a guest exit
of any VCPUs that are currently in the guest, but does not prevent
those VCPU tasks from immediately re-entering the guest.
The result is that after resize_hpt_rehash_hpte() has made a HPTE
absent, a hypervisor page fault can occur and make that HPTE present
again. This includes updating the rmap array for the guest real page,
meaning that we now have a pointer in the rmap array which connects
with pointers in the old rev array but not the new rev array. In
fact, if the HPT is being reduced in size, the pointer in the rmap
array could point outside the bounds of the new rev array. If that
happens, we can get a host crash later on such as this one:
[91652.628516] Unable to handle kernel paging request for data at address 0xd0000000157fb10c
[91652.628668] Faulting instruction address: 0xc0000000000e2640
[91652.628736] Oops: Kernel access of bad area, sig: 11 [#1]
[91652.628789] LE SMP NR_CPUS=1024 NUMA PowerNV
[91652.628847] Modules linked in: binfmt_misc vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables ses enclosure scsi_transport_sas i2c_opal ipmi_powernv ipmi_devintf i2c_core ipmi_msghandler powernv_op_panel nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc kvm_hv kvm_pr kvm scsi_dh_alua dm_service_time dm_multipath tg3 ptp pps_core [last unloaded: stap_552b612747aec2da355051e464fa72a1_14259]
[91652.629566] CPU: 136 PID: 41315 Comm: CPU 21/KVM Tainted: G O 4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le #1
[91652.629684] task: c0000007a419e400 task.stack: c0000000028d8000
[91652.629750] NIP: c0000000000e2640 LR: d00000000c36e498 CTR: c0000000000e25f0
[91652.629829] REGS: c0000000028db5d0 TRAP: 0300 Tainted: G O (4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le)
[91652.629932] MSR: 900000010280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 44022422 XER: 00000000
[91652.630034] CFAR: d00000000c373f84 DAR: d0000000157fb10c DSISR: 40000000 SOFTE: 1
[91652.630034] GPR00: d00000000c36e498 c0000000028db850 c000000001403900 c0000007b7960000
[91652.630034] GPR04: d0000000117fb100 d000000007ab00d8 000000000033bb10 0000000000000000
[91652.630034] GPR08: fffffffffffffe7f 801001810073bb10 d00000000e440000 d00000000c373f70
[91652.630034] GPR12: c0000000000e25f0 c00000000fdb9400 f000000003b24680 0000000000000000
[91652.630034] GPR16: 00000000000004fb 00007ff7081a0000 00000000000ec91a 000000000033bb10
[91652.630034] GPR20: 0000000000010000 00000000001b1190 0000000000000001 0000000000010000
[91652.630034] GPR24: c0000007b7ab8038 d0000000117fb100 0000000ec91a1190 c000001e6a000000
[91652.630034] GPR28: 00000000033bb100 000000000073bb10 c0000007b7960000 d0000000157fb100
[91652.630735] NIP [c0000000000e2640] kvmppc_add_revmap_chain+0x50/0x120
[91652.630806] LR [d00000000c36e498] kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv]
[91652.630884] Call Trace:
[91652.630913] [c0000000028db850] [c0000000028db8b0] 0xc0000000028db8b0 (unreliable)
[91652.630996] [c0000000028db8b0] [d00000000c36e498] kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv]
[91652.631091] [c0000000028db9e0] [d00000000c36a078] kvmppc_vcpu_run_hv+0xdf8/0x1300 [kvm_hv]
[91652.631179] [c0000000028dbb30] [d00000000c2248c4] kvmppc_vcpu_run+0x34/0x50 [kvm]
[91652.631266] [c0000000028dbb50] [d00000000c220d54] kvm_arch_vcpu_ioctl_run+0x114/0x2a0 [kvm]
[91652.631351] [c0000000028dbbd0] [d00000000c2139d8] kvm_vcpu_ioctl+0x598/0x7a0 [kvm]
[91652.631433] [c0000000028dbd40] [c0000000003832e0] do_vfs_ioctl+0xd0/0x8c0
[91652.631501] [c0000000028dbde0] [c000000000383ba4] SyS_ioctl+0xd4/0x130
[91652.631569] [c0000000028dbe30] [c00000000000b8e0] system_call+0x58/0x6c
[91652.631635] Instruction dump:
[91652.631676] fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ffa1 2fa70000 793d0020 e9432110
[91652.631814] 7bbf26e4 7c7e1b78 7feafa14 409e0094 <807f000c> 786326e4 7c6a1a14 93a40008
[91652.631959] ---[ end trace ac85ba6db72e5b2e ]---
To fix this, we tighten up the way that the hpte_setup_done flag is
checked to ensure that it does provide the guarantee that the resizing
code needs. In kvmppc_run_core(), we check the hpte_setup_done flag
after disabling interrupts and refuse to enter the guest if it is
clear (for a HPT guest). The code that checks hpte_setup_done and
calls kvmppc_hv_setup_htab_rma() is moved from kvmppc_vcpu_run_hv()
to a point inside the main loop in kvmppc_run_vcpu(), ensuring that
we don't just spin endlessly calling kvmppc_run_core() while
hpte_setup_done is clear, but instead have a chance to block on the
kvm->lock mutex.
Finally we also check hpte_setup_done inside the region in
kvmppc_book3s_hv_page_fault() where the HPTE is locked and we are about
to update the HPTE, and bail out if it is clear. If another CPU is
inside kvm_vm_ioctl_resize_hpt_commit) and has cleared hpte_setup_done,
then we know that either we are looking at a HPTE
that resize_hpt_rehash_hpte() has not yet processed, which is OK,
or else we will see hpte_setup_done clear and refuse to update it,
because of the full barrier formed by the unlock of the HPTE in
resize_hpt_rehash_hpte() combined with the locking of the HPTE
in kvmppc_book3s_hv_page_fault().
Fixes: 5e9859699aba ("KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing implementation")
Cc: stable@vger.kernel.org # v4.10+
Reported-by: Satheesh Rajendran <satheera@in.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
|
Jiri Pirko says:
====================
qdisc RED offload
Nogah says:
Add an offload support for RED qdisc for mlxsw driver.
The first patch adds the ability to offload RED qdisc by using
ndo_setup_tc. It gives RED three commands, to offload, change or delete
the qdisc, to get the qdisc generic stats and to get it's RED xstats.
There is no enforcement on a driver to offload or not offload the qdisc and
it is up to the driver to decide.
RED qdisc is first being created and only later graft to a parent (unless
it is a root qdisc). For that reason the return value of the offload
replace command that is called in the init process doesn't reflect actual
offload state. The offload state is determined in the dump function so it
can be reflected to the user. This function is also responsible for stats
update.
The patchses 2-3 change the name of TC_SETUP_MQPRIO & TC_SETUP_CBS to match
with the new convention of QDISC prefix.
The rest of the patchset is driver support for the qdisc. Currently only
as root qdisc that is being set on the default traffic class. It supports
only the following parameters of RED: min, max, probability and ECN mode.
Limit and burst size related params are being ignored at this moment.
---
v7->v8 internal: (external RFC->v1)
- patch 1/9:
- unite the offload and un-offload functions
- clean the OFFLOAD flag when the qdisc in not offloaded
- patch 2/9:
- minor change to avoid a conflict
- patch 5/9:
- check for bad min/max values
- clean the offloaded qdisc after a bad config call
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for ndo_setup_tc with enum tc_setup_type value of
TC_SETUP_QDISC_STATS. This call updates the generic qdisc stats from the
cache if the handle ID that is asked for matching the root qdisc ID and
fails otherwise.
Currently doesn't support qlen and rqueues.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for ndo_setup_tc with enum tc_setup_type value of
TC_SETUP_RED_XSTATS. This call returns the RED qdisc xstats from the cache
if the handle ID that is asked for matching the root qdisc ID and fails
otherwise.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add more statistics to be collected from the HW periodically. These stats
are tclass based (beside ECN marked packet, that exist only port based).
They are needed to expose RED qdisc stats and xstats correctly.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This adds the counter group definitions for 2 new counter groups
which are necessary for gaining ECN & wred counters.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for ndo_setup_tc with enum tc_setup_type value of TC_SETUP_RED.
This call sets RED qdisc on a traffic class.
This patch supports RED qdisc only as a root qdisc and set in on the
default tclass. It can be set with or without ECN.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds 2 new registers:
- Congestion WRED ECN TClass Profile Register [CWTP]
- Congestion WRED ECN TClass and Pool Mapping Register [CWTPM]
These registers would later be needed to offload RED-related
functionality to the HW.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Change TC_SETUP_CBS to TC_SETUP_QDISC_CBS to match the new convention..
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Change TC_SETUP_MQPRIO to TC_SETUP_QDISC_MQPRIO to match the new
convention.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the ability to offload RED qdisc by using ndo_setup_tc.
There are four commands for RED offloading:
* TC_RED_SET: handles set and change.
* TC_RED_DESTROY: handle qdisc destroy.
* TC_RED_STATS: update the qdiscs counters (given as reference)
* TC_RED_XSTAT: returns red xstats.
Whether RED is being offloaded is being determined every time dump action
is being called because parent change of this qdisc could change its
offload state but doesn't require any RED function to be called.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The original patch had the wrong filename.
Fixes: bfdf75693875 ("bpf: create samples/bpf/tcp_bpf.readme")
Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After commit 07f4c90062f8 ("tcp/dccp: try to not exhaust ip_local_port_range
in connect()"), we will try to use even ports for connect(). Then if an
application (seen clearly with iperf) opens multiple streams to the same
destination IP and port, each stream will be given an even source port.
So the bonding driver's simple xmit_hash_policy based on layer3+4 addressing
will always hash all these streams to the same interface. And the total
throughput will limited to a single slave.
Change the tcp code will impact the whole tcp behavior, only for bonding
usage. Paolo Abeni suggested fix this by changing the bonding code only,
which should be more reasonable, and less impact.
Fix this by discarding the lowest hash bit because it contains little entropy.
After the fix we can re-balance between slaves.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|