Age | Commit message (Collapse) | Author |
|
Tom Herbert says:
====================
ila: make identifier format optional and other fixes
The identifier type and checksum neutral mapping bits are optional
in identifier formats. This patch set fixes the implementation to
make them optional and configurable.
Specific items:
- Clean up checksum diff code in ILA
- Add checksum neutral mapping auto so that checksum neutral
mapping can be configured without requiring use of the C-bit
- Add identifier type configuration and allow identifier
type to be configured so that the identifier type field does
not need to be present
- Added ILA documention: ila.txt
I have fixes for ILA in iproute2 that will be poseted separately.
Tested: Ran netperf TCP_RR on various combinations of checksum
mode and the two supported identifier types.
v2:
- Add proper sign off
- In ILA LWT, only check prefix length includes identifier type
if identifier type is enabled (ILA_ATYPE_USE_FORMAT).
- Add a hook type so that it can be specified whether ILA
translation is done on input or output route funciton in
LWT.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add documenation for kernel ILA. This describes ILA, features,
configuration gives some examples.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In LWT tunnels both an input and output route method is defined.
If both of these are executed in the same path then double translation
happens and the effect is not correct.
This patch adds a new attribute that indicates the hook type. Two
values are defined for route output and route output. ILA
translation is only done for the one that is set. The default is
to enable ILA on route output.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Allow identifier to be explicitly configured for a mapping.
This can either be one of the identifier types specified in the
ILA draft or a value of ILA_ATYPE_USE_FORMAT which means the
identifier type is inferred from the identifier type field.
If a value other than ILA_ATYPE_USE_FORMAT is set for a
mapping then it is assumed that the identifier type field is
not present in an identifier.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add checksum neutral auto that performs checksum neutral mapping
without using the C-bit. This is enabled by configuration of
a mapping.
The checksum neutral function has been split into
ila_csum_do_neutral_fmt and ila_csum_do_neutral_nofmt. The former
handles the C-bit and includes it in the adjustment value. The latter
just sets the adjustment value on the locator diff only.
Added configuration for checksum neutral map aut in ila_lwt
and ila_xlat.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Consolidate computing checksum diff into one function.
Add get_csum_diff_iaddr that computes the checksum diff between
an address argument and locator being written. get_csum_diff
calls this using the destination address in the IP header as
the argument.
Also moved ila_init_saved_csum to be close to the checksum
diff functions.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 1fa59bda21c7fa36 ("ARM: shmobile: Remove legacy board code for
Armadillo-800 EVA"), removed the last user of st1232_pdata and the
"st1232-ts" platform device. All remaining users use DT.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Merge with mainline to bring in SPDX markings to avoid annoying merge
problems when some header files get deleted.
|
|
Some Synaptics devices, such as LEN0073, use SMBUS version 3.
Signed-off-by: Yiannis Marangos <yiannis.marangos@gmail.com>
Acked-by: Benjamin Tissoires <benjamion.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
If INPUT_PROP_DIRECT is set, userspace doesn't have to fall back to old
ways of identifying touchscreen devices.
In order to identify a touchscreen device, Android for example, seems to
already depend on INPUT_PROP_DIRECT to be present in drivers. udev still
checks for either BTN_TOUCH or INPUT_PROP_DIRECT. Checking for BTN_TOUCH
however can quite easily lead to false positives; it's a code that not only
touchscreen device drivers use.
According to the documentation, touchscreen drivers should have this
property set and in order to make life easy for userspace, let's set it.
Signed-off-by: Martin Kepplinger <martink@posteo.de>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
ELAN060C touchpad uses elan_i2c as its driver. It can be
found on Lenovo ideapad 320-14AST.
BugLink: https://bugs.launchpad.net/bugs/1727544
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
mlx5e_execute_l2_action
hn is being kfree'd in mlx5e_del_l2_from_hash and then dereferenced
by accessing hn->ai.addr
Fix this by copying the MAC address into a local variable for its safe use
in all possible execution paths within function mlx5e_execute_l2_action.
Addresses-Coverity-ID: 1417789
Fixes: eeb66cdb6826 ("net/mlx5: Separate between E-Switch and MPFS")
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implements port to port forwarding with route table and arp table
lookup for ipv4 packets using bpf_redirect helper function and
lpm_trie map.
Signed-off-by: Christina Jacob <Christina.Jacob@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is better for code locality and should slightly
speed up normal interrupts.
This also allows PPS clock output to start working for
i.mx7. This is because i.mx7 was already using the limit
of 3 interrupts, and needed another.
Signed-off-by: Troy Kisky <troy.kisky@boundarydevices.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The mvpp2 driver can't cope at all with the TX affinities being
changed from userspace, and spit an endless stream of
[ 91.779920] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
[ 91.779930] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
[ 91.780402] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
[ 91.780406] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
[ 91.780415] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
[ 91.780418] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
rendering the box completely useless (I've measured around 600k
interrupts/s on a 8040 box) once irqbalance kicks in and start
doing its job.
Obviously, the driver was never designed with this in mind. So let's
work around the problem by preventing userspace from interacting
with these interrupts altogether.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Vitaly Kuznetsov says:
====================
hv_netvsc: fix a hang on channel/mtu changes
It was found that netvsc driver doesn't survive e.g.
test. I was able to identify a hang in guest/host communication, it is
fixed by PATCH1 of this series. PATCH2 is a cosmetic change masking
unneeded messages.
Changes since v1:
- Throw away patches 2 and 3 of the original series as one is unneeded and
the other is not justified [Eric Dumazet, Stephen Hemminger] so I'm only
fixing the hang now, the crash doesn't reproduce. Will keep an eye on it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Hyper-V hosts are known to send RNDIS messages even after we halt the
device in rndis_filter_halt_device(). Remove user visible messages
as they are not really useful.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It was found that in some cases host refuses to teardown GPADL for send/
receive buffers (probably when some work with these buffere is scheduled or
ongoing). Change the teardown logic to be:
1) Send NVSP_MSG1_TYPE_REVOKE_* messages
2) Close the channel
3) Teardown GPADLs.
This seems to work reliably.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, we create a LED trigger for any link speed known to a PHY.
These triggers only fire when their exact link speed had been negotiated
(they aren't cumulative, that is, they don't fire for "their or any higher"
link speed).
What we are missing, however, is a trigger which will fire on any link
speed known to the PHY. Such trigger can then be used for implementing a
poor man's substitute of the "link" LED on boards that lack it.
Let's add it.
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, phy_led_trigger_change_speed() is handling a "no link" condition
like it was some kind of an error (using "goto" to a code at the function
end).
However, having no link at PHY is an ordinary operational state, so let's
handle it in an appropriately named separate function so it is more obvious
what the code is doing.
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Localize PCI_QUIRKS in the PCI bus menu.
Move PCI_QUIRKS to the PCI bus menu instead of the (often broken) General
Setup EXPERT menu. The prompt still depends on EXPERT.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
pdev_save_srm_config() and struct pdev_srm_saved_conf are only used in
arch/alpha/kernel/pci.c, so make them static there.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
|
|
Remove these unused declarations:
pcibios_config_init() # never defined anywhere
pcibios_scan_root() # only defined by x86
pcibios_get_irq_routing_table() # only defined by x86
pcibios_set_irq_routing() # only defined by x86
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
|
|
<linux/pci.h> defines struct pci_bus and struct pci_dev and includes the
struct resource definition before including <asm/pci.h>. Nobody includes
<asm/pci.h> directly, so they don't need their own declarations.
Remove the redundant struct pci_dev, pci_bus, resource declarations.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> # CRIS
Acked-by: Ralf Baechle <ralf@linux-mips.org> # MIPS
|
|
All users of pcibios_set_master() include <linux/pci.h>, which already has
a declaration. Remove the unnecessary declarations from the <asm/pci.h>
files.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> # CRIS
Acked-by: Ralf Baechle <ralf@linux-mips.org> # MIPS
|
|
PCIe PME and native hotplug share the same interrupt number, so hotplug
interrupts are also processed by PME. In some cases, e.g., a Link Down
interrupt, a device may be present but unreachable, so when we try to
read its Root Status register, the read fails and we get all ones data
(0xffffffff).
Previously, we interpreted that data as PCI_EXP_RTSTA_PME being set, i.e.,
"some device has asserted PME," so we scheduled pcie_pme_work_fn(). This
caused an infinite loop because pcie_pme_work_fn() tried to handle PME
requests until PCI_EXP_RTSTA_PME is cleared, but with the link down,
PCI_EXP_RTSTA_PME can't be cleared.
Check for the invalid 0xffffffff data everywhere we read the Root Status
register.
1469d17dd341 ("PCI: pciehp: Handle invalid data when reading from
non-existent devices") added similar checks in the hotplug driver.
Signed-off-by: Qiang Zheng <zhengqiang10@huawei.com>
[bhelgaas: changelog, also check in pcie_pme_work_fn(), use "~0" to follow
other similar checks]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Gabriele is now moving to a different role, so remove him as HiSilicon PCI
maintainer.
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
[bhelgaas: Thanks for all your help, Gabriele, and best wishes!]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Zhou Wang <wangzhou1@hisilicon.com>
|
|
Just sent an email there and received an autoreply because he no longer
works there.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The effective_affinity_mask is always set when an interrupt is assigned in
__assign_irq_vector() -> apic->cpu_mask_to_apicid(), e.g. for struct apic
apic_physflat: -> default_cpu_mask_to_apicid() ->
irq_data_update_effective_affinity(), but it looks d->common->affinity
remains all-1's before the user space or the kernel changes it later.
In the early allocation/initialization phase of an IRQ, we should use the
effective_affinity_mask, otherwise Hyper-V may not deliver the interrupt to
the expected CPU. Without the patch, if we assign 7 Mellanox ConnectX-3
VFs to a 32-vCPU VM, one of the VFs may fail to receive interrupts.
Tested-by: Adrian Suhov <v-adsuho@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jake Oshins <jakeo@microsoft.com>
Cc: stable@vger.kernel.org
Cc: Jork Loeser <jloeser@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
|
|
There are no more users left of the gpd_dev_ops.active_wakeup()
callback. All have been converted to GENPD_FLAG_ACTIVE_WAKEUP.
Hence remove the callback.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Set the newly introduced GENPD_FLAG_ACTIVE_WAKEUP, which allows to
remove the driver's own flag-based callback.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Set the newly introduced GENPD_FLAG_ACTIVE_WAKEUP, which allows to
remove the driver's own flag-based callback.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Set the newly introduced GENPD_FLAG_ACTIVE_WAKEUP, which allows to
remove the driver's own "always true" callback.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
It is quite common for PM Domains to require slave devices to be kept
active during system suspend if they are to be used as wakeup sources.
To enable this, currently each PM Domain or driver has to provide its
own gpd_dev_ops.active_wakeup() callback.
Introduce a new flag GENPD_FLAG_ACTIVE_WAKEUP to consolidate this.
If specified, all slave devices configured as wakeup sources will be
kept active during system suspend.
PM Domains that need more fine-grained controls, based on the slave
device, can still provide their own callbacks, as before.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The WLAN LED on the Linksys WRT54GSv1 is active low, but the software
treats it as active high. Fix the inverted logic.
Fixes: 7bb26b169116 ("MIPS: BCM47xx: Fix LEDs on WRT54GS V1.0")
Signed-off-by: Mirko Parthey <mirko.parthey@web.de>
Looks-ok-by: Rafał Miłecki <zajec5@gmail.com>
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.17+
Patchwork: https://patchwork.linux-mips.org/patch/16071/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
ASC1 is available on every Lantiq SoC (also AmazonSE) and should be
enabled like the other generic xway clocks instead of ASC0, which is
only available for AR9 and Danube.
Signed-off-by: Martin Schiller <ms@dev.tdt.de>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: John Crispin <john@phrozen.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Martin Schiller <ms@dev.tdt.de>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16145/
[jhogan@kernel.org: Drop braces]
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
Since it can take a while before a specific thread gets scheduled, it
is better to just implement a first come first served queue mechanism.
That way, if a thread is already scheduled and is idle, it can pick up
the work to do from the queue.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
If a delegation has been revoked by the server, operations using that
delegation should error out with NFS4ERR_DELEG_REVOKED in the >4.1
case, and NFS4ERR_BAD_STATEID otherwise.
The server needs NFSv4.1 clients to explicitly free revoked delegations.
If the server returns NFS4ERR_DELEG_REVOKED, the client will do that;
otherwise it may just forget about the delegation and be unable to
recover when it later sees SEQ4_STATUS_RECALLABLE_STATE_REVOKED set on a
SEQUENCE reply. That can cause the Linux 4.1 client to loop in its
stage manager.
Signed-off-by: Andrew Elble <aweits@rit.edu>
Reviewed-by: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
I noticed the server was sometimes not closing the connection after
a flushed Send. For example, if the client responds with an RNR NAK
to a Reply from the server, that client might be deadlocked, and
thus wouldn't send any more traffic. Thus the server wouldn't have
any opportunity to notice the XPT_CLOSE bit has been set.
Enqueue the transport so that svcxprt notices the bit even if there
is no more transport activity after a flushed completion, QP access
error, or device removal event.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-By: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Publishing of net pointer is not safe,
let's use nfs->ns.inum instead
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
It would be kinder to WARN() and recover in several spots here instead
of BUG()ing.
Also, it looks like the read_u32_from_xdr_buf() call could actually
fail, though it might require a broken (or malicious) client, so convert
that to just an error return.
Reported-by: Weston Andros Adamson <dros@monkey.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
During each NFSv4 callback Call, an RDMA Send completion frees the
page that contains the RPC Call message. If the upper layer
determines that a retransmit is necessary, this is too soon.
One possible symptom: after a GARBAGE_ARGS response an NFSv4.1
callback request, the following BUG fires on the NFS server:
kernel: BUG: Bad page state in process kworker/0:2H pfn:7d3ce2
kernel: page:ffffea001f4f3880 count:-2 mapcount:0 mapping: (null) index:0x0
kernel: flags: 0x2fffff80000000()
kernel: raw: 002fffff80000000 0000000000000000 0000000000000000 fffffffeffffffff
kernel: raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
kernel: page dumped because: nonzero _refcount
kernel: Modules linked in: cts rpcsec_gss_krb5 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm
ocfs2_nodemanager ocfs2_stackglue rpcrdm a ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad
rdma_cm ib_cm iw_cm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel
kvm irqbypass crct10dif_pc lmul crc32_pclmul ghash_clmulni_intel pcbc iTCO_wdt
iTCO_vendor_support aesni_intel crypto_simd glue_helper cryptd pcspkr lpc_ich i2c_i801
mei_me mf d_core mei raid0 sg wmi ioatdma ipmi_si ipmi_devintf ipmi_msghandler shpchp
acpi_power_meter acpi_pad nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables xfs
libcrc32c mlx4_en mlx4_ib mlx5_ib ib_core sd_mod sr_mod cdrom ast drm_kms_helper
syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci crc32c_intel libahci drm
mlx5_core igb libata mlx4_core dca i2c_algo_bit i2c_core nvme
kernel: ptp nvme_core pps_core dm_mirror dm_region_hash dm_log dm_mod dax
kernel: CPU: 0 PID: 11495 Comm: kworker/0:2H Not tainted 4.14.0-rc3-00001-g577ce48 #811
kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
kernel: Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
kernel: Call Trace:
kernel: dump_stack+0x62/0x80
kernel: bad_page+0xfe/0x11a
kernel: free_pages_check_bad+0x76/0x78
kernel: free_pcppages_bulk+0x364/0x441
kernel: ? ttwu_do_activate.isra.61+0x71/0x78
kernel: free_hot_cold_page+0x1c5/0x202
kernel: __put_page+0x2c/0x36
kernel: svc_rdma_put_context+0xd9/0xe4 [rpcrdma]
kernel: svc_rdma_wc_send+0x50/0x98 [rpcrdma]
This issue exists all the way back to v4.5, but refactoring and code
re-organization prevents this simple patch from applying to kernels
older than v4.12. The fix is the same, however, if someone needs to
backport it.
Reported-by: Ben Coddington <bcodding@redhat.com>
BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=314
Fixes: 5d252f90a800 ('svcrdma: Add class for RDMA backwards ... ')
Cc: stable@vger.kernel.org # v4.12
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
do_gettimeofday() is deprecated and we should generally use time64_t
based functions instead.
In case of nfsd, all three users of nfssvc_boot only use the initial
time as a unique token, and are not affected by it overflowing, so they
are not affected by the y2038 overflow.
This converts the structure to timespec64 anyway and adds comments
to all uses, to document that we have thought about it and avoid
having to look at it again.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)
Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.
The variable nfs4_file.fi_ref is used as pure reference counter.
Convert it to refcount_t and fix up the operations.
Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)
Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.
The variable nfs4_cntl_odstate.co_odcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.
Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)
Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.
The variable nfs4_stid.sc_count is used as pure reference counter.
Convert it to refcount_t and fix up the operations.
Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
lockd_up() can call lockd_unregister_notifiers twice:
inside lockd_start_svc() when it calls lockd_svc_exit_thread()
and then in error path of lockd_up()
Patch forces lockd_start_svc() to unregister notifiers in all error cases
and removes extra unregister in error path of lockd_up().
Fixes: cb7d224f82e4 "lockd: unregister notifier blocks if the service ..."
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
The spec allows us to return NFS4ERR_SEQ_FALSE_RETRY if we notice that
the client is making a call that matches a previous (slot, seqid) pair
but that *isn't* actually a replay, because some detail of the call
doesn't actually match the previous one.
Catching every such case is difficult, but we may as well catch a few
easy ones. This also handles the case described in the previous patch,
in a different way.
The spec does however require us to catch the case where the difference
is in the rpc credentials. This prevents somebody from snooping another
user's replies by fabricating retries.
(But the practical value of the attack is limited by the fact that the
replies with the most sensitive data are READ replies, which are not
normally cached.)
Tested-by: Olga Kornievskaia <aglo@umich.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Currently our handling of 4.1+ requests without "cachethis" set is
confusing and not quite correct.
Suppose a client sends a compound consisting of only a single SEQUENCE
op, and it matches the seqid in a session slot (so it's a retry), but
the previous request with that seqid did not have "cachethis" set.
The obvious thing to do might be to return NFS4ERR_RETRY_UNCACHED_REP,
but the protocol only allows that to be returned on the op following the
SEQUENCE, and there is no such op in this case.
The protocol permits us to cache replies even if the client didn't ask
us to. And it's easy to do so in the case of solo SEQUENCE compounds.
So, when we get a solo SEQUENCE, we can either return the previously
cached reply or NFSERR_SEQ_FALSE_RETRY if we notice it differs in some
way from the original call.
Currently, we're returning a corrupt reply in the case a solo SEQUENCE
matches a previous compound with more ops. This actually matters
because the Linux client recently started doing this as a way to recover
from lost replies to idempotent operations in the case the process doing
the original reply was killed: in that case it's difficult to keep the
original arguments around to do a real retry, and the client no longer
cares what the result is anyway, but it would like to make sure that the
slot's sequence id has been incremented, and the solo SEQUENCE assures
that: if the server never got the original reply, it will increment the
sequence id. If it did get the original reply, it won't increment, and
nothing else that about the reply really matters much. But we can at
least attempt to return valid xdr!
Tested-by: Olga Kornievskaia <aglo@umich.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
The function _svc_create_xprt is local to the source and
does not need to be in global scope, so make it static.
Cleans up sparse warning:
symbol '_svc_create_xprt' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Jeff Layton <jlayton@poochiereds.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|