Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull thread_info flag accessor helper updates from Borislav Petkov:
"Add a set of thread_info.flags accessors which snapshot it before
accesing it in order to prevent any potential data races, and convert
all users to those new accessors"
* tag 'core_entry_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
powerpc: Snapshot thread flags
powerpc: Avoid discarding flags in system_call_exception()
openrisc: Snapshot thread flags
microblaze: Snapshot thread flags
arm64: Snapshot thread flags
ARM: Snapshot thread flags
alpha: Snapshot thread flags
sched: Snapshot thread flags
entry: Snapshot thread flags
x86: Snapshot thread flags
thread_info: Add helpers to snapshot thread flags
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SGX updates from Borislav Petkov:
- Add support for handling hw errors in SGX pages: poisoning,
recovering from poison memory and error injection into SGX pages
- A bunch of changes to the SGX selftests to simplify and allow of SGX
features testing without the need of a whole SGX software stack
- Add a sysfs attribute which is supposed to show the amount of SGX
memory in a NUMA node, similar to what /proc/meminfo is to normal
memory
- The usual bunch of fixes and cleanups too
* tag 'x86_sgx_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86/sgx: Fix NULL pointer dereference on non-SGX systems
selftests/sgx: Fix corrupted cpuid macro invocation
x86/sgx: Add an attribute for the amount of SGX memory in a NUMA node
x86/sgx: Fix minor documentation issues
selftests/sgx: Add test for multiple TCS entry
selftests/sgx: Enable multiple thread support
selftests/sgx: Add page permission and exception test
selftests/sgx: Rename test properties in preparation for more enclave tests
selftests/sgx: Provide per-op parameter structs for the test enclave
selftests/sgx: Add a new kselftest: Unclobbered_vdso_oversubscribed
selftests/sgx: Move setup_test_encl() to each TEST_F()
selftests/sgx: Encpsulate the test enclave creation
selftests/sgx: Dump segments and /proc/self/maps only on failure
selftests/sgx: Create a heap for the test enclave
selftests/sgx: Make data measurement for an enclave segment optional
selftests/sgx: Assign source for each segment
selftests/sgx: Fix a benign linker warning
x86/sgx: Add check for SGX pages to ghes_do_memory_failure()
x86/sgx: Add hook to error injection address validation
x86/sgx: Hook arch_memory_failure() into mainline code
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SEV updates from Borislav Petkov:
"The accumulated pile of x86/sev generalizations and cleanups:
- Share the SEV string unrolling logic with TDX as TDX guests need it
too
- Cleanups and generalzation of code shared by SEV and TDX"
* tag 'x86_sev_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sev: Move common memory encryption code to mem_encrypt.c
x86/sev: Rename mem_encrypt.c to mem_encrypt_amd.c
x86/sev: Use CC_ATTR attribute to generalize string I/O unroll
x86/sev: Remove do_early_exception() forward declarations
x86/head64: Carve out the guest encryption postprocessing into a helper
x86/sev: Get rid of excessive use of defines
x86/sev: Shorten GHCB terminate macro names
|
|
Merge an operating performance points (OPP) update, devfreq updates
and power capping updates for 5.17-rc1:
- Update outdated OPP documentation (Tang Yizhou).
- Reduce log severity for informative message regarding frequency
transition failures in devfreq (Tzung-Bi Shih).
- Add DRAM frequency controller devfreq driver for Allwinner sunXi
SoCs (Samuel Holland).
- Add missing COMMON_CLK dependency to the sun8i devfreq driver (Arnd
Bergmann).
- Add support for new layout of Psys PowerLimit Register on SPR to
the Intel RAPL power capping driver (Zhang Rui).
- Fix typo in a comment in idle_inject.c (Jason Wang).
- Remove unused function definition from the DTPM (Dynamit Thermal
Power Management) power capping framework (Daniel Lezcano).
- Reduce DTPM trace verbosity (Daniel Lezcano).
* pm-opp:
Documentation: power: Update outdated contents in opp.rst
* pm-devfreq:
PM / devfreq: Reduce log severity for informative message
PM / devfreq: sun8i: addd COMMON_CLK dependency
PM / devfreq: Add a driver for the sun8i/sun50i MBUS
* powercap:
powercap/drivers/dtpm: Reduce trace verbosity
powercap/drivers/dtpm: Remove unused function definition
powercap: fix typo in a comment in idle_inject.c
powercap: intel_rapl: support new layout of Psys PowerLimit Register on SPR
|
|
Merge cpuidle updates, PM core updates and one hiberation-related
update for 5.17-rc1:
- Make cpuidle use default_groups in kobj_type (Greg Kroah-Hartman).
- Fix two comments in cpuidle code (Jason Wang, Yang Li).
- Simplify locking in pm_runtime_put_suppliers() (Rafael Wysocki).
- Add safety net to supplier device release in the runtime PM core
code (Rafael Wysocki).
- Capture device status before disabling runtime PM for it (Rafael
Wysocki).
- Add new macros for declaring PM operations to allow drivers to
avoid guarding them with CONFIG_PM #ifdefs or __maybe_unused and
update some drivers to use these macros (Paul Cercueil).
- Allow ACPI hardware signature to be honoured during restore from
hibernation (David Woodhouse).
* pm-cpuidle:
cpuidle: use default_groups in kobj_type
cpuidle: Fix cpuidle_remove_state_sysfs() kerneldoc comment
cpuidle: menu: Fix typo in a comment
* pm-core:
PM: runtime: Simplify locking in pm_runtime_put_suppliers()
mmc: mxc: Use the new PM macros
mmc: jz4740: Use the new PM macros
PM: runtime: Add safety net to supplier device release
PM: runtime: Capture device status before disabling runtime PM
PM: core: Add new *_PM_OPS macros, deprecate old ones
PM: core: Redefine pm_ptr() macro
r8169: Avoid misuse of pm_ptr() macro
* pm-sleep:
PM: hibernate: Allow ACPI hardware signature to be honoured
|
|
Merge cpufreq updates for 5.17-rc1:
- Add new P-state driver for AMD processors (Huang Rui).
- Fix initialization of min and max frequency QoS requests in the
cpufreq core (Rafael Wysocki).
- Fix EPP handling on Alder Lake in intel_pstate (Srinivas Pandruvada).
- Make intel_pstate update cpuinfo.max_freq when notified of HWP
capabilities changes and drop a redundant function call from that
driver (Rafael Wysocki).
- Improve IRQ support in the Qcom cpufreq driver (Ard Biesheuvel,
Stephen Boyd, Vladimir Zapolskiy).
- Fix double devm_remap() in the Mediatek cpufreq driver (Hector Yuan).
- Introduce thermal pressure helpers for cpufreq CPU cooling (Lukasz
Luba).
- Make cpufreq use default_groups in kobj_type (Greg Kroah-Hartman).
* pm-cpufreq: (32 commits)
x86, sched: Fix undefined reference to init_freq_invariance_cppc() build error
cpufreq: amd-pstate: Fix Kconfig dependencies for AMD P-State
cpufreq: amd-pstate: Fix struct amd_cpudata kernel-doc comment
MAINTAINERS: Add AMD P-State driver maintainer entry
Documentation: amd-pstate: Add AMD P-State driver introduction
cpufreq: amd-pstate: Add AMD P-State performance attributes
cpufreq: amd-pstate: Add AMD P-State frequencies attributes
cpufreq: amd-pstate: Add boost mode support for AMD P-State
cpufreq: amd-pstate: Add trace for AMD P-State module
cpufreq: amd-pstate: Introduce the support for the processors with shared memory solution
cpufreq: amd-pstate: Add fast switch function for AMD P-State
cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors
ACPI: CPPC: Add CPPC enable register function
ACPI: CPPC: Check present CPUs for determining _CPC is valid
ACPI: CPPC: Implement support for SystemIO registers
x86/msr: Add AMD CPPC MSR definitions
x86/cpufeatures: Add AMD Collaborative Processor Performance Control feature flag
cpufreq: use default_groups in kobj_type
cpufreq: mediatek-hw: Fix double devm_remap in hotplug case
cpufreq: intel_pstate: Update cpuinfo.max_freq on HWP_CAP changes
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- KCSAN enabled for arm64.
- Additional kselftests to exercise the syscall ABI w.r.t. SVE/FPSIMD.
- Some more SVE clean-ups and refactoring in preparation for SME
support (scalable matrix extensions).
- BTI clean-ups (SYM_FUNC macros etc.)
- arm64 atomics clean-up and codegen improvements.
- HWCAPs for FEAT_AFP (alternate floating point behaviour) and
FEAT_RPRESS (increased precision of reciprocal estimate and
reciprocal square root estimate).
- Use SHA3 instructions to speed-up XOR.
- arm64 unwind code refactoring/unification.
- Avoid DC (data cache maintenance) instructions when DCZID_EL0.DZP ==
1 (potentially set by a hypervisor; user-space already does this).
- Perf updates for arm64: support for CI-700, HiSilicon PCIe PMU,
Marvell CN10K LLC-TAD PMU, miscellaneous clean-ups.
- Other fixes and clean-ups; highlights: fix the handling of erratum
1418040, correct the calculation of the nomap region boundaries,
introduce io_stop_wc() mapped to the new DGH instruction (data
gathering hint).
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (81 commits)
arm64: Use correct method to calculate nomap region boundaries
arm64: Drop outdated links in comments
arm64: perf: Don't register user access sysctl handler multiple times
drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check
perf/smmuv3: Fix unused variable warning when CONFIG_OF=n
arm64: errata: Fix exec handling in erratum 1418040 workaround
arm64: Unhash early pointer print plus improve comment
asm-generic: introduce io_stop_wc() and add implementation for ARM64
arm64: Ensure that the 'bti' macro is defined where linkage.h is included
arm64: remove __dma_*_area() aliases
docs/arm64: delete a space from tagged-address-abi
arm64: Enable KCSAN
kselftest/arm64: Add pidbench for floating point syscall cases
arm64/fp: Add comments documenting the usage of state restore functions
kselftest/arm64: Add a test program to exercise the syscall ABI
kselftest/arm64: Allow signal tests to trigger from a function
kselftest/arm64: Parameterise ptrace vector length information
arm64/sve: Minor clarification of ABI documentation
arm64/sve: Generalise vector length configuration prctl() for SME
arm64/sve: Make sysctl interface for SVE reusable by SME
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC driver updates from Arnd Bergmann:
"There are cleanups and minor bugfixes across several SoC specific
drivers, for Qualcomm, Samsung, NXP i.MX, AT91, Tegra, Keystone,
Renesas, ZynqMP
Noteworthy new features are:
- The op-tee firmware driver gains support for asynchronous
notifications from secure-world firmware.
- Qualcomm platforms gain support for new SoC types in various
drivers: power domain, cache controller, RPM sleep, soc-info
- Samsung SoC drivers gain support for new SoCs in ChipID and PMU, as
well as a new USIv2 driver that handles various types of serial
communiction (uart, i2c, spi)
- Renesas adds support for R-Car S4-8 (R8A779F0) in multiple drivers,
as well as memory controller support for RZ/G2L (R9A07G044).
- Apple M1 gains support for the PMGR power management driver"
* tag 'drivers-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (94 commits)
soc: qcom: rpmh-rsc: Fix typo in a comment
soc: qcom: socinfo: Add SM6350 and SM7225
dt-bindings: arm: msm: Don't mark LLCC interrupt as required
dt-bindings: firmware: scm: Add SM6350 compatible
dt-bindings: arm: msm: Add LLCC for SM6350
soc: qcom: rpmhpd: Sort power-domain definitions and lists
soc: qcom: rpmhpd: Remove mx/cx relationship on sc7280
soc: qcom: rpmhpd: Rename rpmhpd struct names
soc: qcom: rpmhpd: sm8450: Add the missing .peer for sm8450_cx_ao
soc: qcom: socinfo: add SM8450 ID
soc: qcom: rpmhpd: Add SM8450 power domains
dt-bindings: power: rpmpd: Add SM8450 to rpmpd binding
soc: qcom: smem: Update max processor count
dt-bindings: arm: qcom: Document SM8450 SoC and boards
dt-bindings: firmware: scm: Add SM8450 compatible
dt-bindings: arm: cpus: Add kryo780 compatible
soc: qcom: rpmpd: Add support for sm6125
dt-bindings: qcom-rpmpd: Add sm6125 power domains
soc: qcom: aoss: constify static struct thermal_cooling_device_ops
PM: AVS: qcom-cpr: Use div64_ul instead of do_div
...
|
|
Pull ARM SoC updates from Arnd Bergmann:
"These are all minor bug fixes and cleanups to code in arch/arm and
arch/arm64 that is specific to one SoC, updating Kconfig symbols, the
MAINTAINERS file, and removing some dead code"
* tag 'soc-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
arm64: exynos: Enable Exynos Multi-Core Timer driver
ARM: ixp4xx: remove unused header file pata_ixp4xx_cf.h
ARM: ixp4xx: remove dead configs CPU_IXP43X and CPU_IXP46X
MAINTAINERS: Add Florian as BCM5301X and BCM53573 maintainer
ARM: samsung: Remove HAVE_S3C2410_I2C and use direct dependencies
ARM: imx: rename DEBUG_IMX21_IMX27_UART to DEBUG_IMX27_UART
ARM: imx: remove dead left-over from i.MX{27,31,35} removal
ARM: s3c: add one more "fallthrough" statement in Jive
ARM: s3c: include header for prototype of s3c2410_modify_misccr
ARM: shmobile: rcar-gen2: Add missing of_node_put()
|
|
Move NFS to using fscache DIO API instead of the old upstream I/O API as
that has been removed. This is a stopgap solution as the intention is that
at sometime in the future, the cache will move to using larger blocks and
won't be able to store individual pages in order to deal with the potential
for data corruption due to the backing filesystem being able insert/remove
bridging blocks of zeros into its extent list[1].
NFS then reads and writes cache pages synchronously and one page at a time.
The preferred change would be to use the netfs lib, but the new I/O API can
be used directly. It's just that as the cache now needs to track data for
itself, caching blocks may exceed page size...
This code is somewhat borrowed from my "fallback I/O" patchset[2].
Changes
=======
ver #3:
- Restore lost =n fallback for nfs_fscache_release_page()[2].
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Dave Wysochanski <dwysocha@redhat.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
cc: Trond Myklebust <trond.myklebust@hammerspace.com>
cc: Anna Schumaker <anna.schumaker@netapp.com>
cc: linux-nfs@vger.kernel.org
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/YO17ZNOcq+9PajfQ@mit.edu [1]
Link: https://lore.kernel.org/r/202112100957.2oEDT20W-lkp@intel.com/ [2]
Link: https://lore.kernel.org/r/163189108292.2509237.12615909591150927232.stgit@warthog.procyon.org.uk/ [2]
Link: https://lore.kernel.org/r/163906981318.143852.17220018647843475985.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967184451.1823006.6450645559828329590.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021577632.640689.11069627070150063812.stgit@warthog.procyon.org.uk/ # v4
|
|
Change the nfs filesystem to support fscache's indexing rewrite and
reenable caching in nfs.
The following changes have been made:
(1) The fscache_netfs struct is no more, and there's no need to register
the filesystem as a whole.
(2) The session cookie is now an fscache_volume cookie, allocated with
fscache_acquire_volume(). That takes three parameters: a string
representing the "volume" in the index, a string naming the cache to
use (or NULL) and a u64 that conveys coherency metadata for the
volume.
For nfs, I've made it render the volume name string as:
"nfs,<ver>,<family>,<address>,<port>,<fsidH>,<fsidL>*<,param>[,<uniq>]"
(3) The fscache_cookie_def is no more and needed information is passed
directly to fscache_acquire_cookie(). The cache no longer calls back
into the filesystem, but rather metadata changes are indicated at
other times.
fscache_acquire_cookie() is passed the same keying and coherency
information as before.
(4) fscache_enable/disable_cookie() have been removed.
Call fscache_use_cookie() and fscache_unuse_cookie() when a file is
opened or closed to prevent a cache file from being culled and to keep
resources to hand that are needed to do I/O.
If a file is opened for writing, we invalidate it with
FSCACHE_INVAL_DIO_WRITE in lieu of doing writeback to the cache,
thereby making it cease caching until all currently open files are
closed. This should give the same behaviour as the uptream code.
Making the cache store local modifications isn't straightforward for
NFS, so that's left for future patches.
(5) fscache_invalidate() now needs to be given uptodate auxiliary data and
a file size. It also takes a flag to indicate if this was due to a
DIO write.
(6) Call nfs_fscache_invalidate() with FSCACHE_INVAL_DIO_WRITE on a file
to which a DIO write is made.
(7) Call fscache_note_page_release() from nfs_release_page().
(8) Use a killable wait in nfs_vm_page_mkwrite() when waiting for
PG_fscache to be cleared.
(9) The functions to read and write data to/from the cache are stubbed out
pending a conversion to use netfslib.
Changes
=======
ver #3:
- Added missing =n fallback for nfs_fscache_release_file()[1][2].
ver #2:
- Use gfpflags_allow_blocking() rather than using flag directly.
- fscache_acquire_volume() now returns errors.
- Remove NFS_INO_FSCACHE as it's no longer used.
- Need to unuse a cookie on file-release, not inode-clear.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Co-developed-by: David Howells <dhowells@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Dave Wysochanski <dwysocha@redhat.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
cc: Trond Myklebust <trond.myklebust@hammerspace.com>
cc: Anna Schumaker <anna.schumaker@netapp.com>
cc: linux-nfs@vger.kernel.org
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/202112100804.nksO8K4u-lkp@intel.com/ [1]
Link: https://lore.kernel.org/r/202112100957.2oEDT20W-lkp@intel.com/ [2]
Link: https://lore.kernel.org/r/163819668938.215744.14448852181937731615.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906979003.143852.2601189243864854724.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967182112.1823006.7791504655391213379.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021575950.640689.12069642327533368467.stgit@warthog.procyon.org.uk/ # v4
|
|
- new driver to support for LetSketch device (Hans de Goede)
|
|
- locking performance improvement for hidraw code (André Almeida)
|
|
- support for USI style pens (Tero Kristo, Mika Westerberg)
- quirk for devices that need inverted X/Y axes (Alistair Francis)
- small core code cleanups and deduplication (Benjamin Tissoires)
|
|
Merge in fixes directly in prep for the 5.17 merge window.
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace kfree_skb() with kfree_skb_reason() in __udp4_lib_rcv.
New drop reason 'SKB_DROP_REASON_UDP_CSUM' is added for udp csum
error.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace kfree_skb() with kfree_skb_reason() in tcp_v4_rcv(). Following
drop reasons are added:
SKB_DROP_REASON_NO_SOCKET
SKB_DROP_REASON_PKT_TOO_SMALL
SKB_DROP_REASON_TCP_CSUM
SKB_DROP_REASON_TCP_FILTER
After this patch, 'kfree_skb' event will print message like this:
$ TASK-PID CPU# ||||| TIMESTAMP FUNCTION
$ | | | ||||| | |
<idle>-0 [000] ..s1. 36.113438: kfree_skb: skbaddr=(____ptrval____) protocol=2048 location=(____ptrval____) reason: NO_SOCKET
The reason of skb drop is printed too.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Introduce the interface kfree_skb_reason(), which is able to pass
the reason why the skb is dropped to 'kfree_skb' tracepoint.
Add the 'reason' field to 'trace_kfree_skb', therefor user can get
more detail information about abnormal skb with 'drop_monitor' or
eBPF.
All drop reasons are defined in the enum 'skb_drop_reason', and
they will be print as string in 'kfree_skb' tracepoint in format
of 'reason: XXX'.
( Maybe the reasons should be defined in a uapi header file, so that
user space can use them? )
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Netfilter conntrack maintains NAT flags per connection indicating
whether NAT was configured for the connection. Openvswitch maintains
NAT flags on the per packet flow key ct_state field, indicating
whether NAT was actually executed on the packet.
When a packet misses from tc to ovs the conntrack NAT flags are set.
However, NAT was not necessarily executed on the packet because the
connection's state might still be in NEW state. As such, openvswitch
wrongly assumes that NAT was executed and sets an incorrect flow key
NAT flags.
Fix this, by flagging to openvswitch which NAT was actually done in
act_ct via tc_skb_ext and tc_skb_cb to the openvswitch module, so
the packet flow key NAT flags will be correctly set.
Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20220106153804.26451-1-paulb@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next. This
includes one patch to update ovs and act_ct to use nf_ct_put() instead
of nf_conntrack_put().
1) Add netns_tracker to nfnetlink_log and masquerade, from Eric Dumazet.
2) Remove redundant rcu read-size lock in nf_tables packet path.
3) Replace BUG() by WARN_ON_ONCE() in nft_payload.
4) Consolidate rule verdict tracing.
5) Replace WARN_ON() by WARN_ON_ONCE() in nf_tables core.
6) Make counter support built-in in nf_tables.
7) Add new field to conntrack object to identify locally generated
traffic, from Florian Westphal.
8) Prevent NAT from shadowing well-known ports, from Florian Westphal.
9) Merge nf_flow_table_{ipv4,ipv6} into nf_flow_table_inet, also from
Florian.
10) Remove redundant pointer in nft_pipapo AVX2 support, from Colin Ian King.
11) Replace opencoded max() in conntrack, from Jiapeng Chong.
12) Update conntrack to use refcount_t API, from Florian Westphal.
13) Move ip_ct_attach indirection into the nf_ct_hook structure.
14) Constify several pointer object in the netfilter codebase,
from Florian Westphal.
15) Tree-wide replacement of nf_conntrack_put() by nf_ct_put(), also
from Florian.
16) Fix egress splat due to incorrect rcu notation, from Florian.
17) Move stateful fields of connlimit, last, quota, numgen and limit
out of the expression data area.
18) Build a blob to represent the ruleset in nf_tables, this is a
requirement of the new register tracking infrastructure.
19) Add NFT_REG32_NUM to define the maximum number of 32-bit registers.
20) Add register tracking infrastructure to skip redundant
store-to-register operations, this includes support for payload,
meta and bitwise expresssions.
* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next: (32 commits)
netfilter: nft_meta: cancel register tracking after meta update
netfilter: nft_payload: cancel register tracking after payload update
netfilter: nft_bitwise: track register operations
netfilter: nft_meta: track register operations
netfilter: nft_payload: track register operations
netfilter: nf_tables: add register tracking infrastructure
netfilter: nf_tables: add NFT_REG32_NUM
netfilter: nf_tables: add rule blob layout
netfilter: nft_limit: move stateful fields out of expression data
netfilter: nft_limit: rename stateful structure
netfilter: nft_numgen: move stateful fields out of expression data
netfilter: nft_quota: move stateful fields out of expression data
netfilter: nft_last: move stateful fields out of expression data
netfilter: nft_connlimit: move stateful fields out of expression data
netfilter: egress: avoid a lockdep splat
net: prefer nf_ct_put instead of nf_conntrack_put
netfilter: conntrack: avoid useless indirection during conntrack destruction
netfilter: make function op structures const
netfilter: core: move ip_ct_attach indirection to struct nf_ct_hook
netfilter: conntrack: convert to refcount_t api
...
====================
Link: https://lore.kernel.org/r/20220109231640.104123-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
include/linux/netfilter_netdev.h:97 suspicious rcu_dereference_check() usage!
2 locks held by sd-resolve/1100:
0: ..(rcu_read_lock_bh){1:3}, at: ip_finish_output2
1: ..(rcu_read_lock_bh){1:3}, at: __dev_queue_xmit
__dev_queue_xmit+0 ..
The helper has two callers, one uses rcu_read_lock, the other
rcu_read_lock_bh(). Annotate the dereference to reflect this.
Fixes: 42df6e1d221dd ("netfilter: Introduce egress hook")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
nf_ct_put() results in a usesless indirection:
nf_ct_put -> nf_conntrack_put -> nf_conntrack_destroy -> rcu readlock +
indirect call of ct_hooks->destroy().
There are two _put helpers:
nf_ct_put and nf_conntrack_put. The latter is what should be used in
code that MUST NOT cause a linker dependency on the conntrack module
(e.g. calls from core network stack).
Everyone else should call nf_ct_put() instead.
A followup patch will convert a few nf_conntrack_put() calls to
nf_ct_put(), in particular from modules that already have a conntrack
dependency such as act_ct or even nf_conntrack itself.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
No functional changes, these structures should be const.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
ip_ct_attach predates struct nf_ct_hook, we can place it there and
remove the exported symbol.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Convert nf_conn reference counting from atomic_t to refcount_t based api.
refcount_t api provides more runtime sanity checks and will warn on
certain constructs, e.g. refcount_inc() on a zero reference count, which
usually indicates use-after-free.
For this reason template allocation is changed to init the refcount to
1, the subsequenct add operations are removed.
Likewise, init_conntrack() is changed to set the initial refcount to 1
instead refcount_inc().
This is safe because the new entry is not (yet) visible to other cpus.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Move the 'inline' keyword to the front of 'void'.
Remove a warning found by clang(make W=1 LLVM=1)
./include/linux/blk-mq.h:259:1: warning: ‘inline’ is not at beginning of
declaration
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220107005228.103927-1-yang.lee@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If something went wrong during the TPM firmware upgrade, like power
failure or the firmware image file get corrupted, the TPM might end
up in Upgrade or Failure mode upon the next start. The state is
persistent between the TPM power cycle/restart.
According to TPM specification:
* If the TPM is in Upgrade mode, it will answer with TPM2_RC_UPGRADE
to all commands except TPM2_FieldUpgradeData(). It may also accept
other commands if it is able to complete them using the previously
installed firmware.
* If the TPM is in Failure mode, it will allow performing TPM
initialization but will not provide any crypto operations.
Will happily respond to Field Upgrade calls.
Change the behavior of the tpm2_auto_startup(), so it detects the active
running mode of the TPM by adding the following checks. If
tpm2_do_selftest() call returns TPM2_RC_UPGRADE, the TPM is in Upgrade
mode.
If the TPM is in Failure mode, it will successfully respond to both
tpm2_do_selftest() and tpm2_startup() calls. Although, will fail to
answer to tpm2_get_cc_attrs_tbl(). Use this fact to conclude that TPM is
in Failure mode.
If detected that the TPM is in the Upgrade or Failure mode, the function
sets TPM_CHIP_FLAG_FIRMWARE_UPGRADE_MODE flag.
The TPM_CHIP_FLAG_FIRMWARE_UPGRADE_MODE flag is used later during driver
initialization/deinitialization to disable functionality which makes no
sense or will fail in the current TPM state. Following functionality is
affected:
* Do not register TPM as a hwrng
* Do not register sysfs entries which provide information impossible to
obtain in limited mode
* Do not register resource managed character device
Signed-off-by: axelj <axelj@axis.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
When CONFIG_SYSFS is not enabled provide stubs for the helper functions
to not break their callers.
Fixes: 539b9c94ac83 ("power: supply: add helpers for charge_behaviour sysfs")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20220108153158.189489-1-linux@weissschuh.net
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
We currently store large folios as 2^N consecutive entries. While this
consumes rather more memory than necessary, it also turns out to be buggy.
A writeback operation which starts within a tail page of a dirty folio will
not write back the folio as the xarray's dirty bit is only set on the
head index. With multi-index entries, the dirty bit will be found no
matter where in the folio the operation starts.
This does end up simplifying the page cache slightly, although not as
much as I had hoped.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
|
|
Add a new helper function to help iterate over multi-index entries.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
|
|
All of its callers now call folio_batch_remove_exceptionals().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
|
|
find_lock_entries() already only returned the head page of folios, so
convert it to return a folio_batch instead of a pagevec. That cascades
through converting truncate_inode_pages_range() to
delete_from_page_cache_batch() and page_cache_delete_batch().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
|
|
The callers have all been converted to work on folios, so convert
find_get_entries() to return a batch of folios instead of pages.
We also now return multiple large folios in a single call.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Convert all callers of truncate_inode_page() to call
truncate_inode_folio() instead, and move the declaration to mm/internal.h.
Move the assertion that the caller is not passing in a tail page to
generic_error_remove_page(). We can't entirely remove the struct page
from the callers yet because the page pointer in the pvec might be a
shadow/dax/swap entry instead of actually a page.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
|
|
Convert both callers of unmap_mapping_page() to call unmap_mapping_folio()
instead. Also move zap_details from linux/mm.h to mm/memory.c
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
|
|
Pull drm fixes from Dave Airlie:
"There is only the amdgpu runtime pm regression fix in here:
amdgpu:
- suspend/resume fix
- fix runtime PM regression"
* tag 'drm-fixes-2022-01-07' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu: disable runpm if we are the primary adapter
fbdev: fbmem: add a helper to determine if an aperture is used by a fw fb
drm/amd/pm: keep the BACO feature enabled for suspend
|
|
Commit b42faeee718c ("spi: Add a PTP system timestamp
to the transfer structure") added an include of ptp_clock_kernel.h
to spi.h for struct ptp_system_timestamp but a forward declaration
is enough. Let's use that to limit the number of objects we have
to rebuild every time we touch networking headers.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20210904013140.2377609-1-kuba@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Add a stat counter of culling events whereby the cache backend culls a file
to make space (when asked by cachefilesd in this case) and display in
/proc/fs/fscache/stats.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/163819654165.215744.3797804661644212436.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906961387.143852.9291157239960289090.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967168266.1823006.14436200166581605746.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021567619.640689.4339228906248763197.stgit@warthog.procyon.org.uk/ # v4
|
|
Add stat counters of no-space events that caused caching not to happen and
display in /proc/fs/fscache/stats.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/163819653216.215744.17210522251617386509.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906958369.143852.7257100711818401748.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967166917.1823006.14842444049198947892.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021566184.640689.4417328329632709265.stgit@warthog.procyon.org.uk/ # v4
|
|
Store the volume coherency data in an xattr and check it when we rebind the
volume. If it doesn't match the cache volume is moved to the graveyard and
rebuilt anew.
Changes
=======
ver #4:
- Remove a couple of debugging prints.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/163967164397.1823006.2950539849831291830.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021563138.640689.15851092065380543119.stgit@warthog.procyon.org.uk/ # v4
|
|
Use an inode flag, S_KERNEL_FILE, to mark that a backing file is in use by
the kernel to prevent cachefiles or other kernel services from interfering
with that file.
Alter rmdir to reject attempts to remove a directory marked with this flag.
This is used by cachefiles to prevent cachefilesd from removing them.
Using S_SWAPFILE instead isn't really viable as that has other effects in
the I/O paths.
Changes
=======
ver #3:
- Check for the object pointer being NULL in the tracepoints rather than
the caller.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/163819630256.215744.4815885535039369574.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906931596.143852.8642051223094013028.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967141000.1823006.12920680657559677789.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021541207.640689.564689725898537127.stgit@warthog.procyon.org.uk/ # v4
|
|
Provide a function to change the size of the storage attached to a cookie,
to match the size of the file being cached when it's changed by truncate or
fallocate:
void fscache_resize_cookie(struct fscache_cookie *cookie,
loff_t new_size);
This acts synchronously and is expected to run under the inode lock of the
caller.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/163819621839.215744.7895597119803515402.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906922387.143852.16394459879816147793.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967128998.1823006.10740669081985775576.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021527861.640689.3466382085497236267.stgit@warthog.procyon.org.uk/ # v4
|
|
Provide a function to be called from a network filesystem's releasepage
method to indicate that a page has been released that might have been a
reflection of data upon the server - and now that data must be reloaded
from the server or the cache.
This is used to end an optimisation for empty files, in particular files
that have just been created locally, whereby we know there cannot yet be
any data that we would need to read from the server or the cache.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/163819617128.215744.4725572296135656508.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906920354.143852.7511819614661372008.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967128061.1823006.611781655060034988.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021525963.640689.9264556596205140044.stgit@warthog.procyon.org.uk/ # v4
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2022-01-06
1) Expose FEC per lane block counters via ethtool
2) Trivial fixes/updates/cleanup to mlx5e netdev driver
3) Fix htmldoc build warning
4) Spread mlx5 SFs (sub-functions) to all available CPU cores: Commits 1..5
Shay Drory Says:
================
Before this patchset, mlx5 subfunction shared the same IRQs (MSI-X) with
their peers subfunctions, causing them to use same CPU cores.
In large scale, this is very undesirable, SFs use small number of cpu
cores and all of them will be packed on the same CPU cores, not
utilizing all CPU cores in the system.
In this patchset we want to achieve two things.
a) Spread IRQs used by SFs to all cpu cores
b) Pack less SFs in the same IRQ, will result in multiple IRQs per core.
In this patchset, we spread SFs over all online cpus available to mlx5
irqs in Round-Robin manner. e.g.: Whenever a SF is created, pick the next
CPU core with least number of SF IRQs bound to it, SFs will share IRQs on
the same core until a certain limit, when such limit is reached, we
request a new IRQ and add it to that CPU core IRQ pool, when out of IRQs,
pick any IRQ with least number of SF users.
This enhancement is done in order to achieve a better distribution of
the SFs over all the available CPUs, which reduces application latency,
as shown bellow.
Machine details:
Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz with 56 cores.
PCI Express 3 with BW of 126 Gb/s.
ConnectX-5 Ex; EDR IB (100Gb/s) and 100GbE; dual-port QSFP28; PCIe4.0
x16.
Base line test description:
Single SF on the system. One instance of netperf is running on-top the
SF.
Numbers: latency = 15.136 usec, CPU Util = 35%
Test description:
There are 250 SFs on the system. There are 3 instances of netperf
running, on-top three different SFs, in parallel.
Perf numbers:
# netperf SFs latency(usec) latency CPU utilization
affinity affinity (lower is better) increase %
1 cpu=0 cpu={0} ~23 (app 1-3) 35% 75%
2 cpu=0,2,4 cpu={0} app 1: 21.625 30% 68% (CPU 0)
app 2-3: 16.5 9% 15% (CPU 2,4)
3 cpu=0 cpu={0,2,4} app 1: ~16 7% 84% (CPU 0)
app 2-3: ~17.9 14% 22% (CPU 2,4)
4 cpu=0,2,4 cpu={0,2,4} 15.2 (app 1-3) 0% 33% (CPU 0,2,4)
- The first two entries (#1 and #2) show current state. e.g.: SFs are
using the same CPU. The last two entries (#3 and #4) shows the latency
reduction improvement of this patch. e.g.: SFs are on different CPUs.
- Whenever we use several CPUs, in case there is a different CPU
utilization, write the utilization of each CPU separately.
- Whenever the latency result of the netperf instances were different,
write the latency of each netperf instances separately.
Commands:
- for netperf CPU=0:
$ for i in {1..3}; do taskset -c 0 netperf -H 1${i}.1.1.1 -t TCP_RR -- \
-o RT_LATENCY -r8 & done
- for netperf CPU=0,2,4
$ for i in {1..3}; do taskset -c $(( ($i - 1) * 2 )) netperf -H \
1${i}.1.1.1 -t TCP_RR -- -o RT_LATENCY -r8 & done
================
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Cachefiles has a problem in that it needs to keep the backing file for a
cookie open whilst there are local modifications pending that need to be
written to it. However, we don't want to keep the file open indefinitely,
as that causes EMFILE/ENFILE/ENOMEM problems.
Reopening the cache file, however, is a problem if this is being done due
to writeback triggered by exit(). Some filesystems will oops if we try to
open a file in that context because they want to access current->fs or
other resources that have already been dismantled.
To get around this, I added the following:
(1) An inode flag, I_PINNING_FSCACHE_WB, to be set on a network filesystem
inode to indicate that we have a usage count on the cookie caching
that inode.
(2) A flag in struct writeback_control, unpinned_fscache_wb, that is set
when __writeback_single_inode() clears the last dirty page from
i_pages - at which point it clears I_PINNING_FSCACHE_WB and sets this
flag.
This has to be done here so that clearing I_PINNING_FSCACHE_WB can be
done atomically with the check of PAGECACHE_TAG_DIRTY that clears
I_DIRTY_PAGES.
(3) A function, fscache_set_page_dirty(), which if it is not set, sets
I_PINNING_FSCACHE_WB and calls fscache_use_cookie() to pin the cache
resources.
(4) A function, fscache_unpin_writeback(), to be called by ->write_inode()
to unuse the cookie.
(5) A function, fscache_clear_inode_writeback(), to be called when the
inode is evicted, before clear_inode() is called. This cleans up any
lingering I_PINNING_FSCACHE_WB.
The network filesystem can then use these tools to make sure that
fscache_write_to_cache() can write locally modified data to the cache as
well as to the server.
For the future, I'm working on write helpers for netfs lib that should
allow this facility to be removed by keeping track of the dirty regions
separately - but that's incomplete at the moment and is also going to be
affected by folios, one way or another, since it deals with pages
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/163819615157.215744.17623791756928043114.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906917856.143852.8224898306177154573.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967124567.1823006.14188359004568060298.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021524705.640689.17824932021727663017.stgit@warthog.procyon.org.uk/ # v4
|
|
Provide a higher-level function than fscache_write() to perform a write
from an inode's pagecache to the cache, whilst fending off concurrent
writes by means of the PG_fscache mark on a page:
void fscache_write_to_cache(struct fscache_cookie *cookie,
struct address_space *mapping,
loff_t start,
size_t len,
loff_t i_size,
netfs_io_terminated_t term_func,
void *term_func_priv,
bool caching);
If caching is false, this function does nothing except call (*term_func)()
if given. It assumes that, in such a case, PG_fscache will not have been
set on the pages.
Otherwise, if caching is true, this function requires the source pages to
have had PG_fscache set on them before calling. start and len define the
region of the file to be modified and i_size indicates the new file size.
The source pages are extracted from the mapping.
term_func and term_func_priv work as for fscache_write(). The PG_fscache
marks will be cleared at the end of the operation, before term_func is
called or the function otherwise returns.
There is an additonal helper function to clear the PG_fscache bits from a
range of pages:
void fscache_clear_page_bits(struct fscache_cookie *cookie,
struct address_space *mapping,
loff_t start, size_t len,
bool caching);
If caching is true, the pages to be managed are expected to be located on
mapping in the range defined by start and len. If caching is false, it
does nothing.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/163819614155.215744.5528123235123721230.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906916346.143852.15632773570362489926.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967123599.1823006.12946816026724657428.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021522672.640689.4381958316198807813.stgit@warthog.procyon.org.uk/ # v4
|
|
Provide a pair of functions to perform raw I/O on the cache. The first
function allows an arbitrary asynchronous direct-IO read to be made against
a cache object, though the read should be aligned and sized appropriately
for the backing device:
int fscache_read(struct netfs_cache_resources *cres,
loff_t start_pos,
struct iov_iter *iter,
enum netfs_read_from_hole read_hole,
netfs_io_terminated_t term_func,
void *term_func_priv);
The cache resources must have been previously initialised by
fscache_begin_read_operation(). A read operation is sent to the backing
filesystem, starting at start_pos within the file. The size of the read is
specified by the iterator, as is the location of the output buffer.
If there is a hole in the data it can be ignored and left to the backing
filesystem to deal with (NETFS_READ_HOLE_IGNORE), a hole at the beginning
can be skipped over and the buffer padded with zeros
(NETFS_READ_HOLE_CLEAR) or -ENODATA can be given (NETFS_READ_HOLE_FAIL).
If term_func is not NULL, the operation may be performed asynchronously.
Upon completion, successful or otherwise, (*term_func)() will be called and
passed term_func_priv, along with an error or the amount of data
transferred. If the op is run asynchronously, fscache_read() will return
-EIOCBQUEUED.
The second function allows an arbitrary asynchronous direct-IO write to be
made against a cache object, though the write should be aligned and sized
appropriately for the backing device:
int fscache_write(struct netfs_cache_resources *cres,
loff_t start_pos,
struct iov_iter *iter,
netfs_io_terminated_t term_func,
void *term_func_priv);
This works in very similar way to fscache_read(), except that there's no
need to deal with holes (they're just overwritten).
The caller is responsible for preventing concurrent overlapping writes.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/163819613224.215744.7877577215582621254.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906915386.143852.16936177636106480724.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967122632.1823006.7487049517698562172.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021521420.640689.12747258780542678309.stgit@warthog.procyon.org.uk/ # v4
|
|
Pass more information to the cache on how to deal with a hole if it
encounters one when trying to read from the cache. Three options are
provided:
(1) NETFS_READ_HOLE_IGNORE. Read the hole along with the data, assuming
it to be a punched-out extent by the backing filesystem.
(2) NETFS_READ_HOLE_CLEAR. If there's a hole, erase the requested region
of the cache and clear the read buffer.
(3) NETFS_READ_HOLE_FAIL. Fail the read if a hole is detected.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/163819612321.215744.9738308885948264476.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906914460.143852.6284247083607910189.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967119923.1823006.15637375885194297582.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021519762.640689.16994364383313159319.stgit@warthog.procyon.org.uk/ # v4
|
|
Provide a function to let the netfs update its coherency data:
void fscache_update_cookie(struct fscache_cookie *cookie,
const void *aux_data,
const loff_t *object_size);
This will update the auxiliary data and/or the size of the object attached
to a cookie if either pointer is not-NULL and flag that the disk needs to
be updated.
Note that fscache_unuse_cookie() also allows this to be done.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/163819610438.215744.4223265964131424954.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906913530.143852.18150303220217653820.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967117795.1823006.7493373142653442595.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021518440.640689.6369952464473039268.stgit@warthog.procyon.org.uk/ # v4
|
|
Provide read/write stat counters for the cache backend to use.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/163819609532.215744.10821082637727410554.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906912598.143852.12960327989649429069.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967113830.1823006.3222957649202368162.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021517502.640689.6077928311710357342.stgit@warthog.procyon.org.uk/ # v4
|