Age | Commit message (Collapse) | Author |
|
Splice the moved list to a local one to avoid taking the lock over and
over again.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use list_for_each_entry_safe here.
v2: Drop the optimization, it doesn't work as expected.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Only the moved state needs a separate spin lock protection. All other
states are protected by reserving the VM anyway.
v2: fix some more incorrect cases
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Enable support for dynamically powering up/down VCN on demand.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Enable static VCN powergating by default on Raven.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Implement static powergating suport on VCN.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Enable VCN clockgating by default on Raven.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Implement proper static clockgating support for VCN.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This feature/cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- Disable batman-adv debugfs by default, by Sven Eckelmann
- Improve handling mesh nodes with multicast optimizations disabled,
by Linus Luessing
- Avoid bool in structs, by Sven Eckelmann
- Allocate less memory when debugfs is disabled, by Sven Eckelmann
- Fix batadv_interface_tx return data type, by Luc Van Oostenryck
- improve link speed handling for virtual interfaces, by Marek Lindner
- Enable BATMAN V algorithm by default, by Marek Lindner
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since 4.10, commit 8003c9ae204e (KVM: LAPIC: add APIC Timer
periodic/oneshot mode VMX preemption timer support), guests using
periodic LAPIC timers (such as FreeBSD 8.4) would see their timers
drift significantly over time.
Differences in the underlying clocks and numerical errors means the
periods of the two timers (hv and sw) are not the same. This
difference will accumulate with every expiry resulting in a large
error between the hv and sw timer.
This means the sw timer may be running slow when compared to the hv
timer. When the timer is switched from hv to sw, the now active sw
timer will expire late. The guest VCPU is reentered and it switches to
using the hv timer. This timer catches up, injecting multiple IRQs
into the guest (of which the guest only sees one as it does not get to
run until the hv timer has caught up) and thus the guest's timer rate
is low (and becomes increasing slower over time as the sw timer lags
further and further behind).
I believe a similar problem would occur if the hv timer is the slower
one, but I have not observed this.
Fix this by synchronizing the deadlines for both timers to the same
time source on every tick. This prevents the errors from accumulating.
Fixes: 8003c9ae204e21204e49816c5ea629357e283b06
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: David Vrabel <david.vrabel@nutanix.com>
Cc: stable@vger.kernel.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
Fixes for PPC KVM:
- Close a hole which could possibly lead to the host timebase getting
out of sync.
- Three fixes relating to PTEs and TLB entries for radix guests.
- Fix a bug which could lead to an interrupt never getting delivered
to the guest, if it is pending for a guest vCPU when the vCPU gets
offlined.
|
|
According to section 59.2.4 MSIOF Receive Mode Register 1 (SIRMDR1) in
the R-Car Gen3 datasheet Rev.1.00, the value of the SIRMDR1.SYNCAC bit
must match the value of the SITMDR1.SYNCAC bit. However,
sh_msiof_spi_setup() changes only the latter.
Fix this by updating the SIRMDR1 register like the SITMDR1 register,
taking into account register bits that exist in SITMDR1 only.
Reported-by: Renesas BSP team via Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Fixes: 7ff0b53c4051145d ("spi: sh-msiof: Avoid writing to registers from spi_master.setup()")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
When single-stepping kernel code from xmon without a debug hook
enabled the kernel crashes. This can happen when kernel starts with
xmon on crash disabled but xmon is entered using sysrq.
Call force_enable_xmon when single-stepping in xmon to install the
xmon debug hooks.
Fixes: e1368d0c9edb ("powerpc/xmon: Setup debugger hooks when first break-point is set")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
New binutils generate the following warning
AS arch/powerpc/kernel/head_8xx.o
arch/powerpc/kernel/head_8xx.S: Assembler messages:
arch/powerpc/kernel/head_8xx.S:916: Warning: invalid register expression
This patch fixes it.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This test the ptrace hw breakpoints via PTRACE_SET_DEBUGREG and
PPC_PTRACE_SETHWDEBUG. This test was use to find the bugs fixed by
these recent commits:
4f7c06e26e powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG
cd6ef7eebf powerpc/ptrace: Fix enforcement of DAWR constraints
Signed-off-by: Michael Neuling <mikey@neuling.org>
[mpe: Add SPDX tag, clang format it]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Commit a7a9dcd882a67 ("powerpc: Avoid taking a data miss on every
userspace instruction miss") has shown that limiting the read of
faulting instruction to likely cases improves performance.
This patch goes further into this direction by limiting the read
of the faulting instruction to the only cases where it is likely
needed.
On an MPC885, with the same benchmark app as in the commit referred
above, we see a reduction of about 3900 dTLB misses (approx 3%):
Before the patch:
Performance counter stats for './fault 500' (10 runs):
683033312 cpu-cycles ( +- 0.03% )
134538 dTLB-load-misses ( +- 0.03% )
46099 iTLB-load-misses ( +- 0.02% )
19681 faults ( +- 0.02% )
5.389747878 seconds time elapsed ( +- 0.06% )
With the patch:
Performance counter stats for './fault 500' (10 runs):
682112862 cpu-cycles ( +- 0.03% )
130619 dTLB-load-misses ( +- 0.03% )
46073 iTLB-load-misses ( +- 0.05% )
19681 faults ( +- 0.01% )
5.381342641 seconds time elapsed ( +- 0.07% )
The proper work of the huge stack expansion was tested with the
following app:
int main(int argc, char **argv)
{
char buf[1024 * 1025];
sprintf(buf, "Hello world !\n");
printf(buf);
exit(0);
}
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add include of pagemap.h to fix build errors]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Use symbolic names defined in asm/ppc-opcode.h
instead of hardcoded values.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This one should be using the default LPM policy for mobile chipsets so
add the PCI ID to the driver list of supported revices.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
|
|
Entry corresponding to 220 us setup time was missing. I am not aware of
any specific bug this fixes, but this could potentially result in enabling
PSR on a panel with a higher setup time requirement than supported by the
hardware.
I verified the value is present in eDP spec versions 1.3, 1.4 and 1.4a.
Fixes: 6608804b3d7f ("drm/dp: Add drm_dp_psr_setup_time()")
Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jose Roberto de Souza <jose.souza@intel.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Tarun Vyas <tarun.vyas@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180511195145.3829-3-dhinakaran.pandiyan@intel.com
|
|
Passing O_CREAT (00000100) to open means we should also pass file
mode as the third parameter. Creating /dev/console as a regular
file may not be helpful anyway, so simply drop the flag when
opening debug_fd.
Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
BPFILTER could have been enabled without INET causing this build error:
ERROR: "bpfilter_process_sockopt" [net/bpfilter/bpfilter.ko] undefined!
Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use 64-bit accesses for 64-bit floating-point general registers with
PTRACE_PEEKUSR, removing the truncation of their upper halves in the
FR=1 mode, caused by commit bbd426f542cb ("MIPS: Simplify FP context
access"), which inadvertently switched them to using 32-bit accesses.
The PTRACE_POKEUSR side is fine as it's never been broken and continues
using 64-bit accesses.
Fixes: bbd426f542cb ("MIPS: Simplify FP context access")
Signed-off-by: Maciej W. Rozycki <macro@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.15+
Patchwork: https://patchwork.linux-mips.org/patch/19334/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
Having PR_FP_MODE_FRE (i.e. Config5.FRE) set without PR_FP_MODE_FR (i.e.
Status.FR) is not supported as the lone purpose of Config5.FRE is to
emulate Status.FR=0 handling on FPU hardware that has Status.FR=1
hardwired[1][2]. Also we do not handle this case elsewhere, and assume
throughout our code that TIF_HYBRID_FPREGS and TIF_32BIT_FPREGS cannot
be set both at once for a task, leading to inconsistent behaviour if
this does happen.
Return unsuccessfully then from prctl(2) PR_SET_FP_MODE calls requesting
PR_FP_MODE_FRE to be set with PR_FP_MODE_FR clear. This corresponds to
modes allowed by `mips_set_personality_fp'.
References:
[1] "MIPS Architecture For Programmers, Vol. III: MIPS32 / microMIPS32
Privileged Resource Architecture", Imagination Technologies,
Document Number: MD00090, Revision 6.02, July 10, 2015, Table 9.69
"Config5 Register Field Descriptions", p. 262
[2] "MIPS Architecture For Programmers, Volume III: MIPS64 / microMIPS64
Privileged Resource Architecture", Imagination Technologies,
Document Number: MD00091, Revision 6.03, December 22, 2015, Table
9.72 "Config5 Register Field Descriptions", p. 288
Fixes: 9791554b45a2 ("MIPS,prctl: add PR_[GS]ET_FP_MODE prctl options for MIPS")
Signed-off-by: Maciej W. Rozycki <macro@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.0+
Patchwork: https://patchwork.linux-mips.org/patch/19327/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
While doing a global software reset, these bits are not cleared and let
some bootloader fail to initialise the GPHYs. The bootloader don't
expect the GPHYs in reset, as they aren't during power on.
The asserts were a workaround for a wrong syscon-reboot mask. With a
mask set which includes the GPHY resets, these resets aren't required
any more.
Fixes: 126534141b45 ("MIPS: lantiq: Add a GPHY driver which uses the RCU syscon-mfd")
Signed-off-by: Mathias Kresin <dev@kresin.me>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: John Crispin <john@phrozen.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.14+
Patchwork: https://patchwork.linux-mips.org/patch/19003/
[jhogan@kernel.org: Fix build warnings]
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
This patch adds support of external interrupt for
gpio[a..k], gpioz
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
This patch adds external interrupt (exti) support
on stm32mp157c SoC.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
-Parent domain of stm32gpio evolves to hierarchy domain
and could have a handle_fasteoi_irq. So an irq_eoi parent callback
is needed for children.
-Replace space by tabulation.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
This patch adds suspend/resume feature for exti hierarchy domain.
-suspend function sets wake_active into imr of each banks
-resume function restores the mask_cache interrupt into
imr of each banks
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Exti controller has been differently integrated on stm32mp1 SoC.
A parent irq has only one external interrupt. A hierachy domain could
be used. Handlers are call by parent, each parent interrupt could be
masked and unmasked according to the needs.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
This patch prepares functions which could be reused by
next variant of stm32 exti controller.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
This patch adds host and driver data structures to support
different stm32 exti controllers with variants.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
This patch adds suspend feature.
-Use default irq_set_wake function to store wakeup request.
-Suspend function set wake_active into imr of each bank
and save rising/falling trigger registers.
-Resume function restore the mask_cache interrupt into
imr of each bank and restore rising/falling trigger registers.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
This patch adds support of rising/falling pending registers.
Falling pending register (fpr) is needed for next revision.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
-WARNING: struct irq_domain_ops should normally be const
-CHECK: Alignment should match open parenthesis
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
- In stm32_exti_alloc function, discards irq_domain_set_info
with handle_simple_irq. This overwrite the setting defined while init
of generic chips. Exti controller manages edge irq type.
- Removes acking in chained irq handler as this is done by
irq_chip itself inside handle_edge_irq
- removes unneeded irq_domain_ops.xlate callback
Acked-by: Ludovic Barre <ludovic.barre@st.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Radoslaw Pietrzyk <radoslaw.pietrzyk@gmail.com>
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
The Meson-AXG SoC uses the same GPIO interrupt controller IP block as the other
Meson SoCs. A total of 100 pins can be spied on, which is the sum of:
- 255:100 Undefined(no interrupt)
- 99:84, 16 pins on bank GPIOY
- 83:61, 23 pins on bank GPIOX
- 60:40, 21 pins on bank GPIOA
- 39:25, 15 pins on bank BOOT
- 24:14, 11 pins on bank GPIOZ
- 13:0 , 14 pins in the AO domain
Signed-off-by: Yixun Lan <yixun.lan@amlogic.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Update the dt-binding documentation to support new compatible string
for the GPIO interrupt controller which found in Amlogic's Meson-AXG SoC.
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Yixun Lan <yixun.lan@amlogic.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
The double quotes seems not ASCII type, fix it here.
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Yixun Lan <yixun.lan@amlogic.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Commit 15122ee2c515 ("arm64: Enforce BBM for huge IO/VMAP mappings")
disallowed block mappings for ioremap since that code does not honor
break-before-make. The same APIs are also used for permission updating
though and the extra checks prevent the permission updates from happening,
even though this should be permitted. This results in read-only permissions
not being fully applied. Visibly, this can occasionaly be seen as a failure
on the built in rodata test when the test data ends up in a section or
as an odd RW gap on the page table dump. Fix this by using
pgattr_change_is_safe instead of p*d_present for determining if the
change is permitted.
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Peter Robinson <pbrobinson@gmail.com>
Reported-by: Peter Robinson <pbrobinson@gmail.com>
Fixes: 15122ee2c515 ("arm64: Enforce BBM for huge IO/VMAP mappings")
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
If 020/030 support is enabled, get_io_area() leaves an IO_SIZE gap
between mappings which is added to the vm_struct representing the
mapping. __ioremap() uses the actual requested size (after alignment),
while __iounmap() is passed the size from the vm_struct.
On 020/030, early termination descriptors are used to set up mappings of
extent 'size', which are validated on unmapping. The unmapped gap of
size IO_SIZE defeats the sanity check of the pmd tables, causing
__iounmap() to loop forever on 030.
On 040/060, unmapping of page table entries does not check for a valid
mapping, so the umapping loop always completes there.
Adjust size to be unmapped by the gap that had been added in the
vm_struct prior.
This fixes the hang in atari_platform_init() reported a long time ago,
and a similar one reported by Finn recently (addressed by removing
ioremap() use from the SWIM driver.
Tested on my Falcon in 030 mode - untested but should work the same on
040/060 (the extra page tables cleared there would never have been set
up anyway).
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
[geert: Minor commit description improvements]
[geert: This was fixed in 2.4.23, but not in 2.5.x]
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org
|
|
Mathieu Xhonneux says:
====================
As of Linux 4.14, it is possible to define advanced local processing for
IPv6 packets with a Segment Routing Header through the seg6local LWT
infrastructure. This LWT implements the network programming principles
defined in the IETF "SRv6 Network Programming" draft.
The implemented operations are generic, and it would be very interesting to
be able to implement user-specific seg6local actions, without having to
modify the kernel directly. To do so, this patchset adds an End.BPF action
to seg6local, powered by some specific Segment Routing-related helpers,
which provide SR functionalities that can be applied on the packet. This
BPF hook would then allow to implement specific actions at native kernel
speed such as OAM features, advanced SR SDN policies, SRv6 actions like
Segment Routing Header (SRH) encapsulation depending on the content of
the packet, etc.
This patchset is divided in 6 patches, whose main features are :
- A new seg6local action End.BPF with the corresponding new BPF program
type BPF_PROG_TYPE_LWT_SEG6LOCAL. Such attached BPF program can be
passed to the LWT seg6local through netlink, the same way as the LWT
BPF hook operates.
- 3 new BPF helpers for the seg6local BPF hook, allowing to edit/grow/
shrink a SRH and apply on a packet some of the generic SRv6 actions.
- 1 new BPF helper for the LWT BPF IN hook, allowing to add a SRH through
encapsulation (via IPv6 encapsulation or inlining if the packet contains
already an IPv6 header).
As this patchset adds a new LWT BPF hook, I took into account the result
of the discussions when the LWT BPF infrastructure got merged. Hence, the
seg6local BPF hook doesn't allow write access to skb->data directly, only
the SRH can be modified through specific helpers, which ensures that the
integrity of the packet is maintained. More details are available in the
related patches messages.
The performances of this BPF hook have been assessed with the BPF JIT
enabled on an Intel Xeon X3440 processors with 4 cores and 8 threads
clocked at 2.53 GHz. No throughput losses are noted with the seg6local
BPF hook when the BPF program does nothing (440kpps). Adding a 8-bytes
TLV (1 call each to bpf_lwt_seg6_adjust_srh and bpf_lwt_seg6_store_bytes)
drops the throughput to 410kpps, and inlining a SRH via bpf_lwt_seg6_action
drops the throughput to 420kpps. All throughputs are stable.
Changelog:
v2: move the SRH integrity state from skb->cb to a per-cpu buffer
v3: - document helpers in man-page style
- fix kbuild bugs
- un-break BPF LWT out hook
- bpf_push_seg6_encap is now static
- preempt_enable is now called when the packet is dropped in
input_action_end_bpf
v4: fix kbuild bugs when CONFIG_IPV6=m
v5: fix kbuild sparse warnings when CONFIG_IPV6=m
v6: fix skb pointers-related bugs in helpers
v7: - fix memory leak in error path of End.BPF setup
- add freeing of BPF data in seg6_local_destroy_state
- new enums SEG6_LOCAL_BPF_* instead of re-using ones of lwt bpf for
netlink nested bpf attributes
- SEG6_LOCAL_BPF_PROG attr now contains prog->aux->id when dumping
state
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Add a new test for the seg6local End.BPF action. The following helpers
are also tested:
- bpf_lwt_push_encap within the LWT BPF IN hook
- bpf_lwt_seg6_action
- bpf_lwt_seg6_adjust_srh
- bpf_lwt_seg6_store_bytes
A chain of End.BPF actions is built. The SRH is injected through a LWT
BPF IN hook before entering this chain. Each End.BPF action validates
the previous one, otherwise the packet is dropped. The test succeeds
if the last node in the chain receives the packet and the UDP datagram
contained can be retrieved from userspace.
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This patch adds the End.BPF action to the LWT seg6local infrastructure.
This action works like any other seg6local End action, meaning that an IPv6
header with SRH is needed, whose DA has to be equal to the SID of the
action. It will also advance the SRH to the next segment, the BPF program
does not have to take care of this.
Since the BPF program may not be a source of instability in the kernel, it
is important to ensure that the integrity of the packet is maintained
before yielding it back to the IPv6 layer. The hook hence keeps track if
the SRH has been altered through the helpers, and re-validates its
content if needed with seg6_validate_srh. The state kept for validation is
stored in a per-CPU buffer. The BPF program is not allowed to directly
write into the packet, and only some fields of the SRH can be altered
through the helper bpf_lwt_seg6_store_bytes.
Performances profiling has shown that the SRH re-validation does not induce
a significant overhead. If the altered SRH is deemed as invalid, the packet
is dropped.
This validation is also done before executing any action through
bpf_lwt_seg6_action, and will not be performed again if the SRH is not
modified after calling the action.
The BPF program may return 3 types of return codes:
- BPF_OK: the End.BPF action will look up the next destination through
seg6_lookup_nexthop.
- BPF_REDIRECT: if an action has been executed through the
bpf_lwt_seg6_action helper, the BPF program should return this
value, as the skb's destination is already set and the default
lookup should not be performed.
- BPF_DROP : the packet will be dropped.
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Acked-by: David Lebrun <dlebrun@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The new bpf_lwt_push_encap helper should only be accessible within the
LWT BPF IN hook, and not the OUT one, as this may lead to a skb under
panic.
At the moment, both LWT BPF IN and OUT share the same list of helpers,
whose calls are authorized by the verifier. This patch separates the
verifier ops for the IN and OUT hooks, and allows the IN hook to call the
bpf_lwt_push_encap helper.
This patch is also the occasion to put all lwt_*_func_proto functions
together for clarity. At the moment, socks_op_func_proto is in the middle
of lwt_inout_func_proto and lwt_xmit_func_proto.
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Acked-by: David Lebrun <dlebrun@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The BPF seg6local hook should be powerful enough to enable users to
implement most of the use-cases one could think of. After some thinking,
we figured out that the following actions should be possible on a SRv6
packet, requiring 3 specific helpers :
- bpf_lwt_seg6_store_bytes: Modify non-sensitive fields of the SRH
- bpf_lwt_seg6_adjust_srh: Allow to grow or shrink a SRH
(to add/delete TLVs)
- bpf_lwt_seg6_action: Apply some SRv6 network programming actions
(specifically End.X, End.T, End.B6 and
End.B6.Encap)
The specifications of these helpers are provided in the patch (see
include/uapi/linux/bpf.h).
The non-sensitive fields of the SRH are the following : flags, tag and
TLVs. The other fields can not be modified, to maintain the SRH
integrity. Flags, tag and TLVs can easily be modified as their validity
can be checked afterwards via seg6_validate_srh. It is not allowed to
modify the segments directly. If one wants to add segments on the path,
he should stack a new SRH using the End.B6 action via
bpf_lwt_seg6_action.
Growing, shrinking or editing TLVs via the helpers will flag the SRH as
invalid, and it will have to be re-validated before re-entering the IPv6
layer. This flag is stored in a per-CPU buffer, along with the current
header length in bytes.
Storing the SRH len in bytes in the control block is mandatory when using
bpf_lwt_seg6_adjust_srh. The Header Ext. Length field contains the SRH
len rounded to 8 bytes (a padding TLV can be inserted to ensure the 8-bytes
boundary). When adding/deleting TLVs within the BPF program, the SRH may
temporary be in an invalid state where its length cannot be rounded to 8
bytes without remainder, hence the need to store the length in bytes
separately. The caller of the BPF program can then ensure that the SRH's
final length is valid using this value. Again, a final SRH modified by a
BPF program which doesn’t respect the 8-bytes boundary will be discarded
as it will be considered as invalid.
Finally, a fourth helper is provided, bpf_lwt_push_encap, which is
available from the LWT BPF IN hook, but not from the seg6local BPF one.
This helper allows to encapsulate a Segment Routing Header (either with
a new outer IPv6 header, or by inlining it directly in the existing IPv6
header) into a non-SRv6 packet. This helper is required if we want to
offer the possibility to dynamically encapsulate a SRH for non-SRv6 packet,
as the BPF seg6local hook only works on traffic already containing a SRH.
This is the BPF equivalent of the seg6 LWT infrastructure, which achieves
the same purpose but with a static SRH per route.
These helpers require CONFIG_IPV6=y (and not =m).
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Acked-by: David Lebrun <dlebrun@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The function lookup_nexthop is essential to implement most of the seg6local
actions. As we want to provide a BPF helper allowing to apply some of these
actions on the packet being processed, the helper should be able to call
this function, hence the need to make it public.
Moreover, if one argument is incorrect or if the next hop can not be found,
an error should be returned by the BPF helper so the BPF program can adapt
its processing of the packet (return an error, properly force the drop,
...). This patch hence makes this function return dst->error to indicate a
possible error.
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Acked-by: David Lebrun <dlebrun@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
include/net/seg6.h cannot be included in a source file if CONFIG_IPV6 is
not enabled:
include/net/seg6.h: In function 'seg6_pernet':
>> include/net/seg6.h:52:14: error: 'struct net' has no member named
'ipv6'; did you mean 'ipv4'?
return net->ipv6.seg6_data;
^~~~
ipv4
This commit makes seg6_pernet return NULL if IPv6 is not compiled, hence
allowing seg6.h to be included regardless of the configuration.
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Jun Wu at Facebook reported that an internal service was seeing a return
value of 1 from ftruncate() on Btrfs in some cases. This is coming from
the NEED_TRUNCATE_BLOCK return value from btrfs_truncate_inode_items().
btrfs_truncate() uses two variables for error handling, ret and err.
When btrfs_truncate_inode_items() returns non-zero, we set err to the
return value. However, NEED_TRUNCATE_BLOCK is not an error. Make sure we
only set err if ret is an error (i.e., negative).
To reproduce the issue: mount a filesystem with -o compress-force=zstd
and the following program will encounter return value of 1 from
ftruncate:
int main(void) {
char buf[256] = { 0 };
int ret;
int fd;
fd = open("test", O_CREAT | O_WRONLY | O_TRUNC, 0666);
if (fd == -1) {
perror("open");
return EXIT_FAILURE;
}
if (write(fd, buf, sizeof(buf)) != sizeof(buf)) {
perror("write");
close(fd);
return EXIT_FAILURE;
}
if (fsync(fd) == -1) {
perror("fsync");
close(fd);
return EXIT_FAILURE;
}
ret = ftruncate(fd, 128);
if (ret) {
printf("ftruncate() returned %d\n", ret);
close(fd);
return EXIT_FAILURE;
}
close(fd);
return EXIT_SUCCESS;
}
Fixes: ddfae63cc8e0 ("btrfs: move btrfs_truncate_block out of trans handle")
CC: stable@vger.kernel.org # 4.15+
Reported-by: Jun Wu <quark@fb.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The default zpos property for all planes in Exynos DRM was fixed as zero.
Fix this by providing proper value provided by hardware drivers, which
typically matches hardware window number.
Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Fixes: e47726a11e11 ("drm/exynos: use generic code for managing zpos plane property")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
|