Age | Commit message (Collapse) | Author |
|
for (clk = 0x80000080; new_clock >= (clock << 1); clk >>= 1)
clock <<= 1;
... is too tricky, hence I replaced with
roundup_pow_of_two(divisor) >> 2
'(clk >> 22) & 0x1' is the bit test for the 1/1 divisor, but
it is not clear. 'divisor <= 1' is easier to understand.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
renesas_sdhi_clk_start() and renesas_sdhi_clk_stop() are now only
called from renesas_sdhi_set_clock(). Merge them.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
When using DMA, if the DMA addr spans 128MB boundary, we have to split
the DMA transfer into two so that each one doesn't exceed the boundary.
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Add this hook so that it can be overridden with driver specific
implementations. We also let the original sdhci_adma_write_desc()
accept &desc so that the function can set its new value. Then export
the function so that it could be reused by driver's specific
implementations.
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
This patch adds adma_table_cnt member to struct sdhci_host to give more
flexibility to drivers to control the ADMA table count.
Default value of adma_table_cnt is set to (SDHCI_MAX_SEGS * 2 + 1).
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
In preparation to remove the node name pointer from struct device_node,
convert printf users to use the %pOFn format specifier.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hu Ziji <huziji@marvell.com>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: linux-mmc@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Allow SDHCI drivers to hook code before and after sdhci_request() by
making it externally visible.
Signed-off-by: Aapo Vienamo <avienamo@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
After host requests RESET_FOR_ALL action, the hardware output an
interrupt for OS and waiting for the OS to approve.
Before writing this fix, ACPI GED has handled the interrupt. But
the ACPI GED belongs to a slow process, and sometimes the handling
process time is more than 100ms(Mutex wait more than 100ms). So
drop the GED solution and add this quirk fix.
Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
The device specific resource can be free in free_slot after
removing host controller.
Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
In tuning mode of operation, when TBCTL[TB_EN] is set, eSDHC may report
one of the following errors :
1)Tuning error while running tuning operation where SYSCTL2[SAMPCLKSEL]
will not get set even when SYSCTL2[EXTN] is reset. OR
2)Data transaction error (e.g. IRQSTAT[DCE], IRQSTAT[DEBE]) during data
transaction errors.
This issue occurs when the data window sampled within eSDHC is in full
cycle. So, in that case, eSDHC is not able to find out the start and
end points of the data window and sets the sampling pointer at default
location (which is middle of the internal SD clock). If this sampling
point coincides with the data eye boundary, then it can result in the
above mentioned errors. Impact: Tuning mode of operation for SDR50,
SDR104 or HS200 speed modes may not work properly
Workaround: In case eSDHC reports tuning error or data errors in tuning
mode of operation, by add the erratum A008171 support to fix the issue.
Signed-off-by: Yinbo Zhu <yinbo.zhu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
This patch is to add tuning error codes to
judge tuning state
Signed-off-by: Yinbo Zhu <yinbo.zhu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Here is another TMIO MMC variant found in Socionext UniPhier SoCs.
As commit b6147490e6aa ("mmc: tmio: split core functionality, DMA and
MFD glue") said, these MMC controllers use the IP from Panasonic.
However, the MMC controller in the TMIO (Toshiba Mobile IO) MFD chip
was the first upstreamed user of this IP. The common driver code
for this IP is now called 'tmio-mmc-core' in Linux although it is a
historical misnomer.
Anyway, this driver select's MMC_TMIO_CORE to borrow the common code
from tmio-mmc-core.c
Older UniPhier SoCs (LD4, Pro4, sLD8) support the external DMA engine
like renesas_sdhi_sys_dmac.c. The difference is UniPhier SoCs use a
single DMA channel whereas Renesas chips request separate channels for
RX and TX.
Newer UniPhier SoCs (Pro5 and later) support the internal DMA engine
like renesas_sdhi_internal_dmac.c The register map is almost the same,
so I guess Renesas and Socionext use the same internal DMA hardware.
The main difference is, the register offsets are doubled for Renesas.
Renesas Socionext
SDHI UniPhier
DM_CM_DTRAN_MODE 0x820 0x410
DM_CM_DTRAN_CTRL 0x828 0x414
DM_CM_RST 0x830 0x418
DM_CM_INFO1 0x840 0x420
DM_CM_INFO1_MASK 0x848 0x424
DM_CM_INFO2 0x850 0x428
DM_CM_INFO2_MASK 0x858 0x42c
DM_DTRAN_ADDR 0x880 0x440
DM_DTRAN_ADDREX --- 0x444
This comes from the difference of host->bus_shift; 2 for Renesas SoCs,
and 1 for UniPhier SoCs. Also, the datasheet for UniPhier SoCs defines
DM_DTRAN_ADDR and DM_DTRAN_ADDREX as two separate registers.
It could be possible to factor out the DMA common code by introducing
some hooks to cope with platform quirks, but this patch does not touch
that for now.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
This SD/eMMC controller is used for UniPhier SoC family.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
tmio_mmc_set_clock() is full of quirks because different SoC vendors
extended this in different ways.
The original IP defines the divisor range 1/2 ... 1/512.
bit 7 is set: 1/512
bit 6 is set: 1/256
...
bit 0 is set: 1/4
all bits clear: 1/2
It is platform-dependent how to achieve the 1/1 clock.
I guess the TMIO-MFD variant uses the clock selector outside of this IP,
as far as I see tmio_core_mmc_clk_div() in drivers/mfd/tmio_core.c
I guess bit[7:0]=0xff is Renesas-specific extension.
Socionext (and Panasonic) uses bit 10 (CLKSEL) for 1/1. Also, newer
versions of UniPhier SoC variants use bit 16 for 1/1024.
host->clk_update() is only used by the Renesas variants, whereas
host->set_clk_div() is only used by the TMIO-MFD variants.
To cope with this mess, promote tmio_mmc_set_clock() to a new
platform hook ->set_clock(), and melt the old two hooks into it.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
tmio_mmc_clk_stop(host) is equivalent to tmio_mmc_set_clock(host, 0).
This replacement is needed for the next commit.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
The JZ4725B is the first JZ SoC version that introduced a 32-bit IMASK
register, not the JZ4750.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Document the R-Car V3M (R8A77970) SoC in the R-Car SDHI bindings -- it's
the usual R-Car gen3 compatible controller with the internal DMA engine.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
|
|
I've successfully tested eMMC on the V3H Starter Kit board and since the
R8A77970 SoC has a single SDHI core, it can't be a subject to the known RX
DMA errata.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
|
|
Remove the stray underscore in the DM_CM_DTRAN_MODE.BUS_WIDTH register
field name and fix the typo in the comment of the #define
DTRAN_MODE_CH_NUM_CH1.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Depending on MACH_JZ4740 | MACH_JZ4780 prevent us from creating a generic
kernel that works on more than one MIPS board. Instead, we just depend on
MIPS being set.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Enable access to the RPMB on the on-board eMMC of the
Poplar board.
Signed-off-by: Igor Opaniuk <igor.opaniuk@linaro.org>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Document RZ/G2M (R8A774A1) SoC bindings.
Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com>
Reviewed-by: Biju Das <biju.das@bp.renesas.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
We need r8a774a1 to be whitelisted for SDHI to work on the RZ/G2M,
but we don't care about the revision of the SoC, so just whitelist
the generic part number.
Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com>
Reviewed-by: Biju Das <biju.das@bp.renesas.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
When mmc-pwrseq property is passed mmc_pwrseq_alloc() can return
-EPROBE_DEFER because driver for power sequence provider is not probed
yet. Do not show error message when this situation happens.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Add ACPI support to all IPROC SDHCI variants.
Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Reviewed-by: Vladimir Olovyannikov <vladimir.olovyannikov@broadcom.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Convert DT properties to generic device properties
so that drivers can get properties from DT or ACPI.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Srinath Mannam <srinath.mannam@broadcom.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
So:
- use 'extern' consistently for APIs
- fix weird header guard
- clarify code comments
- reorder APIs by type
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1537312139-5580-2-git-send-email-chang.seok.bae@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
limit CPU/node NR trick
We have a special segment descriptor entry in the GDT, whose sole purpose is to
encode the CPU and node numbers in its limit (size) field. There are user-space
instructions that allow the reading of the limit field, which gives us a really
fast way to read the CPU and node IDs from the vDSO for example.
But the naming of related functionality does not make this clear, at all:
VDSO_CPU_SIZE
VDSO_CPU_MASK
__CPU_NUMBER_SEG
GDT_ENTRY_CPU_NUMBER
vdso_encode_cpu_node
vdso_read_cpu_node
There's a number of problems:
- The 'VDSO_CPU_SIZE' doesn't really make it clear that these are number
of bits, nor does it make it clear which 'CPU' this refers to, i.e.
that this is about a GDT entry whose limit encodes the CPU and node number.
- Furthermore, the 'CPU_NUMBER' naming is actively misleading as well,
because the segment limit encodes not just the CPU number but the
node ID as well ...
So use a better nomenclature all around: name everything related to this trick
as 'CPUNODE', to make it clear that this is something special, and add
_BITS to make it clear that these are number of bits, and propagate this to
every affected name:
VDSO_CPU_SIZE => VDSO_CPUNODE_BITS
VDSO_CPU_MASK => VDSO_CPUNODE_MASK
__CPU_NUMBER_SEG => __CPUNODE_SEG
GDT_ENTRY_CPU_NUMBER => GDT_ENTRY_CPUNODE
vdso_encode_cpu_node => vdso_encode_cpunode
vdso_read_cpu_node => vdso_read_cpunode
This, beyond being less confusing, also makes it easier to grep for all related
functionality:
$ git grep -i cpunode arch/x86
Also, while at it, fix "return is not a function" style sloppiness in vdso_encode_cpunode().
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1537312139-5580-2-git-send-email-chang.seok.bae@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Currently the CPU/node NR segment descriptor (GDT_ENTRY_CPU_NUMBER) is
initialized relatively late during CPU init, from the vCPU code, which
has a number of disadvantages, such as hotplug CPU notifiers and SMP
cross-calls.
Instead just initialize it much earlier, directly in cpu_init().
This reduces complexity and increases robustness.
[ mingo: Wrote new changelog. ]
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1537312139-5580-9-git-send-email-chang.seok.bae@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Clean up the CPU/node number related code a bit, to make it more apparent
how we are encoding/extracting the CPU and node fields from the
segment limit.
No change in functionality intended.
[ mingo: Wrote new changelog. ]
Suggested-by: Andy Lutomirski <luto@kernel.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Rik van Riel <riel@surriel.com>
Link: http://lkml.kernel.org/r/1537312139-5580-8-git-send-email-chang.seok.bae@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The old 'per CPU' naming was misleading: 64-bit kernels don't use this
GDT entry for per CPU data, but to store the CPU (and node) ID.
[ mingo: Wrote new changelog. ]
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Rik van Riel <riel@surriel.com>
Link: http://lkml.kernel.org/r/1537312139-5580-7-git-send-email-chang.seok.bae@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Instead of open coding the calls to load_seg_legacy(), introduce
x86_fsgsbase_load() to load FS/GS segments.
This makes it more explicit that this is part of FSGSBASE functionality,
and the new helper can be updated when FSGSBASE instructions are enabled.
[ mingo: Wrote new changelog. ]
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Rik van Riel <riel@surriel.com>
Link: http://lkml.kernel.org/r/1537312139-5580-6-git-send-email-chang.seok.bae@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Replace open-coded rdmsr()'s with their <asm/fsgsbase.h> API
counterparts.
No change in functionality intended.
[ mingo: Wrote new changelog. ]
Based-on-code-from: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Rik van Riel <riel@surriel.com>
Link: http://lkml.kernel.org/r/1537312139-5580-5-git-send-email-chang.seok.bae@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Use the new FS/GS base helper functions in <asm/fsgsbase.h> in the platform
specific ptrace implementation of the following APIs:
PTRACE_ARCH_PRCTL,
PTRACE_SETREG,
PTRACE_GETREG,
etc.
The fsgsbase code is more abstracted out this way and the FS/GS-update
mechanism will be easier to change this way.
[ mingo: Wrote new changelog. ]
Based-on-code-from: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1537312139-5580-4-git-send-email-chang.seok.bae@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Introduce FS/GS base access functionality via <asm/fsgsbase.h>,
not yet used by anything directly.
Factor out task_seg_base() from x86/ptrace.c and rename it to
x86_fsgsbase_read_task() to make it part of the new helpers.
This will allow us to enhance FSGSBASE support and eventually enable
the FSBASE/GSBASE instructions.
An "inactive" GS base refers to a base saved at kernel entry
and being part of an inactive, non-running/stopped user-task.
(The typical ptrace model.)
Here are the new functions:
x86_fsbase_read_task()
x86_gsbase_read_task()
x86_fsbase_write_task()
x86_gsbase_write_task()
x86_fsbase_read_cpu()
x86_fsbase_write_cpu()
x86_gsbase_read_cpu_inactive()
x86_gsbase_write_cpu_inactive()
As an advantage of the unified namespace we can now see all FS/GSBASE
API use in the kernel via the following 'git grep' pattern:
$ git grep x86_.*sbase
[ mingo: Wrote new changelog. ]
Based-on-code-from: Andy Lutomirski <luto@kernel.org>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1537312139-5580-3-git-send-email-chang.seok.bae@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
On 64-bit kernels ptrace can read the FS/GS base using the register access
APIs (PTRACE_PEEKUSER, etc.) or PTRACE_ARCH_PRCTL.
Make both of these mechanisms return the actual FS/GS base.
This will improve debuggability by providing the correct information
to ptracer such as GDB.
[ chang: Rebased and revised patch description. ]
[ mingo: Revised the changelog some more. ]
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1537312139-5580-2-git-send-email-chang.seok.bae@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
If the current process has unlimited RLIMIT_MEMLOCK,
we should should leave it as is.
Fixes: 941ff6f11c02 ("bpf: fix rlimit in reuseport net selftest")
Signed-off-by: John Sperbeck <jsperbeck@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Quentin Monnet says:
====================
This patch series adds support for hardware offload of programs containing
BPF-to-BPF function calls. First, a new callback is added to the kernel
verifier, to collect information after the main part of the verification
has been performed. Then support for BPF-to-BPF calls is incrementally
added to the nfp driver, before offloading programs containing such calls
is eventually allowed by lifting the restriction in the kernel verifier, in
the last patch. Please refer to individual patches for details.
Many thanks to Jiong and Jakub for their precious help and contribution on
the main patches for the JIT-compiler, and everything related to stack
accesses.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Now that there is at least one driver supporting BPF-to-BPF function
calls, lift the restriction, in the verifier, on hardware offload of
eBPF programs containing such calls. But prevent jit_subprogs(), still
in the verifier, from being run for offloaded programs.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Mark instructions that use pointers to areas in the stack outside of the
current stack frame, and process them accordingly in mem_op_stack().
This way, we also support BPF-to-BPF calls where the caller passes a
pointer to data in its own stack frame to the callee (typically, when
the caller passes an address to one of its local variables located in
the stack, as an argument).
Thanks to Jakub and Jiong for figuring out how to deal with this case,
I just had to turn their email discussion into this patch.
Suggested-by: Jiong Wang <jiong.wang@netronome.com>
Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
When pre-processing the instructions, it is trivial to detect what
subprograms are using R6, R7, R8 or R9 as destination registers. If a
subprogram uses none of those, then we do not need to jump to the
subroutines dedicated to saving and restoring callee-saved registers in
its prologue and epilogue.
This patch introduces detection of callee-saved registers in subprograms
and prevents the JIT from adding calls to those subroutines whenever we
can: we save some instructions in the translated program, and some time
at runtime on BPF-to-BPF calls and returns.
If no subprogram needs to save those registers, we can avoid appending
the subroutines at the end of the program.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
On performing a BPF-to-BPF call, we first jump to a subroutine that
pushes callee-saved registers (R6~R9) to the stack, and from there we
goes to the start of the callee next. In order to do so, the caller must
pass to the subroutine the address of the NFP instruction to jump to at
the end of that subroutine. This cannot be reliably implemented when
translated the caller, as we do not always know the start offset of the
callee yet.
This patch implement the required fixup step for passing the start
offset in the callee via the register used by the subroutine to hold its
return address.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Relocation for targets of BPF-to-BPF calls are required at the end of
translation. Update the nfp_fixup_branches() function in that regard.
When checking that the last instruction of each bloc is a branch, we
must account for the length of the instructions required to pop the
return address from the stack.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Offloaded programs using BPF-to-BPF calls use the stack to store the
return address when calling into a subprogram. Callees also need some
space to save eBPF registers R6 to R9. And contrarily to kernel
verifier, we align stack frames on 64 bytes (and not 32). Account for
all this when checking the stack size limit before JIT-ing the program.
This means we have to recompute maximum stack usage for the program, we
cannot get the value from the kernel.
In addition to adapting the checks on stack usage, move them to the
finalize() callback, now that we have it and because such checks are
part of the verification step rather than translation.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This is the main patch for the logics of BPF-to-BPF calls in the nfp
driver.
The functions called on BPF_JUMP | BPF_CALL and BPF_JUMP | BPF_EXIT were
used to call helpers and exit from the program, respectively; make them
usable for calling into, or returning from, a BPF subprogram as well.
For all calls, push the return address as well as the callee-saved
registers (R6 to R9) to the stack, and pop them upon returning from the
calls. In order to limit the overhead in terms of instruction number,
this is done through dedicated subroutines. Jumping to the callee
actually consists in jumping to the subroutine, that "returns" to the
callee: this will require some fixup for passing the address in a later
patch. Similarly, returning consists in jumping to the subroutine, which
pops registers and then return directly to the caller (but no fixup is
needed here).
Return to the caller is performed with the RTN instruction newly added
to the JIT.
For the few steps where we need to know what subprogram an instruction
belongs to, the struct nfp_insn_meta is extended with a new subprog_idx
field.
Note that checks on the available stack size, to take into account the
additional requirements associated to BPF-to-BPF calls (storing R6-R9
and return addresses), are added in a later patch.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Similarly to "exit" or "helper call" instructions, BPF-to-BPF calls will
require additional processing before translation starts, in order to
record and mark jump destinations.
We also mark the instructions where each subprogram begins. This will be
used in a following commit to determine where to add prologues for
subprograms.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The checks related to eBPF helper calls are performed each time the nfp
driver meets a BPF_JUMP | BPF_CALL instruction. However, these checks
are not relevant for BPF-to-BPF call (same instruction code, different
value in source register), so just skip the checks for such calls.
While at it, rename the function that runs those checks to make it clear
they apply to _helper_ calls only.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
In order to support BPF-to-BPF calls in offloaded programs, the nfp
driver must collect information about the distinct subprograms: namely,
the number of subprograms composing the complete program and the stack
depth of those subprograms. The latter in particular is non-trivial to
collect, so we copy those elements from the kernel verifier via the
newly added post-verification hook. The struct nfp_prog is extended to
store this information. Stack depths are stored in an array of dedicated
structs.
Subprogram start indexes are not collected. Instead, meta instructions
associated to the start of a subprogram will be marked with a flag in a
later patch.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|