Age | Commit message (Collapse) | Author |
|
All of the users are now in ipc/shm.c so make the definition local to
that file to make code maintenance easier. AKA to prevent rebuilding
the entire kernel when struct shmid_kernel changes.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Send nm complaints about broken pipe (when sed exits early) to /dev/null.
All errors should be printed to stderr.
Don't trap on normal exit so the trap can return an error code.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Define vdso_start, vdso_end as array to avoid compile-time analysis error
for the case of built with CONFIG_FORTIFY_SOURCE.
and, since vdso_start, vdso_end are used in vdso.c only,
move extern-declaration from vdso.h to vdso.c.
If kernel is built with CONFIG_FORTIFY_SOURCE,
compile-time error happens at this code.
- if (memcmp(&vdso_start, "177ELF", 4))
The size of "&vdso_start" is recognized as 1 byte, but n is 4,
So that compile-time error is reported.
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jinbum Park <jinb.park7@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Without CONFIG_MMU, this results in a build failure:
./arch/arm/include/asm/memory.h:92:23: error: initializer element is not constant
#define VECTORS_BASE vectors_base
arch/arm/mm/dump.c:32:4: note: in expansion of macro 'VECTORS_BASE'
{ VECTORS_BASE, "Vectors" },
arch/arm/mm/dump.c:71:11: error: 'L_PTE_USER' undeclared here (not in a function); did you mean 'VTIME_USER'?
.mask = L_PTE_USER,
^~~~~~~~~~
Obviously the feature only makes sense with an MMU, so let's add the
dependency here.
Fixes: a8e53c151fe7 ("ARM: 8737/1: mm: dump: add checking for writable and executable")
Acked-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Commit 384b38b66947 ("ARM: 7873/1: vfp: clear vfp_current_hw_state
for dying cpu") fixed the cpu dying notifier by clearing
vfp_current_hw_state[]. However commit e5b61bafe704 ("arm: Convert VFP
hotplug notifiers to state machine") incorrectly used the original
vfp_force_reload() function in the cpu dying notifier.
Fix it by going back to clearing vfp_current_hw_state[].
Fixes: e5b61bafe704 ("arm: Convert VFP hotplug notifiers to state machine")
Cc: linux-stable <stable@vger.kernel.org>
Reported-by: Kohji Okuno <okuno.kohji@jp.panasonic.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
After previous refactoring, there is only one user in the same file
left. Make the function static now.
[wsa: added 'int' to bare 'unsigned']
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
According to documentation, Bit 7 of ICMSR is unused and 0 should be
written to it. Fix the mask accordingly.
Signed-off-by: Hiromitsu Yamasaki <hiromitsu.yamasaki.ym@renesas.com>
[wsa: edited commit message]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
R-Car M3-N (R8A77965) SoC has a R-Car Gen3-compatible I2C controller.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into i2c/for-4.17
"three new special cases for device tree compatible strings"
|
|
Before assigning returned setup structure check if not null
Fixes: 463a9215f3ca7600b5ff ("i2c: stm32f7: fix setup structure")
Signed-off-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Acked-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
|
|
Now that the i2c-pca-plaform driver is using the device managed API for
gpios there is no need for the reset gpio to be specified via
i2c_pca9564_pf_platform_data.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Use device_property_read_u32 instead of of_property_read_u32_index to
lookup the "clock-frequency" property.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Allow for the reset-gpios property to be defined in the device tree
or via a GPIO lookup table.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Define the GPIO connected to the PCA9564 using a GPIO lookup table. This
will allow the i2c-pca-platform driver to use the device managed APIs to
lookup the gpio instead of using platform_data.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Following are the major issues in current driver code
1. The current driver simply assumes the transfer completion
whenever its gets any non-error interrupts and then simply do the
polling of available/free bytes in FIFO.
2. The block mode is not working properly since no handling in
being done for OUT_BLOCK_WRITE_REQ and IN_BLOCK_READ_READ.
3. An i2c transfer can contain multiple message and QUP v2
supports reconfiguration during run in which the mode should be same
for all the sub transfer. Currently the mode is being programmed
before every sub transfer which is functionally wrong. If one message
is less than FIFO length and other message is greater than FIFO
length, then transfers will fail.
Because of above, i2c v2 transfers of size greater than 64 are failing
with following error message
i2c_qup 78b6000.i2c: timeout for fifo out full
To make block mode working properly and move to use the interrupts
instead of polling, major code reorganization is required. Following
are the major changes done in this patch
1. Remove the polling of TX FIFO free space and RX FIFO available
bytes and move to interrupts completely. QUP has QUP_MX_OUTPUT_DONE,
QUP_MX_INPUT_DONE, OUT_BLOCK_WRITE_REQ and IN_BLOCK_READ_REQ
interrupts to handle FIFO’s properly so check all these interrupts.
2. Determine the mode for transfer before starting by checking
all the tx/rx data length in each message. The complete message can be
transferred either in DMA mode or Programmed IO by FIFO/Block mode.
in DMA mode, both tx and rx uses same mode but in PIO mode, the TX and
RX can be in different mode.
3. During write, For FIFO mode, TX FIFO can be directly written
without checking for FIFO space. For block mode, the QUP will generate
OUT_BLOCK_WRITE_REQ interrupt whenever it has block size of available
space.
4. During read, both TX and RX FIFO will be used. TX will be used
for writing tags and RX will be used for receiving the data. In QUP,
TX and RX can operate in separate mode so configure modes accordingly.
5. For read FIFO mode, wait for QUP_MX_INPUT_DONE interrupt which
will be generated after all the bytes have been copied in RX FIFO. For
read Block mode, QUP will generate IN_BLOCK_READ_REQ interrupts
whenever it has block size of available data.
6. Split the transfer in chunk of one QUP block size(256 bytes)
and schedule each block separately. QUP v2 supports reconfiguration
during run in which QUP can transfer multiple blocks without issuing a
stop events.
7. Port the SMBus block read support for new code changes.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Following are the major issues in current driver code
1. The current driver simply assumes the transfer completion
whenever its gets any non-error interrupts and then simply do the
polling of available/free bytes in FIFO.
2. The block mode is not working properly since no handling in
being done for OUT_BLOCK_WRITE_REQ and IN_BLOCK_READ_REQ.
Because of above, i2c v1 transfers of size greater than 32 are failing
with following error message
i2c_qup 78b6000.i2c: timeout for fifo out full
To make block mode working properly and move to use the interrupts
instead of polling, major code reorganization is required. Following
are the major changes done in this patch
1. Remove the polling of TX FIFO free space and RX FIFO available
bytes and move to interrupts completely. QUP has QUP_MX_OUTPUT_DONE,
QUP_MX_INPUT_DONE, OUT_BLOCK_WRITE_REQ and IN_BLOCK_READ_REQ
interrupts to handle FIFO’s properly so check all these interrupts.
2. During write, For FIFO mode, TX FIFO can be directly written
without checking for FIFO space. For block mode, the QUP will generate
OUT_BLOCK_WRITE_REQ interrupt whenever it has block size of available
space.
3. During read, both TX and RX FIFO will be used. TX will be used
for writing tags and RX will be used for receiving the data. In QUP,
TX and RX can operate in separate mode so configure modes accordingly.
4. For read FIFO mode, wait for QUP_MX_INPUT_DONE interrupt which
will be generated after all the bytes have been copied in RX FIFO. For
read Block mode, QUP will generate IN_BLOCK_READ_REQ interrupts
whenever it has block size of available data.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
According to I2c specification, “If a master-receiver sends a
repeated START condition, it sends a not-acknowledge (A) just
before the repeated START condition”. QUP v2 supports sending
of NACK without stop with QUP_TAG_V2_DATARD_NACK so added the
same.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Austin Christ <austinwc@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
The BAM mode requires buffer for start tag data and tx, rx SG
list. Currently, this is being taken for maximum transfer length
(65K). But an I2C transfer can have multiple messages and each
message can be of this maximum length so the buffer overflow will
happen in this case. Since increasing buffer length won’t be
feasible since an I2C transfer can contain any number of messages
so this patch does following changes to make i2c transfers working
for multiple messages case.
1. Calculate the required buffers for 2 maximum length messages
(65K * 2).
2. Split the descriptor formation and descriptor scheduling.
The idea is to fit as many messages in one DMA transfers for 65K
threshold value (max_xfer_sg_len). Whenever the sg_cnt is
crossing this, then schedule the BAM transfer and subsequent
transfer will again start from zero.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Currently the completion timeout is being taken according to
maximum transfer length which is too high if SCL is operating in
high frequency. This patch calculates timeout on the basis of
one-byte transfer time and uses the same for completion timeout.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Currently each message length in complete transfer is being
checked for determining DMA mode and if any of the message length
is less than FIFO length then non DMA mode is being used which
will increase overhead. DMA can be used for any length and it
should be determined with complete transfer length. Now, this
patch selects DMA mode if the total length is greater than FIFO
length.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Austin Christ <austinwc@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Currently the i2c error handling in BAM mode is not working
properly in stress condition.
1. After an error, the FIFO are being written with FLUSH and
EOT tags which should not be required since already these tags
have been written in BAM descriptor itself.
2. QUP state is being moved to RESET in IRQ handler in case
of error. When QUP HW encounters an error in BAM mode then it
moves the QUP STATE to PAUSE state. In this case, I2C_FLUSH
command needs to be executed while moving to RUN_STATE by writing
to the QUP_STATE register with the I2C_FLUSH bit set to 1.
3. In Error case, sometimes, QUP generates more than one
interrupt which will trigger the complete again. After an error,
the flush operation will be scheduled after doing
reinit_completion which should be triggered by BAM IRQ callback.
If the second QUP IRQ comes during this time then it will call
the complete and the transfer function will assume the all the
BAM HW descriptors have been completed.
4. The release DMA is being called after each error which
will free the DMA tx and rx channels. The error like NACK is very
common in I2C transfer and every time this will be overhead. Now,
since the error handling is proper so this release channel can be
completely avoided.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Sricharan R <sricharan@codeaurora.org>
Reviewed-by: Austin Christ <austinwc@codeaurora.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
In case of FLUSH operation, BAM copies INPUT EOT FLUSH (0x94)
instead of normal EOT (0x93) tag in input data stream when an
input EOT tag is received during flush operation. So only one tag
will be written instead of 2 separate tags.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
The role of FLUSH and EOT tag is to flush already scheduled
descriptors in BAM HW in case of error. EOT is required only
when descriptors are scheduled in RX FIFO. If all the messages
are WRITE, then only FLUSH tag will be used.
A single BAM transfer can have multiple read and write messages.
The EOT and FLUSH tags should be scheduled at the end of BAM HW
descriptors. Since the READ and WRITE can be present in any order
so for some of the cases, these tags are not being written
correctly.
Following is one of the example
READ, READ, READ, READ
Currently EOT and FLUSH tags are being written after each READ.
If QUP gets NACK for first READ itself, then flush will be
triggered. It will look for first FLUSH tag in TX FIFO and will
stop there so only descriptors for first READ descriptors be
flushed. All the scheduled descriptors should be cleared to
generate BAM DMA completion.
Now this patch is scheduling FLUSH and EOT only once after all the
descriptors. So, flush will clear all the scheduled descriptors and
BAM will generate the completion interrupt.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
The rx_nents and tx_nents are redundant. rx_buf and tx_buf can
be used for total number of SG entries. Since rx_buf and tx_buf
give the impression that it is buffer instead of count so rename
it to tx_cnt and rx_cnt for giving it more meaningful variable
name.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Austin Christ <austinwc@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
1. Assigns use_dma in qup_dev structure itself which will
help in subsequent patches to determine the mode in IRQ handler.
2. Does minor code reorganization for loops to reduce the
unnecessary comparison and assignment.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Austin Christ <austinwc@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
The QUP BSLP BAM generates the following error sometimes if the
current I2C DMA transfer fails and the flush operation has been
scheduled
“bam-dma-engine 7884000.dma: Cannot free busy channel”
If any I2C error comes during BAM DMA transfer, then the QUP I2C
interrupt will be generated and the flush operation will be
carried out to make I2C consume all scheduled DMA transfer.
Currently, the same completion structure is being used for BAM
transfer which has already completed without reinit. It will make
flush operation wait_for_completion_timeout completed immediately
and will proceed for freeing the DMA resources where the
descriptors are still in process.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Acked-by: Sricharan R <sricharan@codeaurora.org>
Reviewed-by: Austin Christ <austinwc@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
The file has been updated from 2016 to 2018 so fixed the
copyright years.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
DHCP connectivity issues can currently occur if the following conditions
are met:
1) A DHCP packet from a client to a server
2) This packet has a multicast destination
3) This destination has a matching entry in the translation table
(FF:FF:FF:FF:FF:FF for IPv4, 33:33:00:01:00:02/33:33:00:01:00:03
for IPv6)
4) The orig-node determined by TT for the multicast destination
does not match the orig-node determined by best-gateway-selection
In this case the DHCP packet will be dropped.
The "gateway-out-of-range" check is supposed to only be applied to
unicasted DHCP packets to a specific DHCP server.
In that case dropping the the unicasted frame forces the client to
retry via a broadcasted one, but now directed to the new best
gateway.
A DHCP packet with broadcast/multicast destination is already ensured to
always be delivered to the best gateway. Dropping a multicasted
DHCP packet here will only prevent completing DHCP as there is no
other fallback.
So far, it seems the unicast check was implicitly performed by
expecting the batadv_transtable_search() to return NULL for multicast
destinations. However, a multicast address could have always ended up in
the translation table and in fact is now common.
To fix this potential loss of a DHCP client-to-server packet to a
multicast address this patch adds an explicit multicast destination
check to reliably bail out of the gateway-out-of-range check for such
destinations.
The issue and fix were tested in the following three node setup:
- Line topology, A-B-C
- A: gateway client, DHCP client
- B: gateway server, hop-penalty increased: 30->60, DHCP server
- C: gateway server, code modifications to announce FF:FF:FF:FF:FF:FF
Without this patch, A would never transmit its DHCP Discover packet
due to an always "out-of-range" condition. With this patch,
a full DHCP handshake between A and B was possible again.
Fixes: be7af5cf9cae ("batman-adv: refactoring gateway handling code")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
|
|
For multicast frames AP isolation is only supposed to be checked on
the receiving nodes and never on the originating one.
Furthermore, the isolation or wifi flag bits should only be intepreted
as such for unicast and never multicast TT entries.
By injecting flags to the multicast TT entry claimed by a single
target node it was verified in tests that this multicast address
becomes unreachable, leading to packet loss.
Omitting the "src" parameter to the batadv_transtable_search() call
successfully skipped the AP isolation check and made the target
reachable again.
Fixes: 1d8ab8d3c176 ("batman-adv: Modified forwarding behaviour for multicast packets")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
|
|
Make the "clock valid" control a global control instead of a mixer
so that it doesn't appear in mixer applications.
Additionally, remove the check for writeability prohibited by spec, and
Use common code to read the control value.
Tested with a UAC2 Audio device that presents a clock validity
control. The control still shows up in /proc usbmixer but not
in alsamixer.
Signed-off-by: Andrew Chant <achant@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
This implements UAC2 jack detection support, presenting
jack status as a boolean read-only mono mixer.
The presence of any channel in the UAC2_TE_CONNECTOR
control for a terminal will result in the mixer saying
the jack is connected.
Mixer naming follows the convention in sound/core/ctljack.c,
terminating the mixer with " Jack".
For additional clues as to which jack is being presented,
the name is prefixed with " - Input Jack" or " - Output Jack"
depending on if it's an input or output terminal.
This is required because terminal names are ambiguous
between inputs and outputs and often duplicated -
Bidirectional terminal types (0x400 -> 0x4FF)
"... may be used separately for input only or output only.
These types require two Terminal descriptors. Both have the same type."
(quote from "USB Device Class Definition for Terminal Types")
Since bidirectional terminal types are common for headphone adapters,
this distinguishes between two otherwise identically-named
jack controls.
Tested with a UAC2 audio device with connector control capability.
Signed-off-by: Andrew Chant <achant@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Conflicts:
arch/x86/mm/init_64.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
With the cherry-picked perf/urgent commit merged separately we can now
merge all the fixes without conflicts.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Pick up a cherry-picked commit.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
use u16 in place of __be16 to suppress the following sparse warnings:
net/sched/act_vlan.c:150:26: warning: incorrect type in assignment (different base types)
net/sched/act_vlan.c:150:26: expected restricted __be16 [usertype] push_vid
net/sched/act_vlan.c:150:26: got unsigned short
net/sched/act_vlan.c:151:21: warning: restricted __be16 degrades to integer
net/sched/act_vlan.c:208:26: warning: incorrect type in assignment (different base types)
net/sched/act_vlan.c:208:26: expected unsigned short [unsigned] [usertype] tcfv_push_vid
net/sched/act_vlan.c:208:26: got restricted __be16 [usertype] push_vid
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tcf_idr_cleanup() is no more used, so remove it.
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In net commit 8175f7c4736f ("mlxsw: spectrum: Prevent duplicate
mirrors") we prevented the user from mirroring more than once from a
single binding point (port-direction pair).
The fix was essentially reverted in a merge conflict resolution when net
was merged into net-next. Restore it.
Fixes: 03fe2debbb27 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We can only get into the branch if CRCs are enabled, so there's no
need to check inside the branch for CRCs being enabled....
Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
We recently came across a V4 filesystem causing memory corruption
due to a newly allocated inode being setup twice and being added to
the superblock inode list twice. From code inspection, the only way
this could happen is if a newly allocated inode was not marked as
free on disk (i.e. di_mode wasn't zero).
Running the metadump on an upstream debug kernel fails during inode
allocation like so:
XFS: Assertion failed: ip->i_d.di_nblocks == 0, file: fs/xfs/xfs_inod=
e.c, line: 838
------------[ cut here ]------------
kernel BUG at fs/xfs/xfs_message.c:114!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 11 PID: 3496 Comm: mkdir Not tainted 4.16.0-rc5-dgc #442
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/0=
1/2014
RIP: 0010:assfail+0x28/0x30
RSP: 0018:ffffc9000236fc80 EFLAGS: 00010202
RAX: 00000000ffffffea RBX: 0000000000004000 RCX: 0000000000000000
RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffff8227211b
RBP: ffffc9000236fce8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000bec R11: f000000000000000 R12: ffffc9000236fd30
R13: ffff8805c76bab80 R14: ffff8805c77ac800 R15: ffff88083fb12e10
FS: 00007fac8cbff040(0000) GS:ffff88083fd00000(0000) knlGS:0000000000000=
000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffa6783ff8 CR3: 00000005c6e2b003 CR4: 00000000000606e0
Call Trace:
xfs_ialloc+0x383/0x570
xfs_dir_ialloc+0x6a/0x2a0
xfs_create+0x412/0x670
xfs_generic_create+0x1f7/0x2c0
? capable_wrt_inode_uidgid+0x3f/0x50
vfs_mkdir+0xfb/0x1b0
SyS_mkdir+0xcf/0xf0
do_syscall_64+0x73/0x1a0
entry_SYSCALL_64_after_hwframe+0x42/0xb7
Extracting the inode number we crashed on from an event trace and
looking at it with xfs_db:
xfs_db> inode 184452204
xfs_db> p
core.magic = 0x494e
core.mode = 0100644
core.version = 2
core.format = 2 (extents)
core.nlinkv2 = 1
core.onlink = 0
.....
Confirms that it is not a free inode on disk. xfs_repair
also trips over this inode:
.....
zero length extent (off = 0, fsbno = 0) in ino 184452204
correcting nextents for inode 184452204
bad attribute fork in inode 184452204, would clear attr fork
bad nblocks 1 for inode 184452204, would reset to 0
bad anextents 1 for inode 184452204, would reset to 0
imap claims in-use inode 184452204 is free, would correct imap
would have cleared inode 184452204
.....
disconnected inode 184452204, would move to lost+found
And so we have a situation where the directory structure and the
inobt thinks the inode is free, but the inode on disk thinks it is
still in use. Where this corruption came from is not possible to
diagnose, but we can detect it and prevent the kernel from oopsing
on lookup. The reproducer now results in:
$ sudo mkdir /mnt/scratch/{0,1,2,3,4,5}{0,1,2,3,4,5}
mkdir: cannot create directory =E2=80=98/mnt/scratch/00=E2=80=99: File ex=
ists
mkdir: cannot create directory =E2=80=98/mnt/scratch/01=E2=80=99: File ex=
ists
mkdir: cannot create directory =E2=80=98/mnt/scratch/03=E2=80=99: Structu=
re needs cleaning
mkdir: cannot create directory =E2=80=98/mnt/scratch/04=E2=80=99: Input/o=
utput error
mkdir: cannot create directory =E2=80=98/mnt/scratch/05=E2=80=99: Input/o=
utput error
....
And this corruption shutdown:
[ 54.843517] XFS (loop0): Corruption detected! Free inode 0xafe846c not=
marked free on disk
[ 54.845885] XFS (loop0): Internal error xfs_trans_cancel at line 1023 =
of file fs/xfs/xfs_trans.c. Caller xfs_create+0x425/0x670
[ 54.848994] CPU: 10 PID: 3541 Comm: mkdir Not tainted 4.16.0-rc5-dgc #=
443
[ 54.850753] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIO=
S 1.10.2-1 04/01/2014
[ 54.852859] Call Trace:
[ 54.853531] dump_stack+0x85/0xc5
[ 54.854385] xfs_trans_cancel+0x197/0x1c0
[ 54.855421] xfs_create+0x425/0x670
[ 54.856314] xfs_generic_create+0x1f7/0x2c0
[ 54.857390] ? capable_wrt_inode_uidgid+0x3f/0x50
[ 54.858586] vfs_mkdir+0xfb/0x1b0
[ 54.859458] SyS_mkdir+0xcf/0xf0
[ 54.860254] do_syscall_64+0x73/0x1a0
[ 54.861193] entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ 54.862492] RIP: 0033:0x7fb73bddf547
[ 54.863358] RSP: 002b:00007ffdaa553338 EFLAGS: 00000246 ORIG_RAX: 0000=
000000000053
[ 54.865133] RAX: ffffffffffffffda RBX: 00007ffdaa55449a RCX: 00007fb73=
bddf547
[ 54.866766] RDX: 0000000000000001 RSI: 00000000000001ff RDI: 00007ffda=
a55449a
[ 54.868432] RBP: 00007ffdaa55449a R08: 00000000000001ff R09: 00005623a=
8670dd0
[ 54.870110] R10: 00007fb73be72d5b R11: 0000000000000246 R12: 000000000=
00001ff
[ 54.871752] R13: 00007ffdaa5534b0 R14: 0000000000000000 R15: 00007ffda=
a553500
[ 54.873429] XFS (loop0): xfs_do_force_shutdown(0x8) called from line 1=
024 of file fs/xfs/xfs_trans.c. Return address = ffffffff814cd050
[ 54.882790] XFS (loop0): Corruption of in-memory data detected. Shutt=
ing down filesystem
[ 54.884597] XFS (loop0): Please umount the filesystem and rectify the =
problem(s)
Note that this crash is only possible on v4 filesystemsi or v5
filesystems mounted with the ikeep mount option. For all other V5
filesystems, this problem cannot occur because we don't read inodes
we are allocating from disk - we simply overwrite them with the new
inode information.
Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Tested-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
In xfs_scrub_iallocbt_xref_rmap_inodes we're checking inodes against
rmap records, so we should use xfs_scrub_btree_xref_set_corrupt if we
encounter discrepancies here so that we know that it's a cross
referencing error, not necessarily a corruption in the inobt itself.
The userspace xfs_scrub program will try to repair outright corruptions
in the agi/inobt prior to phase 3 so that the inode scan will proceed.
If only a cross-referencing error is noted, the repair program defers
the repair attempt until it can check the other space metadata at least
once.
It is therefore essential that the inobt scrubber can correctly
distinguish between corruptions and "unable to cross-reference something
else with this inobt". The same reasoning applies to "xfs: record inode
buf errors as a xref error in inobt scrubber".
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
If a directory's parent inode pointer doesn't point to an inode, the
directory should be flagged as corrupt. Enable IGET_UNTRUSTED here so
that _iget will return -EINVAL if the inobt does not confirm that the
inode is present and allocated and we can flag the directory corruption.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
When we're verifying inode buffers, sanity-check the unlinked pointer.
We don't want to run the risk of trying to purge something that's
obviously broken.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
Extent size hint validation is used by scrub to decide if there's an
error, and it will be used by repair to decide to remove the hint.
Since these use the same validation functions, move them to libxfs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
During the inode btree scrubs we try to confirm the freemask bits
against the inode records. If the inode buffer read fails, this is a
cross-referencing error, not a corruption of the inode btree itself.
Use the xref_process_error call here. Found via core.version middlebit
fuzz in xfs/415.
The userspace xfs_scrub program will try to repair outright corruptions
in the agi/inobt prior to phase 3 so that the inode scan will proceed.
If only a cross-referencing error is noted, the repair program defers
the repair attempt until it can check the other space metadata at least
once.
It is therefore essential that the inobt scrubber can correctly
distinguish between corruptions and "unable to cross-reference something
else with this inobt".
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
Now that we no longer do raw inode buffer scrubbing, the bp parameter is
no longer used anywhere we're dealing with an inode, so remove it and
all the useless NULL parameters that go with it.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
The inode scrubber tries to _iget the inode prior to running checks.
If that _iget call fails with corruption errors that's an automatic
fail, regardless of whether it was the inode buffer read verifier,
the ifork verifier, or the ifork formatter that errored out.
Therefore, get rid of the raw mode scrub code because it's not needed.
Found by trying to fix some test failures in xfs/379 and xfs/415.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
When we're scanning an extent mapping inode fork, ensure that every rmap
record for this ifork has a corresponding bmbt record too. This
(mostly) provides the ability to cross-reference rmap records with bmap
data. The rmap scrubber cannot do the xref on its own because that
requires taking an ilock with the agf lock held, which violates our
locking order rules (inode, then agf).
Note that we only do this for forks that are in btree format due to the
increased complexity; or forks that should have data but suspiciously
have zero extents because the inode could have just had its iforks
zapped by the inode repair code and now we need to reclaim the old
extents.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
When the inode buffer verifier encounters an error, it's much more
helpful to print a buffer from the offending inode instead of just the
start of the inode chunk buffer.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
Refactor some of the inode verifier failure logging call sites to use
the new xfs_inode_verifier_error method which dumps the offending buffer
as well as the code location of the failed check. This trims the
output, makes it clearer to the admin that repair must be run, and gives
the developers more details to work from.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
Refactor the bmap validator into a more complete helper that looks for
extents that run off the end of the device, overflow into the next AG,
or have invalid flag states.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|