summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-08-25KVM: s390: generate kvm hypercall functionsHeiko Carstens
Generate kvm hypercall functions with a macro instead of duplicating the more or less identical code seven times. This also reduces number of lines of code. However the main purpose is to get rid of as many as possible open coded error prone register asm constructs in s390 architecture code. For the only user of kvm_hypercall identical code is created before/after this patch (drivers/s390/virtio/virtio_ccw.c). Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Link: https://lore.kernel.org/r/20210713145713.2815167-1-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/sclp: add tracing of SCLP interactionsPeter Oberparleiter
Add tracing of interactions between the SCLP base driver, firmware and other drivers to support problem determination in case of SCLP-related issues. For that purpose this patch introduces two new s390dbf debug areas: - sclp: An abbreviated log of all common interactions - sclp_err: A full log of failed or abnormal interactions Tracing of full SCCB contents can be enabled for the sclp area by setting its debug level to maximum (6). Overview of added trace events: * Firmware interaction: - SRV1: Service call about to be issued - SRV2: Service call was issued - INT: Interrupt received * Driver interaction: - RQAD: Request was added - RQOK: Request success - RQAB: Request aborted - RQTM: Request timed out - REG: Event listener registered - UREG: Event listener unregistered - EVNT: Event callback - STCG: State-change callback * Abnormal events: - TMO: A timeout occurred - UNEX: Unexpected SCCB completion * Other (not traced at default level): - SYN1: Synchronous wait start - SYN2: Synchronous wait end Since the SCLP interface is used by console drivers this patch also moves s390dbf printks outside the critical section protected by debug area locks to prevent a potential deadlock that would otherwise be introduced between console_owner --> sclp_lock --> sclp_debug.lock. Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/debug: add early tracing supportPeter Oberparleiter
Debug areas can currently only be used after s390dbf initialization which occurs as a postcore_initcall. This is too late for tracing earlier code such as that related to console_init(). This patch introduces a macro for defining a statically initialized debug area that can be used to trace very early code. The macro is made available for built-in code only because modules are never running during early boot. Example usage: 1. Define static debug area: DEFINE_STATIC_DEBUG_INFO(my_debug, "my_debug", 4, 1, 16, &debug_hex_ascii_view); 2. Add trace entry: debug_event(&my_debug, 0, "DATA", 4); Note: The debug area is automatically registered in debugfs during boot. A driver must not call any of the debug_register()/_unregister() functions on a static debug_info_t! Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/debug: fix debug area life cyclePeter Oberparleiter
Currently allocation and registration of s390dbf debug areas are tied together. As a result, a debug area cannot be unregistered and re-registered while any process has an associated debugfs file open. Fix this by splitting alloc/release from register/unregister. Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/debug: keep debug data on resizePeter Oberparleiter
Any previously recorded s390dbf debug data is reset when a debug area is resized using the 'pages' sysfs attribute. This can make live-debugging unnecessarily complex. Fix this by copying existing debug data to the newly allocated debug area when resizing. Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/diag: make restart_part2 a local labelHeiko Carstens
Avoid that the "restart_part2" label, which is in the middle of a function, appears in /proc/kallsyms. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/mm,pageattr: fix walk_pte_level() early exitHeiko Carstens
In case of splitting to 4k mapping the early exit in walk_pte_level() must only be taken iff flags is equal to SET_MEMORY_4K. Currently the early exit is taken if the flag is set, and also others might be set. This may lead to the situation that a mapping is split but other changes are not done, like e.g. setting pages to R/W. There is currently no such caller, but there might be in the future. Fixes: b3e1a00c8fa4 ("s390/mm: implement set_memory_4k()") Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390: fix typo in linker scriptHeiko Carstens
Rename amod31 to amode31 like it was supposed to be. Fixes: c78d0c7484f0 ("s390: rename dma section to amode31") Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390: remove do_signal() prototype and do_notify_resume() functionSven Schnelle
Both are no longer used since the conversion to generic entry, therefore remove them. Fixes: 56e62a737028 ("s390: convert to generic entry") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/crypto: fix all kernel-doc warnings in vfio_ap_ops.cRandy Dunlap
The 0day bot reported some kernel-doc warnings in this file so clean up all of the kernel-doc and use proper kernel-doc formatting. There are no more kernel-doc errors or warnings reported in this file. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Tony Krowiak <akrowiak@linux.ibm.com> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Jason Herne <jjherne@linux.ibm.com> Cc: Harald Freudenberger <freude@linux.ibm.com> Cc: linux-s390@vger.kernel.org Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Link: https://lore.kernel.org/r/20210806050149.9614-1-rdunlap@infradead.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/pci: improve DMA translation init and exitNiklas Schnelle
Currently zpci_dma_init_device()/zpci_dma_exit_device() is called as part of zpci_enable_device()/zpci_disable_device() and errors for zpci_dma_exit_device() are always ignored even if we could abort. Improve upon this by moving zpci_dma_exit_device() out of zpci_disable_device() and check for errors whenever we have a way to abort the current operation. Note that for example in zpci_event_hard_deconfigured() the device is expected to be gone so we really can't abort and proceed even in case of error. Similarly move the cc == 3 special case out of zpci_unregister_ioat() and into the callers allowing to abort when finding an already disabled devices precludes proceeding with the operation. While we are at it log IOAT register/unregister errors in the s390 debugfs log, Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/pci: simplify CLP List PCI handlingNiklas Schnelle
Currently clp_get_state() and clp_refresh_fh() awkwardly use the clp_list_pci() callback mechanism to find the entry for a specific FID and update its zdev, respectively return its state. This is both needlessly complex and means we are always going through the entire PCI function list even if the FID has already been found. Instead lets introduce a clp_find_pci() function to find a specific entry and share the CLP List PCI request handling code with clp_list_pci(). With that in place we can also easily make the function handle a simple out parameter instead of directly altering the zdev allowing easier access to the updated function handle by the caller. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/pci: handle FH state mismatch only on disableNiklas Schnelle
Instead of always treating CLP_RC_SETPCIFN_ALRDY as success and blindly updating the function handle restrict this special handling to the disable case by moving it into zpci_disable_device() and still treating it as an error while also updating the function handle such that a subsequent zpci_disable_device() succeeds or the caller can ignore the error when aborting is not an option such as for zPCI event 0x304. Also print this occurrence to the log such that an admin can tell why a disable operation returned an error. A mismatch between the state of the underlying device and our view of it can naturally happen when the device suddenly enters the error state but we haven't gotten the error notification yet, it must not happen on enable though. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/pci: fix misleading rc in clp_set_pci_fn()Niklas Schnelle
Currently clp_set_pci_fn() always returns 0 as long as the CLP request itself succeeds even if the operation itself returns a response code other than CLP_RC_OK or CLP_RC_SETPCIFN_ALRDY. This is highly misleading because calling code assumes that a zero rc means that the operation was successful. Fix this by returning the response code or cc on failure with the exception of the special handling for CLP_RC_SETPCIFN_ALRDY. Also let's not assume that the returned function handle for CLP_RC_SETPCIFN_ALRDY is 0, we don't need it anyway. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/boot: factor out offset_vmlinux_info() functionAlexander Gordeev
Move offsetting all of vmlinux_info fields to a separate function for better readability. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/kasan: fix large PMD pages address alignment checkAlexander Gordeev
It is currently possible to initialize a large PMD page when the address is not aligned on page boundary. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/zcrypt: remove gratuitious NULL check in .remove() callbacksJulian Wiedmann
As .remove() is only called after a successful .probe() call, we can trust that the drvdata is valid. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/ap: use the common driver-data pointerJulian Wiedmann
The device struct provides a pointer for driver-private data. Use this in the zcrypt drivers (as vfio_ap already does), and then remove the custom pointer from the AP device structs. As really_probe() will always clear the drvdata pointer on error, we no longer have to do so ourselves. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/ap: use the common device_driver pointerJulian Wiedmann
The device struct itself already contains a pointer to its driver. Use this consistently, instead of duplicating it. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/pci: reset zdev->zbus on registration failureNiklas Schnelle
On failure to register a struct zpci_dev with a struct zpci_bus we left a dangling pointer in zdev->zbus. As zpci_create_device() bails if zpci_bus_device_register() fails this is of no consequence but still bad practice. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/pci: cleanup resources only if necessaryNiklas Schnelle
It's currently safe to call zpci_cleanup_bus_resources() even if the resources were never created but it makes no sense so check zdev->has_resources before we call zpci_cleanup_bus_resources() in zpci_release_device(). Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Acked-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25Revert "USB: serial: ch341: fix character loss at high transfer rates"Johan Hovold
This reverts commit 3c18e9baee0ef97510dcda78c82285f52626764b. These devices do not appear to send a zero-length packet when the transfer size is a multiple of the bulk-endpoint max-packet size. This means that incoming data may not be processed by the driver until a short packet is received or the receive buffer is full. Revert back to using endpoint-sized receive buffers to avoid stalled reads. Reported-by: Paul Größel <pb.g@gmx.de> Link: https://bugzilla.kernel.org/show_bug.cgi?id=214131 Fixes: 3c18e9baee0e ("USB: serial: ch341: fix character loss at high transfer rates") Cc: stable@vger.kernel.org Cc: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20210824121926.19311-1-johan@kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>
2021-08-25can: mscan: mpc5xxx_can: mpc5xxx_can_probe(): remove useless BUG_ON()Tang Bin
In the function mpc5xxx_can_probe(), the variable 'data' has already been determined in the above code, so the BUG_ON() in this place is useless, remove it. Link: https://lore.kernel.org/r/20210823141033.17876-1-tangbin@cmss.chinamobile.com Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-08-25can: mscan: mpc5xxx_can: mpc5xxx_can_probe(): use of_device_get_match_data ↵Tang Bin
to simplify code Retrieve OF match data, it's better and cleaner to use 'of_device_get_match_data' over 'of_match_device'. Link: https://lore.kernel.org/r/20210823113338.3568-4-tangbin@cmss.chinamobile.com Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-08-25can: rcar_canfd: rcar_canfd_handle_channel_tx(): fix redundant assignmentLad Prabhakar
Fix redundant assignment of 'priv' to itself in rcar_canfd_handle_channel_tx(). Fixes: 76e9353a80e9 ("can: rcar_canfd: Add support for RZ/G2L family") Link: https://lore.kernel.org/r/20210820161449.18169-1-prabhakar.mahadev-lad.rj@bp.renesas.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-08-25can: rcar: Kconfig: Add helper dependency on COMPILE_TESTCai Huoqing
it's helpful for complie test in other platform(e.g.X86) Link: https://lore.kernel.org/r/20210825062341.2332-1-caihuoqing@baidu.com Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-08-24Merge pull request #69 from namjaejeon/cifsd-for-nextSteve French
ksmbd-fixes
2021-08-25MAINTAINERS: ksmbd: add cifs_common directory to ksmbd entryNamjae Jeon
The codes that shared between cifs and ksmbd will move into the cifs_common directory. This patch add it to the ksmbd entry in the MAINTAINERS file. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-08-25MAINTAINERS: ksmbd: update my email addressNamjae Jeon
My email address in ksmbd entry will be not available in a few days. Update it to my own kernel.org address. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-08-24riscv: dts: microchip: Add ethernet0 to the aliases nodeBin Meng
U-Boot expects this alias to be in place in order to fix up the mac address of the ethernet node. Note on the Icicle Kit board, currently only emac1 is enabled so it becomes the 'ethernet0'. Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-24riscv: dts: microchip: Use 'local-mac-address' for emac1Bin Meng
Per the DT spec, 'local-mac-address' is used to specify MAC address that was assigned to the network device, while 'mac-address' is used to specify the MAC address that was last used by the boot program, and shall be used only if the value differs from 'local-mac-address' property value. Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: conor dooley <conor.dooley@microchip.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-24riscv: Ensure the value of FP registers in the core dump file is up to dateVincent Chen
The value of FP registers in the core dump file comes from the thread.fstate. However, kernel saves the FP registers to the thread.fstate only before scheduling out the process. If no process switch happens during the exception handling process, kernel will not have a chance to save the latest value of FP registers to thread.fstate. It will cause the value of FP registers in the core dump file may be incorrect. To solve this problem, this patch force lets kernel save the FP register into the thread.fstate if the target task_struct equals the current. Signed-off-by: Vincent Chen <vincent.chen@sifive.com> Reviewed-by: Jisheng Zhang <jszhang@kernel.org> Fixes: b8c8a9590e4f ("RISC-V: Add FP register ptrace support for gdb.") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-24scsi: core: Fix hang of freezing queue between blocking and running deviceLi Jinlin
We found a hang, the steps to reproduce are as follows: 1. blocking device via scsi_device_set_state() 2. dd if=/dev/sda of=/mnt/t.log bs=1M count=10 3. echo none > /sys/block/sda/queue/scheduler 4. echo "running" >/sys/block/sda/device/state Step 3 and 4 should complete after step 4, but they hang. CPU#0 CPU#1 CPU#2 --------------- ---------------- ---------------- Step 1: blocking device Step 2: dd xxxx ^^^^^^ get request q_usage_counter++ Step 3: switching scheculer elv_iosched_store elevator_switch blk_mq_freeze_queue blk_freeze_queue > blk_freeze_queue_start ^^^^^^ mq_freeze_depth++ > blk_mq_run_hw_queues ^^^^^^ can't run queue when dev blocked > blk_mq_freeze_queue_wait ^^^^^^ Hang here!!! wait q_usage_counter==0 Step 4: running device store_state_field scsi_rescan_device scsi_attach_vpd scsi_vpd_inquiry __scsi_execute blk_get_request blk_mq_alloc_request blk_queue_enter ^^^^^^ Hang here!!! wait mq_freeze_depth==0 blk_mq_run_hw_queues ^^^^^^ dispatch IO, q_usage_counter will reduce to zero blk_mq_unfreeze_queue ^^^^^ mq_freeze_depth-- To fix this, we need to run queue before rescanning device when the device state changes to SDEV_RUNNING. Link: https://lore.kernel.org/r/20210824025921.3277629-1-lijinlin3@huawei.com Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after offlinining device") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Li Jinlin <lijinlin3@huawei.com> Signed-off-by: Qiu Laibin <qiulaibin@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24xfs: fix I_DONTCACHEDave Chinner
Yup, the VFS hoist broke it, and nobody noticed. Bulkstat workloads make it clear that it doesn't work as it should. Fixes: dae2f8ed7992 ("fs: Lift XFS_IDONTCACHE to the VFS layer") Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-24net: phy: mediatek: add the missing suspend/resume callbacksDENG Qingfang
Without suspend/resume callbacks, the PHY cannot be powered down/up administratively. Fixes: e40d2cca0189 ("net: phy: add MediaTek Gigabit Ethernet PHY driver") Signed-off-by: DENG Qingfang <dqfext@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20210823044422.164184-1-dqfext@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-24net: bridge: change return type of br_handle_ingress_vlan_tunnelKangmin Park
br_handle_ingress_vlan_tunnel() is only referenced in br_handle_frame(). If br_handle_ingress_vlan_tunnel() is called and return non-zero value, goto drop in br_handle_frame(). But, br_handle_ingress_vlan_tunnel() always return 0. So, the routines that check the return value and goto drop has no meaning. Therefore, change return type of br_handle_ingress_vlan_tunnel() to void and remove if statement of br_handle_frame(). Signed-off-by: Kangmin Park <l4stpr0gr4m@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/20210823102118.17966-1-l4stpr0gr4m@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-24selftests/net: Use kselftest skip code for skipped testsPo-Hsu Lin
There are several test cases in the net directory are still using exit 0 or exit 1 when they need to be skipped. Use kselftest framework skip code instead so it can help us to distinguish the return status. Criterion to filter out what should be fixed in net directory: grep -r "exit [01]" -B1 | grep -i skip This change might cause some false-positives if people are running these test scripts directly and only checking their return codes, which will change from 0 to 4. However I think the impact should be small as most of our scripts here are already using this skip code. And there will be no such issue if running them with the kselftest framework. Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20210823085854.40216-1-po-hsu.lin@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-24audit: move put_tree() to avoid trim_trees refcount underflow and UAFRichard Guy Briggs
AUDIT_TRIM is expected to be idempotent, but multiple executions resulted in a refcount underflow and use-after-free. git bisect fingered commit fb041bb7c0a9 ("locking/refcount: Consolidate implementations of refcount_t") but this patch with its more thorough checking that wasn't in the x86 assembly code merely exposed a previously existing tree refcount imbalance in the case of tree trimming code that was refactored with prune_one() to remove a tree introduced in commit 8432c7006297 ("audit: Simplify locking around untag_chunk()") Move the put_tree() to cover only the prune_one() case. Passes audit-testsuite and 3 passes of "auditctl -t" with at least one directory watch. Cc: Jan Kara <jack@suse.cz> Cc: Will Deacon <will@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Seiji Nishikawa <snishika@redhat.com> Cc: stable@vger.kernel.org Fixes: 8432c7006297 ("audit: Simplify locking around untag_chunk()") Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> [PM: reformatted/cleaned-up the commit description] Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-08-24mq-deadline: Fix request accountingBart Van Assche
The block layer may call the I/O scheduler .finish_request() callback without having called the .insert_requests() callback. Make sure that the mq-deadline I/O statistics are correct if the block layer inserts an I/O request that bypasses the I/O scheduler. This patch prevents that lower priority I/O is delayed longer than necessary for mixed I/O priority workloads. Cc: Niklas Cassel <Niklas.Cassel@wdc.com> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Reported-by: Niklas Cassel <Niklas.Cassel@wdc.com> Fixes: 08a9ad8bf607 ("block/mq-deadline: Add cgroup support") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20210824170520.1659173-1-bvanassche@acm.org Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Tested-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-24arm64: kdump: Remove custom linux,usable-memory-range handlingGeert Uytterhoeven
Remove the architecture-specific code for handling the "linux,usable-memory-range" property under the "/chosen" node in DT, as the platform-agnostic FDT core code already takes care of this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/7356c531c49a24b4a55577bf8e46d93f4d8ae460.1628670468.git.geert+renesas@glider.be
2021-08-24arm64: kdump: Remove custom linux,elfcorehdr handlingGeert Uytterhoeven
Remove the architecture-specific code for handling the "linux,elfcorehdr" property under the "/chosen" node in DT, as the platform-agnostic handling in the FDT core code already takes care of this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/3b8f801f9b92066855e87f3079fafc153ab20f69.1628670468.git.geert+renesas@glider.be
2021-08-24riscv: Remove non-standard linux,elfcorehdr handlingGeert Uytterhoeven
RISC-V uses platform-specific code to locate the elf core header in memory. However, this does not conform to the standard "linux,elfcorehdr" DT bindings, as it relies on a reserved memory node with the "linux,elfcorehdr" compatible value, instead of on a "linux,elfcorehdr" property under the "/chosen" node. The non-compliant code can just be removed, as the standard behavior is already implemented by platform-agnostic handling in the FDT core code. Fixes: 5640975003d0234d ("RISC-V: Add crash kernel support") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/41c75d6ee3114ae6304f8afe0051895af91200ee.1628670468.git.geert+renesas@glider.be
2021-08-24of: fdt: Use IS_ENABLED(CONFIG_BLK_DEV_INITRD) instead of #ifdefGeert Uytterhoeven
Replace the conditional compilation using "#ifdef CONFIG_BLK_DEV_INITRD" by a check for "IS_ENABLED(CONFIG_BLK_DEV_INITRD)", to increase compile coverage and to simplify the code. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/604c13747f09d800da6a7c12f661e1ec146f1dfd.1628670468.git.geert+renesas@glider.be
2021-08-24of: fdt: Add generic support for handling usable memory range propertyGeert Uytterhoeven
Add support for handling the "linux,usable-memory-range" property in the "/chosen" node to the FDT core code. This can co-exist safely with the architecture-specific handling, until the latter has been removed. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/3bd69bada93ee59b7d23c38b3527fc1654e19343.1628670468.git.geert+renesas@glider.be
2021-08-24of: fdt: Add generic support for handling elf core headers propertyGeert Uytterhoeven
There are two methods to specify the location of the elf core headers: using the "elfcorehdr=" kernel parameter, as handled by generic code in kernel/crash_dump.c, or using the "linux,elfcorehdr" property under the "/chosen" node in the Device Tree, as handled by architecture-specific code in arch/arm64/mm/init.c. Extend support for "linux,elfcorehdr" to all platforms supporting DT by adding platform-agnostic handling for handling this property to the FDT core code. This can co-exist safely with the architecture-specific handling, until the latter has been removed. This requires moving the call to of_scan_flat_dt() up, as the code scanning the "/chosen" node now needs to be aware of the values of "#address-cells" and "#size-cells". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/c7e46e50aaf87ef49bdaa61358d25b122f32b7df.1628670468.git.geert+renesas@glider.be
2021-08-24crash_dump: Make elfcorehdr address/size symbols always visibleGeert Uytterhoeven
Make the forward declarations of elfcorehdr_addr and elfcorehdr_size, and the definitions of ELFCORE_ADDR_MAX and ELFCORE_ADDR_ERR always available, like is done for phys_initrd_start and phys_initrd_size. Code referring to these symbols can then just check for IS_ENABLED(CONFIG_CRASH_DUMP), instead of requiring conditional compilation using an #ifdef, thus preparing to increase compile coverage. Suggested-by: Rob Herring <robh+dt@kernel.org> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/ba965ca613c0cc82c1ec2fe353ee34fb13b36474.1628670468.git.geert+renesas@glider.be
2021-08-24dt-bindings: memory: convert Samsung Exynos DMC to dtschemaKrzysztof Kozlowski
Convert Samsung Exynos5422 SoC frequency and voltage scaling for Dynamic Memory Controller to DT schema format using json-schema. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Acked-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20210820150353.161161-3-krzysztof.kozlowski@canonical.com Signed-off-by: Rob Herring <robh@kernel.org>
2021-08-24dt-bindings: devfreq: event: convert Samsung Exynos PPMU to dtschemaKrzysztof Kozlowski
Convert Samsung Exynos PPMU bindings to DT schema format using json-schema. The example is quite different due to the nature of dtschema examples parsing (no overriding via-label allowed). New bindings contain copied description from previous bindings document, therefore the license is set as GPL-2.0-only. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210820150353.161161-2-krzysztof.kozlowski@canonical.com Signed-off-by: Rob Herring <robh@kernel.org>
2021-08-24Merge branch 'Improve XDP samples usability and output'Alexei Starovoitov
Kumar Kartikeya says: ==================== This set revamps XDP samples related to redirection to show better output and implement missing features consolidating all their differences and giving them a consistent look and feel, by implementing common features and command line options. Some of the TODO items like reporting redirect error numbers (ENETDOWN, EINVAL, ENOSPC, etc.) have also been implemented. Some of the features are: * Received packet statistics * xdp_redirect/xdp_redirect_map tracepoint statistics * xdp_redirect_err/xdp_redirect_map_err tracepoint statistics (with support for showing exact errno) * xdp_cpumap_enqueue/xdp_cpumap_kthread tracepoint statistics * xdp_devmap_xmit tracepoint statistics * xdp_exception tracepoint statistics * Per ifindex pair devmap_xmit stats shown dynamically (for xdp_monitor) to decompose the total. * Use of BPF skeleton and BPF static linking to share BPF programs. * Use of vmlinux.h and tp_btf for raw_tracepoint support. * Removal of redundant -N/--native-mode option (enforced by default now) * ... and massive cleanups all over the place. All tracepoints also use raw_tp now, and tracepoints like xdp_redirect are only enabled when requested explicitly to capture successful redirection statistics. The set of programs converted as part of this series are: * xdp_redirect_cpu * xdp_redirect_map_multi * xdp_redirect_map * xdp_redirect * xdp_monitor Explanation of the output: There is now a concise output mode by default that shows primarily four fields: rx/s Number of packets received per second redir/s Number of packets successfully redirected per second err,drop/s Aggregated count of errors per second (including dropped packets) xmit/s Number of packets transmitted on the output device per second Some examples: ; sudo ./xdp_redirect_map veth0 veth1 -s Redirecting from veth0 (ifindex 15; driver veth) to veth1 (ifindex 14; driver veth) veth0->veth1 0 rx/s 0 redir/s 0 err,drop/s 0 xmit/s veth0->veth1 9,998,660 rx/s 9,998,658 redir/s 0 err,drop/s 9,998,654 xmit/s ... There is also a verbose mode, that can also be enabled by default using -v (--verbose). The output mode can be switched dynamically at runtime using Ctrl + \ (SIGQUIT). To make the concise output more useful, the errors that occur are expanded inline (as if verbose mode was enabled) to let the user pin down the source of the problem without having to clutter output (or possibly miss it) or always use verbose mode. For instance, let's consider a case where the output device link state is set to down while redirection is happening: [...] veth0->veth1 24,503,376 rx/s 0 err,drop/s 24,503,372 xmit/s veth0->veth1 25,044,775 rx/s 0 err,drop/s 25,044,783 xmit/s veth0->veth1 25,263,046 rx/s 4 err,drop/s 25,263,028 xmit/s redirect_err 4 error/s ENETDOWN 4 error/s [...] The same holds for xdp_exception actions. An example of how a complete xdp_redirect_map session would look: ; sudo ./xdp_redirect_map veth0 veth1 Redirecting from veth0 (ifindex 5; driver veth) to veth1 (ifindex 4; driver veth) veth0->veth1 7,411,506 rx/s 0 err,drop/s 7,411,470 xmit/s veth0->veth1 8,931,770 rx/s 0 err,drop/s 8,931,771 xmit/s ^\ veth0->veth1 8,787,295 rx/s 0 err,drop/s 8,787,325 xmit/s receive total 8,787,295 pkt/s 0 drop/s 0 error/s cpu:7 8,787,295 pkt/s 0 drop/s 0 error/s redirect_err 0 error/s xdp_exception 0 hit/s xmit veth0->veth1 8,787,325 xmit/s 0 drop/s 0 drv_err/s 2.00 bulk-avg cpu:7 8,787,325 xmit/s 0 drop/s 0 drv_err/s 2.00 bulk-avg veth0->veth1 8,842,610 rx/s 0 err,drop/s 8,842,606 xmit/s receive total 8,842,610 pkt/s 0 drop/s 0 error/s cpu:7 8,842,610 pkt/s 0 drop/s 0 error/s redirect_err 0 error/s xdp_exception 0 hit/s xmit veth0->veth1 8,842,606 xmit/s 0 drop/s 0 drv_err/s 2.00 bulk-avg cpu:7 8,842,606 xmit/s 0 drop/s 0 drv_err/s 2.00 bulk-avg ^C Packets received : 33,973,181 Average packets/s : 4,246,648 Packets transmitted : 33,973,172 Average transmit/s : 4,246,647 The xdp_redirect tracepoint (for success stats) needs to be enabled explicitly using --stats/-s. Documentation for entire output and options is provided when user specifies --help/-h with a sample. Changelog: ---------- v3 -> v4: v3: https://lore.kernel.org/bpf/20210728165552.435050-1-memxor@gmail.com * Address all feedback from Daniel * Use READ_ONCE/WRITE_ONCE from linux/compiler.h (cannot directly include due to conflicts with vmlinux.h) * Fix MAX_CPUS hardcoding by switching to mmapable array maps, that are resized based on the value of libbpf_num_possible_cpus * s/ELEMENTS_OF/ARRAY_SIZE/g * Use tools/include/linux/hashtable.h * Coding style fixes * Remove hyperlinks for tracepoints * Split into smaller reviewable changes * Restore support for specifying custom xdp_redirect_cpu cpumap prog with some enhancements, including built-in programs for common actions (pass, drop, redirect). By default, cpumap prog is now disabled. * Misc bug fixes all over the place The printing stuff is a lot more basic without hyperlink support, hence it has not been exported into a more general facility. v2 -> v3 v2: https://lore.kernel.org/bpf/20210721212833.701342-1-memxor@gmail.com * Address all feedback from Andrii * Replace usage of libbpf hashmap (internal API) with custom one * Rename ATOMIC_* macros to NO_TEAR_* to better reflect their use * Use size_t as a portable word sized data type * Set libbpf_set_strict_mode * Invert conditions in BPF programs to exit early and reduce nesting * Use canonical SEC("xdp") naming for all XDP BPF progams * Add missing help description for cpumap enqueue and kthread tracepoints * Move private struct declarations from xdp_sample_user.h to .c file * Improve help output for cpumap enqueue and cpumap kthread tracepoints * Fix a bug where keys array for BPF_MAP_LOOKUP_BATCH is overallocated * Fix some conditions for printing stats (earlier only checked pps, now pps, drop, err and print if any is greater than zero) * Fix alloc_stats_record to properly return and cleanup allocated memory on allocation failure instead of calling exit(3) * Bump bpf_map_lookup_batch count to 32 to reduce lookup time with multiple devices in map * Fix a bug where devmap_xmit_multi stats are not printed when previous record is missing (i.e. when the first time stats are printed), by simply using a dummy record that is zeroed out * Also print per-CPU counts for devmap_xmit_multi which we collect already * Change mac_map to be BPF_MAP_TYPE_HASH instead of array to prevent resizing to a large size when max_ifindex is high, in xdp_redirect_map_multi * Fix instance of strerror(errno) in sample_install_xdp to use saved errno * Provide a usage function from samples helper * Provide a fix where incorrect stats are shown for parallel sessions of xdp_redirect_* samples by introducing matching support for input device(s), output device(s) and cpumap map id for enqueue and kthread stats. Only xdp_monitor doesn't filter stats, all others do. RFC (v1) -> v2 RFC (v1): https://lore.kernel.org/bpf/20210528235250.2635167-1-memxor@gmail.com * Address all feedback from Andrii * Use BPF static linking * Use vmlinux.h * Use BPF_PROG macro * Use global variables instead of maps * Use of tp_btf for raw_tracepoint progs * Switch to timerfd for polling * Use libbpf hashmap for maintaing device sets for per ifindex pair devmap_xmit stats * Fix Makefile to specify object dependencies properly * Use in-tree bpftool * ... misc fixes and cleanups all over the place ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-08-24samples: bpf: Convert xdp_redirect_map_multi to XDP samples helperKumar Kartikeya Dwivedi
Use the libbpf skeleton facility and other utilities provided by XDP samples helper. Also adapt to change of type of mac address map, so that no resizing is required. Add a new flag for sample mask that skips priting the from_device->to_device heading for each line, as xdp_redirect_map_multi may have two devices but the flow of data may be bidirectional, so the output would be confusing. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210821002010.845777-23-memxor@gmail.com