summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-05-28ptp: rework gianfar_ptp as QorIQ common PTP driverYangbo Lu
gianfar_ptp was the PTP clock driver for 1588 timer module of Freescale QorIQ eTSEC (Enhanced Three-Speed Ethernet Controllers) platforms. Actually QorIQ DPAA (Data Path Acceleration Architecture) platforms is also using the same 1588 timer module in hardware. This patch is to rework gianfar_ptp as QorIQ common PTP driver to support both DPAA and eTSEC. Moved gianfar_ptp.c to drivers/ptp/, renamed it as ptp_qoriq.c, and renamed many variables. There were not any function changes. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28ifb: fix packets checksumJon Maxwell
Fixup the checksum for CHECKSUM_COMPLETE when pulling skbs on RX path. Otherwise we get splats when tc mirred is used to redirect packets to ifb. Before fix: nic: hw csum failure Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28net: phy: realtek: add suspend/resume callbacks for RTL8211BHeiner Kallweit
Add RTL8211B suspend / resume callbacks. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28Merge branch 'Enable-virtio_net-to-act-as-a-standby-for-a-passthru-device'David S. Miller
Sridhar Samudrala says: ==================== Enable virtio_net to act as a standby for a passthru device The main motivation for this patch is to enable cloud service providers to provide an accelerated datapath to virtio-net enabled VMs in a transparent manner with no/minimal guest userspace changes. This also enables hypervisor controlled live migration to be supported with VMs that have direct attached SR-IOV VF devices. Patch 1 introduces a failover module that provides a generic interface for paravirtual drivers to listen for netdev register/unregister/link change events from pci ethernet devices with the same MAC and takeover their datapath. The notifier and event handling code is based on the existing netvsc implementation. Patch 2 refactors netvsc to use the registration/notification framework introduced by failover module. Patch 3 introduces a net_failover driver that provides an automated failover mechanism to paravirtual drivers via APIs to create and destroy a failover master netdev and mananges a primary and standby slave netdevs that get registered via the generic failover infrastructure. Patch 4 introduces a new feature bit VIRTIO_NET_F_STANDBY to virtio-net that can be used by hypervisor to indicate that virtio_net interface should act as a standby for another device with the same MAC address. Patch 5 extends virtio_net to use alternate datapath when available and registered. When STANDBY feature is enabled, virtio_net driver uese the net_failover API to create an additional 'failover' netdev that acts as a master device and controls 2 slave devices. The original virtio_net netdev is registered as 'standby' netdev and a passthru/vf device with the same MAC gets registered as 'primary' netdev. Both 'standby' and 'failover' netdevs are associated with the same 'pci' device. The user accesses the network interface via 'failover' netdev. The 'failover' netdev chooses 'primary' netdev as default for transmits when it is available with link up and running. As this patch series is initially focusing on usecases where hypervisor fully controls the VM networking and the guest is not expected to directly configure any hardware settings, it doesn't expose all the ndo/ethtool ops that are supported by virtio_net at this time. To support additional usecases, it should be possible to enable additional ops later by caching the state in failover netdev and replaying when the 'primary' netdev gets registered. At the time of live migration, the hypervisor needs to unplug the VF device from the guest on the source host and reset the MAC filter of the VF to initiate failover of datapath to virtio before starting the migration. After the migration is completed, the destination hypervisor sets the MAC filter on the VF and plugs it back to the guest to switch over to VF datapath. This patch is based on the discussion initiated by Jesse on this thread. https://marc.info/?l=linux-virtualization&m=151189725224231&w=2 v12: - Tested live migration with virtio-net/AVF(i40evf) configured in failover mode while running iperf in background. Tried static ip and dhcp configurations using 'network' scripts and Network Manager. - Build tested netvsc module. Updates: - Extended generic failover module to do common functions like setting FAILOVER_SLAVE flag, registering rx-handler and linking to upper dev in the generic register/unregister handlers. This required adding 3 additional failover ops pre_register, pre_unregister and handle_frame. netvsc and net_failover drivers are updated to support these ops. v11: - Split net_failover module into 2 components. 1. 'failover' module that provides generic failover infrastructure to register a failover instance and listen for slave events. 2. 'net_failover' driver that provides APIs to create/destroy upper netdev and supports 3-netdev model used by virtio-net. - Added documentation v10: - fix net_failover_open() to update failover CARRIER correctly based on standby and primary states. - fix net_failover_handle_frame() to handle frames received on standby when primary is present. - replace netdev_upper_dev_link with netdev_master_upper_dev_link and handle lower dev state changes. - fix net_failver_create() and net_failover_register() interfaces to use ERR_PTR and avoid arg ** - disable setting mac address when virtio-net in STANDBY mode - document exported symbols - added entry to MAINTAINERS file v9: Select NET_FAILOVER automatically when VIRTIO_NET/HYPERV_NET are enabled. (stephen) v8: - Made the failover managment routines more robust by updating the feature bits/other fields in the failover netdev when slave netdevs are registered/unregistered. (mst) - added support for handling vlans. - Limited the changes in netvsc to only use the notifier/event/lookups from the failover module. The slave register/unregister/link-change handlers are only updated to use the getbymac routine to get the upper netdev. There is no change in their functionality. (stephen) - renamed structs/function/file names to use net_failover prefix. (mst) v7 - Rename 'bypass/active/backup' terminology with 'failover/primary/standy' (jiri, mst) - re-arranged dev_open() and dev_set_mtu() calls in the register routines so that they don't get called for 2-netdev model. (stephen) - fixed select_queue() routine to do queue selection based on VF if it is registered as primary. (stephen) - minor bugfixes v6 RFC: Simplified virtio_net changes by moving all the ndo_ops of the bypass_netdev and create/destroy of bypass_netdev to 'bypass' module. avoided 2 phase registration(driver + instances). introduced IFF_BYPASS/IFF_BYPASS_SLAVE dev->priv_flags replaced mutex with a spinlock v5 RFC: Based on Jiri's comments, moved the common functionality to a 'bypass' module so that the same notifier and event handlers to handle child register/unregister/link change events can be shared between virtio_net and netvsc. Improved error handling based on Siwei's comments. v4: - Based on the review comments on the v3 version of the RFC patch and Jakub's suggestion for the naming issue with 3 netdev solution, proposed 3 netdev in-driver bonding solution for virtio-net. v3 RFC: - Introduced 3 netdev model and pointed out a couple of issues with that model and proposed 2 netdev model to avoid these issues. - Removed broadcast/multicast optimization and only use virtio as backup path when VF is unplugged. v2 RFC: - Changed VIRTIO_NET_F_MASTER to VIRTIO_NET_F_BACKUP (mst) - made a small change to the virtio-net xmit path to only use VF datapath for unicasts. Broadcasts/multicasts use virtio datapath. This avoids east-west broadcasts to go over the PCI link. - added suppport for the feature bit in qemu ==================== Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28virtio_net: Extend virtio to use VF datapath when availableSridhar Samudrala
This patch enables virtio_net to switch over to a VF datapath when STANDBY feature is enabled and a VF netdev is present with the same MAC address. It allows live migration of a VM with a direct attached VF without the need to setup a bond/team between a VF and virtio net device in the guest. It uses the API that is exported by the net_failover driver to create and and destroy a master failover netdev. When STANDBY feature is enabled, an additional netdev(failover netdev) is created that acts as a master device and tracks the state of the 2 lower netdevs. The original virtio_net netdev is marked as 'standby' netdev and a passthru device with the same MAC is registered as 'primary' netdev. The hypervisor needs to unplug the VF device from the guest on the source host and reset the MAC filter of the VF to initiate failover of datapath to virtio before starting the migration. After the migration is completed, the destination hypervisor sets the MAC filter on the VF and plugs it back to the guest to switch over to VF datapath. This patch is based on the discussion initiated by Jesse on this thread. https://marc.info/?l=linux-virtualization&m=151189725224231&w=2 Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28virtio_net: Introduce VIRTIO_NET_F_STANDBY feature bitSridhar Samudrala
This feature bit can be used by hypervisor to indicate virtio_net device to act as a standby for another device with the same MAC address. VIRTIO_NET_F_STANDBY is defined as bit 62 as it is a device feature bit. Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28net: Introduce net_failover driverSridhar Samudrala
The net_failover driver provides an automated failover mechanism via APIs to create and destroy a failover master netdev and manages a primary and standby slave netdevs that get registered via the generic failover infrastructure. The failover netdev acts a master device and controls 2 slave devices. The original paravirtual interface gets registered as 'standby' slave netdev and a passthru/vf device with the same MAC gets registered as 'primary' slave netdev. Both 'standby' and 'failover' netdevs are associated with the same 'pci' device. The user accesses the network interface via 'failover' netdev. The 'failover' netdev chooses 'primary' netdev as default for transmits when it is available with link up and running. This can be used by paravirtual drivers to enable an alternate low latency datapath. It also enables hypervisor controlled live migration of a VM with direct attached VF by failing over to the paravirtual datapath when the VF is unplugged. Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28netvsc: refactor notifier/event handling code to use the failover frameworkSridhar Samudrala
Use the registration/notification framework supported by the generic failover infrastructure. Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28net: Introduce generic failover moduleSridhar Samudrala
The failover module provides a generic interface for paravirtual drivers to register a netdev and a set of ops with a failover instance. The ops are used as event handlers that get called to handle netdev register/ unregister/link change/name change events on slave pci ethernet devices with the same mac address as the failover netdev. This enables paravirtual drivers to use a VF as an accelerated low latency datapath. It also allows migration of VMs with direct attached VFs by failing over to the paravirtual datapath when the VF is unplugged. Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28vrf: add CRC32c offload to device featuresDavide Caratti
SCTP sockets originated in a VRF can improve their performance if CRC32c computation is delegated to underlying devices: update device features, setting NETIF_F_SCTP_CRC. Iterating the following command in the topology proposed with [1], # ip vrf exec vrf-h2 netperf -H 192.0.2.1 -t SCTP_STREAM -- -m 10K the measured throughput in Mbit/s improved from 2395 ± 1% to 2720 ± 1%. [1] https://www.spinics.net/lists/netdev/msg486007.html Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28net: stmmac: Use mutex instead of spinlockThierry Reding
Some drivers, such as DWC EQOS on Tegra, need to perform operations that can sleep under this lock (clk_set_rate() in tegra_eqos_fix_speed()) for proper operation. Since there is no need for this lock to be a spinlock, convert it to a mutex instead. Fixes: e6ea2d16fc61 ("net: stmmac: dwc-qos: Add Tegra186 support") Reported-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Tested-by: Bhadram Varka <vbhadram@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28bnx2x: Collect the device debug information during Tx timeout.Sudarsana Reddy Kalluru
Tx-timeout mostly happens due to some issue in the device. In such cases, debug dump would be helpful for identifying the cause of the issue. This patch adds support to spill debug data during the Tx timeout. Here bnx2x_panic_dump() API is used instead of bnx2x_panic(), since we still want to allow the Tx-timeout recovery a chance to succeed. Changes from previous version: ------------------------------- v2: Fixed a coding error. Please consider applying this to "net-next". Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree: 1) Null pointer dereference when dumping conntrack helper configuration, from Taehee Yoo. 2) Missing sanitization in ebtables extension name through compat, from Paolo Abeni. 3) Broken fetch of tracing value, from Taehee Yoo. 4) Incorrect arithmetics in packet ratelimiting. 5) Buffer overflow in IPVS sync daemon, from Julian Anastasov. 6) Wrong argument to nla_strlcpy() in nfnetlink_{acct,cthelper}, from Eric Dumazet. 7) Fix splat in nft_update_chain_stats(). 8) Null pointer dereference from object netlink dump path, from Taehee Yoo. 9) Missing static_branch_inc() when enabling counters in existing chain, from Taehee Yoo. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28scsi: scsi_transport_srp: Fix shost to rport translationBart Van Assche
Since an SRP remote port is attached as a child to shost->shost_gendev and as the only child, the translation from the shost pointer into an rport pointer must happen by looking up the shost child that is an rport. This patch fixes the following KASAN complaint: BUG: KASAN: slab-out-of-bounds in srp_timed_out+0x57/0x110 [scsi_transport_srp] Read of size 4 at addr ffff880035d3fcc0 by task kworker/1:0H/19 CPU: 1 PID: 19 Comm: kworker/1:0H Not tainted 4.16.0-rc3-dbg+ #1 Workqueue: kblockd blk_mq_timeout_work Call Trace: dump_stack+0x85/0xc7 print_address_description+0x65/0x270 kasan_report+0x231/0x350 srp_timed_out+0x57/0x110 [scsi_transport_srp] scsi_times_out+0xc7/0x3f0 [scsi_mod] blk_mq_terminate_expired+0xc2/0x140 bt_iter+0xbc/0xd0 blk_mq_queue_tag_busy_iter+0x1c7/0x350 blk_mq_timeout_work+0x325/0x3f0 process_one_work+0x441/0xa50 worker_thread+0x76/0x6c0 kthread+0x1b2/0x1d0 ret_from_fork+0x24/0x30 Fixes: e68ca75200fe ("scsi_transport_srp: Reduce failover time") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Laurence Oberman <loberman@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-29powerpc/livepatch: Fix build error with kprobes disabled.Aneesh Kumar K.V
arch/powerpc/kernel/stacktrace.c: In function ‘save_stack_trace_tsk_reliable’: arch/powerpc/kernel/stacktrace.c:176:28: error: ‘kretprobe_trampoline’ undeclared if (ip == (unsigned long)kretprobe_trampoline) ^~~~~~~~~~~~~~~~~~~~ Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing for the consistency model") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-05-29netfilter: nfnetlink: allow commit to failFlorian Westphal
->commit() cannot fail at the moment. Followup-patch adds kmalloc calls in the commit phase, so we'll need to be able to handle errors. Make it so that -EGAIN causes a full replay, and make other errors cause the transaction to fail. Failing is ok from a consistency point of view as long as we perform all actions that could return an error before we increment the generation counter and the base seq. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-29netfilter: nat: merge nf_nat_redirect into nf_natFlorian Westphal
Similar to previous patch, this time, merge redirect+nat. The redirect module is just 2k in size, get rid of it and make redirect part available from the nat core. before: text data bss dec hex filename 19461 1484 4138 25083 61fb net/netfilter/nf_nat.ko 1236 792 0 2028 7ec net/netfilter/nf_nat_redirect.ko after: 20340 1508 4138 25986 6582 net/netfilter/nf_nat.ko Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-29netfilter: nat: merge ipv4/ipv6 masquerade code into main nat moduleFlorian Westphal
Instead of using extra modules for these, turn the config options into an implicit dependency that adds masq feature to the protocol specific nf_nat module. before: text data bss dec hex filename 2001 860 4 2865 b31 net/ipv4/netfilter/nf_nat_masquerade_ipv4.ko 5579 780 2 6361 18d9 net/ipv4/netfilter/nf_nat_ipv4.ko 2860 836 8 3704 e78 net/ipv6/netfilter/nf_nat_masquerade_ipv6.ko 6648 780 2 7430 1d06 net/ipv6/netfilter/nf_nat_ipv6.ko after: text data bss dec hex filename 7245 872 8 8125 1fbd net/ipv4/netfilter/nf_nat_ipv4.ko 9165 848 12 10025 2729 net/ipv6/netfilter/nf_nat_ipv6.ko Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-29netfilter: add includes to nf_socket.hMáté Eckl
These have to be included always when nf_socket.h is included. Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-29netfilter: nf_tables: increase nft_counters_enabled in nft_chain_stats_replace()Taehee Yoo
When a chain is updated, a counter can be attached. if so, the nft_counters_enabled should be increased. test commands: %nft add table ip filter %nft add chain ip filter input { type filter hook input priority 4\; } %iptables-compat -Z input %nft delete chain ip filter input we can see below messages. [ 286.443720] jump label: negative count! [ 286.448278] WARNING: CPU: 0 PID: 1459 at kernel/jump_label.c:197 __static_key_slow_dec_cpuslocked+0x6f/0xf0 [ 286.449144] Modules linked in: nf_tables nfnetlink ip_tables x_tables [ 286.449144] CPU: 0 PID: 1459 Comm: nft Tainted: G W 4.17.0-rc2+ #12 [ 286.449144] RIP: 0010:__static_key_slow_dec_cpuslocked+0x6f/0xf0 [ 286.449144] RSP: 0018:ffff88010e5176f0 EFLAGS: 00010286 [ 286.449144] RAX: 000000000000001b RBX: ffffffffc0179500 RCX: ffffffffb8a82522 [ 286.449144] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88011b7e5eac [ 286.449144] RBP: 0000000000000000 R08: ffffed00236fce5c R09: ffffed00236fce5b [ 286.449144] R10: ffffffffc0179503 R11: ffffed00236fce5c R12: 0000000000000000 [ 286.449144] R13: ffff88011a28e448 R14: ffff88011a28e470 R15: dffffc0000000000 [ 286.449144] FS: 00007f0384328700(0000) GS:ffff88011b600000(0000) knlGS:0000000000000000 [ 286.449144] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 286.449144] CR2: 00007f038394bf10 CR3: 0000000104a86000 CR4: 00000000001006f0 [ 286.449144] Call Trace: [ 286.449144] static_key_slow_dec+0x6a/0x70 [ 286.449144] nf_tables_chain_destroy+0x19d/0x210 [nf_tables] [ 286.449144] nf_tables_commit+0x1891/0x1c50 [nf_tables] [ 286.449144] nfnetlink_rcv+0x1148/0x13d0 [nfnetlink] [ ... ] Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-29netfilter: nf_tables: fix NULL-ptr in nf_tables_dump_obj()Taehee Yoo
The table field in nft_obj_filter is not an array. In order to check tablename, we should check if the pointer is set. Test commands: %nft add table ip filter %nft add counter ip filter ct1 %nft reset counters Splat looks like: [ 306.510504] kasan: CONFIG_KASAN_INLINE enabled [ 306.516184] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 306.524775] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 306.528284] Modules linked in: nft_objref nft_counter nf_tables nfnetlink ip_tables x_tables [ 306.528284] CPU: 0 PID: 1488 Comm: nft Not tainted 4.17.0-rc4+ #17 [ 306.528284] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015 [ 306.528284] RIP: 0010:nf_tables_dump_obj+0x52c/0xa70 [nf_tables] [ 306.528284] RSP: 0018:ffff8800b6cb7520 EFLAGS: 00010246 [ 306.528284] RAX: 0000000000000000 RBX: ffff8800b6c49820 RCX: 0000000000000000 [ 306.528284] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffffed0016d96e9a [ 306.528284] RBP: ffff8800b6cb75c0 R08: ffffed00236fce7c R09: ffffed00236fce7b [ 306.528284] R10: ffffffff9f6241e8 R11: ffffed00236fce7c R12: ffff880111365108 [ 306.528284] R13: 0000000000000000 R14: ffff8800b6c49860 R15: ffff8800b6c49860 [ 306.528284] FS: 00007f838b007700(0000) GS:ffff88011b600000(0000) knlGS:0000000000000000 [ 306.528284] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 306.528284] CR2: 00007ffeafabcf78 CR3: 00000000b6cbe000 CR4: 00000000001006f0 [ 306.528284] Call Trace: [ 306.528284] netlink_dump+0x470/0xa20 [ 306.528284] __netlink_dump_start+0x5ae/0x690 [ 306.528284] ? nf_tables_getobj+0x1b3/0x740 [nf_tables] [ 306.528284] nf_tables_getobj+0x2f5/0x740 [nf_tables] [ 306.528284] ? nft_obj_notify+0x100/0x100 [nf_tables] [ 306.528284] ? nf_tables_getobj+0x740/0x740 [nf_tables] [ 306.528284] ? nf_tables_dump_flowtable_done+0x70/0x70 [nf_tables] [ 306.528284] ? nft_obj_notify+0x100/0x100 [nf_tables] [ 306.528284] nfnetlink_rcv_msg+0x8ff/0x932 [nfnetlink] [ 306.528284] ? nfnetlink_rcv_msg+0x216/0x932 [nfnetlink] [ 306.528284] netlink_rcv_skb+0x1c9/0x2f0 [ 306.528284] ? nfnetlink_bind+0x1d0/0x1d0 [nfnetlink] [ 306.528284] ? debug_check_no_locks_freed+0x270/0x270 [ 306.528284] ? netlink_ack+0x7a0/0x7a0 [ 306.528284] ? ns_capable_common+0x6e/0x110 [ ... ] Fixes: e46abbcc05aa8 ("netfilter: nf_tables: Allow table names of up to 255 chars") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-29netfilter: nf_tables: disable preemption in nft_update_chain_stats()Pablo Neira Ayuso
This patch fixes the following splat. [118709.054937] BUG: using smp_processor_id() in preemptible [00000000] code: test/1571 [118709.054970] caller is nft_update_chain_stats.isra.4+0x53/0x97 [nf_tables] [118709.054980] CPU: 2 PID: 1571 Comm: test Not tainted 4.17.0-rc6+ #335 [...] [118709.054992] Call Trace: [118709.055011] dump_stack+0x5f/0x86 [118709.055026] check_preemption_disabled+0xd4/0xe4 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-28PM / QoS: Drop redundant declaration of pm_qos_get_value()Rafael J. Wysocki
The extra forward declaration of pm_qos_get_value() is redundant, so drop it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-28bcache: Replace bch_read_string_list() by __sysfs_match_string()Andy Shevchenko
Kernel library has a common function to match user input from sysfs against an array of strings. Thus, replace bch_read_string_list() by __sysfs_match_string(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-28bcache: Move couple of functions to sysfs.cAndy Shevchenko
There is couple of functions that are used exclusively in sysfs.c. Move it to there and make them static. Besides above, it will allow further clean up. No functional change intended. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-28bcache: Move couple of string arrays to sysfs.cAndy Shevchenko
There is couple of string arrays that are used exclusively in sysfs.c. Move it to there and make them static. Besides above, it will allow further clean up. No functional change intended. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-28bcache: stop bcache device when backing device is offlineColy Li
Currently bcache does not handle backing device failure, if backing device is offline and disconnected from system, its bcache device can still be accessible. If the bcache device is in writeback mode, I/O requests even can success if the requests hit on cache device. That is to say, when and how bcache handles offline backing device is undefined. This patch tries to handle backing device offline in a rather simple way, - Add cached_dev->status_update_thread kernel thread to update backing device status in every 1 second. - Add cached_dev->offline_seconds to record how many seconds the backing device is observed to be offline. If the backing device is offline for BACKING_DEV_OFFLINE_TIMEOUT (30) seconds, set dc->io_disable to 1 and call bcache_device_stop() to stop the bache device which linked to the offline backing device. Now if a backing device is offline for BACKING_DEV_OFFLINE_TIMEOUT seconds, its bcache device will be removed, then user space application writing on it will get error immediately, and handler the device failure in time. This patch is quite simple, does not handle more complicated situations. Once the bcache device is stopped, users need to recovery the backing device, register and attach it manually. Changelog: v3: call wait_for_kthread_stop() before exits kernel thread. v2: remove "bcache: " prefix when calling pr_warn(). v1: initial version. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Michael Lyle <mlyle@lyle.org> Cc: Junhui Tang <tang.junhui@zte.com.cn> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-29kconfig: add basic helper macros to scripts/Kconfig.includeMasahiro Yamada
Kconfig got text processing tools like we see in Make. Add Kconfig helper macros to scripts/Kconfig.include like we collect Makefile macros in scripts/Kbuild.include. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-05-29kconfig: show compiler version text in the top commentMasahiro Yamada
The kernel configuration phase is now tightly coupled with the compiler in use. It will be nice to show the compiler information in Kconfig. The compiler information will be displayed like this: $ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- config scripts/kconfig/conf --oldaskconfig Kconfig * * Linux/arm64 4.16.0-rc1 Kernel Configuration * * * Compiler: aarch64-linux-gnu-gcc (Linaro GCC 7.2-2017.11) 7.2.1 20171011 * * * General setup * Compile also drivers which will not load (COMPILE_TEST) [N/y/?] If you use GUI methods such as menuconfig, it will be displayed in the top menu. This is simply implemented by using the 'comment' statement. So, it will be saved into the .config file as well. This commit has a very important meaning. If the compiler is upgraded, Kconfig must be re-run since different compilers have different sets of supported options. All referenced environments are written to include/config/auto.conf.cmd so that any environment change triggers syncconfig, and prompt the user to input new values if needed. With this commit, something like follows will be added to include/config/auto.conf.cmd ifneq "$(CC_VERSION_TEXT)" "aarch64-linux-gnu-gcc (Linaro GCC 7.2-2017.11) 7.2.1 20171011" include/config/auto.conf: FORCE endif Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org>
2018-05-29kconfig: test: add Kconfig macro language testsMasahiro Yamada
Here are the test cases I used for developing the text expansion feature. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-29Documentation: kconfig: document a new Kconfig macro languageMasahiro Yamada
Add a document for the macro language introduced to Kconfig. The motivation of this work is to move the compiler option tests to Kconfig from Makefile. A number of kernel features require the compiler support. Enabling such features blindly in Kconfig ends up with a lot of nasty build-time testing in Makefiles. If a chosen feature turns out unsupported by the compiler, what the build system can do is either to disable it (silently!) or to forcibly break the build, despite Kconfig has let the user to enable it. By moving the compiler capability tests to Kconfig, features unsupported by the compiler will be hidden automatically. This change was strongly prompted by Linus Torvalds. You can find his suggestions [1] [2] in ML. The original idea was to add a new attribute with 'option shell=...', but I found more generalized text expansion would make Kconfig more powerful and lovely. The basic ideas are from Make, but there are some differences. [1]: https://lkml.org/lkml/2016/12/9/577 [2]: https://lkml.org/lkml/2018/2/7/527 Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
2018-05-29kconfig: error out if a recursive variable references itselfMasahiro Yamada
When using a recursively expanded variable, it is a common mistake to make circular reference. For example, Make terminates the following code: X = $(X) Y := $(X) Let's detect the circular expansion in Kconfig, too. On the other hand, a function that recurses itself is a commonly-used programming technique. So, Make does not check recursion in the reference with 'call'. For example, the following code continues running eternally: X = $(call X) Y := $(X) Kconfig allows circular expansion if one or more arguments are given, but terminates when the same function is recursively invoked 1000 times, assuming it is a programming mistake. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-29kconfig: add 'filename' and 'lineno' built-in variablesMasahiro Yamada
The special variables, $(filename) and $(lineno), are expanded to a file name and its line number being parsed, respectively. Suggested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org>
2018-05-29kconfig: add 'info', 'warning-if', and 'error-if' built-in functionsMasahiro Yamada
Syntax: $(info,<text>) $(warning-if,<condition>,<text>) $(error-if,<condition>,<text) The 'info' function prints a message to stdout as in Make. The 'warning-if' and 'error-if' are similar to 'warning' and 'error' in Make, but take the condition parameter. They are effective only when the <condition> part is y. Kconfig does not implement the lazy expansion as used in the 'if' 'and, 'or' functions in Make. In other words, Kconfig does not support conditional expansion. The unconditional 'error' function would always terminate the parsing, hence would be useless in Kconfig. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org>
2018-05-29kconfig: expand lefthand side of assignment statementMasahiro Yamada
Make expands the lefthand side of assignment statements. In fact, Kbuild relies on it since kernel makefiles mostly look like this: obj-$(CONFIG_FOO) += foo.o Do likewise in Kconfig. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-29kconfig: support append assignment operatorMasahiro Yamada
Support += operator. This appends a space and the text on the righthand side to a variable. The timing of the evaluation of the righthand side depends on the flavor of the variable. If the lefthand side was originally defined as a simple variable, the righthand side is expanded immediately. Otherwise, the expansion is deferred. Appending something to an undefined variable results in a recursive variable. To implement this, we need to remember the flavor of variables. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-29kconfig: support simply expanded variableMasahiro Yamada
The previous commit added variable and user-defined function. They work similarly in the sense that the evaluation is deferred until they are used. This commit adds another type of variable, simply expanded variable, as we see in Make. The := operator defines a simply expanded variable, expanding the righthand side immediately. This works like traditional programming language variables. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-29kconfig: support user-defined function and recursively expanded variableMasahiro Yamada
Now, we got a basic ability to test compiler capability in Kconfig. config CC_HAS_STACKPROTECTOR def_bool $(shell,($(CC) -Werror -fstack-protector -E -x c /dev/null -o /dev/null 2>/dev/null) && echo y || echo n) This works, but it is ugly to repeat this long boilerplate. We want to describe like this: config CC_HAS_STACKPROTECTOR bool default $(cc-option,-fstack-protector) It is straight-forward to add a new function, but I do not like to hard-code specialized functions like that. Hence, here is another feature, user-defined function. This works as a textual shorthand with parameterization. A user-defined function is defined by using the = operator, and can be referenced in the same way as built-in functions. A user-defined function in Make is referenced like $(call my-func,arg1,arg2), but I omitted the 'call' to make the syntax shorter. The definition of a user-defined function contains $(1), $(2), etc. in its body to reference the parameters. It is grammatically valid to pass more or fewer arguments when calling it. We already exploit this feature in our makefiles; scripts/Kbuild.include defines cc-option which takes two arguments at most, but most of the callers pass only one argument. By the way, a variable is supported as a subset of this feature since a variable is "a user-defined function with zero argument". In this context, I mean "variable" as recursively expanded variable. I will add a different flavored variable in the next commit. The code above can be written as follows: [Example Code] success = $(shell,($(1)) >/dev/null 2>&1 && echo y || echo n) cc-option = $(success,$(CC) -Werror $(1) -E -x c /dev/null -o /dev/null) config CC_HAS_STACKPROTECTOR def_bool $(cc-option,-fstack-protector) [Result] $ make -s alldefconfig && tail -n 1 .config CONFIG_CC_HAS_STACKPROTECTOR=y Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-29kconfig: begin PARAM state only when seeing a command keywordMasahiro Yamada
Currently, any statement line starts with a keyword with TF_COMMAND flag. So, the following three lines are dead code. alloc_string(yytext, yyleng); zconflval.string = text; return T_WORD; If a T_WORD token is returned in this context, it will cause syntax error in the parser anyway. The next commit will support the assignment statement where a line starts with an arbitrary identifier. So, I want the lexer to switch to the PARAM state only when it sees a command keyword. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-29kconfig: replace $(UNAME_RELEASE) with function callMasahiro Yamada
Now that 'shell' function is supported, this can be self-contained in Kconfig. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-05-29kconfig: add 'shell' built-in functionMasahiro Yamada
This accepts a single command to execute. It returns the standard output from it. [Example code] config HELLO string default "$(shell,echo hello world)" config Y def_bool $(shell,echo y) [Result] $ make -s alldefconfig && tail -n 2 .config CONFIG_HELLO="hello world" CONFIG_Y=y Caveat: Like environments, functions are expanded in the lexer. You cannot pass symbols to function arguments. This is a limitation to simplify the implementation. I want to avoid the dynamic function evaluation, which would introduce much more complexity. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-29kconfig: add built-in function supportMasahiro Yamada
This commit adds a new concept 'function' to do more text processing in Kconfig. A function call looks like this: $(function,arg1,arg2,arg3,...) This commit adds the basic infrastructure to expand functions. Change the text expansion helpers to take arguments. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-29kconfig: make default prompt of mainmenu less specificMasahiro Yamada
If "mainmenu" is not specified, "Linux Kernel Configuration" is used as a default prompt. Given that Kconfig is used in other projects than Linux, let's use a more generic prompt, "Main menu". Suggested-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-29kconfig: remove sym_expand_string_value()Masahiro Yamada
There is no more caller of sym_expand_string_value(). Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org>
2018-05-29kconfig: remove string expansion for mainmenu after yyparse()Masahiro Yamada
Now that environments are expanded in the lexer, conf_parse() does not need to expand them explicitly. The hack introduced by commit 0724a7c32a54 ("kconfig: Don't leak main menus during parsing") can go away. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-05-29kconfig: remove string expansion in file_lookup()Masahiro Yamada
There are two callers of file_lookup(), but there is no more reason to expand the given path. [1] zconf_initscan() This is used to open the first Kconfig. sym_expand_string_value() has never been used in a useful way here; before opening the first Kconfig file, obviously there is no symbol to expand. If you use expand_string_value() instead, environments in KBUILD_KCONFIG would be expanded, but I do not see practical benefits for that. [2] zconf_nextfile() This is used to open the next file from 'source' statement. Symbols in the path like "arch/$SRCARCH/Kconfig" needed expanding, but it was replaced with the direct environment expansion. The environment has already been expanded before the token is passed to the parser. By the way, file_lookup() was already buggy; it expanded a given path, but it used the path before expansion for look-up: if (!strcmp(name, file->name)) { Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-05-29kconfig: reference environment variables directly and remove 'option env='Masahiro Yamada
To get access to environment variables, Kconfig needs to define a symbol using "option env=" syntax. It is tedious to add a symbol entry for each environment variable given that we need to define much more such as 'CC', 'AS', 'srctree' etc. to evaluate the compiler capability in Kconfig. Adding '$' for symbol references is grammatically inconsistent. Looking at the code, the symbols prefixed with 'S' are expanded by: - conf_expand_value() This is used to expand 'arch/$ARCH/defconfig' and 'defconfig_list' - sym_expand_string_value() This is used to expand strings in 'source' and 'mainmenu' All of them are fixed values independent of user configuration. So, they can be changed into the direct expansion instead of symbols. This change makes the code much cleaner. The bounce symbols 'SRCARCH', 'ARCH', 'SUBARCH', 'KERNELVERSION' are gone. sym_init() hard-coding 'UNAME_RELEASE' is also gone. 'UNAME_RELEASE' should be replaced with an environment variable. ARCH_DEFCONFIG is a normal symbol, so it should be simply referenced without '$' prefix. The new syntax is addicted by Make. The variable reference needs parentheses, like $(FOO), but you can omit them for single-letter variables, like $F. Yet, in Makefiles, people tend to use the parenthetical form for consistency / clarification. At this moment, only the environment variable is supported, but I will extend the concept of 'variable' later on. The variables are expanded in the lexer so we can simplify the token handling on the parser side. For example, the following code works. [Example code] config MY_TOOLCHAIN_LIST string default "My tools: CC=$(CC), AS=$(AS), CPP=$(CPP)" [Result] $ make -s alldefconfig && tail -n 1 .config CONFIG_MY_TOOLCHAIN_LIST="My tools: CC=gcc, AS=as, CPP=gcc -E" Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org>
2018-05-29kbuild: remove CONFIG_CROSS_COMPILE supportMasahiro Yamada
Kbuild provides a couple of ways to specify CROSS_COMPILE: [1] Command line [2] Environment [3] arch/*/Makefile (only some architectures) [4] CONFIG_CROSS_COMPILE [4] is problematic for the compiler capability tests in Kconfig. CONFIG_CROSS_COMPILE allows users to change the compiler prefix from 'make menuconfig', etc. It means, the compiler options would have to be all re-calculated everytime CONFIG_CROSS_COMPILE is changed. To avoid complexity and performance issues, I'd like to evaluate the shell commands statically, i.e. only parsing Kconfig files. I guess the majority is [1] or [2]. Currently, there are only 5 defconfig files that specify CONFIG_CROSS_COMPILE. arch/arm/configs/lpc18xx_defconfig arch/hexagon/configs/comet_defconfig arch/nds32/configs/defconfig arch/openrisc/configs/or1ksim_defconfig arch/openrisc/configs/simple_smp_defconfig Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org>
2018-05-29kbuild: remove kbuild cacheMasahiro Yamada
The kbuild cache was introduced to remember the result of shell commands, some of which are expensive to compute, such as $(call cc-option,...). However, this turned out not so clever as I had first expected. Actually, it is problematic. For example, "$(CC) -print-file-name" is cached. If the compiler is updated, the stale search path causes build error, which is difficult to figure out. Another problem scenario is cache files could be touched while install targets are running under the root permission. We can patch them if desired, but the build infrastructure is getting uglier and uglier. Now, we are going to move compiler flag tests to the configuration phase. If this is completed, the result of compiler tests will be naturally cached in the .config file. We will not have performance issues of incremental building since this testing only happens at Kconfig time. To start this work with a cleaner code base, remove the kbuild cache first. Revert the following commits: Commit 9a234a2e3843 ("kbuild: create directory for make cache only when necessary") Commit e17c400ae194 ("kbuild: shrink .cache.mk when it exceeds 1000 lines") Commit 4e56207130ed ("kbuild: Cache a few more calls to the compiler") Commit 3298b690b21c ("kbuild: Add a cache for generated variables") Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org>
2018-05-28aio: add missing break for the IOCB_CMD_FDSYNC caseChristoph Hellwig
Looks like this got lost in a merge. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>