summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2018-08-09nfsd: fix leaked file lock with nfs exported overlayfsAmir Goldstein
nfsd and lockd call vfs_lock_file() to lock/unlock the inode returned by locks_inode(file). Many places in nfsd/lockd code use the inode returned by file_inode(file) for lock manipulation. With Overlayfs, file_inode() (the underlying inode) is not the same object as locks_inode() (the overlay inode). This can result in "Leaked POSIX lock" messages and eventually to a kernel crash as reported by Eddie Horng: https://marc.info/?l=linux-unionfs&m=153086643202072&w=2 Fix all the call sites in nfsd/lockd that should use locks_inode(). This is a correctness bug that manifested when overlayfs gained NFS export support in v4.16. Reported-by: Eddie Horng <eddiehorng.tw@gmail.com> Tested-by: Eddie Horng <eddiehorng.tw@gmail.com> Cc: Jeff Layton <jlayton@kernel.org> Fixes: 8383f1748829 ("ovl: wire up NFS export operations") Cc: stable@vger.kernel.org Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-08-09Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Overlapping changes in RXRPC, changing to ktime_get_seconds() whilst adding some tracepoints. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09net: stmmac: Add XGMAC 2.10 HWIF entryJose Abreu
Add a new entry to HWIF table for XGMAC 2.10. For now we fill it with empty callbacks which will be added in posterior patches. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09phylink: add helper for configuring 2500BaseX modesRussell King
Add a helper for MAC drivers to use in their validate callback to deal with 2500BaseX vs 1000BaseX modes, where the hardware supports both but it is not possible to automatically select between them. This helper defaults to 1000BaseX, as that is the 802.3 standard, and will allow users to select 2500BaseX either by forcing the speed if AN is disabled, or by changing the advertising mask if AN is enabled. Disabling AN is not recommended as it is only the speed that we're interested in controlling, not the duplex or pause mode parameters. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09signal: Don't restart fork when signals come in.Eric W. Biederman
Wen Yang <wen.yang99@zte.com.cn> and majiang <ma.jiang@zte.com.cn> report that a periodic signal received during fork can cause fork to continually restart preventing an application from making progress. The code was being overly pessimistic. Fork needs to guarantee that a signal sent to multiple processes is logically delivered before the fork and just to the forking process or logically delivered after the fork to both the forking process and it's newly spawned child. For signals like periodic timers that are always delivered to a single process fork can safely complete and let them appear to logically delivered after the fork(). While examining this issue I also discovered that fork today will miss signals delivered to multiple processes during the fork and handled by another thread. Similarly the current code will also miss blocked signals that are delivered to multiple process, as those signals will not appear pending during fork. Add a list of each thread that is currently forking, and keep on that list a signal set that records all of the signals sent to multiple processes. When fork completes initialize the new processes shared_pending signal set with it. The calculate_sigpending function will see those signals and set TIF_SIGPENDING causing the new task to take the slow path to userspace to handle those signals. Making it appear as if those signals were received immediately after the fork. It is not possible to send real time signals to multiple processes and exceptions don't go to multiple processes, which means that that are no signals sent to multiple processes that require siginfo. This means it is safe to not bother collecting siginfo on signals sent during fork. The sigaction of a child of fork is initially the same as the sigaction of the parent process. So a signal the parent ignores the child will also initially ignore. Therefore it is safe to ignore signals sent to multiple processes and ignored by the forking process. Signals sent to only a single process or only a single thread and delivered during fork are treated as if they are received after the fork, and generally not dealt with. They won't cause any problems. V2: Added removal from the multiprocess list on failure. V3: Use -ERESTARTNOINTR directly V4: - Don't queue both SIGCONT and SIGSTOP - Initialize signal_struct.multiprocess in init_task - Move setting of shared_pending to before the new task is visible to signals. This prevents signals from comming in before shared_pending.signal is set to delayed.signal and being lost. V5: - rework list add and delete to account for idle threads v6: - Use sigdelsetmask when removing stop signals Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200447 Reported-by: Wen Yang <wen.yang99@zte.com.cn> and Reported-by: majiang <ma.jiang@zte.com.cn> Fixes: 4a2c7a7837da ("[PATCH] make fork() atomic wrt pgrp/session signals") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-08-09NFS handle COPY reply CB_OFFLOAD call raceOlga Kornievskaia
It's possible that server replies back with CB_OFFLOAD call and COPY reply at the same time such that client will process CB_OFFLOAD before reply to COPY. For that keep a list of pending callback stateids received and then before waiting on completion check the pending list. Cleanup any pending copies on the client shutdown. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-08-09NFS add support for asynchronous COPYOlga Kornievskaia
Change xdr to always send COPY asynchronously. Keep the list copies send in a list under a server structure. Once copy is sent, it waits on a completion structure that will be signalled by the callback thread that receives CB_OFFLOAD. If CB_OFFLOAD returned an error and even if it returned partial bytes, ignore them (as we can't commit without a verifier to match) and return an error. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-08-09NFS COPY xdr handle async replyOlga Kornievskaia
If server returns async reply, it must include a callback stateid, wr_callback_id in the write_response4. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-08-09NFS OFFLOAD_CANCEL xdrOlga Kornievskaia
Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-08-09ssb: Remove SSB_WARN_ON, SSB_BUG_ON and SSB_DEBUGMichael Büsch
Use the standard WARN_ON instead. If a small kernel is desired, WARN_ON can be disabled globally. Also remove SSB_DEBUG. Besides WARN_ON it only adds a tiny debug check. Include this check unconditionally. Signed-off-by: Michael Buesch <m@bues.ch> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-08-09blkcg: Introduce blkg_root_lookup()Bart Van Assche
This new function will be used in a later patch to verify whether a queue has been dissociated from the cgroup controller before being released. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Alexandru Moise <00moses.alexander00@gmail.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: <stable@vger.kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09block: Remove two superfluous #include directivesBart Van Assche
Commit 12f5b9314545 ("blk-mq: Remove generation seqeunce") removed the only seqcount_t and u64_stats_sync instances from <linux/blkdev.h> but did not remove the corresponding #include directives. Since these include directives are no longer needed, remove them. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Keith Busch <keith.busch@intel.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Jianchao Wang <jianchao.w.wang@oracle.com> Cc: Hannes Reinecke <hare@suse.com>, Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09Merge branch 'asoc-4.19' into asoc-nextMark Brown
2018-08-09Merge branch 'regmap-4.19' into regmap-nextMark Brown
2018-08-09Merge tag 'regmap-noinc-read' into regmap-4.19Mark Brown
regmap: Support non-incrementing registers Some devices have individual registers that don't autoincrement the register address during bulk reads but instead repeatedly read the same value, for example for monitoring GPIOs or ADCs. Add support for these.
2018-08-09regmap: Add regmap_noinc_read APICrestez Dan Leonard
The regmap API usually assumes that bulk read operations will read a range of registers but some I2C/SPI devices have certain registers for which a such a read operation will return data from an internal FIFO instead. Add an explicit API to support bulk read without range semantics. Some linux drivers use regmap_bulk_read or regmap_raw_read for such registers, for example mpu6050 or bmi150 from IIO. This only happens to work because when caching is disabled a single regmap read op will map to a single bus read op (as desired). This breaks if caching is enabled and reg+1 happens to be a cacheable register. Without regmap support refactoring a driver to enable regmap caching requires separate I2C and SPI paths. This is exactly what regmap is supposed to help avoid. Suggested-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Crestez Dan Leonard <leonard.crestez@intel.com> Signed-off-by: Stefan Popa <stefan.popa@analog.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-08-09arm64 / ACPI: clean the additional checks before calling ghes_notify_sea()Dongjiu Geng
In order to remove the additional check before calling the ghes_notify_sea(), make stub definition when !CONFIG_ACPI_APEI_SEA. After this cleanup, we can simply call the ghes_notify_sea() to let APEI driver handle the SEA notification. Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-08-08net/mlx5: Remove unused mlx5_query_vport_admin_stateEli Cohen
mlx5_query_vport_admin_state() is not used anywhere. Remove it. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08net/mlx5: Rename modify/query_vport state related enumsEran Ben Elisha
Modify and query vport state commands share the same admin_state and op_mod values, rename the enums to fit them both. In addition, remove the esw prefix from the admin state enum as this also applied for vnic. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08net/mlx5: Use max_num_eqs for calculation of required MSIX vectorsDenis Drozdov
New firmware has defined new HCA capability field called "max_num_eqs", that is the number of available EQs after subtracting reserved FW EQs. Before this capability the FW reported the EQ number in "log_max_eqs", the reported value also contained FW reserved EQs, but the driver might be failing to load on 320 cpus systems due to the fact that FW reserved EQs were not available to the driver. Now the driver has to obtain max_num_eqs value from new FW to get real number of EQs available. Signed-off-by: Denis Drozdov <denisd@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08i2c: core: Parse SDA hold time from firmwareAndy Shevchenko
There are two drivers already using the SDA hold time setting. It might be more in the future, thus, make I2C core to parse the setting for us if provided by firmware. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-08netfilter: nfnetlink_osf: add missing enum in nfnetlink_osf uapi headerFernando Fernandez Mancera
xt_osf_window_size_options was originally part of include/uapi/linux/netfilter/xt_osf.h, restore it. Fixes: bfb15f2a95cb ("netfilter: extract Passive OS fingerprint infrastructure from xt_osf") Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-08overflow.h: Add arithmetic shift helperJason Gunthorpe
Add shift_overflow() helper to assist driver authors in ensuring that shift operations don't cause overflows or other odd conditions. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> [kees: tweaked comments and commit log, dropped unneeded assignment] Signed-off-by: Kees Cook <keescook@chromium.org>
2018-08-08Merge branches 'arm/shmobile', 'arm/renesas', 'arm/msm', 'arm/smmu', ↵Joerg Roedel
'arm/omap', 'x86/amd', 'x86/vt-d' and 'core' into next
2018-08-08nvme.h: add support for ns write protect definitionsChaitanya Kulkarni
Add various definitions from NVMe 1.3 TP 4005. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-08-08nvme.h: fixup ANA group descriptor formatHannes Reinecke
ANA Phase 3 draft had the 'reserved' field in the group descriptor format set to '23:17' (so that the first namespace identifier started at byte 24), but that got move with the approved TP to '31:17' (so that the first namespace identifier started at byte 32). Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-08-08regulator: maxim: Add SPDX license identifiersKrzysztof Kozlowski
Replace GPL v2.0 and v2.0+ license statements with SPDX license identifiers. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-08-08iommu: Remove the ->map_sg indirectionChristoph Hellwig
All iommu drivers use the default_iommu_map_sg implementation, and there is no good reason to ever override it. Just expose it as iommu_map_sg directly and remove the indirection, specially in our post-spectre world where indirect calls are horribly expensive. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-08-07llc: use refcount_inc_not_zero() for llc_sap_find()Cong Wang
llc_sap_put() decreases the refcnt before deleting sap from the global list. Therefore, there is a chance llc_sap_find() could find a sap with zero refcnt in this global list. Close this race condition by checking if refcnt is zero or not in llc_sap_find(), if it is zero then it is being removed so we can just treat it as gone. Reported-by: <syzbot+278893f3f7803871f7ce@syzkaller.appspotmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07net: phy: Add support for Broadcom Omega internal Combo GPHYArun Parameswaran
Add support for the Broadcom Omega SoC internal Combo Ethernet GPHY to the bcm7xxx phy driver. Signed-off-by: Arun Parameswaran <arun.parameswaran@broadcom.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08Merge branch 'drm-next-4.19' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-next Fixes for 4.19: - Fix UVD 7.2 instance handling - Fix UVD 7.2 harvesting - GPU scheduler fix for when a process is killed - TTM cleanups - amdgpu CS bo_list fixes - Powerplay fixes for polaris12 and CZ/ST - DC fixes for link training certain HMDs - DC fix for vega10 blank screen in certain cases From: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180801222906.1016-1-alexander.deucher@amd.com
2018-08-07vsock: split dwork to avoid reinitializationsCong Wang
syzbot reported that we reinitialize an active delayed work in vsock_stream_connect(): ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x90 kernel/workqueue.c:1414 WARNING: CPU: 1 PID: 11518 at lib/debugobjects.c:329 debug_print_object+0x16a/0x210 lib/debugobjects.c:326 The pattern is apparently wrong, we should only initialize the dealyed work once and could repeatly schedule it. So we have to move out the initializations to allocation side. And to avoid confusion, we can split the shared dwork into two, instead of re-using the same one. Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Reported-by: <syzbot+8a9b1bd330476a4f3db6@syzkaller.appspotmail.com> Cc: Andy king <acking@vmware.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07net/sched: allow flower to match tunnel optionsPieter Jansen van Vuuren
Allow matching on options in Geneve tunnel headers. This makes use of existing tunnel metadata support. The options can be described in the form CLASS:TYPE:DATA/CLASS_MASK:TYPE_MASK:DATA_MASK, where CLASS is represented as a 16bit hexadecimal value, TYPE as an 8bit hexadecimal value and DATA as a variable length hexadecimal value. e.g. # ip link add name geneve0 type geneve dstport 0 external # tc qdisc add dev geneve0 ingress # tc filter add dev geneve0 protocol ip parent ffff: \ flower \ enc_src_ip 10.0.99.192 \ enc_dst_ip 10.0.99.193 \ enc_key_id 11 \ geneve_opts 0102:80:1122334421314151/ffff:ff:ffffffffffffffff \ ip_proto udp \ action mirred egress redirect dev eth1 This patch adds support for matching Geneve options in the order supplied by the user. This leads to an efficient implementation in the software datapath (and in our opinion hardware datapaths that offload this feature). It is also compatible with Geneve options matching provided by the Open vSwitch kernel datapath which is relevant here as the Flower classifier may be used as a mechanism to program flows into hardware as a form of Open vSwitch datapath offload (sometimes referred to as OVS-TC). The netlink Kernel/Userspace API may be extended, for example by adding a flag, if other matching options are desired, for example matching given options in any order. This would require an implementation in the TC software datapath. And be done in a way that drivers that facilitate offload of the Flower classifier can reject or accept such flows based on hardware datapath capabilities. This approach was discussed and agreed on at Netconf 2017 in Seoul. Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07flow_dissector: allow dissection of tunnel options from metadataSimon Horman
Allow the existing 'dissection' of tunnel metadata to 'dissect' options already present in tunnel metadata. This dissection is controlled by a new dissector key, FLOW_DISSECTOR_KEY_ENC_OPTS. This dissection only occurs when skb_flow_dissect_tunnel_info() is called, currently only the Flower classifier makes that call. So there should be no impact on other users of the flow dissector. This is in preparation for allowing the flower classifier to match on Geneve options. Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07ethtool: Add WAKE_FILTER and RX_CLS_FLOW_WAKEFlorian Fainelli
Add the ability to specify through ethtool::rxnfc that a rule location is special and will be used to participate in Wake-on-LAN, by e.g: having a specific pattern be matched. When this is the case, fs->ring_cookie must be set to the special value RX_CLS_FLOW_WAKE. We also define an additional ethtool::wolinfo flag: WAKE_FILTER which can be used to configure an Ethernet adapter to allow Wake-on-LAN using previously programmed filters. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2018-08-07 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add cgroup local storage for BPF programs, which provides a fast accessible memory for storing various per-cgroup data like number of transmitted packets, etc, from Roman. 2) Support bpf_get_socket_cookie() BPF helper in several more program types that have a full socket available, from Andrey. 3) Significantly improve the performance of perf events which are reported from BPF offload. Also convert a couple of BPF AF_XDP samples overto use libbpf, both from Jakub. 4) seg6local LWT provides the End.DT6 action, which allows to decapsulate an outer IPv6 header containing a Segment Routing Header. Adds this action now to the seg6local BPF interface, from Mathieu. 5) Do not mark dst register as unbounded in MOV64 instruction when both src and dst register are the same, from Arthur. 6) Define u_smp_rmb() and u_smp_wmb() to their respective barrier instructions on arm64 for the AF_XDP sample code, from Brian. 7) Convert the tcp_client.py and tcp_server.py BPF selftest scripts over from Python 2 to Python 3, from Jeremy. 8) Enable BTF build flags to the BPF sample code Makefile, from Taeung. 9) Remove an unnecessary rcu_read_lock() in run_lwt_bpf(), from Taehee. 10) Several improvements to the README.rst from the BPF documentation to make it more consistent with RST format, from Tobin. 11) Replace all occurrences of strerror() by calls to strerror_r() in libbpf and fix a FORTIFY_SOURCE build error along with it, from Thomas. 12) Fix a bug in bpftool's get_btf() function to correctly propagate an error via PTR_ERR(), from Yue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07netfilter: nft_ct: add ct timeout supportHarsha Sharma
This patch allows to add, list and delete connection tracking timeout policies via nft objref infrastructure and assigning these timeout via nft rule. %./libnftnl/examples/nft-ct-timeout-add ip raw cttime tcp Ruleset: table ip raw { ct timeout cttime { protocol tcp; policy = {established: 111, close: 13 } } chain output { type filter hook output priority -300; policy accept; ct timeout set "cttime" } } %./libnftnl/examples/nft-rule-ct-timeout-add ip raw output cttime %conntrack -E [NEW] tcp 6 111 ESTABLISHED src=172.16.19.128 dst=172.16.19.1 sport=22 dport=41360 [UNREPLIED] src=172.16.19.1 dst=172.16.19.128 sport=41360 dport=22 %nft delete rule ip raw output handle <handle> %./libnftnl/examples/nft-ct-timeout-del ip raw cttime Joint work with Pablo Neira. Signed-off-by: Harsha Sharma <harshasharmaiitr@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-07netfilter: remove ifdef around cttimeout in struct nf_conntrack_l4protoPablo Neira Ayuso
Simplify this, include it inconditionally in this structure layout as we do with ctnetlink. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-07netfilter: cttimeout: decouple timeout policy from nfnetlink_cttimeout objectPablo Neira Ayuso
The timeout policy is currently embedded into the nfnetlink_cttimeout object, move the policy into an independent object. This allows us to reuse part of the existing conntrack timeout extension from nf_tables without adding dependencies with the nfnetlink_cttimeout object layout. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-07netfilter: cttimeout: move ctnl_untimeout to nf_conntrackHarsha Sharma
As, ctnl_untimeout is required by nft_ct, so move ctnl_timeout from nfnetlink_cttimeout to nf_conntrack_timeout and rename as nf_ct_timeout. Signed-off-by: Harsha Sharma <harshasharmaiitr@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-07netfilter: nft_osf: use NFT_OSF_MAXGENRELEN instead of IFNAMSIZFernando Fernandez Mancera
As no "genre" on pf.os exceed 16 bytes of length, we reduce NFT_OSF_MAXGENRELEN parameter to 16 bytes and use it instead of IFNAMSIZ. Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-08powerpc/64s: Fix page table fragment refcount race vs speculative referencesNicholas Piggin
The page table fragment allocator uses the main page refcount racily with respect to speculative references. A customer observed a BUG due to page table page refcount underflow in the fragment allocator. This can be caused by the fragment allocator set_page_count stomping on a speculative reference, and then the speculative failure handler decrements the new reference, and the underflow eventually pops when the page tables are freed. Fix this by using a dedicated field in the struct page for the page table fragment allocator. Fixes: 5c1f6ee9a31c ("powerpc: Reduce PTE table memory wastage") Cc: stable@vger.kernel.org # v3.10+ Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07cpu/hotplug: Fix SMT supported evaluationThomas Gleixner
Josh reported that the late SMT evaluation in cpu_smt_state_init() sets cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied on the kernel command line as it cannot differentiate between SMT disabled by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and makes the sysfs interface unusable. Rework this so that during bringup of the non boot CPUs the availability of SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a 'primary' thread then set the local cpu_smt_available marker and evaluate this explicitely right after the initial SMP bringup has finished. SMT evaulation on x86 is a trainwreck as the firmware has all the information _before_ booting the kernel, but there is no interface to query it. Fixes: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS") Reported-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-06Merge branch 'ieee802154-for-davem-2018-08-06' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next Stefan Schmidt says: ==================== pull-request: ieee802154-next 2018-08-06 An update from ieee802154 for *net-next* Romuald added a socket option to get the LQI value of the received datagram. Alexander added a new hardware simulation driver modelled after hwsim of the wireless people. It allows runtime configuration for new nodes and edges over a netlink interface (a config utlity is making its way into wpan-tools). We also have three fixes in here. One from Colin which is more of a cleanup and two from Alex fixing tailroom and single frame space problems. I would normally put the last two into my fixes tree, but given we are already in -rc8 I simply put them here and added a cc: stable to them. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-06Merge branch 'x86/pti-urgent' into x86/ptiThomas Gleixner
Integrate the PTI Global bit fixes which conflict with the 32bit PTI support. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-06Bluetooth: btqca: Introduce HCI_EV_VENDOR and use itMarcel Holtmann
Using HCI_VENDOR_PKT for vendor specific events does work since it has also the value 0xff, but it is actually the packet type indicator constant and not the event constant. So introduce HCI_EV_VENDOR and use it. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2018-08-06vhost: switch to use new message formatJason Wang
We use to have message like: struct vhost_msg { int type; union { struct vhost_iotlb_msg iotlb; __u8 padding[64]; }; }; Unfortunately, there will be a hole of 32bit in 64bit machine because of the alignment. This leads a different formats between 32bit API and 64bit API. What's more it will break 32bit program running on 64bit machine. So fixing this by introducing a new message type with an explicit 32bit reserved field after type like: struct vhost_msg_v2 { __u32 type; __u32 reserved; union { struct vhost_iotlb_msg iotlb; __u8 padding[64]; }; }; We will have a consistent ABI after switching to use this. To enable this capability, introduce a new ioctl (VHOST_SET_BAKCEND_FEATURE) for userspace to enable this feature (VHOST_BACKEND_F_IOTLB_V2). Fixes: 6b1e6cc7855b ("vhost: new device IOTLB API") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-06Revert "ata: ahci_platform: allow disabling of hotplug to save power"Tejun Heo
This reverts commit aece27a2f01be4bb7683790f69cd1bed3a0929a2. Causes boot failure on some devices. http://lore.kernel.org/r/CA+G9fYuKW_jCFZPqG4tz=QY9ROfHO38KiCp9XTA+KaDOFVtcqQ@mail.gmail.com Signed-off-by: Tejun Heo <tj@kernel.org>
2018-08-06locks: add tracepoint in flock codepathJeff Layton
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2018-08-06KVM: X86: Implement PV IPIs in linux guestWanpeng Li
Implement paravirtual apic hooks to enable PV IPIs for KVM if the "send IPI" hypercall is available. The hypercall lets a guest send IPIs, with at most 128 destinations per hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>