linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2017-12-11	dt-bindings: fec: Make the phy-reset-gpio polarity explicit	Fabio Estevam
	The GPIO polarity passed to phy-reset-gpio is ignored by the FEC driver and it is assumed to be active low. It can be active high only when the 'phy-reset-active-high' property is present. The current examples pass active high polarity and work fine, but in order to improve the documentation make it explicit what the real polarity is. Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Acked-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	Merge branch 'sctp-stream-interleave-part-1'	David S. Miller
	Xin Long says: ==================== sctp: Implement Stream Interleave: The I-DATA Chunk Supporting User Message Interleaving Stream Interleave would be Implemented in two Parts: 1. The I-DATA Chunk Supporting User Message Interleaving 2. Interaction with Other SCTP Extensions Overview in section 1.1 of RFC8260 for Part 1: This document describes a new chunk carrying payload data called I-DATA. This chunk incorporates the properties of the current SCTP DATA chunk, all the flags and fields except the Stream Sequence Number (SSN), and also adds two new fields in its chunk header -- the Fragment Sequence Number (FSN) and the Message Identifier (MID). The FSN is only used for reassembling all fragments that have the same MID and the same ordering property. The TSN is only used for the reliable transfer in combination with Selective Acknowledgment (SACK) chunks. In addition, the MID is also used for ensuring ordered delivery instead of using the stream sequence number (the I-DATA chunk omits an SSN). As the 1st part of Stream Interleave Implementation, this patchset adds an ops framework named sctp_stream_interleave with a bunch of stuff that does lots of things needed somewhere. Then it defines sctp_stream_interleave_0 to work for normal DATA chunks and sctp_stream_interleave_1 for I-DATA chunks. With these functions, hundreds of if-else checks for the different process on I-DATA chunks would be avoided. Besides, very few codes could be shared in these two function sets. In this patchset, it adds some basic variables, structures and socket options firstly, then implement these functions one by one to add the procedures for ordered idata gradually, at last adjusts some codes to make them work for unordered idata. To make it safe to be implemented and also not break the normal data chunk process, this feature can't be enabled to use until all stream interleave codes are completely accomplished. v1 -> v2: - fixed a checkpatch warning that a blank line was missed. - avoided a kbuild warning reported from gcc-4.9. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	sctp: add support for the process of unordered idata	Xin Long
	Unordered idata process is more complicated than unordered data: - It has to add mid into sctp_stream_out to save the next mid value, which is separated from ordered idata's. - To support pd for unordered idata, another mid and pd_mode need to be added to save the message id and pd state in sctp_stream_in. - To make unordered idata reasm easier, it adds a new event queue to save frags for idata. The patch mostly adds the samilar reasm functions for unordered idata as ordered idata's, and also adjusts some other codes on assign_mid, abort_pd and ulpevent_data for idata. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	sctp: implement abort_pd for sctp_stream_interleave	Xin Long
	abort_pd is added as a member of sctp_stream_interleave, used to abort partial delivery for data or idata, called in sctp_cmd_assoc_failed. Since stream interleave allows to do partial delivery for each stream at the same time, sctp_intl_abort_pd for idata would be very different from the old function sctp_ulpq_abort_pd for data. Note that sctp_ulpevent_make_pdapi will support per stream in this patch by adding pdapi_stream and pdapi_seq in sctp_pdapi_event, as described in section 6.1.7 of RFC6458. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	sctp: implement start_pd for sctp_stream_interleave	Xin Long
	start_pd is added as a member of sctp_stream_interleave, used to do partial_delivery for data or idata when datalen >= asoc->rwnd in sctp_eat_data. The codes have been done in last patches, but they need to be extracted into start_pd, so that it could be used for SCTP_CMD_PART_DELIVER cmd as well. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	sctp: implement renege_events for sctp_stream_interleave	Xin Long
	renege_events is added as a member of sctp_stream_interleave, used to renege some old data or idata in reasm or lobby queue properly to free some memory for the new data when there's memory stress. It defines sctp_renege_events for idata, and leaves sctp_ulpq_renege as it is for data. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	sctp: implement enqueue_event for sctp_stream_interleave	Xin Long
	enqueue_event is added as a member of sctp_stream_interleave, used to enqueue either data, idata or notification events into user socket rx queue. It replaces sctp_ulpq_tail_event used in the other places with enqueue_event. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	sctp: implement ulpevent_data for sctp_stream_interleave	Xin Long
	ulpevent_data is added as a member of sctp_stream_interleave, used to do the most process in ulpq, including to convert data or idata chunk to event, reasm them in reasm queue and put them in lobby queue in right order, and deliver them up to user sk rx queue. This procedure is described in section 2.2.3 of RFC8260. It adds most functions for idata here to do the similar process as the old functions for data. But since the details are very different between them, the old functions can not be reused for idata. event->ssn and event->ppid settings are moved to ulpevent_data from sctp_ulpevent_make_rcvmsg, so that sctp_ulpevent_make_rcvmsg could work for both data and idata. Note that mid is added in sctp_ulpevent for idata, __packed has to be used for defining sctp_ulpevent, or it would exceeds the skb cb that saves a sctp_ulpevent variable for ulp layer process. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	sctp: implement validate_data for sctp_stream_interleave	Xin Long
	validate_data is added as a member of sctp_stream_interleave, used to validate ssn/chunk type for data or mid (message id)/chunk type for idata, called in sctp_eat_data. If this check fails, an abort packet will be sent, as said in section 2.2.3 of RFC8260. It also adds the process for idata in rx path. As Marcelo pointed out, there's no need to add event table for idata, but just share chunk_event_table with data's. It would drop data chunk for idata and drop idata chunk for data by calling validate_data in sctp_eat_data. As last patch did, it also replaces sizeof(struct sctp_data_chunk) with sctp_datachk_len for rx path. After this patch, the idata can be accepted and delivered to ulp layer. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	sctp: implement assign_number for sctp_stream_interleave	Xin Long
	assign_number is added as a member of sctp_stream_interleave, used to assign ssn for data or mid (message id) for idata, called in sctp_packet_append_data. sctp_chunk_assign_ssn is left as it is, and sctp_chunk_assign_mid is added for sctp_stream_interleave_1. This procedure is described in section 2.2.2 of RFC8260. All sizeof(struct sctp_data_chunk) in tx path is replaced with sctp_datachk_len, to make it right for idata as well. And also adjust sctp_chunk_is_data for SCTP_CID_I_DATA. After this patch, idata can be built and sent in tx path. Note that if sp strm_interleave is set, it has to wait_connect in sctp_sendmsg, as asoc intl_enable need to be known after 4 shake- hands, to decide if it should use data or idata later. data and idata can't be mixed to send in one asoc. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	sctp: implement make_datafrag for sctp_stream_interleave	Xin Long
	To avoid hundreds of checks for the different process on I-DATA chunk, struct sctp_stream_interleave is defined as a group of functions used to replace the codes in some place where it needs to do different job according to if the asoc intl_enabled is set. With these ops, it only needs to initialize asoc->stream.si with sctp_stream_interleave_0 for normal data if asoc intl_enable is 0, or sctp_stream_interleave_1 for idata if asoc intl_enable is set in sctp_stream_init. After that, the members in asoc->stream.si can be used directly in some special places without checking asoc intl_enable. make_datafrag is the first member for sctp_stream_interleave, it's used to make data or idata frags, called in sctp_datamsg_from_user. The old function sctp_make_datafrag_empty needs to be adjust some to fit in this ops. Note that as idata and data chunks have different length, it also defines data_chunk_len for sctp_stream_interleave to describe the chunk size. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	sctp: add basic structures and make chunk function for idata	Xin Long
	sctp_idatahdr and sctp_idata_chunk are used to define and parse I-DATA chunk format, and sctp_make_idata is a function to build the chunk. The I-DATA Chunk Format is defined in section 2.1 of RFC8260. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	sctp: add asoc intl_enable negotiation during 4 shakehands	Xin Long
	asoc intl_enable will be set when local sp strm_interleave is set and there's I-DATA chunk in init and init_ack extensions, as said in section 2.2.1 of RFC8260. asoc intl_enable indicates all data will be sent as I-DATA chunks. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	sctp: add stream interleave enable members and sockopt	Xin Long
	This patch adds intl_enable in asoc and netns, and strm_interleave in sctp_sock to indicate if stream interleave is enabled and supported. netns intl_enable would be set via procfs, but that is not added yet until all stream interleave codes are completely implemented; asoc intl_enable will be set when doing 4-shakehands. sp strm_interleave can be set by sockopt SCTP_INTERLEAVING_SUPPORTED which is also added in this patch. This socket option is defined in section 4.3.1 of RFC8260. Note that strm_interleave can only be set by sockopt when both netns intl_enable and sp frag_interleave are set. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	net: phy: meson-gxl: detect LPA corruption	Jerome Brunet
	The purpose of this change is to fix the incorrect detection of the link partner (LP) advertised capabilities which sometimes happens with this PHY (roughly 1 time in a dozen) This issue may cause the link to be negotiated at 10Mbps/Full or 10Mbps/Half when 100MBps/Full is actually possible. In some case, the link is even completely broken and no communication is possible. To detect the corruption, we must look for a magic undocumented bit in the WOL bank (hint given by the SoC vendor kernel) but this is not enough to cover all cases. We also have to look at the LPA ack. If the LP supports Aneg but did not ack our base code when aneg is completed, we assume something went wrong. The detection of a corrupted LPA triggers a restart of the aneg process. This solves the problem but may take up to 6 retries to complete. Fixes: 7334b3e47aee ("net: phy: Add Meson GXL Internal PHY driver") Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	ipvlan: add L2 check for packets arriving via virtual devices	Mahesh Bandewar
	Packets that don't have dest mac as the mac of the master device should not be entertained by the IPvlan rx-handler. This is mostly true as the packet path mostly takes care of that, except when the master device is a virtual device. As demonstrated in the following case - ip netns add ns1 ip link add ve1 type veth peer name ve2 ip link add link ve2 name iv1 type ipvlan mode l2 ip link set dev iv1 netns ns1 ip link set ve1 up ip link set ve2 up ip -n ns1 link set iv1 up ip addr add 192.168.10.1/24 dev ve1 ip -n ns1 addr 192.168.10.2/24 dev iv1 ping -c2 192.168.10.2 <Works!> ip neigh show dev ve1 ip neigh show 192.168.10.2 lladdr <random> dev ve1 ping -c2 192.168.10.2 <Still works! Wrong!!> This patch adds that missing check in the IPvlan rx-handler. Reported-by: Amit Sikka <amit.sikka@ericsson.com> Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	arm64: mm: Fix pte_mkclean, pte_mkdirty semantics	Steve Capper
	On systems with hardware dirty bit management, the ltp madvise09 unit test fails due to dirty bit information being lost and pages being incorrectly freed. This was bisected to: arm64: Ignore hardware dirty bit updates in ptep_set_wrprotect() Reverting this commit leads to a separate problem, that the unit test retains pages that should have been dropped due to the function madvise_free_pte_range(.) not cleaning pte's properly. Currently pte_mkclean only clears the software dirty bit, thus the following code sequence can appear: pte = pte_mkclean(pte); if (pte_dirty(pte)) // this condition can return true with HW DBM! This patch also adjusts pte_mkclean to set PTE_RDONLY thus effectively clearing both the SW and HW dirty information. In order for this to function on systems without HW DBM, we need to also adjust pte_mkdirty to remove the read only bit from writable pte's to avoid infinite fault loops. Cc: <stable@vger.kernel.org> Fixes: 64c26841b349 ("arm64: Ignore hardware dirty bit updates in ptep_set_wrprotect()") Reported-by: Bhupinder Thakur <bhupinder.thakur@linaro.org> Tested-by: Bhupinder Thakur <bhupinder.thakur@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-11	arm64: Initialise high_memory global variable earlier	Steve Capper
	The high_memory global variable is used by cma_declare_contiguous(.) before it is defined. We don't notice this as we compute __pa(high_memory - 1), and it looks like we're processing a VA from the direct linear map. This problem becomes apparent when we flip the kernel virtual address space and the linear map is moved to the bottom of the kernel VA space. This patch moves the initialisation of high_memory before it used. Cc: <stable@vger.kernel.org> Fixes: f7426b983a6a ("mm: cma: adjust address limit to avoid hitting low/high memory boundary") Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-11	netfilter: ip6t_MASQUERADE: add dependency on conntrack module	Konstantin Khlebnikov
	After commit 4d3a57f23dec ("netfilter: conntrack: do not enable connection tracking unless needed") conntrack is disabled by default unless some module explicitly declares dependency in particular network namespace. Fixes: a357b3f80bc8 ("netfilter: nat: add dependencies on conntrack module") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-12-11	netlink: convert netlink tap spinlock to mutex	Cong Wang
	Both netlink_add_tap() and netlink_remove_tap() are called in process context, no need to bother spinlock. Note, in fact, currently we always hold RTNL when calling these two functions, so we don't need any other lock at all, but keeping this lock doesn't harm anything. Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	netlink: make netlink tap per netns	Cong Wang
	nlmon device is not supposed to capture netlink events from other netns, so instead of filtering events, we can simply make netlink tap itself per netns. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Kevin Cernekee <cernekee@chromium.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	ptr_ring: add barriers	Michael S. Tsirkin
	Users of ptr_ring expect that it's safe to give the data structure a pointer and have it be available to consumers, but that actually requires an smb_wmb or a stronger barrier. In absence of such barriers and on architectures that reorder writes, consumer might read an un=initialized value from an skb pointer stored in the skb array. This was observed causing crashes. To fix, add memory barriers. The barrier we use is a wmb, the assumption being that producers do not need to read the value so we do not need to order these reads. Reported-by: George Cherian <george.cherian@cavium.com> Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	RISC-V: Remove unused CONFIG_HVC_RISCV_SBI code	Palmer Dabbelt
	This is code that probably should never have made it into the kernel in the first place: it depends on a driver that hadn't been reviewed yet. During the HVC_SBI_RISCV review process a better way of doing this was suggested, but that means this code is defunct. It's compile-time disabled in 4.15 because the driver isn't in, so I think it's safe to just remove this for now. CC: Greg KH <gregkh@linuxfoundation.org> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-12-11	RISC-V: Resurrect smp_mb__after_spinlock()	Palmer Dabbelt
	I removed this last week because of an incorrect comment: smp_mb__after_spinlock() is actually still used, and is necessary on RISC-V. It's been resurrected, with a comment that describes what it actually does this time. Thanks to Andrea for finding the bug! Fixes: 3343eb6806f3 ("RISC-V: Remove smb_mb__{before,after}_spinlock()") CC: Andrea Parri <parri.andrea@gmail.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-12-11	RISC-V: Logical vs Bitwise typo	Dan Carpenter
	In the current code, there is a ! logical NOT where a bitwise ~ NOT was intended. It means that we never return -EINVAL. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-12-11	workqueue: remove unneeded kallsyms include	Sergey Senozhatsky
	The filw was converted from print_symbol() to %pf some time ago (044c782ce3a901fb "workqueue: fix checkpatch issues"). kallsyms does not seem to be needed anymore. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2017-12-11	sched/core: Fix kernel-doc warnings after code movement	Randy Dunlap
	Fix the following kernel-doc warnings after code restructuring: ../kernel/sched/core.c:5113: warning: No description found for parameter 't' ../kernel/sched/core.c:5113: warning: Excess function parameter 'interval' description in 'sched_rr_get_interval' get rid of set_fs()") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: abca5fc535a3e ("sched_rr_get_interval(): move compat to native, Link: http://lkml.kernel.org/r/995c6ded-b32e-bbe4-d9f5-4d42d121aff1@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-11	Merge branch 'rhashtable-New-features-in-walk-and-bucket'	David S. Miller
	Tom Herbert says: ==================== rhashtable: New features in walk and bucket This patch contains some changes to related rhashtable: - Above allow rhashtable_walk_start to return void - Add a functon to peek at the next entry during a walk - Abstract out function to compute a has for a table - A library function to alloc a spinlocks bucket array - Call the above function for rhashtable locks allocation Tested: Exercised using various operations on an ILA xlat table. v2: - Apply feedback from Herbert. Don't change semantics of resize event reporting and -EAGAIN, just simplify API for callers that ignore those. - Add end_of_table in iter to reliably tell when the iterator has reached to the eno. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	rhashtable: Call library function alloc_bucket_locks	Tom Herbert
	To allocate the array of bucket locks for the hash table we now call library function alloc_bucket_spinlocks. This function is based on the old alloc_bucket_locks in rhashtable and should produce the same effect. Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	spinlock: Add library function to allocate spinlock buckets array	Tom Herbert
	Add two new library functions: alloc_bucket_spinlocks and free_bucket_spinlocks. These are used to allocate and free an array of spinlocks that are useful as locks for hash buckets. The interface specifies the maximum number of spinlocks in the array as well as a CPU multiplier to derive the number of spinlocks to allocate. The number allocated is rounded up to a power of two to make the array amenable to hash lookup. Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	rhashtable: abstract out function to get hash	Tom Herbert
	Split out most of rht_key_hashfn which is calculating the hash into its own function. This way the hash function can be called separately to get the hash value. Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	rhashtable: Add rhastable_walk_peek	Tom Herbert
	This function is like rhashtable_walk_next except that it only returns the current element in the inter and does not advance the iter. This patch also creates __rhashtable_walk_find_next. It finds the next element in the table when the entry cached in iter is NULL or at the end of a slot. __rhashtable_walk_find_next is called from rhashtable_walk_next and rhastable_walk_peek. end_of_table is an added field to the iter structure. This indicates that the end of table was reached (walker.tbl being NULL is not a sufficient condition for end of table). Signed-off-by: Tom Herbert <tom@quantonium.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	rhashtable: Change rhashtable_walk_start to return void	Tom Herbert
	Most callers of rhashtable_walk_start don't care about a resize event which is indicated by a return value of -EAGAIN. So calls to rhashtable_walk_start are wrapped wih code to ignore -EAGAIN. Something like this is common: ret = rhashtable_walk_start(rhiter); if (ret && ret != -EAGAIN) goto out; Since zero and -EAGAIN are the only possible return values from the function this check is pointless. The condition never evaluates to true. This patch changes rhashtable_walk_start to return void. This simplifies code for the callers that ignore -EAGAIN. For the few cases where the caller cares about the resize event, particularly where the table can be walked in mulitple parts for netlink or seq file dump, the function rhashtable_walk_start_check has been added that returns -EAGAIN on a resize event. Signed-off-by: Tom Herbert <tom@quantonium.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	rtnetlink: fix typo in GSO max segments	Stephen Hemminger
	Fixes: 46e6b992c250 ("rtnetlink: allow GSO maximums to be set on device creation") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	Merge tag 'mac80211-for-davem-2017-12-11' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Three fixes: * for certificate C file generation, don't use hexdump as it's not always installed by default, use pure posix instead (od/sed) * for certificate C file generation, don't write the file if anything fails, so the build abort will not cause a bad build upon a second attempt * fix locking in ieee80211_sta_tear_down_BA_sessions() which had been causing lots of locking warnings ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11	x86/mm/kmmio: Fix mmiotrace for page unaligned addresses	Karol Herbst
	If something calls ioremap() with an address not aligned to PAGE_SIZE, the returned address might be not aligned as well. This led to a probe registered on exactly the returned address, but the entire page was armed for mmiotracing. On calling iounmap() the address passed to unregister_kmmio_probe() was PAGE_SIZE aligned by the caller leading to a complete freeze of the machine. We should always page align addresses while (un)registerung mappings, because the mmiotracer works on top of pages, not mappings. We still keep track of the probes based on their real addresses and lengths though, because the mmiotrace still needs to know what are mapped memory regions. Also move the call to mmiotrace_iounmap() prior page aligning the address, so that all probes are unregistered properly, otherwise the kernel ends up failing memory allocations randomly after disabling the mmiotracer. Tested-by: Lyude <lyude@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Pekka Paalanen <ppaalanen@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: nouveau@lists.freedesktop.org Link: http://lkml.kernel.org/r/20171127075139.4928-1-kherbst@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-11	mm/early_ioremap: Fix boot hang with earlyprintk=efi,keep	Dave Young
	earlyprintk=efi,keep does not work any more with a warning in mm/early_ioremap.c: WARN_ON(system_state != SYSTEM_BOOTING): Boot just hangs because of the earlyprintk within the earlyprintk implementation code itself. This is caused by a new introduced middle state in: 69a78ff226fe ("init: Introduce SYSTEM_SCHEDULING state") early_ioremap() is fine in both SYSTEM_BOOTING and SYSTEM_SCHEDULING states, original condition should be updated accordingly. Signed-off-by: Dave Young <dyoung@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bp@suse.de Cc: linux-efi@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20171209041610.GA3249@dhcp-128-65.nay.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-11	ipmi_si: fix crash on parisc	Mikulas Patocka
	This patch fixes ipmi crash on parisc introduced in the kernel 4.15-rc. The pointer io.io_setup is not initialized and thus it causes crash in try_smi_init when attempting to call new_smi->io.io_setup. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Corey Minyard <cminyard@mvista.com>
2017-12-11	ipmi_si: Fix oops with PCI devices	Corey Minyard
	When the IPMI PCI code was split out, some code was consolidated for setting the io_setup field in the io structure. The PCI code needed this set before registration to probe register spacing, though, so restore the old code for that function. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=197999 Signed-off-by: Corey Minyard <cminyard@mvista.com> Tested-by: Meelis Roos <mroos@linux.ee>
2017-12-11	PM / sleep: Avoid excess pm_runtime_enable() calls in device_resume()	Rafael J. Wysocki
	Middle-layer code doing suspend-time optimizations for devices with the DPM_FLAG_SMART_SUSPEND flag set (currently, the PCI bus type and the ACPI PM domain) needs to make the core skip ->thaw_early and ->thaw callbacks for those devices in some cases and it sets the power.direct_complete flag for them for this purpose. However, it turns out that setting power.direct_complete outside of the PM core is a bad idea as it triggers an excess invocation of pm_runtime_enable() in device_resume(). For this reason, provide a helper to clear power.is_late_suspended and power.is_suspended to be invoked by the middle-layer code in question instead of setting power.direct_complete and make that code call the new helper. Fixes: c4b65157aeef (PCI / PM: Take SMART_SUSPEND driver flag into account) Fixes: 05087360fd7a (ACPI / PM: Take SMART_SUSPEND driver flag into account) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-12-11	netfilter: exthdr: add missign attributes to policy	Florian Westphal
	Add missing netlink attribute policy. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-12-11	mmc: core: apply NO_CMD23 quirk to some specific cards	Christoph Fritz
	To get an usdhc Apacer and some ATP SD cards work reliable, CMD23 needs to be disabled. This has been tested on i.MX6 (sdhci-esdhc) and rk3288 (dw_mmc-rockchip). Without this patch on i.MX6 (sdhci-esdhc): $ dd if=/dev/urandom of=/mnt/test bs=1M count=10 conv=fsync \| <mmc0: starting CMD23 arg 00000400 flags 00000015> \| mmc0: starting CMD25 arg 00a71f00 flags 000000b5 \| mmc0: blksz 512 blocks 1024 flags 00000100 tsac 3000 ms nsac 0 \| mmc0: CMD12 arg 00000000 flags 0000049d \| sdhci [sdhci_irq()]: *** mmc0 got interrupt: 0x00000001 \| mmc0: Timeout waiting for hardware interrupt. Without this patch on rk3288 (dw_mmc-rockchip): \| mmc1: Card stuck in programming state! mmcblk1 card_busy_detect \| dwmmc_rockchip ff0c0000.dwmmc: Busy; trying anyway \| mmc_host mmc1: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, \| actual 400000HZ div = 0) \| mmc1: card never left busy state \| mmc1: tried to reset card, got error -110 \| blk_update_request: I/O error, dev mmcblk1, sector 139778 \| Buffer I/O error on dev mmcblk1p1, logical block 131586, lost async \| page write Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com> Cc: <stable@vger.kernel.org> # v4.14+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-12-11	mac80211: Add airtime account and scheduling to TXQs	Toke Høiland-Jørgensen
	This adds airtime accounting and scheduling to the mac80211 TXQ scheduler. A new hardware flag, AIRTIME_ACCOUNTING, is added that drivers can set if they support reporting airtime usage of transmissions. When this flag is set, mac80211 will expect the actual airtime usage to be reported in the tx_time and rx_time fields of the respective status structs. When airtime information is present, mac80211 will schedule TXQs (through ieee80211_next_txq()) in a way that enforces airtime fairness between active stations. This scheduling works the same way as the ath9k in-driver airtime fairness scheduling. Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-12-11	mac80211: Add TXQ scheduling API	Toke Høiland-Jørgensen
	This adds an API to mac80211 to handle scheduling of TXQs and changes the interface between driver and mac80211 for TXQ handling as follows: - The wake_tx_queue callback interface no longer includes the TXQ. Instead, the driver is expected to retrieve that from ieee80211_next_txq() - Two new mac80211 functions are added: ieee80211_next_txq() and ieee80211_schedule_txq(). The former returns the next TXQ that should be scheduled, and is how the driver gets a queue to pull packets from. The latter is called internally by mac80211 to start scheduling a queue, and the driver is supposed to call it to re-schedule the TXQ after it is finished pulling packets from it (unless the queue emptied). The ath9k and ath10k drivers are changed to use the new API. Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-12-11	nl80211: fix nl80211_send_iface() error paths	Johannes Berg
	Evidently I introduced a locking bug in my change here, the nla_put_failure sometimes needs to unlock. Fix it. Fixes: 44905265bc15 ("nl80211: don't expose wdev->ssid for most interfaces") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-12-11	crypto: af_alg - fix race accessing cipher request	Stephan Mueller
	When invoking an asynchronous cipher operation, the invocation of the callback may be performed before the subsequent operations in the initial code path are invoked. The callback deletes the cipher request data structure which implies that after the invocation of the asynchronous cipher operation, this data structure must not be accessed any more. The setting of the return code size with the request data structure must therefore be moved before the invocation of the asynchronous cipher operation. Fixes: e870456d8e7c ("crypto: algif_skcipher - overhaul memory management") Fixes: d887c52d6ae4 ("crypto: algif_aead - overhaul memory management") Reported-by: syzbot <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> # v4.14+ Signed-off-by: Stephan Mueller <smueller@chronox.de> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-12-11	crypto: mcryptd - protect the per-CPU queue with a lock	Sebastian Andrzej Siewior
	mcryptd_enqueue_request() grabs the per-CPU queue struct and protects access to it with disabled preemption. Then it schedules a worker on the same CPU. The worker in mcryptd_queue_worker() guards access to the same per-CPU variable with disabled preemption. If we take CPU-hotplug into account then it is possible that between queue_work_on() and the actual invocation of the worker the CPU goes down and the worker will be scheduled on _another_ CPU. And here the preempt_disable() protection does not work anymore. The easiest thing is to add a spin_lock() to guard access to the list. Another detail: mcryptd_queue_worker() is not processing more than MCRYPTD_BATCH invocation in a row. If there are still items left, then it will invoke queue_work() to proceed with more later. I would suggest to simply drop that check because it does not use a system workqueue and the workqueue is already marked as "CPU_INTENSIVE". And if preemption is required then the scheduler should do it. However if queue_work() is used then the work item is marked as CPU unbound. That means it will try to run on the local CPU but it may run on another CPU as well. Especially with CONFIG_DEBUG_WQ_FORCE_RR_CPU=y. Again, the preempt_disable() won't work here but lock which was introduced will help. In order to keep work-item on the local CPU (and avoid RR) I changed it to queue_work_on(). Cc: stable@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-12-11	crypto: af_alg - wait for data at beginning of recvmsg	Stephan Mueller
	The wait for data is a non-atomic operation that can sleep and therefore potentially release the socket lock. The release of the socket lock allows another thread to modify the context data structure. The waiting operation for new data therefore must be called at the beginning of recvmsg. This prevents a race condition where checks of the members of the context data structure are performed by recvmsg while there is a potential for modification of these values. Fixes: e870456d8e7c ("crypto: algif_skcipher - overhaul memory management") Fixes: d887c52d6ae4 ("crypto: algif_aead - overhaul memory management") Reported-by: syzbot <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> # v4.14+ Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-12-11	crypto: skcipher - set walk.iv for zero-length inputs	Eric Biggers
	All the ChaCha20 algorithms as well as the ARM bit-sliced AES-XTS algorithms call skcipher_walk_virt(), then access the IV (walk.iv) before checking whether any bytes need to be processed (walk.nbytes). But if the input is empty, then skcipher_walk_virt() doesn't set the IV, and the algorithms crash trying to use the uninitialized IV pointer. Fix it by setting the IV earlier in skcipher_walk_virt(). Also fix it for the AEAD walk functions. This isn't a perfect solution because we can't actually align the IV to ->cra_alignmask unless there are bytes to process, for one because the temporary buffer for the aligned IV is freed by skcipher_walk_done(), which is only called when there are bytes to process. Thus, algorithms that require aligned IVs will still need to avoid accessing the IV when walk.nbytes == 0. Still, many algorithms/architectures are fine with IVs having any alignment, and even for those that aren't, a misaligned pointer bug is much less severe than an uninitialized pointer bug. This change also matches the behavior of the older blkcipher_walk API. Fixes: 0cabf2af6f5a ("crypto: skcipher - Fix crash on zero-length input") Reported-by: syzbot <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> # v4.14+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-12-11	mac80211: Add MIC space only for TX key option	David Spinadel
	Add a key flag to indicates that the device only needs MIC space and not a real MIC. In such cases, keep the MIC zeroed for ease of debug. Signed-off-by: David Spinadel <david.spinadel@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>