linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2019-04-15	hwmon: Add convience macro to define simple static sensors	Charles Keepax
	It takes a fair amount of boiler plate code to add new sensors, add a macro that can be used to specify simple static sensors. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-04-15	scsi: libsas: Inject revalidate event for root port event	John Garry
	According to the SAS spec, an expander device shall transmit BROADCAST (CHANGE) from at least one phy in each expander port other than the expander port that is the cause for transmitting BROADCAST (CHANGE). As such, for when the link is lost for a root PHY attached to an expander PHY, we get no broadcast event. This causes an issue for libsas, in that we will not revalidate the domain for these events. As a solution, for when a root PHY is formed or deformed from a root port, insert a broadcast event to trigger a domain revalidation. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-15	scsi: libsas: Stop hardcoding SAS address length	John Garry
	Many times we use 8 for SAS address length, while we already have a macro for this - SAS_ADDR_SIZE. Replace instances of this with the macro. However, don't touch the SAS address array sizes sas.h, as these are defined according to the SAS spec. Some missing whitespaces are also added, and whitespace indentation in sas_hash_addr() is also fixed (see sas_hash_addr()). Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-15	Merge branch 'reset/meson-g12a' of git://git.pengutronix.de/pza/linux into ↵	Kevin Hilman
	v5.2/dt64 * 'reset/meson-g12a' of git://git.pengutronix.de/pza/linux: dt-bindings: reset: meson-g12a: Add missing USB2 PHY resets
2019-04-15	ntp: Audit NTP parameters adjustment	Ondrej Mosnacek
	Emit an audit record every time selected NTP parameters are modified from userspace (via adjtimex(2) or clock_adjtime(2)). These parameters may be used to indirectly change system clock, and thus their modifications should be audited. Such events will now generate records of type AUDIT_TIME_ADJNTPVAL containing the following fields: - op -- which value was adjusted: - offset -- corresponding to the time_offset variable - freq -- corresponding to the time_freq variable - status -- corresponding to the time_status variable - adjust -- corresponding to the time_adjust variable - tick -- corresponding to the tick_usec variable - tai -- corresponding to the timekeeping's TAI offset - old -- the old value - new -- the new value Example records: type=TIME_ADJNTPVAL msg=audit(1530616044.507:7): op=status old=64 new=8256 type=TIME_ADJNTPVAL msg=audit(1530616044.511:11): op=freq old=0 new=49180377088000 The records of this type will be associated with the corresponding syscall records. An overview of parameter changes that can be done via do_adjtimex() (based on information from Miroslav Lichvar) and whether they are audited: __timekeeping_set_tai_offset() -- sets the offset from the International Atomic Time (AUDITED) NTP variables: time_offset -- can adjust the clock by up to 0.5 seconds per call and also speed it up or slow down by up to about 0.05% (43 seconds per day) (AUDITED) time_freq -- can speed up or slow down by up to about 0.05% (AUDITED) time_status -- can insert/delete leap seconds and it also enables/ disables synchronization of the hardware real-time clock (AUDITED) time_maxerror, time_esterror -- change error estimates used to inform userspace applications (NOT AUDITED) time_constant -- controls the speed of the clock adjustments that are made when time_offset is set (NOT AUDITED) time_adjust -- can temporarily speed up or slow down the clock by up to 0.05% (AUDITED) tick_usec -- a more extreme version of time_freq; can speed up or slow down the clock by up to 10% (AUDITED) Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-04-15	timekeeping: Audit clock adjustments	Ondrej Mosnacek
	Emit an audit record whenever the system clock is changed (i.e. shifted by a non-zero offset) by a syscall from userspace. The syscalls than can (at the time of writing) trigger such record are: - settimeofday(2), stime(2), clock_settime(2) -- via do_settimeofday64() - adjtimex(2), clock_adjtime(2) -- via do_adjtimex() The new records have type AUDIT_TIME_INJOFFSET and contain the following fields: - sec -- the 'seconds' part of the offset - nsec -- the 'nanoseconds' part of the offset Example record (time was shifted backwards by ~15.875 seconds): type=TIME_INJOFFSET msg=audit(1530616049.652:13): sec=-16 nsec=124887145 The records of this type will be associated with the corresponding syscall records. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> [PM: fixed a line width problem in __audit_tk_injoffset()] Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-04-15	sctp: implement memory accounting on rx path	Xin Long
	sk_forward_alloc's updating is also done on rx path, but to be consistent we change to use sk_mem_charge() in sctp_skb_set_owner_r(). In sctp_eat_data(), it's not enough to check sctp_memory_pressure only, which doesn't work for mem_cgroup_sockets_enabled, so we change to use sk_under_memory_pressure(). When it's under memory pressure, sk_mem_reclaim() and sk_rmem_schedule() should be called on both RENEGE or CHUNK DELIVERY path exit the memory pressure status as soon as possible. Note that sk_rmem_schedule() is using datalen to make things easy there. Reported-by: Matteo Croce <mcroce@redhat.com> Tested-by: Matteo Croce <mcroce@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next	David S. Miller
	Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter updates for net-next: 1) Remove the broute pseudo hook, implement this from the bridge prerouting hook instead. Now broute becomes real table in ebtables, from Florian Westphal. This also includes a size reduction patch for the bridge control buffer area via squashing boolean into bitfields and a selftest. 2) Add OS passive fingerprint version matching, from Fernando Fernandez. 3) Support for gue encapsulation for IPVS, from Jacky Hu. 4) Add support for NAT to the inet family, from Florian Westphal. This includes support for masquerade, redirect and nat extensions. 5) Skip interface lookup in flowtable, use device in the dst object. 6) Add jiffies64_to_msecs() and use it, from Li RongQing. 7) Remove unused parameter in nf_tables_set_desc_parse(), from Colin Ian King. 8) Statify several functions, patches from YueHaibing and Florian Westphal. 9) Add an optimized version of nf_inet_addr_cmp(), from Li RongQing. 10) Merge route extension to core, also from Florian. 11) Use IS_ENABLED(CONFIG_NF_NAT) instead of NF_NAT_NEEDED, from Florian. 12) Merge ip/ip6 masquerade extensions, from Florian. This includes netdevice notifier unification. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15	platform/chrome: wilco_ec: Standardize mailbox interface	Nick Crews
	The current API for the wilco EC mailbox interface is bad. It assumes that most messages sent to the EC follow a similar structure, with a command byte in MBOX[0], followed by a junk byte, followed by actual data. This doesn't happen in several cases, such as setting the RTC time, using the raw debugfs interface, and reading or writing properties such as the Peak Shift policy (this last to be submitted soon). Similarly for the response message from the EC, the current interface assumes that the first byte of data is always 0, and the second byte is unused. However, in both setting and getting the RTC time, in the debugfs interface, and for reading and writing properties, this isn't true. The current way to resolve this is to use WILCO_EC_FLAG_RAW* flags to specify when and when not to skip these initial bytes in the sent and received message. They are confusing and used so much that they are normal, and not exceptions. In addition, the first byte of response in the debugfs interface is still always skipped, which is weird, since this raw interface should be giving the entire result. Additionally, sent messages assume the first byte is a command, and so struct wilco_ec_message contains the "command" field. In setting or getting properties however, the first byte is not a command, and so this field has to be filled with a byte that isn't actually a command. This is again inconsistent. wilco_ec_message contains a result field as well, copied from wilco_ec_response->result. The message result field should be removed: if the message fails, the cause is already logged, and the callers are alerted. They will never care about the actual state of the result flag. These flags and different cases make the wilco_ec_transfer() function, used in wilco_ec_mailbox(), really gross, dealing with a bunch of different cases. It's difficult to figure out what it is doing. Finally, making these assumptions about the structure of a message make it so that the messages do not correspond well with the specification for the EC's mailbox interface. For instance, this interface specification may say that MBOX[9] in the received message contains some information, but the calling code needs to remember that the first byte of response is always skipped, and because it didn't set the RESPONSE_RAW flag, the next byte is also skipped, so this information is actually contained within wilco_ec_message->response_data[7]. This makes it difficult to maintain this code in the future. To fix these problems this patch standardizes the mailbox interface by: - Removing the WILCO_EC_FLAG_RAW* flags - Removing the command and reserved_raw bytes from wilco_ec_request - Removing the mbox0 byte from wilco_ec_response - Simplifying wilco_ec_transfer() because of these changes - Gives the callers of wilco_ec_mailbox() the responsibility of exactly and consistently defining the structure of the mailbox request and response - Removing command and result from wilco_ec_message. This results in the reduction of total code, and makes it much more maintainable and understandable. Signed-off-by: Nick Crews <ncrews@chromium.org> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2019-04-15	PCI: endpoint: Add support to specify alignment for buffers allocated to BARs	Kishon Vijay Abraham I
	The address that is allocated using pci_epf_alloc_space() is directly written to the target address of the Inbound Address Translation unit (ie the HW component implementing inbound address decoding) on endpoint controllers. Designware IP [1] has a configuration parameter (CX_ATU_MIN_REGION_SIZE [2]) which has 64KB as default value and the lower 16 bits of the Base, Limit and Target registers of the Inbound ATU are fixed to zero. If the programmed memory address is not aligned to 64 KB boundary this causes memory corruption. Modify pci_epf_alloc_space() API to take alignment size as argument in order to allocate buffers to be mapped to BARs with an alignment that suits the platform where they are used. Add an 'align' parameter to epc_features which can be used by platform drivers to specify the BAR allocation alignment requirements and use this while invoking pci_epf_alloc_space(). [1] "I/O and MEM Match Modes" section in DesignWare Cores PCI Express Controller Databook version 4.90a [2] http://www.ti.com/lit/ug/spruid7c/spruid7c.pdf Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2019-04-15	printk: Tie printk_once / printk_deferred_once into .data.once for reset	Paul Gortmaker
	In commit b1fca27d384e ("kernel debug: support resetting WARN_ONCE") we got the opportunity to reset state on the one shot messages, without having to reboot. However printk_once (printk_deferred_once) live in a different file and didn't get the same kind of update/conversion, so they remain unconditionally one shot, until the system is rebooted. For example, we currently have: sched/rt.c: printk_deferred_once("sched: RT throttling activated\n"); ..which could reasonably be tripped as someone is testing and tuning a new system/workload and their task placements. For consistency, and to avoid reboots in the same vein as the original commit, we make these two instances of _once the same as the WARN_ONCE instances are. Link: http://lkml.kernel.org/r/1555121491-31213-1-git-send-email-paul.gortmaker@windriver.com Cc: Andi Kleen <ak@linux.intel.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
2019-04-15	firmware: xilinx: Add fpga API's	Nava kishore Manne
	This Patch Adds fpga API's to support the Bitstream loading by using firmware interface. Signed-off-by: Nava kishore Manne <nava.manne@xilinx.com> Reviewed-by: Moritz Fischer <mdf@kernel.org> Signed-off-by: Michal Simek <michal.simek@xilinx.com>
2019-04-15	BackMerge v5.1-rc5 into drm-next	Dave Airlie
	Need rc5 for udl fix to add udl cleanups on top. Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-04-15	netfilter: ctnetlink: don't use conntrack/expect object addresses as id	Florian Westphal
	else, we leak the addresses to userspace via ctnetlink events and dumps. Compute an ID on demand based on the immutable parts of nf_conn struct. Another advantage compared to using an address is that there is no immediate re-use of the same ID in case the conntrack entry is freed and reallocated again immediately. Fixes: 3583240249ef ("[NETFILTER]: nf_conntrack_expect: kill unique ID") Fixes: 7f85f914721f ("[NETFILTER]: nf_conntrack: kill unique ID") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-04-14	Merge branch 'page-refs' (page ref overflow)	Linus Torvalds
	Merge page ref overflow branch. Jann Horn reported that he can overflow the page ref count with sufficient memory (and a filesystem that is intentionally extremely slow). Admittedly it's not exactly easy. To have more than four billion references to a page requires a minimum of 32GB of kernel memory just for the pointers to the pages, much less any metadata to keep track of those pointers. Jann needed a total of 140GB of memory and a specially crafted filesystem that leaves all reads pending (in order to not ever free the page references and just keep adding more). Still, we have a fairly straightforward way to limit the two obvious user-controllable sources of page references: direct-IO like page references gotten through get_user_pages(), and the splice pipe page duplication. So let's just do that. * branch page-refs: fs: prevent page refcount overflow in pipe_buf_get mm: prevent get_user_pages() from overflowing page refcount mm: add 'try_get_page()' helper function mm: make page ref count overflow check tighter and more explicit
2019-04-14	fs: prevent page refcount overflow in pipe_buf_get	Matthew Wilcox
	Change pipe_buf_get() to return a bool indicating whether it succeeded in raising the refcount of the page (if the thing in the pipe is a page). This removes another mechanism for overflowing the page refcount. All callers converted to handle a failure. Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Matthew Wilcox <willy@infradead.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-14	mm: add 'try_get_page()' helper function	Linus Torvalds
	This is the same as the traditional 'get_page()' function, but instead of unconditionally incrementing the reference count of the page, it only does so if the count was "safe". It returns whether the reference count was incremented (and is marked __must_check, since the caller obviously has to be aware of it). Also like 'get_page()', you can't use this function unless you already had a reference to the page. The intent is that you can use this exactly like get_page(), but in situations where you want to limit the maximum reference count. The code currently does an unconditional WARN_ON_ONCE() if we ever hit the reference count issues (either zero or negative), as a notification that the conditional non-increment actually happened. NOTE! The count access for the "safety" check is inherently racy, but that doesn't matter since the buffer we use is basically half the range of the reference count (ie we look at the sign of the count). Acked-by: Matthew Wilcox <willy@infradead.org> Cc: Jann Horn <jannh@google.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-14	mm: make page ref count overflow check tighter and more explicit	Linus Torvalds
	We have a VM_BUG_ON() to check that the page reference count doesn't underflow (or get close to overflow) by checking the sign of the count. That's all fine, but we actually want to allow people to use a "get page ref unless it's already very high" helper function, and we want that one to use the sign of the page ref (without triggering this VM_BUG_ON). Change the VM_BUG_ON to only check for small underflows (or _very_ close to overflowing), and ignore overflows which have strayed into negative territory. Acked-by: Matthew Wilcox <willy@infradead.org> Cc: Jann Horn <jannh@google.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-14	iio: inkern: API for reading available iio channel attribute values	Artur Rojek
	Extend the inkern API with a function for reading available attribute values of iio channels. Signed-off-by: Artur Rojek <contact@artur-rojek.eu> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2019-04-13	bfq: update internal depth state when queue depth changes	Jens Axboe
	A previous commit moved the shallow depth and BFQ depth map calculations to be done at init time, moving it outside of the hotter IO path. This potentially causes hangs if the users changes the depth of the scheduler map, by writing to the 'nr_requests' sysfs file for that device. Add a blk-mq-sched hook that allows blk-mq to inform the scheduler if the depth changes, so that the scheduler can update its internal state. Tested-by: Kai Krakow <kai@kaishome.de> Reported-by: Paolo Valente <paolo.valente@linaro.org> Fixes: f0635b8a416e ("bfq: calculate shallow depths at init time") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-13	Merge tag 'for-linus-20190412' of git://git.kernel.dk/linux-block	Linus Torvalds
	Pull block fixes from Jens Axboe: "Set of fixes that should go into this round. This pull is larger than I'd like at this time, but there's really no specific reason for that. Some are fixes for issues that went into this merge window, others are not. Anyway, this contains: - Hardware queue limiting for virtio-blk/scsi (Dongli) - Multi-page bvec fixes for lightnvm pblk - Multi-bio dio error fix (Jason) - Remove the cache hint from the io_uring tool side, since we didn't move forward with that (me) - Make io_uring SETUP_SQPOLL root restricted (me) - Fix leak of page in error handling for pc requests (Jérôme) - Fix BFQ regression introduced in this merge window (Paolo) - Fix break logic for bio segment iteration (Ming) - Fix NVMe cancel request error handling (Ming) - NVMe pull request with two fixes (Christoph): - fix the initial CSN for nvme-fc (James) - handle log page offsets properly in the target (Keith)" * tag 'for-linus-20190412' of git://git.kernel.dk/linux-block: block: fix the return errno for direct IO nvmet: fix discover log page when offsets are used nvme-fc: correct csn initialization and increments on error block: do not leak memory in bio_copy_user_iov() lightnvm: pblk: fix crash in pblk_end_partial_read due to multipage bvecs nvme: cancel request synchronously blk-mq: introduce blk_mq_complete_request_sync() scsi: virtio_scsi: limit number of hw queues by nr_cpu_ids virtio-blk: limit number of hw queues by nr_cpu_ids block, bfq: fix use after free in bfq_bfqq_expire io_uring: restrict IORING_SETUP_SQPOLL to root tools/io_uring: remove IOCQE_FLAG_CACHEHIT block: don't use for-inside-for in bio_for_each_segment_all
2019-04-13	Merge tag 'nfs-for-5.1-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs	Linus Torvalds
	Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Stable fix: - Fix a deadlock in close() due to incorrect draining of RDMA queues Bugfixes: - Revert "SUNRPC: Micro-optimise when the task is known not to be sleeping" as it is causing stack overflows - Fix a regression where NFSv4 getacl and fs_locations stopped working - Forbid setting AF_INET6 to "struct sockaddr_in"->sin_family. - Fix xfstests failures due to incorrect copy_file_range() return values" * tag 'nfs-for-5.1-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: Revert "SUNRPC: Micro-optimise when the task is known not to be sleeping" NFSv4.1 fix incorrect return value in copy_file_range xprtrdma: Fix helper that drains the transport NFS: Fix handling of reply page vector NFS: Forbid setting AF_INET6 to "struct sockaddr_in"->sin_family.
2019-04-13	Merge tag 'clk-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "Here's more than a handful of clk driver fixes for changes that came in during the merge window: - Fix the AT91 sama5d2 programmable clk prescaler formula - A bunch of Amlogic meson clk driver fixes for the VPU clks - A DMI quirk for Intel's Bay Trail SoC's driver to properly mark pmc clks as critical only when really needed - Stop overwriting CLK_SET_RATE_PARENT flag in mediatek's clk gate implementation - Use the right structure to test for a frequency table in i.MX's PLL_1416x driver" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: imx: Fix PLL_1416X not rounding rates clk: mediatek: fix clk-gate flag setting platform/x86: pmc_atom: Drop __initconst on dmi table clk: x86: Add system specific quirk to mark clocks as critical clk: meson: vid-pll-div: remove warning and return 0 on invalid config clk: meson: pll: fix rounding and setting a rate that matches precisely clk: meson-g12a: fix VPU clock parents clk: meson: g12a: fix VPU clock muxes mask clk: meson-gxbb: round the vdec dividers to closest clk: at91: fix programmable clock for sama5d2
2019-04-13	CPER: Remove unnecessary use of user-space types	Bjorn Helgaas
	"__u32" and similar types are intended for things exported to user-space, including structs used in ioctls; see include/uapi/asm-generic/int-l64.h. They are not needed for the CPER struct definitions, which not exported to user-space and not used in ioctls. Replace them with the typical "u32" and similar types. No functional change intended. The reason for changing this is to remove the question of "why do we use __u32 here instead of u32?" We should use __u32 when there's a reason for it; otherwise, we should prefer u32 for consistency. Reference: Documentation/process/coding-style.rst Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Masahiro Yamada <yamada.masahiro@socionext.com> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> CC: Andrew Morton <akpm@linux-foundation.org>
2019-04-13	CPER: Add UEFI spec references	Bjorn Helgaas
	Add UEFI spec references for CPER UUIDs and structures, fix a few typos, and remove some useless comments. No functional change intended. Link: http://www.uefi.org/specifications Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-04-13	netfilter: conntrack: don't set related state for different outer address	Florian Westphal
	Luca Moro says: ------ The issue lies in the filtering of ICMP and ICMPv6 errors that include an inner IP datagram. For these packets, icmp_error_message() extract the ICMP error and inner layer to search of a known state. If a state is found the packet is tagged as related (IP_CT_RELATED). The problem is that there is no correlation check between the inner and outer layer of the packet. So one can encapsulate an error with an inner layer matching a known state, while its outer layer is directed to a filtered host. In this case the whole packet will be tagged as related. This has various implications from a rule bypass (if a rule to related trafic is allow), to a known state oracle. Unfortunately, we could not find a real statement in a RFC on how this case should be filtered. The closest we found is RFC5927 (Section 4.3) but it is not very clear. A possible fix would be to check that the inner IP source is the same than the outer destination. We believed this kind of attack was not documented yet, so we started to write a blog post about it. You can find it attached to this mail (sorry for the extract quality). It contains more technical details, PoC and discussion about the identified behavior. We discovered later that https://www.gont.com.ar/papers/filtering-of-icmp-error-messages.pdf described a similar attack concept in 2004 but without the stateful filtering in mind. ----- This implements above suggested fix: In icmp(v6) error handler, take outer destination address, then pass that into the common function that does the "related" association. After obtaining the nf_conn of the matching inner-headers connection, check that the destination address of the opposite direction tuple is the same as the outer address and only set RELATED if thats the case. Reported-by: Luca Moro <luca.moro@synacktiv.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-04-13	ALSA: hda: Initialize ext-bus-specific fields in snd_hdac_bus_init(), too	Takashi Iwai
	Some fields in snd_hdac_bus are ext-bus specific, but they still should be initialized in snd_hdac_bus_init() for consistency, at least, for the ones that do need the explicit initialization like the list head. Also move the lock field to the more appropriate place and correct the comment to reflect the recent change where it serves for both the display power and the link management. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-04-13	Merge branch 'for-linus' into for-next	Takashi Iwai
	Back-merge the 5.1 devel branch for the further HD-audio development. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-04-12	Merge branch 'core-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core fixes from Ingo Molnar: "Fix an objtool warning plus fix a u64_to_user_ptr() macro expansion bug" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Add rewind_stack_do_exit() to the noreturn list linux/kernel.h: Use parentheses around argument in u64_to_user_ptr()
2019-04-12	fs: drop unused fput_atomic definition	Lukas Bulwahn
	commit d7065da03822 ("get rid of the magic around f_count in aio") added fput_atomic to include/linux/fs.h, motivated by its use in __aio_put_req() in fs/aio.c. Later, commit 3ffa3c0e3f6e ("aio: now fput() is OK from interrupt context; get rid of manual delayed __fput()") removed the only use of fput_atomic in __aio_put_req(), but did not remove the since then unused fput_atomic definition in include/linux/fs.h. We curate this now and finally remove the unused definition. This issue was identified during a code review due to a coccinelle warning from the atomic_as_refcounter.cocci rule pointing to the use of atomic_t in fput_atomic. Suggested-by: Krystian Radlak <kradlak@exida.com> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-12	rhashtable: use BIT(0) for locking.	NeilBrown
	As reported by Guenter Roeck, the new bit-locking using BIT(1) doesn't work on the m68k architecture. m68k only requires 2-byte alignment for words and longwords, so there is only one unused bit in pointers to structs - We current use two, one for the NULLS marker at the end of the linked list, and one for the bit-lock in the head of the list. The two uses don't need to conflict as we never need the head of the list to be a NULLS marker - the marker is only needed to check if an object has moved to a different table, and the bucket head cannot move. The NULLS marker is only needed in a ->next pointer. As we already have different types for the bucket head pointer (struct rhash_lock_head) and the ->next pointers (struct rhash_head), it is fairly easy to treat the lsb differently in each. So: Initialize buckets heads to NULL, and use the lsb for locking. When loading the pointer from the bucket head, if it is NULL (ignoring the lock big), report as being the expected NULLS marker. When storing a value into a bucket head, if it is a NULLS marker, store NULL instead. And convert all places that used bit 1 for locking, to use bit 0. Fixes: 8f0db018006a ("rhashtable: use bit_spin_locks to protect hash bucket.") Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12	rhashtable: replace rht_ptr_locked() with rht_assign_locked()	NeilBrown
	The only times rht_ptr_locked() is used, it is to store a new value in a bucket-head. This is the only time it makes sense to use it too. So replace it by a function which does the whole task: Sets the lock bit and assigns to a bucket head. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12	rhashtable: move dereference inside rht_ptr()	NeilBrown
	Rather than dereferencing a pointer to a bucket and then passing the result to rht_ptr(), we now pass in the pointer and do the dereference in rht_ptr(). This requires that we pass in the tbl and hash as well to support RCU checks, and means that the various rht_for_each functions can expect a pointer that can be dereferenced without further care. There are two places where we dereference a bucket pointer where there is no testable protection - in each case we know that we much have exclusive access without having taken a lock. The previous code used rht_dereference() to pretend that holding the mutex provided protects, but holding the mutex never provides protection for accessing buckets. So instead introduce rht_ptr_exclusive() that can be used when there is known to be exclusive access without holding any locks. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12	rhashtable: reorder some inline functions and macros.	NeilBrown
	This patch only moves some code around, it doesn't change the code at all. A subsequent patch will benefit from this as it needs to add calls to functions which are now defined before the call-site, but weren't before. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12	rhashtable: fix some __rcu annotation errors	NeilBrown
	With these annotations, the rhashtable now gets no warnings when compiled with "C=1" for sparse checking. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12	scsi: target/iscsi: Handle too large immediate data buffers correctly	Bart Van Assche
	Since target_alloc_sgl() and iscsit_allocate_iovecs() allocate buffer space for se_cmd.data_length bytes and since that number can be smaller than the iSCSI Expected Data Transfer Length (EDTL), ensure that the iSCSI target driver does not attempt to receive more bytes than what fits in the receive buffer. Always receive the full immediate data buffer such that the iSCSI target driver does not attempt to parse immediate data as an iSCSI PDU. Note: the current code base only calls iscsit_get_dataout() if the size of the immediate data buffer does not exceed the buffer size derived from the SCSI CDB. See also target_cmd_size_check(). Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12	scsi: target/core: Rework the SPC-2 reservation handling code	Bart Van Assche
	Instead of tracking the initiator that established an SPC-2 reservation, track the session through which the SPC-2 reservation has been established. This patch does not change any functionality. Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12	scsi: scsi_transport_fc: nvme: display FC-NVMe port roles	Hannes Reinecke
	Currently the FC-NVMe driver is leverating the SCSI FC transport class to access the remote ports. Which means that all FC-NVMe remote ports will be visible to the fc transport layer, but due to missing definitions the port roles will always be 'unknown'. This patch adds the missing definitions to the fc transport class to that the port roles are correctly displayed. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: James Smart <james.smart@broadcom.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Giridhar Malavali <gmalavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12	rxrpc: Make rxrpc_kernel_check_life() indicate if call completed	Marc Dionne
	Make rxrpc_kernel_check_life() pass back the life counter through the argument list and return true if the call has not yet completed. Suggested-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12	Merge branch 'next-integrity-for-james' of ↵	James Morris
	git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next-integrity From Mimi: "This pull request contains just three patches, the remainder are either included in other pull requests (eg. audit, lockdown) or will be upstreamed via other subsystems (eg. kselftests, Power). Included in this pull request is one bug fix, one documentation update, and extending the x86 IMA arch policy rules to coordinate the different kernel module signature verification methods."
2019-04-12	bpf: Introduce bpf_strtol and bpf_strtoul helpers	Andrey Ignatov
	Add bpf_strtol and bpf_strtoul to convert a string to long and unsigned long correspondingly. It's similar to user space strtol(3) and strtoul(3) with a few changes to the API: * instead of NUL-terminated C string the helpers expect buffer and buffer length; * resulting long or unsigned long is returned in a separate result-argument; * return value is used to indicate success or failure, on success number of consumed bytes is returned that can be used to identify position to read next if the buffer is expected to contain multiple integers; * instead of base argument, flags is used that provides base in 5 LSB, other bits are reserved for future use; * number of supported bases is limited. Documentation for the new helpers is provided in bpf.h UAPI. The helpers are made available to BPF_PROG_TYPE_CGROUP_SYSCTL programs to be able to convert string input to e.g. "ulongvec" output. E.g. "net/ipv4/tcp_mem" consists of three ulong integers. They can be parsed by calling to bpf_strtoul three times. Implementation notes: Implementation includes "../../lib/kstrtox.h" to reuse integer parsing functions. It's done exactly same way as fs/proc/base.c already does. Unfortunately existing kstrtoX function can't be used directly since they fail if any invalid character is present right after integer in the string. Existing simple_strtoX functions can't be used either since they're obsolete and don't handle overflow properly. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12	bpf: Introduce ARG_PTR_TO_{INT,LONG} arg types	Andrey Ignatov
	Currently the way to pass result from BPF helper to BPF program is to provide memory area defined by pointer and size: func(void , size_t). It works great for generic use-case, but for simple types, such as int, it's overkill and consumes two arguments when it could use just one. Introduce new argument types ARG_PTR_TO_INT and ARG_PTR_TO_LONG to be able to pass result from helper to program via pointer to int and long correspondingly: func(int ) or func(long ). New argument types are similar to ARG_PTR_TO_MEM with the following differences: they don't require corresponding ARG_CONST_SIZE argument, predefined access sizes are used instead (32bit for int, 64bit for long); * it's possible to use more than one such an argument in a helper; * provided pointers have to be aligned. It's easy to introduce similar ARG_PTR_TO_CHAR and ARG_PTR_TO_SHORT argument types. It's not done due to lack of use-case though. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12	bpf: Add file_pos field to bpf_sysctl ctx	Andrey Ignatov
	Add file_pos field to bpf_sysctl context to read and write sysctl file position at which sysctl is being accessed (read or written). The field can be used to e.g. override whole sysctl value on write to sysctl even when sys_write is called by user space with file_pos > 0. Or BPF program may reject such accesses. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12	bpf: Introduce bpf_sysctl_{get,set}_new_value helpers	Andrey Ignatov
	Add helpers to work with new value being written to sysctl by user space. bpf_sysctl_get_new_value() copies value being written to sysctl into provided buffer. bpf_sysctl_set_new_value() overrides new value being written by user space with a one from provided buffer. Buffer should contain string representation of the value, similar to what can be seen in /proc/sys/. Both helpers can be used only on sysctl write. File position matters and can be managed by an interface that will be introduced separately. E.g. if user space calls sys_write to a file in /proc/sys/ at file position = X, where X > 0, then the value set by bpf_sysctl_set_new_value() will be written starting from X. If program wants to override whole value with specified buffer, file position has to be set to zero. Documentation for the new helpers is provided in bpf.h UAPI. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12	bpf: Introduce bpf_sysctl_get_current_value helper	Andrey Ignatov
	Add bpf_sysctl_get_current_value() helper to copy current sysctl value into provided by BPF_PROG_TYPE_CGROUP_SYSCTL program buffer. It provides same string as user space can see by reading corresponding file in /proc/sys/, including new line, etc. Documentation for the new helper is provided in bpf.h UAPI. Since current value is kept in ctl_table->data in a parsed form, ctl_table->proc_handler() with write=0 is called to read that data and convert it to a string. Such a string can later be parsed by a program using helpers that will be introduced separately. Unfortunately it's not trivial to provide API to access parsed data due to variety of data representations (string, intvec, uintvec, ulongvec, custom structures, even NULL, etc). Instead it's assumed that user know how to handle specific sysctl they're interested in and appropriate helpers can be used. Since ctl_table->proc_handler() expects __user buffer, conversion to __user happens for kernel allocated one where the value is stored. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12	bpf: Introduce bpf_sysctl_get_name helper	Andrey Ignatov
	Add bpf_sysctl_get_name() helper to copy sysctl name (/proc/sys/ entry) into provided by BPF_PROG_TYPE_CGROUP_SYSCTL program buffer. By default full name (w/o /proc/sys/) is copied, e.g. "net/ipv4/tcp_mem". If BPF_F_SYSCTL_BASE_NAME flag is set, only base name will be copied, e.g. "tcp_mem". Documentation for the new helper is provided in bpf.h UAPI. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12	bpf: Sysctl hook	Andrey Ignatov
	Containerized applications may run as root and it may create problems for whole host. Specifically such applications may change a sysctl and affect applications in other containers. Furthermore in existing infrastructure it may not be possible to just completely disable writing to sysctl, instead such a process should be gradual with ability to log what sysctl are being changed by a container, investigate, limit the set of writable sysctl to currently used ones (so that new ones can not be changed) and eventually reduce this set to zero. The patch introduces new program type BPF_PROG_TYPE_CGROUP_SYSCTL and attach type BPF_CGROUP_SYSCTL to solve these problems on cgroup basis. New program type has access to following minimal context: struct bpf_sysctl { __u32 write; }; Where @write indicates whether sysctl is being read (= 0) or written (= 1). Helpers to access sysctl name and value will be introduced separately. BPF_CGROUP_SYSCTL attach point is added to sysctl code right before passing control to ctl_table->proc_handler so that BPF program can either allow or deny access to sysctl. Suggested-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12	block: disk_events: introduce event flags	Martin Wilck
	Currently, an empty disk->events field tells the block layer not to forward media change events to user space. This was done in commit 7c88a168da80 ("block: don't propagate unlisted DISK_EVENTs to userland") in order to avoid events from "fringe" drivers to be forwarded to user space. By doing so, the block layer lost the information which events were supported by a particular block device, and most importantly, whether or not a given device supports media change events at all. Prepare for not interpreting the "events" field this way in the future any more. This is done by adding an additional field "event_flags" to struct gendisk, and two flag bits that can be set to have the device treated like one that had the "events" field set to a non-zero value before. This applies only to the sd and sr drivers, which are changed to set the new flags. The new flags are DISK_EVENT_FLAG_POLL to enforce polling of the device for synchronous events, and DISK_EVENT_FLAG_UEVENT to tell the blocklayer to generate udev events from kernel events. In order to add the event_flags field to struct gendisk, the events field is converted to an "unsigned short"; it doesn't need to hold values bigger than 2 anyway. This patch doesn't change behavior. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-12	block: genhd: remove async_events field	Martin Wilck
	The async_events field, intended to be used for drivers that support asynchronous notifications about disk events (aka media change events), isn't currently used by any driver, and apparently that has been that way for a long time (if not forever). Remove it. Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-12	drm/panfrost: Add initial panfrost driver	Rob Herring
	This adds the initial driver for panfrost which supports Arm Mali Midgard and Bifrost family of GPUs. Currently, only the T860 and T760 Midgard GPUs have been tested. v2: - Add GPU reset on job hangs (Tomeu) - Add RuntimePM and devfreq support (Tomeu) - Fix T760 support (Tomeu) - Add a TODO file (Rob, Tomeu) - Support multiple in fences (Tomeu) - Drop support for shared fences (Tomeu) - Fill in MMU de-init (Rob) - Move register definitions back to single header (Rob) - Clean-up hardcoded job submit todos (Rob) - Implement feature setup based on features/issues (Rob) - Add remaining Midgard DT compatible strings (Rob) v3: - Add support for reset lines (Neil) - Add a MAINTAINERS entry (Rob) - Call dma_set_mask_and_coherent (Rob) - Do MMU invalidate on map and unmap. Restructure to do a single operation per map/unmap call. (Rob) - Add a missing explicit padding to struct drm_panfrost_create_bo (Rob) - Fix 0-day error: "panfrost_devfreq.c:151:9-16: ERROR: PTR_ERR applied after initialization to constant on line 150" - Drop HW_FEATURE_AARCH64_MMU conditional (Rob) - s/DRM_PANFROST_PARAM_GPU_ID/DRM_PANFROST_PARAM_GPU_PROD_ID/ (Rob) - Check drm_gem_shmem_prime_import_sg_table() error code (Rob) - Re-order power on sequence (Rob) - Move panfrost_acquire_object_fences() before scheduling job (Rob) - Add NULL checks on array pointers in job clean-up (Rob) - Rework devfreq (Tomeu) - Fix devfreq init with no regulator (Rob) - Various WS and comments clean-up (Rob) Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <maxime.ripard@bootlin.com> Cc: Sean Paul <sean@poorly.run> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Lyude Paul <lyude@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Marty E. Plummer <hanetzer@startmail.com> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190409205427.6943-4-robh@kernel.org