summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2018-07-23net: sched: cls_flower: propagate chain teplate creation and destruction to ↵Jiri Pirko
drivers Introduce a couple of flower offload commands in order to propagate template creation/destruction events down to device drivers. Drivers may use this information to prepare HW in an optimal way for future filter insertions. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23net: sched: introduce chain templatesJiri Pirko
Allow user to set a template for newly created chains. Template lock down the chain for particular classifier type/options combinations. The classifier needs to support templates, otherwise kernel would reply with error. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23net: sched: introduce chain object to uapiJiri Pirko
Allow user to create, destroy, get and dump chain objects. Do that by extending rtnl commands by the chain-specific ones. User will now be able to explicitly create or destroy chains (so far this was done only automatically according the filter/act needs and refcounting). Also, the user will receive notification about any chain creation or destuction. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23net: sched: Avoid implicit chain 0 creationJiri Pirko
Currently, chain 0 is implicitly created during block creation. However that does not align with chain object exposure, creation and destruction api introduced later on. So make the chain 0 behave the same way as any other chain and only create it when it is needed. Since chain 0 is somehow special as the qdiscs need to hold pointer to the first chain tp, this requires to move the chain head change callback infra to the block structure. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23PCI: hotplug: Demidlayer registration with the coreLukas Wunner
When a hotplug driver calls pci_hp_register(), all steps necessary for registration are carried out in one go, including creation of a kobject and addition to sysfs. That's a problem for pciehp once it's converted to enable/disable the slot exclusively from the IRQ thread: The thread needs to be spawned after creation of the kobject (because it uses the kobject's name), but before addition to sysfs (because it will handle enable/disable requests submitted via sysfs). pci_hp_deregister() does offer a ->release callback that's invoked after deletion from sysfs and before destruction of the kobject. But because pci_hp_register() doesn't offer a counterpart, hotplug drivers' ->probe and ->remove code becomes asymmetric, which is error prone as recently discovered use-after-free bugs in pciehp's ->remove hook have shown. In a sense, this appears to be a case of the midlayer antipattern: "The core thesis of the "midlayer mistake" is that midlayers are bad and should not exist. That common functionality which it is so tempting to put in a midlayer should instead be provided as library routines which can [be] used, augmented, or ignored by each bottom level driver independently. Thus every subsystem that supports multiple implementations (or drivers) should provide a very thin top layer which calls directly into the bottom layer drivers, and a rich library of support code that eases the implementation of those drivers. This library is available to, but not forced upon, those drivers." -- Neil Brown (2009), https://lwn.net/Articles/336262/ The presence of midlayer traits in the PCI hotplug core might be ascribed to its age: When it was introduced in February 2002, the blessings of a library approach might not have been well known: https://git.kernel.org/tglx/history/c/a8a2069f432c For comparison, the driver core does offer split functions for creating a kobject (device_initialize()) and addition to sysfs (device_add()) as an alternative to carrying out everything at once (device_register()). This was introduced in October 2002: https://git.kernel.org/tglx/history/c/8b290eb19962 The odd ->release callback in the PCI hotplug core was added in 2003: https://git.kernel.org/tglx/history/c/69f8d663b595 Clearly, a library approach would not force every hotplug driver to implement a ->release callback, but rather allow the driver to remove the sysfs files, release its data structures and finally destroy the kobject. Alternatively, a driver may choose to remove everything with pci_hp_deregister(), then release its data structures. To this end, offer drivers pci_hp_initialize() and pci_hp_add() as a split-up version of pci_hp_register(). Likewise, offer pci_hp_del() and pci_hp_destroy() as a split-up version of pci_hp_deregister(). Eliminate the ->release callback and move its code into each driver's teardown routine. Declare pci_hp_deregister() void, in keeping with the usual kernel pattern that enablement can fail, but disablement cannot. It only returned an error if the caller passed in a NULL pointer or a slot which has never or is no longer registered or is sharing its name with another slot. Those would be bugs, so WARN about them. Few hotplug drivers actually checked the return value and those that did only printed a useless error message to dmesg. Remove that. For most drivers the conversion was straightforward since it doesn't matter whether the code in the ->release callback is executed before or after destruction of the kobject. But in the case of ibmphp, it was unclear to me whether setting slot_cur->ctrl and slot_cur->bus_on to NULL needs to happen before the kobject is destroyed, so I erred on the side of caution and ensured that the order stays the same. Another nontrivial case is pnv_php, I've found the list and kref logic difficult to understand, however my impression was that it is safe to delete the list element and drop the references until after the kobject is destroyed. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> # drivers/platform/x86 Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Scott Murray <scott@spiteful.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Gavin Shan <gwshan@linux.vnet.ibm.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Corentin Chary <corentin.chary@gmail.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org>
2018-07-23net/mlx5: FW tracer, events handlingFeras Daoud
The tracer has one event, event 0x26, with two subtypes: - Subtype 0: Ownership change - Subtype 1: Traces available An ownership change occurs in the following cases: 1- Owner releases his ownership, in this case, an event will be sent to inform others to reattempt acquire ownership. 2- Ownership was taken by a higher priority tool, in this case the owner should understand that it lost ownership, and go through tear down flow. The second subtype indicates that there are traces in the trace buffer, in this case, the driver polls the tracer buffer for new traces, parse them and prepares the messages for printing. The HW starts tracing from the first address in the tracer buffer. Driver receives an event notifying that new trace block exists. HW posts a timestamp event at the last 8B of every 256B block. Comparing the timestamp to the last handled timestamp would indicate that this is a new trace block. Once the new timestamp is detected, the entire block is considered valid. Block validation and parsing, should be done after copying the current block to a different location, in order to avoid block overwritten during processing. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23net/mlx5: FW tracer, implement tracer logicFeras Daoud
Implement FW tracer logic and registers access, initialization and cleanup flows. Initializing the tracer will be part of load one flow, as multiple PFs will try to acquire ownership but only one will succeed and will be the tracer owner. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux mlx5 core infrastructure updates and fixes. From Eran: - Add MPEGC (Management PCIe General Configuration) registers and btis - Fix tristate and description for MLX5 module rom Feras: - Add hardware structures for the firmware tracer From Jainbo: - Core support for double vlan push/pop steering action From Max: - Add XRQ commands definitions From Noa: - Add missing SET_DRIVER_VERSION command translation From Roi: - Use ERR_CAST() instead of coding it From Tariq: - Better return types for CQE API Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23IB/uverbs: Move ib_access_flags and ib_read_counters_flags to uapiJason Gunthorpe
These constants are used in the ioctl interface so they are part of the uapi, place them in the correct header for clarity. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-23ipv6: use fib6_info_hold_safe() when necessaryWei Wang
In the code path where only rcu read lock is held, e.g. in the route lookup code path, it is not safe to directly call fib6_info_hold() because the fib6_info may already have been deleted but still exists in the rcu grace period. Holding reference to it could cause double free and crash the kernel. This patch adds a new function fib6_info_hold_safe() and replace fib6_info_hold() in all necessary places. Syzbot reported 3 crash traces because of this. One of them is: 8021q: adding VLAN 0 to HW filter on device team0 IPv6: ADDRCONF(NETDEV_CHANGE): team0: link becomes ready dst_release: dst:(____ptrval____) refcnt:-1 dst_release: dst:(____ptrval____) refcnt:-2 WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 dst_hold include/net/dst.h:239 [inline] WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204 dst_release: dst:(____ptrval____) refcnt:-1 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 4845 Comm: syz-executor493 Not tainted 4.18.0-rc3+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 panic+0x238/0x4e7 kernel/panic.c:184 dst_release: dst:(____ptrval____) refcnt:-2 dst_release: dst:(____ptrval____) refcnt:-3 __warn.cold.8+0x163/0x1ba kernel/panic.c:536 dst_release: dst:(____ptrval____) refcnt:-4 report_bug+0x252/0x2d0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:178 [inline] do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296 dst_release: dst:(____ptrval____) refcnt:-5 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992 RIP: 0010:dst_hold include/net/dst.h:239 [inline] RIP: 0010:ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204 Code: c1 ed 03 89 9d 18 ff ff ff 48 b8 00 00 00 00 00 fc ff df 41 c6 44 05 00 f8 e9 2d 01 00 00 4c 8b a5 c8 fe ff ff e8 1a f6 e6 fa <0f> 0b e9 6a fc ff ff e8 0e f6 e6 fa 48 8b 85 d0 fe ff ff 48 8d 78 RSP: 0018:ffff8801a8fcf178 EFLAGS: 00010293 RAX: ffff8801a8eba5c0 RBX: 0000000000000000 RCX: ffffffff869511e6 RDX: 0000000000000000 RSI: ffffffff869515b6 RDI: 0000000000000005 RBP: ffff8801a8fcf2c8 R08: ffff8801a8eba5c0 R09: ffffed0035ac8338 R10: ffffed0035ac8338 R11: ffff8801ad6419c3 R12: ffff8801a8fcf720 R13: ffff8801a8fcf6a0 R14: ffff8801ad6419c0 R15: ffff8801ad641980 ip6_make_skb+0x2c8/0x600 net/ipv6/ip6_output.c:1768 udpv6_sendmsg+0x2c90/0x35f0 net/ipv6/udp.c:1376 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:641 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:651 ___sys_sendmsg+0x51d/0x930 net/socket.c:2125 __sys_sendmmsg+0x240/0x6f0 net/socket.c:2220 __do_sys_sendmmsg net/socket.c:2249 [inline] __se_sys_sendmmsg net/socket.c:2246 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2246 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x446ba9 Code: e8 cc bb 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fb39a469da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 00000000006dcc54 RCX: 0000000000446ba9 RDX: 00000000000000b8 RSI: 0000000020001b00 RDI: 0000000000000003 RBP: 00000000006dcc50 R08: 00007fb39a46a700 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 45c828efc7a64843 R13: e6eeb815b9d8a477 R14: 5068caf6f713c6fc R15: 0000000000000001 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Rebooting in 86400 seconds.. Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes") Reported-by: syzbot+902e2a1bcd4f7808cef5@syzkaller.appspotmail.com Reported-by: syzbot+8ae62d67f647abeeceb9@syzkaller.appspotmail.com Reported-by: syzbot+3f08feb14086930677d0@syzkaller.appspotmail.com Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23iio: Add modifier for DUV lightMaxime Roussin-Bélanger
Signed-off-by: Maxime Roussin-Bélanger <maxime.roussinbelanger@gmail.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2018-07-23net/smc: provide smc mode in smc_diag.cKarsten Graul
Rename field diag_fallback into diag_mode and set the smc mode of a connection explicitly. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23mm, memory_failure: Teach memory_failure() about dev_pagemap pagesDan Williams
mce: Uncorrected hardware memory error in user-access at af34214200 {1}[Hardware Error]: It has been corrected by h/w and requires no further action mce: [Hardware Error]: Machine check events logged {1}[Hardware Error]: event severity: corrected Memory failure: 0xaf34214: reserved kernel page still referenced by 1 users [..] Memory failure: 0xaf34214: recovery action for reserved kernel page: Failed mce: Memory error not recovered In contrast to typical memory, dev_pagemap pages may be dax mapped. With dax there is no possibility to map in another page dynamically since dax establishes 1:1 physical address to file offset associations. Also dev_pagemap pages associated with NVDIMM / persistent memory devices can internal remap/repair addresses with poison. While memory_failure() assumes that it can discard typical poisoned pages and keep them unmapped indefinitely, dev_pagemap pages may be returned to service after the error is cleared. Teach memory_failure() to detect and handle MEMORY_DEVICE_HOST dev_pagemap pages that have poison consumed by userspace. Mark the memory as UC instead of unmapping it completely to allow ongoing access via the device driver (nd_pmem). Later, nd_pmem will grow support for marking the page back to WB when the error is cleared. Cc: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2018-07-23filesystem-dax: Introduce dax_lock_mapping_entry()Dan Williams
In preparation for implementing support for memory poison (media error) handling via dax mappings, implement a lock_page() equivalent. Poison error handling requires rmap and needs guarantees that the page->mapping association is maintained / valid (inode not freed) for the duration of the lookup. In the device-dax case it is sufficient to simply hold a dev_pagemap reference. In the filesystem-dax case we need to use the entry lock. Export the entry lock via dax_lock_mapping_entry() that uses rcu_read_lock() to protect against the inode being freed, and revalidates the page->mapping association under xa_lock(). Cc: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2018-07-23Merge branch 'i2c/smbus_xfer_unlock-immutable' of ↵Mark Brown
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux into regmap-4.19 for sccb dependency
2018-07-23net: bridge: add support for backup portNikolay Aleksandrov
This patch adds a new port attribute - IFLA_BRPORT_BACKUP_PORT, which allows to set a backup port to be used for known unicast traffic if the port has gone carrier down. The backup pointer is rcu protected and set only under RTNL, a counter is maintained so when deleting a port we know how many other ports reference it as a backup and we remove it from all. Also the pointer is in the first cache line which is hot at the time of the check and thus in the common case we only add one more test. The backup port will be used only for the non-flooding case since it's a part of the bridge and the flooded packets will be forwarded to it anyway. To remove the forwarding just send a 0/non-existing backup port. This is used to avoid numerous scalability problems when using MLAG most notably if we have thousands of fdbs one would need to change all of them on port carrier going down which takes too long and causes a storm of fdb notifications (and again when the port comes back up). In a Multi-chassis Link Aggregation setup usually hosts are connected to two different switches which act as a single logical switch. Those switches usually have a control and backup link between them called peerlink which might be used for communication in case a host loses connectivity to one of them. We need a fast way to failover in case a host port goes down and currently none of the solutions (like bond) cannot fulfill the requirements because the participating ports are actually the "master" devices and must have the same peerlink as their backup interface and at the same time all of them must participate in the bridge device. As Roopa noted it's normal practice in routing called fast re-route where a precalculated backup path is used when the main one is down. Another use case of this is with EVPN, having a single vxlan device which is backup of every port. Due to the nature of master devices it's not currently possible to use one device as a backup for many and still have all of them participate in the bridge (which is master itself). More detailed information about MLAG is available at the link below. https://docs.cumulusnetworks.com/display/DOCS/Multi-Chassis+Link+Aggregation+-+MLAG Further explanation and a diagram by Roopa: Two switches acting in a MLAG pair are connected by the peerlink interface which is a bridge port. the config on one of the switches looks like the below. The other switch also has a similar config. eth0 is connected to one port on the server. And the server is connected to both switches. br0 -- team0---eth0 | -- switch-peerlink Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23Documentation: document ktime_get_*() APIsArnd Bergmann
As Dave Chinner points out, we don't have a proper documentation for the ktime_get() family of interfaces, making it rather unclear which of the over 30 (!) interfaces one should actually use in a driver or elsewhere in the kernel. I wrote up an explanation from how I personally see the interfaces, documenting what each of the functions do and hopefully making it a bit clearer which should be used where. This is the first time I tried writing .rst format documentation, so in addition to any mistakes in the content, I probably also introduce nonstandard formatting ;-) I first tried to add an extra section to Documentation/timers/timekeeping.txt, but this is currently not included in the generated API, and it seems useful to have the API docs as part of what gets generated in https://www.kernel.org/doc/html/latest/core-api/index.html#core-utilities instead, so I started a new file there. I also considered adding the documentation inline in the include/linux/timekeeping.h header, but couldn't figure out how to do that in a way that would result both in helpful inline comments as well as readable html output, so I settled for the latter, with a small note pointing to it from the header. Cc: Dave Chinner <david@fromorbit.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Stephen Boyd <sboyd@kernel.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-07-23ACPI: property: Make the ACPI graph API privateSakari Ailus
The fwnode graph API is preferred over the ACPI graph API. Therefore make the ACPI graph API private, and use it as a back-end for the fwnode graph API only. Unused functionality is removed while the functionality actually used remains the same. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-07-23ACPI: Convert ACPI reference args to generic fwnode reference argsSakari Ailus
Convert all users of struct acpi_reference_args to more generic fwnode_reference_args. This will 1) avoid an ACPI specific references to device nodes with integer arguments as well as 2) allow making references to nodes other than device nodes in ACPI. As a by-product, convert the fwnode interger arguments to u64. The arguments were 64-bit integers on ACPI but the fwnode arguments were just 32-bit. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-07-23HID: cougar: make compare_device_paths reusableDaniel M. Lambea
The function compare_device_paths from wacom_sys.c is generic and useful for other drivers. Move the function to hid-core and rename it as hid_compare_device_paths. Signed-off-by: Daniel M. Lambea <dmlambea@gmail.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-07-23nvme.h: resync with nvme-cliRevanth Rajashekar
Added some feature ids present in nvme-cli but not kernel. Signed-off-by: Revanth Rajashekar <revanth.rajashekar@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-23fsi: master-ast-cf: Add new FSI master using Aspeed ColdFireBenjamin Herrenschmidt
The Aspeed AST2x00 can contain a ColdFire v1 coprocessor which is currently unused on OpenPower systems. This adds an alternative to the fsi-master-gpio driver that uses that coprocessor instead of bit banging from the ARM core itself. The end result is about 4 times faster. The firmware for the coprocessor and its source code can be found at https://github.com/ozbenh/cf-fsi and is system specific. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2018-07-23devres: Add devm_of_iomap()Benjamin Herrenschmidt
There are still quite a few cases where a device might want to get to a different node of the device-tree, obtain the resources and map them. We have of_iomap() and of_io_request_and_map() but they both have shortcomings, such as not returning the size of the resource found (which can be useful) and not being "managed". This adds a devm_of_iomap() that provides all of these and should probably replace uses of the above in most drivers. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Joel Stanley <joel@jms.id.au>
2018-07-23Merge remote-tracking branch 'gpio/ib-aspeed' into upstream-readyBenjamin Herrenschmidt
Merge the GPIO tree "ib-aspeed" topic branch which contains pre-requisites for subsequent changes. This branch is also in gpio "next".
2018-07-22Merge tag 'ds2760-for-v4.19-signed' into psy-nextSebastian Reichel
Immutable branch for moving ds2760 driver from w1 to power supply Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
2018-07-22Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull vfs fixes from Al Viro: "Fix several places that screw up cleanups after failures halfway through opening a file (one open-coding filp_clone_open() and getting it wrong, two misusing alloc_file()). That part is -stable fodder from the 'work.open' branch. And Christoph's regression fix for uapi breakage in aio series; include/uapi/linux/aio_abi.h shouldn't be pulling in the kernel definition of sigset_t, the reason for doing so in the first place had been bogus - there's no need to expose struct __aio_sigset in aio_abi.h at all" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: aio: don't expose __aio_sigset in uapi ocxlflash_getfile(): fix double-iput() on alloc_file() failures cxl_getfile(): fix double-iput() on alloc_file() failures drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open()
2018-07-22alpha: fix osf_wait4() breakageAl Viro
kernel_wait4() expects a userland address for status - it's only rusage that goes as a kernel one (and needs a copyout afterwards) [ Also, fix the prototype of kernel_wait4() to have that __user annotation - Linus ] Fixes: 92ebce5ac55d ("osf_wait4: switch to kernel_wait4()") Cc: stable@kernel.org # v4.13+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-22nfp: bring back support for offloading shared blocksJakub Kicinski
Now that we have offload replay infrastructure added by commit 326367427cc0 ("net: sched: call reoffload op on block callback reg") and flows are guaranteed to be removed correctly, we can revert commit 951a8ee6def3 ("nfp: reject binding to shared blocks"). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-22efi: Deduplicate efi_open_volume()Lukas Wunner
There's one ARM, one x86_32 and one x86_64 version of efi_open_volume() which can be folded into a single shared version by masking their differences with the efi_call_proto() macro introduced by commit: 3552fdf29f01 ("efi: Allow bitness-agnostic protocol calls"). To be able to dereference the device_handle attribute from the efi_loaded_image_t table in an arch- and bitness-agnostic manner, introduce the efi_table_attr() macro (which already exists for x86) to arm and arm64. No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20180720014726.24031-7-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-21net/ipv6: Fix linklocal to global address with VRFDavid Ahern
Example setup: host: ip -6 addr add dev eth1 2001:db8:104::4 where eth1 is enslaved to a VRF switch: ip -6 ro add 2001:db8:104::4/128 dev br1 where br1 only has an LLA ping6 2001:db8:104::4 ssh 2001:db8:104::4 (NOTE: UDP works fine if the PKTINFO has the address set to the global address and ifindex is set to the index of eth1 with a destination an LLA). For ICMP, icmp6_iif needs to be updated to check if skb->dev is an L3 master. If it is then return the ifindex from rt6i_idev similar to what is done for loopback. For TCP, restore the original tcp_v6_iif definition which is needed in most places and add a new tcp_v6_iif_l3_slave that considers the l3_slave variability. This latter check is only needed for socket lookups. Fixes: 9ff74384600a ("net: vrf: Handle ipv6 multicast and link-local addresses") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21bpfilter: Fix mismatch in function argument typesYueHaibing
Fix following warning: net/ipv4/bpfilter/sockopt.c:28:5: error: symbol 'bpfilter_ip_set_sockopt' redeclared with different type net/ipv4/bpfilter/sockopt.c:34:5: error: symbol 'bpfilter_ip_get_sockopt' redeclared with different type Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21mm: make vm_area_alloc() initialize core fieldsLinus Torvalds
Like vm_area_dup(), it initializes the anon_vma_chain head, and the basic mm pointer. The rest of the fields end up being different for different users, although the plan is to also initialize the 'vm_ops' field to a dummy entry. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-21Merge tag 'reset-for-4.19' of git://git.pengutronix.de/git/pza/linux into ↵Olof Johansson
next/drivers Reset controller changes for v4.19 This adds new drivers and bindings for the SDM845 AOSS (always on subsystem) reset controller and for the Uniphier USB3 core reset. SPI controller resets are added to the Uniphier reset driver. * tag 'reset-for-4.19' of git://git.pengutronix.de/git/pza/linux: reset: uniphier: add reset control support for SPI reset: uniphier: add USB3 core reset control dt-bindings: reset: uniphier: add USB3 core reset support reset: simple: export reset_simple_ops to be referred from modules reset: qcom: AOSS (always on subsystem) reset controller dt-bindings: reset: Add AOSS reset bindings for SDM845 SoCs Signed-off-by: Olof Johansson <olof@lixom.net>
2018-07-21Merge tag 'imx-dt-clkdep-4.19' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into next/dt i.MX device tree changes with clock dependency: - Add clock for i.MX6UL GPIO blocks * tag 'imx-dt-clkdep-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: imx6ul: add GPIO clocks clk: imx6ul: remove clks_init_on array clk: imx6ul: add GPIO clock gates dt-bindings: clock: imx6ul: Do not change the clock definition order Signed-off-by: Olof Johansson <olof@lixom.net>
2018-07-21Merge tag 'imx-soc-4.19' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into next/soc i.MX SoC update for 4.19: - A series from Anson Huang to add power management for i.MX6SLL, including standby and mem mode suspend, cpuidle support, and bus clock auto gating function, etc. - A couple of fix-ups on i.MX6SLL cpuidle random build issues. - A couple of cleanups on stale EPIT timer initialization and RNGA platform device registration function. - Configure i.MX51 SoC M4IF to avoid visual artifacts during video playback. - Set up i.MX51 and i.MX53 DBGEN bit of ARM_GPC register, so that clocks within the debug system can be activated. - Add a Cortex-M4 platform support which will be useful for running a Linux instance on Cortex-M4 core integrated in i.MX7D SoC. - Flag of_iomap failure in imx_aips_allow_unprivileged_access() function by giving a warning in there. * tag 'imx-soc-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: mx5: Set the DBGEN bit in ARM_GPC register ARM: imx51: Configure M4IF to avoid visual artifacts ARM: imx: call imx6sx_cpuidle_init() conditionally for 6sll ARM: imx: fix i.MX6SLL build ARM: imx: flag failure of of_iomap ARM: i.MX31: remove rnga registration as a platform device ARM: imx: Provide support for NXP i.MX7D Cortex-M4 ARM: imx: enable bus auto clock gating function for i.mx6sll ARM: imx: remove i.MX6SLL support in i.MX6SL cpu idle driver ARM: imx: add cpu idle support for i.MX6SLL ARM: imx: add L2 page power control for GPC ARM: imx: add mem mode suspend for i.MX6SLL ARM: imx: add standby mode suspend for i.MX6SLL ARM: imx: remove inexistant EPIT timer init Signed-off-by: Olof Johansson <olof@lixom.net>
2018-07-21Merge tag 'at91-ab-4.19-soc' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux into next/soc AT91 SoC for 4.19: - New low power mode for sama5d2: ULP1 * tag 'at91-ab-4.19-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: ARM: at91: pm: configure wakeup sources for ULP1 mode ARM: at91: pm: add PMC fast startup registers defines ARM: at91: pm: Add ULP1 mode support ARM: at91: pm: Use ULP0 naming instead of slow clock MAINTAINERS: Remove the AT91 clk driver entry Signed-off-by: Olof Johansson <olof@lixom.net>
2018-07-21mm: use helper functions for allocating and freeing vm_area structsLinus Torvalds
The vm_area_struct is one of the most fundamental memory management objects, but the management of it is entirely open-coded evertwhere, ranging from allocation and freeing (using kmem_cache_[z]alloc and kmem_cache_free) to initializing all the fields. We want to unify this in order to end up having some unified initialization of the vmas, and the first step to this is to at least have basic allocation functions. Right now those functions are literally just wrappers around the kmem_cache_*() calls. This is a purely mechanical conversion: # new vma: kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL) -> vm_area_alloc() # copy old vma kmem_cache_alloc(vm_area_cachep, GFP_KERNEL) -> vm_area_dup(old) # free vma kmem_cache_free(vm_area_cachep, vma) -> vm_area_free(vma) to the point where the old vma passed in to the vm_area_dup() function isn't even used yet (because I've left all the old manual initialization alone). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-21firmware: qcom: scm: add a dummy qcom_scm_assign_mem()Niklas Cassel
Add a dummy qcom_scm_assign_mem() to enable building drivers when CONFIG_COMPILE_TEST=y && CONFIG_QCOM_SCM=n. All other qcom_scm_* functions already have a dummy version. Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Andy Gross <andy.gross@linaro.org>
2018-07-21drivers: qcom: rpmh: add support for batch RPMH requestLina Iyer
Platform drivers need make a lot of resource state requests at the same time, say, at the start or end of an usecase. It can be quite inefficient to send each request separately. Instead they can give the RPMH library a batch of requests to be sent and wait on the whole transaction to be complete. rpmh_write_batch() is a blocking call that can be used to send multiple RPMH command sets. Each RPMH command set is set asynchronously and the API blocks until all the command sets are complete and receive their tx_done callbacks. Signed-off-by: Lina Iyer <ilina@codeaurora.org> Signed-off-by: Raju P.L.S.S.S.N <rplsssn@codeaurora.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Andy Gross <andy.gross@linaro.org>
2018-07-21drivers: qcom: rpmh: allow requests to be sent asynchronouslyLina Iyer
Platform drivers that want to send a request but do not want to block until the RPMH request completes have now a new API - rpmh_write_async(). The API allocates memory and send the requests and returns the control back to the platform driver. The tx_done callback from the controller is handled in the context of the controller's thread and frees the allocated memory. This API allows RPMH requests from atomic contexts as well. Signed-off-by: Lina Iyer <ilina@codeaurora.org> Signed-off-by: Raju P.L.S.S.S.N <rplsssn@codeaurora.org> Signed-off-by: Andy Gross <andy.gross@linaro.org>
2018-07-21drivers: qcom: rpmh: cache sleep/wake state requestsLina Iyer
Active state requests are sent immediately to the RSC controller, while sleep and wake state requests are cached in this driver to avoid taxing the RSC controller repeatedly. The cached values will be sent to the controller when the rpmh_flush() is called. Generally, flushing is a system PM activity and may be called from the system PM drivers when the system is entering suspend or deeper sleep modes during cpuidle. Also allow invalidating the cached requests, so they may be re-populated again. Signed-off-by: Lina Iyer <ilina@codeaurora.org> [rplsssn: remove unneeded semicolon, address line over 80chars error] Signed-off-by: Raju P.L.S.S.S.N <rplsssn@codeaurora.org> Reviewed-by: Evan Green <evgreen@chromium.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Andy Gross <andy.gross@linaro.org>
2018-07-21drivers: qcom: rpmh: add RPMH helper functionsLina Iyer
Sending RPMH requests and waiting for response from the controller through a callback is common functionality across all platform drivers. To simplify drivers, add a library functions to create RPMH client and send resource state requests. rpmh_write() is a synchronous blocking call that can be used to send active state requests. Signed-off-by: Lina Iyer <ilina@codeaurora.org> Signed-off-by: Raju P.L.S.S.S.N <rplsssn@codeaurora.org> Signed-off-by: Andy Gross <andy.gross@linaro.org>
2018-07-21drivers: qcom: rpmh-rsc: add RPMH controller for QCOM SoCsLina Iyer
Add controller driver for QCOM SoCs that have hardware based shared resource management. The hardware IP known as RSC (Resource State Coordinator) houses multiple Direct Resource Voter (DRV) for different execution levels. A DRV is a unique voter on the state of a shared resource. A Trigger Control Set (TCS) is a bunch of slots that can house multiple resource state requests, that when triggered will issue those requests through an internal bus to the Resource Power Manager Hardened (RPMH) blocks. These hardware blocks are capable of adjusting clocks, voltages, etc. The resource state request from a DRV are aggregated along with state requests from other processors in the SoC and the aggregate value is applied on the resource. Some important aspects of the RPMH communication - - Requests are <addr, value> with some header information - Multiple requests (upto 16) may be sent through a TCS, at a time - Requests in a TCS are sent in sequence - Requests may be fire-n-forget or completion (response expected) - Multiple TCS from the same DRV may be triggered simultaneously - Cannot send a request if another request for the same addr is in progress from the same DRV - When all the requests from a TCS are complete, an IRQ is raised - The IRQ handler needs to clear the TCS before it is available for reuse - TCS configuration is specific to a DRV - Platform drivers may use DRV from different RSCs to make requests Resource state requests made when CPUs are active are called 'active' state requests. Requests made when all the CPUs are powered down (idle state) are called 'sleep' state requests. They are matched by a corresponding 'wake' state requests which puts the resources back in to previously requested active state before resuming any CPU. TCSes are dedicated for each type of requests. Active mode TCSes (AMC) are used to send requests immediately to the resource, while control TCS are used to provide specific information to the controller. Sleep and Wake TCS send sleep and wake requests, after and before the system halt respectively. Signed-off-by: Lina Iyer <ilina@codeaurora.org> Signed-off-by: Raju P.L.S.S.S.N <rplsssn@codeaurora.org> Signed-off-by: Andy Gross <andy.gross@linaro.org>
2018-07-21drivers: soc: Add LLCC driverRishabh Bhatnagar
LLCC (Last Level Cache Controller) provides additional cache memory in the system. LLCC is partitioned into multiple slices and each slice gets its own priority, size, ID and other config parameters. LLCC driver programs these parameters for each slice. Clients that are assigned to use LLCC need to get information such size & ID of the slice they get and activate or deactivate the slice as needed. LLCC driver provides API for the clients to perform these operations. Signed-off-by: Channagoud Kadabi <ckadabi@codeaurora.org> Signed-off-by: Rishabh Bhatnagar <rishabhb@codeaurora.org> Reviewed-by: Evan Green <evgreen@chromium.org> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Andy Gross <andy.gross@linaro.org>
2018-07-21signal: Pass pid type into do_send_sig_infoEric W. Biederman
This passes the information we already have at the call sight into do_send_sig_info. Ultimately allowing for better handling of signals sent to a group of processes during fork. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-07-21signal: Pass pid type into group_send_sig_infoEric W. Biederman
This passes the information we already have at the call sight into group_send_sig_info. Ultimatelly allowing for to better handle signals sent to a group of processes. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-07-21Merge tag 'mlx5-fixes-2018-07-18' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2018-07-18 The following series provides fixes to mlx5 core and net device driver. Please pull and let me know if there's any problem. For -stable v4.7 net/mlx5e: Don't allow aRFS for encapsulated packets net/mlx5e: Fix quota counting in aRFS expire flow For -stable v4.15 net/mlx5e: Only allow offloading decap egress (egdev) flows net/mlx5e: Refine ets validation function net/mlx5: Adjust clock overflow work period For -stable v4.17 net/mlx5: E-Switch, UBSAN fix undefined behavior in mlx5_eswitch_mode ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21signal: Pass pid and pid type into send_sigqueueEric W. Biederman
Make the code more maintainable by performing more of the signal related work in send_sigqueue. A quick inspection of do_timer_create will show that this code path does not lookup a thread group by a thread's pid. Making it safe to find the task pointed to by it_pid with "pid_task(it_pid, type)"; This supports the changes needed in fork to tell if a signal was sent to a single process or a group of processes. Having the pid to task transition in signal.c will also make it easier to sort out races with de_thread and and the thread group leader exiting when it comes time to address that. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-07-21pid: Implement PIDTYPE_TGIDEric W. Biederman
Everywhere except in the pid array we distinguish between a tasks pid and a tasks tgid (thread group id). Even in the enumeration we want that distinction sometimes so we have added __PIDTYPE_TGID. With leader_pid we almost have an implementation of PIDTYPE_TGID in struct signal_struct. Add PIDTYPE_TGID as a first class member of the pid_type enumeration and into the pids array. Then remove the __PIDTYPE_TGID special case and the leader_pid in signal_struct. The net size increase is just an extra pointer added to struct pid and an extra pair of pointers of an hlist_node added to task_struct. The effect on code maintenance is the removal of a number of special cases today and the potential to remove many more special cases as PIDTYPE_TGID gets used to it's fullest. The long term potential is allowing zombie thread group leaders to exit, which will remove a lot more special cases in the code. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-07-21pids: Move the pgrp and session pid pointers from task_struct to signal_structEric W. Biederman
To access these fields the code always has to go to group leader so going to signal struct is no loss and is actually a fundamental simplification. This saves a little bit of memory by only allocating the pid pointer array once instead of once for every thread, and even better this removes a few potential races caused by the fact that group_leader can be changed by de_thread, while signal_struct can not. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>