summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2018-10-17vxlan: Export address checking functionsIdo Schimmel
Drivers that support VxLAN offload need to be able to sanitize the configuration of the VxLAN device and accept / reject its offload. For example, mlxsw requires that the local IP of the VxLAN device be set and that packets be flooded to unicast IP(s) and not to a multicast group. Expose the functions that perform such checks. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17net/mlx5: Add a no-append flow insertion modePaul Blakey
If no-append flag is set, we will add a new FTE, instead of appending the actions of the inserted rule when the same match already exists. While here, move the has_flow_tag boolean indicator to be a flag too. This patch doesn't change any functionality. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanmox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17net/mlx5: Split FDB fast path prio to multiple namespacesPaul Blakey
Towards supporting multi-chains and priorities, split the FDB fast path to multiple namespaces (sub namespaces), each with multiple priorities. This patch adds a new flow steering type, FS_TYPE_PRIO_CHAINS, which is like current FS_TYPE_PRIO, but may contain only namespaces, and those will be in parallel to one another in terms of managing of the flow tables connections inside them. Meaning, while searching for the next or previous flow table to connect for a new table inside such namespace we skip the parallel namespaces in the same level under the FS_TYPE_PRIO_CHAINS prio we originated from. We use this new type for splitting the fast path prio into multiple parallel namespaces, each containing normal prios. The prios inside them (and their tables) will be connected to one another, but not from one parallel namespace to another, instead the last prio in each namespace will be connected to the next prio in the containing FDB namespace, which is the slow path prio. Signed-off-by: Paul Blakey <paulb@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17net/mlx5: Add cap bits for multi fdb encapPaul Blakey
If set, the firmware supports creating of flow tables with encap enabled while VFs are configured, if we already created one (restriction still applies on the first creation). Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17net/mlx5: Use flow counter IDs and not the wrapping cache objectMark Bloch
Currently, when a flow rule is created using the FS core layer, the caller has to pass the entire flow counter object and not just the counter HW handle (ID). This requires both the FS core and the caller to have knowledge about the inner implementation of the FS layer flow counters cache and limits the possible users. Move to use the counter ID across the place when dealing with flows. Doing this decoupling, now can we privatize the inner implementation of the flow counters. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17net/mlx5: E-Switch, Get counters for offloaded flows from callersMark Bloch
There's no real reason for the e-switch logic to manage the creation of counters for offloaded flows. The API already has the directive for the caller to denote they want to attach a counter to the created flow. As such, we go and move the management of flow counters to the mlx5e tc offload logic. This also lets us remove an inelegant interface where the FS layer had to provide a way to retrieve a counter from a flow rule. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into net-next mlx5 updates for both net-next and rdma-next * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: (21 commits) net/mlx5: Expose DC scatter to CQE capability bit net/mlx5: Update mlx5_ifc with DEVX UID bits net/mlx5: Set uid as part of DCT commands net/mlx5: Set uid as part of SRQ commands net/mlx5: Set uid as part of SQ commands net/mlx5: Set uid as part of RQ commands net/mlx5: Set uid as part of QP commands net/mlx5: Set uid as part of CQ commands net/mlx5: Rename incorrect naming in IFC file net/mlx5: Export packet reformat alloc/dealloc functions net/mlx5: Pass a namespace for packet reformat ID allocation net/mlx5: Expose new packet reformat capabilities {net, RDMA}/mlx5: Rename encap to reformat packet net/mlx5: Move header encap type to IFC header file net/mlx5: Break encap/decap into two separated flow table creation flags net/mlx5: Add support for more namespaces when allocating modify header net/mlx5: Export modify header alloc/dealloc functions net/mlx5: Add proper NIC TX steering flow tables support net/mlx5: Cleanup flow namespace getter switch logic net/mlx5: Add memic command opcode to command checker ... Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17UAPI: ndctl: Remove use of PAGE_SIZEDavid Howells
The macro PAGE_SIZE isn't valid outside of the kernel, so it should not appear in UAPI headers. Furthermore, the actual machine page size could theoretically change from an application's point of view if it's running in a container that gets migrated to another machine (say 4K/ppc64 to 64K/ppc64). Fixes: f2ba5a5baecf ("libnvdimm, namespace: make min namespace size 4K") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-17UAPI: ndctl: Fix g++-unsupported initialisation in headersDavid Howells
The following code in the linux/ndctl header file: static inline const char *nvdimm_bus_cmd_name(unsigned cmd) { static const char * const names[] = { [ND_CMD_ARS_CAP] = "ars_cap", [ND_CMD_ARS_START] = "ars_start", [ND_CMD_ARS_STATUS] = "ars_status", [ND_CMD_CLEAR_ERROR] = "clear_error", [ND_CMD_CALL] = "cmd_call", }; if (cmd < ARRAY_SIZE(names) && names[cmd]) return names[cmd]; return "unknown"; } is broken in a number of ways: (1) ARRAY_SIZE() is not generally defined. (2) g++ does not support "non-trivial" array initialisers fully yet. (3) Every file that calls this function will acquire a copy of names[]. The same goes for nvdimm_cmd_name(). Fix all three by converting to a switch statement where each case returns a string. That way if cmd is a constant, the compiler can trivially reduce it and, if not, the compiler can use a shared lookup table if it thinks that is more efficient. A better way would be to remove these functions and their arrays from the header entirely. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-17clk: qcom: Add MSM8960/APQ8064's HFPLLsStephen Boyd
Describe the HFPLLs present on MSM8960 and APQ8064 devices. Acked-by: Rob Herring <robh@kernel.org> (bindings) Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Sricharan R <sricharan@codeaurora.org> Tested-by: Craig Tatlor <ctatlor97@gmail.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-17tracepoint: Fix tracepoint array element size mismatchMathieu Desnoyers
commit 46e0c9be206f ("kernel: tracepoints: add support for relative references") changes the layout of the __tracepoint_ptrs section on architectures supporting relative references. However, it does so without turning struct tracepoint * const into const int elsewhere in the tracepoint code, which has the following side-effect: Setting mod->num_tracepoints is done in by module.c: mod->tracepoints_ptrs = section_objs(info, "__tracepoints_ptrs", sizeof(*mod->tracepoints_ptrs), &mod->num_tracepoints); Basically, since sizeof(*mod->tracepoints_ptrs) is a pointer size (rather than sizeof(int)), num_tracepoints is erroneously set to half the size it should be on 64-bit arch. So a module with an odd number of tracepoints misses the last tracepoint due to effect of integer division. So in the module going notifier: for_each_tracepoint_range(mod->tracepoints_ptrs, mod->tracepoints_ptrs + mod->num_tracepoints, tp_module_going_check_quiescent, NULL); the expression (mod->tracepoints_ptrs + mod->num_tracepoints) actually evaluates to something within the bounds of the array, but miss the last tracepoint if the number of tracepoints is odd on 64-bit arch. Fix this by introducing a new typedef: tracepoint_ptr_t, which is either "const int" on architectures that have PREL32 relocations, or "struct tracepoint * const" on architectures that does not have this feature. Also provide a new tracepoint_ptr_defer() static inline to encapsulate deferencing this type rather than duplicate code and ugly idefs within the for_each_tracepoint_range() implementation. This issue appears in 4.19-rc kernels, and should ideally be fixed before the end of the rc cycle. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Jessica Yu <jeyu@kernel.org> Link: http://lkml.kernel.org/r/20181013191050.22389-1-mathieu.desnoyers@efficios.com Link: http://lkml.kernel.org/r/20180704083651.24360-7-ard.biesheuvel@linaro.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morris <james.morris@microsoft.com> Cc: James Morris <jmorris@namei.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Nicolas Pitre <nico@linaro.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Russell King <linux@armlinux.org.uk> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-17clk: imx6q: add mmdc0 ipg clockAnson Huang
i.MX6Q has MMDC0 ipg clock in CCM CCGR, add it into clock tree for clock management. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-17clk: imx6sl: add mmdc ipg clocksAnson Huang
i.MX6SL has MMDC0 and MMDC1 ipg clock in CCM CCGR, add them into clock tree for clock management. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-17clk: imx6sll: add mmdc1 ipg clockAnson Huang
i.MX6SLL has MMDC1 ipg clock in CCM CCGR, add it into clock tree for clock management. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-17clk: imx6sx: add mmdc1 ipg clockAnson Huang
i.MX6SX has MMDC1 ipg clock in CCM CCGR, add it into clock tree for clock management. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-17clk: imx6ul: add mmdc1 ipg clockAnson Huang
i.MX6UL has MMDC1 ipg clock in CCM CCGR, add it into clock tree for clock management. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-17drm/atomic_helper: Stop modesets on unregistered connectors harderLyude Paul
Unfortunately, it appears our fix in: commit b5d29843d8ef ("drm/atomic_helper: Allow DPMS On<->Off changes for unregistered connectors") Which attempted to work around the problems introduced by: commit 4d80273976bf ("drm/atomic_helper: Disallow new modesets on unregistered connectors") Is still not the right solution, as modesets can still be triggered outside of drm_atomic_set_crtc_for_connector(). So in order to fix this, while still being careful that we don't break modesets that a driver may perform before being registered with userspace, we replace connector->registered with a tristate member, connector->registration_state. This allows us to keep track of whether or not a connector is still initializing and hasn't been exposed to userspace, is currently registered and exposed to userspace, or has been legitimately removed from the system after having once been present. Using this info, we can prevent userspace from performing new modesets on unregistered connectors while still allowing the driver to perform modesets on unregistered connectors before the driver has finished being registered. Changes since v1: - Fix WARN_ON() in drm_connector_cleanup() that CI caught with this patchset in igt@drv_module_reload@basic-reload-inject and igt@drv_module_reload@basic-reload by checking if the connector is registered instead of unregistered, as calling drm_connector_cleanup() on a connector that hasn't been registered with userspace yet should stay valid. - Remove unregistered_connector_check(), and just go back to what we were doing before in commit 4d80273976bf ("drm/atomic_helper: Disallow new modesets on unregistered connectors") except replacing READ_ONCE(connector->registered) with drm_connector_is_unregistered(). This gets rid of the behavior of allowing DPMS On<->Off, but that should be fine as it's more consistent with the UAPI we had before - danvet - s/drm_connector_unregistered/drm_connector_is_unregistered/ - danvet - Update documentation, fix some typos. Fixes: b5d29843d8ef ("drm/atomic_helper: Allow DPMS On<->Off changes for unregistered connectors") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: stable@vger.kernel.org Cc: David Airlie <airlied@linux.ie> Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20181016203946.9601-1-lyude@redhat.com
2018-10-17clk: at91: add new DT lookup functionAlexandre Belloni
Add a new DT lookup function to lookup for PMC clocks. Note that the #ifndef AT91_PMC_MOSCS section will be removed once all the platforms are converted. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-17block: Add PCI P2P flag for request queueLogan Gunthorpe
Add QUEUE_FLAG_PCI_P2P, meaning a driver's request queue supports targeting P2P memory. This will be used by P2P providers and orchestrators (in subsequent patches) to ensure block devices can support P2P memory before submitting P2P-backed pages to submit_bio(). Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Jens Axboe <axboe@kernel.dk>
2018-10-17PCI/P2PDMA: Introduce configfs/sysfs enable attribute helpersLogan Gunthorpe
Users of the P2PDMA infrastructure will typically need a way for the user to tell the kernel to use P2P resources. Typically this will be a simple on/off boolean operation but sometimes it may be desirable for the user to specify the exact device to use for the P2P operation. Add new helpers for attributes which take a boolean or a PCI device. Any boolean as accepted by strtobool() turn P2P on or off (such as 'y', 'n', '1', '0', etc). Specifying a full PCI device name/BDF will select the specific device. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-10-17PCI/P2PDMA: Add PCI p2pmem DMA mappings to adjust the bus offsetLogan Gunthorpe
The DMA address used when mapping PCI P2P memory must be the PCI bus address. Thus, introduce pci_p2pmem_map_sg() to map the correct addresses when using P2P memory. Memory mapped in this way does not need to be unmapped and thus if we provided pci_p2pmem_unmap_sg() it would be empty. This breaks the expected balance between map/unmap but was left out as an empty function doesn't really provide any benefit. In the future, if this call becomes necessary it can be added without much difficulty. For this, we assume that an SGL passed to these functions contain all P2P memory or no P2P memory. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-10-17kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOADJim Mattson
This is a per-VM capability which can be enabled by userspace so that the faulting linear address will be included with the information about a pending #PF in L2, and the "new DR6 bits" will be included with the information about a pending #DB in L2. With this capability enabled, the L1 hypervisor can now intercept #PF before CR2 is modified. Under VMX, the L1 hypervisor can now intercept #DB before DR6 and DR7 are modified. When userspace has enabled KVM_CAP_EXCEPTION_PAYLOAD, it should generally provide an appropriate payload when injecting a #PF or #DB exception via KVM_SET_VCPU_EVENTS. However, to support restoring old checkpoints, this payload is not required. Note that bit 16 of the "new DR6 bits" is set to indicate that a debug exception (#DB) or a breakpoint exception (#BP) occurred inside an RTM region while advanced debugging of RTM transactional regions was enabled. This is the reverse of DR6.RTM, which is cleared in this scenario. This capability also enables exception.pending in struct kvm_vcpu_events, which allows userspace to distinguish between pending and injected exceptions. Reported-by: Jim Mattson <jmattson@google.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17IB/mlx5: Add support for extended atomic operationsYonatan Cohen
Extended atomic operations cmp&swp and fetch&add is a Mellanox feature extending the standard atomic operation to use, varied operand sizes, as apposed to normal atomic operation that use an 8 byte operand only. Extended atomics allows masking the results and arguments. This patch configures QP to support extended atomic operation with the maximum size possible, as exposed by HCA capabilities. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-17IB/mlx5: Allow scatter to CQE without global signaled WRsYonatan Cohen
Requester scatter to CQE is restricted to QPs configured to signal all WRs. This patch adds ability to enable scatter to cqe (force enable) in the requester without sig_all, for users who do not want all WRs signaled but rather just the ones whose data found in the CQE. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-17Merge remote-tracking branch 'mlx5-next' into for-nextDoug Ledford
Pick up changes to mlx5_ifc.h needed for direct Scatter to CQE support series to come next. Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-17RDMA/core: Allow existing drivers to set one sysfs group per deviceParav Pandit
Currently many rdma drivers are creating device attribute files using device_create_file() with device specific attributes. Device specific attributes should be exposed via well defined netlink device attributes in future. Introduce an API rdma_set_device_sysfs_group() for existing drivers to set a group for sysfs attributes for legacy. This API is only for exposing legacy attributes which existed for sometime now. New drivers should not be using this API and rather follow netlink path. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-17bpf: sockmap, support for msg_peek in sk_msg with redirect ingressJohn Fastabend
This adds support for the MSG_PEEK flag when doing redirect to ingress and receiving on the sk_msg psock queue. Previously the flag was being ignored which could confuse applications if they expected the flag to work as normal. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-17bpf: skmsg, improve sk_msg_used_element to work in cork contextJohn Fastabend
Currently sk_msg_used_element is only called in zerocopy context where cork is not possible and if this case happens we fallback to copy mode. However the helper is more useful if it works in all contexts. This patch resolved the case where if end == head indicating a full or empty ring the helper always reports an empty ring. To fix this add a test for the full ring case to avoid reporting a full ring has 0 elements. This additional functionality will be used in the next patches from recvmsg context where end = head with a full ring is a valid case. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-17bpf: sockmap, fix skmsg recvmsg handler to track size correctlyJohn Fastabend
When converting sockmap to new skmsg generic data structures we missed that the recvmsg handler did not correctly use sg.size and instead was using individual elements length. The result is if a sock is closed with outstanding data we omit the call to sk_mem_uncharge() and can get the warning below. [ 66.728282] WARNING: CPU: 6 PID: 5783 at net/core/stream.c:206 sk_stream_kill_queues+0x1fa/0x210 To fix this correct the redirect handler to xfer the size along with the scatterlist and also decrement the size from the recvmsg handler. Now when a sock is closed the remaining 'size' will be decremented with sk_mem_uncharge(). Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-16clk: add managed version of clk_bulk_get_allDong Aisheng
This patch introduces the managed version of clk_bulk_get_all. Cc: Michael Turquette <mturquette@baylibre.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Tested-by: Thor Thayer <thor.thayer@linux.intel.com> Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-16clk: add new APIs to operate on all available clocksDong Aisheng
This patch introduces of_clk_bulk_get_all and clk_bulk_x_all APIs to users who just want to handle all available clocks from device tree without need to know the detailed clock information likes clock numbers and names. This is useful in writing some generic drivers to handle clock part. Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Thor Thayer <thor.thayer@linux.intel.com> Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-17KVM: nVMX: add KVM_CAP_HYPERV_ENLIGHTENED_VMCS capabilityVitaly Kuznetsov
Enlightened VMCS is opt-in. The current version does not contain all fields supported by nested VMX so we must not advertise the corresponding VMX features if enlightened VMCS is enabled. Userspace is given the enlightened VMCS version supported by KVM as part of enabling KVM_CAP_HYPERV_ENLIGHTENED_VMCS. The version is to be advertised to the nested hypervisor, currently done via a cpuid leaf for Hyper-V. Suggested-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17kvm/x86 : add coalesced pio supportPeng Hao
Coalesced pio is based on coalesced mmio and can be used for some port like rtc port, pci-host config port and so on. Specially in case of rtc as coalesced pio, some versions of windows guest access rtc frequently because of rtc as system tick. guest access rtc like this: write register index to 0x70, then write or read data from 0x71. writing 0x70 port is just as index and do nothing else. So we can use coalesced pio to handle this scene to reduce VM-EXIT time. When starting and closing a virtual machine, it will access pci-host config port frequently. So setting these port as coalesced pio can reduce startup and shutdown time. without my patch, get the vm-exit time of accessing rtc 0x70 and piix 0xcf8 using perf tools: (guest OS : windows 7 64bit) IO Port Access Samples Samples% Time% Min Time Max Time Avg time 0x70:POUT 86 30.99% 74.59% 9us 29us 10.75us (+- 3.41%) 0xcf8:POUT 1119 2.60% 2.12% 2.79us 56.83us 3.41us (+- 2.23%) with my patch IO Port Access Samples Samples% Time% Min Time Max Time Avg time 0x70:POUT 106 32.02% 29.47% 0us 10us 1.57us (+- 7.38%) 0xcf8:POUT 1065 1.67% 0.28% 0.41us 65.44us 0.66us (+- 10.55%) Signed-off-by: Peng Hao <peng.hao2@zte.com.cn> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17KVM: x86: hyperv: implement PV IPI send hypercallsVitaly Kuznetsov
Using hypercall for sending IPIs is faster because this allows to specify any number of vCPUs (even > 64 with sparse CPU set), the whole procedure will take only one VMEXIT. Current Hyper-V TLFS (v5.0b) claims that HvCallSendSyntheticClusterIpi hypercall can't be 'fast' (passing parameters through registers) but apparently this is not true, Windows always uses it as 'fast' so we need to support that. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-16dt-bindings: clock: Add jz4725b-cgu.h headerPaul Cercueil
This will be used from the devicetree bindings to specify the clocks that should be obtained from the jz4725b-cgu driver. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-16clk: qcom: gcc: Add global clock controller driver for QCS404Shefali Jain
Add the clocks supported in global clock controller which clock the peripherals like BLSPs, SDCC, USB, MDSS etc. Register all the clocks to the clock framework for the clients to be able to request for them. Signed-off-by: Shefali Jain <shefjain@codeaurora.org> Signed-off-by: Taniya Das <tdas@codeaurora.org> Co-developed-by: Taniya Das <tdas@codeaurora.org> Signed-off-by: Anu Ramanathan <anur@codeaurora.org> [bamse, vkoul: rebase and tidyup for upstream] Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Vinod Koul <vkoul@kernel.org> Acked-by: Rob Herring <robh@kernel.org> [sboyd@kernel.org: Lowercase hex] Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-16clk: qcom: Add Global Clock controller (GCC) driver for SDM660Taniya Das
Add support for the global clock controller found on SDM660 based devices. This should allow most non-multimedia device drivers to probe and control their clocks. Based on CAF implementation. Signed-off-by: Taniya Das <tdas@codeaurora.org> [craig: rename parents to fit upstream, and other cleanups] Signed-off-by: Craig Tatlor <ctatlor97@gmail.com> Acked-by: Rob Herring <robh@kernel.org> [sboyd@kernel.org: Rename gcc_660 to gcc_sdm660 and fix numbering of defines to avoid duplicates] Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-16dt-bindings: clk: hisilicon: Add bindings for Hi3670 clkManivannan Sadhasivam
Add devicetree bindings for HiSilicon Hi3670 clock controller. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-16dt-bindings: reset: Add binding constants for Actions Semi S900 RMUManivannan Sadhasivam
Add device tree binding constants for Actions Semi S900 SoC Reset Management Unit (RMU). Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-16dt-bindings: reset: Add binding constants for Actions Semi S700 RMUManivannan Sadhasivam
Add device tree binding constants for Actions Semi S700 SoC Reset Management Unit (RMU). Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-16net/mlx5: Expose DC scatter to CQE capability bitYonatan Cohen
dc_req_scat_data_cqe capability bit determines if requester scatter to cqe is available for 64 bytes CQE over DC transport type. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-10-16RDMA/core: Rename ports_parent to ports_kobjParav Pandit
Normally kobj objects have kobj suffix to reflect it. Rename ports_parent to ports_kobj. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-16RDMA/nldev: Allow IB device rename through RDMA netlinkLeon Romanovsky
Provide an option to rename IB device name through RDMA netlink and limit it to users with ADMIN capability only. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-16RDMA/core: Annotate timeout as unsigned longLeon Romanovsky
The ucma users supply timeout in u32 format, it means that any number with most significant bit set will be converted to negative value by various rdma_*, cma_* and sa_query functions, which treat timeout as int. In the lowest level, the timeout is converted back to be unsigned long. Remove this ambiguous conversion by updating all function signatures to receive unsigned long. Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-16RDMA/core: Align multiple functions to kernel coding styleLeon Romanovsky
This patch changes the small number of functions to be aligned to kernel coding style. It is needed to minimize the diffstat of the following patch. It doesn't change any functionality. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-16qed: Add supported link and advertise link to display in ethtool.Rahul Verma
Added transceiver type, speed capability and board types in HSI, are utilizing to display the accurate link information in ethtool. Signed-off-by: Rahul Verma <Rahul.Verma@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16sctp: get pr_assoc and pr_stream all status with SCTP_PR_SCTP_ALL insteadXin Long
According to rfc7496 section 4.3 or 4.4: sprstat_policy: This parameter indicates for which PR-SCTP policy the user wants the information. It is an error to use SCTP_PR_SCTP_NONE in sprstat_policy. If SCTP_PR_SCTP_ALL is used, the counters provided are aggregated over all supported policies. We change to dump pr_assoc and pr_stream all status by SCTP_PR_SCTP_ALL instead, and return error for SCTP_PR_SCTP_NONE, as it also said "It is an error to use SCTP_PR_SCTP_NONE in sprstat_policy. " Fixes: 826d253d57b1 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt") Fixes: d229d48d183f ("sctp: add SCTP_PR_STREAM_STATUS sockopt for prsctp") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16f2fs: checkpoint disablingDaniel Rosenberg
Note that, it requires "f2fs: return correct errno in f2fs_gc". This adds a lightweight non-persistent snapshotting scheme to f2fs. To use, mount with the option checkpoint=disable, and to return to normal operation, remount with checkpoint=enable. If the filesystem is shut down before remounting with checkpoint=enable, it will revert back to its apparent state when it was first mounted with checkpoint=disable. This is useful for situations where you wish to be able to roll back the state of the disk in case of some critical failure. Signed-off-by: Daniel Rosenberg <drosen@google.com> [Jaegeuk Kim: use SB_RDONLY instead of MS_RDONLY] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-10-16amiflop: fold headers into C fileOmar Sandoval
amifd.h and amifdreg.h are only used from amiflop.c, and they're pretty small, so move the contents to amiflop.c and get rid of the .h files. This is preparation for adding a struct blk_mq_tag_set to struct amiga_floppy_struct. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16Merge branch 'x86/build' into locking/core, to pick up dependent patches and ↵Ingo Molnar
unify jump-label work Signed-off-by: Ingo Molnar <mingo@kernel.org>