summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-01-05net/mlx5e: Do not modify the TX SKBAchiad Shochat
If the SKB is cloned, or has an elevated users count, someone else can be looking at it at the same time. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05Merge remote-tracking branches 'regmap/topic/mmio', 'regmap/topic/rbtree' ↵Mark Brown
and 'regmap/topic/seq' into regmap-next
2016-01-05Merge remote-tracking branches 'regmap/topic/64bit' and ↵Mark Brown
'regmap/topic/irq-type' into regmap-next
2016-01-05Merge remote-tracking branch 'regmap/topic/core' into regmap-nextMark Brown
2016-01-05Merge remote-tracking branch 'regmap/topic/cache' into regmap-nextMark Brown
2016-01-05spi: loopback-test: rename method spi_test_fill_tx to spi_test_fill_patternMartin Sperl
Rename method spi_test_fill_tx to spi_test_fill_pattern to better describe what it does. Signed-off-by: Martin Sperl <kernel@martin.sperl.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-01-05spi: loopback-test: write rx pattern also when running without tx_bufMartin Sperl
Currently the rx_buf does not get set with the SPI_TEST_PATTERN_UNWRITTEN when tx_buf == NULL in the transfer. Reorder code so that it gets done also under this specific condition. Signed-off-by: Martin Sperl <kernel@martin.sperl.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-01-05regmap: debugfs: Use seq_file for the access mapMark Brown
Unlike the registers file we don't have any substantial performance concerns rendering the entire file (it involves no device accesses) so just use seq_printf() to simplify the code. Signed-off-by: Mark Brown <broonie@kernel.org>
2016-01-05regmap: irq: add support for configuration of trigger typeLaxman Dewangan
Some of devices supports the trigger level for interrupt like rising/falling edge specially for GPIOs. The interrupt support of such devices may have uses the generic regmap irq framework for implementation. Add support to configure the trigger type device interrupt register via regmap-irq framework. The regmap-irq framework configures the trigger register only if the details of trigger type registers are provided. [Fixed use of terery operator for legibility -- broonie] Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-01-05Merge branch 'sctp-transport-rhashtable'David S. Miller
Xin Long says: ==================== sctp: use transport hashtable to replace association's with rhashtable for telecom center, the usual case is that a server is connected by thousands of clients. but if the server with only one enpoint(udp style) use the same sport and dport to communicate with every clients, and every assoc in server will be hashed in the same chain of global assoc hashtable due to currently we choose dport and sport as the hash key. when a packet is received, sctp_rcv try to find the assoc with sport and dport, since that chain is too long to find it fast, it make the performance turn to very low, some test data is as follow: in server: $./ss [start a udp style server there] in client: $./cc [start 2500 sockets to connect server with same port and different ip, and use one of them to send data to server] ===== test on net-next -- perf top server: 55.73% [kernel] [k] sctp_assoc_is_match 6.80% [kernel] [k] sctp_assoc_lookup_paddr 4.81% [kernel] [k] sctp_v4_cmp_addr 3.12% [kernel] [k] _raw_spin_unlock_irqrestore 1.94% [kernel] [k] sctp_cmp_addr_exact client: 46.01% [kernel] [k] sctp_endpoint_lookup_assoc 5.55% libc-2.17.so [.] __libc_calloc 5.39% libc-2.17.so [.] _int_free 3.92% libc-2.17.so [.] _int_malloc 3.23% [kernel] [k] __memset -- spent time time is 487s, send pkt is 10000000 we need to change the way to calculate the hash key, to use lport + rport + paddr as the hash key can avoid this issue. besides, this patchset will use transport hashtable to replace association hashtable to lookup with rhashtable api. get transport first then get association by t->asoc. and also it will make tcp style work better. ===== test with this patchset: -- perf top server: 15.98% [kernel] [k] _raw_spin_unlock_irqrestore 9.92% [kernel] [k] __pv_queued_spin_lock_slowpath 7.22% [kernel] [k] copy_user_generic_string 2.38% libpthread-2.17.so [.] __recvmsg_nocancel 1.88% [kernel] [k] sctp_recvmsg client: 11.90% [kernel] [k] sctp_hash_cmp 8.52% [kernel] [k] rht_deferred_worker 4.94% [kernel] [k] __pv_queued_spin_lock_slowpath 3.95% [kernel] [k] sctp_bind_addr_match 2.49% [kernel] [k] __memset -- spent time time is 22s, send pkt is 10000000 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05sctp: remove the local_bh_disable/enable in sctp_endpoint_lookup_assocXin Long
sctp_endpoint_lookup_assoc is called in the protection of sock lock there is no need to call local_bh_disable in this function. so remove them. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05sctp: drop the old assoc hashtable of sctpXin Long
transport hashtable will replace the association hashtable, so association hashtable is not used in sctp any more, so drop the codes about that. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05sctp: apply rhashtable api to sctp procfsXin Long
Traversal the transport rhashtable, get the association only once through the condition assoc->peer.primary_path != transport. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05sctp: apply rhashtable api to send/recv pathXin Long
apply lookup apis to two functions, for __sctp_endpoint_lookup_assoc and __sctp_lookup_association, it's invoked in the protection of sock lock, it will be safe, but sctp_lookup_association need to call rcu_read_lock() and to detect the t->dead to protect it. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05sctp: add the rhashtable apis for sctp global transport hashtableXin Long
tranport hashtbale will replace the association hashtable to do the lookup for transport, and then get association by t->assoc, rhashtable apis will be used because of it's resizable, scalable and using rcu. lport + rport + paddr will be the base hashkey to locate the chain, with net to protect one netns from another, then plus the laddr to compare to get the target. this patch will provider the lookup functions: - sctp_epaddr_lookup_transport - sctp_addrs_lookup_transport hash/unhash functions: - sctp_hash_transport - sctp_unhash_transport init/destroy functions: - sctp_transport_hashtable_init - sctp_transport_hashtable_destroy Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05mmc: dw_mmc: remove the unused quirksJaehoon Chung
Removed the unused quirks. These quirks don't used anywhere. Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-01-05mmc: sdhci-pci: use to_pci_dev()Geliang Tang
Use to_pci_dev() instead of open-coding it. Signed-off-by: Geliang Tang <geliangtang@163.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-01-05mmc: cb710: use to_platform_device()Geliang Tang
Use to_platform_device() instead of open-coding it. Signed-off-by: Geliang Tang <geliangtang@163.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-01-05mmc: tegra: use correct accessor for misc ctrl registerLucas Stach
The misc control register is 32bit wide, the used readw/writew accessors only mainipulate the low 16bit of this register. It currently doesn't matter as all the bit changed are located in the lower half, but together with the u32 variable used to hold the contents of the register it is seriously confusing. Switch to 32bit accessors to avoid any future breakage. Signed-off-by: Lucas Stach <dev@lynxeye.de> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-01-05mmc: tegra: enable UHS-I modesLucas Stach
Keep the quirk bits, as Tegra30 and Tegra114 host have different levels of support for UHS-I modes and so need different spare bits to be set, but change the logic to be positive. Tegra210 needs a different tuning sequence than Tegra30+. Disable UHS modes until support for this is properly added. Signed-off-by: Lucas Stach <dev@lynxeye.de> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-01-05spi: fsl-espi: expose maximum transfer size limitMichal Suchanek
The fsl-espi hardware can trasfer at most 64K data so report teh limitation. Based on patch by Heiner Kallweit <hkallweit1@gmail.com> CC: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Michal Suchanek <hramrach@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-01-05spi: expose master transfer size limitation.Michal Suchanek
On some SPI controllers it is not feasible to transfer arbitrary amount of data at once. When the limit on transfer size is a few kilobytes at least it makes sense to use the SPI hardware rather than reverting to gpio driver. The protocol drivers need a way to check that they do not sent overly long messages, though. Signed-off-by: Michal Suchanek <hramrach@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-01-05Bluetooth: Add support for Start Limited Discovery commandJohan Hedberg
This patch implements the mgmt Start Limited Discovery command. Most of existing Start Discovery code is reused since the only difference is the presence of a 'limited' flag as part of the discovery state. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-01-05Bluetooth: Change eir_has_data_type() to more generic eir_get_data()Johan Hedberg
To make the EIR parsing helper more general purpose, make it return the found data and its length rather than just saying whether the data was present or not. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-01-05arm64: mm: move pgd_cache initialisation to pgtable_cache_initWill Deacon
Initialising the suppport for EFI runtime services requires us to allocate a pgd off the back of an early_initcall. On systems where the PGD_SIZE is smaller than PAGE_SIZE (e.g. 64k pages and 48-bit VA), the pgd_cache isn't initialised at this stage, and we panic with a NULL dereference during boot: Unable to handle kernel NULL pointer dereference at virtual address 00000000 __create_mapping.isra.5+0x84/0x350 create_pgd_mapping+0x20/0x28 efi_create_mapping+0x5c/0x6c arm_enable_runtime_services+0x154/0x1e4 do_one_initcall+0x8c/0x190 kernel_init_freeable+0x84/0x1ec kernel_init+0x10/0xe0 ret_from_fork+0x10/0x50 This patch fixes the problem by initialising the pgd_cache earlier, in the pgtable_cache_init callback, which sounds suspiciously like what it was intended for. Reported-by: Dennis Chen <dennis.chen@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-01-05cgroup: Remove resource_counter.txt in Documentation/cgroup-legacy/00-INDEX.Yuan Sun
Signed-off-by: Yuan Sun <sunyuan3@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-01-05Fix documentation for adp1653 DTPali Rohár
Property names do not match real names needed by driver itself. This patch fix this problem. Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rob Herring <robh@kernel.org>
2016-01-05ARM: psci: Fix indentation in DT bindingsGeert Uytterhoeven
Fix bogus indentation of the PSCI compatible values, reformat. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rob Herring <robh@kernel.org>
2016-01-05of/platform: export of_default_bus_match_tableMasahiro Yamada
Currently, drivers/bus/uniphier-system-bus.c is kept from being a module due to the unresolved reference to of_default_bus_match_table. Refer to commit 326ea45aa827 ("bus: uniphier: allow only built-in driver"). Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Rob Herring <robh@kernel.org>
2016-01-05of/unittest: Show broken behaviour in the platform busGrant Likely
Add a single resource to the test bus device to exercise the platform bus code a little more. This isn't strictly a devicetree test, but it is a corner case that the devicetree runs into. Until we've got platform device unittests, it can live here. It doesn't need to be an explicit text because the kernel will oops when it is wrong. Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Cc: Rob Herring <robh@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com> Signed-off-by: Grant Likely <grant.likely@linaro.org> [wsa: added the comment provided by Grant, rebased, and tested] Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Rob Herring <robh@kernel.org>
2016-01-05tile: provide CONFIG_PAGE_SIZE_64KB etc for tileproChris Metcalf
This allows the build system to know that it can't attempt to configure the Lustre virtual block device, for example, when tilepro is using 64KB pages (as it does by default). The tilegx build already provided those symbols. Previously we required that the tilepro hypervisor be rebuilt with a different hardcoded page size in its headers, and then Linux be rebuilt using the updated hypervisor header. Now we allow each of the hypervisor and Linux to be built independently. We still check at boot time to ensure that the page size provided by the hypervisor matches what Linux expects. Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: stable@vger.kernel.org [3.19+]
2016-01-05pinctrl: lantiq: 2 pins have the wrong mux listJohn Crispin
The latest vendor SDK contained this patch. Signed-off-by: John Crispin <blogic@openwrt.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-01-05Documentation: cpufreq: intel_pstate: enhance documentationSrinivas Pandruvada
This is an attempt to make documentation more user friendly. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Doug Smythies <dsmythies@telus.net> Reviewed-by: Chen, Yu C <yu.c.chen@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-05ACPI, PCI, irq: remove redundant check for null string pointerColin Ian King
source is decleared as a 4 byte char array in struct acpi_pci_routing_table so !prt->source is a redundant null string pointer check. Detected with smatch: drivers/acpi/pci_irq.c:134 do_prt_fixups() warn: this array is probably non-NULL. 'prt->source' Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-05ACPI / video: driver must be registered before checking for keypressesAdrien Schildknecht
acpi_video_handles_brightness_key_presses() may use an uninitialized mutex. The error has been reported by lockdep: DEBUG_LOCKS_WARN_ON(l->magic != l). The function assumes that the video driver has been registered before being called. As explained in the comment of acpi_video_init(), the registration of the video class may be defered and thus may not take place in the init function of the module. Use completion mechanisms to make sure that acpi_video_handles_brightness_key_presses() wait for the completion of acpi_video_register() before using the mutex. Also get rid of register_count since task completion can replace it. Signed-off-by: Adrien Schildknecht <adrien+dev@schischi.me> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-05ASoC: Intel: Skylake: Fix the memory leakVinod Koul
This provide the fix for firmware memory by freeing the pointer in driver remove where it is safe to do so Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-01-05ASoC: Intel: Skylake: Revert previous broken fix memory leak fixVinod Koul
This reverts commit 87b5ed8ecb9fe05a696e1c0b53c7a49ea66432c1 ("ASoC: Intel: Skylake: fix memory leak") as it causes regression on Skylake devices The SKL drivers can be deferred probe. The topology file based widgets can have references to topology file so this can't be freed until card is fully created, so revert this patch for now [ 66.682767] BUG: unable to handle kernel paging request at ffffc900001363fc [ 66.690735] IP: [<ffffffff806c94dd>] strnlen+0xd/0x40 [ 66.696509] PGD 16e035067 PUD 16e036067 PMD 16e038067 PTE 0 [ 66.702925] Oops: 0000 [#1] PREEMPT SMP [ 66.768390] CPU: 3 PID: 57 Comm: kworker/u16:3 Tainted: G O 4.4.0-rc7-skl #62 [ 66.778869] Hardware name: Intel Corporation Skylake Client platform [ 66.793201] Workqueue: deferwq deferred_probe_work_func [ 66.799173] task: ffff88008b700f40 ti: ffff88008b704000 task.ti: ffff88008b704000 [ 66.807692] RIP: 0010:[<ffffffff806c94dd>] [<ffffffff806c94dd>] strnlen+0xd/0x40 [ 66.816243] RSP: 0018:ffff88008b707878 EFLAGS: 00010286 [ 66.822293] RAX: ffffffff80e60a82 RBX: 000000000000000e RCX: fffffffffffffffe [ 66.830406] RDX: ffffc900001363fc RSI: ffffffffffffffff RDI: ffffc900001363fc [ 66.838520] RBP: ffff88008b707878 R08: 000000000000ffff R09: 000000000000ffff [ 66.846649] R10: 0000000000000001 R11: ffffffffa01c6368 R12: ffffc900001363fc [ 66.854765] R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000000 [ 66.862910] FS: 0000000000000000(0000) GS:ffff88016ecc0000(0000) knlGS:0000000000000000 [ 66.872150] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 66.878696] CR2: ffffc900001363fc CR3: 0000000002c09000 CR4: 00000000003406e0 [ 66.886820] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 66.894938] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 66.903052] Stack: [ 66.905346] ffff88008b7078b0 ffffffff806cb1db 000000000000000e 0000000000000000 [ 66.913854] ffff88008b707928 ffffffffa00d1050 ffffffffa00d104e ffff88008b707918 [ 66.922353] ffffffff806ccbd6 ffff88008b707948 0000000000000046 ffff88008b707940 [ 66.930855] Call Trace: [ 66.933646] [<ffffffff806cb1db>] string.isra.4+0x3b/0xd0 [ 66.939793] [<ffffffff806ccbd6>] vsnprintf+0x116/0x540 [ 66.945742] [<ffffffff806d02f0>] kvasprintf+0x40/0x80 [ 66.951591] [<ffffffff806d0370>] kasprintf+0x40/0x50 [ 66.957359] [<ffffffffa00c085f>] dapm_create_or_share_kcontrol+0x1cf/0x300 [snd_soc_core] [ 66.966771] [<ffffffff8057dd1e>] ? __kmalloc+0x16e/0x2a0 [ 66.972931] [<ffffffffa00c0dab>] snd_soc_dapm_new_widgets+0x41b/0x4b0 [snd_soc_core] [ 66.981857] [<ffffffffa00be8c0>] ? snd_soc_dapm_add_routes+0xb0/0xd0 [snd_soc_core] [ 67.007828] [<ffffffffa00b92ed>] soc_probe_component+0x23d/0x360 [snd_soc_core] [ 67.016244] [<ffffffff80b14e69>] ? mutex_unlock+0x9/0x10 [ 67.022405] [<ffffffffa00ba02f>] snd_soc_instantiate_card+0x47f/0xd10 [snd_soc_core] [ 67.031329] [<ffffffff8049eeb2>] ? debug_mutex_init+0x32/0x40 [ 67.037973] [<ffffffffa00baa92>] snd_soc_register_card+0x1d2/0x2b0 [snd_soc_core] [ 67.046619] [<ffffffffa00c8b54>] devm_snd_soc_register_card+0x44/0x80 [snd_soc_core] [ 67.055539] [<ffffffffa01c303b>] skylake_audio_probe+0x1b/0x20 [snd_soc_skl_rt286] [ 67.064292] [<ffffffff808aa887>] platform_drv_probe+0x37/0x90 Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-01-05spi: zynq: use to_platform_device()Geliang Tang
Use to_platform_device() instead of open-coding it. Signed-off-by: Geliang Tang <geliangtang@163.com> Reviewed-by: Moritz Fischer <moritz.fischer@ettus.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-01-05spi: cadence: use to_platform_device()Geliang Tang
Use to_platform_device() instead of open-coding it. Signed-off-by: Geliang Tang <geliangtang@163.com> Reviewed-by: Moritz Fischer <moritz.fischer@ettus.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-01-05arm64: module: avoid undefined shift behavior in reloc_data()Ard Biesheuvel
Compilers may engage the improbability drive when encountering shifts by a distance that is a multiple of the size of the operand type. Since the required bounds check is very simple here, we can get rid of all the fuzzy masking, shifting and comparing, and use the documented bounds directly. Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-01-05arm64: module: fix relocation of movz instruction with negative immediateArd Biesheuvel
The test whether a movz instruction with a signed immediate should be turned into a movn instruction (i.e., when the immediate is negative) is flawed, since the value of imm is always positive. Also, the subsequent bounds check is incorrect since the limit update never executes, due to the fact that the imm_type comparison will always be false for negative signed immediates. Let's fix this by performing the sign test on sval directly, and replacing the bounds check with a simple comparison against U16_MAX. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: tidied up use of sval, renamed MOVK enum value to MOVKZ] Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-01-05Merge branches 'misc' and 'misc-rc6' into for-linusRussell King
2016-01-05configfs: add myself as co-maintainer, updated git treeChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Joel Becker <jlbec@evilplan.org>
2016-01-05x86/mm/pat: Change free_memtype() to support shrinking caseToshi Kani
Using mremap() to shrink the map size of a VM_PFNMAP range causes the following error message, and leaves the pfn range allocated. x86/PAT: test:3493 freeing invalid memtype [mem 0x483200000-0x4863fffff] This is because rbt_memtype_erase(), called from free_memtype() with spin_lock held, only supports to free a whole memtype node in memtype_rbroot. Therefore, this patch changes rbt_memtype_erase() to support a request that shrinks the size of a memtype node for mremap(). memtype_rb_exact_match() is renamed to memtype_rb_match(), and is enhanced to support EXACT_MATCH and END_MATCH in @match_type. Since the memtype_rbroot tree allows overlapping ranges, rbt_memtype_erase() checks with EXACT_MATCH first, i.e. free a whole node for the munmap case. If no such entry is found, it then checks with END_MATCH, i.e. shrink the size of a node from the end for the mremap case. On the mremap case, rbt_memtype_erase() proceeds in two steps, 1) remove the node, and then 2) insert the updated node. This allows proper update of augmented values, subtree_max_end, in the tree. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: stsp@list.ru Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1450832064-10093-3-git-send-email-toshi.kani@hpe.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-01-05x86/mm/pat: Add untrack_pfn_moved for mremapToshi Kani
mremap() with MREMAP_FIXED on a VM_PFNMAP range causes the following WARN_ON_ONCE() message in untrack_pfn(). WARNING: CPU: 1 PID: 3493 at arch/x86/mm/pat.c:985 untrack_pfn+0xbd/0xd0() Call Trace: [<ffffffff817729ea>] dump_stack+0x45/0x57 [<ffffffff8109e4b6>] warn_slowpath_common+0x86/0xc0 [<ffffffff8109e5ea>] warn_slowpath_null+0x1a/0x20 [<ffffffff8106a88d>] untrack_pfn+0xbd/0xd0 [<ffffffff811d2d5e>] unmap_single_vma+0x80e/0x860 [<ffffffff811d3725>] unmap_vmas+0x55/0xb0 [<ffffffff811d916c>] unmap_region+0xac/0x120 [<ffffffff811db86a>] do_munmap+0x28a/0x460 [<ffffffff811dec33>] move_vma+0x1b3/0x2e0 [<ffffffff811df113>] SyS_mremap+0x3b3/0x510 [<ffffffff817793ee>] entry_SYSCALL_64_fastpath+0x12/0x71 MREMAP_FIXED moves a pfnmap from old vma to new vma. untrack_pfn() is called with the old vma after its pfnmap page table has been removed, which causes follow_phys() to fail. The new vma has a new pfnmap to the same pfn & cache type with VM_PAT set. Therefore, we only need to clear VM_PAT from the old vma in this case. Add untrack_pfn_moved(), which clears VM_PAT from a given old vma. move_vma() is changed to call this function with the old vma when VM_PFNMAP is set. move_vma() then calls do_munmap(), and untrack_pfn() is a no-op since VM_PAT is cleared. Reported-by: Stas Sergeev <stsp@list.ru> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1450832064-10093-2-git-send-email-toshi.kani@hpe.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-01-04af_unix: Fix splice-bind deadlockRainer Weikusat
On 2015/11/06, Dmitry Vyukov reported a deadlock involving the splice system call and AF_UNIX sockets, http://lists.openwall.net/netdev/2015/11/06/24 The situation was analyzed as (a while ago) A: socketpair() B: splice() from a pipe to /mnt/regular_file does sb_start_write() on /mnt C: try to freeze /mnt wait for B to finish with /mnt A: bind() try to bind our socket to /mnt/new_socket_name lock our socket, see it not bound yet decide that it needs to create something in /mnt try to do sb_start_write() on /mnt, block (it's waiting for C). D: splice() from the same pipe to our socket lock the pipe, see that socket is connected try to lock the socket, block waiting for A B: get around to actually feeding a chunk from pipe to file, try to lock the pipe. Deadlock. on 2015/11/10 by Al Viro, http://lists.openwall.net/netdev/2015/11/10/4 The patch fixes this by removing the kern_path_create related code from unix_mknod and executing it as part of unix_bind prior acquiring the readlock of the socket in question. This means that A (as used above) will sb_start_write on /mnt before it acquires the readlock, hence, it won't indirectly block B which first did a sb_start_write and then waited for a thread trying to acquire the readlock. Consequently, A being blocked by C waiting for B won't cause a deadlock anymore (effectively, both A and B acquire two locks in opposite order in the situation described above). Dmitry Vyukov(<dvyukov@google.com>) tested the original patch. Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04net: Propagate lookup failure in l3mdev_get_saddr to callerDavid Ahern
Commands run in a vrf context are not failing as expected on a route lookup: root@kenny:~# ip ro ls table vrf-red unreachable default root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254 ping: Warning: source address might be selected on device other than vrf-red. PING 10.100.1.254 (10.100.1.254) from 0.0.0.0 vrf-red: 56(84) bytes of data. --- 10.100.1.254 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 999ms Since the vrf table does not have a route for 10.100.1.254 the ping should have failed. The saddr lookup causes a full VRF table lookup. Propogating a lookup failure to the user allows the command to fail as expected: root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254 connect: No route to host Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04Merge branch 'faster-soreuseport'David S. Miller
Craig Gallek says: ==================== Faster SO_REUSEPORT This series contains two optimizations for the SO_REUSEPORT feature: Faster lookup when selecting a socket for an incoming packet and the ability to select the socket from the group using a BPF program. This series only includes the UDP path. I plan to submit a follow-up including the TCP path if the implementation in this series is acceptable. Changes in v4: - pskb_may_pull is unnecessary with pskb_pull (per Alexei Starovoitov) Changes in v3: - skb_pull_inline -> pskb_pull (per Alexei Starovoitov) - reuseport_attach* -> sk_reuseport_attach* and simple return statement syntax change (per Daniel Borkmann) Changes in v2: - Fix ARM build; remove unnecessary include. - Handle case where protocol header is not in linear section (per Alexei Starovoitov). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04soreuseport: BPF selection functional testCraig Gallek
This program will build classic and extended BPF programs and validate the socket selection logic when used with SO_ATTACH_REUSEPORT_CBPF and SO_ATTACH_REUSEPORT_EBPF. It also validates the re-programing flow and several edge cases. Signed-off-by: Craig Gallek <kraig@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPFCraig Gallek
Expose socket options for setting a classic or extended BPF program for use when selecting sockets in an SO_REUSEPORT group. These options can be used on the first socket to belong to a group before bind or on any socket in the group after bind. This change includes refactoring of the existing sk_filter code to allow reuse of the existing BPF filter validation checks. Signed-off-by: Craig Gallek <kraig@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>