linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2015-03-13	phy: core: Fixup return value of phy_exit when !pm_runtime_enabled	Axel Lin
	When phy_pm_runtime_get_sync() returns -ENOTSUPP, phy_exit() also returns -ENOTSUPP if !phy->ops->exit. Fix it. Also move the code to override ret close to the code we got ret. I think it is less error prone this way. Signed-off-by: Axel Lin <axel.lin@ingics.com> Acked-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2015-03-13	phy: miphy28lp: Convert to devm_kcalloc and fix wrong sizof	Axel Lin
	Prefer devm_kcalloc over devm_kzalloc with multiply. In additional, use sizeof(phy) is incorrect, fix it. Signed-off-by: Axel Lin <axel.lin@ingics.com> Acked-by: Gabriel Fernandez<gabriel.fernandez@linaro.org> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2015-03-13	phy: miphy365x: Convert to devm_kcalloc and fix wrong sizeof	Axel Lin
	Prefer devm_kcalloc over devm_kzalloc with multiply. In additional, use sizeof(phy) is incorrect, fix it. Signed-off-by: Axel Lin <axel.lin@ingics.com> Acked-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2015-03-13	phy: twl4030-usb: Remove redundant assignment for twl->linkstat	Axel Lin
	It's pointless to set twl->linkstat twice. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2015-03-13	phy: exynos5-usbdrd: Fix off-by-one valid value checking for args->args[0]	Axel Lin
	Current code uses args->args[0] as array subscript of phy_drd->phys[]. So the valid value range for args->args[0] is 0 ... EXYNOS5_DRDPHYS_NUM - 1. Signed-off-by: Axel Lin <axel.lin@ingics.com> Reviewed by: Vivek Gautam <gautam.vivek@samsung.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2015-03-13	x86/fpu: Drop_fpu() should not assume that tsk equals current	Oleg Nesterov
	drop_fpu() does clear_used_math() and usually this is correct because tsk == current. However switch_fpu_finish()->restore_fpu_checking() is called before __switch_to() updates the "current_task" variable. If it fails, we will wrongly clear the PF_USED_MATH flag of the previous task. So use clear_stopped_child_used_math() instead. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150309171041.GB11388@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-13	x86/fpu: Avoid math_state_restore() without used_math() in ↵	Oleg Nesterov
	__restore_xstate_sig() math_state_restore() assumes it is called with irqs disabled, but this is not true if the caller is __restore_xstate_sig(). This means that if ia32_fxstate == T and __copy_from_user() fails, __restore_xstate_sig() returns with irqs disabled too. This triggers: BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:41 dump_stack ___might_sleep ? _raw_spin_unlock_irqrestore __might_sleep down_read ? _raw_spin_unlock_irqrestore print_vma_addr signal_fault sys32_rt_sigreturn Change __restore_xstate_sig() to call set_used_math() unconditionally. This avoids enabling and disabling interrupts in math_state_restore(). If copy_from_user() fails, we can simply do fpu_finit() by hand. [ Note: this is only the first step. math_state_restore() should not check used_math(), it should set this flag. While init_fpu() should simply die. ] Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150307153844.GB25954@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-13	phy: Find the right match in devm_phy_destroy()	Thierry Reding
	devm_phy_create() stores the pointer to the new PHY at the address returned by devres_alloc(). The res parameter passed to devm_phy_match() is therefore the location where the pointer to the PHY is stored, hence it needs to be dereferenced before comparing to the match data in order to find the correct match. Cc: <stable@vger.kernel.org> # v3.13+ Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2015-03-13	s390/mm: limit STACK_RND_MASK for compat tasks	Martin Schwidefsky
	For compat tasks the mmap randomization does not use the maximum randomization value from mmap_rnd_mask but the fixed value of 0x7ff. This needs to be respected in the definition of STACK_RND_MASK as well. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-03-13	s390/ftrace: fix compile error if CONFIG_KPROBES is disabled	Heiko Carstens
	Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-03-13	s390/cpum_sf: add diagnostic sampling event only if it is authorized	Hendrik Brueckner
	The SF_CYCLES_BASIC_DIAG is always registered even if it is turned of in the current hardware configuration. Because diagnostic-sampling is typically not turned on in the hardware configuration, do not register this perf event by default. Enable it only if the diagnostic-sampling function is authorized. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-03-13	netfilter: Fix potential crash in nft_hash walker	Herbert Xu
	When we get back an EAGAIN from rhashtable_walk_next we were treating it as a valid object which obviously doesn't work too well. Luckily this is hard to trigger so it seems nobody has run into it yet. This patch fixes it by redoing the next call when we get an EAGAIN. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-13	Bluetooth: btusb: Add helper for READ_LOCAL_VERSION command	Daniel Drake
	Multiple codepaths duplicate some simple code to read and sanity-check local version information. Before I add a couple more such codepaths, add a helper to reduce duplication. Signed-off-by: Daniel Drake <drake@endlessm.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13	arm/arm64: KVM: fix missing unlock on error in kvm_vgic_create()	Wei Yongjun
	Add the missing unlock before return from function kvm_vgic_create() in the error handling case. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-13	crypto: aesni - fix memory usage in GCM decryption	Stephan Mueller
	The kernel crypto API logic requires the caller to provide the length of (ciphertext \|\| authentication tag) as cryptlen for the AEAD decryption operation. Thus, the cipher implementation must calculate the size of the plaintext output itself and cannot simply use cryptlen. The RFC4106 GCM decryption operation tries to overwrite cryptlen memory in req->dst. As the destination buffer for decryption only needs to hold the plaintext memory but cryptlen references the input buffer holding (ciphertext \|\| authentication tag), the assumption of the destination buffer length in RFC4106 GCM operation leads to a too large size. This patch simply uses the already calculated plaintext size. In addition, this patch fixes the offset calculation of the AAD buffer pointer: as mentioned before, cryptlen already includes the size of the tag. Thus, the tag does not need to be added. With the addition, the AAD will be written beyond the already allocated buffer. Note, this fixes a kernel crash that can be triggered from user space via AF_ALG(aead) -- simply use the libkcapi test application from [1] and update it to use rfc4106-gcm-aes. Using [1], the changes were tested using CAVS vectors to demonstrate that the crypto operation still delivers the right results. [1] http://www.chronox.de/libkcapi.html CC: Tadeusz Struk <tadeusz.struk@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-03-13	Bluetooth: Introduce hci_dev_test_and_set_flag helper macro	Marcel Holtmann
	Instead of manually coding test_and_set_bit on hdev->dev_flags all the time, use hci_dev_test_and_set_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13	Bluetooth: Introduce hci_dev_test_and_clear_flag helper macro	Marcel Holtmann
	Instead of manually coding test_and_clear_bit on hdev->dev_flags all the time, use hci_dev_test_and_clear_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13	Bluetooth: Introduce hci_dev_test_and_change_flag helper macro	Marcel Holtmann
	Instead of manually coding test_and_change_bit on hdev->dev_flags all the time, use hci_dev_test_and_change_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13	Bluetooth: Introduce hci_dev_change_flag helper macro	Marcel Holtmann
	Instead of manually coding change_bit on hdev->dev_flags all the time, use hci_dev_change_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13	Bluetooth: Introduce hci_dev_clear_flag helper macro	Marcel Holtmann
	Instead of manually coding clear_bit on hdev->dev_flags all the time, use hci_dev_clear_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13	Bluetooth: Introduce hci_dev_set_flag helper macro	Marcel Holtmann
	Instead of manually coding set_bit on hdev->dev_flags all the time, use hci_dev_set_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13	Bluetooth: Introduce hci_dev_test_flag helper macro	Marcel Holtmann
	Instead of manually coding test_bit on hdev->dev_flags all the time, use hci_dev_test_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13	Bluetooth: Add support connectable advertising setting	Marcel Holtmann
	The patch adds a second advertising setting that allows switching of the controller into connectable mode independent of the global connectable setting. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13	dmaengine: at_hdmac: Fix calculation of the residual bytes	Torsten Fleischer
	This patch fixes the following issues regarding to the calculation of the residue: 1. The residue is always calculated for the current transfer even if the cookie is associated to a pending transfer. 2. For scatter/gather DMA the calculation of the residue for the current transfer doesn't include the bytes of the child descriptors that are already transferred. It only calculates the difference between the transfer's total length minus the number of bytes that are already transferred for the current child descriptor. For example: There is a scatter/gather DMA transfer with a total length of 1 MByte. Getting the residue several times while the transfer is running shows something like that: 1: residue = 975584 2: residue = 1002766 3: residue = 992627 4: residue = 983767 5: residue = 985694 6: residue = 1008094 7: residue = 1009741 8: residue = 1011195 3. The driver stores the residue but never resets it when starting a new transfer. For example: If there are two subsequent DMA transfers. The first one with a total length of 1 MByte and the second one with a total length of 1 kByte. Getting the residue for both transfers shows something like that: transfer 1: residue = 975584 transfer 2: residue = 1048380 Changes from V1: * Fixed coding style of the multi-line comments. * Improved accuracy of the residue calculation when the transfer for the first descriptor is active. Changes from V2: * Member 'tx_width' of 'struct at_desc' restored, because the transfer width can't be derived from the source width when using "slave_sg". The transfer width is needed for the calculation of the residue if either the transfer of the first or the last descriptor is in progress. In the case of a "memory_to_memory_sg" transfer (part of this patch series) the transfer width of both descriptors may differ. Thus it is required to additionally set 'tx_width' of the last descriptor. * Added functions for multiply used calculations. Signed-off-by: Torsten Fleischer <torfl6749@gmail.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-03-13	perf: Fix context leak in put_event()	Leon Yu
	Commit: a83fe28e2e45 ("perf: Fix put_event() ctx lock") changed the locking logic in put_event() by replacing mutex_lock_nested() with perf_event_ctx_lock_nested(), but didn't fix the subsequent mutex_unlock() with a correct counterpart, perf_event_ctx_unlock(). Contexts are thus leaked as a result of incremented refcount in perf_event_ctx_lock_nested(). Signed-off-by: Leon Yu <chianglungyu@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Fixes: a83fe28e2e45 ("perf: Fix put_event() ctx lock") Link: http://lkml.kernel.org/r/1424954613-5034-1-git-send-email-chianglungyu@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-13	ALSA: hda - Don't access stereo amps for mono channel widgets	Takashi Iwai
	The current HDA generic parser initializes / modifies the amp values always in stereo, but this seems causing the problem on ALC3229 codec that has a few mono channel widgets: namely, these mono widgets react to actions for both channels equally. In the driver code, we do care the mono channel and create a control only for the left channel (as defined in HD-audio spec) for such a node. When the control is updated, only the left channel value is changed. However, in the resume, the right channel value is also restored from the initial value we took as stereo, and this overwrites the left channel value. This ends up being the silent output as the right channel has been never touched and remains muted. This patch covers the places where unconditional stereo amp accesses are done and converts to the conditional accesses. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=94581 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-03-13	Merge branch 'tcp_metrics_netns_debloat'	David S. Miller
	Eric W. Biederman says: ==================== tcp_metrics: Network namespace bloat reduction v3 This is a small pile of patches that convert tcp_metrics from using a hash table per network namespace to using a single hash table for all network namespaces. This is broken up into several patches so that each small step along the way could be carefully scrutinized as I wrote it, and equally so that each small step can be reviewed. There are several cleanups included in this series. The addition of panic calls during boot where we can not handle failure, and not trying simplifies the code. The removal of the return code from tcp_metrics_flush_all. The motivation for this change is that the tcp_metrics hash table at 128KiB is one of the largest components of a freshly allocated network namespace. I am resending the the previous version I sent has suffered bitrot, so I have respun the patches so that they apply. I believe I have addressed all of the review concerns except optimal behavior on little machines with 32-byte cache lines, which is beyond me as even the current code has bad behavior in that case. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-13	tcp_metrics: Use a single hash table for all network namespaces.	Eric W. Biederman
	Now that all of the operations are safe on a single hash table accross network namespaces, allocate a single global hash table and update the code to use it. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-13	tcp_metrics: Rewrite tcp_metrics_flush_all	Eric W. Biederman
	Rewrite tcp_metrics_flush_all so that it can cope with entries from different network namespaces on it's hash chain. This is based on the logic in tcp_metrics_nl_cmd_del for deleting a selection of entries from a tcp metrics hash chain. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-13	tcp_metrics: Remove the unused return code from tcp_metrics_flush_all	Eric W. Biederman
	tcp_metrics_flush_all always returns 0. Remove the unnecessary return code. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-13	tcp_metrics: Add a field tcpm_net and verify it matches on lookup	Eric W. Biederman
	In preparation for using one tcp metrics hash table for all network namespaces add a field tcpm_net to struct tcp_metrics_block, and verify that field on all hash table lookups. Make the field tcpm_net of type possible_net_t so it takes no space when network namespaces are disabled. Further add a function tm_net to read that field so we can be efficient when network namespaces are disabled and concise the rest of the time. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-13	tcp_metrics: Mix the network namespace into the hash function.	Eric W. Biederman
	In preparation for using one hash table for all network namespaces mix the network namespace into the hash value. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-13	tcp_metrics: panic when tcp_metrics_init fails.	Eric W. Biederman
	There is not a practical way to cleanup during boot so just panic if there is a problem initializing tcp_metrics. That will at least give us a clear place to start debugging if something does go wrong. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-13	uapi/virtio_scsi: allow overriding CDB/SENSE size	Michael S. Tsirkin
	QEMU wants to use virtio scsi structures with a different VIRTIO_SCSI_CDB_SIZE/VIRTIO_SCSI_SENSE_SIZE, let's add ifdefs to allow overriding them. Keep the old defines under new names: VIRTIO_SCSI_CDB_DEFAULT_SIZE/VIRTIO_SCSI_SENSE_DEFAULT_SIZE, since that's what these values really are: defaults for cdb/sense size fields. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-13	virtio_mmio: generation support	Michael S. Tsirkin
	virtio_mmio currently lacks generation support which makes multi-byte field access racy. Fix by getting the value at offset 0xfc for version 2 devices. Nothing we can do for version 1, so return generation id 0. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-13	virtio_rpmsg: set DRIVER_OK before using device	Michael S. Tsirkin
	virtio spec requires that all drivers set DRIVER_OK before using devices. While rpmsg isn't yet included in the virtio 1 spec, previous spec versions also required this. virtio rpmsg violates this rule: is calls kick before setting DRIVER_OK. The fix isn't trivial since simply calling virtio_device_ready earlier would mean we might get an interrupt in parallel with adding buffers. Instead, split kick out to prepare+notify calls. prepare before virtio_device_ready - when we know we won't get interrupts. notify right afterwards. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Ohad Ben-Cohen <ohad@wizery.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-13	9p/trans_virtio: fix hot-unplug	Michael S. Tsirkin
	On device hot-unplug, 9p/virtio currently will kfree channel while it might still be in use. Of course, it might stay used forever, so it's an extremely ugly hack, but it seems better than use-after-free that we have now. [ Unused variable removed, whitespace cleanup, msg single-lined --RR ] Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-12	vxlan: Don't set s_addr in vxlan_create_sock	Simon Horman
	In the case of AF_INET s_addr was set to INADDR_ANY (0) which which both symmetric with the AF_INET6 case, where s_addr is not set, and unnecessary as udp_conf is zeroed out earlier in the same function. I suspect this change does not have any run-time effect due to compiler optimisations. But it does make the code a little easier on the/my eyes. Cc: Tom Herbert <therbert@google.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12	mpls: In mpls_egress verify the packet length.	Eric W. Biederman
	Reobert Shearman noticed that mpls_egress is failing to verify that the bytes to be examined are in fact present in the packet before mpls_egress reads those bytes. As suggested by David Miller reduce this to a single pskb_may_pull call so that we don't do unnecessary work in the fast path. Reported-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12	net/macb: Only adjust tx_clk on link change	Jaeden Amero
	The PHY state machine (in drivers/net/phy/phy.c) will unconditionally call phydev->adjust_link (macb_handle_link_change) when polling in the PHY_CHANGELINK state. As currently written, macb always ends up requesting a new tx_clk frequency in macb_handle_link_change. It is a waste of time to request a new tx_clk frequency if the link state hasn't changed, as the tx_clk will already be configured properly. Let's only request a new tx_clk clock frequency when necessary. Signed-off-by: Jaeden Amero <jaeden.amero@ni.com> Cc: Josh Cartwright <joshc@ni.com> Cc: Soren Brinkmann <soren.brinkmann@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12	rhashtable: Fix read-side crash during rehash	Herbert Xu
	This patch fixes a typo rhashtable_lookup_compare where we fail to recompute the hash when looking up the new table. This causes elements to be missed and potentially a crash during a resize. Reported-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12	rhashtable: kill ht->shift atomic operations	Daniel Borkmann
	Commit c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for worker queue") changed ht->shift to be atomic, which is actually unnecessary. Instead of leaving the current shift in the core rhashtable structure, it can be cached inside the individual bucket tables. There, it will only be initialized once during a new table allocation in the shrink/expansion slow path, and from then onward it stays immutable for the rest of the bucket table liftime. That allows shift to be non-atomic. The patch also moves hash_rnd management into the table setup. The rhashtable structure now consumes 3 instead of 4 cachelines. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Ying Xue <ying.xue@windriver.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12	rhashtable: Fix reader/rehash race	Herbert Xu
	There is a potential race condition between readers and the rehasher. In particular, the rehasher could have started a rehash while the reader finishes a scan of the old table but fails to see the new table pointer. This patch closes this window by adding smp_wmb/smp_rmb. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12	Merge branch 'listener_refactor'	David S. Miller
	Eric Dumazet says: ==================== inet: tcp listener refactoring, part 8 These patches prepare request socks being hashed into general ehash table : We declare 3 aliases (ireq_state, ireq_refcnt, ireq_family) Note that refcnt is not yet handled, this will be done later. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12	inet: introduce ireq_family	Eric Dumazet
	Before inserting request socks into general hash table, fill their socket family. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12	inet: get_openreq4() & get_openreq6() do not need listener	Eric Dumazet
	ireq->ir_num contains local port, use it. Also, get_openreq4() dumping listen_sk->refcnt makes litle sense. inet_diag_fill_req() can also use ireq->ir_num Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12	inet: prepare sock_edemux() & sock_gen_put() for new SYN_RECV state	Eric Dumazet
	sock_edemux() & sock_gen_put() should be ready to cope with request socks. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12	net: add req_prot_cleanup() & req_prot_init() helpers	Eric Dumazet
	Make proto_register() & proto_unregister() a bit nicer. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12	inet: add rsk_refcnt/ireq_refcnt to request socks	Eric Dumazet
	When request socks will be in ehash, they'll need to be refcounted. This patch adds rsk_refcnt/ireq_refcnt macros, and adds reqsk_put() function, but nothing yet use them. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12	inet: add ireq_state field to inet_request_sock	Eric Dumazet
	We need to identify request sock when they'll be visible in global ehash table. ireq_state is an alias to req.__req_common.skc_state. Its value is set to TCP_NEW_SYN_RECV Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>