linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2013-10-29	net: mvmdio: make orion_mdio_wait_ready consistent	Leigh Brown
	Amend orion_mdio_wait_ready so that the same timeout is used when polling or using wait_event_timeout. Set the timeout to 1ms. Replace udelay with usleep_range to avoid a busy loop, and set the polling interval range as 45us to 55us, so that the first sleep will be enough in almost all cases. Generate the same log message at timeout when polling or using wait_event_timeout. Signed-off-by: Leigh Brown <leigh@solinno.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29	net/benet: Make lancer_wait_ready() static	Gavin Shan
	The function needn't to be public, so to make it as static. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29	net/benet: Remove interface type	Gavin Shan
	The interface type, which is being traced by "struct be_adapter:: if_type", isn't used currently. So we can remove that safely according to Sathya's comments. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29	netconsole: Convert to pr_<level>	Joe Perches
	Use a more current logging style. Convert printks to pr_<level>. Consolidate multiple printks into a single printk to avoid any possible dmesg interleaving. Add a default "event" msg in case the listed types are ever expanded. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29	bridge: pass correct vlan id to multicast code	Vlad Yasevich
	Currently multicast code attempts to extrace the vlan id from the skb even when vlan filtering is disabled. This can lead to mdb entries being created with the wrong vlan id. Pass the already extracted vlan id to the multicast filtering code to make the correct id is used in creation as well as lookup. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29	Merge branch 'fixes' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch Jesse Gross says: ==================== One patch for net/3.12 fixing an issue where devices could be in an invalid state they are removed while still attached to OVS. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29	net: x25: Fix dead URLs in Kconfig	Michael Drüing
	Update the URLs in the Kconfig file to the new pages at sangoma.com and cisco.com Signed-off-by: Michael Drüing <michael@drueing.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29	net: sched: cls_bpf: add BPF-based classifier	Daniel Borkmann
	This work contains a lightweight BPF-based traffic classifier that can serve as a flexible alternative to ematch-based tree classification, i.e. now that BPF filter engine can also be JITed in the kernel. Naturally, tc actions and policies are supported as well with cls_bpf. Multiple BPF programs/filter can be attached for a class, or they can just as well be written within a single BPF program, that's really up to the user how he wishes to run/optimize the code, e.g. also for inversion of verdicts etc. The notion of a BPF program's return/exit codes is being kept as follows: 0: No match -1: Select classid given in "tc filter ..." command else: flowid, overwrite the default one As a minimal usage example with iproute2, we use a 3 band prio root qdisc on a router with sfq each as leave, and assign ssh and icmp bpf-based filters to band 1, http traffic to band 2 and the rest to band 3. For the first two bands we load the bytecode from a file, in the 2nd we load it inline as an example: echo 1 > /proc/sys/net/core/bpf_jit_enable tc qdisc del dev em1 root tc qdisc add dev em1 root handle 1: prio bands 3 priomap 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 tc qdisc add dev em1 parent 1:1 sfq perturb 16 tc qdisc add dev em1 parent 1:2 sfq perturb 16 tc qdisc add dev em1 parent 1:3 sfq perturb 16 tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/ssh.bpf flowid 1:1 tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/icmp.bpf flowid 1:1 tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/http.bpf flowid 1:2 tc filter add dev em1 parent 1: bpf run bytecode "`bpfc -f tc -i misc.ops`" flowid 1:3 BPF programs can be easily created and passed to tc, either as inline 'bytecode' or 'bytecode-file'. There are a couple of front-ends that can compile opcodes, for example: 1) People familiar with tcpdump-like filters: tcpdump -iem1 -ddd port 22 \| tr '\n' ',' > /etc/tc/ssh.bpf 2) People that want to low-level program their filters or use BPF extensions that lack support by libpcap's compiler: bpfc -f tc -i ssh.ops > /etc/tc/ssh.bpf ssh.ops example code: ldh [12] jne #0x800, drop ldb [23] jneq #6, drop ldh [20] jset #0x1fff, drop ldxb 4 * ([14] & 0xf) ldh [%x + 14] jeq #0x16, pass ldh [%x + 16] jne #0x16, drop pass: ret #-1 drop: ret #0 It was chosen to load bytecode into tc, since the reverse operation, tc filter list dev em1, is then able to show the exact commands again. Possible follow-up work could also include a small expression compiler for iproute2. Tested with the help of bmon. This idea came up during the Netfilter Workshop 2013 in Copenhagen. Also thanks to feedback from Eric Dumazet! Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29	bgmac: separate RX descriptor setup code into a new function	Rafał Miłecki
	This cleans code a bit and will be useful when allocating buffers in other places (like RX path, to avoid skb_copy_from_linear_data_offset). Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf	David S. Miller
	Pablo Neira Ayuso says: ==================== This pull request contains the following netfilter fix: * fix --queue-bypass in xt_NFQUEUE revision 3. While adding the revision 3 of this target, the bypass flags were not correctly handled anymore, thus, breaking packet bypassing if no application is listening from userspace, patch from Holger Eitzenberger, reported by Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29	MIPS: Perf: Fix 74K cache map	Deng-Cheng Zhu
	According to Software User's Manual, the event of last-level-cache read/write misses is mapped to even counters. Odd counters of that event number count miss cycles. Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com> Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/6036/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-10-29	iwlwifi: remove duplicate includes	Michael Opdenacker
	Reported by "make includecheck" Tested that the corresponding sources still compile well on x86 Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2013-10-29	Fix a few incorrectly checked [io_]remap_pfn_range() calls	Linus Torvalds
	Nico Golde reports a few straggling uses of [io_]remap_pfn_range() that really should use the vm_iomap_memory() helper. This trivially converts two of them to the helper, and comments about why the third one really needs to continue to use remap_pfn_range(), and adds the missing size check. Reported-by: Nico Golde <nico@ngolde.de> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org.
2013-10-29	Merge branch 'perf-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf tooling fixes from Ingo Molnar: "This contains five tooling fixes: - fix a remaining mmap2 assumption which resulted in perf top output breakage - fix mmap ring-buffer processing bug that corrupts data - fix for a severe python scripting memory leak - fix broken (and user-visible) -g option handling - fix stdio output The diffstat size is larger than what we'd like to see this late :-/" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf tools: Fixup mmap event consumption perf top: Split -G and --call-graph perf record: Split -g and --call-graph perf hists: Add color overhead for stdio output buffer perf tools: Fix up /proc/PID/maps parsing perf script python: Fix mem leak due to missing Py_DECREFs on dict entries
2013-10-29	Kconfig: make KOBJECT_RELEASE debugging require timer debugging	Linus Torvalds
	Without the timer debugging, the delayed kobject release will just result in undebuggable oopses if it triggers any latent bugs. That doesn't actually help debugging at all. So make DEBUG_KOBJECT_RELEASE depend on DEBUG_OBJECTS_TIMERS to avoid having people enable one without the other. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-10-29	iwlwifi: warn if firmware image doesn't exist	Johannes Berg
	If the firmware image that we attempt to load doesn't actually exist we have a broken firmware file or other code not checking things correctly, so warn in such a case. Also avoid assigning cur_ucode/ucode_loaded then. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2013-10-29	iwlwifi: mvm: add missing break in debugfs	Johannes Berg
	When writing the disable_power_off value, the LPRX enable value also gets written unintentionally, so fix that by adding the missing break statement. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2013-10-29	iwlwifi: mvm: capture the FCS in monitor mode	Johannes Berg
	This can be useful when using the device as a sniffer. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2013-10-29	iwlwifi: pcie: move warning message into warning	Johannes Berg
	Having a WARN_ON() followed by a printed message is less useful than having the message in the warning so move the message. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2013-10-29	iwlwifi: mvm: BT Coex fix NULL pointer dereference	Emmanuel Grumbach
	When we disassociate, mac80211 removes the station and then, it sets the bss it unsets the assoc bool in bss_info. Since the firwmware wants it the opposite (first set the MAC context as unassoc, and only then, remove the STA of the API), we have a small period of time in which the STA in firmware doesn't have a valid ieee80211_sta pointer. During that time, iwl_mvm_vif->ap_sta_id, is still set to the STA in firmware that represent the AP. This avoids: [ 4481.476246] BUG: unable to handle kernel NULL pointer dereference at 00000045 [ 4481.479521] IP: [<f8416a6a>] iwl_mvm_bt_coex_reduced_txp+0x7a/0x190 [iwlmvm] [ 4481.482023] *pde = 00000000 [ 4481.484332] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 4481.486897] Modules linked in: netconsole configfs autofs4 rfcomm(O) bnep(O) nfsd nfs_acl auth_rpcgss exportfs nfs lockd binfmt_misc sunrpc fscache arc4 iwlmvm(O) mac80211(O) btusb(O) iwlwifi(O) bluetooth(O) cfg80211(O) snd_hda_codec_hdmi coretemp dell_wmi snd_hda_codec_idt compat(O) dell_laptop aesni_intel i915 sparse_keymap dcdbas cryptd psmouse serio_raw aes_i586 microcode snd_hda_intel drm_kms_helper snd_hda_codec drm snd_pcm snd_timer i2c_algo_bit video intel_agp intel_gtt snd soundcore snd_page_alloc crc32c_intel ahci sdhci_pci libahci sdhci mmc_core e1000e xhci_hcd [last unloaded: configfs] [ 4481.502983] [ 4481.505599] Pid: 6507, comm: kworker/0:1 Tainted: G O 3.4.43-dev #1 Dell Inc. Latitude E6430/0CMDYV [ 4481.508575] EIP: 0060:[<f8416a6a>] EFLAGS: 00010246 CPU: 0 [ 4481.511248] EIP is at iwl_mvm_bt_coex_reduced_txp+0x7a/0x190 [iwlmvm] [ 4481.513947] EAX: ffffffea EBX: 00000002 ECX: 00000001 EDX: 00000001 [ 4481.516710] ESI: ec6f0f28 EDI: 00000000 EBP: e8175dfc ESP: e8175d9c [ 4481.519445] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 4481.522185] CR0: 8005003b CR2: 00000045 CR3: 01a5e000 CR4: 001407d0 [ 4481.524950] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 4481.527768] DR6: ffff0ff0 DR7: 00000400 [ 4481.530565] Process kworker/0:1 (pid: 6507, ti=e8174000 task=e8032b20 task.ti=e8174000) [ 4481.533447] Stack: [ 4481.536379] e472439f 00003a12 e8032b20 e8033048 00000001 e8175ddc 00000246 e8033040 [ 4481.540132] 00000002 01814990 ec4d1ddc e8175dcc 00000000 00000000 00000000 00000000 [ 4481.543867] 00000000 00000000 00000001 000001c8 009b0002 ec4d1ddc ec6f0f28 00000000 [ 4481.547633] Call Trace: [ 4481.550578] [<f8418027>] iwl_mvm_bt_rssi_event+0x197/0x220 [iwlmvm] [ 4481.553537] [<f840919c>] iwl_mvm_stat_iterator+0xdc/0x240 [iwlmvm] [ 4481.556582] [<f8d129c2>] __iterate_active_interfaces+0xe2/0x1f0 [mac80211] [ 4481.559544] [<f84090c0>] ? iwl_mvm_update_smps+0x90/0x90 [iwlmvm] [ 4481.562519] [<f84090c0>] ? iwl_mvm_update_smps+0x90/0x90 [iwlmvm] [ 4481.565498] [<f8d12b0c>] ieee80211_iterate_active_interfaces+0x3c/0x50 [mac80211] [ 4481.568421] [<f8409b43>] iwl_mvm_rx_statistics+0xb3/0x130 [iwlmvm] [ 4481.571349] [<f8405431>] iwl_mvm_async_handlers_wk+0xc1/0xf0 [iwlmvm] [ 4481.574251] [<c1052915>] ? process_one_work+0x105/0x5c0 [ 4481.577162] [<c1052991>] process_one_work+0x181/0x5c0 [ 4481.580025] [<c1052915>] ? process_one_work+0x105/0x5c0 [ 4481.582861] [<f8405370>] ? iwl_mvm_rx_fw_logs+0x20/0x20 [iwlmvm] [ 4481.585722] [<c10530f1>] worker_thread+0x121/0x2c0 [ 4481.588536] [<c1052fd0>] ? rescuer_thread+0x1d0/0x1d0 [ 4481.591323] [<c105af0d>] kthread+0x7d/0x90 [ 4481.594059] [<c105ae90>] ? flush_kthread_worker+0x120/0x120 [ 4481.596868] [<c15b7cc2>] kernel_thread_helper+0x6/0x10 [ 4481.599605] Code: 9d de c3 c8 85 c0 74 0d 80 3d f8 ae 42 f8 00 0f 84 dc 00 00 00 8b 45 c8 0f b6 d3 31 ff 89 55 c0 8b 84 90 d8 03 00 00 0f b6 55 c7 <38> 50 5b 89 45 bc 0f 84 a8 00 00 00 a1 e4 d2 04 c2 85 c0 0f 84 [ 4481.611782] EIP: [<f8416a6a>] iwl_mvm_bt_coex_reduced_txp+0x7a/0x190 [iwlmvm] SS:ESP 0068:e8175d9c [ 4481.614985] CR2: 0000000000000045 [ 4481.687441] ---[ end trace b11bc915fbac4412 ]--- Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2013-10-29	iwlwifi: transport config n_no_reclaim_cmds should be unsigned	Johannes Berg
	The number of commands can never be negative, so it should be using an unsigned type. This also shuts up an smatch warning elsewhere in the code. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2013-10-29	iwlwifi: mvm: update UAPSD support TLV bits	Alexander Bondar
	Change old UAPSD bit to PM_CMD_SUPPORT, and add a new bit to indicate real UAPSD support. Don't use UAPSD when the firmware doesn't support it. Signed-off-by: David Spinadel <david.spinadel@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2013-10-29	drm/i915: Fix the PPT fdi lane bifurcate state handling on ivb	Daniel Vetter
	Originally I've thought that this is leftover hw state dirt from the BIOS. But after way too much helpless flailing around on my part I've noticed that the actual bug is when we change the state of an already active pipe. For example when we change the fdi lines from 2 to 3 without switching off outputs in-between we'll never see the crucial on->off transition in the ->modeset_global_resources hook the current logic relies on. Patch version 2 got this right by instead also checking whether the pipe is indeed active. But that in turn broke things when pipes have been turned off through dpms since the bifurcate enabling is done in the ->crtc_mode_set callback. To address this issues discussed with Ville in the patch review move the setting of the bifurcate bit into the ->crtc_enable hook. That way we won't wreak havoc with this state when userspace puts all other outputs into dpms off state. This also moves us forward with our overall goal to unify the modeset and dpms on paths (which we need to have to allow runtime pm in the dpms off state). Unfortunately this requires us to move the bifurcate helpers around a bit. Also update the commit message, I've misanalyzed the bug rather badly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70507 Tested-by: Jan-Michael Brummer <jan.brummer@tabos.org> Cc: stable@vger.kernel.org Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-29	netfilter: xt_NFQUEUE: fix --queue-bypass regression	Holger Eitzenberger
	V3 of the NFQUEUE target ignores the --queue-bypass flag, causing packets to be dropped when the userspace listener isn't running. Regression is in since 8746ddcf12bb26 ("netfilter: xt_NFQUEUE: introduce CPU fanout"). Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-29	i40e: fix error return code in i40e_probe()	Wei Yongjun
	Fix to return -ENOMEM in the memory alloc error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-10-29	ixgbevf: Add zero_base handler to network statistics	Don Skidmore
	This patch removes the need to keep a zero_base variable in the adapter structure. Now we just use two different macros to set the non-zero and zero base. This adds to readability and shortens some of the structure initialization under 80 columns. The gathering of status for ethtool was slightly modified to again better fit into 80 columns and become a bit more readable. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-10-29	ixgbevf: add BP_EXTENDED_STATS for CONFIG_NET_RX_BUSY_POLL	Jacob Keller
	This patch adds the extended statistics similar to the ixgbe driver. These statistics keep track of how often the busy polling yields, as well as how many packets are cleaned or missed by the polling routine. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-10-29	ixgbevf: implement CONFIG_NET_RX_BUSY_POLL	Jacob Keller
	This patch enables CONFIG_NET_RX_BUSY_POLL support in the VF code. This enables sockets which have enabled the SO_BUSY_POLL socket option to use the ndo_busy_poll_recv operation which could result in lower latency, at the cost of higher CPU utilization, and increased power usage. This support is similar to how the ixgbe driver works. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-10-29	perf/x86: Fix NMI measurements	Peter Zijlstra
	OK, so what I'm actually seeing on my WSM is that sched/clock.c is 'broken' for the purpose we're using it for. What triggered it is that my WSM-EP is broken :-( [ 0.001000] tsc: Fast TSC calibration using PIT [ 0.002000] tsc: Detected 2533.715 MHz processor [ 0.500180] TSC synchronization [CPU#0 -> CPU#6]: [ 0.505197] Measured 3 cycles TSC warp between CPUs, turning off TSC clock. [ 0.004000] tsc: Marking TSC unstable due to check_tsc_sync_source failed For some reason it consistently detects TSC skew, even though NHM+ should have a single clock domain for 'reasonable' systems. This marks sched_clock_stable=0, which means that we do fancy stuff to try and get a 'sane' clock. Part of this fancy stuff relies on the tick, clearly that's gone when NOHZ=y. So for idle cpus time gets stuck, until it either wakes up or gets kicked by another cpu. While this is perfectly fine for the scheduler -- it only cares about actually running stuff, and when we're running stuff we're obviously not idle. This does somewhat break down for perf which can trigger events just fine on an otherwise idle cpu. So I've got NMIs get get 'measured' as taking ~1ms, which actually don't last nearly that long: <idle>-0 [013] d.h. 886.311970: rcu_nmi_enter <-do_nmi ... <idle>-0 [013] d.h. 886.311997: perf_sample_event_took: HERE!!! : 1040990 So ftrace (which uses sched_clock(), not the fancy bits) only sees ~27us, but we measure ~1ms !! Now since all this measurement stuff lives in x86 code, we can actually fix it. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: mingo@kernel.org Cc: dave.hansen@linux.intel.com Cc: eranian@google.com Cc: Don Zickus <dzickus@redhat.com> Cc: jmario@redhat.com Cc: acme@infradead.org Link: http://lkml.kernel.org/r/20131017133350.GG3364@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29	perf: Fix perf ring buffer memory ordering	Peter Zijlstra
	The PPC64 people noticed a missing memory barrier and crufty old comments in the perf ring buffer code. So update all the comments and add the missing barrier. When the architecture implements local_t using atomic_long_t there will be double barriers issued; but short of introducing more conditional barrier primitives this is the best we can do. Reported-by: Victor Kaplansky <victork@il.ibm.com> Tested-by: Victor Kaplansky <victork@il.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: michael@ellerman.id.au Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: anton@samba.org Cc: benh@kernel.crashing.org Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29	ixgbevf: have clean_rx_irq return total_rx_packets cleaned	Jacob Keller
	Rather than return true/false indicating whether there was budget left, return the total packets cleaned. This currently has no use, but will be used in a following patch which enables CONFIG_NET_RX_BUSY_POLL support in order to track how many packets were cleaned during the busy poll as part of the extended statistics. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-10-29	ixgbevf: add ixgbevf_rx_skb	Jacob Keller
	This patch adds ixgbevf_rx_skb in line with how ixgbe handles the variations on how packets can be received. It will be extended in a following patch for CONFIG_NET_RX_BUSY_POLL support. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-10-29	ixgbe: remove unnecessary duplication of PCIe bandwidth display	Jacob Keller
	This patch removes the unnecessary display of PCIe bandwidth twice. Since the ixgbe_check_minimum_link does a better job, and ensures accurate detection on even complex chains, this older check is no longer necessary. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-10-29	ixgbe: show <2% for encoding loss on PCIe Gen3	Jacob Keller
	This patch updates the ixgbe_check_minimum_link function to correctly show that there is some minor loss of encoding, even though we don't calculate it in the max GT/s equation. It is small enough to not bother, but is better to report it than not. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-10-29	mm: Account for a THP NUMA hinting update as one PTE update	Mel Gorman
	A THP PMD update is accounted for as 512 pages updated in vmstat. This is large difference when estimating the cost of automatic NUMA balancing and can be misleading when comparing results that had collapsed versus split THP. This patch addresses the accounting issue. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-10-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29	mm: Close races between THP migration and PMD numa clearing	Mel Gorman
	THP migration uses the page lock to guard against parallel allocations but there are cases like this still open Task A Task B --------------------- --------------------- do_huge_pmd_numa_page do_huge_pmd_numa_page lock_page mpol_misplaced == -1 unlock_page goto clear_pmdnuma lock_page mpol_misplaced == 2 migrate_misplaced_transhuge pmd = pmd_mknonnuma set_pmd_at During hours of testing, one crashed with weird errors and while I have no direct evidence, I suspect something like the race above happened. This patch extends the page lock to being held until the pmd_numa is cleared to prevent migration starting in parallel while the pmd_numa is being cleared. It also flushes the old pmd entry and orders pagetable insertion before rmap insertion. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-9-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29	mm: numa: Sanitize task_numa_fault() callsites	Mel Gorman
	There are three callers of task_numa_fault(): - do_huge_pmd_numa_page(): Accounts against the current node, not the node where the page resides, unless we migrated, in which case it accounts against the node we migrated to. - do_numa_page(): Accounts against the current node, not the node where the page resides, unless we migrated, in which case it accounts against the node we migrated to. - do_pmd_numa_page(): Accounts not at all when the page isn't migrated, otherwise accounts against the node we migrated towards. This seems wrong to me; all three sites should have the same sementaics, furthermore we should accounts against where the page really is, we already know where the task is. So modify all three sites to always account; we did after all receive the fault; and always account to where the page is after migration, regardless of success. They all still differ on when they clear the PTE/PMD; ideally that would get sorted too. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-8-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29	mm: Prevent parallel splits during THP migration	Mel Gorman
	THP migrations are serialised by the page lock but on its own that does not prevent THP splits. If the page is split during THP migration then the pmd_same checks will prevent page table corruption but the unlock page and other fix-ups potentially will cause corruption. This patch takes the anon_vma lock to prevent parallel splits during migration. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-7-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29	mm: Wait for THP migrations to complete during NUMA hinting faults	Mel Gorman
	The locking for migrating THP is unusual. While normal page migration prevents parallel accesses using a migration PTE, THP migration relies on a combination of the page_table_lock, the page lock and the existance of the NUMA hinting PTE to guarantee safety but there is a bug in the scheme. If a THP page is currently being migrated and another thread traps a fault on the same page it checks if the page is misplaced. If it is not, then pmd_numa is cleared. The problem is that it checks if the page is misplaced without holding the page lock meaning that the racing thread can be migrating the THP when the second thread clears the NUMA bit and faults a stale page. This patch checks if the page is potentially being migrated and stalls using the lock_page if it is potentially being migrated before checking if the page is misplaced or not. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-6-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29	mm: numa: Do not account for a hinting fault if we raced	Mel Gorman
	If another task handled a hinting fault in parallel then do not double account for it. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-5-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29	ixgbe: fix qv_lock_napi call in ixgbe_napi_disable_all	Jacob Keller
	ixgbe_napi_disable_all calls napi_disable on each queue, however the busy polling code introduced a local_bh_disable()d context around the napi_disable. The original author did not realize that napi_disable might sleep, which would cause a sleep while atomic BUG. In addition, on a single processor system, the ixgbe_qv_lock_napi loop shouldn't have to mdelay. This patch adds an ixgbe_qv_disable along with a new IXGBE_QV_STATE_DISABLED bit, which it uses to indicate to the poll and napi routines that the q_vector has been disabled. Now the ixgbe_napi_disable_all function will wait until all pending work has been finished and prevent any future work from being started. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com> Cc: Alexander Duyck <alexander.duyck@intel.com> Cc: Hyong-Youb Kim <hykim@myri.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Dmitry Kravkov <dmitry@broadcom.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-10-29	net: add might_sleep() call to napi_disable	Jacob Keller
	napi_disable uses an msleep() call to wait for outstanding napi work to be finished after setting the disable bit. It does not always sleep incase there was no outstanding work. This resulted in a rare bug in ixgbe_down operation where a napi_disable call took place inside of a local_bh_disable()d context. In order to enable easier detection of future sleep while atomic BUGs, this patch adds a might_sleep() call, so that every use of napi_disable during atomic context will be visible. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com> Cc: Alexander Duyck <alexander.duyck@intel.com> Cc: Hyong-Youb Kim <hykim@myri.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Dmitry Kravkov <dmitry@broadcom.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-10-29	vxlan: Have the NIC drivers do less work for offloads	Joseph Gasparakis
	This patch removes the burden from the NIC drivers to check if the vxlan driver is enabled in the kernel and also makes available the vxlan headrooms to them. Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-10-29	Merge tag 'perf-urgent-for-mingo' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: * Add color overhead for stdio output buffer, which fixes --stdio output being chopped up on the hot (red) entries, fix from Jiri Olsa. * Get 'perf record -g -a sleep 1' working again, removing the need for -- separating perf options from the workload, restoring ages old behaviour, fix from Jiri Olsa. More patches allowing ~/.perfconfig setting up of default callchain collecting method ("fp" or "dwarf") left for next merge window. * Fixup mmap event consumption, where we were acking the consumption by writing the tail before actually accessing the event, which could lead to using overwritten records in things like 'perf record --call-graph'. From Zhouyi Zhou. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-29	net: esp{4,6}: get rid of struct esp_data	Mathias Krause
	struct esp_data consists of a single pointer, vanishing the need for it to be a structure. Fold the pointer into 'data' direcly, removing one level of pointer indirection. Signed-off-by: Mathias Krause <mathias.krause@secunet.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-10-29	net: esp{4,6}: remove padlen from struct esp_data	Mathias Krause
	The padlen member of struct esp_data is always zero. Get rid of it. Signed-off-by: Mathias Krause <mathias.krause@secunet.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-10-29	xen-netback: use jiffies_64 value to calculate credit timeout	Wei Liu
	time_after_eq() only works if the delta is < MAX_ULONG/2. For a 32bit Dom0, if netfront sends packets at a very low rate, the time between subsequent calls to tx_credit_exceeded() may exceed MAX_ULONG/2 and the test for timer_after_eq() will be incorrect. Credit will not be replenished and the guest may become unable to send packets (e.g., if prior to the long gap, all credit was exhausted). Use jiffies_64 variant to mitigate this problem for 32bit Dom0. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Jason Luan <jianhai.luan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29	net, mc: fix the incorrect comments in two mc-related functions	Zhi Yong Wu
	Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29	net, iovec: fix the incorrect comment in memcpy_fromiovecend()	Zhi Yong Wu
	Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29	net, datagram: fix the incorrect comment in zerocopy_sg_from_iovec()	Zhi Yong Wu
	Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>