linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2013-11-05	ALSA: hda - Name Haswell HDMI controllers better	Takashi Iwai
	"HDA Intel MID" is no correct name for Haswell HDMI controllers. Give them a better name, "HDA Intel HDMI". Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-11-05	ALSA: hda - Force buffer alignment for Haswell HDMI controllers	Takashi Iwai
	Haswell HDMI audio controllers seem to get stuck when unaligned buffer size is used. Let's enable the buffer alignment for the corresponding entries. Since AZX_DCAPS_INTEL_PCH contains AZX_DCAPS_BUFSIZE that disables the buffer alignment forcibly, define AZX_DCAPS_INTEL_HASWELL and put the necessary AZX_DCAPS bits there. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=60769 Reported-by: Alexander E. Patrakov <patrakov@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-11-05	tools/perf/build: Fix detection of non-core features	David Ahern
	feature_check needs to be invoked through call, and LDFLAGS may not be set so quotes are needed. Thanks to Jiri for spotting the quotes around LDFLAGS; that one was driving me nuts with the upcoming timerfd feature detection. Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383064996-20933-1-git-send-email-dsahern@gmail.com [ Fixed conflict with 8a0c4c2843d3 ("perf tools: Fix libunwind build and feature detection for 32-bit build") ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-05	perf kvm: Disable live command if timerfd is not supported	David Ahern
	If the OS does not have timerfd support (e.g., older OS'es like RHEL5) disable perf kvm stat live. Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383064996-20933-2-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-05	ALSA: atmel: remove dependency on <mach/gpio.h>	Linus Walleij
	This include is completely unused since the AT91 sound driver actually uses gpiolib properly. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-11-05	ALSA: hda - Enable Thinkpad mute/micmute LEDs for Realtek	David Henningsson
	Same as we already have for Conexant. Right now it's only enabled for one machine. Tested-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-11-05	ext2: Fix fs corruption in ext2_get_xip_mem()	Jan Kara
	Commit 8e3dffc651cb "Ext2: mark inode dirty after the function dquot_free_block_nodirty is called" unveiled a bug in __ext2_get_block() called from ext2_get_xip_mem(). That function called ext2_get_block() mistakenly asking it to map 0 blocks while 1 was intended. Before the above mentioned commit things worked out fine by luck but after that commit we started returning that we allocated 0 blocks while we in fact allocated 1 block and thus allocation was looping until all blocks in the filesystem were exhausted. Fix the problem by properly asking for one block and also add assertion in ext2_get_blocks() to catch similar problems. Reported-and-tested-by: Andiry Xu <andiry.xu@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2013-11-05	arm64: module: ensure instruction is little-endian before manipulation	Will Deacon
	Relocations that require an instruction immediate to be re-encoded must ensure that the instruction pattern is represented in a little-endian format for the manipulation code to work correctly. This patch converts the loaded instruction into native-endianess prior to encoding and then converts back to little-endian byteorder before updating memory. Signed-off-by: Will Deacon <will.deacon@arm.com> Tested-by: Matthew Leach <matthew.leach@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-11-05	arm64: defconfig: Enable CONFIG_PREEMPT by default	Catalin Marinas
	This way we can spot early bugs when just testing with the default config. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-11-05	pinctrl: at91: copy define to driver	Linus Walleij
	The #define for the maximum number of GPIO blocks was retrieved into pinctrl-at91.c by implicit inclusion of <mach/gpio.h> from <linux/gpio.h> creating a dependency on machine-local <mach/gpio.h>. Break the depenency by copying this single define into the driver. Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2013-11-05	arm64: fix access to preempt_count from assembly code	Marc Zyngier
	preempt_count is defined as an int. Oddly enough, we access it as a 64bit value. Things become interesting when running a BE kernel, and looking at the current CPU number, which is stored as an int next to preempt_count. Like in a per-cpu interrupt handler, for example... Using a 32bit access fixes the issue for good. Cc: Matthew Leach <matthew.leach@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-11-05	ALSA: hda: add device IDs for AMD Evergreen/Northern Islands HDMI	Clemens Ladisch
	The device IDs of the AMD Cypress/Juniper/Redwood/Cedar/Cayman/Antilles/ Barts/Turks/Caicos HDMI HDA controllers weren't added explicitly because the generic entry works, but it made the device appearing as "Generic", and people are confused as if it's no proper HDMI controller. Add them so that the name shows up properly as "ATI HDMI" instead of "Generic". According to Takashi's tests and the lack of complaints, these devices work fine without disabling snooping. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-11-05	fuse: writepages: protect secondary requests from fuse file release	Maxim Patlasov
	All async fuse requests must be supplied with extra reference to a fuse file. This is necessary to ensure that the fuse file is not released until all in-flight requests are completed. Fuse secondary writeback requests must obey this rule as well. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-11-05	fuse: writepages: update bdi writeout when deleting secondary request	Maxim Patlasov
	BDI_WRITTEN counter is used to estimate bdi bandwidth. It must be incremented every time as bdi ends page writeback. No matter whether it was fulfilled by actual write or by discarding the request (e.g. due to shrunk i_size). Note that even before writepages patches, the case "Got truncated off completely" was handled in fuse_send_writepage() by calling fuse_writepage_finish() which updated BDI_WRITTEN unconditionally. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-11-05	fuse: writepages: crop secondary requests	Maxim Patlasov
	If writeback happens while fuse is in FUSE_NOWRITE condition, the request will be queued but not processed immediately (see fuse_flush_writepages()). Until FUSE_NOWRITE becomes relaxed, more writebacks can happen. They will be queued as "secondary" requests to that first ("primary") request. Existing implementation crops only primary request. This is not correct because a subsequent extending write(2) may increase i_size and then secondary requests won't be cropped properly. The result would be stale data written to the server to a file offset where zeros must be. Similar problem may happen if secondary requests are attached to an in-flight request that was already cropped. The patch solves the issue by cropping all secondary requests in fuse_writepage_end(). Thanks to Miklos for idea. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-11-05	fuse: writepages: roll back changes if request not found	Maxim Patlasov
	fuse_writepage_in_flight() returns false if it fails to find request with given index in fi->writepages. Then the caller proceeds with populating data->orig_pages[] and incrementing req->num_pages. Hence, fuse_writepage_in_flight() must revert changes it made in request before returning false. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-11-05	Merge branch 'for-davem' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== Please accept the following pull request intended for the 3.13 tree... I had intended to pass most of these to you as much as two weeks ago. Unfortunately, I failed to account for the effects of bad Internet connections and my own fatique/laziness while traveling. On the bright side, at least these have been baking in linux-next for some time! For the mac80211 bits, Johannes says: "This time I have two fixes for P2P (which requires not using CCK rates) and a workaround for APs with broken WMM information." For the iwlwifi bits, Johannes says: "I have a few fixes for warnings/issues: one from Alex, fixing scan timings, one from Emmanuel fixing a WARN_ON in the DVM driver, one from Stanislaw removing a trigger-happy WARN_ON in the MVM driver and a change from myself to try to recover when the device isn't processing commands quickly." And: "For this round, I have a lot of changes: * power management improvements * BT coexistence improvements/updates * new device support * VHT support * IBSS support (though due to a small bug it requires new firmware) * various other fixes/improvements." For the Bluetooth bits, Gustavo says: "More patches for 3.12, busy times for Bluetooth. More than a 100 commits since the last pull. The bulk of work comes from Johan and Marcel, they are doing fixes and improvements all over the Bluetooth subsystem, as the diffstat can show." For the ath10k and ath6kl bits, Kalle says: "Bartosz added support to ath10k for our 10.x AP firmware branch, which gives us AP specific features and fixes. We still support the main firmware branch as well just like before, ath10k detects runtime what firmware is used. Unfortunately the firmware interface in 10.x branch is somewhat different so there was quite a lot of changes in ath10k for this. Michal and Sujith did some performance improvements in ath10k. Vladimir fixed a compiler warning and Fengguang removed an extra semicolon." For the NFC bits, Samuel says: "It's a fairly big one, with the following highlights: - NFC digital layer implementation: Most NFC chipsets implement the NFC digital layer in firmware, but others have more basic functionalities and expect the host to implement the digital layer. This layer sits below the NFC core. - Sony's port100 support: This is "soft" NFC USB dongle that expects the digital layer to be implemented on the host. This is the first user of our NFC digital stack implementation. - Secure element API: We now provide a netlink API for enabling, disabling and discovering NFC attached (embedded or UICC ones) secure elements. With some userspace help, this allows us to support NFC payments. Only the pn544 driver currently supports that API. - NCI SPI fixes and improvements: In order to support NCI devices over SPI, we fixed and improved our NCI/SPI implementation. The currently most deployed NFC NCI chipset, Broadcom's bcm2079x, supports that mode and we're planning to use our NCI/SPI framework to implement a driver for it. - pn533 fragmentation support in target mode: This was the only missing feature from our pn533 impementation. We now support fragmentation in both Tx and Rx modes, in target mode." On top of all that, brcmfmac and rt2x00 both get the usual flurry of updates. A few other drivers get hit here or there as well. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-05	ALSA: hda - Introduce the bitmask for excluding output volume	Takashi Iwai
	Add a bitmask to hda_gen_spec indicating NIDs to exclude from the possible volume controls. That is, when the bit is set, the NID corresponding to the bit won't be picked as an output volume control any longer. Basically this is just a band-aid for working around the issue found with CS4208 codec, where only the headphone pin has a volume AMP with different dB steps. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=60811 Cc: <stable@vger.kernel.org> [v3.12+] Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-11-05	ALSA: hda - Add sanity check of vmaster slave dB steps	Takashi Iwai
	Check whether all vmaster slaves have the same dB steps. Otherwise the behavior would become inconsistent. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-11-05	ALSA: hda - Fix possible zero-division	Takashi Iwai
	Check the TLV db scale result before actually dividing in vmaster slave init code. Also mask TLV_DB_SCALE_MUTE bit so that the right value is obtained even if this bit is set by the codec driver. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-11-05	ALSA: usb - Don't trust the channel config if the channel count changed	David Henningsson
	In case the channel count of the input terminal is not the same as the channel count of the streaming descriptor, the channel config of the input terminal can not be trusted. Instead fall back to a default (guessed) channel map. This was found on a Logitech USB Headset. Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-11-05	ALSA: usb - For class 2 devices, use channel map from altsettings	David Henningsson
	The channel config from the streaming descriptor is probably a better indicator of the channel map than the input terminal. Use the input terminal's channel map as fallback only. Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-11-05	ALSA: usb: supply channel maps even when wChannelConfig is unspecified	David Henningsson
	If wChannelconfig is given for some formats but not others, userspace might not be able to set the channel map. This is RFC because I'm not sure what the best behaviour is - to guess the channel map from the given number of channels (it's quite likely that one channel is MONO and two channels is FL FR), or just to supply UNKNOWN for all channels. But the complete lack of channel map for a format leads userspace to believe that the format is not available at all. Or am I misunderstanding how this should be used? Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-11-04	x86, defconfig: Add DEVTMPFS and DEVTMPFS_MOUNT to 86_defconfig	Chen Gang
	The defconfig kernel can not run under neither fedora16 x86_64 laptop nor fedora17 x86_64 pc. After enable DEVTMPFS* in x86_64_defconfig, it will be OK. DEVTMPFS* is only related with software, so for i386_defconfig may also need them (at least, it has no negative effect for defconfig). Signed-off-by: Chen Gang <gang.chen@asianux.com> Link: http://lkml.kernel.org/r/52784DFF.8040004@asianux.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-11-04	virtio-net: coalesce rx frags when possible during rx	Jason Wang
	Commit 2613af0ed18a11d5c566a81f9a6510b73180660a (virtio_net: migrate mergeable rx buffers to page frag allocators) try to increase the payload/truesize for MTU-sized traffic. But this will introduce the extra overhead for GSO packets received because of the frag list. This commit tries to reduce this issue by coalesce the possible rx frags when possible during rx. Test result shows the about 15% improvement on full size GSO packet receiving (and even better than before commit 2613af0ed18a11d5c566a81f9a6510b73180660a). Before this commit: ./netperf -H 192.168.100.4 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.100.4 () port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 20303.87 After this commit: ./netperf -H 192.168.100.4 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.100.4 () port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 23841.26 Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Michael Dalton <mwdalton@google.com> Cc: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	net: introduce skb_coalesce_rx_frag()	Jason Wang
	Sometimes we need to coalesce the rx frags to avoid frag list. One example is virtio-net driver which tries to use small frags for both MTU sized packet and GSO packet. So this patch introduce skb_coalesce_rx_frag() to do this. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Michael Dalton <mwdalton@google.com> Cc: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	vxlan: Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...))	Duan Jiong
	trivial patch converting ERR_PTR(PTR_ERR()) into ERR_CAST(). No functional changes. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	net: codel: Avoid undefined behavior from signed overflow	Jesper Dangaard Brouer
	As described in commit 5a581b367 (jiffies: Avoid undefined behavior from signed overflow), according to the C standard 3.4.3p3, overflow of a signed integer results in undefined behavior. To fix this, do as the above commit, and do an unsigned subtraction, and interpreting the result as a signed two's-complement number. This is based on the theory from RFC 1982 and is nicely described in wikipedia here: https://en.wikipedia.org/wiki/Serial_number_arithmetic#General_Solution A side-note, I have seen practical issues with the previous logic when dealing with 16-bit, on a 64-bit machine (gcc version 4.4.5). This were 32-bit, which I have not observed issues with. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jesper Dangaard Brouer <netoptimizer@brouer.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-next	David S. Miller
	Marc Kleine-Budde says: ==================== here's a pull request for net-next. It includes a patch by Oliver Hartkopp et al. that adds documentation for the broadcast manager to Documentation/networking/can.txt. Three patches by me that clean up the netlink handling code in the CAN core. And another patch that removes a not needed function from the ti_hecc driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	tcp: properly handle stretch acks in slow start	Yuchung Cheng
	Slow start now increases cwnd by 1 if an ACK acknowledges some packets, regardless the number of packets. Consequently slow start performance is highly dependent on the degree of the stretch ACKs caused by receiver or network ACK compression mechanisms (e.g., delayed-ACK, GRO, etc). But slow start algorithm is to send twice the amount of packets of packets left so it should process a stretch ACK of degree N as if N ACKs of degree 1, then exits when cwnd exceeds ssthresh. A follow up patch will use the remainder of the N (if greater than 1) to adjust cwnd in the congestion avoidance phase. In addition this patch retires the experimental limited slow start (LSS) feature. LSS has multiple drawbacks but questionable benefit. The fractional cwnd increase in LSS requires a loop in slow start even though it's rarely used. Configuring such an increase step via a global sysctl on different BDPS seems hard. Finally and most importantly the slow start overshoot concern is now better covered by the Hybrid slow start (hystart) enabled by default. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	tcp: enable sockets to use MSG_FASTOPEN by default	Yuchung Cheng
	Applications have started to use Fast Open (e.g., Chrome browser has such an optional flag) and the feature has gone through several generations of kernels since 3.7 with many real network tests. It's time to enable this flag by default for applications to test more conveniently and extensively. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables Pablo Neira Ayuso says: ==================== This batch contains fives nf_tables patches for your net-next tree, they are: * Fix possible use after free in the module removal path of the x_tables compatibility layer, from Dan Carpenter. * Add filter chain type for the bridge family, from myself. * Fix Kconfig dependencies of the nf_tables bridge family with the core, from myself. * Fix sparse warnings in nft_nat, from Tomasz Bursztyka. * Remove duplicated include in the IPv4 family support for nf_tables, from Wei Yongjun. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== This is another batch containing Netfilter/IPVS updates for your net-next tree, they are: * Six patches to make the ipt_CLUSTERIP target support netnamespace, from Gao feng. * Two cleanups for the nf_conntrack_acct infrastructure, introducing a new structure to encapsulate conntrack counters, from Holger Eitzenberger. * Fix missing verdict in SCTP support for IPVS, from Daniel Borkmann. * Skip checksum recalculation in SCTP support for IPVS, also from Daniel Borkmann. * Fix behavioural change in xt_socket after IP early demux, from Florian Westphal. * Fix bogus large memory allocation in the bitmap port set type in ipset, from Jozsef Kadlecsik. * Fix possible compilation issues in the hash netnet set type in ipset, also from Jozsef Kadlecsik. * Define constants to identify netlink callback data in ipset dumps, again from Jozsef Kadlecsik. * Use sock_gen_put() in xt_socket to replace xt_socket_put_sk, from Eric Dumazet. * Improvements for the SH scheduler in IPVS, from Alexander Frolkin. * Remove extra delay due to unneeded rcu barrier in IPVS net namespace cleanup path, from Julian Anastasov. * Save some cycles in ip6t_REJECT by skipping checksum validation in packets leaving from our stack, from Stanislav Fomichev. * Fix IPVS_CMD_ATTR_MAX definition in IPVS, larger that required, from Julian Anastasov. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	perf hists: Consolidate __hists__add_*entry()	Namhyung Kim
	The __hists__add_{branch,mem}_entry() does almost the same thing that __hists__add_entry() does. Consolidate them into one. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rodrigo Campos <rodrigo@sdfg.com.ar> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1383202576-28141-2-git-send-email-namhyung@kernel.org [ Fixup clash with new COMM infrastructure ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-05	powerpc/boot: Properly handle the base "of" boot wrapper	Benjamin Herrenschmidt
	The wrapper script needs an explicit rule for the "of" boot wrapper (generic wrapper, similar to pseries). Before 0c9fa29149d3726e14262aeb0c8461a948cc9d56 it was hanlded implicitly by the statement: platformo=$object/"$platform".o But now that epapr.o needs to be added, that doesn't work and an explicit rule must be added. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-11-04	netfilter: nft_compat: use _safe version of list_for_each	Dan Carpenter
	We need to use the _safe version of list_for_each_entry() here otherwise we have a use after free bug. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-11-04	NFSv4.2: Remove redundant checks in nfs_setsecurity+nfs4_label_init_security	Trond Myklebust
	We already check for nfs_server_capable(inode, NFS_CAP_SECURITY_LABEL) in nfs4_label_alloc() We check the minor version in _nfs4_server_capabilities before setting NFS_CAP_SECURITY_LABEL. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-04	NFSv4: Sanity check the server reply in _nfs4_server_capabilities	Trond Myklebust
	We don't want to be setting capabilities and/or requesting attributes that are not appropriate for the NFSv4 minor version. - Ensure that we clear the NFS_CAP_SECURITY_LABEL capability when appropriate - Ensure that we limit the attribute bitmasks to the mounted_on_fileid attribute and less for NFSv4.0 - Ensure that we limit the attribute bitmasks to suppattr_exclcreat and less for NFSv4.1 - Ensure that we limit it to change_sec_label or less for NFSv4.2 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-04	NFSv4.2: encode_readdir - only ask for labels when doing readdirplus	Trond Myklebust
	Currently, if the server is doing NFSv4.2 and supports labeled NFS, then our on-the-wire READDIR request ends up asking for the label information, which is then ignored unless we're doing readdirplus. This patch ensures that READDIR doesn't ask the server for label information at all unless the readdir->bitmask contains the FATTR4_WORD2_SECURITY_LABEL attribute, and the readdir->plus flag is set. While we're at it, optimise away the 3rd bitmap field if it is zero. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-04	nfs: set security label when revalidating inode	Jeff Layton
	Currently, we fetch the security label when revalidating an inode's attributes, but don't apply it. This is in contrast to the readdir() codepath where we do apply label changes. Cc: Dave Quigley <dpquigl@davequigley.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-04	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch Jesse Gross says: ==================== Open vSwitch A set of updates for net-next/3.13. Major changes are: * Restructure flow handling code to be more logically organized and easier to read. * Rehashing of the flow table is moved from a workqueue to flow installation time. Before, heavy load could block the workqueue for excessive periods of time. * Additional debugging information is provided to help diagnose megaflows. * It's now possible to match on TCP flags. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	Merge branch 'mlx4'	David S. Miller
	Or Gerlitz says: ==================== Mellanox driver updates This patch set from Jack Morgenstein does the following: 1. Fix MAC/VLAN SRIOV implementation, and add wrapper functions for VLAN allocation and de-allocation (patches 1-6). 2. Implements resource quotas when running under SRIOV (patches 7-10). Patch 7 is a small bug fix, and patches 8-10 implement the quotas. Quotas are implemented per resource type for VFs and the PF, to prevent any entity from simply grabbing all the resources for itself and leaving the other entities unable to obtain such resources. The series is against net-next commit ba48650 "ipv6: remove the unnecessary statement in find_match()" changes from V0: - dropped the 1st patch which needs to go to -stable and hence through net, not net-next ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	net/mlx4_core: Implement resource quota enforcement	Jack Morgenstein
	Implements resource quota grant decision when resources are requested, for the following resources: QPs, CQs, SRQs, MPTs, MTTs, vlans, MACs, and Counters. When granting a resource, the quota system increases the allocated-count for that slave. When the slave later frees the resource, its allocated-count is reduced. A spinlock is used to protect the integrity of each resource's free-pool counter. (One slave may be in the process of being granted a resource while another slave has crashed, initiating cleanup of that slave's resource quotas). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	net/mlx4_core: Fix quota handling in the QUERY_FUNC_CAP wrapper	Jack Morgenstein
	In current kernels, the mlx4 driver running on a VM does not differentiate between max resource numbers for the HCA and max quotas -- it simply takes the quota values passed to it as max-resource values. However, the driver actually requires the VFs to be aware of the actual number of resources that the HCA was initialized with, for QPs, CQs, SRQs and MPTs. For QPs, CQs and SRQs, the reason is that in completion handling the driver must know which of the 24 bits are the actual resource number, and which are "padding" bits. For MPTs, also, the driver assumes knowledge of the number of MPTs in the system. The previous commit fixes the quota logic on the VM for the quota values passed to it by QUERY_FUNC_CAPS. For QPs, CQs, SRQs, and MPTs, it takes the max resource numbers from QUERY_HCA (and not QUERY_FUNC_CAPS). The quotas passed in QUERY_FUNC_CAPS are used to report max resource number values in the response to ib_query_device. However, the Hypervisor driver must consider that VMs may be running previous kernels, and compatibility must be preserved. To resolve the incompatibility with previous kernels running on VMs, we deprecated the quota fields in mlx4_QUERY_FUNC_CAP. In the deprecated fields, we pass the max-resource values from INIT_HCA The quota fields are moved to a new location, and the current kernel driver takes the proper values from that location. There is also a new flag in dword 0, bit 28 of the mlx4_QUERY_FUNC_CAP mailbox; if this flag is set, the (VM) driver takes the quota values from the new location. VMs running previous kernels will work properly, except that the max resource numbers reported in ib_query_device for these resources will be too high. The Hypervisor driver will, however, enforce the quotas for these VMs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	mlx4: Structures and init/teardown for VF resource quotas	Jack Morgenstein
	This is step #1 for implementing SRIOV resource quotas for VFs. Quotas are implemented per resource type for VFs and the PF, to prevent any entity from simply grabbing all the resources for itself and leaving the other entities unable to obtain such resources. Resources which are allocated using quotas: QPs, CQs, SRQs, MPTs, MTTs, MAC, VLAN, and Counters. The quota system works as follows: Each entity (VF or PF) is given a max number of a given resource (its quota), and a guaranteed minimum number for each resource (starvation prevention). For QPs, CQs, SRQs, MPTs and MTTs: 50% of the available quantity for the resource is divided equally among the PF and all the active VFs (i.e., the number of VFs in the mlx4_core module parameter "num_vfs"). This 50% represents the "guaranteed minimum" pool. The other 50% is the "free pool", allocated on a first-come-first-serve basis. For each VF/PF, resources are first allocated from its "guaranteed-minimum" pool. When that pool is exhausted, the driver attempts to allocate from the resource "free-pool". The quota (i.e., max) for the VFs and the PF is: The free-pool amount (50% of the real max) + the guaranteed minimum For MACs: Guarantee 2 MACs per VF/PF per port. As a result, since we have only 128 MACs per port, reduce the allowable number of VFs from 64 to 63. Any remaining MACs are put into a free pool. For VLANs: For the PF, the per-port quota is 128 and guarantee is 64 (to allow the PF to register at least a VLAN per VF in VST mode). For the VFs, the per-port quota is 64 and the guarantee is 0. We assume that VGT VFs are trusted not to abuse the VLAN resource. For Counters: For all functions (PF and VFs), the quota is 128 and the guarantee is 0. In this patch, we define the needed structures, which are added to the resource-tracker struct. In addition, we do initialization for the resource quota, and adjust the query_device response to use quotas rather than resource maxima. As part of the implementation, we introduce a new field in mlx4_dev: quotas. This field holds the resource quotas used to report maxima to the upper layers (ib_core, via query_device). The HCA maxima of these values are passed to the VFs (via QUERY_HCA) so that they may continue to use these in handling QPs, CQs, SRQs and MPTs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	net/mlx4_core: Fix checking order in MR table init	Jack Morgenstein
	In procedure mlx4_init_mr_table(), slaves should do no processing, but should return success. This initialization is hypervisor-only. However, the check for num_mpts being a power-of-2 was performed before the check to return immediately if the driver is for a slave. This resulted in spurious failures. The order of performing the checks is reversed, so that if the driver is for a slave, no processing is done and success is returned. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	net/mlx4_core: Don't fail reg/unreg vlan for older guests	Jack Morgenstein
	In upstream kernels under SRIOV, the vlan register/unregister calls were NOPs (doing nothing and returning OK). We detect these old calls from guests (via the comm channel), since previously the port number in mlx4_register_vlan was passed (improperly) in the out_param. This has been corrected so that the port number is now passed in bits 8..15 of the in_modifier field. For old calls, these bits will be zero, so if the passed port number is zero, we can still look at the out_param field to see if it contains a valid port number. If yes, the VM is running an old driver. Since for old drivers, the register/unregister_vlan wrappers were NOPs, we continue this policy -- the reason being that upstream had an additional bug in eth driver running on guests (where procedure mlx4_en_vlan_rx_kill_vid() had the following code: if (!mlx4_find_cached_vlan(mdev->dev, priv->port, vid, &idx)) mlx4_unregister_vlan(mdev->dev, priv->port, idx); else en_err(priv, "could not find vid %d in cache\n", vid); On a VM, mlx4_find_cached_vlan() will always fail, since the vlan cache is located on the Hypervisor; on guests it is empty. Therefore, if we allow upstream guests to register vlans, we will have vlan leakage since the unregister will never be performed. Leaving vlan reg/unreg for old guest drivers as a NOP is not a feature regression, since in upstream the register/unregister vlan wrapper is a NOP. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	net/mlx4_core: Resource tracker for reg/unreg vlans	Jack Morgenstein
	Add resource tracker support for reg/unreg vlans calls done by VFs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	net/mlx4_en: Use vlan id instead of vlan index for unregistration	Jack Morgenstein
	Use of vlan_index created problems unregistering vlans on guests. In addition, tools delete vlan by tag, not by index, lets follow that. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04	net/mlx4_core: Fix reg/unreg vlan/mac to conform to the firmware spec	Jack Morgenstein
	The functions mlx4_register_vlan, mlx4_unregister_vlan, mlx4_register_mac, mlx4_unregister_mac all made illegal use of the out_param in multifunc mode to pass the port number. The firmware spec specifies that the port number should be passed in bits 8..15 of the input-modifier field for ALLOC_RES and FREE_RES (sections 20.15.1 and 20.15.2). For MAC register/unregister, this patch contains workarounds so that guests running previous kernels continue to work on a new Hypervisor, and guests running the new kernel will continue to work on old hypervisors. Vlan registeration capability is still not operational in multifunction mode, since the vlan wrapper functions are not implemented in this patch. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>