linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2016-09-30	mnt: Add a per mount namespace limit on the number of mounts	Eric W. Biederman
	CAI Qian <caiqian@redhat.com> pointed out that the semantics of shared subtrees make it possible to create an exponentially increasing number of mounts in a mount namespace. mkdir /tmp/1 /tmp/2 mount --make-rshared / for i in $(seq 1 20) ; do mount --bind /tmp/1 /tmp/2 ; done Will create create 2^20 or 1048576 mounts, which is a practical problem as some people have managed to hit this by accident. As such CVE-2016-6213 was assigned. Ian Kent <raven@themaw.net> described the situation for autofs users as follows: > The number of mounts for direct mount maps is usually not very large because of > the way they are implemented, large direct mount maps can have performance > problems. There can be anywhere from a few (likely case a few hundred) to less > than 10000, plus mounts that have been triggered and not yet expired. > > Indirect mounts have one autofs mount at the root plus the number of mounts that > have been triggered and not yet expired. > > The number of autofs indirect map entries can range from a few to the common > case of several thousand and in rare cases up to between 30000 and 50000. I've > not heard of people with maps larger than 50000 entries. > > The larger the number of map entries the greater the possibility for a large > number of active mounts so it's not hard to expect cases of a 1000 or somewhat > more active mounts. So I am setting the default number of mounts allowed per mount namespace at 100,000. This is more than enough for any use case I know of, but small enough to quickly stop an exponential increase in mounts. Which should be perfect to catch misconfigurations and malfunctioning programs. For anyone who needs a higher limit this can be changed by writing to the new /proc/sys/fs/mount-max sysctl. Tested-by: CAI Qian <caiqian@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-09-30	dmaengine: edma: avoid uninitialized variable use	Arnd Bergmann
	If edma_read_slot() gets an invalid argument, it does not set a result, as found by "gcc -Wmaybe-uninitialized" drivers/dma/edma.c: In function 'dma_ccerr_handler': drivers/dma/edma.c:1499:21: error: 'p.a_b_cnt' may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/dma/edma.c:1499:21: error: 'p.ccnt' may be used uninitialized in this function [-Werror=maybe-uninitialized] if (p.a_b_cnt == 0 && p.ccnt == 0) { If we change the function to return an error in this case, we can handle the failure more gracefully and treat this the same way as a null slot that we already catch. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-09-30	MAINTAINERS: Switch to kernel.org email address for Javi Merino	Javi Merino
	Change my email address to my kernel.org account instead of the ARM one. Signed-off-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-30	f2fs: fix to avoid race condition when updating sbi flag	Chao Yu
	Making updating of sbi flag atomic by using {test,set,clear}_bit, otherwise in concurrency scenario, the flag could be updated incorrectly. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30	f2fs: put directory inodes before checkpoint in roll-forward recovery	Jaegeuk Kim
	Before checkpoint, we'd be better drop any inodes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30	f2fs: use crc and cp version to determine roll-forward recovery	Jaegeuk Kim
	Previously, we used cp_version only to detect recoverable dnodes. In order to avoid same garbage cp_version, we needed to truncate the next dnode during checkpoint, resulting in additional discard or data write. If we can distinguish this by using crc in addition to cp_version, we can remove this overhead. There is backward compatibility concern where it changes node_footer layout. So, this patch introduces a new checkpoint flag, CP_CRC_RECOVERY_FLAG, to detect new layout. New layout will be activated only when this flag is set. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30	Merge tag 'asoc-v4.9' of ↵	Takashi Iwai
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-next ASoC: Updates for v4.9 Apart from the cleanups done by Morimoto-san this has very much been a driver focused release with very little generic change: - A big factoring out of the simple-card code to allow it to be shared more with the rcar generic card from Kuninori Morimoto. - Removal of some operations duplicated on the CODEC level, again by Kuninori Morimoto. - Lots more machine support for x86 systems. - New drivers for Nuvoton NAU88C10, Realtek RT5660 and RT5663.
2016-09-30	Merge remote-tracking branches 'spi/topic/ti-qspi', 'spi/topic/tools', ↵	Mark Brown
	'spi/topic/txx9' and 'spi/topic/xlp' into spi-next
2016-09-30	Merge remote-tracking branches 'spi/topic/rspi', 'spi/topic/sc18is602', ↵	Mark Brown
	'spi/topic/sh-msiof', 'spi/topic/spidev-test' and 'spi/topic/st-ssc4' into spi-next
2016-09-30	Merge remote-tracking branches 'spi/topic/octeon', 'spi/topic/pic32-sqi', ↵	Mark Brown
	'spi/topic/pxa2xx' and 'spi/topic/qup' into spi-next
2016-09-30	Merge remote-tracking branches 'spi/topic/fsl-espi', 'spi/topic/imx', ↵	Mark Brown
	'spi/topic/jcore', 'spi/topic/loopback' and 'spi/topic/meson' into spi-next
2016-09-30	Merge remote-tracking branches 'spi/topic/bcm', 'spi/topic/dw' and ↵	Mark Brown
	'spi/topic/fsl-dspi' into spi-next
2016-09-30	Merge remote-tracking branch 'spi/topic/dma' into spi-next	Mark Brown

2016-09-30	Merge remote-tracking branch 'spi/topic/core' into spi-next	Mark Brown

2016-09-30	Merge remote-tracking branch 'spi/fix/spidev' into spi-linus	Mark Brown

2016-09-30	Merge remote-tracking branch 'regulator/topic/tps80031' into regulator-next	Mark Brown

2016-09-30	Merge remote-tracking branches 'regulator/topic/of', ↵	Mark Brown
	'regulator/topic/pv88080', 'regulator/topic/rk808', 'regulator/topic/set-voltage' and 'regulator/topic/tps65218' into regulator-next
2016-09-30	Merge remote-tracking branches 'regulator/topic/bulk', ↵	Mark Brown
	'regulator/topic/dbx500', 'regulator/topic/hi6421', 'regulator/topic/load' and 'regulator/topic/ltc3676' into regulator-next
2016-09-30	Merge remote-tracking branch 'regulator/fix/tps65910' into regulator-linus	Mark Brown

2016-09-30	gpio: stmpe: use BIT() macro	Linus Walleij
	Avoid custom (1 << bits) shifting by consequently using the BIT() macro from <linux/bitops.h>. Reviewed-by: Patrice Chotard <patrice.chotard@st.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-09-30	gpio: stmpe: forbid unused lines to be mapped as IRQs	Linus Walleij
	Exploit the new mechanism for masking off disallowed IRQs added by Mika Westerberg to properly manage the STMPE "norequest mask" to disallow also mapping said lines as IRQs. Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Patrice Chotard <patrice.chotard@st.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-09-30	rxrpc: Fix the call timer handling	David Howells
	The call timer's concept of a call timeout (of which there are three) that is inactive is that it is the timeout has the same expiration time as the call expiration timeout (the expiration timer is never inactive). However, I'm not resetting the timeouts when they expire, leading to repeated processing of expired timeouts when other timeout events occur. Fix this by: (1) Move the timer expiry detection into rxrpc_set_timer() inside the locked section. This means that if a timeout is set that will expire immediately, we deal with it immediately. (2) If a timeout is at or before now then it has expired. When an expiry is detected, an event is raised, the timeout is automatically inactivated and the event processor is queued. (3) If a timeout is at or after the expiry timeout then it is inactive. Inactive timeouts do not contribute to the timer setting. (4) The call timer callback can now just call rxrpc_set_timer() to handle things. (5) The call processor work function now checks the event flags rather than checking the timeouts directly. Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-30	rxrpc: Keep the call timeouts as ktimes rather than jiffies	David Howells
	Keep that call timeouts as ktimes rather than jiffies so that they can be expressed as functions of RTT. Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-30	rxrpc: Remove error from struct rxrpc_skb_priv as it is unused	David Howells
	Remove error from struct rxrpc_skb_priv as it is no longer used. Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-30	rxrpc: The offset field in struct rxrpc_skb_priv is unnecessary	David Howells
	The offset field in struct rxrpc_skb_priv is unnecessary as the value can always be calculated. Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-30	rxrpc: Reduce ssthresh to peer's receive window	David Howells
	When we receive an ACK from the peer that tells us what the peer's receive window (rwind) is, we should reduce ssthresh to rwind if rwind is smaller than ssthresh. Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-30	rxrpc: Switch to Congestion Avoidance mode at cwnd==ssthresh	David Howells
	Switch to Congestion Avoidance mode at cwnd == ssthresh rather than relying on cwnd getting incremented beyond ssthresh and the window size, the mode being shifted and then cwnd being corrected. We need to make sure we switch into CA mode so that we stop marking every packet for ACK. Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-30	mac80211: Move reorder-sensitive TX handlers to after TXQ dequeue	Toke Høiland-Jørgensen
	The TXQ intermediate queues can cause packet reordering when more than one flow is active to a single station. Since some of the wifi-specific packet handling (notably sequence number and encryption handling) is sensitive to re-ordering, things break if they are applied before the TXQ. This splits up the TX handlers and fast_xmit logic into two parts: An early part and a late part. The former is applied before TXQ enqueue, and the latter after dequeue. The non-TXQ path just applies both parts at once. Because fragments shouldn't be split up or reordered, the fragmentation handler is run after dequeue. Any fragments are then kept in the TXQ and on subsequent dequeues they take precedence over dequeueing from the FQ structure. This approach avoids having to scatter special cases all over the place for when TXQ is enabled, at the cost of making the fast_xmit and TX handler code slightly more complex. Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> [fix a few code-style nits, make ieee80211_xmit_fast_finish void, remove a useless txq->sta check] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	x86/entry/64: Fix context tracking state warning when load_gs_index fails	Wanpeng Li
	This warning: WARNING: CPU: 0 PID: 3331 at arch/x86/entry/common.c:45 enter_from_user_mode+0x32/0x50 CPU: 0 PID: 3331 Comm: ldt_gdt_64 Not tainted 4.8.0-rc7+ #13 Call Trace: dump_stack+0x99/0xd0 __warn+0xd1/0xf0 warn_slowpath_null+0x1d/0x20 enter_from_user_mode+0x32/0x50 error_entry+0x6d/0xc0 ? general_protection+0x12/0x30 ? native_load_gs_index+0xd/0x20 ? do_set_thread_area+0x19c/0x1f0 SyS_set_thread_area+0x24/0x30 do_int80_syscall_32+0x7c/0x220 entry_INT80_compat+0x38/0x50 ... can be reproduced by running the GS testcase of the ldt_gdt test unit in the x86 selftests. do_int80_syscall_32() will call enter_form_user_mode() to convert context tracking state from user state to kernel state. The load_gs_index() call can fail with user gsbase, gsbase will be fixed up and proceed if this happen. However, enter_from_user_mode() will be called again in the fixed up path though it is context tracking kernel state currently. This patch fixes it by just fixing up gsbase and telling lockdep that IRQs are off once load_gs_index() failed with user gsbase. Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1475197266-3440-1-git-send-email-wanpeng.li@hotmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30	x86/boot: Initialize FPU and X86_FEATURE_ALWAYS even if we don't have CPUID	Andy Lutomirski
	Otherwise arch_task_struct_size == 0 and we die. While we're at it, set X86_FEATURE_ALWAYS, too. Reported-by: David Saggiorato <david@saggiorato.net> Tested-by: David Saggiorato <david@saggiorato.net> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: aaeb5c01c5b ("x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and use it on x86") Link: http://lkml.kernel.org/r/8de723afbf0811071185039f9088733188b606c9.1475103911.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30	mac80211: mesh: decrease max drift	Pedersen, Thomas
	The old value was 30ms, which means mesh sync will treat any value below as merely TSF drift. This isn't really reasonable (typical drift is < 10us/s) since people probably want to adjust TSF in smaller increments (for ie. beacon collision avoidance) without mesh sync fighting back. Change max drift adjustment to 0.8ms, so manual TSF adjustments can be made in 1ms increments, with some margin. Signed-off-by: Thomas Pedersen <twp@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	mac80211: add offset_tsf driver op and use it for mesh	Pedersen, Thomas
	This allows the mesh sync (and debugfs) code to make incremental TSF adjustments, avoiding any uncertainty introduced by delay in programming absolute TSF. Signed-off-by: Thomas Pedersen <twp@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	mac80211: Set lower memory limit for non-VHT devices	Toke Høiland-Jørgensen
	Small devices can run out of memory from queueing too many packets. If VHT is not supported by the PHY, having more than 4 MBytes of total queue in the TXQ intermediate queues is not needed, and so we can safely limit the memory usage in these cases and avoid OOM. Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	mac80211: Export fq memory limit information in debugfs	Toke Høiland-Jørgensen
	Add memory limit, usage and overlimit counter to per-PHY 'aqm' debugfs file. Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	fq.h: Port memory limit mechanism from fq_codel	Toke Høiland-Jørgensen
	The reusable fairness queueing implementation (fq.h) lacks the memory usage limit that the fq_codel qdisc has. This means that small devices (e.g. WiFi routers) can run out of memory when flooded with a large number of packets. This ports the memory limit feature from fq_codel to fq.h. Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	mac80211: Add API to report NAN function match	Ayala Beker
	Provide an API to report NAN function match. Mac80211 will lookup the corresponding cookie and report the match to cfg80211. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	mac80211: Implement add_nan_func and rm_nan_func	Ayala Beker
	Implement add/rm_nan_func functions and handle NAN function termination notifications. Handle instance_id allocation for NAN functions and implement the reconfig flow. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	mac80211: implement nan_change_conf	Ayala Beker
	Implement nan_change_conf callback which allows to change current NAN configuration (master preference and dual band operation). Store the current NAN configuration in sdata, so it can be used both to provide the driver the updated configuration with changes and also it will be used in hw reconfig flows in next patches. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	cfg80211: Provide an API to report NAN function termination	Ayala Beker
	Provide a function that reports NAN DE function termination. The function may be terminated due to one of the following reasons: user request, ttl expiration or failure. If the NAN instance is tied to the owner, the notification will be sent to the socket that started the NAN interface only Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	cfg80211: provide a function to report a match for NAN	Ayala Beker
	Provide a function the driver can call to report a match. This will send the event to the user space. If the NAN instance is tied to the owner, the notifications will be sent to the socket that started the NAN interface only. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	cfg80211: allow the user space to change current NAN configuration	Ayala Beker
	Some NAN configuration paramaters may change during the operation of the NAN device. For example, a user may want to update master preference value when the device gets plugged/unplugged to the power. Add API that allows to do so. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	cfg80211: add add_nan_func / del_nan_func	Ayala Beker
	A NAN function can be either publish, subscribe or follow up. Make all the necessary verifications and just pass the request to the driver. Allow the user space application that starts NAN to forbid any other socket to add or remove functions. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Ayala Beker <ayala.beker@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	mac80211: add boilerplate code for start / stop NAN	Ayala Beker
	This code doesn't do much besides allowing to start and stop the vif. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Ayala Beker <ayala.beker@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	cfg80211: add start / stop NAN commands	Ayala Beker
	This allows user space to start/stop NAN interface. A NAN interface is like P2P device in a few aspects: it doesn't have a netdev associated to it. Add the new interface type and prevent operations that can't be executed on NAN interface like scan. Define several attributes that may be configured by user space when starting NAN functionality (master preference and dual band operation) Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	cfg80211: Add support for static WEP in the driver	David Spinadel
	Add support for drivers that implement static WEP internally, i.e. expose connection keys to the driver in connect flow and don't upload the keys after the connection. Signed-off-by: David Spinadel <david.spinadel@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	mac80211: Move ieee802111_tx_dequeue() to later in tx.c	Toke Høiland-Jørgensen
	The TXQ path restructure requires ieee80211_tx_dequeue() to call TX handlers and parts of the xmit_fast path. Move the function to later in tx.c in preparation for this. Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-30	xen/pciback: support driver_override	Juergen Gross
	Support the driver_override scheme introduced with commit 782a985d7af2 ("PCI: Introduce new device binding path using pci_dev.driver_override") As pcistub_probe() is called for all devices (it has to check for a match based on the slot address rather than device type) it has to check for driver_override set to "pciback" itself. Up to now for assigning a pci device to pciback you need something like: echo 0000:07:10.0 > /sys/bus/pci/devices/0000\:07\:10.0/driver/unbind echo 0000:07:10.0 > /sys/bus/pci/drivers/pciback/new_slot echo 0000:07:10.0 > /sys/bus/pci/drivers_probe while with the patch you can use the same mechanism as for similar drivers like pci-stub and vfio-pci: echo pciback > /sys/bus/pci/devices/0000\:07\:10.0/driver_override echo 0000:07:10.0 > /sys/bus/pci/devices/0000\:07\:10.0/driver/unbind echo 0000:07:10.0 > /sys/bus/pci/drivers_probe So e.g. libvirt doesn't need special handling for pciback. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-09-30	xen/pciback: avoid multiple entries in slot list	Juergen Gross
	The Xen pciback driver has a list of all pci devices it is ready to seize. There is no check whether a to be added entry already exists. While this might be no problem in the common case it might confuse those which consume the list via sysfs. Modify the handling of this list by not adding an entry which already exists. As this will be needed later split out the list handling into a separate function. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-09-30	xen/pciback: simplify pcistub device handling	Juergen Gross
	The Xen pciback driver maintains a list of all its seized devices. There are two functions searching the list for a specific device with basically the same semantics just returning different structures in case of a match. Split out the search function. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-09-30	xen: Remove event channel notification through Xen PCI platform device	KarimAllah Ahmed
	Ever since commit 254d1a3f02eb ("xen/pv-on-hvm kexec: shutdown watches from old kernel") using the INTx interrupt from Xen PCI platform device for event channel notification would just lockup the guest during bootup. postcore_initcall now calls xs_reset_watches which will eventually try to read a value from XenStore and will get stuck on read_reply at XenBus forever since the platform driver is not probed yet and its INTx interrupt handler is not registered yet. That means that the guest can not be notified at this moment of any pending event channels and none of the per-event handlers will ever be invoked (including the XenStore one) and the reply will never be picked up by the kernel. The exact stack where things get stuck during xenbus_init: -xenbus_init -xs_init -xs_reset_watches -xenbus_scanf -xenbus_read -xs_single -xs_single -xs_talkv Vector callbacks have always been the favourite event notification mechanism since their introduction in commit 38e20b07efd5 ("x86/xen: event channels delivery on HVM.") and the vector callback feature has always been advertised for quite some time by Xen that's why INTx was broken for several years now without impacting anyone. Luckily this also means that event channel notification through INTx is basically dead-code which can be safely removed without impacting anybody since it has been effectively disabled for more than 4 years with nobody complaining about it (at least as far as I'm aware of). This commit removes event channel notification through Xen PCI platform device. Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Julien Grall <julien.grall@citrix.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: xen-devel@lists.xenproject.org Cc: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>