linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2015-11-03	ring_buffer: Do no not complete benchmark reader too early	Petr Mladek
	It seems that complete(&read_done) might be called too early in some situations. 1st scenario: ------------- CPU0 CPU1 ring_buffer_producer_thread() wake_up_process(consumer); wait_for_completion(&read_start); ring_buffer_consumer_thread() complete(&read_start); ring_buffer_producer() # producing data in # the do-while cycle ring_buffer_consumer(); # reading data # got error # set kill_test = 1; set_current_state( TASK_INTERRUPTIBLE); if (reader_finish) # false schedule(); # producer still in the middle of # do-while cycle if (consumer && !(cnt % wakeup_interval)) wake_up_process(consumer); # spurious wakeup while (!reader_finish && !kill_test) # leaving because # kill_test == 1 reader_finish = 0; complete(&read_done); 1st BANG: We might access uninitialized "read_done" if this is the the first round. # producer finally leaving # the do-while cycle because kill_test == 1; if (consumer) { reader_finish = 1; wake_up_process(consumer); wait_for_completion(&read_done); 2nd BANG: This will never complete because consumer already did the completion. 2nd scenario: ------------- CPU0 CPU1 ring_buffer_producer_thread() wake_up_process(consumer); wait_for_completion(&read_start); ring_buffer_consumer_thread() complete(&read_start); ring_buffer_producer() # CPU3 removes the module <--- difference from # and stops producer <--- the 1st scenario if (kthread_should_stop()) kill_test = 1; ring_buffer_consumer(); while (!reader_finish && !kill_test) # kill_test == 1 => we never go # into the top level while() reader_finish = 0; complete(&read_done); # producer still in the middle of # do-while cycle if (consumer && !(cnt % wakeup_interval)) wake_up_process(consumer); # spurious wakeup while (!reader_finish && !kill_test) # leaving because kill_test == 1 reader_finish = 0; complete(&read_done); BANG: We are in the same "bang" situations as in the 1st scenario. Root of the problem: -------------------- ring_buffer_consumer() must complete "read_done" only when "reader_finish" variable is set. It must not be skipped due to other conditions. Note that we still must keep the check for "reader_finish" in a loop because there might be spurious wakeups as described in the above scenarios. Solution: ---------- The top level cycle in ring_buffer_consumer() will finish only when "reader_finish" is set. The data will be read in "while-do" cycle so that they are not read after an error (kill_test == 1) or a spurious wake up. In addition, "reader_finish" is manipulated by the producer thread. Therefore we add READ_ONCE() to make sure that the fresh value is read in each cycle. Also we add the corresponding barrier to synchronize the sleep check. Next we set the state back to TASK_RUNNING for the situation where we did not sleep. Just from paranoid reasons, we initialize both completions statically. This is safer, in case there are other races that we are unaware of. As a side effect we could remove the memory barrier from ring_buffer_producer_thread(). IMHO, this was the reason for the barrier. ring_buffer_reset() uses spin locks that should provide the needed memory barrier for using the buffer. Link: http://lkml.kernel.org/r/1441629518-32712-2-git-send-email-pmladek@suse.com Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-03	Merge branch 'for-next' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k update from Geert Uytterhoeven. * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k/sun3: Use %pM format specifier to print ethernet address
2015-11-03	Merge tag 'leds_for_4.4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds Pull LED updates from Jacek Anaszewski: - Move the out-of-LED-tree led-sead3 driver to the LED subsystem. - Add 'invert' sysfs attribute to the heartbeat trigger. - Add Device Tree support to the leds-netxbig driver and add related DT nodes to the kirkwood-netxbig.dtsi and kirkwood-net5big.dts files. Remove static LED setup from the related board files. - Remove redundant brightness conversion operation from leds-netxbig. - Improve leds-bcm6328 driver: improve default-state handling, add more init configuration options, print invalid LED instead of warning only about maximum LED value. - Add a shutdown function for setting gpio-leds into off state when shutting down. - Fix DT flash timeout property naming in leds-aat1290.txt. - Switch to using devm prefixed version of led_classdev_register() (leds-cobalt-qube, leds-hp6xx, leds-ot200, leds-ipaq-micro, leds-netxbig, leds-locomo, leds-menf21bmc, leds-net48xx, leds-wrap). - Add missing of_node_put (leds-powernv, leds-bcm6358, leds-bcm6328, leds-88pm860x). - Coding style fixes and cleanups: led-class/led-core, leds-ipaq-micro. * tag 'leds_for_4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: (27 commits) leds: 88pm860x: add missing of_node_put leds: bcm6328: add missing of_node_put leds: bcm6358: add missing of_node_put powerpc/powernv: add missing of_node_put leds: leds-wrap.c: Use devm_led_classdev_register leds: aat1290: Fix property naming of flash-timeout-us leds: leds-net48xx: Use devm_led_classdev_register leds: leds-menf21bmc.c: Use devm_led_class_register leds: leds-locomo.c: Use devm_led_classdev_register leds: leds-gpio: add shutdown function Documentation: leds: update DT bindings for leds-bcm6328 leds-bcm6328: add more init configuration options leds-bcm6328: simplify and improve default-state handling leds-bcm6328: print invalid LED leds: netxbig: set led_classdev max_brightness leds: netxbig: convert to use the devm_ functions ARM: mvebu: remove static LED setup for netxbig boards ARM: Kirkwood: add LED DT entries for netxbig boards leds: netxbig: add device tree binding leds: triggers: add invert to heartbeat ...
2015-11-03	tracing: Remove redundant TP_ARGS redefining	Dmitry Safonov
	TP_ARGS is not used anywhere in trace.h nor trace_entries.h Firstly, I left just #undef TP_ARGS and had no errors - remove it. Link: http://lkml.kernel.org/r/1446576560-14085-1-git-send-email-0x7f454c46@gmail.com Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-03	sh_eth: use DMA barriers	Sergei Shtylyov
	Commit 7d7355f58ba4 ("sh_eth: Ensure proper ordering of descriptor active bit write/read") did the right thing but used too "heavy" barriers while there were already "lighter" DMA barriers exactly for this case... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	NVMe: Precedence error in nvme_pr_clear()	Dan Carpenter
	The original code is equivalent to: u32 cdw10 = (1 \| key) ? 1 << 3 : 0; But we want: u32 cdw10 = 1 \| (key ? 1 << 3 : 0); Fixes: 1d277a637a71: ('NVMe: Add persistent reservation ops') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-03	tracing: Rename max_stack_lock to stack_trace_max_lock	Steven Rostedt (Red Hat)
	Now that max_stack_lock is a global variable, it requires a naming convention that is unlikely to collide. Rename it to the same naming convention that the other stack_trace variables have. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-03	tracing: Allow arch-specific stack tracer	AKASHI Takahiro
	A stack frame may be used in a different way depending on cpu architecture. Thus it is not always appropriate to slurp the stack contents, as current check_stack() does, in order to calcurate a stack index (height) at a given function call. At least not on arm64. In addition, there is a possibility that we will mistakenly detect a stale stack frame which has not been overwritten. This patch makes check_stack() a weak function so as to later implement arch-specific version. Link: http://lkml.kernel.org/r/1446182741-31019-5-git-send-email-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-03	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller
	Minor overlapping changes in net/ipv4/ipmr.c, in 'net' we were fixing the "BH-ness" of the counter bumps whilst in 'net-next' the functions were modified to take an explicit 'net' parameter. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	switchdev: respect SKIP_EOPNOTSUPP flag in case there is no recursion	Jiri Pirko
	Caller passing down the SKIP_EOPNOTSUPP switchdev flag expects that -EOPNOTSUPP cannot be returned. But in case of direct op call without recurtion, this may happen. So fix this by checking it always on the end of __switchdev_port_attr_set function. Fixes: 464314ea6c11 ("switchdev: skip over ports returning -EOPNOTSUPP when recursing ports") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	net: sched: kill dead code in sch_choke.c	Phil Sutter
	It looks like this has never been used at all. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	irda: Delete an unnecessary check before the function call ↵	Markus Elfring
	"irlmp_unregister_service" The irlmp_unregister_service() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	lightnvm: refactor phys addrs type to u64	Matias Bjørling
	For cases where CONFIG_LBDAF is not set. The struct ppa_addr exceeds its type on 32 bit architectures. ppa_addr requires a 64bit integer to hold the generic ppa format. We therefore refactor it to u64 and replaces the sector_t usages with u64 for physical addresses. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-03	ide: pdc202xx_new: Replace timeval with ktime_t	Amitoj Kaur Chawla
	This driver uses 'struct timeval' which we are trying to remove since 32 bit time types will break in the year 2038 by replacing it with ktime_t. This patch changes do_gettimeofday() to ktime_get() because ktime_get() returns a ktime_t while do_gettimeofday() returns struct timeval. This patch also uses ktime_us_delta() to get the elapsed time. Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	dlm: make posix locks interruptible	Eric Ren
	Replace wait_event_killable with wait_event_interruptible so that a program waiting for a posix lock can be interrupted by a signal. With the killable version, a program was not interruptible by a signal if it had a signal handler set for it, overriding the default action of terminating the process. Signed-off-by: Eric Ren <zren@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>
2015-11-03	Merge tag 'mac80211-for-davem-2015-11-03' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Another set of fixes: * remove a warning on a check that can trigger without any errors having happened (Andrei) * correctly handle deauth request while in the process of associating (Andrei) * fix TDLS HT operation (Arik) * allow changing AID/listen interval during client setup (Ayala) * be more forgiving with WMM parameters to get HT/VHT in case of broken APs with bad WMM settings (Emmanuel, myself) * a number of other fixes (some in documentation) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	net: dsa: mv88e6xxx: include DSA ports in VLANs	Vivien Didelot
	DSA ports must be members of a VLAN in order to ensure frame bridging between chained switch chips. Thus tag them in addition to the CPU port when adding a VLAN, and skip them when deleting a VLAN and reporting VLAN members. Also use the UNMODIFIED egress policy, so that frames egress on these ports as they ingress, tagged or untagged. Fixes: 0d3b33e60206 ("net: dsa: mv88e6xxx: add VLAN Load support") Reported-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	net: dsa: mv88e6xxx: disable SA learning for DSA and CPU ports	Andrew Lunn
	Frames with DSA headers passing to/from the CPU were taking place in the MAC learning on these ports, resulting in incorrect ATU entries. Disable learning on these ports. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	net/core: fix for_each_netdev_feature	Jarod Wilson
	As pointed out by Nikolay and further explained by Geert, the initial for_each_netdev_feature macro was broken, as feature would get set outside of the block of code it was intended to run in, thus only ever working for the first feature bit in the mask. While less pretty this way, this is tested and confirmed functional with multiple feature bits set in NETIF_F_UPPER_DISABLES. [root@dell-per730-01 ~]# ethtool -K bond0 lro off ... [ 242.761394] bond0: Disabling feature 0x0000000000008000 on lower dev p5p2. [ 243.552178] bnx2x 0000:06:00.1 p5p2: using MSI-X IRQs: sp 74 fp[0] 76 ... fp[7] 83 [ 244.353978] bond0: Disabling feature 0x0000000000008000 on lower dev p5p1. [ 245.147420] bnx2x 0000:06:00.0 p5p1: using MSI-X IRQs: sp 62 fp[0] 64 ... fp[7] 71 [root@dell-per730-01 ~]# ethtool -K bond0 gro off ... [ 251.925645] bond0: Disabling feature 0x0000000000004000 on lower dev p5p2. [ 252.713693] bnx2x 0000:06:00.1 p5p2: using MSI-X IRQs: sp 74 fp[0] 76 ... fp[7] 83 [ 253.499085] bond0: Disabling feature 0x0000000000004000 on lower dev p5p1. [ 254.290922] bnx2x 0000:06:00.0 p5p1: using MSI-X IRQs: sp 62 fp[0] 64 ... fp[7] 71 Fixes: fd867d51f ("net/core: generic support for disabling netdev features down stack") CC: "David S. Miller" <davem@davemloft.net> CC: Eric Dumazet <edumazet@google.com> CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <gospo@cumulusnetworks.com> CC: Jiri Pirko <jiri@resnulli.us> CC: Nikolay Aleksandrov <razor@blackwall.org> CC: Michal Kubecek <mkubecek@suse.cz> CC: Alexander Duyck <alexander.duyck@gmail.com> CC: Geert Uytterhoeven <geert@linux-m68k.org> CC: netdev@vger.kernel.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	vlan: Invoke driver vlan hooks only if device is present	Padmanabh Ratnakar
	NIC drivers mark device as detached during error recovery. It expects no manangement hooks to be invoked in this state. Invoke driver vlan hooks only if device is present. Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	arcnet/com20020: add LEDS_CLASS dependency	Arnd Bergmann
	The newly added led trigger support in the com20020-pci driver causes build errors when CONFIG_LEDS_CLASS is disabled: drivers/built-in.o: In function `com20020pci_probe': (.text+0x185dc4): undefined reference to `devm_led_classdev_register' (.text+0x185dd8): undefined reference to `devm_led_classdev_register' This adds a Kconfig dependency to prevent the invalid configurations. Other drivers appear to be split 50:50 between 'select' and 'depends on' for this symbol, I picked 'depends on' as I could not find a common policy and it generally causes fewer problems. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 8890624a4e8c ("arcnet: com20020-pci: add led trigger support") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	bpf, verifier: annotate verbose printer with __printf	Daniel Borkmann
	The verbose() printer dumps the verifier state to user space, so let gcc take care to check calls to verbose() for (future) errors. make with W=1 correctly suggests: function might be possible candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format]. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	Merge branch 'dp83640-fixes'	David S. Miller
	Stefan Sørensen says: ==================== dp83640 driver fixes This series fixes a number of minor bugs in the dp83640 driver. ==================== Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	dp83640: Only wait for timestamps for packets with timestamping enabled.	Stefan Sørensen
	In the packet timestamping function, check that the ptp version and protocol of the packet matches what we have configured the hardware to actually generate timestamps for, before looking/waiting for a timestamp. Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	ptp: Change ptp_class to a proper bitmask	Stefan Sørensen
	Change the definition of PTP_CLASS_L2 to not have any bits overlapping with the other defined protocol values, allowing the PTP_CLASS_* definitions to be for simple filtering on packet type. Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	dp83640: Prune rx timestamp list before reading from it	Stefan Sørensen
	The list of rx timestamps are currently only pruned of old entries when a new entry is inserted. If no new entries are added, old timestamps may survive beyond their lifetime, possible causing them to be attached to packets with the same sequence number after a rollover. Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	dp83640: Delay scheduled work.	Stefan Sørensen
	Currently rx_timestamp_work reschedules itself as a regular workqueue item, effectively causing it run constantly as long as there are packets left in the queue. Fix by using delayed workqueue items, limiting it to run only every two jiffies. Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	dp83640: Include hash in timestamp/packet matching	Stefan Sørensen
	Only using the message type and sequence id for matching timestamps with packets is error prone, as multiple clients may very well be sending packets with the same messagetype and timestamp at the same time. Fix by extending the check to include the hash of bytes 20-29 (source id in PTPv2) that is provided with the timestamps. Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	ipv6: fix tunnel error handling	Michal Kubeček
	Both tunnel6_protocol and tunnel46_protocol share the same error handler, tunnel6_err(), which traverses through tunnel6_handlers list. For ipip6 tunnels, we need to traverse tunnel46_handlers as we do e.g. in tunnel46_rcv(). Current code can generate an ICMPv6 error message with an IPv4 packet embedded in it. Fixes: 73d605d1abbd ("[IPSEC]: changing API of xfrm6_tunnel_register") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	recordmcount: arm64: Replace the ignored mcount call into nop	Li Bin
	By now, the recordmcount only records the function that in following sections: .text/.ref.text/.sched.text/.spinlock.text/.irqentry.text/ .kprobes.text/.text.unlikely For the function that not in these sections, the call mcount will be in place and not be replaced when kernel boot up. And it will bring performance overhead, such as do_mem_abort (in .exception.text section). This patch make the call mcount to nop for this case in recordmcount. Link: http://lkml.kernel.org/r/1446019445-14421-1-git-send-email-huawei.libin@huawei.com Link: http://lkml.kernel.org/r/1446193864-24593-4-git-send-email-huawei.libin@huawei.com Cc: <lkp@intel.com> Cc: <catalin.marinas@arm.com> Cc: <takahiro.akashi@linaro.org> Cc: <stable@vger.kernel.org> # 3.18+ Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Li Bin <huawei.libin@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-03	recordmcount: Fix endianness handling bug for nop_mcount	libin
	In nop_mcount, shdr->sh_offset and welp->r_offset should handle endianness properly, otherwise it will trigger Segmentation fault if the recordmcount main and file.o have different endianness. Link: http://lkml.kernel.org/r/563806C7.7070606@huawei.com Cc: <stable@vger.kernel.org> # 3.0+ Signed-off-by: Li Bin <huawei.libin@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-03	Btrfs: fix hole punching when using the no-holes feature	Filipe Manana
	When we are using the no-holes feature, if we punch a hole into a file range that already contains a hole which overlaps the range we are passing to fallocate(), we end up removing the extent map that represents the existing hole without adding a new one. This happens because with the no-holes feature we do not have explicit extent items to represent holes and therefore the call to __btrfs_drop_extents(), made from btrfs_punch_hole(), returns an end offset to the variable drop_end that is smaller than the end of the range passed to fallocate(), while it drops all existing extent maps in that range. Normally having a missing extent map is not a problem, for example for a readpages() operation we just end up building the extent map by looking at the fs/subvol tree for a matching extent item (or a lack of one for implicit holes). However for an fsync that uses the fast path, which needs to look at the list of modified extent maps, this means the fsync will not record information about the complete hole we had before the fallocate() call into the log tree, resulting in a file with content/layout that does not match what we had neither before nor after the hole punch operation. The following test case for fstests reproduces the issue. It fails without this change because we get a file with a different digest after the fsync log replay and also with a different extent/hole layout. seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" tmp=/tmp/$$ status=1 # failure is the default! trap "_cleanup; exit \$status" 0 1 2 3 15 _cleanup() { _cleanup_flakey rm -f $tmp.* } # get standard environment, filters and checks . ./common/rc . ./common/filter . ./common/punch . ./common/dmflakey # real QA test starts here _need_to_be_root _supported_fs generic _supported_os Linux _require_scratch _require_xfs_io_command "fpunch" _require_xfs_io_command "fiemap" _require_dm_target flakey _require_metadata_journaling $SCRATCH_DEV # This test was motivated by an issue found in btrfs when the btrfs # no-holes feature is enabled (introduced in kernel 3.14). So enable # the feature if the fs being tested is btrfs. if [ $FSTYP == "btrfs" ]; then _require_btrfs_fs_feature "no_holes" _require_btrfs_mkfs_feature "no-holes" MKFS_OPTIONS="$MKFS_OPTIONS -O no-holes" fi rm -f $seqres.full _scratch_mkfs >>$seqres.full 2>&1 _init_flakey _mount_flakey # Create out test file with some data and then fsync it. # We do the fsync only to make sure the last fsync we do in this test # triggers the fast code path of btrfs' fsync implementation, a # condition necessary to trigger the bug btrfs had. $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 128K" \ -c "fsync" \ $SCRATCH_MNT/foobar \| _filter_xfs_io # Now punch a hole against the range [96K, 128K[. $XFS_IO_PROG -c "fpunch 96K 32K" $SCRATCH_MNT/foobar # Punch another hole against a range that overlaps the previous range # and ends beyond eof. $XFS_IO_PROG -c "fpunch 64K 128K" $SCRATCH_MNT/foobar # Punch another hole against a range that overlaps the first range # ([96K, 128K[) and ends at eof. $XFS_IO_PROG -c "fpunch 32K 96K" $SCRATCH_MNT/foobar # Fsync our file. We want to verify that, after a power failure and # mounting the filesystem again, the file content reflects all the hole # punch operations. $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar echo "File digest before power failure:" md5sum $SCRATCH_MNT/foobar \| _filter_scratch echo "Fiemap before power failure:" $XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar \| _filter_fiemap # Silently drop all writes and umount to simulate a crash/power failure. _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey # Allow writes again, mount to trigger log replay and validate file # contents. _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey echo "File digest after log replay:" # Must match the same digest we got before the power failure. md5sum $SCRATCH_MNT/foobar \| _filter_scratch echo "Fiemap after log replay:" # Must match the same extent listing we got before the power failure. $XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar \| _filter_fiemap _unmount_flakey status=0 exit Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-11-03	Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state	chandan
	When executing generic/001 in a loop on a ppc64 machine (with both sectorsize and nodesize set to 64k), the following call trace is observed, WARNING: at /root/repos/linux/fs/btrfs/locking.c:253 Modules linked in: CPU: 2 PID: 8353 Comm: umount Not tainted 4.3.0-rc5-13676-ga5e681d #54 task: c0000000f2b1f560 ti: c0000000f6008000 task.ti: c0000000f6008000 NIP: c000000000520c88 LR: c0000000004a3b34 CTR: 0000000000000000 REGS: c0000000f600a820 TRAP: 0700 Not tainted (4.3.0-rc5-13676-ga5e681d) MSR: 8000000102029032 <SF,VEC,EE,ME,IR,DR,RI> CR: 24444884 XER: 00000000 CFAR: c0000000004a3b30 SOFTE: 1 GPR00: c0000000004a3b34 c0000000f600aaa0 c00000000108ac00 c0000000f5a808c0 GPR04: 0000000000000000 c0000000f600ae60 0000000000000000 0000000000000005 GPR08: 00000000000020a1 0000000000000001 c0000000f2b1f560 0000000000000030 GPR12: 0000000084842882 c00000000fdc0900 c0000000f600ae60 c0000000f070b800 GPR16: 0000000000000000 c0000000f3c8a000 0000000000000000 0000000000000049 GPR20: 0000000000000001 0000000000000001 c0000000f5aa01f8 0000000000000000 GPR24: 0f83e0f83e0f83e1 c0000000f5a808c0 c0000000f3c8d000 c000000000000000 GPR28: c0000000f600ae74 0000000000000001 c0000000f3c8d000 c0000000f5a808c0 NIP [c000000000520c88] .btrfs_tree_lock+0x48/0x2a0 LR [c0000000004a3b34] .btrfs_lock_root_node+0x44/0x80 Call Trace: [c0000000f600aaa0] [c0000000f600ab80] 0xc0000000f600ab80 (unreliable) [c0000000f600ab80] [c0000000004a3b34] .btrfs_lock_root_node+0x44/0x80 [c0000000f600ac00] [c0000000004a99dc] .btrfs_search_slot+0xa8c/0xc00 [c0000000f600ad40] [c0000000004ab878] .btrfs_insert_empty_items+0x98/0x120 [c0000000f600adf0] [c00000000050da44] .btrfs_finish_chunk_alloc+0x1d4/0x620 [c0000000f600af20] [c0000000004be854] .btrfs_create_pending_block_groups+0x1d4/0x2c0 [c0000000f600b020] [c0000000004bf188] .do_chunk_alloc+0x3c8/0x420 [c0000000f600b100] [c0000000004c27cc] .find_free_extent+0xbfc/0x1030 [c0000000f600b260] [c0000000004c2ce8] .btrfs_reserve_extent+0xe8/0x250 [c0000000f600b330] [c0000000004c2f90] .btrfs_alloc_tree_block+0x140/0x590 [c0000000f600b440] [c0000000004a47b4] .__btrfs_cow_block+0x124/0x780 [c0000000f600b530] [c0000000004a4fc0] .btrfs_cow_block+0xf0/0x250 [c0000000f600b5e0] [c0000000004a917c] .btrfs_search_slot+0x22c/0xc00 [c0000000f600b720] [c00000000050aa40] .btrfs_remove_chunk+0x1b0/0x9f0 [c0000000f600b850] [c0000000004c4e04] .btrfs_delete_unused_bgs+0x434/0x570 [c0000000f600b950] [c0000000004d3cb8] .close_ctree+0x2e8/0x3b0 [c0000000f600ba20] [c00000000049d178] .btrfs_put_super+0x18/0x30 [c0000000f600ba90] [c000000000243cd4] .generic_shutdown_super+0xa4/0x1a0 [c0000000f600bb10] [c0000000002441d8] .kill_anon_super+0x18/0x30 [c0000000f600bb90] [c00000000049c898] .btrfs_kill_super+0x18/0xc0 [c0000000f600bc10] [c0000000002444f8] .deactivate_locked_super+0x98/0xe0 [c0000000f600bc90] [c000000000269f94] .cleanup_mnt+0x54/0xa0 [c0000000f600bd10] [c0000000000bd744] .task_work_run+0xc4/0x100 [c0000000f600bdb0] [c000000000016334] .do_notify_resume+0x74/0x80 [c0000000f600be30] [c0000000000098b8] .ret_from_except_lite+0x64/0x68 Instruction dump: fba1ffe8 fbc1fff0 fbe1fff8 7c791b78 f8010010 f821ff21 e94d0290 81030040 812a04e8 7d094a78 7d290034 5529d97e <0b090000> 3b400000 3be30050 3bc3004c The above call trace is seen even on x86_64; albeit very rarely and that too with nodesize set to 64k and with nospace_cache mount option being used. The reason for the above call trace is, btrfs_remove_chunk check_system_chunk Allocate chunk if required For each physical stripe on underlying device, btrfs_free_dev_extent ... Take lock on Device tree's root node btrfs_cow_block("dev tree's root node"); btrfs_reserve_extent find_free_extent index = BTRFS_RAID_DUP; have_caching_bg = false; When in LOOP_CACHING_NOWAIT state, Assume we find a block group which is being cached; Hence have_caching_bg is set to true When repeating the search for the next RAID index, we set have_caching_bg to false. Hence right after completing the LOOP_CACHING_NOWAIT state, we incorrectly skip LOOP_CACHING_WAIT state and move to LOOP_ALLOC_CHUNK state where we allocate a chunk and try to add entries corresponding to the chunk's physical stripe into the device tree. When doing so the task deadlocks itself waiting for the blocking lock on the root node of the device tree. This commit fixes the issue by introducing a new local variable to help indicate as to whether a block group of any RAID type is being cached. Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-11-03	btrfs: Fix a data space underflow warning	Qu Wenruo
	Even with quota disabled, generic/127 will trigger a kernel warning by underflow data space info. The bug is caused by buffered write, which in case of short copy, the start parameter for btrfs_delalloc_release_space() is wrong, and round_up/down() in btrfs_delalloc_release() extents the range to page aligned, decreasing one more page than expected. This patch will fix it by passing correct start. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-11-03	blk-mq: avoid excessive boot delays with large lun counts	Jeff Moyer
	Hi, Zhangqing Luo reported long boot times on a system with thousands of LUNs when scsi-mq was enabled. He narrowed the problem down to blk_mq_add_queue_tag_set, where every queue is frozen in order to set the BLK_MQ_F_TAG_SHARED flag. Each added device will freeze all queues added before it in sequence, which involves waiting for an RCU grace period for each one. We don't need to do this. After the second queue is added, only new queues need to be initialized with the shared tag. We can do that by percolating the flag up to the blk_mq_tag_set, and updating the newly added queue's hctxs if the flag is set. This problem was introduced by commit 0d2602ca30e41 (blk-mq: improve support for shared tags maps). Reported-and-tested-by: Jason Luo <zhangqing.luo@oracle.com> Reviewed-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-03	Merge branch 'mlx5-fixes'	David S. Miller
	Or Gerlitz says: ==================== Mellanox mlx5e driver update, Nov 3 2015 This series contains bunch of small fixes to the mlx5e driver from Achiad. Changes from V0: - removed the driver patch that dealt with IRQ affinity changes during NAPI poll, as this is a generic problem which needs generic solution. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	net/mlx5e: Fix LSO vlan insertion	Achiad Shochat
	Consider vlan insertion impact on headers copy size also for LSO packets. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	net/mlx5e: Re-eanble client vlan TX acceleration	Achiad Shochat
	This reverts commit cd58c714acb9 "net/mlx5e: Disable client vlan TX acceleration". Bring back client vlan insertion offload, the original performance issue was found and fixed in the next patch. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	net/mlx5e: Return error in case mlx5e_set_features() fails	Achiad Shochat
	In case mlx5e_set_features() fails, return the failure status rather than 0. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	net/mlx5e: Don't allow more than max supported channels	Achiad Shochat
	Consider MLX5E_MAX_NUM_CHANNELS @ethtool set/get_channels Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	net/mlx5_core: Use the the real irqn in eq->irqn	Achiad Shochat
	Instead of storing the msix array index in eq->irqn (vecidx), store the real irq number. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	net/mlx5e: Wait for RX buffers initialization in a more proper manner	Achiad Shochat
	Use jiffies rather than wait loop with msleep(). The wait loop didn't take into consideration time when the process was not executing. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	net/mlx5e: Avoid NULL pointer access in case of configuration failure	Achiad Shochat
	In case a configuration operation that involves closing and re-opening resources (e.g RX/TX queue size change) fails at the re-opening stage these resources will remain closed. So when executing (following) configuration operations (e.g ifconfig down) we cannot assume that these resources are available. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03	Merge branch 'pci/host-layerscape' into next	Bjorn Helgaas
	* pci/host-layerscape: PCI: layerscape: Add ls_pcie_msi_host_init() PCI: layerscape: Add support for LS1043a and LS2080a PCI: layerscape: Remove unused fields from struct ls_pcie PCI: layerscape: Update ls_add_pcie_port() PCI: layerscape: Factor out SCFG related function PCI: layerscape: Ignore PCIe controllers in Endpoint mode PCI: layerscape: Remove ls_pcie_establish_link()
2015-11-03	Merge branch 'pci/host-hisi' into next	Bjorn Helgaas
	* pci/host-hisi: PCI: hisi: Add HiSilicon SoC Hip05 PCIe driver
2015-11-03	Merge branches 'pci/host-altera', 'pci/host-designware', 'pci/host-generic', ↵	Bjorn Helgaas
	'pci/host-imx6', 'pci/host-iproc', 'pci/host-mvebu', 'pci/host-rcar', 'pci/host-tegra' and 'pci/host-xgene' into next * pci/host-altera: PCI: altera: Add Altera PCIe MSI driver PCI: altera: Add Altera PCIe host controller driver ARM: Add msi.h to Kbuild * pci/host-designware: PCI: designware: Make "clocks" and "clock-names" optional DT properties PCI: designware: Make driver arch-agnostic ARM/PCI: Replace pci_sys_data->align_resource with global function pointer PCI: designware: Use of_pci_get_host_bridge_resources() to parse DT Revert "PCI: designware: Program ATU with untranslated address" PCI: designware: Move calculation of bus addresses to DRA7xx PCI: designware: Make "num-lanes" an optional DT property PCI: designware: Require config accesses to be naturally aligned PCI: designware: Simplify dw_pcie_cfg_read/write() interfaces PCI: designware: Use exact access size in dw_pcie_cfg_read() PCI: spear: Fix dw_pcie_cfg_read/write() usage PCI: designware: Set up high part of MSI target address PCI: designware: Make get_msi_addr() return phys_addr_t, not u32 PCI: designware: Implement multivector MSI IRQ setup PCI: designware: Factor out MSI msg setup PCI: Add msi_controller setup_irqs() method for special multivector setup PCI: designware: Fix PORT_LOGIC_LINK_WIDTH_MASK * pci/host-generic: PCI: generic: Fix address window calculation for non-zero starting bus PCI: generic: Pass starting bus number to pci_scan_root_bus() PCI: generic: Allow multiple hosts with different map_bus() methods arm64: dts: Drop linux,pci-probe-only from the Seattle DTS powerpc/PCI: Fix lookup of linux,pci-probe-only property PCI: generic: Fix lookup of linux,pci-probe-only property of/pci: Add of_pci_check_probe_only to parse "linux,pci-probe-only" * pci/host-imx6: PCI: imx6: Add PCIE_PHY_RX_ASIC_OUT_VALID definition PCI: imx6: Return real error code from imx6_add_pcie_port() * pci/host-iproc: PCI: iproc: Fix header comment "Corporation" misspelling PCI: iproc: Add outbound mapping support PCI: iproc: Update PCIe device tree bindings PCI: iproc: Improve link detection logic PCI: iproc: Fix PCIe reset logic PCI: iproc: Call pci_fixup_irqs() for ARM64 as well as ARM PCI: iproc: Remove unused struct iproc_pcie.irqs[] PCI: iproc: Fix code comment to match code * pci/host-mvebu: PCI: mvebu: Remove code restricting accesses to slot 0 PCI: mvebu: Add PCI Express root complex capability block PCI: mvebu: Improve clock/reset handling PCI: mvebu: Use gpio_desc to carry around gpio PCI: mvebu: Use devm_kcalloc() to allocate an array PCI: mvebu: Use gpio_set_value_cansleep() PCI: mvebu: Split port parsing and resource claiming from port setup PCI: mvebu: Fix memory leaks and refcount leaks PCI: mvebu: Move port parsing and resource claiming to separate function PCI: mvebu: Use port->name rather than "PCIe%d.%d" PCI: mvebu: Report full node name when reporting a DT error PCI: mvebu: Use for_each_available_child_of_node() to walk child nodes PCI: mvebu: Use of_get_available_child_count() PCI: mvebu: Use exact config access size; don't read/modify/write PCI: mvebu: Return zero for reserved or unimplemented config space * pci/host-rcar: PCI: rcar: Fix I/O offset for multiple host bridges PCI: rcar: Set root bus nr to that provided in DT PCI: rcar: Remove dependency on ARM-specific struct hw_pci PCI: rcar: Make PCI aware of the I/O resources PCI: rcar: Build pcie-rcar.c only on ARM PCI: rcar: Build pci-rcar-gen2.c only on ARM * pci/host-tegra: PCI: tegra: Wrap static pgprot_t initializer with __pgprot() * pci/host-xgene: PCI/MSI: xgene: Remove msi_controller assignment
2015-11-03	PCI: altera: Add Altera PCIe MSI driver	Ley Foon Tan
	Add Altera PCIe MSI driver. This soft IP supports a configurable number of vectors, which is a DTS parameter. [bhelgaas: Kconfig depend on PCIE_ALTERA, typos, whitespace] Signed-off-by: Ley Foon Tan <lftan@altera.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Rob Herring <robh@kernel.org>
2015-11-03	s390: remove runtime instrumentation interrupts	Martin Schwidefsky
	The external interrupts for runtime instrumentation buffer-full and runtime instrumentation halted are unused and have no current user. Remove the support and ignore the second parameter of the s390_runtime_instr system call from now on. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-11-03	s390/cio: de-duplicate subchannel validation	Pierre Morel
	cio_validate_io_subchannel() and cio_validate_msg_subchannel() are identical, as the called functions already take care about the differences between subchannel types. Just inline the code into the only user, cio_validate_subchannel(), instead. Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-11-03	s390/css: unneeded initialization in for_each_subchannel	Pierre Morel
	The ret variable is always set by the fn function. There is no need to initialize it. Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-By: Sascha Silbe <silbe@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>