summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-07-02Btrfs: cleanup the code of copy_nocow_pages_for_inode()Miao Xie
- It make no sense that we continue to do something after the error happened, just go back with this patch. - remove some check of copy_nocow_pages_for_inode(), such as page check after write, inode check in the end of the function, because we are sure they exist. - remove the unnecessary goto in the return value check of the write Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-02Btrfs: fix oops when recovering the file data by scrub functionMiao Xie
We get oops while running btrfs replace start test, ------------[ cut here ]------------ kernel BUG at mm/filemap.c:608! [SNIP] Call Trace: [<ffffffffa04b36c7>] copy_nocow_pages_for_inode+0x217/0x3f0 [btrfs] [<ffffffffa04b34b0>] ? scrub_print_warning_inode+0x230/0x230 [btrfs] [<ffffffffa04b34b0>] ? scrub_print_warning_inode+0x230/0x230 [btrfs] [<ffffffffa04bb8ce>] iterate_extent_inodes+0x1ae/0x300 [btrfs] [<ffffffffa04bbab2>] iterate_inodes_from_logical+0x92/0xb0 [btrfs] [<ffffffffa04b34b0>] ? scrub_print_warning_inode+0x230/0x230 [btrfs] [<ffffffffa04b3b07>] copy_nocow_pages_worker+0x97/0x150 [btrfs] [<ffffffffa048eed4>] worker_loop+0x134/0x540 [btrfs] [<ffffffff816274ea>] ? __schedule+0x3ca/0x7f0 [<ffffffffa048eda0>] ? btrfs_queue_worker+0x300/0x300 [btrfs] [<ffffffff8106f2f0>] kthread+0xc0/0xd0 [<ffffffff8106f230>] ? flush_kthread_worker+0x80/0x80 [<ffffffff8163181c>] ret_from_fork+0x7c/0xb0 [<ffffffff8106f230>] ? flush_kthread_worker+0x80/0x80 [SNIP] RIP [<ffffffff8111f4c5>] unlock_page+0x35/0x40 RSP <ffff88010316bb98> ---[ end trace 421e79ad0dd72c7d ]--- it is because we forgot to lock the page again after we read data to the page. Fix it. Signed-off-by: Lin Feng <linfeng@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-02Btrfs: make the chunk allocator completely tree locklessJosef Bacik
When adjusting the enospc rules for relocation I ran into a deadlock because we were relocating the only system chunk and that forced us to try and allocate a new system chunk while holding locks in the chunk tree, which caused us to deadlock. To fix this I've moved all of the dev extent addition and chunk addition out to the delayed chunk completion stuff. We still keep the in-memory stuff which makes sure everything is consistent. One change I had to make was to search the commit root of the device tree to find a free dev extent, and hold onto any chunk em's that we allocated in that transaction so we do not allocate the same dev extent twice. This has the side effect of fixing a bug with balance that has been there ever since balance existed. Basically you can free a block group and it's dev extent and then immediately allocate that dev extent for a new block group and write stuff to that dev extent, all within the same transaction. So if you happen to crash during a balance you could come back to a completely broken file system. This patch should keep these sort of things from happening in the future since we won't be able to allocate free'd dev extents until after the transaction commits. This has passed all of the xfstests and my super annoying stress test followed by a balance. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-02Btrfs: cleanup orphaned root orphan itemJosef Bacik
I hit a weird problem were my root item had been deleted but the orphan item had not. This isn't necessarily a problem, but it keeps the file system from being mounted. To fix this we just need to axe the orphan item if we can't find the fs root when we're putting them altogether. With this patch I was able to successfully mount my file system. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-02Btrfs: fix wrong mirror number tuningMiao Xie
Now reading the data from the target device of the replace operation is allowed, so the mirror number that is greater than the stripes number of a chunk is valid, we will tune it when we find there is no target device later. Fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-02Btrfs: cleanup redundant code in btrfs_submit_direct()Miao Xie
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-02Btrfs: remove btrfs_sector_sum structureMiao Xie
Using the structure btrfs_sector_sum to keep the checksum value is unnecessary, because the extents that btrfs_sector_sum points to are continuous, we can find out the expected checksums by btrfs_ordered_sum's bytenr and the offset, so we can remove btrfs_sector_sum's bytenr. After removing bytenr, there is only one member in the structure, so it makes no sense to keep the structure, just remove it, and use a u32 array to store the checksum value. By this change, we don't use the while loop to get the checksums one by one. Now, we can get several checksum value at one time, it improved the performance by ~74% on my SSD (31MB/s -> 54MB/s). test command: # dd if=/dev/zero of=/mnt/btrfs/file0 bs=1M count=1024 oflag=sync Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-02Btrfs: check if we can nocow if we don't have data spaceJosef Bacik
We always just try and reserve data space when we write, but if we are out of space but have prealloc'ed extents we should still successfully write. This patch will try and see if we can write to prealloc'ed space and if we can go ahead and allow the write to continue. With this patch we now pass xfstests generic/274. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-02Btrfs: stop using try_to_writeback_inodes_sb_nr to flush delallocJosef Bacik
try_to_writeback_inodes_sb_nr returns 1 if writeback is already underway, which is completely fraking useless for us as we need to make sure pages are actually written before we go and check if there are ordered extents. So replace this with an open coding of try_to_writeback_inodes_sb_nr minus the writeback underway check so that we are sure to actually have flushed some dirty pages out and will have ordered extents to use. With this patch xfstests generic/273 now passes. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-02Btrfs: use a percpu to keep track of possibly pinned bytesJosef Bacik
There are all of these checks in the ENOSPC code to see if committing the transaction would free up enough space to make the allocation. This is because early on we just committed the transaction and hoped and prayed, which resulted in cases where it took _forever_ to get an ENOSPC when we really were out of space. So we check space_info->bytes_pinned, except this isn't completely true because it doesn't account for space we may free but are stuck in delayed refs. So tests like xfstests 226 would fail because we wouldn't commit the transaction to free up the data space. So instead add a percpu counter that will be a little fuzzier, it will add bytes as soon as we try to free up the space, and remove any space it doesn't actually free up when we get around to doing the actual free. We then 0 out this counter every transaction period so we have a better idea of how much space we will actually free up by committing this transaction. With this patch we now pass xfstests 226. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-02Btrfs: check for actual acls rather than just xattrs when caching no aclJosef Bacik
We have an optimization that will go ahead and cache no acls on an inode if there are no xattrs on the inode. This saves us a lookup later to check the acls for writes or any other access. The problem is I use selinux so I always have an xattr on inodes, so make this test a little smarter and check for the actual acl hash on the key and if it isn't there then we still get to cache no acl which makes everybody who uses selinux a little happier. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-02leds: mc13783: Fix "uninitialized variable" warningAlexander Shiyan
drivers/leds/leds-mc13783.c: In function 'mc13xxx_led_probe': drivers/leds/leds-mc13783.c:195:2: warning: 'ret' may be used uninitialized in this function [-Wmaybe-uninitialized] Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2013-07-02tracing: Get trace_array ref counts when accessing trace filesSteven Rostedt (Red Hat)
When a trace file is opened that may access a trace array, it must increment its ref count to prevent it from being deleted. Cc: stable@vger.kernel.org # 3.10 Reported-by: Alexander Lam <azl@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-02tracing: Add trace_array_get/put() to handle instance refs betterSteven Rostedt (Red Hat)
Commit a695cb58162 "tracing: Prevent deleting instances when they are being read" tried to fix a race between deleting a trace instance and reading contents of a trace file. But it wasn't good enough. The following could crash the kernel: # cd /sys/kernel/debug/tracing/instances # ( while :; do mkdir foo; rmdir foo; done ) & # ( while :; do cat foo/trace &> /dev/null; done ) & Luckily this can only be done by root user, but it should be fixed regardless. The problem is that a delete of the file can happen after the reader starts to open the file but before it grabs the trace_types_mutex. The solution is to validate the trace array before using it. If the trace array does not exist in the list of trace arrays, then it returns -ENODEV. There's a possibility that a trace_array could be deleted and a new one created and the open would open its file instead. But that is very minor as it will just return the data of the new trace array, it may confuse the user but it will not crash the system. As this can only be done by root anyway, the race will only occur if root is deleting what its trying to read at the same time. Cc: stable@vger.kernel.org # 3.10 Reported-by: Alexander Lam <azl@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-02drm/radeon/dpm: clarify debugfs warningAlex Deucher
For chips without debugfs dpm support say that it's not implemented rather than not supported to avoid confusion about DPM support in general. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-02metag: cpu hotplug: route_irq: preserve irq maskJames Hogan
The route_irq() function needs to preserve the irq mask by using the _irqsave/irqrestore variants of raw spin lock functions instead of the _irq variants. This is because it is called from __cpu_disable() (via migrate_irqs()), which is called with IRQs disabled, so using the _irq variants re-enables IRQs. This appears to have been causing occasional hits of the BUG_ON(!irqs_disabled()) in __irq_work_run() during CPU hotplug soak testing: BUG: failure at kernel/irq_work.c:122/__irq_work_run()! Signed-off-by: James Hogan <james.hogan@imgtec.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
2013-07-02metag: kick: add missing irq_enter/exit to kick_handler()James Hogan
kick_handler() doesn't have an irq_enter/exit pair, but it's used for handling SMP IPIs which require work to be done in softirqs, which are invoked from irq_exit() when the hard irq nest count reaches 0. The scheduler_ipi() callback in the IPI handler calls irq_enter/exit itself, but this is inside kick_handler()'s spin lock critical section, so if an invoked softirq issues an IPI the kick_handler() will be re-entered on the same CPU and will deadlock. This is easily fixed by adding the missing irq_enter/exit to kick_handler() so that the hard irq nest count doesn't reach 0 until after the spin lock has been released. Ideally the spin lock protected handler list will also be replaced by a lockless RCU protected list since it is certainly mostly read. That can be done in a later change though. Signed-off-by: James Hogan <james.hogan@imgtec.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2013-07-02tick: Sanitize broadcast control logicThomas Gleixner
The recent implementation of a generic dummy timer resulted in a different registration order of per cpu local timers which made the broadcast control logic go belly up. If the dummy timer is the first clock event device which is registered for a CPU, then it is installed, the broadcast timer is initialized and the CPU is marked as broadcast target. If a real clock event device is installed after that, we can fail to take the CPU out of the broadcast mask. In the worst case we end up with two periodic timer events firing for the same CPU. One from the per cpu hardware device and one from the broadcast. Now the problem is that we have no way to distinguish whether the system is in a state which makes broadcasting necessary or the broadcast bit was set due to the nonfunctional dummy timer installment. To solve this we need to keep track of the system state seperately and provide a more detailed decision logic whether we keep the CPU in broadcast mode or not. The old decision logic only clears the broadcast mode, if the newly installed clock event device is not affected by power states. The new logic clears the broadcast mode if one of the following is true: - The new device is not affected by power states. - The system is not in a power state affected mode - The system has switched to oneshot mode. The oneshot broadcast is controlled from the deep idle state. The CPU is not in idle at this point, so it's safe to remove it from the mask. If we clear the broadcast bit for the CPU when a new device is installed, we also shutdown the broadcast device when this was the last CPU in the broadcast mask. If the broadcast bit is kept, then we leave the new device in shutdown state and rely on the broadcast to deliver the timer interrupts via the broadcast ipis. Reported-and-tested-by: Stehle Vincent-B46079 <B46079@freescale.com> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Cc: John Stultz <john.stultz@linaro.org>, Cc: Mark Rutland <mark.rutland@arm.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1307012153060.4013@ionos.tec.linutronix.de Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-07-02tick: Prevent uncontrolled switch to oneshot modeThomas Gleixner
When the system switches from periodic to oneshot mode, the broadcast logic causes a possibility that a CPU which has not yet switched to oneshot mode puts its own clock event device into oneshot mode without updating the state and the timer handler. CPU0 CPU1 per cpu tickdev is in periodic mode and switched to broadcast Switch to oneshot mode tick_broadcast_switch_to_oneshot() cpumask_copy(tick_oneshot_broacast_mask, tick_broadcast_mask); broadcast device mode = oneshot Timer interrupt irq_enter() tick_check_oneshot_broadcast() dev->set_mode(ONESHOT); tick_handle_periodic() if (dev->mode == ONESHOT) dev->next_event += period; FAIL. We fail, because dev->next_event contains KTIME_MAX, if the device was in periodic mode before the uncontrolled switch to oneshot happened. We must copy the broadcast bits over to the oneshot mask, because otherwise a CPU which relies on the broadcast would not been woken up anymore after the broadcast device switched to oneshot mode. So we need to verify in tick_check_oneshot_broadcast() whether the CPU has already switched to oneshot mode. If not, leave the device untouched and let the CPU switch controlled into oneshot mode. This is a long standing bug, which was never noticed, because the main user of the broadcast x86 cannot run into that scenario, AFAICT. The nonarchitected timer mess of ARM creates a gazillion of differently broken abominations which trigger the shortcomings of that broadcast code, which better had never been necessary in the first place. Reported-and-tested-by: Stehle Vincent-B46079 <B46079@freescale.com> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Cc: John Stultz <john.stultz@linaro.org>, Cc: Mark Rutland <mark.rutland@arm.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1307012153060.4013@ionos.tec.linutronix.de Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-07-02tick: Make oneshot broadcast robust vs. CPU offliningThomas Gleixner
In periodic mode we remove offline cpus from the broadcast propagation mask. In oneshot mode we fail to do so. This was not a problem so far, but the recent changes to the broadcast propagation introduced a constellation which can result in a NULL pointer dereference. What happens is: CPU0 CPU1 idle() arch_idle() tick_broadcast_oneshot_control(OFF); set cpu1 in tick_broadcast_force_mask if (cpu_offline()) arch_cpu_dead() cpu_dead_cleanup(cpu1) cpu1 tickdevice pointer = NULL broadcast interrupt dereference cpu1 tickdevice pointer -> OOPS We dereference the pointer because cpu1 is still set in tick_broadcast_force_mask and tick_do_broadcast() expects a valid cpumask and therefor lacks any further checks. Remove the cpu from the tick_broadcast_force_mask before we set the tick device pointer to NULL. Also add a sanity check to the oneshot broadcast function, so we can detect such issues w/o crashing the machine. Reported-by: Prarit Bhargava <prarit@redhat.com> Cc: athorlton@sgi.com Cc: CAI Qian <caiqian@redhat.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1306261303260.4013@ionos.tec.linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-07-02metag: smp: don't spin waiting for CPU to startJames Hogan
Use a completion to block until a secondary CPU has started up, like ARM do, instead of a loop of udelays. On Meta, SMP is really SMT, with each "CPU" being a different hardware thread on the same Meta processor core, so as well as being more efficient and latency friendly, using a completion prevents the bogomips of the secondary CPU from being drastically skewed every time by the execution of the tight in-cache udelay loop on the other CPU. Signed-off-by: James Hogan <james.hogan@imgtec.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
2013-07-02metag: smp: enable irqs after set_cpu_onlineJames Hogan
In secondary_start_kernel() interrupts should be enabled with local_irq_enable() after the cpu is marked as online with set_cpu_online(). Otherwise it's possible for a timer interrupt to trigger a softirq, which if the cpu is marked as offline may have it's affinity altered. Reported-by: Kirill Tkhai <tkhai@yandex.ru> Signed-off-by: James Hogan <james.hogan@imgtec.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Cc: Kirill Tkhai <tkhai@yandex.ru>
2013-07-02metag: use clear_tasks_mm_cpumask()James Hogan
Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To fix this we would need to use find_lock_task_mm(), which would walk up all threads and returns an appropriate task (with task lock held). clear_tasks_mm_cpumask() was introduced in v3.5-rc1 to fix this issue, so let's use it for metag too. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
2013-07-02drm/i915: Don't try to tear down the stolen drm_mm if it's not thereDaniel Vetter
Every other place properly checks whether we've managed to set up the stolen allocator at boot-up properly, with the exception of the cleanup code. Which results in an ugly *ERROR* Memory manager not clean. Delaying takedown at module unload time since the drm_mm isn't initialized at all. v2: While at it check whether the stolen drm_mm is initialized instead of the more obscure stolen_base == 0 check. v3: Fix up the logic. Also we need to keep the stolen_base check in i915_gem_object_create_stolen_for_preallocated since that can be called before stolen memory is fully set up. Spotted by Chris Wilson. v4: Readd the conversion in i915_gem_object_create_stolen_for_preallocated, the check is for the dev_priv->mm.gtt_space drm_mm, the stolen allocatot must already be initialized when calling that function (if we indeed have stolen memory). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65953 Cc: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: lu hua <huax.lu@intel.com> (v3) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-02net: cdc_ether: allow combined control and data interfaceBjørn Mork
Some Icera based Huawei modems handled by this driver are not completely CDC ECM compliant, using the same USB interface for both control and data. The CDC functional descriptors include a Union naming this interface as both master and slave, so it is supportable by relaxing the descriptor parsing in case these interfaces are identical. This has been tested on a Huawei K3806 and verified to add support for that device. Reported-and-tested-by: Enrico Mioso <mrkiko.rs@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Oliver Neukum <oliver@neukum.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02net: fec: Fix Transmitted bytes counterJim Baxter
The tx_bytes field was not being updated so the network card statistics showed 0.0B transmitted. Signed-off-by: Jim Baxter <jim_baxter@mentor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02pstore: Add hsize argument in write_buf call of pstore_ftrace_callAruna Balakrishnaiah
Incorporate the addition of hsize argument in write_buf callback of pstore. This was forgotten in 6bbbca735936e15b9431882eceddcf6dff76e03c pstore: Pass header size in the pstore write callback Causing a build failure when ftrace and pstore are enabled. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-02Merge remote-tracking branch 'agust/next' into nextBenjamin Herrenschmidt
From Anatolij: "There are small cleanups and fixes for mpc512x common code, mpc512x_defconfig updates and soft reboot support for mpc5125 based boards."
2013-07-02ipip: fix a regression in ioctlCong Wang
This is a regression introduced by commit fd58156e456d9f68fe0448 (IPIP: Use ip-tunneling code.) Similar to GRE tunnel, previously we only check the parameters for SIOCADDTUNNEL and SIOCCHGTUNNEL, after that commit, the check is moved for all commands. So, just check for SIOCADDTUNNEL and SIOCCHGTUNNEL. Also, the check for i_key, o_key etc. is suspicious too, which did not exist before, reset them before passing to ip_tunnel_ioctl(). Cc: Pravin B Shelar <pshelar@nicira.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02l2tp: add missing .owner to struct pppox_protoWei Yongjun
Add missing .owner of struct pppox_proto. This prevents the module from being removed from underneath its users. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02net: ethernet: davinci_emac: remove redundant dev_err call in ↵Wei Yongjun
davinci_emac_probe() There is a error message within devm_ioremap_resource already, so remove the dev_err call to avoid redundant error message. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Acked-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02xen-netback: xenbus.c: use more current logging stylesWei Liu
Convert one printk to pr_<level>. Add a missing newline in several places to avoid message interleaving, coalesce formats, reflow modified lines to 80 columns. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02x86/tracing: Add irq_enter/exit() in smp_trace_reschedule_interrupt()Seiji Aguchi
Reschedule vector tracepoints may be called in cpu idle state. This causes lockdep check warning below. The tracepoint requires rcu but for accuracy it also requires irq_enter() (tracepoints record the irq context), thus, the tracepoint interrupt handler should be calling irq_enter() and not rcu_irq_enter() (irq_enter() calls rcu_irq_enter()). So, add irq_enter/exit() to smp_trace_reschedule_interrupt() with common pre/post processing functions, smp_entering_irq() and exiting_irq() (exiting_irq() calls just irq_exit() in arch/x86/include/asm/apic.h), because these can be shared among reschedule, call_function, and call_function_single vectors. [ 50.720557] Testing event reschedule_exit: [ 50.721349] [ 50.721502] =============================== [ 50.721835] [ INFO: suspicious RCU usage. ] [ 50.722169] 3.10.0-rc6-00004-gcf910e8 #190 Not tainted [ 50.722582] ------------------------------- [ 50.722915] /c/kernel-tests/src/linux/arch/x86/include/asm/trace/irq_vectors.h:50 suspicious rcu_dereference_check() usage! [ 50.723770] [ 50.723770] other info that might help us debug this: [ 50.723770] [ 50.724385] [ 50.724385] RCU used illegally from idle CPU! [ 50.724385] rcu_scheduler_active = 1, debug_locks = 0 [ 50.725232] RCU used illegally from extended quiescent state! [ 50.725690] no locks held by swapper/0/0. [ 50.726010] [ 50.726010] stack backtrace: [...] Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/51CDCFA3.9080101@hds.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-07-02Merge remote-tracking branch 'scott/next' into nextBenjamin Herrenschmidt
Merge Freescale updates
2013-07-02ipv4: remove fib_update_nh_saddrs() declaration.Rami Rosen
This patch removes the fib_update_nh_saddrs() declaration from include/net/ip_fib.h, as the fib_update_nh_saddrs() method was removed in coomit 436c3b6 ("ipv4: Invalidate nexthop cache nh_saddr more correctly"). Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02net: ipheth: Add USB ID for iPad miniAaron Marburg
Adds the USB device ID (0x12ab) to the ipheth network-over-USB-tethering driver for iOS devices. Applied and tested against mainline tag v3.10 (as well as 3.8.x and 3.6.y kernel for Raspbian on Raspberry pi) Signed-off-by: Aaron Marburg <amarburg@notetofutureself.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02net: sctp: prevent checksum.h from double inclusionDaniel Borkmann
The header file checksum.h is missing proper defines that prevents it from double inclusion. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02tools: selftests: psock_tpacket: get rid of macro wrappersDaniel Borkmann
The TPACKET_V3 test code consists of a lot of unecessary macro wrappers that rather obfuscate what members are accessed in what way. So get rid of them and make the code more readable. Also credit Chetan for providing tpacket_v3 example code. Furthermore, get rid of private offset usage, as we do not need it here. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02ethtool: make .get_dump_data() harder to misuse by driversMichal Schmidt
As the patch "bnx2x: remove zeroing of dump data buffer" showed, it is too easy implement .get_dump_data incorrectly in a driver. Let's make sure drivers cannot get confused by userspace requesting a too big dump. Also WARN if the driver sets dump->len to something weird and make sure the length reported to userspace is the actual length of data copied to userspace. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02bnx2x: fill in sane dump flag informationMichal Schmidt
bnx2x did not set dump->version in its .get_dump_flag() method. Let's set it to BNX2X_DUMP_VERSION. It's not a particularly nice number (0x50acff01 currently), but at least it's something deterministic. dump->flag should report the value previously set using ethtool -W. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02bnx2x: fix dump flag handlingMichal Schmidt
bnx2x interprets the dump flag as an index of a register preset. It is important to validate the index to avoid out of bounds memory accesses. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02bnx2x: remove zeroing of dump data bufferMichal Schmidt
There is no need to initialize the dump data with zeros. data is allocated with vzalloc, so it's already zero-filled. More importantly, the memset is harmful, because dump->len (the length requested by userspace) can be bigger than the allocated buffer (whose size is determined by asking the driver's .get_dump_flag method). Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02net: sctp: get rid of SCTP_DBG_TSNS entirelyDaniel Borkmann
After having reworked the debugging framework, Neil and Vlad agreed to get rid of the leftover SCTP_DBG_TSNS code for a couple of reasons: We can use systemtap scripts to investigate these things, we now have pr_debug() helpers that make life easier, and if we really need anything else besides those tools, we will be forced to come up with something better than we have there. Therefore, get rid of this ifdef debugging code entirely for now. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: Neil Horman <nhorman@tuxdriver.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01cassini: Make missing firmware non-fatalBen Hutchings
The firmware patch for the Saturn PHY fixes a bug, but is not absolutely essential. And its licence is unclear, so it is not included in all distributions. Just log an error message and continue if it is missing or invalid. References: http://bugs.debian.org/712674 Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Tested-by: Jose Andres Arias Velichko <Andres.Arias@PaisLinux.net> (against 3.2) Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01qmi_wwan: add ONDA MT689DC device ID (fwd)Enrico Mioso
Another QMI-speaking device by ZTE, re-branded by ONDA! I'm connected ovr this device's QMI interface right now, so I can say I tested it! :) Note: a follow-up patch was posted to the linux-usb mailing list, to prevent the option driver from binding to the device's QMI interface, making it unusable. Signed-off-by: Enrico Mioso <mrkiko.rs@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01ipv6,mcast: always hold idev->lock before mca_lockAmerigo Wang
dingtianhong reported the following deadlock detected by lockdep: ====================================================== [ INFO: possible circular locking dependency detected ] 3.4.24.05-0.1-default #1 Not tainted ------------------------------------------------------- ksoftirqd/0/3 is trying to acquire lock: (&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120 but task is already holding lock: (&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&mc->mca_lock){+.+...}: [<ffffffff810a8027>] validate_chain+0x637/0x730 [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500 [<ffffffff810a8734>] lock_acquire+0x114/0x150 [<ffffffff814f691a>] rt_spin_lock+0x4a/0x60 [<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120 [<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60 [<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80 [<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0 [<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80 [<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20 [<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60 [<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80 [<ffffffff813d9360>] dev_change_flags+0x40/0x70 [<ffffffff813ea627>] do_setlink+0x237/0x8a0 [<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600 [<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310 [<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0 [<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40 [<ffffffff81403e20>] netlink_unicast+0x140/0x180 [<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380 [<ffffffff813c4252>] sock_sendmsg+0x112/0x130 [<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460 [<ffffffff813c5544>] sys_sendmsg+0x44/0x70 [<ffffffff814feab9>] system_call_fastpath+0x16/0x1b -> #0 (&ndev->lock){+.+...}: [<ffffffff810a798e>] check_prev_add+0x3de/0x440 [<ffffffff810a8027>] validate_chain+0x637/0x730 [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500 [<ffffffff810a8734>] lock_acquire+0x114/0x150 [<ffffffff814f6c82>] rt_read_lock+0x42/0x60 [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120 [<ffffffff8149b036>] mld_newpack+0xb6/0x160 [<ffffffff8149b18b>] add_grhead+0xab/0xc0 [<ffffffff8149d03b>] add_grec+0x3ab/0x460 [<ffffffff8149d14a>] mld_send_report+0x5a/0x150 [<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0 [<ffffffff8105705a>] call_timer_fn+0xca/0x1d0 [<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0 [<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0 [<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0 [<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210 [<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0 [<ffffffff8106c7de>] kthread+0xae/0xc0 [<ffffffff814fff74>] kernel_thread_helper+0x4/0x10 actually we can just hold idev->lock before taking pmc->mca_lock, and avoid taking idev->lock again when iterating idev->addr_list, since the upper callers of mld_newpack() already take read_lock_bh(&idev->lock). Reported-by: dingtianhong <dingtianhong@huawei.com> Cc: dingtianhong <dingtianhong@huawei.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: David S. Miller <davem@davemloft.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Tested-by: Ding Tianhong <dingtianhong@huawei.com> Tested-by: Chen Weilong <chenweilong@huawei.com> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01vti: remove duplicated code to fix a memory leakCong Wang
vti module allocates dev->tstats twice: in vti_fb_tunnel_init() and in vti_tunnel_init(), this lead to a memory leak of dev->tstats. Just remove the duplicated operations in vti_fb_tunnel_init(). (candidate for -stable) Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Saurabh Mohan <saurabh.mohan@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01gre: fix a regression in ioctlCong Wang
When testing GRE tunnel, I got: # ip tunnel show get tunnel gre0 failed: Invalid argument get tunnel gre1 failed: Invalid argument This is a regression introduced by commit c54419321455631079c7d ("GRE: Refactor GRE tunneling code.") because previously we only check the parameters for SIOCADDTUNNEL and SIOCCHGTUNNEL, after that commit, the check is moved for all commands. So, just check for SIOCADDTUNNEL and SIOCCHGTUNNEL. After this patch I got: # ip tunnel show gre0: gre/ip remote any local any ttl inherit nopmtudisc gre1: gre/ip remote 192.168.122.101 local 192.168.122.45 ttl inherit Cc: Pravin B Shelar <pshelar@nicira.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01net: sctp: rework debugging framework to use pr_debug and friendsDaniel Borkmann
We should get rid of all own SCTP debug printk macros and use the ones that the kernel offers anyway instead. This makes the code more readable and conform to the kernel code, and offers all the features of dynamic debbuging that pr_debug() et al has, such as only turning on/off portions of debug messages at runtime through debugfs. The runtime cost of having CONFIG_DYNAMIC_DEBUG enabled, but none of the debug statements printing, is negligible [1]. If kernel debugging is completly turned off, then these statements will also compile into "empty" functions. While we're at it, we also need to change the Kconfig option as it /now/ only refers to the ifdef'ed code portions in outqueue.c that enable further debugging/tracing of SCTP transaction fields. Also, since SCTP_ASSERT code was enabled with this Kconfig option and has now been removed, we transform those code parts into WARNs resp. where appropriate BUG_ONs so that those bugs can be more easily detected as probably not many people have SCTP debugging permanently turned on. To turn on all SCTP debugging, the following steps are needed: # mount -t debugfs none /sys/kernel/debug # echo -n 'module sctp +p' > /sys/kernel/debug/dynamic_debug/control This can be done more fine-grained on a per file, per line basis and others as described in [2]. [1] https://www.kernel.org/doc/ols/2009/ols2009-pages-39-46.pdf [2] Documentation/dynamic-debug-howto.txt Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01lib: vsprintf: add IPv4/v6 generic %p[Ii]S[pfs] format specifierDaniel Borkmann
In order to avoid making code that deals with printing both, IPv4 and IPv6 addresses, unnecessary complicated as for example ... if (sa.sa_family == AF_INET6) printk("... %pI6 ...", ..sin6_addr); else printk("... %pI4 ...", ..sin_addr.s_addr); ... it would be better to introduce a format specifier that can deal with those kind of situations internally; just as we have a "struct sockaddr" for generic mapping into "struct sockaddr_in" or "struct sockaddr_in6" as e.g. done in "union sctp_addr". Then, we could reduce the above statement into something like: printk("... %pIS ..", &sockaddr); In case our pointer is NULL, pointer() then deals with that already at an earlier point in time internally. While we're at it, support for both %piS/%pIS, where 'S' stands for sockaddr, comes (almost) for free. Additionally to that, postfix specifiers 'p', 'f' and 's' are supported as suggested and initially implemented in 2009 by Joe Perches [1]. Handling of those additional specifiers orientate on the initial RFC that was proposed. Also we support IPv6 compressed format specified by 'c' and various other IPv4 extensions as stated in the documentation part. Likely, there are many other areas than just SCTP in the kernel to make use of this extension as well. [1] http://patchwork.ozlabs.org/patch/31480/ Signed-off-by: Daniel Borkmann <dborkman@redhat.com> CC: Joe Perches <joe@perches.com> CC: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>