summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-04-01cpufreq: ondemand: Don't update sample_type if we don't evaluate load againViresh Kumar
Because we have per cpu timer now, we check if we need to evaluate load again or not (In case it is recently evaluated). Here the 2nd cpu which got timer interrupt updates core_dbs_info->sample_type irrespective of load evaluation is required or not. Which is wrong as the first cpu is dependent on this variable set to an older value. Moreover it would be best in this case to schedule 2nd cpu's timer to sampling_rate instead of freq_lo or hi as that must be managed by the other cpu. In case the other cpu idles in between then also we wouldn't loose much power. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01cpufreq: governor: Set MIN_LATENCY_MULTIPLIER to 20Viresh Kumar
Currently MIN_LATENCY_MULTIPLIER is set defined as 100 and so on a system with transition latency of 1 ms, the minimum sampling time comes to be around 100 ms. That is quite big if you want to get better performance for your system. Redefine MIN_LATENCY_MULTIPLIER to 20 so that we can support 20ms sampling rate for such platforms. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01cpufreq: governor: Implement per policy instances of governorsViresh Kumar
Currently, there can't be multiple instances of single governor_type. If we have a multi-package system, where we have multiple instances of struct policy (per package), we can't have multiple instances of same governor. i.e. We can't have multiple instances of ondemand governor for multiple packages. Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/ governor-name/. Which again reflects that there can be only one instance of a governor_type in the system. This is a bottleneck for multicluster system, where we want different packages to use same governor type, but with different tunables. This patch uses the infrastructure provided by earlier patch and implements init/exit routines for ondemand and conservative governors. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01cpufreq: Add per policy governor-init/exit infrastructureViresh Kumar
Currently, there can't be multiple instances of single governor_type. If we have a multi-package system, where we have multiple instances of struct policy (per package), we can't have multiple instances of same governor. i.e. We can't have multiple instances of ondemand governor for multiple packages. Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/ governor-name/. Which again reflects that there can be only one instance of a governor_type in the system. This is a bottleneck for multicluster system, where we want different packages to use same governor type, but with different tunables. This patch is inclined towards providing this infrastructure. Because we are required to allocate governor's resources dynamically now, we must do it at policy creation and end. And so got CPUFREQ_GOV_POLICY_INIT/EXIT. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01cpufreq: Convert the cpufreq_driver_lock to a rwlockNathan Zimmer
This eliminates the contention I am seeing in __cpufreq_cpu_get. It also nicely stages the lock to be replaced by the rcu. Signed-off-by: Nathan Zimmer <nzimmer@sgi.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01cpuidle: imx6: remove timer broadcast initializationDaniel Lezcano
The initialization is done from the cpuidle framework. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01cpuidle: OMAP4: remove timer broadcast initializationDaniel Lezcano
The initialization is done from the cpuidle framework. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01cpuidle: ux500: remove timer broadcast initializationDaniel Lezcano
The initialization is done from the cpuidle framework. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01cpuidle: initialize the broadcast timer frameworkDaniel Lezcano
The commit 89878baa73f0f1c679355006bd8632e5d78f96c2 introduced the CPUIDLE_FLAG_TIMER_STOP flag where we specify a specific idle state stops the local timer. Now use this flag to check at init time if one state will need the broadcast timer and, in this case, setup the broadcast timer framework. That prevents multiple code duplication in the drivers. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01timer: move enum definition out of ifdef sectionDaniel Lezcano
The next patch will setup automatically the broadcast timer for the different cpuidle driver when one idle state stops its timer. This will be part of the generic code. But some ARM boards, like s3c64xx, uses cpuidle but without the CONFIG_GENERIC_CLOCKEVENTS_BUILD set. Hence the cpuidle framework will be compiled with the code supposed to be generic, that is with clockevents_notify and the different enum. Also the function clockevents_notify is a noop macro, this is fine except the usual code is: int cpu = smp_processor_id(); clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ON, &cpu); and that raises a warning for the variable cpu which is not used. Move the clock_event_nofitiers enum definition out of the CONFIG_GENERIC_CLOCKEVENTS_BUILD section to prevent a compilation error when these are used in the code. Change the clockevents_notify macro to a static inline noop function to prevent a compilation warning. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01cpuidle: kirkwood: fix coccicheck warningsSilviu-Mihai Popescu
Convert all uses of devm_request_and_ioremap() to the newly introduced devm_ioremap_resource() which provides more consistent error handling. devm_ioremap_resource() provides its own error messages so all explicit error messages can be removed from the failure code paths. Signed-off-by: Silviu-Mihai Popescu <silviupopescu1990@gmail.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01cpuidle / kirkwood: remove redundant Kconfig optionDaniel Lezcano
When the CPU_IDLE and the ARCH_KIRKWOOD options are set it is pointless to define a new option CPU_IDLE_KIRKWOOD because it is redundant. The Makefile drivers directory contains a condition to compile the cpuidle drivers: obj-$(CONFIG_CPU_IDLE) += cpuidle/ Hence, if CPU_IDLE is not set we won't enter this directory. This patch removes the useless Kconfig option and replaces the condition in the Makefile by CONFIG_ARCH_KIRKWOOD. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Jason Cooper <jason@lakedaemon.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01cpuidle / ux500 : use CPUIDLE_FLAG_TIMER_STOP flagDaniel Lezcano
Use the CPUIDLE_FLAG_TIMER_STOP and let the cpuidle framework to handle the CLOCK_EVT_NOTIFY_BROADCAST_ENTER/EXIT when entering this state. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01cpuidle / imx6 : use CPUIDLE_FLAG_TIMER_STOP flagDaniel Lezcano
Use the CPUIDLE_FLAG_TIMER_STOP and let the cpuidle framework to handle the CLOCK_EVT_NOTIFY_BROADCAST_ENTER/EXIT when entering this state. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01cpuidle / omap4 : use CPUIDLE_FLAG_TIMER_STOP flagDaniel Lezcano
Use the CPUIDLE_FLAG_TIMER_STOP and let the cpuidle framework to handle the CLOCK_EVT_NOTIFY_BROADCAST_ENTER/EXIT when entering this state. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01cpuidle : handle clockevent notify from the cpuidle frameworkDaniel Lezcano
When a cpu enters a deep idle state, the local timers are stopped and the time framework falls back to the timer device used as a broadcast timer. The different cpuidle drivers are calling clockevents_notify ENTER/EXIT when the idle state stops the local timer. Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle drivers. If the flag is set, the cpuidle core code takes care of the notification on behalf of the driver to avoid pointless code duplication. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01NFC: microread: Fix build failure due to a new MEI bus APISamuel Ortiz
uuid device_id field is removed and mei_device is renamed mei_cl_device. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-03-31Merge remote-tracking branch 'regmap/fix/async' into tmpMark Brown
2013-03-31Linux 3.9-rc5v3.9-rc5Linus Torvalds
2013-03-31Merge remote-tracking branch 'regmap/fix/core' into tmpMark Brown
2013-03-31Merge remote-tracking branch 'regmap/fix/cache' into tmpMark Brown
2013-03-31Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds
Pull slave-dmaengine fixes from Vinod Koul: "Two fixes for slave-dmaengine. The first one is for making slave_id value correct for dw_dmac and the other one fixes the endieness in DT parsing" * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma: dw_dmac: adjust slave_id accordingly to request line base dmaengine: dw_dma: fix endianess for DT xlate function
2013-03-31Merge branch 'v4l_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "For a some fixes for Kernel 3.9: - subsystem build fix when VIDEO_DEV=y, VIDEO_V4L2=m and I2C=m - compilation fix for arm multiarch preventing IR_RX51 to be selected - regression fix at bttv crop logic - s5p-mfc/m5mols/exynos: a few fixes for cameras on exynos hardware" * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] [REGRESSION] bt8xx: Fix too large height in cropcap [media] fix compilation with both V4L2 and I2C as 'm' [media] m5mols: Fix bug in stream on handler [media] s5p-fimc: Do not attempt to disable not enabled media pipeline [media] s5p-mfc: Fix encoder control 15 issue [media] s5p-mfc: Fix frame skip bug [media] s5p-fimc: send valid m2m ctx to fimc_m2m_job_finish [media] exynos-gsc: send valid m2m ctx to gsc_m2m_job_finish [media] fimc-lite: Fix the variable type to avoid possible crash [media] fimc-lite: Initialize 'step' field in fimc_lite_ctrl structure [media] ir: IR_RX51 only works on OMAP2
2013-03-31Merge tag 'for-linus-20130331' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Alright, this time from 10K up in the air. Collection of fixes that have been queued up since the merge window opened, hence postponed until later in the cycle. The pull request contains: - A bunch of fixes for the xen blk front/back driver. - A round of fixes for the new IBM RamSan driver, fixing various nasty issues. - Fixes for multiple drives from Wei Yongjun, bad handling of return values and wrong pointer math. - A fix for loop properly killing partitions when being detached." * tag 'for-linus-20130331' of git://git.kernel.dk/linux-block: (25 commits) mg_disk: fix error return code in mg_probe() rsxx: remove unused variable rsxx: enable error return of rsxx_eeh_save_issued_dmas() block: removes dynamic allocation on stack Block: blk-flush: Fixed indent code style cciss: fix invalid use of sizeof in cciss_find_cfgtables() loop: cleanup partitions when detaching loop device loop: fix error return code in loop_add() mtip32xx: fix error return code in mtip_pci_probe() xen-blkfront: remove frame list from blk_shadow xen-blkfront: pre-allocate pages for requests xen-blkback: don't store dev_bus_addr xen-blkfront: switch from llist to list xen-blkback: fix foreach_grant_safe to handle empty lists xen-blkfront: replace kmalloc and then memcpy with kmemdup xen-blkback: fix dispatch_rw_block_io() error path rsxx: fix missing unlock on error return in rsxx_eeh_remap_dmas() Adding in EEH support to the IBM FlashSystem 70/80 device driver block: IBM RamSan 70/80 error message bug fix. block: IBM RamSan 70/80 branding changes. ...
2013-03-31Revert "lockdep: check that no locks held at freeze time"Paul Walmsley
This reverts commit 6aa9707099c4b25700940eb3d016f16c4434360d. Commit 6aa9707099c4 ("lockdep: check that no locks held at freeze time") causes problems with NFS root filesystems. The failures were noticed on OMAP2 and 3 boards during kernel init: [ BUG: swapper/0/1 still has locks held! ] 3.9.0-rc3-00344-ga937536 #1 Not tainted ------------------------------------- 1 lock held by swapper/0/1: #0: (&type->s_umount_key#13/1){+.+.+.}, at: [<c011e84c>] sget+0x248/0x574 stack backtrace: rpc_wait_bit_killable __wait_on_bit out_of_line_wait_on_bit __rpc_execute rpc_run_task rpc_call_sync nfs_proc_get_root nfs_get_root nfs_fs_mount_common nfs_try_mount nfs_fs_mount mount_fs vfs_kern_mount do_mount sys_mount do_mount_root mount_root prepare_namespace kernel_init_freeable kernel_init Although the rootfs mounts, the system is unstable. Here's a transcript from a PM test: http://www.pwsan.com/omap/testlogs/test_v3.9-rc3/20130317194234/pm/37xxevm/37xxevm_log.txt Here's what the test log should look like: http://www.pwsan.com/omap/testlogs/test_v3.8/20130218214403/pm/37xxevm/37xxevm_log.txt Mailing list discussion is here: http://lkml.org/lkml/2013/3/4/221 Deal with this for v3.9 by reverting the problem commit, until folks can figure out the right long-term course of action. Signed-off-by: Paul Walmsley <paul@pwsan.com> Cc: Mandeep Singh Baines <msb@chromium.org> Cc: Jeff Layton <jlayton@redhat.com> Cc: Shawn Guo <shawn.guo@linaro.org> Cc: <maciej.rutecki@gmail.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ben Chan <benchan@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-30cfg80211: sched_scan_mtx lock in cfg80211_conn_work()Artem Savkov
Introduced in f9f475292dbb0e7035fb6661d1524761ea0888d9 ("cfg80211: always check for scan end on P2P device") cfg80211_conn_scan() which requires sched_scan_mtx to be held can be called from cfg80211_conn_work(). Without this we are hitting multiple warnings like the following: WARNING: at net/wireless/sme.c:88 cfg80211_conn_scan+0x1dc/0x3a0 [cfg80211]() Hardware name: 0578A21 Modules linked in: ... Pid: 620, comm: kworker/3:1 Not tainted 3.9.0-rc4-next-20130328+ #326 Call Trace: [<c1036992>] warn_slowpath_common+0x72/0xa0 [<c10369e2>] warn_slowpath_null+0x22/0x30 [<faa4b0ec>] cfg80211_conn_scan+0x1dc/0x3a0 [cfg80211] [<faa4b344>] cfg80211_conn_do_work+0x94/0x380 [cfg80211] [<faa4c3b2>] cfg80211_conn_work+0xa2/0x130 [cfg80211] [<c1051858>] process_one_work+0x198/0x450 Signed-off-by: Artem Savkov <artem.savkov@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds
Pull SCSI target fixes from Nicholas Bellinger: "This includes the bug-fix for a >= v3.8-rc1 regression specific to iscsi-target persistent reservation conflict handling (CC'ed to stable), and a tcm_vhost patch to drop VIRTIO_RING_F_EVENT_IDX usage so that in-flight qemu vhost-scsi-pci device code can detect the proper vhost feature bits. Also, there are two more tcm_vhost patches still being discussed by MST and Asias for v3.9 that will be required for the in-flight qemu vhost-scsi-pci device patch to function properly, and that should (hopefully) be the last target fixes for this round." * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: target: Fix RESERVATION_CONFLICT status regression for iscsi-target special case tcm_vhost: Avoid VIRTIO_RING_F_EVENT_IDX feature bit
2013-03-30ARM: cns3xxx: fix mapping of private memory regionMac Lin
Since commit 0536bdf33faf (ARM: move iotable mappings within the vmalloc region), the Cavium CNS3xxx cannot boot anymore. This is caused by the pre-defined iotable mappings is not in the vmalloc region. This patch move the iotable mappings into the vmalloc region, and merge the MPCore private memory region (containing the SCU, the GIC and the TWD) as a single region. Signed-off-by: Mac Lin <mkl0301@gmail.com> Signed-off-by: Anton Vorontsov <anton@enomsg.org> Cc: stable@vger.kernel.org [v3.3+]
2013-03-30virtio: console: add locking around c_ovq operationsAmit Shah
When multiple ovq operations are being performed (lots of open/close operations on virtio_console fds), the __send_control_msg() function can get confused without locking. A simple recipe to cause badness is: * create a QEMU VM with two virtio-serial ports * in the guest, do while true;do echo abc >/dev/vport0p1;done while true;do echo edf >/dev/vport0p2;done In one run, this caused a panic in __send_control_msg(). In another, I got virtio_console virtio0: control-o:id 0 is not a head! This also results repeated messages similar to these on the host: qemu-kvm: virtio-serial-bus: Unexpected port id 478762112 for device virtio-serial-bus.0 qemu-kvm: virtio-serial-bus: Unexpected port id 478762368 for device virtio-serial-bus.0 Reported-by: FuXiangChun <xfu@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Reviewed-by: Asias He <asias@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org
2013-03-30virtio: console: rename cvq_lock to c_ivq_lockAmit Shah
The cvq_lock was taken for the c_ivq. Rename the lock to make that obvious. We'll also add a lock around the c_ovq in the next commit, so there's no ambiguity. Signed-off-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Asias He <asias@redhat.com> Reviewed-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org
2013-03-30PCI / ACPI: Don't query OSC support with all possible controlsYinghai Lu
Found problem on system that firmware that could handle pci aer. Firmware get error reporting after pci injecting error, before os boots. But after os boots, firmware can not get report anymore, even pci=noaer is passed. Root cause: BIOS _OSC has problem with query bit checking. It turns out that BIOS vendor is copying example code from ACPI Spec. In ACPI Spec 5.0, page 290: If (Not(And(CDW1,1))) // Query flag clear? { // Disable GPEs for features granted native control. If (And(CTRL,0x01)) // Hot plug control granted? { Store(0,HPCE) // clear the hot plug SCI enable bit Store(1,HPCS) // clear the hot plug SCI status bit } ... } When Query flag is set, And(CDW1,1) will be 1, Not(1) will return 0xfffffffe. So it will get into code path that should be for control set only. BIOS acpi code should be changed to "If (LEqual(And(CDW1,1), 0)))" Current kernel code is using _OSC query to notify firmware about support from OS and then use _OSC to set control bits. During query support, current code is using all possible controls. So will execute code that should be only for control set stage. That will have problem when pci=noaer or aer firmware_first is used. As firmware have that control set for os aer already in query support stage, but later will not os aer handling. We should avoid passing all possible controls, just use osc_control_set instead. That should workaround BIOS bugs with affected systems on the field as more bios vendors are copying sample code from ACPI spec. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-03-30dw_dmac: adjust slave_id accordingly to request line baseAndy Shevchenko
On some hardware configurations we have got the request line with the offset. The patch introduces convert_slave_id() helper for that cases. The request line base is came from the driver data provided by the platform_device_id table. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-03-30dmaengine: dw_dma: fix endianess for DT xlate functionArnd Bergmann
As reported by Wu Fengguang's build robot tracking sparse warnings, the dma_spec arguments in the dw_dma_xlate are already byte swapped on little-endian platforms and must not get swapped again. This code is currently not used anywhere, but will be used in Linux 3.10 when the ARM SPEAr platform starts using the generic DMA DT binding. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-03-29PNP: List Rafael Wysocki as a maintainerRafael J. Wysocki
The Adam Belay's e-mail address in MAINTAINERS under PNP SUPPORT is not valid any more and I started to maintain that code in the meantime as a matter of fact, so list myself as a maintainer of it along with Bjorn and remove the Adam's entry from it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-29ks8851: Fix interpretation of rxlen field.Max.Nekludov@us.elster.com
According to the Datasheet (page 52): 15-12 Reserved 11-0 RXBC Receive Byte Count This field indicates the present received frame byte size. The code has a bug: rxh = ks8851_rdreg32(ks, KS_RXFHSR); rxstat = rxh & 0xffff; rxlen = rxh >> 16; // BUG!!! 0xFFF mask should be applied Signed-off-by: Max Nekludov <Max.Nekludov@us.elster.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29net: add a synchronize_net() in netdev_rx_handler_unregister()Eric Dumazet
commit 35d48903e97819 (bonding: fix rx_handler locking) added a race in bonding driver, reported by Steven Rostedt who did a very good diagnosis : <quoting Steven> I'm currently debugging a crash in an old 3.0-rt kernel that one of our customers is seeing. The bug happens with a stress test that loads and unloads the bonding module in a loop (I don't know all the details as I'm not the one that is directly interacting with the customer). But the bug looks to be something that may still be present and possibly present in mainline too. It will just be much harder to trigger it in mainline. In -rt, interrupts are threads, and can schedule in and out just like any other thread. Note, mainline now supports interrupt threads so this may be easily reproducible in mainline as well. I don't have the ability to tell the customer to try mainline or other kernels, so my hands are somewhat tied to what I can do. But according to a core dump, I tracked down that the eth irq thread crashed in bond_handle_frame() here: slave = bond_slave_get_rcu(skb->dev); bond = slave->bond; <--- BUG the slave returned was NULL and accessing slave->bond caused a NULL pointer dereference. Looking at the code that unregisters the handler: void netdev_rx_handler_unregister(struct net_device *dev) { ASSERT_RTNL(); RCU_INIT_POINTER(dev->rx_handler, NULL); RCU_INIT_POINTER(dev->rx_handler_data, NULL); } Which is basically: dev->rx_handler = NULL; dev->rx_handler_data = NULL; And looking at __netif_receive_skb() we have: rx_handler = rcu_dereference(skb->dev->rx_handler); if (rx_handler) { if (pt_prev) { ret = deliver_skb(skb, pt_prev, orig_dev); pt_prev = NULL; } switch (rx_handler(&skb)) { My question to all of you is, what stops this interrupt from happening while the bonding module is unloading? What happens if the interrupt triggers and we have this: CPU0 CPU1 ---- ---- rx_handler = skb->dev->rx_handler netdev_rx_handler_unregister() { dev->rx_handler = NULL; dev->rx_handler_data = NULL; rx_handler() bond_handle_frame() { slave = skb->dev->rx_handler; bond = slave->bond; <-- NULL pointer dereference!!! What protection am I missing in the bond release handler that would prevent the above from happening? </quoting Steven> We can fix bug this in two ways. First is adding a test in bond_handle_frame() and others to check if rx_handler_data is NULL. A second way is adding a synchronize_net() in netdev_rx_handler_unregister() to make sure that a rcu protected reader has the guarantee to see a non NULL rx_handler_data. The second way is better as it avoids an extra test in fast path. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jiri Pirko <jpirko@redhat.com> Cc: Paul E. McKenney <paulmck@us.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29MAINTAINERS: Update netxen_nic maintainers listManish Chopra
o Add myself to netxen_nic maintainers list Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29atl1e: drop pci-msi support because of packet corruptionHannes Frederic Sowa
Usage of pci-msi results in corrupted dma packet transfers to the host. Reported-by: rebelyouth <rebelyouth.hacklab@gmail.com> Cc: Huang, Xiong <xiong@qca.qualcomm.com> Tested-by: Christian Sünkenberg <christian.suenkenberg@student.kit.edu> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29net: fq_codel: Fix off-by-one errorVijay Subramanian
Currently, we hold a max of sch->limit -1 number of packets instead of sch->limit packets. Fix this off-by-one error. Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29net: calxedaxgmac: Wake-on-LAN fixesRob Herring
WOL is broken because the magic packet status bit is getting set rather than the enable bit. The PMT interrupt is not getting serviced because the PMT interrupt is also enabled on the global interrupt, but not cleared by the global interrupt and the global interrupt is higher priority. This fixes both of these issues to get WOL working. There's still a problem with receive after resume, but at least now we can wake-up. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29net: calxedaxgmac: fix rx ring handling when OOMRob Herring
If skb allocation for the rx ring fails repeatedly, we can reach a point were the ring is empty. In this condition, the driver is out of sync with the h/w. While this has always been possible, the removal of the skb recycling seems to have made triggering this problem easier. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29net: core: Remove redundant call to 'nf_reset' in 'dev_forward_skb'Shmulik Ladkani
'nf_reset' is called just prior calling 'netif_rx'. No need to call it twice. Reported-by: Igor Michailov <rgohita@gmail.com> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29smsc75xx: fix jumbo frame supportSteve Glendinning
This patch enables RX of jumbo frames for LAN7500. Previously the driver would transmit jumbo frames succesfully but would drop received jumbo frames (incrementing the interface errors count). With this patch applied the device can succesfully receive jumbo frames up to MTU 9000 (9014 bytes on the wire including ethernet header). Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29net: fix the use of this_cpu_ptrLi RongQing
flush_tasklet is not percpu var, and percpu is percpu var, and this_cpu_ptr(&info->cache->percpu->flush_tasklet) is not equal to &this_cpu_ptr(info->cache->percpu)->flush_tasklet 1f743b076(use this_cpu_ptr per-cpu helper) introduced this bug. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29tile: expect new initramfs name from hypervisor file systemChris Metcalf
The current Tilera boot infrastructure now provides the initramfs to Linux as a Tilera-hypervisor file named "initramfs", rather than "initramfs.cpio.gz", as before. (This makes it reasonable to use other compression techniques than gzip on the file without having to worry about the name causing confusion.) Adapt to use the new name, but also fall back to checking for the old name. Cc'ing to stable so that older kernels will remain compatible with newer Tilera boot infrastructure. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Cc: stable@vger.kernel.org
2013-03-29bonding: fix disabling of arp_interval and miimonnikolay@redhat.com
Currently if either arp_interval or miimon is disabled, they both get disabled, and upon disabling they get executed once more which is not the proper behaviour. Also when doing a no-op and disabling an already disabled one, the other again gets disabled. Also fix the error messages with the proper valid ranges, and a small typo fix in the up delay error message (outputting "down delay", instead of "up delay"). Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29ipv6: don't accept node local multicast traffic from the wireHannes Frederic Sowa
Erik Hugne's errata proposal (Errata ID: 3480) to RFC4291 has been verified: http://www.rfc-editor.org/errata_search.php?eid=3480 We have to check for pkt_type and loopback flag because either the packets are allowed to travel over the loopback interface (in which case pkt_type is PACKET_HOST and IFF_LOOPBACK flag is set) or they travel over a non-loopback interface back to us (in which case PACKET_TYPE is PACKET_LOOPBACK and IFF_LOOPBACK flag is not set). Cc: Erik Hugne <erik.hugne@ericsson.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29sky2: Threshold for Pause Packet is set wrongMirko Lindner
The sky2 driver sets the Rx Upper Threshold for Pause Packet generation to a wrong value which leads to only 2kB of RAM remaining space. This can lead to Rx overflow errors even with activated flow-control. Fix: We should increase the value to 8192/8 Signed-off-by: Mirko Lindner <mlindner@marvell.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29sky2: Receive Overflows not countedMirko Lindner
The sky2 driver doesn't count the Receive Overflows because the MAC interrupt for this event is not set in the MAC's interrupt mask. The MAC's interrupt mask is set only for Transmit FIFO Underruns. Fix: The correct setting should be (GM_IS_TX_FF_UR | GM_IS_RX_FF_OR) Otherwise the Receive Overflow event will not generate any interrupt. The Receive Overflow interrupt is handled correctly Signed-off-by: Mirko Lindner <mlindner@marvell.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull ceph fix from Sage Weil: "This fixes a regression introduced during the last merge window when mapping non-existent images." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: don't zero-fill non-image object requests