summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-01-16Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller
Included change: - properly compute the batman-adv header overhead. Such result is later used to initialize the hard_header_len member of the soft-interface netdev object Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16team: block mtu change before it happens via NETDEV_PRECHANGEMTUVeaceslav Falico
Now it catches the NETDEV_CHANGEMTU notification, which is signaled after the actual change happened on the device, and returns NOTIFY_BAD, so that the change on the device is reverted. This might be quite costly and messy, so use the new NETDEV_PRECHANGEMTU to catch the MTU change before the actual change happens and signal that it's forbidden to do it. CC: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16net: add NETDEV_PRECHANGEMTU to notify before mtu change happensVeaceslav Falico
Currently, if a device changes its mtu, first the change happens (invloving all the side effects), and after that the NETDEV_CHANGEMTU is sent so that other devices can catch up with the new mtu. However, if they return NOTIFY_BAD, then the change is reverted and error returned. This is a really long and costy operation (sometimes). To fix this, add NETDEV_PRECHANGEMTU notification which is called prior to any change actually happening, and if any callee returns NOTIFY_BAD - the change is aborted. This way we're skipping all the playing with apply/revert the mtu. CC: "David S. Miller" <davem@davemloft.net> CC: Jiri Pirko <jiri@resnulli.us> CC: Eric Dumazet <edumazet@google.com> CC: Nicolas Dichtel <nicolas.dichtel@6wind.com> CC: Cong Wang <amwang@redhat.com> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-17Merge branch 'pm-cpufreq'Rafael J. Wysocki
* pm-cpufreq: (40 commits) thermal: exynos: boost: Automatic enable/disable of BOOST feature (at Exynos4412) cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ Documentation: cpufreq / boost: Update BOOST documentation cpufreq: exynos: Extend Exynos cpufreq driver to support boost cpufreq / boost: Kconfig: Support for software-managed BOOST acpi-cpufreq: Adjust the code to use the common boost attribute cpufreq: Add boost frequency support in core intel_pstate: Add trace point to report internal state. cpufreq: introduce cpufreq_generic_get() routine ARM: SA1100: Create dummy clk_get_rate() to avoid build failures cpufreq: stats: create sysfs entries when cpufreq_stats is a module cpufreq: stats: free table and remove sysfs entry in a single routine cpufreq: stats: remove hotplug notifiers cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly cpufreq: speedstep: remove unused speedstep_get_state powernow-k6: reorder frequencies powernow-k6: correctly initialize default parameters powernow-k6: disable cache when changing frequency Documentation: add ABI entry for intel_pstate cpufreq: exynos: Convert exynos-cpufreq to platform driver ...
2014-01-17thermal: exynos: boost: Automatic enable/disable of BOOST feature (at ↵Lukasz Majewski
Exynos4412) This patch provides auto disable/enable operation for boost. It uses already present thermal infrastructure to provide BOOST hysteresis. The TMU data is modified to work properly with or without BOOST. Hence, the two first trip points with corresponding clip frequencies are adjusted. The first one is reduced from 85 to 70 degrees and the clip frequency is increased to 1.4 GHz from 800 MHz. This trip point is in fact responsible for providing BOOST hysteresis. When temperature exceeds 70 deg, the maximal non BOOST frequency for Exynos4412 is imposed. Since the first trigger level has been "stolen" for BOOST, the second one needs to be a compromise for the previously used two for non BOOST configuration. The 95 deg with modified clip freq (to 400 MHz) should provide a good balance between cooling down the overheated device and throughput on an acceptable level. Two last trigger levels are not modified since, they cause platform shutdown on emergency overheat to happen. The third trip point passage results in SW managed shut down of the system. If the last trip point is crossed, the PMU HW generates the power off signal. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Eduardo Valentin <eduardo.valentin@ti.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQLukasz Majewski
Add a special driver data flag (CPUFREQ_BOOST_FREQ) to indicate a frequency that can be only enabled for BOOST mode. This frequency will be used only for limited time, since running with it for too long may cause the target device to overheat. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17Documentation: cpufreq / boost: Update BOOST documentationLukasz Majewski
Since the support for software and hardware controlled boosting has been added, update the corresponding documentation. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: exynos: Extend Exynos cpufreq driver to support boostLukasz Majewski
The cpufreq_driver's boost_supported flag is true only when boost support is explicitly enabled. Boost related attributes are exported only under the same condition. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq / boost: Kconfig: Support for software-managed BOOSTLukasz Majewski
Add CONFIG_CPU_FREQ_BOOST_SW Kconfig option such that software-managed boost is enabled only after selecting "EXYNOS Frequency Overclocking - Software". It also depends on the thermal subsystem to be compiled in, which is necessary for disabling boost and cooling down the device when overheating is detected. Software-managed boost _MUST_ _NOT_ be enabled without thermal subsystem with properly defined overheating temperature thresholds. This option doesn't affect the x86's hardware-driven boost support in the acpi-cpufreq driver. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Subject and changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17acpi-cpufreq: Adjust the code to use the common boost attributeLukasz Majewski
Modify acpi-cpufreq's hardware-based boost solution to work with the common cpufreq boost framework. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Subject and changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: Add boost frequency support in coreLukasz Majewski
This commit adds boost frequency support in cpufreq core (Hardware & Software). Some SoCs (like Exynos4 - e.g. 4x12) allow setting frequency above its normal operation limits. Such mode shall be only used for a short time. Overclocking (boost) support is essentially provided by platform dependent cpufreq driver. This commit unifies support for SW and HW (Intel) overclocking solutions in the core cpufreq driver. Previously the "boost" sysfs attribute was defined in the ACPI processor driver code. By default boost is disabled. One global attribute is available at: /sys/devices/system/cpu/cpufreq/boost. It only shows up when cpufreq driver supports overclocking. Under the hood frequencies dedicated for boosting are marked with a special flag (CPUFREQ_BOOST_FREQ) at driver's frequency table. It is the user's concern to enable/disable overclocking with a proper call to sysfs. The cpufreq_boost_trigger_state() function is defined non static on purpose. It is used later with thermal subsystem to provide automatic enable/disable of the BOOST feature. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17intel_pstate: Add trace point to report internal state.Dirk Brandewie
Add perf trace event "power:pstate_sample" to report driver state to aid in diagnosing issues reported against intel_pstate. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: introduce cpufreq_generic_get() routineViresh Kumar
CPUFreq drivers that use clock frameworks interface,i.e. clk_get_rate(), to get CPUs clk rate, have similar sort of code used in most of them. This patch adds a generic ->get() which will do the same thing for them. All those drivers are required to now is to set .get to cpufreq_generic_get() and set their clk pointer in policy->clk during ->init(). Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no> Acked-by: Shawn Guo <shawn.guo@linaro.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Shawn Guo <shawn.guo@linaro.org> Acked-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17ARM: SA1100: Create dummy clk_get_rate() to avoid build failuresViresh Kumar
There are some parts of common kernel which would be using routines like clk_get_rate() on some platforms. Currently, they wouldn't be called for SA1100 boards, but they are needed for successful kernel compilation. Create a dummy clk_get_rate() routine for SA1100 which can be called by the cpufreq core. More dummy routines might be added later if necessary. Cc: Russell King <linux@arm.linux.org.uk> Cc: Arnd Bergmann <arnd.bergmann@linaro.org> Reported-by: Olof Johansson <olof@lixom.net> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: stats: create sysfs entries when cpufreq_stats is a moduleViresh Kumar
When cpufreq_stats is compiled in as a module, cpufreq driver would have already been registered. And so the CPUFREQ_CREATE_POLICY notifiers wouldn't be called for it. Hence no sysfs entries for stats. :( This patch calls cpufreq_stats_create_table() for each online CPU from cpufreq_stats_init() and so if policy is already created for CPUx then we will register sysfs stats for it. When its not compiled as module, we will return early as policy wouldn't be found for any of the CPUs. Acked-by: Nicolas Pitre <nico@linaro.org> Tested-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: stats: free table and remove sysfs entry in a single routineViresh Kumar
We don't have code paths now where we need to do these two things separately, so it is better do them in a single routine. Just as they are allocated in a single routine. Acked-by: Nicolas Pitre <nico@linaro.org> Tested-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: stats: remove hotplug notifiersViresh Kumar
Either CPUs are hot-unplugged or suspend/resume occurs, cpufreq core will send notifications to cpufreq-stats and stats structure and sysfs entries would be correctly handled.. And so we don't actually need hotcpu notifiers in cpufreq-stats anymore. We were only handling cpu hot-unplug events here and that are already taken care of by POLICY notifiers. Acked-by: Nicolas Pitre <nico@linaro.org> Tested-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properlyViresh Kumar
There are several problems with cpufreq stats in the way it handles cpufreq_unregister_driver() and suspend/resume.. - We must not lose data collected so far when suspend/resume happens and so stats directories must not be removed/allocated during these operations, which is done currently. - cpufreq_stat has registered notifiers with both cpufreq and hotplug. It adds sysfs stats directory with a cpufreq notifier: CPUFREQ_NOTIFY and removes this directory with a notifier from hotplug core. In case cpufreq_unregister_driver() is called (on rmmod cpufreq driver), stats directories per cpu aren't removed as CPUs are still online. The only call cpufreq_stats gets is cpufreq_stats_update_policy_cpu() for all CPUs except the last of each policy. And pointer to stat information is stored in the entry for last CPU in the per-cpu cpufreq_stats_table. But policy structure would be freed inside cpufreq core and so that will result in memory leak inside cpufreq stats (as we are never freeing memory for stats). Now if we again insert the module cpufreq_register_driver() will be called and we will again allocate stats data and put it on for first CPU of every policy. In case we only have a single CPU per policy, we will return with a error from cpufreq_stats_create_table() due to this code: if (per_cpu(cpufreq_stats_table, cpu)) return -EBUSY; And so probably cpufreq stats directory would not show up anymore (as it was added inside last policies->kobj which doesn't exist anymore). I haven't tested it, though. Also the values in stats files wouldn't be refreshed as we are using the earlier stats structure. - CPUFREQ_NOTIFY is called from cpufreq_set_policy() which is called for scenarios where we don't really want cpufreq_stat_notifier_policy() to get called. For example whenever we are changing anything related to a policy: min/max/current freq, etc. cpufreq_set_policy() is called and so cpufreq stats is notified. Where we don't do any useful stuff other than simply returning with -EBUSY from cpufreq_stats_create_table(). And so this isn't the right notifier that cpufreq stats.. Due to all above reasons this patch does following changes: - Add new notifiers CPUFREQ_CREATE_POLICY and CPUFREQ_REMOVE_POLICY, which are only called when policy is created/destroyed. They aren't called for suspend/resume paths.. - Use these notifiers in cpufreq_stat_notifier_policy() to create/destory stats sysfs entries. And so cpufreq_unregister_driver() or suspend/resume shouldn't be a problem for cpufreq_stats. - Return early from cpufreq_stat_cpu_callback() for suspend/resume sequence, so that we don't free stats structure. Acked-by: Nicolas Pitre <nico@linaro.org> Tested-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: speedstep: remove unused speedstep_get_statePaul Bolle
The only caller of speedstep_get_state() was removed in commit d4019f0a92ab ("cpufreq: move freq change notifications to cpufreq core"). So building speedstep-smi.o now triggers a GCC warning: drivers/cpufreq/speedstep-smi.c:148:12: warning: 'speedstep_get_state' defined but not used [-Wunused-function] Remove this unused function. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17Merge branches 'acpi-tools' and 'pm-tools'Rafael J. Wysocki
* acpi-tools: ACPICA: acpidump: Update MAINTAINERS file to include tools folder for ACPI/ACPICA. ACPICA: acpidump: Enable tools Makefile to include acpi tools. ACPICA: acpidump: Cleanup tools/power/acpi makefiles. * pm-tools: PM / tools: new tool for suspend/resume performance optimization cpupower: Fix sscanf robustness in cpufreq-set
2014-01-17Merge branch 'acpi-modules'Rafael J. Wysocki
* acpi-modules: platform: introduce OF style 'modalias' support for platform bus ACPI: fix module autoloading for ACPI enumerated devices ACPI: add module autoloading support for ACPI enumerated devices ACPI: fix create_modalias() return value handling
2014-01-17platform: introduce OF style 'modalias' support for platform busZhang Rui
Fix a problem that, the platform bus supports the OF style modalias in .uevent() call, but not in its device 'modalias' sysfs attribute. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17Merge branches 'acpi-init' and 'acpi-hotplug'Rafael J. Wysocki
* acpi-init: ACPI / init: Run acpi_early_init() before timekeeping_init() * acpi-hotplug: ACPI / memhotplug: add parameter to disable memory hotplug
2014-01-17Merge branch 'pm-clk'Rafael J. Wysocki
* pm-clk: PM / clock_ops: report clock errors from clk_enable() PM / clock_ops: check return of clk_enable() in pm_clk_resume() PM / clock_ops: fix up clk prepare/unprepare count
2014-01-17Merge branch 'pm-cpuidle'Rafael J. Wysocki
* pm-cpuidle: intel_idle: remove superfluous dev->state_count initialization intel_idle: do C1E promotion disable quirk for hotplugged CPUs ACPI / cpuidle: remove dev->state_count setting ACPI / cpuidle: fix max idle state handling with hotplug CPU support POWERPC: pseries: cpuidle: use the common cpuidle_[un]register() routines POWERPC: pseries: cpuidle: remove superfluous dev->state_count initialization ARM: EXYNOS: cpuidle: fix AFTR mode check
2014-01-17PM / tools: new tool for suspend/resume performance optimizationTodd E Brandt
This tool is designed to assist kernel and OS developers in optimizing their linux stack's suspend/resume time. Using a kernel image built with a few extra options enabled, the tool will execute a suspend and will capture dmesg and ftrace data until resume is complete. This data is transformed into a device timeline and a callgraph to give a quick and detailed view of which devices and callbacks are taking the most time in suspend/resume. The output is a single html file which can be viewed in firefox or chrome. References: https://01.org/suspendresume Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: "Revert "arm64: Fix memory shareability attribute for ioremap_wc/cache" We noticed that it breaks ioremap (and earlyprintk) with 64K page configuration" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: Revert "arm64: Fix memory shareability attribute for ioremap_wc/cache"
2014-01-17percpu_counter: unbreak __percpu_counter_add()Hugh Dickins
Commit 74e72f894d56 ("lib/percpu_counter.c: fix __percpu_counter_add()") looked very plausible, but its arithmetic was badly wrong: obvious once you see the fix, but maddening to get there from the weird tmpfs ENOSPCs Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Ming Lei <tom.leiming@gmail.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Shaohua Li <shli@fusionio.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Fan Du <fan.du@windriver.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-16r6040: use ETH_ZLEN instead of MISR for SKB length checkingFlorian Fainelli
Ever since this driver was merged the following code was included: if (skb->len < MISR) skb->len = MISR; MISR is defined to 0x3C which is also equivalent to ETH_ZLEN, but use ETH_ZLEN directly which is exactly what we want to be checking for. Reported-by: Marc Volovic <marcv@ezchip.com> Signed-off-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16r6040: add delays in MDIO read/write polling loopsFlorian Fainelli
On newer and faster machines (Vortex X86DX) using the r6040 driver, it was noticed that the driver was returning an error during probing traced down to being the MDIO bus probing and the inability to complete a MDIO read operation in time. It turns out that the MDIO operations on these faster machines usually complete after ~2140 iterations which is bigger than 2048 (MAC_DEF_TIMEOUT) and results in spurious timeouts depending on the system load. Update r6040_phy_read() and r6040_phy_write() to include a 1 micro second delay in each busy-looping iteration of the loop which is a much safer operation than incrementing MAC_DEF_TIMEOUT. Reported-by: Nils Koehler <nils.koehler@ibt-interfaces.de> Reported-by: Daniel Goertzen <daniel.goertzen@gmail.com> Signed-off-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16xen-netfront: add support for IPv6 offloadsPaul Durrant
This patch adds support for IPv6 checksum offload and GSO when those features are available in the backend. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16net: Check skb->rxhash in gro_receiveTom Herbert
When initializing a gro_list for a packet, first check the rxhash of the incoming skb against that of the skb's in the list. This should be a very strong inidicator of whether the flow is going to be matched, and potentially allows a lot of other checks to be short circuited. Use skb_hash_raw so that we don't force the hash to be calculated. Tested by running netperf 200 TCP_STREAMs between two machines with GRO, HW rxhash, and 1G. Saw no performance degration, slight reduction of time in dev_gro_receive. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16net: Add skb_get_hash_rawTom Herbert
Function to just return skb->rxhash without checking to see if it needs to be recomputed. Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16vxge: make local functions staticstephen hemminger
Remove unused function vxge_hw_vpath_vid_get Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16bnad: code cleanupstephen hemminger
Use 'make namespacecheck' to code that could be declared static. After that remove code that is not being used. Compile tested only. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16dm9000: fix a lot of checkpatch issuesBarry Song
recently, dm9000 codes have many checkpatch errors and warnings: WARNING: please, no space before tabs 3: FILE: dm9000.c:3: + * ^ICopyright (C) 1997 Sten Wang$ WARNING: please, no space before tabs 5: FILE: dm9000.c:5: + * ^IThis program is free software; you can redistribute it and/or$ WARNING: please, no space before tabs 6: FILE: dm9000.c:6: + * ^Imodify it under the terms of the GNU General Public License$ WARNING: please, no space before tabs 7: FILE: dm9000.c:7: + * ^Ias published by the Free Software Foundation; either version 2$ WARNING: please, no space before tabs 8: FILE: dm9000.c:8: + * ^Iof the License, or (at your option) any later version.$ WARNING: please, no space before tabs 10: FILE: dm9000.c:10: + * ^IThis program is distributed in the hope that it will be useful,$ WARNING: please, no space before tabs 11: FILE: dm9000.c:11: + * ^Ibut WITHOUT ANY WARRANTY; without even the implied warranty of$ WARNING: please, no space before tabs 12: FILE: dm9000.c:12: + * ^IMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the$ WARNING: please, no space before tabs 13: FILE: dm9000.c:13: + * ^IGNU General Public License for more details.$ WARNING: do not add new typedefs 97: FILE: dm9000.c:97: +typedef struct board_info { ERROR: spaces prohibited around that ':' (ctx:WxV) 113: FILE: dm9000.c:113: + unsigned int in_suspend :1; ^ ERROR: spaces prohibited around that ':' (ctx:WxV) 114: FILE: dm9000.c:114: + unsigned int wake_supported :1; ^ This patch fixes important errors in it. Signed-off-by: Barry Song <Baohua.Song@csr.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16packet: use percpu mmap tx frame pending refcountDaniel Borkmann
In PF_PACKET's packet mmap(), we can avoid using one atomic_inc() and one atomic_dec() call in skb destructor and use a percpu reference count instead in order to determine if packets are still pending to be sent out. Micro-benchmark with [1] that has been slightly modified (that is, protcol = 0 in socket(2) and bind(2)), example on a rather crappy testing machine; I expect it to scale and have even better results on bigger machines: ./packet_mm_tx -s7000 -m7200 -z700000 em1, avg over 2500 runs: With patch: 4,022,015 cyc Without patch: 4,812,994 cyc time ./packet_mm_tx -s64 -c10000000 em1 > /dev/null, stable: With patch: real 1m32.241s user 0m0.287s sys 1m29.316s Without patch: real 1m38.386s user 0m0.265s sys 1m35.572s In function tpacket_snd(), it is okay to use packet_read_pending() since in fast-path we short-circuit the condition already with ph != NULL, since we have next frames to process. In case we have MSG_DONTWAIT, we also do not execute this path as need_wait is false here anyway, and in case of _no_ MSG_DONTWAIT flag, it is okay to call a packet_read_pending(), because when we ever reach that path, we're done processing outgoing frames anyway and only look if there are skbs still outstanding to be orphaned. We can stay lockless in this percpu counter since it's acceptable when we reach this path for the sum to be imprecise first, but we'll level out at 0 after all pending frames have reached the skb destructor eventually through tx reclaim. When people pin a tx process to particular CPUs, we expect overflows to happen in the reference counter as on one CPU we expect heavy increase; and distributed through ksoftirqd on all CPUs a decrease, for example. As David Laight points out, since the C language doesn't define the result of signed int overflow (i.e. rather than wrap, it is allowed to saturate as a possible outcome), we have to use unsigned int as reference count. The sum over all CPUs when tx is complete will result in 0 again. The BUG_ON() in tpacket_destruct_skb() we can remove as well. It can _only_ be set from inside tpacket_snd() path and we made sure to increase tx_ring.pending in any case before we called po->xmit(skb). So testing for tx_ring.pending == 0 is not too useful. Instead, it would rather have been useful to test if lower layers didn't orphan the skb so that we're missing ring slots being put back to TP_STATUS_AVAILABLE. But such a bug will be caught in user space already as we end up realizing that we do not have any TP_STATUS_AVAILABLE slots left anymore. Therefore, we're all set. Btw, in case of RX_RING path, we do not make use of the pending member, therefore we also don't need to use up any percpu memory here. Also note that __alloc_percpu() already returns a zero-filled percpu area, so initialization is done already. [1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16packet: don't unconditionally schedule() in case of MSG_DONTWAITDaniel Borkmann
In tpacket_snd(), when we've discovered a first frame that is not in status TP_STATUS_SEND_REQUEST, and return a NULL buffer, we exit the send routine in case of MSG_DONTWAIT, since we've finished traversing the mmaped send ring buffer and don't care about pending frames. While doing so, we still unconditionally call an expensive schedule() in the packet_current_frame() "error" path, which is unnecessary in this case since it's enough to just quit the function. Also, in case MSG_DONTWAIT is not set, we should rather test for need_resched() first and do schedule() only if necessary since meanwhile pending frames could already have finished processing and called skb destructor. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16packet: improve socket create/bind latency in some casesDaniel Borkmann
Most people acquire PF_PACKET sockets with a protocol argument in the socket call, e.g. libpcap does so with htons(ETH_P_ALL) for all its sockets. Most likely, at some point in time a subsequent bind() call will follow, e.g. in libpcap with ... memset(&sll, 0, sizeof(sll)); sll.sll_family = AF_PACKET; sll.sll_ifindex = ifindex; sll.sll_protocol = htons(ETH_P_ALL); ... as arguments. What happens in the kernel is that already in socket() syscall, we install a proto hook via register_prot_hook() if our protocol argument is != 0. Yet, in bind() we're almost doing the same work by doing a unregister_prot_hook() with an expensive synchronize_net() call in case during socket() the proto was != 0, plus follow-up register_prot_hook() with a bound device to it this time, in order to limit traffic we get. In the case when the protocol and user supplied device index (== 0) does not change from socket() to bind(), we can spare us doing the same work twice. Similarly for re-binding to the same device and protocol. For these scenarios, we can decrease create/bind latency from ~7447us (sock-bind-2 case) to ~89us (sock-bind-1 case) with this patch. Alternatively, for the first case, if people care, they should simply create their sockets with proto == 0 argument and define the protocol during bind() as this saves a call to synchronize_net() as well (sock-bind-3 case). In all other cases, we're tied to user space behaviour we must not change, also since a bind() is not strictly required. Thus, we need the synchronize_net() to make sure no asynchronous packet processing paths still refer to the previous elements of po->prot_hook. In case of mmap()ed sockets, the workflow that includes bind() is socket() -> setsockopt(<ring>) -> bind(). In that case, a pair of {__unregister, register}_prot_hook is being called from setsockopt() in order to install the new protocol receive handler. Thus, when we call bind and can skip a re-hook, we have already previously installed the new handler. For fanout, this is handled different entirely, so we should be good. Timings on an i7-3520M machine: * sock-bind-1: 89 us * sock-bind-2: 7447 us * sock-bind-3: 75 us sock-bind-1: socket(PF_PACKET, SOCK_RAW, htons(ETH_P_IP)) = 3 bind(3, {sa_family=AF_PACKET, proto=htons(ETH_P_IP), if=all(0), pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0 sock-bind-2: socket(PF_PACKET, SOCK_RAW, htons(ETH_P_IP)) = 3 bind(3, {sa_family=AF_PACKET, proto=htons(ETH_P_IP), if=lo(1), pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0 sock-bind-3: socket(PF_PACKET, SOCK_RAW, 0) = 3 bind(3, {sa_family=AF_PACKET, proto=htons(ETH_P_IP), if=lo(1), pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0 Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16i40e: Remove autogenerated Module.symvers file.David S. Miller
Fixes: 9d8bf54 ("i40e: associate VMDq queue with VM type") Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16x86, intel_mid: Replace memcpy with struct assignmentFengguang Wu
This is a cleanup proposed by coccinelle. It replaces memcpy with struct assignment on intel-mid's sfi layer. Generated by: coccinelle/misc/memcpy-assign.cocci Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Link: http://lkml.kernel.org/r/1389917588-9785-1-git-send-email-david.a.cohen@linux.intel.com Signed-off-by: David Cohen <david.a.cohen@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-01-16net/ipv4: don't use module_init in non-modular gre_offloadPaul Gortmaker
Recent commit 438e38fadca2f6e57eeecc08326c8a95758594d4 ("gre_offload: statically build GRE offloading support") added new module_init/module_exit calls to the gre_offload.c file. The file is obj-y and can't be anything other than built-in. Currently it can never be built modular, so using module_init as an alias for __initcall can be somewhat misleading. Fix this up now, so that we can relocate module_init from init.h into module.h in the future. If we don't do this, we'd have to add module.h to obviously non-modular code, and that would be a worse thing. We also make the inclusion explicit. Note that direct use of __initcall is discouraged, vs. one of the priority categorized subgroups. As __initcall gets mapped onto device_initcall, our use of device_initcall directly in this change means that the runtime impact is zero -- it will remain at level 6 in initcall ordering. As for the module_exit, rather than replace it with __exitcall, we simply remove it, since it appears only UML does anything with those, and even for UML, there is no relevant cleanup to be done here. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16net/mlx4_core: clean up srq_res_start_move_to()Paul Bolle
Building resource_tracker.o triggers a GCC warning: drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mlx4_HW2SW_SRQ_wrapper': drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3202:17: warning: 'srq' may be used uninitialized in this function [-Wmaybe-uninitialized] atomic_dec(&srq->mtt->ref_count); ^ This is a false positive. But a cleanup of srq_res_start_move_to() can help GCC here. The code currently uses a switch statement where a plain if/else would do, since only two of the switch's four cases can ever occur. Dropping that switch makes the warning go away. While we're at it, add some missing braces, and convert state to the correct type. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16net/mlx4_core: clean up cq_res_start_move_to()Paul Bolle
Building resource_tracker.o triggers a GCC warning: drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mlx4_HW2SW_CQ_wrapper': drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3019:16: warning: 'cq' may be used uninitialized in this function [-Wmaybe-uninitialized] atomic_dec(&cq->mtt->ref_count); ^ This is a false positive. But a cleanup of cq_res_start_move_to() can help GCC here. The code currently uses a switch statement where an if/else construct would do too, since only two of the switch's four cases can ever occur. Dropping that switch makes the warning go away. While we're at it, add some missing braces. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16ceph: trivial comment fixJ. Bruce Fields
"disconnected" is too easily confused with "DCACHE_DISCONNECTED". I think "unhashed" is the more precise term here. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-01-16e1000e: Fix compilation warning when !CONFIG_PM_SLEEPMika Westerberg
Commit 7509963c703b (e1000e: Fix a compile flag mis-match for suspend/resume) moved suspend and resume hooks to be available when CONFIG_PM is set. However, it can be set even if CONFIG_PM_SLEEP is not set causing following warnings to be emitted: drivers/net/ethernet/intel/e1000e/netdev.c:6178:12: warning: ‘e1000_suspend’ defined but not used [-Wunused-function] drivers/net/ethernet/intel/e1000e/netdev.c:6185:12: warning: ‘e1000_resume’ defined but not used [-Wunused-function] To fix this make the hooks to be available only when CONFIG_PM_SLEEP is set and remove CONFIG_PM wrapping from driver ops because this is already handled by SET_SYSTEM_SLEEP_PM_OPS() and SET_RUNTIME_PM_OPS(). Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Dave Ertman <davidx.m.ertman@intel.com> Cc: Aaron Brown <aaron.f.brown@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16Merge branch 'ixgbe-next'David S. Miller
Aaron Brown says: ==================== Intel Wired LAN Driver Updates This series contains updates to ixgbe and ixgbevf. John adds rtnl lock / unlock semantics for ixgbe_reinit_locked() which was being called without the rtnl lock being held. Jacob corrects an issue where ixgbevf_qv_disable function does not set the disabled bit correctly. From the community, Wei uses a type of struct for pci driver-specific data in ixgbevf_suspend() Don changes the way we store ring arrays in a manner that allows support of multiple queues on multiple nodes and creates new ring initialization functions for work previously done across multiple functions - making the code closer to ixgbe and hopefully more readable. He also fixes incorrect fiber eeprom write logic. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16ixgbe: Fix incorrect logic for fixed fiber eeprom writeDon Skidmore
In this code we wanted to set the bit in IXGBE_SFF_SOFT_RS_SELECT_MASK to the value in rs. So we really needed a logical or rather than an and, this patch makes that change. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16ixgbevf: create function for all of ring initDon Skidmore
This patch creates new functions for ring initialization, ixgbevf_configure_tx_ring() and ixgbevf_configure_rx_ring(). The work done in these function previously was spread between several other functions and this change should hopefully lead to greater readability and make the code more like ixgbe. This patch also moves the placement of some older functions to avoid having to write prototypes. It also promotes a couple of debug messages to errors. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16ixgbevf: Convert ring storage form pointer to an array to array of pointersDon Skidmore
This will change how we store rings arrays in the adapter sturct. We use to have a pointer to an array now we will be using an array of pointers. This will allow us to support multiple queues on muliple nodes at some point we would be able to reallocate the rings so that each is on a local node if needed. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>