summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-06-27Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for ↵Alex Shi
load-tracking" Remove CONFIG_FAIR_GROUP_SCHED that covers the runnable info, then we can use runnable load variables. Also remove 2 CONFIG_FAIR_GROUP_SCHED setting which is not in reverted patch(introduced in 9ee474f), but also need to revert. Signed-off-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/51CA76A3.3050207@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26Merge tag 'fcoe1' into fixesJames Bottomley
This patch fixes a critical bug that was introduced in 3.9 related to VLAN tagging FCoE frames.
2013-06-26Merge tag 'fcoe' into fixesJames Bottomley
3.10 fixes
2013-06-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Found via trinity: If you connect up an ipv6 socket to an ipv4 mapped address then an ipv6 one, sendmsg() can croak because ip6_sk_dst_check() assumes the route cached in the socket is an ipv6 one. In this case there is an ipv4 route attached, so it gets stomped on. Reported by Dave Jones and Hannes Frederic Sowa, fixed by Eric Dumazet. 2) AF_KEY notifications leak some kernel memory to userspace, fix from Mathias Krause. 3) DLCI calls __dev_get_by_name() without proper locking, and dlci_del doesn't validate that the device being deleted is actually a DLCI one. Fixes from Li Zefan. 4) Length check on bluetooth l2cap information responses is wrong, each response type has a different lenth, so we should make sure it's in a given range rather than enforce one single valid length. From Jaganath Kanakkassery. 5) Receive FIFO overflow is really easy to trigger in stress scenerios in the sh_eth driver, but the event isn't being handled properly at all. Specifically, the mask of error interrupts doesn't include the event so we never clear it, resulting in the driver becomming wedged processing an interrupt that never gets cleared. Fix from Sergei Shtylyov. 6) qlcnic sleeps while holding a spinlock, use mdelay() instead of msleep(). From Shahed Shaikh. 7) Missing curly braces causes SIP netfilter NAT module to always drop packets. Fix from Balazs Peter Odor. 8) ipt_ULOG in netfilter passes the wrong value to timer setup, causing the timer to dereference crap when it fires. Fix from Gao Feng. 9) Missing RCU protection around txq->axq_acq traversal in ath_txq_schedule(). Fix from Felix Fietkau. 10) Idle state transition test in ath9k_htc_config() is reversed, fix from Sujith Manoharan. 11) IPV6 forwarding handles unicast Router Alert packets incorrectly. It tests the wrong option state. Previously opt->ra being non-zero indicated a router alert marking in the SKB, but now it's indicated by a bit in opt->flags. Fix from YOSHIFUJI Hideaki. 12) SKB leak in GRE tunnel GSO handling, from Eric Dumazet. 13) get_user_pages_fast() error handling in TUN and MACVTAP use the same local variable for the base index and the loop iterator for page traversal, oops! Fix from Michael S Tsirkin. 14) ipv6_get_lladdr() can fail, and we must therefore check it's return value in inet6_set_iftoken(). For from Hannes Frederic Sowa. 15) If you change an interface name and meanwhile can sneak in something that looks up the name (like SO_BINDTODEVICE or SIOCGIFNAME) we can deadlock with CONFIG_PREEMPT=n. Fix this by providing a helper function that properly uses raw_seqcount_begin(). From Nicolas Schichan. 16) Chain noise calibration test is inverted in iwlwifi, fix from Nikolay Martynov. 17) Properly set TX iwlwifi descriptor flags for back requests. Fix from Emmanuel Grumbach. 18) We can't assume skb_transport_header() is set in xt_TCPOPTSTRAP module, fix from Pablo Neira Ayuso. 19) Some crummy APs don't provide the proper High Throughput info in association response frames. Add a workaround by assume we'll use whatever is in the beacon/probe. Fix from Johannes Berg. 20) mac80211 call to rate_idx_match_mask() swaps two arguments (mask and channel width). Fix from Simon Wunderlich. 21) xt_TCPMSS (like xt_TCPOPTSTRAP) must not try to handle fragmented frames. Fix from Phil Oester. 22) Fix rate control regression causing iwlwifi/iwlegacy chips to use 1Mbit/s on pre-11n networks. From Moshe Benji and Stanslaw Gruszka. 23) Disable brcmsmac power-save functions, they cause regressions. From Arend van Spriel. 24) Enforce a sane minimum MTU in l2cap_build_cmd() otherwise we can easily crash. Fix from Anderson Lizardo. 25) If a learning packet arrives during vxlan_stop() we crash, easily fixed by checking netif_running(). From Stephen Hemminger. 26) Static vxlan FDB entries should not be migrated, also from Stephen. 27) skb_clone() failures not handled in vxlan_xmit(), oops. Also from Stephen. 28) Add minimal driver for AR816x/AR817x ethernet chips, from Johannes Berg. 29) Fix regression in userspace VLAN acceleration control, added by the 802.1ad support changes. Fix from Fernando Luis Vazquez Cao. 30) Interval selection for MLD queries in the bridging code was reversed. Fix from Linus Lüssing. 31) ipv6's ndisc_send_redirect() erroneously writes to the packet we received not the packet we are building to send out. Fix from Matthias Schiffer. 32) Don't free netdev before unregistering it, in usb_8dev can driver. From Marc Kleine-Budde. 33) Fix nl80211 attribute buffer races, from Johannes Berg. 34) Although netlink_diag.h is under uapi/ it isn't present in Kbuild. From Stephen Hemminger. 35) Wrong address and family passed to MD5 key lookups in TCP, from Aydin Arik. 36) phy_type attribute created by SFC driver should not be writable. From Ben Hutchings. 37) Receive/Transmit queue allocations in pxa168_eth and mv643xx_eth should use kzalloc(). Otherwise if setup fails half-way, we'll dereference garbage when trying to teardown the rings. From Lubomir Rintel. 38) Fix double-allocation of dst (resulting in unfreeable net device) in ipv6's init_loopback(). From Gao Feng. 39) Fix fragmentation handling SKB leak in netfilter conntrack, we were freeing the wrong skb pointer. From Phil Oester. 40) Don't report "-1" (SPEED_UNKNOWN) in bond_miimon_commit(), from Nikolay Aleksandrov. 41) davinci_cpdma doesn't check for DMA mapping errors, letting the device scribble to random addresses. From Sebastian Siewior. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits) dlci: validate the net device in dlci_del() dlci: acquire rtnl_lock before calling __dev_get_by_name() af_key: fix info leaks in notify messages ipv6: ip6_sk_dst_check() must not assume ipv6 dst net: fix kernel deadlock with interface rename and netdev name retrieval. net/tg3: Avoid delay during MMIO access ipv6: check return value of ipv6_get_lladdr macvtap: fix recovery from gup errors tun: fix recovery from gup errors gre: fix a possible skb leak ipv6: Process unicast packet with Router Alert by checking flag in skb. ath9k_htc: Handle IDLE state transition properly ath9k: fix an RCU issue in calling ieee80211_get_tx_rates netfilter: ipt_ULOG: fix incorrect setting of ulog timer netfilter: ctnetlink: send event when conntrack label was modified netfilter: nf_nat_sip: fix mangling qlcnic: Do not sleep while holding spinlock drivers: net: cpsw: fix compilation error with cpsw driver tcp: doc : fix the syncookies default value sh_eth: fix misreporting of transmit abort ...
2013-06-26Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull i915 drm fixes from Dave Airlie: "These should be the last two fixes for i915, one is for a fence leak killing X on some older GPUs, and one is a late regression partial revert for an swiotlb/xen/i915 interaction, Konrad has promised to figure out the proper answer, and this patch is the best thing to do at this stage to avoid regressing" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/i915: make compact dma scatter lists creation work with SWIOTLB backend. drm/i915: Restore fences after resume and GPU resets
2013-06-26dlci: validate the net device in dlci_del()Zefan Li
We triggered an oops while running trinity with 3.4 kernel: BUG: unable to handle kernel paging request at 0000000100000d07 IP: [<ffffffffa0109738>] dlci_ioctl+0xd8/0x2d4 [dlci] PGD 640c0d067 PUD 0 Oops: 0000 [#1] PREEMPT SMP CPU 3 ... Pid: 7302, comm: trinity-child3 Not tainted 3.4.24.09+ 40 Huawei Technologies Co., Ltd. Tecal RH2285 /BC11BTSA RIP: 0010:[<ffffffffa0109738>] [<ffffffffa0109738>] dlci_ioctl+0xd8/0x2d4 [dlci] ... Call Trace: [<ffffffff8137c5c3>] sock_ioctl+0x153/0x280 [<ffffffff81195494>] do_vfs_ioctl+0xa4/0x5e0 [<ffffffff8118354a>] ? fget_light+0x3ea/0x490 [<ffffffff81195a1f>] sys_ioctl+0x4f/0x80 [<ffffffff81478b69>] system_call_fastpath+0x16/0x1b ... It's because the net device is not a dlci device. Reported-by: Li Jinyue <lijinyue@huawei.com> Signed-off-by: Li Zefan <lizefan@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-26dlci: acquire rtnl_lock before calling __dev_get_by_name()Zefan Li
Otherwise the net device returned can be freed at anytime. Signed-off-by: Li Zefan <lizefan@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-26af_key: fix info leaks in notify messagesMathias Krause
key_notify_sa_flush() and key_notify_policy_flush() miss to initialize the sadb_msg_reserved member of the broadcasted message and thereby leak 2 bytes of heap memory to listeners. Fix that. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-26ipv6: ip6_sk_dst_check() must not assume ipv6 dstEric Dumazet
It's possible to use AF_INET6 sockets and to connect to an IPv4 destination. After this, socket dst cache is a pointer to a rtable, not rt6_info. ip6_sk_dst_check() should check the socket dst cache is IPv6, or else various corruptions/crashes can happen. Dave Jones can reproduce immediate crash with trinity -q -l off -n -c sendmsg -c connect With help from Hannes Frederic Sowa Reported-by: Dave Jones <davej@redhat.com> Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-26x86, microcode, amd: Another early loading fixupJacob Shin
commit cd1c32ca969ebfd65e61312c988223bb14f09c2e is an early premature rendition of the patch. Augment it with this delta patch to: * correctly mark offset and size of the matching bin file * use __pa instead of __pa_nodebug during AP load * check for !initrd_start before using it Signed-off-by: Jacob Shin <jacob.shin@amd.com> Link: http://lkml.kernel.org/r/20130620152414.GA6676@jshin-Toonie Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-06-26net: fix kernel deadlock with interface rename and netdev name retrieval.Nicolas Schichan
When the kernel (compiled with CONFIG_PREEMPT=n) is performing the rename of a network interface, it can end up waiting for a workqueue to complete. If userland is able to invoke a SIOCGIFNAME ioctl or a SO_BINDTODEVICE getsockopt in between, the kernel will deadlock due to the fact that read_secklock_begin() will spin forever waiting for the writer process (the one doing the interface rename) to update the devnet_rename_seq sequence. This patch fixes the problem by adding a helper (netdev_get_name()) and using it in the code handling the SIOCGIFNAME ioctl and SO_BINDTODEVICE setsockopt. The netdev_get_name() helper uses raw_seqcount_begin() to avoid spinning forever, waiting for devnet_rename_seq->sequence to become even. cond_resched() is used in the contended case, before retrying the access to give the writer process a chance to finish. The use of raw_seqcount_begin() will incur some unneeded work in the reader process in the contended case, but this is better than deadlocking the system. Signed-off-by: Nicolas Schichan <nschichan@freebox.fr> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-26perf/x86: Disable PEBS-LL in intel_pmu_pebs_disable()Stephane Eranian
Make sure intel_pmu_pebs_disable() and intel_pmu_pebs_enable() are symmetrical w.r.t. PEBS-LL and precise store. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1371824448-7306-2-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26perf/x86: Fix shared register mutual exclusion enforcementStephane Eranian
This patch fixes a problem with the shared registers mutual exclusion code and incremental event scheduling by the generic perf_event code. There was a bug whereby the mutual exclusion on the shared registers was not enforced because of incremental scheduling abort due to event constraints. As an example on Intel Nehalem, consider the following events: group1= L1D_CACHE_LD:E_STATE,OFFCORE_RESPONSE_0:PF_RFO,L1D_CACHE_LD:I_STATE group2= L1D_CACHE_LD:I_STATE The L1D_CACHE_LD event can only be measured by 2 counters. Yet, there are 3 instances here. The first group can be scheduled and is committed. Then, the generic code tries to schedule group2 and this fails (because there is no more counter to support the 3rd instance of L1D_CACHE_LD). But in x86_schedule_events() error path, put_event_contraints() is invoked on ALL the events and not just the ones that just failed. That causes the "lock" on the shared offcore_response MSR to be released. Yet the first group is actually scheduled and is exposed to reprogramming of that shared msr by the sibling HT thread. In other words, there is no guarantee on what is measured. This patch fixes the problem by tagging committed events with the PERF_X86_EVENT_COMMITTED tag. In the error path of x86_schedule_events(), only the events NOT tagged have their constraint released. The tag is eventually removed when the event in descheduled. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130620164254.GA3556@quad Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26Merge tag 'regulator-v3.10-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "Fix module loading for tps6586x. A simple one liner fix to make module loading work for distros (product specific kernels tend to have things built in)" * tag 'regulator-v3.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: mfd: tps6586x: correct device name of the regulator cell
2013-06-26Merge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linuxLinus Torvalds
Pull GPIO regression fix from Grant Likely: "It took a while to work out the correct solution to this regression. It is sorted now. This branch was constructed and tested by Tony. I've verified that it builds and signed the tag" * tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux: gpio/omap: don't use linear domain mapping for OMAP1
2013-06-26Merge tag 'pm+acpi-3.10-late' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull late power management and ACPI fixes from Rafael Wysocki: "Sorry about the timing of this, but ACPI-based docking stations with PCI devices on them and ATA bays would be hardly usable with 3.10 without it. We've been working on these fixes for the last couple of weeks and everyone involved appears to be reasonably comfortable with them now. The PM part is one fix for a cpufreq regression introduced recently - Fix for an ACPI dock regression introduced by the recent rework of the ACPI-based PCI hotplug code (acpiphp) that caused it to be initialized before the ACPI dock driver, which is incorrect (ACPI dock has to be initialized before acpiphp so that acpiphp can register PCI devices on docking stations with it for PCI hotplug on re-dock to work). From Jiang Liu. - Fix for PCI resources allocation in the ACPI-based PCI hotplug code (acpiphp) that makes it use the same PCI resources assignment rules during runtime hotplug that are used during boot (the BIOS' choices are now respected in both cases). This prevents PCI resource allocation failures during hotplug from happening in some cases. From Jiang Liu. - Fix for ordering and synchronization issues during hot-removal of PCI devices on docking stations. It makes the ACPI dock code carry out the PCI devices removal synchronously during undock instead of spawning a separate asynchronous work item to remove each of them without even bothering to wait for all those work items to complete. The hot-addition part is changed analogously. - Fix for a regression (introduced a few releases ago) that removed the code to register a hotplug notificaion handler for for ATA ports/devices inadvertently which prevented ATA bays hotplug from working. The missing code is added back with some improvements. From Aaron Lu. - Fix for a recent cpufreq regression causing a NULL pointer dereference to trigger in od_set_powersave_bias() in some situations from Jacob Shin" * tag 'pm+acpi-3.10-late' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: fix NULL pointer deference at od_set_powersave_bias() libata-acpi: add back ACPI based hotplug functionality ACPI / dock / PCI: Synchronous handling of dock events for PCI devices PCI / ACPI: Use boot-time resource allocation rules during hotplug ACPI / dock: Initialize ACPI dock subsystem upfront
2013-06-26Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Three small fixlets" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hw_breakpoint: Use cpu_possible_mask in {reserve,release}_bp_slot() hw_breakpoint: Fix cpu check in task_bp_pinned(cpu) kprobes: Fix arch_prepare_kprobe to handle copy insn failures
2013-06-26Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "Another round of ARM fixes. Largest one is the second half of the PJ4B fix which was pushed in the previous -rc - this one was delayed because its original caused a build regression while trying to fix a regression! As ever, noMMU gets forgotten when fixing problems on MMU, so we have a noMMU fix for a previous fix included in this set. A couple of fixes from Lorenzo for problems with the ARM DT CPU code, and a one liner to remove the buggy 'wait for interrupt' with FA526 cores" * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: ARM: 7773/1: PJ4B: Add support for errata 4742 ARM: 7772/1: Fix missing flush_kernel_dcache_page() for noMMU ARM: 7763/1: kernel: fix __cpu_logical_map default initialization ARM: 7762/1: kernel: fix arm_dt_init_cpu_maps() to skip non-cpu nodes ARM: 7760/1: cpu_fa526_do_idle: remove WFI
2013-06-26Merge tag 'critical_fix_for_3.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rwlove/fcoe Pull FCoE fix from Robert W Love: "This patch fixes a critical bug that was introduced in 3.9 related to VLAN tagging FCoE frames" * tag 'critical_fix_for_3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rwlove/fcoe: fcoe: Use correct API to set vlan tag for FCoE Ethertype skbs
2013-06-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fix from Sage Weil: "This fixes another problem with using v2 images on 3.10 due to the order in which fields are read from the image header. Hopefully this is the last one" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: fetch object order before using it
2013-06-26ARM: davinci: da850: adopt to pinctrl-single change for configuring multiple ↵Manjunathappa, Prakash
pins function-mask DT property is now a mask for a pin at each pin offset inside a given pincontrol register. Fix DA850 DT data to reflect this change. Signed-off-by: Manjunathappa, Prakash <prakash.pm@ti.com> [nsekhar@ti.com: reword commit message for clarity] Signed-off-by: Sekhar Nori <nsekhar@ti.com>
2013-06-26ARM: dts: Add pcie controller node for exynos5440-ssdk5440Jingoo Han
This patch adds pcie controller node for exynos5440-ssdk5440, and also adds a phandle for pin controller node. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Acked-by: Kukjin Kim <kgene.kim@samsung.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2013-06-26ARM: dts: Add pcie controller node for Samsung EXYNOS5440 SoCJingoo Han
Exynos5440 has two PCIe controllers which can be used as root complex for PCIe interface. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Acked-by: Kukjin Kim <kgene.kim@samsung.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2013-06-26ARM: EXYNOS: Enable PCIe support for Exynos5440Jingoo Han
Enable PCIe support for Exynos5440 which has two PCIe controllers. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Acked-by: Kukjin Kim <kgene.kim@samsung.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2013-06-26pci: Add PCIe driver for Samsung ExynosJingoo Han
Exynos5440 has a PCIe controller which can be used as Root Complex. This driver supports a PCIe controller as Root Complex mode. Signed-off-by: Surendranath Gurivireddy Balla <suren.reddy@samsung.com> Signed-off-by: Siva Reddy Kallam <siva.kallam@samsung.com> Signed-off-by: Jingoo Han <jg1.han@samsung.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Kukjin Kim <kgene.kim@samsung.com> Cc: Pratyush Anand <pratyush.anand@st.com> Cc: Mohit KUMAR <Mohit.KUMAR@st.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2013-06-26cgroup: always use RCU accessors for protected accessesTejun Heo
kernel/cgroup.c still has places where a RCU pointer is set and accessed directly without going through RCU_INIT_POINTER() or rcu_dereference_protected(). They're all properly protected accesses so nothing is broken but it leads to spurious sparse RCU address space warnings. Substitute direct accesses with RCU_INIT_POINTER() and rcu_dereference_protected(). Note that %true is specified as the extra condition for all derference updates. This isn't ideal as all it does is suppressing warning without actually policing synchronization rules; however, most are scheduled to be removed pretty soon along with css_id itself, so no reason to be more elaborate. Combined with the previous changes, this removes all RCU related sparse warnings from cgroup. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by; Li Zefan <lizefan@huawei.com>
2013-06-26cgroup: fix RCU accesses around task->cgroupsTejun Heo
There are several places in kernel/cgroup.c where task->cgroups is accessed and modified without going through proper RCU accessors. None is broken as they're all lock protected accesses; however, this still triggers sparse RCU address space warnings. * Consistently use task_css_set() for task->cgroups dereferencing. * Use RCU_INIT_POINTER() to clear task->cgroups to &init_css_set on exit. * Remove unnecessary rcu_dereference_raw() from cset->subsys[] dereference in cgroup_exit(). Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Li Zefan <lizefan@huawei.com>
2013-06-26cgroup: fix RCU accesses to task->cgroupsTejun Heo
task->cgroups is a RCU pointer pointing to struct css_set. A task switches to a different css_set on cgroup migration but a css_set doesn't change once created and its pointers to cgroup_subsys_states aren't RCU protected. task_subsys_state[_check]() is the macro to acquire css given a task and subsys_id pair. It RCU-dereferences task->cgroups->subsys[] not task->cgroups, so the RCU pointer task->cgroups ends up being dereferenced without read_barrier_depends() after it. It's broken. Fix it by introducing task_css_set[_check]() which does RCU-dereference on task->cgroups. task_subsys_state[_check]() is reimplemented to directly dereference ->subsys[] of the css_set returned from task_css_set[_check](). This removes some of sparse RCU warnings in cgroup. v2: Fixed unbalanced parenthsis and there's no need to use rcu_dereference_raw() when !CONFIG_PROVE_RCU. Both spotted by Li. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Li Zefan <lizefan@huawei.com> Cc: stable@vger.kernel.org
2013-06-26cgroup: grab cgroup_mutex in drop_parsed_module_refcounts()Tejun Heo
This isn't strictly necessary as all subsystems specified in @subsys_mask are guaranteed to be pinned; however, it does spuriously trigger lockdep warning. Let's grab cgroup_mutex around it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>
2013-06-26cgroup: fix cgroupfs_root early destruction pathTejun Heo
cgroupfs_root used to have ->actual_subsys_mask in addition to ->subsys_mask. a8a648c4ac ("cgroup: remove cgroup->actual_subsys_mask") removed it noting that the subsys_mask is essentially temporary and doesn't belong in cgroupfs_root; however, the patch made it impossible to tell whether a cgroupfs_root actually has the subsystems bound or just have the bits set leading to the following BUG when trying to mount with subsystems which are already mounted elsewhere. kernel BUG at kernel/cgroup.c:1038! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC ... CPU: 1 PID: 7973 Comm: mount Tainted: G W 3.10.0-rc7-next-20130625-sasha-00011-g1c1dc0e #1105 task: ffff880fc0ae8000 ti: ffff880fc0b9a000 task.ti: ffff880fc0b9a000 RIP: 0010:[<ffffffff81249b29>] [<ffffffff81249b29>] rebind_subsystems+0x409/0x5f0 ... Call Trace: [<ffffffff8124bd4f>] cgroup_kill_sb+0xff/0x210 [<ffffffff813d21af>] deactivate_locked_super+0x4f/0x90 [<ffffffff8124f3b3>] cgroup_mount+0x673/0x6e0 [<ffffffff81257169>] cpuset_mount+0xd9/0x110 [<ffffffff813d2580>] mount_fs+0xb0/0x2d0 [<ffffffff81404afd>] vfs_kern_mount+0xbd/0x180 [<ffffffff814070b5>] do_new_mount+0x145/0x2c0 [<ffffffff814085d6>] do_mount+0x356/0x3c0 [<ffffffff8140873d>] SyS_mount+0xfd/0x140 [<ffffffff854eb600>] tracesys+0xdd/0xe2 We still want rebind_subsystems() to take added/removed masks, so let's fix it by marking whether a cgroupfs_root has finished binding or not. Also, document what's going on around ->subsys_mask initialization so that similar mistakes aren't repeated. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Li Zefan <lizefan@huawei.com>
2013-06-26Revert "char: misc: assign file->private_data in all cases"Greg Kroah-Hartman
This reverts commit 585d98e00ba7a5e2abe65f7a1eff631cb612289b, as it breaks the FUSE misc driver. Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-26dlm: Avoid LVB truncationBart Van Assche
For lockspaces with an LVB length above 64 bytes, avoid truncating the LVB while exchanging it with another node in the cluster. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: David Teigland <teigland@redhat.com>
2013-06-26Merge tag 'at91-dt' of git://github.com/at91linux/linux-at91 into next/dtArnd Bergmann
From Nicolas Ferre: - more SPI DT activation for rm9200 - SPI DMA for at91sam9n12/sama5d3 And one little fix for SPI compatibility string * tag 'at91-dt' of git://github.com/at91linux/linux-at91: ARM: at91: dt: rm9200ek: add spi support ARM: at91: dt: rm9200: add spi support ARM: at91/DT: at91sam9n12: add SPI DMA client infos ARM: at91/DT: sama5d3: add SPI DMA client infos ARM: at91/DT: fix SPI compatibility string Conflicts: arch/arm/boot/dts/sama5d3.dtsi Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2013-06-26Merge tag 'omap-pm-v3.11/fixes/omap5-voltdm' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-omap-pm into next/soc From Kevin Hilman: OMAP5: PM: fix boot by removing unneeded dummy voltage domain data * tag 'omap-pm-v3.11/fixes/omap5-voltdm' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-omap-pm: ARM: OMAP5: voltagedomain data: remove temporary OMAP4 voltage data Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2013-06-26ARM: at91/PMC: use at91_usb_rate() for UTMI PLLNicolas Ferre
We are using this function, now that we have introduced the support for UTMI clock for computing the USB host rate. Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Tested-by: Bo Shen <voice.shen@atmel.com>
2013-06-26ARM: at91/PMC: fix at91sam9n12 USB FS initNicolas Ferre
at91sam9n12 has Full-speed only USB. So we should add it to the list in at91_pllb_usbfs_clock_init() function. Moreover, at91sam9n12 has an unusual PMC in the sense that it has a PLLB but also has a USB clock register. Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Tested-by: Bo Shen <voice.shen@atmel.com>
2013-06-26ARM: at91/PMC: at91sam9n12 family has a PLLBNicolas Ferre
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Bo Shen <voice.shen@atmel.com>
2013-06-26ARM: at91/PMC: sama5d3 family doesn't have a PLLBNicolas Ferre
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Ludovic Desroches <ludovic.desroches@atmel.com>
2013-06-26x86/platform: Make X86_GOLDFISH depend on X86_EXTENDED_PLATFORMBen Hutchings
All non-PC platforms are supposed to be dependent on this option. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Jun Nakajima <jnakajim@gmail.com> Link: http://lkml.kernel.org/n/tip-Bcihhqhstm67fchjnkxoiJbu@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26locking-selftests: Handle unexpected failures more strictlyMaarten Lankhorst
When CONFIG_PROVE_LOCKING is not enabled, more tests are expected to pass unexpectedly, but there no tests that should start to fail that pass with CONFIG_PROVE_LOCKING enabled. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: rostedt@goodmis.org Cc: daniel@ffwll.ch Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20130620113151.4001.77963.stgit@patser Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26mutex: Add more w/w tests to test EDEADLK path handlingMaarten Lankhorst
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: rostedt@goodmis.org Cc: daniel@ffwll.ch Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20130620113141.4001.54331.stgit@patser Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26mutex: Add more tests to lib/locking-selftest.cMaarten Lankhorst
None of the ww_mutex codepaths should be taken in the 'normal' mutex calls. The easiest way to verify this is by using the normal mutex calls, and making sure o.ctx is unmodified. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: robclark@gmail.com Cc: rostedt@goodmis.org Cc: daniel@ffwll.ch Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20130620113130.4001.45423.stgit@patser Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26mutex: Add w/w tests to lib/locking-selftest.cMaarten Lankhorst
This stresses the lockdep code in some ways specifically useful to ww_mutexes. It adds checks for most of the common locking errors. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: robclark@gmail.com Cc: rostedt@goodmis.org Cc: daniel@ffwll.ch Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20130620113124.4001.23186.stgit@patser Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26mutex: Add w/w mutex slowpath debuggingDaniel Vetter
Injects EDEADLK conditions at pseudo-random interval, with exponential backoff up to UINT_MAX (to ensure that every lock operation still completes in a reasonable time). This way we can test the wound slowpath even for ww mutex users where contention is never expected, and the ww deadlock avoidance algorithm is only needed for correctness against malicious userspace. An example would be protecting kernel modesetting properties, which thanks to single-threaded X isn't really expected to contend, ever. I've looked into using the CONFIG_FAULT_INJECTION infrastructure, but decided against it for two reasons: - EDEADLK handling is mandatory for ww mutex users and should never affect the outcome of a syscall. This is in contrast to -ENOMEM injection. So fine configurability isn't required. - The fault injection framework only allows to set a simple probability for failure. Now the probability that a ww mutex acquire stage with N locks will never complete (due to too many injected EDEADLK backoffs) is zero. But the expected number of ww_mutex_lock operations for the completely uncontended case would be O(exp(N)). The per-acuiqire ctx exponential backoff solution choosen here only results in O(log N) overhead due to injection and so O(log N * N) lock operations. This way we can fail with high probability (and so have good test coverage even for fancy backoff and lock acquisition paths) without running into patalogical cases. Note that EDEADLK will only ever be injected when we managed to acquire the lock. This prevents any behaviour changes for users which rely on the EALREADY semantics. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: rostedt@goodmis.org Cc: daniel@ffwll.ch Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20130620113117.4001.21681.stgit@patser Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26mutex: Add support for wound/wait style locksMaarten Lankhorst
Wound/wait mutexes are used when other multiple lock acquisitions of a similar type can be done in an arbitrary order. The deadlock handling used here is called wait/wound in the RDBMS literature: The older tasks waits until it can acquire the contended lock. The younger tasks needs to back off and drop all the locks it is currently holding, i.e. the younger task is wounded. For full documentation please read Documentation/ww-mutex-design.txt. References: https://lwn.net/Articles/548909/ Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: rostedt@goodmis.org Cc: daniel@ffwll.ch Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/51C8038C.9000106@canonical.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26arch: Make __mutex_fastpath_lock_retval return whether fastpath succeeded or notMaarten Lankhorst
This will allow me to call functions that have multiple arguments if fastpath fails. This is required to support ticket mutexes, because they need to be able to pass an extra argument to the fail function. Originally I duplicated the functions, by adding __mutex_fastpath_lock_retval_arg. This ended up being just a duplication of the existing function, so a way to test if fastpath was called ended up being better. This also cleaned up the reservation mutex patch some by being able to call an atomic_set instead of atomic_xchg, and making it easier to detect if the wrong unlock function was previously used. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: robclark@gmail.com Cc: rostedt@goodmis.org Cc: daniel@ffwll.ch Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20130620113105.4001.83929.stgit@patser Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26perf/x86/intel: Support full width countingAndi Kleen
Recent Intel CPUs like Haswell and IvyBridge have a new alternative MSR range for perfctrs that allows writing the full counter width. Enable this range if the hardware reports it using a new capability bit. Currently the perf code queries CPUID to get the counter width, and sign extends the counter values as needed. The traditional PERFCTR MSRs always limit to 32bit, even though the counter internally is larger (usually 48 bits on recent CPUs) When the new capability is set use the alternative range which do not have these restrictions. This lowers the overhead of perf stat slightly because it has to do less interrupts to accumulate the counter value. On Haswell it also avoids some problems with TSX aborting when the end of the counter range is reached. ( See the patch "perf/x86/intel: Avoid checkpointed counters causing excessive TSX aborts" for more details. ) Signed-off-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1372173153-20215-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26perf: Disable monitoring on setuid processes for regular usersStephane Eranian
There was a a bug in setup_new_exec(), whereby the test to disabled perf monitoring was not correct because the new credentials for the process were not yet committed and therefore the get_dumpable() test was never firing. The patch fixes the problem by moving the perf_event test until after the credentials are committed. Signed-off-by: Stephane Eranian <eranian@google.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26irqchip: Add support for ARMv7-M NVICUwe Kleine-König
This interrupt controller is integrated in all Cortex-M3 and Cortex-M4 machines. Support for this controller appeared in Catalin's Cortex tree based on 2.6.33 but was nearly completely rewritten. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Grant Likely <grant.likely@secretlab.ca> Cc: linux-arm-kernel@lists.infradead.org Cc: Jonathan Austin <jonathan.austin@arm.com> Cc: kernel@pengutronix.de Link: http://lkml.kernel.org/r/1372231128-11802-1-git-send-email-u.kleine-koenig@pengutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-06-26Merge tag 'please-pull-mce-bitmap-comment' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/ras Pull MCE updates from Tony Luck: "Better comments so we understand our existing machine check bank bitmaps - prelude to adding another bitmap soon." Signed-off-by: Ingo Molnar <mingo@kernel.org>