summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-04-28thp: keep huge zero page pinned until tlb flushKirill A. Shutemov
Andrea has found[1] a race condition on MMU-gather based TLB flush vs split_huge_page() or shrinker which frees huge zero under us (patch 1/2 and 2/2 respectively). With new THP refcounting, we don't need patch 1/2: mmu_gather keeps the page pinned until flush is complete and the pin prevents the page from being split under us. We still need patch 2/2. This is simplified version of Andrea's patch. We don't need fancy encoding. [1] http://lkml.kernel.org/r/1447938052-22165-1-git-send-email-aarcange@redhat.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28mm: exclude HugeTLB pages from THP page_mapped() logicSteve Capper
HugeTLB pages cannot be split, so we use the compound_mapcount to track rmaps. Currently page_mapped() will check the compound_mapcount, but will also go through the constituent pages of a THP compound page and query the individual _mapcount's too. Unfortunately, page_mapped() does not distinguish between HugeTLB and THP compound pages and assumes that a compound page always needs to have HPAGE_PMD_NR pages querying. For most cases when dealing with HugeTLB this is just inefficient, but for scenarios where the HugeTLB page size is less than the pmd block size (e.g. when using contiguous bit on ARM) this can lead to crashes. This patch adjusts the page_mapped function such that we skip the unnecessary THP reference checks for HugeTLB pages. Fixes: e1534ae95004 ("mm: differentiate page_mapped() from page_mapcount() for compound pages") Signed-off-by: Steve Capper <steve.capper@arm.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28kexec: export OFFSET(page.compound_head) to find out compound tail pageAtsushi Kumagai
PageAnon() always look at head page to check PAGE_MAPPING_ANON and tail page's page->mapping has just a poisoned data since commit 1c290f642101 ("mm: sanitize page->mapping for tail pages"). If makedumpfile checks page->mapping of a compound tail page to distinguish anonymous page as usual, it must fail in newer kernel. So it's necessary to export OFFSET(page.compound_head) to avoid checking compound tail pages. The problem is that unnecessary hugepages won't be removed from a dump file in kernels 4.5.x and later. This means that extra disk space would be consumed. It's a problem, but not critical. Signed-off-by: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28kexec: update VMCOREINFO for compound_order/dtorAtsushi Kumagai
makedumpfile refers page.lru.next to get the order of compound pages for page filtering. However, now the order is stored in page.compound_order, hence VMCOREINFO should be updated to export the offset of page.compound_order. The fact is, page.compound_order was introduced already in kernel 4.0, but the offset of it was the same as page.lru.next until kernel 4.3, so this was not actual problem. The above can be said also for page.lru.prev and page.compound_dtor, it's necessary to detect hugetlbfs pages. Further, the content was changed from direct address to the ID which means dtor. The problem is that unnecessary hugepages won't be removed from a dump file in kernels 4.4.x and later. This means that extra disk space would be consumed. It's a problem, but not critical. Signed-off-by: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "There is a lifecycle fix in the auth code, a fix for a narrow race condition on map, and a helpful message in the log when there is a feature mismatch (which happens frequently now that the default server-side options have changed)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: report unsupported features to syslog rbd: fix rbd map vs notify races libceph: make authorizer destruction independent of ceph_auth_client
2016-04-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Three more bug fixes for 4.6 - Due to a race in the dynamic page table code a multi-threaded program can cause a translation specification exception. With panic_on_oops a user space program can crash the system. - An information leak with the /dev/sclp device. - A use after free in the s390 PCI code" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/sclp_ctl: fix potential information leak with /dev/sclp s390/mm: fix asce_bits handling with dynamic pagetable levels s390/pci: fix use after free in dma_init
2016-04-28RDMA/nes: don't leak skb if carrier downFlorian Westphal
Alternatively one could free the skb, OTOH I don't think this test is useful so just remove it. Cc: <linux-rdma@vger.kernel.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28Merge branch 'bpf-fixes'David S. Miller
Alexei Starovoitov says: ==================== bpf: fix several bugs First two patches address bugs found by Jann Horn. Last patch is a minor samples fix spotted during the testing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28samples/bpf: fix trace_output exampleAlexei Starovoitov
llvm cannot always recognize memset as builtin function and optimize it away, so just delete it. It was a leftover from testing of bpf_perf_event_output() with large data structures. Fixes: 39111695b1b8 ("samples: bpf: add bpf_perf_event_output example") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28bpf: fix check_map_func_compatibility logicAlexei Starovoitov
The commit 35578d798400 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter") introduced clever way to check bpf_helper<->map_type compatibility. Later on commit a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") adjusted the logic and inadvertently broke it. Get rid of the clever bool compare and go back to two-way check from map and from helper perspective. Fixes: a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28bpf: fix refcnt overflowAlexei Starovoitov
On a system with >32Gbyte of phyiscal memory and infinite RLIMIT_MEMLOCK, the malicious application may overflow 32-bit bpf program refcnt. It's also possible to overflow map refcnt on 1Tb system. Impose 32k hard limit which means that the same bpf program or map cannot be shared by more than 32k processes. Fixes: 1be7f75d1668 ("bpf: enable non-root eBPF programs") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28Merge branch 'cpsw-phy-handle-fixes'David S. Miller
David Rivshin says: ==================== drivers: net: cpsw: phy-handle fixes This series fixes a number of related issues around using phy-handle properties in cpsw emac nodes. Patch 1 fixes a bug if more than one slave is used, and either slave uses the phy-handle property in the devicetree. Patch 2 fixes a NULL pointer dereference which can occur if a phy-handle property is used and of_phy_connect() return NULL, such as with a bad devicetree. Patch 3 fixes an issue where the phy-mode property would be ignored if a phy-handle property was used. This also fixes a bogus error message that would be emitted. Patch 4 fixes makes the binding documentation more explicit that exactly one PHY property should be used, and also marks phy_id as deprecated. Patch 5 cleans up the fixed-link case to work like the now-fixed phy-handle case. I have tested on the following hardware configurations: - (EVMSK) dual emac, phy_id property in both slaves - (EVMSK) dual emac, phy-handle property in both slaves - (EVMSK) a bad phy-handle property pointing to &mmc1 - (EVMSK) phy_id property with incorrect PHY address - (BeagleBoneBlack) single emac, phy_id property - (custom) single emac, fixed-link subnode Andrew Goodbody reported testing v2 on a board that doesn't use dual_emac mode, but with 2 PHYs using phy-handle properties [1]. Nicolas Chauvet reported testing v2 on an HP t410 (dm8148). Markus Brunner reported testing v1 on the following [2]: - emac0 with phy_id and emac1 with fixed phy - emac0 with phy-handle and emac1 with fixed phy - emac0 with fixed phy and emac1 with fixed phy [1] https://lkml.org/lkml/2016/4/22/537 [2] http://www.spinics.net/lists/netdev/msg357890.html ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28drivers: net: cpsw: use of_phy_connect() in fixed-link caseDavid Rivshin
If a fixed-link DT subnode is used, the phy_device was looked up so that a PHY ID string could be constructed and passed to phy_connect(). This is not necessary, as the device_node can be passed directly to of_phy_connect() instead. This reuses the same codepath as if the phy-handle DT property was used. Signed-off-by: David Rivshin <drivshin@allworx.com> Tested-by: Nicolas Chauvet <kwizart@gmail.com> Tested-by: Andrew Goodbody <andrew.goodbody@cambrionix.com> Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28dt: cpsw: phy-handle, phy_id, and fixed-link are mutually exclusiveDavid Rivshin
The phy-handle, phy_id, and fixed-link properties are mutually exclusive, and only one need be specified. Make this clear in the binding doc. Also mark the phy_id property as deprecated, as phy-handle should be used instead. Signed-off-by: David Rivshin <drivshin@allworx.com> Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28drivers: net: cpsw: don't ignore phy-mode if phy-handle is usedDavid Rivshin
The phy-mode emac property was only being processed in the phy_id or fixed-link cases. However if phy-handle was specified instead, an error message would complain about the lack of phy_id or fixed-link, and then jump past the of_get_phy_mode(). This would result in the PHY mode defaulting to MII, regardless of what the devicetree specified. Fixes: 9e42f715264f ("drivers: net: cpsw: add phy-handle parsing") Signed-off-by: David Rivshin <drivshin@allworx.com> Tested-by: Nicolas Chauvet <kwizart@gmail.com> Tested-by: Andrew Goodbody <andrew.goodbody@cambrionix.com> Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28drivers: net: cpsw: fix segfault in case of bad phy-handleDavid Rivshin
If an emac node has a phy-handle property that points to something which is not a phy, then a segmentation fault will occur when the interface is brought up. This is because while phy_connect() will return ERR_PTR() on failure, of_phy_connect() will return NULL. The common error check uses IS_ERR(), and so missed when of_phy_connect() fails. The NULL pointer is then dereferenced. Also, the common error message referenced slave->data->phy_id, which would be empty in the case of phy-handle. Instead, use the name of the device_node as a useful identifier. And in the phy_id case add the error code for completeness. Fixes: 9e42f715264f ("drivers: net: cpsw: add phy-handle parsing") Signed-off-by: David Rivshin <drivshin@allworx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28drivers: net: cpsw: fix parsing of phy-handle DT property in dual_emac configDavid Rivshin
Commit 9e42f715264ff158478fa30eaed847f6e131366b ("drivers: net: cpsw: add phy-handle parsing") saved the "phy-handle" phandle into a new cpsw_priv field. However, phy connections are per-slave, so the phy_node field should be in cpsw_slave_data rather than cpsw_priv. This would go unnoticed in a single emac configuration. But in dual_emac mode, the last "phy-handle" property parsed for either slave would be used by both of them, causing them both to refer to the same phy_device. Fixes: 9e42f715264f ("drivers: net: cpsw: add phy-handle parsing") Signed-off-by: David Rivshin <drivshin@allworx.com> Tested-by: Nicolas Chauvet <kwizart@gmail.com> Tested-by: Andrew Goodbody <andrew.goodbody@cambrionix.com> Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28MAINTAINERS: net: Change maintainer for GRETH 10/100/1G Ethernet MAC device ↵Andreas Larsson
driver Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28gre: reject GUE and FOU in collect metadata modeJiri Benc
The collect metadata mode does not support GUE nor FOU. This might be implemented later; until then, we should reject such config. I think this is okay to be changed. It's unlikely anyone has such configuration (as it doesn't work anyway) and we may need a way to distinguish whether it's supported or not by the kernel later. For backwards compatibility with iproute2, it's not possible to just check the attribute presence (iproute2 always includes the attribute), the actual value has to be checked, too. Fixes: 2e15ea390e6f4 ("ip_gre: Add support to collect tunnel metadata.") Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28Merge branch 'pegasus-sizes'David S. Miller
Petko Manolov says: ==================== pegasus: correct buffer & packet sizes As noticed by Lincoln Ramsay <a1291762@gmail.com> some old (usb 1.1) Pegasus based devices may actually return more bytes than the specified in the datasheet amount. That would not be a problem if the allocated space for the SKB was equal to the parameter passed to usb_fill_bulk_urb(). Some poor bugger (i really hope it was not me, but 'git blame' is useless in this case, so anyway) decided to add '+ 8' to the buffer length parameter. Sometimes the usb transfer overflows and corrupts the socket structure, leading to kernel panic. The above doesn't seem to happen for newer (Pegasus2 based) devices which did help this bug to hide for so long. The new default is to not include the CRC at the end of each received package. So far CRC has been ignored which makes no sense to do it in a first place. The patch is against v4.6-rc5 and was tested on ADM8515 device by transferring multiple gigabytes of data over a couple of days without any complaints from the kernel. Please apply it to whatever net tree you deem fit. Changes since v1: - split the patch in two parts; - corrected the subject lines; Changes since v2: - do not append CRC by default (based on a discussion with Johannes Berg); ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28pegasus: fixes reported packet lengthPetko Manolov
The default Pegasus setup was to append the status and CRC at the end of each received packet. The status bits are used to update various stats, but CRC has been ignored. The new default is to not append CRC at the end of RX packets. Signed-off-by: Petko Manolov <petkan@mip-labs.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28pegasus: fixes URB buffer allocation size;Petko Manolov
usb_fill_bulk_urb() receives buffer length parameter 8 bytes larger than what's allocated by alloc_skb(); This seems to be a problem with older (pegasus usb-1.1) devices, which may silently return more data than the maximal packet length. Reported-by: Lincoln Ramsay <a1291762@gmail.com> Signed-off-by: Petko Manolov <petkan@mip-labs.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28Merge branch 'gre-lwt-fixes'David S. Miller
Jiri Benc says: ==================== gre: fix lwtunnel support This patchset fixes a few bugs in ipgre metadata mode implementation. As an example, in this setup: ip a a 192.168.1.1/24 dev eth0 ip l a gre1 type gre external ip l s gre1 up ip a a 192.168.99.1/24 dev gre1 ip r a 192.168.99.2/32 encap ip dst 192.168.1.2 ttl 10 dev gre1 ping 192.168.99.2 the traffic does not go through before this patchset and does as expected with it applied. v3: Back to v1 in order not to break existing users. Dropped patch 3, will be fixed in iproute2 instead. v2: Rejecting invalid configuration, added patch 3, dropped patch for ETH_P_TEB (will target net-next). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28gre: build header correctly for collect metadata tunnelsJiri Benc
In ipgre (i.e. not gretap) + collect metadata mode, the skb was assumed to contain Ethernet header and was encapsulated as ETH_P_TEB. This is not the case, the interface is ARPHRD_IPGRE and the protocol to be used for encapsulation is skb->protocol. Fixes: 2e15ea390e6f4 ("ip_gre: Add support to collect tunnel metadata.") Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28gre: do not assign header_ops in collect metadata modeJiri Benc
In ipgre mode (i.e. not gretap) with collect metadata flag set, the tunnel is incorrectly assumed to be mGRE in NBMA mode (see commit 6a5f44d7a048c). This is not the case, we're controlling the encapsulation addresses by lwtunnel metadata. And anyway, assigning dev->header_ops in collect metadata mode does not make sense. Although it would be more user firendly to reject requests that specify both the collect metadata flag and a remote/local IP address, this would break current users of gretap or introduce ugly code and differences in handling ipgre and gretap configuration. Keep the current behavior of remote/local IP address being ignored in such case. v3: Back to v1, added explanation paragraph. v2: Reject configuration specifying both remote/local address and collect metadata flag. Fixes: 2e15ea390e6f4 ("ip_gre: Add support to collect tunnel metadata.") Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28Merge tag 'mac80211-for-davem-2016-04-27' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Just a single fix, for a per-CPU memory leak in a (root user triggerable) error case. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28net: phy: at803x: only the AT8030 needs a hardware reset on link changeTimur Tabi
Commit 13a56b44 ("at803x: Add support for hardware reset") added a work-around for a hardware bug on the AT8030. However, the work-around was being called for all 803x PHYs, even those that don't need it. Function at803x_link_change_notify() checks to make sure that it only resets the PHY on the 8030, but it makes more sense to not call that function at all if it isn't needed. Signed-off-by: Timur Tabi <timur@codeaurora.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller
Antonio Quartulli says: ==================== In this patchset you can find the following fixes: 1) check skb size to avoid reading beyond its border when delivering payloads, by Sven Eckelmann 2) initialize last_seen time in neigh_node object to prevent cleanup routine from accidentally purge it, by Marek Lindner 3) release "recently added" slave interfaces upon virtual/batman interface shutdown, by Sven Eckelmann 4) properly decrease router object reference counter upon routing table update, by Sven Eckelmann 5) release queue slots when purging OGM packets of deactivating slave interface, by Linus Lüssing Patch 2 and 3 have no "Fixes:" tag because the offending commits date back to when batman-adv was not yet officially in the net tree. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28ps3_gelic: fix memcpy parameterChristophe Jaillet
The size allocated for target->hwinfo and the number of bytes copied in it should be consistent. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28lan78xx: workaround of forced 100 Full/Half duplex mode errorWoojung Huh
At forced 100 Full & Half duplex mode, chip may fail to set mode correctly when cable is switched between long(~50+m) and short one. As workaround, set to 10 before setting to 100 at forced 100 F/H mode. Signed-off-by: Woojung Huh <woojung.huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28lan78xx: fix statistics counter errorWoojung Huh
Fix rx_bytes, tx_bytes and tx_frames error in netdev.stats. - rx_bytes counted bytes excluding size of struct ethhdr. - tx_packets didn't count multiple packets in a single urb - tx_bytes included 8 bytes of extra commands. Signed-off-by: Woojung Huh <woojung.huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28net: dsa: mv88e6xxx: fix uninitialized error returnColin Ian King
The error return err is not initialized and there is a possibility that err is not assigned causing mv88e6xxx_port_bridge_join to return a garbage error return status. Fix this by initializing err to 0. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28net: fix net_gso_ok for new GSO types.Marcelo Ricardo Leitner
Fix casting in net_gso_ok. Otherwise the shift on gso_type << NETIF_F_GSO_SHIFT may hit the 32th bit and make it look like a INT_MIN, which is then promoted from signed to uint64 which is 0xffffffff80000000, resulting in wrong behavior when it is and'ed with the feature itself, as in: This test app: #include <stdio.h> #include <stdint.h> int main(int argc, char **argv) { uint64_t feature1; uint64_t feature2; int gso_type = 1 << 15; feature1 = gso_type << 16; feature2 = (uint64_t)gso_type << 16; printf("%lx %lx\n", feature1, feature2); return 0; } Gives: ffffffff80000000 80000000 So that this: return (features & feature) == feature; Actually works on more bits than expected and invalid ones. Fix is to promote it earlier. Issue noted while rebasing SCTP GSO patch but posting separetely as someone else may experience this meanwhile. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28net: ethernet: davinci_emac: Fix devioctl while in fixed linkNeil Armstrong
When configured in fixed link, the DaVinci emac driver sets the priv->phydev to NULL and further ioctl calls to the phy_mii_ioctl() causes the kernel to crash. Cc: Brian Hutchinson <b.hutchman@gmail.com> Fixes: 1bb6aa56bb38 ("net: davinci_emac: Add support for fixed-link PHY") Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28Merge tag 'wireless-drivers-for-davem-2016-04-25' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.6 ath9k * fix a couple release old throughput regression on ar9281 iwlwifi * add new device IDs for 8265 * fix a NULL pointer dereference when paging firmware asserts * remove a WARNING on gscan capabilities * fix MODULE_FIRMWARE for 8260 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28MAINTAINERS: net: update sfc maintainersBert Kenward
Add myself and Edward Cree as maintainers. Remove Shradha Shah, who is on extended leave. Cc: David S. Miller <davem@davemloft.net> Cc: Edward Cree <ecree@solarflare.com> Cc: Shradha Shah <sshah@solarflare.com> Signed-off-by: Bert Kenward <bkenward@solarflare.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28sfc: disable RSS when unsupportedJon Cooper
When certain firmware variants are selected (via the sfboot utility) the SFC7000 and SFC8000 series NICs don't support RSS. The driver still tries (and fails) to insert filters with the RSS flag, and the NIC fails to pass traffic. When the firmware reports RSS_LIMITED suppress allocating a default RSS context. The absence of an RSS context is picked up in filter insertion and RSS flags are discarded. Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28myri10ge: fix sleeping with bh disabledStanislaw Gruszka
napi_disable() can not be called with bh disabled, move locking just around myri10ge_ss_lock_napi() . Patches fixes following bug: [ 114.278378] BUG: sleeping function called from invalid context at net/core/dev.c:4383 <snip> [ 114.313712] Call Trace: [ 114.314943] [<ffffffff817010ce>] dump_stack+0x19/0x1b [ 114.317673] [<ffffffff810ce7f3>] __might_sleep+0x173/0x230 [ 114.320566] [<ffffffff815b3117>] napi_disable+0x27/0x90 [ 114.323254] [<ffffffffa01e437f>] myri10ge_close+0xbf/0x3f0 [myri10ge] Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Hyong-Youb Kim <hykim@myri.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28Documentation: networking: fix spelling mistakesEric Engestrom
Signed-off-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28drm/vmwgfx: Fix order of operationSinclair Yeh
mode->hdisplay * (var->bits_per_pixel + 7) gets evaluated before the division, potentially making the pitch larger than it should be. Since the original intention is to do a div-round-up, just use the macro instead. Signed-off-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
2016-04-28drm/vmwgfx: use vmw_cmd_dx_cid_check for query commands.Charmaine Lee
Instead of calling vmw_cmd_ok, call vmw_cmd_dx_cid_check to validate the context id for query commands. Signed-off-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2016-04-28drm/vmwgfx: Enable SVGA_3D_CMD_DX_SET_PREDICATIONCharmaine Lee
Fixes piglit tests nv_conditional_render-* crashes. Signed-off-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2016-04-28IB/security: Restrict use of the write() interfaceJason Gunthorpe
The drivers/infiniband stack uses write() as a replacement for bi-directional ioctl(). This is not safe. There are ways to trigger write calls that result in the return structure that is normally written to user space being shunted off to user specified kernel memory instead. For the immediate repair, detect and deny suspicious accesses to the write API. For long term, update the user space libraries and the kernel API to something that doesn't present the same security vulnerabilities (likely a structured ioctl() interface). The impacted uAPI interfaces are generally only available if hardware from drivers/infiniband is installed in the system. Reported-by: Jann Horn <jann@thejh.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [ Expanded check to all known write() entry points ] Cc: stable@vger.kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28IB/hfi1: Use kernel default llseek for ui deviceDean Luick
The ui device llseek had a mistake with SEEK_END and did not fully follow seek semantics. Correct all this by using a kernel supplied function for fixed size devices. Cc: Al Viro <viro@ZenIV.linux.org.uk> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28IB/hfi1: Don't attempt to free resources if initialization failedMitko Haralanov
Attempting to free resources which have not been allocated and initialized properly led to the following kernel backtrace: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffffa09658fe>] unlock_exp_tids.isra.8+0x2e/0x120 [hfi1] PGD 852a43067 PUD 85d4a6067 PMD 0 Oops: 0000 [#1] SMP CPU: 0 PID: 2831 Comm: osu_bw Tainted: G IO 3.12.18-wfr+ #1 task: ffff88085b15b540 ti: ffff8808588fe000 task.ti: ffff8808588fe000 RIP: 0010:[<ffffffffa09658fe>] [<ffffffffa09658fe>] unlock_exp_tids.isra.8+0x2e/0x120 [hfi1] RSP: 0018:ffff8808588ffde0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff880858a31800 RCX: 0000000000000000 RDX: ffff88085d971bc0 RSI: ffff880858a318f8 RDI: ffff880858a318c0 RBP: ffff8808588ffe20 R08: 0000000000000000 R09: 0000000000000000 R10: ffff88087ffd6f40 R11: 0000000001100348 R12: ffff880852900000 R13: ffff880858a318c0 R14: 0000000000000000 R15: ffff88085d971be8 FS: 00007f4674e83740(0000) GS:ffff88087f400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000085c377000 CR4: 00000000001407f0 Stack: ffffffffa0941a71 ffff880858a318f8 ffff88085d971bc0 ffff880858a31800 ffff880852900000 ffff880858a31800 00000000003ffff7 ffff88085d971bc0 ffff8808588ffe60 ffffffffa09663fc ffff8808588ffe60 ffff880858a31800 Call Trace: [<ffffffffa0941a71>] ? find_mmu_handler+0x51/0x70 [hfi1] [<ffffffffa09663fc>] hfi1_user_exp_rcv_free+0x6c/0x120 [hfi1] [<ffffffffa0932809>] hfi1_file_close+0x1a9/0x340 [hfi1] [<ffffffff8116c189>] __fput+0xe9/0x270 [<ffffffff8116c35e>] ____fput+0xe/0x10 [<ffffffff81065707>] task_work_run+0xa7/0xe0 [<ffffffff81002969>] do_notify_resume+0x59/0x80 [<ffffffff814ffc1a>] int_signal+0x12/0x17 This commit re-arranges the context initialization code in a way that would allow for context event flags to be used to determine whether the context has been successfully initialized. In turn, this can be used to skip the resource de-allocation if they were never allocated in the first place. Fixes: 3abb33ac6521 ("staging/hfi1: Add TID cache receive init and free funcs") Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com. Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28IB/hfi1: Fix missing lock/unlock in verbs drain callbackMike Marciniszyn
The iowait_sdma_drained() callback lacked locking to protect the qp s_flags field. This causes the s_flags to be out of sync on multiple CPUs, potentially corrupting the s_flags. Fixes: a545f5308b6c ("staging/rdma/hfi: fix CQ completion order issue") Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28IB/rdmavt: Fix send schedulingJubin John
call_send is used to determine whether to send immediately or schedule a send for later. The current logic in rdmavt is inverted and has a negative impact on the latency of the hfi1 and qib drivers. Fix this regression by correctly calling send immediately when call_send is set. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28IB/hfi1: Prevent unpinning of wrong pagesMitko Haralanov
The routine used by the SDMA cache to handle already cached nodes can extend an already existing node. In its error handling code, the routine will unpin pages when not all pages of the buffer extension were pinned. There was a bug in that part of the routine, which would mistakenly unpin pages from the original set rather than the newly pinned pages. This commit fixes that bug by offsetting the page array to the proper place pointing at the beginning of the newly pinned pages. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28IB/hfi1: Fix deadlock caused by locking with wrong scopeMitko Haralanov
The locking around the interval RB tree is designed to prevent access to the tree while it's being modified. The locking in its current form is too overzealous, which is causing a deadlock in certain cases with the following backtrace: Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 0 CPU: 0 PID: 5836 Comm: IMB-MPI1 Tainted: G O 3.12.18-wfr+ #1 0000000000000000 ffff88087f206c50 ffffffff814f1caa ffffffff817b53f0 ffff88087f206cc8 ffffffff814ecd56 0000000000000010 ffff88087f206cd8 ffff88087f206c78 0000000000000000 0000000000000000 0000000000001662 Call Trace: <NMI> [<ffffffff814f1caa>] dump_stack+0x45/0x56 [<ffffffff814ecd56>] panic+0xc2/0x1cb [<ffffffff810d4370>] ? restart_watchdog_hrtimer+0x50/0x50 [<ffffffff810d4432>] watchdog_overflow_callback+0xc2/0xd0 [<ffffffff81109b4e>] __perf_event_overflow+0x8e/0x2b0 [<ffffffff8110a714>] perf_event_overflow+0x14/0x20 [<ffffffff8101c906>] intel_pmu_handle_irq+0x1b6/0x390 [<ffffffff814f927b>] perf_event_nmi_handler+0x2b/0x50 [<ffffffff814f8ad8>] nmi_handle.isra.3+0x88/0x180 [<ffffffff814f8d39>] do_nmi+0x169/0x310 [<ffffffff814f8177>] end_repeat_nmi+0x1e/0x2e [<ffffffff81272600>] ? unmap_single+0x30/0x30 [<ffffffff814f780d>] ? _raw_spin_lock_irqsave+0x2d/0x40 [<ffffffff814f780d>] ? _raw_spin_lock_irqsave+0x2d/0x40 [<ffffffff814f780d>] ? _raw_spin_lock_irqsave+0x2d/0x40 <<EOE>> <IRQ> [<ffffffffa056c4a8>] hfi1_mmu_rb_search+0x38/0x70 [hfi1] [<ffffffffa05919cb>] user_sdma_free_request+0xcb/0x120 [hfi1] [<ffffffffa0593393>] user_sdma_txreq_cb+0x263/0x350 [hfi1] [<ffffffffa057fad7>] ? sdma_txclean+0x27/0x1c0 [hfi1] [<ffffffffa0593130>] ? user_sdma_send_pkts+0x1710/0x1710 [hfi1] [<ffffffffa057fdd6>] sdma_make_progress+0x166/0x480 [hfi1] [<ffffffff810762c9>] ? ttwu_do_wakeup+0x19/0xd0 [<ffffffffa0581c7e>] sdma_engine_interrupt+0x8e/0x100 [hfi1] [<ffffffffa0546bdd>] sdma_interrupt+0x5d/0xa0 [hfi1] [<ffffffff81097e57>] handle_irq_event_percpu+0x47/0x1d0 [<ffffffff81098017>] handle_irq_event+0x37/0x60 [<ffffffff8109aa5f>] handle_edge_irq+0x6f/0x120 [<ffffffff810044af>] handle_irq+0xbf/0x150 [<ffffffff8104c9b7>] ? irq_enter+0x17/0x80 [<ffffffff8150168d>] do_IRQ+0x4d/0xc0 [<ffffffff814f7c6a>] common_interrupt+0x6a/0x6a <EOI> [<ffffffff81073524>] ? finish_task_switch+0x54/0xe0 [<ffffffff814f56c6>] __schedule+0x3b6/0x7e0 [<ffffffff810763a6>] __cond_resched+0x26/0x30 [<ffffffff814f5eda>] _cond_resched+0x3a/0x50 [<ffffffff814f4f82>] down_write+0x12/0x30 [<ffffffffa0591619>] hfi1_release_user_pages+0x69/0x90 [hfi1] [<ffffffffa059173a>] sdma_rb_remove+0x9a/0xc0 [hfi1] [<ffffffffa056c00d>] __mmu_rb_remove.isra.5+0x5d/0x70 [hfi1] [<ffffffffa056c536>] hfi1_mmu_rb_remove+0x56/0x70 [hfi1] [<ffffffffa059427b>] hfi1_user_sdma_process_request+0x74b/0x1160 [hfi1] [<ffffffffa055c763>] hfi1_aio_write+0xc3/0x100 [hfi1] [<ffffffff8116a14c>] do_sync_readv_writev+0x4c/0x80 [<ffffffff8116b58b>] do_readv_writev+0xbb/0x230 [<ffffffff811a9da1>] ? fsnotify+0x241/0x320 [<ffffffff81073524>] ? finish_task_switch+0x54/0xe0 [<ffffffff8116b795>] vfs_writev+0x35/0x60 [<ffffffff8116b8c9>] SyS_writev+0x49/0xc0 [<ffffffff810cd876>] ? __audit_syscall_exit+0x1f6/0x2a0 [<ffffffff814ff992>] system_call_fastpath+0x16/0x1b As evident from the backtrace above, the process was being put to sleep while holding the lock. Limiting the scope of the lock only to the RB tree operation fixes the above error allowing for proper locking and the process being put to sleep when needed. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28IB/hfi1: Prevent NULL pointer deferences in caching codeMitko Haralanov
There is a potential kernel crash when the MMU notifier calls the invalidation routines in the hfi1 pinned page caching code for sdma. The invalidation routine could call the remove callback for the node, which in turn ends up dereferencing the current task_struct to get a pointer to the mm_struct. However, the mm_struct pointer could be NULL resulting in the following backtrace: BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 IP: [<ffffffffa041f75a>] sdma_rb_remove+0xaa/0x100 [hfi1] 15 task: ffff88085e66e080 ti: ffff88085c244000 task.ti: ffff88085c244000 RIP: 0010:[<ffffffffa041f75a>] [<ffffffffa041f75a>] sdma_rb_remove+0xaa/0x100 [hfi1] RSP: 0000:ffff88085c245878 EFLAGS: 00010002 RAX: 0000000000000000 RBX: ffff88105b9bbd40 RCX: ffffea003931a830 RDX: 0000000000000004 RSI: ffff88105754a9c0 RDI: ffff88105754a9c0 RBP: ffff88085c245890 R08: ffff88105b9bbd70 R09: 00000000fffffffb R10: ffff88105b9bbd58 R11: 0000000000000013 R12: ffff88105754a9c0 R13: 0000000000000001 R14: 0000000000000001 R15: ffff88105b9bbd40 FS: 0000000000000000(0000) GS:ffff88107ef40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000a8 CR3: 0000000001a0b000 CR4: 00000000001407e0 Stack: ffff88105b9bbd40 ffff88080ec481a8 ffff88080ec481b8 ffff88085c2458c0 ffffffffa03fa00e ffff88080ec48190 ffff88080ed9cd00 0000000001024000 0000000000000000 ffff88085c245920 ffffffffa03fa0e7 0000000000000282 Call Trace: [<ffffffffa03fa00e>] __mmu_rb_remove.isra.5+0x5e/0x70 [hfi1] [<ffffffffa03fa0e7>] mmu_notifier_mem_invalidate+0xc7/0xf0 [hfi1] [<ffffffffa03fa143>] mmu_notifier_page+0x13/0x20 [hfi1] [<ffffffff81156dd0>] __mmu_notifier_invalidate_page+0x50/0x70 [<ffffffff81140bbb>] try_to_unmap_one+0x20b/0x470 [<ffffffff81141ee7>] try_to_unmap_anon+0xa7/0x120 [<ffffffff81141fad>] try_to_unmap+0x4d/0x60 [<ffffffff8111fd7b>] shrink_page_list+0x2eb/0x9d0 [<ffffffff81120ab3>] shrink_inactive_list+0x243/0x490 [<ffffffff81121491>] shrink_lruvec+0x4c1/0x640 [<ffffffff81121641>] shrink_zone+0x31/0x100 [<ffffffff81121b0f>] kswapd_shrink_zone.constprop.62+0xef/0x1c0 [<ffffffff811229e3>] kswapd+0x403/0x7e0 [<ffffffff811225e0>] ? shrink_all_memory+0xf0/0xf0 [<ffffffff81068ac0>] kthread+0xc0/0xd0 [<ffffffff81068a00>] ? insert_kthread_work+0x40/0x40 [<ffffffff814ff8ec>] ret_from_fork+0x7c/0xb0 [<ffffffff81068a00>] ? insert_kthread_work+0x40/0x40 To correct this, the mm_struct passed to us by the MMU notifier is used (which is what should have been done to begin with). This avoids the broken derefences and ensures that the correct mm_struct is used. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>