summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-07-07libceph: always populate t->target_{oid,oloc} in calc_target()Ilya Dryomov
need_check_tiering logic doesn't make a whole lot of sense. Drop it and apply tiering unconditionally on every calc_target() call instead. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: make sure need_resend targets reflect latest mapIlya Dryomov
Otherwise we may miss events like PG splits, pool deletions, etc when we get multiple incremental maps at once. Because check_pool_dne() can now be fed an unlinked request, finish_request() needed to be taught to handle unlinked requests. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: delete from need_resend_linger before check_linger_pool_dne()Ilya Dryomov
When processing a map update consisting of multiple incrementals, we may end up running check_linger_pool_dne() on a lingering request that was previously added to need_resend_linger list. If it is concluded that the target pool doesn't exist, the request is killed off while still on need_resend_linger list, which leads to a crash on a NULL lreq->osd in kick_requests(): libceph: linger_id 18446462598732840961 pool does not exist BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: ceph_osdc_handle_map+0x4ae/0x870 Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: resend on PG splits if OSD has RESEND_ON_SPLITIlya Dryomov
Note that ceph_osd_request_target fields are updated regardless of RESEND_ON_SPLIT. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: drop need_resend from calc_target()Ilya Dryomov
Replace it with more fine-grained bools to separate updating ceph_osd_request_target fields and the decision to resend. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: MOSDOp v8 encoding (actual spgid + full hash)Ilya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: ceph_connection_operations::reencode_message() methodIlya Dryomov
Give upper layers a chance to reencode the message after the connection is negotiated and ->peer_features is set. OSD client will use this to support both luminous and pre-luminous OSDs (in a single cluster): the former need MOSDOp v8; the latter will continue to be sent MOSDOp v4. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: encode_{pgid,oloc}() helpersIlya Dryomov
Factor out encode_{pgid,oloc}() and use ceph_encode_string() for oid. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: introduce ceph_spg, ceph_pg_to_primary_shard()Ilya Dryomov
Store both raw pgid and actual spgid in ceph_osd_request_target. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: new pi->last_force_request_resendIlya Dryomov
The old (v15) pi->last_force_request_resend has been repurposed to make pre-RESEND_ON_SPLIT clients that don't check for PG splits but do obey pi->last_force_request_resend resend on splits. See ceph.git commit 189ca7ec6420 ("mon/OSDMonitor: make pre-luminous clients resend ops on split"). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: fold [l]req->last_force_resend into ceph_osd_request_targetIlya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: support SERVER_JEWEL feature bitsIlya Dryomov
Only MON_STATEFUL_SUB, really. MON_ROUTE_OSDMAP and OSDSUBOP_NO_SNAPCONTEXT are irrelevant. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: advertise support for OSD_POOLRESENDIlya Dryomov
The code has been in place since commit 63244fa123a7 ("libceph: introduce ceph_osd_request_target, calc_target()"), and, with the ceph_{oloc,oid}_copy() issue fixed in the previous commit, is now in working order. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: handle non-empty dest in ceph_{oloc,oid}_copy()Ilya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: new features macrosIlya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07libceph: remove ceph_sanitize_features() workaroundIlya Dryomov
Reflects ceph.git commit ff1959282826ae6acd7134e1b1ede74ffd1cc04a. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07ceph: update ceph_dentry_info::lease_session when necessaryYan, Zheng
Current code does not update ceph_dentry_info::lease_session once it is set. If auth mds of corresponding dentry changes, dentry lease keeps in an invalid state. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07ceph: new mount option that specifies fscache uniquifierYan, Zheng
Current ceph uses FSID as primary index key of fscache data. This allows ceph to retain cached data across remount. But this causes problem (kernel opps, fscache does not support sharing data) when a filesystem get mounted several times (with fscache enabled, with different mount options). The fix is adding a new mount option, which specifies uniquifier for fscache. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07ceph: avoid accessing freeing inode in ceph_check_delayed_caps()Yan, Zheng
Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07ceph: avoid invalid memory dereference in the middle of umountYan, Zheng
extra_mon_dispatch() and debugfs' foo_show functions dereference fsc->mdsc. we should clean up fsc->client->extra_mon_dispatch and debugfs before destroying fsc->mds. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07ceph: getattr before read on ceph.* xattrsYan, Zheng
Previously we were returning values for quota, layout xattrs without any kind of update -- the user just got whatever happened to be in our cache. Clearly this extra round trip has a cost, but reads of these xattrs are fairly rare, happening on admin intervention rather than in normal operation. Link: http://tracker.ceph.com/issues/17939 Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07ceph: don't re-send interrupted flock requestYan, Zheng
Don't re-send interrupted flock request in cases of mds failover and receiving request forward. Because corresponding 'lock intr' request may have been finished, it won't get re-sent. Link: http://tracker.ceph.com/issues/20170 Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07ceph: cleanup writepage_nounlock()Yan, Zheng
Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07ceph: redirty page when writepage_nounlock() skips unwritable pageYan, Zheng
Ceph needs to flush dirty page in the order in which in which snap context they belong to. Dirty pages belong to older snap context should be flushed earlier. if writepage_nounlock() can not flush a page, it should redirty the page. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07ceph: remove useless page->mapping check in writepage_nounlock()Yan, Zheng
Callers of writepage_nounlock() have already ensured non-null page->mapping. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07ceph: update the 'approaching max_size' codeYan, Zheng
The old 'approaching max_size' code expects MDS set max_size to '2 * reported_size'. This is no longer true. The new code reports file size when half of previous max_size increment has been used. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07ceph: re-request max size after importing capsYan, Zheng
The 'wanted max size' could be sent to inode's old auth mds, re-send it to inode's new auth mds if necessary. Otherwise write syscall may hang. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-07drm/radeon: Fix eDP for single-display iMac10,1 (v2)Mario Kleiner
The late 2009, 27 inch Apple iMac10,1 has an internal eDP display and an external Mini- Displayport output, driven by a DCE-3.2, RV730 Radeon Mobility HD-4670. The machine worked fine in a dual-display setup with eDP panel + externally connected HDMI or DVI-D digital display sink, connected via MiniDP to DVI or HDMI adapter. However, booting the machine single-display with only eDP panel results in a completely black display - even backlight powering off, as soon as the radeon modesetting driver loads. This patch fixes the single dispay eDP case by assigning encoders based on dig->linkb, similar to DCE-4+. While this should not be generally necessary (Alex: "...atom on normal boards should be able to handle any mapping."), Apple seems to use some special routing here. One remaining problem not solved by this patch is that an external Minidisplayport->DP sink does still not work on iMac10,1, whereas external DVI and HDMI sinks continue to work. The problem affects at least all tested kernels since Linux 3.13 - didn't test earlier kernels, so backporting to stable probably makes sense. v2: With the original patch from 2016, Alex was worried it will break other DCE3.2 systems. Use dmi_match() to apply this special encoder assignment only for the Apple iMac 10,1 from late 2009. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Michel Dänzer <michel.daenzer@amd.com> Cc: <stable@vger.kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-07-07rtc: ds1307: remove ds1307_removeAlexandre Belloni
ds1307_remove() is now empty, remove it Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-07rtc: ds1307: use generic nvmemAlexandre Belloni
Instead of adding a binary sysfs attribute from the driver (which suffers from a race condition as the attribute appears after the device), use the core to register an nvmem device. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-07rtc: ds1307: switch to rtc_register_deviceAlexandre Belloni
This removes a possible race condition and crash and allows for further improvement of the driver. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-07rtc: rv8803: remove rv8803_removeAlexandre Belloni
rv8803_remove() is now empty, remove it. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-07rtc: rv8803: use generic nvmem supportAlexandre Belloni
Instead of adding a binary sysfs attribute from the driver (which suffers from a race condition as the attribute appears after the device), use the core to register an nvmem device. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-07rtc: rv8803: switch to rtc_register_deviceAlexandre Belloni
This removes a possible race condition and allows for further improvement of the driver. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-07rtc: add generic nvmem supportAlexandre Belloni
Many RTCs have an on board non volatile storage. It can be battery backed RAM or an EEPROM. Use the nvmem subsystem to export it to both userspace and in-kernel consumers. This stays compatible with the previous (non documented) ABI that was using /sys/class/rtc/rtcx/device/nvram to export that memory. But will warn about the deprecation. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-07rtc: at91rm9200: remove race conditionAlexandre Belloni
While highly unlikely, it is possible to get an interrupt as soon as it is requested. In that case, at91_rtc_interrupt() will be called with rtc == NULL. Solve that by using devm_rtc_allocate_device/rtc_register_device. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-07rtc: introduce new registration methodAlexandre Belloni
Introduce rtc_register_device() to register an already allocated and initialized struct rtc_device. It automatically sets up the owner and the two steps allocation/registration will allow to remove race conditions in the IRQ handling of some driver. It also allows to properly extend the core without adding more arguments to rtc_device_register(). Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-07rtc: class separate id allocation from registrationAlexandre Belloni
Create rtc_device_get_id to allocate the id for an RTC. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-07rtc: class separate device allocation from registrationAlexandre Belloni
Create rtc_allocate_device to allocate memory for a struct rtc_device and initialize it. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-07Merge tag 'mac80211-for-davem-2017-07-07' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== pull-request: mac80211 2017-07-07 Just got a set of fixes in from Jouni/QCA, all netlink validation fixes. I assume they ran some kind of checker, but I don't know what kind :) Please pull and let me know if there's any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-07irqdomain: Allow ACPI device nodes to be used as irqdomain identifiersMarc Zyngier
A number of irqchip implementations are (ab)using the irqdomain allocator by passing a fwnode that is neither a FWNODE_OF or a FWNODE_IRQCHIP. This is pretty bad, but it also feels pretty crap to force these drivers to allocate their own irqchip_fwid when they already have a proper fwnode. Instead, let's teach the irqdomain allocator about ACPI device nodes, and add some lovely name generation code... Tested on an arm64 D05 system. Reported-and-tested-by: John Garry <john.garry@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Agustin Vega-Frias <agustinv@codeaurora.org> Cc: Ma Jun <majun258@huawei.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Link: http://lkml.kernel.org/r/20170707083959.10349-1-marc.zyngier@arm.com
2017-07-07cfg80211: Validate frequencies nested in NL80211_ATTR_SCAN_FREQUENCIESSrinivas Dasari
validate_scan_freqs() retrieves frequencies from attributes nested in the attribute NL80211_ATTR_SCAN_FREQUENCIES with nla_get_u32(), which reads 4 bytes from each attribute without validating the size of data received. Attributes nested in NL80211_ATTR_SCAN_FREQUENCIES don't have an nla policy. Validate size of each attribute before parsing to avoid potential buffer overread. Fixes: 2a519311926 ("cfg80211/nl80211: scanning (and mac80211 update to use it)") Cc: stable@vger.kernel.org Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-07-07cfg80211: Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODESrinivas Dasari
Buffer overread may happen as nl80211_set_station() reads 4 bytes from the attribute NL80211_ATTR_LOCAL_MESH_POWER_MODE without validating the size of data received when userspace sends less than 4 bytes of data with NL80211_ATTR_LOCAL_MESH_POWER_MODE. Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE to avoid the buffer overread. Fixes: 3b1c5a5307f ("{cfg,nl}80211: mesh power mode primitives and userspace access") Cc: stable@vger.kernel.org Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-07-07cfg80211: Check if NAN service ID is of expected sizeSrinivas Dasari
nla policy checks for only maximum length of the attribute data when the attribute type is NLA_BINARY. If userspace sends less data than specified, cfg80211 may access illegal memory. When type is NLA_UNSPEC, nla policy check ensures that userspace sends minimum specified length number of bytes. Remove type assignment to NLA_BINARY from nla_policy of NL80211_NAN_FUNC_SERVICE_ID to make these NLA_UNSPEC and to make sure minimum NL80211_NAN_FUNC_SERVICE_ID_LEN bytes are received from userspace with NL80211_NAN_FUNC_SERVICE_ID. Fixes: a442b761b24 ("cfg80211: add add_nan_func / del_nan_func") Cc: stable@vger.kernel.org Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-07-07cfg80211: Check if PMKID attribute is of expected sizeSrinivas Dasari
nla policy checks for only maximum length of the attribute data when the attribute type is NLA_BINARY. If userspace sends less data than specified, the wireless drivers may access illegal memory. When type is NLA_UNSPEC, nla policy check ensures that userspace sends minimum specified length number of bytes. Remove type assignment to NLA_BINARY from nla_policy of NL80211_ATTR_PMKID to make this NLA_UNSPEC and to make sure minimum WLAN_PMKID_LEN bytes are received from userspace with NL80211_ATTR_PMKID. Fixes: 67fbb16be69d ("nl80211: PMKSA caching support") Cc: stable@vger.kernel.org Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-07-07iov_iter: saner checks on copyin/copyoutAl Viro
* might_fault() is better checked in caller (and e.g. fault-in + kmap_atomic codepath also needs might_fault() coverage) * we have already done object size checks * we have *NOT* done access_ok() recently enough; we rely upon the iovec array having passed sanity checks back when it had been created and not nothing having buggered it since. However, that's very much non-local, so we'd better recheck that. So the thing we want does not match anything in uaccess - we need access_ok + kasan checks + raw copy without any zeroing. Just define such helpers and use them here. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-07-07arcnet: com20020-pci: Fix an error handling path in 'com20020pci_probe()'Christophe Jaillet
If this memory allocation fails, we should go through the error handling path as done everywhere else in this function before returning. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-07nfp: flower: add missing clean up call to avoid memory leaksJakub Kicinski
nfp_flower_metadata_cleanup() is defined but never invoked, not calling it will cause us to leak mask and statistics queue memory on the host. Fixes: 43f84b72c50d ("nfp: add metadata to each flow offload") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-07genirq/debugfs: Remove redundant NULL pointer checkThomas Gleixner
debugfs_remove() can be called with a NULL pointer. Fixes: 087cdfb662ae5 ("genirq/debugfs: Add proper debugfs interface") Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-07-06Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc updates from Andrew Morton: - a few hotfixes - various misc updates - ocfs2 updates - most of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (108 commits) mm, memory_hotplug: move movable_node to the hotplug proper mm, memory_hotplug: drop CONFIG_MOVABLE_NODE mm, memory_hotplug: drop artificial restriction on online/offline mm: memcontrol: account slab stats per lruvec mm: memcontrol: per-lruvec stats infrastructure mm: memcontrol: use generic mod_memcg_page_state for kmem pages mm: memcontrol: use the node-native slab memory counters mm: vmstat: move slab statistics from zone to node counters mm/zswap.c: delete an error message for a failed memory allocation in zswap_dstmem_prepare() mm/zswap.c: improve a size determination in zswap_frontswap_init() mm/zswap.c: delete an error message for a failed memory allocation in zswap_pool_create() mm/swapfile.c: sort swap entries before free mm/oom_kill: count global and memory cgroup oom kills mm: per-cgroup memory reclaim stats mm: kmemleak: treat vm_struct as alternative reference to vmalloc'ed objects mm: kmemleak: factor object reference updating out of scan_block() mm: kmemleak: slightly reduce the size of some structures on 64-bit architectures mm, mempolicy: don't check cpuset seqlock where it doesn't matter mm, cpuset: always use seqlock when changing task's nodemask mm, mempolicy: simplify rebinding mempolicies when updating cpusets ...