git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2017-08-02	mtd: nand: Declare tBERS, tR and tPROG as u64 to avoid integer overflow	Boris Brezillon
	All timings in nand_sdr_timings are expressed in picoseconds but some of them may not fit in an u32. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Fixes: 204e7ecd47e2 ("mtd: nand: Add a few more timings to nand_sdr_timings") Reported-by: Alexander Dahl <ada@thorsis.com> Cc: <stable@vger.kernel.org> Reviewed-by: Alexander Dahl <ada@thorsis.com> Tested-by: Alexander Dahl <ada@thorsis.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-08-01	net: dsa: rename switch EEE ops	Vivien Didelot
	To avoid confusion with the PHY EEE settings, rename the .set_eee and .get_eee ops to respectively .set_mac_eee and .get_mac_eee. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01	net: dsa: remove PHY device argument from .set_eee	Vivien Didelot
	The DSA switch operations for EEE are only meant to configure a port's MAC EEE settings. The port's PHY EEE settings are accessed by the DSA layer and must be made available via a proper PHY driver. In order to reduce this confusion, remove the phy_device argument from the .set_eee operation. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-02	Merge branch 'drm-next-4.14' of git://people.freedesktop.org/~agd5f/linux ↵	Dave Airlie
	into drm-next - Stop reprogramming the MC, the vbios already does this in asic_init - Reduce internal gart to 256M (this does not affect the ttm GTT pool size) - Initial support for huge pages - Rework bo migration logic - Lots of improvements for vega10 - Powerplay fixes - Additional Raven enablement - SR-IOV improvements - Bug fixes - Code cleanup * 'drm-next-4.14' of git://people.freedesktop.org/~agd5f/linux: (138 commits) drm/amdgpu: fix header on gfx9 clear state drm/amdgpu: reduce the time of reading VBIOS drm/amdgpu/virtual_dce: Remove the rmmod error message drm/amdgpu/gmc9: disable legacy vga features in gmc init drm/amdgpu/gmc8: disable legacy vga features in gmc init drm/amdgpu/gmc7: disable legacy vga features in gmc init drm/amdgpu/gmc6: disable legacy vga features in gmc init (v2) drm/radeon: Set depth on low mem to 16 bpp instead of 8 bpp drm/amdgpu: fix the incorrect scratch reg number on gfx v6 drm/amdgpu: fix the incorrect scratch reg number on gfx v7 drm/amdgpu: fix the incorrect scratch reg number on gfx v8 drm/amdgpu: fix the incorrect scratch reg number on gfx v9 drm/amd/powerplay: add support for 3DP 4K@120Hz on vega10. drm/amdgpu: enable huge page handling in the VM v5 drm/amdgpu: increase fragmentation size for Vega10 v2 drm/amdgpu: ttm_bind only when user needs gpu_addr in bo pin drm/amdgpu: correct clock info for SRIOV drm/amdgpu/gmc8: SRIOV need to program fb location drm/amdgpu: disable firmware loading for psp v10 drm/amdgpu:fix gfx fence allocate size ...
2017-08-01	PCI: Add pci_reset_function_locked()	Marc Zyngier
	The implementation of PCI workarounds may require that the device is reset from its probe function. This implies that the PCI device lock is already held, and makes calling pci_reset_function() impossible (since it will itself try to take that lock). Add pci_reset_function_locked(), which is the equivalent of pci_reset_function(), except that it requires the PCI device lock to be already held by the caller. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> [bhelgaas: folded in fix for conflict with 52354b9d1f46 ("PCI: Remove __pci_dev_reset() and pci_dev_reset()")] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # 4.11: 52354b9d1f46: PCI: Remove __pci_dev_reset() and pci_dev_reset() Cc: stable@vger.kernel.org # 4.11
2017-08-01	drm/msm: Remove __user from __u64 data types	Jordan Crouse
	__user should be used to identify user pointers and not __u64 variables containing pointers. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-08-01	net: add skb_frag_foreach_page and use with kmap_atomic	Willem de Bruijn
	Skb frags may contain compound pages. Various operations map frags temporarily using kmap_atomic, but this function works on single pages, not whole compound pages. The distinction is only relevant for high mem pages that require temporary mappings. Introduce a looping mechanism that for compound highmem pages maps one page at a time, does not change behavior on other pages. Use the loop in the kmap_atomic callers in net/core/skbuff.c. Verified by triggering skb_copy_bits with tcpdump -n -c 100 -i ${DEV} -w /dev/null & netperf -t TCP_STREAM -H ${HOST} and by triggering __skb_checksum with ethtool -K ${DEV} tx off repeated the tests with looping on a non-highmem platform (x86_64) by making skb_frag_must_loop always return true. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01	strparser: Generalize strparser	Tom Herbert
	Generalize strparser from more than just being used in conjunction with read_sock. strparser will also be used in the send path with zero proxy. The primary change is to create strp_process function that performs the critical processing on skbs. The documentation is also updated to reflect the new uses. Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01	skbuff: Function to send an skbuf on a socket	Tom Herbert
	Add skb_send_sock to send an skbuff on a socket within the kernel. Arguments include an offset so that an skbuf might be sent in mulitple calls (e.g. send buffer limit is hit). Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01	proto_ops: Add locked held versions of sendmsg and sendpage	Tom Herbert
	Add new proto_ops sendmsg_locked and sendpage_locked that can be called when the socket lock is already held. Correspondingly, add kernel_sendmsg_locked and kernel_sendpage_locked as front end functions. These functions will be used in zero proxy so that we can take the socket lock in a ULP sendmsg/sendpage and then directly call the backend transport proto_ops functions. Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01	ptp: introduce ptp auxiliary worker	Grygorii Strashko
	Many PTP drivers required to perform some asynchronous or periodic work, like periodically handling PHC counter overflow or handle delayed timestamp for RX/TX network packets. In most of the cases, such work is implemented using workqueues. Unfortunately, Kernel workqueues might introduce significant delay in work scheduling under high system load and on -RT, which could cause misbehavior of PTP drivers due to internal counter overflow, for example, and there is no way to tune its execution policy and priority manuallly. Hence, The kthread_worker can be used insted of workqueues, as it create separte named kthread for each worker and its its execution policy and priority can be configured using chrt tool. This prblem was reported for two drivers TI CPSW CPTS and dp83640, so instead of modifying each of these driver it was proposed to add PTP auxiliary worker to the PHC subsystem. The patch adds PTP auxiliary worker in PHC subsystem using kthread_worker and kthread_delayed_work and introduces two new PHC subsystem APIs: - long (do_aux_work)(struct ptp_clock_info ptp) callback in ptp_clock_info structure, which driver should assign if it require to perform asynchronous or periodic work. Driver should return the delay of the PTP next auxiliary work scheduling time (>=0) or negative value in case further scheduling is not required. - int ptp_schedule_worker(struct ptp_clock *ptp, unsigned long delay) which allows schedule PTP auxiliary work. The name of kthread_worker thread corresponds PTP PHC device name "ptp%d". Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01	x86/intel_rdt/cqm: Add tasks file support	Vikas Shivappa
	The root directory, ctrl_mon and monitor groups are populated with a read/write file named "tasks". When read, it shows all the task IDs assigned to the resource group. Tasks can be added to groups by writing the PID to the file. A task can be present in one "ctrl_mon" group "and" one "monitor" group. IOW a PID_x can be seen in a ctrl_mon group and a monitor group at the same time. When a task is added to a ctrl_mon group, it is automatically removed from the previous ctrl_mon group where it belonged. Similarly if a task is moved to a monitor group it is removed from the previous monitor group . Also since the monitor groups can only have subset of tasks of parent ctrl_mon group, a task can be moved to a monitor group only if its already present in the parent ctrl_mon group. Task membership is indicated by a new field in the task_struct "u32 rmid" which holds the RMID for the task. RMID=0 is reserved for the default root group where the tasks belong to at mount. [tony: zero the rmid if rdtgroup was deleted when task was being moved] Signed-off-by: Tony Luck <tony.luck@linux.intel.com> Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-16-git-send-email-vikas.shivappa@linux.intel.com
2017-08-01	x86/intel_rdt: Change closid type from int to u32	Vikas Shivappa
	OS associates a CLOSid(Class of service id) to a task by writing the high 32 bits of per CPU IA32_PQR_ASSOC MSR when a task is scheduled in. CPUID.(EAX=10H, ECX=1):EDX[15:0] enumerates the max CLOSID supported and it is zero indexed. Hence change the type to u32 from int. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-15-git-send-email-vikas.shivappa@linux.intel.com
2017-08-01	x86/intel_rdt: Introduce a common compile option for RDT	Vikas Shivappa
	We currently have a CONFIG_RDT_A which is for RDT(Resource directory technology) allocation based resctrl filesystem interface. As a preparation to add support for RDT monitoring as well into the same resctrl filesystem, change the config option to be CONFIG_RDT which would include both RDT allocation and monitoring code. No functional change. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-4-git-send-email-vikas.shivappa@linux.intel.com
2017-08-01	x86/perf/cqm: Wipe out perf based cqm	Vikas Shivappa
	'perf cqm' never worked due to the incompatibility between perf infrastructure and cqm hardware support. The hardware uses RMIDs to track the llc occupancy of tasks and these RMIDs are per package. This makes monitoring a hierarchy like cgroup along with monitoring of tasks separately difficult and several patches sent to lkml to fix them were NACKed. Further more, the following issues in the current perf cqm make it almost unusable: 1. No support to monitor the same group of tasks for which we do allocation using resctrl. 2. It gives random and inaccurate data (mostly 0s) once we run out of RMIDs due to issues in Recycling. 3. Recycling results in inaccuracy of data because we cannot guarantee that the RMID was stolen from a task when it was not pulling data into cache or even when it pulled the least data. Also for monitoring llc_occupancy, if we stop using an RMID_x and then start using an RMID_y after we reclaim an RMID from an other event, we miss accounting all the occupancy that was tagged to RMID_x at a later perf_count. 2. Recycling code makes the monitoring code complex including scheduling because the event can lose RMID any time. Since MBM counters count bandwidth for a period of time by taking snap shot of total bytes at two different times, recycling complicates the way we count MBM in a hierarchy. Also we need a spin lock while we do the processing to account for MBM counter overflow. We also currently use a spin lock in scheduling to prevent the RMID from being taken away. 4. Lack of support when we run different kind of event like task, system-wide and cgroup events together. Data mostly prints 0s. This is also because we can have only one RMID tied to a cpu as defined by the cqm hardware but a perf can at the same time tie multiple events during one sched_in. 5. No support of monitoring a group of tasks. There is partial support for cgroup but it does not work once there is a hierarchy of cgroups or if we want to monitor a task in a cgroup and the cgroup itself. 6. No support for monitoring tasks for the lifetime without perf overhead. 7. It reported the aggregate cache occupancy or memory bandwidth over all sockets. But most cloud and VMM based use cases want to know the individual per-socket usage. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-2-git-send-email-vikas.shivappa@linux.intel.com
2017-08-01	NFSv4: Fix EXCHANGE_ID corrupt verifier issue	Trond Myklebust
	The verifier is allocated on the stack, but the EXCHANGE_ID RPC call was changed to be asynchronous by commit 8d89bd70bc939. If we interrrupt the call to rpc_wait_for_completion_task(), we can therefore end up transmitting random stack contents in lieu of the verifier. Fixes: 8d89bd70bc939 ("NFS setup async exchange_id") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-08-01	sunrpc: Const-ify all instances of struct rpc_xprt_ops	Chuck Lever
	After transport instance creation, these function pointers never change. Mark them as constant to prevent their use as an attack vector for code injections. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-08-01	mtd: spi-nor: Recover from Spansion/Cypress errors	Alexander Sverdlin
	S25FL{128\|256\|512}S datasheets say: "When P_ERR or E_ERR bits are set to one, the WIP bit will remain set to one indicating the device remains busy and unable to receive new operation commands. A Clear Status Register (CLSR) command must be received to return the device to standby mode." Current spi-nor code works until first error occurs, but write/erase errors are not just rare hardware failures, they also occur if user tries to flash write-protected areas. After such attempt no SPI command can be executed any more and even read fails. This patch adds support for P_ERR and E_ERR bits in Status Register 1 (so that operation fails immediately and not after a long timeout) and proper recovery from the error condition. Tested on Spansion S25FS128S, which is supported by S25FL129P entry. Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr>
2017-08-01	LSM: drop bprm_secureexec hook	Kees Cook
	This removes the bprm_secureexec hook since the logic has been folded into the bprm_set_creds hook for all LSMs now. Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: John Johansen <john.johansen@canonical.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge@hallyn.com>
2017-08-01	commoncap: Move cap_elevated calculation into bprm_set_creds	Kees Cook
	Instead of a separate function, open-code the cap_elevated test, which lets us entirely remove bprm->cap_effective (to use the local "effective" variable instead), and more accurately examine euid/egid changes via the existing local "is_setid". The following LTP tests were run to validate the changes: # ./runltp -f syscalls -s cap # ./runltp -f securebits # ./runltp -f cap_bounds # ./runltp -f filecaps All kernel selftests for capabilities and exec continue to pass as well. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Andy Lutomirski <luto@kernel.org>
2017-08-01	commoncap: Refactor to remove bprm_secureexec hook	Kees Cook
	The commoncap implementation of the bprm_secureexec hook is the only LSM that depends on the final call to its bprm_set_creds hook (since it may be called for multiple files, it ignores bprm->called_set_creds). As a result, it cannot safely _clear_ bprm->secureexec since other LSMs may have set it. Instead, remove the bprm_secureexec hook by introducing a new flag to bprm specific to commoncap: cap_elevated. This is similar to cap_effective, but that is used for a specific subset of elevated privileges, and exists solely to track state from bprm_set_creds to bprm_secureexec. As such, it will be removed in the next patch. Here, set the new bprm->cap_elevated flag when setuid/setgid has happened from bprm_fill_uid() or fscapabilities have been prepared. This temporarily moves the bprm_secureexec hook to a static inline. The helper will be removed in the next patch; this makes the step easier to review and bisect, since this does not introduce any changes to inputs nor outputs to the "elevated privileges" calculation. The new flag is merged with the bprm->secureexec flag in setup_new_exec() since this marks the end of any further prepare_binprm() calls. Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andy Lutomirski <luto@kernel.org> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge@hallyn.com>
2017-08-01	binfmt: Introduce secureexec flag	Kees Cook
	The bprm_secureexec hook can be moved earlier. Right now, it is called during create_elf_tables(), via load_binary(), via search_binary_handler(), via exec_binprm(). Nearly all (see exception below) state used by bprm_secureexec is created during the bprm_set_creds hook, called from prepare_binprm(). For all LSMs (except commoncaps described next), only the first execution of bprm_set_creds takes any effect (they all check bprm->called_set_creds which prepare_binprm() sets after the first call to the bprm_set_creds hook). However, all these LSMs also only do anything with bprm_secureexec when they detected a secure state during their first run of bprm_set_creds. Therefore, it is functionally identical to move the detection into bprm_set_creds, since the results from secureexec here only need to be based on the first call to the LSM's bprm_set_creds hook. The single exception is that the commoncaps secureexec hook also examines euid/uid and egid/gid differences which are controlled by bprm_fill_uid(), via prepare_binprm(), which can be called multiple times (e.g. binfmt_script, binfmt_misc), and may clear the euid/egid for the final load (i.e. the script interpreter). However, while commoncaps specifically ignores bprm->cred_prepared, and runs its bprm_set_creds hook each time prepare_binprm() may get called, it needs to base the secureexec decision on the final call to bprm_set_creds. As a result, it will need special handling. To begin this refactoring, this adds the secureexec flag to the bprm struct, and calls the secureexec hook during setup_new_exec(). This is safe since all the cred work is finished (and past the point of no return). This explicit call will be removed in later patches once the hook has been removed. Cc: David Howells <dhowells@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: John Johansen <john.johansen@canonical.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: James Morris <james.l.morris@oracle.com>
2017-08-01	exec: Rename bprm->cred_prepared to called_set_creds	Kees Cook
	The cred_prepared bprm flag has a misleading name. It has nothing to do with the bprm_prepare_cred hook, and actually tracks if bprm_set_creds has been called. Rename this flag and improve its comment. Cc: David Howells <dhowells@redhat.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: Serge Hallyn <serge@hallyn.com>
2017-08-01	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller
	Two minor conflicts in virtio_net driver (bug fix overlapping addition of a helper) and MAINTAINERS (new driver edit overlapping revamp of PHY entry). Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01	drm: Create a format/modifier blob	Ben Widawsky
	Updated blob layout (Rob, Daniel, Kristian, xerpi) v2: * Removed __packed, and alignment (.+) * Fix indent in drm_format_modifier fields (Liviu) * Remove duplicated modifier > 64 check (Liviu) * Change comment about modifier (Liviu) * Remove arguments to blob creation, use plane instead (Liviu) * Fix data types (Ben) * Make the blob part of uapi (Daniel) v3: Remove unused ret field. Change i, and j to unsigned int (Emil) v4: Use plane->modifier_count instead of recounting (Daniel) v5: Rename modifiers to modifiers_property (Ville) Use sizeof(__u32) instead to reflect UAPI nature (Ville) Make BUILD_BUG_ON for blob header size Cc: Rob Clark <robdclark@gmail.com> Cc: Kristian H. Kristensen <hoegsberg@gmail.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Daniel Stone <daniels@collabora.com> (v2) Reviewed-by: Liviu Dudau <liviu@dudau.co.uk> (v2) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> (v3) Signed-off-by: Daniel Stone <daniels@collabora.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170724034641.13369-2-ben@bwidawsk.net
2017-08-01	drm: Plumb modifiers through plane init	Ben Widawsky
	This is the plumbing for supporting fb modifiers on planes. Modifiers have already been introduced to some extent, but this series will extend this to allow querying modifiers per plane. Based on this, the client to enable optimal modifications for framebuffers. This patch simply allows the DRM drivers to initialize their list of supported modifiers upon initializing the plane. v2: A minor addition from Daniel v3: * Updated commit message * s/INVALID/DRM_FORMAT_MOD_INVALID (Liviu) * Remove some excess newlines (Liviu) * Update comment for > 64 modifiers (Liviu) v4: Minor comment adjustments (Liviu) v5: Some new platforms added due to rebase v6: Add some missed plane inits (or maybe they're new - who knows at this point) (Daniel) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Daniel Stone <daniels@collabora.com> (v2) Reviewed-by: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: Daniel Stone <daniels@collabora.com>
2017-08-01	fbdev: Nuke FBINFO_MODULE	Daniel Vetter
	Instead check info->ops->owner, which amounts to the same. Spotted because I want to remove the pile of broken and cargo-culted fb_info->flags assignments in drm drivers. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
2017-08-01	fbcon: Make fbcon a built-time depency for fbdev	Daniel Vetter
	There's a bunch of folks who're trying to make printk less contended and faster, but there's a problem: printk uses the console_lock, and the console lock has become the BKL for all things fbdev/fbcon, which in turn pulled in half the drm subsystem under that lock. That's awkward. There reasons for that is probably just a historical accident: - fbcon is a runtime option of fbdev, i.e. at runtime you can pick whether your fbdev driver instances are used as kernel consoles. Unfortunately this wasn't implemented with some module option, but through some module loading magic: As long as you don't load fbcon.ko, there's no fbdev console support, but loading it (in any order wrt fbdev drivers) will create console instances for all fbdev drivers. - This was implemented through a notifier chain. fbcon.ko enumerates all fbdev instances at load time and also registers itself as listener in the fbdev notifier. The fbdev core tries to register new fbdev instances with fbcon using the notifier. - On top of that the modifier chain is also used at runtime by the fbdev subsystem to e.g. control backlights for panels. - The problem is that the notifier puts a mutex locking context between fbdev and fbcon, which mixes up the locking contexts for both the runtime usage and the register time usage to notify fbcon. And at runtime fbcon (through the fbdev core) might call into the notifier from a printk critical section while console_lock is held. - This means console_lock must be an outer lock for the entire fbdev subsystem, which also means it must be acquired when registering a new framebuffer driver as the outermost lock since we might call into fbcon (through the notifier) which would result in a locking inversion if fbcon would acquire the console_lock from its notifier callback (which it needs to register the console). - console_lock can be held anywhere, since printk can be called anywhere, and through the above story, plus drm/kms being an fbdev driver, we pull in a shocking amount of locking hiercharchy underneath the console_lock. Which makes cleaning up printk really hard (not even splitting console_lock into an rwsem is all that useful due to this). There's various ways to address this, but the cleanest would be to make fbcon a compile-time option, where fbdev directly calls the fbcon register functions from register_framebuffer, or dummy static inline versions if fbcon is disabled. Maybe augmented with a runtime knob to disable fbcon, if that's needed (for debugging perhaps). But this could break some users who rely on the magic "loading fbcon.ko enables/disables fbdev framebuffers at runtime" thing, even if that's unlikely. Hence we must be careful: 1. Create a compile-time dependency between fbcon and fbdev in the least minimal way. This is what this patch does. 2. Wait at least 1 year to give possible users time to scream about how we broke their setup. Unlikely, since all distros make fbcon compile-in, and embedded platforms only compile stuff they know they need anyway. But still. 3. Convert the notifier to direct functions calls, with dummy static inlines if fbcon is disabled. We'll still need the fb notifier for the other uses (like backlights), but we can probably move it into the fb core (atm it must be built-into vmlinux). 4. Push console_lock down the call-chain, until it is down in console_register again. 5. Finally start to clean up and rework the printk/console locking. For context of this saga see commit 50e244cc793d511b86adea24972f3a7264cae114 Author: Alan Cox <alan@linux.intel.com> Date: Fri Jan 25 10:28:15 2013 +1000 fb: rework locking to fix lock ordering on takeover plus the pile of commits on top that tried to make this all work without terminally upsetting lockdep. We've uncovered all this when console_lock lockdep annotations where added in commit daee779718a319ff9f83e1ba3339334ac650bb22 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Sat Sep 22 19:52:11 2012 +0200 console: implement lockdep support for console_lock On the patch itself: - Switch CONFIG_FRAMEBUFFER_CONSOLE to be a boolean, using the overall CONFIG_FB tristate to decided whether it should be a module or built-in. - At first I thought I could force the build depency with just a dummy symbol that fbcon.ko exports and fb.ko uses. But that leads to a module depency cycle (it works fine when built-in). Since this tight binding is the entire goal the simplest solution is to move all the fbcon modules (and there's a bunch of optinal source-files which are each modules of their own, for no good reason) into the overall fb.ko core module. That's a bit more than what I would have liked to do in this patch, but oh well. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
2017-08-01	libceph: make RECOVERY_DELETES feature create a new interval	Ilya Dryomov
	This is needed so that the OSDs can regenerate the missing set at the start of a new interval where support for recovery deletes changed. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2017-08-01	libceph: fallback for when there isn't a pool-specific choose_arg	Ilya Dryomov
	There is now a fallback to a choose_arg index of -1 if there isn't a pool-specific choose_arg set. If you create a per-pool weight-set, that works for that pool. Otherwise we try the compat/default one. If that doesn't exist either, then we use the normal CRUSH weights. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2017-08-01	Merge remote-tracking branches 'asoc/fix/dpcm', 'asoc/fix/imx', ↵	Mark Brown
	'asoc/fix/msm8916', 'asoc/fix/multi-pcm', 'asoc/fix/of-graph' and 'asoc/fix/pxa' into asoc-linus
2017-08-01	mm: remove optimizations based on i_size in mapping writeback waits	Jeff Layton
	Marcelo added this i_size based optimization with a patch in 2004 (commitid is from the linux-history tree): commit 765dad09b4ac101a32d87af2bb793c3060497d3c Author: Marcelo Tosatti <marcelo.tosatti@cyclades.com> Date: Tue Sep 7 17:51:17 2004 -0700 small wait_on_page_writeback_range() optimization filemap_fdatawait() calls wait_on_page_writeback_range() with -1 as "end" parameter. This is not needed since we know the EOF from the inode. Use that instead. There may be races here, particularly with clustered or network filesystems. It also seems like a bit of a layering violation since we're operating on an address_space here, not an inode. Finally, it's also questionable whether this optimization really helps on workloads that we care about. Should we be optimizing for writeback vs. truncate races in a codepath where we expect to wait anyway? It doesn't seem worth the risk. Remove this optimization from the filemap_fdatawait codepaths. This means that filemap_fdatawait becomes a trivial wrapper around filemap_fdatawait_range. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@redhat.com>
2017-08-01	futex: Allow for compiling out PI support	Nicolas Pitre
	This makes it possible to preserve basic futex support and compile out the PI support when RT mutexes are not available. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Darren Hart <dvhart@infradead.org> Link: http://lkml.kernel.org/r/alpine.LFD.2.20.1708010024190.5981@knanqh.ubzr
2017-08-01	cpufreq: Process remote callbacks from any CPU if the platform permits	Viresh Kumar
	On many platforms, CPUs can do DVFS across cpufreq policies. i.e CPU from policy-A can change frequency of CPUs belonging to policy-B. This is quite common in case of ARM platforms where we don't configure any per-cpu register. Add a flag to identify such platforms and update cpufreq_can_do_remote_dvfs() to allow remote callbacks if this flag is set. Also enable the flag for cpufreq-dt driver which is used only on ARM platforms currently. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-01	sched: cpufreq: Allow remote cpufreq callbacks	Viresh Kumar
	With Android UI and benchmarks the latency of cpufreq response to certain scheduling events can become very critical. Currently, callbacks into cpufreq governors are only made from the scheduler if the target CPU of the event is the same as the current CPU. This means there are certain situations where a target CPU may not run the cpufreq governor for some time. One testcase to show this behavior is where a task starts running on CPU0, then a new task is also spawned on CPU0 by a task on CPU1. If the system is configured such that the new tasks should receive maximum demand initially, this should result in CPU0 increasing frequency immediately. But because of the above mentioned limitation though, this does not occur. This patch updates the scheduler core to call the cpufreq callbacks for remote CPUs as well. The schedutil, ondemand and conservative governors are updated to process cpufreq utilization update hooks called for remote CPUs where the remote CPU is managed by the cpufreq policy of the local CPU. The intel_pstate driver is updated to always reject remote callbacks. This is tested with couple of usecases (Android: hackbench, recentfling, galleryfling, vellamo, Ubuntu: hackbench) on ARM hikey board (64 bit octa-core, single policy). Only galleryfling showed minor improvements, while others didn't had much deviation. The reason being that this patch only targets a corner case, where following are required to be true to improve performance and that doesn't happen too often with these tests: - Task is migrated to another CPU. - The task has high demand, and should take the target CPU to higher OPPs. - And the target CPU doesn't call into the cpufreq governor until the next tick. Based on initial work from Steve Muckle. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-01	ACPI / PCI / PM: Rework acpi_pci_propagate_wakeup()	Rafael J. Wysocki
	The acpi_pci_propagate_wakeup() routine is there to handle cases in which PCI bridges (or PCIe ports) are expected to signal wakeup for devices below them, but currently it doesn't do that correctly. The problem is that acpi_pci_propagate_wakeup() uses acpi_pm_set_device_wakeup() for bridges and if that routine is called for multiple times to disable wakeup for the same device, it will disable it on the first invocation and the next calls will have no effect (it works analogously when called to enable wakeup, but that is not a problem). Now, say acpi_pci_propagate_wakeup() has been called for two different devices under the same bridge and it has called acpi_pm_set_device_wakeup() for that bridge each time. The bridge is now enabled to generate wakeup signals. Next, suppose that one of the devices below it resumes and acpi_pci_propagate_wakeup() is called to disable wakeup for that device. It will then call acpi_pm_set_device_wakeup() for the bridge and that will effectively disable remote wakeup for all devices under it even though some of them may still be suspended and remote wakeup may be expected to work for them. To address this (arguably theoretical) issue, allow wakeup.enable_count under struct acpi_device to grow beyond 1 in certain situations. In particular, allow that to happen in acpi_pci_propagate_wakeup() when wakeup is enabled or disabled for PCI bridges, so that wakeup is actually disabled for the bridge when all devices under it resume and not when just one of them does that. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-08-01	ACPI / PM: Split acpi_device_wakeup()	Rafael J. Wysocki
	To prepare for a subsequent change and make the code somewhat easier to follow, do the following in the ACPI device wakeup handling code: * Replace wakeup.flags.enabled under struct acpi_device with wakeup.enable_count as that will be necessary going forward. For now, wakeup.enable_count is not allowed to grow beyond 1, so the current behavior is retained. * Split acpi_device_wakeup() into acpi_device_wakeup_enable() and acpi_device_wakeup_disable() and modify the callers of it accordingly. * Introduce a new acpi_wakeup_lock mutex to protect the wakeup enabling/disabling code from races in case it is executed more than once in parallel for the same device (which may happen for bridges theoretically). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2017-08-01	HID: Remove the semaphore driver_lock	Binoy Jayan
	The semaphore 'driver_lock' is used as a simple mutex, and also unnecessary as suggested by Arnd. Hence removing it, as the concurrency between the probe and remove is already handled in the driver core. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org> Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-07-31	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds
	Pull networking fixes from David Miller: 1) Handle notifier registry failures properly in tun/tap driver, from Tonghao Zhang. 2) Fix bpf verifier handling of subtraction bounds and add a testcase for this, from Edward Cree. 3) Increase reset timeout in ftgmac100 driver, from Ben Herrenschmidt. 4) Fix use after free in prd_retire_rx_blk_timer_exired() in AF_PACKET, from Cong Wang. 5) Fix SElinux regression due to recent UDP optimizations, from Paolo Abeni. 6) We accidently increment IPSTATS_MIB_FRAGFAILS in the ipv6 code paths, fix from Stefano Brivio. 7) Fix some mem leaks in dccp, from Xin Long. 8) Adjust MDIO_BUS kconfig deps to avoid build errors, from Arnd Bergmann. 9) Mac address length check and buffer size fixes from Cong Wang. 10) Don't leak sockets in ipv6 udp early demux, from Paolo Abeni. 11) Fix return value when copy_from_user() fails in bpf_prog_get_info_by_fd(), from Daniel Borkmann. 12) Handle PHY_HALTED properly in phy library state machine, from Florian Fainelli. 13) Fix OOPS in fib_sync_down_dev(), from Ido Schimmel. 14) Fix truesize calculation in virtio_net which led to performance regressions, from Michael S Tsirkin. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits) samples/bpf: fix bpf tunnel cleanup udp6: fix jumbogram reception ppp: Fix a scheduling-while-atomic bug in del_chan Revert "net: bcmgenet: Remove init parameter from bcmgenet_mii_config" virtio_net: fix truesize for mergeable buffers mv643xx_eth: fix of_irq_to_resource() error check MAINTAINERS: Add more files to the PHY LIBRARY section ipv4: fib: Fix NULL pointer deref during fib_sync_down_dev() net: phy: Correctly process PHY_HALTED in phy_stop_machine() sunhme: fix up GREG_STAT and GREG_IMASK register offsets bpf: fix bpf_prog_get_info_by_fd to dump correct xlated_prog_len tcp: avoid bogus gcc-7 array-bounds warning net: tc35815: fix spelling mistake: "Intterrupt" -> "Interrupt" bpf: don't indicate success when copy_from_user fails udp6: fix socket leak on early demux net: thunderx: Fix BGX transmit stall due to underflow Revert "vhost: cache used event for better performance" team: use a larger struct for mac address net: check dev->addr_len for dev_set_mac_address() phy: bcm-ns-usb3: fix MDIO_BUS dependency ...
2017-07-31	udp6: fix jumbogram reception	Paolo Abeni
	Since commit 67a51780aebb ("ipv6: udp: leverage scratch area helpers") udp6_recvmsg() read the skb len from the scratch area, to avoid a cache miss. But the UDP6 rx path support RFC 2675 UDPv6 jumbograms, and their length exceeds the 16 bits available in the scratch area. As a side effect the length returned by recvmsg() is: <ingress datagram len> % (1<<16) This commit addresses the issue allocating one more bit in the IP6CB flags field and setting it for incoming jumbograms. Such field is still in the first cacheline, so at recvmsg() time we can check it and fallback to access skb->len if required, without a measurable overhead. Fixes: 67a51780aebb ("ipv6: udp: leverage scratch area helpers") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-31	tcp: add related fields into SCM_TIMESTAMPING_OPT_STATS	Wei Wang
	Add the following stats into SCM_TIMESTAMPING_OPT_STATS control msg: TCP_NLA_PACING_RATE TCP_NLA_DELIVERY_RATE TCP_NLA_SND_CWND TCP_NLA_REORDERING TCP_NLA_MIN_RTT TCP_NLA_RECUR_RETRANS TCP_NLA_DELIVERY_RATE_APP_LMT Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-31	f2fs: support project quota	Chao Yu
	This patch adds to support plain project quota. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-31	f2fs: enhance on-disk inode structure scalability	Chao Yu
	This patch add new flag F2FS_EXTRA_ATTR storing in inode.i_inline to indicate that on-disk structure of current inode is extended. In order to extend, we changed the inode structure a bit: Original one: struct f2fs_inode { ... struct f2fs_extent i_ext; __le32 i_addr[DEF_ADDRS_PER_INODE]; __le32 i_nid[DEF_NIDS_PER_INODE]; } Extended one: struct f2fs_inode { ... struct f2fs_extent i_ext; union { struct { __le16 i_extra_isize; __le16 i_padding; __le32 i_extra_end[0]; }; __le32 i_addr[DEF_ADDRS_PER_INODE]; }; __le32 i_nid[DEF_NIDS_PER_INODE]; } Once F2FS_EXTRA_ATTR is set, we will steal four bytes in the head of i_addr field for storing i_extra_isize and i_padding. with i_extra_isize, we can calculate actual size of reserved space in i_addr, available attribute fields included in total extra attribute fields for current inode can be described as below: +--------------------+ \| .i_mode \| \| ... \| \| .i_ext \| +--------------------+ \| .i_extra_isize \|-----+ \| .i_padding \| \| \| .i_prjid \| \| \| .i_atime_extra \| \| \| .i_ctime_extra \| \| \| .i_mtime_extra \|<----+ \| .i_inode_cs \|<----- store blkaddr/inline from here \| .i_xattr_cs \| \| ... \| +--------------------+ \| \| \| block address \| \| \| +--------------------+ \| .i_nid \| +--------------------+ \| node_footer \| \| (nid, ino, offset) \| +--------------------+ Hence, with this patch, we would enhance scalability of f2fs inode for storing more newly added attribute. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-31	f2fs: make max inline size changeable	Chao Yu
	This patch tries to make below macros calculating max inline size, inline dentry field size considerring reserving size-changeable space: - MAX_INLINE_DATA - NR_INLINE_DENTRY - INLINE_DENTRY_BITMAP_SIZE - INLINE_RESERVED_SIZE Then, when inline_{data,dentry} options is enabled, it allows us to reserve inline space with different size flexibly for adding newly introduced inode attribute. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-31	mm: add file_fdatawait_range and file_write_and_wait	Jeff Layton
	Necessary now for gfs2_fsync and sync_file_range, but there will eventually be other callers. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@redhat.com>
2017-07-31	net: phy: mdio-bcm-unimac: Allow specifying platform data	Florian Fainelli
	In preparation for having the bcmgenet driver migrate over the mdio-bcm-unimac driver, add a platform data structure which allows passing integrating specific details like bus name, wait function to complete MDIO operations and PHY mask. We also define what the platform device name contract is by defining UNIMAC_MDIO_DRV_NAME and moving it to the platform_data header. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-31	tcp: remove unused mib counters	Florian Westphal
	was used by tcp prequeue and header prediction. TCPFORWARDRETRANS use was removed in january. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-31	tcp: remove CA_ACK_SLOWPATH	Florian Westphal
	re-indent tcp_ack, and remove CA_ACK_SLOWPATH; it is always set now. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-31	tcp: remove header prediction	Florian Westphal
	Like prequeue, I am not sure this is overly useful nowadays. If we receive a train of packets, GRO will aggregate them if the headers are the same (HP predates GRO by several years) so we don't get a per-packet benefit, only a per-aggregated-packet one. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-31	tcp: remove low_latency sysctl	Florian Westphal
	Was only checked by the removed prequeue code. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>