summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2016-02-23scsi_dh_alua: Use separate alua_port_group structureHannes Reinecke
The port group needs to be a separate structure as several LUNs might belong to the same group. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-02-23scsi: Export function scsi_scan.c:sanitize_inquiry_stringDon Brace
The hpsa driver uses this function to cleanup inquiry data. Our new pqi driver will also use this function. This function was copied into both drivers. This patch exports sanitize_inquiry_string so the hpsa and the pqi drivers can use this function directly. Suggested-by: Hannes Reinecke <hare@suse.de> Suggested-by: Matthew R. Ochs mrochs@linux.vnet.ibm.com Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com> Reviewed-by: Justin Lindley <justin.lindley@pmcs.com> Reviewed-by: Scott Teel <scott.teel@pmcs.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Don Brace <don.brace@pmcs.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-02-23scsi_transport_iscsi: Add 25G and 40G speed definitionJitendra Bhivare
iscsi_port_speed and iscsi_port_speed_names have new entries for 25Gbps and 40Gbps link speeds. Signed-off-by: Jitendra Bhivare <jitendra.bhivare@avagotech.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-02-23nfit: update address range scrub commands to the acpi 6.1 formatDan Williams
The original format of these commands from the "NVDIMM DSM Interface Example" [1] are superseded by the ACPI 6.1 definition of the "NVDIMM Root Device _DSMs" [2]. [1]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf [2]: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf "9.20.7 NVDIMM Root Device _DSMs" Changes include: 1/ New 'restart' fields in ars_status, unfortunately these are implemented in the middle of the existing definition so this change is not backwards compatible. The expectation is that shipping platforms will only ever support the ACPI 6.1 definition. 2/ New status values for ars_start ('busy') and ars_status ('overflow'). Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Linda Knippers <linda.knippers@hpe.com> Cc: <stable@vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-02-23Merge tag 'nfs-for-4.5-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: "Stable bugfixes: - Fix nfs_size_to_loff_t - NFSv4: Fix a dentry leak on alias use Other bugfixes: - Don't schedule a layoutreturn if the layout segment can be freed immediately. - Always set NFS_LAYOUT_RETURN_REQUESTED with lo->plh_return_iomode - rpcrdma_bc_receive_call() should init rq_private_buf.len - fix stateid handling for the NFS v4.2 operations - pnfs/blocklayout: fix a memeory leak when using,vmalloc_to_page - fix panic in gss_pipe_downcall() in fips mode - Fix a race between layoutget and pnfs_destroy_layout - Fix a race between layoutget and bulk recalls" * tag 'nfs-for-4.5-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4.x/pnfs: Fix a race between layoutget and bulk recalls NFSv4.x/pnfs: Fix a race between layoutget and pnfs_destroy_layout auth_gss: fix panic in gss_pipe_downcall() in fips mode pnfs/blocklayout: fix a memeory leak when using,vmalloc_to_page nfs4: fix stateid handling for the NFS v4.2 operations NFSv4: Fix a dentry leak on alias use xprtrdma: rpcrdma_bc_receive_call() should init rq_private_buf.len pNFS: Always set NFS_LAYOUT_RETURN_REQUESTED with lo->plh_return_iomode pNFS: Fix pnfs_mark_matching_lsegs_return() nfs: fix nfs_size_to_loff_t
2016-02-23net: dsa: pass bridge down to driversVivien Didelot
Some DSA drivers may or may not support multiple software bridges on top of an hardware switch. It is more convenient for them to access the bridge's net_device for finer configuration. Removing the need to craft and access a bitmask also simplifies the code. This patch changes the signature of bridge related functions, update DSA drivers, and removes dsa_slave_br_port_mask. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-236lowpan: iphc: add support for stateful compressionAlexander Aring
This patch introduce support for IPHC stateful address compression. It will offer the context table via one debugfs entry. This debugfs has and directory for each cid entry for the context table. Inside each cid directory there exists the following files: - "active": If the entry is added or deleted. The context table is original a list implementation, this flag will indicate if the context is part of list or not. - "prefix": The ipv6 prefix. - "prefix_length": The prefix length for the prefix. - "compression": The compression flag according RFC6775. This part should be moved into sysfs after some testing time. Also the debugfs entry contains a "show" file which is a pretty-printout for the current context table information. Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Alexander Aring <aar@pengutronix.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-02-23mac802154: fix mac header length checkAlexander Aring
I got report about that sometimes the WARN_ON occurs there which should never happen. I came to the conclusion that the mac header is there but inside the headroom of skb. The skb->len information doesn't contain the information about the headroom length and skb->len is lesser than two. We check now if the skb_mac_header pointer is set and the room between mac header pointer and tail pointer. Signed-off-by: Alexander Aring <aar@pengutronix.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-02-23Bluetooth: add LED trigger for indicating HCI is powered upHeiner Kallweit
Add support for LED triggers to the Bluetooth subsystem and add kernel config symbol BT_LEDS for it. For now one trigger for indicating "HCI is powered up" is supported. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-02-23cgroup: convert cgroup_subsys flag fields to bool bitfieldsTejun Heo
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org>
2016-02-23gpio: Add devm_ apis for gpiochip_add_data and gpiochip_removeLaxman Dewangan
Add device managed APIs devm_gpiochip_add_data() and devm_gpiochip_remove() for the APIs gpiochip_add_data() and gpiochip_remove(). This helps in reducing code in error path and sometimes removal of .remove callback for driver unbind. Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
2016-02-23clk: samsung: exynos5433: Fix typos in *_ISP_MPWM clock namesSylwester Nawrocki
This fixes "MPWM" -> "WPWM" typo in 3 *ISP_MWPM clock definitions. Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
2016-02-23[media] media_device: move allocation out of media_device_*_initMauro Carvalho Chehab
Right now, media_device_pci_init and media_device_usb_init does media_device allocation internaly. That preents its usage when the media_device struct is embedded on some other structure. Move memory allocation outside it, to make it more generic. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2016-02-23[media] media-device: move PCI/USB helper functions from v4l2-mcMauro Carvalho Chehab
Those ancillary functions could be called even when compiled without V4L2 support, as warned by ktest build robot: All errors (new ones prefixed by >>): >> ERROR: "__v4l2_mc_usb_media_device_init" [drivers/media/usb/dvb-usb/dvb-usb.ko] undefined! >> ERROR: "__v4l2_mc_usb_media_device_init" [drivers/media/usb/dvb-usb-v2/dvb_usb_v2.ko] undefined! >> ERROR: "__v4l2_mc_usb_media_device_init" [drivers/media/usb/au0828/au0828.ko] undefined! Also, there's nothing there that are specific to V4L2. So, move those ancillary functions to MC core. No functional changes. Just function rename. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2016-02-23ALSA: jack: Allow building the jack layer without input deviceTakashi Iwai
Since the recent integration of kctl jack and input jack layers, we can basically build the jack layer even without input devices. That is, the jack layer itself can be built with conditional to enable the input device support or not, while the users may enable always CONFIG_SND_JACK unconditionally. For achieving it, this patch changes the following: - A new Kconfig, CONFIG_SND_JACK_INPUT_DEV, was introduced to indicate whether the jack layer supports the input device, - A few items in snd_jack struct and relevant codes are conditionally built upon CONFIG_SND_JACK_INPUT_DEV, - The users of CONFIG_SND_JACK drop the messy dependency on CONFIG_INPUT. This change also automagically fixes a potential bug in HD-audio driver Arnd reported, where the NULL or uninitialized jack instance is dereferenced. Reported-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2016-02-22f2fs: trace old block address for CoWed pageChao Yu
This patch enables to trace old block address of CoWed page for better debugging. f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f0, oldaddr = 0xfe8ab, newaddr = 0xfee90 rw = WRITE_SYNC, type = NODE f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f8, oldaddr = 0xfe8b0, newaddr = 0xfee91 rw = WRITE_SYNC, type = NODE f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4fa, oldaddr = 0xfe8ae, newaddr = 0xfee92 rw = WRITE_SYNC, type = NODE f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x96, oldaddr = 0xf049b, newaddr = 0x2bbe rw = WRITE, type = DATA f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x97, oldaddr = 0xf049c, newaddr = 0x2bbf rw = WRITE, type = DATA f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x98, oldaddr = 0xf049d, newaddr = 0x2bc0 rw = WRITE, type = DATA f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x47, oldaddr = 0xffffffff, newaddr = 0xf2631 rw = WRITE, type = DATA f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x48, oldaddr = 0xffffffff, newaddr = 0xf2632 rw = WRITE, type = DATA f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x49, oldaddr = 0xffffffff, newaddr = 0xf2633 rw = WRITE, type = DATA Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-22f2fs: introduce f2fs_journal struct to wrap journal infoChao Yu
Introduce a new structure f2fs_journal to wrap journal info in struct f2fs_summary_block for readability. struct f2fs_journal { union { __le16 n_nats; __le16 n_sits; }; union { struct nat_journal nat_j; struct sit_journal sit_j; struct f2fs_extra_info info; }; } __packed; struct f2fs_summary_block { struct f2fs_summary entries[ENTRIES_IN_SUM]; struct f2fs_journal journal; struct summary_footer footer; } __packed; Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/phy/bcm7xxx.c drivers/net/phy/marvell.c drivers/net/vxlan.c All three conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22jbd2: unify revoke and tag block checksum handlingJan Kara
Revoke and tag descriptor blocks are just different kinds of descriptor blocks and thus have checksum in the same place. Unify computation and checking of checksums for these. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-02-22jbd2: factor out common descriptor block initializationJan Kara
Descriptor block header is initialized in several places. Factor out the common code into jbd2_journal_get_descriptor_buffer(). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-02-22jbd2: remove unnecessary arguments of jbd2_journal_write_revoke_recordsJan Kara
jbd2_journal_write_revoke_records() takes journal pointer and write_op, although journal can be obtained from the passed transaction and write_op is always WRITE_SYNC. Remove these superfluous arguments. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-02-22mbcache: add reusable flag to cache entriesAndreas Gruenbacher
To reduce amount of damage caused by single bad block, we limit number of inodes sharing an xattr block to 1024. Thus there can be more xattr blocks with the same contents when there are lots of files with the same extended attributes. These xattr blocks naturally result in hash collisions and can form long hash chains and we unnecessarily check each such block only to find out we cannot use it because it is already shared by too many inodes. Add a reusable flag to cache entries which is cleared when a cache entry has reached its maximum refcount. Cache entries which are not marked reusable are skipped by mb_cache_entry_find_{first,next}. This significantly speeds up mbcache when there are many same xattr blocks. For example for xattr-bench with 5 values and each process handling 20000 files, the run for 64 processes is 25x faster with this patch. Even for 8 processes the speedup is almost 3x. We have also verified that for situations where there is only one xattr block of each kind, the patch doesn't have a measurable cost. [JK: Remove handling of setting the same value since it is not needed anymore, check for races in e_reusable setting, improve changelog, add measurements] Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-02-22mbcache: get rid of _e_hash_list_headAndreas Gruenbacher
Get rid of field _e_hash_list_head in cache entries and add bit field e_referenced instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-02-22mbcache2: rename to mbcacheJan Kara
Since old mbcache code is gone, let's rename new code to mbcache since number 2 is now meaningless. This is just a mechanical replacement. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-02-22dm: rename target's per_bio_data_size to per_io_data_sizeMike Snitzer
Request-based DM will also make use of per_bio_data_size. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-22cgroup: make cgroup subsystem masks u16Tejun Heo
After the recent do_each_subsys_mask() conversion, there's no reason to use ulong for subsystem masks. We'll be adding more subsystem masks to persistent data structures, let's reduce its size to u16 which should be enough for now and the foreseeable future. This doesn't create any noticeable behavior differences. v2: Johannes spotted that the initial patch missed cgroup_no_v1_mask. Converted. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
2016-02-22cgroup: s/child_subsys_mask/subtree_ss_mask/Tejun Heo
For consistency with cgroup->subtree_control. * cgroup->child_subsys_mask -> cgroup->subtree_ss_mask * cgroup_calc_child_subsys_mask() -> cgroup_calc_subtree_ss_mask() * cgroup_refresh_child_subsys_mask() -> cgroup_refresh_subtree_ss_mask() No functional changes. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
2016-02-22Revert "cgroup: add cgroup_subsys->css_e_css_changed()"Tejun Heo
This reverts commit 56c807ba4e91f0980567b6a69de239677879b17f. cgroup_subsys->css_e_css_changed() was supposed to be used by cgroup writeback support; however, the change to per-inode cgroup association made it unnecessary and the callback doesn't have any user. Remove it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
2016-02-22f2fs: support revoking atomic written pagesChao Yu
f2fs support atomic write with following semantics: 1. open db file 2. ioctl start atomic write 3. (write db file) * n 4. ioctl commit atomic write 5. close db file With this flow we can avoid file becoming corrupted when abnormal power cut, because we hold data of transaction in referenced pages linked in inmem_pages list of inode, but without setting them dirty, so these data won't be persisted unless we commit them in step 4. But we should still hold journal db file in memory by using volatile write, because our semantics of 'atomic write support' is incomplete, in step 4, we could fail to submit all dirty data of transaction, once partial dirty data was committed in storage, then after a checkpoint & abnormal power-cut, db file will be corrupted forever. So this patch tries to improve atomic write flow by adding a revoking flow, once inner error occurs in committing, this gives another chance to try to revoke these partial submitted data of current transaction, it makes committing operation more like aotmical one. If we're not lucky, once revoking operation was failed, EAGAIN will be reported to user for suggesting doing the recovery with held journal file, or retrying current transaction again. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-22f2fs: preallocate blocks for buffered aio writesJaegeuk Kim
This patch preallocates data blocks for buffered aio writes. With this patch, we can avoid redundant locking and unlocking of node pages given consecutive aio request. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-22f2fs: fix endianness of on-disk summary_footerSheng Yong
Signed-off-by: Sheng Yong <shengyong1@huawei.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-22f2fs: remove unneeded pointer conversionChao Yu
There are redundant pointer conversion in following call stack: - at position a, inode was been converted to f2fs_file_info. - at position b, f2fs_file_info was been converted to inode again. - truncate_blocks(inode,..) - fi = F2FS_I(inode) ---a - ADDRS_PER_PAGE(node_page, fi) - addrs_per_inode(fi) - inode = &fi->vfs_inode ---b - f2fs_has_inline_xattr(inode) - fi = F2FS_I(inode) - is_inode_flag_set(fi,..) In order to avoid unneeded conversion, alter ADDRS_PER_PAGE and addrs_per_inode to acept parameter with type of inode pointer. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-22f2fs: introduce lifetime write IO statisticsShuoran Liu
This patch introduces lifetime IO write statistics exposed to the sysfs interface. The write IO amount is obtained from block layer, accumulated in the file system and stored in the hot node summary of checkpoint. Signed-off-by: Shuoran Liu <liushuoran@huawei.com> Signed-off-by: Pengyang Hou <houpengyang@huawei.com> [Jaegeuk Kim: add sysfs documentation] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-22mbcache2: Use referenced bit instead of LRUJan Kara
Currently we maintain perfect LRU list by moving entry to the tail of the list when it gets used. However these operations on cache-global list are relatively expensive. In this patch we switch to lazy updates of LRU list. Whenever entry gets used, we set a referenced bit in it. When reclaiming entries, we give referenced entries another round in the LRU. Since the list is not a real LRU anymore, rename it to just 'list'. In my testing this logic gives about 30% boost to workloads with mostly unique xattr blocks (e.g. xattr-bench with 10 files and 10000 unique xattr values). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-02-22vfio/pci: Intel IGD host and LCP bridge config space accessAlex Williamson
Provide read-only access to PCI config space of the PCI host bridge and LPC bridge through device specific regions. This may be used to configure a VM with matching register contents to satisfy driver requirements. Providing this through the vfio file descriptor removes an additional userspace requirement for access through pci-sysfs and removes the CAP_SYS_ADMIN requirement that doesn't appear to apply to the specific devices we're accessing. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-22vfio/pci: Intel IGD OpRegion supportAlex Williamson
This is the first consumer of vfio device specific resource support, providing read-only access to the OpRegion for Intel graphics devices. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-22vfio: Define device specific region type capabilityAlex Williamson
To this point vfio has only provided an interface to the user that allows them to determine the number of regions and specifics about each region. What the region represents is left to the vfio bus driver. vfio-pci chooses to use fixed indexes for fixed resources, index 0 is BAR0, 1 is BAR1,... 7 is config space, etc. This works pretty well since all PCI devices have these regions, even if they don't necessarily populate all of them. Then we start to add things like VGA, which only certain device even support. We added this the same way, but now we've wasted a region index, and due to our offset implementation the corresponding address space, for all devices. Rather than continuing that process, let's try to make regions self describing by including a capability that defines their type. For vfio-pci we'll make the current VFIO_PCI_NUM_REGIONS fixed, defining the end of the static indexes and the beginning of self describing regions. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-22vfio: Define sparse mmap capability for regionsAlex Williamson
We can't always support mmap across an entire device region, for example we deny mmaps covering the MSI-X table of PCI devices, but we don't really have a way to report it. We expect the user to implicitly know this restriction. We also can't split the region because vfio-pci defines an API with fixed region index to BAR number mapping. We therefore define a new capability which lists areas within the region that may be mmap'd. In addition to the MSI-X case, this potentially enables in-kernel emulation and extensions to devices. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-22vfio: Add capability chain helpersAlex Williamson
Allow sub-modules to easily reallocate a buffer for managing capability chains for info ioctls. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-22vfio: Define capability chainsAlex Williamson
We have a few cases where we need to extend the data returned from the INFO ioctls in VFIO. For instance we already have devices exposed through vfio-pci where VFIO_DEVICE_GET_REGION_INFO reports the region as mmap-capable, but really only supports sparse mmaps, avoiding the MSI-X table. If we wanted to provide in-kernel emulation or extended functionality for devices, we'd also want the ability to tell the user not to mmap various regions, rather than forcing them to figure it out on their own. Another example is VFIO_IOMMU_GET_INFO. We'd really like to expose the actual IOVA capabilities of the IOMMU rather than letting the user assume the address space they have available to them. We could add IOVA base and size fields to struct vfio_iommu_type1_info, but what if we have multiple IOVA ranges. For instance x86 uses a range of addresses at 0xfee00000 for MSI vectors. These typically are not available for standard DMA IOVA mappings and splits our available IOVA space into two ranges. POWER systems have both an IOVA window below 4G as well as dynamic data window which they can use to remap all of guest memory. Representing variable sized arrays within a fixed structure makes it very difficult to parse, we'd therefore like to put this data beyond fixed fields within the data structures. One way to do this is to emulate capabilities in PCI configuration space. A new flag indciates whether capabilties are supported and a new fixed field reports the offset of the first entry. Users can then walk the chain to find capabilities, adding capabilities does not require additional fields in the fixed structure, and parsing variable sized data becomes trivial. This patch outlines the theory and base header structure, which should be shared by all future users. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-22Merge branch 'multipath'Trond Myklebust
* multipath: NFS add callback_ops to nfs4_proc_bind_conn_to_session_callback pnfs/NFSv4.1: Add multipath capabilities to pNFS flexfiles servers over NFSv3 SUNRPC: Allow addition of new transports to a struct rpc_clnt NFSv4.1: nfs4_proc_bind_conn_to_session must iterate over all connections SUNRPC: Make NFS swap work with multipath SUNRPC: Add a helper to apply a function to all the rpc_clnt's transports SUNRPC: Allow caller to specify the transport to use SUNRPC: Use the multipath iterator to assign a transport to each task SUNRPC: Make rpc_clnt store the multipath iterators SUNRPC: Add a structure to track multiple transports SUNRPC: Make freeing of struct xprt rcu-safe SUNRPC: Uninline xprt_get(); It isn't performance critical. SUNRPC: Reorder rpc_task to put waitqueue related info in same cachelines SUNRPC: Remove unused function rpc_task_reset_client
2016-02-22Merge char-misc-next into staging-nextGreg Kroah-Hartman
This resolves the merge issues and confusions people were having with the goldfish drivers due to changes for them showing up in two different trees. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-22clk: ti: dpll: convert DPLL support code to use clk_hw instead of clk ptrsTero Kristo
Convert DPLL support code to use clk_hw pointers for reference and bypass clocks. This allows us to use clk_hw_* APIs for accessing any required parameters for these clocks, avoiding some locking problems at least with DPLL enable code; this used clk_get_rate which uses mutex but isn't good under clk_enable / clk_disable. Signed-off-by: Tero Kristo <t-kristo@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-02-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "Looks like a lot, but mostly driver fixes scattered all over as usual. Of note: 1) Add conditional sched in nf conntrack in cleanup to avoid NMI watchdogs. From Florian Westphal. 2) Fix deadlock in nfnetlink cttimeout, also from Floarian. 3) Fix handling of slaves in bonding ARP monitor validation, from Jay Vosburgh. 4) Callers of ip_cmsg_send() are responsible for freeing IP options, some were not doing so. Fix from Eric Dumazet. 5) Fix per-cpu bugs in mvneta driver, from Gregory CLEMENT. 6) Fix vlan handling in mv88e6xxx DSA driver, from Vivien Didelot. 7) bcm7xxx PHY driver bug fixes from Florian Fainelli. 8) Avoid unaligned accesses to protocol headers wrt. GRE, from Alexander Duyck. 9) SKB leaks and other problems in arc_emac driver, from Alexander Kochetkov. 10) tcp_v4_inbound_md5_hash() releases listener socket instead of request socket on error path, oops. Fix from Eric Dumazet. 11) Missing socket release in pppoe_rcv_core() that seems to have existed basically forever. From Guillaume Nault. 12) Missing slave_dev unregister in dsa_slave_create() error path, from Florian Fainelli. 13) crypto_alloc_hash() never returns NULL, fix return value check in __tcp_alloc_md5sig_pool. From Insu Yun. 14) Properly expire exception route entries in ipv4, from Xin Long. 15) Fix races in tcp/dccp listener socket dismantle, from Eric Dumazet. 16) Don't set IFF_TX_SKB_SHARING in vxlan, geneve, or GRE, it's not legal. These drivers modify the SKB on transmit. From Jiri Benc. 17) Fix regression in the initialziation of netdev->tx_queue_len. From Phil Sutter. 18) Missing unlock in tipc_nl_add_bc_link() error path, from Insu Yun. 19) SCTP port hash sizing does not properly ensure that table is a power of two in size. From Neil Horman. 20) Fix initializing of software copy of MAC address in fmvj18x_cs driver, from Ken Kawasaki" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (129 commits) bnx2x: Fix 84833 phy command handler bnx2x: Fix led setting for 84858 phy. bnx2x: Correct 84858 PHY fw version bnx2x: Fix 84833 RX CRC bnx2x: Fix link-forcing for KR2 net: ethernet: davicom: fix devicetree irq resource fmvj18x_cs: fix incorrect indexing of dev->dev_addr[] when copying the MAC address Driver: Vmxnet3: Update Rx ring 2 max size net: netcp: rework the code for get/set sw_data in dma desc soc: ti: knav_dma: rename pad in struct knav_dma_desc to sw_data net: ti: netcp: restore get/set_pad_info() functionality MAINTAINERS: Drop myself as xen netback maintainer sctp: Fix port hash table size computation can: ems_usb: Fix possible tx overflow Bluetooth: hci_core: Avoid mixing up req_complete and req_complete_skb net: bcmgenet: Fix internal PHY link state af_unix: Don't use continue to re-execute unix_stream_read_generic loop unix_diag: fix incorrect sign extension in unix_lookup_by_ino bnxt_en: Failure to update PHY is not fatal condition. bnxt_en: Remove unnecessary call to update PHY settings. ...
2016-02-22mbcache: remove mbcacheJan Kara
Both ext2 and ext4 are now converted to mbcache2. Remove the old mbcache code. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-02-22mbcache2: reimplement mbcacheJan Kara
Original mbcache was designed to have more features than what ext? filesystems ended up using. It supported entry being in more hashes, it had a home-grown rwlocking of each entry, and one cache could cache entries from multiple filesystems. This genericity also resulted in more complex locking, larger cache entries, and generally more code complexity. This is reimplementation of the mbcache functionality to exactly fit the purpose ext? filesystems use it for. Cache entries are now considerably smaller (7 instead of 13 longs), the code is considerably smaller as well (414 vs 913 lines of code), and IMO also simpler. The new code is also much more lightweight. I have measured the speed using artificial xattr-bench benchmark, which spawns P processes, each process sets xattr for F different files, and the value of xattr is randomly chosen from a pool of V values. Averages of runtimes for 5 runs for various combinations of parameters are below. The first value in each cell is old mbache, the second value is the new mbcache. V=10 F\P 1 2 4 8 16 32 64 10 0.158,0.157 0.208,0.196 0.500,0.277 0.798,0.400 3.258,0.584 13.807,1.047 61.339,2.803 100 0.172,0.167 0.279,0.222 0.520,0.275 0.825,0.341 2.981,0.505 12.022,1.202 44.641,2.943 1000 0.185,0.174 0.297,0.239 0.445,0.283 0.767,0.340 2.329,0.480 6.342,1.198 16.440,3.888 V=100 F\P 1 2 4 8 16 32 64 10 0.162,0.153 0.200,0.186 0.362,0.257 0.671,0.496 1.433,0.943 3.801,1.345 7.938,2.501 100 0.153,0.160 0.221,0.199 0.404,0.264 0.945,0.379 1.556,0.485 3.761,1.156 7.901,2.484 1000 0.215,0.191 0.303,0.246 0.471,0.288 0.960,0.347 1.647,0.479 3.916,1.176 8.058,3.160 V=1000 F\P 1 2 4 8 16 32 64 10 0.151,0.129 0.210,0.163 0.326,0.245 0.685,0.521 1.284,0.859 3.087,2.251 6.451,4.801 100 0.154,0.153 0.211,0.191 0.276,0.282 0.687,0.506 1.202,0.877 3.259,1.954 8.738,2.887 1000 0.145,0.179 0.202,0.222 0.449,0.319 0.899,0.333 1.577,0.524 4.221,1.240 9.782,3.579 V=10000 F\P 1 2 4 8 16 32 64 10 0.161,0.154 0.198,0.190 0.296,0.256 0.662,0.480 1.192,0.818 2.989,2.200 6.362,4.746 100 0.176,0.174 0.236,0.203 0.326,0.255 0.696,0.511 1.183,0.855 4.205,3.444 19.510,17.760 1000 0.199,0.183 0.240,0.227 1.159,1.014 2.286,2.154 6.023,6.039 ---,10.933 ---,36.620 V=100000 F\P 1 2 4 8 16 32 64 10 0.171,0.162 0.204,0.198 0.285,0.230 0.692,0.500 1.225,0.881 2.990,2.243 6.379,4.771 100 0.151,0.171 0.220,0.210 0.295,0.255 0.720,0.518 1.226,0.844 3.423,2.831 19.234,17.544 1000 0.192,0.189 0.249,0.225 1.162,1.043 2.257,2.093 5.853,4.997 ---,10.399 ---,32.198 We see that the new code is faster in pretty much all the cases and starting from 4 processes there are significant gains with the new code resulting in upto 20-times shorter runtimes. Also for large numbers of cached entries all values for the old code could not be measured as the kernel started hitting softlockups and died before the test completed. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-02-22dm: set DM_TARGET_WILDCARD feature on "error" targetMike Snitzer
The DM_TARGET_WILDCARD feature indicates that the "error" target may replace any target; even immutable targets. This feature will be useful to preserve the ability to replace the "multipath" target even once it is formally converted over to having the DM_TARGET_IMMUTABLE feature. Also, implicit in the DM_TARGET_WILDCARD feature flag being set is that .map, .map_rq, .clone_and_map_rq and .release_clone_rq are all defined in the target_type. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-22dmaengine: make slave address physicalVinod Koul
The slave dmaengine semantics required the client to map dma addresses and pass DMA address to dmaengine drivers. This was a convenient notion coming from generic dma offload cases where dmaengines are interchangeable and client is not aware of which engine to map to. But in case of slave, we know the dmaengine and always use a specific one. Further the IOMMU cases can lead to failure of this notion, so make this as physical address and now dmaengine driver will do the required mapping. Original-patch-by: Linus Walleij <linus.walleij@linaro.org> Original-patch-Acked-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-02-22arch: Introduce post-init read-only memoryKees Cook
One of the easiest ways to protect the kernel from attack is to reduce the internal attack surface exposed when a "write" flaw is available. By making as much of the kernel read-only as possible, we reduce the attack surface. Many things are written to only during __init, and never changed again. These cannot be made "const" since the compiler will do the wrong thing (we do actually need to write to them). Instead, move these items into a memory region that will be made read-only during mark_rodata_ro() which happens after all kernel __init code has finished. This introduces __ro_after_init as a way to mark such memory, and adds some documentation about the existing __read_mostly marking. This improves the security of the Linux kernel by marking formerly read-write memory regions as read-only on a fully booted up system. Based on work by PaX Team and Brad Spengler. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Brown <david.brown@linaro.org> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Emese Revfy <re.emese@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathias Krause <minipli@googlemail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: PaX Team <pageexec@freemail.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-hardening@lists.openwall.com Cc: linux-arch <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/1455748879-21872-5-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-22asm-generic: Consolidate mark_rodata_ro()Kees Cook
Instead of defining mark_rodata_ro() in each architecture, consolidate it. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Gross <agross@codeaurora.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ashok Kumar <ashoks@broadcom.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Brown <david.brown@linaro.org> Cc: David Hildenbrand <dahi@linux.vnet.ibm.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Emese Revfy <re.emese@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Helge Deller <deller@gmx.de> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathias Krause <minipli@googlemail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: PaX Team <pageexec@freemail.hu> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: kernel-hardening@lists.openwall.com Cc: linux-arch <linux-arch@vger.kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-parisc@vger.kernel.org Link: http://lkml.kernel.org/r/1455748879-21872-2-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>