summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-09-04ocfs2: fix a tiny case that inode can not removedYiwen Jiang
When running dirop_fileop_racer we found a case that inode can not removed. Two nodes, say Node A and Node B, mount the same ocfs2 volume. Create two dirs /race/1/ and /race/2/ in the filesystem. Node A Node B rm -r /race/2/ mv /race/1/ /race/2/ call ocfs2_unlink(), get the EX mode of /race/2/ wait for B unlock /race/2/ decrease i_nlink of /race/2/ to 0, and add inode of /race/2/ into orphan dir, unlock /race/2/ got EX mode of /race/2/. because /race/1/ is dir, so inc i_nlink of /race/2/ and update into disk, unlock /race/2/ because i_nlink of /race/2/ is not zero, this inode will always remain in orphan dir This patch fixes this case by test whether i_nlink of new dir is zero. Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: Xue jiufei <xuejiufei@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: add ip_alloc_sem in direct IO to protect allocation changesWeiWei Wang
In ocfs2, ip_alloc_sem is used to protect allocation changes on the node. In direct IO, we add ip_alloc_sem to protect date consistent between direct-io and ocfs2_truncate_file race (buffer io use ip_alloc_sem already). Although inode->i_mutex lock is used to avoid concurrency of above situation, i think ip_alloc_sem is still needed because protect allocation changes is significant. Other filesystem like ext4 also uses rw_semaphore to protect data consistent between get_block-vs-truncate race by other means, So ip_alloc_sem in ocfs2 direct io is needed. Signed-off-by: Weiwei Wang <wangww631@huawei.com> Signed-off-by: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: clear the rest of the buffers on errorGoldwyn Rodrigues
In case a validation fails, clear the rest of the buffers and return the error to the calling function. This also facilitates bubbling up the error originating from ocfs2_error to calling functions. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: acknowledge return value of ocfs2_error()Goldwyn Rodrigues
Caveat: This may return -EROFS for a read case, which seems wrong. This is happening even without this patch series though. Should we convert EROFS to EIO? Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: add errors=continueGoldwyn Rodrigues
OCFS2 is often used in high-availaibility systems. However, ocfs2 converts the filesystem to read-only at the drop of the hat. This may not be necessary, since turning the filesystem read-only would affect other running processes as well, decreasing availability. This attempt is to add errors=continue, which would return the EIO to the calling process and terminate furhter processing so that the filesystem is not corrupted further. However, the filesystem is not converted to read-only. As a future plan, I intend to create a small utility or extend fsck.ocfs2 to fix small errors such as in the inode. The input to the utility such as the inode can come from the kernel logs so we don't have to schedule a downtime for fixing small-enough errors. The patch changes the ocfs2_error to return an error. The error returned depends on the mount option set. If none is set, the default is to turn the filesystem read-only. Perhaps errors=continue is not the best option name. Historically it is used for making an attempt to progress in the current process itself. Should we call it errors=eio? or errors=killproc? Suggestions/Comments welcome. Sources are available at: https://github.com/goldwynr/linux/tree/error-cont Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: flush inode data to disk and free inode when i_count becomes zeroXue jiufei
Disk inode deletion may be heavily delayed when one node unlink a file after the same dentry is freed on another node(say N1) because of memory shrink but inode is left in memory. This inode can only be freed while N1 doing the orphan scan work. However, N1 may skip orphan scan for several times because other nodes may do the work earlier. In our tests, it may take 1 hour on 4 nodes cluster and it hurts the user experience. So we think the inode should be freed after the data flushed to disk when i_count becomes zero to avoid such circumstances. Signed-off-by: Joyce.xue <xuejiufei@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: trusted xattr missing CAP_SYS_ADMIN checkSanidhya Kashyap
The trusted extended attributes are only visible to the process which hvae CAP_SYS_ADMIN capability but the check is missing in ocfs2 xattr_handler trusted list. The check is important because this will be used for implementing mechanisms in the userspace for which other ordinary processes should not have access to. Signed-off-by: Sanidhya Kashyap <sanidhya.gatech@gmail.com> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Taesoo kim <taesoo@gatech.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: set filesytem read-only when ocfs2_delete_entry failed.jiangyiwen
In ocfs2_rename, it will lead to an inode with two entried(old and new) if ocfs2_delete_entry(old) failed. Thus, filesystem will be inconsistent. The case is described below: ocfs2_rename -> ocfs2_start_trans -> ocfs2_add_entry(new) -> ocfs2_delete_entry(old) -> __ocfs2_journal_access *failed* because of -ENOMEM -> ocfs2_commit_trans So filesystem should be set to read-only at the moment. Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2/dlm: use list_for_each_entry instead of list_for_eachJoseph Qi
Use list_for_each_entry instead of list_for_each to simplify code. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: remove unneeded code in dlm_register_domain_handlersJoseph Qi
The last goto statement is unneeded, so remove it. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: fix BUG when o2hb_register_callback failsJoseph Qi
In dlm_register_domain_handlers, if o2hb_register_callback fails, it will call dlm_unregister_domain_handlers to unregister. This will trigger the BUG_ON in o2hb_unregister_callback because hc_magic is 0. So we should call o2hb_setup_callback to initialize hc first. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: remove unneeded code in ocfs2_dlm_initJoseph Qi
status is already initialized and it will only be 0 or negatives in the code flow. So remove the unneeded assignment after the lable 'local'. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: adjust code to match locking/unlocking orderJoseph Qi
Unlocking order in ocfs2_unlink and ocfs2_rename mismatches the corresponding locking order, although it won't cause issues, adjust the code so that it looks more reasonable. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: clean up unused local variables in ocfs2_file_write_iterJoseph Qi
Since commit 86b9c6f3f891 ("ocfs2: remove filesize checks for sync I/O journal commit") removes filesize checks for sync I/O journal commit, variables old_size and old_clusters are not actually used any more. So clean them up. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: do not log twice error messagesChristophe JAILLET
'o2hb_map_slot_data' and 'o2hb_populate_slot_data' are called from only one place, in 'o2hb_region_dev_write'. Return value is checked and 'mlog_errno' is called to log a message if it is not 0. So there is no need to call 'mlog_errno' directly within these functions. This would result on logging the message twice. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: do not BUG if buffer not uptodate in __ocfs2_journal_accessJoseph Qi
When storage network is unstable, it may trigger the BUG in __ocfs2_journal_access because of buffer not uptodate. We can retry the write in this case or return error instead of BUG. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reported-by: Zhangguanghui <zhang.guanghui@h3c.com> Tested-by: Zhangguanghui <zhang.guanghui@h3c.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: fix several issues of append dioJoseph Qi
1) Take rw EX lock in case of append dio. 2) Explicitly treat the error code -EIOCBQUEUED as normal. 3) Set di_bh to NULL after brelse if it may be used again later. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Yiwen Jiang <jiangyiwen@huawei.com> Cc: Weiwei Wang <wangww631@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: fix race between dio and recover orphanJoseph Qi
During direct io the inode will be added to orphan first and then deleted from orphan. There is a race window that the orphan entry will be deleted twice and thus trigger the BUG when validating OCFS2_DIO_ORPHANED_FL in ocfs2_del_inode_from_orphan. ocfs2_direct_IO_write ... ocfs2_add_inode_to_orphan >>>>>>>> race window. 1) another node may rm the file and then down, this node take care of orphan recovery and clear flag OCFS2_DIO_ORPHANED_FL. 2) since rw lock is unlocked, it may race with another orphan recovery and append dio. ocfs2_del_inode_from_orphan So take inode mutex lock when recovering orphans and make rw unlock at the end of aio write in case of append dio. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reported-by: Yiwen Jiang <jiangyiwen@huawei.com> Cc: Weiwei Wang <wangww631@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04sh: use PFN_DOWN macroAlexander Kuleshov
Replace ((x) >> PAGE_SHIFT) with the predefined PFN_DOWN macro. Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ntfs: delete unnecessary checks before calling iput()SF Markus Elfring
iput() tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Cc: Julia Lawall <julia.lawall@lip6.fr> Reviewed-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04scripts/spelling.txt: add some typo-wordsZhao Lei
I wrote a small script to show word-pair from all linux spelling-typo commits, and get following result by sort | uniq -c: 181 occured -> occurred 78 transfered -> transferred 67 recieved -> received 65 dependant -> dependent 58 wether -> whether 56 accomodate -> accommodate 54 occured -> occurred 51 recieve -> receive 47 cant -> can't 40 sucessfully -> successfully ... Some of them are not in spelling.txt, this patch adds the most common word-pairs into spelling.txt. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04scripts: decode_stacktrace: fix ARM architecture decodingRobert Jarzmik
Fix the stack decoder for the ARM architecture. An ARM stack is designed as : [ 81.547704] [<c023eb04>] (bucket_find_contain) from [<c023ec88>] (check_sync+0x40/0x4f8) [ 81.559668] [<c023ec88>] (check_sync) from [<c023f8c4>] (debug_dma_sync_sg_for_cpu+0x128/0x194) [ 81.571583] [<c023f8c4>] (debug_dma_sync_sg_for_cpu) from [<c0327dec>] (__videobuf_s The current script doesn't expect the symbols to be bound by parenthesis, and triggers the following errors : awk: cmd. line:1: error: Unmatched ( or \(: / (check_sync$/ [ 81.547704] (bucket_find_contain) from (check_sync+0x40/0x4f8) Fix it by chopping starting and ending parenthesis from the each symbol name. As a side note, this probably comes from the function dump_backtrace_entry(), which is implemented differently for each architecture. That makes a single decoding script a bit a challenge. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04scripts/Lindent: handle missing indent gracefullyJean Delvare
If indent is not found, bail out immediately instead of spitting random shell script error messages. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04kerneldoc: Convert error messages to GNU error message formatBart Van Assche
Editors like emacs and vi recognize a number of error message formats. The format used by the kerneldoc tool is not recognized by emacs. Change the kerneldoc error message format to the GNU style such that the emacs prev-error and next-error commands can be used to navigate through kerneldoc error messages. For more information about the GNU error message format, see also https://www.gnu.org/prep/standards/html_node/Errors.html. This patch has been generated via the following sed command: sed -i.orig 's/Error(\${file}:\$.):/\${file}:\$.: error:/g;s/Warning(\${file}:\$.):/\${file}:\$.: warning:/g;s/Warning(\${file}):/\${file}:1: warning:/g;s/Info(\${file}:\$.):/\${file}:\$.: info:/g' scripts/kernel-doc Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Johannes Berg <johannes.berg@intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04scripts/spelling.txt: spelling of uninitializedSudip Mukherjee
I just did a spelling mistake of uninitialized and wrote that as unintialized. Fortunately I noticed it in my final review. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04scripts/spelling.txt: add misspelled words for checkManinder Singh
misspelled words for check:- chcek chck cehck I myself did these spell mistakes in changelog for patches, Thus suggesting to add in spelling.txt, so that checkpatch.pl warns it earlier. References:- ./arch/powerpc/kernel/exceptions-64e.S:456: . . . make sure you chcek https://lkml.org/lkml/2015/6/25/289 ./arch/x86/mm/pageattr.c:1368: * No need to cehck in that case [akpm@linux-foundation.org: add whcih->which, whcih I always get wrong] Signed-off-by: Maninder Singh <maninder1.s@samsung.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04fsnotify: get rid of fsnotify_destroy_mark_locked()Jan Kara
fsnotify_destroy_mark_locked() is subtle to use because it temporarily releases group->mark_mutex. To avoid future problems with this function, split it into two. fsnotify_detach_mark() is the part that needs group->mark_mutex and fsnotify_free_mark() is the part that must be called outside of group->mark_mutex. This way it's much clearer what's going on and we also avoid some pointless acquisitions of group->mark_mutex. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04fsnotify: remove mark->free_listJan Kara
Free list is used when all marks on given inode / mount should be destroyed when inode / mount is going away. However we can free all of the marks without using a special list with some care. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04fsnotify: document mark lockingJan Kara
Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04fsnotify: fix check in inotify fdinfo printingJan Kara
A check in inotify_fdinfo() checking whether mark is valid was always true due to a bug. Luckily we can never get to invalidated marks since we hold mark_mutex and invalidated marks get removed from the group list when they are invalidated under that mutex. Anyway fix the check to make code more future proof. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04fs/notify: optimize inotify/fsnotify code for unwatched filesDave Hansen
I have a _tiny_ microbenchmark that sits in a loop and writes single bytes to a file. Writing one byte to a tmpfs file is around 2x slower than reading one byte from a file, which is a _bit_ more than I expecte. This is a dumb benchmark, but I think it's hard to deny that write() is a hot path and we should avoid unnecessary overhead there. I did a 'perf record' of 30-second samples of read and write. The top item in a diffprofile is srcu_read_lock() from fsnotify(). There are active inotify fd's from systemd, but nothing is actually listening to the file or its part of the filesystem. I *think* we can avoid taking the srcu_read_lock() for the common case where there are no actual marks on the file. This means that there will both be nothing to notify for *and* implies that there is no need for clearing the ignore mask. This patch gave a 13.1% speedup in writes/second on my test, which is an improvement from the 10.8% that I saw with the last version. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Cc: John McCutchan <john@johnmccutchan.com> Cc: Robert Love <rlove@rlove.org> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04drivers/video/concole: add negative dependency for VGA_CONSOLE on ARCYuriy Kolerov
Architectures which support VGA console must define screen_info structurture from "uapi/linux/screen_info.h". Otherwise undefined symbol error occurs. Usually it's defined in "setup.c" for each architecture. If an architecture does not support VGA console (ARC's case) there are 2 ways: define a dummy instance of screen_info or add a negative dependency for VGA_CONSOLE in to prevent selecting this option. I've implemented the second way. However the best solution is to add HAVE_VGA_CONSOLE option for targets which support VGA console. Then turn off VGA_CONSOLE by default and add dependency to HAVE_VGA_CONSOLE. But right now it's better to just add a negative dependency for ARC and then consider how to collaborate about this issue with maintainers of other architectures. Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Jaya Kumar <jayalk@intworks.biz> Cc: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04capabilities: add a securebit to disable PR_CAP_AMBIENT_RAISEAndy Lutomirski
Per Andrew Morgan's request, add a securebit to allow admins to disable PR_CAP_AMBIENT_RAISE. This securebit will prevent processes from adding capabilities to their ambient set. For simplicity, this disables PR_CAP_AMBIENT_RAISE entirely rather than just disabling setting previously cleared bits. Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Andrew G. Morgan <morgan@kernel.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Kees Cook <keescook@chromium.org> Cc: Christoph Lameter <cl@linux.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Aaron Jones <aaronmdjones@gmail.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Andrew G. Morgan <morgan@kernel.org> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Austin S Hemmelgarn <ahferroin7@gmail.com> Cc: Markku Savela <msa@moth.iki.fi> Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: James Morris <james.l.morris@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04selftests/capabilities: Add tests for capability evolutionAndy Lutomirski
This test focuses on ambient capabilities. It requires either root or the ability to create user namespaces. Some of the test cases will be skipped for nonroot users. Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Christoph Lameter <cl@linux.com> # Original author Cc: Serge E. Hallyn <serge.hallyn@ubuntu.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04capabilities: ambient capabilitiesAndy Lutomirski
Credit where credit is due: this idea comes from Christoph Lameter with a lot of valuable input from Serge Hallyn. This patch is heavily based on Christoph's patch. ===== The status quo ===== On Linux, there are a number of capabilities defined by the kernel. To perform various privileged tasks, processes can wield capabilities that they hold. Each task has four capability masks: effective (pE), permitted (pP), inheritable (pI), and a bounding set (X). When the kernel checks for a capability, it checks pE. The other capability masks serve to modify what capabilities can be in pE. Any task can remove capabilities from pE, pP, or pI at any time. If a task has a capability in pP, it can add that capability to pE and/or pI. If a task has CAP_SETPCAP, then it can add any capability to pI, and it can remove capabilities from X. Tasks are not the only things that can have capabilities; files can also have capabilities. A file can have no capabilty information at all [1]. If a file has capability information, then it has a permitted mask (fP) and an inheritable mask (fI) as well as a single effective bit (fE) [2]. File capabilities modify the capabilities of tasks that execve(2) them. A task that successfully calls execve has its capabilities modified for the file ultimately being excecuted (i.e. the binary itself if that binary is ELF or for the interpreter if the binary is a script.) [3] In the capability evolution rules, for each mask Z, pZ represents the old value and pZ' represents the new value. The rules are: pP' = (X & fP) | (pI & fI) pI' = pI pE' = (fE ? pP' : 0) X is unchanged For setuid binaries, fP, fI, and fE are modified by a moderately complicated set of rules that emulate POSIX behavior. Similarly, if euid == 0 or ruid == 0, then fP, fI, and fE are modified differently (primary, fP and fI usually end up being the full set). For nonroot users executing binaries with neither setuid nor file caps, fI and fP are empty and fE is false. As an extra complication, if you execute a process as nonroot and fE is set, then the "secure exec" rules are in effect: AT_SECURE gets set, LD_PRELOAD doesn't work, etc. This is rather messy. We've learned that making any changes is dangerous, though: if a new kernel version allows an unprivileged program to change its security state in a way that persists cross execution of a setuid program or a program with file caps, this persistent state is surprisingly likely to allow setuid or file-capped programs to be exploited for privilege escalation. ===== The problem ===== Capability inheritance is basically useless. If you aren't root and you execute an ordinary binary, fI is zero, so your capabilities have no effect whatsoever on pP'. This means that you can't usefully execute a helper process or a shell command with elevated capabilities if you aren't root. On current kernels, you can sort of work around this by setting fI to the full set for most or all non-setuid executable files. This causes pP' = pI for nonroot, and inheritance works. No one does this because it's a PITA and it isn't even supported on most filesystems. If you try this, you'll discover that every nonroot program ends up with secure exec rules, breaking many things. This is a problem that has bitten many people who have tried to use capabilities for anything useful. ===== The proposed change ===== This patch adds a fifth capability mask called the ambient mask (pA). pA does what most people expect pI to do. pA obeys the invariant that no bit can ever be set in pA if it is not set in both pP and pI. Dropping a bit from pP or pI drops that bit from pA. This ensures that existing programs that try to drop capabilities still do so, with a complication. Because capability inheritance is so broken, setting KEEPCAPS, using setresuid to switch to nonroot uids, and then calling execve effectively drops capabilities. Therefore, setresuid from root to nonroot conditionally clears pA unless SECBIT_NO_SETUID_FIXUP is set. Processes that don't like this can re-add bits to pA afterwards. The capability evolution rules are changed: pA' = (file caps or setuid or setgid ? 0 : pA) pP' = (X & fP) | (pI & fI) | pA' pI' = pI pE' = (fE ? pP' : pA') X is unchanged If you are nonroot but you have a capability, you can add it to pA. If you do so, your children get that capability in pA, pP, and pE. For example, you can set pA = CAP_NET_BIND_SERVICE, and your children can automatically bind low-numbered ports. Hallelujah! Unprivileged users can create user namespaces, map themselves to a nonzero uid, and create both privileged (relative to their namespace) and unprivileged process trees. This is currently more or less impossible. Hallelujah! You cannot use pA to try to subvert a setuid, setgid, or file-capped program: if you execute any such program, pA gets cleared and the resulting evolution rules are unchanged by this patch. Users with nonzero pA are unlikely to unintentionally leak that capability. If they run programs that try to drop privileges, dropping privileges will still work. It's worth noting that the degree of paranoia in this patch could possibly be reduced without causing serious problems. Specifically, if we allowed pA to persist across executing non-pA-aware setuid binaries and across setresuid, then, naively, the only capabilities that could leak as a result would be the capabilities in pA, and any attacker *already* has those capabilities. This would make me nervous, though -- setuid binaries that tried to privilege-separate might fail to do so, and putting CAP_DAC_READ_SEARCH or CAP_DAC_OVERRIDE into pA could have unexpected side effects. (Whether these unexpected side effects would be exploitable is an open question.) I've therefore taken the more paranoid route. We can revisit this later. An alternative would be to require PR_SET_NO_NEW_PRIVS before setting ambient capabilities. I think that this would be annoying and would make granting otherwise unprivileged users minor ambient capabilities (CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much less useful than it is with this patch. ===== Footnotes ===== [1] Files that are missing the "security.capability" xattr or that have unrecognized values for that xattr end up with has_cap set to false. The code that does that appears to be complicated for no good reason. [2] The libcap capability mask parsers and formatters are dangerously misleading and the documentation is flat-out wrong. fE is *not* a mask; it's a single bit. This has probably confused every single person who has tried to use file capabilities. [3] Linux very confusingly processes both the script and the interpreter if applicable, for reasons that elude me. The results from thinking about a script's file capabilities and/or setuid bits are mostly discarded. Preliminary userspace code is here, but it needs updating: https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=7f5afbd175d2 Here is a test program that can be used to verify the functionality (from Christoph): /* * Test program for the ambient capabilities. This program spawns a shell * that allows running processes with a defined set of capabilities. * * (C) 2015 Christoph Lameter <cl@linux.com> * Released under: GPL v3 or later. * * * Compile using: * * gcc -o ambient_test ambient_test.o -lcap-ng * * This program must have the following capabilities to run properly: * Permissions for CAP_NET_RAW, CAP_NET_ADMIN, CAP_SYS_NICE * * A command to equip the binary with the right caps is: * * setcap cap_net_raw,cap_net_admin,cap_sys_nice+p ambient_test * * * To get a shell with additional caps that can be inherited by other processes: * * ./ambient_test /bin/bash * * * Verifying that it works: * * From the bash spawed by ambient_test run * * cat /proc/$$/status * * and have a look at the capabilities. */ #include <stdlib.h> #include <stdio.h> #include <errno.h> #include <cap-ng.h> #include <sys/prctl.h> #include <linux/capability.h> /* * Definitions from the kernel header files. These are going to be removed * when the /usr/include files have these defined. */ #define PR_CAP_AMBIENT 47 #define PR_CAP_AMBIENT_IS_SET 1 #define PR_CAP_AMBIENT_RAISE 2 #define PR_CAP_AMBIENT_LOWER 3 #define PR_CAP_AMBIENT_CLEAR_ALL 4 static void set_ambient_cap(int cap) { int rc; capng_get_caps_process(); rc = capng_update(CAPNG_ADD, CAPNG_INHERITABLE, cap); if (rc) { printf("Cannot add inheritable cap\n"); exit(2); } capng_apply(CAPNG_SELECT_CAPS); /* Note the two 0s at the end. Kernel checks for these */ if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, cap, 0, 0)) { perror("Cannot set cap"); exit(1); } } int main(int argc, char **argv) { int rc; set_ambient_cap(CAP_NET_RAW); set_ambient_cap(CAP_NET_ADMIN); set_ambient_cap(CAP_SYS_NICE); printf("Ambient_test forking shell\n"); if (execv(argv[1], argv + 1)) perror("Cannot exec"); return 0; } Signed-off-by: Christoph Lameter <cl@linux.com> # Original author Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Aaron Jones <aaronmdjones@gmail.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Andrew G. Morgan <morgan@kernel.org> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Austin S Hemmelgarn <ahferroin7@gmail.com> Cc: Markku Savela <msa@moth.iki.fi> Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: James Morris <james.l.morris@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04kernel/kthread.c:kthread_create_on_node(): clarify documentationAndrew Morton
- Make it clear that the `node' arg refers to memory allocations only: kthread_create_on_node() does not pin the new thread to that node's CPUs. - Encourage the use of NUMA_NO_NODE. [nzimmer@sgi.com: use NUMA_NO_NODE in kthread_create() also] Cc: Nathan Zimmer <nzimmer@sgi.com> Cc: Tejun Heo <tj@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04mm: check if section present during memory block registeringYinghai Lu
Tony Luck found on his setup, if memory block size 512M will cause crash during booting. BUG: unable to handle kernel paging request at ffffea0074000020 IP: get_nid_for_pfn+0x17/0x40 PGD 128ffcb067 PUD 128ffc9067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc8 #1 ... Call Trace: ? register_mem_sect_under_node+0x66/0xe0 register_one_node+0x17b/0x240 ? pci_iommu_alloc+0x6e/0x6e topology_init+0x3c/0x95 do_one_initcall+0xcd/0x1f0 The system has non continuous RAM address: BIOS-e820: [mem 0x0000001300000000-0x0000001cffffffff] usable BIOS-e820: [mem 0x0000001d70000000-0x0000001ec7ffefff] usable BIOS-e820: [mem 0x0000001f00000000-0x0000002bffffffff] usable BIOS-e820: [mem 0x0000002c18000000-0x0000002d6fffefff] usable BIOS-e820: [mem 0x0000002e00000000-0x00000039ffffffff] usable So there are start sections in memory block not present. For example: memory block : [0x2c18000000, 0x2c20000000) 512M first three sections are not present. The current register_mem_sect_under_node() assume first section is present, but memory block section number range [start_section_nr, end_section_nr] would include not present section. For arch that support vmemmap, we don't setup memmap for struct page area within not present sections area. So skip the pfn range that belong to absent section. [akpm@linux-foundation.org: simplification] [rientjes@google.com: more simplification] Fixes: bdee237c0343 ("x86: mm: Use 2GB memory block size on large memory x86-64 systems") Fixes: 982792c782ef ("x86, mm: probe memory block size for generic x86 64bit") Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Tony Luck <tony.luck@intel.com> Tested-by: Tony Luck <tony.luck@intel.com> Cc: Greg KH <greg@kroah.com> Cc: Ingo Molnar <mingo@elte.hu> Tested-by: David Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> [3.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: direct write will call ocfs2_rw_unlock() twice when doing aio+dioRyan Ding
ocfs2_file_write_iter() is usng the wrong return value ('written'). This will cause ocfs2_rw_unlock() be called both in write_iter & end_io, triggering a BUG_ON. This issue was introduced by commit 7da839c47589 ("ocfs2: use __generic_file_write_iter()"). Orabug: 21612107 Fixes: 7da839c47589 ("ocfs2: use __generic_file_write_iter()") Signed-off-by: Ryan Ding <ryan.ding@oracle.com> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04memory-hotplug: add hot-added memory ranges to memblock before allocate ↵Tang Chen
node_data for a node. Commit f9126ab9241f ("memory-hotplug: fix wrong edge when hot add a new node") hot-added memory range to memblock, after creating pgdat for new node. But there is a problem: add_memory() |--> hotadd_new_pgdat() |--> free_area_init_node() |--> get_pfn_range_for_nid() |--> find start_pfn and end_pfn in memblock |--> ...... |--> memblock_add_node(start, size, nid) -------- Here, just too late. get_pfn_range_for_nid() will find that start_pfn and end_pfn are both 0. As a result, when adding memory, dmesg will give the following wrong message. Initmem setup node 5 [mem 0x0000000000000000-0xffffffffffffffff] On node 5 totalpages: 0 Built 5 zonelists in Node order, mobility grouping on. Total pages: 32588823 Policy zone: Normal init_memory_mapping: [mem 0x60000000000-0x607ffffffff] The solution is simple, just add the memory range to memblock a little earlier, before hotadd_new_pgdat(). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: David Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> [4.2.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-05CRISv10: delete unused lib/dmacopy.cRabin Vincent
This file is never built. Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Jesper Nilsson <jespern@axis.com>
2015-09-05CRISv10: delete unused lib/old_checksum.cRabin Vincent
This file is never built. Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
2015-09-05CRIS: fix switch_mm() lockdep splatRabin Vincent
With lockdep support implemented on CRISv32, we get the following splat. switch_mm() can be called both from the scheduler() (with interrupts disabled) and from flush_old_exec (via activate_mm()), with interrupts enabled. Fix it by disabling interrupts in activate_mm(), similar to powerpc and hexagon. t====================================================== [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ] 3.19.0-08802-g20bc9f1-dirty #323 Not tainted ------------------------------------------------------ init/1 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: (mmu_context_lock){+.+...}, at: [<c0009290>] switch_mm+0x22/0xc6 and this task is already holding: (&rq->lock){-.-.-.}, at: [<c01a0756>] __schedule+0x5e/0x648 which would create a new lock dependency: (&rq->lock){-.-.-.} -> (mmu_context_lock){+.+...} but this new dependency connects a HARDIRQ-irq-safe lock: (&rq->lock){-.-.-.} ... which became HARDIRQ-irq-safe at: [<c002b03c>] scheduler_tick+0x28/0x5e [<c0007c6c>] timer_interrupt+0x4e/0x6a [<c0043ac4>] handle_irq_event_percpu+0x54/0x13c [<c004343c>] generic_handle_irq+0x2a/0x36 to a HARDIRQ-irq-unsafe lock: (mmu_context_lock){+.+...} ... which became HARDIRQ-irq-unsafe at: ... [<c0039e60>] __lock_acquire+0x8f8/0x1d9c [<c0009290>] switch_mm+0x22/0xc6 [<c009c260>] flush_old_exec+0x500/0x5d4 [<c00da4c6>] load_elf_phdrs+0x7a/0x84 [<c00dbdb0>] load_elf_binary+0x21c/0x13b4 [<c009cdb6>] do_execve+0x22/0x2c [<c001dcf2>] ____call_usermodehelper+0x0/0x154 [<c000581e>] ret_from_kernel_thread+0xe/0x14 other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(mmu_context_lock); local_irq_disable(); lock(&rq->lock); lock(mmu_context_lock); <Interrupt> lock(&rq->lock); *** DEADLOCK *** 1 lock held by init/1: #0: (&rq->lock){-.-.-.}, at: [<c01a0756>] __schedule+0x5e/0x648 Call Trace: [<c019fe9e>] printk+0x0/0x4e [<c00368f8>] print_shortest_lock_dependencies+0x0/0x15c [<c0048628>] print_stack_trace+0x0/0x88 [<c0038912>] __lock_is_held+0x3e/0x5e [<c003b894>] lock_acquire+0x8a/0xcc [<c01a50c4>] _raw_spin_lock+0x44/0x7a [<c0009290>] switch_mm+0x22/0xc6 [<c01a06f8>] __schedule+0x0/0x648 [<c01a0d76>] schedule+0x36/0x7c [<c0037d04>] trace_hardirqs_on+0x0/0x1e [<c0004e18>] do_work_pending+0x30/0xd4 [<c000591a>] _work_pending+0xe/0x12 Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
2015-09-05CRISv32: enable LOCKDEP_SUPPORTRabin Vincent
Now that we have stack tracing and irq flags tracing support, we can also enable lockdep support Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
2015-09-05CRIS: add STACKTRACE_SUPPORTRabin Vincent
Add stacktrace support, which is required for lockdep and tracing. The stack tracing simply looks at all kernel text symbols found on the stack, similar to the trap stack dumping code, which can also be converted to use this. Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
2015-09-05CRISv32: annotate irq enable in idle loopRabin Vincent
Use a call to local_irq_enable() instead of incline asm so that the irqsoff latency tracer knows that interrupts are enabled here. Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
2015-09-05CRISv32: add support for irqflags tracingRabin Vincent
Add support irqflags tracing, which is required for things like lockdep and ftrace. Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
2015-09-05CRIS: UAPI: use generic types.hRabin Vincent
CRIS' types.h is functionally identical to the asm-generic version. Effective diff: +#ifndef _ASM_GENERIC_TYPES_H +#define _ASM_GENERIC_TYPES_H + #include <asm-generic/int-ll64.h> + +#endif Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
2015-09-05CRIS: UAPI: use generic shmbuf.hRabin Vincent
CRIS' shmbuf.h is equivalent to the asm-generic verison. Effective diff: -#ifndef _CRIS_SHMBUF_H -#define _CRIS_SHMBUF_H +#ifndef __ASM_GENERIC_SHMBUF_H +#define __ASM_GENERIC_SHMBUF_H + +#include <asm/bitsperlong.h> struct ipc64_perm shm_perm; size_t shm_segsz; __kernel_time_t shm_atime; +#if __BITS_PER_LONG != 64 unsigned long __unused1; +#endif __kernel_time_t shm_dtime; +#if __BITS_PER_LONG != 64 unsigned long __unused2; +#endif __kernel_time_t shm_ctime; +#if __BITS_PER_LONG != 64 unsigned long __unused3; +#endif __kernel_pid_t shm_cpid; __kernel_pid_t shm_lpid; - unsigned long shm_nattch; - unsigned long __unused4; - unsigned long __unused5; + __kernel_ulong_t shm_nattch; + __kernel_ulong_t __unused4; + __kernel_ulong_t __unused5; }; struct shminfo64 { - unsigned long shmmax; - unsigned long shmmin; - unsigned long shmmni; - unsigned long shmseg; - unsigned long shmall; - unsigned long __unused1; - unsigned long __unused2; - unsigned long __unused3; - unsigned long __unused4; + __kernel_ulong_t shmmax; + __kernel_ulong_t shmmin; + __kernel_ulong_t shmmni; + __kernel_ulong_t shmseg; + __kernel_ulong_t shmall; + __kernel_ulong_t __unused1; + __kernel_ulong_t __unused2; + __kernel_ulong_t __unused3; + __kernel_ulong_t __unused4; }; #endif Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
2015-09-05CRIS: UAPI: use generic msgbuf.hRabin Vincent
CRIS' msgbuf.h is equivalent to the asm-generic version. Effective diff: -#ifndef _CRIS_MSGBUF_H -#define _CRIS_MSGBUF_H - - +#ifndef __ASM_GENERIC_MSGBUF_H +#define __ASM_GENERIC_MSGBUF_H +#include <asm/bitsperlong.h> struct msqid64_ds { struct ipc64_perm msg_perm; __kernel_time_t msg_stime; +#if __BITS_PER_LONG != 64 unsigned long __unused1; +#endif __kernel_time_t msg_rtime; +#if __BITS_PER_LONG != 64 unsigned long __unused2; +#endif __kernel_time_t msg_ctime; +#if __BITS_PER_LONG != 64 unsigned long __unused3; - unsigned long msg_cbytes; - unsigned long msg_qnum; - unsigned long msg_qbytes; +#endif + __kernel_ulong_t msg_cbytes; + __kernel_ulong_t msg_qnum; + __kernel_ulong_t msg_qbytes; __kernel_pid_t msg_lspid; __kernel_pid_t msg_lrpid; - unsigned long __unused4; - unsigned long __unused5; + __kernel_ulong_t __unused4; + __kernel_ulong_t __unused5; }; #endif Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
2015-09-05CRIS: UAPI: use generic socket.hRabin Vincent
CRIS' socket.h is equivalent to the asm-generic version. Effective diff: -#ifndef _ASM_SOCKET_H -#define _ASM_SOCKET_H - - +#ifndef __ASM_GENERIC_SOCKET_H +#define __ASM_GENERIC_SOCKET_H #include <asm/sockios.h> #define SO_LINGER 13 #define SO_BSDCOMPAT 14 #define SO_REUSEPORT 15 +#ifndef SO_PASSCRED #define SO_PASSCRED 16 #define SO_PEERCRED 17 #define SO_RCVLOWAT 18 #define SO_SNDLOWAT 19 #define SO_RCVTIMEO 20 #define SO_SNDTIMEO 21 +#endif #define SO_SECURITY_AUTHENTICATION 22 Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>