summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-04-13Merge tag 'apparmor-pr-2018-04-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor Pull apparmor updates from John Johansen: "Features: - add base infrastructure for socket mediation. ABI bump and additional checks to ensure only v8 compliant policy uses socket af mediation. - improve and cleanup dfa verification - improve profile attachment logic - improve overlapping expression handling - add the xattr matching to the attachment logic - improve signal mediation handling with stacked labels - improve handling of no_new_privs in a label stack Cleanups and changes: - use dfa to parse string split - bounded version of label_parse - proper line wrap nulldfa.in - split context out into task and cred naming to better match usage - simplify code in aafs Bug fixes: - fix display of .ns_name for containers - fix resource audit messages when auditing peer - fix logging of the existence test for signals - fix resource audit messages when auditing peer - fix display of .ns_name for containers - fix an error code in verify_table_headers() - fix memory leak on buffer on error exit path - fix error returns checks by making size a ssize_t" * tag 'apparmor-pr-2018-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: (36 commits) apparmor: fix memory leak on buffer on error exit path apparmor: fix dangling symlinks to policy rawdata after replacement apparmor: Fix an error code in verify_table_headers() apparmor: fix error returns checks by making size a ssize_t apparmor: update MAINTAINERS file git and wiki locations apparmor: remove POLICY_MEDIATES_SAFE apparmor: add base infastructure for socket mediation apparmor: improve overlapping domain attachment resolution apparmor: convert attaching profiles via xattrs to use dfa matching apparmor: Add support for attaching profiles via xattr, presence and value apparmor: cleanup: simplify code to get ns symlink name apparmor: cleanup create_aafs() error path apparmor: dfa split verification of table headers apparmor: dfa add support for state differential encoding apparmor: dfa move character match into a macro apparmor: update domain transitions that are subsets of confinement at nnp apparmor: move context.h to cred.h apparmor: move task related defines and fns to task.X files apparmor: cleanup, drop unused fn __aa_task_is_confined() apparmor: cleanup fixup description of aa_replace_profiles ...
2018-04-13Merge tag 'for-linus-20180413' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Followup fixes for this merge window. This contains: - Series from Ming, fixing corner cases in our CPU <-> queue mapping. This triggered repeated warnings on especially s390, but I also hit it in cpu hot plug/unplug testing while doing IO on NVMe on x86-64. - Another fix from Ming, ensuring that we always order budget and driver tag identically, avoiding a deadlock on QD=1 devices. - Loop locking regression fix from this merge window, from Omar. - Another loop locking fix, this time missing an unlock, from Tetsuo Handa. - Fix for racing IO submission with device removal from Bart. - sr reference fix from me, fixing a case where disk change or getevents can race with device removal. - Set of nvme fixes by way of Keith, from various contributors" * tag 'for-linus-20180413' of git://git.kernel.dk/linux-block: (28 commits) nvme: expand nvmf_check_if_ready checks nvme: Use admin command effects for admin commands nvmet: fix space padding in serial number nvme: check return value of init_srcu_struct function nvmet: Fix nvmet_execute_write_zeroes sector count nvme-pci: Separate IO and admin queue IRQ vectors nvme-pci: Remove unused queue parameter nvme-pci: Skip queue deletion if there are no queues nvme: target: fix buffer overflow nvme: don't send keep-alives to the discovery controller nvme: unexport nvme_start_keep_alive nvme-loop: fix kernel oops in case of unhandled command nvme: enforce 64bit offset for nvme_get_log_ext fn sr: get/drop reference to device in revalidate and check_events blk-mq: Revert "blk-mq: reimplement blk_mq_hw_queue_mapped" blk-mq: Avoid that submitting a bio concurrently with device removal triggers a crash backing: silence compiler warning using __printf blk-mq: remove code for dealing with remapping queue blk-mq: reimplement blk_mq_hw_queue_mapped blk-mq: don't check queue mapped in __blk_mq_delay_run_hw_queue() ...
2018-04-13Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull more i2c updates from Wolfram Sang: - hot bugfix for i801 to make laptops with strange BIOS reboot again when using SMBUS Host notify - change to MAINTAINERS creating a specific fallback entry for I2C host drivers and settings its status to "Odd fixes" - a long overdue param checking for the I2C core * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: add param sanity check to i2c_transfer() MAINTAINERS: add maintainer for Renesas I2C related drivers MAINTAINERS: remove me as maintainer for I2C host drivers i2c: i801: Restore configuration at shutdown i2c: i801: Save register SMBSLVCMD value only once
2018-04-13Merge tag 'sh-for-4.17' of git://git.libc.org/linux-shLinus Torvalds
Pull arch/sh updates from Rich Felker: "Fixes for bugs in futex, device tree, and userspace breakpoint traps, and for PCI issues on SH7786" * tag 'sh-for-4.17' of git://git.libc.org/linux-sh: arch/sh: pcie-sh7786: handle non-zero DMA offset arch/sh: pcie-sh7786: adjust the memory mapping arch/sh: pcie-sh7786: adjust PCI MEM and IO regions arch/sh: pcie-sh7786: exclude unusable PCI MEM areas arch/sh: pcie-sh7786: mark unavailable PCI resource as disabled arch/sh: pci: don't use disabled resources arch/sh: make the DMA mapping operations observe dev->dma_pfn_offset arch/sh: add sh7786_mm_sel() function sh: fix debug trap failure to process signals before return to user sh: fix memory corruption of unflattened device tree sh: fix futex FUTEX_OP_SET op on userspace addresses
2018-04-13Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull more arm64 updates from Will Deacon: "A few late updates to address some issues arising from conflicts with other trees: - Removal of Qualcomm-specific Spectre-v2 mitigation in favour of the generic SMCCC-based firmware call - Fix EL2 hardening capability checking, which was bodged to reduce conflicts with the KVM tree - Add some currently unused assembler macros for managing SIMD registers which will be used by some crypto code in the next merge window" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: assembler: add macros to conditionally yield the NEON under PREEMPT arm64: assembler: add utility macros to push/pop stack frames arm64: Move the content of bpi.S to hyp-entry.S arm64: Get rid of __smccc_workaround_1_hvc_* arm64: capabilities: Rework EL2 vector hardening entry arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening
2018-04-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Martin Schwidefsky: "Three notable larger changes next to the usual bug fixing: - update the email addresses in MAINTAINERS for the s390 folks to use the simpler linux.ibm.com domain instead of the old linux.vnet.ibm.com - an update for the zcrypt device driver that removes some old and obsolete interfaces and add support for up to 256 crypto adapters - a rework of the IPL aka boot code" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (23 commits) s390: correct nospec auto detection init order s390/zcrypt: Support up to 256 crypto adapters. s390/zcrypt: Remove deprecated zcrypt proc interface. s390/zcrypt: Remove deprecated ioctls. s390/zcrypt: Make ap init functions static. MAINTAINERS: update s390 maintainers email addresses s390/ipl: remove reipl_method and dump_method s390/ipl: correct kdump reipl block checksum calculation s390/ipl: remove non-existing functions declaration s390: assume diag308 set always works s390/ipl: avoid adding scpdata to cmdline during ftp/dvd boot s390/ipl: correct ipl parmblock valid checks s390/ipl: rely on diag308 store to get ipl info s390/ipl: move ipl_flags to ipl.c s390/ipl: get rid of ipl_ssid and ipl_devno s390/ipl: unite diag308 and scsi boot ipl blocks s390/ipl: ensure loadparm valid flag is set s390/qdio: lock device while installing IRQ handler s390/qdio: clear intparm during shutdown s390/ccwgroup: require at least one ccw device ...
2018-04-13Merge branch 'l2tp-remove-unsafe-calls-to-l2tp_tunnel_find_nth'David S. Miller
Guillaume Nault says: ==================== l2tp: remove unsafe calls to l2tp_tunnel_find_nth() Using l2tp_tunnel_find_nth() is racy, because the returned tunnel can go away as soon as this function returns. This series introduce l2tp_tunnel_get_nth() as a safe replacement to fixes these races. With this series, all unsafe tunnel/session lookups are finally gone. ==================== Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-13l2tp: hold reference on tunnels printed in l2tp/tunnels debugfs fileGuillaume Nault
Use l2tp_tunnel_get_nth() instead of l2tp_tunnel_find_nth(), to be safe against concurrent tunnel deletion. Use the same mechanism as in l2tp_ppp.c for dropping the reference taken by l2tp_tunnel_get_nth(). That is, drop the reference just before looking up the next tunnel. In case of error, drop the last accessed tunnel in l2tp_dfs_seq_stop(). That was the last use of l2tp_tunnel_find_nth(). Fixes: 0ad6614048cf ("l2tp: Add debugfs files for dumping l2tp debug info") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-13l2tp: hold reference on tunnels printed in pppol2tp proc fileGuillaume Nault
Use l2tp_tunnel_get_nth() instead of l2tp_tunnel_find_nth(), to be safe against concurrent tunnel deletion. Unlike sessions, we can't drop the reference held on tunnels in pppol2tp_seq_show(). Tunnels are reused across several calls to pppol2tp_seq_start() when iterating over sessions. These iterations need the tunnel for accessing the next session. Therefore the only safe moment for dropping the reference is just before searching for the next tunnel. Normally, the last invocation of pppol2tp_next_tunnel() doesn't find any new tunnel, so it drops the last tunnel without taking any new reference. However, in case of error, pppol2tp_seq_stop() is called directly, so we have to drop the reference there. Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-13l2tp: hold reference on tunnels in netlink dumpsGuillaume Nault
l2tp_tunnel_find_nth() is unsafe: no reference is held on the returned tunnel, therefore it can be freed whenever the caller uses it. This patch defines l2tp_tunnel_get_nth() which works similarly, but also takes a reference on the returned tunnel. The caller then has to drop it after it stops using the tunnel. Convert netlink dumps to make them safe against concurrent tunnel deletion. Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-13virtio-net: add missing virtqueue kick when flushing packetsJason Wang
We tends to batch submitting packets during XDP_TX. This requires to kick virtqueue after a batch, we tried to do it through xdp_do_flush_map() which only makes sense for devmap not XDP_TX. So explicitly kick the virtqueue in this case. Reported-by: Kimitoshi Takahashi <ktaka@nii.ac.jp> Tested-by: Kimitoshi Takahashi <ktaka@nii.ac.jp> Cc: Daniel Borkmann <daniel@iogearbox.net> Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-13kconfig: extend output of 'listnewconfig'Don Zickus
We at Red Hat/Fedora have generally tried to have a per file breakdown of every config option we set. This makes it easy for us to add new options when they are exposed and keep a changelog of why they were set. A Fedora example is here: https://src.fedoraproject.org/cgit/rpms/kernel.git/tree/configs/fedora/generic Using various merge scripts, we build up a config file and run it through 'make listnewconfig' and 'make oldnoconfig'. The idea is to print out new config options that haven't been manually set and use the default until a patch is posted to set it properly. To speed things up, it would be nice to make it easier to generate a patch to post the default setting. The output of 'make listnewconfig' has two issues that limit us: - it doesn't provide the default value - it doesn't provide the new 'choice' options that get flagged in 'oldconfig' This patch extends 'listnewconfig' to address the above two issues. This allows us to run a script make listnewconfig | rhconfig-tool -o patches; git send-email patches/ The output of 'make listnewconfig': CONFIG_NET_EMATCH_IPT CONFIG_IPVLAN CONFIG_ICE CONFIG_NET_VENDOR_NI CONFIG_IEEE802154_MCR20A CONFIG_IR_IMON_DECODER CONFIG_IR_IMON_RAW The new output of 'make listnewconfig': CONFIG_KERNEL_XZ=n CONFIG_KERNEL_LZO=n CONFIG_NET_EMATCH_IPT=n CONFIG_IPVLAN=n CONFIG_ICE=n CONFIG_NET_VENDOR_NI=y CONFIG_IEEE802154_MCR20A=n CONFIG_IR_IMON_DECODER=n CONFIG_IR_IMON_RAW=n Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-04-13kbuild: rpm-pkg: use kernel-install as a fallback for new-kernel-pkgJavier Martinez Canillas
The new-kernel-pkg script is only present when grubby is installed, but it may not always be the case. So if the script isn't present, attempt to use the kernel-install script as a fallback instead. Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-04-13btrfs: Only check first key for committed tree blocksQu Wenruo
When looping btrfs/074 with many cpus (>= 8), it's possible to trigger kernel warning due to first key verification: [ 4239.523446] WARNING: CPU: 5 PID: 2381 at fs/btrfs/disk-io.c:460 btree_read_extent_buffer_pages+0x1ad/0x210 [ 4239.523830] Modules linked in: [ 4239.524630] RIP: 0010:btree_read_extent_buffer_pages+0x1ad/0x210 [ 4239.527101] Call Trace: [ 4239.527251] read_tree_block+0x42/0x70 [ 4239.527434] read_node_slot+0xd2/0x110 [ 4239.527632] push_leaf_right+0xad/0x1b0 [ 4239.527809] split_leaf+0x4ea/0x700 [ 4239.527988] ? leaf_space_used+0xbc/0xe0 [ 4239.528192] ? btrfs_set_lock_blocking_rw+0x99/0xb0 [ 4239.528416] btrfs_search_slot+0x8cc/0xa40 [ 4239.528605] btrfs_insert_empty_items+0x71/0xc0 [ 4239.528798] __btrfs_run_delayed_refs+0xa98/0x1680 [ 4239.529013] btrfs_run_delayed_refs+0x10b/0x1b0 [ 4239.529205] btrfs_commit_transaction+0x33/0xaf0 [ 4239.529445] ? start_transaction+0xa8/0x4f0 [ 4239.529630] btrfs_alloc_data_chunk_ondemand+0x1b0/0x4e0 [ 4239.529833] btrfs_check_data_free_space+0x54/0xa0 [ 4239.530045] btrfs_delalloc_reserve_space+0x25/0x70 [ 4239.531907] btrfs_direct_IO+0x233/0x3d0 [ 4239.532098] generic_file_direct_write+0xcb/0x170 [ 4239.532296] btrfs_file_write_iter+0x2bb/0x5f4 [ 4239.532491] aio_write+0xe2/0x180 [ 4239.532669] ? lock_acquire+0xac/0x1e0 [ 4239.532839] ? __might_fault+0x3e/0x90 [ 4239.533032] do_io_submit+0x594/0x860 [ 4239.533223] ? do_io_submit+0x594/0x860 [ 4239.533398] SyS_io_submit+0x10/0x20 [ 4239.533560] ? SyS_io_submit+0x10/0x20 [ 4239.533729] do_syscall_64+0x75/0x1d0 [ 4239.533979] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 4239.534182] RIP: 0033:0x7f8519741697 The problem here is, at btree_read_extent_buffer_pages() we don't have acquired read/write lock on that extent buffer, only basic info like level/bytenr is reliable. So race condition leads to such false alert. However in current call site, it's impossible to acquire proper lock without race window. To fix the problem, we only verify first key for committed tree blocks (whose generation is no larger than fs_info->last_trans_committed), so the content of such tree blocks will not change and there is no need to get read/write lock. Reported-by: Nikolay Borisov <nborisov@suse.com> Fixes: 581c1760415c ("btrfs: Validate child tree block's level and first key") Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-04-13MAINTAINERS: add an entry for FSNOTIFY infrastructureAmir Goldstein
There is alreay an entry for all the backends, but those entries do not cover all the fsnotify files. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2018-04-13fsnotify: fix typo in a comment about mark->g_listAmir Goldstein
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2018-04-13fsnotify: fix ignore mask logic in send_to_group()Amir Goldstein
The ignore mask logic in send_to_group() does not match the logic in fanotify_should_send_event(). In the latter, a vfsmount mark ignore mask precedes an inode mark mask and in the former, it does not. That difference may cause events to be sent to fanotify backend for no reason. Fix the logic in send_to_group() to match that of fanotify_should_send_event(). Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2018-04-13powerpc/64s: Fix CPU_FTRS_ALWAYS vs DT CPU featuresMichael Ellerman
The cpu_has_feature() mechanism has an optimisation where at build time we construct a mask of the CPU feature bits that will always be true for the given .config, based on the platform/bitness/etc. that we are building for. That is incompatible with DT CPU features, where the set of CPU features is dependent on feature flags that are given to us by firmware. The result is that some feature bits can not be *disabled* by DT CPU features. Or more accurately, they can be disabled but they will still appear in the ALWAYS mask, meaning cpu_has_feature() will always return true for them. In the past this hasn't really been a problem because on Book3S 64 (where we support DT CPU features), the set of ALWAYS bits has been very small. That was because we always built for POWER4 and later, meaning the set of common bits was small. The only bit that could be cleared by DT CPU features that was also in the ALWAYS mask was CPU_FTR_NODSISRALIGN, and that was only used in the alignment handler to create a fake DSISR. That code was itself deleted in 31bfdb036f12 ("powerpc: Use instruction emulation infrastructure to handle alignment faults") (Sep 2017). However the set of ALWAYS features changed with the recent commit db5ae1c155af ("powerpc/64s: Refine feature sets for little endian builds") which restricted the set of feature flags when building little endian to Power7 or later. That caused the ALWAYS mask to become much larger for little endian builds. The result is that the following feature bits can currently not be *disabled* by DT CPU features: CPU_FTR_REAL_LE, CPU_FTR_MMCRA, CPU_FTR_CTRL, CPU_FTR_SMT, CPU_FTR_PURR, CPU_FTR_SPURR, CPU_FTR_DSCR, CPU_FTR_PKEY, CPU_FTR_VMX_COPY, CPU_FTR_CFAR, CPU_FTR_HAS_PPR. To fix it we need to mask the set of ALWAYS features with the base set of DT CPU features, ie. the features that are always enabled by DT CPU features. That way there are no bits in the ALWAYS mask that are not also always set by DT CPU features. Fixes: db5ae1c155af ("powerpc/64s: Refine feature sets for little endian builds") Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-13firmware: dmi_scan: Use lowercase letters for UUIDJean Delvare
RFC 4122 asks for letters a-f in UUID to be lowercase. Follow this recommendation. Suggested by Paul Dagnelie at: https://savannah.nongnu.org/bugs/index.php?53569 Signed-off-by: Jean Delvare <jdelvare@suse.de>
2018-04-13firmware: dmi_scan: Add DMI_OEM_STRING support to dmi_matchesAlex Hung
OEM strings are defined by each OEM and they contain customized and useful OEM information. Supporting it provides more flexible uses of the dmi_matches function. Signed-off-by: Alex Hung <alex.hung@canonical.com> Signed-off-by: Jean Delvare <jdelvare@suse.de>
2018-04-13firmware: dmi_scan: Fix UUID length safety checkJean Delvare
The test which ensures that the DMI type 1 structure is long enough to hold the UUID is off by one. It would fail if the structure is exactly 24 bytes long, while that's sufficient to hold the UUID. I don't expect this bug to cause problem in practice because all implementations I have seen had length 8, 25 or 27 bytes, in line with the SMBIOS specifications. But let's fix it still. Signed-off-by: Jean Delvare <jdelvare@suse.de> Fixes: a814c3597a6b ("firmware: dmi_scan: Check DMI structure length") Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2018-04-13perf annotate: Handle variables in 'sub', 'or' and many other instructionsArnaldo Carvalho de Melo
Just like is done for 'mov' and others that can have as source or targets variables resolved by objdump, to make them more compact: - orb $0x4,0x224d71(%rip) # 226ca4 <_rtld_global+0xca4> + orb $0x4,_rtld_global+0xca4 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-efex7746id4w4wa03nqxvh3m@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-13perf annotate: Allow setting the offset level in .perfconfigArnaldo Carvalho de Melo
The default is 1 (jump_target): # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave Samples: 3K of event 'cycles:ppp', 3000 Hz, Event count (approx.): 2766398574 _raw_spin_lock_irqsave() /proc/kcore 0.26 nop 4.61 push %rbx 19.33 pushfq 7.97 pop %rax 0.32 nop 0.06 mov %rax,%rbx 14.63 cli 0.06 nop xor %eax,%eax mov $0x1,%edx 49.94 lock cmpxchg %edx,(%rdi) 0.16 test %eax,%eax ↓ jne 2b 2.66 mov %rbx,%rax pop %rbx ← retq 2b: mov %eax,%esi → callq *ffffffffb30eaed0 mov %rbx,%rax pop %rbx ← retq # But one can ask for showing offsets for call instructions by setting this: # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave Samples: 3K of event 'cycles:ppp', 3000 Hz, Event count (approx.): 2766398574 _raw_spin_lock_irqsave() /proc/kcore 0.26 nop 4.61 push %rbx 19.33 pushfq 7.97 pop %rax 0.32 nop 0.06 mov %rax,%rbx 14.63 cli 0.06 nop xor %eax,%eax mov $0x1,%edx 49.94 lock cmpxchg %edx,(%rdi) 0.16 test %eax,%eax ↓ jne 2b 2.66 mov %rbx,%rax pop %rbx ← retq 2b: mov %eax,%esi 2d: → callq *ffffffffb30eaed0 mov %rbx,%rax pop %rbx ← retq # Or using a big value to ask for all offsets to be shown: # cat ~/.perfconfig [annotate] offset_level = 100 hide_src_code = true # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave Samples: 3K of event 'cycles:ppp', 3000 Hz, Event count (approx.): 2766398574 _raw_spin_lock_irqsave() /proc/kcore 0.26 0: nop 4.61 5: push %rbx 19.33 6: pushfq 7.97 7: pop %rax 0.32 8: nop 0.06 d: mov %rax,%rbx 14.63 10: cli 0.06 11: nop 17: xor %eax,%eax 19: mov $0x1,%edx 49.94 1e: lock cmpxchg %edx,(%rdi) 0.16 22: test %eax,%eax 24: ↓ jne 2b 2.66 26: mov %rbx,%rax 29: pop %rbx 2a: ← retq 2b: mov %eax,%esi 2d: → callq *ffffffffb30eaed0 32: mov %rbx,%rax 35: pop %rbx 36: ← retq # This also affects the TUI, i.e. the default 'perf annotate' and 'perf top/report' -> A hotkey -> annotate interfaces, when slang-devel is present in the build, i.e.: # perf version --build-options | grep slang libslang: [ on ] # HAVE_SLANG_SUPPORT # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-venm6x5zrt40eu8hxdsmqxz6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-13perf report: Fix switching to another perf.data fileArnaldo Carvalho de Melo
In the TUI the 's' hotkey can be used to switch to another perf.data file in the current directory, but that got broken in Fixes: b01141f4f59c ("perf annotate: Initialize the priv are in symbol__new()"), that would show this once another file was chosen: ┌─Fatal Error─────────────────────────────────────┐ │Annotation needs to be init before symbol__init()│ │ │ │ │ │Press any key... │ └─────────────────────────────────────────────────┘ Fix it by just silently bailing out if symbol__annotation_init() was already called, just like is done with symbol__init(), i.e. they are done just once at session start, not when switching to a new perf.data file. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: b01141f4f59c ("perf annotate: Initialize the priv are in symbol__new()") Link: https://lkml.kernel.org/n/tip-ogppdtpzfax7y1h6gjdv5s6u@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-13perf record: Change warning for missing sysfs entry to debugThomas Richter
Using perf on 4.16.0 kernel on s390 shows this warning: failed: can't open node sysfs data each time I run command perf record ... for example: [root@s35lp76 perf]# ./perf record -e rB0000 -- sleep 1 [ perf record: Woken up 1 times to write data ] failed: can't open node sysfs data [ perf record: Captured and wrote 0.001 MB perf.data (4 samples) ] [root@s35lp76 perf]# It turns out commit e2091cedd51bf ("perf tools: Add MEM_TOPOLOGY feature to perf data file") tries to open directory named /sys/devices/system/node/ which does not exist on s390. This is the call stack: __cmd_record +---> perf_session__write_header +---> perf_header__adds_write +---> do_write_feat +---> write_mem_topology +---> build_mem_topology prints warning The issue starts in do_write_feat() which unconditionally loops over all features and now includes HEADER_MEM_TOPOLOGY and calls write_mem_topology(). Function record__init_features() at the beginning of __cmd_record() sets all features and then turns off some of them. Fix this by changing the warning to a level 2 debug output statement. So it is only shown when debug level 2 or higher is set. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180412133246.92801-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-13Merge branches 'thermal-core' and 'thermal-soc' into nextZhang Rui
2018-04-12Merge tag 'drm-fixes-for-v4.17-rc1' of ↵Linus Torvalds
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "One omap, and one alsa pm fix (we merged the breaking patch via drm tree). Otherwise it's two bunches of amdgpu fixes, removing an unneeded file, some DC fixes, HDMI audio regression fix, and some vega12 fixes" * tag 'drm-fixes-for-v4.17-rc1' of git://people.freedesktop.org/~airlied/linux: (27 commits) Revert "drm/amd/display: disable CRTCs with NULL FB on their primary plane (V2)" Revert "drm/amd/display: fix dereferencing possible ERR_PTR()" drm/amd/display: Fix regamma not affecting full-intensity color values drm/amd/display: Fix FBC text console corruption drm/amd/display: Only register backlight device if embedded panel connected drm/amd/display: fix brightness level after resume from suspend drm/amd/display: HDMI has no sound after Panel power off/on drm/amdgpu: add MP1 and THM hw ip base reg offset drm/amdgpu: fix null pointer panic with direct fw loading on gpu reset drm/radeon: add PX quirk for Asus K73TK drm/omap: fix crash if there's no video PLL drm/amdgpu: Fix memory leaks at amdgpu_init() error path drm/amdgpu: Fix PCIe lane width calculation drm/radeon: Fix PCIe lane width calculation drm/amdgpu/si: implement get/set pcie_lanes asic callback drm/amdgpu: Add support for SRBM selection v3 Revert "drm/amdgpu: Don't change preferred domian when fallback GTT v5" drm/amd/powerply: fix power reading on Fiji drm/amd/powerplay: Enable ACG SS feature drm/amdgpu/sdma: fix mask in emit_pipeline_sync ...
2018-04-12Merge tag 'trace-v4.17-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "A few clean ups and bug fixes: - replace open coded "ARRAY_SIZE()" with macro - updates to uprobes - bug fix for perf event filter on error path" * tag 'trace-v4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Enforce passing in filter=NULL to create_filter() trace_uprobe: Simplify probes_seq_show() trace_uprobe: Use %lx to display offset tracing/uprobe: Add support for overlayfs tracing: Use ARRAY_SIZE() macro instead of open coding it
2018-04-12proc: fixup copyright signAlexey Dobriyan
Add copyright in two files before they get autorubberstamped. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-12Merge tag 'pci-v4.17-changes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - mark Extended Tags as broken on Broadcom HT1100 and HT2000 Root Ports to fix drm/Xorg hangs and unresponsive keyboards (Sinan Kaya) - remove useless messages during resource reassignment (Desnes A. Nunes do Rosario) * tag 'pci-v4.17-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Remove messages about reassigning resources PCI: Mark Broadcom HT1100 and HT2000 Root Port Extended Tags as broken
2018-04-12net: dsa: mv88e6xxx: Fix receive time stamp race condition.Richard Cochran
The DSA stack passes received PTP frames to this driver via mv88e6xxx_port_rxtstamp() for deferred delivery. The driver then queues the frame and kicks the worker thread. The work callback reads out the latched receive time stamp and then works through the queue, delivering any non-matching frames without a time stamp. If a new frame arrives after the worker thread has read out the time stamp register but enters the queue before the worker finishes processing the queue, that frame will be delivered without a time stamp. This patch fixes the race by moving the queue onto a list on the stack before reading out the latched time stamp value. Fixes: c6fe0ad2c3499 ("net: dsa: mv88e6xxx: add rx/tx timestamping support") Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12net: fix deadlock while clearing neighbor proxy tableWolfgang Bumiller
When coming from ndisc_netdev_event() in net/ipv6/ndisc.c, neigh_ifdown() is called with &nd_tbl, locking this while clearing the proxy neighbor entries when eg. deleting an interface. Calling the table's pndisc_destructor() with the lock still held, however, can cause a deadlock: When a multicast listener is available an IGMP packet of type ICMPV6_MGM_REDUCTION may be sent out. When reaching ip6_finish_output2(), if no neighbor entry for the target address is found, __neigh_create() is called with &nd_tbl, which it'll want to lock. Move the elements into their own list, then unlock the table and perform the destruction. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199289 Fixes: 6fd6ce2056de ("ipv6: Do not depend on rt->n in ip6_finish_output2().") Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12sctp: do not check port in sctp_inet6_cmp_addrXin Long
pf->cmp_addr() is called before binding a v6 address to the sock. It should not check ports, like in sctp_inet_cmp_addr. But sctp_inet6_cmp_addr checks the addr by invoking af(6)->cmp_addr, sctp_v6_cmp_addr where it also compares the ports. This would cause that setsockopt(SCTP_SOCKOPT_BINDX_ADD) could bind multiple duplicated IPv6 addresses after Commit 40b4f0fd74e4 ("sctp: lack the check for ports in sctp_v6_cmp_addr"). This patch is to remove af->cmp_addr called in sctp_inet6_cmp_addr, but do the proper check for both v6 addrs and v4mapped addrs. v1->v2: - define __sctp_v6_cmp_addr to do the common address comparison used for both pf and af v6 cmp_addr. Fixes: 40b4f0fd74e4 ("sctp: lack the check for ports in sctp_v6_cmp_addr") Reported-by: Jianwen Ji <jiji@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12Merge branch ↵David S. Miller
'nfp-improve-signal-handing-on-FW-waits-and-flower-control-message-Jakub Kicinski says: ==================== nfp: improve signal handing on FW waits and flower control message processing The first part of this set aims to improve handling of interrupted waits. Patch 1 makes waiting for management FW responses uninterruptible while patch 2 adds a message when signal arrives while waiting for an NFP mutex. We can't interrupt execution of FW commands so uninterruptible sleep seems reasonable there. Exiting a wait for a mutex should be clean and have no side affects so we are allowing to abort it. Note that both waits have rather large timeouts (tens of seconds). Patches 3 and 4 improve flower offload operation under heavy load. Currently there is no cap on the number of queued FW notifications. Some of the notifications have to be processed from a workqueue which may lead to very large number of messages getting queued if workqueue never gets a chance to run. Pieter puts a limit on number of queued messages, tries to drop some messages we ignore without queuing and process more important messages first. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> processing'
2018-04-12nfp: flower: split and limit cmsg skb listsPieter Jansen van Vuuren
Introduce a second skb list for handling control messages and limit the number of allowed messages. Some control messages are considered more crucial than others, resulting in the need for a second skb list. By splitting the list into a separate high and low priority list we can ensure that messages on the high list get added to the head of the list that gets processed, this however has no functional impact. Previously there was no limit on the number of messages allowed on the queue, this could result in the queue growing boundlessly and eventually the host running out of memory. Fixes: b985f870a5f0 ("nfp: process control messages in workqueue in flower app") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12nfp: flower: move route ack control messages out of the workqueuePieter Jansen van Vuuren
Previously we processed the route ack control messages in the workqueue, this unnecessarily loads the workqueue. We can deal with these messages sooner as we know we are going to drop them. Fixes: 8e6a9046b66a ("nfp: flower vxlan neighbour offload") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12nfp: print a message when mutex wait is interruptedJakub Kicinski
When waiting for an NFP mutex is interrupted print a message to make root causing later error messages easier. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12nfp: ignore signals when communicating with management FWJakub Kicinski
We currently allow signals to interrupt the wait for management FW commands. Exiting the wait should not cause trouble, the FW will just finish executing the command in the background and new commands will wait for the old one to finish. However, this may not be what users expect (Ctrl-C not actually stopping the command). Moreover some systems routinely request link information with signals pending (Ubuntu 14.04 runs a landscape-sysinfo python tool from MOTD) worrying users with errors like these: nfp 0000:04:00.0: nfp_nsp: Error -512 waiting for code 0x0007 to start nfp 0000:04:00.0: nfp: reading port table failed -512 Make the wait for management FW responses non-interruptible. Fixes: 1a64821c6af7 ("nfp: add support for service processor access") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12tipc: fix missing initializer in tipc_sendmsg()Jon Maloy
The stack variable 'dnode' in __tipc_sendmsg() may theoretically end up tipc_node_get_mtu() as an unitilalized variable. We fix this by intializing the variable at declaration. We also add a default else clause to the two conditional ones already there, so that we never end up in the named function if the given address type is illegal. Reported-by: syzbot+b0975ce9355b347c1546@syzkaller.appspotmail.com Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12strparser: Fix incorrect strp->need_bytes value.Doron Roberts-Kedes
strp_data_ready resets strp->need_bytes to 0 if strp_peek_len indicates that the remainder of the message has been received. However, do_strp_work does not reset strp->need_bytes to 0. If do_strp_work completes a partial message, the value of strp->need_bytes will continue to reflect the needed bytes of the previous message, causing future invocations of strp_data_ready to return early if strp->need_bytes is less than strp_peek_len. Resetting strp->need_bytes to 0 in __strp_recv on handing a full message to the upper layer solves this problem. __strp_recv also calculates strp->need_bytes using stm->accum_len before stm->accum_len has been incremented by cand_len. This can cause strp->need_bytes to be equal to the full length of the message instead of the full length minus the accumulated length. This, in turn, causes strp_data_ready to return early, even when there is sufficient data to complete the partial message. Incrementing stm->accum_len before using it to calculate strp->need_bytes solves this problem. Found while testing net/tls_sw recv path. Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages") Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12selftests: net: add in_netns.sh to TEST_PROGSAnders Roxell
Script in_netns.sh isn't installed. -------------------- running psock_fanout test -------------------- ./run_afpackettests: line 12: ./in_netns.sh: No such file or directory [FAIL] -------------------- running psock_tpacket test -------------------- ./run_afpackettests: line 22: ./in_netns.sh: No such file or directory [FAIL] In current code added in_netns.sh to be installed. Fixes: cc30c93fa020 ("selftests/net: ignore background traffic in psock_fanout") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12Merge branch 'ibmvnic-Fix-parameter-change-request-handling'David S. Miller
Nathan Fontenot says: ==================== ibmvnic: Fix parameter change request handling When updating parameters for the ibmvnic driver there is a possibility of entering an infinite loop if a return value other that a partial success is received from sending the login CRQ. Also, a deadlock can occur on the rtnl lock if netdev_notify_peers() is called during driver reset for a parameter change reset. This patch set corrects both of these issues by updating the return code handling in ibmvnic_login() nand gaurding against calling netdev_notify_peers() for parameter change requests. Updates for V2: Correct spelling mistakes in commit messages. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12ibmvnic: Do not notify peers on parameter change resetsNathan Fontenot
When attempting to change the driver parameters, such as the MTU value or number of queues, do not call netdev_notify_peers(). Doing so will deadlock on the rtnl_lock. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12ibmvnic: Handle all login error conditionsNathan Fontenot
There is a bug in handling the possible return codes from sending the login CRQ. The current code treats any non-success return value, minus failure to send the crq and a timeout waiting for a login response, as a need to re-send the login CRQ. This can put the drive in an infinite loop of trying to login when getting return values other that a partial success such as a return code of aborted. For these scenarios the login will not ever succeed at this point and the driver would need to be reset again. To resolve this loop trying to login is updated to only retry the login if the driver gets a return code of a partial success. Other return codes are treated as an error and the driver returns an error from ibmvnic_login(). To avoid infinite looping in the partial success return cases, the number of retries is capped at the maximum number of supported queues. This value was chosen because the driver does a renegotiation of capabilities which sets the number of queues possible and allows the driver to attempt a login for possible value for the number of queues supported. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12net: validate attribute sizes in neigh_dump_table()Eric Dumazet
Since neigh_dump_table() calls nlmsg_parse() without giving policy constraints, attributes can have arbirary size that we must validate Reported by syzbot/KMSAN : BUG: KMSAN: uninit-value in neigh_master_filtered net/core/neighbour.c:2292 [inline] BUG: KMSAN: uninit-value in neigh_dump_table net/core/neighbour.c:2348 [inline] BUG: KMSAN: uninit-value in neigh_dump_info+0x1af0/0x2250 net/core/neighbour.c:2438 CPU: 1 PID: 3575 Comm: syzkaller268891 Not tainted 4.16.0+ #83 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 neigh_master_filtered net/core/neighbour.c:2292 [inline] neigh_dump_table net/core/neighbour.c:2348 [inline] neigh_dump_info+0x1af0/0x2250 net/core/neighbour.c:2438 netlink_dump+0x9ad/0x1540 net/netlink/af_netlink.c:2225 __netlink_dump_start+0x1167/0x12a0 net/netlink/af_netlink.c:2322 netlink_dump_start include/linux/netlink.h:214 [inline] rtnetlink_rcv_msg+0x1435/0x1560 net/core/rtnetlink.c:4598 netlink_rcv_skb+0x355/0x5f0 net/netlink/af_netlink.c:2447 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4653 netlink_unicast_kernel net/netlink/af_netlink.c:1311 [inline] netlink_unicast+0x1672/0x1750 net/netlink/af_netlink.c:1337 netlink_sendmsg+0x1048/0x1310 net/netlink/af_netlink.c:1900 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046 __sys_sendmsg net/socket.c:2080 [inline] SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091 SyS_sendmsg+0x54/0x80 net/socket.c:2087 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x43fed9 RSP: 002b:00007ffddbee2798 EFLAGS: 00000213 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fed9 RDX: 0000000000000000 RSI: 0000000020005000 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8 R10: 00000000004002c8 R11: 0000000000000213 R12: 0000000000401800 R13: 0000000000401890 R14: 0000000000000000 R15: 0000000000000000 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314 kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321 slab_post_alloc_hook mm/slab.h:445 [inline] slab_alloc_node mm/slub.c:2737 [inline] __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:984 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1183 [inline] netlink_sendmsg+0x9a6/0x1310 net/netlink/af_netlink.c:1875 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046 __sys_sendmsg net/socket.c:2080 [inline] SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091 SyS_sendmsg+0x54/0x80 net/socket.c:2087 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Fixes: 21fdd092acc7 ("net: Add support for filtering neigh dump by master device") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsa@cumulusnetworks.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established socketsEric Dumazet
syzbot/KMSAN reported an uninit-value in tcp_parse_options() [1] I believe this was caused by a TCP_MD5SIG being set on live flow. This is highly unexpected, since TCP option space is limited. For instance, presence of TCP MD5 option automatically disables TCP TimeStamp option at SYN/SYNACK time, which we can not do once flow has been established. Really, adding/deleting an MD5 key only makes sense on sockets in CLOSE or LISTEN state. [1] BUG: KMSAN: uninit-value in tcp_parse_options+0xd74/0x1a30 net/ipv4/tcp_input.c:3720 CPU: 1 PID: 6177 Comm: syzkaller192004 Not tainted 4.16.0+ #83 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 tcp_parse_options+0xd74/0x1a30 net/ipv4/tcp_input.c:3720 tcp_fast_parse_options net/ipv4/tcp_input.c:3858 [inline] tcp_validate_incoming+0x4f1/0x2790 net/ipv4/tcp_input.c:5184 tcp_rcv_established+0xf60/0x2bb0 net/ipv4/tcp_input.c:5453 tcp_v4_do_rcv+0x6cd/0xd90 net/ipv4/tcp_ipv4.c:1469 sk_backlog_rcv include/net/sock.h:908 [inline] __release_sock+0x2d6/0x680 net/core/sock.c:2271 release_sock+0x97/0x2a0 net/core/sock.c:2786 tcp_sendmsg+0xd6/0x100 net/ipv4/tcp.c:1464 inet_sendmsg+0x48d/0x740 net/ipv4/af_inet.c:764 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] SYSC_sendto+0x6c3/0x7e0 net/socket.c:1747 SyS_sendto+0x8a/0xb0 net/socket.c:1715 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x448fe9 RSP: 002b:00007fd472c64d38 EFLAGS: 00000216 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00000000006e5a30 RCX: 0000000000448fe9 RDX: 000000000000029f RSI: 0000000020a88f88 RDI: 0000000000000004 RBP: 00000000006e5a34 R08: 0000000020e68000 R09: 0000000000000010 R10: 00000000200007fd R11: 0000000000000216 R12: 0000000000000000 R13: 00007fff074899ef R14: 00007fd472c659c0 R15: 0000000000000009 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314 kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321 slab_post_alloc_hook mm/slab.h:445 [inline] slab_alloc_node mm/slub.c:2737 [inline] __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:984 [inline] tcp_send_ack+0x18c/0x910 net/ipv4/tcp_output.c:3624 __tcp_ack_snd_check net/ipv4/tcp_input.c:5040 [inline] tcp_ack_snd_check net/ipv4/tcp_input.c:5053 [inline] tcp_rcv_established+0x2103/0x2bb0 net/ipv4/tcp_input.c:5469 tcp_v4_do_rcv+0x6cd/0xd90 net/ipv4/tcp_ipv4.c:1469 sk_backlog_rcv include/net/sock.h:908 [inline] __release_sock+0x2d6/0x680 net/core/sock.c:2271 release_sock+0x97/0x2a0 net/core/sock.c:2786 tcp_sendmsg+0xd6/0x100 net/ipv4/tcp.c:1464 inet_sendmsg+0x48d/0x740 net/ipv4/af_inet.c:764 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] SYSC_sendto+0x6c3/0x7e0 net/socket.c:1747 SyS_sendto+0x8a/0xb0 net/socket.c:1715 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Fixes: cfb6eeb4c860 ("[TCP]: MD5 Signature Option (RFC2385) support.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12tipc: fix unbalanced reference counterJon Maloy
When a topology subscription is created, we may encounter (or KASAN may provoke) a failure to create a corresponding service instance in the binding table. Instead of letting the tipc_nametbl_subscribe() report the failure back to the caller, the function just makes a warning printout and returns, without incrementing the subscription reference counter as expected by the caller. This makes the caller believe that the subscription was successful, so it will at a later moment try to unsubscribe the item. This involves a sub_put() call. Since the reference counter never was incremented in the first place, we get a premature delete of the subscription item, followed by a "use-after-free" warning. We fix this by adding a return value to tipc_nametbl_subscribe() and make the caller aware of the failure to subscribe. This bug seems to always have been around, but this fix only applies back to the commit shown below. Given the low risk of this happening we believe this to be sufficient. Fixes: commit 218527fe27ad ("tipc: replace name table service range array with rb tree") Reported-by: syzbot+aa245f26d42b8305d157@syzkaller.appspotmail.com Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12lan78xx: PHY DSP registers initialization to address EEE link drop issues ↵Raghuram Chary J
with long cables The patch is to configure DSP registers of PHY device to handle Gbe-EEE failures with >40m cable length. Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Signed-off-by: Raghuram Chary J <raghuramchary.jallipalli@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12mISDN: Remove VLAsLaura Abbott
There's an ongoing effort to remove VLAs[1] from the kernel to eventually turn on -Wvla. Remove the VLAs from the mISDN code by switching to using kstrdup in one place and using an upper bound in another. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-12net/tls: Remove VLA usageKees Cook
In the quest to remove VLAs from the kernel[1], this replaces the VLA size with the only possible size used in the code, and adds a mechanism to double-check future IV sizes. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>