summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2016-12-14Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge more updates from Andrew Morton: - a few misc things - kexec updates - DMA-mapping updates to better support networking DMA operations - IPC updates - various MM changes to improve DAX fault handling - lots of radix-tree changes, mainly to the test suite. All leading up to reimplementing the IDA/IDR code to be a wrapper layer over the radix-tree. However the final trigger-pulling patch is held off for 4.11. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (114 commits) radix tree test suite: delete unused rcupdate.c radix tree test suite: add new tag check radix-tree: ensure counts are initialised radix tree test suite: cache recently freed objects radix tree test suite: add some more functionality idr: reduce the number of bits per level from 8 to 6 rxrpc: abstract away knowledge of IDR internals tpm: use idr_find(), not idr_find_slowpath() idr: add ida_is_empty radix tree test suite: check multiorder iteration radix-tree: fix replacement for multiorder entries radix-tree: add radix_tree_split_preload() radix-tree: add radix_tree_split radix-tree: add radix_tree_join radix-tree: delete radix_tree_range_tag_if_tagged() radix-tree: delete radix_tree_locate_item() radix-tree: improve multiorder iterators btrfs: fix race in btrfs_free_dummy_fs_info() radix-tree: improve dump output radix-tree: make radix_tree_find_next_bit more useful ...
2016-12-14rxrpc: abstract away knowledge of IDR internalsMatthew Wilcox
Add idr_get_cursor() / idr_set_cursor() APIs, and remove the reference to IDR_SIZE. Link: http://lkml.kernel.org/r/1480369871-5271-65-git-send-email-mawilcox@linuxonhyperv.com Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: David Howells <dhowells@redhat.com> Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14netfilter: nft_payload: mangle ckecksum if NFT_PAYLOAD_L4CSUM_PSEUDOHDR is setPablo Neira Ayuso
If the NFT_PAYLOAD_L4CSUM_PSEUDOHDR flag is set, then mangle layer 4 checksum. This should not depend on csum_type NFT_PAYLOAD_CSUM_INET since IPv6 header has no checksum field, but still an update of any of the pseudoheader fields may trigger a layer 4 checksum update. Fixes: 1814096980bb ("netfilter: nft_payload: layer 4 checksum adjustment for pseudoheader fields") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-14netfilter: nf_tables: fix oob accessFlorian Westphal
BUG: KASAN: slab-out-of-bounds in nf_tables_rule_destroy+0xf1/0x130 at addr ffff88006a4c35c8 Read of size 8 by task nft/1607 When we've destroyed last valid expr, nft_expr_next() returns an invalid expr. We must not dereference it unless it passes != nft_expr_last() check. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-14netfilter: nft_queue: use raw_smp_processor_id()Pablo Neira Ayuso
Using smp_processor_id() causes splats with PREEMPT_RCU: [19379.552780] BUG: using smp_processor_id() in preemptible [00000000] code: ping/32389 [19379.552793] caller is debug_smp_processor_id+0x17/0x19 [...] [19379.552823] Call Trace: [19379.552832] [<ffffffff81274e9e>] dump_stack+0x67/0x90 [19379.552837] [<ffffffff8129a4d4>] check_preemption_disabled+0xe5/0xf5 [19379.552842] [<ffffffff8129a4fb>] debug_smp_processor_id+0x17/0x19 [19379.552849] [<ffffffffa07c42dd>] nft_queue_eval+0x35/0x20c [nft_queue] No need to disable preemption since we only fetch the numeric value, so let's use raw_smp_processor_id() instead. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-14netfilter: nft_quota: reset quota after dumpPablo Neira Ayuso
Dumping of netlink attributes may fail due to insufficient room in the skbuff, so let's reset consumed quota if we succeed to put netlink attributes into the skbuff. Fixes: 43da04a593d8 ("netfilter: nf_tables: atomic dump and reset for stateful objects") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-14Merge branch 'stable-4.10' of git://git.infradead.org/users/pcmoore/auditLinus Torvalds
Pull audit updates from Paul Moore: "After the small number of patches for v4.9, we've got a much bigger pile for v4.10. The bulk of these patches involve a rework of the audit backlog queue to enable us to move the netlink multicasting out of the task/thread that generates the audit record and into the kernel thread that emits the record (just like we do for the audit unicast to auditd). While we were playing with the backlog queue(s) we fixed a number of other little problems with the code, and from all the testing so far things look to be in much better shape now. Doing this also allowed us to re-enable disabling IRQs for some netns operations ("netns: avoid disabling irq for netns id"). The remaining patches fix some small problems that are well documented in the commit descriptions, as well as adding session ID filtering support" * 'stable-4.10' of git://git.infradead.org/users/pcmoore/audit: audit: use proper refcount locking on audit_sock netns: avoid disabling irq for netns id audit: don't ever sleep on a command record/message audit: handle a clean auditd shutdown with grace audit: wake up kauditd_thread after auditd registers audit: rework audit_log_start() audit: rework the audit queue handling audit: rename the queues and kauditd related functions audit: queue netlink multicast sends just like we do for unicast sends audit: fixup audit_init() audit: move kaudit thread start from auditd registration to kaudit init (#2) audit: add support for session ID user filter audit: fix formatting of AUDIT_CONFIG_CHANGE events audit: skip sessionid sentinel value when auto-incrementing audit: tame initialization warning len_abuf in audit_log_execve_info audit: less stack usage for /proc/*/loginuid
2016-12-14libceph: remove now unused finish_request() wrapperIlya Dryomov
Kill the wrapper and rename __finish_request() to finish_request(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-12-14libceph: always signal completion when doneIlya Dryomov
r_safe_completion is currently, and has always been, signaled only if on-disk ack was requested. It's there for fsync and syncfs, which wait for in-flight writes to flush - all data write requests set ONDISK. However, the pool perm check code introduced in 4.2 sends a write request with only ACK set. An unfortunately timed syncfs can then hang forever: r_safe_completion won't be signaled because only an unsafe reply was requested. We could patch ceph_osdc_sync() to skip !ONDISK write requests, but that is somewhat incomplete and yet another special case. Instead, rename this completion to r_done_completion and always signal it when the OSD client is done with the request, whether unsafe, safe, or error. This is a bit cleaner and helps with the cancellation code. Reported-by: Yan, Zheng <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-12-14netns: avoid disabling irq for netns idPaul Moore
Bring back commit bc51dddf98c9 ("netns: avoid disabling irq for netns id") now that we've fixed some audit multicast issues that caused problems with original attempt. Additional information, and history, can be found in the links below: * https://github.com/linux-audit/audit-kernel/issues/22 * https://github.com/linux-audit/audit-kernel/issues/23 Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-12-14rds_rdma: log the connection reject messageSteve Wise
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13mac80211: Use appropriate name for functions and messagesMasashi Honma
These functions drifts TSF timers, not TBTT. Signed-off-by: Masashi Honma <masashi.honma@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13mac80211: Remove invalid flag operations in mesh TSF synchronizationMasashi Honma
mesh_sync_offset_adjust_tbtt() implements Extensible synchronization framework ([1] 13.13.2 Extensible synchronization framework). It shall not operate the flag "TBTT Adjusting subfield" ([1] 8.4.2.100.8 Mesh Capability), since it is used only for MBCA ([1] 13.13.4 Mesh beacon collision avoidance, see 13.13.4.4.3 TBTT scanning and adjustment procedures for detail). So this patch remove the flag operations. [1] IEEE Std 802.11 2012 Signed-off-by: Masashi Honma <masashi.honma@gmail.com> [remove adjusting_tbtt entirely, since it's now unused] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13rfkill: Add rfkill-any LED triggerMichał Kępień
Add a new "global" (i.e. not per-rfkill device) LED trigger, rfkill-any, which may be useful on laptops with a single "radio LED" and multiple radio transmitters. The trigger is meant to turn a LED on whenever there is at least one radio transmitter active and turn it off otherwise. This requires taking rfkill_global_mutex before calling rfkill_set_block() in rfkill_resume(): since __rfkill_any_led_trigger_event() is called from rfkill_set_block() unconditionally, each caller of the latter needs to take care of locking rfkill_global_mutex. Signed-off-by: Michał Kępień <kernel@kempniu.pl> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13rfkill: Cleanup error handling in rfkill_init()Michał Kępień
Use a separate label per error condition in rfkill_init() to make it a bit cleaner and easier to extend. Signed-off-by: Michał Kępień <kernel@kempniu.pl> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13mac80211: Ensure enough headroom when forwarding mesh pktCedric Izoard
When a buffer is duplicated during MESH packet forwarding, this patch ensures that the new buffer has enough headroom. Signed-off-by: Cedric Izoard <cedric.izoard@ceva-dsp.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13mac80211: Show pending txqlen in debugfs.Ben Greear
Could be useful for debugging memory consumption issues, and perhaps power-save as well. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13mac80211: Suppress NEW_PEER_CANDIDATE event if no roomMasashi Honma
Previously, kernel sends NEW_PEER_CANDIDATE event to user land even if the found peer does not have any room to accept other peer. This causes continuous connection trials. Signed-off-by: Masashi Honma <masashi.honma@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13mac80211: multicast to unicast conversionMichael Braun
Add the ability for an AP (and associated VLANs) to perform multicast-to-unicast conversion for ARP, IPv4 and IPv6 frames (possibly within 802.1Q). If enabled, such frames are to be sent to each station separately, with the DA replaced by their own MAC address rather than the group address. Note that this may break certain expectations of the receiver, such as the ability to drop unicast IP packets received within multicast L2 frames, or the ability to not send ICMP destination unreachable messages for packets received in L2 multicast (which is required, but the receiver can't tell the difference if this new option is enabled.) This also doesn't implement the 802.11 DMS (directed multicast service). Signed-off-by: Michael Braun <michael-dev@fami-braun.de> [use true/false, rename label to the correct "multicast", use __be16 for ethertype and network order for constants] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13mac80211: Remove unused 'beaconint_us' variableKirtika Ruchandani
Commit 4a733ef1bea7 (mac80211: remove PM-QoS listener) removed all use of 'beaconint_us' from ieee80211_recalc_ps() but left the variable intact. Compiling with W=1 gives the following warning, fix it. net/mac80211/mlme.c: In function ‘ieee80211_recalc_ps’: net/mac80211/mlme.c:1481:7: warning: variable ‘beaconint_us’ set but not used [-Wunused-but-set-variable] iee80211_tu_to_usec has no side-effects and is safe to remove. Fixes: 4a733ef1bea7 ("mac80211: remove PM-QoS listener") Cc: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kirtika Ruchandani <kirtika@chromium.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13mac80211: Remove unused 'sband' and 'local' variablesKirtika Ruchandani
Commit b1bce14a7954 (mac80211: update opmode when adding new station) refactored ieee80211_vht_handle_opmode into __ieee80211_vht_handle_opmode and ieee80211_vht_handle_opmode leaving a set but unused variable (sband) in the former. Compiling with W=1 gives the following warning, fix it. net/mac80211/vht.c: In function ‘__ieee80211_vht_handle_opmode’: net/mac80211/vht.c:424:35: warning: variable ‘sband’ set but not used [-Wunused-but-set-variable] Remove 'struct ieee80211_local* local' as well, it was only used to set sband. This is a harmless warning, and is only being fixed to reduce the noise with W=1 in the kernel. Fixes: b1bce14a7954 ("mac80211: update opmode when adding new station") Cc: Marek Kwaczynski <marek.kwaczynski@tieto.com> Cc: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kirtika Ruchandani <kirtika@chromium.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13mac80211: Remove unused 'len' variableKirtika Ruchandani
Commit 633e27132625 (mac80211: split sched scan IEs) introduced the len variable to keep track of the return value of ieee80211_build_preq_ies() but did not use it. Compiling with W=1 gives the following warning, fix it. net/mac80211/scan.c: In function ‘__ieee80211_request_sched_scan_start’: net/mac80211/scan.c:1123:9: warning: variable ‘len’ set but not used [-Wunused-but-set-variable] This is a harmless warning and is only being fixed to reduce the noise with W=1 in the kernel. Fixes: 633e27132625 ("mac80211: split sched scan IEs") Cc: David Spinadel <david.spinadel@intel.com> Cc: Alexander Bondar <alexander.bondar@intel.com> Cc: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kirtika Ruchandani <kirtika@chromium.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13mac80211: Remove unused 'i' variableKirtika Ruchandani
Commit 5bcae31d9 (mac80211: implement multi-vif in-place reservations) introduced ieee80211_vif_use_reserved_switch() with a counter variable 'i' that is set but not used. Compiling with W=1 gives the following warning, fix it. net/mac80211/chan.c: In function ‘ieee80211_vif_use_reserved_switch’: net/mac80211/chan.c:1273:6: warning: variable ‘i’ set but not used [-Wunused-but-set-variable] This is a harmless warning, and is only being fixed to reduce the noise obtained with W=1 in the kernel. Fixes: 5bcae31d9 ("mac80211: implement multi-vif in-place reservations") Cc: Michal Kazior <michal.kazior@tieto.com> Cc: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kirtika Ruchandani <kirtika@chromium.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13mac80211: Remove unused 'struct rate_control_ref' variableKirtika Ruchandani
Commit 3b17fbf87d5d introduced sta_get_expected_throughput() leaving variable 'struct rate_control_ref* ref' set but unused. Compiling with W=1 gives the following warning, fix it. net/mac80211/sta_info.c: In function ‘sta_set_sinfo’: net/mac80211/sta_info.c:2052:27: warning: variable ‘ref’ set but not used [-Wunused-but-set-variable] Fixes: 3b17fbf87d5d ("mac80211: mesh: Add support for HW RC implementation") Cc: Johannes Berg <johannes.berg@intel.com> Cc: Maxim Altshul <maxim.altshul@ti.com> Signed-off-by: Kirtika Ruchandani <kirtika@google.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13mac80211: Remove unused 'rates_idx' variableKirtika Ruchandani
Commit f027c2aca0cf introduced 'rates_idx' in ieee80211_tx_status_noskb but did not use it. Compiling with W=1 gives the following warning, fix it. mac80211/status.c: In function ‘ieee80211_tx_status_noskb’: mac80211/status.c:636:6: warning: variable ‘rates_idx’ set but not used [-Wunused-but-set-variable] This is a harmless warning, and is only being fixed to reduce the noise generated with W=1. Fixes: f027c2aca0cf ("mac80211: add ieee80211_tx_status_noskb") Cc: Johannes Berg <johannes.berg@intel.com> Cc: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Kirtika Ruchandani <kirtika@google.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13mac80211: Remove unused 'struct ieee80211_rx_status' ptrKirtika Ruchandani
Commit 554891e63a29 introduced 'struct ieee80211_rx_status' in ieee80211_rx_h_defragment but did not use it. Compiling with W=1 gives the following warning, fix it. net/mac80211/rx.c: In function ‘ieee80211_rx_h_defragment’: net/mac80211/rx.c:1911:30: warning: variable ‘status’ set but not used [-Wunused-but-set-variable] Fixes: 554891e63a29 ("mac80211: move packet flags into packet") Cc: Johannes Berg <johannes.berg@intel.com> Cc: John W. Linville <linville@tuxdriver.com> Signed-off-by: Kirtika Ruchandani <kirtika@google.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13nl80211: multicast_to_unicast can be changed while IFF_UPMichael Braun
There is no need to prevent toggling multicast_to_unicast while interface is already up. This change simplifies reconfiguration from hostapd. Signed-off-by: Michael Braun <michael-dev@fami-braun.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13nl80211: check NL80211_ATTR_SCHED_SCAN_INTERVAL only onceArend Van Spriel
The presence of the NL80211_ATTR_SCHED_SCAN_INTERVAL attribute was checked in nl80211_parse_sched_scan() and nl80211_parse_sched_scan_plans() which might be a bit redundant so removing one. Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13cfg80211: get rid of name indirection trick for ieee80211_get_channel()Arend Van Spriel
The comment on the name indirection suggested an issue but turned out to be untrue. Digging in older kernel version showed issue with ipw2x00 but that is no longer true so get rid on the name indirection. Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13rfkill: simplify rfkill_set_hw_state() slightlyJohannes Berg
Simplify the two conditions gating the schedule_work() into a single one and get rid of the additional exit point from the function in doing so. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-13mac80211: don't call drv_set_default_unicast_key() for VLANsJohannes Berg
Since drivers know nothing about AP_VLAN interfaces, trying to call drv_set_default_unicast_key() just results in a warning and no call to the driver. Avoid the warning by not calling the driver for this on AP_VLAN interfaces. This means that drivers that somehow need this call for AP mode will fail to work properly in the presence of VLAN interfaces, but the current drivers don't seem to use it, and mac80211 will select and indicate the key - so drivers should be OK now. Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-12Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp hotplug updates from Thomas Gleixner: "This is the final round of converting the notifier mess to the state machine. The removal of the notifiers and the related infrastructure will happen around rc1, as there are conversions outstanding in other trees. The whole exercise removed about 2000 lines of code in total and in course of the conversion several dozen bugs got fixed. The new mechanism allows to test almost every hotplug step standalone, so usage sites can exercise all transitions extensively. There is more room for improvement, like integrating all the pointlessly different architecture mechanisms of synchronizing, setting cpus online etc into the core code" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) tracing/rb: Init the CPU mask on allocation soc/fsl/qbman: Convert to hotplug state machine soc/fsl/qbman: Convert to hotplug state machine zram: Convert to hotplug state machine KVM/PPC/Book3S HV: Convert to hotplug state machine arm64/cpuinfo: Convert to hotplug state machine arm64/cpuinfo: Make hotplug notifier symmetric mm/compaction: Convert to hotplug state machine iommu/vt-d: Convert to hotplug state machine mm/zswap: Convert pool to hotplug state machine mm/zswap: Convert dst-mem to hotplug state machine mm/zsmalloc: Convert to hotplug state machine mm/vmstat: Convert to hotplug state machine mm/vmstat: Avoid on each online CPU loops mm/vmstat: Drop get_online_cpus() from init_cpu_node_state/vmstat_cpu_dead() tracing/rb: Convert to hotplug state machine oprofile/nmi timer: Convert to hotplug state machine net/iucv: Use explicit clean up labels in iucv_init() x86/pci/amd-bus: Convert to hotplug state machine x86/oprofile/nmi: Convert to hotplug state machine ...
2016-12-12crush: include mapper.h in mapper.cTobias Klauser
Include linux/crush/mapper.h in crush/mapper.c to get the prototypes of crush_find_rule and crush_do_rule which are defined there. This fixes the following GCC warnings when building with 'W=1': net/ceph/crush/mapper.c:40:5: warning: no previous prototype for ‘crush_find_rule’ [-Wmissing-prototypes] net/ceph/crush/mapper.c:793:5: warning: no previous prototype for ‘crush_do_rule’ [-Wmissing-prototypes] Signed-off-by: Tobias Klauser <tklauser@distanz.ch> [idryomov@gmail.com: corresponding !__KERNEL__ include] Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-12-12libceph: no need to drop con->mutex for ->get_authorizer()Ilya Dryomov
->get_authorizer(), ->verify_authorizer_reply(), ->sign_message() and ->check_message_signature() shouldn't be doing anything with or on the connection (like closing it or sending messages). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12libceph: drop len argument of *verify_authorizer_reply()Ilya Dryomov
The length of the reply is protocol-dependent - for cephx it's ceph_x_authorize_reply. Nothing sensible can be passed from the messenger layer anyway. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12libceph: verify authorize reply on connectIlya Dryomov
After sending an authorizer (ceph_x_authorize_a + ceph_x_authorize_b), the client gets back a ceph_x_authorize_reply, which it is supposed to verify to ensure the authenticity and protect against replay attacks. The code for doing this is there (ceph_x_verify_authorizer_reply(), ceph_auth_verify_authorizer_reply() + plumbing), but it is never invoked by the the messenger. AFAICT this goes back to 2009, when ceph authentication protocols support was added to the kernel client in 4e7a5dcd1bba ("ceph: negotiate authentication protocol; implement AUTH_NONE protocol"). The second param of ceph_connection_operations::verify_authorizer_reply is unused all the way down. Pass 0 to facilitate backporting, and kill it in the next commit. Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12libceph: no need for GFP_NOFS in ceph_monc_init()Ilya Dryomov
It's called during inital setup, when everything should be allocated with GFP_KERNEL. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12libceph: stop allocating a new cipher on every crypto requestIlya Dryomov
This is useless and more importantly not allowed on the writeback path, because crypto_alloc_skcipher() allocates memory with GFP_KERNEL, which can recurse back into the filesystem: kworker/9:3 D ffff92303f318180 0 20732 2 0x00000080 Workqueue: ceph-msgr ceph_con_workfn [libceph] ffff923035dd4480 ffff923038f8a0c0 0000000000000001 000000009eb27318 ffff92269eb28000 ffff92269eb27338 ffff923036b145ac ffff923035dd4480 00000000ffffffff ffff923036b145b0 ffffffff951eb4e1 ffff923036b145a8 Call Trace: [<ffffffff951eb4e1>] ? schedule+0x31/0x80 [<ffffffff951eb77a>] ? schedule_preempt_disabled+0xa/0x10 [<ffffffff951ed1f4>] ? __mutex_lock_slowpath+0xb4/0x130 [<ffffffff951ed28b>] ? mutex_lock+0x1b/0x30 [<ffffffffc0a974b3>] ? xfs_reclaim_inodes_ag+0x233/0x2d0 [xfs] [<ffffffff94d92ba5>] ? move_active_pages_to_lru+0x125/0x270 [<ffffffff94f2b985>] ? radix_tree_gang_lookup_tag+0xc5/0x1c0 [<ffffffff94dad0f3>] ? __list_lru_walk_one.isra.3+0x33/0x120 [<ffffffffc0a98331>] ? xfs_reclaim_inodes_nr+0x31/0x40 [xfs] [<ffffffff94e05bfe>] ? super_cache_scan+0x17e/0x190 [<ffffffff94d919f3>] ? shrink_slab.part.38+0x1e3/0x3d0 [<ffffffff94d9616a>] ? shrink_node+0x10a/0x320 [<ffffffff94d96474>] ? do_try_to_free_pages+0xf4/0x350 [<ffffffff94d967ba>] ? try_to_free_pages+0xea/0x1b0 [<ffffffff94d863bd>] ? __alloc_pages_nodemask+0x61d/0xe60 [<ffffffff94ddf42d>] ? cache_grow_begin+0x9d/0x560 [<ffffffff94ddfb88>] ? fallback_alloc+0x148/0x1c0 [<ffffffff94ed84e7>] ? __crypto_alloc_tfm+0x37/0x130 [<ffffffff94de09db>] ? __kmalloc+0x1eb/0x580 [<ffffffffc09fe2db>] ? crush_choose_firstn+0x3eb/0x470 [libceph] [<ffffffff94ed84e7>] ? __crypto_alloc_tfm+0x37/0x130 [<ffffffff94ed9c19>] ? crypto_spawn_tfm+0x39/0x60 [<ffffffffc08b30a3>] ? crypto_cbc_init_tfm+0x23/0x40 [cbc] [<ffffffff94ed857c>] ? __crypto_alloc_tfm+0xcc/0x130 [<ffffffff94edcc23>] ? crypto_skcipher_init_tfm+0x113/0x180 [<ffffffff94ed7cc3>] ? crypto_create_tfm+0x43/0xb0 [<ffffffff94ed83b0>] ? crypto_larval_lookup+0x150/0x150 [<ffffffff94ed7da2>] ? crypto_alloc_tfm+0x72/0x120 [<ffffffffc0a01dd7>] ? ceph_aes_encrypt2+0x67/0x400 [libceph] [<ffffffffc09fd264>] ? ceph_pg_to_up_acting_osds+0x84/0x5b0 [libceph] [<ffffffff950d40a0>] ? release_sock+0x40/0x90 [<ffffffff95139f94>] ? tcp_recvmsg+0x4b4/0xae0 [<ffffffffc0a02714>] ? ceph_encrypt2+0x54/0xc0 [libceph] [<ffffffffc0a02b4d>] ? ceph_x_encrypt+0x5d/0x90 [libceph] [<ffffffffc0a02bdf>] ? calcu_signature+0x5f/0x90 [libceph] [<ffffffffc0a02ef5>] ? ceph_x_sign_message+0x35/0x50 [libceph] [<ffffffffc09e948c>] ? prepare_write_message_footer+0x5c/0xa0 [libceph] [<ffffffffc09ecd18>] ? ceph_con_workfn+0x2258/0x2dd0 [libceph] [<ffffffffc09e9903>] ? queue_con_delay+0x33/0xd0 [libceph] [<ffffffffc09f68ed>] ? __submit_request+0x20d/0x2f0 [libceph] [<ffffffffc09f6ef8>] ? ceph_osdc_start_request+0x28/0x30 [libceph] [<ffffffffc0b52603>] ? rbd_queue_workfn+0x2f3/0x350 [rbd] [<ffffffff94c94ec0>] ? process_one_work+0x160/0x410 [<ffffffff94c951bd>] ? worker_thread+0x4d/0x480 [<ffffffff94c95170>] ? process_one_work+0x410/0x410 [<ffffffff94c9af8d>] ? kthread+0xcd/0xf0 [<ffffffff951efb2f>] ? ret_from_fork+0x1f/0x40 [<ffffffff94c9aec0>] ? kthread_create_on_node+0x190/0x190 Allocating the cipher along with the key fixes the issue - as long the key doesn't change, a single cipher context can be used concurrently in multiple requests. We still can't take that GFP_KERNEL allocation though. Both ceph_crypto_key_clone() and ceph_crypto_key_decode() are called from GFP_NOFS context, so resort to memalloc_noio_{save,restore}() here. Reported-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12libceph: uninline ceph_crypto_key_destroy()Ilya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12libceph: remove now unused ceph_*{en,de}crypt*() functionsIlya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12libceph: switch ceph_x_decrypt() to ceph_crypt()Ilya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12libceph: switch ceph_x_encrypt() to ceph_crypt()Ilya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12libceph: tweak calcu_signature() a littleIlya Dryomov
- replace an ad-hoc array with a struct - rename to calc_signature() for consistency Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12libceph: rename and align ceph_x_authorizer::reply_bufIlya Dryomov
It's going to be used as a temporary buffer for in-place en/decryption with ceph_crypt() instead of on-stack buffers, so rename to enc_buf. Ensure alignment to avoid GFP_ATOMIC allocations in the crypto stack. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12libceph: introduce ceph_crypt() for in-place en/decryptionIlya Dryomov
Starting with 4.9, kernel stacks may be vmalloced and therefore not guaranteed to be physically contiguous; the new CONFIG_VMAP_STACK option is enabled by default on x86. This makes it invalid to use on-stack buffers with the crypto scatterlist API, as sg_set_buf() expects a logical address and won't work with vmalloced addresses. There isn't a different (e.g. kvec-based) crypto API we could switch net/ceph/crypto.c to and the current scatterlist.h API isn't getting updated to accommodate this use case. Allocating a new header and padding for each operation is a non-starter, so do the en/decryption in-place on a single pre-assembled (header + data + padding) heap buffer. This is explicitly supported by the crypto API: "... the caller may provide the same scatter/gather list for the plaintext and cipher text. After the completion of the cipher operation, the plaintext data is replaced with the ciphertext data in case of an encryption and vice versa for a decryption." Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12libceph: introduce ceph_x_encrypt_offset()Ilya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12libceph: old_key in process_one_ticket() is redundantIlya Dryomov
Since commit 0a990e709356 ("ceph: clean up service ticket decoding"), th->session_key isn't assigned until everything is decoded. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12libceph: ceph_x_encrypt_buflen() takes in_lenIlya Dryomov
Pass what's going to be encrypted - that's msg_b, not ticket_blob. ceph_x_encrypt_buflen() returns the upper bound, so this doesn't change the maxlen calculation, but makes it a bit clearer. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The tree got pretty big in this development cycle, but the net effect is pretty good: 115 files changed, 673 insertions(+), 1522 deletions(-) The main changes were: - Rework and generalize the mutex code to remove per arch mutex primitives. (Peter Zijlstra) - Add vCPU preemption support: add an interface to query the preemption status of vCPUs and use it in locking primitives - this optimizes paravirt performance. (Pan Xinhui, Juergen Gross, Christian Borntraeger) - Introduce cpu_relax_yield() and remov cpu_relax_lowlatency() to clean up and improve the s390 lock yielding machinery and its core kernel impact. (Christian Borntraeger) - Micro-optimize mutexes some more. (Waiman Long) - Reluctantly add the to-be-deprecated mutex_trylock_recursive() interface on a temporary basis, to give the DRM code more time to get rid of its locking hacks. Any other users will be NAK-ed on sight. (We turned off the deprecation warning for the time being to not pollute the build log.) (Peter Zijlstra) - Improve the rtmutex code a bit, in light of recent long lived bugs/races. (Thomas Gleixner) - Misc fixes, cleanups" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) x86/paravirt: Fix bool return type for PVOP_CALL() x86/paravirt: Fix native_patch() locking/ww_mutex: Use relaxed atomics locking/rtmutex: Explain locking rules for rt_mutex_proxy_unlock()/init_proxy_locked() locking/rtmutex: Get rid of RT_MUTEX_OWNER_MASKALL x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted() locking/mutex: Break out of expensive busy-loop on {mutex,rwsem}_spin_on_owner() when owner vCPU is preempted locking/osq: Break out of spin-wait busy waiting loop for a preempted vCPU in osq_lock() Documentation/virtual/kvm: Support the vCPU preemption check x86/xen: Support the vCPU preemption check x86/kvm: Support the vCPU preemption check x86/kvm: Support the vCPU preemption check kvm: Introduce kvm_write_guest_offset_cached() locking/core, x86/paravirt: Implement vcpu_is_preempted(cpu) for KVM and Xen guests locking/spinlocks, s390: Implement vcpu_is_preempted(cpu) locking/core, powerpc: Implement vcpu_is_preempted(cpu) sched/core: Introduce the vcpu_is_preempted(cpu) interface sched/wake_q: Rename WAKE_Q to DEFINE_WAKE_Q locking/core: Provide common cpu_relax_yield() definition locking/mutex: Don't mark mutex_trylock_recursive() as deprecated, temporarily ...
2016-12-11netfilter: nft_counter: rework atomic dump and resetPablo Neira
Dump and reset doesn't work unless cmpxchg64() is used both from packet and control plane paths. This approach is going to be slow though. Instead, use a percpu seqcount to fetch counters consistently, then subtract bytes and packets in case a reset was requested. The cpu that running over the reset code is guaranteed to own this stats exclusively, we have to turn counters into signed 64bit though so stats update on reset don't get wrong on underflow. This patch is based on original sketch from Eric Dumazet. Fixes: 43da04a593d8 ("netfilter: nf_tables: atomic dump and reset for stateful objects") Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>