summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-06-02block: don't use blocking queue entered for recursive bio submitsJens Axboe
If we end up splitting a bio and the queue goes away between the initial submission and the later split submission, then we can block forever in blk_queue_enter() waiting for the reference to drop to zero. This will never happen, since we already hold a reference. Mark a split bio as already having entered the queue, so we can just use the live non-blocking queue enter variant. Thanks to Tetsuo Handa for the analysis. Reported-by: syzbot+c4f9cebf9d651f6e54de@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-02dm-crypt: fix warning in shutdown pathKent Overstreet
The counter for the number of allocated pages includes pages in the mempool's reserve, so checking that the number of allocated pages is 0 needs to happen after we exit the mempool. Fixes: 6f1c819c219f ("dm: convert to bioset_init()/mempool_init()") Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Reported-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Mike Snitzer <snitzer@redhat.com> Fixed to always just use percpu_counter_sum() Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Infinite loop in _decode_session6(), from Eric Dumazet. 2) Pass correct argument to nla_strlcpy() in netfilter, also from Eric Dumazet. 3) Out of bounds memory access in ipv6 srh code, from Mathieu Xhonneux. 4) NULL deref in XDP_REDIRECT handling of tun driver, from Toshiaki Makita. 5) Incorrect idr release in cls_flower, from Paul Blakey. 6) Probe error handling fix in davinci_emac, from Dan Carpenter. 7) Memory leak in XPS configuration, from Alexander Duyck. 8) Use after free with cloned sockets in kcm, from Kirill Tkhai. 9) MTU handling fixes fo ip_tunnel and ip6_tunnel, from Nicolas Dichtel. 10) Fix UAPI hole in bpf data structure for 32-bit compat applications, from Daniel Borkmann. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits) bpf: fix uapi hole for 32 bit compat applications net: usb: cdc_mbim: add flag FLAG_SEND_ZLP ip6_tunnel: remove magic mtu value 0xFFF8 ip_tunnel: restore binding to ifaces with a large mtu net: dsa: b53: Add BCM5389 support kcm: Fix use-after-free caused by clonned sockets net-sysfs: Fix memory leak in XPS configuration ixgbe: fix parsing of TC actions for HW offload net: ethernet: davinci_emac: fix error handling in probe() net/ncsi: Fix array size in dumpit handler cls_flower: Fix incorrect idr release when failing to modify rule net/sonic: Use dma_mapping_error() xfrm Fix potential error pointer dereference in xfrm_bundle_create. vhost_net: flush batched heads before trying to busy polling tun: Fix NULL pointer dereference in XDP redirect be2net: Fix error detection logic for BE3 net: qmi_wwan: Add Netgear Aircard 779S mlxsw: spectrum: Forbid creation of VLAN 1 over port/LAG atm: zatm: fix memcmp casting iwlwifi: pcie: compare with number of IRQs requested for, not number of CPUs ...
2018-06-02CIFS: Add support for direct pages in wdataLong Li
Add a function to allocate wdata without allocating pages for data transfer. This gives the caller an option to pass a number of pages that point to the data buffer to write to. wdata is reponsible for free those pages after it's done. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2018-06-02CIFS: Use offset when reading pagesLong Li
With offset defined in rdata, transport functions need to look at this offset when reading data into the correct places in pages. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2018-06-02CIFS: Add support for direct pages in rdataLong Li
Add a function to allocate rdata without allocating pages for data transfer. This gives the caller an option to pass a number of pages that point to the data buffer. rdata is still reponsible for free those pages after it's done. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2018-06-02cifs: update multiplex loop to handle compounded responsesRonnie Sahlberg
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2018-06-02Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "Eve of merge window fix: The original code was so bogus as to be casting the wrong generic device to an rport and proceeding to take actions based on the bogus values it found. Fortunately it seems the location that is dereferenced always exists, so the code hasn't oopsed yet, but it certainly annoys the memory checkers" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: scsi_transport_srp: Fix shost to rport translation
2018-06-02Merge tag 'drm-fixes-for-v4.17-rc8' of ↵Linus Torvalds
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "A few final fixes: i915: - fix for potential Spectre vector in the new query uAPI - fix NULL pointer deref (FDO #106559) - DMI fix to hide LVDS for Radiant P845 (FDO #105468) amdgpu: - suspend/resume DC regression fix - underscan flicker fix on fiji - gamma setting fix after dpms omap: - fix oops regression core: - fix PSR timing dw-hdmi: - fix oops regression" * tag 'drm-fixes-for-v4.17-rc8' of git://people.freedesktop.org/~airlied/linux: drm/amd/display: Update color props when modeset is required drm/amd/display: Make atomic-check validate underscan changes drm/bridge/synopsys: dw-hdmi: fix dw_hdmi_setup_rx_sense drm/amd/display: Fix BUG_ON during CRTC atomic check update drm/i915/query: nospec expects no more than an unsigned long drm/i915/query: Protect tainted function pointer lookup drm/i915/lvds: Move acpi lid notification registration to registration phase drm/i915: Disable LVDS on Radiant P845 drm/omap: fix NULL deref crash with SDI displays drm/psr: Fix missed entry in PSR setup time table.
2018-06-03Merge branch 'drm-fixes-4.17' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-fixes Two last minute DC fixes for 4.17. A fix for underscan on fiji and a fix for gamma settings getting after dpms. * 'drm-fixes-4.17' of git://people.freedesktop.org/~agd5f/linux: drm/amd/display: Update color props when modeset is required drm/amd/display: Make atomic-check validate underscan changes
2018-06-02vmw_balloon: fixing double free when batching mode is offGil Kupfer
The balloon.page field is used for two different purposes if batching is on or off. If batching is on, the field point to the page which is used to communicate with with the hypervisor. If it is off, balloon.page points to the page that is about to be (un)locked. Unfortunately, this dual-purpose of the field introduced a bug: when the balloon is popped (e.g., when the machine is reset or the balloon driver is explicitly removed), the balloon driver frees, unconditionally, the page that is held in balloon.page. As a result, if batching is disabled, this leads to double freeing the last page that is sent to the hypervisor. The following error occurs during rmmod when kernel checkers are on, and the balloon is not empty: [ 42.307653] ------------[ cut here ]------------ [ 42.307657] Kernel BUG at ffffffffba1e4b28 [verbose debug info unavailable] [ 42.307720] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 42.312512] Modules linked in: vmw_vsock_vmci_transport vsock ppdev joydev vmw_balloon(-) input_leds serio_raw vmw_vmci parport_pc shpchp parport i2c_piix4 nfit mac_hid autofs4 vmwgfx drm_kms_helper hid_generic syscopyarea sysfillrect usbhid sysimgblt fb_sys_fops hid ttm mptspi scsi_transport_spi ahci mptscsih drm psmouse vmxnet3 libahci mptbase pata_acpi [ 42.312766] CPU: 10 PID: 1527 Comm: rmmod Not tainted 4.12.0+ #5 [ 42.312803] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2016 [ 42.313042] task: ffff9bf9680f8000 task.stack: ffffbfefc1638000 [ 42.313290] RIP: 0010:__free_pages+0x38/0x40 [ 42.313510] RSP: 0018:ffffbfefc163be98 EFLAGS: 00010246 [ 42.313731] RAX: 000000000000003e RBX: ffffffffc02b9720 RCX: 0000000000000006 [ 42.313972] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9bf97e08e0a0 [ 42.314201] RBP: ffffbfefc163be98 R08: 0000000000000000 R09: 0000000000000000 [ 42.314435] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc02b97e4 [ 42.314505] R13: ffffffffc02b9748 R14: ffffffffc02b9728 R15: 0000000000000200 [ 42.314550] FS: 00007f3af5fec700(0000) GS:ffff9bf97e080000(0000) knlGS:0000000000000000 [ 42.314599] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 42.314635] CR2: 00007f44f6f4ab24 CR3: 00000003a7d12000 CR4: 00000000000006e0 [ 42.314864] Call Trace: [ 42.315774] vmballoon_pop+0x102/0x130 [vmw_balloon] [ 42.315816] vmballoon_exit+0x42/0xd64 [vmw_balloon] [ 42.315853] SyS_delete_module+0x1e2/0x250 [ 42.315891] entry_SYSCALL_64_fastpath+0x23/0xc2 [ 42.315924] RIP: 0033:0x7f3af5b0e8e7 [ 42.315949] RSP: 002b:00007fffe6ce0148 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 42.315996] RAX: ffffffffffffffda RBX: 000055be676401e0 RCX: 00007f3af5b0e8e7 [ 42.316951] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055be67640248 [ 42.317887] RBP: 0000000000000003 R08: 0000000000000000 R09: 1999999999999999 [ 42.318845] R10: 0000000000000883 R11: 0000000000000206 R12: 00007fffe6cdf130 [ 42.319755] R13: 0000000000000000 R14: 0000000000000000 R15: 000055be676401e0 [ 42.320606] Code: c0 74 1c f0 ff 4f 1c 74 02 5d c3 85 f6 74 07 e8 0f d8 ff ff 5d c3 31 f6 e8 c6 fb ff ff 5d c3 48 c7 c6 c8 0f c5 ba e8 58 be 02 00 <0f> 0b 66 0f 1f 44 00 00 66 66 66 66 90 48 85 ff 75 01 c3 55 48 [ 42.323462] RIP: __free_pages+0x38/0x40 RSP: ffffbfefc163be98 [ 42.325735] ---[ end trace 872e008e33f81508 ]--- To solve the bug, we eliminate the dual purpose of balloon.page. Fixes: f220a80f0c2e ("VMware balloon: add batching to the vmw_balloon.") Cc: stable@vger.kernel.org Reported-by: Oleksandr Natalenko <onatalen@redhat.com> Signed-off-by: Gil Kupfer <gilkup@gmail.com> Signed-off-by: Nadav Amit <namit@vmware.com> Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Tested-by: Oleksandr Natalenko <oleksandr@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-02Merge tag 'mips_fixes_4.17_3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from James Hogan: "A final few MIPS fixes for 4.17: - drop Lantiq gphy reboot/remove reset (4.14) - prctl(PR_SET_FP_MODE): Disallow PRE without FR (4.0) - ptrace(PTRACE_PEEKUSR): Fix 64-bit FGRs (3.15)" * tag 'mips_fixes_4.17_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: ptrace: Fix PTRACE_PEEKUSR requests for 64-bit FGRs MIPS: prctl: Disallow FRE without FR with PR_SET_FP_MODE requests MIPS: lantiq: gphy: Drop reboot/remove reset asserts
2018-06-02Merge tag 'vfio-v4.17' of git://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO fix from Alex Williamson: "Revert a pfn page mapping optimization identified as introducing a bad page state regression (Alex Williamson)" * tag 'vfio-v4.17' of git://github.com/awilliam/linux-vfio: Revert "vfio/type1: Improve memory pinning process for raw PFN mapping"
2018-06-02Merge tag 'char-misc-4.17-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are four small bugfixes for some char/misc drivers. Well, really three fixes and one fix for one of those fixes due to problems found by 0-day. This resolves some reported issues with the hwtracing drivers, and a reported regression for the thunderbolt subsystem. All of these have been in linux-next for a while now with no reported problems" * tag 'char-misc-4.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: hwtracing: stm: fix build error on some arches intel_th: Use correct device when freeing buffers stm class: Use vmalloc for the master map thunderbolt: Handle NULL boot ACL entries properly
2018-06-02Merge tag 'staging-4.17-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull IIO driver fixes from Greg KH: "Here are some old IIO driver fixes that were sitting in my tree for a few weeks. Sorry about not getting them to you sooner. They fix a number of small IIO driver issues that have been reported. All of these have been in linux-next for a while with no reported problems" * tag 'staging-4.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: iio: adc: select buffer for at91-sama5d2_adc iio: hid-sensor-trigger: Fix sometimes not powering up the sensor after resume iio: adc: at91-sama5d2_adc: fix channel configuration for differential channels iio:kfifo_buf: check for uint overflow iio:buffer: make length types match kfifo types iio: adc: stm32-dfsdm: fix sample rate for div2 spi clock iio: adc: stm32-dfsdm: fix successive oversampling settings iio: ad7793: implement IIO_CHAN_INFO_SAMP_FREQ
2018-06-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "Just three small last minute regressions that were found in the last week. The Broadcom fix is a bit big for rc7, but since it is fixing driver crash regressions that were merged via netdev into rc1, I am sending it. - bnxt netdev changes merged this cycle caused the bnxt RDMA driver to crash under certain situations - Arnd found (several, unfortunately) kconfig problems with the patches adding INFINIBAND_ADDR_TRANS. Reverting this last part, will fix it more fully outside -rc. - Subtle change in error code for a uapi function caused breakage in userspace. This was bug was subtly introduced cycle" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/core: Fix error code for invalid GID entry IB: Revert "remove redundant INFINIBAND kconfig dependencies" RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes
2018-06-02Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "A documentation bugfix and a MAINTAINERS addition" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: ocores: update HDL sources URL i2c: xlp9xx: Add MAINTAINERS entry
2018-06-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge two fixes from Andrew Morton. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: fix the NULL mapping case in __isolate_lru_page() mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty()
2018-06-02mm: fix the NULL mapping case in __isolate_lru_page()Hugh Dickins
George Boole would have noticed a slight error in 4.16 commit 69d763fc6d3a ("mm: pin address_space before dereferencing it while isolating an LRU page"). Fix it, to match both the comment above it, and the original behaviour. Although anonymous pages are not marked PageDirty at first, we have an old habit of calling SetPageDirty when a page is removed from swap cache: so there's a category of ex-swap pages that are easily migratable, but were inadvertently excluded from compaction's async migration in 4.16. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805302014001.12558@eggly.anvils Fixes: 69d763fc6d3a ("mm: pin address_space before dereferencing it while isolating an LRU page") Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Ivan Kalvachev <ikalvachev@gmail.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-02mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty()Hugh Dickins
Swapping load on huge=always tmpfs (with khugepaged tuned up to be very eager, but I'm not sure that is relevant) soon hung uninterruptibly, waiting for page lock in shmem_getpage_gfp()'s find_lock_entry(), most often when "cp -a" was trying to write to a smallish file. Debug showed that the page in question was not locked, and page->mapping NULL by now, but page->index consistent with having been in a huge page before. Reproduced in minutes on a 4.15 kernel, even with 4.17's 605ca5ede764 ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()") added in; but took hours to reproduce on a 4.17 kernel (no idea why). The culprit proved to be the __ClearPageDirty() on tails beyond i_size in __split_huge_page(): the non-atomic __bitoperation may have been safe when 4.8's baa355fd3314 ("thp: file pages support for split_huge_page()") introduced it, but liable to erase PageWaiters after 4.10's 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit"). Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805291841070.3197@eggly.anvils Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit") Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-02Revert "vfio/type1: Improve memory pinning process for raw PFN mapping"Alex Williamson
Bisection by Amadeusz Sławiński implicates this commit leading to bad page state issues after VM shutdown, likely due to unbalanced page references. The original commit was intended only as a performance improvement, therefore revert for offline rework. Link: https://lkml.org/lkml/2018/6/2/97 Fixes: 356e88ebe447 ("vfio/type1: Improve memory pinning process for raw PFN mapping") Cc: Jason Cai (Xiang Feng) <jason.cai@linux.alibaba.com> Reported-by: Amadeusz Sławiński <amade@asmblr.net> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-06-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2018-06-02 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) BPF uapi fix in struct bpf_prog_info and struct bpf_map_info in order to fix offsets on 32 bit archs. This will have a minor merge conflict with net-next which has the __u32 gpl_compatible:1 bitfield in struct bpf_prog_info at this location. Resolution is to use the gpl_compatible member. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01bpf: fix uapi hole for 32 bit compat applicationsDaniel Borkmann
In 64 bit, we have a 4 byte hole between ifindex and netns_dev in the case of struct bpf_map_info but also struct bpf_prog_info. In net-next commit b85fab0e67b ("bpf: Add gpl_compatible flag to struct bpf_prog_info") added a bitfield into it to expose some flags related to programs. Thus, add an unnamed __u32 bitfield for both so that alignment keeps the same in both 32 and 64 bit cases, and can be naturally extended from there as in b85fab0e67b. Before: # file test.o test.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped # pahole test.o struct bpf_map_info { __u32 type; /* 0 4 */ __u32 id; /* 4 4 */ __u32 key_size; /* 8 4 */ __u32 value_size; /* 12 4 */ __u32 max_entries; /* 16 4 */ __u32 map_flags; /* 20 4 */ char name[16]; /* 24 16 */ __u32 ifindex; /* 40 4 */ __u64 netns_dev; /* 44 8 */ __u64 netns_ino; /* 52 8 */ /* size: 64, cachelines: 1, members: 10 */ /* padding: 4 */ }; After (same as on 64 bit): # file test.o test.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped # pahole test.o struct bpf_map_info { __u32 type; /* 0 4 */ __u32 id; /* 4 4 */ __u32 key_size; /* 8 4 */ __u32 value_size; /* 12 4 */ __u32 max_entries; /* 16 4 */ __u32 map_flags; /* 20 4 */ char name[16]; /* 24 16 */ __u32 ifindex; /* 40 4 */ /* XXX 4 bytes hole, try to pack */ __u64 netns_dev; /* 48 8 */ __u64 netns_ino; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ /* size: 64, cachelines: 1, members: 10 */ /* sum members: 60, holes: 1, sum holes: 4 */ }; Reported-by: Dmitry V. Levin <ldv@altlinux.org> Reported-by: Eugene Syromiatnikov <esyr@redhat.com> Fixes: 52775b33bb507 ("bpf: offload: report device information about offloaded maps") Fixes: 675fc275a3a2d ("bpf: offload: report device information for offloaded programs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-01tools/power turbostat: update version numberLen Brown
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: Add Node in outputPrarit Bhargava
Output a Node column if there is more than one node/socket. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: add node information into turbostat calculationsPrarit Bhargava
The previous patches have added node information to turbostat, but the counters code does not take it into account. Add node information from cpu_topology calculations to turbostat counters. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: remove num_ from cpu_topology structPrarit Bhargava
Cleanup, remove num_ from num_nodes_per_pkg, num_cores_per_node, and num_threads_per_node. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: rename num_cores_per_pkg to num_cores_per_nodePrarit Bhargava
turbostat incorrectly assumes that there is one node per package. As a result num_cores_per_pkg is not correctly named and is actually num_cores_per_node. Rename num_cores_per_pkg to num_cores_per_node. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: track thread ID in cpu_topologyPrarit Bhargava
The code can be simplified if the cpu_topology *cpus tracks the thread IDs. This removes an additional file lookup and simplifies the counter initialization code. Add thread ID to cpu_topology information and cleanup the counter initialization code. v2: prevent thread_id from being overwritten Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: Calculate additional node information for a packagePrarit Bhargava
The code currently assumes each package has exactly one node. This is not the case for AMD systems and Intel systems with COD. AMD systems also may re-enumerate each node's core IDs starting at 0 (for example, an AMD processor may have two nodes, each with core IDs from 0 to 7). In order to properly enumerate the cores we need to track both the physical and logical node IDs. Add physical_node_id to track the node ID assigned by the kernel, and logical_node_id used by turbostat to track the nodes per package ie) a 0-based count within the package. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: Fix node and siblings lookup dataLen Brown
The turbostat code only looks at thread_siblings_list to determine if processing units/threads are on the same the core. This works well on Intel systems which have a shared L1 instruction and data cache. This does not work on AMD systems which have shared L1 instruction cache but separate L1 data caches. Other utilities also check sibling's core ID to determine if the processing unit shares the same core. Additionally, the cpu_topology *cpus list used in topology_probe() can be used elsewhere in the code to simplify things. Export *cpus to the entire turbostat code, and add Processing Unit/Thread IDs information to each cpu_topology struct. Confirm that the thread is on the same core as indicated by thread_siblings_list. [v2]: Fixup CPU_* usage that caused gcc malloc error. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: set max_num_cpus equal to the cpumask lengthPrarit Bhargava
Future fixes will use sysfs files that contain cpumask output. The code needs to know the length of the cpumask in order to determine which cpus are set in a cpumask. Currently topo.max_cpu_num is the maximum cpu number. It can be increased the the maximum value of cpus represented in cpumasks. Set max_num_cpus to the length of a cpumask. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: if --num_iterations, print for specific number of ↵Chen Yu
iterations There's a use case during test to only print specific round of iterations if --num_iterations is specified, for example, with this patch applied: turbostat -i 5 -n 4 will capture 4 samples with 5 seconds interval. [lenb: renamed to --num_iterations from --iterations] Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: Add Cannon Lake supportSrinivas Pandruvada
All MSRs related to turbostat are same as Kabylake. Even though SDM claims that core C3 residency can be read from MSR 0x662, the read on this MSR fails on CNL platform. Hence disabled C3 MSR read and display. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: delete duplicate #definesLen Brown
The SNB_C1_AUTO_UNDEMOTE definition should have been deleted once it was copied into msr-index.h. One copy of the truth is better -- particularly when Matt needs to fix it:-) Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01x86: msr-index.h: Correct SNB_C1/C3_AUTO_UNDEMOTE definesMatt Turner
According to the Intel Software Developers' Manual, Vol. 4, Order No. 335592, these macros have been reversed since they were added in the initial turbostat commit. The reversed definitions were presumably copied from turbostat.c to this file. Fixes: 9c63a650bb10 ("tools/power/x86/turbostat: share kernel MSR #defines") Signed-off-by: Matt Turner <mattst88@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: Correct SNB_C1/C3_AUTO_UNDEMOTE definesMatt Turner
According to the Intel Software Developers' Manual, Vol. 4, Order No. 335592, these macros have been reversed since they were added. Fixes: 889facbee3e6 ("tools/power turbostat: v3.0: monitor Watts and Temperature") Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: add POLL and POLL% columnLen Brown
Like the "C1" and "C1%" column, the new POLL and POLL% columns show invocations and residency% during the measurement interval. While it didn't seem important to track in the past, we've recently found some Linux cpuidle bugs related to POLL%. Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: Fix --hide Pk%pc10Len Brown
The column header for PC10 residency is "Pk%pc10" This is missing the 'g' that others have, eg Pkg%pc6, to allow tab-delimited columns to fit into 8-columns. However, --hide Pk%pc10 did not work, it was still looking for the 'g'. This was confusing, because --list shows the correct "Pk%pc10" Reported-by: Wendy Wang <wendy.wang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: Build-in "Low Power Idle" counters supportLen Brown
Linux 4.15 exports the ACPI Low Power Idle Table's counters in /sys/devices/system/cpu/cpuidle/ low_power_idle_cpu_residency_us Show this in the "CPU%LPI" column. Today this reflects the "North Complex" residency in PC10, so expect it to closely follow "Pk%pc10". low_power_idle_system_residency_us Show this in the "SYS%LPI" column. Today, this reflects the North is in PC10, plus the PCH is sufficiently quiescent to save additional power via the "S0ix" system state, as measured by the PCH SLP_S0 counter. Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01fs: use ->is_partially_uptodate in page_cache_seek_hole_dataChristoph Hellwig
This way the implementation doesn't depend on buffer_head internals. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-01fs: remove the buffer_unwritten check in page_seek_hole_dataChristoph Hellwig
We only call into this function through the iomap iterators, so we already know the buffer is unwritten. In addition to that we always require the uptodate flag that is ORed with the result anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-01fs: move page_cache_seek_hole_data to iomap.cChristoph Hellwig
This function is only used by the iomap code, depends on being called from it, and will soon stop poking into buffer head internals. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-01xfs: use iomap_bmapChristoph Hellwig
Switch to the iomap based bmap implementation to get rid of one of the last users of xfs_get_blocks. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-01iomap: add an iomap-based bmap implementationChristoph Hellwig
This adds a simple iomap-based implementation of the legacy ->bmap interface. Note that we can't easily add checks for rt or reflink files, so these will have to remain in the callers. This interface just needs to die.. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-01iomap: add a iomap_sector helperChristoph Hellwig
Factor the repeated calculation of the on-disk sector for a given logical block into a littler helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-01iomap: use __bio_add_page in iomap_dio_zeroChristoph Hellwig
We don't need any merging logic, and this also replaces a BUG_ON with a WARN_ON_ONCE inside __bio_add_page for the impossible overflow condition. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-01iomap: move IOMAP_F_BOUNDARY to gfs2Christoph Hellwig
Just define a range of fs specific flags and use that in gfs2 instead of exposing this internal flag globally. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-01iomap: fix the comment describing IOMAP_NOWAITChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-01iomap: inline data should be an iomap type, not a flagChristoph Hellwig
Inline data is fundamentally different from our normal mapped case in that it doesn't even have a block address. So instead of having a flag for it it should be an entirely separate iomap range type. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>