summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-11-08net: cdc_ether: fix divide by 0 on bad descriptorsBjørn Mork
Setting dev->hard_mtu to 0 will cause a divide error in usbnet_probe. Protect against devices with bogus CDC Ethernet functional descriptors by ignoring a zero wMaxSegmentSize. Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-08KVM: PPC: Book3S HV: Fix exclusion between HPT resizing and other HPT updatesPaul Mackerras
Commit 5e9859699aba ("KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing implementation", 2016-12-20) added code that tries to exclude any use or update of the hashed page table (HPT) while the HPT resizing code is iterating through all the entries in the HPT. It does this by taking the kvm->lock mutex, clearing the kvm->arch.hpte_setup_done flag and then sending an IPI to all CPUs in the host. The idea is that any VCPU task that tries to enter the guest will see that the hpte_setup_done flag is clear and therefore call kvmppc_hv_setup_htab_rma, which also takes the kvm->lock mutex and will therefore block until we release kvm->lock. However, any VCPU that is already in the guest, or is handling a hypervisor page fault or hypercall, can re-enter the guest without rechecking the hpte_setup_done flag. The IPI will cause a guest exit of any VCPUs that are currently in the guest, but does not prevent those VCPU tasks from immediately re-entering the guest. The result is that after resize_hpt_rehash_hpte() has made a HPTE absent, a hypervisor page fault can occur and make that HPTE present again. This includes updating the rmap array for the guest real page, meaning that we now have a pointer in the rmap array which connects with pointers in the old rev array but not the new rev array. In fact, if the HPT is being reduced in size, the pointer in the rmap array could point outside the bounds of the new rev array. If that happens, we can get a host crash later on such as this one: [91652.628516] Unable to handle kernel paging request for data at address 0xd0000000157fb10c [91652.628668] Faulting instruction address: 0xc0000000000e2640 [91652.628736] Oops: Kernel access of bad area, sig: 11 [#1] [91652.628789] LE SMP NR_CPUS=1024 NUMA PowerNV [91652.628847] Modules linked in: binfmt_misc vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables ses enclosure scsi_transport_sas i2c_opal ipmi_powernv ipmi_devintf i2c_core ipmi_msghandler powernv_op_panel nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc kvm_hv kvm_pr kvm scsi_dh_alua dm_service_time dm_multipath tg3 ptp pps_core [last unloaded: stap_552b612747aec2da355051e464fa72a1_14259] [91652.629566] CPU: 136 PID: 41315 Comm: CPU 21/KVM Tainted: G O 4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le #1 [91652.629684] task: c0000007a419e400 task.stack: c0000000028d8000 [91652.629750] NIP: c0000000000e2640 LR: d00000000c36e498 CTR: c0000000000e25f0 [91652.629829] REGS: c0000000028db5d0 TRAP: 0300 Tainted: G O (4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le) [91652.629932] MSR: 900000010280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 44022422 XER: 00000000 [91652.630034] CFAR: d00000000c373f84 DAR: d0000000157fb10c DSISR: 40000000 SOFTE: 1 [91652.630034] GPR00: d00000000c36e498 c0000000028db850 c000000001403900 c0000007b7960000 [91652.630034] GPR04: d0000000117fb100 d000000007ab00d8 000000000033bb10 0000000000000000 [91652.630034] GPR08: fffffffffffffe7f 801001810073bb10 d00000000e440000 d00000000c373f70 [91652.630034] GPR12: c0000000000e25f0 c00000000fdb9400 f000000003b24680 0000000000000000 [91652.630034] GPR16: 00000000000004fb 00007ff7081a0000 00000000000ec91a 000000000033bb10 [91652.630034] GPR20: 0000000000010000 00000000001b1190 0000000000000001 0000000000010000 [91652.630034] GPR24: c0000007b7ab8038 d0000000117fb100 0000000ec91a1190 c000001e6a000000 [91652.630034] GPR28: 00000000033bb100 000000000073bb10 c0000007b7960000 d0000000157fb100 [91652.630735] NIP [c0000000000e2640] kvmppc_add_revmap_chain+0x50/0x120 [91652.630806] LR [d00000000c36e498] kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv] [91652.630884] Call Trace: [91652.630913] [c0000000028db850] [c0000000028db8b0] 0xc0000000028db8b0 (unreliable) [91652.630996] [c0000000028db8b0] [d00000000c36e498] kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv] [91652.631091] [c0000000028db9e0] [d00000000c36a078] kvmppc_vcpu_run_hv+0xdf8/0x1300 [kvm_hv] [91652.631179] [c0000000028dbb30] [d00000000c2248c4] kvmppc_vcpu_run+0x34/0x50 [kvm] [91652.631266] [c0000000028dbb50] [d00000000c220d54] kvm_arch_vcpu_ioctl_run+0x114/0x2a0 [kvm] [91652.631351] [c0000000028dbbd0] [d00000000c2139d8] kvm_vcpu_ioctl+0x598/0x7a0 [kvm] [91652.631433] [c0000000028dbd40] [c0000000003832e0] do_vfs_ioctl+0xd0/0x8c0 [91652.631501] [c0000000028dbde0] [c000000000383ba4] SyS_ioctl+0xd4/0x130 [91652.631569] [c0000000028dbe30] [c00000000000b8e0] system_call+0x58/0x6c [91652.631635] Instruction dump: [91652.631676] fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ffa1 2fa70000 793d0020 e9432110 [91652.631814] 7bbf26e4 7c7e1b78 7feafa14 409e0094 <807f000c> 786326e4 7c6a1a14 93a40008 [91652.631959] ---[ end trace ac85ba6db72e5b2e ]--- To fix this, we tighten up the way that the hpte_setup_done flag is checked to ensure that it does provide the guarantee that the resizing code needs. In kvmppc_run_core(), we check the hpte_setup_done flag after disabling interrupts and refuse to enter the guest if it is clear (for a HPT guest). The code that checks hpte_setup_done and calls kvmppc_hv_setup_htab_rma() is moved from kvmppc_vcpu_run_hv() to a point inside the main loop in kvmppc_run_vcpu(), ensuring that we don't just spin endlessly calling kvmppc_run_core() while hpte_setup_done is clear, but instead have a chance to block on the kvm->lock mutex. Finally we also check hpte_setup_done inside the region in kvmppc_book3s_hv_page_fault() where the HPTE is locked and we are about to update the HPTE, and bail out if it is clear. If another CPU is inside kvm_vm_ioctl_resize_hpt_commit) and has cleared hpte_setup_done, then we know that either we are looking at a HPTE that resize_hpt_rehash_hpte() has not yet processed, which is OK, or else we will see hpte_setup_done clear and refuse to update it, because of the full barrier formed by the unlock of the HPTE in resize_hpt_rehash_hpte() combined with the locking of the HPTE in kvmppc_book3s_hv_page_fault(). Fixes: 5e9859699aba ("KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing implementation") Cc: stable@vger.kernel.org # v4.10+ Reported-by: Satheesh Rajendran <satheera@in.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-11-08bonding: discard lowest hash bit for 802.3ad layer3+4Hangbin Liu
After commit 07f4c90062f8 ("tcp/dccp: try to not exhaust ip_local_port_range in connect()"), we will try to use even ports for connect(). Then if an application (seen clearly with iperf) opens multiple streams to the same destination IP and port, each stream will be given an even source port. So the bonding driver's simple xmit_hash_policy based on layer3+4 addressing will always hash all these streams to the same interface. And the total throughput will limited to a single slave. Change the tcp code will impact the whole tcp behavior, only for bonding usage. Paolo Abeni suggested fix this by changing the bonding code only, which should be more reasonable, and less impact. Fix this by discarding the lowest hash bit because it contains little entropy. After the fix we can re-balance between slaves. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-07Input: synaptics-rmi4 - RMI4 can also use SMBUS version 3Yiannis Marangos
Some Synaptics devices, such as LEN0073, use SMBUS version 3. Signed-off-by: Yiannis Marangos <yiannis.marangos@gmail.com> Acked-by: Benjamin Tissoires <benjamion.tissoires@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-11-07Input: tsc200x-core - set INPUT_PROP_DIRECTMartin Kepplinger
If INPUT_PROP_DIRECT is set, userspace doesn't have to fall back to old ways of identifying touchscreen devices. In order to identify a touchscreen device, Android for example, seems to already depend on INPUT_PROP_DIRECT to be present in drivers. udev still checks for either BTN_TOUCH or INPUT_PROP_DIRECT. Checking for BTN_TOUCH however can quite easily lead to false positives; it's a code that not only touchscreen device drivers use. According to the documentation, touchscreen drivers should have this property set and in order to make life easy for userspace, let's set it. Signed-off-by: Martin Kepplinger <martink@posteo.de> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-11-07Input: elan_i2c - add ELAN060C to the ACPI tableKai-Heng Feng
ELAN060C touchpad uses elan_i2c as its driver. It can be found on Lenovo ideapad 320-14AST. BugLink: https://bugs.launchpad.net/bugs/1727544 Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-11-08net/mlx5e/core/en_fs: fix pointer dereference after free in ↵Gustavo A. R. Silva
mlx5e_execute_l2_action hn is being kfree'd in mlx5e_del_l2_from_hash and then dereferenced by accessing hn->ai.addr Fix this by copying the MAC address into a local variable for its safe use in all possible execution paths within function mlx5e_execute_l2_action. Addresses-Coverity-ID: 1417789 Fixes: eeb66cdb6826 ("net/mlx5: Separate between E-Switch and MPFS") Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-08net: mvpp2: Prevent userspace from changing TX affinitiesMarc Zyngier
The mvpp2 driver can't cope at all with the TX affinities being changed from userspace, and spit an endless stream of [ 91.779920] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing [ 91.779930] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing [ 91.780402] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing [ 91.780406] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing [ 91.780415] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing [ 91.780418] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing rendering the box completely useless (I've measured around 600k interrupts/s on a 8040 box) once irqbalance kicks in and start doing its job. Obviously, the driver was never designed with this in mind. So let's work around the problem by preventing userspace from interacting with these interrupts altogether. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-07MAINTAINERS: Remove Gabriele Paoloni as HiSilicon PCI maintainerGabriele Paoloni
Gabriele is now moving to a different role, so remove him as HiSilicon PCI maintainer. Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com> [bhelgaas: Thanks for all your help, Gabriele, and best wishes!] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Zhou Wang <wangzhou1@hisilicon.com>
2017-11-07MAINTAINERS: Remove Stephen Bates as Microsemi Switchtec maintainerSebastian Andrzej Siewior
Just sent an email there and received an autoreply because he no longer works there. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-11-07MIPS: BMIPS: Fix missing cbr addressJaedon Shin
Fix NULL pointer access in BMIPS3300 RAC flush. Fixes: 738a3f79027b ("MIPS: BMIPS: Add early CPU initialization code") Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Cc: Kevin Cernekee <cernekee@gmail.com> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.7+ Patchwork: https://patchwork.linux-mips.org/patch/16423/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-07resource: Fix resource_size.cocci warningskbuild test robot
arch/x86/kernel/crash.c:627:34-37: ERROR: Missing resource_size with res arch/x86/kernel/crash.c:528:16-19: ERROR: Missing resource_size with res Use resource_size function on resource object instead of explicit computation. Generated by: scripts/coccinelle/api/resource_size.cocci Fixes: 1d2e733b13b4 ("resource: Provide resource struct in resource walk callback") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Juergen Gross <jgross@suse.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Kees Cook <keescook@chromium.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: kbuild-all@01.org Cc: tipbuild@zytor.com Cc: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20171107191801.GA91887@lkp-snb01
2017-11-07drivers/ide-cd: Handle missing driver data during status check gracefullyBorislav Petkov
The 0day bot reports the below failure which happens occasionally, with their randconfig testing (once every ~100 boots). The Code points at the private pointer ->driver_data being NULL, which hints at a race of sorts where the private driver_data descriptor has disappeared by the time we get to run the workqueue. So let's check that pointer before we continue with issuing the command to the drive. This fix is of the brown paper bag nature but considering that IDE is long deprecated, let's do that so that random testing which happens to enable CONFIG_IDE during randconfig builds, doesn't fail because of this. Besides, failing the TEST_UNIT_READY command because the drive private data is gone is something which we could simply do anyway, to denote that there was a problem communicating with the device. BUG: unable to handle kernel NULL pointer dereference at 000001c0 IP: cdrom_check_status *pde = 00000000 Oops: 0000 [#1] SMP CPU: 1 PID: 155 Comm: kworker/1:2 Not tainted 4.14.0-rc8 #127 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Workqueue: events_freezable_power_ disk_events_workfn task: 4fe90980 task.stack: 507ac000 EIP: cdrom_check_status+0x2c/0x90 EFLAGS: 00210246 CPU: 1 EAX: 00000000 EBX: 4fefec00 ECX: 00000000 EDX: 00000000 ESI: 00000003 EDI: ffffffff EBP: 467a9340 ESP: 507aded0 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 80050033 CR2: 000001c0 CR3: 06e0f000 CR4: 00000690 Call Trace: ? ide_cdrom_check_events_real ? cdrom_check_events ? disk_check_events ? process_one_work ? process_one_work ? worker_thread ? kthread ? process_one_work ? __kthread_create_on_node ? ret_from_fork Code: 53 83 ec 14 89 c3 89 d1 be 03 00 00 00 65 a1 14 00 00 00 89 44 24 10 31 c0 8b 43 18 c7 44 24 04 00 00 00 00 c7 04 24 00 00 00 00 <8a> 80 c0 01 00 00 c7 44 24 08 00 00 00 00 83 e0 03 c7 44 24 0c EIP: cdrom_check_status+0x2c/0x90 SS:ESP: 0068:507aded0 CR2: 00000000000001c0 ---[ end trace 2410e586dd8f88b2 ]--- Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jens Axboe <axboe@kernel.dk> Cc: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-07Revert "scsi: make 'state' device attribute pollable"Linus Torvalds
This reverts commit 8a97712e5314aefe16b3ffb4583a34deaa49de04. This commit added a call to sysfs_notify() from within scsi_device_set_state(), which in turn turns out to make libata very unhappy, because ata_eh_detach_dev() does spin_lock_irqsave(ap->lock, flags); .. if (ata_scsi_offline_dev(dev)) { dev->flags |= ATA_DFLAG_DETACHED; ap->pflags |= ATA_PFLAG_SCSI_HOTPLUG; } and ata_scsi_offline_dev() then does that scsi_device_set_state() to set it offline. So now we called sysfs_notify() from within a spinlocked region, which really doesn't work. The 0day robot reported this as: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:238 because sysfs_notify() ends up calling kernfs_find_and_get_ns() which then does mutex_lock(&kernfs_mutex).. The pollability of the device state isn't critical, so revert this all for now, and maybe we'll do it differently in the future. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-07ALSA: seq: Fix OSS sysex delivery in OSS emulationTakashi Iwai
The SYSEX event delivery in OSS sequencer emulation assumed that the event is encoded in the variable-length data with the straight buffering. This was the normal behavior in the past, but during the development, the chained buffers were introduced for carrying more data, while the OSS code was left intact. As a result, when a SYSEX event with the chained buffer data is passed to OSS sequencer port, it may end up with the wrong memory access, as if it were having a too large buffer. This patch addresses the bug, by applying the buffer data expansion by the generic snd_seq_dump_var_event() helper function. Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Mark Salyzyn <salyzyn@android.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-11-07x86/smpboot: Make optimization of delay calibration work correctlyPavel Tatashin
If the TSC has constant frequency then the delay calibration can be skipped when it has been calibrated for a package already. This is checked in calibrate_delay_is_known(), but that function is buggy in two aspects: It returns 'false' if (!tsc_disabled && !cpu_has(&cpu_data(cpu), X86_FEATURE_CONSTANT_TSC) which is obviously the reverse of the intended check and the check for the sibling mask cannot work either because the topology links have not been set up yet. Correct the condition and move the call to set_cpu_sibling_map() before invoking calibrate_delay() so the sibling check works correctly. [ tglx: Rewrote changelong ] Fixes: c25323c07345 ("x86/tsc: Use topology functions") Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: peterz@infradead.org Cc: bob.picco@oracle.com Cc: steven.sistare@oracle.com Cc: daniel.m.jordan@oracle.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20171028001100.26603-1-pasha.tatashin@oracle.com
2017-11-07X86/KVM: Clear encryption attribute when SEV is activeBrijesh Singh
The guest physical memory area holding the struct pvclock_wall_clock and struct pvclock_vcpu_time_info are shared with the hypervisor. It periodically updates the contents of the memory. When SEV is active, the encryption attributes from the shared memory pages must be cleared so that both hypervisor and guest can access the data. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lkml.kernel.org/r/20171020143059.3291-18-brijesh.singh@amd.com
2017-11-07X86/KVM: Decrypt shared per-cpu variables when SEV is activeBrijesh Singh
When SEV is active, guest memory is encrypted with a guest-specific key, a guest memory region shared with the hypervisor must be mapped as decrypted before it can be shared. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lkml.kernel.org/r/20171020143059.3291-17-brijesh.singh@amd.com
2017-11-07percpu: Introduce DEFINE_PER_CPU_DECRYPTEDBrijesh Singh
KVM guest defines three per-CPU variables (steal-time, apf_reason, and kvm_pic_eoi) which are shared between a guest and a hypervisor. When SEV is active, memory is encrypted with a guest-specific key, and if the guest OS wants to share the memory region with the hypervisor then it must clear the C-bit (i.e set decrypted) before sharing it. DEFINE_PER_CPU_DECRYPTED can be used to define the per-CPU variables which will be shared between a guest and a hypervisor. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <bp@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: linux-arch@vger.kernel.org Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: kvm@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Lameter <cl@linux.com> Link: https://lkml.kernel.org/r/20171020143059.3291-16-brijesh.singh@amd.com
2017-11-07x86: Add support for changing memory encryption attribute in early bootBrijesh Singh
Some KVM-specific custom MSRs share the guest physical address with the hypervisor in early boot. When SEV is active, the shared physical address must be mapped with memory encryption attribute cleared so that both hypervisor and guest can access the data. Add APIs to change the memory encryption attribute in early boot code. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: kvm@vger.kernel.org Cc: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20171020143059.3291-15-brijesh.singh@amd.com
2017-11-07x86/io: Unroll string I/O when SEV is activeTom Lendacky
Secure Encrypted Virtualization (SEV) does not support string I/O, so unroll the string I/O operation into a loop operating on one element at a time. [ tglx: Gave the static key a real name instead of the obscure __sev ] Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: kvm@vger.kernel.org Cc: David Laight <David.Laight@ACULAB.COM> Cc: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20171020143059.3291-14-brijesh.singh@amd.com
2017-11-07x86/boot: Add early boot support when running with SEV activeTom Lendacky
Early in the boot process, add checks to determine if the kernel is running with Secure Encrypted Virtualization (SEV) active. Checking for SEV requires checking that the kernel is running under a hypervisor (CPUID 0x00000001, bit 31), that the SEV feature is available (CPUID 0x8000001f, bit 1) and then checking a non-interceptable SEV MSR (0xc0010131, bit 0). This check is required so that during early compressed kernel booting the pagetables (both the boot pagetables and KASLR pagetables (if enabled) are updated to include the encryption mask so that when the kernel is decompressed into encrypted memory, it can boot properly. After the kernel is decompressed and continues booting the same logic is used to check if SEV is active and set a flag indicating so. This allows to distinguish between SME and SEV, each of which have unique differences in how certain things are handled: e.g. DMA (always bounce buffered with SEV) or EFI tables (always access decrypted with SME). Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Laura Abbott <labbott@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: kvm@vger.kernel.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: https://lkml.kernel.org/r/20171020143059.3291-13-brijesh.singh@amd.com
2017-11-07x86/mm: Add DMA support for SEV memory encryptionTom Lendacky
DMA access to encrypted memory cannot be performed when SEV is active. In order for DMA to properly work when SEV is active, the SWIOTLB bounce buffers must be used. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de>C Tested-by: Borislav Petkov <bp@suse.de> Cc: kvm@vger.kernel.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20171020143059.3291-12-brijesh.singh@amd.com
2017-11-07x86/mm, resource: Use PAGE_KERNEL protection for ioremap of memory pagesTom Lendacky
In order for memory pages to be properly mapped when SEV is active, it's necessary to use the PAGE_KERNEL protection attribute as the base protection. This ensures that memory mapping of, e.g. ACPI tables, receives the proper mapping attributes. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Laura Abbott <labbott@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: kvm@vger.kernel.org Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: https://lkml.kernel.org/r/20171020143059.3291-11-brijesh.singh@amd.com
2017-11-07resource: Provide resource struct in resource walk callbackTom Lendacky
In preperation for a new function that will need additional resource information during the resource walk, update the resource walk callback to pass the resource structure. Since the current callback start and end arguments are pulled from the resource structure, the callback functions can obtain them from the resource structure directly. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: kvm@vger.kernel.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lkml.kernel.org/r/20171020143059.3291-10-brijesh.singh@amd.com
2017-11-07resource: Consolidate resource walking codeTom Lendacky
The walk_iomem_res_desc(), walk_system_ram_res() and walk_system_ram_range() functions each have much of the same code. Create a new function that consolidates the common code from these functions in one place to reduce the amount of duplicated code. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: kvm@vger.kernel.org Cc: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20171020143059.3291-9-brijesh.singh@amd.com
2017-11-07x86/efi: Access EFI data as encrypted when SEV is activeTom Lendacky
EFI data is encrypted when the kernel is run under SEV. Update the page table references to be sure the EFI memory areas are accessed encrypted. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: linux-efi@vger.kernel.org Cc: kvm@vger.kernel.org Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20171020143059.3291-8-brijesh.singh@amd.com
2017-11-07x86/mm: Include SEV for encryption memory attribute changesTom Lendacky
The current code checks only for sme_active() when determining whether to perform the encryption attribute change. Include sev_active() in this check so that memory attribute changes can occur under SME and SEV. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Laura Abbott <labbott@redhat.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: kvm@vger.kernel.org Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: https://lkml.kernel.org/r/20171020143059.3291-7-brijesh.singh@amd.com
2017-11-07x86/mm: Use encrypted access of boot related data with SEVTom Lendacky
When Secure Encrypted Virtualization (SEV) is active, boot data (such as EFI related data, setup data) is encrypted and needs to be accessed as such when mapped. Update the architecture override in early_memremap to keep the encryption attribute when mapping this data. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Laura Abbott <labbott@redhat.com> Cc: kvm@vger.kernel.org Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: https://lkml.kernel.org/r/20171020143059.3291-6-brijesh.singh@amd.com
2017-11-07x86/realmode: Don't decrypt trampoline area under SEVTom Lendacky
When SEV is active the trampoline area will need to be in encrypted memory so only mark the area decrypted if SME is active. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Laura Abbott <labbott@redhat.com> Cc: kvm@vger.kernel.org Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: https://lkml.kernel.org/r/20171020143059.3291-5-brijesh.singh@amd.com
2017-11-07x86/mm: Don't attempt to encrypt initrd under SEVTom Lendacky
When SEV is active the initrd/initramfs will already have already been placed in memory encrypted so do not try to encrypt it. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: kvm@vger.kernel.org Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20171020143059.3291-4-brijesh.singh@amd.com
2017-11-07x86/mm: Add Secure Encrypted Virtualization (SEV) supportTom Lendacky
Provide support for Secure Encrypted Virtualization (SEV). This initial support defines a flag that is used by the kernel to determine if it is running with SEV active. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: kvm@vger.kernel.org Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20171020143059.3291-3-brijesh.singh@amd.com
2017-11-07Documentation/x86: Add AMD Secure Encrypted Virtualization (SEV) descriptionBrijesh Singh
Update the AMD memory encryption document describing the Secure Encrypted Virtualization (SEV) feature. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: kvm@vger.kernel.org Cc: Jonathan Corbet <corbet@lwn.net> Cc: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20171020143059.3291-2-brijesh.singh@amd.com
2017-11-07sdhci-fujitsu: add support for setting the CMD_DAT_DELAY attributeArd Biesheuvel
The Socionext SynQuacer SoC inherits this IP from Fujitsu, but requires the F_SDH30_CMD_DAT_DELAY bit to be set in the F_SDH30_ESD_CONTROL control register. So set this bit if the DT node has the 'fujitsu,cmd-dat-delay-select' property. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-11-07dt-bindings: sdhci-fujitsu: document cmd-dat-delay propertyArd Biesheuvel
Document a new boolean property of the sdhci-fujitsu binding that indicates whether the CMD_DAT_DELAY bit needs to be set in the F_SDH30_ESD_CONTROL register. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-11-07mmc: tmio: Replace msleep() of 20ms or less with usleep_range()Masaharu Hayakawa
As documented in Documentation/timers/timers-howto.txt as follows, replace msleep() with usleep_range(). msleep(1~20) may not do what the caller intends, and will often sleep longer (~20 ms actual sleep for any value given in the 1~20ms range). In many cases this is not the desired behavior. Signed-off-by: Masaharu Hayakawa <masaharu.hayakawa.ry@renesas.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-11-07irqchip/stm32: Move the wakeup on interrupt maskLudovic Barre
Move irq_set_wake on interrupt mask, needed to wake up from low power mode as the event mask is not able to do so. Signed-off-by: Ludovic Barre <ludovic.barre@st.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-07irqchip/stm32: Fix initial valuesLudovic Barre
-After cold boot, imr default value depends on hardware configuration. -After hot reboot the registers must be cleared to avoid residue. Signed-off-by: Ludovic Barre <ludovic.barre@st.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-07irqchip/stm32: Add stm32h7 supportLudovic Barre
stm32h7 has up to 96 inputs (3 banks of 32 inputs max). Signed-off-by: Ludovic Barre <ludovic.barre@st.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-07dt-bindings/interrupt-controllers: Add compatible string for stm32h7Ludovic Barre
This patch updates stm32-exti documentation with stm32h7-exti compatible string. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Ludovic Barre <ludovic.barre@st.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-07irqchip/stm32: Add multi-bank managementLudovic Barre
-Prepare to manage multi-bank of external interrupts (N banks of 32 inputs). -Prepare to manage registers offsets by compatible (registers offsets could be different follow per stm32 platform). Signed-off-by: Ludovic Barre <ludovic.barre@st.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-07irqchip/stm32: Select GENERIC_IRQ_CHIPLudovic Barre
This patch adds GENERIC_IRQ_CHIP to stm32 exti config. Signed-off-by: Ludovic Barre <ludovic.barre@st.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-07locking/rwlocks: Fix commentsCheng Jian
- fix the list of locking API headers in kernel/locking/spinlock.c - fix an #endif comment Signed-off-by: Cheng Jian <cj.chengjian@huawei.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: huawei.libin@huawei.com Cc: xiexiuqi@huawei.com Link: http://lkml.kernel.org/r/1509706788-152547-1-git-send-email-cj.chengjian@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07kprobes, x86/alternatives: Use text_mutex to protect smp_alt_modulesZhou Chengming
We use alternatives_text_reserved() to check if the address is in the fixed pieces of alternative reserved, but the problem is that we don't hold the smp_alt mutex when call this function. So the list traversal may encounter a deleted list_head if another path is doing alternatives_smp_module_del(). One solution is that we can hold smp_alt mutex before call this function, but the difficult point is that the callers of this functions, arch_prepare_kprobe() and arch_prepare_optimized_kprobe(), are called inside the text_mutex. So we must hold smp_alt mutex before we go into these arch dependent code. But we can't now, the smp_alt mutex is the arch dependent part, only x86 has it. Maybe we can export another arch dependent callback to solve this. But there is a simpler way to handle this problem. We can reuse the text_mutex to protect smp_alt_modules instead of using another mutex. And all the arch dependent checks of kprobes are inside the text_mutex, so it's safe now. Signed-off-by: Zhou Chengming <zhouchengming1@huawei.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@suse.de Fixes: 2cfa197 "ftrace/alternatives: Introducing *_text_reserved functions" Link: http://lkml.kernel.org/r/1509585501-79466-1-git-send-email-zhouchengming1@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07irqchip/exiu: Add support for Socionext Synquacer EXIU controllerArd Biesheuvel
The Socionext Synquacer SoC has an external interrupt unit (EXIU) that forwards a block of 32 configurable input lines to 32 adjacent level-high type GICv3 SPIs. The EXIU has per-interrupt level/edge and polarity controls, and mask bits that keep the outgoing lines de-asserted, even though the controller may still latch interrupt conditions that occur while the line is masked. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-07dt-bindings: Add description of Socionext EXIU interrupt controllerArd Biesheuvel
Add a description of the External Interrupt Unit (EXIU) interrupt controller as found on the Socionext SynQuacer SoC. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-07x86/mm: Remove unnecessary TLB flush for SME in-place encryptionTom Lendacky
A TLB flush is not required when doing in-place encryption or decryption since the area's pagetable attributes are not being altered. To avoid confusion between what the routine is doing and what is documented in the AMD APM, delete the local_flush_tlb() call. Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171101165426.1388.24866.stgit@tlendack-t1.amdoffice.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07irqchip/gic-v3-its: Fix VPE activate callback return valueMarc Zyngier
its_vpe_irq_domain_activate should always return 0. Really. There is not a single case why it wouldn't. So this "return true;" is really a copy/paste issue that got revealed now that we actually check the return value of the activate method. Brown paper bag day. Fixes: 2247e1bf7063 ("irqchip/gic-v3-its: Limit scope of VPE mapping to be per ITS") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-07Merge tag 'timers-conversion-next4' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux into timers/core Pull the 4th timer conversion batch from Kees Cook - A couple fixes for less common build configurations - More stragglers that have either been reviewed or gone long enough on list
2017-11-07arm/kprobes: Remove jprobe test caseMasami Hiramatsu
Remove the jprobes test case because jprobes is a deprecated feature. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jon Medhurst <tixy@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Brown <broonie@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/150976988105.2012.13618117383683725047.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>