summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-08-30Merge branch 'drm-fixes-4.19' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-fixes Fixes for 4.19: - SR-IOV fixes - Kasan and page fault fix on device removal - S3 stability fix for CZ/ST - VCE regression fixes for CIK parts - Avoid holding the mn_lock when allocating memory - DC memory leak fix - BO eviction fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180829202555.2653-1-alexander.deucher@amd.com
2018-08-30Merge branch 'mediatek-drm-fixes-4.19' of ↵Dave Airlie
https://github.com/ckhu-mediatek/linux.git-tags into drm-fixes "Here are some fixes for mediatek drm driver." Mostly fixes around the RDMA and Overlay Signed-off-by: Dave Airlie <airlied@redhat.com> From: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.freedesktop.org/patch/msgid/1535346194.27648.5.camel@mtksdaap41
2018-08-29net/sched: act_pedit: fix dump of extended layered opDavide Caratti
in the (rare) case of failure in nla_nest_start(), missing NULL checks in tcf_pedit_key_ex_dump() can make the following command # tc action add action pedit ex munge ip ttl set 64 dereference a NULL pointer: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 800000007d1cd067 P4D 800000007d1cd067 PUD 7acd3067 PMD 0 Oops: 0002 [#1] SMP PTI CPU: 0 PID: 3336 Comm: tc Tainted: G E 4.18.0.pedit+ #425 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:tcf_pedit_dump+0x19d/0x358 [act_pedit] Code: be 02 00 00 00 48 89 df 66 89 44 24 20 e8 9b b1 fd e0 85 c0 75 46 8b 83 c8 00 00 00 49 83 c5 08 48 03 83 d0 00 00 00 4d 39 f5 <66> 89 04 25 00 00 00 00 0f 84 81 01 00 00 41 8b 45 00 48 8d 4c 24 RSP: 0018:ffffb5d4004478a8 EFLAGS: 00010246 RAX: ffff8880fcda2070 RBX: ffff8880fadd2900 RCX: 0000000000000000 RDX: 0000000000000002 RSI: ffffb5d4004478ca RDI: ffff8880fcda206e RBP: ffff8880fb9cb900 R08: 0000000000000008 R09: ffff8880fcda206e R10: ffff8880fadd2900 R11: 0000000000000000 R12: ffff8880fd26cf40 R13: ffff8880fc957430 R14: ffff8880fc957430 R15: ffff8880fb9cb988 FS: 00007f75a537a740(0000) GS:ffff8880fda00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000007a2fa005 CR4: 00000000001606f0 Call Trace: ? __nla_reserve+0x38/0x50 tcf_action_dump_1+0xd2/0x130 tcf_action_dump+0x6a/0xf0 tca_get_fill.constprop.31+0xa3/0x120 tcf_action_add+0xd1/0x170 tc_ctl_action+0x137/0x150 rtnetlink_rcv_msg+0x263/0x2d0 ? _cond_resched+0x15/0x40 ? rtnl_calcit.isra.30+0x110/0x110 netlink_rcv_skb+0x4d/0x130 netlink_unicast+0x1a3/0x250 netlink_sendmsg+0x2ae/0x3a0 sock_sendmsg+0x36/0x40 ___sys_sendmsg+0x26f/0x2d0 ? do_wp_page+0x8e/0x5f0 ? handle_pte_fault+0x6c3/0xf50 ? __handle_mm_fault+0x38e/0x520 ? __sys_sendmsg+0x5e/0xa0 __sys_sendmsg+0x5e/0xa0 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f75a4583ba0 Code: c3 48 8b 05 f2 62 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d fd c3 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24 RSP: 002b:00007fff60ee7418 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fff60ee7540 RCX: 00007f75a4583ba0 RDX: 0000000000000000 RSI: 00007fff60ee7490 RDI: 0000000000000003 RBP: 000000005b842d3e R08: 0000000000000002 R09: 0000000000000000 R10: 00007fff60ee6ea0 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fff60ee7554 R14: 0000000000000001 R15: 000000000066c100 Modules linked in: act_pedit(E) ip6table_filter ip6_tables iptable_filter binfmt_misc crct10dif_pclmul ext4 crc32_pclmul mbcache ghash_clmulni_intel jbd2 pcbc snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer cryptd glue_helper snd joydev pcspkr soundcore virtio_balloon i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net net_failover virtio_blk virtio_console failover qxl crc32c_intel drm_kms_helper syscopyarea serio_raw sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix virtio_pci libata virtio_ring i2c_core virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_pedit] CR2: 0000000000000000 Like it's done for other TC actions, give up dumping pedit rules and return an error if nla_nest_start() returns NULL. Fixes: 71d0ed7079df ("net/act_pedit: Support using offset relative to the conventional network headers") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29sh_eth: Add R7S9210 supportChris Brandt
Add support for the R7S9210 which is part of the RZ/A2 series. Signed-off-by: Chris Brandt <chris.brandt@renesas.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29Merge branch 'hns-fixes'David S. Miller
Peng Li says: ==================== net: hns: fix some bugs about speed and duplex change If there are packets in hardware when changing the spped or duplex, it may cause hardware hang up. This patchset adds the code for waiting chip to clean the all pkts(TX & RX) in chip when the driver uses the function named "adjust link". This patchset cleans the pkts as follows: 1) close rx of chip, close tx of protocol stack. 2) wait rcb, ppe, mac to clean. 3) adjust link 4) open rx of chip, open tx of protocol stack. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29net: hns: add netif_carrier_off before change speed and duplexPeng Li
If there are packets in hardware when changing the speed or duplex, it may cause hardware hang up. This patch adds netif_carrier_off before change speed and duplex in ethtool_ops.set_link_ksettings, and adds netif_carrier_on after complete the change. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29net: hns: add the code for cleaning pkt in chipPeng Li
If there are packets in hardware when changing the speed or duplex, it may cause hardware hang up. This patch adds the code for waiting chip to clean the all pkts(TX & RX) in chip when the driver uses the function named "adjust link". This patch cleans the pkts as follows: 1) close rx of chip, close tx of protocol stack. 2) wait rcb, ppe, mac to clean. 3) adjust link 4) open rx of chip, open tx of protocol stack. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devicesAzat Khuzhin
I have two Ethernet adapters: r8169 0000:03:01.0 eth0: RTL8169sb/8110sb, 00:14:d1:14:2d:49, XID 10000000, IRQ 18 r8169 0000:01:00.0 eth0: RTL8168e/8111e, 64:66:b3:11:14:5d, XID 2c200000, IRQ 30 And after upgrading from linux 4.15 [1] to linux 4.18+ [2] RTL8169sb failed to receive any packets. tcpdump shows a lot of checksum mismatch. [1]: a0f79386a4968b4925da6db2d1daffd0605a4402 [2]: 0519359784328bfa92bf0931bf0cff3b58c16932 (4.19 merge window opened) I started bisecting and the found that [3] breaks it. According to [4]: "For 8110S, 8110SB, and 8110SC series, the initial value of RxConfig needs to be set after the tx/rx is enabled." So I moved rtl_init_rxcfg() after enabling tx/rs and now my adapter works (RTL8168e works too). [3]: 3559d81e76bfe3803e89f2e04cf6ef7ab4f3aace [4]: e542a2269f232d61270ceddd42b73a4348dee2bb ("r8169: adjust the RxConfig settings.") Also drop "rx" from rtl_set_rx_tx_config_registers(), since it does nothing with it already. Fixes: 3559d81e76bfe3803e89f2e04cf6ef7ab4f3aace ("r8169: simplify rtl_hw_start_8169") Cc: Heiner Kallweit <hkallweit1@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Cc: Realtek linux nic maintainers <nic_swsd@realtek.com> Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29tipc: switch to rhashtable iteratorCong Wang
syzbot reported a use-after-free in tipc_group_fill_sock_diag(), where tipc_group_fill_sock_diag() still reads tsk->group meanwhile tipc_group_delete() just deletes it in tipc_release(). tipc_nl_sk_walk() aims to lock this sock when walking each sock in the hash table to close race conditions with sock changes like this one, by acquiring tsk->sk.sk_lock.slock spinlock, unfortunately this doesn't work at all. All non-BH call path should take lock_sock() instead to make it work. tipc_nl_sk_walk() brutally iterates with raw rht_for_each_entry_rcu() where RCU read lock is required, this is the reason why lock_sock() can't be taken on this path. This could be resolved by switching to rhashtable iterator API's, where taking a sleepable lock is possible. Also, the iterator API's are friendly for restartable calls like diag dump, the last position is remembered behind the scence, all we need to do here is saving the iterator into cb->args[]. I tested this with parallel tipc diag dump and thousands of tipc socket creation and release, no crash or memory leak. Reported-by: syzbot+b9c8f3ab2994b7cd1625@syzkaller.appspotmail.com Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29Revert "net: stmmac: Do not keep rearming the coalesce timer in stmmac_xmit"Jerome Brunet
This reverts commit 4ae0169fd1b3c792b66be58995b7e6b629919ecf. This change in the handling of the coalesce timer is causing regression on (at least) amlogic platforms. Network will break down very quickly (a few seconds) after starting a download. This can easily be reproduced using iperf3 for example. The problem has been reported on the S805, S905, S912 and A113 SoCs (Realtek and Micrel PHYs) and it is likely impacting all Amlogics platforms using Gbit ethernet No problem was seen with the platform using 10/100 only PHYs (GXL internal) Reverting change brings things back to normal and allows to use network again until we better understand the problem with the coalesce timer. Cc: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Vitor Soares <soares@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Cc: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29tipc: fix a missing rhashtable_walk_exit()Cong Wang
rhashtable_walk_exit() must be paired with rhashtable_walk_enter(). Fixes: 40f9f4397060 ("tipc: Fix tipc_sk_reinit race conditions") Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29vti6: remove !skb->ignore_df check from vti6_xmit()Alexey Kodanev
Before the commit d6990976af7c ("vti6: fix PMTU caching and reporting on xmit") '!skb->ignore_df' check was always true because the function skb_scrub_packet() was called before it, resetting ignore_df to zero. In the commit, skb_scrub_packet() was moved below, and now this check can be false for the packet, e.g. when sending it in the two fragments, this prevents successful PMTU updates in such case. The next attempts to send the packet lead to the same tx error. Moreover, vti6 initial MTU value relies on PMTU adjustments. This issue can be reproduced with the following LTP test script: udp_ipsec_vti.sh -6 -p ah -m tunnel -s 2000 Fixes: ccd740cbc6e0 ("vti6: Add pmtu handling to vti6_xmit.") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2018-08-29 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a build error in sk_reuseport_convert_ctx_access() when compiling with clang which cannot resolve hweight_long() at build time inside the BUILD_BUG_ON() assertion, from Stefan. 2) Several fixes for BPF sockmap, four of them in getting the bpf_msg_pull_data() helper to work, one use after free case in bpf_tcp_close() and one refcount leak in bpf_tcp_recvmsg(), from Daniel. 3) Another fix for BPF sockmap where we misaccount sk_mem_uncharge() in the socket redirect error case from unwinding scatterlist twice, from John. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29powerpc: disable support for relative ksymtab referencesArd Biesheuvel
The newly added code that emits ksymtab entries as pairs of 32-bit relative references interacts poorly with the way powerpc lays out its address space: when a module exports a per-CPU variable, the primary module region covering the ksymtab entry -and thus the 32-bit relative reference- is too far away from the actual per-CPU variable's base address (to which the per-CPU offsets are applied to obtain the respective address of each CPU's copy), resulting in corruption when the module loader attempts to resolve symbol references of modules that are loaded on top and link to the exported per-CPU symbol. So let's disable this feature on powerpc. Even though it implements CONFIG_RELOCATABLE, it does not implement CONFIG_RANDOMIZE_BASE and so KASLR kernels (which are the main target of the feature) do not exist on powerpc anyway. Reported-by: Andreas Schwab <schwab@linux-m68k.org> Suggested-by: Nicholas Piggin <nicholas.piggin@gmail.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-29Merge tag 'hwmon-for-linus-v4.19-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - Fix potential Spectre v1 in nct6775 - Add error checking to adt7475 driver - Fix reading shunt resistor value in ina2xx driver * tag 'hwmon-for-linus-v4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (nct6775) Fix potential Spectre v1 hwmon: (adt7475) Make adt7475_read_word() return errors hwmon: (adt7475) Potential error pointer dereferences hwmon: (ina2xx) fix sysfs shunt resistor read access
2018-08-29Merge tag 'for_v4.19-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull misc fs fixes from Jan Kara: - make UDF to properly mount media created by Win7 - make isofs to properly refuse devices with large physical block size - fix a Spectre gadget in quotactl(2) - fix a warning in fsnotify code hit by syzkaller * tag 'for_v4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Fix mounting of Win7 created UDF filesystems udf: Remove dead code from udf_find_fileset() fs/quota: Fix spectre gadget in do_quotactl fs/quota: Replace XQM_MAXQUOTAS usage with MAXQUOTAS isofs: reject hardware sector size > 2048 bytes fsnotify: fix false positive warning on inode delete
2018-08-29Merge tag 'nios2-v4.19-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2 Pull nios2 fix from Ley Foon Tan: "remove duplicate DEBUG_STACK_USAGE symbol defintions" * tag 'nios2-v4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2: nios2: kconfig: remove duplicate DEBUG_STACK_USAGE symbol defintions
2018-08-29drm/i915/audio: Hook up component bindings even if displays are disabledChris Wilson
If the display has been disabled by modparam, we still want to connect together the HW bits and bobs with the associated drivers so that we can continue to manage their runtime power gating. Fixes: 108109444ff6 ("drm/i915: Check num_pipes before initializing audio component") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Elaine Wang <elaine.wang@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180817100241.4628-1-chris@chris-wilson.co.uk (cherry picked from commit 35a5fd9ebfa93758ca579e30f337b6c9126d995b) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-08-29drm/i915: Increase LSPCON timeoutFredrik Schön
100 ms is not enough time for the LSPCON adapter on Intel NUC devices to settle. This causes dropped display modes at boot or screen reconfiguration. Empirical testing can reproduce the error up to a timeout of 190 ms. Basic boot and stress testing at 200 ms has not (yet) failed. Increase timeout to 400 ms to get some margin of error. Changes from v1: The initial suggestion of 1000 ms was lowered due to concerns about delaying valid timeout cases. Update patch metadata. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107503 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1570392 Fixes: 357c0ae9198a ("drm/i915/lspcon: Wait for expected LSPCON mode to settle") Cc: Shashank Sharma <shashank.sharma@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: <stable@vger.kernel.org> # v4.11+ Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Shashank Sharma <shashank.sharma@intel.com> Signed-off-by: Fredrik Schön <fredrik.schon@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180817200728.8154-1-fredrik.schon@gmail.com (cherry picked from commit 59f1c8ab30d6f9042562949f42cbd3f3cf69de94) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-08-29drm/i915: Stop holding a ref to the ppgtt from each vmaChris Wilson
The context owns both the ppgtt and the vma within it, and our activity tracking on the context ensures that we do not release active ppgtt. As the context fulfils our obligations for active memory tracking, we can relinquish the reference from the vma. This fixes a silly transient refleak from closed vma being kept alive until the entire system was idle, keeping all vm alive as well. Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Testcase: igt/gem_ctx_create/files Fixes: 3365e2268b6b ("drm/i915: Lazily unbind vma on close") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Tested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180816073448.19396-1-chris@chris-wilson.co.uk (cherry picked from commit a4417b7b419a68540ad7945ac4efbb39d19afa63) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-08-29Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Check for the right CPU feature bit in sm4-ce on arm64. - Fix scatterwalk WARN_ON in aes-gcm-ce on arm64. - Fix unaligned fault in aesni on x86. - Fix potential NULL pointer dereference on exit in chtls. - Fix DMA mapping direction for RSA in caam. - Fix error path return value for xts setkey in caam. - Fix address endianness when DMA unmapping in caam. - Fix sleep-in-atomic in vmx. - Fix command corruption when queue is full in cavium/nitrox. * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: cavium/nitrox - fix for command corruption in queue full case with backlog submissions. crypto: vmx - Fix sleep-in-atomic bugs crypto: arm64/aes-gcm-ce - fix scatterwalk API violation crypto: aesni - Use unaligned loads from gcm_context_data crypto: chtls - fix null dereference chtls_free_uld() crypto: arm64/sm4-ce - check for the right CPU feature bit crypto: caam - fix DMA mapping direction for RSA forms 2 & 3 crypto: caam/qi - fix error path in xts setkey crypto: caam/jr - fix descriptor DMA unmapping
2018-08-29Merge branch 'AF_XDP-zerocopy-for-i40e'Alexei Starovoitov
Björn Töpel says: ==================== This patch set introduces zero-copy AF_XDP support for Intel's i40e driver. In the first preparatory patch we also add support for XDP_REDIRECT for zero-copy allocated frames so that XDP programs can redirect them. This was a ToDo from the first AF_XDP zero-copy patch set from early June. Special thanks to Alex Duyck and Jesper Dangaard Brouer for reviewing earlier versions of this patch set. The i40e zero-copy code is located in its own file i40e_xsk.[ch]. Note that in the interest of time, to get an AF_XDP zero-copy implementation out there for people to try, some code paths have been copied from the XDP path to the zero-copy path. It is out goal to merge the two paths in later patch sets. In contrast to the implementation from beginning of June, this patch set does not require any extra HW queues for AF_XDP zero-copy TX. Instead, the XDP TX HW queue is used for both XDP_REDIRECT and AF_XDP zero-copy TX. Jeff, given that most of changes are in i40e, it is up to you how you would like to route these patches. The set is tagged bpf-next, but if taking it via the Intel driver tree is easier, let us know. We have run some benchmarks on a dual socket system with two Broadwell E5 2660 @ 2.0 GHz with hyperthreading turned off. Each socket has 14 cores which gives a total of 28, but only two cores are used in these experiments. One for TR/RX and one for the user space application. The memory is DDR4 @ 2133 MT/s (1067 MHz) and the size of each DIMM is 8192MB and with 8 of those DIMMs in the system we have 64 GB of total memory. The compiler used is gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0. The NIC is Intel I40E 40Gbit/s using the i40e driver. Below are the results in Mpps of the I40E NIC benchmark runs for 64 and 1500 byte packets, generated by a commercial packet generator HW outputing packets at full 40 Gbit/s line rate. The results are with retpoline and all other spectre and meltdown fixes, so these results are not comparable to the ones from the zero-copy patch set in June. AF_XDP performance 64 byte packets. Benchmark XDP_SKB XDP_DRV XDP_DRV with zerocopy rxdrop 2.6 8.2 15.0 txpush 2.2 - 21.9 l2fwd 1.7 2.3 11.3 AF_XDP performance 1500 byte packets: Benchmark XDP_SKB XDP_DRV XDP_DRV with zerocopy rxdrop 2.0 3.3 3.3 l2fwd 1.3 1.7 3.1 XDP performance on our system as a base line: 64 byte packets: XDP stats CPU pps issue-pps XDP-RX CPU 16 18.4M 0 1500 byte packets: XDP stats CPU pps issue-pps XDP-RX CPU 16 3.3M 0 The structure of the patch set is as follows: Patch 1: Add support for XDP_REDIRECT of zero-copy allocated frames Patches 2-4: Preparatory patches to common xsk and net code Patches 5-7: Preparatory patches to i40e driver code for RX Patch 8: i40e zero-copy support for RX Patch 9: Preparatory patch to i40e driver code for TX Patch 10: i40e zero-copy support for TX Patch 11: Add flags to sample application to force zero-copy/copy mode ==================== Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29samples/bpf: add -c/--copy -z/--zero-copy flags to xdpsockBjörn Töpel
The -c/--copy -z/--zero-copy flags enforces either copy or zero-copy mode. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29i40e: add AF_XDP zero-copy Tx supportMagnus Karlsson
This patch adds zero-copy Tx support for AF_XDP sockets. It implements the ndo_xsk_async_xmit netdev ndo and performs all the Tx logic from a NAPI context. This means pulling egress packets from the Tx ring, placing the frames on the NIC HW descriptor ring and completing sent frames back to the application via the completion ring. The regular XDP Tx ring is used for AF_XDP as well. This rationale for this is as follows: XDP_REDIRECT guarantees mutual exclusion between different NAPI contexts based on CPU id. In other words, a netdev can XDP_REDIRECT to another netdev with a different NAPI context, since the operation is bound to a specific core and each core has its own hardware ring. As the AF_XDP Tx action is running in the same NAPI context and using the same ring, it will also be protected from XDP_REDIRECT actions with the exact same mechanism. As with AF_XDP Rx, all AF_XDP Tx specific functions are added to i40e_xsk.c. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29i40e: move common Tx functions to i40e_txrx_common.hMagnus Karlsson
This patch prepares for the upcoming zero-copy Tx functionality, by moving common functions and refactor chunks of code into re-usable functions, used both by the regular path and zero-copy path. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29i40e: add AF_XDP zero-copy Rx supportBjörn Töpel
This patch adds zero-copy Rx support for AF_XDP sockets. Instead of allocating buffers of type MEM_TYPE_PAGE_SHARED, the Rx frames are allocated as MEM_TYPE_ZERO_COPY when AF_XDP is enabled for a certain queue. All AF_XDP specific functions are added to a new file, i40e_xsk.c. Note that when AF_XDP zero-copy is enabled, the XDP action XDP_PASS will allocate a new buffer and copy the zero-copy frame prior passing it to the kernel stack. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29i40e: move common Rx functions to i40e_txrx_common.hBjörn Töpel
This patch prepares for the upcoming zero-copy Rx functionality, by moving/changing linkage of common functions, used both by the regular path and zero-copy path. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29i40e: refactor Rx path for re-useBjörn Töpel
In this commit, the Rx path is refactored some, as a step torwards the introduction AF_XDP Rx zero-copy. The page re-use counter is moved into the i40e_reuse_rx_page, instead of bumping the counter in many places. The Rx buffer page clearing is moved for better readability. Lastely, functions to update statistics and bump the XDP Tx ring are introduced. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29i40e: added queue pair disable/enable functionsBjörn Töpel
Add functions for queue pair enable/disable. Instead of resetting the whole device, only the affected queue pair is disabled or enabled. This plumbing is used in a later commit, when zero-copy AF_XDP support is introduced. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29net: add napi_if_scheduled_mark_missedMagnus Karlsson
The function napi_if_scheduled_mark_missed is used to check if the NAPI context is scheduled, if so set NAPIF_STATE_MISSED and return true. Used by the AF_XDP zero-copy i40e Tx code implementation in order to make sure that irq affinity is honored by the napi context. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29xsk: expose xdp_umem_get_{data,dma} to driversBjörn Töpel
Move the xdp_umem_get_{data,dma} functions to include/net/xdp_sock.h, so that the upcoming zero-copy implementation in the Ethernet drivers can utilize them. Also, supply some dummy function implementations for CONFIG_XDP_SOCKETS=n configs. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29xdp: export xdp_rxq_info_unreg_mem_modelBjörn Töpel
Export __xdp_rxq_info_unreg_mem_model as xdp_rxq_info_unreg_mem_model, so it can be used from netdev drivers. Also, add additional checks for the memory type. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29xdp: implement convert_to_xdp_frame for MEM_TYPE_ZERO_COPYBjörn Töpel
This commit adds proper MEM_TYPE_ZERO_COPY support for convert_to_xdp_frame. Converting a MEM_TYPE_ZERO_COPY xdp_buff to an xdp_frame is done by transforming the MEM_TYPE_ZERO_COPY buffer into a MEM_TYPE_PAGE_ORDER0 frame. This is costly, and in the future it might make sense to implement a more sophisticated thread-safe alloc/free scheme for MEM_TYPE_ZERO_COPY, so that no allocation and copy is required in the fast-path. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29arm64: defconfig: Enable TI's AM6 SoC platformNishanth Menon
Enable K3 SoC platform for TI's AM6 SoC. Signed-off-by: Nishanth Menon <nm@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Olof Johansson <olof@lixom.net>
2018-08-29ARM: defconfig: Update the ARM Versatile defconfigLinus Walleij
This updates the ARM Versatile defconfig to the latest Kconfig structural changes and adds the DUMB VGA bridge driver so that VGA works out of the box, e.g. with QEMU. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2018-08-29Merge tag 'imx-fixes-4.19' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes i.MX fixes for 4.19: - i.MX display folks decided to switch MXS display driver from legacy FB to DRM during 4.19 merge window. It leads to a fallout on some Freescale/NXP development boards with Seiko 43WVF1G panel, because this DRM panel driver is not enabled in i.MX defconfig. Here is a series from Fabio to convert i.MX23/28 EVK DT to Seiko 43WVF1G panel bindings and enable the panel driver in i.MX defconfig, so that users can still get functional LCD on these boards by default. - A fix from Leonard to revert incorrect legacy PCI irq mapping in i.MX7 device tree, that was caused by document errors. * tag 'imx-fixes-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: imx_v6_v7_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G ARM: mxs_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G ARM: dts: imx23-evk: Convert to the new display bindings ARM: dts: imx23-evk: Move regulators outside simple-bus ARM: dts: imx28-evk: Convert to the new display bindings ARM: dts: imx28-evk: Move regulators outside simple-bus Revert "ARM: dts: imx7d: Invert legacy PCI irq mapping" Signed-off-by: Olof Johansson <olof@lixom.net>
2018-08-29dt-bindings: watchdog: renesas-wdt: Document r8a774a1 supportFabrizio Castro
RZ/G2M (R8A774A1) watchdog implementation is compatible with R-Car Gen3, therefore add relevant documentation. Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com> Reviewed-by: Biju Das <biju.das@bp.renesas.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2018-08-29media: camss: add missing includesArnd Bergmann
Multiple files in this driver fail to build because of missing header inclusions: drivers/media/platform/qcom/camss/camss-csiphy-2ph-1-0.c: In function 'csiphy_hw_version_read': drivers/media/platform/qcom/camss/camss-csiphy-2ph-1-0.c:31:18: error: implicit declaration of function 'readl_relaxed'; did you mean 'xchg_relaxed'? [-Werror=implicit-function-declaration] drivers/media/platform/qcom/camss/camss-csiphy-3ph-1-0.c: In function 'csiphy_hw_version_read': drivers/media/platform/qcom/camss/camss-csiphy-3ph-1-0.c:52:2: error: implicit declaration of function 'writel' [-Werror=implicit-function-declaration] Add linux/io.h there and in all other files that call readl/writel and related interfaces. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Todor Tomov <todor.tomov@linaro.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-29media: camss: Use managed memory allocationsTodor Tomov
Use managed memory allocations for structs which are used until the driver is removed. Signed-off-by: Todor Tomov <todor.tomov@linaro.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-29media: camss: mark PM functions as __maybe_unusedArnd Bergmann
The empty suspend/resume functions cause a build warning when they are unused: drivers/media/platform/qcom/camss/camss.c:1001:12: error: 'camss_runtime_resume' defined but not used [-Werror=unused-function] drivers/media/platform/qcom/camss/camss.c:996:12: error: 'camss_runtime_suspend' defined but not used [-Werror=unused-function] Mark them as __maybe_unused so the compiler can silently drop them. Fixes: 02afa816dbbf ("media: camss: Add basic runtime PM support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Todor Tomov <todor.tomov@linaro.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-29media: af9035: prevent buffer overflow on writeJozef Balga
When less than 3 bytes are written to the device, memcpy is called with negative array size which leads to buffer overflow and kernel panic. This patch adds a condition and returns -EOPNOTSUPP instead. Fixes bugzilla issue 64871 [mchehab+samsung@kernel.org: fix a merge conflict and changed the condition to match the patch's comment, e. g. len == 3 could also be valid] Signed-off-by: Jozef Balga <jozef.balga@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-29Merge branch 'bpf_msg_pull_data-fixes'Alexei Starovoitov
Daniel Borkmann says: ==================== This set contains three more fixes for the bpf_msg_pull_data() mainly for correcting scatterlist ring wrap-arounds as well as fixing up data pointers. For details please see individual patches. Thanks! ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29bpf: fix sg shift repair start offset in bpf_msg_pull_dataDaniel Borkmann
When we perform the sg shift repair for the scatterlist ring, we currently start out at i = first_sg + 1. However, this is not correct since the first_sg could point to the sge sitting at slot MAX_SKB_FRAGS - 1, and a subsequent i = MAX_SKB_FRAGS will access the scatterlist ring (sg) out of bounds. Add the sk_msg_iter_var() helper for iterating through the ring, and apply the same rule for advancing to the next ring element as we do elsewhere. Later work will use this helper also in other places. Fixes: 015632bb30da ("bpf: sk_msg program helper bpf_sk_msg_pull_data") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29bpf: fix shift upon scatterlist ring wrap-around in bpf_msg_pull_dataDaniel Borkmann
If first_sg and last_sg wraps around in the scatterlist ring, then we need to account for that in the shift as well. E.g. crafting such msgs where this is the case leads to a hang as shift becomes negative. E.g. consider the following scenario: first_sg := 14 |=> shift := -12 msg->sg_start := 10 last_sg := 3 | msg->sg_end := 5 round 1: i := 15, move_from := 3, sg[15] := sg[ 3] round 2: i := 0, move_from := -12, sg[ 0] := sg[-12] round 3: i := 1, move_from := -11, sg[ 1] := sg[-11] round 4: i := 2, move_from := -10, sg[ 2] := sg[-10] [...] round 13: i := 11, move_from := -1, sg[ 2] := sg[ -1] round 14: i := 12, move_from := 0, sg[ 2] := sg[ 0] round 15: i := 13, move_from := 1, sg[ 2] := sg[ 1] round 16: i := 14, move_from := 2, sg[ 2] := sg[ 2] round 17: i := 15, move_from := 3, sg[ 2] := sg[ 3] [...] This means we will loop forever and never hit the msg->sg_end condition to break out of the loop. When we see that the ring wraps around, then the shift should be MAX_SKB_FRAGS - first_sg + last_sg - 1. Meaning, the remainder slots from the tail of the ring and the head until last_sg combined. Fixes: 015632bb30da ("bpf: sk_msg program helper bpf_sk_msg_pull_data") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29bpf: fix msg->data/data_end after sg shift repair in bpf_msg_pull_dataDaniel Borkmann
In the current code, msg->data is set as sg_virt(&sg[i]) + start - offset and msg->data_end relative to it as msg->data + bytes. Using iterator i to point to the updated starting scatterlist element holds true for some cases, however not for all where we'd end up pointing out of bounds. It is /correct/ for these ones: 1) When first finding the starting scatterlist element (sge) where we find that the page is already privately owned by the msg and where the requested bytes and headroom fit into the sge's length. However, it's /incorrect/ for the following ones: 2) After we made the requested area private and updated the newly allocated page into first_sg slot of the scatterlist ring; when we find that no shift repair of the ring is needed where we bail out updating msg->data and msg->data_end. At that point i will point to last_sg, which in this case is the next elem of first_sg in the ring. The sge at that point might as well be invalid (e.g. i == msg->sg_end), which we use for setting the range of sg_virt(&sg[i]). The correct one would have been first_sg. 3) Similar as in 2) but when we find that a shift repair of the ring is needed. In this case we fix up all sges and stop once we've reached the end. In this case i will point to will point to the new msg->sg_end, and the sge at that point will be invalid. Again here the requested range sits in first_sg. Fixes: 015632bb30da ("bpf: sk_msg program helper bpf_sk_msg_pull_data") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29media: video_function_calls.rst: drop obsolete video-set-attributes referenceHans Verkuil
This fixes this warning: Documentation/media/uapi/dvb/video_function_calls.rst:9: WARNING: toctree contains reference to nonexisting document 'uapi/dvb/video-set-attributes' Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-29Merge branch 'nvme-4.19' of git://git.infradead.org/nvme into for-linusJens Axboe
Pull NVMe fixes from Christoph. * 'nvme-4.19' of git://git.infradead.org/nvme: nvmet: free workqueue object if module init fails nvme-fcloop: Fix dropped LS's to removed target port nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event
2018-08-29bpf: use --cgroup in test_suite if suppliedJohn Fastabend
If the user supplies a --cgroup value in the arguments when running the test_suite go ahaead and run the self tests there. I use this to test with multiple cgroup users. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-29bpf: sockmap test remove shutdown() callsJohn Fastabend
Currently, we do a shutdown(sk, SHUT_RDWR) on both peer sockets and a shutdown on the sender as well. However, this is incorrect and can occasionally cause issues if you happen to have bad timing. First peer1 or peer2 may still be in use depending on the test and timing. Second we really should only be closing the read side and/or write side depending on if the test is receiving or sending. But, really none of this is needed just remove the shutdown calls. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-29bpf: remove duplicated include from syscall.cYueHaibing
Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>