summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-11-13powerpc/64s: mm_context.addr_limit is only used on hashNicholas Piggin
Radix keeps no meaningful state in addr_limit, so remove it from radix code and rename to slb_addr_limit to make it clear it applies to hash only. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case allocationNicholas Piggin
Radix VA space allocations test addresses against mm->task_size which is 512TB, even in cases where the intention is to limit allocation to below 128TB. This results in mmap with a hint address below 128TB but address + length above 128TB succeeding when it should fail (as hash does after the previous patch). Set the high address limit to be considered up front, and base subsequent allocation checks on that consistently. Fixes: f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundaryNicholas Piggin
While mapping hints with a length that cross 128TB are disallowed, MAP_FIXED allocations that cross 128TB are allowed. These are failing on hash (on radix they succeed). Add an additional case for fixed mappings to expand the addr_limit when crossing 128TB. Fixes: f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13powerpc/64s/hash: Fix fork() with 512TB process address spaceNicholas Piggin
Hash unconditionally resets the addr_limit to default (128TB) when the mm context is initialised. If a process has > 128TB mappings when it forks, the child will not get the 512TB addr_limit, so accesses to valid > 128TB mappings will fail in the child. Fix this by only resetting the addr_limit to default if it was 0. Non zero indicates it was duplicated from the parent (0 means exec()). Fixes: f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocationNicholas Piggin
When allocating VA space with a hint that crosses 128TB, the SLB addr_limit variable is not expanded if addr is not > 128TB, but the slice allocation looks at task_size, which is 512TB. This results in slice_check_fit() incorrectly succeeding because the slice_count truncates off bit 128 of the requested mask, so the comparison to the available mask succeeds. Fix this by using mm->context.addr_limit instead of mm->task_size for testing allocation limits. This causes such allocations to fail. Fixes: f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") Cc: stable@vger.kernel.org # v4.12+ Reported-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13powerpc/64s/hash: Fix 512T hint detection to use >= 128TMichael Ellerman
Currently userspace is able to request mmap() search between 128T-512T by specifying a hint address that is greater than 128T. But that means a hint of 128T exactly will return an address below 128T, which is confusing and wrong. So fix the logic to check the hint is greater than *or equal* to 128T. Fixes: f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") Cc: stable@vger.kernel.org # v4.12+ Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Suggested-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Split out of Nick's bigger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13MIPS: ralink: Fix typo in mt7628 pinmux functionMathias Kresin
There is a typo inside the pinmux setup code. The function is called refclk and not reclk. Fixes: 53263a1c6852 ("MIPS: ralink: add mt7628an support") Signed-off-by: Mathias Kresin <dev@kresin.me> Acked-by: John Crispin <john@phrozen.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 3.19+ Patchwork: https://patchwork.linux-mips.org/patch/16047/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-13MIPS: ralink: Fix MT7628 pinmuxMathias Kresin
According to the datasheet the REFCLK pin is shared with GPIO#37 and the PERST pin is shared with GPIO#36. Fixes: 53263a1c6852 ("MIPS: ralink: add mt7628an support") Signed-off-by: Mathias Kresin <dev@kresin.me> Acked-by: John Crispin <john@phrozen.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 3.19+ Patchwork: https://patchwork.linux-mips.org/patch/16046/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-13powerpc: Fix DABR match on hash based systemsBenjamin Herrenschmidt
Commit 398a719d34a1 ("powerpc/mm: Update bits used to skip hash_page") mistakenly dropped the DSISR_DABRMATCH bit from the mask of bit tested to skip trying to hash a page. As a result, the DABR matches would no longer be detected. This adds it back. We open code it in the 2 places where it matters rather than fold it into DSISR_BAD_FAULT_32S/64S because this isn't technically a bad fault and while we would never hit it with the current code, I prefer if page_fault_is_bad() didn't trigger on these. Fixes: 398a719d34a1 ("powerpc/mm: Update bits used to skip hash_page") Cc: stable@vger.kernel.org # v4.14 Tested-by: Pedro Miraglia Franco de Carvalho <pedromfc@br.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2017-11-13libceph: don't WARN() if user tries to add invalid keyEric Biggers
The WARN_ON(!key->len) in set_secret() in net/ceph/crypto.c is hit if a user tries to add a key of type "ceph" with an invalid payload as follows (assuming CONFIG_CEPH_LIB=y): echo -e -n '\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' \ | keyctl padd ceph desc @s This can be hit by fuzzers. As this is merely bad input and not a kernel bug, replace the WARN_ON() with return -EINVAL. Fixes: 7af3ea189a9a ("libceph: stop allocating a new cipher on every crypto request") Cc: <stable@vger.kernel.org> # v4.10+ Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13rbd: set discard_alignment to zeroDavid Disseldorp
RBD devices are currently incorrectly initialised with the block queue discard_alignment set to the underlying RADOS object size. As per Documentation/ABI/testing/sysfs-block: The discard_alignment parameter indicates how many bytes the beginning of the device is offset from the internal allocation unit's natural alignment. Correcting the discard_alignment parameter from the RADOS object size to zero (the blk_set_default_limits() default) has no effect on how discard requests are propagated through the block layer - @alignment in __blkdev_issue_discard() remains zero. However, it does fix the UNMAP granularity alignment value advertised to SCSI initiators via the Block Limits VPD. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13ceph: silence sparse endianness warning in encode_caps_cbJeff Layton
sparse warns: fs/ceph/mds_client.c:2887:34: warning: incorrect type in assignment (different base types) fs/ceph/mds_client.c:2887:34: expected restricted __le32 [assigned] [usertype] flock_len fs/ceph/mds_client.c:2887:34: got int At this point, it's just being used as a flag. It gets overwritten later if the rest of the encoding succeeds. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13ceph: remove the bump of i_versionJeff Layton
Eventually, we'll want to wire cephfs up to use the change attribute that the cluster tracks instead, but for now this is unneeded. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13ceph: present consistent fsid, regardless of arch endiannessJeff Layton
Since its inception, ceph has presented the fsid as an opaque value without any sort of endianness conversion. This means that the value presented is different on architectures of different endianness. While the value that should be stuffed into f_fsid is poorly-defined, I think it would be best to strive for consistency here between architectures, and clients (we need to present this properly to the userland client as well). Change ceph_statfs to convert the opaque words to host-endian before doing the xor. On an upgrade, a big-endian box may see a different fsid than it did before, but little-endian arches should see no change with this patch. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13ceph: clean up spinlocking and list handling around cleanup_cap_releases()Jeff Layton
Functions that release a lock taken in a parent frame are notoriously hard to follow. Split cleanup_cap_releases into two functions, one to detach the cap releases from the session (which should be called with the spinlock held), and another to dispose of those caps. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13rbd: get rid of rbd_mapping::read_onlyIlya Dryomov
It is redundant -- rw/ro state is stored in hd_struct and managed by the block layer. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13rbd: fix and simplify rbd_ioctl_set_ro()Ilya Dryomov
->open_count/-EBUSY check is bogus and wrong: when an open device is set read-only, blkdev_write_iter() refuses further writes with -EPERM. This is standard behaviour and all other block devices allow this. set_disk_ro() call is also problematic: we affect the entire device when called on a single partition. All rbd_ioctl_set_ro() needs to do is refuse ro -> rw transition for mapped snapshots. Everything else can be handled by generic code. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13ceph: remove unused and redundant variable droppingColin Ian King
Variable dropping is set but never read and hence is redundant and can be removed. Cleans up clang warning: fs/ceph/caps.c:1170:2: warning: Value stored to 'dropping' is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13ceph: mark expected switch fall-throughsGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> [idryomov@gmail.com: amended "Older OSDs" comment] Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13ceph: -EINVAL on decoding failure in ceph_mdsc_handle_fsmap()Ilya Dryomov
Don't set ->mdsmap_err to -ENOENT unconditionally, and drop unneeded return statement while at it. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13ceph: disable cached readdir after dropping positive dentryYan, Zheng
Ideally CEPH_CAP_FILE_SHARED should have been revoked before postive dentry get dropped. But if something goes wrong, later cached readdir may dereference the dropped dentry. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13ceph: fix bool initialization/comparisonThomas Meyer
Bool initializations should use true and false. Bool tests don't need comparisons. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13ceph: handle 'session get evicted while there are file locks'Yan, Zheng
When session get evicted, all file locks associated with the session get released remotely by mds. File locks tracked by kernel become stale. In this situation, set an error flag on inode. The flag makes further file locks return -EIO. Another option to handle this situation is cleanup file locks tracked kernel. I do not choose it because it is inconvenient to notify user program about the error. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13ceph: optimize flock encoding during reconnectYan, Zheng
Don't malloc if there is no flock. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13ceph: make lock_to_ceph_filelock() staticYan, Zheng
Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13ceph: keep auth cap when inode has flocks or posix locksYan, Zheng
file locks are tracked by inode's auth mds. dropping auth caps is equivalent to releasing all file locks. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13power: supply: cpcap-charger: fix incorrect return value checkPan Bian
Function platform_get_irq_byname() returns a negative error code on failure, and a zero or positive number on success. However, in function cpcap_usb_init_irq(), positive IRQ numbers are also taken as error cases. Use "if (irq < 0)" instead of "if (!irq)" to validate the return value of platform_get_irq_byname(). Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
2017-11-13gpio: tegra186: Remove tegra186_gpio_lock_classAxel Lin
This is no longer required after commit 959bc7b22bd2 ("gpio: Automatically add lockdep keys") Signed-off-by: Axel Lin <axel.lin@ingics.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-11-13quota: be aware of error from dquot_initializeChao Yu
Commit 6184fc0b8dd7 ("quota: Propagate error from ->acquire_dquot()") missed to handle error from dquot_initialize in dquot_file_open, fix it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>
2017-11-13cpu/hotplug: Get rid of CPU hotplug notifier leftoversThomas Gleixner
The CPU hotplug notifiers are history. Remove the last reminders. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-11-13kprobes: Don't spam the build log with deprecation warningsIngo Molnar
The jprobes APIs are deprecated - but are still in occasional use for code that few people seem to care about, so stop generating deprecation warnings. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-12/proc/module: use the same logic as /proc/kallsyms for address exposureLinus Torvalds
The (alleged) users of the module addresses are the same: kernel profiling. So just expose the same helper and format macros, and unify the logic. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-13drm/rockchip: analogix_dp: Use mutex rather than spinlockEmil Renner Berthing
On the Samsung Chromebook Plus I get this error with 4.14-rc3: BUG: scheduling while atomic: kworker/3:1/50/0x00000002 Modules linked in: CPU: 3 PID: 50 Comm: kworker/3:1 Not tainted 4.14.0-0.rc3-kevin #2 Hardware name: Google Kevin (DT) Workqueue: events analogix_dp_psr_work Call trace: [<ffffff80080873b0>] dump_backtrace+0x0/0x320 [<ffffff80080876e4>] show_stack+0x14/0x20 [<ffffff8008606d38>] dump_stack+0x9c/0xbc [<ffffff80080c6b5c>] __schedule_bug+0x4c/0x70 [<ffffff80086188c0>] __schedule+0x3f0/0x458 [<ffffff8008618960>] schedule+0x38/0xa0 [<ffffff800861c20c>] schedule_hrtimeout_range_clock+0x84/0xe8 [<ffffff800861c2a0>] schedule_hrtimeout_range+0x10/0x18 [<ffffff800861bcec>] usleep_range+0x64/0x78 [<ffffff8008415a6c>] analogix_dp_transfer+0x16c/0x340 [<ffffff8008412550>] analogix_dpaux_transfer+0x10/0x18 [<ffffff80083ceb14>] drm_dp_dpcd_access+0x4c/0xf0 [<ffffff80083cf614>] drm_dp_dpcd_write+0x1c/0x28 [<ffffff8008413b98>] analogix_dp_disable_psr+0x60/0xa8 [<ffffff800840da3c>] analogix_dp_psr_work+0x4c/0x90 [<ffffff80080bb09c>] process_one_work+0x1d4/0x348 [<ffffff80080bb258>] worker_thread+0x48/0x478 [<ffffff80080c11fc>] kthread+0x12c/0x130 [<ffffff8008084290>] ret_from_fork+0x10/0x18 Changing rockchip_dp_device::psr_lock to a mutex rather than spinlock seems to fix the issue. Signed-off-by: Emil Renner Berthing <kernel@esmil.dk> Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Mark Yao <mark.yao@rock-chips.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171004175346.11956-1-kernel@esmil.dk
2017-11-13Merge branch 'net-improve-the-process-of-redirect-and-toobig-for-ipv6-tunnels'David S. Miller
Xin Long says: ==================== net: improve the process of redirect and toobig for ipv6 tunnels Now let's say there are 3 kinds of icmp packets to process for tunnels, toobig(needfrag), redirect, others, their process should be: - toobig(needfrag) update the lower dst's pmtu by route cache, also update sk dst's pmtu if possible, or it will be fine if sk dst pmtu will get updated on tx path. - redirect update the lower dst's gw by route cache and return, no need to send this redirect packet to user sk. - others send the packet to user's sk, or it will also be fine to use err_count to count it and report fail link on tx path. All ipv4 tunnels basically follow this while some of ipv6 tunnels are doing in different ways, like ip6gre and ip6_tunnels update tnl dev's mtu instead of updating lower dst pmtu, no redirect process on their err_handlers, which doesn't make any sense and even causes performance problems. This patchset is to improve the process of redirect and toobig for ip6gre ip4ip6, ip6ip6 tunnels, as in ipv4 tunnels. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13ip6_tunnel: clean up ip4ip6 and ip6ip6's err_handlersXin Long
This patch is to remove some useless codes of redirect and fix some indents on ip4ip6 and ip6ip6's err_handlers. Note that redirect icmp packet is already processed in ip6_tnl_err, the old redirect codes in ip4ip6_err actually never worked even before this patch. Besides, there's no need to send redirect to user's sk, it's for lower dst, so just remove it in this patch. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13ip6_tunnel: process toobig in a better wayXin Long
The same improvement in "ip6_gre: process toobig in a better way" is needed by ip4ip6 and ip6ip6 as well. Note that ip4ip6 and ip6ip6 will also update sk dst pmtu in their err_handlers. Like I said before, gre6 could not do this as it's inner proto is not certain. But for all of them, sk dst pmtu will be updated in tx path if in need. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13ip6_tunnel: add the process for redirect in ip6_tnl_errXin Long
The same process for redirect in "ip6_gre: add the process for redirect in ip6gre_err" is needed by ip4ip6 and ip6ip6 as well. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13ip6_gre: process toobig in a better wayXin Long
Now ip6gre processes toobig icmp packet by setting gre dev's mtu in ip6gre_err, which would cause few things not good: - It couldn't set mtu with dev_set_mtu due to it's not in user context, which causes route cache and idev->cnf.mtu6 not to be updated. - It has to update sk dst pmtu in tx path according to gredev->mtu for ip6gre, while it updates pmtu again according to lower dst pmtu in ip6_tnl_xmit. - To change dev->mtu by toobig icmp packet is not a good idea, it should only work on pmtu. This patch is to process toobig by updating the lower dst's pmtu, as later sk dst pmtu will be updated in ip6_tnl_xmit, the same way as in ip4gre. Note that gre dev's mtu will not be updated any more, it doesn't make any sense to change dev's mtu after receiving a toobig packet. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13ip6_gre: add the process for redirect in ip6gre_errXin Long
This patch is to add redirect icmp packet process for ip6gre by calling ip6_redirect() in ip6gre_err(), as in vti6_err. Prior to this patch, there's even no route cache generated after receiving redirect. Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13forcedeth: remove redudant assignments in xmitZhu Yanjun
In xmit process, the variables are set many times. In fact, it is enough for these variables to be set once. After a long time test, the throughput performance is better than before. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13Merge tag 'nfc-next-4.15-1' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next Samuel Ortiz says: ==================== NFC 4.15 pull request This is the NFC pull request for 4.15. We have: - A new netlink command for explicitly deactivating NFC targets - i2c constification for all NFC drivers - One NFC device allocation error path fix ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13Merge branch 'Openvswitch-meter-action'David S. Miller
Andy Zhou says: ==================== Openvswitch meter action This patch series is the first attempt to add openvswitch meter support. We have previously experimented with adding metering support in nftables. However 1) It was not clear how to expose a named nftables object cleanly, and 2) the logic that implements metering is quite small, < 100 lines of code. With those two observations, it seems cleaner to add meter support in the openvswitch module directly. --- v1(RFC)->v2: remove unused code improve locking and other review comments v2 -> v3: rebase v3 -> v4: fix undefined "__udivdi3" references on 32 bit builds. use div_u64() instead. v4 -> v5: rebase ==================== Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13openvswitch: Add meter action supportAndy Zhou
Implements OVS kernel meter action support. Signed-off-by: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13openvswitch: Add meter infrastructureAndy Zhou
OVS kernel datapath so far does not support Openflow meter action. This is the first stab at adding kernel datapath meter support. This implementation supports only drop band type. Signed-off-by: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13openvswitch: export get_dp() API.Andy Zhou
Later patches will invoke get_dp() outside of datapath.c. Export it. Signed-off-by: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13openvswitch: Add meter netlink definitionsAndy Zhou
Meter has its own netlink family. Define netlink messages and attributes for communicating with the user space programs. Signed-off-by: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13Merge branch 'dsa-b53-Support-prepended-Broadcom-tags'David S. Miller
Florian Fainelli says: ==================== net: dsa: b53: Support prepended Broadcom tags This patch series adds support for prepended 4-bytes Broadcom tags that we already support. This type of tag will typically be used when interfaced to a SoC like BCM58xx (NorthStar Plus) which supports a Flow Accelerator (WIP). In that case, we need to support a slightly different tagging format. The first patch does a bit of re-factoring and passes a port index to the get_tag_protocol() function since at least two different drivers need that type of information (mt7530, b53) to support tagging or not. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13net: dsa: b53: Support prepended Broadcom tagsFlorian Fainelli
On BCM58xx devices (Northstar Plus), there is an accelerator attached to port 8 which would only work if we use prepended Broadcom tags. Resolve that difference in our get_tag_protocol() function by setting the appropriate tagging protocol in that case. We need to change b53_brcm_hdr_setup() a little bit now since we can deal with two types of Broadcom tags. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13net: dsa: Support prepended Broadcom tagFlorian Fainelli
Add a new type: DSA_TAG_PROTO_PREPEND which allows us to support for the 4-bytes Broadcom tag that we already support, but in a format where it is pre-pended to the packet instead of located between the MAC SA and the Ethertyper (DSA_TAG_PROTO_BRCM). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13net: dsa: tag_brcm: Prepare for supporting prepended tagFlorian Fainelli
In preparation for supporting the same Broadcom tag format, but instead of inserted between the MAC SA and EtherType, prepended to the Ethernet frame, restructure the code a little bit to make that possible and take an offset parameter. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>