linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2017-09-04	Merge branch 'intel_pstate'	Rafael J. Wysocki
	* intel_pstate: cpufreq: intel_pstate: Shorten a couple of long names cpufreq: intel_pstate: Simplify intel_pstate_adjust_pstate() cpufreq: intel_pstate: Improve IO performance with per-core P-states cpufreq: intel_pstate: Drop INTEL_PSTATE_HWP_SAMPLING_INTERVAL cpufreq: intel_pstate: Drop ->update_util from pstate_funcs cpufreq: intel_pstate: Do not use PID-based P-state selection
2017-09-04	Merge branch 'pm-cpufreq-sched'	Rafael J. Wysocki
	* pm-cpufreq-sched: cpufreq: schedutil: Always process remote callback with slow switching cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily cpufreq: Return 0 from ->fast_switch() on errors cpufreq: Simplify cpufreq_can_do_remote_dvfs() cpufreq: Process remote callbacks from any CPU if the platform permits sched: cpufreq: Allow remote cpufreq callbacks cpufreq: schedutil: Use unsigned int for iowait boost cpufreq: schedutil: Make iowait boost more energy efficient
2017-09-04	Merge branch 'pm-cpufreq'	Rafael J. Wysocki
	* pm-cpufreq: (33 commits) cpufreq: imx6q: Fix imx6sx low frequency support cpufreq: speedstep-lib: make several arrays static, makes code smaller cpufreq: ti: Fix 'of_node_put' being called twice in error handling path cpufreq: dt-platdev: Drop few entries from whitelist cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2 ARM: ux500: don't select CPUFREQ_DT cpufreq: Convert to using %pOF instead of full_name cpufreq: Cap the default transition delay value to 10 ms cpufreq: dbx500: Delete obsolete driver mfd: db8500-prcmu: Get rid of cpufreq dependency cpufreq: enable the DT cpufreq driver on the Ux500 cpufreq: Loongson2: constify platform_device_id cpufreq: dt: Add r8a7796 support to to use generic cpufreq driver cpufreq: remove setting of policy->cpu in policy->cpus during init cpufreq: mediatek: add support of cpufreq to MT7622 SoC cpufreq: mediatek: add cleanups with the more generic naming cpufreq: rcar: Add support for R8A7795 SoC cpufreq: dt: Add rk3328 compatible to use generic cpufreq driver cpufreq: s5pv210: add missing of_node_put() cpufreq: Allow dynamic switching with CPUFREQ_ETERNAL latency ...
2017-09-04	Merge branches 'pm-core', 'pm-opp', 'pm-domains', 'pm-cpu' and 'pm-avs'	Rafael J. Wysocki
	* pm-core: PM / wakeup: Set power.can_wakeup if wakeup_sysfs_add() fails * pm-opp: PM / OPP: Fix get sharing CPUs when hotplug is used PM / OPP: OF: Use pr_debug() instead of pr_err() while adding OPP table * pm-domains: PM / Domains: Convert to using %pOF instead of full_name PM / Domains: Extend generic power domain debugfs PM / Domains: Add time accounting to various genpd states * pm-cpu: PM / CPU: replace raw_notifier with atomic_notifier * pm-avs: PM / AVS: rockchip-io: add io selectors and supplies for RV1108
2017-09-03	Merge branches 'acpi-sysfs', 'acpi-apei' and 'acpi-blacklist'	Rafael J. Wysocki
	* acpi-sysfs: ACPI / sysfs: Extend ACPI sysfs to provide access to boot error region * acpi-apei: ACPI / APEI: Suppress message if HEST not present ACPI, APEI, EINJ: Subtract any matching Register Region from Trigger resources ACPI: APEI: fix the wrong iteration of generic error status block ACPI: APEI: Enable APEI multiple GHES source to share a single external IRQ * acpi-blacklist: intel_pstate: convert to use acpi_match_platform_list() ACPI / blacklist: add acpi_match_platform_list()
2017-09-03	Merge branches 'acpi-video', 'acpi-battery', 'acpi-spcr' and 'acpi-misc'	Rafael J. Wysocki
	* acpi-video: ACPI / video: Add force_none quirk for Dell OptiPlex 9020M * acpi-battery: ACPI: make device_attribute const * acpi-spcr: ACPI: SPCR: work around clock issue on xgene UART ACPI: SPCR: extend XGENE 8250 workaround to m400 * acpi-misc: ACPI / dock: constify attribute_group structure MAINTAINERS: Add Tony and Boris as ACPI/APEI reviewers ACPI / lpat: Fix typos in comments and kerneldoc style MAINTAINERS: device property: ACPI: add fwnode.h
2017-09-03	Merge branches 'acpi-x86', 'acpi-soc', 'acpi-pmic' and 'acpi-apple'	Rafael J. Wysocki
	* acpi-x86: ACPI / boot: Add number of legacy IRQs to debug output ACPI / boot: Correct address space of __acpi_map_table() ACPI / boot: Don't define unused variables * acpi-soc: ACPI / LPSS: Don't abort ACPI scan on missing mem resource * acpi-pmic: ACPI / PMIC: xpower: Do pinswitch magic when reading GPADC * acpi-apple: spi: Use Apple device properties in absence of ACPI resources ACPI / scan: Recognize Apple SPI and I2C slaves ACPI / property: Support Apple _DSM properties ACPI / property: Don't evaluate objects for devices w/o handle treewide: Consolidate Apple DMI checks
2017-09-03	Merge branches 'acpi-ec', 'acpi-dma', 'acpi-processor' and 'acpi-cppc'	Rafael J. Wysocki
	* acpi-ec: ACPI / EC: Clean up EC GPE mask flag ACPI: EC: Fix possible issues related to EC initialization order * acpi-dma: ACPI/IORT: Add IORT named component memory address limits ACPI: Make acpi_dma_configure() DMA regions aware ACPI: Introduce DMA ranges parsing ACPI: Make acpi_dev_get_resources() method agnostic * acpi-processor: ACPI / processor: make function acpi_processor_check_duplicates() static ACPI: processor: use dev_dbg() instead of dev_warn() when CPPC probe failed * acpi-cppc: mailbox: pcc: Drop uninformative output during boot
2017-09-03	Merge branches 'acpi-scan' and 'acpi-pm'	Rafael J. Wysocki
	* acpi-scan: ACPI / scan: Enable GPEs before scanning the namespace ACPICA: Make it possible to enable runtime GPEs earlier ACPICA: Dispatch active GPEs at init time * acpi-pm: ACPI / PM: Add debug statements to acpi_pm_notify_handler() ACPI: Add debug statements to acpi_global_event_handler() ACPI / sleep: Make acpi_sleep_syscore_init() static ACPI / PCI / PM: Rework acpi_pci_propagate_wakeup() ACPI / PM: Split acpi_device_wakeup() PCI / PM: Skip bridges in pci_enable_wake()
2017-09-03	Merge branch 'acpica'	Rafael J. Wysocki
	* acpica: (32 commits) ACPICA: Update version to 20170728 ACPICA: Revert "Update resource descriptor handling" ACPICA: Resources: Allow _DMA method in walk resources ACPICA: Ensure all instances of AE_AML_INTERNAL have error messages ACPICA: Implement deferred resolution of reference package elements ACPICA: Debugger: Improve support for Alias objects ACPICA: Interpreter: Update handling for Alias operator ACPICA: EFI/EDK2: Cleanup to enable /WX for MSVC builds ACPICA: acpidump: Add DSDT/FACS instance support for Linux and EFI ACPICA: CLib: Add short multiply/shift support ACPICA: EFI/EDK2: Sort acpi.h inclusion order ACPICA: Add a comment, no functional change ACPICA: Namespace: Update/fix an error message ACPICA: iASL: Add support for the SDEI table ACPICA: Divergences: reduce access size definitions ACPICA: Update version to 20170629 ACPICA: Update resource descriptor handling ACPICA: iasl: Update to IORT SMMUv3 disassembling ACPICA: Disassembler: skip parsing of incorrect external declarations ACPICA: iASL: Ensure that the target node is valid in acpi_ex_create_alias ...
2017-09-03	Linux 4.13	Linus Torvalds

2017-09-03	Merge branch 'l2tp-session-creation-fixes'	David S. Miller
	Guillaume Nault says: ==================== l2tp: session creation fixes The session creation process has a few issues wrt. concurrent tunnel deletion. Patch #1 avoids creating sessions in tunnels that are getting removed. This prevents races where sessions could try to take tunnel resources that were already released. Patch #2 removes some racy l2tp_tunnel_find() calls in session creation callbacks. Together with path #1 it ensures that sessions can only access tunnel resources that are guaranteed to remain valid during the session creation process. There are other problems with how sessions are created: pseudo-wire specific data are set after the session is added to the tunnel. So the session can be used, or deleted, before it has been completely initialised. Separating session allocation from session registration would be necessary, but we'd still have circular dependencies preventing race-free registration. I'll consider this issue in future series. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-03	l2tp: pass tunnel pointer to ->session_create()	Guillaume Nault
	Using l2tp_tunnel_find() in pppol2tp_session_create() and l2tp_eth_create() is racy, because no reference is held on the returned session. These functions are only used to implement the ->session_create callback which is run by l2tp_nl_cmd_session_create(). Therefore searching for the parent tunnel isn't necessary because l2tp_nl_cmd_session_create() already has a pointer to it and holds a reference. This patch modifies ->session_create()'s prototype to directly pass the the parent tunnel as parameter, thus avoiding searching for it in pppol2tp_session_create() and l2tp_eth_create(). Since we have to touch the ->session_create() call in l2tp_nl_cmd_session_create(), let's also remove the useless conditional: we know that ->session_create isn't NULL at this point because it's already been checked earlier in this same function. Finally, one might be tempted to think that the removed l2tp_tunnel_find() calls were harmless because they would return the same tunnel as the one held by l2tp_nl_cmd_session_create() anyway. But that tunnel might be removed and a new one created with same tunnel Id before the l2tp_tunnel_find() call. In this case l2tp_tunnel_find() would return the new tunnel which wouldn't be protected by the reference held by l2tp_nl_cmd_session_create(). Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-03	l2tp: prevent creation of sessions on terminated tunnels	Guillaume Nault
	l2tp_tunnel_destruct() sets tunnel->sock to NULL, then removes the tunnel from the pernet list and finally closes all its sessions. Therefore, it's possible to add a session to a tunnel that is still reachable, but for which tunnel->sock has already been reset. This can make l2tp_session_create() dereference a NULL pointer when calling sock_hold(tunnel->sock). This patch adds the .acpt_newsess field to struct l2tp_tunnel, which is used by l2tp_tunnel_closeall() to prevent addition of new sessions to tunnels. Resetting tunnel->sock is done after l2tp_tunnel_closeall() returned, so that l2tp_session_add_to_tunnel() can safely take a reference on it when .acpt_newsess is true. The .acpt_newsess field is modified in l2tp_tunnel_closeall(), rather than in l2tp_tunnel_destruct(), so that it benefits all tunnel removal mechanisms. E.g. on UDP tunnels, a session could be added to a tunnel after l2tp_udp_encap_destroy() proceeded. This would prevent the tunnel from being removed because of the references held by this new session on the tunnel and its socket. Even though the session could be removed manually later on, this defeats the purpose of commit 9980d001cec8 ("l2tp: add udp encap socket destroy handler"). Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-03	Merge branch ↵	David S. Miller
	'net-revert-lib-percpu_counter-API-for-fragmentation-mem-accounting' Jesper Dangaard Brouer says: ==================== net: revert lib/percpu_counter API for fragmentation mem accounting There is a bug in fragmentation codes use of the percpu_counter API, that can cause issues on systems with many CPUs, above 24 CPUs. After much consideration and different attempts at solving the API usage. The conclusion is to revert to the simple atomic_t API instead. The ratio between batch size and threshold size make it a bad use-case for the lib/percpu_counter API. As using the correct API calls will unfortunately cause systems with many CPUs to always execute an expensive sum across all CPUs. Plus the added complexity is not worth it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-03	Revert "net: fix percpu memory leaks"	Jesper Dangaard Brouer
	This reverts commit 1d6119baf0610f813eb9d9580eb4fd16de5b4ceb. After reverting commit 6d7b857d541e ("net: use lib/percpu_counter API for fragmentation mem accounting") then here is no need for this fix-up patch. As percpu_counter is no longer used, it cannot memory leak it any-longer. Fixes: 6d7b857d541e ("net: use lib/percpu_counter API for fragmentation mem accounting") Fixes: 1d6119baf061 ("net: fix percpu memory leaks") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-03	Revert "net: use lib/percpu_counter API for fragmentation mem accounting"	Jesper Dangaard Brouer
	This reverts commit 6d7b857d541ecd1d9bd997c97242d4ef94b19de2. There is a bug in fragmentation codes use of the percpu_counter API, that can cause issues on systems with many CPUs. The frag_mem_limit() just reads the global counter (fbc->count), without considering other CPUs can have upto batch size (130K) that haven't been subtracted yet. Due to the 3MBytes lower thresh limit, this become dangerous at >=24 CPUs (310241024/130000=24). The correct API usage would be to use __percpu_counter_compare() which does the right thing, and takes into account the number of (online) CPUs and batch size, to account for this and call __percpu_counter_sum() when needed. We choose to revert the use of the lib/percpu_counter API for frag memory accounting for several reasons: 1) On systems with CPUs > 24, the heavier fully locked __percpu_counter_sum() is always invoked, which will be more expensive than the atomic_t that is reverted to. Given systems with more than 24 CPUs are becoming common this doesn't seem like a good option. To mitigate this, the batch size could be decreased and thresh be increased. 2) The add_frag_mem_limit+sub_frag_mem_limit pairs happen on the RX CPU, before SKBs are pushed into sockets on remote CPUs. Given NICs can only hash on L2 part of the IP-header, the NIC-RXq's will likely be limited. Thus, a fair chance that atomic add+dec happen on the same CPU. Revert note that commit 1d6119baf061 ("net: fix percpu memory leaks") removed init_frag_mem_limit() and instead use inet_frags_init_net(). After this revert, inet_frags_uninit_net() becomes empty. Fixes: 6d7b857d541e ("net: use lib/percpu_counter API for fragmentation mem accounting") Fixes: 1d6119baf061 ("net: fix percpu memory leaks") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-03	net/mlx4_core: fix incorrect size allocation for dev->caps.spec_qps	Colin Ian King
	The current allocation for dev->caps.spec_qps is for the size of the pointer and not the size of the actual mlx4_spec_qps structure. Fix this by using the correct size. Also splint allocation over a few lines to make it cppcheck clean on overly wide lines. Detected by CoverityScan, CID#1455222 ("Wrong sizeof argument") Fixes: c73c8b1e47ca ("net/mlx4_core: Dynamically allocate structs at mlx4_slave_cap") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-03	net/mlx4_core: fix memory leaks on error exit path	Colin Ian King
	The structures hca_param and func_cap are not being kfree'd on an error exit path causing two memory leaks. Fix this by jumping to the existing free memory error exit path. Detected by CoverityScan, CID#1455219, CID#1455224 ("Resource Leak") Fixes: c73c8b1e47ca ("net/mlx4_core: Dynamically allocate structs at mlx4_slave_cap") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-03	ipv4: Don't override return code from ip_route_input_noref()	Stefano Brivio
	After ip_route_input() calls ip_route_input_noref(), another check on skb_dst() is done, but if this fails, we shouldn't override the return code from ip_route_input_noref(), as it could have been more specific (i.e. -EHOSTUNREACH). This also saves one call to skb_dst_force_safe() and one to skb_dst() in case the ip_route_input_noref() check fails. Reported-by: Sabrina Dubroca <sdubroca@redhat.com> Fixes: 9df16efadd2a ("ipv4: call dst_hold_safe() properly") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Acked-by: Wei Wang <weiwan@google.com> Acked-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-03	xfs: use kmem_free to free return value of kmem_zalloc	Pan Bian
	In function xfs_test_remount_options(), kfree() is used to free memory allocated by kmem_zalloc(). But it is better to use kmem_free(). Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-09-03	xfs: open code end_buffer_async_write in xfs_finish_page_writeback	Christoph Hellwig
	Our loop in xfs_finish_page_writeback, which iterates over all buffer heads in a page and then calls end_buffer_async_write, which also iterates over all buffers in the page to check if any I/O is in flight is not only inefficient, but also potentially dangerous as end_buffer_async_write can cause the page and all buffers to be freed. Replace it with a single loop that does the work of end_buffer_async_write on a per-page basis. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-09-03	Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus	Linus Torvalds
	Pull MIPS fixes from Ralf Baechle: "The two indirect syscall fixes have sat in linux-next for a few days. I did check back with a hardware designer to ensure a SYNC is really what's required for the GIC fix and so the GIC fix didn't make it into to linux-next in time for this final pull request. It builds in local build tests and passes Imagination's test system" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: irqchip: mips-gic: SYNC after enabling GIC region MIPS: Remove pt_regs adjustments in indirect syscall handler MIPS: seccomp: Fix indirect syscall args
2017-09-03	Merge branch 'x86-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: - Expand the space for uncompressing as the LZ4 worst case does not fit into the currently reserved space - Validate boot parameters more strictly to prevent out of bound access in the decompressor/boot code - Fix off by one errors in get_segment_base() * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Prevent faulty bootparams.screeninfo from causing harm x86/boot: Provide more slack space during decompression x86/ldt: Fix off by one in get_segment_base()
2017-09-03	Merge branch 'timers-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A single fix for a thinko in the raw timekeeper update which causes clock MONOTONIC_RAW to run with erratically increased frequency" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: time: Fix ktime_get_raw() incorrect base accumulation
2017-09-03	Merge branch 'perf-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: - Prevent a potential inconistency in the perf user space access which might lead to evading sanity checks. - Prevent perf recording function trace entries twice * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/ftrace: Fix double traces of perf on ftrace:function perf/core: Fix potential double-fetch bug
2017-09-03	ALSA: hda: Fix regression of hdmi eld control created based on invalid pcm	Wang YanQing
	Commit fb087eaaef72 ("ALSA: hda - hdmi eld control created based on pcm") forget to filter out invalid pcm numbers, if there is only one invalid pcm number, then this issue causes we create eld control for invalid pcm silently, but when there are more than one invalid pcm numbers, then this issue bring probe error looks like below dmesg: " kernel: [ 1.647283] snd_hda_intel 0000:00:03.0: bound 0000:00:02.0 (ops 0xc2967540) kernel: [ 1.651192] snd_hda_intel 0000:00:03.0: Too many HDMI devices kernel: [ 1.651195] snd_hda_intel 0000:00:03.0: Consider building the kernel with CONFIG_SND_DYNAMIC_MINORS=y kernel: [ 1.651197] snd_hda_intel 0000:00:03.0: Too many HDMI devices kernel: [ 1.651199] snd_hda_intel 0000:00:03.0: Consider building the kernel with CONFIG_SND_DYNAMIC_MINORS=y kernel: [ 1.651201] snd_hda_intel 0000:00:03.0: Too many HDMI devices kernel: [ 1.651203] snd_hda_intel 0000:00:03.0: Consider building the kernel with CONFIG_SND_DYNAMIC_MINORS=y kernel: [ 1.651676] snd_hda_intel 0000:00:03.0: control 3:0:0:ELD:0 is already present kernel: [ 1.651787] snd_hda_codec_hdmi: probe of hdaudioC0D0 failed with error -16 " This patch add invalid pcm number filter before calling hdmi_create_eld_ctl. Fixes: fb087eaaef72 ("ALSA: hda - hdmi eld control created based on pcm") Signed-off-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-09-03	net/mlx5e: Distribute RSS table among all RX rings	Tariq Toukan
	In default, uniformly distribute the RSS indirection table entries among all RX rings, rather than restricting this only to the rings on the close NUMA node. irqbalancer would anyway dynamically override the default affinities set to the RX rings. This gives better multi-stream performance and CPU util. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Stop NAPI when irq balancer changes affinity	Tariq Toukan
	NAPI context keeps rescheduling on same CPU as long as it's busy. This doesn't give the oppurtunity for changes in irq affinities to take effect. Fix that by calling napi_complete_done() upon a change in affinity. This would stop the NAPI and reschedule it on the new CPU. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Use kernel's mechanism to avoid missing NAPIs	Tariq Toukan
	We used a channel state bit MLX5E_CHANNEL_NAPI_SCHED to make sure no NAPI is missed when a channel's napi_schedule() is called for completion events of the different channel's resources/rings while NAPI is currently running. Now, as similar mechanism is implemented in kernel, ("39e6c8208d7b net: solve a NAPI race"), we obsolete our own implementation and rely on the return value of napi_complete_done(). This patch removes a redundant overhead of atomic bit operations. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Slightly increase RX page-cache size	Tariq Toukan
	In XDP_TX flow, we now get back quicker to each page in page-cache, and on some occasions refcount does not get back to 1 on time, causing some costly page allocations. Slightly increase the size of RX page-cache to significantly decrease the chances for this to happen. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Don't recycle page if moved to far NUMA	Tariq Toukan
	Avoid recycling an RX page if it moved to another NUMA node. Add an ethtool counter to count such events. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Remove unnecessary fields in ICO SQ	Tariq Toukan
	As of current design, in each NAPI, only a single UMR WQE completion could be available in the completion queue of the the internal control operations (ICO) send queue, in addition to nop operations that require no actions upon completion. This renders the consume index obsolete, as the wqe_counter field in CQE is sufficient. This helps removing a memory barrier, and obsoletes the need for tracking the num_wqebbs to update the consumer counter. In addition, remove other unused fields in icosq struct: pdev, dma_fifo_pc, and prev_cc. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Type-specific optimizations for RX post WQEs function	Tariq Toukan
	Separate the RX post WQEs function of the different RQ types. This enables RQ type-specific optimizations in data-path. Poll the ICOSQ completion queue only for Striding RQ, and only when a UMR post completion could be possibly available. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Non-atomic RQ state indicator for UMR WQE in progress	Tariq Toukan
	The indication for a UMR WQE in progress is needed only within the NAPI context, and hence no races possible and no need for the use of atomic operations. The only place the flag is read outside of NAPI context is in closure flow, after RQ is disabled flag is no more accessed in NAPI. Use a boolean instead of a bit in ring state, so that its non-atomic set operations do not race with the atomic sets of the other bits. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Non-atomic indicator for ring enabled state	Tariq Toukan
	Rings enabled state change occurs in control path only, and is always followed by a napi_sychronize(), so that following NAPIs read the new value. This read does not need to be atomic. The RQ auto-moderation bit is not set/cleared in data-path. No need for atomic read, a regular read operation is sufficient. In RQ creation time as well, there's no multiple threads trying to access it yet, hence a regular read can be used. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Refactor data-path lro header function	Tariq Toukan
	Refactor function mlx5e_lro_update_hdr() to reduce number of branches. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Early-return on empty completion queues	Tariq Toukan
	NAPI context handles different kinds of completion queues (RX, TX, and others). Hence, upon a poll trial, some of them might be empty. Here we early-return upon empty completion queues, as well as full rx buffer, and save unnecessary logic and memory barriers. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: NAPI busy-poll when UMR post is in progress	Tariq Toukan
	If a UMR post is in progress, it means that there's a missing WQE in RQ, and that a completion will be shortly available in ICO SQ completion queue. Prefer busy-poll to handle it as soon as possible. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Small enhancements for RX MPWQE allocation and free	Tariq Toukan
	The dma offset of a MPWQE (Multi-Packet WQE) in memory region is fixed for all rounds. Calculate it once on creation time, instead of in runtime. This also obsoletes the wqe argument in the function. In addition, optimize dma_info iterator calculation. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Use memset to init skbs_frags array to zeros	Tariq Toukan
	In RX data-path, use memset() instead of loop assignment to init the whole skbs_frags array. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Remove unnecessary wqe_sz field from RQ buffer	Tariq Toukan
	Field is used only locally within the RQ create function. The use of a local variable is sufficient. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Replace multiplication by stride size with a shift	Tariq Toukan
	In RX data-path, use shift operations instead of a regular multiplication by stride size, as it is a power of two. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-03	net/mlx5e: Reorganize struct mlx5e_rq	Tariq Toukan
	Bring fast-path fields together, and combine RX WQE mutual exclusive fields into a union. Page-reuse and XDP are mutually exclusive and cannot be used at the same time. Use a union to combine their footprints. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-02	xfs: don't set v3 xflags for v2 inodes	Christoph Hellwig
	Reject attempts to set XFLAGS that correspond to di_flags2 inode flags if the inode isn't a v3 inode, because di_flags2 only exists on v3. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-09-02	xfs: fix compiler warnings	Darrick J. Wong
	Fix up all the compiler warnings that have crept in. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-09-02	powerpc/xive: improve debugging macros	Cédric Le Goater
	Having the CPU identifier in the debug logs is helpful when tracking issues. Also add some more logging and fix a compile issue in xive_do_source_eoi(). Signed-off-by: Cédric Le Goater <clg@kaod.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-02	powerpc/xive: add XIVE Exploitation Mode to CAS	Cédric Le Goater
	On POWER9, the Client Architecture Support (CAS) negotiation process determines whether the guest operates in XIVE Legacy compatibility or in XIVE exploitation mode. Now that we have initial guest support for the XIVE interrupt controller, let's inform the hypervisor what we can do. The platform advertises the XIVE Exploitation Mode support using the property "ibm,arch-vec-5-platform-support-vec-5", byte 23 bits 0-1 : - 0b00 XIVE legacy mode Only - 0b01 XIVE exploitation mode Only - 0b10 XIVE legacy or exploitation mode The OS asks for XIVE Exploitation Mode support using the property "ibm,architecture-vec-5", byte 23 bits 0-1: - 0b00 XIVE legacy mode Only - 0b01 XIVE exploitation mode Only Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-02	powerpc/xive: introduce H_INT_ESB hcall	Cédric Le Goater
	The H_INT_ESB hcall() is used to issue a load or store to the ESB page instead of using the MMIO pages. This can be used as a workaround on some HW issues. The OS knows that this hcall should be used on an interrupt source when the ESB hcall flag is set to 1 in the hcall H_INT_GET_SOURCE_INFO. To maintain the frontier between the xive frontend and backend, we introduce a new xive operation 'esb_rw' to be used in the routines doing memory accesses on the ESBs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-02	powerpc/xive: add the HW IRQ number under xive_irq_data	Cédric Le Goater
	It will be required later by the H_INT_ESB hcall. Signed-off-by: Cédric Le Goater <clg@kaod.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>