Age | Commit message (Collapse) | Author |
|
Use actual CPU number instead of hardcoded value to decide the size
of 'cpu_used_mask' in 'struct sw_flow'. Below is the reason.
'struct cpumask cpu_used_mask' is embedded in struct sw_flow.
Its size is hardcoded to CONFIG_NR_CPUS bits, which can be
8192 by default, it costs memory and slows down ovs_flow_alloc.
To address this:
Redefine cpu_used_mask to pointer.
Append cpumask_size() bytes after 'stat' to hold cpumask.
Initialization cpu_used_mask right after stats_last_writer.
APIs like cpumask_next and cpumask_set_cpu never access bits
beyond cpu count, cpumask_size() bytes of memory is enough.
Signed-off-by: Eddy Tao <taoyuan_eddy@hotmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://lore.kernel.org/r/OS3P286MB229570CCED618B20355D227AF5D59@OS3P286MB2295.JPNP286.PROD.OUTLOOK.COM
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The forward declaration was introduced with a prototype that does
not match the function definition:
drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c:2166:13: error: conflicting types for 'xgbe_phy_perform_ratechange' due to enum/integer mismatch; have 'void(struct xgbe_prv_data *, enum xgbe_mb_cmd, enum xgbe_mb_subcmd)' [-Werror=enum-int-mismatch]
2166 | static void xgbe_phy_perform_ratechange(struct xgbe_prv_data *pdata,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c:391:13: note: previous declaration of 'xgbe_phy_perform_ratechange' with type 'void(struct xgbe_prv_data *, unsigned int, unsigned int)'
391 | static void xgbe_phy_perform_ratechange(struct xgbe_prv_data *pdata,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
Ideally there should not be any forward declarations here, which
would make it easier to show that there is no unbounded recursion.
I tried fixing this but could not figure out how to avoid the
recursive call.
As a hotfix, address only the broken prototype to fix the build
problem instead.
Fixes: 4f3b20bfbb75 ("amd-xgbe: add support for rx-adaptation")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Link: https://lore.kernel.org/r/20230203121553.2871598-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are no external users of the vsc7514_*_regmap[] symbols or
vsc7514_vcap_* functions. They were exported in commit 32ecd22ba60b ("net:
mscc: ocelot: split register definitions to a separate file") with the
intention of being used, but the actual structure used in commit
2efaca411c96 ("net: mscc: ocelot: expose vsc7514_regmap definition") ended
up being all that was needed.
Bury these unnecessary symbols.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20230204182056.25502-1-colin.foster@in-advantage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ajit Khaparde says:
====================
bnxt: Add Auxiliary driver support
Add auxiliary device driver for Broadcom devices.
The bnxt_en driver will register and initialize an aux device
if RDMA is enabled in the underlying device.
The bnxt_re driver will then probe and initialize the
RoCE interfaces with the infiniband stack.
We got rid of the bnxt_en_ops which the bnxt_re driver used to
communicate with bnxt_en.
Similarly We have tried to clean up most of the bnxt_ulp_ops.
In most of the cases we used the functions and entry points provided
by the auxiliary bus driver framework.
And now these are the minimal functions needed to support the functionality.
We will try to work on getting rid of the remaining if we find any
other viable option in future.
* 'aux-bus-v11' of https://github.com/ajitkhaparde1/linux:
bnxt_en: Remove runtime interrupt vector allocation
RDMA/bnxt_re: Remove the sriov config callback
bnxt_en: Remove struct bnxt access from RoCE driver
bnxt_en: Use auxiliary bus calls over proprietary calls
bnxt_en: Use direct API instead of indirection
bnxt_en: Remove usage of ulp_id
RDMA/bnxt_re: Use auxiliary driver interface
bnxt_en: Add auxiliary driver support
====================
Link: https://lore.kernel.org/r/20230202033809.3989-1-ajit.khaparde@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When the network status is unstable, use-after-free may occur when
read data from the server.
BUG: KASAN: use-after-free in readpages_fill_pages+0x14c/0x7e0
Call Trace:
<TASK>
dump_stack_lvl+0x38/0x4c
print_report+0x16f/0x4a6
kasan_report+0xb7/0x130
readpages_fill_pages+0x14c/0x7e0
cifs_readv_receive+0x46d/0xa40
cifs_demultiplex_thread+0x121c/0x1490
kthread+0x16b/0x1a0
ret_from_fork+0x2c/0x50
</TASK>
Allocated by task 2535:
kasan_save_stack+0x22/0x50
kasan_set_track+0x25/0x30
__kasan_kmalloc+0x82/0x90
cifs_readdata_direct_alloc+0x2c/0x110
cifs_readdata_alloc+0x2d/0x60
cifs_readahead+0x393/0xfe0
read_pages+0x12f/0x470
page_cache_ra_unbounded+0x1b1/0x240
filemap_get_pages+0x1c8/0x9a0
filemap_read+0x1c0/0x540
cifs_strict_readv+0x21b/0x240
vfs_read+0x395/0x4b0
ksys_read+0xb8/0x150
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
Freed by task 79:
kasan_save_stack+0x22/0x50
kasan_set_track+0x25/0x30
kasan_save_free_info+0x2e/0x50
__kasan_slab_free+0x10e/0x1a0
__kmem_cache_free+0x7a/0x1a0
cifs_readdata_release+0x49/0x60
process_one_work+0x46c/0x760
worker_thread+0x2a4/0x6f0
kthread+0x16b/0x1a0
ret_from_fork+0x2c/0x50
Last potentially related work creation:
kasan_save_stack+0x22/0x50
__kasan_record_aux_stack+0x95/0xb0
insert_work+0x2b/0x130
__queue_work+0x1fe/0x660
queue_work_on+0x4b/0x60
smb2_readv_callback+0x396/0x800
cifs_abort_connection+0x474/0x6a0
cifs_reconnect+0x5cb/0xa50
cifs_readv_from_socket.cold+0x22/0x6c
cifs_read_page_from_socket+0xc1/0x100
readpages_fill_pages.cold+0x2f/0x46
cifs_readv_receive+0x46d/0xa40
cifs_demultiplex_thread+0x121c/0x1490
kthread+0x16b/0x1a0
ret_from_fork+0x2c/0x50
The following function calls will cause UAF of the rdata pointer.
readpages_fill_pages
cifs_read_page_from_socket
cifs_readv_from_socket
cifs_reconnect
__cifs_reconnect
cifs_abort_connection
mid->callback() --> smb2_readv_callback
queue_work(&rdata->work) # if the worker completes first,
# the rdata is freed
cifs_readv_complete
kref_put
cifs_readdata_release
kfree(rdata)
return rdata->... # UAF in readpages_fill_pages()
Similarly, this problem also occurs in the uncache_fill_pages().
Fix this by adjusts the order of condition judgment in the return
statement.
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Cc: stable@vger.kernel.org
Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When ice_add_special_words() fails, the 'rm' is not released, which will
lead to a memory leak. Fix this up by going to 'err_unroll' label.
Compile tested only.
Fixes: 8b032a55c1bd ("ice: low level support for tunnels")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
|
|
The > comparison should be >= to prevent reading one element beyond
the end of the array.
The "vsi->num_rxq" is not strictly speaking the number of elements in
the vsi->rxq_map[] array. The array has "vsi->alloc_rxq" elements and
"vsi->num_rxq" is less than or equal to the number of elements in the
array. The array is allocated in ice_vsi_alloc_arrays(). It's still
an off by one but it might not access outside the end of the array.
Fixes: 143b86f346c7 ("ice: Enable RX queue selection using skbedit action")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Amritha Nambiar <amritha.nambiar@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
|
|
If the user turns on the vf-true-promiscuous-support flag, then Rx VLAN
filtering will be disabled if the VF requests to enable promiscuous
mode. When the VF is in a port VLAN, this is the incorrect behavior
because it will allow the VF to receive traffic outside of its port VLAN
domain. Fortunately this only resulted in the VF(s) receiving broadcast
traffic outside of the VLAN domain because all of the VLAN promiscuous
rules are based on the port VLAN ID. Fix this by setting the
.disable_rx_filtering VLAN op to a no-op when a port VLAN is enabled on
the VF.
Also, make sure to make this fix for both Single VLAN Mode and Double
VLAN Mode enabled devices.
Fixes: c31af68a1b94 ("ice: Add outer_vlan_ops and VSI specific VLAN ops implementations")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Karen Ostrowska <karen.ostrowska@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
KASAN reported:
[ 9793.708867] BUG: KASAN: global-out-of-bounds in ice_get_link_speed+0x16/0x30 [ice]
[ 9793.709205] Read of size 4 at addr ffffffffc1271b1c by task kworker/6:1/402
[ 9793.709222] CPU: 6 PID: 402 Comm: kworker/6:1 Kdump: loaded Tainted: G B OE 6.1.0+ #3
[ 9793.709235] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.00.01.0014.070920180847 07/09/2018
[ 9793.709245] Workqueue: ice ice_service_task [ice]
[ 9793.709575] Call Trace:
[ 9793.709582] <TASK>
[ 9793.709588] dump_stack_lvl+0x44/0x5c
[ 9793.709613] print_report+0x17f/0x47b
[ 9793.709632] ? __cpuidle_text_end+0x5/0x5
[ 9793.709653] ? ice_get_link_speed+0x16/0x30 [ice]
[ 9793.709986] ? ice_get_link_speed+0x16/0x30 [ice]
[ 9793.710317] kasan_report+0xb7/0x140
[ 9793.710335] ? ice_get_link_speed+0x16/0x30 [ice]
[ 9793.710673] ice_get_link_speed+0x16/0x30 [ice]
[ 9793.711006] ice_vc_notify_vf_link_state+0x14c/0x160 [ice]
[ 9793.711351] ? ice_vc_repr_cfg_promiscuous_mode+0x120/0x120 [ice]
[ 9793.711698] ice_vc_process_vf_msg+0x7a7/0xc00 [ice]
[ 9793.712074] __ice_clean_ctrlq+0x98f/0xd20 [ice]
[ 9793.712534] ? ice_bridge_setlink+0x410/0x410 [ice]
[ 9793.712979] ? __request_module+0x320/0x520
[ 9793.713014] ? ice_process_vflr_event+0x27/0x130 [ice]
[ 9793.713489] ice_service_task+0x11cf/0x1950 [ice]
[ 9793.713948] ? io_schedule_timeout+0xb0/0xb0
[ 9793.713972] process_one_work+0x3d0/0x6a0
[ 9793.714003] worker_thread+0x8a/0x610
[ 9793.714031] ? process_one_work+0x6a0/0x6a0
[ 9793.714049] kthread+0x164/0x1a0
[ 9793.714071] ? kthread_complete_and_exit+0x20/0x20
[ 9793.714100] ret_from_fork+0x1f/0x30
[ 9793.714137] </TASK>
[ 9793.714151] The buggy address belongs to the variable:
[ 9793.714158] ice_aq_to_link_speed+0x3c/0xffffffffffff3520 [ice]
[ 9793.714632] Memory state around the buggy address:
[ 9793.714642] ffffffffc1271a00: f9 f9 f9 f9 00 00 05 f9 f9 f9 f9 f9 00 00 02 f9
[ 9793.714656] ffffffffc1271a80: f9 f9 f9 f9 00 00 04 f9 f9 f9 f9 f9 00 00 00 00
[ 9793.714670] >ffffffffc1271b00: 00 00 00 04 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9
[ 9793.714680] ^
[ 9793.714690] ffffffffc1271b80: 00 00 00 00 00 04 f9 f9 f9 f9 f9 f9 00 00 00 00
[ 9793.714704] ffffffffc1271c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
The ICE_AQ_LINK_SPEED_UNKNOWN define is BIT(15). The value is bigger
than both legacy and normal link speed tables. Add one element (0 -
unknown) to both tables. There is no need to explicitly set table size,
leave it empty.
Fixes: 1d0e28a9be1f ("ice: Remove and replace ice speed defines with ethtool.h versions")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
|
|
When both ice and the irdma driver are loaded, a warning in
check_flush_dependency is being triggered. This is due to ice driver
workqueue being allocated with the WQ_MEM_RECLAIM flag and the irdma one
is not.
According to kernel documentation, this flag should be set if the
workqueue will be involved in the kernel's memory reclamation flow.
Since it is not, there is no need for the ice driver's WQ to have this
flag set so remove it.
Example trace:
[ +0.000004] workqueue: WQ_MEM_RECLAIM ice:ice_service_task [ice] is flushing !WQ_MEM_RECLAIM infiniband:0x0
[ +0.000139] WARNING: CPU: 0 PID: 728 at kernel/workqueue.c:2632 check_flush_dependency+0x178/0x1a0
[ +0.000011] Modules linked in: bonding tls xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_cha
in_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink bridge stp llc rfkill vfat fat intel_rapl_msr intel
_rapl_common isst_if_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct1
0dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_
core_mod ib_iser libiscsi scsi_transport_iscsi rdma_cm ib_cm iw_cm iTCO_wdt iTCO_vendor_support ipmi_ssif irdma mei_me ib_uverbs
ib_core intel_uncore joydev pcspkr i2c_i801 acpi_ipmi mei lpc_ich i2c_smbus intel_pch_thermal ioatdma ipmi_si acpi_power_meter
acpi_pad xfs libcrc32c sd_mod t10_pi crc64_rocksoft crc64 sg ahci ixgbe libahci ice i40e igb crc32c_intel mdio i2c_algo_bit liba
ta dca wmi dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse
[ +0.000161] [last unloaded: bonding]
[ +0.000006] CPU: 0 PID: 728 Comm: kworker/0:2 Tainted: G S 6.2.0-rc2_next-queue-13jan-00458-gc20aabd57164 #1
[ +0.000006] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020
[ +0.000003] Workqueue: ice ice_service_task [ice]
[ +0.000127] RIP: 0010:check_flush_dependency+0x178/0x1a0
[ +0.000005] Code: 89 8e 02 01 e8 49 3d 40 00 49 8b 55 18 48 8d 8d d0 00 00 00 48 8d b3 d0 00 00 00 4d 89 e0 48 c7 c7 e0 3b 08
9f e8 bb d3 07 01 <0f> 0b e9 be fe ff ff 80 3d 24 89 8e 02 00 0f 85 6b ff ff ff e9 06
[ +0.000004] RSP: 0018:ffff88810a39f990 EFLAGS: 00010282
[ +0.000005] RAX: 0000000000000000 RBX: ffff888141bc2400 RCX: 0000000000000000
[ +0.000004] RDX: 0000000000000001 RSI: dffffc0000000000 RDI: ffffffffa1213a80
[ +0.000003] RBP: ffff888194bf3400 R08: ffffed117b306112 R09: ffffed117b306112
[ +0.000003] R10: ffff888bd983088b R11: ffffed117b306111 R12: 0000000000000000
[ +0.000003] R13: ffff888111f84d00 R14: ffff88810a3943ac R15: ffff888194bf3400
[ +0.000004] FS: 0000000000000000(0000) GS:ffff888bd9800000(0000) knlGS:0000000000000000
[ +0.000003] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000003] CR2: 000056035b208b60 CR3: 000000017795e005 CR4: 00000000007706f0
[ +0.000003] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ +0.000003] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ +0.000002] PKRU: 55555554
[ +0.000003] Call Trace:
[ +0.000002] <TASK>
[ +0.000003] __flush_workqueue+0x203/0x840
[ +0.000006] ? mutex_unlock+0x84/0xd0
[ +0.000008] ? __pfx_mutex_unlock+0x10/0x10
[ +0.000004] ? __pfx___flush_workqueue+0x10/0x10
[ +0.000006] ? mutex_lock+0xa3/0xf0
[ +0.000005] ib_cache_cleanup_one+0x39/0x190 [ib_core]
[ +0.000174] __ib_unregister_device+0x84/0xf0 [ib_core]
[ +0.000094] ib_unregister_device+0x25/0x30 [ib_core]
[ +0.000093] irdma_ib_unregister_device+0x97/0xc0 [irdma]
[ +0.000064] ? __pfx_irdma_ib_unregister_device+0x10/0x10 [irdma]
[ +0.000059] ? up_write+0x5c/0x90
[ +0.000005] irdma_remove+0x36/0x90 [irdma]
[ +0.000062] auxiliary_bus_remove+0x32/0x50
[ +0.000007] device_release_driver_internal+0xfa/0x1c0
[ +0.000005] bus_remove_device+0x18a/0x260
[ +0.000007] device_del+0x2e5/0x650
[ +0.000005] ? __pfx_device_del+0x10/0x10
[ +0.000003] ? mutex_unlock+0x84/0xd0
[ +0.000004] ? __pfx_mutex_unlock+0x10/0x10
[ +0.000004] ? _raw_spin_unlock+0x18/0x40
[ +0.000005] ice_unplug_aux_dev+0x52/0x70 [ice]
[ +0.000160] ice_service_task+0x1309/0x14f0 [ice]
[ +0.000134] ? __pfx___schedule+0x10/0x10
[ +0.000006] process_one_work+0x3b1/0x6c0
[ +0.000008] worker_thread+0x69/0x670
[ +0.000005] ? __kthread_parkme+0xec/0x110
[ +0.000007] ? __pfx_worker_thread+0x10/0x10
[ +0.000005] kthread+0x17f/0x1b0
[ +0.000005] ? __pfx_kthread+0x10/0x10
[ +0.000004] ret_from_fork+0x29/0x50
[ +0.000009] </TASK>
Fixes: 940b61af02f4 ("ice: Initialize PF and setup miscellaneous interrupt")
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Tested-by: Jakub Andrysiak <jakub.andrysiak@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
|
|
There is a spelling mistake in a literal string. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230206092229.46416-1-colin.i.king@gmail.com
|
|
In a previous commit, Ubuntu kernel code version is correctly set
by retrieving the information from /proc/version_signature.
commit<5b3d72987701d51bf31823b39db49d10970f5c2d>
(libbpf: Improve LINUX_VERSION_CODE detection)
The /proc/version_signature file doesn't present in at least the
older versions of Debian distributions (eg, Debian 9, 10). The Debian
kernel has a similar issue where the release information from uname()
syscall doesn't give the kernel code version that matches what the
kernel actually expects. Below is an example content from Debian 10.
release: 4.19.0-23-amd64
version: #1 SMP Debian 4.19.269-1 (2022-12-20) x86_64
Debian reports incorrect kernel version in utsname::release returned
by uname() syscall, which in older kernels (Debian 9, 10) leads to
kprobe BPF programs failing to load due to the version check mismatch.
Fortunately, the correct kernel code version presents in the
utsname::version returned by uname() syscall in Debian kernels. This
change adds another get kernel version function to handle Debian in
addition to the previously added get kernel version function to handle
Ubuntu. Some minor refactoring work is also done to make the code more
readable.
Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
Signed-off-by: Ho-Ren (Jack) Chuang <horenchuang@bytedance.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230203234842.2933903-1-hao.xiang@bytedance.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
"During the v6.2 cycle, there were a series of changes to task cpu
affinity handling which fixed cpuset inadvertently clobbering
user-configured affinity masks. Unfortunately, they broke the affinity
handling on hybrid heterogeneous CPUs which have cores that can
execute both 64 and 32bit along with cores that can only execute 32bit
code.
This contains two fix patches for the above issue. While reverting the
changes that caused the regression is definitely an option, the
origial patches do improve how cpuset behave signficantly in some
cases and the fixes seem fairly safe, so I think it'd be better to try
to fix them first"
* tag 'cgroup-for-6.2-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cpuset: Call set_cpus_allowed_ptr() with appropriate mask for task
cgroup/cpuset: Don't filter offline CPUs in cpuset_cpus_allowed() for top cpuset tasks
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- explicitly initialize zlib work memory to fix a KCSAN warning
- limit number of send clones by maximum memory allocated
- limit device size extent in case it device shrink races with chunk
allocation
- raid56 fixes:
- fix copy&paste error in RAID6 stripe recovery
- make error bitmap update atomic
* tag 'for-6.2-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: raid56: make error_bitmap update atomic
btrfs: send: limit number of clones and allocated memory size
btrfs: zlib: zero-initialize zlib workspace
btrfs: limit device extents to the device size
btrfs: raid56: fix stripes if vertical errors are found
|
|
set_cpus_allowed_ptr() will fail with -EINVAL if the requested
affinity mask is not a subset of the task_cpu_possible_mask() for the
task being updated. Consequently, on a heterogeneous system with cpusets
spanning the different CPU types, updates to the cgroup hierarchy can
silently fail to update task affinities when the effective affinity
mask for the cpuset is expanded.
For example, consider an arm64 system with 4 CPUs, where CPUs 2-3 are
the only cores capable of executing 32-bit tasks. Attaching a 32-bit
task to a cpuset containing CPUs 0-2 will correctly affine the task to
CPU 2. Extending the cpuset to CPUs 0-3, however, will fail to extend
the affinity mask of the 32-bit task because update_tasks_cpumask() will
pass the full 0-3 mask to set_cpus_allowed_ptr().
Extend update_tasks_cpumask() to take a temporary 'cpumask' paramater
and use it to mask the 'effective_cpus' mask with the possible mask for
each task being updated.
Fixes: 431c69fac05b ("cpuset: Honour task_cpu_possible_mask() in guarantee_online_cpus()")
Signed-off-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
cpuset tasks
Since commit 8f9ea86fdf99 ("sched: Always preserve the user
requested cpumask"), relax_compatible_cpus_allowed_ptr() is calling
__sched_setaffinity() unconditionally. This helps to expose a bug in
the current cpuset hotplug code where the cpumasks of the tasks in
the top cpuset are not updated at all when some CPUs become online or
offline. It is likely caused by the fact that some of the tasks in the
top cpuset, like percpu kthreads, cannot have their cpu affinity changed.
One way to reproduce this as suggested by Peter is:
- boot machine
- offline all CPUs except one
- taskset -p ffffffff $$
- online all CPUs
Fix this by allowing cpuset_cpus_allowed() to return a wider mask that
includes offline CPUs for those tasks that are in the top cpuset. For
tasks not in the top cpuset, the old rule applies and only online CPUs
will be returned in the mask since hotplug events will update their
cpumasks accordingly.
Fixes: 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask")
Reported-by: Will Deacon <will@kernel.org>
Originally-from: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Will Deacon <will@kernel.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The dev_lan_addr and hw_lan_addr members of ice_vf are used only to store
the MAC address for the VF. They are defined using virtchnl_ether_addr, but
only the .addr sub-member is actually used. Drop the use of
virtchnl_ether_addr and just use a u8 array of length [ETH_ALEN].
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The Scalable IOV implementation will require notifying the VDCM driver when
an IRQ must be closed. This allows the VDCM to handle releasing stale IRQ
context values and properly reconfigure.
To handle this, introduce a new optional .irq_close callback to the VF
operations structure. This will be implemented by Scalable IOV to handle
the shutdown of the IRQ context.
Since the SR-IOV implementation does not need this, we must check that its
non-NULL before calling it.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When hardware is reset, the VF relies on the VFGEN_RSTAT register to detect
when the VF is finished resetting. This is a tri-state register where 0
indicates a reset is in progress, 1 indicates the hardware is done
resetting, and 2 indicates that the software is done resetting.
Currently the PF driver relies on the device hardware resetting VFGEN_RSTAT
when a global reset occurs. This works ok, but it does mean that the VF
might not immediately notice a reset when the driver first detects that the
global reset is occurring.
This is also problematic for Scalable IOV, because there is no read/write
equivalent VFGEN_RSTAT register for the Scalable VSI type. Instead, the
Scalable IOV VFs will need to emulate this register.
To support this, introduce a new VF operation, clear_reset_state, which is
called when the PF driver first detects a global reset. The Single Root IOV
implementation can just write to VFGEN_RSTAT to ensure it's cleared
immediately, without waiting for the actual hardware reset to begin. The
Scalable IOV implementation will use this as part of its tracking of the
reset status to allow properly reporting the emulated VFGEN_RSTAT to the VF
driver.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The .vsi_rebuild function exists for ice_reset_vf. It is used to release
and re-create the VSI during a single-VF reset.
This function is only called when we need to re-create the VSI, and not
when rebuilding an existing VSI. This makes the single-VF reset process
different from the process used to restore functionality after a
hardware reset such as the PF reset or EMP reset.
When we add support for Scalable IOV VFs, the implementation will be very
similar. The primary difference will be in the fact that each VF type uses
a different underlying VSI type in hardware.
Move the common functionality into a new ice_vf_recreate VSI function. This
will allow the two IOV paths to share this functionality. Rework the
.vsi_rebuild vf_op into .create_vsi, only performing the task of creating a
new VSI.
This creates a nice dichotomy between the ice_vf_rebuild_vsi and
ice_vf_recreate_vsi, and should make it more clear why the two flows atre
distinct.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Introduce a new generic helper ice_vf_init_host_cfg which performs common
host configuration initialization tasks that will need to be done for both
Single Root IOV and the new Scalable IOV implementation.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Some of the initialization code for Single Root IOV VFs will need to be
reused when we introduce Scalable IOV. Pull this code out into a new
ice_initialize_vf_entry helper function.
Co-developed-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The Single Root IOV implementation of .post_vsi_rebuild performs some tasks
that will ultimately need to be shared with the Scalable IOV implementation
such as rebuilding the host configuration.
Refactor by introducing a new wrapper function, ice_vf_post_vsi_rebuild
which performs the tasks that will be shared between SR-IOV and Scalable
IOV. Move the ice_vf_rebuild_host_cfg and ice_vf_set_initialized calls into
this wrapper. Then call the implementation specific post_vsi_rebuild
handler afterwards.
This ensures that we will properly re-initialize filters and expected
settings for both SR-IOV and Scalable IOV.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The ice_vf_vsi_release function will be used in a future change to
refactor the .vsi_rebuild function. Move this over to ice_vf_lib.c so
that it can be used there.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The ice_vsi_alloc and ice_vsi_cfg functions are used together to allocate
and configure a new VSI, called as part of the ice_vsi_setup function.
In the future with the addition of the subfunction code the ice driver
will want to be able to allocate a VSI while delaying the configuration to
a later point of the port activation.
Currently this requires that the port code know what type of VSI should
be allocated. This is required because ice_vsi_alloc assigns the VSI type.
Refactor the ice_vsi_alloc and ice_vsi_cfg functions so that VSI type
assignment isn't done until the configuration stage. This will allow the
devlink port addition logic to reserve a VSI as early as possible before
the type of the port is known. In this way, the port add can fail in the
event that all hardware VSI resources are exhausted.
Since the ice_vsi_cfg function already takes the ice_vsi_cfg_params
structure, this is relatively straight forward.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The ice_vsi_setup function, ice_vsi_alloc, and ice_vsi_cfg functions have
grown a large number of parameters. These parameters are used to initialize
a new VSI, as well as re-configure an existing VSI
Any time we want to add a new parameter to this function chain, even if it
will usually be unset, we have to change many call sites due to changing
the function signature.
A future change is going to refactor ice_vsi_alloc and ice_vsi_cfg to move
the VSI configuration and initialization all into ice_vsi_cfg.
Before this, refactor the VSI setup flow to use a new ice_vsi_cfg_params
structure. This will contain the configuration (mainly pointers) used to
initialize a VSI.
Pass this from ice_vsi_setup into the related functions such as
ice_vsi_alloc, ice_vsi_cfg, and ice_vsi_cfg_def.
Introduce a helper, ice_vsi_to_params to convert an existing VSI to the
parameters used to initialize it. This will aid in the flows where we
rebuild an existing VSI.
Since we also pass the ICE_VSI_FLAG_INIT to more functions which do not
need (or cannot yet have) the VSI parameters, lets make this clear by
renaming the function parameter to vsi_flags and using a u32 instead of a
signed integer. The name vsi_flags also makes it clear that we may extend
the flags in the future.
This change will make it easier to refactor the setup flow in the future,
and will reduce the complexity required to add a new parameter for
configuration in the future.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The vsi->vf pointer gets assigned early on during ice_vsi_alloc. Several
functions currently take a VF pointer, but they can just use the existing
vsi->vf pointer as needed. Modify these functions to drop the unnecessary
VF parameter.
Note that ice_vsi_cfg is not changed as a following change will refactor so
that the VF pointer is assigned during ice_vsi_cfg rather than
ice_vsi_alloc.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Since commit 1d2e32275de7 ("ice: split ice_vsi_setup into smaller
functions") ice_vsi_alloc has not been responsible for all of the behavior
implied by the comment for ice_vsi_setup_vector_base.
Fix the comment to refer to the new function ice_vsi_alloc_def().
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Extend the usage of function ice_get_vf_vsi(vf) in multiple places
instead of VF's VSI by using a long string of dereferences
(i.e. vf->pf->vsi[vf->lan_vsi_idx]).
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Kalyan Kodamagula <kalyan.kodamagula@intel.com>
Tested-by: Piotr Tyda <piotr.tyda@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The touchscreen reports a battery status of 0% and jumps to 1% when a
stylus is used. The device ID was added and the battery ignore quirk was
enabled for it.
Signed-off-by: Luka Guzenko <l.guzenko@web.de>
Link: https://lore.kernel.org/r/20230120223741.3007-1-l.guzenko@web.de
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
|
|
Marc Kleine-Budde <mkl@pengutronix.de> says:
several people noticed that on modern CAN controllers with wide bit
timing registers the default SJW of 1 can result in unstable or no
synchronization to the CAN network. See Patch 14/17 for details.
During review of v1 Vincent pointed out that the original code and the
series doesn't always check user provided bit timing parameters,
sometimes silently limits them and the return error values are not
consistent.
This series first cleans up some code in bittiming.c, replacing
open-coded variants by macros or functions (Patches 1, 2).
Patch 3 adds the missing assignment of the effective TQ if the
interface is configured with low level timing parameters.
Patch 4 is another code cleanup.
Patches 5, 6 check the bit timing parameter during interface
registration.
Patch 7 adds a validation of the sample point.
The patches 8-13 convert the error messages from netdev_err() to
NL_SET_ERR_MSG_FMT, factor out the SJW handling from
can_fixup_bittiming(), add checking and error messages for the
individual limits and harmonize the error return values.
Patch 14 changes the default SJW value from 1 to min(Phase Seg1, Phase
Seg2 / 2).
Patch 15 switches can_calc_bittiming() to use the new SJW handling.
Patch 16 converts can_calc_bittiming() to NL_SET_ERR_MSG_FMT().
And patch 16 adds a NL_SET_ERR_MSG_FMT() error message to
can_validate_bitrate().
v1: https://lore.kernel.org/all/20220907103845.3929288-1-mkl@pengutronix.de
Link: https://lore.kernel.org/all/20230202110854.2318594-1-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Report an error to user space via netlink if the requested bit rate is
not supported by the device.
Link: https://lore.kernel.org/all/20230202110854.2318594-18-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
NL_SET_ERR_MSG_FMT()
Replace the netdev_err() by NL_SET_ERR_MSG_FMT() to better inform the
user about the problem. While there, use %u to print unsigned values
and improve error message a bit.
In case of an error, return -EINVAL instead of -EDOM, this corresponds
better to the actual meaning of the error value.
Link: https://lore.kernel.org/all/20230202110854.2318594-17-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
In the current code, if the user configures a bitrate, a default SJW
value of 1 is used. If the user configures both a bitrate and a SJW
value, can_calc_bittiming() silently limits the SJW value to SJW max
and TSEG2.
We came to the conclusion that if the user provided an invalid SJW
value, it's best to bail out and inform the user [1].
[1] https://lore.kernel.org/all/CAMZ6RqKqhmTgUZiwe5uqUjBDnhhC2iOjZ791+Y845btJYwVDKg@mail.gmail.com
Further the ISO 11898-1:2015 standard mandates that "SJW shall be less
than or equal to the minimum of these two items: Phase_Seg1 and
Phase_Seg2." [2] The current code is missing that check.
[2] https://lore.kernel.org/all/BL3PR11MB64844E3FC13C55433CDD0B3DFB449@BL3PR11MB6484.namprd11.prod.outlook.com
The previous patches introduced
1) can_sjw_set_default() - sets a default value for SJW if unset
2) can_sjw_check() - implements a SJW check against SJW max, Phase
Seg1 and Phase Seg2. In the error case this function reports the error
to user space via netlink.
Replace both the open-coded SJW default setting and the open-coded and
insufficient checks of SJW with the helper functions
can_sjw_set_default() and can_sjw_check().
Link: https://lore.kernel.org/all/20230202110854.2318594-16-mkl@pengutronix.de
Link: https://lore.kernel.org/all/CAMZ6RqKqhmTgUZiwe5uqUjBDnhhC2iOjZ791+Y845btJYwVDKg@mail.gmail.com
Link: https://lore.kernel.org/all/BL3PR11MB64844E3FC13C55433CDD0B3DFB449@BL3PR11MB6484.namprd11.prod.outlook.com
Suggested-by: Thomas Kopp <Thomas.Kopp@microchip.com>
Suggested-by: Vincent Mailhol <vincent.mailhol@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
"The (Re-)Synchronization Jump Width (SJW) defines how far a
resynchronization may move the Sample Point inside the limits defined
by the Phase Buffer Segments to compensate for edge phase errors." [1]
In other words, this means that the SJW parameter controls the
tolerance of the CAN controller to frequency errors compared to other
CAN controllers.
If the user space does not provide an SJW parameter, the kernel
chooses a default value of 1. This has proven to be a good default
value for classic CAN controllers, but no longer for modern CAN-FD
controllers.
In the past there were CAN controllers like the sja1000 with a rather
limited range of bit timing parameters. For the standard bit rates
this results in the following bit timing parameters:
| Bit timing parameters for sja1000 with 8.000000 MHz ref clock
| _----+--------------=> tseg1: 1 … 16
| / / _---------=> tseg2: 1 … 8
| | | / _-----=> sjw: 1 … 4
| | | | / _-=> brp: 1 … 64 (inc: 1)
| | | | | /
| nominal | | | | | real Bitrt nom real SampP
| Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error BTR0 BTR1
| 1000000 125 2 3 2 1 1 1000000 0.0% 75.0% 75.0% 0.0% 0x00 0x14
| 800000 125 3 4 2 1 1 800000 0.0% 80.0% 80.0% 0.0% 0x00 0x16
| 666666 125 4 4 3 1 1 666666 0.0% 80.0% 75.0% 6.2% 0x00 0x27
| 500000 125 6 7 2 1 1 500000 0.0% 87.5% 87.5% 0.0% 0x00 0x1c
| 250000 250 6 7 2 1 2 250000 0.0% 87.5% 87.5% 0.0% 0x01 0x1c
| 125000 500 6 7 2 1 4 125000 0.0% 87.5% 87.5% 0.0% 0x03 0x1c
| 100000 625 6 7 2 1 5 100000 0.0% 87.5% 87.5% 0.0% 0x04 0x1c
| 83333 750 6 7 2 1 6 83333 0.0% 87.5% 87.5% 0.0% 0x05 0x1c
| 50000 1250 6 7 2 1 10 50000 0.0% 87.5% 87.5% 0.0% 0x09 0x1c
| 33333 1875 6 7 2 1 15 33333 0.0% 87.5% 87.5% 0.0% 0x0e 0x1c
| 20000 3125 6 7 2 1 25 20000 0.0% 87.5% 87.5% 0.0% 0x18 0x1c
| 10000 6250 6 7 2 1 50 10000 0.0% 87.5% 87.5% 0.0% 0x31 0x1c
The attentive reader will notice that the SJW is 1 in most cases,
while the Seg2 phase is 2. Both values are given in TQ units, which in
turn is a duration in nanoseconds.
For example the 500 kbit/s configuration:
| nominal real Bitrt nom real SampP
| Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error BTR0 BTR1
| 500000 125 6 7 2 1 1 500000 0.0% 87.5% 87.5% 0.0% 0x00 0x1c
the TQ is 125ns, the Phase Seg2 is "2" (== 250ns), the SJW is "1" (==
125 ns).
Looking at a more modern CAN controller like a mcp2518fd, it has wider
bit timing registers.
| Bit timing parameters for mcp251xfd with 40.000000 MHz ref clock
| _----+--------------=> tseg1: 2 … 256
| / / _---------=> tseg2: 1 … 128
| | | / _-----=> sjw: 1 … 128
| | | | / _-=> brp: 1 … 256 (inc: 1)
| | | | | /
| nominal | | | | | real Bitrt nom real SampP
| Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error NBTCFG
| 500000 25 34 35 10 1 1 500000 0.0% 87.5% 87.5% 0.0% 0x00440900
The TQ is 25ns, the Phase Seg 2 is "10" (== 250ns), the SJW is "1" (==
25ns).
Since the kernel chooses a default SJW of 1 regardless of the TQ, this
leads to a much smaller SJW and thus much smaller tolerances to
frequency errors.
To maintain the same oscillator tolerances on controllers with wide
bit timing registers, select a default SJW value of Phase Seg2 / 2
unless Phase Seg 1 is less. This results in the following bit timing
parameters:
| Bit timing parameters for mcp251xfd with 40.000000 MHz ref clock
| _----+--------------=> tseg1: 2 … 256
| / / _---------=> tseg2: 1 … 128
| | | / _-----=> sjw: 1 … 128
| | | | / _-=> brp: 1 … 256 (inc: 1)
| | | | | /
| nominal | | | | | real Bitrt nom real SampP
| Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error NBTCFG
| 500000 25 34 35 10 5 1 500000 0.0% 87.5% 87.5% 0.0% 0x00440904
The TQ is 25ns, the Phase Seg 2 is "10" (== 250ns), the SJW is "5" (==
125ns). Which is the same as on the sja1000 controller.
[1] http://web.archive.org/http://www.oertel-halle.de/files/cia99paper.pdf
Link: https://lore.kernel.org/all/20230202110854.2318594-15-mkl@pengutronix.de
Cc: Mark Bath <mark@baggywrinkle.co.uk>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Phase Buffer Segment
According to "The Configuration of the CAN Bit Timing" [1] the SJW
"may not be longer than either Phase Buffer Segment".
Check SJW against length of both Phase buffers. In case the SJW is
greater, report an error via netlink to user space and bail out.
[1] http://web.archive.org/http://www.oertel-halle.de/files/cia99paper.pdf
Link: https://lore.kernel.org/all/20230202110854.2318594-14-mkl@pengutronix.de
Suggested-by: Vincent Mailhol <vincent.mailhol@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
error value
If the user space has supplied an invalid SJW value (greater than the
maximum SJW value), report -EINVAL instead of -ERANGE, this better
matches the actual meaning of the error value.
Additionally report an error message via netlink to the user space.
Link: https://lore.kernel.org/all/20230202110854.2318594-13-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
harmonize error value
Check each bit timing parameter first individually against their
limits and report a meaningful error message via netlink to the user
space.
In case of an error, return -EINVAL instead of -ERANGE, this
corresponds better to the actual meaning of the error value.
Link: https://lore.kernel.org/all/20230202110854.2318594-12-mkl@pengutronix.de
Suggested-by: Vincent Mailhol <vincent.mailhol@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Factor out the functionality of assigning a SJW default value into
can_sjw_set_default() and the checking the SJW limits into
can_sjw_check().
This functions will be improved and called from a different function
in the following patches.
Link: https://lore.kernel.org/all/20230202110854.2318594-11-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
This is a preparation patch.
In order to pass warning/error messages during netlink calls back to
user space, pass the extack struct down the callstack of
can_changelink(), the actual error messages will be added in the
following ptaches.
Link: https://lore.kernel.org/all/20230202110854.2318594-10-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
NL_SET_ERR_MSG_FMT()
Since commit 51c352bdbcd2 ("netlink: add support for formatted extack
messages") formatted extack messages are supported to inform the user
space or warnings/errors during netlink calls.
Replace the netdev_err() by NL_SET_ERR_MSG_FMT() to better inform the
user about the problem. While there, use %u to print unsigned values
and improve error message a bit.
Link: https://lore.kernel.org/all/20230202110854.2318594-9-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
The sample point is a value in tenths of a percent. Meaningful values
are between 0 and 1000. Invalid values are rejected and an error
message is returned to user space via netlink.
Link: https://lore.kernel.org/all/20230202110854.2318594-8-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
constants are provided
The CAN driver framework supports either fixed bit rates or bit timing
constants. Bail out during driver registration if both are given.
Link: https://lore.kernel.org/all/20230202110854.2318594-7-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Implement the function can_bittiming_const_valid() to check the
validity of the specified bit timing constant. Call this function from
register_candev() to check the bit timing constants during the
registration of the CAN interface.
Link: https://lore.kernel.org/all/20230202110854.2318594-6-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Clean up the code flow a bit, don't assign err variable but directly
return. Remove the unneeded else, too.
Link: https://lore.kernel.org/all/20230202110854.2318594-5-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
The can_fixup_bittiming() function is used to validate the
user-supplied low-level bit timing parameters and calculate the
bitrate prescaler (brp) from the requested time quanta (tq) and the
CAN clock of the controller.
can_fixup_bittiming() selects the best matching integer bit rate
prescaler, which may result in a different time quantum than the value
specified by the user.
Calculate the resulting time quantum and assign it so that the user
sees the effective time quantum.
Link: https://lore.kernel.org/all/20230202110854.2318594-4-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Commit 1c47fa6b31c2 ("can: dev: add a helper function to calculate the
duration of one bit") made the constant CAN_SYNC_SEG available in a
header file.
The magic number 1 in can_fixup_bittiming() represents the width of
the sync segment, replace it by CAN_SYNC_SEG to make the code more
readable.
Link: https://lore.kernel.org/all/20230202110854.2318594-3-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Commit 1c47fa6b31c2 ("can: dev: add a helper function to calculate the
duration of one bit") added the helper function can_bit_time().
Replace open coded variants of can_bit_time() by the helper function.
Link: https://lore.kernel.org/all/20230202110854.2318594-2-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Pietro Borrello says:
====================
tuntap: correctly initialize socket uid
sock_init_data() assumes that the `struct socket` passed in input is
contained in a `struct socket_alloc` allocated with sock_alloc().
However, tap_open() and tun_chr_open() pass a `struct socket` embedded
in a `struct tap_queue` and `struct tun_file` respectively, both
allocated with sk_alloc().
This causes a type confusion when issuing a container_of() with
SOCK_INODE() in sock_init_data() which results in assigning a wrong
sk_uid to the `struct sock` in input.
Due to the type confusion, both sockets happen to have their uid set
to 0, i.e. root.
While it will be often correct, as tuntap devices require
CAP_NET_ADMIN, it may not always be the case.
Not sure how widespread is the impact of this, it seems the socket uid
may be used for network filtering and routing, thus tuntap sockets may
be incorrectly managed.
Additionally, it seems a socket with an incorrect uid may be returned
to the vhost driver when issuing a get_socket() on a tuntap device in
vhost_net_set_backend().
Fix the bugs by adding and using sock_init_data_uid(), which
explicitly takes a uid as argument.
Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it>
---
Changes in v3:
- Fix the bug by defining and using sock_init_data_uid()
- Link to v2: https://lore.kernel.org/r/20230131-tuntap-sk-uid-v2-0-29ec15592813@diag.uniroma1.it
Changes in v2:
- Shorten and format comments
- Link to v1: https://lore.kernel.org/r/20230131-tuntap-sk-uid-v1-0-af4f9f40979d@diag.uniroma1.it
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sock_init_data() assumes that the `struct socket` passed in input is
contained in a `struct socket_alloc` allocated with sock_alloc().
However, tap_open() passes a `struct socket` embedded in a `struct
tap_queue` allocated with sk_alloc().
This causes a type confusion when issuing a container_of() with
SOCK_INODE() in sock_init_data() which results in assigning a wrong
sk_uid to the `struct sock` in input.
On default configuration, the type confused field overlaps with
padding bytes between `int vnet_hdr_sz` and `struct tap_dev __rcu
*tap` in `struct tap_queue`, which makes the uid of all tap sockets 0,
i.e., the root one.
Fix the assignment by using sock_init_data_uid().
Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|