summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-01-30platform/x86/amd/pmf: Fix to update SPS thermals when power supply changeShyam Sundar S K
Every power mode of static power slider has its own AC and DC power settings. When the power source changes from AC to DC, corresponding DC thermals were not updated from PMF config store and this leads the system to always run on AC power settings. Fix it by registering with power_supply notifier and apply DC settings upon getting notified by the power_supply handler. Fixes: da5ce22df5fe ("platform/x86/amd/pmf: Add support for PMF core layer") Suggested-by: Patil Rajesh Reddy <Patil.Reddy@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20230125095936.3292883-6-Shyam-sundar.S-k@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-30platform/x86/amd/pmf: Fix to update SPS default pprof thermalsShyam Sundar S K
By design PMF static slider will be set to BALANCED during init, but updating to corresponding thermal values from the PMF config store was missed, leading to improper settings getting propagated to PMFW. Fixes: 4c71ae414474 ("platform/x86/amd/pmf: Add support SPS PMF feature") Suggested-by: Patil Rajesh Reddy <Patil.Reddy@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20230125095936.3292883-5-Shyam-sundar.S-k@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-30platform/x86/amd/pmf: update to auto-mode limits only after AMT eventShyam Sundar S K
Auto-mode thermal limits should be updated only after receiving the AMT event. But due to a bug in the older commit, these settings were getting applied during the auto-mode init. Fix this by removing amd_pmf_set_automode() during auto-mode initialization. Fixes: 3f5571d99524 ("platform/x86/amd/pmf: Add support for Auto mode feature") Suggested-by: Patil Rajesh Reddy <Patil.Reddy@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20230125095936.3292883-4-Shyam-sundar.S-k@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-30platform/x86/amd/pmf: Add helper routine to check pprof is balancedShyam Sundar S K
Add helper routine to check if the current platform profile is balanced mode and remove duplicate code occurrences. Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20230125095936.3292883-3-Shyam-sundar.S-k@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-30platform/x86/amd/pmf: Add helper routine to update SPS thermalsShyam Sundar S K
Add helper routine to update the static slider information and remove the duplicate code occurrences after this change. Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20230125095936.3292883-2-Shyam-sundar.S-k@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-30fscache: Use clear_and_wake_up_bit() in fscache_create_volume_work()Hou Tao
fscache_create_volume_work() uses wake_up_bit() to wake up the processes which are waiting for the completion of volume creation. According to comments in wake_up_bit() and waitqueue_active(), an extra smp_mb() is needed to guarantee the memory order between FSCACHE_VOLUME_CREATING flag and waitqueue_active() before invoking wake_up_bit(). Fixing it by using clear_and_wake_up_bit() to add the missing memory barrier. Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20230113115211.2895845-3-houtao@huaweicloud.com/ # v3
2023-01-30fscache: Use wait_on_bit() to wait for the freeing of relinquished volumeHou Tao
The freeing of relinquished volume will wake up the pending volume acquisition by using wake_up_bit(), however it is mismatched with wait_var_event() used in fscache_wait_on_volume_collision() and it will never wake up the waiter in the wait-queue because these two functions operate on different wait-queues. According to the implementation in fscache_wait_on_volume_collision(), if the wake-up of pending acquisition is delayed longer than 20 seconds (e.g., due to the delay of on-demand fd closing), the first wait_var_event_timeout() will timeout and the following wait_var_event() will hang forever as shown below: FS-Cache: Potential volume collision new=00000024 old=00000022 ...... INFO: task mount:1148 blocked for more than 122 seconds. Not tainted 6.1.0-rc6+ #1 task:mount state:D stack:0 pid:1148 ppid:1 Call Trace: <TASK> __schedule+0x2f6/0xb80 schedule+0x67/0xe0 fscache_wait_on_volume_collision.cold+0x80/0x82 __fscache_acquire_volume+0x40d/0x4e0 erofs_fscache_register_volume+0x51/0xe0 [erofs] erofs_fscache_register_fs+0x19c/0x240 [erofs] erofs_fc_fill_super+0x746/0xaf0 [erofs] vfs_get_super+0x7d/0x100 get_tree_nodev+0x16/0x20 erofs_fc_get_tree+0x20/0x30 [erofs] vfs_get_tree+0x24/0xb0 path_mount+0x2fa/0xa90 do_mount+0x7c/0xa0 __x64_sys_mount+0x8b/0xe0 do_syscall_64+0x30/0x60 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Considering that wake_up_bit() is more selective, so fix it by using wait_on_bit() instead of wait_var_event() to wait for the freeing of relinquished volume. In addition because waitqueue_active() is used in wake_up_bit() and clear_bit() doesn't imply any memory barrier, use clear_and_wake_up_bit() to add the missing memory barrier between cursor->flags and waitqueue_active(). Fixes: 62ab63352350 ("fscache: Implement volume registration") Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20230113115211.2895845-2-houtao@huaweicloud.com/ # v3
2023-01-30wifi: iwlwifi: mei: fix compilation errors in rfkill()Gregory Greenman
The rfkill() callback was invoked with wrong parameters. It was missed since MEI is defined now as depending on BROKEN. Fix that. Fixes: d288067ede4b ("wifi: iwlwifi: mei: avoid blocking sap messages handling due to rtnl lock") Fixes: 5aa7ce31bd84 ("wifi: iwlwifi: mei: make sure ownership confirmed message is sent") Fixes: 95170a46b7dd ("wifi: iwlwifi: mei: don't send SAP commands if AMT is disabled") Link: https://lore.kernel.org/r/20230126222821.305122-2-gregory.greenman@intel.com Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
2023-01-30wifi: iwlwifi: mvm: Support STEP equalizer settings from BIOS.Ayala Barazani
Read the STEP equalizer parameters from the BIOS during init and transfer it to the firmware. This table provides values to configure an equalizer at the transmitter that can be used to compensate for PCB channel attenuation. Signed-off-by: Ayala Barazani <ayala.barazani@intel.com> Link: https://lore.kernel.org/r/20230127002430.f25f871c5e17.I8390ab916c8f681229433ebc576ed37a594c6d30@changeid Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
2023-01-30wifi: iwlwifi: bump FW API to 74 for AX devicesGolan Ben Ami
Start supporting API version 74 for AX devices. Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com> Link: https://lore.kernel.org/r/20230127002430.80012ee4c5d6.I45ba1f8bf923d242ef2ffeb160d736120c8add65@changeid Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
2023-01-30wifi: iwlwifi: mvm: Reset rate index if rate is wrongMukesh Sisodiya
Setting rate index should not depend on net_ratelimit(). Fix that for the case of invalid rate. Signed-off-by: Mukesh Sisodiya <mukesh.sisodiya@intel.com> Link: https://lore.kernel.org/r/20230127002430.8eede67758bb.I585ab389e27d61153540b7cb5ebed66e21f765f0@changeid Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
2023-01-30wifi: iwlwifi: mvm: simplify by using SKB MAC header pointerMordechay Goodstein
Instead of calculating the offset to the 802.11 header based on radiotap bits and length, shorten the code path by always setting the MAC header in the skb and using skb_mac_header(). Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Link: https://lore.kernel.org/r/20230127002430.3ec5493934a4.I1d41a2af28588b5899fcd2402f8c4bd8cc29a12e@changeid Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
2023-01-30wifi: iwlwifi: mvm: add sniffer meta data APIsMordechay Goodstein
We add TSF overwrite for EHT MU/TB high and low, and add definitions for EHT Data 5 meta data. Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Link: https://lore.kernel.org/r/20230127002430.6729c0be66aa.I95ad94d5e137ec80010facd8ee57cd40461a0721@changeid Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
2023-01-30wifi: iwlwifi: rx: add sniffer support for EHT modeMordechay Goodstein
Start by adding a parsing option for all the new fields coming from FW and checking that we have the right version for parsing EHT. Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Link: https://lore.kernel.org/r/20230127002430.ba9b364fbacf.I469af2a07b3ff51cbd8d67e572478f4c56ce22ba@changeid Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
2023-01-30wifi: iwlwifi: mvm: always send nullfunc frames on MGMT queueGregory Greenman
Non-QOS nullfunc frames should be sent on MGMT queue similarly to the QOS nullfunc frames. It means that the corresponding TID should remain IWL_MAX_TID_COUNT. Make the condition more strict, so the TID won't be changed to IWL_TID_NON_QOS. Link: https://lore.kernel.org/r/20230127002430.a05bf77c9e29.I06262424878232b46fecd58743c889e4c3216bbf@changeid Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
2023-01-30wifi: iwlwifi: mvm: remove h from printk format specifierTom Rix
This change fixes the checkpatch warning described in this commit commit cbacb5ab0aa0 ("docs: printk-formats: Stop encouraging use of unnecessary %h[xudi] and %hh[xudi]") Standard integer promotion is already done and %hx and %hhx is useless so do not encourage the use of %hh[xudi] or %h[xudi]. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20230127002430.a25158d58fd7.Ibfe217f12a63c1d5349218e74c4b802c70c13c7c@changeid Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
2023-01-30wifi: iwlwifi: improve tag handling in iwl_request_firmwareHeiner Kallweit
We can remove the intermediary string conversion and use drv->fw_index in the final snprintf directly. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20230127002430.175bfffdf2f5.I7ec7a29b2d93a977cb0a39dbcc7c875032eb14b7@changeid Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
2023-01-30wifi: iwlwifi: mention the response structure in the kerneldocEmmanuel Grumbach
Add a comment to mention the structure used for the response for the flush command. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://lore.kernel.org/r/20230127002430.422c9fbac12c.I2da0954d1c62007b5f01faf06df3e4081e52204f@changeid Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
2023-01-30wifi: iwlwifi: mvm: add minimal EHT rate reportingJohannes Berg
Now with all the prework, this is fairly simple, just report the new bandwidth and RX_ENC_EHT type in RX, and for now just do a minimal report of the EHT TLC rate in iwl_mvm_set_sta_rate(). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20230109130329.5f34d73d1f74.Ib27ae7bd23bc152d61021fd73aabdc76679b9fe4@changeid Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
2023-01-30xfrm: fix bug with DSCP copy to v6 from v4 tunnelChristian Hopps
When copying the DSCP bits for decap-dscp into IPv6 don't assume the outer encap is always IPv6. Instead, as with the inner IPv4 case, copy the DSCP bits from the correctly saved "tos" value in the control block. Fixes: 227620e29509 ("[IPSEC]: Separate inner/outer mode processing on input") Signed-off-by: Christian Hopps <chopps@chopps.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2023-01-30net: phy: fix null dereference in phy_attach_directColin Foster
Commit bc66fa87d4fd ("net: phy: Add link between phy dev and mac dev") introduced a link between net devices and phy devices. It fails to check whether dev is NULL, leading to a NULL dereference error. Fixes: bc66fa87d4fd ("net: phy: Add link between phy dev and mac dev") Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30powerpc/64: Fix perf profiling asynchronous interrupt handlersNicholas Piggin
Interrupt entry sets the soft mask to IRQS_ALL_DISABLED to match the hard irq disabled state. So when should_hard_irq_enable() returns true because we want PMI interrupts in irq handlers, MSR[EE] is enabled but PMIs just get soft-masked. Fix this by clearing IRQS_PMI_DISABLED before enabling MSR[EE]. This also tidies some of the warnings, no need to duplicate them in both should_hard_irq_enable() and do_hard_irq_enable(). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230121100156.2824054-1-npiggin@gmail.com
2023-01-30powerpc/64s: Fix local irq disable when PMIs are disabledNicholas Piggin
When PMI interrupts are soft-masked, local_irq_save() will clear the PMI mask bit, allowing PMIs in and causing a race condition. This causes a deadlock in native_hpte_insert via hash_preload, which depends on PMIs being disabled since commit 8b91cee5eadd ("powerpc/64s/hash: Make hash faults work in NMI context"). native_hpte_insert calls local_irq_save(). It's possible the lpar hash code is also affected when tracing is enabled because __trace_hcall_entry() calls local_irq_save(). Fix this by making arch_local_irq_save() _or_ the IRQS_DISABLED bit into the mask. This was found with the stress_hpt option with a kbuild workload running together with `perf record -g`. Fixes: f442d004806e ("powerpc/64s: Add support to mask perf interrupts and replay them") Fixes: 8b91cee5eadd ("powerpc/64s/hash: Make hash faults work in NMI context") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Just take the fix without the new warning] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230121095352.2823517-1-npiggin@gmail.com
2023-01-30Merge branch 'devlink-next'David S. Miller
Jakub Kicinski says: ==================== devlink: fix reload notifications and remove features First two patches adjust notifications during devlink reload. The last patch removes no longer needed devlink features. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> Reviewed-by: Jakub Kicinski <kuba@kernel.org>
2023-01-30devlink: remove devlink featuresJiri Pirko
Devlink features were introduced to disallow devlink reload calls of userspace before the devlink was fully initialized. The reason for this workaround was the fact that devlink reload was originally called without devlink instance lock held. However, with recent changes that converted devlink reload to be performed under devlink instance lock, this is redundant so remove devlink features entirely. Note that mlx5 used this to enable devlink reload conditionally only when device didn't act as multi port slave. Move the multi port check into mlx5_devlink_reload_down() callback alongside with the other checks preventing the device from reload in certain states. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30devlink: send objects notifications during devlink reloadJiri Pirko
Currently, the notifications are only sent for params. People who introduced other objects forgot to add the reload notifications here. To make sure all notifications happen according to existing comment, benefit from existence of devlink_notify_register/unregister() helpers and use them in reload code. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30devlink: move devlink reload notifications back in between _down() and _up() ↵Jiri Pirko
calls This effectively reverts commit 05a7f4a8dff1 ("devlink: Break parameter notification sequence to be before/after unload/load driver"). Cited commit resolved a problem in mlx5 params implementation, when param notification code accessed memory previously freed during reload. Now, when the params can be registered and unregistered when devlink instance is registered, mlx5 code unregisters the problematic param during devlink reload. The fix is therefore no longer needed. Current behavior is a it problematic, as it sends DEL notifications even in potential case when reload_down() call fails which might confuse userspace notifications listener. So move the reload notifications back where they were originally in between reload_down() and reload_up() calls. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30Merge branch 'sparx5-ES2-VCAP-support'David S. Miller
Steen Hegelund says: ==================== Adding Sparx5 ES2 VCAP support This provides the Egress Stage 2 (ES2) VCAP (Versatile Content-Aware Processor) support for the Sparx5 platform. The ES2 VCAP is an Egress Access Control VCAP that uses frame keyfields and previously classified keyfields to apply e.g. policing, trapping or mirroring to frames. The ES2 VCAP has 2 lookups and they are accessible with a TC chain id: - chain 20000000: ES2 Lookup 0 - chain 20100000: ES2 Lookup 1 As the other Sparx5 VCAPs the ES2 VCAP has its own lookup/port keyset configuration that decides which keys will be used for matching on which traffic type. The ES2 VCAP has these traffic classifications: - IPv4 frames - IPv6 frames - Other frames The ES2 VCAP can match on an ISDX key (Ingress Service Index) as one of the frame metadata keyfields. The IS0 VCAP can update this key using its actions, and this allows a IS0 VCAP rule to be linked to an ES2 rule. This is similar to how the IS0 VCAP and the IS2 VCAP use the PAG (Policy Association Group) keyfield to link rules. From user space this is exposed via "chain offsets", so an IS0 rule with a "goto chain 20000015" action will use an ISDX key value of 15 to link to a rule in the ES2 VCAP attached to the same chain id. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30net: microchip: sparx5: Add KUNIT tests for enabling/disabling chainsSteen Hegelund
This enhances the KUNIT test of the VCAP API with tests of the chaining functionality. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30net: microchip: sparx5: Add TC support for the ES2 VCAPSteen Hegelund
This enables the TC command to use the Sparx5 ES2 VCAP, and provides a new ES2 ethertype table and handling of rule links between IS0 and ES2. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30net: microchip: sparx5: Add ingress information to VCAP instanceSteen Hegelund
This allows the check of the goto action to be specific to the ingress and egress VCAP instances. The debugfs support is also updated to show this information. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30net: microchip: sparx5: Add ES2 VCAP keyset configuration for Sparx5Steen Hegelund
This adds the ES2 VCAP port keyset configuration for Sparx5 and also updates the debugFS support to show the keyset configuration and the egress port mask. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30net: microchip: sparx5: Add ES2 VCAP model and updated KUNIT VCAP modelSteen Hegelund
This provides the VCAP model for the Sparx5 ES2 (Egress Stage 2) VCAP. This VCAP provides tagging and remarking functionality This also renames a VCAP keyfield: VCAP_KF_MIRROR_ENA becomes VCAP_KF_MIRROR_PROBE, as the first name was caused by a mistake in the model transformation. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30net: microchip: sparx5: Improve error message when parsing CVLAN filterSteen Hegelund
This improves the error message when a TC filter with CVLAN tag is used and the selected VCAP instance does not support this. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30net: microchip: sparx5: Improve the IP frame key match for IPv6 framesSteen Hegelund
This ensures that it will be possible for a VCAP rule to distinguish IPv6 frames from non-IP frames, as the IS0 keyset usually selected for the IPv6 traffic class in (7TUPLE) does not offer a key that specifies IPv6 directly: only non-IPv4. The IP_SNAP key ensures that we select (at least) IP frames. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30net: microchip: sparx5: Add support for getting keysets without a type idSteen Hegelund
When there is only one keyset available for a certain VCAP rule size, the particular keyset does not need a type id when encoded in the VCAP Hardware. This provides support for getting a keyset from a rule, when this is the case: only one keyset fits this rule size. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30Merge tag 'batadv-next-pullrequest-20230127' of ↵David S. Miller
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This feature/cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - drop prandom.h includes, by Sven Eckelmann - fix mailing list address, by Sven Eckelmann - multicast feature preparation, by Linus Lüssing (2 patches) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30netrom: Fix use-after-free caused by accept on already connected socketHyunwoo Kim
If you call listen() and accept() on an already connect()ed AF_NETROM socket, accept() can successfully connect. This is because when the peer socket sends data to sendmsg, the skb with its own sk stored in the connected socket's sk->sk_receive_queue is connected, and nr_accept() dequeues the skb waiting in the sk->sk_receive_queue. As a result, nr_accept() allocates and returns a sock with the sk of the parent AF_NETROM socket. And here use-after-free can happen through complex race conditions: ``` cpu0 cpu1 1. socket_2 = socket(AF_NETROM) . . listen(socket_2) accepted_socket = accept(socket_2) 2. socket_1 = socket(AF_NETROM) nr_create() // sk refcount : 1 connect(socket_1) 3. write(accepted_socket) nr_sendmsg() nr_output() nr_kick() nr_send_iframe() nr_transmit_buffer() nr_route_frame() nr_loopback_queue() nr_loopback_timer() nr_rx_frame() nr_process_rx_frame(sk, skb); // sk : socket_1's sk nr_state3_machine() nr_queue_rx_frame() sock_queue_rcv_skb() sock_queue_rcv_skb_reason() __sock_queue_rcv_skb() __skb_queue_tail(list, skb); // list : socket_1's sk->sk_receive_queue 4. listen(socket_1) nr_listen() uaf_socket = accept(socket_1) nr_accept() skb_dequeue(&sk->sk_receive_queue); 5. close(accepted_socket) nr_release() nr_write_internal(sk, NR_DISCREQ) nr_transmit_buffer() // NR_DISCREQ nr_route_frame() nr_loopback_queue() nr_loopback_timer() nr_rx_frame() // sk : socket_1's sk nr_process_rx_frame() // NR_STATE_3 nr_state3_machine() // NR_DISCREQ nr_disconnect() nr_sk(sk)->state = NR_STATE_0; 6. close(socket_1) // sk refcount : 3 nr_release() // NR_STATE_0 sock_put(sk); // sk refcount : 0 sk_free(sk); close(uaf_socket) nr_release() sock_hold(sk); // UAF ``` KASAN report by syzbot: ``` BUG: KASAN: use-after-free in nr_release+0x66/0x460 net/netrom/af_netrom.c:520 Write of size 4 at addr ffff8880235d8080 by task syz-executor564/5128 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:306 [inline] print_report+0x15e/0x461 mm/kasan/report.c:417 kasan_report+0xbf/0x1f0 mm/kasan/report.c:517 check_region_inline mm/kasan/generic.c:183 [inline] kasan_check_range+0x141/0x190 mm/kasan/generic.c:189 instrument_atomic_read_write include/linux/instrumented.h:102 [inline] atomic_fetch_add_relaxed include/linux/atomic/atomic-instrumented.h:116 [inline] __refcount_add include/linux/refcount.h:193 [inline] __refcount_inc include/linux/refcount.h:250 [inline] refcount_inc include/linux/refcount.h:267 [inline] sock_hold include/net/sock.h:775 [inline] nr_release+0x66/0x460 net/netrom/af_netrom.c:520 __sock_release+0xcd/0x280 net/socket.c:650 sock_close+0x1c/0x20 net/socket.c:1365 __fput+0x27c/0xa90 fs/file_table.c:320 task_work_run+0x16f/0x270 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0xaa8/0x2950 kernel/exit.c:867 do_group_exit+0xd4/0x2a0 kernel/exit.c:1012 get_signal+0x21c3/0x2450 kernel/signal.c:2859 arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306 exit_to_user_mode_loop kernel/entry/common.c:168 [inline] exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296 do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f6c19e3c9b9 Code: Unable to access opcode bytes at 0x7f6c19e3c98f. RSP: 002b:00007fffd4ba2ce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: 0000000000000116 RBX: 0000000000000003 RCX: 00007f6c19e3c9b9 RDX: 0000000000000318 RSI: 00000000200bd000 RDI: 0000000000000006 RBP: 0000000000000003 R08: 000000000000000d R09: 000000000000000d R10: 0000000000000000 R11: 0000000000000246 R12: 000055555566a2c0 R13: 0000000000000011 R14: 0000000000000000 R15: 0000000000000000 </TASK> Allocated by task 5128: kasan_save_stack+0x22/0x40 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 ____kasan_kmalloc mm/kasan/common.c:371 [inline] ____kasan_kmalloc mm/kasan/common.c:330 [inline] __kasan_kmalloc+0xa3/0xb0 mm/kasan/common.c:380 kasan_kmalloc include/linux/kasan.h:211 [inline] __do_kmalloc_node mm/slab_common.c:968 [inline] __kmalloc+0x5a/0xd0 mm/slab_common.c:981 kmalloc include/linux/slab.h:584 [inline] sk_prot_alloc+0x140/0x290 net/core/sock.c:2038 sk_alloc+0x3a/0x7a0 net/core/sock.c:2091 nr_create+0xb6/0x5f0 net/netrom/af_netrom.c:433 __sock_create+0x359/0x790 net/socket.c:1515 sock_create net/socket.c:1566 [inline] __sys_socket_create net/socket.c:1603 [inline] __sys_socket_create net/socket.c:1588 [inline] __sys_socket+0x133/0x250 net/socket.c:1636 __do_sys_socket net/socket.c:1649 [inline] __se_sys_socket net/socket.c:1647 [inline] __x64_sys_socket+0x73/0xb0 net/socket.c:1647 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Freed by task 5128: kasan_save_stack+0x22/0x40 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:518 ____kasan_slab_free mm/kasan/common.c:236 [inline] ____kasan_slab_free+0x13b/0x1a0 mm/kasan/common.c:200 kasan_slab_free include/linux/kasan.h:177 [inline] __cache_free mm/slab.c:3394 [inline] __do_kmem_cache_free mm/slab.c:3580 [inline] __kmem_cache_free+0xcd/0x3b0 mm/slab.c:3587 sk_prot_free net/core/sock.c:2074 [inline] __sk_destruct+0x5df/0x750 net/core/sock.c:2166 sk_destruct net/core/sock.c:2181 [inline] __sk_free+0x175/0x460 net/core/sock.c:2192 sk_free+0x7c/0xa0 net/core/sock.c:2203 sock_put include/net/sock.h:1991 [inline] nr_release+0x39e/0x460 net/netrom/af_netrom.c:554 __sock_release+0xcd/0x280 net/socket.c:650 sock_close+0x1c/0x20 net/socket.c:1365 __fput+0x27c/0xa90 fs/file_table.c:320 task_work_run+0x16f/0x270 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0xaa8/0x2950 kernel/exit.c:867 do_group_exit+0xd4/0x2a0 kernel/exit.c:1012 get_signal+0x21c3/0x2450 kernel/signal.c:2859 arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306 exit_to_user_mode_loop kernel/entry/common.c:168 [inline] exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296 do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x63/0xcd ``` To fix this issue, nr_listen() returns -EINVAL for sockets that successfully nr_connect(). Reported-by: syzbot+caa188bdfc1eeafeb418@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30net: bcmgenet: Add a check for oversized packetsFlorian Fainelli
Occasionnaly we may get oversized packets from the hardware which exceed the nomimal 2KiB buffer size we allocate SKBs with. Add an early check which drops the packet to avoid invoking skb_over_panic() and move on to processing the next packet. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30net: stmmac: do not stop RX_CLK in Rx LPI state for qcs404 SoCAndrey Konovalov
Currently in phy_init_eee() the driver unconditionally configures the PHY to stop RX_CLK after entering Rx LPI state. This causes an LPI interrupt storm on my qcs404-base board. Change the PHY initialization so that for "qcom,qcs404-ethqos" compatible device RX_CLK continues to run even in Rx LPI state. Signed-off-by: Andrey Konovalov <andrey.konovalov@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30fec: convert to gpio descriptorArnd Bergmann
The driver can be trivially converted, as it only triggers the gpio pin briefly to do a reset, and it already only supports DT. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30net: mdio: mux-meson-g12a: use __clk_is_enabled to simplify the codeHeiner Kallweit
By using __clk_is_enabled () we can avoid defining an own variable for tracking whether enable counter is zero. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30powerpc/kvm: Fix unannotated intra-function call warningSathvika Vasireddy
objtool throws the following warning: arch/powerpc/kvm/booke.o: warning: objtool: kvmppc_fill_pt_regs+0x30: unannotated intra-function call Fix the warning by setting the value of 'nip' using the _THIS_IP_ macro, without using an assembly bl/mflr sequence to save the instruction pointer. Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230128124158.1066251-1-sv@linux.ibm.com
2023-01-30powerpc/85xx: Fix unannotated intra-function call warningSathvika Vasireddy
objtool throws the following warning: arch/powerpc/kernel/head_85xx.o: warning: objtool: .head.text+0x1a6c: unannotated intra-function call Fix the warning by annotating KernelSPE symbol with SYM_FUNC_START_LOCAL and SYM_FUNC_END macros. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230128124138.1066176-1-sv@linux.ibm.com
2023-01-29Merge branch 'Support bpf trampoline for s390x'Alexei Starovoitov
Ilya Leoshkevich says: ==================== v2: https://lore.kernel.org/bpf/20230128000650.1516334-1-iii@linux.ibm.com/#t v2 -> v3: - Make __arch_prepare_bpf_trampoline static. (Reported-by: kernel test robot <lkp@intel.com>) - Support both old- and new- style map definitions in sk_assign. (Alexei) - Trim DENYLIST.s390x. (Alexei) - Adjust s390x vmlinux path in vmtest.sh. - Drop merged fixes. v1: https://lore.kernel.org/bpf/20230125213817.1424447-1-iii@linux.ibm.com/#t v1 -> v2: - Fix core_read_macros, sk_assign, test_profiler, test_bpffs (24/31; I'm not quite happy with the fix, but don't have better ideas), and xdp_synproxy. (Andrii) - Prettify liburandom_read and verify_pkcs7_sig fixes. (Andrii) - Fix bpf_usdt_arg using barrier_var(); prettify barrier_var(). (Andrii) - Change BPF_MAX_TRAMP_LINKS to enum and query it using BTF. (Andrii) - Improve bpf_jit_supports_kfunc_call() description. (Alexei) - Always check sign_extend() return value. - Cc: Alexander Gordeev. Hi, This series implements poke, trampoline, kfunc, and mixing subprogs and tailcalls on s390x. The following failures still remain: #82 get_stack_raw_tp:FAIL get_stack_print_output:FAIL:user_stack corrupted user stack Known issue: We cannot reliably unwind userspace on s390x without DWARF. #101 ksyms_module:FAIL address of kernel function bpf_testmod_test_mod_kfunc is out of range Known issue: Kernel and modules are too far away from each other on s390x. #190 stacktrace_build_id:FAIL Known issue: We cannot reliably unwind userspace on s390x without DWARF. #281 xdp_metadata:FAIL See patch 6. None of these seem to be due to the new changes. Best regards, Ilya ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-29selftests/bpf: Trim DENYLIST.s390xIlya Leoshkevich
Now that trampoline is implemented, enable a number of tests on s390x. 18 of the remaining failures have to do with either lack of rethook (fixed by [1]) or syscall symbols missing from BTF (fixed by [2]). Do not re-classify the remaining failures for now; wait until the s390/for-next fixes are merged and re-classify only the remaining few. [1] https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=for-next&id=1a280f48c0e403903cf0b4231c95b948e664f25a [2] https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=for-next&id=2213d44e140f979f4b60c3c0f8dd56d151cc8692 Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230129190501.1624747-9-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-29selftests/bpf: Fix s390x vmlinux pathIlya Leoshkevich
After commit edd4a8667355 ("s390/boot: get rid of startup archive") there is no more compressed/ subdirectory. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230129190501.1624747-8-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-29s390/bpf: Implement bpf_jit_supports_kfunc_call()Ilya Leoshkevich
Implement calling kernel functions from eBPF. In general, the eBPF ABI is fairly close to that of s390x, with one important difference: on s390x callers should sign-extend signed arguments. Handle that by using information returned by bpf_jit_find_kfunc_model(). Here is an example of how sign extensions works. Suppose we need to call the following function from BPF: ; long noinline bpf_kfunc_call_test4(signed char a, short b, int c, long d) 0000000000936a78 <bpf_kfunc_call_test4>: 936a78: c0 04 00 00 00 00 jgnop bpf_kfunc_call_test4 ; return (long)a + (long)b + (long)c + d; 936a7e: b9 08 00 45 agr %r4,%r5 936a82: b9 08 00 43 agr %r4,%r3 936a86: b9 08 00 24 agr %r2,%r4 936a8a: c0 f4 00 1e 3b 27 jg <__s390_indirect_jump_r14> As per the s390x ABI, bpf_kfunc_call_test4() has the right to assume that a, b and c are sign-extended by the caller, which results in using 64-bit additions (agr) without any additional conversions. Without sign extension we would have the following on the JITed code side: ; tmp = bpf_kfunc_call_test4(-3, -30, -200, -1000); ; 5: b4 10 00 00 ff ff ff fd w1 = -3 0x3ff7fdcdad4: llilf %r2,0xfffffffd ; 6: b4 20 00 00 ff ff ff e2 w2 = -30 0x3ff7fdcdada: llilf %r3,0xffffffe2 ; 7: b4 30 00 00 ff ff ff 38 w3 = -200 0x3ff7fdcdae0: llilf %r4,0xffffff38 ; 8: b7 40 00 00 ff ff fc 18 r4 = -1000 0x3ff7fdcdae6: lgfi %r5,-1000 0x3ff7fdcdaec: mvc 64(4,%r15),160(%r15) 0x3ff7fdcdaf2: lgrl %r1,bpf_kfunc_call_test4@GOT 0x3ff7fdcdaf8: brasl %r14,__s390_indirect_jump_r1 This first 3 llilfs are 32-bit loads, that need to be sign-extended to 64 bits. Note: at the moment bpf_jit_find_kfunc_model() does not seem to play nicely with XDP metadata functions: add_kfunc_call() adds an "abstract" bpf_*() version to kfunc_btf_tab, but then fixup_kfunc_call() puts the concrete version into insn->imm, which bpf_jit_find_kfunc_model() cannot find. But this seems to be a common code problem. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230129190501.1624747-7-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-29s390/bpf: Implement bpf_jit_supports_subprog_tailcalls()Ilya Leoshkevich
Allow mixing subprogs and tail calls by passing the current tail call count to subprogs. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230129190501.1624747-6-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-29s390/bpf: Implement arch_prepare_bpf_trampoline()Ilya Leoshkevich
arch_prepare_bpf_trampoline() is used for direct attachment of eBPF programs to various places, bypassing kprobes. It's responsible for calling a number of eBPF programs before, instead and/or after whatever they are attached to. Add a s390x implementation, paying attention to the following: - Reuse the existing JIT infrastructure, where possible. - Like the existing JIT, prefer making multiple passes instead of backpatching. Currently 2 passes is enough. If literal pool is introduced, this needs to be raised to 3. However, at the moment adding literal pool only makes the code larger. If branch shortening is introduced, the number of passes needs to be increased even further. - Support both regular and ftrace calling conventions, depending on the trampoline flags. - Use expolines for indirect calls. - Handle the mismatch between the eBPF and the s390x ABIs. - Sign-extend fmod_ret return values. invoke_bpf_prog() produces about 120 bytes; it might be possible to slightly optimize this, but reaching 50 bytes, like on x86_64, looks unrealistic: just loading cookie, __bpf_prog_enter, bpf_func, insnsi and __bpf_prog_exit as literals already takes at least 5 * 12 = 60 bytes, and we can't use relative addressing for most of them. Therefore, lower BPF_MAX_TRAMP_LINKS on s390x. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230129190501.1624747-5-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>