Age | Commit message (Collapse) | Author |
|
Jann Horn points out that epoll is decrementing the ep refcount and then
doing a
mutex_unlock(&ep->mtx);
afterwards. That's very wrong, because it can lead to a use-after-free.
That pattern is actually fine for the very last reference, because the
code in question will delay the actual call to "ep_free(ep)" until after
it has unlocked the mutex.
But it's wrong for the much subtler "next to last" case when somebody
*else* may also be dropping their reference and free the ep while we're
still using the mutex.
Note that this is true even if that other user is also using the same ep
mutex: mutexes, unlike spinlocks, can not be used for object ownership,
even if they guarantee mutual exclusion.
A mutex "unlock" operation is not atomic, and as one user is still
accessing the mutex as part of unlocking it, another user can come in
and get the now released mutex and free the data structure while the
first user is still cleaning up.
See our mutex documentation in Documentation/locking/mutex-design.rst,
in particular the section [1] about semantics:
"mutex_unlock() may access the mutex structure even after it has
internally released the lock already - so it's not safe for
another context to acquire the mutex and assume that the
mutex_unlock() context is not using the structure anymore"
So if we drop our ep ref before the mutex unlock, but we weren't the
last one, we may then unlock the mutex, another user comes in, drops
_their_ reference and releases the 'ep' as it now has no users - all
while the mutex_unlock() is still accessing it.
Fix this by simply moving the ep refcount dropping to outside the mutex:
the refcount itself is atomic, and doesn't need mutex protection (that's
the whole _point_ of refcounts: unlike mutexes, they are inherently
about object lifetimes).
Reported-by: Jann Horn <jannh@google.com>
Link: https://docs.kernel.org/locking/mutex-design.html#semantics [1]
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
- Fix bogus KASAN splat on EFI runtime stack
- Select JUMP_LABEL unconditionally to avoid boot failure with pKVM and
the legacy implementation of static keys
- Avoid touching GCS registers when 'arm64.nogcs' has been passed on
the command-line
- Move a 'cpumask_t' off the stack in smp_send_stop()
- Don't advertise SME-related hwcaps to userspace when ID_AA64PFR1_EL1
indicates that SME is not implemented
- Always check the VMA when handling an Overlay fault
- Avoid corrupting TCR2_EL1 during boot
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/mm: Drop wrong writes into TCR2_EL1
arm64: poe: Handle spurious Overlay faults
arm64: Filter out SME hwcaps when FEAT_SME isn't implemented
arm64: move smp_send_stop() cpu mask off stack
arm64/gcs: Don't try to access GCS registers if arm64.nogcs is enabled
arm64: Unconditionally select CONFIG_JUMP_LABEL
arm64: efi: Fix KASAN false positive for EFI runtime stack
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- Mark som pins as invalid for IRQ use in the Qualcomm driver
- Fix up the use of device properties on the MA35DX Nuvoton, apparently
something went sidewise
- Clear the GPIO debounce settings when going down for suspend in the
AMD driver. Very good for some AMD laptops that now wake up from
suspend again!
- Add the compulsory .can_sleep bool flag in the AW9523 driver, should
have been there from the beginning, now there are users finding the
bug
- Drop some bouncing email address from MAINTAINERS
* tag 'pinctrl-v6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: aw9523: fix can_sleep flag for GPIO chip
pinctrl: amd: Clear GPIO debounce for suspend
pinctrl: nuvoton: Fix boot on ma35dx platforms
MAINTAINERS: drop bouncing Lakshmi Sowjanya D
pinctrl: qcom: msm: mark certain pins as invalid for interrupts
|
|
When SAE commit is sent and received in response, there's no
ordering for the SAE confirm messages. As such, don't call
drivers to stop listening on the channel when the confirm
message is still expected.
This fixes an issue if the local confirm is transmitted later
than the AP's confirm, for iwlwifi (and possibly mt76) the
AP's confirm would then get lost since the device isn't on
the channel at the time the AP transmit the confirm.
For iwlwifi at least, this also improves the overall timing
of the authentication handshake (by about 15ms according to
the report), likely since the session protection won't be
aborted and rescheduled.
Note that even before this, mgd_complete_tx() wasn't always
called for each call to mgd_prepare_tx() (e.g. in the case
of WEP key shared authentication), and the current drivers
that have the complete callback don't seem to mind. Document
this as well though.
Reported-by: Jan Hendrik Farr <kernel@jfarr.cc>
Closes: https://lore.kernel.org/all/aB30Ea2kRG24LINR@archlinux/
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250609213232.12691580e140.I3f1d3127acabcd58348a110ab11044213cf147d3@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Implement dot11ExtendedRegInfoSupport to advertise non-AP station
regulatory power capability as part of regulatory connectivity
element in (Re)Association request frames so that AP can achieve
maximum client connectivity. Control field which was interpreted
using value of 3-bits B5 to B3, now uses value of 4-bits B6 to B3 to
interpret the type of AP. Hence update IEEE80211_HE_6GHZ_OPER_CTRL_REG_INFO
to parse 4-bits control field. If older AP still updates only 3-bits
value of control field, station can still interpret the value as per
section E.2.7 of IEEE 802.11 REVme D7.0 and support the appropriate
AP type.
Also update IEEE80211_6GHZ_CTRL_REG_INDOOR_SP_AP as the value of
standard power AP is changed to 8 instead of 4 so that AP can support both
LPI AP and SP AP to maximize the connectivity with stations. For backward
compatibility, keeping value 4 as old AP by limiting it to SP AP only.
Signed-off-by: Somashekhar Puttagangaiah <somashekhar.puttagangaiah@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250609213232.90cdef116aad.I85da390fbee59355e3855691933e6a5e55c47ac4@changeid
[fix kernel-doc]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Currently the code only sends extended MLD capa/ops in
strict mode, but if the AP has it then it should also
be able to parse it. There could be cases where the AP
doesn't have it but we would want to advertise it (e.g.
if the AP supports nothing but we want to have BTM.),
but given the broken deployed APs out there right now
this is the best we can do.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250609213232.c9b8b3a6ca77.I1153d4283d1fbb9e5db60e7b939cc133a6345db5@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
cfg80211 now reports whether this is the first part of a scan. Copy
that information into the driver request.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250609213231.63f6078bd7be.Ia6e5cee945e6d9617c2f427552d89d23c92eee83@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
When there are no non-6 GHz channels, then the 6 GHz scan is the first
part of a split scan. Add a boolean denoting whether the scan is the
first part of a scan as it might be useful to drivers for internal
bookkeeping. This flag is also set if the scan is not split.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250609213231.07e5a8a452ec.Ibf18f513e507422078fb31b28947e582a20df87a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Since iwlwifi was the only driver using this and no
longer does, we can remove all this code.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250609213231.4dff5fb8890f.Ie531f912b252a0042c18c0734db50c3afe1adfb5@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
We verify that the Extended MLD Capabilities are matching between links.
However, some bits are reserved and in particular the Recommended Max
Links subfield may not necessarily match. So only verify the known
subfields that can reliably be expected to be the same. More information
can be found in Table 9-417o, in IEEE P802.11be/D7.0.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250609213231.a2fad48dd3e6.Iae1740cd2ac833bc4a64fd2af718e1485158fd42@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The cast from void * here coupled with the boolean argument
on what to cast to is confusing and really not needed, just
split the code and make a type-safe interface. It seems to
even reduce the code size slightly, at least on x86-64.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250609213231.bdb3c96570b0.Ia153e6ce06dc9a636ff5bcc1d52468a1afd06e13@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Hide the internal scan fields from mac80211 and drivers, the
'notified' variable is for internal tracking, and the 'info'
is output that's passed to cfg80211_scan_done() and stored
only for delayed userspace notification.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250609213231.6a62e41858e2.I004f66e9c087cc6e6ae4a24951cf470961ee9466@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
If the link is deactivated and the CSA completes, then that
needs to update the link station's bandwidth (only the AP STA
can exist at this point, no TDLS on inactive links) and set
the CSA to no longer be active. Fix this.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250609213231.07f120cf687d.I5a868c501ee73fcc2355d61c2ee06e5f444b350f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
When a new station is added, ensure that mandatory bit-rates
are enabled for 6 GHz band.
Signed-off-by: Somashekhar Puttagangaiah <somashekhar.puttagangaiah@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250609213231.4aecd7f3b85b.I33a54872a3267c9f6155ce537d6c9c2a31c3f117@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
ieee80211_process_ml_reconf_resp() has a blank line between
an if statement and the covered code, remove it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250609213231.a1f4ceae700d.I1d7aae17cc466c1648f31c42b935165db85d2809@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
ieee80211_prep_connection is supposed to be called when both bitmaps
(valid_links and active_links) are cleared.
Make sure of it and WARN if this is not the case, to avoid weird issues.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Link: https://patch.msgid.link/20250609213231.f616c7b693df.Ie983155627ad0d2e7c19c30ce642915246d0ed9d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
If we get to the error path of ieee80211_prep_connection, for example
because of a FW issue, then ieee80211_vif_set_links is called
with 0.
But the call to drv_change_vif_links from ieee80211_vif_update_links
will probably fail as well, for the same reason.
In this case, the valid_links and active_links bitmaps will be reverted
to the value of the failing connection.
Then, in the next connection, due to the logic of
ieee80211_set_vif_links_bitmaps, valid_links will be set to the ID of
the new connection assoc link, but the active_links will remain with the
ID of the old connection's assoc link.
If those IDs are different, we get into a weird state of valid_links and
active_links being different. One of the consequences of this state is
to call drv_change_vif_links with new_links as 0, since the & operation
between the bitmaps will be 0.
Since a removal of a link should always succeed, ignore the return value
of drv_change_vif_links if it was called to only remove links, which is
the case for the ieee80211_prep_connection's error path.
That way, the bitmaps will not be reverted to have the value from the
failing connection and will have 0, so the next connection will have a
good state.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Link: https://patch.msgid.link/20250609213231.ba2011fb435f.Id87ff6dab5e1cf757b54094ac2d714c656165059@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next
Miri Korenblit says:
====================
iwlwifi features, notably:
- PNVM integrated in the ucode image
- more cleanups in the transport layer
====================
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
FWs with this version are no longer supported on any device.
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709081300.1668a7430521.I488d69251aed62f0b11a2553f972a1730bc8b6cf@changeid
|
|
Stop supporting older FWs
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709081300.71404c289481.Iea4f3d36e18029a817ec5d6641d08ac5ee025678@changeid
|
|
iwl_mvm_set_key_rx_seq is called only once when the installed argument
is false. Remove this argument.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709081300.2586112afd70.Iddf9a2b24546cb3a1506d68ca41ed215f88cff5c@changeid
|
|
iwlmvm driver does not support MLO. Remove this code
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709081300.4957e50dee91.I2a432256dbc3069e0300e1f833e10a93d203f538@changeid
|
|
Using 'trans' for the mac config is confusing, rename the
argument to 'mac_cfg'.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709081300.72d87406f8d7.I8b39f01e06ad7791efe718c267cbf367233920a3@changeid
|
|
Some device names were wrong because our internal data suggested
that discrete Ga devices have B-step RF, when they actually have
C-step. However, matching the step for them is bad anyway.
Change the code to be able to find the devinfo depending on the
device being integrated or discrete. This is only for the names,
since the RF config cannot be different for the same RF because
it's discrete or integrated, so add a kunit test that ensures
both (a) the RF config is the same and (b) the name is different
(the latter really only because that's the whole point of having
a match on the discrete/integrated bit.)
Remove the RF step matching since it's no longer needed now.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709081300.e048a94659f1.Ie5919c70e9d8e3a28152aaf3cdffd19ed3d4f5c7@changeid
|
|
There's really not much value in printing something just
because the driver loaded, remove that message.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709081300.fe33c279a45d.I16a9cbcfce92a1d1b8b26a20ea9911e8a5a0b1cc@changeid
|
|
These have no real value for normal users, print them as
debug messages instead.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709081300.bd2df0705d89.Ic6f042588ef17719653c077ff89a8b9149c22f92@changeid
|
|
We needed it for setting up trans parameters that could change later in
the probe flow.
This is no longer true, now we know all the parameters before we allocate the
trans, so we can just send the right parameters to iwl_trans_alloc and have all
initializations done there.
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
.../net/wireless/intel/iwlwifi/iwl-trans.c | 25 ++--------
.../net/wireless/intel/iwlwifi/iwl-trans.h | 8 +--
.../intel/iwlwifi/pcie/gen1_2/trans.c | 50 ++++++++++++-------
3 files changed, 41 insertions(+), 42 deletions(-)
Link: https://patch.msgid.link/20250709081300.9602fde079de.Iaede14c91095560852f9b441f1e16546b0a06bdd@changeid
|
|
Move the parts of the probe that are not gen specific to the common
probe function.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709081300.91aee0874e79.Ib762365933d4dd4fc0bf07833226cd7118dee0a1@changeid
|
|
Start supporting FW API version 102 on those devices.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709081300.da98a7b6be42.I77150bbf55eb160dbe0ef75c3e28afc053f27ec3@changeid
|
|
Refactor iwl_trans_init to accept txcmd_size and txcmd_align as parameters
instead of calculating them internally.
Signed-off-by: Yedidya Benshimol <yedidya.ben.shimol@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709081300.237285d81461.I3552860dd062a523606c8a5c85c9a6f0d4f04262@changeid
|
|
In the process of splitting the transport's different generations,
move gen1_2's probe flow and relevant helper functions to the gen1_2
subfolder
Signed-off-by: Yedidya Benshimol <yedidya.ben.shimol@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709081300.29b909144e1a.Idaa77eddd6650cf6f113833d2fbc8d3ef08cfd8f@changeid
|
|
Given compatibility issues with external PNVM data that doesn't match
the firmware it was designed with/for, future firmware releases will
include the PNVM data in the firmware files directly, avoiding those
mismatch issues. Make the driver load and use that embedded PNVM data
in preference of external files, falling back to the external file if
it isn't present.
Co-developed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709081300.c843f77aa2d3.I7200f8dd40ef82aff1f5574fdd3966913cda592c@changeid
|
|
Add iwl_poll_bits helper to simplify calls to iwl_poll_bit
for the case when the bits and mask arguments are equal.
Signed-off-by: Rotem Kerem <rotem.kerem@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709081300.6bbc4bccc597.Ic7a10a7f8a9a32a9a9feecaf6e3a48fa37479f2d@changeid
|
|
Use DEFINE_MUTEX() to initialize udp_tunnel_gro_type_lock.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250707091634.311974-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add missing 'minItems: 1' to iommus property of the Altera SOCFPGA SoC
implementation of the Synopsys DWMAC.
Fixes: 6d359cf464f4 ("dt-bindings: net: Convert socfpga-dwmac bindings to yaml")
Signed-off-by: Matthew Gerlach <matthew.gerlach@altera.com>
Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20250707154409.15527-1-matthew.gerlach@altera.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This ethernet controller is using fixed links for DSA switches
in two already existing device trees, so make sure the checker
does not complain like this:
intel-ixp42x-linksys-wrv54g.dtb: ethernet@c8009000 (intel,ixp4xx-ethernet):
'fixed-link' does not match any of the regexes: '^pinctrl-[0-9]+$'
from schema $id: http://devicetree.org/schemas/net/intel,ixp4xx-ethernet.yaml#
intel-ixp42x-usrobotics-usr8200.dtb: ethernet@c800a000 (intel,ixp4xx-ethernet):
'fixed-link' does not match any of the regexes: '^pinctrl-[0-9]+$'
from schema $id: http://devicetree.org/schemas/net/intel,ixp4xx-ethernet.yaml#
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202507040609.K9KytWBA-lkp@intel.com/
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20250704-ixp4xx-ethernet-binding-fix-v1-1-8ac360d5bc9b@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Kuniyuki Iwashima says:
====================
ipv6: Drop RTNL from mcast.c and anycast.c
This is a prep series for RCU conversion of RTM_NEWNEIGH, which needs
RTNL during neigh_table.{pconstructor,pdestructor}() touching IPv6
multicast code.
Currently, IPv6 multicast code is protected by lock_sock() and
inet6_dev->mc_lock, and RTNL is not actually needed.
In addition, anycast code is also in the same situation and does not
need RTNL at all.
This series removes RTNL from net/ipv6/{mcast.c,anycast.c} and finally
removes setsockopt_needs_rtnl() from do_ipv6_setsockopt().
v2: https://lore.kernel.org/20250624202616.526600-1-kuni1840@gmail.com
v1: https://lore.kernel.org/20250616233417.1153427-1-kuni1840@gmail.com
====================
Link: https://patch.msgid.link/20250702230210.3115355-1-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We no longer need to hold RTNL for IPv6 socket options.
Let's remove setsockopt_needs_rtnl().
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250702230210.3115355-16-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
inet6_sk(sk)->ipv6_ac_list is protected by lock_sock().
In ipv6_sock_ac_join(), only __dev_get_by_index(), __dev_get_by_flags(),
and __in6_dev_get() require RTNL.
__dev_get_by_flags() is only used by ipv6_sock_ac_join() and can be
converted to RCU version.
Let's replace RCU version helper and drop RTNL from IPV6_JOIN_ANYCAST.
setsockopt_needs_rtnl() will be removed in the next patch.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250702230210.3115355-15-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The next patch will replace __dev_get_by_index() and __dev_get_by_flags()
to RCU + refcount version.
Then, we will need to call dev_put() in some error paths.
Let's unify two error paths to make the next patch cleaner.
Note that we add READ_ONCE() for net->ipv6.devconf_all->forwarding
and idev->conf.forwarding as we will drop RTNL that protects them.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250702230210.3115355-14-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
inet6_sk(sk)->ipv6_ac_list is protected by lock_sock().
In ipv6_sock_ac_drop() and ipv6_sock_ac_close(),
only __dev_get_by_index() and __in6_dev_get() requrie RTNL.
Let's replace them with dev_get_by_index() and in6_dev_get()
and drop RTNL from IPV6_LEAVE_ANYCAST and IPV6_ADDRFORM.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250702230210.3115355-13-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
inet6_dev->ac_list is protected by inet6_dev->lock, so rtnl_dereference()
is a bit rough annotation.
As done in mcast.c, we can use ac_dereference() that checks if
inet6_dev->lock is held.
Let's replace rtnl_dereference() with a new helper ac_dereference().
Note that now addrconf_join_solict() / addrconf_leave_solict() in
__ipv6_dev_ac_inc() / __ipv6_dev_ac_dec() does not need RTNL, so we
can remove ASSERT_RTNL() there.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250702230210.3115355-12-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Now, RTNL is not needed for mcast code, and what's commented in
ip6_mc_msfget() is apparent by for_each_pmc_socklock(), which has
lockdep annotation for lock_sock().
Let's remove the comment and ASSERT_RTNL() in ipv6_mc_rejoin_groups().
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250702230210.3115355-11-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In ip6_mc_source() and ip6_mc_msfilter(), per-socket mld data is
protected by lock_sock() and inet6_dev->mc_lock is also held for
some per-interface functions.
ip6_mc_find_dev_rtnl() only depends on RTNL. If we want to remove
it, we need to check inet6_dev->dead under mc_lock to close the race
with addrconf_ifdown(), as mentioned earlier.
Let's do that and drop RTNL for the rest of MCAST_ socket options.
Note that ip6_mc_msfilter() has unnecessary lock dances and they
are integrated into one to avoid the last-minute error and simplify
the error handling.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250702230210.3115355-10-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In __ipv6_sock_mc_close(), per-socket mld data is protected by lock_sock(),
and only __dev_get_by_index() and __in6_dev_get() require RTNL.
Let's call __ipv6_sock_mc_drop() and drop RTNL in ipv6_sock_mc_close().
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250702230210.3115355-9-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In __ipv6_sock_mc_drop(), per-socket mld data is protected by lock_sock(),
and only __dev_get_by_index() and __in6_dev_get() require RTNL.
Let's use dev_get_by_index() and in6_dev_get() and drop RTNL for
IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP.
Note that __ipv6_sock_mc_drop() is factorised to reuse in the next patch.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250702230210.3115355-8-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In __ipv6_sock_mc_join(), per-socket mld data is protected by lock_sock(),
and only __dev_get_by_index() requires RTNL.
Let's use dev_get_by_index() and drop RTNL for IPV6_ADD_MEMBERSHIP and
MCAST_JOIN_GROUP.
Note that we must call rt6_lookup() and dev_hold() under RCU.
If rt6_lookup() returns an entry from the exception table, dst_dev_put()
could change rt->dev.dst to loopback concurrently, and the original device
could lose the refcount before dev_hold() and unblock device registration.
dst_dev_put() is called from NETDEV_UNREGISTER and synchronize_net() follows
it, so as long as rt6_lookup() and dev_hold() are called within the same
RCU critical section, the dev is alive.
Even if the race happens, they are synchronised by idev->dead and mcast
addresses are cleaned up.
For the racy access to rt->dst.dev, we use dst_dev().
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250702230210.3115355-7-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As well as __ipv6_dev_mc_inc(), all code in __ipv6_dev_mc_dec() are
protected by inet6_dev->mc_lock, and RTNL is not needed.
Let's use in6_dev_get() in ipv6_dev_mc_dec() and remove ASSERT_RTNL()
in __ipv6_dev_mc_dec().
Now, we can remove the RTNL comment above addrconf_leave_solict() too.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250702230210.3115355-6-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since commit 63ed8de4be81 ("mld: add mc_lock for protecting per-interface
mld data"), the newly allocated struct ifmcaddr6 cannot be removed until
inet6_dev->mc_lock is released, so mca_get() and mc_put() are unnecessary.
Let's remove the extra refcounting.
Note that mca_get() was only used in __ipv6_dev_mc_inc().
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250702230210.3115355-5-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since commit 63ed8de4be81 ("mld: add mc_lock for protecting
per-interface mld data"), every multicast resource is protected
by inet6_dev->mc_lock.
RTNL is unnecessary in terms of protection but still needed for
synchronisation between addrconf_ifdown() and __ipv6_dev_mc_inc().
Once we removed RTNL, there would be a race below, where we could
add a multicast address to a dead inet6_dev.
CPU1 CPU2
==== ====
addrconf_ifdown() __ipv6_dev_mc_inc()
if (idev->dead) <-- false
dead = true return -ENODEV;
ipv6_mc_destroy_dev() / ipv6_mc_down()
mutex_lock(&idev->mc_lock)
...
mutex_unlock(&idev->mc_lock)
mutex_lock(&idev->mc_lock)
...
mutex_unlock(&idev->mc_lock)
The race window can be easily closed by checking inet6_dev->dead
under inet6_dev->mc_lock in __ipv6_dev_mc_inc() as addrconf_ifdown()
will acquire it after marking inet6_dev dead.
Let's check inet6_dev->dead under mc_lock in __ipv6_dev_mc_inc().
Note that now __ipv6_dev_mc_inc() no longer depends on RTNL and
we can remove ASSERT_RTNL() there and the RTNL comment above
addrconf_join_solict().
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250702230210.3115355-4-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|