git.armlinux.org.uk/linux.git - Linus' kernel tree

diff options

author	Paolo Abeni <pabeni@redhat.com>	2025-09-23 10:12:17 +0200
committer	Paolo Abeni <pabeni@redhat.com>	2025-09-23 10:12:17 +0200
commit	6e2f1484b94435d7491dcc380ce2959fcf6caba9 (patch)
tree	3ebeb02bca04008cb2b88127207fe5ba8fd07f20 /rust/helpers/xarray.c
parent	3afb106f3f9aa81c512ec5c7e2f7e1c01a2a6e6b (diff)
parent	8a8241cdaa343c4bdb5ae11fa6cef09a2476d73b (diff)

Merge branch 'tcp-update-bind-bucket-state-on-port-release'

Jakub Sitnicki says: ==================== tcp: Update bind bucket state on port release TL;DR ----- This is another take on addressing the issue we already raised earlier [1]. This time around, instead of trying to relax the bind-conflict checks in connect(), we make an attempt to fix the tcp bind bucket state accounting. The goal of this patch set is to make the bind buckets return to "port reusable by ephemeral connections" state when all sockets blocking the port from reuse get unhashed. Changelog --------- Changes in v5: - Fix initial port-addr bucket state on saddr update with ip_dynaddr=1 - Add Kuniyuki's tag for tests - Link to v4: https://lore.kernel.org/r/20250913-update-bind-bucket-state-on-unhash-v4-0-33a567594df7@cloudflare.com Changes in v4: - Drop redundant sk_is_connect_bind helper doc comment - Link to v3: https://lore.kernel.org/r/20250910-update-bind-bucket-state-on-unhash-v3-0-023caaf4ae3c@cloudflare.com Changes in v3: - Move the flag from inet_flags to sk_userlocks (Kuniyuki) - Rename the flag from AUTOBIND to CONNECT_BIND to avoid a name clash (Kuniyuki) - Drop unreachable code for sk_state == TCP_NEW_SYN_RECV (Kuniyuki) - Move the helper to inet_hashtables where it's used - Reword patch 1 description for conciseness - Link to v2: https://lore.kernel.org/r/20250821-update-bind-bucket-state-on-unhash-v2-0-0c204543a522@cloudflare.com Changes in v2: - Rename the inet_sock flag from LAZY_BIND to AUTOBIND (Eric) - Clear the AUTOBIND flag on disconnect path (Eric) - Add a test to cover the disconnect case (Eric) - Link to RFC v1: https://lore.kernel.org/r/20250808-update-bind-bucket-state-on-unhash-v1-0-faf85099d61b@cloudflare.com Situation --------- We observe the following scenario in production: inet_bind_bucket state for port 54321 -------------------- (bucket doesn't exist) // Process A opens a long-lived connection: s1 = socket(AF_INET, SOCK_STREAM) s1.setsockopt(IP_BIND_ADDRESS_NO_PORT) s1.setsockopt(IP_LOCAL_PORT_RANGE, 54000..54500) s1.bind(192.0.2.10, 0) s1.connect(192.51.100.1, 443) tb->fastreuse = -1 tb->fastreuseport = -1 s1.getsockname() -> 192.0.2.10:54321 s1.send() s1.recv() // ... s1 stays open. // Process B opens a short-lived connection: s2 = socket(AF_INET, SOCK_STREAM) s2.setsockopt(SO_REUSEADDR) s2.bind(192.0.2.20, 0) tb->fastreuse = 0 tb->fastreuseport = 0 s2.connect(192.51.100.2, 53) s2.getsockname() -> 192.0.2.20:54321 s2.send() s2.recv() s2.close() // bucket remains in this // state even though port // was released by s2 tb->fastreuse = 0 tb->fastreuseport = 0 // Process A attempts to open another connection // when there is connection pressure from // 192.0.2.30:54000..54500 to 192.51.100.1:443. // Assume only port 54321 is still available. s3 = socket(AF_INET, SOCK_STREAM) s3.setsockopt(IP_BIND_ADDRESS_NO_PORT) s3.setsockopt(IP_LOCAL_PORT_RANGE, 54000..54500) s3.bind(192.0.2.30, 0) s3.connect(192.51.100.1, 443) -> EADDRNOTAVAIL (99) Problem ------- We end up in a state where Process A can't reuse ephemeral port 54321 for as long as there are sockets, like s1, that keep the bind bucket alive. The bucket does not return to "reusable" state even when all sockets which blocked it from reuse, like s2, are gone. The ephemeral port becomes available for use again only after all sockets bound to it are gone and the bind bucket is destroyed. Programs which behave like Process B in this scenario - that is, binding to an IP address without setting IP_BIND_ADDRESS_NO_PORT - might be considered poorly written. However, the reality is that such implementation is not actually uncommon. Trying to fix each and every such program is like playing whack-a-mole. For instance, it could be any software using Golang's net.Dialer with LocalAddr provided: dialer := &net.Dialer{ LocalAddr: &net.TCPAddr{IP: srcIP}, } conn, err := dialer.Dial("tcp4", dialTarget) Or even a ubiquitous tool like dig when using a specific local address: $ dig -b 127.1.1.1 +tcp +short example.com Hence, we are proposing a systematic fix in the network stack itself. Solution -------- Please see the description in patch 1. [1] https://lore.kernel.org/r/20250714-connect-port-search-harder-v3-0-b1a41f249865@cloudflare.com Reported-by: Lee Valentine <lvalentine@cloudflare.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> ==================== Link: https://patch.msgid.link/20250917-update-bind-bucket-state-on-unhash-v5-0-57168b661b47@cloudflare.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Diffstat (limited to 'rust/helpers/xarray.c')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: