From f555f3d76aaade29c7e221a37ee64fe722955c09 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Fri, 16 Jan 2015 11:37:12 +0100 Subject: genetlink: document parallel_ops The kernel-doc for the parallel_ops family struct member is missing, add it. Signed-off-by: Johannes Berg Signed-off-by: David S. Miller --- include/net/genetlink.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/net/genetlink.h') diff --git a/include/net/genetlink.h b/include/net/genetlink.h index 84125088c309..2ea2c55bdc87 100644 --- a/include/net/genetlink.h +++ b/include/net/genetlink.h @@ -27,6 +27,8 @@ struct genl_info; * @maxattr: maximum number of attributes supported * @netnsok: set to true if the family can handle network * namespaces and should be presented in all of them + * @parallel_ops: operations can be called in parallel and aren't + * synchronized by the core genetlink code * @pre_doit: called before an operation's doit callback, it may * do additional, common, filtering and return an error * @post_doit: called after an operation's doit callback, it may -- cgit From ee1c244219fd652964710a6cc3e4f922e86aa492 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Fri, 16 Jan 2015 11:37:14 +0100 Subject: genetlink: synchronize socket closing and family removal In addition to the problem Jeff Layton reported, I looked at the code and reproduced the same warning by subscribing and removing the genl family with a socket still open. This is a fairly tricky race which originates in the fact that generic netlink allows the family to go away while sockets are still open - unlike regular netlink which has a module refcount for every open socket so in general this cannot be triggered. Trying to resolve this issue by the obvious locking isn't possible as it will result in deadlocks between unregistration and group unbind notification (which incidentally lockdep doesn't find due to the home grown locking in the netlink table.) To really resolve this, introduce a "closing socket" reference counter (for generic netlink only, as it's the only affected family) in the core netlink code and use that in generic netlink to wait for all the sockets that are being closed at the same time as a generic netlink family is removed. This fixes the race that when a socket is closed, it will should call the unbind, but if the family is removed at the same time the unbind will not find it, leading to the warning. The real problem though is that in this case the unbind could actually find a new family that is registered to have a multicast group with the same ID, and call its mcast_unbind() leading to confusing. Also remove the warning since it would still trigger, but is now no longer a problem. This also moves the code in af_netlink.c to before unreferencing the module to avoid having the same problem in the normal non-genl case. Signed-off-by: Johannes Berg Signed-off-by: David S. Miller --- include/net/genetlink.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) (limited to 'include/net/genetlink.h') diff --git a/include/net/genetlink.h b/include/net/genetlink.h index 2ea2c55bdc87..6c92415311ca 100644 --- a/include/net/genetlink.h +++ b/include/net/genetlink.h @@ -35,7 +35,10 @@ struct genl_info; * undo operations done by pre_doit, for example release locks * @mcast_bind: a socket bound to the given multicast group (which * is given as the offset into the groups array) - * @mcast_unbind: a socket was unbound from the given multicast group + * @mcast_unbind: a socket was unbound from the given multicast group. + * Note that unbind() will not be called symmetrically if the + * generic netlink family is removed while there are still open + * sockets. * @attrbuf: buffer to store parsed attributes * @family_list: family list * @mcgrps: multicast groups used by this family (private) -- cgit