summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2016-08-09batman-adv: netlink: add gateway table queriesSven Eckelmann
Add BATADV_CMD_GET_GATEWAYS commands, using handlers bat_gw_dump in batadv_algo_ops. Will always return -EOPNOTSUPP for now, as no implementations exist yet. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: add B.A.T.M.A.N. V bat_{orig, neigh}_dump implementationsMatthias Schiffer
Dump the algo V originators and neighbours. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Andrew Lunn <andrew@lunn.ch> [sven@narfation.org: Fix includes, fix algo_ops integration] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09batman-adv: add B.A.T.M.A.N. IV bat_{orig, neigh}_dump implementationsMatthias Schiffer
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Andrew Lunn <andrew@lunn.ch> [sven.eckelmann@open-mesh.com: Fix function parameter alignments, add policy for attributes, fix includes, fix algo_ops integration] Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09batman-adv: netlink: add originator and neighbor table queriesMatthias Schiffer
Add BATADV_CMD_GET_ORIGINATORS and BATADV_CMD_GET_NEIGHBORS commands, using handlers bat_orig_dump and bat_neigh_dump in batadv_algo_ops. Will always return -EOPNOTSUPP for now, as no implementations exist yet. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Andrew Lunn <andrew@lunn.ch> [sven@narfation.org: Rewrite based on new algo_ops structures] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09batman-adv: Provide TTVN in the mesh_info netlink msgSven Eckelmann
The TTVN is the main information for the debugging of translation table problems. It is therefore necessary when comparing the global translation tables. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09batman-adv: netlink: add translation table queryMatthias Schiffer
This adds the commands BATADV_CMD_GET_TRANSTABLE_LOCAL and BATADV_CMD_GET_TRANSTABLE_GLOBAL, which correspond to the transtable_local and transtable_global debugfs files. The batadv_tt_client_flags enum is moved to the UAPI to expose it as part of the netlink API. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Andrew Lunn <andrew@lunn.ch> [sven.eckelmann@open-mesh.com: add policy for attributes, fix includes] Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com> [sw@simonwunderlich.de: fix VID attributes content] Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09batman-adv: netlink: hardif queryMatthias Schiffer
BATADV_CMD_GET_HARDIFS will return the list of hardifs (including index, name and MAC address) of all hardifs for a given softif. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Andrew Lunn <andrew@lunn.ch> [sven.eckelmann@open-mesh.com: Reduce the number of changes to BATADV_CMD_GET_HARDIFS, add policy for attributes] Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09batman-adv: netlink: add routing_algo queryMatthias Schiffer
BATADV_CMD_GET_ROUTING_ALGOS is used to get the list of supported routing algorithms. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Andrew Lunn <andrew@lunn.ch> [sven.eckelmann@open-mesh.com: Reduce the number of changes to BATADV_CMD_GET_ROUTING_ALGOS, fix includes] Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09batman-adv: Suppress debugfs entries for netns'sAndrew Lunn
Debugfs is not netns aware. It thus has problems when the same interface name exists in multiple network name spaces. Work around this by not creating entries for interfaces in name spaces other than the default name space. This means meshes in network namespaces cannot be managed via debugfs, but there will soon be a netlink interface which is netns aware. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09batman-adv: Handle parent interfaces in a different netnsAndrew Lunn
batman-adv tries to prevent the user from placing a batX soft interface into another batman mesh as a hard interface. It does this by walking up the devices list of parents and ensures they are all none batX interfaces. iflink can point to an interface in a different namespace, so also retrieve the parents name space when finding the parent and use it when doing the comparison. Signed-off-by: Andrew Lunn <andrew@lunn.ch> [sven@narfation.org: Fix alignments, simplify parent netns retrieval] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09batman-adv: Fix consistency of update route messagesSven Eckelmann
The debug messages of _batadv_update_route were printed before the actual route change is done. At this point it is not really known which curr_router will be replaced. Thus the messages could print the wrong operation. Printing the debug messages after the operation was done avoids this problem. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: Use bitwise instead of arithmetic operator for flagsLinus Lüssing
This silences the following coccinelle warning: "WARNING: sum of probable bitmasks, consider |" Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: Remove orig_node reference handling from send_skb_unicastSven Eckelmann
The function batadv_send_skb_unicast is not acquiring a reference for an orig_node nor removing it from any datastructure. It still reduces the reference counter for an object which is still in the hands of the caller. This is confusing and can lead in the future to problems in the reference handling of the caller function. Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: use kmem_cache for translation tableSven Eckelmann
The translation table (global, local) is usually the part of batman-adv which has the most dynamical allocated objects. Most of them (tt_local_entry, tt_global_entry, tt_orig_list_entry, tt_change_node, tt_req_node, tt_roam_node) are equally sized. So it makes sense to have them allocated from a kmem_cache for each type. This approach allowed a small wireless router (TP-Link TL-841NDv8; SLUB allocator) to store 34% more translation table entries compared to the current implementation. [1] https://open-mesh.org/projects/batman-adv/wiki/Kmalloc-kmem-cache-tests Reported-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: Introduce forward packet creation helperLinus Lüssing
This patch abstracts the forward packet creation into the new function batadv_forw_packet_alloc(). The queue counting and interface reference counters are now handled internally within batadv_forw_packet_alloc() and its batadv_forw_packet_free() counterpart. This should reduce the risk of having reference/queue counting bugs again and should increase code readibility. Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: fix boolreturn.cocci warningskbuild test robot
net/batman-adv/bridge_loop_avoidance.c:1105:9-10: WARNING: return of 0/1 in function 'batadv_bla_process_claim' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Generated by: scripts/coccinelle/misc/boolreturn.cocci Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: iv_ogm, Reduce code duplicationMarkus Pargmann
The difference between tq1 and tq2 are calculated the same way in two separate functions. This patch moves the common code to a separate function 'batadv_iv_ogm_neigh_diff' which handles everything necessary. The other two functions can then handle errors and use the difference directly. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> [sven@narfation.org: rebased on current version, initialize return variable in batadv_iv_ogm_neigh_diff, add kerneldoc, convert to bool return type] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: disable sysfs knobs when GW-mode is not implementedAntonio Quartulli
Now that the GW-mode code is algorithm specific, batman-adv expects the routing algorithm to implement some APIs to make it work. However, such APIs are not mandatory, therefore we might have algorithms not providing them. In this case all the sysfs knobs related to GW-mode should be deactivated to make sure that settings injected by the user for this feature are rejected. Signed-off-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: B.A.T.M.A.N. V - implement GW selection logicAntonio Quartulli
Since the GW selection logic has been made routing protocol specific it is now possible for B.A.T.M.A.N V to have its own mechanism by providing the API implementation. Implement the GW specific API in the B.A.T.M.A.N. V protocol in order to provide a working GW selection mechanism. Signed-off-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: make GW election code protocol specificAntonio Quartulli
Each routing protocol may have its own specific logic about gateway election which is potentially based on the metric being used. Create two GW specific API functions and move the current election logic in the B.A.T.M.A.N. IV specific code. Signed-off-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: make the GW selection class algorithm specificAntonio Quartulli
The B.A.T.M.A.N. V algorithm uses a different metric compared to its predecessor and for this reason the logic used to compute the best Gateway is also changed. This means that the GW selection class fed to this logic has a semantics that depends on the algorithm being used. Make the parsing and printing routine of the GW selection class routing algorithm specific. Each algorithm can now parse (and print) this value independently. If no API is provided by any algorithm, the default is to use the current mechanism of considering such value like an integer between 1 and 255. Signed-off-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: Remove unused primary_if and bat_priv variablesLinus Lüssing
Fixes: ef0a937f7a14 ("batman-adv: consider outgoing interface in OGM sending") Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: Avoid sysfs name collision for netns movesSven Eckelmann
The kobject_put is only removing the sysfs entry and corresponding entries when its reference counter becomes zero. This tends to lead to collisions when a device is moved between two different network namespaces because some of the sysfs files have to be removed first and then added again to the already moved sysfs entry. WARNING: CPU: 0 PID: 290 at lib/kobject.c:240 kobject_add_internal+0x5ec/0x8a0 kobject_add_internal failed for batman_adv with -EEXIST, don't try to register things with the same name in the same directory. But the caller of kobject_put can already remove the sysfs entry before it does the kobject_put. This removal is done even when the reference counter is not yet zero and thus avoids the problem. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: Revert "postpone sysfs removal when unregistering"Sven Eckelmann
Postponing the removal of the interface breaks the expected behavior of NETDEV_UNREGISTER and NETDEV_PRE_TYPE_CHANGE. This is especially problematic when an interface is removed and added in quick succession. This reverts commit 5bc44dc8458c ("batman-adv: postpone sysfs removal when unregistering"). Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: Modify mesh_iface outside sysfs contextSven Eckelmann
The legacy sysfs interface to modify interfaces belonging to batman-adv is run inside a region holding s_lock. And to add a net_device, it has to also get the rtnl_lock. This is exactly the other way around than in other virtual net_devices and conflicts with netdevice notifier which executes inside rtnl_lock. The inverted lock situation is currently solved by executing the removal of netdevices via workqueue. The workqueue isn't executed inside rtnl_lock and thus can independently get the s_lock and the rtnl_lock. But this workaround fails when the netdevice notifier creates events in quick succession and the earlier triggered removal of a net_device isn't processed in the workqueue before the adding of the new netdevice (with same name) event is issued. Instead the legacy sysfs interface store events have to be enqueued in a workqueue to loose the s_lock. The worker is then free to get the required locks and the deadlock is avoided. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: Define module rtnl link nameSven Eckelmann
The batman-adv module can automatically be loaded when operations over the rtnl link are triggered. This requires only the correct rtnl link name in the module header. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: Document optional batadv_algo_opsSven Eckelmann
Some operations in batadv_algo_ops are optional and marked as such in the kerneldoc. But some of them miss the "(optional)" in their kerneldoc. These have to also be marked to give an implementor of an algorithm the correct background information without looking in the code calling these function pointers. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09batman-adv: Start new development cycleSimon Wunderlich
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org>
2016-08-08RDS: add __printf format attribute to error reporting functionsNicolas Iooss
This is helpful to detect at compile-time errors related to format strings. Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08net/sched/sch_hfsc.c: remove unused cl_myfadjMichal Soltys
The code using this variable has been commented out in the past as it was causing issues in upperlimited link-sharing scenarios. Signed-off-by: Michal Soltys <soltys@ziu.info> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08net/sched/sch_hfsc.c: keep fsc and virtual times in sync; fix an old bugMichal Soltys
This patch simplifies how we update fsc and calculate vt from it - while keeping the expected functionality identical with how hfsc behaves curently. It also fixes a certain issue introduced with a very old patch. The idea is, that instead of correcting cl_vt before fsc curve update (rtsc_min) and correcting cl_vt after calculation (rtsc_y2x) to keep cl_vt local to the current period - we can simply rely on virtual times and curve values always being in sync - analogously to how rsc and usc function, except that we use virtual time here. Why hasn't it been done since the beginning this way ? The likely scenario (basing on the code trying to correct curves whenever possible) was to keep the virtual times as small as possible - as they have tendency to "gallop" forward whenever their siblings and other fair sharing subtrees are idling. On top of that, current code is subtly bugged, so cumulative time (without any corrections) is always kept and used in init_vf() when a new backlog period begins (using cl_cvtoff). Is cumulative value safe ? Generally yes, though corner cases are easy to create. For example consider: 1gbit interface some 100kbit leaf, everything else idle With current tick (64ns) 1s is 15625000 ticks, but the leaf is alone and it's virtual time, so in reality it's 10000 times more. ITOW 38 bits are needed to hold 1 second. 54 - 1 day, 59 - 1 month, 63 - 1 year (all logarithms rounded up). It's getting somewhat dangerous, but also requires setup excusing this kind of values not mentioning permanently backlogged class for a year. In near most extreme case (10gbit, 10kbit leaf), we have "enough" to hold ~13.6 days in 64 bits. Well, the issue remains mostly theoretical and cl_cvtoff has been working fine for all those years. Sensible configuration are de-facto immune to this issue, and not so sensible can solve it with a cronjob and its period inversely proportional to the insanity of such setup =) Now let's explain the subtle bug mentioned earlier. The issue is related to how offsets are kept and how we calculate virtual times and update fair service curve(s). The issue itself is subtle, but easy to observe with long m1 segments. It was introduced in rather old patch: Commit 99296150c7: "[NET_SCHED]: O(1) children vtoff adjustment in HFSC scheduler" (available in git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git) Originally when a new backlog period was started, cl_vtoff of each sibling was updated with cl_cvtmax from past period - naturally moving all cl_vt to proper starting point. That patch adjusted it so cumulative offset is kept in the parent, and there is no need for traversing the list (as any subsequent child activation derives new vt from already active sibling(s)). But with this change, cl_vtoff (of each sibling) is no longer persistent across the inactivity periods, as it's calculated from parent's cl_cvtoff on a new backlog period, conflicting with the following curve correction from the previous period: if (cl->cl_virtual.x == vt) { cl->cl_virtual.x -= cl->cl_vtoff; cl->cl_vtoff = 0; } This essentially tries to keep curve as if it was local to the period and resets cl_vtoff (cumulative vt offset of the class) to 0 when possible (read: when we have an intersection or if a new curve is below the old one). But then it's recalculated from cl_cvtoff on next active period. Then rtsc_min() call preceding the above if() doesn't really do what we expect it to do in such scenario - as it calculates the minimum of corrected curve (from the previous backlog period) and the new uncorrected curve (with offset derived from cl_cvtoff). Example: tc class add dev $ife parent 1:0 classid 1:1 hfsc ls m2 100mbit ul m2 100mbit tc class add dev $ife parent 1:1 classid 1:10 hfsc ls m1 80mbit d 10s m2 20mbit tc class add dev $ife parent 1:1 classid 1:11 hfsc ls m2 20mbit start B, keep it backlogged, let it run 6s (30s worth of vt as A is idle) pause B briefly to force cl_cvtoff update in parent (whole 1:1 going idle) start A, let it run 10s pause A briefly to force rtsc_min() At this point we would expect A to continue at 20mbit after a brief moment of 80mbit. But instead A will use 80mbit for full 10s again. It's the effect of first correcting A (during 'start A'), and then - after unpausing - calculating rtsc_min() from old corrected and new uncorrected curve. The patch fixes this bug and keepis vt and fsc in sync (virtual times are cumulative, not local to the backlog period). Signed-off-by: Michal Soltys <soltys@ziu.info> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08net/multicast: should not send source list records when have filter mode changeHangbin Liu
Based on RFC3376 5.1 and RFC3810 6.1 If the per-interface listening change that triggers the new report is a filter mode change, then the next [Robustness Variable] State Change Reports will include a Filter Mode Change Record. This applies even if any number of source list changes occur in that period. Old State New State State Change Record Sent --------- --------- ------------------------ INCLUDE (A) EXCLUDE (B) TO_EX (B) EXCLUDE (A) INCLUDE (B) TO_IN (B) So we should not send source-list change if there is a filter-mode change. Here are two scenarios: 1. Group deleted and filter mode is EXCLUDE, which means we need send a TO_IN { }. 2. Not group deleted, but has pcm->crcount, which means we need send a normal filter-mode-change. At the same time, if the type is ALLOW or BLOCK, and have psf->sf_crcount, we stop add records and decrease sf_crcount directly Reference: https://www.ietf.org/mail-archive/web/magma/current/msg01274.html Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08net: ipconfig: drop inter-device timeoutUwe Kleine-König
Now that ipconfig learned to handle "delayed replies" in the previous commit, there is no reason any more to delay sending a first request per device. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08net: ipconfig: Support using "delayed" DHCP repliesUwe Kleine-König
The dhcp code only waits 1s between sending DHCP requests on different devices and only accepts an answer for the device that sent out the last request. Only the timeout at the end of a loop is increased iteratively which favours only the last device. This makes it impossible to work with a dhcp server that takes little more than 1s connected to a device that is not the last one. Instead of also increasing the inter-device timeout, teach the code to handle delayed replies. To accomplish that, make *ic_dev track the current ic_device instead of the current net_device and adapt all users accordingly. The relevant change then is to reset d to ic_dev on a reply to assert that the followup request goes through the right device. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08net: ipconfig: Add device name to debug messagesUwe Kleine-König
This simplifies understanding what happens when there is more than one device. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08neigh: allow admin to set NUD_STALEJulian Anastasov
Admin should be able to set any state. Currently, this fails when lladdr is not changed and state is changed from NUD_CONNECTED to NUD_STALE: ip neigh add 192.168.8.1 lladdr 00:11:22:33:44:55 nud perm dev wlan0 ip neigh show to 192.168.8.1 192.168.8.1 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT ip neigh change 192.168.8.1 lladdr 00:11:22:33:44:55 nud stale dev wlan0 ip neigh show to 192.168.8.1 192.168.8.1 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT Problem may be from 2.1.X days. Signed-off-by: Julian Anastasov <ja@ssi.bg> Reviewed-by: Chunhui He <hchunhui@mail.ustc.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08sctp: use event->chunk when it's validXin Long
Commit 52253db924d1 ("sctp: also point GSO head_skb to the sk when it's available") used event->chunk->head_skb to get the head_skb in sctp_ulpevent_set_owner(). But at that moment, the event->chunk was NULL, as it cloned the skb in sctp_ulpevent_make_rcvmsg(). Therefore, that patch didn't really work. This patch is to move the event->chunk initialization before calling sctp_ulpevent_receive_data() so that it uses event->chunk when it's valid. Fixes: 52253db924d1 ("sctp: also point GSO head_skb to the sk when it's available") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08bpf: fix checksum for vlan push/pop helperDaniel Borkmann
When having skbs on ingress with CHECKSUM_COMPLETE, tc BPF programs don't push rcsum of mac header back in and after BPF run back pull out again as opposed to some other subsystems (ovs, for example). For cases like q-in-q, meaning when a vlan tag for offloading is already present and we're about to push another one, then skb_vlan_push() pushes the inner one into the skb, increasing mac header and skb_postpush_rcsum()'ing the 4 bytes vlan header diff. Likewise, for the reverse operation in skb_vlan_pop() for the case where vlan header needs to be pulled out of the skb, we're decreasing the mac header and skb_postpull_rcsum()'ing the 4 bytes rcsum of the vlan header that was removed. However mangling the rcsum here will lead to hw csum failure for BPF case, since we're pulling or pushing data that was not part of the current rcsum. Changing tc BPF programs in general to push/pull rcsum around BPF_PROG_RUN() is also not really an option since current behaviour is ABI by now, but apart from that would also mean to do quite a bit of useless work in the sense that usually 12 bytes need to be rcsum pushed/pulled also when we don't need to touch this vlan related corner case. One way to fix it would be to push the necessary rcsum fixup down into vlan helpers that are (mostly) slow-path anyway. Fixes: 4e10df9a60d9 ("bpf: introduce bpf_skb_vlan_push/pop() helpers") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08bpf: fix checksum fixups on bpf_skb_store_bytesDaniel Borkmann
bpf_skb_store_bytes() invocations above L2 header need BPF_F_RECOMPUTE_CSUM flag for updates, so that CHECKSUM_COMPLETE will be fixed up along the way. Where we ran into an issue with bpf_skb_store_bytes() is when we did a single-byte update on the IPv6 hoplimit despite using BPF_F_RECOMPUTE_CSUM flag; simple ping via ICMPv6 triggered a hw csum failure as a result. The underlying issue has been tracked down to a buffer alignment issue. Meaning, that csum_partial() computations via skb_postpull_rcsum() and skb_postpush_rcsum() pair invoked had a wrong result since they operated on an odd address for the hoplimit, while other computations were done on an even address. This mix doesn't work as-is with skb_postpull_rcsum(), skb_postpush_rcsum() pair as it always expects at least half-word alignment of input buffers, which is normally the case. Thus, instead of these helpers using csum_sub() and (implicitly) csum_add(), we need to use csum_block_sub(), csum_block_add(), respectively. For unaligned offsets, they rotate the sum to align it to a half-word boundary again, otherwise they work the same as csum_sub() and csum_add(). Adding __skb_postpull_rcsum(), __skb_postpush_rcsum() variants that take the offset as an input and adapting bpf_skb_store_bytes() to them fixes the hw csum failures again. The skb_postpull_rcsum(), skb_postpush_rcsum() helpers use a 0 constant for offset so that the compiler optimizes the offset & 1 test away and generates the same code as with csum_sub()/_add(). Fixes: 608cd71a9c7c ("tc: bpf: generalize pedit action") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08bpf: also call skb_postpush_rcsum on xmit occasionsDaniel Borkmann
Follow-up to commit f8ffad69c9f8 ("bpf: add skb_postpush_rcsum and fix dev_forward_skb occasions") to fix an issue for dev_queue_xmit() redirect locations which need CHECKSUM_COMPLETE fixups on ingress. For the same reasons as described in f8ffad69c9f8 already, we of course also need this here, since dev_queue_xmit() on a veth device will let us end up in the dev_forward_skb() helper again to cross namespaces. Latter then calls into skb_postpull_rcsum() to pull out L2 header, so that netif_rx_internal() sees CHECKSUM_COMPLETE as it is expected. That is, CHECKSUM_COMPLETE on ingress covering L2 _payload_, not L2 headers. Also here we have to address bpf_redirect() and bpf_clone_redirect(). Fixes: 3896d655f4d4 ("bpf: introduce bpf_clone_redirect() helper") Fixes: 27b29f63058d ("bpf: add bpf_redirect() helper") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08sctp_diag: Respect ss adding TCPF_CLOSE to idiag_statesPhil Sutter
Since 'ss' always adds TCPF_CLOSE to idiag_states flags, sctp_diag can't rely upon TCPF_LISTEN flag solely being present when listening sockets are requested. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08sctp_diag: Fix T3_rtx timer exportPhil Sutter
The asoc's timer value is not kept in asoc->timeouts array but in it's primary transport instead. Furthermore, we must export the timer only if it is pending, otherwise the value will underrun when stored in an unsigned variable and user space will only see a very large timeout value. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08libceph: using kfree_rcu() to simplify the codeWei Yongjun
The callback function of call_rcu() just calls a kfree(), so we can use kfree_rcu() instead of call_rcu() + callback function. Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-08-08libceph: make cancel_generic_request() staticWei Yongjun
Fixes the following sparse warning: net/ceph/mon_client.c:577:6: warning: symbol 'cancel_generic_request' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-08-08libceph: fix return value check in alloc_msg_with_page_vector()Wei Yongjun
In case of error, the function ceph_alloc_page_vector() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: 1907920324f1 ('libceph: support for sending notifies') Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-08-08netfilter: nf_conntrack_sip: CSeq 0 is a valid CSeqChristophe Leroy
Do not drop packet when CSeq is 0 as 0 is also a valid value for CSeq. simple_strtoul() will return 0 either when all digits are 0 or if there are no digits at all. Therefore when simple_strtoul() returns 0 we check if first character is digit 0 or not. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-08netfilter: nft_rbtree: ignore inactive matching element with no descendantsPablo Neira Ayuso
If we find a matching element that is inactive with no descendants, we jump to the found label, then crash because of nul-dereference on the left branch. Fix this by checking that the element is active and not an interval end and skipping the logic that only applies to the tree iteration. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Tested-by: Anders K. Pedersen <akp@akp.dk>
2016-08-08netfilter: nf_ct_h323: do not re-activate already expired timerLiping Zhang
Commit 96d1327ac2e3 ("netfilter: h323: Use mod_timer instead of set_expect_timeout") just simplify the source codes if (!del_timer(&exp->timeout)) return 0; add_timer(&exp->timeout); to mod_timer(&exp->timeout, jiffies + info->timeout * HZ); This is not correct, and introduce a race codition: CPU0 CPU1 - timer expire process_rcf expectation_timed_out lock(exp_lock) - find_exp waiting exp_lock... re-activate timer!! waiting exp_lock... unlock(exp_lock) lock(exp_lock) - unlink expect - free(expect) - unlock(exp_lock) So when the timer expires again, we will access the memory that was already freed. Replace mod_timer with mod_timer_pending here to fix this problem. Fixes: 96d1327ac2e3 ("netfilter: h323: Use mod_timer instead of set_expect_timeout") Cc: Gao Feng <fgao@ikuai8.com> Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-08Revert "wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel"Johannes Berg
This reverts commit 3d5fdff46c4b2b9534fa2f9fc78e90a48e0ff724. Ben Hutchings pointed out that the commit isn't safe since it assumes that the structure used by the driver is iw_point, when in fact there's no way to know about that. Fortunately, the only driver in the tree that ever runs this code path is the wilc1000 staging driver, so it doesn't really matter. Clearly I should have investigated this better before applying, sorry. Reported-by: Ben Hutchings <ben@decadent.org.uk> Cc: stable@vger.kernel.org [though I guess it doesn't matter much] Fixes: 3d5fdff46c4b ("wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-08-06Merge tag 'mac80211-for-davem-2016-08-05' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== First set of fixes for the current cycle: * fix 80+80 bandwidth warning * fix powersave with mac80211 TXQ implementation * use correct way to free SKBs from multicast buffering * mesh: fix operation ordering to work with all drivers * mesh: end service period even when peer goes away * mesh: correct HT opmode validity checks * pass hw pointer from mac80211 to driver in TPT method, fixing a bug (in a bit the wrong way, but that's what we have right now) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>