summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-07-16net: cgroup: fix access the unallocated memory in netprio cgroupGao feng
there are some out of bound accesses in netprio cgroup. now before accessing the dev->priomap.priomap array,we only check if the dev->priomap exist.and because we don't want to see additional bound checkings in fast path, so we should make sure that dev->priomap is null or array size of dev->priomap.priomap is equal to max_prioidx + 1; so in write_priomap logic,we should call extend_netdev_table when dev->priomap is null and dev->priomap.priomap_len < max_len. and in cgrp_create->update_netdev_tables logic,we should call extend_netdev_table only when dev->priomap exist and dev->priomap.priomap_len < max_len. and it's not needed to call update_netdev_tables in write_priomap, we can only allocate the net device's priomap which we change through net_prio.ifpriomap. this patch also add a return value for update_netdev_tables & extend_netdev_table, so when new_priomap is allocated failed, write_priomap will stop to access the priomap,and return -ENOMEM back to the userspace to tell the user what happend. Change From v3: 1. add rtnl protect when reading max_prioidx in write_priomap. 2. only call extend_netdev_table when map->priomap_len < max_len, this will make sure array size of dev->map->priomap always bigger than any prioidx. 3. add a function write_update_netdev_table to make codes clear. Change From v2: 1. protect extend_netdev_table by RTNL. 2. when extend_netdev_table failed,call dev_put to reduce device's refcount. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Eric Dumazet <edumazet@google.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16bridge: Fix enforcement of multicast hash_max limitThomas Graf
The hash size is doubled when it needs to grow and compared against hash_max. The >= comparison will limit the hash table size to half of what is expected i.e. the default 512 hash_max will not allow the hash table to grow larger than 256. Also print the hash table limit instead of the desirable size when the limit is reached. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16net/mlx4_en: dereferencing freed memoryDan Carpenter
We dereferenced "mclist" after the kfree(). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16net/mlx4: off by one in parse_trans_rule()Dan Carpenter
This should be ">=" here instead of ">". MLX4_NET_TRANS_RULE_NUM is 6. We use "spec->id" as an array offset into the __rule_hw_sz[] and __sw_id_hw[] arrays which have 6 elements. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16ipv6: fix RTPROT_RA markup of RA routes w/nexthopsDenis Ovsienko
Userspace implementations of network routing protocols sometimes need to tell RA-originated IPv6 routes from other kernel routes to make proper routing decisions. This makes most sense for RA routes with nexthops, namely, default routes and Route Information routes. The intended mean of preserving RA route origin in a netlink message is through indicating RTPROT_RA as protocol code. Function rt6_fill_node() tried to do that for default routes, but its test condition was taken wrong. This change is modeled after the original mailing list posting by Jeff Haran. It fixes the test condition for default route case and sets the same behaviour for Route Information case (both types use nexthops). Handling of the 3rd RA route type, Prefix Information, is left unchanged, as it stands for interface connected routes (without nexthops). Signed-off-by: Denis Ovsienko <infrastation@yandex.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16hyperv: Add support for setting MAC from within guestsHaiyang Zhang
This adds support for setting synthetic NIC MAC address from within Linux guests. Before using this feature, the option "spoofing of MAC address" should be enabled at the Hyper-V manager / Settings of the synthetic NIC. Thanks to Kin Cho <kcho@infoblox.com> for the initial implementation and tests. And, thanks to Long Li <longli@microsoft.com> for the debugging works. Reported-and-tested-by: Kin Cho <kcho@infoblox.com> Reported-by: Long Li <longli@microsoft.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-166lowpan: Change byte order when storing/accessing to len fieldTony Cheneau
Lenght field should be encoded using big endian byte order, such as intend in the specs. As it is currently written, the len field would not be decoded properly on an implementation using the correct byte ordering. Hence, it could lead to interroperability issues. Also, I rewrote the code so that iphc0 argument of lowpan_alloc_new_frame could be removed. Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-166lowpan: Change byte order when storing/accessing u16 tagTony Cheneau
The tag field should be stored and accessed using big endian byte order (as intended in the specs). Or else, when displayed with a trafic analyser, such a Wireshark, the field not properly displayed (e.g. 0x01 00 instead of 0x00 01, and so on). Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-166lowpan: Fix null pointer dereference in UDP uncompression functionTony Cheneau
When a UDP packet gets fragmented, a crash will occur at reassembly time. This is because skb->transport_header is not set during earlier period of fragment reassembly. As a consequence, call to udp_hdr() return NULL and uh (which is NULL) gets dereferenced without much test. Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16ixgbevf: Prevent RX/TX statistics getting reset to zeroNarendra K
The commit 4197aa7bb81877ebb06e4f2cc1b5fea2da23a7bd implements 64 bit per ring statistics. But the driver resets the 'total_bytes' and 'total_packets' from RX and TX rings in the RX and TX interrupt handlers to zero. This results in statistics being lost and user space reporting RX and TX statistics as zero. This patch addresses the issue by preventing the resetting of RX and TX ring statistics to zero. Signed-off-by: Narendra K <narendra_k@dell.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16arch: Use eth_random_addrJoe Perches
Convert the existing uses of random_ether_addr to the new eth_random_addr. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16usb: Use eth_random_addrJoe Perches
Convert the existing uses of random_ether_addr to the new eth_random_addr. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Felipe Balbi <balbi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16s390: Use eth_random_addrJoe Perches
Convert the existing uses of random_ether_addr to the new eth_random_addr. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16drivers/net: Use eth_random_addrJoe Perches
Convert the existing uses of random_ether_addr to the new eth_random_addr. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16wireless: Use eth_random_addrJoe Perches
Convert the existing uses of random_ether_addr to the new eth_random_addr. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Hin-Tak Leung <htl10@users.sourceforge.net> Acked-by: Gertjan van Wingerde <gwingerde@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16net: usb: Use eth_random_addrJoe Perches
Convert the existing uses of random_ether_addr to the new eth_random_addr. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16ethernet: Use eth_random_addrJoe Perches
Convert the existing uses of random_ether_addr to the new eth_random_addr. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16etherdevice: Rename random_ether_addr to eth_random_addrJoe Perches
Add some API symmetry to eth_broadcast_addr and add a #define to the old name for backward compatibility. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16Merge branch 'tipc_net-next' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Paul Gortmaker says: ==================== This is the same eight commits as sent for review last week[1], with just the incorporation of the pr_fmt change as suggested by JoeP. There was no additional change requests, so unless you can see something else you'd like me to change, please pull. ... Erik Hugne (5): tipc: use standard printk shortcut macros (pr_err etc.) tipc: remove TIPC packet debugging functions and macros tipc: simplify print buffer handling in tipc_printf tipc: phase out most of the struct print_buf usage tipc: remove print_buf and deprecated log buffer code Paul Gortmaker (3): tipc: factor stats struct out of the larger link struct tipc: limit error messages relating to memory leak to one line tipc: simplify link_print by divorcing it from using tipc_printf ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16sctp: Fix list corruption resulting from freeing an association on a listNeil Horman
A few days ago Dave Jones reported this oops: [22766.294255] general protection fault: 0000 [#1] PREEMPT SMP [22766.295376] CPU 0 [22766.295384] Modules linked in: [22766.387137] ffffffffa169f292 6b6b6b6b6b6b6b6b ffff880147c03a90 ffff880147c03a74 [22766.387135] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000 [22766.387136] Process trinity-watchdo (pid: 10896, threadinfo ffff88013e7d2000, [22766.387137] Stack: [22766.387140] ffff880147c03a10 [22766.387140] ffffffffa169f2b6 [22766.387140] ffff88013ed95728 [22766.387143] 0000000000000002 [22766.387143] 0000000000000000 [22766.387143] ffff880003fad062 [22766.387144] ffff88013c120000 [22766.387144] [22766.387145] Call Trace: [22766.387145] <IRQ> [22766.387150] [<ffffffffa169f292>] ? __sctp_lookup_association+0x62/0xd0 [sctp] [22766.387154] [<ffffffffa169f2b6>] __sctp_lookup_association+0x86/0xd0 [sctp] [22766.387157] [<ffffffffa169f597>] sctp_rcv+0x207/0xbb0 [sctp] [22766.387161] [<ffffffff810d4da8>] ? trace_hardirqs_off_caller+0x28/0xd0 [22766.387163] [<ffffffff815827e3>] ? nf_hook_slow+0x133/0x210 [22766.387166] [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0 [22766.387168] [<ffffffff8159043d>] ip_local_deliver_finish+0x18d/0x4c0 [22766.387169] [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0 [22766.387171] [<ffffffff81590a07>] ip_local_deliver+0x47/0x80 [22766.387172] [<ffffffff8158fd80>] ip_rcv_finish+0x150/0x680 [22766.387174] [<ffffffff81590c54>] ip_rcv+0x214/0x320 [22766.387176] [<ffffffff81558c07>] __netif_receive_skb+0x7b7/0x910 [22766.387178] [<ffffffff8155856c>] ? __netif_receive_skb+0x11c/0x910 [22766.387180] [<ffffffff810d423e>] ? put_lock_stats.isra.25+0xe/0x40 [22766.387182] [<ffffffff81558f83>] netif_receive_skb+0x23/0x1f0 [22766.387183] [<ffffffff815596a9>] ? dev_gro_receive+0x139/0x440 [22766.387185] [<ffffffff81559280>] napi_skb_finish+0x70/0xa0 [22766.387187] [<ffffffff81559cb5>] napi_gro_receive+0xf5/0x130 [22766.387218] [<ffffffffa01c4679>] e1000_receive_skb+0x59/0x70 [e1000e] [22766.387242] [<ffffffffa01c5aab>] e1000_clean_rx_irq+0x28b/0x460 [e1000e] [22766.387266] [<ffffffffa01c9c18>] e1000e_poll+0x78/0x430 [e1000e] [22766.387268] [<ffffffff81559fea>] net_rx_action+0x1aa/0x3d0 [22766.387270] [<ffffffff810a495f>] ? account_system_vtime+0x10f/0x130 [22766.387273] [<ffffffff810734d0>] __do_softirq+0xe0/0x420 [22766.387275] [<ffffffff8169826c>] call_softirq+0x1c/0x30 [22766.387278] [<ffffffff8101db15>] do_softirq+0xd5/0x110 [22766.387279] [<ffffffff81073bc5>] irq_exit+0xd5/0xe0 [22766.387281] [<ffffffff81698b03>] do_IRQ+0x63/0xd0 [22766.387283] [<ffffffff8168ee2f>] common_interrupt+0x6f/0x6f [22766.387283] <EOI> [22766.387284] [22766.387285] [<ffffffff8168eed9>] ? retint_swapgs+0x13/0x1b [22766.387285] Code: c0 90 5d c3 66 0f 1f 44 00 00 4c 89 c8 5d c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 <0f> b7 87 98 00 00 00 48 89 fb 49 89 f5 66 c1 c0 08 66 39 46 02 [22766.387307] [22766.387307] RIP [22766.387311] [<ffffffffa168a2c9>] sctp_assoc_is_match+0x19/0x90 [sctp] [22766.387311] RSP <ffff880147c039b0> [22766.387142] ffffffffa16ab120 [22766.599537] ---[ end trace 3f6dae82e37b17f5 ]--- [22766.601221] Kernel panic - not syncing: Fatal exception in interrupt It appears from his analysis and some staring at the code that this is likely occuring because an association is getting freed while still on the sctp_assoc_hashtable. As a result, we get a gpf when traversing the hashtable while a freed node corrupts part of the list. Nominally I would think that an mibalanced refcount was responsible for this, but I can't seem to find any obvious imbalance. What I did note however was that the two places where we create an association using sctp_primitive_ASSOCIATE (__sctp_connect and sctp_sendmsg), have failure paths which free a newly created association after calling sctp_primitive_ASSOCIATE. sctp_primitive_ASSOCIATE brings us into the sctp_sf_do_prm_asoc path, which issues a SCTP_CMD_NEW_ASOC side effect, which in turn adds a new association to the aforementioned hash table. the sctp command interpreter that process side effects has not way to unwind previously processed commands, so freeing the association from the __sctp_connect or sctp_sendmsg error path would lead to a freed association remaining on this hash table. I've fixed this but modifying sctp_[un]hash_established to use hlist_del_init, which allows us to proerly use hlist_unhashed to check if the node is on a hashlist safely during a delete. That in turn alows us to safely call sctp_unhash_established in the __sctp_connect and sctp_sendmsg error paths before freeing them, regardles of what the associations state is on the hash list. I noted, while I was doing this, that the __sctp_unhash_endpoint was using hlist_unhsashed in a simmilar fashion, but never nullified any removed nodes pointers to make that function work properly, so I fixed that up in a simmilar fashion. I attempted to test this using a virtual guest running the SCTP_RR test from netperf in a loop while running the trinity fuzzer, both in a loop. I wasn't able to recreate the problem prior to this fix, nor was I able to trigger the failure after (neither of which I suppose is suprising). Given the trace above however, I think its likely that this is what we hit. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: davej@redhat.com CC: davej@redhat.com CC: "David S. Miller" <davem@davemloft.net> CC: Vlad Yasevich <vyasevich@gmail.com> CC: Sridhar Samudrala <sri@us.ibm.com> CC: linux-sctp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16net: make sock diag per-namespaceAndrey Vagin
Before this patch sock_diag works for init_net only and dumps information about sockets from all namespaces. This patch expands sock_diag for all name-spaces. It creates a netlink kernel socket for each netns and filters data during dumping. v2: filter accoding with netns in all places remove an unused variable. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Pavel Emelyanov <xemul@parallels.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Andrew Vagin <avagin@openvz.org> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16lpc_eth: remove duplicated includeDuan Jiong
Remove duplicated #include <linux/delay.h> in drivers/net/ethernet/nxp/lpc_eth.c Signed-off-by: Duan Jiong<djduanjiong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16tcp: add OFO snmp countersEric Dumazet
Add three SNMP TCP counters, to better track TCP behavior at global stage (netstat -s), when packets are received Out Of Order (OFO) TCPOFOQueue : Number of packets queued in OFO queue TCPOFODrop : Number of packets meant to be queued in OFO but dropped because socket rcvbuf limit hit. TCPOFOMerge : Number of packets in OFO that were merged with other packets. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16cifs: always update the inode cache with the results from a FIND_*Jeff Layton
When we get back a FIND_FIRST/NEXT result, we have some info about the dentry that we use to instantiate a new inode. We were ignoring and discarding that info when we had an existing dentry in the cache. Fix this by updating the inode in place when we find an existing dentry and the uniqueid is the same. Cc: <stable@vger.kernel.org> # .31.x Reported-and-Tested-by: Andrew Bartlett <abartlet@samba.org> Reported-by: Bill Robertson <bill_robertson@debortoli.com.au> Reported-by: Dion Edwards <dion_edwards@debortoli.com.au> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2012-07-16cifs: when CONFIG_HIGHMEM is set, serialize the read/write kmapsJeff Layton
Jian found that when he ran fsx on a 32 bit arch with a large wsize the process and one of the bdi writeback kthreads would sometimes deadlock with a stack trace like this: crash> bt PID: 2789 TASK: f02edaa0 CPU: 3 COMMAND: "fsx" #0 [eed63cbc] schedule at c083c5b3 #1 [eed63d80] kmap_high at c0500ec8 #2 [eed63db0] cifs_async_writev at f7fabcd7 [cifs] #3 [eed63df0] cifs_writepages at f7fb7f5c [cifs] #4 [eed63e50] do_writepages at c04f3e32 #5 [eed63e54] __filemap_fdatawrite_range at c04e152a #6 [eed63ea4] filemap_fdatawrite at c04e1b3e #7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs] #8 [eed63ecc] do_sync_write at c052d202 #9 [eed63f74] vfs_write at c052d4ee #10 [eed63f94] sys_write at c052df4c #11 [eed63fb0] ia32_sysenter_target at c0409a98 EAX: 00000004 EBX: 00000003 ECX: abd73b73 EDX: 012a65c6 DS: 007b ESI: 012a65c6 ES: 007b EDI: 00000000 SS: 007b ESP: bf8db178 EBP: bf8db1f8 GS: 0033 CS: 0073 EIP: 40000424 ERR: 00000004 EFLAGS: 00000246 Each task would kmap part of its address array before getting stuck, but not enough to actually issue the write. This patch fixes this by serializing the marshal_iov operations for async reads and writes. The idea here is to ensure that cifs aggressively tries to populate a request before attempting to fulfill another one. As soon as all of the pages are kmapped for a request, then we can unlock and allow another one to proceed. There's no need to do this serialization on non-CONFIG_HIGHMEM arches however, so optimize all of this out when CONFIG_HIGHMEM isn't set. Cc: <stable@vger.kernel.org> Reported-by: Jian Li <jiali@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2012-07-16cifs: on CONFIG_HIGHMEM machines, limit the rsize/wsize to the kmap spaceJeff Layton
We currently rely on being able to kmap all of the pages in an async read or write request. If you're on a machine that has CONFIG_HIGHMEM set then that kmap space is limited, sometimes to as low as 512 slots. With 512 slots, we can only support up to a 2M r/wsize, and that's assuming that we can get our greedy little hands on all of them. There are other users however, so it's possible we'll end up stuck with a size that large. Since we can't handle a rsize or wsize larger than that currently, cap those options at the number of kmap slots we have. We could consider capping it even lower, but we currently default to a max of 1M. Might as well allow those luddites on 32 bit arches enough rope to hang themselves. A more robust fix would be to teach the send and receive routines how to contend with an array of pages so we don't need to marshal up a kvec array at all. That's a fairly significant overhaul though, so we'll need this limit in place until that's ready. Cc: <stable@vger.kernel.org> Reported-by: Jian Li <jiali@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2012-07-16Initialise mid_q_entry before putting it on the pending queueSachin Prabhu
A user reported a crash in cifs_demultiplex_thread() caused by an incorrectly set mid_q_entry->callback() function. It appears that the callback assignment made in cifs_call_async() was not flushed back to memory suggesting that a memory barrier was required here. Changing the code to make sure that the mid_q_entry structure was completely initialised before it was added to the pending queue fixes the problem. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Steve French <smfrench@gmail.com>
2012-07-16target: Check number of unmap descriptors against our limitRoland Dreier
Fail UNMAP commands that have more than our reported limit on unmap descriptors. Signed-off-by: Roland Dreier <roland@purestorage.com> Cc: stable@vger.kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: Fix possible integer underflow in UNMAP emulationRoland Dreier
It's possible for an initiator to send us an UNMAP command with a descriptor that is less than 8 bytes; in that case it's really bad for us to set an unsigned int to that value, subtract 8 from it, and then use that as a limit for our loop (since the value will wrap around to a huge positive value). Fix this by making size be signed and only looping if size >= 16 (ie if we have at least a full descriptor available). Also remove offset as an obfuscated name for the constant 8. Signed-off-by: Roland Dreier <roland@purestorage.com> Cc: stable@vger.kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: Fix reading of data length fields for UNMAP commandsRoland Dreier
The UNMAP DATA LENGTH and UNMAP BLOCK DESCRIPTOR DATA LENGTH fields are in the unmap descriptor (the payload transferred to our data out buffer), not in the CDB itself. Read them from the correct place in target_emulated_unmap. Signed-off-by: Roland Dreier <roland@purestorage.com> Cc: stable@vger.kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: Add range checking to UNMAP emulationRoland Dreier
When processing an UNMAP command, we need to make sure that the number of blocks we're asked to UNMAP does not exceed our reported maximum number of blocks per UNMAP, and that the range of blocks we're unmapping doesn't go past the end of the device. Signed-off-by: Roland Dreier <roland@purestorage.com> Cc: stable@vger.kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: Add generation of LOGICAL BLOCK ADDRESS OUT OF RANGERoland Dreier
Many SCSI commands are defined to return a CHECK CONDITION / ILLEGAL REQUEST with ASC set to LOGICAL BLOCK ADDRESS OUT OF RANGE if the initiator sends a command that accesses a too-big LBA. Add an enum value and case entries so that target code can return this status. Signed-off-by: Roland Dreier <roland@purestorage.com> Cc: stable@vger.kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: Make unnecessarily global se_dev_align_max_sectors() staticRoland Dreier
Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: Remove se_session.sess_wait_listRoland Dreier
Since we set se_session.sess_tearing_down and stop new commands from being added to se_session.sess_cmd_list before we wait for commands to finish when freeing a session, there's no need for a separate sess_wait_list -- if we let new commands be added to sess_cmd_list after setting sess_tearing_down, that would be a bug that breaks the logic of waiting in-flight commands. Also rename target_splice_sess_cmd_list() to target_sess_cmd_list_set_waiting(), since we are no longer splicing onto a separate list. Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16qla2xxx: Remove racy, now-redundant check of sess_tearing_downRoland Dreier
Now that target_submit_cmd() / target_get_sess_cmd() check sess_tearing_down before adding commands to the list, we no longer need the check in qlt_do_work(). In fact this check is racy anyway (and that race is what inspired the change to add the check of sess_tearing_down to the target core). Cc: Chad Dupuis <chad.dupuis@qlogic.com> Cc: Arun Easi <arun.easi@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: Check sess_tearing_down in target_get_sess_cmd()Roland Dreier
Target core code assumes that target_splice_sess_cmd_list() has set sess_tearing_down and moved the list of pending commands to sess_wait_list, no more commands will be added to the session; if any are added, nothing keeps the se_session from being freed while the command is still in flight, which e.g. leads to use-after-free of se_cmd->se_sess in target_release_cmd_kref(). To enforce this invariant, put a check of sess_tearing_down inside of sess_cmd_lock in target_get_sess_cmd(); any checks before this are racy and can lead to the use-after-free described above. For example, the qla_target check in qlt_do_work() checks sess_tearing_down from work thread context but then drops all locks before calling target_submit_cmd() (as it must, since that is a sleeping function). However, since no locks are held, anything can happen with respect to the session it has looked up -- although it does correctly get sess_kref within its lock, so the memory won't be freed while target_submit_cmd() is actually running, nothing stops eg an ACL from being dropped and calling ->shutdown_session() (which calls into target_splice_sess_cmd_list()) before we get to target_get_sess_cmd(). Once this happens, the se_session memory can be freed as soon as target_submit_cmd() returns and qlt_do_work() drops its reference, even though we've just added a command to sess_cmd_list. To prevent this use-after-free, check sess_tearing_down inside of sess_cmd_lock right before target_get_sess_cmd() adds a command to sess_cmd_list; this is synchronized with target_splice_sess_cmd_list() so that every command is either waited for or not added to the queue. (nab: Keep target_submit_cmd() returning void for now..) Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16sbp-target: Consolidate duplicated error path code in sbp_handle_command()Roland Dreier
Cc: Chris Boot <bootc@bootc.net> Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-16target: Un-export target_get_sess_cmd()Roland Dreier
There are no in-tree users of target_get_sess_cmd() outside of target_core_transport.c. Any new code should use the higher-level target_submit_cmd() interface. So let's un-export target_get_sess_cmd() and make it static to the one file where it's actually used. (nab: Fix up minor fuzz to for-next) Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16qla2xxx: Get rid of redundant qla_tgt_sess.tearing_downRoland Dreier
The only place that sets qla_tgt_sess.tearing_down calls target_splice_sess_cmd_list() immediately afterwards, without dropping the lock it holds. That function sets se_session.sess_tearing_down, so we can get rid of the qla_target-specific flag, and in the one place that looks at the qla_tgt_sess.tearing_down flag just test se_session.sess_tearing_down instead. Cc: Chad Dupuis <chad.dupuis@qlogic.com> Cc: Arun Easi <arun.easi@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: Make core_disable_device_list_for_node use pre-refactoring lock orderingNicholas Bellinger
So after kicking around commit 547ac4c9c90 around a bit more, a tcm_qla2xxx LUN unlink OP has generated the following warning: [ 50.386625] qla2xxx [0000:07:00.0]-00af:0: Performing ISP error recovery - ha=ffff880263774000. [ 70.572988] qla2xxx [0000:07:00.0]-8038:0: Cable is unplugged... [ 126.527531] ------------[ cut here ]------------ [ 126.532677] WARNING: at kernel/softirq.c:159 local_bh_enable_ip+0x41/0x8c() [ 126.540433] Hardware name: S5520HC [ 126.544248] Modules linked in: tcm_vhost ib_srpt ib_cm ib_sa ib_mad ib_core tcm_qla2xxx tcm_loop tcm_fc libfc iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod configfs ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_round_robin dm_multipath scsi_dh loop i2c_i801 kvm_intel kvm crc32c_intel i2c_core microcode joydev button iomemory_vsl(O) pcspkr ext3 jbd uhci_hcd lpfc ata_piix libata ehci_hcd qla2xxx mlx4_core scsi_transport_fc scsi_tgt igb [last unloaded: scsi_wait_scan] [ 126.595567] Pid: 3283, comm: unlink Tainted: G O 3.5.0-rc2+ #33 [ 126.603128] Call Trace: [ 126.605853] [<ffffffff81026b91>] ? warn_slowpath_common+0x78/0x8c [ 126.612737] [<ffffffff8102c342>] ? local_bh_enable_ip+0x41/0x8c [ 126.619433] [<ffffffffa03582a2>] ? core_disable_device_list_for_node+0x70/0xe3 [target_core_mod] [ 126.629323] [<ffffffffa035849f>] ? core_clear_lun_from_tpg+0x88/0xeb [target_core_mod] [ 126.638244] [<ffffffffa0362ec1>] ? core_tpg_post_dellun+0x17/0x48 [target_core_mod] [ 126.646873] [<ffffffffa03575ee>] ? core_dev_del_lun+0x26/0x8c [target_core_mod] [ 126.655114] [<ffffffff810bcbd1>] ? dput+0x27/0x154 [ 126.660549] [<ffffffffa0359aa0>] ? target_fabric_port_unlink+0x3b/0x41 [target_core_mod] [ 126.669661] [<ffffffffa034a698>] ? configfs_unlink+0xfc/0x14a [configfs] [ 126.677224] [<ffffffff810b5979>] ? vfs_unlink+0x58/0xb7 [ 126.683141] [<ffffffff810b6ef3>] ? do_unlinkat+0xbb/0x142 [ 126.689253] [<ffffffff81330c75>] ? page_fault+0x25/0x30 [ 126.695170] [<ffffffff81335df9>] ? system_call_fastpath+0x16/0x1b [ 126.702053] ---[ end trace 2f8e5b0a9ec797ef ]--- [ 126.756336] qla2xxx [0000:07:00.0]-00af:0: Performing ISP error recovery - ha=ffff880263774000. [ 146.942414] qla2xxx [0000:07:00.0]-8038:0: Cable is unplugged... So this warning triggered because device_list disable logic is now holding nacl->device_list_lock w/ spin_lock_irqsave before obtaining port->sep_alua_lock with only spin_lock_bh.. The original disable logic obtains *deve ahead of dropping the entry from deve->alua_port_list and then obtains ->device_list_lock to do the remaining work. Also, I'm pretty sure this particular warning is being generated by a demo-mode session in tcm_qla2xxx, and not by explicit NodeACL MappedLUNs. The Initiator MappedLUNs are already protected by a seperate configfs symlink reference back se_lun->lun_group, and the demo-mode se_node_acl (and associated ->device_list[]) is released during se_portal_group->tpg_group shutdown. The following patch drops the extra functional change to disable logic in commit 547ac4c9c90 Cc: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: refactor core_update_device_list_for_node()Andy Grover
Code was almost entirely divided based on value of bool param "enable". Split it into two functions. Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: Eliminate else using boolean logicAndy Grover
Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: Misc retval cleanupsAndy Grover
Bubble-up retval from iscsi_update_param_value() and iscsit_ta_authentication(). Other very small retval cleanups. Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: Remove hba param from core_dev_add_lunAndy Grover
Only used in a debugprint, and function signature is cleaner now. Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: Remove unneeded double parenthesesAndy Grover
Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: replace the processing thread with a TMR work queueChristoph Hellwig
The last functionality of the target processing thread is offloading possibly long running task management requests from the submitter context. To keep TMR semantics the same we need a single threaded ordered queue, which can be provided by a per-device workqueue with the right flags. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: remove transport_generic_handle_cdb_mapChristoph Hellwig
Remove this command submission path which is not used by any in-tree driver. This also removes the now unused new_cmd_map fabtric method, which a few drivers implemented despite never calling transport_generic_handle_cdb_map. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: simply fabric driver queue full processingChristoph Hellwig
There is no need to schedule the delayed processing in a workqueue that offloads it to the target processing thread. Instead execute it directly from the workqueue. There will be a lot of future work in this area, which I'd likfe to defer for now as it is not nessecary for getting rid of the target processing thread. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16target: remove transport_generic_handle_dataChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16tcm_fc: Offload WRITE I/O backend submission to tpg workqueueChristoph Hellwig
Defer the write processing to the internal to be able to use target_execute_cmd. I'm not even entirely sure the calling code requires this due to the convoluted structure in libfc, but let's be safe for now. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Mark Rustad <mark.d.rustad@intel.com> Cc: Kiran Patil <Kiran.patil@intel.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>