summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2011-05-24x86, ioapic: Restore ioapic entries during resume properlySuresh Siddha
In mask/restore_ioapic_entries() we should be restoring ioapic entries when ioapics[apic].saved_registers is not NULL. Fix the typo and address the resume hang regression reported by Linus. This was not found sooner because the systems where these changes were tested on kept the IO-APIC entries intact over resume. Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Daniel J Blueman <daniel.blueman@gmail.com> Link: http://lkml.kernel.org/r/1306259131.7171.7.camel@sbsiddha-MOBL3.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-24dst: catch uninitialized metricsStephen Hemminger
Catch cases where dst_metric_set() and other functions are called but _metrics is NULL. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-24be2net: hash key for rss-config cmd not setSathya Perla
A non-zero, non-descript value is needed as the hash key. The hash variable was left un-initialized; but sometimes it gets a zero value and hashing is not effective. The constant value used now (not of any significance) seems to work fine. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-24bridge: initialize fake_rtable metricsEric Dumazet
bridge netfilter code uses a fake_rtable, and we must init its _metric field or risk NULL dereference later. Ref: https://bugzilla.kernel.org/show_bug.cgi?id=35672 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-24net: fix __dst_destroy_metrics_generic()Eric Dumazet
dst_default_metrics is readonly, we dont want to kfree() it later. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-24CDC NCM: release interfaces fix in unbind()Alexey Orishko
Changes: - claim slave/data interface during bind() and release interfaces in unbind() unconditionally - in case of error during bind(), release claimed data interface in the same function - remove obsolited "*_claimed" entries from driver context Signed-off-by: Alexey Orishko <alexey.orishko@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-24bnx2x: fix inverted conditionDmitry Kravkov
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-24igmp: call ip_mc_clear_src() only when we have no users of ip_mc_listVeaceslav Falico
In igmp_group_dropped() we call ip_mc_clear_src(), which resets the number of source filters per mulitcast. However, igmp_group_dropped() is also called on NETDEV_DOWN, NETDEV_PRE_TYPE_CHANGE and NETDEV_UNREGISTER, which means that the group might get added back on NETDEV_UP, NETDEV_REGISTER and NETDEV_POST_TYPE_CHANGE respectively, leaving us with broken source filters. To fix that, we must clear the source filters only when there are no users in the ip_mc_list, i.e. in ip_mc_dec_group() and on device destroy. Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-24net: use synchronize_rcu_expedited()Eric Dumazet
synchronize_rcu() is very slow in various situations (HZ=100, CONFIG_NO_HZ=y, CONFIG_PREEMPT=n) Extract from my (mostly idle) 8 core machine : synchronize_rcu() in 99985 us synchronize_rcu() in 79982 us synchronize_rcu() in 87612 us synchronize_rcu() in 79827 us synchronize_rcu() in 109860 us synchronize_rcu() in 98039 us synchronize_rcu() in 89841 us synchronize_rcu() in 79842 us synchronize_rcu() in 80151 us synchronize_rcu() in 119833 us synchronize_rcu() in 99858 us synchronize_rcu() in 73999 us synchronize_rcu() in 79855 us synchronize_rcu() in 79853 us When we hold RTNL mutex, we would like to spend some cpu cycles but not block too long other processes waiting for this mutex. We also want to setup/dismantle network features as fast as possible at boot/shutdown time. This patch makes synchronize_net() call the expedited version if RTNL is locked. synchronize_rcu_expedited() typical delay is about 20 us on my machine. synchronize_rcu_expedited() in 18 us synchronize_rcu_expedited() in 18 us synchronize_rcu_expedited() in 18 us synchronize_rcu_expedited() in 18 us synchronize_rcu_expedited() in 20 us synchronize_rcu_expedited() in 16 us synchronize_rcu_expedited() in 20 us synchronize_rcu_expedited() in 18 us synchronize_rcu_expedited() in 18 us Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> CC: Ben Greear <greearb@candelatech.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-24kbuild: Create a kernel-headers RPMArun Sharma
To compile binaries which depend on new kernel interfaces, we need a kernel-headers RPM Signed-off-by: Arun Sharma <asharma@fb.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-05-24rpm-pkg: Fix when current directory is a symlinkMichal Marek
The better fix would be to stop using the parent directory (principle of least surprise), but as long as we use it, use it consistently. Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-05-24loop: handle on-demand devices correctlyNamhyung Kim
When finding or allocating a loop device, loop_probe() did not take partition numbers into account so that it can result to a different device. Consider following example: $ sudo modprobe loop max_part=15 $ ls -l /dev/loop* brw-rw---- 1 root disk 7, 0 2011-05-24 22:16 /dev/loop0 brw-rw---- 1 root disk 7, 16 2011-05-24 22:16 /dev/loop1 brw-rw---- 1 root disk 7, 32 2011-05-24 22:16 /dev/loop2 brw-rw---- 1 root disk 7, 48 2011-05-24 22:16 /dev/loop3 brw-rw---- 1 root disk 7, 64 2011-05-24 22:16 /dev/loop4 brw-rw---- 1 root disk 7, 80 2011-05-24 22:16 /dev/loop5 brw-rw---- 1 root disk 7, 96 2011-05-24 22:16 /dev/loop6 brw-rw---- 1 root disk 7, 112 2011-05-24 22:16 /dev/loop7 $ sudo mknod /dev/loop8 b 7 128 $ sudo losetup /dev/loop8 ~/temp/disk-with-3-parts.img $ sudo losetup -a /dev/loop128: [0805]:278201 (/home/namhyung/temp/disk-with-3-parts.img) $ ls -l /dev/loop* brw-rw---- 1 root disk 7, 0 2011-05-24 22:16 /dev/loop0 brw-rw---- 1 root disk 7, 16 2011-05-24 22:16 /dev/loop1 brw-rw---- 1 root disk 7, 2048 2011-05-24 22:18 /dev/loop128 brw-rw---- 1 root disk 7, 2049 2011-05-24 22:18 /dev/loop128p1 brw-rw---- 1 root disk 7, 2050 2011-05-24 22:18 /dev/loop128p2 brw-rw---- 1 root disk 7, 2051 2011-05-24 22:18 /dev/loop128p3 brw-rw---- 1 root disk 7, 32 2011-05-24 22:16 /dev/loop2 brw-rw---- 1 root disk 7, 48 2011-05-24 22:16 /dev/loop3 brw-rw---- 1 root disk 7, 64 2011-05-24 22:16 /dev/loop4 brw-rw---- 1 root disk 7, 80 2011-05-24 22:16 /dev/loop5 brw-rw---- 1 root disk 7, 96 2011-05-24 22:16 /dev/loop6 brw-rw---- 1 root disk 7, 112 2011-05-24 22:16 /dev/loop7 brw-r--r-- 1 root root 7, 128 2011-05-24 22:17 /dev/loop8 After this patch, /dev/loop8 - instead of /dev/loop128 - was accessed correctly. In addition, 'range' passed to blk_register_region() should include all range of dev_t that LOOP_MAJOR can address. It does not need to be limited by partition numbers unless 'max_loop' param was specified. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Laurent Vivier <Laurent.Vivier@bull.net> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-24loop: limit 'max_part' module param to DISK_MAX_PARTSNamhyung Kim
The 'max_part' parameter controls the number of maximum partition a loop block device can have. However if a user specifies very large value it would exceed the limitation of device minor number and can cause a kernel panic (or, at least, produce invalid device nodes in some cases). On my desktop system, following command kills the kernel. On qemu, it triggers similar oops but the kernel was alive: $ sudo modprobe loop max_part0000 ------------[ cut here ]------------ kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65! invalid opcode: 0000 [#1] SMP last sysfs file: CPU 0 Modules linked in: loop(+) Pid: 43, comm: insmod Tainted: G W 2.6.39-qemu+ #155 Bochs Bochs RIP: 0010:[<ffffffff8113ce61>] [<ffffffff8113ce61>] internal_create_group= +0x2a/0x170 RSP: 0018:ffff880007b3fde8 EFLAGS: 00000246 RAX: 00000000ffffffef RBX: ffff880007b3d878 RCX: 00000000000007b4 RDX: ffffffff8152da50 RSI: 0000000000000000 RDI: ffff880007b3d878 RBP: ffff880007b3fe38 R08: ffff880007b3fde8 R09: 0000000000000000 R10: ffff88000783b4a8 R11: ffff880007b3d878 R12: ffffffff8152da50 R13: ffff880007b3d868 R14: 0000000000000000 R15: ffff880007b3d800 FS: 0000000002137880(0063) GS:ffff880007c00000(0000) knlGS:00000000000000= 00 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000422680 CR3: 0000000007b50000 CR4: 00000000000006b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000 Process insmod (pid: 43, threadinfo ffff880007b3e000, task ffff880007afb9c= 0) Stack: ffff880007b3fe58 ffffffff811e66dd ffff880007b3fe58 ffffffff811e570b 0000000000000010 ffff880007b3d800 ffff880007a7b390 ffff880007b3d868 0000000000400920 ffff880007b3d800 ffff880007b3fe48 ffffffff8113cfc8 Call Trace: [<ffffffff811e66dd>] ? device_add+0x4bc/0x5af [<ffffffff811e570b>] ? dev_set_name+0x3c/0x3e [<ffffffff8113cfc8>] sysfs_create_group+0xe/0x12 [<ffffffff810b420e>] blk_trace_init_sysfs+0x14/0x16 [<ffffffff8116a090>] blk_register_queue+0x47/0xf7 [<ffffffff8116f527>] add_disk+0xdf/0x290 [<ffffffffa00060eb>] loop_init+0xeb/0x1b8 [loop] [<ffffffffa0006000>] ? 0xffffffffa0005fff [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e [<ffffffff81096804>] sys_init_module+0x9c/0x1e0 [<ffffffff813329bb>] system_call_fastpath+0x16/0x1b Code: c3 55 48 89 e5 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 53 48 89 fb= 48 83 ec 28 48 85 ff 74 0b 85 f6 75 0b 48 83 7f 30 00 75 14 <0f> 0b eb fe = 48 83 7f 30 00 b9 ea ff ff ff 0f 84 18 01 00 00 49 RIP [<ffffffff8113ce61>] internal_create_group+0x2a/0x170 RSP <ffff880007b3fde8> ---[ end trace a123eb592043acad ]--- Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Laurent Vivier <Laurent.Vivier@bull.net> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-24video: mb862xxfb: add support for L1 displayingAnatolij Gustschin
Allow displaying L1 video data on top of the primary L0 layer. The L1 layer position and dimensions can be configured and the layer enabled/disabled by using the appropriate L1 controls added by this patch. Signed-off-by: Anatolij Gustschin <agust@denx.de>
2011-05-24video: mb862xx: add support for controller's I2C bus adapterAnatolij Gustschin
Add adapter driver for I2C adapter in Coral-P(A)/Lime GDCs. So we can easily access devices on controller's I2C bus using i2c-dev interface. Signed-off-by: Anatolij Gustschin <agust@denx.de>
2011-05-24video: mb862xxfb: relocate register space to get contiguous vramAnatolij Gustschin
By default the GDC registers are located in the middle of the 64MiB area for video RAM and registers. When 32MiB VRAM or more is used, relocate the register space to the top of the 64MiB space so that we get the contiguous VRAM for GDC frame buffer layers, drawing frames, capture and cursor buffers. Signed-off-by: Anatolij Gustschin <agust@denx.de>
2011-05-24video: mb862xxfb: use pre-initialized configuration for PCI GDCsAnatolij Gustschin
If the bootloader has already initialized the display controller, do not re-initialize it in the driver. Take over the bootloader's configuration instead. This is already supported for non PCI GDCs Lime and Mint. Add this functionality for PCI GDCs Coral-P and Coral-PA. It is useful to avoid flicker and also avoids unneeded init delays while booting. Signed-off-by: Anatolij Gustschin <agust@denx.de>
2011-05-24video: mb862xxfb: correct fix.smem_len field initializationAnatolij Gustschin
Initialize smem_len field to the actual frame buffer size and not to the whole video RAM size. This prevents overwriting other video memory (which could be used by other layers, cursors or accelerated drivers) by frame buffer applications relying on fix.smem_len. Signed-off-by: Anatolij Gustschin <agust@denx.de>
2011-05-24kconfig: do not record timestamp in .configArnaud Lacombe
Signed-off-by: Arnaud Lacombe <lacombar@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-05-24export_report: use warn() to issue WARNING, so they go to stderrJim Cromie
Also count CONFIG_MODVERSIONS warnings, and print a NOTE at start of SECTION 2 if any were issued. Section 2 will be empty if the build is lacking this CONFIG_ item, and user may have missed the warnings, as they're off screen. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-05-24export_report: sort SECTION 2 outputJim Cromie
Sort SECTION 2 modules by name. Within those module listings, sort the symbol providers by name, and remove the count, as it is misleading; its the kernel-wide count of uses of that symbol, not the count pertaining to the module being outlined. (this can be seen by grepping the output for a single symbol). The count is still used to sort the symbols. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-05-24export_report: do collectcfiles work in perl itselfJim Cromie
Avoid spawning a shell pipeline doing cat, grep, sed, and do it all inside perl. The <*.c> globbing construct works at least as far back as 5.8.9 Note that this is not just an optimization; the sed command in the pipeline was unterminated, due to lack of escape on the end-of-line (\$) in the regex, resulting in this: $ perl ../linux-2.6/scripts/export_report.pl > /dev/null sed: -e expression #1, char 5: unterminated `s' command sh: .mod.c/: not found Comments on an earlier patch sought an all-perl implementation. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> cc: Michal Marek <mmarek@suse.cz>, cc: linux-kbuild@vger.kernel.org cc: Arnaud Lacombe lacombar@gmail.com cc: Stephen Hemminger shemminger@vyatta.com Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-05-24gconfig: Hide unused left treeview when start up the interfaceEduardo Silva
When the gconfig program starts in full mode view, it shows the left treeview which belongs to the 'split mode view'. The patch fix this visual issue. Signed-off-by: Eduardo Silva <edsiper@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-05-24gconfig: enable rules hint for main treeviewsEduardo Silva
Due to the large amount of rows in the treeviews, is difficult to match columns with rows, setting the rules hint to 'true' allows the treeview to alternate background colors in the rows making the data more readable. Signed-off-by: Eduardo Silva <edsiper@gmail.com> Tested-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-05-24Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into ↵James Morris
for-linus Conflicts: lib/flex_array.c security/selinux/avc.c security/selinux/hooks.c security/selinux/ss/policydb.c security/smack/smack_lsm.c Manually resolve conflicts. Signed-off-by: James Morris <jmorris@namei.org>
2011-05-24Merge branch 'next' into for-linusJames Morris
2011-05-24compat: include aio_abi.h for aio_context_tStephen Rothwell
fixes this build error on sparc64 (at least): In file included from arch/sparc/include/asm/siginfo.h:19, from include/linux/signal.h:5, from include/linux/sched.h:73, from arch/sparc/kernel/asm-offsets.c:13: include/linux/compat.h:401: error: expected ')' before 'ctx_id' include/linux/compat.h:406: error: expected ')' before 'ctx_id' Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2011-05-24x86: Get rid of asmregparmRichard Weinberger
As UML does no longer need asmregparm we can remove it. Signed-off-by: Richard Weinberger <richard@nod.at> Cc: namhyung@gmail.com Cc: davem@davemloft.net Cc: fweisbec@gmail.com Cc: dhowells@redhat.com Link: http://lkml.kernel.org/r/%3C1306189085-29896-1-git-send-email-richard%40nod.at%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-24um: Use RWSEM_GENERIC_SPINLOCK on x86Richard Weinberger
Commit d12337 (rwsem: Remove redundant asmregparm annotation) broke rwsem on UML. As we cannot compile UML with -mregparm=3 and keeping asmregparm only for UML is inadequate the easiest solution is using RWSEM_GENERIC_SPINLOCK. Thanks to Thomas Gleixner for the idea. Reported-by: Toralf Förster <toralf.foerster@gmx.de> Tested-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Richard Weinberger <richard@nod.at> Cc: user-mode-linux-devel@lists.sourceforge.net Cc: <stable@kernel.org> # .39.x Link: http://lkml.kernel.org/r/%3C1306183893-26655-1-git-send-email-richard%40nod.at%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-24posix-timers: RCU conversionEric Dumazet
Ben Nagy reported a scalability problem with KVM/QEMU that hit very hard a single spinlock (idr_lock) in posix-timers code, on its 48 core machine. Even on a 16 cpu machine (2x4x2), a single test can show 98% of cpu time used in ticket_spin_lock, from lock_timer Ref: http://www.spinics.net/lists/kvm/msg51526.html Switching to RCU is quite easy, IDR being already RCU ready. idr_lock should be locked only for an insert/delete, not a lookup. Benchmark on a 2x4x2 machine, 16 processes calling timer_gettime(). Before : real 1m18.669s user 0m1.346s sys 1m17.180s After : real 0m3.296s user 0m1.366s sys 0m1.926s Reported-by: Ben Nagy <ben@iagu.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Ben Nagy <ben@iagu.net> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Richard Cochran <richard.cochran@omicron.at> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-24video: s3c-fb: correct transparency checking in 32bppJingoo Han
32bpp means ARGB 8888 in the driver, therfore the transparency length and offset should be 8 and 24 respectively. However, the transparency length and offset were previously 0, which means that the driver supports RGB 888 without alpha blending when 32bpp is used. So, the transparency checking in 32bpp is corrected so that the transparency length and offset are 8 and 24 respectively. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2011-05-24video: s3c-fb: add gpio setup function to resume functionJingoo Han
This patch adds gpio setup function to resume function to ensure gpio used by FIMD IP and LCD panel during a resume. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2011-05-24drbd: fix warningAndrew Morton
In file included from drivers/block/drbd/drbd_main.c:54: drivers/block/drbd/drbd_int.h:1190: warning: parameter has incomplete type Forward declarations of enums do not work. Fix it unpleasantly by moving the prototype. Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Lars Ellenberg <drbd-dev@lists.linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2011-05-24drbd: fix warningPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2011-05-24cfq-iosched: free cic_index if cfqd allocation failsNamhyung Kim
When struct cfq_data allocation fails, cic_index need to be freed. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-24cfq-iosched: remove unused 'group_changed' in cfq_service_tree_add()Namhyung Kim
The 'group_changed' variable is initialized to 0 and never changed, so checking the variable is meaningless. It is a leftover from 0bbfeb832042 ("cfq-iosched: Always provide group iosolation."). Let's get rid of it. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Justin TerAvest <teravest@google.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-24cfq-iosched: reduce bit operations in cfq_choose_req()Namhyung Kim
Reduce the number of bit operations in cfq_choose_req() on average (and worst) cases. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-24cfq-iosched: algebraic simplification in cfq_prio_to_maxrq()Namhyung Kim
Simplify the calculation in cfq_prio_to_maxrq(), plus replace CFQ_PRIO_LISTS to IOPRIO_BE_NR since they are the same and IOPRIO_BE_NR looks more reasonable in this context IMHO. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-24drbd: Fix spellingBart Van Assche
Found these with the help of ispell -l. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2011-05-24drbd: fix schedule in atomicLars Ellenberg
An administrative detach used to request a state change directly to D_DISKLESS, first suspending IO to avoid the last put_ldev() occuring from an endio handler, potentially in irq context. This is not enough on the receiving side (typically secondary), we may miss some peer_req on the way to local disk, which then may do the last put_ldev() from their drbd_peer_request_endio(). This patch makes the detach always go through the intermediate D_FAILED state. We may consider to rename it D_DETACHING. Alternative approach would be to create yet an other work item to be scheduled on the worker, do the destructor work from there, and get the timing right. manually picked commit 564040f from the drbd 8.4 branch. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: Take a more conservative approach when deciding max_bio_sizePhilipp Reisner
The old (optimistic) implementation could shrink the bio size on an primary device. Shrinking the bio size on a primary device is bad. Since there we might get BIOs with the old (bigger) size shortly after we published the new size. The new implementation is more conservative, and eventually increases the max_bio_size on a primary device (which is valid). It does so, when it knows the local limit AND the remote limit. We cache the last seen max_bio_size of the peer in the meta data, and rely on that, to make the operation of single nodes more efficient. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: Fixed state transitions after async outdate-peer-handler returnedPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: Disallow the peer_disk_state to be D_OUTDATED while connectedPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: Fix for the connection problems on high latency linksPhilipp Reisner
It seems that the real cause of all the issues where that we did not noticed in drbd_try_connect() when the other guy closes one socket if the round trip time gets higher than 100ms. There were that 100ms hard coded! Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: fix potential activity log refcount imbalance in error pathLars Ellenberg
It is no longer sufficient to trigger on local WRITE, we need to check on (rq_state & RQ_IN_ACT_LOG) before calling drbd_al_complete_io also in the error path. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: Only downgrade the disk state in case of disk failuresPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: fix disconnect/reconnect loop, if ping-timeout == ping-intLars Ellenberg
If there is no replication traffic within the idle timeout (ping-int seconds), DRBD will send a P_PING, and adjust the timeout to ping-timeout. If there is no P_PING_ACK received within this ping-timeout, DRBD finally drops the connection, and tries to re-establish it. To decide which timeout was active, we compared the current timeout with the ping-timeout, and dropped the connection, if that was the case. By default, ping-int is 10 seconds, ping-timeout is 500 ms. Unfortunately, if you configure ping-timeout to be the same as ping-int, expiry of the idle-timeout had been mistaken for a missing ping ack, and caused an immediate reconnection attempt. Fix: Allow both timeouts to be equal, use a local variable to store which timeout is active. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: fix potential distributed deadlockLars Ellenberg
We limit ourselves to a configurable maximum number of pages used as temporary bio pages. If the configured "max_buffers" is not big enough to match the bandwidth of the respective deployment, a distributed deadlock could be triggered by e.g. fast online verify and heavy application IO. TCP connections would block on congestion, because both receivers would wait on pages to become available. Fortunately the respective senders in this case would be able to give back some pages already. So do that. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24lru_cache.h: fix comments referring to ts_ instead of lc_Lars Ellenberg
For some time we contemplated calling the "struct lru_cache" a "struct tracked_set", and some comments kept the ts_ prefix. Fix those to match the member field names. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: Fix for application IO with the on-io-error=pass-on policyPhilipp Reisner
In case a write failes on the local disk, go into D_INCONSISTENT disk state. That causes future reads of that block to be shipped to the peer. Read retry remote was already in place. Actually the documentation needs to get fixed now. Since the application is still shielded from the error. (as long as we have only a single disk failing) The difference to detach is that we keep the disk. And therefore might keep all the other, still working sectors up to date. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>