OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Martin K. Petersen	80ddf247c8	block: Set max_sectors correctly for stacking devices The topology changes unintentionally caused SAFE_MAX_SECTORS to be set for stacking devices. Set the default limit to BLK_DEF_MAX_SECTORS and provide SAFE_MAX_SECTORS in blk_queue_make_request() for legacy hw drivers that depend on the old behavior. Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-01 21:15:45 +02:00
Kay Sievers	e454cea20b	Driver-Core: extend devnode callbacks to provide permissions This allows subsytems to provide devtmpfs with non-default permissions for the device node. Instead of the default mode of 0600, null, zero, random, urandom, full, tty, ptmx now have a mode of 0666, which allows non-privileged processes to access standard device nodes in case no other userspace process applies the expected permissions. This also fixes a wrong assignment in pktcdvd and a checkpatch.pl complain. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-09-19 12:50:38 -07:00
Linus Torvalds	ab86e5765d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: Driver Core: devtmpfs - kernel-maintained tmpfs-based /dev debugfs: Modify default debugfs directory for debugging pktcdvd. debugfs: Modified default dir of debugfs for debugging UHCI. debugfs: Change debugfs directory of IWMC3200 debugfs: Change debuhgfs directory of trace-events-sample.h debugfs: Fix mount directory of debugfs by default in events.txt hpilo: add poll f_op hpilo: add interrupt handler hpilo: staging for interrupt handling driver core: platform_device_add_data(): use kmemdup() Driver core: Add support for compatibility classes uio: add generic driver for PCI 2.3 devices driver-core: move dma-coherent.c from kernel to driver/base mem_class: fix bug mem_class: use minor as index instead of searching the array driver model: constify attribute groups UIO: remove 'default n' from Kconfig Driver core: Add accessor for device platform data Driver core: move dev_get/set_drvdata to drivers/base/dd.c Driver core: add new device to bus's list before probing	2009-09-16 08:27:10 -07:00
David Brownell	a4dbd6740d	driver model: constify attribute groups Let attribute group vectors be declared "const". We'd like to let most attribute metadata live in read-only sections... this is a start. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-09-15 09:50:47 -07:00
Linus Torvalds	ada3fa1505	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (46 commits) powerpc64: convert to dynamic percpu allocator sparc64: use embedding percpu first chunk allocator percpu: kill lpage first chunk allocator x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA percpu: update embedding first chunk allocator to handle sparse units percpu: use group information to allocate vmap areas sparsely vmalloc: implement pcpu_get_vm_areas() vmalloc: separate out insert_vmalloc_vm() percpu: add chunk->base_addr percpu: add pcpu_unit_offsets[] percpu: introduce pcpu_alloc_info and pcpu_group_info percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward percpu: add @align to pcpu_fc_alloc_fn_t percpu: make @dyn_size mandatory for pcpu_setup_first_chunk() percpu: drop @static_size from first chunk allocators percpu: generalize first chunk allocator selection percpu: build first chunk allocators selectively percpu: rename 4k first chunk allocator to page percpu: improve boot messages percpu: fix pcpu_reclaim() locking ... Fix trivial conflict as by Tejun Heo in kernel/sched.c	2009-09-15 09:39:44 -07:00
Linus Torvalds	355bbd8cb8	Merge branch 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block: (29 commits) block: use blkdev_issue_discard in blk_ioctl_discard Make DISCARD_BARRIER and DISCARD_NOBARRIER writes instead of reads block: don't assume device has a request list backing in nr_requests store block: Optimal I/O limit wrapper cfq: choose a new next_req when a request is dispatched Seperate read and write statistics of in_flight requests aoe: end barrier bios with EOPNOTSUPP block: trace bio queueing trial only when it occurs block: enable rq CPU completion affinity by default cfq: fix the log message after dispatched a request block: use printk_once cciss: memory leak in cciss_init_one() splice: update mtime and atime on files block: make blk_iopoll_prep_sched() follow normal 0/1 return convention cfq-iosched: get rid of must_alloc flag block: use interrupts disabled version of raise_softirq_irqoff() block: fix comment in blk-iopoll.c block: adjust default budget for blk-iopoll block: fix long lines in block/blk-iopoll.c block: add blk-iopoll, a NAPI like approach for block devices ...	2009-09-14 17:55:15 -07:00
Christoph Hellwig	746cd1e7e4	block: use blkdev_issue_discard in blk_ioctl_discard blk_ioctl_discard duplicates large amounts of code from blkdev_issue_discard, the only difference between the two is that blkdev_issue_discard needs to send a barrier discard request and blk_ioctl_discard a non-barrier one, and blk_ioctl_discard needs to wait on the request. To facilitates this add a flags argument to blkdev_issue_discard to control both aspects of the behaviour. This will be very useful later on for using the waiting funcitonality for other callers. Based on an earlier patch from Matthew Wilcox <matthew@wil.cx>. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-14 08:24:53 +02:00
Jens Axboe	b8a9ae779f	block: don't assume device has a request list backing in nr_requests store Stacked devices do not. For now, just error out with -EINVAL. Later we could make the limit apply on stacked devices too, for throttling reasons. This fixes 5a54cd13353bb3b88887604e2c980aa01e314309 and should go into 2.6.31 stable as well. Cc: stable@kernel.org Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-14 08:24:53 +02:00
Martin K. Petersen	3c5820c743	block: Optimal I/O limit wrapper Implement blk_limits_io_opt() and make blk_queue_io_opt() a wrapper around it. DM needs this to avoid poking at the queue_limits directly. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-14 08:24:52 +02:00
Jeff Moyer	06d2188644	cfq: choose a new next_req when a request is dispatched This patch addresses http://bugzilla.kernel.org/show_bug.cgi?id=13401, a regression introduced in 2.6.30. From the bug report: Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-14 08:24:52 +02:00
Nikanth Karthikesan	a9327cac44	Seperate read and write statistics of in_flight requests Currently, there is a single in_flight counter measuring the number of requests in the request_queue. But some monitoring tools would like to know how many read requests and write requests are in progress. Split the current in_flight counter into two seperate counters for read and write. This information is exported as a sysfs attribute, as changing the currently available stat files would break the existing tools. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-14 08:24:52 +02:00
Minchan Kim	01edede41e	block: trace bio queueing trial only when it occurs If BIO is discarded or cross over end of device, BIO queueing trial doesn't occur. Actually the trace was called just before make_request at first: [PATCH] Block queue IO tracing support (blktrace) as of 2006-03-23 2056a782f8e7e65fd4bfd027506b4ce1c5e9ccd4 And then 2 patches added some checks between them: [PATCH] md: check bio address after mapping through partitions 5ddfe9691c91a244e8d1be597b6428fcefd58103, [BLOCK] Don't allow empty barriers to be passed down to queues that don't grok them 51fd77bd9f512ab6cc9df0733ba1caaab89eb957 It breaks original goal. Let's trace it only when it happens. Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Acked-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:34:34 +02:00
Shan Wei	b217a903ab	cfq: fix the log message after dispatched a request The blktrace tools can show process id when cfq dispatched a request, using cfq_log_cfqq() instead of cfq_log(). Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:34:33 +02:00
Jens Axboe	1b379d8daf	cfq-iosched: get rid of must_alloc flag It's not currently used, as pointed out by Gui Jianfeng <guijianfeng@cn.fujitsu.com>. We already check the wait_request flag to allow an idling queue priority allocation access, so we don't need this extra flag. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:32 +02:00
Jens Axboe	a33dac26d4	block: use interrupts disabled version of raise_softirq_irqoff() We already have interrupts disabled at that point, so use the __raise_softirq_irqoff() variant. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:32 +02:00
Jens Axboe	fca51d64c5	block: fix comment in blk-iopoll.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:32 +02:00
Jens Axboe	37867ae7c5	block: adjust default budget for blk-iopoll It's not exported, I doubt we'll have a reason to change this... Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:31 +02:00
Jens Axboe	1badcfbd7f	block: fix long lines in block/blk-iopoll.c Note sure why they happened in the first place, probably some bad terminal setting. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:31 +02:00
Jens Axboe	5e605b64a1	block: add blk-iopoll, a NAPI like approach for block devices This borrows some code from NAPI and implements a polled completion mode for block devices. The idea is the same as NAPI - instead of doing the command completion when the irq occurs, schedule a dedicated softirq in the hopes that we will complete more IO when the iopoll handler is invoked. Devices have a budget of commands assigned, and will stay in polled mode as long as they continue to consume their budget from the iopoll softirq handler. If they do not, the device is set back to interrupt completion mode. This patch holds the core bits for blk-iopoll, device driver support sold separately. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:31 +02:00
Jens Axboe	fb1e75389b	block: improve queue_should_plug() by looking at IO depths Instead of just checking whether this device uses block layer tagging, we can improve the detection by looking at the maximum queue depth it has reached. If that crosses 4, then deem it a queuing device. This is important on high IOPS devices, since plugging hurts the performance there (it can be as much as 10-15% of the sys time). Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:31 +02:00
Jens Axboe	1f98a13f62	bio: first step in sanitizing the bio->bi_rw flag testing Get rid of any functions that test for these bits and make callers use bio_rw_flagged() directly. Then it is at least directly apparent what variable and flag they check. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:31 +02:00
Hannes Reinecke	e3264a4d7d	Send uevents for write_protect changes Whenever a block device changes it's read-only attribute notify the userspace about it. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:31 +02:00
Vivek Goyal	d58b85e1e8	cfq-iosched: no need to keep track of busy_rt_queues o Get rid of busy_rt_queues infrastructure. Looks like it is redundant. o Once an RT queue gets request it will preempt any of the BE or IDLE queues immediately. Otherwise this queue will be put on service tree and scheduler will anyway select this queue before any of the BE or IDLE queue. Hence looks like there is no need to keep track of how many busy RT queues are currently on service tree. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:30 +02:00
Jens Axboe	5ad531db6e	cfq-iosched: drain device queue before switching to a sync queue To lessen the impact of async IO on sync IO, let the device drain of any async IO in progress when switching to a sync cfqq that has idling enabled. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:30 +02:00
Tejun Heo	da6c5c720c	scsi,block: update SCSI to handle mixed merge failures Update scsi_io_completion() such that it only fails requests till the next error boundary and retry the leftover. This enables block layer to merge requests with different failfast settings and still behave correctly on errors. Allow merge of requests of different failfast settings. As SCSI is currently the only subsystem which follows failfast status, there's no need to worry about other block drivers for now. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Niel Lambrechts <niel.lambrechts@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:30 +02:00
Tejun Heo	80a761fd33	block: implement mixed merge of different failfast requests Failfast has characteristics from other attributes. When issuing, executing and successuflly completing requests, failfast doesn't make any difference. It only affects how a request is handled on failure. Allowing requests with different failfast settings to be merged cause normal IOs to fail prematurely while not allowing has performance penalties as failfast is used for read aheads which are likely to be located near in-flight or to-be-issued normal IOs. This patch introduces the concept of 'mixed merge'. A request is a mixed merge if it is merge of segments which require different handling on failure. Currently the only mixable attributes are failfast ones (or lack thereof). When a bio with different failfast settings is added to an existing request or requests of different failfast settings are merged, the merged request is marked mixed. Each bio carries failfast settings and the request always tracks failfast state of the first bio. When the request fails, blk_rq_err_bytes() can be used to determine how many bytes can be safely failed without crossing into an area which requires further retrials. This allows request merging regardless of failfast settings while keeping the failure handling correct. This patch only implements mixed merge but doesn't enable it. The next one will update SCSI to make use of mixed merge. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Niel Lambrechts <niel.lambrechts@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:30 +02:00
Tejun Heo	a82afdfcb8	block: use the same failfast bits for bio and request bio and request use the same set of failfast bits. This patch makes the following changes to simplify things. * enumify BIO_RW* bits and reorder bits such that BIOS_RW_FAILFAST_* bits coincide with __REQ_FAILFAST_* bits. * The above pushes BIO_RW_AHEAD out of sync with __REQ_FAILFAST_DEV but the matching is useless anyway. init_request_from_bio() is responsible for setting FAILFAST bits on FS requests and non-FS requests never use BIO_RW_AHEAD. Drop the code and comment from blk_rq_bio_prep(). * Define REQ_FAILFAST_MASK which is OR of all FAILFAST bits and simplify FAILFAST flags handling in init_request_from_bio(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:27 +02:00
Jens Axboe	d993831fa7	writeback: add name to backing_dev_info This enables us to track who does what and print info. Its main use is catching dirty inodes on the default_backing_dev_info, so we can fix that up. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 09:20:26 +02:00
Nikanth Karthikesan	c295fc0578	block: Allow changing max_sectors_kb above the default 512 The patch "block: Use accessor functions for queue limits" (`ae03bf639a`) changed queue_max_sectors_store() to use blk_queue_max_sectors() instead of directly assigning the value. But blk_queue_max_sectors() differs a bit 1. It sets both max_sectors_kb, and max_hw_sectors_kb 2. Never allows one to change max_sectors_kb above BLK_DEF_MAX_SECTORS. If one specifies a value greater then max_hw_sectors is set to that value but max_sectors is set to BLK_DEF_MAX_SECTORS I am not sure whether blk_queue_max_sectors() should be changed, as it seems to be that way for a long time. And there may be callers dependent on that behaviour. This patch simply reverts to the older way of directly assigning the value to max_sectors as it was before. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-01 22:40:15 +02:00
Tejun Heo	384be2b18a	Merge branch 'percpu-for-linus' into percpu-for-next Conflicts: arch/sparc/kernel/smp_64.c arch/x86/kernel/cpu/perf_counter.c arch/x86/kernel/setup_percpu.c drivers/cpufreq/cpufreq_ondemand.c mm/percpu.c Conflicts in core and arch percpu codes are mostly from commit ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all the first chunk allocators into mm/percpu.c, the changes are moved from arch code to mm/percpu.c. Signed-off-by: Tejun Heo <tj@kernel.org>	2009-08-14 14:45:31 +09:00
John Stoffel	14d9fa3525	Make SCSI SG v4 driver enabled by default and remove EXPERIMENTAL dependency, since udev depends on BSG Make Block Layer SG support v4 the default, since recent udev versions depend on this to access serial numbers and other low level info properly. This should be backported to older kernels as well, since most distros have enabled this for a long time. Signed-off-by: John Stoffel <john@stoffel.org> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-08-04 22:10:17 +02:00
Martin K. Petersen	7e5f5fb09e	block: Update topology documentation Update topology comments and sysfs documentation based upon discussions with Neil Brown. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-08-01 10:24:35 +02:00
Martin K. Petersen	70dd5bf3b9	block: Stack optimal I/O size When stacking block devices ensure that optimal I/O size is scaled accordingly. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-08-01 10:24:35 +02:00
Martin K. Petersen	7c958e3264	block: Add a wrapper for setting minimum request size without a queue Introduce blk_limits_io_min() and make blk_queue_io_min() call it. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-08-01 10:24:35 +02:00
Martin K. Petersen	fef246672b	block: Make blk_queue_stack_limits use the new stacking interface blk_queue_stack_limits() has been superceded by blk_stack_limits() and disk_stack_limits(). Wrap the function call for now, we'll deprecate it later. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-08-01 10:24:35 +02:00
Jens Axboe	56ad1740d9	block: make the end_io functions be non-GPL exports Prior to the change for more sane end_io functions, we exported the helpers with the normal EXPORT_SYMBOL(). That got changed to _GPL() for the new interface. Revert that particular change, on the basis that this is basic functionality and doesn't dip into internal structures. If these exports can't be non-GPL, then we may as well make EXPORT_SYMBOL() imply GPL for everything. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-28 22:11:24 +02:00
Xiaotian Feng	3839e4b29b	block: fix improper kobject release in blk_integrity_unregister blk_integrity_unregister should use kobject_put to release the kobject, otherwise after bi is freed, memory of bi->kobj->name is leaked. Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-28 09:11:14 +02:00
Jens Axboe	a4e7d46407	block: always assign default lock to queues Move the assignment of a default lock below blk_init_queue() to blk_queue_make_request(), so we also get to set the default lock for ->make_request_fn() based drivers. This is important since the queue flag locking requires a lock to be in place. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-28 09:07:29 +02:00
Xiaotian Feng	9cb308ce8d	block: sysfs fix mismatched queue_var_{store,show} in 64bit kernel In blk-sysfs.c, queue_var_store uses unsigned long to store data, but queue_var_show uses unsigned int to show data. This causes, # echo 70000000000 > /sys/block/<dev>/queue/read_ahead_kb # cat /sys/block/<dev>/queue/read_ahead_kb => get wrong value Fix it by using unsigned long. While at it, convert queue_rq_affinity_show() such that it uses bool variable instead of explicit != 0 testing. Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2009-07-17 16:54:15 +09:00
Tejun Heo	0a09f4319c	block: fix failfast merge testing in elv_rq_merge_ok() Commit `ab0fd1debe` tries to prevent merge of requests with different failfast settings. In elv_rq_merge_ok(), it compares new bio's failfast flags against the merge target request's. However, the flag testing accessors for bio and blk don't return boolean but the tested bit value directly and FAILFAST on bio and blk don't match, so directly comparing them with == results in false negative unnecessary preventing merge of readahead requests. This patch convert the results to boolean by negating them before comparison. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jeff Garzik <jeff@garzik.org>	2009-07-17 14:50:43 +09:00
Vivek Goyal	32f2e807a3	cfq-iosched: reset oom_cfqq in cfq_set_request() In case memory is scarce, we now default to oom_cfqq. Once memory is available again, we should allocate a new cfqq and stop using oom_cfqq for a particular io context. Once a new request comes in, check if we are using oom_cfqq, and if yes, try to allocate a new cfqq. Tested the patch by forcing the use of oom_cfqq and upon next request thread realized that it was using oom_cfqq and it allocated a new cfqq. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-10 20:31:54 +02:00
FUJITA Tomonori	76da03467a	block: call blk_scsi_ioctl_init() Currently, blk_scsi_ioctl_init() is not called since it lacks an initcall marking. This causes the command table to be unitialized, hence somce commands are block when they should not have been. This fixes a regression introduced by commit `018e044689` Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-10 20:31:53 +02:00
Tejun Heo	c43768cbb7	Merge branch 'master' into for-next Pull linus#master to merge PER_CPU_DEF_ATTRIBUTES and alpha build fix changes. As alpha in percpu tree uses 'weak' attribute instead of inline assembly, there's no need for __used attribute. Conflicts: arch/alpha/include/asm/percpu.h arch/mn10300/kernel/vmlinux.lds.S include/linux/percpu-defs.h	2009-07-04 07:13:18 +09:00
Tejun Heo	ab0fd1debe	block: don't merge requests of different failfast settings Block layer used to merge requests and bios with different failfast settings. This caused regular IOs to fail prematurely when they were merged into failfast requests for readahead. Niel Lambrechts could trigger the problem semi-reliably on ext4 when resuming from STR. ext4 uses readahead when reading inodes and combined with the deterministic extra SATA PHY exception cycle during resume on the specific configuration, non-readahead inode read would fail causing ext4 errors. Please read the following thread for details. http://lkml.org/lkml/2009/5/23/21 This patch makes block layer reject merging if the failfast settings don't match. This is correct but likely to lower IO performance by preventing regular IOs from mingling into surrounding readahead requests. Changes to allow such mixed merges and handle errors correctly will be added later. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Niel Lambrechts <niel.lambrechts@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Signed-off-by: Jens Axboe <axboe@carl.(none)>	2009-07-03 21:06:45 +02:00
Shan Wei	b706f64281	cfq-iosched: remove redundant check for NULL cfqq in cfq_set_request() With the changes for falling back to an oom_cfqq, we never fail to find/allocate a queue in cfq_get_queue(). So remove the check. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-01 12:41:14 +02:00
NeilBrown	db64f680ba	blocK: Restore barrier support for md and probably other virtual devices. The next_ordered flag is only meaningful for devices that use __make_request. So move the test against next_ordered out of generic code and in to __make_request Since this test was added, barriers have not worked on md or any devices that don't use __make_request and so don't bother to set next_ordered. (dm explicitly sets something other than QUEUE_ORDERED_NONE since commit `99360b4c18` but notes in the comments that it is otherwise meaningless). Cc: Ken Milmore <ken.milmore@googlemail.com> Cc: stable@kernel.org Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-01 10:56:26 +02:00
Jens Axboe	018e044689	block: get rid of queue-private command filter The initial patches to support this through sysfs export were broken and have been if 0'ed out in any release. So lets just kill the code and reclaim some space in struct request_queue, if anyone would later like to fixup the sysfs bits, the git history can easily restore the removed bits. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-01 10:56:26 +02:00
Martin K. Petersen	7878cba9f0	block: Create bip slabs with embedded integrity vectors This patch restores stacking ability to the block layer integrity infrastructure by creating a set of dedicated bip slabs. Each bip slab has an embedded bio_vec array at the end. This cuts down on memory allocations and also simplifies the code compared to the original bvec version. Only the largest bip slab is backed by a mempool. The pool is contained in the bio_set so stacking drivers can ensure forward progress. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@carl.(none)>	2009-07-01 10:56:25 +02:00
Jens Axboe	6118b70b3a	cfq-iosched: get rid of the need for __GFP_NOFAIL in cfq_find_alloc_queue() Setup an emergency fallback cfqq that we allocate at IO scheduler init time. If the slab allocation fails in cfq_find_alloc_queue(), we'll just punt IO to that cfqq instead. This ensures that cfq_find_alloc_queue() never fails without having to ensure free memory. On cfqq lookup, always try to allocate a new cfqq if the given cfq io context has the oom_cfqq assigned. This ensures that we only temporarily punt to this shared queue. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-01 10:56:25 +02:00
Jens Axboe	d5036d770f	cfq-iosched: move cfqq initialization out of cfq_find_alloc_queue() We're going to be needing that init code outside of that function to get rid of the __GFP_NOFAIL in cfqq allocation. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-01 10:56:25 +02:00

1 2 3 4 5 ...

988 Commits