android_kernel_xiaomi_sm8350

Author	SHA1	Message	Date
Jens Axboe	9a11b4ed0e	cfq-iosched: properly protect ioc_gone and ioc count If we have multiple tasks freeing cfq_io_contexts when cfq-iosched is being unloaded, we could complete() ioc_gone twice. Fix that by protecting ioc_gone complete() and clearing with a spinlock for just that purpose. Doesn't matter from a performance perspective, since it'll only enter that path when ioc_gone != NULL (when cfq-iosched is being rmmod'ed). Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:12 +02:00
Divyesh Shah	d585d0b9d7	block: Fix the starving writes bug in the anticipatory IO scheduler AS scheduler alternates between issuing read and write batches. It does the batch switch only after all requests from the previous batch are completed. When switching to a write batch, if there is an on-going read request, it waits for its completion and indicates its intention of switching by setting ad->changed_batch and the new direction but does not update the batch_expire_time for the new write batch which it does in the case of no previous pending requests. On completion of the read request, it sees that we were waiting for the switch and schedules work for kblockd right away and resets the ad->changed_data flag. Now when kblockd enters dispatch_request where it is expected to pick up a write request, it in turn ends the write batch because the batch_expire_timer was not updated and shows the expire timestamp for the previous batch. This results in the write starvation for all the cases where there is the intention for switching to a write batch, but there is a previous in-flight read request and the batch gets reverted to a read_batch right away. This also holds true in the reverse case (switching from a write batch to a read batch with an in-flight write request). I've checked that this bug exists on 2.6.11, 2.6.18, 2.6.24 and linux-2.6-block git HEAD. I've tested the fix on x86 platforms with SCSI drives where the driver asks for the next request while a current request is in-flight. This patch is based off linux-2.6-block git HEAD. Bug reproduction: A simple scenario which reproduces this bug is: - dd if=/dev/hda3 of=/dev/null & - lilo The lilo takes forever to complete. This can also be reproduced fairly easily with the earlier dd and another test program doing msync(). The example test program below should print out a message after every iteration but it simply hangs forever. With this bugfix it makes forward progress. ==== Example test program using msync() (thanks to suleiman AT google DOT com) inline uint64_t rdtsc(void) { int64_t tsc; __asm __volatile("rdtsc" : "=A" (tsc)); return (tsc); } int main(int argc, char *argv) { struct stat st; uint64_t e, s, t; char p, q; long i; int fd; if (argc < 2) { printf("Usage: %s <file>\n", argv[0]); return (1); } if ((fd = open(argv[1], O_RDWR \| O_NOATIME)) < 0) err(1, "open"); if (fstat(fd, &st) < 0) err(1, "fstat"); p = mmap(NULL, st.st_size, PROT_READ \| PROT_WRITE, MAP_SHARED, fd, 0); t = 0; for (i = 0; i < 1000; i++) { p = 0; msync(p, 4096, MS_SYNC); s = rdtsc(); p = 0; __asm __volatile(""::: "memory"); e = rdtsc(); if (argc > 2) printf("%d: %lld cycles %jd %jd\n", i, e - s, (intmax_t)s, (intmax_t)e); t += e - s; } printf("average time: %lld cycles\n", t / 1000); return (0); } Cc: <stable@kernel.org> Acked-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-01 09:06:42 +02:00
Ingo Molnar	9583f3d9c0	Merge branch 'linus' into core/softirq	2008-06-16 11:24:17 +02:00
Carl Henrik Lunde	14a73f5479	block: disable IRQs until data is written to relay channel As we may run relay_reserve from interrupt context we must always disable IRQs. This is because a call to relay_reserve may expose previously written data to use space. Updated new message code and an old but related comment. Signed-off-by: Carl Henrik Lunde <chlunde@ping.uio.no> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-06-12 11:20:57 -07:00
Linus Torvalds	d5791d13b1	Fix invalid access errors in blk_lookup_devt Commit `30f2f0eb4b` ("block: do_mounts - accept root=<non-existant partition>") extended blk_lookup_devt() to be able to look up partitions that had not yet been registered, but in the process made the assumption that the '&block_class.devices' list only contains disk devices and that you can do 'dev_to_disk(dev)' on them. That isn't actually true. The block_class device list also contains the partitions we've discovered so far, and you can't just do a 'dev_to_disk()' on those. So make sure to only work on devices that block/genhd.c has registered itself, something we can test by checking the 'dev->type' member. This makes the loop in blk_lookup_devt() match the other such loops in this file. [ We may want to do an alternate version that knows to handle _either_ whole-disk devices or partitions, but for now this is the minimal fix for a series of crashes reported by Mariusz Kozlowski in http://lkml.org/lkml/2008/5/25/25 and Ingo in http://lkml.org/lkml/2008/6/9/39 ] Reported-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Reported-by: Ingo Molnar <mingo@elte.hu> Cc: Neil Brown <neilb@suse.de> Cc: Joao Luis Meloni Assirati <assirati@nonada.if.usp.br> Acked-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-06-09 10:06:24 -07:00
Jens Axboe	d6de8be711	cfq-iosched: fix RCU problem in cfq_cic_lookup() cfq_cic_lookup() needs to properly protect ioc->ioc_data before dereferencing it and also exclude updaters of ioc->ioc_data as well. Also add a number of comments documenting why the existing RCU usage is OK. Thanks a lot to "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> for review and comments! Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-28 14:49:28 +02:00
Jens Axboe	64565911cd	block: make blktrace use per-cpu buffers for message notes Currently it uses a single static char array, but that risks being corrupted when multiple users issue message notes at the same time. Make the buffers dynamically allocated when the trace is setup and make them per-cpu instead. The default max message size of 1k is also very large, the interface is mainly for small text notes. So shrink it to 128 bytes. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-28 14:49:27 +02:00
Alan D. Brunelle	4722dc52a8	Added in elevator switch message to blktrace stream Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-28 14:49:27 +02:00
Alan D. Brunelle	9d5f09a424	Added in MESSAGE notes for blktraces Allows messages to be inserted into blktrace streams. Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-28 14:49:27 +02:00
Richard Kennedy	be754d2c21	block: reorder cfq_queue to save space on 64bit builds saves 8 bytes of padding & increases objects/slab from 30 to 32 on my AMD64 config Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-28 14:49:27 +02:00
Zhang, Yanmin	05caf8dbc1	block: Move the second call to get_request to the end of the loop In function get_request_wait, the second call to get_request could be moved to the end of the while loop, because if the first call to get_request fails, the second call will fail without sleep. Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-28 14:49:27 +02:00
Carlos R. Mafra	962cf36c5b	Remove argument from open_softirq which is always NULL As git-grep shows, open_softirq() is always called with the last argument being NULL block/blk-core.c: open_softirq(BLOCK_SOFTIRQ, blk_done_softirq, NULL); kernel/hrtimer.c: open_softirq(HRTIMER_SOFTIRQ, run_hrtimer_softirq, NULL); kernel/rcuclassic.c: open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL); kernel/rcupreempt.c: open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL); kernel/sched.c: open_softirq(SCHED_SOFTIRQ, run_rebalance_domains, NULL); kernel/softirq.c: open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL); kernel/softirq.c: open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL); kernel/timer.c: open_softirq(TIMER_SOFTIRQ, run_timer_softirq, NULL); net/core/dev.c: open_softirq(NET_TX_SOFTIRQ, net_tx_action, NULL); net/core/dev.c: open_softirq(NET_RX_SOFTIRQ, net_rx_action, NULL); This observation has already been made by Matthew Wilcox in June 2002 (http://www.cs.helsinki.fi/linux/linux-kernel/2002-25/0687.html) "I notice that none of the current softirq routines use the data element passed to them." and the situation hasn't changed since them. So it appears we can safely remove that extra argument to save 128 (54) bytes of kernel data (text). Signed-off-by: Carlos R. Mafra <crmafra@ift.unesp.br> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-25 07:43:15 +02:00
Jonathan Corbet	75bd2ef145	bsg: cdev lock_kernel() pushdown Push the cdev lock_kernel call into bsg_open(). Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2008-05-18 15:43:40 -06:00
Neil Brown	e7e72bf641	Remove blkdev warning triggered by using md As setting and clearing queue flags now requires that we hold a spinlock on the queue, and as blk_queue_stack_limits is called without that lock, get the lock inside blk_queue_stack_limits. For blk_queue_stack_limits to be able to find the right lock, each md personality needs to set q->queue_lock to point to the appropriate lock. Those personalities which didn't previously use a spin_lock, us q->__queue_lock. So always initialise that lock when allocated. With this in place, setting/clearing of the QUEUE_FLAG_PLUGGED bit will no longer cause warnings as it will be clear that the proper lock is held. Thanks to Dan Williams for review and fixing the silly bugs. Signed-off-by: NeilBrown <neilb@suse.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Alistair John Strachan <alistair@devzero.co.uk> Cc: Nick Piggin <npiggin@suse.de> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Jacek Luczak <difrost.kernel@gmail.com> Cc: Prakash Punnoor <prakash@punnoor.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-14 19:11:15 -07:00
Kay Sievers	30f2f0eb4b	block: do_mounts - accept root=<non-existant partition> Some devices, like md, may create partitions only at first access, so allow root= to be set to a valid non-existant partition of an existing disk. This applies only to non-initramfs root mounting. This fixes a regression from 2.6.24 which did allow this to happen and broke some users machines :( Acked-by: Neil Brown <neilb@suse.de> Tested-by: Joao Luis Meloni Assirati <assirati@nonada.if.usp.br> Cc: stable <stable@kernel.org> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-05-14 10:37:57 -07:00
Jean Delvare	f36f21ecca	Fix misuses of bdevname() bdevname() fills the buffer that it is given as a parameter, so calling strcpy() or snprintf() on the returned value is redundant (and probably not guaranteed to work - I don't think strcpy and snprintf support overlapping buffers.) Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Stephen Tweedie <sct@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-13 08:02:26 -07:00
Jens Axboe	28f13702f0	block: avoid duplicate calls to get_part() in disk stat code get_part() is fairly expensive, as it O(N) loops over partitions to find the right one. In lots of normal IO paths we end up looking up the partition twice, to make matters even worse. Change the stat add code to accept a passed in partition instead. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-07 10:15:46 +02:00
Jens Axboe	6d63c27557	cfq-iosched: make io priorities inherit CPU scheduling class as well as nice We currently set all processes to the best-effort scheduling class, regardless of what CPU scheduling class they belong to. Improve that so that we correctly track idle and rt scheduling classes as well. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-07 09:51:23 +02:00
Jens Axboe	dbaf2c003e	block: optimize generic_unplug_device() Original patch from Mikulas Patocka <mpatocka@redhat.com> Mike Anderson was doing an OLTP benchmark on a computer with 48 physical disks mapped to one logical device via device mapper. He found that there was a slowdown on request_queue->lock in function generic_unplug_device. The slowdown is caused by the fact that when some code calls unplug on the device mapper, device mapper calls unplug on all physical disks. These unplug calls take the lock, find that the queue is already unplugged, release the lock and exit. With the below patch, performance of the benchmark was increased by 18% (the whole OLTP application, not just block layer microbenchmarks). So I'm submitting this patch for upstream. I think the patch is correct, because when more threads call simultaneously plug and unplug, it is unspecified, if the queue is or isn't plugged (so the patch can't make this worse). And the caller that plugged the queue should unplug it anyway. (if it doesn't, there's 3ms timeout). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-07 09:48:17 +02:00
Jens Axboe	2cdf79cafb	block: get rid of likely/unlikely predictions in merge logic They tend to depend a lot on the workload, so not a clear-cut likely or unlikely fit. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-07 09:33:55 +02:00
Jens Axboe	07416d29bc	cfq-iosched: fix RCU race in the cfq io_context destructor handling put_io_context() drops the RCU read lock before calling into cfq_dtor(), however we need to hold off freeing there before grabbing and dereferencing the first object on the list. So extend the rcu_read_lock() scope to cover the calling of cfq_dtor(), and optimize cfq_free_io_context() to use a new variant for call_for_each_cic() that assumes the RCU read lock is already held. Hit in the wild by Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-07 09:28:57 +02:00
Jens Axboe	aa94b5371f	block: adjust tagging function queue bit locking For most initialization purposes, calling blk_queue_init_tags() without the queue lock held is OK. Only if called for resizing an existing map must the lock be held. Ditto for tag cleanup, the maps are reference counted. So switch the general queue flag setting to the unlocked variant, but retain the locked variant for resizing. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-07 09:27:43 +02:00
Jens Axboe	bf0f97025c	block: sysfs store function needs to grab queue_lock and use queue_flag_*() Concurrency isn't a big deal here since we have requests in flight at this point, but do the locked variant to set a better example. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-07 09:09:39 +02:00
Linus Torvalds	d626e3bf72	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: [SCSI] aic94xx: fix section mismatch [SCSI] u14-34f: Fix 32bit only problem [SCSI] dpt_i2o: sysfs code [SCSI] dpt_i2o: 64 bit support [SCSI] dpt_i2o: move from virt_to_bus/bus_to_virt to dma_alloc_coherent [SCSI] dpt_i2o: use standard __init / __exit code [SCSI] megaraid_sas: fix suspend/resume sections [SCSI] aacraid: Add Power Management support [SCSI] aacraid: Fix jbod operations scan issues [SCSI] aacraid: Fix warning about macro side-effects [SCSI] add support for variable length extended commands [SCSI] Let scsi_cmnd->cmnd use request->cmd buffer [SCSI] bsg: add large command support [SCSI] aacraid: Fix down_interruptible() to check the return value correctly [SCSI] megaraid_sas; Update the Version and Changelog [SCSI] ibmvscsi: Handle non SCSI error status [SCSI] bug fix for free list handling [SCSI] ipr: Rename ipr's state scsi host attribute to prevent collisions [SCSI] megaraid_mbox: fix Dell CERC firmware problem	2008-05-02 13:52:35 -07:00
Boaz Harrosh	db4742dd8f	[SCSI] add support for variable length extended commands Add support for variable-length, extended, and vendor specific CDBs to scsi-ml. It is now possible for initiators and ULD's to issue these types of commands. LLDs need not change much. All they need is to raise the .max_cmd_len to the longest command they support (see iscsi patch). - clean-up some code paths that did not expect commands to be larger than 16, and change cmd_len members' type to short as char is not enough. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-05-02 11:33:25 -05:00
FUJITA Tomonori	9f5de6b105	[SCSI] bsg: add large command support This enables bsg to handle the request length larger than BLK_MAX_CDB (mainly for the variable length CDB format). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-05-02 10:17:35 -05:00
Harvey Harrison	24c03d47d0	block: remove remaining __FUNCTION__ occurrences __FUNCTION__ is gcc specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-01 08:04:02 -07:00
Peter Zijlstra	cf0ca9fe5d	mm: bdi: export BDI attributes in sysfs Provide a place in sysfs (/sys/class/bdi) for the backing_dev_info object. This allows us to see and set the various BDI specific variables. In particular this properly exposes the read-ahead window for all relevant users and /sys/block/<block>/queue/read_ahead_kb should be deprecated. With patient help from Kay Sievers and Greg KH [mszeredi@suse.cz] - split off NFS and FUSE changes into separate patches - document new sysfs attributes under Documentation/ABI - do bdi_class_init as a core_initcall, otherwise the "default" BDI won't be initialized - remove bdi_init_fmt macro, it's not used very much [akpm@linux-foundation.org: fix ia64 warning] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Kay Sievers <kay.sievers@vrfy.org> Acked-by: Greg KH <greg@kroah.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:49 -07:00
Alan D. Brunelle	ac9fafa124	block: Skip I/O merges when disabled The block I/O + elevator + I/O scheduler code spend a lot of time trying to merge I/Os -- rightfully so under "normal" circumstances. However, if one were to know that the incoming I/O stream was /very/ random in nature, the cycles are wasted. This patch adds a per-request_queue tunable that (when set) disables merge attempts (beyond the simple one-hit cache check), thus freeing up a non-trivial amount of CPU cycles. Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-29 14:48:55 +02:00
FUJITA Tomonori	d7e3c3249e	block: add large command support This patch changes rq->cmd from the static array to a pointer to support large commands. We rarely handle large commands. So for optimization, a struct request still has a static array for a command. rq_init sets rq->cmd pointer to the static array. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-29 14:48:55 +02:00
FUJITA Tomonori	d34c87e4ba	block: replace sizeof(rq->cmd) with BLK_MAX_CDB This is a preparation for changing rq->cmd from the static array to a pointer. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-29 14:48:55 +02:00
FUJITA Tomonori	2a4aa30c5f	block: rename and export rq_init() This rename rq_init() blk_rq_init() and export it. Any path that hands the request to the block layer needs to call it to initialize the request. This is a preparation for large command support, which needs to initialize the request in a proper way (that is, just doing a memset() will not work). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-29 14:48:55 +02:00
FUJITA Tomonori	992b5bceee	block: no need to initialize rq->cmd with blk_get_request blk_get_request initializes rq->cmd (rq_init does) so the users don't need to do that. The purpose of this patch is to remove sizeof(rq->cmd) and &rq->cmd, as a preparation for large command support, which changes rq->cmd from the static array to a pointer. sizeof(rq->cmd) will not make sense and &rq->cmd won't work. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-29 14:48:55 +02:00
Adrian Bunk	6f6a036e6e	block/blk-barrier.c:blk_ordered_cur_seq() mustn't be inline This patch fixes the following build error with UML and gcc 4.3: <-- snip --> ... CC block/blk-barrier.o /home/bunk/linux/kernel-2.6/git/linux-2.6/block/blk-barrier.c: In function ‘blk_do_ordered’: /home/bunk/linux/kernel-2.6/git/linux-2.6/block/blk-barrier.c:57: sorry, unimplemented: inlining failed in call to ‘blk_ordered_cur_seq’: function body not available /home/bunk/linux/kernel-2.6/git/linux-2.6/block/blk-barrier.c:252: sorry, unimplemented: called from here /home/bunk/linux/kernel-2.6/git/linux-2.6/block/blk-barrier.c:57: sorry, unimplemented: inlining failed in call to ‘blk_ordered_cur_seq’: function body not available /home/bunk/linux/kernel-2.6/git/linux-2.6/block/blk-barrier.c:253: sorry, unimplemented: called from here make[2]: *** [block/blk-barrier.o] Error 1 <-- snip --> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-29 14:48:54 +02:00
Adrian Bunk	72ed0bf60a	block/elevator.c:elv_rq_merge_ok() mustn't be inline This patch fixes the following build error with UML and gcc 4.3: <-- snip --> ... CC block/elevator.o /home/bunk/linux/kernel-2.6/git/linux-2.6/block/elevator.c: In function ‘elv_merge’: /home/bunk/linux/kernel-2.6/git/linux-2.6/block/elevator.c:73: sorry, unimplemented: inlining failed in call to ‘elv_rq_merge_ok’: function body not available /home/bunk/linux/kernel-2.6/git/linux-2.6/block/elevator.c:103: sorry, unimplemented: called from here /home/bunk/linux/kernel-2.6/git/linux-2.6/block/elevator.c:73: sorry, unimplemented: inlining failed in call to ‘elv_rq_merge_ok’: function body not available /home/bunk/linux/kernel-2.6/git/linux-2.6/block/elevator.c:495: sorry, unimplemented: called from here make[2]: * [block/elevator.o] Error 1 make[1]: * [block] Error 2 <-- snip --> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-29 14:48:54 +02:00
Nick Piggin	75ad23bc0f	block: make queue flags non-atomic We can save some atomic ops in the IO path, if we clearly define the rules of how to modify the queue flags. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-29 14:48:33 +02:00
FUJITA Tomonori	68154e90c9	block: add dma alignment and padding support to blk_rq_map_kern This patch adds bio_copy_kern similar to bio_copy_user. blk_rq_map_kern uses bio_copy_kern instead of bio_map_kern if necessary. bio_copy_kern uses temporary pages and the bi_end_io callback frees these pages. bio_copy_kern saves the original kernel buffer at bio->bi_private it doesn't use something like struct bio_map_data to store the information about the caller. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-29 09:50:34 +02:00
Adrian Bunk	657e93be35	unexport blk_max_pfn blk_max_pfn can now be unexported. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-29 09:50:34 +02:00
FUJITA Tomonori	1afb20f301	block: make rq_init() do a full memset() This requires moving rq_init() from get_request() to blk_alloc_request(). The upside is that we can now require an rq_init() from any path that wishes to hand the request to the block layer. rq_init() will be exported for the code that uses struct request without blk_get_request. This is a preparation for large command support, which needs to initialize struct request in a proper way (that is, just doing a memset() will not work). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-29 09:50:34 +02:00
FUJITA Tomonori	97f46ae45c	[SCSI] bsg: add release callback support This patch adds release callback support, which is called when a bsg device goes away. bsg_register_queue() takes a pointer to a callback function. This feature is useful for stuff like sas_host that can't use the release callback in struct device. If a caller doesn't need bsg's release callback, it can call bsg_register_queue() with NULL pointer (e.g. scsi devices can use release callback in struct device so they don't need bsg's callback). With this patch, bsg uses kref for refcounts on bsg devices instead of get/put_device in fops->open/release. bsg calls put_device and the caller's release callback (if it was registered) in kref_put's release. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-04-22 15:16:32 -05:00
Linus Torvalds	548453fd10	Merge branch 'for-2.6.26' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.26' of git://git.kernel.dk/linux-2.6-block: block: fix blk_register_queue() return value block: fix memory hotplug and bouncing in block layer block: replace remaining __FUNCTION__ occurrences Kconfig: clean up block/Kconfig help descriptions cciss: fix warning oops on rmmod of driver cciss: Fix race between disk-adding code and interrupt handler block: move the padding adjustment to blk_rq_map_sg block: add bio_copy_user_iov support to blk_rq_map_user_iov block: convert bio_copy_user to bio_copy_user_iov loop: manage partitions in disk image cdrom: use kmalloced buffers instead of buffers on stack cdrom: make unregister_cdrom() return void cdrom: use list_head for cdrom_device_info list cdrom: protect cdrom_device_info list by mutex cdrom: cleanup hardcoded error-code cdrom: remove ifdef CONFIG_SYSCTL	2008-04-21 16:03:40 -07:00
Akinobu Mita	fb19974630	block: fix blk_register_queue() return value blk_register_queue() returns -ENXIO when queue->request_fn is NULL. But there are some block drivers that call blk_register_queue() via add_disk() with queue->request_fn == NULL. (For example, brd, loop) Although no one checks return value of blk_register_queue(), this patch makes it return 0 instead of -ENXIO when queue->request_fn is NULL, Also this patch adds warning when blk_register_queue() and blk_unregister_queue() are called with queue == NULL rather than ignore invalid usage silently. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-21 09:51:06 +02:00
Nick Andrew	ee86418d39	Kconfig: clean up block/Kconfig help descriptions Modify the help descriptions of block/Kconfig for clarity, accuracy and consistency. Refactor the BLOCK description a bit. The wording "This permits ... to be removed" isn't quite right; the block layer is removed when the option is disabled, whereas most descriptions talk about what happens when the option is enabled. Reformat the list of what is affected by disabling the block layer. Add more examples of large block devices to LBD and strive for technical accuracy; block devices of size _exactly_ 2TB require CONFIG_LBD, not only "bigger than 2TB". Also try to say (perhaps not very clearly) that the config option is only needed when you want to have individual block devices of size >= 2TB, for example if you had 3 x 1TB disks in your computer you'd have a total storage size of 3TB but you wouldn't need the option unless you want to aggregate those disks into a RAID or LVM. Improve terminology and grammar on BLK_DEV_IO_TRACE. I also added the boilerplate "If unsure, say N" to most options. Precisely say "2TB and larger" for LSF. Indent the help text for BLK_DEV_BSG by 2 spaces in accordance with the standard. Signed-off-by: Nick Andrew <nick@nick-andrew.net> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-21 09:51:04 +02:00
FUJITA Tomonori	f18573abcc	block: move the padding adjustment to blk_rq_map_sg blk_rq_map_user adjusts bi_size of the last bio. It breaks the rule that req->data_len (the true data length) is equal to sum(bio). It broke the scsi command completion code. commit `e97a294ef6` was introduced to fix the above issue. However, the partial completion code doesn't work with it. The commit is also a layer violation (scsi mid-layer should not know about the block layer's padding). This patch moves the padding adjustment to blk_rq_map_sg (suggested by James). The padding works like the drain buffer. This patch breaks the rule that req->data_len is equal to sum(sg), however, the drain buffer already broke it. So this patch just restores the rule that req->data_len is equal to sub(bio) without breaking anything new. Now when a low level driver needs padding, blk_rq_map_user and blk_rq_map_user_iov guarantee there's enough room for padding. blk_rq_map_sg can safely extend the last entry of a scatter list. blk_rq_map_sg must extend the last entry of a scatter list only for a request that got through bio_copy_user_iov. This patches introduces new REQ_COPY_USER flag. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <htejun@gmail.com> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-21 09:50:08 +02:00
FUJITA Tomonori	afdc1a780e	block: add bio_copy_user_iov support to blk_rq_map_user_iov With this patch, blk_rq_map_user_iov uses bio_copy_user_iov when a low level driver needs padding or a buffer in sg_iovec isn't aligned. That is, it uses temporary kernel buffers instead of mapping user pages directly. When a LLD needs padding, later blk_rq_map_sg needs to extend the last entry of a scatter list. bio_copy_user_iov guarantees that there is enough space for padding by using temporary kernel buffers instead of user pages. blk_rq_map_user_iov needs buffers in sg_iovec to be aligned. The comment in blk_rq_map_user_iov indicates that drivers/scsi/sg.c also needs buffers in sg_iovec to be aligned. Actually, drivers/scsi/sg.c works with unaligned buffers in sg_iovec (it always uses temporary kernel buffers). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <htejun@gmail.com> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-21 09:50:08 +02:00
Tony Jones	ee959b00c3	SCSI: convert struct class_device to struct device It's big, but there doesn't seem to be a way to split it up smaller... Signed-off-by: Tony Jones <tonyj@suse.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-04-19 19:10:33 -07:00
Linus Torvalds	2cca775bae	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (137 commits) [SCSI] iscsi: bidi support for iscsi_tcp [SCSI] iscsi: bidi support at the generic libiscsi level [SCSI] iscsi: extended cdb support [SCSI] zfcp: Fix error handling for blocked unit for send FCP command [SCSI] zfcp: Remove zfcp_erp_wait from slave destory handler to fix deadlock [SCSI] zfcp: fix 31 bit compile warnings [SCSI] bsg: no need to set BSG_F_BLOCK bit in bsg_complete_all_commands [SCSI] bsg: remove minor in struct bsg_device [SCSI] bsg: use better helper list functions [SCSI] bsg: replace kobject_get with blk_get_queue [SCSI] bsg: takes a ref to struct device in fops->open [SCSI] qla1280: remove version check [SCSI] libsas: fix endianness bug in sas_ata [SCSI] zfcp: fix compiler warning caused by poking inside new semaphore (linux-next) [SCSI] aacraid: Do not describe check_reset parameter with its value [SCSI] aacraid: Fix down_interruptible() to check the return value [SCSI] sun3_scsi_vme: add MODULE_LICENSE [SCSI] st: rename flush_write_buffer() [SCSI] tgt: use KMEM_CACHE macro [SCSI] initio: fix big endian problems for auto request sense ...	2008-04-18 11:25:31 -07:00
FUJITA Tomonori	99773aab03	[SCSI] bsg: no need to set BSG_F_BLOCK bit in bsg_complete_all_commands Before bsg_complete_all_commands is called, BSG_F_BLOCK bit is always set. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-04-18 11:48:43 -05:00
FUJITA Tomonori	842ea771c3	[SCSI] bsg: remove minor in struct bsg_device minor in struct bsg_device is used as identifier to find the corresponding struct bsg_device_class. However, request_queuse can be used as identifier for that and the minor in struct bsg_device is unnecessary. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-04-18 11:48:26 -05:00
FUJITA Tomonori	43ac9e62c4	[SCSI] bsg: use better helper list functions This replace hlist_for_each and list_entry with hlist_for_each_entry and list_first_entry respectively. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-04-18 11:48:08 -05:00
FUJITA Tomonori	c3ff1b90d8	[SCSI] bsg: replace kobject_get with blk_get_queue Both takes a ref to a queue. But blk_get_queue checks QUEUE_FLAG_DEAD and is more appropriate interface here. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-04-18 11:47:49 -05:00
FUJITA Tomonori	d45ac4fa8f	[SCSI] bsg: takes a ref to struct device in fops->open bsg_register_queue() takes a ref to struct device that a caller passes. For example, bsg takes a ref to the sdev_gendev for scsi devices. However, bsg doesn't inrease the refcount in fops->open. So while an application opens a bsg device, the scsi device that the bsg device holds can go away (bsg also takes a ref to a queue, but it doesn't prevent the device from going away). With this patch, bsg increases the refcount of struct device in fops->open and decreases it in fops->release. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-04-18 11:47:19 -05:00
Bartlomiej Zolnierkiewicz	93de00fd1c	ide: remove broken/dangerous HDIO_[UNREGISTER,SCAN]_HWIF ioctls (take 3) hdparm explicitely marks HDIO_[UNREGISTER,SCAN]_HWIF ioctls as DANGEROUS and given the number of bugs we can assume that there are no real users: * DMA has no chance of working because DMA resources are released by ide_unregister() and they are never allocated again. * Since ide_init_hwif_ports() is used for ->io_ports[] setup the ioctls don't work for almost all hosts with "non-standard" (== non ISA-like) layout of IDE taskfile registers (there is a lot of such host drivers). * ide_port_init_devices() is not called when probing IDE devices so: - drive->autotune is never set and IDE host/devices are not programmed for the correct PIO/DMA transfer modes (=> possible data corruption) - host specific I/O 32-bit and IRQ unmasking settings are not applied (=> possible data corruption) - host specific ->port_init_devs method is not called (=> no luck with ht6560b, qd65xx and opti621 host drivers) * ->rw_disk method is not preserved (=> no HPT3xxN chipsets support). * ->serialized flag is not preserved (=> possible data corruption when using icside, aec62xx (ATP850UF chipset), cmd640, cs5530, hpt366 (HPT3xxN chipsets), rz1000, sc1200, dtc2278 and ht6560b host drivers). * ->ack_intr method is not preserved (=> needed by ide-cris, buddha, gayle and macide host drivers). * ->sata_scr[] and sata_misc[] is cleared by ide_unregister() and it isn't initialized again (SiI3112 support needs them). * To issue an ioctl() there need to be at least one IDE device present in the system. * ->cable_detect method is not preserved + it is not called when probing IDE devices so cable detection is broken (however since DMA support is also broken it doesn't really matter ;-). * Some objects which may have already been freed in ide_unregister() are restored by ide_hwif_restore() (i.e. ->hwgroup). * ide_register_hw() may unregister unrelated IDE ports if free ide_hwifs[] slot cannot be found. * When IDE host drivers are modular unregistered port may be re-used by different host driver that owned it first causing subtle bugs. Since we now have a proper warm-plug support remove these ioctls, then remove no longer needed: - ide_register_hw() and ide_hwif_restore() functions - 'init_default' and 'restore' arguments of ide_unregister() - zeroeing of hwif->{dma,extra}_* fields in ide_unregister() As an added bonus IDE core code size shrinks by ~3kB (x86-32). v2: * fix ide_unregister() arguments in cleanup_module() (Andrew Morton). v3: * fix ide_unregister() arguments in palm_bk3710.c. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-04-18 00:46:24 +02:00
Jens Axboe	75ce6faccd	block: update git url for blktrace Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-15 10:23:35 +02:00
Fabio Checconi	4faa3c8150	cfq-iosched: do not leak ioc_data across iosched switches When switching scheduler from cfq, cfq_exit_queue() does not clear ioc->ioc_data, leaving a dangling pointer that can deceive the following lookups when the iosched is switched back to cfq. The pattern that can trigger that is the following: - elevator switch from cfq to something else; - module unloading, with elv_unregister() that calls cfq_free_io_context() on ioc freeing the cic (via the .trim op); - module gets reloaded and the elevator switches back to cfq; - reallocation of a cic at the same address as before (with a valid key). To fix it just assign NULL to ioc_data in __cfq_exit_single_io_context(), that is called from the regular exit path and from the elevator switching code. The only path that frees a cic and is not covered is the error handling one, but cic's freed in this way are never cached in ioc_data. Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-10 08:28:01 +02:00
Fabio Checconi	34e6bbf23c	cfq-iosched: fix rcu freeing of cfq io contexts SLAB_DESTROY_BY_RCU is not a direct substitute for normal call_rcu() freeing, since it'll page freeing but NOT object freeing. So change cfq to do the freeing on its own. Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-02 15:42:20 +02:00
Andrea Arcangeli	00d61e3e8c	Fix bounce setting for 64-bit Looking a bit closer into this regression the reason this can't be right is that dma_addr common default is BLK_BOUNCE_HIGH and most machines have less than 4G. So if you do: if (b_pfn <= (min_t(u64, 0xffffffff, BLK_BOUNCE_HIGH) >> PAGE_SHIFT)) dma = 1 that will translate to: if (BLK_BOUNCE_HIGH <= BLK_BOUNCE_HIGH) dma = 1 So for 99% of hardware this will trigger unnecessary GFP_DMA allocations and isa pooling operations. Also note how the 32bit code still does b_pfn < blk_max_low_pfn. I guess this is what you were looking after. I didn't verify but as far as I can tell, this will stop the regression with isa dma operations at boot for 99% of blkdev/memory combinations out there and I guess this fixes the setups with >4G of ram and 32bit pci cards as well (this also retains symmetry with the 32bit code). Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-02 09:06:44 +02:00
Roland McGrath	ee27a558ae	genhd must_check warning fix Fixes: block/genhd.c:361: warning: ignoring return value of ‘class_register’, declared with attribute warn_unused_result Signed-off-by: Roland McGrath <roland@redhat.com> Acked-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-12 12:34:37 -07:00
Jens Axboe	cc66b4512c	block: fix blkdev_issue_flush() not detecting and passing EOPNOTSUPP back This is important to eg dm, that tries to decide whether to stop using barriers or not. Tested as working by Anders Henke <anders.henke@1und1.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-03-04 11:47:46 +01:00
Harvey Harrison	56d94a37f6	block: fix shadowed variable warning in blk-map.c Introduced between 2.6.25-rc2 and -rc3 block/blk-map.c:154:14: warning: symbol 'bio' shadows an earlier one block/blk-map.c:110:13: originally declared here Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-03-04 11:31:22 +01:00
Harvey Harrison	448da4d262	block: remove extern on function definition Intoduced between 2.6.25-rc2 and -rc3 block/blk-settings.c:319:12: warning: function 'blk_queue_dma_drain' with external linkage has definition Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-03-04 11:30:18 +01:00
Adrian Bunk	bec419404a	unexport blk_rq_map_user_iov This patch removes the unused export of blk_rq_map_user_iov. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-03-04 11:28:34 +01:00
Adrian Bunk	9d7f1e6b9b	unexport blk_{get,put}_queue This patch removes the unused exports of blk_{get,put}_queue. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-03-04 11:28:32 +01:00
Adrian Bunk	1826eadfc4	block/genhd.c: cleanups This patch contains the following cleanups: - make the needlessly global struct disk_type static - #if 0 the unused genhd_media_change_notify() Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-03-04 11:28:31 +01:00
Adrian Bunk	ff88972c85	proper prototype for blk_dev_init() This patch adds a proper prototye for blk_dev_init() in block/blk.h Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-03-04 11:28:29 +01:00
Adrian Bunk	278caf0120	block/blk-tag.c should #include "blk.h" Every file should include the headers containing the externs for its global functions (in this case for __blk_queue_free_tags()). Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-03-04 11:28:24 +01:00
Yang Shi	419c434c35	Fix DMA access of block device in 64-bit kernel on some non-x86 systems with 4GB or upper 4GB memory For some non-x86 systems with 4GB or upper 4GB memory, we need increase the range of addresses that can be used for direct DMA in 64-bit kernel. Signed-off-by: Yang Shi <yang.shi@windriver.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-03-04 11:20:51 +01:00
Tejun Heo	e3790c7d42	block: separate out padding from alignment Block layer alignment was used for two different purposes - memory alignment and padding. This causes problems in lower layers because drivers which only require memory alignment ends up with adjusted rq->data_len. Separate out padding such that padding occurs iff driver explicitly requests it. Tomo: restorethe code to update bio in blk_rq_map_user introduced by the commit `40b01b9bbd` according to padding alignment. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-03-04 11:18:17 +01:00
FUJITA Tomonori	7a85f8896f	block: restore the meaning of rq->data_len to the true data length The meaning of rq->data_len was changed to the length of an allocated buffer from the true data length. It breaks SG_IO friends and bsg. This patch restores the meaning of rq->data_len to the true data length and adds rq->extra_len to store an extended length (due to drain buffer and padding). This patch also removes the code to update bio in blk_rq_map_user introduced by the commit `40b01b9bbd`. The commit adjusts bio according to memory alignment (queue_dma_alignment). However, memory alignment is NOT padding alignment. This adjustment also breaks SG_IO friends and bsg. Padding alignment needs to be fixed in a proper way (by a separate patch). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>	2008-03-04 11:17:11 +01:00
Randy Dunlap	5d87a052c7	block: fix kernel-docbook parameters and files kernel-doc for block/: - add missing parameters - fix one function's parameter list (remove blank line) - add 2 source files to docbook for non-exported kernel-doc functions Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-03-04 11:14:39 +01:00
Tejun Heo	db0a2e0099	block: clear drain buffer if draining for write command Clear drain buffer before chaining if the command in question is a write. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-19 11:36:55 +01:00
Tejun Heo	2fb98e8414	block: implement request_queue->dma_drain_needed Draining shouldn't be done for commands where overflow may indicate data integrity issues. Add dma_drain_needed callback to request_queue. Drain buffer is appened iff this function returns non-zero. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-19 11:36:53 +01:00
Tejun Heo	6b00769fe1	block: add request->raw_data_len With padding and draining moved into it, block layer now may extend requests as directed by queue parameters, so now a request has two sizes - the original request size and the extended size which matches the size of area pointed to by bios and later by sgs. The latter size is what lower layers are primarily interested in when allocating, filling up DMA tables and setting up the controller. Both padding and draining extend the data area to accomodate controller characteristics. As any controller which speaks SCSI can handle underflows, feeding larger data area is safe. So, this patch makes the primary data length field, request->data_len, indicate the size of full data area and add a separate length field, request->raw_data_len, for the unmodified request size. The latter is used to report to higher layer (userland) and where the original request size should be fed to the controller or device. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-19 11:36:35 +01:00
Tejun Heo	40b01b9bbd	block: update bio according to DMA alignment padding DMA start address and transfer size alignment for PC requests are achieved using bio_copy_user() instead of bio_map_user(). This works because bio_copy_user() always uses full pages and block DMA alignment isn't allowed to go over PAGE_SIZE. However, the implementation didn't update the last bio of the request to make this padding visible to lower layers. This patch makes blk_rq_map_user() extend the last bio such that it includes the padding area and the size of area pointed to by the request is properly aligned. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-19 11:35:38 +01:00
Jens Axboe	e164094964	elevator: make elevator_get() attempt to load the appropriate module Currently we fail if someone requests a valid io scheduler, but it's modular and not currently loaded. That can happen from a driver init asking for a different scheduler, or online switching through sysfs as requested by a user. This patch makes elevator_get() request_module() to attempt to load the appropriate module, instead of requiring that done manually. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-19 10:20:37 +01:00
Jens Axboe	ffc4e75957	cfq-iosched: add hlist for browsing parallel to the radix tree It's cumbersome to browse a radix tree from start to finish, especially since we modify keys when a process exits. So add a hlist for the single purpose of browsing over all known cfq_io_contexts, used for exit, io prio change, etc. This fixes http://bugzilla.kernel.org/show_bug.cgi?id=9948 Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-19 10:04:00 +01:00
Jens Axboe	84e9e03c55	block: make blk_rq_map_user() clear ->bio if it unmaps it That way the interface is symmetric, and calling blk_rq_unmap_user() on the request wont oops. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-19 10:04:00 +01:00
Adrian Bunk	52ff4cae65	make blk_settings_init() static blk_settings_init() can become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>	2008-02-19 10:04:00 +01:00
Adrian Bunk	1334159826	make blk_ioc_init() static blk_ioc_init() can become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>	2008-02-19 10:04:00 +01:00
Adrian Bunk	5ece6c52ea	make blk-core.c:request_cachep static again request_cachep needlessly became global. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>	2008-02-19 10:04:00 +01:00
Jerome Marchand	c3c930d933	Enhanced partition statistics: remove old partition statistics Removes the now unused old partition statistic code. Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-08 12:42:01 +01:00
Jerome Marchand	28f39d553e	Enhanced partition statistics: procfs Reports enhanced partition statistics in /proc/diskstats. Signed-off-by: Jerome Marchand <jmarchan@redhat.com>	2008-02-08 12:42:00 +01:00
Jerome Marchand	6f2576af5b	Enhanced partition statistics: update partition statitics Updates the enhanced partition statistics in generic block layer besides the disk statistics. Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-08 12:41:56 +01:00
Jens Axboe	63a7138671	block: fixup rq_init() a bit Rearrange fields in cache order and initialize some fields that we didn't previously init. Remove init of ->completion_data, it's part of a union with ->hash. Luckily clearing the rb node is the same as setting it to null! Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-08 12:41:03 +01:00
Jens Axboe	3bc217ffe6	block: kill swap_io_context() It blindly copies everything in the io_context, including the lock. That doesn't work so well for either lock ordering or lockdep. There seems zero point in swapping io contexts on a request to request merge, so the best point of action is to just remove it. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-01 11:34:49 +01:00
Jens Axboe	8bdd3f8a69	as-iosched: fix inconsistent ioc->lock context Since it's acquired from irq context, all locking must be of the irq safe variant. Most are already inside the queue lock (which already disables interrupts), but the io scheduler rmmod path always has irqs enabled and the put_io_context() path may legally be called with irqs enabled (even if it isn't usually). So fixup those two. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-01 09:44:28 +01:00
Jens Axboe	4eb166d987	block: make elevator lib checkpatch compliant Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-01 09:26:33 +01:00
Jens Axboe	fe094d98e7	cfq-iosched: make checkpatch compliant Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-01 09:26:33 +01:00
Jens Axboe	6728cb0e63	block: make core bits checkpatch compliant Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-01 09:26:33 +01:00
Jens Axboe	22b132102f	block: new end request handling interface should take unsigned byte counts No point in passing signed integers as the byte count, they can never be negative. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-01 09:26:33 +01:00
James Bottomley	40f620286d	[SCSI] bsg: copy the cmd_type field to the subordinate request for bidi This fixes a problem in SCSI where we use the (previously uninitialised) cmd_type via blk_pc_request() to set up the transfer in scsi_init_sgtable(). Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-30 13:14:26 -06:00
Jens Axboe	149a051f82	as-iosched: fix double locking bug in as_merged_requests() If the two requests belong to the same io context, we will attempt to lock the same lock twice. But swapping contexts is pointless in that case, so just check for rioc == nioc before doing the double lock and copy. Tested-by: Olof Johansson <olof@lixom.net> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-30 09:11:10 +01:00
Jan Engelhardt	12f32bb317	block: constify function pointer tables Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-29 21:55:19 +01:00
Martin K. Petersen	e68b903c6b	Expose hardware sector size Expose hardware sector size in sysfs queue directory. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-29 21:55:17 +01:00
Jens Axboe	d6d4819696	block: ll_rw_blk.c split, add blk-merge.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-29 21:55:12 +01:00
Jens Axboe	db1d08c646	block: remove dated (and wrong) comment in blk-core.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-29 21:55:11 +01:00
Jens Axboe	26b8256e2b	block: get rid of unnecessary forward declarations in blk-core.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-29 21:55:09 +01:00
Jens Axboe	86db1e2977	block: continue ll_rw_blk.c splitup Adds files for barrier handling, rq execution, io context handling, mapping data to requests, and queue settings. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-29 21:55:08 +01:00
Jens Axboe	8324aa91d1	block: split tag and sysfs handling from blk-core.c Seperates the tag and sysfs handling from ll_rw_blk. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-29 21:55:07 +01:00
Jens Axboe	a168ee84c9	block: first step of splitting ll_rw_blk, rename it Then we retain history in blk-core.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-29 21:55:05 +01:00

1 2 3 4 5 ...

621 Commits