Powerpc already has a directed yield for CONFIG_PREEMPT="n". To make it
work with CONFIG_PREEMPT="y" as well the _raw_{spin,read,write}_relax
primitives need to be defined to call __spin_yield() for spinlocks and
__rw_yield() for rw-locks.
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On systems running with virtual cpus there is optimization potential in
regard to spinlocks and rw-locks. If the virtual cpu that has taken a lock
is known to a cpu that wants to acquire the same lock it is beneficial to
yield the timeslice of the virtual cpu in favour of the cpu that has the
lock (directed yield).
With CONFIG_PREEMPT="n" this can be implemented by the architecture without
common code changes. Powerpc already does this.
With CONFIG_PREEMPT="y" the lock loops are coded with _raw_spin_trylock,
_raw_read_trylock and _raw_write_trylock in kernel/spinlock.c. If the lock
could not be taken cpu_relax is called. A directed yield is not possible
because cpu_relax doesn't know anything about the lock. To be able to
yield the lock in favour of the current lock holder variants of cpu_relax
for spinlocks and rw-locks are needed. The new _raw_spin_relax,
_raw_read_relax and _raw_write_relax primitives differ from cpu_relax
insofar that they have an argument: a pointer to the lock structure.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
One of idiomatic ways to duplicate a region of memory is
dst = kmalloc(len, GFP_KERNEL);
if (!dst)
return -ENOMEM;
memcpy(dst, src, len);
which is neat code except a programmer needs to write size twice. Which
sometimes leads to mistakes. If len passed to kmalloc is smaller that len
passed to memcpy, it's straight overwrite-beyond-end. If len passed to
memcpy is smaller than len passed to kmalloc, it's either a) legit
behaviour ;-), or b) cloned buffer will contain garbage in second half.
Slight trolling of commit lists shows several duplications bugs
done exactly because of diverged lenghts:
Linux:
[CRYPTO]: Fix memcpy/memset args.
[PATCH] memcpy/memset fixes
OpenBSD:
kerberosV/src/lib/asn1: der_copy.c:1.4
If programmer is given only one place to play with lengths, I believe, such
mistakes could be avoided.
With kmemdup, the snippet above will be rewritten as:
dst = kmemdup(src, len, GFP_KERNEL);
if (!dst)
return -ENOMEM;
This also leads to smaller code (kzalloc effect). Quick grep shows
200+ places where kmemdup() can be used.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add maximum latency tracking to the ALSA subsystem for PCM playback. In
ALSA, the playback application controls the buffer size and thus indirectly
the period of latency that it can deal with. This patch uses 75% of the
total available latency as threshold to announce to the latency subsystem;
While 75% is a crude heuristic it's a quite reasonable one; the remaining
25% can be used for all driver processing for the next samples which is
also proportional to the size of the buffer.
With ogg123 a latency setting of about 4msec was seen (at 44Khz), while
with the "play" command a much longer maximum tolerable latency was seen.
Other, more multimedia oriented players as well as games, will have a lot
smaller buffers to allow better synchronization and those will actually get
into the latency domains where there is impact on the power management
rules.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add infrastructure to track "maximum allowable latency" for power saving
policies.
The reason for adding this infrastructure is that power management in the
idle loop needs to make a tradeoff between latency and power savings
(deeper power save modes have a longer latency to running code again). The
code that today makes this tradeoff just does a rather simple algorithm;
however this is not good enough: There are devices and use cases where a
lower latency is required than that the higher power saving states provide.
An example would be audio playback, but another example is the ipw2100
wireless driver that right now has a very direct and ugly acpi hook to
disable some higher power states randomly when it gets certain types of
error.
The proposed solution is to have an interface where drivers can
* announce the maximum latency (in microseconds) that they can deal with
* modify this latency
* give up their constraint
and a function where the code that decides on power saving strategy can
query the current global desired maximum.
This patch has a user of each side: on the consumer side, ACPI is patched
to use this, on the producer side the ipw2100 driver is patched.
A generic maximum latency is also registered of 2 timer ticks (more and you
lose accurate time tracking after all).
While the existing users of the patch are x86 specific, the infrastructure
is not. I'd like to ask the arch maintainers of other architectures if the
infrastructure is generic enough for their use (assuming the architecture
has such a tradeoff as concept at all), and the sound/multimedia driver
owners to look at the driver facing API to see if this is something they
can use.
[akpm@osdl.org: cleanups]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jesse.barnes@intel.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch defines:
* a generic boolean-type, named 'bool'
* aliases to 0 and 1, named 'false' and 'true'
Removing colliding definitions of 'bool', 'false' and 'true'.
Signed-off-by: Richard Knutsson <ricknu-0@student.ltu.se>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This fixes two things
Firstly someone mistakenly used "errata" for the singular. This causes
Dave Woodhouse to emit diagnostics whenever the string is read, and so
should be fixed.
Secondly the AMD AGP tunnel has an erratum which causes hangs if you try
and do direct PCI to AGP transfers in some cases. We have a flag for
PCI/PCI failures but we need a different flag for this really as in this
case we don't want to stop PCI/PCI transfers using things like IOAT and the
new RAID offload work.
I'll post some updates to make proper use of the PCIAGP flag in the
media/video drivers to Mauro.
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
device_reprobe() should return an error code. When it does so,
scsi_device_reprobe() should propagate it back.
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Don't just do nothing: it'll cause busywaits all over writeback and page
reclaim.
For now, take a fixed-length nap. Will improve when NFS starts waking up
throttled processes.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Make it possible to disable the block layer. Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.
This patch does the following:
(*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
support.
(*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
an item that uses the block layer. This includes:
(*) Block I/O tracing.
(*) Disk partition code.
(*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.
(*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
block layer to do scheduling. Some drivers that use SCSI facilities -
such as USB storage - end up disabled indirectly from this.
(*) Various block-based device drivers, such as IDE and the old CDROM
drivers.
(*) MTD blockdev handling and FTL.
(*) JFFS - which uses set_bdev_super(), something it could avoid doing by
taking a leaf out of JFFS2's book.
(*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
however, still used in places, and so is still available.
(*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
parts of linux/fs.h.
(*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.
(*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.
(*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
is not enabled.
(*) fs/no-block.c is created to hold out-of-line stubs and things that are
required when CONFIG_BLOCK is not set:
(*) Default blockdev file operations (to give error ENODEV on opening).
(*) Makes some /proc changes:
(*) /proc/devices does not list any blockdevs.
(*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.
(*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.
(*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
given command other than Q_SYNC or if a special device is specified.
(*) In init/do_mounts.c, no reference is made to the blockdev routines if
CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.
(*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
error ENOSYS by way of cond_syscall if so).
(*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
CONFIG_BLOCK is not set, since they can't then happen.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Move the Ext3 device ioctl compat stuff from fs/compat_ioctl.c to the Ext3
driver so that the Ext3 header file doesn't need to be included.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Move the ReiserFS device ioctl compat stuff from fs/compat_ioctl.c to the
ReiserFS driver so that the ReiserFS header file doesn't need to be included.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Move common FS-specific ioctls from linux/ext2_fs.h to linux/fs.h as FS_IOC_*
and FS_IOC32_* and have the users of them use those as a base.
Also move the GETFLAGS/SETFLAGS flags to linux/fs.h as FS_*_FL macros, and then
have the other users use them as a base.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Move the loop device ioctl compat stuff from fs/compat_ioctl.c to the loop
driver so that the loop header file doesn't need to be included.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Dissociate the generic_writepages() function from the mpage stuff, moving its
declaration to linux/mm.h and actually emitting a full implementation into
mm/page-writeback.c.
The implementation is a partial duplicate of mpage_writepages() with all BIO
references removed.
It is used by NFS to do writeback.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Create a new header file, fs/internal.h, for common definitions local to the
sources in the fs/ directory.
Move extern definitions that should be in header files from fs/*.c to
fs/internal.h or other main header files where they span directories.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove the duplicate declaration of exit_io_context() from linux/sched.h.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Move some functions out of the buffering code that aren't strictly buffering
specific. This is a precursor to being able to disable the block layer.
(*) Moved some stuff out of fs/buffer.c:
(*) The file sync and general sync stuff moved to fs/sync.c.
(*) The superblock sync stuff moved to fs/super.c.
(*) do_invalidatepage() moved to mm/truncate.c.
(*) try_to_release_page() moved to mm/filemap.c.
(*) Moved some related declarations between header files:
(*) declarations for do_invalidatepage() and try_to_release_page() moved
to linux/mm.h.
(*) __set_page_dirty_buffers() moved to linux/buffer_head.h.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We can use this information for making more intelligent priority
decisions, and it will also be useful for blktrace.
Signed-off-by: Jens Axboe <axboe@suse.de>
CFQ implements this on its own now, but it's really block layer
knowledge. Tells a device queue to start dispatching requests to
the driver, taking care to unplug if needed. Also fixes the issue
where as/cfq will invoke a stopped queue, which we really don't
want.
Signed-off-by: Jens Axboe <axboe@suse.de>
None of the in-kernel primitives for handling "atomic" counting seem
to be a good fit. We need something that is essentially free for
incrementing/decrementing, while the read side may be more expensive
as we only ever need to do that when a device is removed from the
kernel.
Use a per-cpu variable for maintaining a per-cpu ioc count and define
a reading mechanism that just sums up the values.
Signed-off-by: Jens Axboe <axboe@suse.de>
cfq_exit_lock is protecting two things now:
- The per-ioc rbtree of cfq_io_contexts
- The per-cfqd linked list of cfq_io_contexts
The per-cfqd linked list can be protected by the queue lock, as it is (by
definition) per cfqd as the queue lock is.
The per-ioc rbtree is mainly used and updated by the process itself only.
The only outside use is the io priority changing. If we move the
priority changing to not browsing the rbtree, we can remove any locking
from the rbtree updates and lookup completely. Let the sys_ioprio syscall
just mark processes as having the iopriority changed and lazily update
the private cfq io contexts the next time io is queued, and we can
remove this locking as well.
Signed-off-by: Jens Axboe <axboe@suse.de>
Move some members around and unionize completion_data and rb_node since
they cannot ever be used at the same time.
Signed-off-by: Jens Axboe <axboe@suse.de>
After Christophs SCSI change, the only usage left is RQ_ACTIVE
and RQ_INACTIVE. The block layer sets RQ_INACTIVE right before freeing
the request, so any check for RQ_INACTIVE in a driver is a bug and
indicates use-after-free.
So kill/clean the remaining users, straight forward.
Signed-off-by: Jens Axboe <axboe@suse.de>
It is always identical to &q->rq, and we only use it for detecting
whether this request came out of our mempool or not. So replace it
with an additional ->flags bit flag.
Signed-off-by: Jens Axboe <axboe@suse.de>
As the comments indicates in blkdev.h, we can fold it into ->end_io_data
usage as that is really what ->waiting is. Fixup the users of
blk_end_sync_rq().
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The rbtree sort/lookup/reposition logic is mostly duplicated in
cfq/deadline/as, so move it to the elevator core. The io schedulers
still provide the actual rb root, as we don't want to impose any sort
of specific handling on the schedulers.
Introduce the helpers and rb_node in struct request to help migrate the
IO schedulers.
Signed-off-by: Jens Axboe <axboe@suse.de>
Right now, every IO scheduler implements its own backmerging (except for
noop, which does no merging). That results in duplicated code for
essentially the same operation, which is never a good thing. This patch
moves the backmerging out of the io schedulers and into the elevator
core. We save 1.6kb of text and as a bonus get backmerging for noop as
well. Win-win!
Signed-off-by: Jens Axboe <axboe@suse.de>
Right now ->flags is a bit of a mess: some are request types, and
others are just modifiers. Clean this up by splitting it into
->cmd_type and ->cmd_flags. This allows introduction of generic
Linux block message types, useful for sending generic Linux commands
to block devices.
Signed-off-by: Jens Axboe <axboe@suse.de>
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (180 commits)
V4L/DVB (4641): Trivial: use lowercase letters in hex subsystem ids
V4L/DVB (4639): Cx88: add autodetection for alternate revision of Leadtek PVR
V4L/DVB (4638): Basic DVB-T and analog TV support for the HVR1300.
V4L/DVB (4637): Add a default method for VIDIOC_G_PARM
V4L/DVB (4635): Extend bttv and saa7134 to check for both AGP and PCI PCI failure case
V4L/DVB (4634): Zr36120: implement pcipci checks
V4L/DVB (4632): Zoran: Implement pcipci failure check
V4L/DVB (4631): Av7110: remove V4L2_CAP_VBI_CAPTURE flag
V4L/DVB (4630): Av7110: FW_LOADER depemdency fixed
V4L/DVB (4629): Saa7134: add card support for Proteus Pro 2309
V4L/DVB (4628): Fix VIDIOC_ENUMSTD ioctl in videodev.c
V4L/DVB (4627): Vivi crashes with mplayer
V4L/DVB (4626): On saa7111/7113, LUMA_CTRL need a different value
V4L/DVB (4624): Tvaudio: Replaced kernel_thread() with kthread_run()
V4L/DVB (4622): Copy-paste bug in videodev.c
V4L/DVB (4620): Fix AGC configuration for MOD3000P-based boards
V4L/DVB (4619): Fixes some I2C dependencies on V4L devices
V4L/DVB (4617): Problem with dibusb-mb.c USB IDs
V4L/DVB (4616): [PATCH] Nebula DigiTV USB RC support
V4L/DVB (4614): Export symbol saa7134_tvaudio_setmute from saa7134 for saa7134-alsa
...
* 'intelfb-patches' of master.kernel.org:/pub/scm/linux/kernel/git/airlied/intelfb-2.6:
intelfbhw.c: intelfbhw_get_p1p2 defined but not used
intelfb: fix mtrr_reg signedness
intelfb: update doc and Kconfig (supported devices)
intelfb: add preliminary i2c support
intelfb: add preliminary i2c support
intelfb: add preliminary i2c support
intelfb: add preliminary i2c support
intelfb: add preliminary i2c support
intelfb: add preliminary i2c support
intelfb: add preliminary i2c support
intelfb: add preliminary i2c support
intelfb: add vsync interrupt support
intelfb: add vsync interrupt support
intelfb: add vsync interrupt support
intelfb: add vsync interrupt support
intelfb: add vsync interrupt support
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6:
[PATCH] Use early clobber in semaphores
[PATCH] Define vsyscall cache as blob to make clearer that user space shouldn't use it
[PATCH] Re-positioning the bss segment
[PATCH] Use ARRAY_SIZE in setup.c
[PATCH] i386: replace intermediate array-size definitions with ARRAY_SIZE()
[PATCH] x86: Clean up x86 NMI sysctls
[PATCH] Refactor some duplicated code in mpparse.c
[PATCH] Document iommu=panic
[PATCH] Fix broken indentation in iommu_setup
[PATCH] Allow disabling DAC using command line options
[PATCH] Add proper sparse __user casts to __copy_to_user_inatomic
[PATCH] i386: Update defconfig
[PATCH] Update defconfig
MSI is defined to be 32-bit write. The 5706 does 64-bit MSI writes
with byte enables disabled on the unused 32-bit word. This is legal
but causes problems on the AMD 8132 which will eventually stop
responding after a while.
Without this patch, the MSI test done by the driver during open will
pass, but MSI will eventually stop working after a few MSIs are
written by the device.
AMD believes this incompatibility is unique to the 5706, and
prefers to locally disable MSI rather than globally disabling it
using pci_msi_quirk.
Update version to 1.4.45.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix some issues Steve Grubb had with the way NetLabel was using the audit
subsystem. This should make NetLabel more consistent with other kernel
generated audit messages specifying configuration changes.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>