Commit Graph

2812 Commits

Author SHA1 Message Date
Linus Torvalds
d69636157a Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] splice: fix page stealing LRU handling.
  [PATCH] splice: page stealing needs to wait_on_page_writeback()
  [PATCH] splice: export generic_splice_sendpage
  [PATCH] splice: add a SPLICE_F_MORE flag
  [PATCH] splice: add comments documenting more of the code
  [PATCH] splice: improve writeback and clean up page stealing
  [PATCH] splice: fix shadow[] filling logic
2006-04-02 14:22:06 -07:00
Jens Axboe
3e7ee3e7b3 [PATCH] splice: fix page stealing LRU handling.
Originally from Nick Piggin, just adapted to the newer branch.

You can't check PageLRU without holding zone->lru_lock.  The page
release code can get away with it only because the page refcount is 0 at
that point. Also, you can't reliably remove pages from the LRU unless
the refcount is 0. Ever.

Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:11:04 +02:00
Jens Axboe
b2b39fa478 [PATCH] splice: add a SPLICE_F_MORE flag
This lets userspace indicate whether more data will be coming in a
subsequent splice call.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:05:41 +02:00
Jens Axboe
4f6f0bd2ff [PATCH] splice: improve writeback and clean up page stealing
By cleaning up the writeback logic (killing write_one_page() and the manual
set_page_dirty()), we can get rid of ->stolen inside the pipe_buffer and
just keep it local in pipe_to_file().

This also adds dirty page balancing logic and O_SYNC handling.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:04:46 +02:00
Linus Torvalds
63589ed078 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (48 commits)
  Documentation: fix minor kernel-doc warnings
  BUG_ON() Conversion in drivers/net/
  BUG_ON() Conversion in drivers/s390/net/lcs.c
  BUG_ON() Conversion in mm/slab.c
  BUG_ON() Conversion in mm/highmem.c
  BUG_ON() Conversion in kernel/signal.c
  BUG_ON() Conversion in kernel/signal.c
  BUG_ON() Conversion in kernel/ptrace.c
  BUG_ON() Conversion in ipc/shm.c
  BUG_ON() Conversion in fs/freevxfs/
  BUG_ON() Conversion in fs/udf/
  BUG_ON() Conversion in fs/sysv/
  BUG_ON() Conversion in fs/inode.c
  BUG_ON() Conversion in fs/fcntl.c
  BUG_ON() Conversion in fs/dquot.c
  BUG_ON() Conversion in md/raid10.c
  BUG_ON() Conversion in md/raid6main.c
  BUG_ON() Conversion in md/raid5.c
  Fix minor documentation typo
  BFP->BPF in Documentation/networking/tuntap.txt
  ...
2006-04-02 12:58:45 -07:00
Linus Torvalds
b043b673dc Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (49 commits)
  V4L/DVB (3667b): cpia2: fix function prototype
  V4L/DVB (3702): Make msp3400 routing defines more consistent
  V4L/DVB (3700): Remove obsolete commands from tvp5150.c
  V4L/DVB (3697): More msp3400 and bttv fixes
  V4L/DVB (3696): Previous change for cx2341X boards broke the remote support
  V4L/DVB (3693): Fix msp3400c and bttv stereo/mono/bilingual detection/handling
  V4L/DVB (3692): Keep experimental SLICED_VBI defines under an #if 0
  V4L/DVB (3689): Kconfig: fix VP-3054 Secondary I2C Bus build configuration menu dependencies
  V4L/DVB (3673): Fix budget-av CAM reset
  V4L/DVB (3672): Fix memory leak in dvr open
  V4L/DVB (3671): New module parameter 'tv_standard' (dvb-ttpci driver)
  V4L/DVB (3670): Fix typo in comment
  V4L/DVB (3669): Configurable dma buffer size for saa7146-based budget dvb cards
  V4L/DVB (3653h): Move usb v4l docs into Documentation/video4linux
  V4L/DVB (3667a): Fix SAP + stereo mode at msp3400
  V4L/DVB (3666): Remove trailing newlines
  V4L/DVB (3665): Add new NEC uPD64031A and uPD64083 i2c drivers
  V4L/DVB (3663): Fix msp3400c wait time and better audio mode fallbacks
  V4L/DVB (3662): Don't set msp3400c-non-existent register
  V4L/DVB (3661): Add wm8739 stereo audio ADC i2c driver
  ...
2006-04-02 12:53:57 -07:00
Linus Torvalds
9c8680e2cf Merge master.kernel.org:/pub/scm/linux/kernel/git/dtor/input
* master.kernel.org:/pub/scm/linux/kernel/git/dtor/input: (26 commits)
  Input: add support for Braille devices
  Input: synaptics - limit rate to 40pps on Toshiba Protege M300
  Input: gamecon - add SNES mouse support
  Input: make modalias code respect allowed buffer size
  Input: convert /proc handling to seq_file
  Input: limit attributes' output to PAGE_SIZE
  Input: gameport - fix memory leak
  Input: serio - fix memory leak
  Input: zaurus keyboard driver updates
  Input: i8042 - fix logic around pnp_register_driver()
  Input: ns558 - fix logic around pnp_register_driver()
  Input: pcspkr - separate device and driver registration
  Input: atkbd - allow disabling on X86_PC (if EMBEDDED)
  Input: atkbd - disable softrepeat for dumb keyboards
  Input: atkbd - fix complaints about 'releasing unknown key 0x7f'
  Input: HID - fix duplicate key mapping for Logitech UltraX remote
  Input: use kzalloc() throughout the code
  Input: fix input_free_device() implementation
  Input: initialize serio and gameport at subsystem level
  Input: uinput - semaphore to mutex conversion
  ...
2006-04-02 12:49:19 -07:00
Linus Torvalds
bacd3add08 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET]: Fully fix the memory leaks in sys_accept().
  [NETFILTER]: iptables 32bit compat layer
  [NETFILTER]: {ip,nf}_conntrack_netlink: fix expectation notifier unregistration
  [NETFILTER]: fix ifdef for connmark support in nf_conntrack_netlink
  [NETFILTER]: x_tables: unify IPv4/IPv6 multiport match
  [NETFILTER]: x_tables: unify IPv4/IPv6 esp match
  [NET]: Fix dentry leak in sys_accept().
  [IPSEC]: Kill unused decap state structure
  [IPSEC]: Kill unused decap state argument
  [NET]: com90xx kmalloc fix
  [TG3]: Update driver version and reldate.
  [TG3]: Revert "Speed up SRAM access"
2006-04-02 12:47:12 -07:00
Linus Torvalds
29e350944f splice: add SPLICE_F_NONBLOCK flag
It doesn't make the splice itself necessarily nonblocking (because the
actual file descriptors that are spliced from/to may block unless they
have the O_NONBLOCK flag set), but it makes the splice pipe operations
nonblocking.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-02 12:46:35 -07:00
Martin Waitz
a580290c3e Documentation: fix minor kernel-doc warnings
This patch updates the comments to match the actual code.

Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:59:55 +02:00
Hans Verkuil
9bc7400a9d V4L/DVB (3692): Keep experimental SLICED_VBI defines under an #if 0
The sliced VBI defines added in videodev2.h are removed since requires
more discussion.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-04-02 04:56:04 -03:00
Samuel Thibault
b9ec4e109d Input: add support for Braille devices
- Add KEY_BRL_* input keys and K_BRL_* keycodes;
- Add emulation of how braille keyboards usually combine braille dots
  to the console keyboard driver;
- Add handling of unicode U+28xy diacritics.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2006-04-02 00:10:28 -05:00
Dmitry Torokhov
95d465fd75 Manual merge with Linus.
Conflicts:
	arch/powerpc/kernel/setup-common.c
	drivers/input/keyboard/hil_kbd.c
	drivers/input/mouse/hil_ptr.c
2006-04-02 00:08:05 -05:00
Dmitry Mishin
2722971cbe [NETFILTER]: iptables 32bit compat layer
This patch extends current iptables compatibility layer in order to get
32bit iptables to work on 64bit kernel. Current layer is insufficient due
to alignment checks both in kernel and user space tools.

Patch is for current net-2.6.17 with addition of move of ipt_entry_{match|
target} definitions to xt_entry_{match|target}.

Signed-off-by: Dmitry Mishin <dim@openvz.org>
Acked-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 02:25:19 -08:00
Yasuyuki Kozakai
a89ecb6a2e [NETFILTER]: x_tables: unify IPv4/IPv6 multiport match
This unifies ipt_multiport and ip6t_multiport to xt_multiport.
As a result, this addes support for inversion and port range match
to IPv6 packets.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 02:22:54 -08:00
Yasuyuki Kozakai
dc5ab2faec [NETFILTER]: x_tables: unify IPv4/IPv6 esp match
This unifies ipt_esp and ip6t_esp to xt_esp. Please note that now
a user program needs to specify IPPROTO_ESP as protocol to use esp match
with IPv6. This means that ip6tables requires '-p esp' like iptables.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 02:22:30 -08:00
Kalin KOZHUHAROV
8ba8e95ed1 Fix comments: s/granuality/granularity/
I was grepping through the code and some `grep ganularity -R .` didn't
catch what I thought. Then looking closer I saw the term "granuality"
used in only four places (in comments) and granularity in many more
places describing the same idea. Some other facts:

dictionary.com does not know such a word
define:granuality on google is not found (and pages for granuality are
mostly related to patches to the kernel)
it has not been discussed as a term on LKML, AFAICS (=Can Search)

To be consistent, I think granularity should be used everywhere.

Signed-off-by: Kalin KOZHUHAROV <kalin@thinrope.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:41:22 +02:00
Linus Torvalds
4b75679f60 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET]: Allow skb headroom to be overridden
  [TCP]: Kill unused extern decl for tcp_v4_hash_connecting()
  [NET]: add SO_RCVBUF comment
  [NET]: Deinline some larger functions from netdevice.h
  [DCCP]: Use NULL for pointers, comfort sparse.
  [DECNET]: Fix refcount
2006-03-31 12:52:30 -08:00
Adrian Bunk
a244e1698a [PATCH] fs/namei.c: make lookup_hash() static
As announced, lookup_hash() can now become static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:19:01 -08:00
Antonino A. Daplas
a536093a2f [PATCH] fbcon: Fix big-endian bogosity in slow_imageblit()
The monochrome->color expansion routine that handles bitmaps which have
(widths % 8) != 0 (slow_imageblit) produces corrupt characters in big-endian.
This is caused by a bogus bit test in slow_imageblit().

Fix.

This patch may deserve to go to the stable tree.  The code has already been
well tested in little-endian machines.  It's only in big-endian where there is
uncertainty and Herbert confirmed that this is the correct way to go.

It should not introduce regressions.

Signed-off-by: Antonino Daplas <adaplas@pol.net>
Acked-by: Herbert Poetzl <herbert@13thfloor.at>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:19:00 -08:00
Richard Purdie
6ca017658b [PATCH] backlight: Backlight Class Improvements
Backlight class attributes are currently easy to implement incorrectly.
Moving certain handling into the backlight core prevents this whilst at the
same time makes the drivers simpler and consistent.  The following changes are
included:

The brightness attribute only sets and reads the brightness variable in the
backlight_properties structure.

The power attribute only sets and reads the power variable in the
backlight_properties structure.

Any framebuffer blanking events change a variable fb_blank in the
backlight_properties structure.

The backlight driver has only two functions to implement.  One function is
called when any of the above properties change (to update the backlight
brightness), the second is called to return the current backlight brightness
value.  A new attribute "actual_brightness" is added to return this brightness
as determined by the driver having combined all the above factors (and any
driver/device specific factors).

Additionally, the backlight core takes care of checking the maximum brightness
is not exceeded and of turning off the backlight before device removal.

The corgi backlight driver is updated to reflect these changes.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:19:00 -08:00
Eric W. Biederman
3e7e241f8c [PATCH] dcache: Add helper d_hash_and_lookup
It is very common to hash a dentry and then to call lookup.  If we take fs
specific hash functions into account the full hash logic can get ugly.
Further full_name_hash as an inline function is almost 100 bytes on x86 so
having a non-inline choice in some cases can measurably decrease code size.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:19:00 -08:00
Eric W. Biederman
92476d7fc0 [PATCH] pidhash: Refactor the pid hash table
Simplifies the code, reduces the need for 4 pid hash tables, and makes the
code more capable.

In the discussions I had with Oleg it was felt that to a large extent the
cleanup itself justified the work.  With struct pid being dynamically
allocated meant we could create the hash table entry when the pid was
allocated and free the hash table entry when the pid was freed.  Instead of
playing with the hash lists when ever a process would attach or detach to a
process.

For myself the fact that it gave what my previous task_ref patch gave for free
with simpler code was a big win.  The problem is that if you hold a reference
to struct task_struct you lock in 10K of low memory.  If you do that in a user
controllable way like /proc does, with an unprivileged but hostile user space
application with typical resource limits of 1000 fds and 100 processes I can
trigger the OOM killer by consuming all of low memory with task structs, on a
machine wight 1GB of low memory.

If I instead hold a reference to struct pid which holds a pointer to my
task_struct, I don't suffer from that problem because struct pid is 2 orders
of magnitude smaller.  In fact struct pid is small enough that most other
kernel data structures dwarf it, so simply limiting the number of referring
data structures is enough to prevent exhaustion of low memory.

This splits the current struct pid into two structures, struct pid and struct
pid_link, and reduces our number of hash tables from PIDTYPE_MAX to just one.
struct pid_link is the per process linkage into the hash tables and lives in
struct task_struct.  struct pid is given an indepedent lifetime, and holds
pointers to each of the pid types.

The independent life of struct pid simplifies attach_pid, and detach_pid,
because we are always manipulating the list of pids and not the hash table.
In addition in giving struct pid an indpendent life it makes the concept much
more powerful.

Kernel data structures can now embed a struct pid * instead of a pid_t and
not suffer from pid wrap around problems or from keeping unnecessarily
large amounts of memory allocated.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:19:00 -08:00
Eric W. Biederman
8c7904a00b [PATCH] task: RCU protect task->usage
A big problem with rcu protected data structures that are also reference
counted is that you must jump through several hoops to increase the reference
count.  I think someone finally implemented atomic_inc_not_zero(&count) to
automate the common case.  Unfortunately this means you must special case the
rcu access case.

When data structures are only visible via rcu in a manner that is not
determined by the reference count on the object (i.e.  tasks are visible until
their zombies are reaped) there is a much simpler technique we can employ.
Simply delaying the decrement of the reference count until the rcu interval is
over.

What that means is that the proc code that looks up a task and later
wants to sleep can now do:

rcu_read_lock();
task = find_task_by_pid(some_pid);
if (task) {
	get_task_struct(task);
}
rcu_read_unlock();

The effect on the rest of the kernel is that put_task_struct becomes cheaper
and immediate, and in the case where the task has been reaped it frees the
task immediate instead of unnecessarily waiting an until the rcu interval is
over.

Cleanup of task_struct does not happen when its reference count drops to
zero, instead cleanup happens when release_task is called.  Tasks can only
be looked up via rcu before release_task is called.  All rcu protected
members of task_struct are freed by release_task.

Therefore we can move call_rcu from put_task_struct into release_task.  And
we can modify release_task to not immediately release the reference count
but instead have it call put_task_struct from the function it gives to
call_rcu.

The end result:

- get_task_struct is safe in an rcu context where we have just looked
  up the task.

- put_task_struct() simplifies into its old pre rcu self.

This reorganization also makes put_task_struct uncallable from modules as
it is not exported but it does not appear to be called from any modules so
this should not be an issue, and is trivially fixed.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:59 -08:00
Andrew Morton
158d9ebd19 [PATCH] resurrect __put_task_struct
This just got nuked in mainline.  Bring it back because Eric's patches use it.

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:59 -08:00
Con Kolivas
d425b274ba [PATCH] sched: activate SCHED BATCH expired
To increase the strength of SCHED_BATCH as a scheduling hint we can
activate batch tasks on the expired array since by definition they are
latency insensitive tasks.

Signed-off-by: Con Kolivas <kernel@kolivas.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:59 -08:00
Con Kolivas
3dee386e14 [PATCH] sched: cleanup task_activated()
The activated flag in task_struct is used to track different sleep types and
its usage is somewhat obfuscated.  Convert the variable to an enum with more
descriptive names without altering the function.

Signed-off-by: Con Kolivas <kernel@kolivas.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:58 -08:00
Jack Steiner
db1b1fefc2 [PATCH] sched: reduce overhead of calc_load
Currently, count_active_tasks() calls both nr_running() &
nr_interruptible().  Each of these functions does a "for_each_cpu" & reads
values from the runqueue of each cpu.  Although this is not a lot of
instructions, each runqueue may be located on different node.  Depending on
the architecture, a unique TLB entry may be required to access each
runqueue.

Since there may be more runqueues than cpu TLB entries, a scan of all
runqueues can trash the TLB.  Each memory reference incurs a TLB miss &
refill.

In addition, the runqueue cacheline that contains nr_running &
nr_uninterruptible may be evicted from the cache between the two passes.
This causes unnecessary cache misses.

Combining nr_running() & nr_interruptible() into a single function
substantially reduces the TLB & cache misses on large systems.  This should
have no measureable effect on smaller systems.

On a 128p IA64 system running a memory stress workload, the new function
reduced the overhead of calc_load() from 605 usec/call to 324 usec/call.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:58 -08:00
Thomas Gleixner
00362e33f6 [PATCH] hrtimer: create generic sleeper
The removal of the data field in the hrtimer structure enforces the
embedding of the timer into another data structure.  nanosleep now uses a
private implementation of the most common used timer callback function
(simple task wakeup).

In order to avoid the reimplentation of such functionality all over the
place a generic hrtimer_sleeper functionality is created.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:58 -08:00
Richard Purdie
2bfb646cdf [PATCH] LED: Add IDE disk activity LED trigger
Add an LED trigger for IDE disk activity to the ide-disk driver.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Acked-by: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:57 -08:00
Richard Purdie
c3bc9956ec [PATCH] LED: add LED trigger tupport
Add support for LED triggers to the LED subsystem.  "Triggers" are events
which change the state of an LED.  Two kinds of trigger are available, simple
ones which can be added to exising code with minimum disruption and complex
ones for implementing new or more complex functionality.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:56 -08:00
Richard Purdie
c72a1d608d [PATCH] LED: add LED class
Add the foundations of a new LEDs subsystem.  This patch adds a class which
presents LED devices within sysfs and allows their brightness to be
controlled.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:56 -08:00
Rafael J. Wysocki
0ca07731e4 [PATCH] vt: add TIOCL_GETKMSGREDIRECT
Add TIOCL_GETKMSGREDIRECT needed by the userland suspend tool to get the
current value of kmsg_redirect from the kernel so that it can save it and
restore it after resume.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:56 -08:00
Andrew Morton
f79e2abb9b [PATCH] sys_sync_file_range()
Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT
fadvise() additions, do it in a new sys_sync_file_range() syscall instead.
Reasons:

- It's more flexible.  Things which would require two or three syscalls with
  fadvise() can be done in a single syscall.

- Using fadvise() in this manner is something not covered by POSIX.

The patch wires up the syscall for x86.

The sycall is implemented in the new fs/sync.c.  The intention is that we can
move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later.

Documentation for the syscall is in fs/sync.c.

A test app (sync_file_range.c) is in
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz.

The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can
say NFS_DATA_SYNC or NFS_FILE_SYNC.  I can skip the ->fsync call for
NFS_DATA_SYNC which is hopefully the more common."

Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if
the queue is congested.  This is trivial to fix: add a new flag bit, set
wbc->nonblocking.  But I'm not sure that we want to expose implementation
details down to that level.

Note: it's notable that we can sync an fd which wasn't opened for writing.
Same with fsync() and fdatasync()).

Note: the code takes some care to handle attempts to sync file contents
outside the 16TB offset on 32-bit machines.  It makes such attempts appear to
succeed, for best 32-bit/64-bit compatibility.  Perhaps it should make such
requests fail...

Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:54 -08:00
Corey Minyard
453823ba08 [PATCH] IPMI: fix startup race condition
Matt Domsch noticed a startup race with the IPMI kernel thread, it was
possible (though extraordinarly unlikely) that a message could come in
before the upper layer was ready to handle it.  This patch splits the
startup processing of an IPMI interface into two parts, one to get ready
and one to actually start the processes to receive messages from the
interface.

[akpm@osdl.org: cleanups]
Signed-off-by: Corey Minyard <minyard@acm.org>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:54 -08:00
Joe Korty
68eef3b479 [PATCH] Simplify proc/devices and fix early termination regression
Make baby-simple the code for /proc/devices.  Based on the proven design
for /proc/interrupts.

This also fixes the early-termination regression 2.6.16 introduced, as
demonstrated by:

    # dd if=/proc/devices bs=1
    Character devices:
      1 mem
    27+0 records in
    27+0 records out

This should also work (but is untested) when /proc/devices >4096 bytes,
which I believe is what the original 2.6.16 rewrite fixed.

[akpm@osdl.org: cleanups, simplifications]
Signed-off-by: Joe Korty <joe.korty@ccur.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:53 -08:00
Oleg Nesterov
3691c5199e [PATCH] kill __init_timer_base in favor of boot_tvec_bases
Commit a4a6198b80:
	[PATCH] tvec_bases too large for per-cpu data

introduced "struct tvec_t_base_s boot_tvec_bases" which is visible at
compile time.  This means we can kill __init_timer_base and move
timer_base_s's content into tvec_t_base_s.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:52 -08:00
Nick Piggin
93fac7041f [PATCH] mm: schedule find_trylock_page() removal
find_trylock_page() is an odd interface in that it doesn't take a reference
like the others.  Now that XFS no longer uses it, and its last remaining
caller actually wants an elevated refcount, opencode that callsite and
schedule find_trylock_page() for removal.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:49 -08:00
Christoph Lameter
9bf9e89c3d [PATCH] migrate_pages_to() must be defined for the no swap case
Fix migrate_pages_to() definition.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:49 -08:00
Adrian Bunk
0500abf521 [PATCH] drivers/mtd/: small cleanups
- chips/sharp.c: make two needlessly global functions static

- move some declarations to a header file where they belong to

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:48 -08:00
Ingo Molnar
48b192686d [PATCH] sem2mutex: drivers/mtd/
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:48 -08:00
Anton Blanchard
025be81e83 [NET]: Allow skb headroom to be overridden
Previously we added NET_IP_ALIGN so an architecture can override the
padding done to align headers. The next step is to allow the skb
headroom to be overridden.

We currently always reserve 16 bytes to grow into, meaning all DMAs
start 16 bytes into a cacheline. On ppc64 we really want DMA writes to
start on a cacheline boundary, so we increase that headroom to one
cacheline.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-31 02:27:06 -08:00
Linus Torvalds
256414dee4 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [PATCH] sata_mv: three bug fixes
  [PATCH] libata: ata_dev_init_params() fixes
  [PATCH] libata: Fix interesting use of "extern" and also some bracketing
  [PATCH] libata: Simplex and other mode filtering logic
  [PATCH] libata - ATA is both ATA and CFA
  [PATCH] libata: Add ->set_mode hook for odd drivers
  [PATCH] libata: BMDMA handling updates
  [PATCH] libata: kill trailing whitespace
  [PATCH] libata: add FIXME above ata_dev_xfermask()
  [PATCH] libata: cosmetic changes in ata_bus_softreset()
  [PATCH] libata: kill E.D.D.
2006-03-30 14:29:20 -08:00
Jens Axboe
5abc97aa25 [PATCH] splice: add support for SPLICE_F_MOVE flag
This enables the caller to migrate pages from one address space page
cache to another.  In buzz word marketing, you can do zero-copy file
copies!

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-30 12:28:18 -08:00
Jens Axboe
5274f052e7 [PATCH] Introduce sys_splice() system call
This adds support for the sys_splice system call. Using a pipe as a
transport, it can connect to files or sockets (latter as output only).

From the splice.c comments:

   "splice": joining two ropes together by interweaving their strands.

   This is the "extended pipe" functionality, where a pipe is used as
   an arbitrary in-memory buffer. Think of a pipe as a small kernel
   buffer that you can use to transfer data from one end to the other.

   The traditional unix read/write is extended with a "splice()" operation
   that transfers data buffers to or from a pipe buffer.

   Named by Larry McVoy, original implementation from Linus, extended by
   Jens to support splicing to files and fixing the initial implementation
   bugs.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-30 12:28:18 -08:00
Alan Cox
5444a6f405 [PATCH] libata: Simplex and other mode filtering logic
Add a field to the host_set called 'flags' (was host_set_flags changed
to suit Jeff)
Add a simplex_claimed field so we can remember who owns the DMA channel
Add a ->mode_filter() hook to allow drivers to filter modes
Add docs for mode_filter and set_mode
Filter according to simplex state
Filter cable in core

This provides the needed framework to support all the mode rules found
in the PATA world. The simplex filter deals with 'to spec' simplex DMA
systems found in older chips. The cable filter avoids duplicating the
same rules in each chip driver with PATA. Finally the mode filter is
neccessary because drive/chip combinations have errata that forbid
certain modes with some drives or types of ATA object.

Drive speed setup remains per channel for now and the filters now use
the framework Tejun put into place which cleans them up a lot from the
older libata-pata patches.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-03-29 19:30:28 -05:00
Alan Cox
e35a9e01f2 [PATCH] libata: Add ->set_mode hook for odd drivers
Some hardware doesn't want the usual mode setup logic running. This
allows the hardware driver to replace it for special cases in the least
invasive way possible.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-03-29 19:30:27 -05:00
Alan Cox
4e5ec5dba2 [PATCH] libata: BMDMA handling updates
This is the minimal patch set to enable the current code to be used with
a controller following SFF (ie any PATA and early SATA controllers)
safely without crashes if there is no BMDMA area or if BMDMA is not
assigned by the BIOS for some reason.

Simplex status is recorded but not acted upon in this change, this isn't
a problem with the current drivers as none of them are for simplex
hardware. A following diff will deal with that.

The flags in the probe structure remain ->host_set_flags although Jeff
asked me to rename them, simply because the rename would break the usual
Linux rules that old code should break when there are changes. not
compile and run and then blow up/eat your computer/etc. Renaming this
later is a trivial exercise once a better name is chosen.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-03-29 19:30:27 -05:00
Denis Vlasenko
56079431b6 [NET]: Deinline some larger functions from netdevice.h
On a allyesconfig'ured kernel:

Size  Uses Wasted Name and definition
===== ==== ====== ================================================
   95  162  12075 netif_wake_queue      include/linux/netdevice.h
  129   86   9265 dev_kfree_skb_any     include/linux/netdevice.h
  127   56   5885 netif_device_attach   include/linux/netdevice.h
   73   86   4505 dev_kfree_skb_irq     include/linux/netdevice.h
   46   60   1534 netif_device_detach   include/linux/netdevice.h
  119   16   1485 __netif_rx_schedule   include/linux/netdevice.h
  143    5    492 netif_rx_schedule     include/linux/netdevice.h
   81    7    366 netif_schedule        include/linux/netdevice.h

netif_wake_queue is big because __netif_schedule is a big inline:

static inline void __netif_schedule(struct net_device *dev)
{
        if (!test_and_set_bit(__LINK_STATE_SCHED, &dev->state)) {
                unsigned long flags;
                struct softnet_data *sd;

                local_irq_save(flags);
                sd = &__get_cpu_var(softnet_data);
                dev->next_sched = sd->output_queue;
                sd->output_queue = dev;
                raise_softirq_irqoff(NET_TX_SOFTIRQ);
                local_irq_restore(flags);
        }
}

static inline void netif_wake_queue(struct net_device *dev)
{
#ifdef CONFIG_NETPOLL_TRAP
        if (netpoll_trap())
                return;
#endif
        if (test_and_clear_bit(__LINK_STATE_XOFF, &dev->state))
                __netif_schedule(dev);
}

By de-inlining __netif_schedule we are saving a lot of text
at each callsite of netif_wake_queue and netif_schedule.
__netif_rx_schedule is also big, and it makes more sense to keep
both of them out of line.

Patch also deinlines dev_kfree_skb_any. We can deinline dev_kfree_skb_irq
instead... oh well.

netif_device_attach/detach are not hot paths, we can deinline them too.

Signed-off-by: Denis Vlasenko <vda@ilport.com.ua>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-29 15:57:29 -08:00
Jeff Garzik
e02a4cabfc Merge branch 'master' 2006-03-29 17:18:49 -05:00