Commit Graph

539 Commits

Author SHA1 Message Date
Linus Torvalds
63589ed078 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (48 commits)
  Documentation: fix minor kernel-doc warnings
  BUG_ON() Conversion in drivers/net/
  BUG_ON() Conversion in drivers/s390/net/lcs.c
  BUG_ON() Conversion in mm/slab.c
  BUG_ON() Conversion in mm/highmem.c
  BUG_ON() Conversion in kernel/signal.c
  BUG_ON() Conversion in kernel/signal.c
  BUG_ON() Conversion in kernel/ptrace.c
  BUG_ON() Conversion in ipc/shm.c
  BUG_ON() Conversion in fs/freevxfs/
  BUG_ON() Conversion in fs/udf/
  BUG_ON() Conversion in fs/sysv/
  BUG_ON() Conversion in fs/inode.c
  BUG_ON() Conversion in fs/fcntl.c
  BUG_ON() Conversion in fs/dquot.c
  BUG_ON() Conversion in md/raid10.c
  BUG_ON() Conversion in md/raid6main.c
  BUG_ON() Conversion in md/raid5.c
  Fix minor documentation typo
  BFP->BPF in Documentation/networking/tuntap.txt
  ...
2006-04-02 12:58:45 -07:00
Dmitry Torokhov
95d465fd75 Manual merge with Linus.
Conflicts:
	arch/powerpc/kernel/setup-common.c
	drivers/input/keyboard/hil_kbd.c
	drivers/input/mouse/hil_ptr.c
2006-04-02 00:08:05 -05:00
Horms
36a891b67f kexec: grammar fix for crash_save_this_cpu()
kexec: grammar fix for crash_save_this_cpu()

Signed-Off-By: Horms <horms@verge.net.au>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:39:17 +02:00
Adrian Bunk
0cb3463f04 [PATCH] unexport get_wchan
The only user of get_wchan is the proc fs - and proc can't be built modular.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:19:01 -08:00
Andrew Morton
f79e2abb9b [PATCH] sys_sync_file_range()
Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT
fadvise() additions, do it in a new sys_sync_file_range() syscall instead.
Reasons:

- It's more flexible.  Things which would require two or three syscalls with
  fadvise() can be done in a single syscall.

- Using fadvise() in this manner is something not covered by POSIX.

The patch wires up the syscall for x86.

The sycall is implemented in the new fs/sync.c.  The intention is that we can
move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later.

Documentation for the syscall is in fs/sync.c.

A test app (sync_file_range.c) is in
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz.

The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can
say NFS_DATA_SYNC or NFS_FILE_SYNC.  I can skip the ->fsync call for
NFS_DATA_SYNC which is hopefully the more common."

Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if
the queue is congested.  This is trivial to fix: add a new flag bit, set
wbc->nonblocking.  But I'm not sure that we want to expose implementation
details down to that level.

Note: it's notable that we can sync an fd which wasn't opened for writing.
Same with fsync() and fdatasync()).

Note: the code takes some care to handle attempts to sync file contents
outside the 16TB offset on 32-bit machines.  It makes such attempts appear to
succeed, for best 32-bit/64-bit compatibility.  Perhaps it should make such
requests fail...

Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:54 -08:00
OGAWA Hirofumi
9b41046cd0 [PATCH] Don't pass boot parameters to argv_init[]
The boot cmdline is parsed in parse_early_param() and
parse_args(,unknown_bootoption).

And __setup() is used in obsolete_checksetup().

	start_kernel()
		-> parse_args()
			-> unknown_bootoption()
				-> obsolete_checksetup()

If __setup()'s callback (->setup_func()) returns 1 in
obsolete_checksetup(), obsolete_checksetup() thinks a parameter was
handled.

If ->setup_func() returns 0, obsolete_checksetup() tries other
->setup_func().  If all ->setup_func() that matched a parameter returns 0,
a parameter is seted to argv_init[].

Then, when runing /sbin/init or init=app, argv_init[] is passed to the app.
If the app doesn't ignore those arguments, it will warning and exit.

This patch fixes a wrong usage of it, however fixes obvious one only.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:53 -08:00
Jakub Jelinek
da2e9e1ff4 [PATCH] Mark unwind info for signal trampolines in vDSOs
Mark unwind info for signal trampolines using the new S augmentation flag
introduced in: http://gcc.gnu.org/PR26208.

GCC 4.2 (or patched earlier GCC) will be able to special case unwinding
through frames right above signal trampolines.  As the augmentations start
with z flag and S is at the very end of the augmentation string, older GCCs
will just skip the S flag as unknown (that's why an augmentation flag was
chosen over say a new CFA opcode).

Signed-off-by: Jakub Jelinek <jakub@redhat.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:52 -08:00
Vivek Goyal
1a75a3f068 [PATCH] i386 kdump timer vector lockup fix
Porting the patch I posted for x86_64 to i386.

http://marc.theaimsgroup.com/?l=linux-kernel&m=114178139610707&w=2

o While using kdump, after a system crash when second kernel boots, timer
  vector gets (0x31) locked and CPU does not see timer interrupts
  travelling from IOAPIC to APIC.  Currently it does not lead to boot
  failure in second kernel as timer interrupts continues to come as ExtInt
  through LAPIC directly, but fixing it is good in case some boards do not
  support the other mode.

o After a system crash, it is not safe to service interrupts any more,
  hence interrupts are disabled.  This leads to pending interrupts at
  LAPIC.  LAPIC sends these interrupts to the CPU during early boot of
  second kernel.  Other pending interrupts are discarded saying unexpected
  trap but timer interrupt is serviced and CPU does not issue an LAPIC EOI
  because it think this interrupt came from i8259 and sends ack to 8259.
  This leads to vector 0x31 locking as LAPIC does not clear respective ISR
  and keeps on waiting for EOI.

o This patch issues extra EOI for the pending interrupts who have ISR set.

o Though today only timer seems to be the special case because in early
  boot it thinks interrupts are coming from i8259 and uses
  mask_and_ack_8259A() as ack handler and does not issue LAPIC EOI.  But
  probably doing it in generic manner for all vectors makes sense.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:50 -08:00
Jens Axboe
5274f052e7 [PATCH] Introduce sys_splice() system call
This adds support for the sys_splice system call. Using a pipe as a
transport, it can connect to files or sockets (latter as output only).

From the splice.c comments:

   "splice": joining two ropes together by interweaving their strands.

   This is the "extended pipe" functionality, where a pipe is used as
   an arbitrary in-memory buffer. Think of a pipe as a small kernel
   buffer that you can use to transfer data from one end to the other.

   The traditional unix read/write is extended with a "splice()" operation
   that transfers data buffers to or from a pipe buffer.

   Named by Larry McVoy, original implementation from Linus, extended by
   Jens to support splicing to files and fixing the initial implementation
   bugs.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-30 12:28:18 -08:00
Linus Torvalds
9561b03dc3 Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] cpufreq_conservative: keep ignore_nice_load and freq_step values when reselected
  [CPUFREQ] powernow: remove private for_each_cpu_mask()
  [CPUFREQ] hotplug cpu fix for powernow-k8
  [PATCH] cpufreq_ondemand: add range check
  [PATCH] cpufreq_ondemand: keep ignore_nice_load value when it is reselected
  [PATCH] cpufreq_ondemand: Warn if it cannot run due to too long transition latency
  [PATCH] cpufreq_conservative: alternative initialise approach
  [PATCH] cpufreq_conservative: make for_each_cpu() safe
  [PATCH] cpufreq_conservative: alter default responsiveness
  [PATCH] cpufreq_conservative: aligning of codebase with ondemand
2006-03-28 09:48:32 -08:00
Jesper Juhl
b791ccef21 [PATCH] fix signed vs unsigned in nmi watchdog
Fix "signed vs unsigned" in nmi_watchdog_tick.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:08 -08:00
Adrian Bunk
f45e4656ac [PATCH] arch/i386/kernel/microcode.c: remove the obsolete microcode_ioctl
Nowadays, even Debian stable ships a microcode_ctl utility recent enough to no
longer use this ioctl.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Tigran Aivazian <tigran_aivazian@symantec.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:06 -08:00
KAMEZAWA Hiroyuki
c8912599c6 [PATCH] for_each_possible_cpu: i386
This patch replaces for_each_cpu with for_each_possible_cpu.

under arch/i386.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:05 -08:00
Andrew Morton
64840e2722 [CPUFREQ] powernow: remove private for_each_cpu_mask()
It is unneeded and wrong.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-03-27 15:06:08 -05:00
shin, jacob
eef5167e50 [CPUFREQ] hotplug cpu fix for powernow-k8
Andi's previous fix to initialise powernow_data on all siblings
will not work properly with CPU Hotplug.

Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-03-27 15:01:28 -05:00
Alan Stern
e041c68341 [PATCH] Notifier chain update: API changes
The kernel's implementation of notifier chains is unsafe.  There is no
protection against entries being added to or removed from a chain while the
chain is in use.  The issues were discussed in this thread:

    http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2

We noticed that notifier chains in the kernel fall into two basic usage
classes:

	"Blocking" chains are always called from a process context
	and the callout routines are allowed to sleep;

	"Atomic" chains can be called from an atomic context and
	the callout routines are not allowed to sleep.

We decided to codify this distinction and make it part of the API.  Therefore
this set of patches introduces three new, parallel APIs: one for blocking
notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
really just the old API under a new name).  New kinds of data structures are
used for the heads of the chains, and new routines are defined for
registration, unregistration, and calling a chain.  The three APIs are
explained in include/linux/notifier.h and their implementation is in
kernel/sys.c.

With atomic and blocking chains, the implementation guarantees that the chain
links will not be corrupted and that chain callers will not get messed up by
entries being added or removed.  For raw chains the implementation provides no
guarantees at all; users of this API must provide their own protections.  (The
idea was that situations may come up where the assumptions of the atomic and
blocking APIs are not appropriate, so it should be possible for users to
handle these things in their own way.)

There are some limitations, which should not be too hard to live with.  For
atomic/blocking chains, registration and unregistration must always be done in
a process context since the chain is protected by a mutex/rwsem.  Also, a
callout routine for a non-raw chain must not try to register or unregister
entries on its own chain.  (This did happen in a couple of places and the code
had to be changed to avoid it.)

Since atomic chains may be called from within an NMI handler, they cannot use
spinlocks for synchronization.  Instead we use RCU.  The overhead falls almost
entirely in the unregister routine, which is okay since unregistration is much
less frequent that calling a chain.

Here is the list of chains that we adjusted and their classifications.  None
of them use the raw API, so for the moment it is only a placeholder.

  ATOMIC CHAINS
  -------------
arch/i386/kernel/traps.c:		i386die_chain
arch/ia64/kernel/traps.c:		ia64die_chain
arch/powerpc/kernel/traps.c:		powerpc_die_chain
arch/sparc64/kernel/traps.c:		sparc64die_chain
arch/x86_64/kernel/traps.c:		die_chain
drivers/char/ipmi/ipmi_si_intf.c:	xaction_notifier_list
kernel/panic.c:				panic_notifier_list
kernel/profile.c:			task_free_notifier
net/bluetooth/hci_core.c:		hci_notifier
net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_chain
net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_expect_chain
net/ipv6/addrconf.c:			inet6addr_chain
net/netfilter/nf_conntrack_core.c:	nf_conntrack_chain
net/netfilter/nf_conntrack_core.c:	nf_conntrack_expect_chain
net/netlink/af_netlink.c:		netlink_chain

  BLOCKING CHAINS
  ---------------
arch/powerpc/platforms/pseries/reconfig.c:	pSeries_reconfig_chain
arch/s390/kernel/process.c:		idle_chain
arch/x86_64/kernel/process.c		idle_notifier
drivers/base/memory.c:			memory_chain
drivers/cpufreq/cpufreq.c		cpufreq_policy_notifier_list
drivers/cpufreq/cpufreq.c		cpufreq_transition_notifier_list
drivers/macintosh/adb.c:		adb_client_list
drivers/macintosh/via-pmu.c		sleep_notifier_list
drivers/macintosh/via-pmu68k.c		sleep_notifier_list
drivers/macintosh/windfarm_core.c	wf_client_list
drivers/usb/core/notify.c		usb_notifier_list
drivers/video/fbmem.c			fb_notifier_list
kernel/cpu.c				cpu_chain
kernel/module.c				module_notify_list
kernel/profile.c			munmap_notifier
kernel/profile.c			task_exit_notifier
kernel/sys.c				reboot_notifier_list
net/core/dev.c				netdev_chain
net/decnet/dn_dev.c:			dnaddr_chain
net/ipv4/devinet.c:			inetaddr_chain

It's possible that some of these classifications are wrong.  If they are,
please let us know or submit a patch to fix them.  Note that any chain that
gets called very frequently should be atomic, because the rwsem read-locking
used for blocking chains is very likely to incur cache misses on SMP systems.
(However, if the chain's callout routines may sleep then the chain cannot be
atomic.)

The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
material written by Keith Owens and suggestions from Paul McKenney and Andrew
Morton.

[jes@sgi.com: restructure the notifier chain initialization macros]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:50 -08:00
Ingo Molnar
dfd4e3ec24 [PATCH] lightweight robust futexes: i386
i386: add the futex_atomic_cmpxchg_inuser() assembly implementation, and wire
up the new syscalls.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Acked-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:49 -08:00
Dave Hansen
22a9835c35 [PATCH] unify PFN_* macros
Just about every architecture defines some macros to do operations on pfns.
 They're all virtually identical.  This patch consolidates all of them.

One minor glitch is that at least i386 uses them in a very skeletal header
file.  To keep away from #include dependency hell, I stuck the new
definitions in a new, isolated header.

Of all of the implementations, sh64 is the only one that varied by a bit.
It used some masks to ensure that any sign-extension got ripped away before
the arithmetic is done.  This has been posted to that sh64 maintainers and
the development list.

Compiles on x86, x86_64, ia64 and ppc64.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:48 -08:00
Shaohua Li
b06be912a3 [PATCH] x86: don't use cpuid.2 to determine cache info if cpuid.4 is supported
Don't use cpuid.2 to determine cache info if cpuid.4 is supported.  The
exception is P4 trace cache.  We always use cpuid.2 to get trace cache
under P4.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:44 -08:00
Siddha, Suresh B
1e9f28fa1e [PATCH] sched: new sched domain for representing multi-core
Add a new sched domain for representing multi-core with shared caches
between cores.  Consider a dual package system, each package containing two
cores and with last level cache shared between cores with in a package.  If
there are two runnable processes, with this appended patch those two
processes will be scheduled on different packages.

On such systems, with this patch we have observed 8% perf improvement with
specJBB(2 warehouse) benchmark and 35% improvement with CFP2000 rate(with 2
users).

This new domain will come into play only on multi-core systems with shared
caches.  On other systems, this sched domain will be removed by domain
degeneration code.  This new domain can be also used for implementing power
savings policy (see OLS 2005 CMP kernel scheduler paper for more details..
I will post another patch for power savings policy soon)

Most of the arch/* file changes are for cpu_coregroup_map() implementation.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:43 -08:00
OGAWA Hirofumi
dbffa47161 [PATCH] PM-Timer: don't use workaround if chipset is not buggy
Current timer_pm.c reads I/O port triple times, in order to avoid the bug
of chipset.  But I/O port is slow.

2.6.16 (pmtmr)
Simple gettimeofday: 3.6532 microseconds

2.6.16+patch (pmtmr)
Simple gettimeofday: 1.4582 microseconds

[if chip is buggy, probably it will be 7us or more in 4.2% of probability.]

This patch adds blacklist of buggy chip, and if chip is not buggy, this
uses fast normal version instead of slow workaround version.

If chip is buggy, warnings "pmtmr is slow".  But sounds like there is gray
zone.  I found the PIIX4 errata, but I couldn't find the ICH4 errata.  But
some motherboard seems to have problem.

So, if we found a ICH4, generate warnings, and use a workaround version.
If user's ICH4 is good, the user can specify the "pmtmr_good" boot
parameter to use fast version.

Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:37 -08:00
Prasanna S Panchamukhi
b4026513b8 [PATCH] kprobes: fix broken fault handling for i386
Provide proper kprobes fault handling, if a user-specified pre/post handlers
tries to access user address space, through copy_from_user(), get_user() etc.

The user-specified fault handler gets called only if the fault occurs while
executing user-specified handlers.  In such a case user-specified handler is
allowed to fix it first, later if the user-specifed fault handler does not fix
it, we try to fix it by calling fix_exception().

The user-specified handler will not be called if the fault happens when single
stepping the original instruction, instead we reset the current probe and
allow the system page fault handler to fix it up.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:04 -08:00
bibo,mao
2326c77017 [PATCH] kprobe handler: discard user space trap
Currently kprobe handler traps only happen in kernel space, so function
kprobe_exceptions_notify should skip traps which happen in user space.
This patch modifies this, and it is based on 2.6.16-rc4.

Signed-off-by: bibo mao <bibo.mao@intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: "Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>
Cc: <hiramatu@sdl.hitachi.co.jp>
Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:04 -08:00
bibo mao
c6fd91f0bd [PATCH] kretprobe instance recycled by parent process
When kretprobe probes the schedule() function, if the probed process exits
then schedule() will never return, so some kretprobe instances will never
be recycled.

In this patch the parent process will recycle retprobe instances of the
probed function and there will be no memory leak of kretprobe instances.

Signed-off-by: bibo mao <bibo.mao@intel.com>
Cc: Masami Hiramatsu <hiramatu@sdl.hitachi.co.jp>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:04 -08:00
Masami Hiramatsu
c9becf58d9 [PATCH] kretprobe: kretprobe-booster
In normal operation, kretprobe makes a target function return to trampoline
code.  A kprobe (called trampoline_probe) has been inserted in the trampoline
code.  When the kernel hits this kprobe, it calls kretprobe's handler and it
returns to the original return address.

Kretprobe-booster removes the trampoline_probe.  It allows the trampoline code
to call kretprobe's handler directly instead of invoking kprobe.  The
trampoline code returns to the original return address.

(changelog from Chuck Ebbert <76306.1226@compuserve.com> - thanks ;))

Signed-off-by: Masami Hiramatsu <hiramatu@sdl.hitachi.co.jp>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:04 -08:00
Masami Hiramatsu
311ac88fd2 [PATCH] x86: kprobes-booster
Current kprobe copies the original instruction at the probe point and replaces
it with a breakpoint instruction (int3).  When the kernel hits the probe
point, kprobe handler is invoked.  And the copied instruction is single-step
executed on the copied buffer (not on the original address) by kprobe.  After
that, the kprobe checks registers and modify it (if need) as if the
instructions was executed on the original address.

My proposal is based on the fact there are many instructions which do NOT
require the register modification after the single-step execution.  When the
copied instruction is a kind of them, kprobe just jumps back to the next
instruction after single-step execution.  If so, why don't we execute those
instructions directly?

With kprobe-booster patch, kprobes will execute a copied instruction directly
and (if need) jump back to original code.  This direct execution is executed
when the kprobe don't have both post_handler and break_handler, and the copied
instruction can be executed directly.

I sorted instructions which can be executed directly or not;

- Call instructions are NG(can not be executed directly).
  We should correct the return address pushed into top of stack.
- Indirect instructions except for absolute indirect-jumps
  are NG. Those instructions changes EIP randomly. We should
  check EIP and correct it.
- Instructions that change EIP beyond the range of the
  instruction buffer are NG.
- Instructions that change EIP to tail 5 bytes of the
  instruction buffer (it is the size of a jump instruction).
  We must write a jump instruction which backs to original
  kernel code in the instruction buffer.
- Break point instruction is NG. We should not touch EIP and
  pass to other handlers.
- Absolute direct/indirect jumps are OK.- Conditional Jumps are NG.
- Halt and software-interruptions are NG. Because it will stay on
  the instruction buffer of kprobes.
- Prefixes are NG.
- Unknown/reserved opcode is NG.
- Other 1 byte instructions are OK. But those instructions need a
  jump back code.
- 2 bytes instructions are mapped sparsely. So, in this release,
  this patch don't boost those instructions.

>From Intel's IA-32 opcode map described in IA-32 Intel Architecture Software
Developer's Manual Vol.2 B, I determined that following opcodes are not
boostable.

- 0FH (2byte escape)
- 70H - 7FH (Jump on condition)
- 9AH (Call) and 9CH (Pushf)
- C0H-C1H (Grp 2: includes reserved opcode)
- C6H-C7H (Grp11: includes reserved opcode)
- CCH-CEH (Software-interrupt)
- D0H-D3H (Grp2: includes reserved opcode)
- D6H (Reserved)
- D8H-DFH (Coprocessor)
- E0H-E3H (loop/conditional jump)
- E8H (Call)
- F0H-F3H (Prefixes and reserved)
- F4H (Halt)
- F6H-F7H (Grp3: includes reserved opcode)
- FEH-FFH(Grp4,5: includes reserved opcode)

Kprobe-booster checks whether target instruction can be boosted (can be
executed directly) at arch_copy_kprobe() function.  If the target instruction
can be boosted, it clears "boostable" flag.  If not, it sets "boostable" flag
-1.  This is disabled status.  In resume_execution() function, If "boostable"
flag is cleared, kprobe-booster measures the size of the target instruction
and sets "boostable" flag 1.

In kprobe_handler(), kprobe checks the "boostable" flag.  If the flag is 1, it
resets current kprobe and executes instruction buffer directly instead of
single stepping.

When unregistering a boosted kprobe, it calls synchronize_sched()
after "int3" is removed. So we can ensure followings after
the synchronize_sched() called.
- interrupt handlers are finished on all CPUs.
- instruction buffer is not executed on all CPUs.
And we can release the boosted kprobe safely.

And also, on preemptible kernel, the booster is not enabled where the kernel
preemption is enabled.  So, there are no preempted threads on the instruction
buffer.

The description of kretprobe-booster:
====================================

In the normal operation, kretprobe make a target function return to trampoline
code.  And a kprobe (called trampoline_probe) have been inserted at the
trampoline code.  When the kernel hits this kprobe, it calls kretprobe's
handler and it returns to original return address.

Kretprobe-booster patch removes the trampoline_probe.  It allows the
trampoline code to call kretprobe's handler directly instead of invoking
kprobe.  And tranpoline code returns to original return address.

This new trampoline code stores and restores registers, so the kretprobe
handler is still able to access those registers.

Current kprobe has about 1.3 usec/probe(*) overhead, and kprobe-booster patch
reduces it to 0.6 usec/probe(*).  Also current kretprobe has about 2.0
usec/probe(*) overhead.  Kprobe-booster patch reduces it to 1.3 usec/probe(*),
and the combination of both kprobe-booster patch and kretprobe-booster patch
reduces it to 0.9 usec/probe(*).

I expect the combination of both patches can reduce half of a probing
overhead.

Performance numbers strongly depend on the processor model.

Andrew Morton wrote:
> These preempt tricks look rather nasty.  Can you please describe what the
> problem is, precisely?  And how this code avoids it?  Perhaps we can find
> something cleaner.

The problem is how to remove the copied instructions of the
kprobe *safely* on the preemptable kernel (CONFIG_PREEMPT=y).

Kprobes basically executes the following actions;

(1)int3
(2)preempt_disable()
(3)kprobe_prehandler()
(4)copied instructioin(single step)
(5)kprobe_posthandler()
(6)preempt_enable()
(7)return to the original code

During the execution of copied instruction, preemption is
disabled (from step (2) to (6)).
When unregistering the probes, Kprobe waits for RCU
quiescent state by using synchronize_sched() after removing
int3 instruction.
Thus we can ensure the copied instruction is not executed.

On the other hand, kprobe-booster executes the following actions;

(1)int3
(2)preempt_disable()
(3)kprobe_prehandler()
(4)preempt_enable()             <-- this one is added by my patch
(5)copied instruction(direct execution)
(6)jmp back to the original code

The problem is that we have no way to prevent preemption on
step (5) or (6). We cannot call preempt_disable() after step (6),
because there are no rooms to do that. Thus, some other
processes may be preempted at step(5) or (6) on preemptable kernel.
And I couldn't find the easy way to ensure that other processes'
stack do *not* have the address of them. (I thought some way
to do that, but those are very costly.)

So currently, I simply boost the kprobe only when the probe
point is already preemption disabled.

> Also, the patch adds a preempt_enable() but I don't see a corresponding
> preempt_disable().  Am I missing something?

It is corresponding to the preempt_disable() in the top of
kprobe_handler().
I copied the code of kprobe_handler() here:

static int __kprobes kprobe_handler(struct pt_regs *regs)
{
        struct kprobe *p;
        int ret = 0;
        kprobe_opcode_t *addr = NULL;
        unsigned long *lp;
        struct kprobe_ctlblk *kcb;

        /*
         * We don't want to be preempted for the entire
         * duration of kprobe processing
         */
        preempt_disable();             <-- HERE
        kcb = get_kprobe_ctlblk();

Signed-off-by: Masami Hiramatsu <hiramatu@sdl.hitachi.co.jp>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:04 -08:00
Masami Hiramatsu
b50ea74c7b [PATCH] kprobes: clean up resume_execute()
Clean up kprobe's resume_execute() for i386 arch.

Signed-off-by: Masami Hiramatsu <hiramatu@sdl.hitachi.co.jp>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:03 -08:00
Darren Jenkins
d6d21dfdd3 [PATCH] fix array overrun in efi.c
Coverity found an over-run @ line 364 of efi.c

This is due to the loop checking the size correctly, then adding a '\0'
after possibly hitting the end of the array.

Ensure the loop exits with one space left in the array.

Signed-off-by: Darren Jenkins <darrenrjenkins@gmail.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:57 -08:00
Ingo Molnar
14cc3e2b63 [PATCH] sem2mutex: misc static one-file mutexes
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jens Axboe <axboe@suse.de>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Cc: Greg KH <greg@kroah.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Adam Belay <ambx1@neo.rr.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:55 -08:00
Tolentino, Matthew E
23dd842c00 [PATCH] EFI fixes
Here's a patch that fixes EFI boot for x86 on 2.6.16-rc5-mm3.  The
off-by-one is admittedly my fault, but the other two fix up the rest.

Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: "Tolentino, Matthew E" <matthew.e.tolentino@intel.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:54 -08:00
Bjorn Helgaas
b2c99e3c70 [PATCH] EFI: keep physical table addresses in efi structure
Almost all users of the table addresses from the EFI system table want
physical addresses.  So rather than doing the pa->va->pa conversion, just keep
physical addresses in struct efi.

This fixes a DMI bug: the efi structure contained the physical SMBIOS address
on x86 but the virtual address on ia64, so dmi_scan_machine() used ioremap()
on a virtual address on ia64.

This is essentially the same as an earlier patch by Matt Tolentino:
	http://marc.theaimsgroup.com/?l=linux-kernel&m=112130292316281&w=2
except that this changes all table addresses, not just ACPI addresses.

Matt's original patch was backed out because it caused MCAs on HP sx1000
systems.  That problem is resolved by the ioremap() attribute checking added
for ia64.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: "Tolentino, Matthew E" <matthew.e.tolentino@intel.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:54 -08:00
Bjorn Helgaas
27d8e3d15b [PATCH] DMI: only ioremap stuff we actually need
dmi_scan_machine() tries to ioremap 0x10000 (64K) bytes, even though it only
looks at the first 32 bytes or so.  If the SMBIOS table is near the end of a
memory region, the ioremap() may fail when it shouldn't.

This is in the efi_enabled path, so it really only affects ia64 at the moment.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: "Tolentino, Matthew E" <matthew.e.tolentino@intel.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:54 -08:00
Matt Domsch
3ed3bce846 [PATCH] ia64: use i386 dmi_scan.c
Enable DMI table parsing on ia64.

Andi Kleen has a patch in his x86_64 tree which enables the use of i386
dmi_scan.c on x86_64.  dmi_scan.c functions are being used by the
drivers/char/ipmi/ipmi_si_intf.c driver for autodetecting the ports or
memory spaces where the IPMI controllers may be found.

This patch adds equivalent changes for ia64 as to what is in the x86_64
tree.  In addition, I reworked the DMI detection, such that on EFI-capable
systems, it uses the efi.smbios pointer to find the table, rather than
brute-force searching from 0xF0000.  On non-EFI systems, it continues the
brute-force search.

My test system, an Intel S870BN4 'Tiger4', aka Dell PowerEdge 7250, with
latest BIOS, does not list the IPMI controller in the ACPI namespace, nor
does it have an ACPI SPMI table.  Also note, currently shipping Dell x8xx
EM64T servers don't have these either, so DMI is the only method for
obtaining the address of the IPMI controller.

Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:54 -08:00
Vivek Goyal
10dbe196a8 [PATCH] i386: export: memory more than 4G through /proc/iomem
Currently /proc/iomem exports physical memory also apart from io device
memory.  But on i386, it truncates any memory more than 4GB.  This leads to
problems for kexec/kdump.

Kexec reads /proc/iomem to determine the system memory layout and prepares a
memory map based on that and passes it to the kernel being kexeced.  Given the
fact that memory more than 4GB has been truncated, new kernel never gets to
see and use that memory.

Kdump also reads /proc/iomem to determine the physical memory layout of the
system and encodes this informaiton in ELF headers.  After a crash new kernel
parses these ELF headers being used by previous kernel and vmcore is prepared
accordingly.  As memory more than 4GB has been truncated, kdump never sees
that memory and never prepares ELF headers for it.  Hence vmcore is truncated
and limited to 4GB even if there is more physical memory in the system.

This patch exports memory more than 4GB through /proc/iomem on i386.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:54 -08:00
Jan Beulich
20c0d2d440 [PATCH] i386: pass proper trap numbers to die chain handlers
Pass the trap number causing the call to notify_die() to the die
notification handler chain in a number of instances.  Also, honor the
return value from the handler chain invocation in die() as, through a
debugger, the fault may have been fixed.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-By: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:53 -08:00
Linus Torvalds
1b9a391736 Merge branch 'audit.b3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: (22 commits)
  [PATCH] fix audit_init failure path
  [PATCH] EXPORT_SYMBOL patch for audit_log, audit_log_start, audit_log_end and audit_format
  [PATCH] sem2mutex: audit_netlink_sem
  [PATCH] simplify audit_free() locking
  [PATCH] Fix audit operators
  [PATCH] promiscuous mode
  [PATCH] Add tty to syscall audit records
  [PATCH] add/remove rule update
  [PATCH] audit string fields interface + consumer
  [PATCH] SE Linux audit events
  [PATCH] Minor cosmetic cleanups to the code moved into auditfilter.c
  [PATCH] Fix audit record filtering with !CONFIG_AUDITSYSCALL
  [PATCH] Fix IA64 success/failure indication in syscall auditing.
  [PATCH] Miscellaneous bug and warning fixes
  [PATCH] Capture selinux subject/object context information.
  [PATCH] Exclude messages by message type
  [PATCH] Collect more inode information during syscall processing.
  [PATCH] Pass dentry, not just name, in fsnotify creation hooks.
  [PATCH] Define new range of userspace messages.
  [PATCH] Filter rule comparators
  ...

Fixed trivial conflict in security/selinux/hooks.c
2006-03-25 09:24:53 -08:00
Andi Kleen
ad90573f93 [PATCH] x86_64: Initialize powernow_data[] for all siblings
I got an oops on a dual core system because the lost tick handler
called cpufreq_get() on core 1 and powernow tried to follow
a NULL powernow_data[] pointer there.

Initialize powernow_data for all cores of a CPU.

Cc: Jacob Shin <jacob.shin@amd.com>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:14:39 -08:00
Andi Kleen
9d95dd849c [PATCH] i386/x86-64: List Intel LaGrange AKA SMX in /proc/cpuinfo
Spec just got published so we know the CPUID bit.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:57 -08:00
Andi Kleen
2ab7f1833b [PATCH] x86_64: Quieten down microcode update driver
Only log data in microcode driver when something is changed Otherwise it
was far too noisy on large systems.

Also remove the printk when it is unloaded.

Cc: tigran@veritas.com

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:56 -08:00
Andi Kleen
f2d3efedbe [PATCH] x86_64: Implement early DMI scanning
There are more and more cases where we need to know DMI information
early to work around bugs.  i386 already had early DMI scanning, but
x86-64 didn't.  Implement this now.

This required some cleanup in the i386 code.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:55 -08:00
Andi Kleen
f083a329e6 [PATCH] x86_64: Clean up and tweak ACPI blacklist year code
- Move the core parser into dmi_scan.c.  It can be useful for other
   subsystems too.
 - Differentiate between field doesn't exist and field is 0 or
   unparseable.  The first case is likely an old BIOS with broken ACPI,
   the later is likely a slightly buggy BIOS where someone forget to
   edit the date.  Don't blacklist in the later case.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:54 -08:00
Linus Torvalds
be9bf30c73 Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] kzalloc conversion for gx-suspmod
  [CPUFREQ] Whitespace cleanup
  [CPUFREQ] Mark longhaul driver as broken.
  [PATCH] cpufreq: fix section mismatch warnings
  [CPUFREQ] Fix the p4-clockmod N60 errata workaround.
  [CPUFREQ] Fix handling for CPU hotplug
  [CPUFREQ] powernow-k8: Let cpufreq driver handle affected CPUs
  [CPUFREQ] Lots of whitespace & CodingStyle cleanup.
  [CPUFREQ] Remove duplicate cpuinfo struct
  [CPUFREQ] Silence powernow-k8 warning on k7's.
2006-03-25 08:52:23 -08:00
Linus Torvalds
2e1ca21d46 Merge master.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild
* master.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild: (46 commits)
  kbuild: remove obsoleted scripts/reference_* files
  kbuild: fix make help & make *pkg
  kconfig: fix time ordering of writes to .kconfig.d and include/linux/autoconf.h
  Kconfig: remove the CONFIG_CC_ALIGN_* options
  kbuild: add -fverbose-asm to i386 Makefile
  kbuild: clean-up genksyms
  kbuild: Lindent genksyms.c
  kbuild: fix genksyms build error
  kbuild: in makefile.txt note that Makefile is preferred name for kbuild files
  kbuild: replace PHONY with FORCE
  kbuild: Fix bug in crc symbol generating of kernel and modules
  kbuild: change kbuild to not rely on incorrect GNU make behavior
  kbuild: when warning symbols exported twice now tell user this is the problem
  kbuild: fix make dir/file.xx when asm symlink is missing
  kbuild: in the section mismatch check try harder to find symbols
  kbuild: fix section mismatch check for unwind on IA64
  kbuild: kill false positives from section mismatch warnings for powerpc
  kbuild: kill trailing whitespace in modpost & friends
  kbuild: small update of allnoconfig description
  kbuild: make namespace.pl CROSS_COMPILE happy
  ...

Trivial conflict in arch/ppc/boot/Makefile manually fixed up
2006-03-25 08:48:48 -08:00
Andrew Morton
f081a529f8 [PATCH] cpufreq: speedstep-smi asm fix
Fix bug identified by Linus Torvalds <torvalds@osdl.org>: the `out'
instruction depends upon the state of memory_data[], so we need to tell gcc
that before executing it. (The opcode, not gcc).

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=5553

Thanks to Antonio Ospite <ospite@studenti.unina.it> for testing.

Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:42:45 -08:00
Ashok Raj
34f361ade2 [PATCH] Check if cpu can be onlined before calling smp_prepare_cpu()
- Moved check for online cpu out of smp_prepare_cpu()

- Moved default declaration of smp_prepare_cpu() to kernel/cpu.c

- Removed lock_cpu_hotplug() from smp_prepare_cpu() to around it, since
  its called from cpu_up() as well now.

- Removed clearing from cpu_present_map during cpu_offline as it breaks
  using cpu_up() directly during a subsequent online operation.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:01 -08:00
Andrey Panin
bc83455bc8 [PATCH] fix DMI onboard device discovery
Attached patch fixes invalid pointer arithmetic in DMI code to make onboard
device discovery working again.

akpm: bug has been present since dmi_find_device() was added in 2.6.14.
Affects ipmi only (I think) - the symptoms weren't described.

akpm: changed to use pointer arithmetic rather than open-coded sizeof.

Signed-off-by: Andrey Panin <pazke@donpac.ru>
Cc: Corey Minyard <minyard@acm.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:48 -08:00
Adrian Bunk
cdb0452789 [PATCH] kill include/linux/platform.h, default_idle() cleanup
include/linux/platform.h contained nothing that was actually used except
the default_idle() prototype, and is therefore removed by this patch.

This patch does the following with the platform specific default_idle()
functions on different architectures:
- remove the unused function:
  - parisc
  - sparc64
- make the needlessly global function static:
  - arm
  - h8300
  - m68k
  - m68knommu
  - s390
  - v850
  - x86_64
- add a prototype in asm/system.h:
  - cris
  - i386
  - ia64

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Patrick Mochel <mochel@digitalimplant.org>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:21 -08:00
Andrew Morton
a720115678 [PATCH] more-for_each_cpu-conversions fix
I screwed up this conversion - we should be iterating across online CPUs, not
possible ones.

Spotted by Joe Perches <joe@perches.com>

Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:15 -08:00
Linus Torvalds
b408cbc704 [PATCH] PCI: resource address mismatch
On Tue, 21 Feb 2006, Ivan Kokshaysky wrote:
> There are two bogus entries in the BIOS memory map table which are
> conflicting with a prefetchable memory range of the AGP bridge:
>
>  BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
>  BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
>
> 0000:00:02.0 PCI bridge: Silicon Integrated Systems [SiS] Virtual PCI-to-PCI bridge (AGP) (prog-if 00 [Normal decode])
> 	Flags: bus master, fast devsel, latency 0
> 	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
> 	I/O behind bridge: 0000c000-0000cfff
> 	Memory behind bridge: e7e00000-e7efffff
> 	Prefetchable memory behind bridge: fec00000-ffcfffff
> 					   ^^^^^^^^^^^^^^^^^

Yes. However, it's pretty clear that the e820 entries are there for a
reason. Probably they are a hack by the BIOS maintainers to keep Windows
from stomping/moving that region, exactly because they want to keep the
bridge where it is (or, it's actually for the BIOS itself - the BIOS
tables are a horrid mess, and BIOS engineers are pretty hacky people:
they'll add random entries to make their own broken algorithms do the
"right thing").

> Starting from 2.6.13, kernel tries to resolve that sort of conflicts,
> so that prefetch window of the bridge and the framebuffer memory behind
> it get moved to 0x10000000.

I think we could (and probably should) solve this another way: consider
the ACPI "reserved regions" from the e820 map exactly the same way that we
do other ACPI hints - they should restrict _new_ allocations, but not
impact stuff we figure out on our own.

Basically, right now we assign _unassigned_ resources at "fs_initcall"
time. If we were to add in the e820 "reserved region" stuff before that
(but after we've done PCI discovery), we'd probably do the right thing.

Right now we do the e820 reserved regions very early indeed: we call
"register_memory()" from setup_arch(). We could move at least part of it
(the part that registers the resources) down a bit.

Here's a test-patch. I'm not saying we should absolutely do this, but it
might be interesting to try...

Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: <bjk@luxsci.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-23 14:35:14 -08:00
Andrew Morton
394e3902c5 [PATCH] more for_each_cpu() conversions
When we stop allocating percpu memory for not-possible CPUs we must not touch
the percpu data for not-possible CPUs at all.  The correct way of doing this
is to test cpu_possible() or to use for_each_cpu().

This patch is a kernel-wide sweep of all instances of NR_CPUS.  I found very
few instances of this bug, if any.  But the patch converts lots of open-coded
test to use the preferred helper macros.

Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Christian Zankel <chris@zankel.net>
Cc: Philippe Elie <phil.el@wanadoo.fr>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Jens Axboe <axboe@suse.de>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:17 -08:00