For architecture like ia64, the switch stack structure is fairly large
(currently 528 bytes). For context switch intensive application, we found
that significant amount of cache misses occurs in switch_to() function.
The following patch adds a hook in the schedule() function to prefetch
switch stack structure as soon as 'next' task is determined. This allows
maximum overlap in prefetch cache lines for that structure.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There were three changes necessary in order to allow
sparc64 to use setup-res.c:
1) Sparc64 roots the PCI I/O and MEM address space using
parent resources contained in the PCI controller structure.
I'm actually surprised no other platforms do this, especially
ones like Alpha and PPC{,64}. These resources get linked into the
iomem/ioport tree when PCI controllers are probed.
So the hierarchy looks like this:
iomem --|
PCI controller 1 MEM space --|
device 1
device 2
etc.
PCI controller 2 MEM space --|
...
ioport --|
PCI controller 1 IO space --|
...
PCI controller 2 IO space --|
...
You get the idea. The drivers/pci/setup-res.c code allocates
using plain iomem_space and ioport_space as the root, so that
wouldn't work with the above setup.
So I added a pcibios_select_root() that is used to handle this.
It uses the PCI controller struct's io_space and mem_space on
sparc64, and io{port,mem}_resource on every other platform to
keep current behavior.
2) quirk_io_region() is buggy. It takes in raw BUS view addresses
and tries to use them as a PCI resource.
pci_claim_resource() expects the resource to be fully formed when
it gets called. The sparc64 implementation would do the translation
but that's absolutely wrong, because if the same resource gets
released then re-claimed we'll adjust things twice.
So I fixed up quirk_io_region() to do the proper pcibios_bus_to_resource()
conversion before passing it on to pci_claim_resource().
3) I was mistakedly __init'ing the function methods the PCI controller
drivers provide on sparc64 to implement some parts of these
routines. This was, of course, easy to fix.
So we end up with the following, and that nasty SPARC64 makefile
ifdef in drivers/pci/Makefile is finally zapped.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
New version leaves the original memory map unmodified.
Also saves any granule trimmings for use by the uncached
memory allocator.
Inspired by Khalid Aziz (various traces of his patch still
remain). Fixes to uncached_build_memmap() and sn2 testing
by Martin Hicks.
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch fixes a race condition where in system used to hang or sometime
crash within minutes when kprobes are inserted on ISR routine and a task
routine.
The fix has been stress tested on i386, ia64, pp64 and on x86_64. To
reproduce the problem insert kprobes on schedule() and do_IRQ() functions
and you should see hang or system crash.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch just gathers together all the struct flock definitions except
xtensa into asm-generic/fcntl.h.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch puts the most popular of each fcntl operation/flag into
asm-generic/fcntl.h and cleans up the arch files.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch puts the most popular of each open flag into asm-generic/fcntl.h
and cleans up the arch files.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This set of patches creates asm-generic/fcntl.h and consolidates as much as
possible from the asm-*/fcntl.h files into it.
This patch just gathers all the identical bits of the asm-*/fcntl.h files into
asm-generic/fcntl.h.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
unused and useless..
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
IRQ_PER_CPU is not used by all architectures. This patch introduces the
macros ARCH_HAS_IRQ_PER_CPU and CHECK_IRQ_PER_CPU() to avoid the generation
of dead code in __do_IRQ().
ARCH_HAS_IRQ_PER_CPU is defined by architectures using IRQ_PER_CPU in their
include/asm_ARCH/irq.h file.
Through grepping the tree I found the following architectures currently use
IRQ_PER_CPU:
cris, ia64, ppc, ppc64 and parisc.
Signed-off-by: Karsten Wiese <annabellesgarden@yahoo.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The size of auxiliary vector is fixed at 42 in linux/sched.h. But it isn't
very obvious when looking at linux/elf.h. This patch adds AT_VECTOR_SIZE
so that we can change it if necessary when a new vector is added.
Because of include file ordering problems, doing this necessitated the
extraction of the AT_* symbols into a standalone header file.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When I first wrote the compat layer patches, I was somewhat cavalier about
the definition of compat_uid_t and compat_gid_t (or maybe I just
misunderstood :-)). This patch makes the compat types much more consistent
with the types we are being compatible with and hopefully will fix a few
bugs along the way.
compat type type in compat arch
__compat_[ug]id_t __kernel_[ug]id_t
__compat_[ug]id32_t __kernel_[ug]id32_t
compat_[ug]id_t [ug]id_t
The difference is that compat_uid_t is always 32 bits (for the archs we
care about) but __compat_uid_t may be 16 bits on some.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
ATM pthread_cond_signal is unnecessarily slow, because it wakes one waiter
(which at least on UP usually means an immediate context switch to one of
the waiter threads). This waiter wakes up and after a few instructions it
attempts to acquire the cv internal lock, but that lock is still held by
the thread calling pthread_cond_signal. So it goes to sleep and eventually
the signalling thread is scheduled in, unlocks the internal lock and wakes
the waiter again.
Now, before 2003-09-21 NPTL was using FUTEX_REQUEUE in pthread_cond_signal
to avoid this performance issue, but it was removed when locks were
redesigned to the 3 state scheme (unlocked, locked uncontended, locked
contended).
Following scenario shows why simply using FUTEX_REQUEUE in
pthread_cond_signal together with using lll_mutex_unlock_force in place of
lll_mutex_unlock is not enough and probably why it has been disabled at
that time:
The number is value in cv->__data.__lock.
thr1 thr2 thr3
0 pthread_cond_wait
1 lll_mutex_lock (cv->__data.__lock)
0 lll_mutex_unlock (cv->__data.__lock)
0 lll_futex_wait (&cv->__data.__futex, futexval)
0 pthread_cond_signal
1 lll_mutex_lock (cv->__data.__lock)
1 pthread_cond_signal
2 lll_mutex_lock (cv->__data.__lock)
2 lll_futex_wait (&cv->__data.__lock, 2)
2 lll_futex_requeue (&cv->__data.__futex, 0, 1, &cv->__data.__lock)
# FUTEX_REQUEUE, not FUTEX_CMP_REQUEUE
2 lll_mutex_unlock_force (cv->__data.__lock)
0 cv->__data.__lock = 0
0 lll_futex_wake (&cv->__data.__lock, 1)
1 lll_mutex_lock (cv->__data.__lock)
0 lll_mutex_unlock (cv->__data.__lock)
# Here, lll_mutex_unlock doesn't know there are threads waiting
# on the internal cv's lock
Now, I believe it is possible to use FUTEX_REQUEUE in pthread_cond_signal,
but it will cost us not one, but 2 extra syscalls and, what's worse, one of
these extra syscalls will be done for every single waiting loop in
pthread_cond_*wait.
We would need to use lll_mutex_unlock_force in pthread_cond_signal after
requeue and lll_mutex_cond_lock in pthread_cond_*wait after lll_futex_wait.
Another alternative is to do the unlocking pthread_cond_signal needs to do
(the lock can't be unlocked before lll_futex_wake, as that is racy) in the
kernel.
I have implemented both variants, futex-requeue-glibc.patch is the first
one and futex-wake_op{,-glibc}.patch is the unlocking inside of the kernel.
The kernel interface allows userland to specify how exactly an unlocking
operation should look like (some atomic arithmetic operation with optional
constant argument and comparison of the previous futex value with another
constant).
It has been implemented just for ppc*, x86_64 and i?86, for other
architectures I'm including just a stub header which can be used as a
starting point by maintainers to write support for their arches and ATM
will just return -ENOSYS for FUTEX_WAKE_OP. The requeue patch has been
(lightly) tested just on x86_64, the wake_op patch on ppc64 kernel running
32-bit and 64-bit NPTL and x86_64 kernel running 32-bit and 64-bit NPTL.
With the following benchmark on UP x86-64 I get:
for i in nptl-orig nptl-requeue nptl-wake_op; do echo time elf/ld.so --library-path .:$i /tmp/bench; \
for j in 1 2; do echo ( time elf/ld.so --library-path .:$i /tmp/bench ) 2>&1; done; done
time elf/ld.so --library-path .:nptl-orig /tmp/bench
real 0m0.655s user 0m0.253s sys 0m0.403s
real 0m0.657s user 0m0.269s sys 0m0.388s
time elf/ld.so --library-path .:nptl-requeue /tmp/bench
real 0m0.496s user 0m0.225s sys 0m0.271s
real 0m0.531s user 0m0.242s sys 0m0.288s
time elf/ld.so --library-path .:nptl-wake_op /tmp/bench
real 0m0.380s user 0m0.176s sys 0m0.204s
real 0m0.382s user 0m0.175s sys 0m0.207s
The benchmark is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt00001.txt
Older futex-requeue-glibc.patch version is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt00002.txt
Older futex-wake_op-glibc.patch version is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt00003.txt
Will post a new version (just x86-64 fixes so that the patch
applies against pthread_cond_signal.S) to libc-hacker ml soon.
Attached is the kernel FUTEX_WAKE_OP patch as well as a simple-minded
testcase that will not test the atomicity of the operation, but at least
check if the threads that should have been woken up are woken up and
whether the arithmetic operation in the kernel gave the expected results.
Acked-by: Ingo Molnar <mingo@redhat.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Jamie Lokier <jamie@shareable.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When handling writes to /proc/irq, current code is re-programming rte
entries directly. This is not recommended and could potentially cause
chipset's to lockup, or cause missing interrupts.
CONFIG_IRQ_BALANCE does this correctly, where it re-programs only when the
interrupt is pending. The same needs to be done for /proc/irq handling as well.
Otherwise user space irq balancers are really not doing the right thing.
- Changed pending_irq_balance_cpumask to pending_irq_migrate_cpumask for
lack of a generic name.
- added move_irq out of IRQ_BALANCE, and added this same to X86_64
- Added new proc handler for write, so we can do deferred write at irq
handling time.
- Display of /proc/irq/XX/smp_affinity used to display CPU_MASKALL, instead
it now shows only active cpu masks, or exactly what was set.
- Provided a common move_irq implementation, instead of duplicating
when using generic irq framework.
Tested on i386/x86_64 and ia64 with CONFIG_PCI_MSI turned on and off.
Tested UP builds as well.
MSI testing: tbd: I have cards, need to look for a x-over cable, although I
did test an earlier version of this patch. Will test in a couple days.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Zwane Mwaikambo <zwane@holomorphy.com>
Grudgingly-acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Coywolf Qi Hunt <coywolf@lovecn.org>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Resend using accessors instead of volatile qualifiers per hch comments, and
easier to understand convenience macros per rja comments.
Patch to apply volatile semantics when accessing MMR's in various SN files.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The function prototypes for iosapic_enable_intr() and
iosapic_pci_fixup() in include/asm-ia64/iosapic.h are no longer
needed. This patch removes them. The original patch has been posted by
Satoru Takeuchi.
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The config option 'CONFIG_ACPI_DEALLOCATE_IRQ' is no longer
needed. This patch removes it.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The function prototype for handl_IRQ_event() in include/asm-ia64/irq.h
is no longer needed. This patch removes it.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
When XPC is being shutdown (i.e., rmmod, reboot) it doesn't ensure that
other partitions with whom it was connected have completely disengaged
from any attempt at cross-partition memory references. This can lead to
MCAs in any of these other partitions when the partition is reset.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
1) workaround a h/w reset issue
2) to improve the determination of FPGA-based h/w in
the arch/ia64/sn/kernel/tiocx code.
Signed-off-by: Bruce Losure <blosure@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This is used only in slab.c and each architecture gets to define whcih
underlying type is to be used.
Seems a bit silly - move it to slab.c and use the same type for all
architectures: unsigned int.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- notifying the PROM of specific features that are supported by the OS.
This is used to enable PROM feature if and only if the corresponding
feature is implemented in the OS
- fetch feature sets that are supported by the current PROM. This allows
the OS to selectively enable features when the PROM support is available.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Peter Staubach pointed out that it is not correct to check
current->personality & PER_LINUX32 (this will have false
hits on several other personality values).
Signed-off-by: Tony Luck <tony.luck@intel.com>
Some shub2 changes were not in the tree when Greg cleaned up the sn2
region definitions in 1b66776da7, so this
one didn't get fixed.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Patch to support P-state transitions on ia64. This driver is based on ACPI,
and uses the ACPI processor driver interface to find out the P-state support
information for the processor. This driver plugs into generic cpufreq
infrastructure.
Once this driver is loaded successfully, ondemand/userspace governor can be
used to change the CPU frequency dynamically based on load or on request from
userspace process.
Refer :
ACPI specification -
http://www.acpi.info
P-state related PAL calls -
http://developer.intel.com/design/itanium/downloads/24869909.pdf
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Altix patch to abstract irq_affinity down to the pci provider level since
different SGI hardware implements this in different ways.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Clean up of SGI SN partitioning related code.
The SN_SAL_GET_SN_INFO SAL call returns the partition ID, making
the SN_SAL_SYSCTL_PARTITION_GET SAL call redundant. Remove sn_partid
and use sn_partition_id.
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add a new exported function for determining the nearest node
with CPUs for I/O nodes and fix a bug where the hwperf dynamic
misc device was being registered before misc_init().
Signed-off-by: Mark Goodwin <markgw@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Bugfix to export PCI topology information in /proc/sgi_sn/sn_topology.
Signed-off-by: Mark Goodwin <markgw@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Currently, region numbers are defined in several files, with several
names. For example, we have REGION_KERNEL in asm/page.h and
RGN_KERNEL in pgtable.h
We also have address definitions that should depend on the
RGN_XXX macros, but are currently just long constants.
The following patch reorganises all the definitions so that they have
the same form (RGN_XXX), are in one place, and that addresses that
depend on RGN_XXX are derived from them.
(This is a necessary but not sufficient patch to allow UML-like
operation on IA64).
Thanks to David Mosberger for catching the change I missed in mmu_context.h.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Parenthesize "p" to avoid ambiguity. No callers have a problem today;
this is just to clean up the bad form.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
We ran into the limit with the maximum number of waiters at one of our sites.
This patch increases the number of possible waiters from 2^15 to 2^31 by using
a long for the counter in struct rw_semaphore. S390 and alpha already do this.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Kenneth Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Not only was this unused, but its somewhat eccentric declaration
of "static inline const unsigned long" gives gcc4 heartburn.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Shub2 provides a much improved mechanism for issuing internode
TLB purges. Add code to support the newer mechanism. There is also
some debug code (disabled) that is useful for testing.
Collect statistics on the number, type & duration of TLB purges.
This data will be useful for making future improvements in the algorithms.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Update the SN address macros so that they work on both shub1
and shub2. Most of the code to support shub2 was added last year
but this patch fixes a few bugs and adds macros to help generate
both processor-specific physical addresses & numalink physical
addresses. More cleanup & optimization will be done later.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The patch below should fix a race which could cause stale TLB entries.
Specifically, when 2 CPUs ended up racing for entrance to
wrap_mmu_context(). The losing CPU would find that by the time it
acquired ctx.lock, mm->context already had a valid value, but then it
failed to (re-)check the delayed TLB flushing logic and hence could
end up using a context number when there were still stale entries in
its TLB. The fix is to check for delayed TLB flushes only after
mm->context is valid (non-zero). The patch also makes GCC v4.x
happier by defining a non-volatile variant of mm_context_t called
nv_mm_context_t.
Signed-off-by: David Mosberger-Tang <David.Mosberger@acm.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Altix patch to add an SN pci provider for TIOCE, which is SGI's
PCI Express implementation.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Altix patch to add TIO "huge-window" address support to sn_dma_flush().
Update copyright in affected files.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Altix patch to abstract the force_interrupt() mechanism away from the
pcibr provider.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cosmetic altix patch to rename SGI_PCIBR_ERROR to something more generic and
remove a duplicate #define.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
1. Nontemporal store for spin unlock.
A nontemporal store will not update the LRU setting for the cacheline. The
cacheline with the lock may therefore be evicted faster from the cpu
caches. Doing so may be useful since it increases the chance that the
exclusive cache line has been evicted when another cpu is trying to
acquire the lock.
The time between dropping and reacquiring a lock on the same cpu is
typically very small so the danger of the cacheline being
evicted is negligible.
2. Avoid semaphore operation in write_unlock and use nontemporal store
write_lock uses a cmpxchg like the regular spin_lock but write_unlock uses
clear_bit which requires a load and then a loop over a cmpxchg. The
following patch makes write_unlock simply use a nontemporal store to clear
the highest 8 bits. We will then still have the lower 3 bytes (24 bits)
left to count the readers.
Doing the byte store will reduce the number of possible readers from 2^31
to 2^24 = 16 million.
These patches were discussed already:
http://marc.theaimsgroup.com/?t=111472054400001&r=1&w=2http://marc.theaimsgroup.com/?l=linux-ia64&m=111401837707849&w=2
The nontemporal stores will only work using GCC. If a compiler is used
that does not support inline asm then fallback C code is used. This
will preserve the byte store but not be able to do the nontemporal stores.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch removes the following stupid compile error that happens
when CONFIG_HOTPLUG is not defined on ia64.
arch/ia64/kernel/built-in.o(.text+0x712): In function `acpi_unregister_ioapic':
: undefined reference to `iosapic_remove'
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
When the kernel is working well and we want to restart cleanly
kernel_restart is the function to use. But in many instances
the kernel wants to reboot when thing are expected to be working
very badly such as from panic or a software watchdog handler.
This patch adds the function emergency_restart() so that
callers can be clear what semantics they expect when calling
restart. emergency_restart() is expected to be callable
from interrupt context and possibly reliable in even more
trying circumstances.
This is an initial generic implementation for all architectures.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The Altix subarch does not provide node information via ACPI. Instead hooks
are used to fixup pci structures. This patch determines the nodes for Altix
PCI busses.
Remote Bridges:
---------------
Altix supports remote I/O nodes without memory or processors but with bridges.
The TIOCA type of bridge is an AGP bridge and the PROM provides information
about the closest node. That information will be returned by pcibus_to_node.
The TIOCP remote bridge type is a PCI bridge but the PROM does not provide a
closest node id. pcibus_to_node will return -1 for devices on those bridges
meaning that device control structures may be allocated on any node.
Safeguard:
----------
Should the fixups result in invalid node information for a pci controller then
a warning will be printed and pcibus_to_node will return -1.
This patch also fixes the "FIXME" in sn_dma_alloc_coherent. This means that
dma_alloc_coherent will now use alloc_pages_node to allocate memory local to
the node that the PCI device is connected to.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch removes the CONFIG_IA64_SGI_SN_SIM option entirely, allowing
any kernel bootable on sn2 to also be booted in the simulator.
Boot tested on Altix and HP rx2600.
Signed-off-by: Greg Edwards <edwardsg@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
pcibus_to_node provides a way for the Linux kernel to identify to which
node a certain pcibus connects to. Allocations of control structures
for devices can then be made on the node where the pci bus is located
to allow local access during interrupt and other device manipulation.
This patch provides a new "node" field in the the pci_controller
structure. The node field will be set based on ACPI information (thanks
to Alex Williamson <alex.williamson@hp.com for that piece).
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
http://bugme.osdl.org/show_bug.cgi?id=4016
Written-by: David Shaohua Li <shaohua.li@intel.com>
Acked-by: Adam Belay <abelay@novell.com>
Signed-off-by: Len Brown <len.brown@intel.com>
ACPI 3.0 added a Correctable Platform Error Interrupt (CPEI)
Processor Overide flag to MADT.Platform_Interrupt_Source.
Record the processor that was provided as hint from ACPI.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Current assign_irq_vector() will panic if interrupt vectors is running
out. But I think how to handle the case of lack of interrupt vectors
should be handled by the caller of this function. For example, some
PCI devices can raise the interrupt signal via both MSI and I/O
APIC. So even if the driver for these device fails to allocate a
vector for MSI, the driver still has a chance to use I/O APIC based
interrupt. But currently there is no chance for these driver to use
I/O APIC based interrupt because kernel will panic when
assign_irq_vector() fails to allocate interrupt vector.
The following patch changes assign_irq_vector() for ia64 to return
-ENOSPC on error instead of panic (as i386 and x86_64 versions do).
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
changing CONFIG_LOCALVERSION rebuilds too much, for no appearent reason.
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Jesse Barnes provided the original version of this patch months ago, but
other changes kept conflicting with it, so it got deferred. Greg Edwards
dug it out of obscurity just over a week ago, and almost immediately
another conflicting patch appeared (Bob Picco's memory-less nodes).
I've resolved the conflicts and got it running again. CONFIG_SGI_TIOCX
is set to "y" in defconfig, which causes a Tiger to not boot (oops in
tiocx_init). But that can be resolved later ... get this in now before it
gets stale again.
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch fixes an issue with the PROM and a kernel running with
CONFIG_PREEMPT enabled. When CONFIG_PREEMPT is enabled, the size of a
spinlock_t changes -- resulting in the PROM writing to an incorrect location.
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch is the SGI hotplug driver and additional changes required for
the driver. These modifications include changes to the SN io_init.c code
for memory management, the inclusion of new SAL calls to enable and disable
PCI slots, and a hotplug-style driver.
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch is a rewrite of the code to check the PROM version. The current
code has some deficiences in the way PROM comparisons were made. The minimum
value of PROM that will boot has also been changed to 4.04.
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch moves header files out of the arch/ia64/sn directories and into
include/asm-ia64/sn. These files were being included by other subsystems
and should be under include/asm-ia64/sn.
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch fixes the SN IRQ code such that cpu affinity and
Hotplug can modify IRQ values. The sn_irq_info structures are now locked
using a RCU lock mechanism to avoid lock contention in the lost interrupt
WAR code.
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
There's another problem shown up by Ingo's recent patch to make
smp_processor_id() complain if it's called with preemption enabled.
local_finish_flush_tlb_mm() calls activate_context() in a situation
where it could be rescheduled to another processor. This patch
disables preemption around the call.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch fixes an ordering issue between the init code for the
tiocx bus driver and tiocx-related device drivers. Also adds
a new brick to the list of known FPGA bricks.
Signed-off-by: Bruce Losure <blosure@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch is a sparse compile cleanup of tioca_provider.c, sn_hwperf.h, and
tioca_provider.h. Each of these files had sparse warnings when
compiled.
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patches provides support on Shub2 for the separate TIO IOSPACE MMR. This
patch is SN specific.
Signed-off-by: Colin Ngam <cngam@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch changes some macros that are used when running kernel on the
SGI simulator.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch is a sparse compile cleanup of shub_mmr.h using both the defconfig
and the sn2_defconfig config files.
The issue with this file was the missing usage of __IA64_UL_CONST wrapper.
This wrapper is defined in include/asm-ia64/types.h and wraps a long
constant definition with UL or with nothing depending on its usage in the
kernel. The missing wrapper caused many sparse compile errors like
warning: constant 0x0x0000000010000380 so big it is long
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Resend 2 with changes per Bjorn Helgaas comments. Changes from original:
+ Change globals to vga_console_iobase/vga_console_membase and make them
unconditional.
+ Address style-related comments.
Patch to extend the PCDP vga setup code to support PCI io/mem translations
for the legacy vga ioport and ram spaces on architectures (e.g. altix) which
need them.
Summary of the changes:
drivers/firmware/pcdp.c
drivers/firmware/pcdp.h
-----------------------
+ add declaration for the spec-defined PCI interface struct (pcdp_if_pci)
as well as support macros.
+ extend setup_vga_console() to know about pcdp_if_pci and add a couple of
globals to hold the io and mem translation offsets if present.
arch/ia64/kernel/setup.c
------------------------
+ tweek early_console_setup() to allow multiple early console setup routines
to be called.
include/asm-ia64/vga.h
----------------------
+ make VGA_MAP_MEM vga_console_membase aware
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
With CONFIG_PCI=n:
In file included from include/linux/pci.h:917,
from lib/iomap.c:6:
include/asm/pci.h:104: warning: `enum pci_dma_burst_strategy' declared inside parameter list
include/asm/pci.h:104: warning: its scope is only this definition or declaration, which is probably not what you want.
include/asm/pci.h: In function `pci_dma_burst_advice':
include/asm/pci.h:106: dereferencing pointer to incomplete type
include/asm/pci.h:106: `PCI_DMA_BURST_INFINITY' undeclared (first use in this function)
include/asm/pci.h:106: (Each undeclared identifier is reported only once
include/asm/pci.h:106: for each function it appears in.)
make[1]: *** [lib/iomap.o] Error 1
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
After seeing, at best, "guesses" as to the following kind
of information in several drivers, I decided that we really
need a way for platforms to specifically give advice in this
area for what works best with their PCI controller implementation.
Basically, this new interface gives DMA bursting advice on
PCI. There are three forms of the advice:
1) Burst as much as possible, it is not necessary to end bursts
on some particular boundary for best performance.
2) Burst on some byte count multiple. A DMA burst to some multiple of
number of bytes may be done, but it is important to end the burst
on an exact multiple for best performance.
The best example of this I am aware of are the PPC64 PCI
controllers, where if you end a burst mid-cacheline then
chip has to refetch the data and the IOMMU translations
which hurts performance a lot.
3) Burst on a single byte count multiple. Bursts shall end
exactly on the next multiple boundary for best performance.
Sparc64 and Alpha's PCI controllers operate this way. They
disconnect any device which tries to burst across a cacheline
boundary.
Actually, newer sparc64 PCI controllers do not have this behavior.
That is why the "pdev" is passed into the interface, so I can
add code later to check which PCI controller the system is using
and give advice accordingly.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is an ia64 implementation of acpi_register_ioapic() and
acpi_unregister_ioapic() interfaces.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Not safe to insert kprobes on IVT code.
This patch checks to see if the address on which Kprobes is being inserted is
in ivt code and if it is in ivt code then refuse to register kprobe.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Acked-by: David Mosberger <davidm@napali.hpl.hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The following patch implements function return probes for ia64 using
the revised design. With this new design we no longer need to do some
of the odd hacks previous required on the last ia64 return probe port
that I sent out for comments.
Note that this new implementation still does not resolve the problem noted
by Keith Owens where backtrace data is lost after a return probe is hit.
Changes include:
* Addition of kretprobe_trampoline to act as a dummy function for instrumented
functions to return to, and for the return probe infrastructure to place
a kprobe on on, gaining control so that the return probe handler
can be called, and so that the instruction pointer can be moved back
to the original return address.
* Addition of arch_init(), allowing a kprobe to be registered on
kretprobe_trampoline
* Addition of trampoline_probe_handler() which is used as the pre_handler
for the kprobe inserted on kretprobe_implementation. This is the function
that handles the details for calling the return probe handler function
and returning control back at the original return address
* Addition of arch_prepare_kretprobe() which is setup as the pre_handler
for a kprobe registered at the beginning of the target function by
kernel/kprobes.c so that a return probe instance can be setup when
a caller enters the target function. (A return probe instance contains
all the needed information for trampoline_probe_handler to do it's job.)
* Hooks added to the exit path of a task so that we can cleanup any left-over
return probe instances (i.e. if a task dies while inside a targeted function
then the return probe instance was reserved at the beginning of the function
but the function never returns so we need to mark the instance as unused.)
Signed-off-by: Rusty Lynch <rusty.lynch@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Now that PPC64 has no-execute support, here is a second try to fix the
single step out of line during kprobe execution. Kprobes on x86_64 already
solved this problem by allocating an executable page and using it as the
scratch area for stepping out of line. Reuse that.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This updates the CFQ io scheduler to the new time sliced design (cfq
v3). It provides full process fairness, while giving excellent
aggregate system throughput even for many competing processes. It
supports io priorities, either inherited from the cpu nice value or set
directly with the ioprio_get/set syscalls. The latter closely mimic
set/getpriority.
This import is based on my latest from -mm.
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Instead of requiring architecture code to interact with the scheduler's
locking implementation, provide a couple of defines that can be used by the
architecture to request runqueue unlocked context switches, and ask for
interrupts to be enabled over the context switch.
Also replaces the "switch_lock" used by these architectures with an oncpu
flag (note, not a potentially slow bitflag). This eliminates one bus
locked memory operation when context switching, and simplifies the
task_running function.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Do some basic initial tuning.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>