arch/ia64/sn and include/asm-ia64/sn changes required to support Tollhouse
system PCI hotplug, fixes the ia64_sn_sysctl_ioboard_get call, and introduces
the PRF_HOTPLUG_SUPPORT feature bit.
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
dig_irq_init is equivalent to machvec_noop, no need to define
another empty function.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Memory errors encountered by user applications may surface
when the CPU is running in kernel context. The current code
will not attempt recovery if the MCA surfaces in kernel
context (privilage mode 0). This patch adds a check for cases
where the user initiated the load that surfaces in kernel
interrupt code.
An example is a user process lauching a load from memory
and the data in memory had bad ECC. Before the bad data
gets to the CPU register, and interrupt comes in. The
code jumps to the IVT interrupt entry point and begins
execution in kernel context. The process of saving the
user registers (SAVE_REST) causes the bad data to be loaded
into a CPU register, triggering the MCA. The MCA surfaces in
kernel context, even though the load was initiated from
user context.
As suggested by David and Tony, this patch uses an exception
table like approach, puting the tagged recovery addresses in
a searchable table. One difference from the exception table
is that MCAs do not surface in precise places (such as with
a TLB miss), so instead of tagging specific instructions,
address ranges are registers. A single macro is used to do
the tagging, with the input parameter being the label
of the starting address and the macro being the ending
address. This limits clutter in the code.
This patch only tags one spot, the interrupt ivt entry.
Testing showed that spot to be a "heavy hitter" with
MCAs surfacing while saving user registers. Other spots
can be added as needed by adding a single macro.
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
include/linux/platform.h contained nothing that was actually used except
the default_idle() prototype, and is therefore removed by this patch.
This patch does the following with the platform specific default_idle()
functions on different architectures:
- remove the unused function:
- parisc
- sparc64
- make the needlessly global function static:
- arm
- h8300
- m68k
- m68knommu
- s390
- v850
- x86_64
- add a prototype in asm/system.h:
- cris
- i386
- ia64
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Patrick Mochel <mochel@digitalimplant.org>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I'm not sure of the worthiness of this idea, so please consider it an RFC.
Its key merits are:
* Reuse existing infrastructure
* Greatly tightens up the parsing of nomca
* Greatly simplifies the parsing of machvec
Addition cleanup (moving setup_mvec() to machvec.c) by Ken Chen.
Signed-Off-By: Horms <horms@verge.net.au>
Signed-Off-By: Tony Luck <tony.luck@intel.com>
This patch removes all occurances of _INLINE_ in the kernel.
With the exception of tty_flip.h, I've simply removed the inline's since
gcc should know best which functions to be inlined.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
ia64_mv is initialized based on platform detected or specified.
However, there is one instantiation of each platform type. We
don't expect to switch platform vector during run time. Move
those platform specific type into init section since a copy is
made into global ia64_mv at initialization.
Also move instruction patch list into init section as well.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add init declaration to bunch of patch functions and gate
page setup function.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add init declaration to variables/functions used for memory
initialization. I don't think they would clash with memory
hotplug. If they do, please yell.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add init declaration to cpu initialization functions.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Mark init related variable and functions with appropriate
__init* declaration to mca functions.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
According to the ACPI spec, the OSPM must ignore the contents of the
Processor Local APIC/SAPIC Affinity Structure in System Resource
Affinity Table (SRAT), if its enable flag is cleared. However, ia64
linux refers all of the Processor Local APIC/SAPIC Affinity Structures
in SRAT regardless of the enable flag. This is obviously against the
ACPI spec. This patch fixes this bug.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Use the recently-added ia64_get_irr() rather than duplicating the code.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Acked-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
fix is_hugepage_only_range() definition to be "overlaps"
instead of "within architectural restricted hugetlb address
range". Simplify the ia64 specific code that used to use
is_hugepage_only_range() to just check which region the
address is in.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Quite a long time back, prepare_hugepage_range() replaced
is_aligned_hugepage_range() as the callback from mm/mmap.c to arch code to
verify if an address range is suitable for a hugepage mapping.
is_aligned_hugepage_range() stuck around, but only to implement
prepare_hugepage_range() on archs which didn't implement their own.
Most archs (everything except ia64 and powerpc) used the same
implementation of is_aligned_hugepage_range(). On powerpc, which
implements its own prepare_hugepage_range(), the custom version was never
used.
In addition, "is_aligned_hugepage_range()" was a bad name, because it
suggests it returns true iff the given range is a good hugepage range,
whereas in fact it returns 0-or-error (so the sense is reversed).
This patch cleans up by abolishing is_aligned_hugepage_range(). Instead
prepare_hugepage_range() is defined directly. Most archs use the default
version, which simply checks the given region is aligned to the size of a
hugepage. ia64 and powerpc define custom versions. The ia64 one simply
checks that the range is in the correct address space region in addition to
being suitably aligned. The powerpc version (just as previously) checks
for suitable addresses, and if necessary performs low-level MMU frobbing to
set up new areas for use by hugepages.
No libhugetlbfs testsuite regressions on ppc64 (POWER5 LPAR).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
set_page_count usage outside mm/ is limited to setting the refcount to 1.
Remove set_page_count from outside mm/, and replace those users with
init_page_count() and set_page_refcounted().
This allows more debug checking, and tighter control on how code is allowed
to play around with page->_count.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Original 2.6.9 patch and explanation from somewhere within HP via
bugzilla...
ia64 stores a success/failure code in r10, and the return value (normal
return, or *positive* errno) in r8. The patch also sets the exit code to
negative errno if it's a failure result for consistency with other
architectures.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
A pte may be zapped by the swapper, exiting process, unmapping or page
migration while the accessed or dirty bit handers are about to run. In that
case the accessed bit or dirty is set on an zeroed pte which leads the VM to
conclude that this is a swap pte. This may lead to
- Messages from the vm like
swap_free: Bad swap file entry 4000000000000000
- Processes being aborted
swap_dup: Bad swap file entry 4000000000000000
VM: killing process ....
Page migration is particular suitable for the creation of this race since
it needs to remove and restore page table entries.
The fix here is to check for the present bit and simply not update
the pte if the page is not present anymore. If the page is not present
then the fault handler should run next which will take care of the problem
by bringing the page back and then mark the page dirty or move it onto the
active list.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
When there is no bus check, the return code should be failure, not success.
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
This stuff is all in the generic ia64 kernel, and the new initcall error
reporting complains about them.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Turn on CONFIG_PNPACPI. I recently removed 8250_acpi.c. All devices
previously claimed by 8250_acpi.c should now be claimed by 8250_pnp.c.
This depends on having CONFIG_PNPACPI so ACPI devices show up as PNP
devices.
All other ia64 defconfigs either have CONFIG_PNPACPI already, or
don't have 8250 support turned on at all.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The MCA recovery messages are currently KERN_DEBUG,
so they don't show up in /var/log/messages (by default).
Increase the severity to KERN_ERR, for the initial
message (and also add the physical address to this
message). Leave the successful isolation message as
KERN_DEBUG, but increase the severity when isolation
fails to KERN_CRIT.
[Russ' patch made these all KERN_CRIT]
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
The kbuild system takes advantage of an incorrect behavior in GNU make.
Once this behavior is fixed, all files in the kernel rebuild every time,
even if nothing has changed. This patch ensures kbuild works with both
the incorrect and correct behaviors of GNU make.
For more details on the incorrect behavior, see:
http://lists.gnu.org/archive/html/bug-make/2006-03/msg00003.html
Changes in this patch:
- Keep all targets that are to be marked .PHONY in a variable, PHONY.
- Add .PHONY: $(PHONY) to mark them properly.
- Remove any $(PHONY) files from the $? list when determining whether
targets are up-to-date or not.
Signed-off-by: Paul Smith <psmith@gnu.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Allow sysadmin to disable all warnings about userland apps
making unaligned accesses by using:
# echo 1 > /proc/sys/kernel/ignore-unaligned-usertrap
Rather than having to use prctl on a process by process basis.
Default behaivour leaves the warnings enabled.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
beautify coding style for zeroing end of fsyscall_table entries.
Remove misleading __NR_syscall_last and add more comments.
Drop (now unneeded) "guard against failure to increase NR_syscalls"
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
arch/ia64/kernel/unaligned.c erroneously marked die_if_kernel()
with a "noreturn" attribute ... which is silly (it returns whenever
the argument regs say that the fault happened in user mode, as one
might expect given the "if_kernel" part of its name!). Thanks to
Alan and Gareth for pointing this out.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Christoph Hellwig <hch@infradead.org> pointed that there are no
in-tree uses of this. So revert 9c65cb9be6
Signed-off-by: Tony Luck <tony.luck@intel.com>
pcibios_setup() should return NULL if it handled a parameter. Since ia64
handles no parameters, it should return the string that was passed in,
not NULL. This brings ia64 into line with all other architectures that
handle no parameters.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Take advantage of kzalloc() as well as reduce the size of code generated
for the error returns in xpc_setup_infrastructure().
Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
unaligned_access does fetch cr.ipsr, then calls
dispatch_unaligned_handler, but dispatch_unaligned_handler fetches
cr.ipsr again, so delete the first one.
Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Commit 9ec4b1f356 made kprobes not compile
without module support, so just make that clear in the Kconfig file.
Also, since it's marked EXPERIMENTAL, make that dependency explicit too.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Minor updates to earlier patch.
- Added to documentation to add ia64 as well.
- Minor clarification on how to use disabled cpus
- used plain max instead of max_t per Andew Morton.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
It appears that if auditing is enabled, the kernel fails to
check for pending signals before returning to user mode.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Acked-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Trivial port of this feature from i386
As it stands, panic_on_oops but does nothing on ia64
Signed-Off-By: Horms <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The original ia64 udelay() was simple, but flawed for platforms without
synchronized ITCs: a preemption and migration to another CPU during the
while-loop likely resulted in too-early termination or very, very
lengthy looping.
The first fix (now in 2.6.15) broke the delay loop into smaller,
non-preemptible chunks, reenabling preemption between the chunks. This
fix is flawed in that the total udelay is computed to be the sum of just
the non-premptible while-loop pieces, i.e., not counting the time spent
in the interim preemptible periods. If an interrupt or a migration
occurs during one of these interim periods, then that time is invisible
and only serves to lengthen the effective udelay().
This new fix backs out the current flawed fix and returns to a simple
udelay(), fully preemptible and interruptible. It implements two simple
alternative udelay() routines: one a default generic version that uses
ia64_get_itc(), and the other an sn-specific version that uses that
platform's RTC.
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Fix XPC so that it does not deliver any messages until the connected
callout has returned, as well as, prevent the disconnected callout to
occur before the disconnecting callout has returned.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The __sn_cnodeid_to_nasid array was incorrectly sized at MAX_NUMNODES.
On a large system, this array could overflow. The following patch
corrects this by defining it to MAX_COMPACT_NODES.
Signed-off-by: Dean Roe <roe@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This one falls into the "present for Andrew Morton" category to address
his wishlist for a compiler warning free build ;-)
Signed-off-by: Tony Luck <tony.luck@intel.com>
General SN2 code cleanup:
- Do not initialize global variables to zero
- Use kzalloc instead of kmalloc+memset
- Check kmalloc return values
- Do not obfuscate spin lock calls
- Remove some unused code
- Various formatting cleanups
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Remove symbol exports from ia64_ksyms.c that are already exported in
lib/string.c.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Have a facility to account for potentially hot-pluggable CPUs. ACPI doesnt
give a determinstic method to find hot-pluggable CPUs. Hence we use 2 methods
to assist.
- BIOS can mark potentially hot-pluggable CPUs as disabled in the MADT tables.
- User can specify the number of hot-pluggable CPUs via parameter
additional_cpus=X
The option is enabled only if ACPI_CONFIG_HOTPLUG_CPU=y which enables the
physical hotplug option. Without which user can still use logical onlining
and offlining of CPUs by enabling CONFIG_HOTPLUG_CPU=y
Adds more bits to cpu_possible_map for potentially hot-pluggable cpus.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Do not set cpu_possible_map for NR_CPUS when ACPI_CONFIG_HOTPLUG_CPU is set.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
MCA driver can cause panic if kernel gets a state info with no minstate.
This patch adds minstate validation before handling it.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Remove an erroneous kfree, and unlink the pcidev_info struct from the
pcidev_info list prior to free'ing the pcidev_info struct.
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Registers system call for the ia64 architecture.
Reserves space for ppoll and pselect, and adds unshare at system
call number 1296.
Signed-off-by: Janak Desai <janak@us.ibm.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
No platform in the community tree uses PLATFORM_MCA_HANDLERS, remove
the references.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Update the comm field on the MCA handler for user tasks as well as for
verified kernel tasks. This helps to identify the task that was
running when the MCA occurred.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Print a message identifying the monarch MCA handler. Print a summary
of the status of the slave MCA cpus.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Prevent SN2 specific code to be executed on non SN2 platforms when
running a generic kernel.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
There were two problems with enabling the PRINTK_TIME config
option:
1) The first calls to printk() occur before per-cpu data virtual
address is pinned into the TLB, so sched_clock() can fault.
2) sched_clock() is based on ar.itc, which may not be synchronized
across cpus.
Ken Chen started this patch, Tony Luck tinkered with it, and Jes
Sorensen perfected it.
Signed-off-by: Tony Luck <tony.luck@intel.com>
The check of (end != cp) after memparse in efi.c looks wrong to me.
The result is that we can't use mem= and max_addr= kernel parameter at
the same time.
The following patch removed the check just like other arches do.
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Rewrite the SN pio_phys_xxx macros in assembly language. This
avoids issues with the Intel icc compiler. Function call
overhead is not an issue - the functions reference PIOs
and take 100's nsec to complete.
In addition, the functions should likely be in assembly
language anyway - they reference memory using physical
addressing mode. One function executes with psr.ic disabled.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
After converting the cpu physical address to shub2 physical
addressing, the address was run through TO_PHYS() which
clobbered a high node offset bit causing the BTE to fail
on shub2 nodes with large memory. This fix corrects
that problem.
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
The patch implements cpu topology exportation by sysfs.
Items (attributes) are similar to /proc/cpuinfo.
1) /sys/devices/system/cpu/cpuX/topology/physical_package_id:
represent the physical package id of cpu X;
2) /sys/devices/system/cpu/cpuX/topology/core_id:
represent the cpu core id to cpu X;
3) /sys/devices/system/cpu/cpuX/topology/thread_siblings:
represent the thread siblings to cpu X in the same core;
4) /sys/devices/system/cpu/cpuX/topology/core_siblings:
represent the thread siblings to cpu X in the same physical package;
To implement it in an architecture-neutral way, a new source file,
driver/base/topology.c, is to export the 5 attributes.
If one architecture wants to support this feature, it just needs to
implement 4 defines, typically in file include/asm-XXX/topology.h.
The 4 defines are:
#define topology_physical_package_id(cpu)
#define topology_core_id(cpu)
#define topology_thread_siblings(cpu)
#define topology_core_siblings(cpu)
The type of **_id is int.
The type of siblings is cpumask_t.
To be consistent on all architectures, the 4 attributes should have
deafult values if their values are unavailable. Below is the rule.
1) physical_package_id: If cpu has no physical package id, -1 is the
default value.
2) core_id: If cpu doesn't support multi-core, its core id is 0.
3) thread_siblings: Just include itself, if the cpu doesn't support
HT/multi-thread.
4) core_siblings: Just include itself, if the cpu doesn't support
multi-core and HT/Multi-thread.
So be careful when declaring the 4 defines in include/asm-XXX/topology.h.
If an attribute isn't defined on an architecture, it won't be exported.
Thank Nathan, Greg, Andi, Paul and Venki.
The patch provides defines for i386/x86_64/ia64.
Signed-off-by: Zhang, Yanmin <yanmin.zhang@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
During some testing, we got a warning about trying to allocate
memory while holding a lock. This fixes that problem.
Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Maintenance patch:
- Add missing __init calls
- Do not zero initialize global variables
- No need to typecast function call returns to void
- Some formatting
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
If SAL_CACHE_FLUSH drops interrupts, complain about it and fall back to
using PAL_CACHE_FLUSH instead.
This is to work around a defect in HP rx5670 firmware: when an interrupt
occurs during SAL_CACHE_FLUSH, SAL drops the interrupt but leaves it marked
"in-service", which leaves the interrupt (and others of equal or lower
priority) masked.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Somehow I doubt this comment is meant to be here anymore... It's
been floating after the L1_CACHE_SHIFT entry since before Linux
moved to bitkeeper.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Temporary patch to make pci_enable_msi() fail gracefully on altix. Will be
removed after 2.6.16 releases and the msi abstraction patches start flowing.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Redirecting interrupts using smp_affinity on altix does not work on kernels
built with CONFIG_PCI_MSI. The problem is that move_irq() turns into a noop
if MSI is built in. This patch calls move_native_irq() instead of move_irq()
to get around that.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
On SN2, MMIO writes which are issued from separate processors are not
guaranteed to arrive in any particular order at the IO hardware. When
performing such writes from the kernel this is not a problem, as a
kernel thread will not migrate to another CPU during execution, and
mmiowb() calls can guarantee write ordering when control of the IO
resource is allowed to move between threads.
However, when MMIO writes can be performed from user space (e.g. DRM)
there are no such guarantees and mechanisms, as the process may
context-switch at any time, and may migrate to a different CPU as part
of the switch. For such programs/hardware to operate correctly, it is
required that the MMIO writes from the old CPU be accepted by the IO
hardware before subsequent writes from the new CPU can be issued.
The following patch implements this behavior on SN2 by waiting for a
Shub register to indicate that these writes have been accepted. This
is placed in the context switch-in path, and only performs the wait
when the newly scheduled task changes CPUs.
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Brent Casavant <bcasavan@sgi.com>
This patch finishes support for SHUB2 (the new chipset). Most of the
changes are performance related. A few changes are workarounds for
"interesting" chipset features.
Some temporary debugging code has also been deleted.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Various bugfixes and hardware bug workarounds necessary for the rev 1.0 version
of the altix TIO CE asic.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The only user of the MCA/INIT sigdelayed code (SGI's I/O probing) has
moved from the kernel into SAL. Delete the MCA/INIT sigdelayed code.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Actually I think this is more appropriate so we don't end up with 17
cases that add drivers/sn to the build lib.
Include drivers/sn when CONFIG_IA64_SGI_SN2 or CONFIG_IA64_GENERIC
is enabled.
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
arch/ia64/sn/Makefile sets CPPFLAGS, expecting that setting to
propogate to all the subdirectories. For a normal build with its
recursive descent it does work, but doing a selective build like
'make arch/ia64/sn/kernel/io_init.i' does not do a recursive descent,
it goes directly to arch/ia64/sn/kernel/Makefile so the flags do not
get set.
To support selective builds, set the flags in all the subordinate Makefiles.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Remove the GFP_DMA flag from XPC kmalloc() calls.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Eliminate a hot shared cacheline that occurs if multiple cpus are
taking unaligned exceptions.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Takashi helped us track down a bad page state bug we thought was coming
from alsa. It turns out we weren't paying attention to the gfp flags
that were passed in to sn_dma_alloc_coherent().
From: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Edwards <edwardsg@sgi.com>
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
sos->os_status is set to a default value of IA64_MCA_COLD_BOOT for an
MCA, but then is incorrectly overwritten with IA64_MCA_SAME_CONTEXT (0).
This makes SAL think that all MCAs have been recovered.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Fix an unnecessary softlockup watchdog warning in the ia64
uncached_build_memmap() that occurs occasionally at 256p and always at
512p. The problem occurs at boot time.
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Migrate perfmon from using an old semaphore to a completion handler.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Migrate arch/ia64/ia32/sys_ia32 to using a mutex for mmap protection.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Adds the ability to disability packet split at compile time and use the legacy receive path on PCI express hardware. Made this a CONFIG option and modified the Kconfig, to reflect the new option.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Migrate sn2 code to use mutex and completion events rather than
semaphores.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Bugfix... the altix SN_SAL_IOIF_SLOT_ENABLE & SN_SAL_IOIF_SLOT_DISABLE
SAL calls need to pass the segment# down
Signed-off-by: Mike Habeck <habeck@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Work-around to temporarily support older PROMs with new flush device code.
This code allows systems running older PROMs to continue to run on the new
kernel base until a new official PROM is released.
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Acked-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch fixes the bug that pci_claim_resource() is called multiple
times for the same P2P bridge's resource structures if P2P bridges
require their own PCI I/O resources.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
First step to memory hotplug for ia64 (add only,
all new memory is added to node 0, does not use
ZONE_EASY_RECLAIM yet).
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add driver support for a 2 port PCI IOC3-based serial card on Altix boxes:
This is a re-submission. On the original submission I was asked to
organize the code so that the MIPS ioc3 ethernet and serial parts could be
used with this driver. Stanislaw Skowronek was kind enough to provide the
shim layer for this - thanks Stanislaw. This patch includes the shim layer
and the Altix PCI ioc3 serial driver. The MIPS merged ioc3 ethernet and
serial support is forthcoming.
Signed-off-by: Patrick Gefre <pfg@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
tmp_buf_sem sems to be a common name for something completely unused...
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de> ("usb portion")
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
TTY layer buffering revamp broke ia64 in commit
33f0f88f1c
CC arch/ia64/hp/sim/simserial.o
arch/ia64/hp/sim/simserial.c: In function `receive_chars':
arch/ia64/hp/sim/simserial.c:170: error: structure has no member named `flip'
... and so on ...
make[1]: *** [arch/ia64/hp/sim/simserial.o] Error 1
Patch from Andreas Schwab.
Signed-off-by: Tony Luck <tony.luck@intel.com>
When jprobe is hit, the function parameters of the original function
should be saved before jprobe handler is executed, and restored it after
jprobe handler is executed, because jprobe handler might change the
register values due to tail call optimization by the gcc.
Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add hotplug cpu support to salinfo.c.
The cpu_event field is a cpumask so use the cpu_* macros consistently,
replacing the existing mixture of cpu_* and *_bit macros.
Instead of counting the number of outstanding events in a semaphore and
trying to track that count over user space context, interrupt context,
non-maskable interrupt context and cpu hotplug, replace the semaphore
with a test for "any bits set" combined with a mutex.
Modify the locking to make the test for "work to do" an atomic
operation.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
We need to handle debug traps in fsys mode non-fatally. They can
happen now that we have fsyscalls which contain probe instructions.
Signed-off-by: Jason Uhlenkott <jasonuhl@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch separates the sn_flush_device_list struct into kernel and
common (both kernel and PROM accessible) structures. As it was, if the
size of a spinlock_t changed (due to additional CONFIG options, etc.) the
sal call which populated the sn_flush_device_list structs would erroneously
write data (and cause memory corruption and/or a panic).
This patch does the following:
1. Removes sn_flush_device_list and adds sn_flush_device_common and
sn_flush_device_kernel.
2. Adds a new SAL call to populate a sn_flush_device_common struct per
device, not per widget as previously done.
3. Correctly initializes each device's sn_flush_device_kernel spinlock_t
struct (before it was only doing each widget's first device).
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
I originally thought this was an bug only in the SN code, but I think I
also see a hole in the generic IA64 tlb code. (Separate patch was sent
for the SN problem).
It looks like there is a bug in the TLB flushing code. During context switch,
kernel threads (kswapd, for example) inherit the mm of the task that was
previously running on the cpu. Normally, this is ok because the previous context
is still loaded into the RR registers. However, if the owner of the mm
migrates to another cpu, changes it's context number, and references a
page before kswapd issues a tlb_purge for that same page, the purge will be
done with a stale context number (& RR registers).
Signed-off-by: Tony Luck <tony.luck@intel.com>
Altix (shub2) pushes the BTE clean-up into SAL.
This patch correctly interfaces with the now implemented SAL call.
It also fixes a bug when delaying clean-up to allow busy BTEs to
complete (or error out).
Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
On return from INIT handler we must convert the address of the
minstate area from a kernel virtual uncached address (0xC...)
to physical uncached (0x8...). A typo (or thinko?) in the code
converted to physical cached.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cleanup a few items after moving xpc.h from arch/ia64/sn/kernel to
include/asm-ia64/sn.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Move xpc.h from arch/ia64/sn/kernel to include/asm-ia64/sn without change.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Move xpc_system_reboot() to be closer to the file it calls for readability
reasons (which are indeed subjective).
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Allow for the loss of heartbeat while in kdebug to be ignored by remote
partitions.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cleanup the XPC disengage related messages that are printed to the log.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch fixes a problem in XPC disengage processing whereby it was not
seeing the request to disengage from a remote partition, so the disengage
wasn't happening. The disengagement is suppose to transpire during the time
a XPC channel is disconnecting, and should be completed before the channel
is declared to be disconnected.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
When this new syscall was added to ia64 in commit
39743889aa
fsys.S was forgotten. Add a ".data8 0" there to keep
it in step. [Reported by Stephane Eranian]
Signed-off-by: Tony Luck <tony.luck@intel.com>
on ia64 thread_info is at the constant offset from task_struct and stack
is embedded into the same beast. Set __HAVE_THREAD_FUNCTIONS, made
task_thread_info() just add a constant.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
)
From: Ingo Molnar <mingo@elte.hu>
This is the latest version of the scheduler cache-hot-auto-tune patch.
The first problem was that detection time scaled with O(N^2), which is
unacceptable on larger SMP and NUMA systems. To solve this:
- I've added a 'domain distance' function, which is used to cache
measurement results. Each distance is only measured once. This means
that e.g. on NUMA distances of 0, 1 and 2 might be measured, on HT
distances 0 and 1, and on SMP distance 0 is measured. The code walks
the domain tree to determine the distance, so it automatically follows
whatever hierarchy an architecture sets up. This cuts down on the boot
time significantly and removes the O(N^2) limit. The only assumption
is that migration costs can be expressed as a function of domain
distance - this covers the overwhelming majority of existing systems,
and is a good guess even for more assymetric systems.
[ People hacking systems that have assymetries that break this
assumption (e.g. different CPU speeds) should experiment a bit with
the cpu_distance() function. Adding a ->migration_distance factor to
the domain structure would be one possible solution - but lets first
see the problem systems, if they exist at all. Lets not overdesign. ]
Another problem was that only a single cache-size was used for measuring
the cost of migration, and most architectures didnt set that variable
up. Furthermore, a single cache-size does not fit NUMA hierarchies with
L3 caches and does not fit HT setups, where different CPUs will often
have different 'effective cache sizes'. To solve this problem:
- Instead of relying on a single cache-size provided by the platform and
sticking to it, the code now auto-detects the 'effective migration
cost' between two measured CPUs, via iterating through a wide range of
cachesizes. The code searches for the maximum migration cost, which
occurs when the working set of the test-workload falls just below the
'effective cache size'. I.e. real-life optimized search is done for
the maximum migration cost, between two real CPUs.
This, amongst other things, has the positive effect hat if e.g. two
CPUs share a L2/L3 cache, a different (and accurate) migration cost
will be found than between two CPUs on the same system that dont share
any caches.
(The reliable measurement of migration costs is tricky - see the source
for details.)
Furthermore i've added various boot-time options to override/tune
migration behavior.
Firstly, there's a blanket override for autodetection:
migration_cost=1000,2000,3000
will override the depth 0/1/2 values with 1msec/2msec/3msec values.
Secondly, there's a global factor that can be used to increase (or
decrease) the autodetected values:
migration_factor=120
will increase the autodetected values by 20%. This option is useful to
tune things in a workload-dependent way - e.g. if a workload is
cache-insensitive then CPU utilization can be maximized by specifying
migration_factor=0.
I've tested the autodetection code quite extensively on x86, on 3
P3/Xeon/2MB, and the autodetected values look pretty good:
Dual Celeron (128K L2 cache):
---------------------
migration cost matrix (max_cache_size: 131072, cpu: 467 MHz):
---------------------
[00] [01]
[00]: - 1.7(1)
[01]: 1.7(1) -
---------------------
cacheflush times [2]: 0.0 (0) 1.7 (1784008)
---------------------
Here the slow memory subsystem dominates system performance, and even
though caches are small, the migration cost is 1.7 msecs.
Dual HT P4 (512K L2 cache):
---------------------
migration cost matrix (max_cache_size: 524288, cpu: 2379 MHz):
---------------------
[00] [01] [02] [03]
[00]: - 0.4(1) 0.0(0) 0.4(1)
[01]: 0.4(1) - 0.4(1) 0.0(0)
[02]: 0.0(0) 0.4(1) - 0.4(1)
[03]: 0.4(1) 0.0(0) 0.4(1) -
---------------------
cacheflush times [2]: 0.0 (33900) 0.4 (448514)
---------------------
Here it can be seen that there is no migration cost between two HT
siblings (CPU#0/2 and CPU#1/3 are separate physical CPUs). A fast memory
system makes inter-physical-CPU migration pretty cheap: 0.4 msecs.
8-way P3/Xeon [2MB L2 cache]:
---------------------
migration cost matrix (max_cache_size: 2097152, cpu: 700 MHz):
---------------------
[00] [01] [02] [03] [04] [05] [06] [07]
[00]: - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[01]: 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[02]: 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[03]: 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[04]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1)
[05]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1)
[06]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1)
[07]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) -
---------------------
cacheflush times [2]: 0.0 (0) 19.2 (19281756)
---------------------
This one has huge caches and a relatively slow memory subsystem - so the
migration cost is 19 msecs.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Cc: <wilder@us.ibm.com>
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add per-arch sched_cacheflush() which is a write-back cacheflush used by
the migration-cost calibration code at bootup time.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
arch: Use <linux/capability.h> where capable() is used.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There is a window where a probe gets removed right after the probe is hit
on some different cpu. In this case probe handlers can't find a matching
probe instance related to break address. In this case we need to read the
original instruction at break address to see if that is not a break/int3
instruction and recover safely.
Previous code had a bug where we were not checking for the above race in
case of reentrant probes and the below patch fixes this race.
Tested on IA64, Powerpc, x86_64.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently arch_remove_kprobes() is only implemented/required for x86_64 and
powerpc. All other architecture like IA64, i386 and sparc64 implementes a
dummy function which is being called from arch independent kprobes.c file.
This patch removes the dummy functions and replaces it with
#define arch_remove_kprobe(p, s) do { } while(0)
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Now that all these entries in the arch ioctl32.c files are gone [1], we can
build fs/compat_ioctl.c as a normal object and kill tons of cruft. We need a
special do_ioctl32_pointer handler for s390 so the compat_ptr call is done.
This is not needed but harmless on all other architectures. Also remove some
superflous includes in fs/compat_ioctl.c
Tested on ppc64.
[1] parisc still had it's PPP handler left, which is not fully correct
for ppp and besides that ppp uses the generic SIOCPRIV ioctl so it'd
kick in for all netdevice users. We can introduce a proper handler
in one of the next patch series by adding a compat_ioctl method to
struct net_device but for now let's just kill it - parisc doesn't
compile in mainline anyway and I don't want this to block this
patchset.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The comment in compat.c is wrong, every architecture provides a
get_compat_sigevent() for the IPC compat code already.
This basically moves the x86_64 version to common code and removes all the
others.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a hook so architectures can validate /dev/mem mmap requests.
This is analogous to validation we already perform in the read/write
paths.
The identity mapping scheme used on ia64 requires that each 16MB or
64MB granule be accessed with exactly one attribute (write-back or
uncacheable). This avoids "attribute aliasing", which can cause a
machine check.
Sample problem scenario:
- Machine supports VGA, so it has uncacheable (UC) MMIO at 640K-768K
- efi_memmap_init() discards any write-back (WB) memory in the first granule
- Application (e.g., "hwinfo") mmaps /dev/mem, offset 0
- hwinfo receives UC mapping (the default, since memmap says "no WB here")
- Machine check abort (on chipsets that don't support UC access to WB
memory, e.g., sx1000)
In the scenario above, the only choices are
- Use WB for hwinfo mmap. Can't do this because it causes attribute
aliasing with the UC mapping for the VGA MMIO space.
- Use UC for hwinfo mmap. Can't do this because the chipset may not
support UC for that region.
- Disallow the hwinfo mmap with -EINVAL. That's what this patch does.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove various things which were checking for gcc-1.x and gcc-2.x compilers.
From: Adrian Bunk <bunk@stusta.de>
Some documentation updates and removes some code paths for gcc < 3.2.
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The ptrace_get_task_struct() helper that I added as part of the ptrace
consolidation is useful in variety of places that currently opencode it.
Switch them to the common helpers.
Add a ptrace_traceme() helper that needs to be explicitly called, and simplify
the ptrace_get_task_struct() interface. We don't need the request argument
now, and we return the task_struct directly, using ERR_PTR() for error
returns. It's a bit more code in the callers, but we have two sane routines
that do one thing well now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
sys_migrate_pages implementation using swap based page migration
This is the original API proposed by Ray Bryant in his posts during the first
half of 2005 on linux-mm@kvack.org and linux-kernel@vger.kernel.org.
The intent of sys_migrate is to migrate memory of a process. A process may
have migrated to another node. Memory was allocated optimally for the prior
context. sys_migrate_pages allows to shift the memory to the new node.
sys_migrate_pages is also useful if the processes available memory nodes have
changed through cpuset operations to manually move the processes memory. Paul
Jackson is working on an automated mechanism that will allow an automatic
migration if the cpuset of a process is changed. However, a user may decide
to manually control the migration.
This implementation is put into the policy layer since it uses concepts and
functions that are also needed for mbind and friends. The patch also provides
a do_migrate_pages function that may be useful for cpusets to automatically
move memory. sys_migrate_pages does not modify policies in contrast to Ray's
implementation.
The current code here is based on the swap based page migration capability and
thus is not able to preserve the physical layout relative to it containing
nodeset (which may be a cpuset). When direct page migration becomes available
then the implementation needs to be changed to do a isomorphic move of pages
between different nodesets. The current implementation simply evicts all
pages in source nodeset that are not in the target nodeset.
Patch supports ia64, i386 and x86_64.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This was causing some ordering problems. Remove the up-front evaluation
and just revaluate the compiler version each time we need it.
(The up-front evaluation was problematic because some architectures modify
the value of $(CC)).
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
arch/ia64/kernel/setup.c: In function `show_cpuinfo':
arch/ia64/kernel/setup.c:576: warning: long unsigned int format, different type arg (arg 12)
arch/ia64/kernel/setup.c:576: warning: long unsigned int format, different type arg (arg 13)
Introduced by 95235ca2c2
Signed-off-by: Tony Luck <tony.luck@intel.com>
here is the BSP removal support for IA64. Its pretty much the same thing that
was released a while back, but has your feedback incorporated.
- Removed CONFIG_BSP_REMOVE_WORKAROUND and associated cmdline param
- Fixed compile issue with sn2/zx1 due to a undefined fix_b0_for_bsp
- some formatting nits (whitespace etc)
This has been tested on tiger and long back by alex on hp systems as well.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>