Take some assignments out of vio_register_device_common and
rename it to vio_register_device.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Formatting changes to vio.c to bring it closer to the
kernel coding standard.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
gcc 3.4 (at least the build we are using) puts the gcc generated .ident
string into a .note section at the end of the files it compiles (gcc
3.3.3-hammer and gcc 4.0.2 Debian puts it in the .text section). This
means that the lparmap.s file we produce in the iSeries build may end with
a .note section. When we include it into head.S, the assembler can no
longer resolve some of the conditional branches since the target label
ends up too far away. This patch just forces us back to the .text section
after including lparmap.s.
The breakage was caused by my patch "iSeries build with newer assemblers
and compilers" (sha1-id: 2ad5649662).
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
A mistake rebasing the series of ppc64 head.S cleanup patches meant
the #include of lparmap.s, needed for iSeries was lost. This patch
puts it back again.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
It has been reported that the way Linux handles NODEFER for signals is
not consistent with the way other Unix boxes handle it. I've written a
program to test the behavior of how this flag affects signals and had
several reports from people who ran this on various Unix boxes,
confirming that Linux seems to be unique on the way this is handled.
The way NODEFER affects signals on other Unix boxes is as follows:
1) If NODEFER is set, other signals in sa_mask are still blocked.
2) If NODEFER is set and the signal is in sa_mask, then the signal is
still blocked. (Note: this is the behavior of all tested but Linux _and_
NetBSD 2.0 *).
The way NODEFER affects signals on Linux:
1) If NODEFER is set, other signals are _not_ blocked regardless of
sa_mask (Even NetBSD doesn't do this).
2) If NODEFER is set and the signal is in sa_mask, then the signal being
handled is not blocked.
The patch converts signal handling in all current Linux architectures to
the way most Unix boxes work.
Unix boxes that were tested: DU4, AIX 5.2, Irix 6.5, NetBSD 2.0, SFU
3.5 on WinXP, AIX 5.3, Mac OSX, and of course Linux 2.6.13-rcX.
* NetBSD was the only other Unix to behave like Linux on point #2. The
main concern was brought up by point #1 which even NetBSD isn't like
Linux. So with this patch, we leave NetBSD as the lonely one that
behaves differently here with #2.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paulus, I think this is now a reasonable candidate for the post-2.6.13
queue.
Relax address restrictions for hugepages on ppc64
Presently, 64-bit applications on ppc64 may only use hugepages in the
address region from 1-1.5T. Furthermore, if hugepages are enabled in
the kernel config, they may only use hugepages and never normal pages
in this area. This patch relaxes this restriction, allowing any
address to be used with hugepages, but with a 1TB granularity. That
is if you map a hugepage anywhere in the region 1TB-2TB, that entire
area will be reserved exclusively for hugepages for the remainder of
the process's lifetime. This works analagously to hugepages in 32-bit
applications, where hugepages can be mapped anywhere, but with 256MB
(mmu segment) granularity.
This patch applies on top of the four level pagetable patch
(http://patchwork.ozlabs.org/linuxppc64/patch?id=1936).
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
You can't call get_property() on a NULL node, so check if of_chosen is set
in check_for_initrd().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
arch/ppc64/kernel/setup.c | 20 ++++++++++++--------
1 files changed, 12 insertions(+), 8 deletions(-)
Signed-off-by: Paul Mackerras <paulus@samba.org>
unflatten_device_tree() doesn't check if lmb_alloc() succeeds or not, it
should. All it can do is panic, but at least there's an error message
(assuming you have some sort of console at that point).
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
arch/ppc64/kernel/prom.c | 9 +++++++--
1 files changed, 7 insertions(+), 2 deletions(-)
Signed-off-by: Paul Mackerras <paulus@samba.org>
When unflatten_dt_node() fails to find an OF_DT_END_NODE tag it prints
"Weird tag at start of node", this should be "Weird tag at end of node".
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
arch/ppc64/kernel/prom.c | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch moves power4_enable_pmcs() to arch/ppc64/kernel/pmc.c.
I've tested it on P5 LPAR and P4. It does what it used to.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
If both CONFIG_XMON and CONFIG_XMON_DEFAULT is enabled in the .config,
there is no way to disable xmon again. setup_system calls first xmon_init,
later parse_early_param. So a new 'xmon=off' cmdline option will do the right
thing.
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We can now remove CONFIG_MSCHUNKS as it doesn't do anything interesting
anymore.
The only macro in abs_addr.h which is called by non-iSeries code is
phys_to_abs(), so remove the other dummy implementations, and we add a
firmware feature check to phys_to_abs().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
lmb_phys_mem_size() can always return lmb.memory.size, as long as it's called
after lmb_analyze(), which it is. There's no need to recalculate the size on
every call.
lmb_analyze() was calculating a few things we then threw away, so just don't
calculate them to start with.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We no longer need the lmb code to know about abs and phys addresses, so
remove the physbase variable from the lmb_property struct.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
abs_to_phys() is a macro that turns out to do nothing, and also has the
unfortunate property that it's not the inverse of phys_to_abs() on iSeries.
The following is for my benefit as much as everyone else.
With CONFIG_MSCHUNKS enabled, the lmb code is changed such that it keeps
a physbase variable for each lmb region. This is used to take the possibly
discontiguous lmb regions and present them as a contiguous address space
beginning from zero.
In this context each lmb region's base address is its "absolute" base
address, and its physbase is it's "physical" address (from Linux's point of
view). The abs_to_phys() macro does the mapping from "absolute" to "physical".
Note: This is not related to the iSeries mapping of physical to absolute
(ie. Hypervisor) addresses which is maintained with the msChunks structure.
And the msChunks structure is not controlled via CONFIG_MSCHUNKS.
Once upon a time you could compile for non-iSeries with CONFIG_MSCHUNKS
enabled. But these days CONFIG_MSCHUNKS depends on CONFIG_PPC_ISERIES, so
for non-iSeries code abs_to_phys() is a no-op.
On iSeries we always have one lmb region which spans from 0 to
systemcfg->physicalMemorySize (arch/ppc64/kernel/iSeries_setup.c line 383).
This region has a base (ie. absolute) address of 0, and a physbase address
of 0 (as calculated in lmb_analyze() (arch/ppc64/kernel/lmb.c line 144)).
On iSeries, abs_to_phys(aa) is defined as lmb_abs_to_phys(aa), which finds
the lmb region containing aa (and there's only one, ie. 0), and then does:
return lmb.memory.region[0].physbase + (aa - lmb.memory.region[0].base)
physbase == base == 0, so you're left with "return aa".
So remove abs_to_phys(), and lmb_abs_to_phys() which is the implementation
of abs_to_phys() for iSeries.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The lmb code is all written to use a pointer to an lmb struct. But it's always
the same lmb struct, called "lmb". So we take the address of lmb, call it
_lmb and then start using _lmb->foo everywhere, which is silly.
This patch removes the _lmb pointers and replaces them with direct references
to the one "lmb" struct. We do the same for some _mem and _rsv pointers which
point to lmb.memory and lmb.reserved respectively.
This patch looks quite busy, but it's basically just:
s/_lmb->/lmb./g
s/_mem->/lmb.memory./g
s/_rsv->/lmb.reserved./g
s/_rsv/&lmb.reserved/g
s/mem->/lmb.memory./g
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
physRpn_to_absRpn is a no-op on non-iSeries platforms, remove the two
redundant calls.
There's only one caller on iSeries so fold the logic in there so we can get
rid of it completely.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Rename the msChunks struct to get rid of the StUdlY caps and make it a bit
clearer what it's for.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Chunks are 256KB, so use constants for the size/shift/mask, rather than
getting them from the msChunks struct. The iSeries debugger (??) might still
need access to the values in the msChunks struct, so we keep them around
for now, but set them from the constant values.
Replace msChunks_entry typedef with regular u32.
Simplify msChunks_alloc() to manipulate klimit directly, rather than via
a parameter.
Move msChunks_alloc() and msChunks into iSeries_setup.c, as that's where
they're used.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The msChunks code was written to work on pSeries, but now it's only used on
iSeries. This means there's no need to do PTRRELOC anymore, so remove it all.
A few places were getting "extern reloc_offset()" from abs_addr.h, move it
into system.h instead.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Make firmware_has_feature() evaluate at compile time for the non pSeries
case and tidy up code where possible.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Create the firmware_has_feature() inline and move the firmware feature
stuff into its own header file.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The firmware_features field of struct cpu_spec should really be a separate
variable as the firmware features do not depend on the chip and the
bitmask is constructed independently. By removing it, we save 112 bytes
from the cpu_specs array and we access the bitmask directly instead of via
the cur_cpu_spec pointer.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The ppc64 head.S defines several zero-initialized structures, such as
the empty_zero_page and the kernel top-level pagetable. Currently
they are defined to be in the data section. However, they're not used
until after the bss is cleared, so this patch moves them to the bss,
saving two and a half pages from the vmlinux.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adjust some comments in head.S for accuracy, clarity, and
spelling.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
arch/ppc64/kernel/head.S #defines SECONDARY_PROCESSORS then has some
#ifdefs based on it. Whatever purpose this had is long lost, this
patch removes it.
Likewise, head.S defines H_SET_ASR, which is now defined, along with
other hypervisor call numbers in hvcall.h. This patch deletes it, as
well, from head.S.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
An #if/#else construct near the top of ppc64's head.S appears to
create overlapping sections of code for iSeries and pSeries (i.e. one
thing on iSeries and something different in the same place on
pSeries). In fact, checking the various absolute offsets, it doesn't.
This patch unravels the #ifdefs to make it more obvious what's going
on. This accomplishes another microstep towards a single kernel image
which can boot both iSeries and pSeries.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
As well as the interrupt vectors and initialization code, head.S
contains several asm functions which are used during runtime. This
patch moves these to misc.S, a more sensible location for random asm
support code. A couple The functions moved are:
disable_kernel_fp
giveup_fpu
disable_kernel_altivec
giveup_altivec
__setup_cpu_power3 (empty function)
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
On ppc64 machines with segment tables, CPU0's segment table is at a
fixed address, currently 0x9000. This patch moves it to the free
space at 0x6000, just below the fwnmi data area. This saves 8k of
space in vmlinux and the runtime kernel image.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
In the ppc64 kernel head.S there is currently quite a lot of unused
space between the naca (at fixed address 0x4000) and the fwnmi data
area (at fixed address 0x7000). This patch moves various exception
vectors and support code into this region to use the wasted space.
The functions load_up_fpu and load_up_altivec are moved down as well,
since they are essentially continuations of the fp_unavailable_common
and altivec_unavailable_common vectors, respectively.
Likewise, the fwnmi vectors themselves are moved down into this area,
because while the location of the fwnmi data area is fixed by the RPA,
the vectors themselves can be anywhere sufficiently low.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Comments in head.S suggest that the iSeries naca has a fixed address,
because tools expect to find it there. The only tool which appears to
access the naca is addRamDisk, but both the in-kernel version and the
version used in RHEL and SuSE in fact locate the NACA the same way as
the hypervisor does, by following the pointer in the hvReleaseData
structure.
Since the requirement for a fixed address seems to be obsolete, this
patch removes the naca from head.S and replaces it with a normal C
initializer.
For good measure, it removes an old version of addRamDisk.c which was
sitting, unused, in the ppc32 tree.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch just splits out the pSeries specific parts of vio.c.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch allows us to have a different bus if matching function for
each platform.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since the iSeries vio iommu tables cannot be used until after the vio bus has
been initialised, move the initialisation of the tables to there.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch splits the iSeries specific parts out of vio.c.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
arch/ppc64/Kconfig defines a "General setup" menu, but also sources
init/Kconfig which also defines a "General setup" menu. Both of these
menus appear at the top level of make menuconfig. Having two menus with
the same name is confusing. This patch renames the ppc64/Kconfig menu to
be "Bus Options" and moves options in this menu which are not bus related
to the end of the "Platform support" menu.
There are many variations among architectures on the exact naming of the
"Bus Options" menu. I chose to use the simplest one, which is also used
in arch/ppc/Kconfig.
Signed-off-by: Frank Rowand <frowand@mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
OpenFirmware marks devices as failed in the device-tree when a hardware
problem is detected. The kernel needs to fail config reads/writes to
prevent a kernel crash when incorrect data is read.
This patch validates that the device-node is not marked "fail" when
config space reads/writes are attempted.
Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch updates the format of the flattened device-tree passed
between the boot trampoline and the kernel to support a more compact
representation, for use by embedded systems mostly.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Implement 4-level pagetables for ppc64
This patch implements full four-level page tables for ppc64, thereby
extending the usable user address range to 44 bits (16T).
The patch uses a full page for the tables at the bottom and top level,
and a quarter page for the intermediate levels. It uses full 64-bit
pointers at every level, thus also increasing the addressable range of
physical memory. This patch also tweaks the VSID allocation to allow
matching range for user addresses (this halves the number of available
contexts) and adds some #if and BUILD_BUG sanity checks.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Make the bootheader for ppc64 independent from kernel and libc headers.
* add -nostdinc -isystem $gccincludes to not include libc headers
* declare all functions in header files, also the stuff from string.S
* declare some functions static
* use stddef.h to get size_t (hopefully ok)
* remove ppc32-types.h, only elf.h used the __NN types
With further modifications by Paul Mackerras and Stephen Rothwell.
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When we leave sigsuspend() directly into a signal handler, we don't want
to go via the normal syscall exit path -- it'll corrupt r4 and r5 which
are supposed to be giving information to the signal handler, and it'll
give us one more single-step SIGTRAP than we need if single-stepping is
in operation.
However, we _should_ be calling audit_syscall_exit(), which would
normally get invoked in that patch. It's not wonderfully pretty, but I
suspect the best answer is just to call it directly...
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
This patch puts back the export of machine_power_off() that was removed
by some janitor as it's used for emergency shutdown by the G5 thermal
control driver. Wether that driver should use kernel_power_off() instead
is debatable and a post-2.6.13 decision. In the meantime, please commit
that patch that fixes the driver for now.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This fixes a bug in the PPC64 iommu vmerge code which results in the
potential for iommu_unmap_sg to go off unmapping more than it should.
This was found on a test system which resulted in PCI bus errors due to
PCI memory being unmapped while DMAs were still in progress.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paulus suggested that we put xLparMap in its own .c file so that we can
generate a .s file to be included into head.S. This doesn't get around
the problem of having it at a fixed address, but it makes it more
palatable.
It would be good if this could be included in 2.6.13 as it solves our
build problems with various versions of binutils and gcc. In
particular, it allows us to build an iSeries kernel on Debian unstable
using their biarch compiler.
This has been built and booted on iSeries and built for pSeries and g5.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The workaround for broken device-tree that prevents fan control from
working on recent G5 models need to be "enabled" for machines with
revision 0x37 of the bridge in addition to machines with revision 0x35.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds a bpa_defconfig file and make target. The config settings
are made for the current version of the Cell Processor Based Blade,
so there are not too many drivers enabled. A few more drivers might
get added in the future though.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In yenta_socket, we default to using the resource setting of the CardBus
bridge. However, this is a PCI-bus-centric view of resources and thus needs
to be converted to generic resources first. Therefore, add a call to
pcibios_bus_to_resource() call in between. This function is a mere wrapper on
x86 and friends, however on some others it already exists, is added in this
patch (alpha, arm, ppc, ppc64) or still needs to be provided (parisc -- where
is its pcibios_resource_to_bus() ?).
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The kexec boot is not successful on some power machines since all CPUs are
getting removed from global interrupt queue (GIQ) before kexec boot. Some
systems always expect at least one CPU in GIQ. Hence, this patch will make
sure that only secondary CPUs are removed from GIQ.
Signed-off-by: Haren Myneni <hbabu@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
CONFIG_KEXEC breaks UP builds because of a misspelled smp_release_cpus().
Also, the function isn't defined unless built with CONFIG_SMP but it is
needed if we are to go from a UP to SMP kernel. Enable it and document it.
Thanks to Steven Winiecki for reporting this and to Milton for remembering
how it's supposed to work and why.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For soft reset during system hang, got an error "CPU did not take
control" for some CPUs even though they responded to soft-reset (called
SystemReset, die and called debugger - xmon). First these CPUs entered
into xmon by IPI callback and then got a soft-reset exception and
re-entered into xmon again. The first CPU which re-entered into xmon got
the output lock and made into xmon successfully without unlocking.
Hence, the next CPU(s) which re-entered into xmon try to acquire a lock
(get_output_lock). Therefore, we can not view state of those CPU(s).
[This is a simple, very low risk, obvious fix for an obvious bug, and
should go into 2.6.13. -- paulus]
Signed-off-by: Haren Myneni <hbabu@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If CONFIG_NUMA is set, some POWER 4 systems will fail to boot. This is
because of special processing needed to handle invalid node IDs (0xffff) on
POWER 4. My previous patch to handle memory 'holes' within nodes forgot to
add this special case for POWER 4 in one place.
In reality, I'm not sure that configuring the kernel for NUMA on POWER 4 makes
much sense. Are there POWER 4 based systems with NUMA characteristics that
are presented by the firmware? But, distros want one kernel for all systems
so NUMA is on by default in their kernels. The patch handles those cases.
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The code that sets the altivec capability of the CPU based on firmware
informations can enable altivec when the kernel has CONFIG_ALTIVEC
disabled. This results in "interesting" crashes.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
inotify system call support for PPC64
[ I don't think we need sys32 compatibility versions--and if we do, I
failed in life. ]
Signed-off-by: Robert Love <rml@novell.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
turn many #if $undefined_string into #ifdef $undefined_string to fix some
warnings after -Wno-def was added to global CFLAGS
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Presently the LparMap, one of the structures the kernel shares with the
legacy iSeries hypervisor has a fixed offset address in head.S. This patch
changes this so the LparMap is a normally initialized structure, without
fixed address. This allows us to use macros to compute some of the values
in the structure, which wasn't previously possible because the assembler
always uses signed-% which gets the wrong answers for the computations in
question.
Unfortunately, a gcc bug means that doing this requires another structure
(hvReleaseData) to be initialized in asm instead of C, but on the whole the
result is cleaner than before.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
PPC64 machines before Power4 need a segment table page allocated for each
CPU. Currently these are allocated statically in a big array in head.S for
all CPUs. The segment tables need to be in the first segment (so
do_stab_bolted doesn't take a recursive fault on the stab itself), but
other than that there are no constraints which require the stabs for the
secondary CPUs to be statically allocated.
This patch allocates segment tables dynamically during boot, using
lmb_alloc() to ensure they are within the first 256M segment. This reduces
the kernel image size by 192k...
Tested on RS64 iSeries, POWER3 pSeries, and POWER5.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
update defconfig, use new CONFIG_HZ and set it to 100 just for the kicks.
Signed-off-by: Olaf Hering <olh@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
use new Kconfig.hz on ppc/ppc64, use also Kconfig.preempt for ppc
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
machine_restart, machine_halt and machine_power_off are machine
specific hooks deep into the reboot logic, that modules
have no business messing with. Usually code should be calling
kernel_restart, kernel_halt, kernel_power_off, or
emergency_restart. So don't export machine_restart,
machine_halt, and machine_power_off so we can catch buggy users.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add PVR value and tests for 970MP. Also switch to a simpler (but slightly
longer) check at init time for simplicity.
Signed-off-by: Olof Johansson <olof@austin.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch removes the use of bitfield types from the ppc64 hash table
manipulation code.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Create a new top-level menu named "Networking" thus moving
net related options and protocol selection way from the drivers
menu and up on the top-level where they belong.
To implement this all architectures has to source "net/Kconfig" before
drivers/*/Kconfig in their Kconfig file. This change has been
implemented for all architectures.
Device drivers for ordinary NIC's are still to be found
in the Device Drivers section, but Bluetooth, IrDA and ax25
are located with their corresponding menu entries under the new
networking menu item.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We dont need to use the PERFMON exception on POWER5, in fact the firmware
returns an error. Due to this just remove the warning.
Also now that we have proper runlatch support we can remove the bootup
hack.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Not sure if we really need this, but it was handy to know which iSeries loop I
was testing.
Be consistent about printing which idle loop we're using, with this patch we
cover all cases.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix a compile warning introduced by the previous patches.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- remove some unnecessary includes
- add runlatch support
- no need to use raw_smp_processor_id any more, current preempt debug
logic checks for processes that are bound to one cpu.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- separate out sleep logic in dedicated_idle, it was so far indented
that it got squashed against the right side of the screen.
- add runlatch support, looping on runlatch disable.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- remove min/max yield time, we dont use the values anywhere
- separate shared and dedicated idle loops
- check need_resched again with irqs off to avoid sleeping with pending work
- continually set runlatch off in idle loop, this means we dont need to
turn the runlatch off on exception exit and suffer that associated
cost for all exceptions. (A future patch will turn the runlatch on at
exception entry)
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Now that the idle loop is configured by each platform we don't need
idle_setup() anymore.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes up iSeries, pSeries, pmac and maple to set the correct idle
function for each platform.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
dedicated_idle() and shared_idle() are only used by pSeries, so move them into
pSeries_setup.c
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Move iSeries_idle() into iSeries_setup.c, no one else needs to know about it.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds an idle member to the ppc_md structure and calls it from
cpu_idle(). If a platform leaves ppc_md.idle as null it will get the default
idle loop default_idle().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Separate the NUL character filtering from get_hvc_chars.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove all the vio device driver code from hvc_console.c
This will allow us to separate hvsi, hvc, and allow hvc_console to be used
without the ppc64 vio layer.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Separate the console setup routines of the hvc_console and the vio layer.
Remove the call to find_init_vty from hvc_console.c.
Fail the setup routine if the console doesn't exist, but register the console
again when the specified channel is instantiated. This scheme maintains the
print buffer semantics while eliminating callout and call back for the console
code.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove some unnecessary includes, an out of date comment and a prototype for
sys_timer_create (which is now in syscalls.h)
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Enable the runlatch at the start of each exception. Unfortunately we are out
of space in the 0x300 handler, so I added it a bit later.
The SPR write is fairly expensive, perhaps we should cache the runlatch state
in the paca and avoid the write when possible.
We don't need to turn the runlatch off, we do that in the idle loop. Better
to take the hit in the idle loop than for each exception exit.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Not all ppc64 CPUs have the CTRL SPR, so we need a cputable feature for it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use c99 initialisers in the cputable code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from <amodra@bigpond.net.au>,
http://sources.redhat.com/bugzilla/show_bug.cgi?id=1042
/usr/bin/ld: arch/ppc64/kernel/vdso32/vdso32.so: The first section in the
PT_DYNAMIC segment is not the .dynamic section
Signed-off-by: Olaf Hering <olh@suse.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This converts the usage of struct of_match to struct of_device_id,
similar to pci_device_id. This allows a device table to be generated,
which can be parsed by depmod(8) to generate a map file for module
loading.
In order for hotplug to work with macio devices, patches to
module-init-tools and hotplug must be applied. Those patches are
available at:
ftp://ftp.suse.com/pub/people/jeffm/linux/macio-hotplug/
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The following renames arch_init, a kprobes function for performing any
architecture specific initialization, to arch_init_kprobes in order to
cleanup the namespace.
Also, this patch adds arch_init_kprobes to sparc64 to fix the sparc64 kprobes
build from the last return probe patch.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The hvlpevent_queue (formally ItLpQueue) has a member called xInUseWord
which is used for serialising access to the queue. Because it's a word
(ie. 32 bit) there's a custom 32-bit version of test_and_set_bit() or
thereabouts in ItLpQueue.c.
The xInUseWord is not shared with they hypervisor, so we can replace it
with a spinlock and remove the custom code.
There is also another locking mechanism (ItLpQueueInProcess). This is
redundant because it's only manipulated while the lock's held. Remove it.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Just formatting cleanups:
* rename some "nextLpEvent" variables to just "event"
* make code fit in 80 columns
* use brackets around if/else
* use a temporary to make hvlpevent_clear_valid clearer
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Just cleanup white space.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The code that prints event counts by type uses a hand-coded number of tabs
to get the alignment right. Instead use a printf alignment which will allow
allow us to use the event_type strings elsewhere in the future.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently there's a per-cpu count of lpevents processed, a per-queue (ie.
global) total count, and a count by event type.
Replace all that with a count by event for each cpu. We only need to add
it up int the proc code.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently we count the number of lpevents processed in 3 seperate places.
One of these counters is never read, so just remove it. This means
hvlpevent_queue_process() no longer needs to return the number of events
processed.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Now that we've renamed the xItLpQueue structure, rename the functions that
operate on it also.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The xItLpQueue is a queue of HvLpEvents that we're given by the Hypervisor.
Rename xItLpQueue to hvlpevent_queue and make the type struct hvlpevent_queue.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The xItLpQueue is declared in LparData.c, move it into ItLpQueue.c.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
External parties don't need to use ItLpQueue_getNextLpEvent() or
ItLpQueue_clearValid(), they're internal to ItLpQueue.c
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Move the code that displays xItLpQueue values in /proc into
ItLpQueue.c.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The xItLpQueue is initalised manually in iSeries_setup_arch(). Move
this code into ItLpQueue.c for a cleaner separation.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Because there's only one ItLpQueue and we know where it is, ie. xItLpQueue,
there's no point passing pointers to it it around all over the place.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch updates the macros that initialise the paca to remove the lpq
parameter. It also rearranges them a bit with the hope of making them a
bit clearer.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The only code outside ItLpQueue.c that refers to spread_lpevents is in
set_apread_lpevents(), so move it inside ItLpQueue.c and make spread_lpevents
static.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
With the previous patch in place, spreading lpevents by default becomes
a one liner.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The iSeries code keeps a pointer to the ItLpQueue in its paca struct. But
all these pointers end up pointing to the one place, ie. xItLpQueue.
So remove the pointer from the paca struct and just refer to xItLpQueue
directly where needed.
The only complication is that the spread_lpevents logic was implemented by
having a NULL lpqueue pointer in the paca on CPUs that weren't supposed to
process events. Instead we just compare the spread_lpevents value to the
processor id to get the same behaviour.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Anyone reporting a stuck IRQ should try these options. Its effectiveness
varies we've found in the Fedora case. Quite a few systems with misdescribed
IRQ routing just work when you use irqpoll. It also fixes up the VIA systems
although thats now fixed with the VIA quirk (which we could just make default
as its what Redmond OS does but Linus didn't like it historically).
A small number of systems have jammed IRQ sources or misdescribes that cause
an IRQ that we have no handler registered anywhere for. In those cases it
doesn't help.
Signed-off-by: Alan Cox <number6@the-village.bc.nu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
initrd size is printed as hex, add a missing 0x
remove a duplicate printf when initrd is used.
remove use of kernel type to access the first bytes of the initrd memarea.
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
remove the printk usage in the zImage. we are not there, yet.
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
On partitioned systems we can wind up creating spurious symlinks in
/sys/devices/system/node/node0 to non-present cpus. The symlinks are
not broken; the problem is that we're potentially misinforming
userspace that there is a relationship between node0 and cpus which
are to be added later. There's no guarantee at all that a cpu which
is added later will belong to node 0.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Convert nvram_create_os_partition to use list_for_each_entry
instead of list_for_each, as this reduces the code size by
two lines.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This is an updated version of Ben's fix-pci-mmap-on-ppc-and-ppc64.patch
which is in 2.6.12-rc4-mm1.
It fixes the patch to work on PPC iSeries, removes some debug printks
at Ben's request, and incorporates your
fix-pci-mmap-on-ppc-and-ppc64-fix.patch also.
Originally from Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch was discussed at length on linux-pci and so far, the last
iteration of it didn't raise any comment. It's effect is a nop on
architecture that don't define the new pci_resource_to_user() callback
anyway. It allows architecture like ppc who put weird things inside of
PCI resource structures to convert to some different value for user
visible ones. It also fixes mmap'ing of IO space on those archs.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The following is a patch provided by Ananth Mavinakayanahalli that implements
the new PPC64 specific parts of the new function return probe design.
NOTE: Since getting Ananth's patch, I changed trampoline_probe_handler()
to consume each of the outstanding return probem instances (feedback
on my original RFC after Ananth cut a patch), and also added the
arch_init() function (adding arch specific initialization.) I have
cross compiled but have not testing this on a PPC64 machine.
Changes include:
* Addition of kretprobe_trampoline to act as a dummy function for instrumented
functions to return to, and for the return probe infrastructure to place
a kprobe on on, gaining control so that the return probe handler
can be called, and so that the instruction pointer can be moved back
to the original return address.
* Addition of arch_init(), allowing a kprobe to be registered on
kretprobe_trampoline
* Addition of trampoline_probe_handler() which is used as the pre_handler
for the kprobe inserted on kretprobe_implementation. This is the function
that handles the details for calling the return probe handler function
and returning control back at the original return address
* Addition of arch_prepare_kretprobe() which is setup as the pre_handler
for a kprobe registered at the beginning of the target function by
kernel/kprobes.c so that a return probe instance can be setup when
a caller enters the target function. (A return probe instance contains
all the needed information for trampoline_probe_handler to do it's job.)
* Hooks added to the exit path of a task so that we can cleanup any left-over
return probe instances (i.e. if a task dies while inside a targeted function
then the return probe instance was reserved at the beginning of the function
but the function never returns so we need to mark the instance as unused.)
Signed-off-by: Rusty Lynch <rusty.lynch@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Now that PPC64 has no-execute support, here is a second try to fix the
single step out of line during kprobe execution. Kprobes on x86_64 already
solved this problem by allocating an executable page and using it as the
scratch area for stepping out of line. Reuse that.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds a couple of missing symbol exports. flush_dcache_page is
used by the AGP driver and rtc_lock by the RTC driver.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
o Following patch provides purely cosmetic changes and corrects CodingStyle
guide lines related certain issues like below in kexec related files
o braces for one line "if" statements, "for" loops,
o more than 80 column wide lines,
o No space after "while", "for" and "switch" key words
o Changes:
o take-2: Removed the extra tab before "case" key words.
o take-3: Put operator at the end of line and space before "*/"
Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Makes kexec_crashdump() take a pt_regs * as an argument. This allows to
get exact register state at the point of the crash. If we come from direct
panic assertion NULL will be passed and the current registers saved before
crashdump.
This hooks into two places:
die(): check the conditions under which we will panic when calling
do_exit and go there directly with the pt_regs that caused the fatal
fault.
die_nmi(): If we receive an NMI lockup while in the kernel use the
pt_regs and go directly to crash_kexec(). We're probably nested up badly
at this point so this might be the only chance to escape with proper
information.
Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch implements the kexec support for ppc64 platforms.
A couple of notes:
1) We copy the pages in virtual mode, using the full base kernel
and a statically allocated stack. At kexec_prepare time we
scan the pages and if any overlap our (0, _end[]) range we
return -ETXTBSY.
On PowerPC 64 systems running in LPAR (logical partitioning)
mode, only a small region of memory, referred to as the RMO,
can be accessed in real mode. Since Linux runs with only one
zone of memory in the memory allocator, and it can be orders of
magnitude more memory than the RMO, looping until we allocate
pages in the source region is not feasible. Copying in virtual
means we don't have to write a hash table generation and call
hypervisor to insert translations, instead we rely on the pinned
kernel linear mapping. The kernel already has move to linked
location built in, so there is no requirement to load it at 0.
If we want to load something other than a kernel, then a stub
can be written to copy a linear chunk in real mode.
2) The start entry point gets passed parameters from the kernel.
Slaves are started at a fixed address after copying code from
the entry point.
All CPUs get passed their firmware assigned physical id in r3
(most calling conventions use this register for the first
argument).
This is used to distinguish each CPU from all other CPUs.
Since firmware is not around, there is no other way to obtain
this information other than to pass it somewhere.
A single CPU, referred to here as the master and the one executing
the kexec call, branches to start with the address of start in r4.
While this can be calculated, we have to load it through a gpr to
branch to this point so defining the register this is contained
in is free. A stack of unspecified size is available at r1
(also common calling convention).
All remaining running CPUs are sent to start at absolute address
0x60 after copying the first 0x100 bytes from start to address 0.
This convention was chosen because it matches what the kernel
has been doing itself. (only gpr3 is defined).
Note: This is not quite the convention of the kexec bootblock v2
in the kernel. A stub has been written to convert between them,
and we may adjust the kernel in the future to allow this directly
without any stub.
3) Destination pages can be placed anywhere, even where they
would not be accessible in real mode. This will allow us to
place ram disks above the RMO if we choose.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: R Sharada <sharada@in.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add code to clear the hash table and invalidate the tlb for native (SMP,
non-LPAR) mode. Supports 16M and 4k pages.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: R Sharada <sharada@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch consolidates the CONFIG_PREEMPT and CONFIG_PREEMPT_BKL
preemption options into kernel/Kconfig.preempt. This, besides reducing
source-code, also enables more centralized tweaking of preemption related
options.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
(The i386 CPU hotplug patch provides infrastructure for some work which Pavel
is doing as well as for ACPI S3 (suspend-to-RAM) work which Li Shaohua
<shaohua.li@intel.com> is doing)
The following provides i386 architecture support for safely unregistering and
registering processors during runtime, updated for the current -mm tree. In
order to avoid dumping cpu hotplug code into kernel/irq/* i dropped the
cpu_online check in do_IRQ() by modifying fixup_irqs(). The difference being
that on cpu offline, fixup_irqs() is called before we clear the cpu from
cpu_online_map and a long delay in order to ensure that we never have any
queued external interrupts on the APICs. There are additional changes to s390
and ppc64 to account for this change.
1) Add CONFIG_HOTPLUG_CPU
2) disable local APIC timer on dead cpus.
3) Disable preempt around irq balancing to prevent CPUs going down.
4) Print irq stats for all possible cpus.
5) Debugging check for interrupts on offline cpus.
6) Hacky fixup_irqs() to redirect irqs when cpus go off/online.
7) play_dead() for offline cpus to spin inside.
8) Handle offline cpus set in flush_tlb_others().
9) Grab lock earlier in smp_call_function() to prevent CPUs going down.
10) Implement __cpu_disable() and __cpu_die().
11) Enable local interrupts in cpu_enable() after fixup_irqs()
12) Don't fiddle with NMI on dead cpu, but leave intact on other cpus.
13) Program IRQ affinity whilst cpu is still in cpu_online_map on offline.
Signed-off-by: Zwane Mwaikambo <zwane@linuxpower.ca>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Stephen's patch to remove LparData.h missed an include in lparcfg.c This
fixes a few compile warnings.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The seccomp check has to happen when entering the syscall and not when
exiting it or regs->gpr[0] contains garabge during signal handling in
ppc64_rt_sigreturn (this actually might be a bug too, but an orthogonal
one, since we really have to run the check before invoking the syscall and
not after it).
Signed-off-by: Andrea Arcangeli <andrea@cpushare.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch includes ppc64 architecture specific changes to support temporary
disarming on reentrancy of probes.
Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The architecture independent code of the current kprobes implementation is
arming and disarming kprobes at registration time. The problem is that the
code is assuming that arming and disarming is a just done by a simple write
of some magic value to an address. This is problematic for ia64 where our
instructions look more like structures, and we can not insert break points
by just doing something like:
*p->addr = BREAKPOINT_INSTRUCTION;
The following patch to 2.6.12-rc4-mm2 adds two new architecture dependent
functions:
* void arch_arm_kprobe(struct kprobe *p)
* void arch_disarm_kprobe(struct kprobe *p)
and then adds the new functions for each of the architectures that already
implement kprobes (spar64/ppc64/i386/x86_64).
I thought arch_[dis]arm_kprobe was the most descriptive of what was really
happening, but each of the architectures already had a disarm_kprobe()
function that was really a "disarm and do some other clean-up items as
needed when you stumble across a recursive kprobe." So... I took the
liberty of changing the code that was calling disarm_kprobe() to call
arch_disarm_kprobe(), and then do the cleanup in the block of code dealing
with the recursive kprobe case.
So far this patch as been tested on i386, x86_64, and ppc64, but still
needs to be tested in sparc64.
Signed-off-by: Rusty Lynch <rusty.lynch@intel.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The attached patch causes the various arch specific install.sh scripts to
look for ${CROSS_COMPILE}installkernel rather than just installkernel (in
both /sbin/ and ~/bin/ where the script already did this). This allows you
to have e.g. arm-linux-installkernel as a handy way to install on your
cross target. It also prevents the script picking up on the host
/sbin/installkernel which causes the script to fall through and do the
install itself (which is what I actually use myself, with $INSTALL_PATH
set).
I don't believe it causes back-compatibility problems since calling the
host installkernel was never likely to work or be what you wanted when
cross compiling anyway. If $CROSS_COMPILE isn't set then nothing changes.
I only use ARM and i386 myself but I figured it couldn't hurt to do the
whole lot. I've cc'd those who I hope are the arch maintainers for files
that I've touched.
Signed-off-by: Ian Campbell <icampbell@arcom.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Provide the architecture specific implementation for SPARSEMEM for PPC64
systems.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com> (in part)
Signed-off-by: Martin Bligh <mbligh@aracnet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Provide hooks for PPC64 to allow memory models to be informed of installed
memory areas. This allows SPARSEMEM to instantiate mem_map for the populated
areas.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Martin Bligh <mbligh@aracnet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Provide an implementation of early_pfn_to_nid for PPC64. This is used by
memory models to determine the node from which to take allocations before the
memory allocators are fully initialised.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Martin Bligh <mbligh@aracnet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The part of the sparsemem patch which modifies memmap_init_zone() has recently
become a problem. It changes behavior so that there is a call to
pfn_to_page() for each individual page inside of a node's range:
node_start_pfn through node_end_pfn. It used to simply do this once, at the
beginning of the node, but having sparsemem's non-contiguous mem_map[]s inside
of a node made it necessary to change.
Mike Kravetz recently wrote a patch which made the NUMA code accept some new
kinds of layouts. The system's memory was laid out like this, with node 0's
memory in two pieces: one before and one after node 1's memory:
Node 0: +++++ +++++
Node 1: +++++
Previous behavior before Mike's patch was to assign nodes like this:
Node 0: 00000 XXXXX
Node 1: 11111
Where the 'X' areas were simply thrown away. The new behavior was to make the
pg_data_t span node 0 across all of its areas, including areas that are really
node 1's: Node 0: 000000000000000 Node 1: 11111
This wastes a little bit of mem_map space, but ends up being OK, and more
fully utilizes the system's memory. memmap_init_zone() initializes all of the
"struct page"s for node 0, even for the "hole", but those never get used,
because there is no pfn_to_page() that resolves to those pages. However, only
calling pfn_to_page() once, memmap_init_zone() always uses the pages that were
allocated for node0->node_mem_map because:
struct page *start = pfn_to_page(start_pfn);
// effectively start = &node->node_mem_map[0]
for (page = start; page < (start + size); page++) {
init_page_here();...
page++;
}
Slow, and wasteful, but generally harmless.
But, modify that to call pfn_to_page() for each loop iteration (like sparsemem
does):
for (pfn = start_pfn; pfn < < (start_pfn + size); pfn++++) {
page = pfn_to_page(pfn);
}
And you end up trying to initialize node 1's pages too early, along with bogus
data from node 0. This patch checks for those weird layouts and declines to
touch the pages, making the more frequent pfn_to_page() calls OK to do.
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch changes some of the default behavior in the ppc64 Kconfig file
that was recently changed/added to 2.6.12-rc2-mm1 by Dave Hansen in
preparation for SPARSEMEM. Patch allows the display of both FLAT and
DISCONTIG models on pseries. As before, default is DISCONTIG for SMP and
PSERIES and FLAT for others.
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This will at least suppress one prompt that users would have received the
first time they compile with the new DISCONTIG arch option. They'll still
get the "Memory Model" prompt, but 99% of them will have the default work
there.
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For all architectures, this just means that you'll see a "Memory Model"
choice in your architecture menu. For those that implement DISCONTIGMEM,
you may eventually want to make your ARCH_DISCONTIGMEM_ENABLE a "def_bool
y" and make your users select DISCONTIGMEM right out of the new choice
menu. The only disadvantage might be if you have some specific things that
you need in your help option to explain something about DISCONTIGMEM.
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch effectively eliminates direct use of pgdat->node_mem_map outside
of the DISCONTIG code. On a flat memory system, these fields aren't
currently used, neither are they on a sparsemem system.
There was also a node_mem_map(nid) macro on many architectures. Its use
along with the use of ->node_mem_map itself was not consistent. It has
been removed in favor of two new, more explicit, arch-independent macros:
pgdat_page_nr(pgdat, pagenr)
nid_page_nr(nid, pagenr)
I called them "pgdat" and "nid" because we overload the term "node" to mean
"NUMA node", "DISCONTIG node" or "pg_data_t" in very confusing ways. I
believe the newer names are much clearer.
These macros can be overridden in the sparsemem case with a theoretically
slower operation using node_start_pfn and pfn_to_page(), instead. We could
make this the only behavior if people want, but I don't want to change too
much at once. One thing at a time.
This patch removes more code than it adds.
Compile tested on alpha, alpha discontig, arm, arm-discontig, i386, i386
generic, NUMAQ, Summit, ppc64, ppc64 discontig, and x86_64. Full list
here: http://sr71.net/patches/2.6.12/2.6.12-rc1-mhp2/configs/
Boot tested on NUMAQ, x86 SMP and ppc64 power4/5 LPARs.
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Martin J. Bligh <mbligh@aracnet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently reset and powerdown are not implemented on the Maple board,
and attempting to do so will (incorrectly return). This implements
the proper communication with the service processor, allowing correct
reset and powerdown on the Maple board, by communicating with the
service processor. If somehow it's unable to communicate with the
service processor it will loop forever instead.
Note that powerdown on the Maple will power down the CPUs, but not the
fans or other board components due to hardware and firmware
limitations.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Frank Rowand <frowand@mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
For I/O DLPAR to work properly, the kernel needs to allow for dynamic
assignment of the irq field of the pci_dev structure upon dynamic bus
addition. This patch moves the assignment of that field from
pSeries_final_fixup() to pcibios_fixup_bus(), which enables dynamic
assignment for the children of a newly added bus.
Currently, pci_devs receive their irq numbers in one of two ways. The
irq line is either read at boot for all pci_devs, or read by the rpaphp
module at slot enable time. The latter is no longer sufficient for
DLPAR addition of slots that don't qualify as PCI-hotplug capable.
This solution handles the cases of boot and dynamic add.
Signed-off-by: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch corrects the printing of progress indicators to the op
panel on p/iSeries ppc64 systems. Each discrete reference code should
begin with a form feed char to clear the op panel, and the first and
second lines should be separated with a CR/LF sequence. Padding with
spaces is not necessary.
Also, capitalize the hex value printed on the first line, to be
consistent with the values printed by firmware, service processor,
etc.
It turns out that there's an ibm,form-feed property; this patch uses
it in the pSeries-specific progress routine. This patch also checks
the number of rows and the specific width of each row (the second row
on power5 systems can actually hold 80 characters). If the displayed
text is too wide for the physical display, it can be viewed in the ASM
menus, or by selecting option 14 on the op panel.
Signed-off-by: Mike Strosaker <strosake@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Implementation of software load support for the BE iommu. This is very
different from other iommu code on ppc64, since we only do a static mapping.
The mapping is currently hardcoded but should really be read from the
firmware, but they don't set up the device nodes yet. There is a single
512MB DMA window for PCI, USB and ethernet at 0x20000000 for our RAM.
The Cell processor can put the I/O page table either in memory like
the hashed page table (hardware load) or have the operating system
write the entries into memory mapped CPU registers (software load).
I use the software load mechanism because I know that all I/O page
table entries for the amount of installed physical memory fit into
the IO TLB cache. At the point when we get machines with more than
4GB of installed memory, we can either use hardware I/O page table
access like the other platforms do or dynamically update the I/O
TLB entries when a page fault occurs in the I/O subsystem.
The software load can then use the macros that I have implemented
for the static mapping in order to do the TLB cache updates.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add support for the integrated interrupt controller on BPA
CPUs. There is one of those for each SMT thread.
The mapping of interrupt numbers to HW interrupt sources
is described in arch/ppc64/kernel/bpa_iic.h.
This version hardcodes the 'Spider' chip as the secondary
interrupt controller. That is not really generic for the
architecture, but at the moment it is the only secondary
PIC that exists.
A little more work will be needed on this as soon as
we have boards with multiple external interrupt controllers.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds the basic support for running on BPA machines.
So far, this is only the IBM workstation, and it will
not run on others without a little more generalization.
It should be possible to configure a kernel for any
combination of CONFIG_PPC_BPA with any of the other
multiplatform targets.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The firmware provides the location and size of the nvram
in the device tree, so it does not really contain any
hardware specific bits and could be used on other
machines as well.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The pSeries_progress function is called from some places in the rtas code,
which may also be used by non-pSeries platforms.
Though pSeries is currently the only platform type that implements
display-character, the code is actually generic enough to be part of
the rtas subsystem.
I hit a bug here because the generic rtas code tried calling ppc_md.progress,
which points to an __init function on most platforms.
We could also clear the ppc_md.progress pointer when freeing the init memory
to make it more explicit that ppc_md.progress must not be called after
bootup.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
BPA is using rtas for PCI but should not be confused by
pSeries code. This also avoids some #ifdefs. Other
platforms that want to use rtas_pci.c could create
their own platform_pci.c with platform specific fixups.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The rtc rtas functions are not pSeries specific but can
also be used by BPA and other SLOF based platforms
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>