The system call entry code will clear the high bits of argument
registers before invoking the system call; don't report whatever noise
happens to be in the high bits of the register before that happens.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
apply_relocate_add() does not support R_PPC_ADDR16_HI relocations, which
prevents some non gcc-built modules to be loaded.
Signed-off-by: Simon Vallet <svallet@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
- Drivers will not rely on the PCI config space value, as they've
already been conditioned to rely on the irq field in "struct pci_dev".
- The virq value may not be < 256 as it has been remapped.
- The PCI config space should reflect the hardware configuration, which
is not being changed. We are only creating a virtual irq mapping that
exists in the kernel only. One would never expect the PCI hardware to
generate the "virq" interrupt.
Signed-off-by: Michal Ostrowski <mostrows@watson.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Use is_init() rather than hard coded pid comparison.
Cc: Anton Blanchard <anton@samba.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The new implementation of pci_device_to_OF_node() on ppc32 has a bogus
sanity check in it that can cause oopses at boot when no device node is
present, and might hit correct cases with older/weird apple device-trees
where they have the type "vci" for the chaos bridge.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Compiling the kernel with CONFIG_HOTPLUG = y and CONFIG_HOTPLUG_CPU = n
with CONFIG_RELOCATABLE = y generates the following modpost warnings
WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141b7d) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141b9c) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.text:__cpu_up
from .text between '_cpu_up' (at offset 0xc0141bd8) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141c05) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141c26) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141c37) and 'cpu_up'
This is because cpu_up, _cpu_up and __cpu_up (in some architectures) are
defined as __devinit
AND
__cpu_up calls some __cpuinit functions.
Since __cpuinit would map to __init with this kind of a configuration,
we get a .text refering .init.data warning.
This patch solves the problem by converting all of __cpu_up, _cpu_up
and cpu_up from __devinit to __cpuinit. The approach is justified since
the callers of cpu_up are either dependent on CONFIG_HOTPLUG_CPU or
are of __init type.
Thus when CONFIG_HOTPLUG_CPU=y, all these cpu up functions would land up
in .text section, and when CONFIG_HOTPLUG_CPU=n, all these functions would
land up in .init section.
Tested on a i386 SMP machine running linux-2.6.20-rc3-mm1.
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When we switched over to the generic BUG mechanism we forgot to change
the assembly code which open-codes a WARN_ON() in enter_rtas(), so the
bug table got corrupted.
This patch provides an EMIT_BUG_ENTRY macro for use in assembly code,
and uses it in entry_64.S. Tested with CONFIG_DEBUG_BUGVERBOSE on ppc64
but not without -- I tried to turn it off but it wouldn't go away; I
suspect Aunt Tillie probably needed it.
This version gets __FILE__ and __LINE__ right in the assembly version --
rather than saying include/asm-powerpc/bug.h line 21 every time which is
a little suboptimal.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
For 32-bit processes, the getcontext side of the swapcontext system
call (i.e. the saving of the context when the first argument is
non-NULL) has to set the ctx->uc_mcontext.uc_regs pointer to the place
where it saves the registers. Which it does, but it doesn't ensure
that the pointer is 16-byte aligned. 16-byte alignment is needed
because the Altivec/VMX registers are saved in there, and they need to
be on a 16-byte boundary.
This fixes it by ensuring the appropriate alignment of the pointer.
This issue was pointed out by Jakub Jelinek.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Initialize the pci device pci channel state. This is critical
for having the pci_channel_offline() routine (in pci.h) to
function correctly.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
On some oldworld PowerMacs, OF doesn't assign interrupts properly
beyond P2P bridges. Fortunately, the fix is easy as all those machines
just wire all IRQ lines together to one IRQ which is assigned to the
bridge itself. We already have a special function for parsing Apple
OldWorld interrupts which are special, so let's add to it the ability
to walk up the PCI tree to find interrupts.
This fixes irqs on the lower slots of s900 clones among others.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch add scanning of ebc bus to of_platform, which is needed
to recognize devices located on that bus.
Signed-off-by: Christian Krafft <krafft@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Run this:
#!/bin/sh
for f in $(grep -Erl "\([^\)]*\) *k[cmz]alloc" *) ; do
echo "De-casting $f..."
perl -pi -e "s/ ?= ?\([^\)]*\) *(k[cmz]alloc) *\(/ = \1\(/" $f
done
And then go through and reinstate those cases where code is casting pointers
to non-pointers.
And then drop a few hunks which conflicted with outstanding work.
Cc: Russell King <rmk@arm.linux.org.uk>, Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Karsten Keil <kkeil@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Steven French <sfrench@us.ibm.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (36 commits)
[POWERPC] Generic BUG for powerpc
[PPC] Fix compile failure do to introduction of PHY_POLL
[POWERPC] Only export __mtdcr/__mfdcr if CONFIG_PPC_DCR is set
[POWERPC] Remove old dcr.S
[POWERPC] Fix SPU coredump code for max_fdset removal
[POWERPC] Fix irq routing on some 32-bit PowerMacs
[POWERPC] ps3: Add vuart support
[POWERPC] Support ibm,dynamic-reconfiguration-memory nodes
[POWERPC] dont allow pSeries_probe to succeed without initialising MMU
[POWERPC] micro optimise pSeries_probe
[POWERPC] Add SPURR SPR to sysfs
[POWERPC] Add DSCR SPR to sysfs
[POWERPC] Fix 440SPe CPU table entry
[POWERPC] Add support for FP emulation for the e300c2 core
[POWERPC] of_device_register: propagate device_create_file return code
[POWERPC] Fix mmap of PCI resource with hack for X
[POWERPC] iSeries: head_64.o needs to depend on lparmap.s
[POWERPC] cbe_thermal: Fix initialization of sysfs attribute_group
[POWERPC] Remove QE header files from lite5200.c
[POWERPC] of_platform_make_bus_id(): make `magic' int
...
This makes powerpc use the generic BUG machinery. The biggest reports the
function name, since it is redundant with kallsyms, and not needed in general.
There is an overall reduction of code, since module_32/64 duplicated several
functions.
Unfortunately there's no way to tell gcc that BUG won't return, so the BUG
macro includes a goto loop. This will generate a real jmp instruction, which
is never used.
[akpm@osdl.org: build fix]
[paulus@samba.org: remove infinite loop in BUG_ON]
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Hugh Dickens <hugh@veritas.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
On 85xx we don't build in dcr support because the core doesn't implement the
instructions. This caused problems when building an 85xx kernel. Additionally
made it so we only build __mtdcr/__mfdcr if we are CONFIG_PPC_DCR_NATIVE.
The 85xx build issue wasPointed out by Dai Haruki.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The changes to use pci_read_irq_line() broke interrupt parsing
on some 32-bit powermacs (oops). The reason is a bit obscure.
The code to parse interrupts happens earlier now, during
pcibios_fixup() as the PCI bus is being probed. However, the
current implementation pci_device_to_OF_node() for 32-bit
powerpc relies, on machines like PowerMac which renumber PCI buses,
on a table called pci_OF_bus_map containing a map of bus numbers
between the kernel and the firmware which is setup only later.
Thus, it fails to match the device node. In addition, some of
Apple internal PCI devices lack a proper PCI_INTERRUPT_PIN, thus
preventing the fallback mapping code to work.
This patch fixes it by making pci_device_to_OF_node() 32-bit
implementation use a different algorithm that works without
using the pci_OF_bus_map thing (which I intend to deprecate
anyway). It's a bit slower but that function isn't called in
any hot path hopefully.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
For PAPR partitions with large amounts of memory, the firmware has an
alternative, more compact representation for the information about the
memory in the partition and its NUMA associativity information. This
adds the code to the kernel to parse this alternative representation.
The other part of this patch is telling the firmware that we can
handle the alternative representation. There is however a subtlety
here, because the firmware will invoke a reboot if the memory
representation we request is different from the representation that
firmware is currently using. This is because firmware can't change
the representation on the fly. Further, some firmware versions used
on POWER5+ machines have a bug where this reboot leaves the machine
with an altered value of load-base, which will prevent any kernel
booting until it is reset to the normal value (0x4000). Because of
this bug, we do NOT set fake_elf.rpanote.new_mem_def = 1, and thus we
do not request the new representation on POWER5+ and earlier machines.
We do request the new representation on POWER6, which uses the
ibm,client-architecture-support call.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Now we have a SPURR cpu feature bit, we can export it to userspace in
sysfs.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
POWER6 adds a new SPR, the data stream control register (DSCR). It can
be used to adjust how agressive the prefetch mechanisms are.
Its possible we may want to context switch this, but for now just export
it to userspace via sysfs so we can adjust it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The 440SPe CPU table entry was missing the CPU_FTR_NODSISRALIGN and
really should have been CPU_FTRS_44X.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The e300c2 has no FPU. Its MSR[FP] is grounded to zero. If an attempt
is made to execute a floating point instruction (including floating-point
load, store, or move instructions), the e300c2 takes a floating-point
unavailable interrupt.
This patch adds support for FP emulation on the e300c2 by declaring a
new CPU_FTR_FP_TAKES_FPUNAVAIL, where FP unavail interrupts are
intercepted and redirected to the ProgramCheck exception path for
correct emulation handling.
(If we run out of CPU_FTR bits we could look to reclaim this bit by adding
support to test the cpu_user_features for PPC_FEATURE_HAS_FPU instead)
It adds a nop to the exception path for 32-bit processors with a FPU.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Removed compiler warning about ignoring the return code of device_create_file
in of_device_register.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The powerpc version of pci_resource_to_user() and associated hooks
used by /proc/bus/pci and /sys/bus/pci mmap have been broken for some
time on machines that don't have a 1:1 mapping of devices (basically
on non-PowerMacs) and have PCI devices above 32 bits.
This attempts to fix it as well as possible.
The rule is supposed to be that pci_resource_to_user() always converts
the resources back into a BAR values since that's what the /proc
interface was supposed to deal with. However, for X to work on
platforms where PCI MMIO is not mapped 1:1, it became a habit of
platforms like powerpc to pass "fixed up" values there since X expects
to be able to use values from /proc/bus/pci/devices as offsets to mmap
of /dev/mem...
So we keep that contraption here, causing also /sys/*/resource to
expose fully absolute MMIO addresses instead of BAR values, which is
ugly, but should still work as long as those are only used to calculate
alignment within a page.
X is still broken when built 32 bits on machines where PCI MMIO can be
above 32-bit space unfortunately.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This dependency was inadvertantly removed in a previous patch
(e73aedba56).
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
of_platform_make_bus_id(): Kill a compiler warning which is a real
bug on PPC64 by changing `magic' to `int'.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
To test for the existence of an RTAS function, we typically do:
foo_token = rtas_token("foo");
if (foo_token == RTAS_UNKNOWN_SERVICE)
return;
Add a rtas_service_present method, which provides a more conventional
boolean interface for testing the existence of an RTAS method.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
As the first step in consolidating the pseries hotplug cpu code,
create platforms/pseries/hotplug-cpu.c and move rtas_stop_self()
into it. Do the rtas token initialisation in a new initcall, rather
than rtas_initialize().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The elf note saving code is currently duplicated over several
architectures. This cleanup patch simply adds code to a common file and
then replaces the arch-specific code with calls to the newly added code.
The only drawback with this approach is that s390 doesn't fully support
kexec-on-panic which for that arch leads to introduction of unused code.
Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When we are unregistering a kprobe-booster, we can't release its
instruction buffer immediately on the preemptive kernel, because some
processes might be preempted on the buffer. The freeze_processes() and
thaw_processes() functions can clean most of processes up from the buffer.
There are still some non-frozen threads who have the PF_NOFREEZE flag. If
those threads are sleeping (not preempted) at the known place outside the
buffer, we can ensure safety of freeing.
However, the processing of this check routine takes a long time. So, this
patch introduces the garbage collection mechanism of insn_slot. It also
introduces the "dirty" flag to free_insn_slot because of efficiency.
The "clean" instruction slots (dirty flag is cleared) are released
immediately. But the "dirty" slots which are used by boosted kprobes, are
marked as garbages. collect_garbage_slots() will be invoked to release
"dirty" slots if there are more than INSNS_PER_PAGE garbage slots or if
there are no unused slots.
Cc: "Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: "bibo,mao" <bibo.mao@intel.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Yumiko Sugita <yumiko.sugita.yf@hitachi.com>
Cc: Satoshi Oshima <soshima@redhat.com>
Cc: Hideo Aoki <haoki@redhat.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Move process freezing functions from include/linux/sched.h to freezer.h, so
that modifications to the freezer or the kernel configuration don't require
recompiling just about everything.
[akpm@osdl.org: fix ueagle driver]
Signed-off-by: Nigel Cunningham <nigel@suspend2.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Replace all uses of kmem_cache_t with struct kmem_cache.
The patch was generated using the following script:
#!/bin/sh
#
# Replace one string by another in all the kernel sources.
#
set -e
for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
quilt add $file
sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
mv /tmp/$$ $file
quilt refresh
done
The script was run like this
sh replace kmem_cache_t "struct kmem_cache"
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
SLAB_KERNEL is an alias of GFP_KERNEL.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Change the 'no_control' field in the cpu struct to a more positive
and better term 'hotpluggable'. And change(/cleanup) the logic accordingly.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Print the addresses of non-absolute symbols relative to _text
so that ld will generate relocations. Allowing a relocatable
kernel to relocate them. We can't actually use the symbol names
because kallsyms includes static symbols that are not exported
from their object files.
Add the _text symbol definitions to the architectures which don't
define it otherwise linker will fail.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cure the damage done by the former PCI debug printks fix for the case
of 64-bit resources (commit 685143ac1f)
which broke it for the plain vanilla 32-bit resources...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds the "logical" PVR value used by POWER6 in "compatible" mode
to the list of PVR values that the kernel tells firmware it is able to
handle.
Signed-off-by: Paul Mackerras <paulus@samba.org>
For PCI devices with only io ports, of_bus_pci_get_flags() will fall
through and still mark the resource as IORESOURCE_MEM.
Signed-off-by: Olof Johansson <olof@lixom.net>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Commit 3ccfc65c50 missed the same fixes for
legacy iSeries specific code, so make some more symbols no longer global.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Change the oprofile_cpu_type in cputables.c to be ppc64/970MP. Oprofile
needs to distinquish the MP from other 970 processors so it can add some
new counters specific to the 970MP.
Signed-off-by: Mike Wolf <mjw@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Devices with no "reg" nor "dcr-reg" property are given a bus_id which
is the node name alone. This means that if more than one such device
with the same names are present in the system, sysfs will have
collisions when creating the symlinks and will fail registering the
devices.
This works around that problem by assigning successive numbers to such
devices.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds code to look at the properties firmware puts in the device
tree to determine what compatibility mode the partition is in on
POWER6 machines, and set the ELF aux vector AT_HWCAP and AT_PLATFORM
entries appropriately.
Specifically, we look at the cpu-version property in the cpu node(s).
If that contains a "logical" PVR value (of the form 0x0f00000x), we
call identify_cpu again with this PVR value. A value of 0x0f000001
indicates the partition is in POWER5+ compatibility mode, and a value
of 0x0f000002 indicates "POWER6 architected" mode, with various
extensions disabled. We also look for various other properties:
ibm,dfp, ibm,purr and ibm,spurr.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add PPU event-based and cycle-based profiling support to Oprofile for Cell.
Oprofile is expected to collect data on all CPUs simultaneously.
However, there is one set of performance counters per node. There are
two hardware threads or virtual CPUs on each node. Hence, OProfile must
multiplex in time the performance counter collection on the two virtual
CPUs.
The multiplexing of the performance counters is done by a virtual
counter routine. Initially, the counters are configured to collect data
on the even CPUs in the system, one CPU per node. In order to capture
the PC for the virtual CPU when the performance counter interrupt occurs
(the specified number of events between samples has occurred), the even
processors are configured to handle the performance counter interrupts
for their node. The virtual counter routine is called via a kernel
timer after the virtual sample time. The routine stops the counters,
saves the current counts, loads the last counts for the other virtual
CPU on the node, sets interrupts to be handled by the other virtual CPU
and restarts the counters, the virtual timer routine is scheduled to run
again. The virtual sample time is kept relatively small to make sure
sampling occurs on both CPUs on the node with a relatively small
granularity. Whenever the counters overflow, the performance counter
interrupt is called to collect the PC for the CPU where data is being
collected.
The oprofile driver relies on a firmware RTAS call to setup the debug bus
to route the desired signals to the performance counter hardware to be
counted. The RTAS call must set the routing registers appropriately in
each of the islands to pass the signals down the debug bus as well as
routing the signals from a particular island onto the bus. There is a
second firmware RTAS call to reset the debug bus to the non pass thru
state when the counters are not in use.
Signed-off-by: Carl Love <carll@us.ibm.com>
Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The patch below fixes an arithmetic wrap-around issue on 32bit machines
using smp-tbsync. Without this patch a timebase value over
0x000000007fffffff will hang the boot process while bringing up
secondary CPUs.
Signed-off-by: Adrian Cox <adrian@humboldt.co.uk>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Per email discussion, it appears that rtas_stop_self()
and pSeries_mach_cpu_die() should not be compiled if
CONFIG_HOTPLUG_CPU is not defined. This patch adds
#ifdefs around these bits of code.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Rewrite local_get_flags and local_irq_disable to use r13 explicitly,
to avoid the risk that gcc will split get_paca()->soft_enabled into a
sequence unsafe against preemption. Similar care in local_irq_restore.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
powerpc: Merge 32 and 64 bits asm-powerpc/io.h
The rework on io.h done for the new hookable accessors made it easier,
so I just finished the work and merged 32 and 64 bits io.h for arch/powerpc.
arch/ppc still uses the old version in asm-ppc, there is just too much gunk
in there that I really can't be bothered trying to cleanup.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds full cell iommu support (and iommu disabled mode).
It implements mapping/unmapping of iommu pages on demand using the
standard powerpc iommu framework. It also supports running with
iommu disabled for machines with less than 2GB of memory. (The
default is off in that case, though it can be forced on with the
kernel command line option iommu=force).
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch makes dma_alloc_coherent() use node local allocation when
using the direct DMA ops. The node is obtained from the new device
extension. If no such extension is present, the current node is used.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds an optional global offset that can be added to DMA addresses
when using the direct DMA operations.
That brings it a step closer to the 32 bits direct DMA operations, and makes
it useable on Cell when the MMU is disabled and we are using a spider
southbridge.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch reworks the way iSeries hooks on PCI IO operations (both MMIO
and PIO) and provides a generic way for other platforms to do so (we
have need to do that for various other platforms).
While reworking the IO ops, I ended up doing some spring cleaning in
io.h and eeh.h which I might want to split into 2 or 3 patches (among
others, eeh.h had a lot of useless stuff in it).
A side effect is that EEH for PIO should work now (it used to pass IO
ports down to the eeh address check functions which is bogus).
Also, new are MMIO "repeat" ops, which other archs like ARM already had,
and that we have too now: readsb, readsw, readsl, writesb, writesw,
writesl.
In the long run, I might also make EEH use the hooks instead
of wrapping at the toplevel, which would make things even cleaner and
relegate EEH completely in platforms/iseries, but we have to measure the
performance impact there (though it's really only on MMIO reads)
Since I also need to hook on ioremap, I shuffled the functions a bit
there. I introduced ioremap_flags() to use by drivers who want to pass
explicit flags to ioremap (and it can be hooked). The old __ioremap() is
still there as a low level and cannot be hooked, thus drivers who use it
should migrate unless they know they want the low level version.
The patch "arch provides generic iomap missing accessors" (should be
number 4 in this series) is a pre-requisite to provide full iomap
API support with this patch.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When enabled in Kconfig, it will pick up any of_platform_device
matching it's match list (currently type "pci", "pcix", "pcie",
or "ht" and setup a PHB for it.
Platform must provide a ppc_md.pci_setup_phb() for it to work
(for doing the necessary initialisations specific to a given PHB
like setting up the config space ops).
It's currently only available on 64 bits as the 32 bits PCI code
can't quite cope with it in it's current form. I will fix that
later.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add a "parent" struct device to our PCI host bridge data structure so that
PCI can be rooted off another device in sysfs.
Note that arch/ppc doesn't use it, only arch/powerpc, though it's available
for both 32 and 64 bits.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The BUID is the first entry of a PCI host bridge "reg" property.
Now that PCI busses can be anywhere in the device-tree, we need to
fully translate the value there to a CPU physical address before
we can use it with RTAS.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When parsing the OF "ranges" properties of PCI host busses to determine
the mapping of a PCI bus, we need to translate the "parent" address using
the prom_parse.c routines in order to obtain a CPU physical address.
This wasn't necessary while PCI busses were always at the root of the
device-tree but this is no longer the case on Cell where they can be
anywhere in the tree.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch completely refactors DMA operations for 64 bits powerpc. 32 bits
is untouched for now.
We use the new dev_archdata structure to add the dma operations pointer
and associated data to struct device. While at it, we also add the OF node
pointer and numa node. In the future, we might want to look into merging
that with pci_dn as well.
The old vio, pci-iommu and pci-direct DMA ops are gone. They are now replaced
by a set of generic iommu and direct DMA ops (non PCI specific) that can be
used by bus types. The toplevel implementation is now inline.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch first splits of_device.c and of_platform.c, the later containing
the bits relative to of_platform_device's. On the "breaks" side of things,
drivers uisng of_platform_device(s) need to include asm/of_platform.h now
and of_(un)register_driver is now of_(un)register_platform_driver.
In addition to a few utility functions to locate of_platform_device(s),
the main new addition is of_platform_bus_probe() which allows the platform
code to trigger an automatic creation of of_platform_devices for a whole
tree of devices.
The function acts based on the type of the various "parent" devices encountered
from a provided root, using either a default known list of bus types that can be
"probed" or a passed-in list. It will only register devices on busses matching
that list, which mean that typically, it will not register PCI devices, as
expected (since they will be picked up by the PCI layer).
This will be used by Cell platforms using 4xx-type IOs in the Axon bridge
and can be used by any embedded-type device as well.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds new dcr_map/dcr_read/dcr_write accessors for DCRs that
can be used by drivers to transparently address either native DCRs or
memory mapped DCRs. The implementation for memory mapped DCRs is done
after the binding being currently worked on for SLOF and the Axon
chipset. This patch enables it for the cell native platform
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
These were inherited from ARCH=ppc, but are not needed since parsing of interrupts
should be done via the of_* functions (who can do swizzling). If we ever need to
do non-standard swizzling on bridges without a device-node, then we might add
back a slightly different version of ppc_md.pci_swizzle but for now, that is not
the case.
I removed the couple of calls for these in 83xx. If that breaks something, then
there is a problem with the device-tree on these.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch reworks the way IRQs are fixed up on PCI for arch powerpc.
It makes pci_read_irq_line() called by default in the PCI code for
devices that are probed, and add an optional per-device fixup in
ppc_md for platforms that really need to correct what they obtain
from pci_read_irq_line().
It also removes ppc_md.irq_bus_setup which was only used by pSeries
and should not be needed anymore.
I've also removed the pSeries s7a workaround as it can't work with
the current interrupt code anyway. I'm trying to get one of these
machines working so I can test a proper fix for that problem.
I also haven't updated the old-style fixup code from 85xx_cds.c
because it's actually buggy :) It assigns pci_dev->irq hard coded
numbers which is no good with the new IRQ mapping code. It should
at least use irq_create_mapping(NULL, hard_coded_number); and possibly
also set_irq_type() to set them as level low.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add a check for a null ppc_md.init_early to allow platforms that
don't require an init_early routine to just set this member to null.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Make nvram_64.o dependent on 64bit, not on MULTIPLATFORM.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
dev_t boot_dev is declared in arch/powerpc/kernel/setup_32.c
and in arch/powerpc/kernel/setup_64.c but not used in these files.
It is only used in arch/powerpc/platforms/powermac/setup.c, so make
it static in this file.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since iSeries is merged to MULTIPLATFORM, there is no way to build a 64bit
kernel without MULTIPLATFORM, so PPC_MULTIPLATFORM can be removed in
64bit-only files.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since iSeries is merged to MULTIPLATFORM, there is no way to build a 64bit
kernel without MULTIPLATFORM, so PPC_MULTIPLATFORM can be removed in
64bit-only files.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The enablement of 64k pages on pseries platforms exposed a bug in
the RTAS mechanism for updating firmware. RTAS assumes 4k for flash
block and list sizes, and use of any other sizes results in a failure,
even though PAPR does not specify any such requirement.
This patch changes the rtas_flash module to force the use of 4k memory
block and list sizes when preparing and sending a firmware image to
RTAS. The rtas_flash function now uses a slab cache of 4k blocks with
4k alignment, rather than get_zeroed_page(), to allocate the memory for
the flash blocks and lists. The 4k alignment requirement is specified
in PAPR.
Signed-off-by: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
It turns out that the linker warnings on 64-bit powerpc about "section
blah exceeds stub group size" were being triggered by conditional
branches in head_64.S branching to global symbols, whether in
head_64.S or in other files. This eliminates the warnings by making
some global symbols in head_64.S no longer global, and by rearranging
some branches.
Signed-off-by: Paul Mackerras <paulus@samba.org>
[ Yee-haa. Maybe I'll notice newly introduced real warnings now - Linus ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The alignment exception used to only check the exception table for
-EFAULT, not for other errors. That opens an oops window if we can
coerce the kernel into getting an alignment exception for other reasons
in what would normally be a user-protected accessor, which can be done
via some of the futex ops. This fixes it by always checking the
exception tables.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The 10Gigabit ethernet device drivers appear to be able to chew
up all 256MB of TCE mappings on pSeries systems, as evidenced by
numerous error messages:
iommu_alloc failed, tbl c0000000010d5c48 vaddr c0000000d875eff0 npages 1
Some experimentation indicates that this is essentially because
one 1500 byte ethernet MTU gets mapped as a 64K DMA region when
the large 64K pages are enabled. Thus, it doesn't take much to
exhaust all of the available DMA mappings for a high-speed card.
This patch changes the iommu allocator to work with its own
unique, distinct page size. Although the patch is long, its
actually quite simple: it just #defines a distinct IOMMU_PAGE_SIZE
and then uses this in all the places that matter.
As a side effect, it also dramatically improves network performance
on platforms with H-calls on iommu translation inserts/removes (since
we no longer call it 16 times for a 1500 bytes packet when the iommu HW
is still 4k).
In the future, we might want to make the IOMMU_PAGE_SIZE a variable
in the iommu_table instance, thus allowing support for different HW
page sizes in the iommu itself.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fixed a compile error in building the 85xx support with oprofile, and in
the process cleaned up some issues with the fsl_booke performance monitor
code.
* Reorganized FSL Book-E performance monitoring code so that the 7450
wouldn't be built if the e500 was, and cleaned it up so it was more
self-contained.
* Added a cpu_setup function for FSL Book-E. The original
cpu_setup function prototype had no arguments, assuming that
the reg_setup function would copy the required information into
variables which represented the registers. This was silly for
e500, since it has 1 register per counter (rather than 3 for
all counters), so the code has been restructured to have
cpu_setup take the current counter config array as an argument,
with op_powerpc_setup() invoking op_powerpc_cpu_setup() through
on_each_cpu(), and op_powerpc_cpu_setup() invoking the
model-specific cpu_setup function with an argument. The
argument is ignored on all other platforms at present.
* Fixed a confusing line where a trinary operator only had two
arguments
Signed-off-by: Andrew Fleming <afleming@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch fixes a few issues in offb:
- A test was inverted causing the palette hack to never work
(no device node was passed down to the init function)
- Some cards seem to have their assigned-addresses property in a random
order, thus we need to try using of_get_pci_address() first, which will
fail if it's not a PCI device, and fallback to of_get_address() in that
case. of_get_pci_address() properly parsees assigned-addresses to test
the BAR number and thus will get it right whatever the order is.
- Some cards (like GXT4500) provide a linebytes of 0xffffffff in the
device-tree which does no good. This patch handles that by using the
screen width when that happens. (Also fixes btext.c while at it).
- Add detection of the GXT4500 in addition to the GXT2000 for the
palette hacks (we use the same hack, palette is linear in register space
at offset 0x6000).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add a vmlinux.lds.h helper macro for defining the eight-level initcall table,
teach all the architectures to use it.
This is a prerequisite for a patch which performs initcall synchronisation for
multithreaded-probing.
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
[ Added AVR32 as well ]
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds a cpufreq backend driver to enable frequency scaling on cell.
Signed-off-by: Christian Krafft <krafft@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This moves the cell idle function to use the default cpu_idle
with a special power_save callback, like all other platforms
except iSeries already do.
It also makes it possible to disable this power_save function
with a new powerpc-specific boot option "powersave=off".
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds two functions to create and remove sysfs attributes and
attribute_group to all cpus. That allows to register sysfs attributes in
a subdirectory like: /sys/devices/system/cpu/cpuX/group_name/what_ever
This will be used by cbe_thermal to group all attributes dealing with
thermal support in one directory.
Signed-of-by: Christian Krafft <krafft@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
It makes for a friendlier API if irq_dispose_mapping(NO_IRQ) is a
nop, rather than triggering a WARN_ON.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fix a const'ification related warning with device_is_compatible()
and friends related to get_property() not properly having const
on it's input device node argument.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The Cell CPU timebase has an erratum. When reading the entire 64 bits
of the timebase with one mftb instruction, there is a handful of cycles
window during which one might read a value with the low order 32 bits
already reset to 0x00000000 but the high order bits not yet incremeted
by one. This fixes it by reading the timebase again until the low order
32 bits is no longer 0. That might introduce occasional latencies if
hitting mftb just at the wrong time, but no more than 70ns on a cell
blade, and that was considered acceptable.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds support for feature fixups in modules. This involves
adding support for R_PPC64_REL64 relocs to the 64 bits module loader.
It also modifies modpost.c to ignore the powerpc fixup sections (or it
would warn when used in .init.text).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch reworks the feature fixup mecanism so vdso's can be fixed up.
The main issue was that the construct:
.long label (or .llong on 64 bits)
will not work in the case of a shared library like the vdso. It will
generate an empty placeholder in the fixup table along with a reloc,
which is not something we can deal with in the vdso.
The idea here (thanks Alan Modra !) is to instead use something like:
1:
.long label - 1b
That is, the feature fixup tables no longer contain addresses of bits of
code to patch, but offsets of such code from the fixup table entry
itself. That is properly resolved by ld when building the .so's. I've
modified the fixup mecanism generically to use that method for the rest
of the kernel as well.
Another trick is that the 32 bits vDSO included in the 64 bits kernel
need to have a table in the 64 bits format. However, gas does not
support 32 bits code with a statement of the form:
.llong label - 1b (Or even just .llong label)
That is, it cannot emit the right fixup/relocation for the linker to use
to assign a 32 bits address to an .llong field. Thus, in the specific
case of the 32 bits vdso built as part of the 64 bits kernel, we are
using a modified macro that generates:
.long 0xffffffff
.llong label - 1b
Note that is assumes that the value is negative which is enforced by
the .lds (those offsets are always negative as the .text is always
before the fixup table and gas doesn't support emiting the reloc the
other way around).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
There are currently two versions of the functions for applying the
feature fixups, one for CPU features and one for firmware features. In
addition, they are both in assembly and with separate implementations
for 32 and 64 bits. identify_cpu() is also implemented in assembly and
separately for 32 and 64 bits.
This patch replaces them with a pair of C functions. The call sites are
slightly moved on ppc64 as well to be called from C instead of from
assembly, though it's a very small change, and thus shouldn't cause any
problem.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The original problem that inspired this patch was solved quite some time
ago (Turning off PCI didn't work), but this patch neatens things up a
little (I think), by putting all the PCI stuff inside a single CONFIG_PCI
block. It also removes the OF PCI bus matching entries if CONFIG_PCI is
off.
Signed-off-by: Paul Mackerras <paulus@samba.org>