This makes the kernel use 1TB segments for all kernel mappings and for
user addresses of 1TB and above, on machines which support them
(currently POWER5+, POWER6 and PA6T).
We detect that the machine supports 1TB segments by looking at the
ibm,processor-segment-sizes property in the device tree.
We don't currently use 1TB segments for user addresses < 1T, since
that would effectively prevent 32-bit processes from using huge pages
unless we also had a way to revert to using 256MB segments. That
would be possible but would involve extra complications (such as
keeping track of which segment size was used when HPTEs were inserted)
and is not addressed here.
Parts of this patch were originally written by Ben Herrenschmidt.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This way we only have entries in the device tree for disks that actually
exist. A slight complication is that disks may be attached to LPARs
at runtime.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Now we will only have entries in the device tree for the actual existing
devices (including their OS/400 properties). This way viotape.c gets
all the information about the devices from the device tree.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Now we will only have entries in the device tree for the actual existing
devices (including their OS/400 properties). This way viocd.c gets all
the information about the devices from the device tree.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
It was only being used to carry around dma_iommu_ops and vio_iommu_table
which we can use directly instead. This also means that vio_bus_device
doesn't need to refer to them either.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently if you don't specify a match callback for your irq_host it's
assumed you match everything. This is a kind of opt-out approach, and
turns out to be the exception rather than the rule.
So change the semantics to be opt-in, ie. you don't match anything unless
you provide a match callback. This in itself isn't very useful, but will
allow us to provide a default match implementation in a subsequent patch.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The majority of irq_host implementations (3 out of 4) are associated
with a device_node, and need to stash it somewhere. Rather than having
it somewhere different for each host, add an optional device_node pointer
to the irq_host structure.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
out of head_64.S and into platforms/iseries/exception.S
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes some interrrupt -> interrupt typos.
Signed-off-by: Gabriel Craciunescu <nix.or.die@googlemail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This removes several duplicate includes from arch/powerpc/.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
WARNING: vmlinux.o(.text+0x4f568): Section mismatch: reference to .init.text:.__alloc_bootmem (between '.setup_hvlpevent_queue' and '.process_hvlpevents')
setup_hvlpevent_queue is only called from __init code so make it __init
as well.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently iSeries will recalibrate the cputime_factors in the first
settimeofday() call.
It seems the reason for doing this is to ensure a resaonable time delta after
time_init(). On current kernels (with udev), this call is made 40-60 seconds
into the boot process, by moving it to a late initcall it is called
approximately 5 seconds after time_init() is called. This is sufficient to
recalibrate the timebase.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Using typedefs to rename structure types if frowned on by CodingStyle.
However, we do so for the hash PTE structure on both ppc32 (where it's
called "PTE") and ppc64 (where it's called "hpte_t"). On ppc32 we
also have such a typedef for the BATs ("BAT").
This removes this unhelpful use of typedefs, in the process
bringing ppc32 and ppc64 closer together, by using the name "struct
hash_pte" in both cases.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This rewrites pretty much from scratch the handling of MMIO and PIO
space allocations on powerpc64. The main goals are:
- Get rid of imalloc and use more common code where possible
- Simplify the current mess so that PIO space is allocated and
mapped in a single place for PCI bridges
- Handle allocation constraints of PIO for all bridges including
hot plugged ones within the 2GB space reserved for IO ports,
so that devices on hotplugged busses will now work with drivers
that assume IO ports fit in an int.
- Cleanup and separate tracking of the ISA space in the reserved
low 64K of IO space. No ISA -> Nothing mapped there.
I booted a cell blade with IDE on PIO and MMIO and a dual G5 so
far, that's it :-)
With this patch, all allocations are done using the code in
mm/vmalloc.c, though we use the low level __get_vm_area with
explicit start/stop constraints in order to manage separate
areas for vmalloc/vmap, ioremap, and PCI IOs.
This greatly simplifies a lot of things, as you can see in the
diffstat of that patch :-)
A new pair of functions pcibios_map/unmap_io_space() now replace
all of the previous code that used to manipulate PCI IOs space.
The allocation is done at mapping time, which is now called from
scan_phb's, just before the devices are probed (instead of after,
which is by itself a bug fix). The only other caller is the PCI
hotplug code for hot adding PCI-PCI bridges (slots).
imalloc is gone, as is the "sub-allocation" thing, but I do beleive
that hotplug should still work in the sense that the space allocation
is always done by the PHB, but if you unmap a child bus of this PHB
(which seems to be possible), then the code should properly tear
down all the HPTE mappings for that area of the PHB allocated IO space.
I now always reserve the first 64K of IO space for the bridge with
the ISA bus on it. I have moved the code for tracking ISA in a separate
file which should also make it smarter if we ever are capable of
hot unplugging or re-plugging an ISA bridge.
This should have a side effect on platforms like powermac where VGA IOs
will no longer work. This is done on purpose though as they would have
worked semi-randomly before. The idea at this point is to isolate drivers
that might need to access those and fix them by providing a proper
function to obtain an offset to the legacy IOs of a given bus.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Use a completion instead of abusing a semaphore for hypervisor event
completion in viopath.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This makes the new iSeries virtual console drivers (nvc_iseries) the
default and prevents viocons being built unless explicitly selected.
Also it makes no sense to have the console as a module.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.
Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
check_legacy_ioport makes only sense on PREP, CHRP and pSeries.
They may have an isa node with PS/2, parport, floppy and serial ports.
Remove the check_legacy_ioport call from ppc_md, it's not needed
anymore. Hardware capabilities come from the device-tree.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Remove unneeded inclusion of linux/ide.h
It does not compile with CONFIG_BLOCK=n.
Remove asm/ide.h from ksyms file, it gets included earlier via
linux/ide.h.
Compile tested with all defconfig files.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Moved pseries, iseries, chrp, prep, maple and pasemi into their respective
arch/powerpc/platform/*/Kconfig files out of arch/powerpc/Kconfig
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This will allow us to build without PCI easier.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.
[akpm@osdl.org: sparc64 fix]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: pnx8550 code creates directory but resets ->nlink to 1.
create_proc_entry() et al will correctly set ->nlink for you.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Corey Minyard <minyard@acm.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Greg KH <greg@kroah.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Clearing the progress indicator should only be done if we are running
on legacy iSeries.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
/proc/iSeries/config should only be created if we are running on legacy
iSeries.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
/proc/iSeries/lpevents should only be created if we are running
on legacy iSeries.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
These proc files should only be created if we are running on legacy
iSeries.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This proc file should only be created if we are running on legacy
iSeries. Since we can now run the same kernel on legacy iSeries and
other machines, we currently get the /proc/iSeries directory and the
files in it on non-iSeries machines, and accessing them causes an oops
in some cases. This and the following patches make sure that these
files are not created on non-iSeries machines, thus avoiding the oops.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Remove CPU_FTR_16M_PAGE from the cupfeatures mask at runtime on iSeries.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds code to look at the properties firmware puts in the device
tree to determine what compatibility mode the partition is in on
POWER6 machines, and set the ELF aux vector AT_HWCAP and AT_PLATFORM
entries appropriately.
Specifically, we look at the cpu-version property in the cpu node(s).
If that contains a "logical" PVR value (of the form 0x0f00000x), we
call identify_cpu again with this PVR value. A value of 0x0f000001
indicates the partition is in POWER5+ compatibility mode, and a value
of 0x0f000002 indicates "POWER6 architected" mode, with various
extensions disabled. We also look for various other properties:
ibm,dfp, ibm,purr and ibm,spurr.
Signed-off-by: Paul Mackerras <paulus@samba.org>
powerpc: Merge 32 and 64 bits asm-powerpc/io.h
The rework on io.h done for the new hookable accessors made it easier,
so I just finished the work and merged 32 and 64 bits io.h for arch/powerpc.
arch/ppc still uses the old version in asm-ppc, there is just too much gunk
in there that I really can't be bothered trying to cleanup.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch reworks the way iSeries hooks on PCI IO operations (both MMIO
and PIO) and provides a generic way for other platforms to do so (we
have need to do that for various other platforms).
While reworking the IO ops, I ended up doing some spring cleaning in
io.h and eeh.h which I might want to split into 2 or 3 patches (among
others, eeh.h had a lot of useless stuff in it).
A side effect is that EEH for PIO should work now (it used to pass IO
ports down to the eeh address check functions which is bogus).
Also, new are MMIO "repeat" ops, which other archs like ARM already had,
and that we have too now: readsb, readsw, readsl, writesb, writesw,
writesl.
In the long run, I might also make EEH use the hooks instead
of wrapping at the toplevel, which would make things even cleaner and
relegate EEH completely in platforms/iseries, but we have to measure the
performance impact there (though it's really only on MMIO reads)
Since I also need to hook on ioremap, I shuffled the functions a bit
there. I introduced ioremap_flags() to use by drivers who want to pass
explicit flags to ioremap (and it can be hooked). The old __ioremap() is
still there as a low level and cannot be hooked, thus drivers who use it
should migrate unless they know they want the low level version.
The patch "arch provides generic iomap missing accessors" (should be
number 4 in this series) is a pre-requisite to provide full iomap
API support with this patch.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch completely refactors DMA operations for 64 bits powerpc. 32 bits
is untouched for now.
We use the new dev_archdata structure to add the dma operations pointer
and associated data to struct device. While at it, we also add the OF node
pointer and numa node. In the future, we might want to look into merging
that with pci_dn as well.
The old vio, pci-iommu and pci-direct DMA ops are gone. They are now replaced
by a set of generic iommu and direct DMA ops (non PCI specific) that can be
used by bus types. The toplevel implementation is now inline.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The 10Gigabit ethernet device drivers appear to be able to chew
up all 256MB of TCE mappings on pSeries systems, as evidenced by
numerous error messages:
iommu_alloc failed, tbl c0000000010d5c48 vaddr c0000000d875eff0 npages 1
Some experimentation indicates that this is essentially because
one 1500 byte ethernet MTU gets mapped as a 64K DMA region when
the large 64K pages are enabled. Thus, it doesn't take much to
exhaust all of the available DMA mappings for a high-speed card.
This patch changes the iommu allocator to work with its own
unique, distinct page size. Although the patch is long, its
actually quite simple: it just #defines a distinct IOMMU_PAGE_SIZE
and then uses this in all the places that matter.
As a side effect, it also dramatically improves network performance
on platforms with H-calls on iommu translation inserts/removes (since
we no longer call it 16 times for a 1500 bytes packet when the iommu HW
is still 4k).
In the future, we might want to make the IOMMU_PAGE_SIZE a variable
in the iommu_table instance, thus allowing support for different HW
page sizes in the iommu itself.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
There are currently two versions of the functions for applying the
feature fixups, one for CPU features and one for firmware features. In
addition, they are both in assembly and with separate implementations
for 32 and 64 bits. identify_cpu() is also implemented in assembly and
separately for 32 and 64 bits.
This patch replaces them with a pair of C functions. The call sites are
slightly moved on ppc64 as well to be called from C instead of from
assembly, though it's a very small change, and thus shouldn't cause any
problem.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>