Commit Graph

272 Commits

Author SHA1 Message Date
Paul Mackerras
bf72aeba2f powerpc: Use 64k pages without needing cache-inhibited large pages
Some POWER5+ machines can do 64k hardware pages for normal memory but
not for cache-inhibited pages.  This patch lets us use 64k hardware
pages for most user processes on such machines (assuming the kernel
has been configured with CONFIG_PPC_64K_PAGES=y).  User processes
start out using 64k pages and get switched to 4k pages if they use any
non-cacheable mappings.

With this, we use 64k pages for the vmalloc region and 4k pages for
the imalloc region.  If anything creates a non-cacheable mapping in
the vmalloc region, the vmalloc region will get switched to 4k pages.
I don't know of any driver other than the DRM that would do this,
though, and these machines don't have AGP.

When a region gets switched from 64k pages to 4k pages, we do not have
to clear out all the 64k HPTEs from the hash table immediately.  We
use the _PAGE_COMBO bit in the Linux PTE to indicate whether the page
was hashed in as a 64k page or a set of 4k pages.  If hash_page is
trying to insert a 4k page for a Linux PTE and it sees that it has
already been inserted as a 64k page, it first invalidates the 64k HPTE
before inserting the 4k HPTE.  The hash invalidation routines also use
the _PAGE_COMBO bit, to determine whether to look for a 64k HPTE or a
set of 4k HPTEs to remove.  With those two changes, we can tolerate a
mix of 4k and 64k HPTEs in the hash table, and they will all get
removed when the address space is torn down.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-15 10:45:18 +10:00
Paul Mackerras
4306443128 powerpc: Remove unused paca->pgdir field
The pgdir field in the paca was a leftover from the dynamic VSIDs
patch, and is not used in the current kernel code.  This removes it.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-12 18:38:21 +10:00
Paul Mackerras
6218a761bb powerpc: add context.vdso_base for 32-bit too
This adds a vdso_base element to the mm_context_t for 32-bit compiles
(both for ARCH=powerpc and ARCH=ppc).  This fixes the compile errors
that have been reported in arch/powerpc/kernel/signal_32.c.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-11 14:15:17 +10:00
Benjamin Herrenschmidt
c5cf0e30bf [PATCH] powerpc: Fix buglet with MMU hash management
Our MMU hash management code would not set the "C" bit (changed bit) in
the hardware PTE when updating a RO PTE into a RW PTE. That would cause
the hardware to possibly to a write back to the hash table to set it on
the first store access, which in addition to being a performance issue,
might also hit a bug when running with native hash management (non-HV)
as our code is specifically optimized for the case where no write back
happens.

Thus there is a very small therocial window were a hash PTE can become
corrupted if that HPTE has just been upgraded to read write, a store
access happens on it, and that races with another processor evicting
that same slot. Since eviction (caused by an almost full hash) is
extremely rare, the bug is very unlikely to happen fortunately.

This fixes by allowing the updating of the protection bits in the native
hash handling to also set (but not clear) the "C" bit, and, in order to
also improve performances in the general case, by always setting that
bit on newly inserted hash PTE so that writeback really never happens.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-09 21:20:59 +10:00
Michael Ellerman
2babf5c2ec [PATCH] powerpc: Unify mem= handling
We currently do mem= handling in three seperate places. And as benh pointed out
I wrote two of them. Now that we parse command line parameters earlier we can
clean this mess up.

Moving the parsing out of prom_init means the device tree might be allocated
above the memory limit. If that happens we'd have to move it. As it happens
we already have logic to do that for kdump, so just genericise it.

This also means we might have reserved regions above the memory limit, if we
do the bootmem allocator will blow up, so we have to modify
lmb_enforce_memory_limit() to truncate the reserves as well.

Tested on P5 LPAR, iSeries, F50, 44p. Tested moving device tree on P5 and
44p and F50.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-05-19 15:02:15 +10:00
Paul Mackerras
f18fc729cd Merge ../linux-2.6 2006-05-05 15:45:48 +10:00
Jeremy Kerr
953039c8df [PATCH] powerpc: Allow devices to register with numa topology
Change of_node_to_nid() to traverse the device tree, looking for a numa id.
Cell uses this to assign ids to SPUs, which are children of the CPU node.
Existing users of of_node_to_nid() are altered to use of_node_to_nid_single(),
which doesn't do the traversal.

Export an attach_sysdev_to_node() function, allowing system devices (eg.
SPUs) to link themselves into the numa topology in sysfs.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-01 18:17:46 -07:00
Paul Mackerras
29f147d746 Merge branch 'merge' 2006-04-29 16:15:57 +10:00
David Gibson
f10a04c034 [PATCH] powerpc: Fix pagetable bloat for hugepages
At present, ARCH=powerpc kernels can waste considerable space in
pagetables when making large hugepage mappings.  Hugepage PTEs go in
PMD pages, but each PMD page maps 256M and so contains only 16
hugepage PTEs (128 bytes of data), but takes up a 1024 byte
allocation.  With CONFIG_PPC_64K_PAGES enabled (64k base page size),
the situation is worse.  Now hugepage PTEs are at the PTE page level
(also mapping 256M), so we store 16 hugepage PTEs in a 64k allocation.

The PowerPC MMU already means that any 256M region is either all
hugepage, or all normal pages.  Thus, with some care, we can use a
different allocation for the hugepage PTE tables and only allocate the
128 bytes necessary.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-04-28 15:02:51 +10:00
Olof Johansson
e110b281dc [PATCH] powerpc: Less verbose mem configuration output
Quieten some of the debug ram config output. we already print out available
memory at KERN_INFO level.

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-04-22 18:45:12 +10:00
Olof Johansson
f430c02b13 [PATCH] powerpc: Quiet page order output
No need to always print page orders.

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-04-22 18:45:09 +10:00
Anton Blanchard
fc5266ea52 [PATCH] powerpc: trivial spelling fixes in fault.c
This comment exceeded my bad spelling threshold :)

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-04-01 22:37:13 +11:00
KAMEZAWA Hiroyuki
0e5519548f [PATCH] for_each_possible_cpu: powerpc
for_each_cpu() actually iterates across all possible CPUs.  We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs.  This is inefficient and
possibly buggy.

We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.

This patch replaces for_each_cpu with for_each_possible_cpu.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-29 13:44:15 +11:00
Eugene Surovegin
bab70a4af7 [PATCH] lock PTE before updating it in 440/BookE page fault handler
Fix 44x and BookE page fault handler to correctly lock PTE before
trying to pte_update() it, otherwise this PTE might be swapped out
after pte_present() check but before pte_uptdate() call, resulting in
corrupted PTE. This can happen with enabled preemption and low memory
condition.

Signed-off-by: Eugene Surovegin <ebs@ebshome.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-29 13:44:15 +11:00
Paul Mackerras
bac30d1a78 Merge ../linux-2.6 2006-03-29 13:24:50 +11:00
Benjamin Herrenschmidt
e8222502ee [PATCH] powerpc: Kill _machine and hard-coded platform numbers
This removes statically assigned platform numbers and reworks the
powerpc platform probe code to use a better mechanism.  With this,
board support files can simply declare a new machine type with a
macro, and implement a probe() function that uses the flattened
device-tree to detect if they apply for a given machine.

We now have a machine_is() macro that replaces the comparisons of
_machine with the various PLATFORM_* constants.  This commit also
changes various drivers to use the new macro instead of looking at
_machine.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-28 23:15:54 +11:00
KAMEZAWA Hiroyuki
ec936fc563 [PATCH] for_each_online_pgdat: renaming for_each_pgdat
Replace for_each_pgdat() with for_each_online_pgdat().

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:48 -08:00
Anton Blanchard
c258dd40ab [PATCH] powerpc: Consistent printing of node id
We were printing node ids in hex in one spot. Lets be consistent and
always print them in decimal.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-27 14:48:50 +11:00
Andrew Morton
069007ae07 [PATCH] powerpc: hot_add_scn_to_nid() build fix
The return statement is to prevent `warning: 'nid' might be used uninitialized
in this function'.

Cc: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-27 14:48:34 +11:00
Ingo Molnar
14cc3e2b63 [PATCH] sem2mutex: misc static one-file mutexes
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jens Axboe <axboe@suse.de>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Cc: Greg KH <greg@kroah.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Adam Belay <ambx1@neo.rr.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:55 -08:00
Linus Torvalds
2e6e33bab6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (78 commits)
  [PATCH] powerpc: Add FSL SEC node to documentation
  [PATCH] macintosh: tidy-up driver_register() return values
  [PATCH] powerpc: tidy-up of_register_driver()/driver_register() return values
  [PATCH] powerpc: via-pmu warning fix
  [PATCH] macintosh: cleanup the use of i2c headers
  [PATCH] powerpc: dont allow old RTC to be selected
  [PATCH] powerpc: make powerbook_sleep_grackle static
  [PATCH] powerpc: Fix warning in add_memory
  [PATCH] powerpc: update mailing list addresses
  [PATCH] powerpc: Remove calculation of io hole
  [PATCH] powerpc: iseries: Add bootargs to /chosen
  [PATCH] powerpc: iseries: Add /system-id, /model and /compatible
  [PATCH] powerpc: Add strne2a() to convert a string from EBCDIC to ASCII
  [PATCH] powerpc: iseries: Make more stuff static in platforms/iseries/mf.c
  [PATCH] powerpc: iseries: Remove pointless iSeries_(restart|power_off|halt)
  [PATCH] powerpc: iseries: mf related cleanups
  [PATCH] powerpc: Replace platform_is_lpar() with a firmware feature
  [PATCH] powerpc: trivial: Cleanup whitespace in cputable.h
  [PATCH] powerpc: Remove unused iommu_off logic from pSeries_init_early()
  [PATCH] powerpc: Unconfuse htab_bolt_mapping() callers
  ...
2006-03-22 22:20:46 -08:00
Andrew Morton
2d0eee14b2 [PATCH] powerpc: Fix warning in add_memory
arch/powerpc/mm/mem.c: In function `add_memory':
arch/powerpc/mm/mem.c:128: warning: assignment makes integer from pointer without a cast

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-23 14:39:51 +11:00
David Gibson
42b88befd6 [PATCH] hugepage: is_aligned_hugepage_range() cleanup
Quite a long time back, prepare_hugepage_range() replaced
is_aligned_hugepage_range() as the callback from mm/mmap.c to arch code to
verify if an address range is suitable for a hugepage mapping.
is_aligned_hugepage_range() stuck around, but only to implement
prepare_hugepage_range() on archs which didn't implement their own.

Most archs (everything except ia64 and powerpc) used the same
implementation of is_aligned_hugepage_range().  On powerpc, which
implements its own prepare_hugepage_range(), the custom version was never
used.

In addition, "is_aligned_hugepage_range()" was a bad name, because it
suggests it returns true iff the given range is a good hugepage range,
whereas in fact it returns 0-or-error (so the sense is reversed).

This patch cleans up by abolishing is_aligned_hugepage_range().  Instead
prepare_hugepage_range() is defined directly.  Most archs use the default
version, which simply checks the given region is aligned to the size of a
hugepage.  ia64 and powerpc define custom versions.  The ia64 one simply
checks that the range is in the correct address space region in addition to
being suitably aligned.  The powerpc version (just as previously) checks
for suitable addresses, and if necessary performs low-level MMU frobbing to
set up new areas for use by hugepages.

No libhugetlbfs testsuite regressions on ppc64 (POWER5 LPAR).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22 07:54:04 -08:00
Nick Piggin
7835e98b2e [PATCH] remove set_page_count() outside mm/
set_page_count usage outside mm/ is limited to setting the refcount to 1.
Remove set_page_count from outside mm/, and replace those users with
init_page_count() and set_page_refcounted().

This allows more debug checking, and tighter control on how code is allowed
to play around with page->_count.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22 07:54:02 -08:00
Nick Piggin
70dc991d66 [PATCH] remove set_page_count(page, 0) users (outside mm)
A couple of places set_page_count(page, 1) that don't need to.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22 07:54:01 -08:00
Michael Ellerman
f8642ebee8 [PATCH] powerpc: Remove calculation of io hole
In mm_init_ppc64() we calculate the location of the "IO hole", but then
no one ever looks at the value. So don't bother.

That's actually all mm_init_ppc64() does, so get rid of it too.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-22 15:04:30 +11:00
Michael Ellerman
57cfb814f6 [PATCH] powerpc: Replace platform_is_lpar() with a firmware feature
It has been decreed that platform numbers are evil, so as a step in that
direction, replace platform_is_lpar() with a FW_FEATURE_LPAR bit.

Currently FW_FEATURE_LPAR really means i/pSeries LPAR, in the future we might
have to clean that up if we need to be more specific about what LPAR actually
means. But that's another patch ...

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-22 15:04:17 +11:00
Michael Ellerman
caf80e579b [PATCH] powerpc: Unconfuse htab_bolt_mapping() callers
htab_bolt_mapping() takes a vstart and pstart parameter, but all but one of
its callers actually pass it vstart and vstart. Luckily before it passes
paddr (calculated from paddr) to the hpte_insert routines it calls
virt_to_abs() (aka. __pa()) on the address, so there isn't actually a bug.

map_io_page() however does pass pstart properly, so currently it's broken
AFAICT because we're calling __pa(paddr) which will get us something very
large. Presumably no one's calling map_io_page() in the right context.

Anyway, change htab_bolt_mapping() callers to properly pass pstart, and then
use it properly in htab_bolt_mapping(), ie. don't call __pa() on it again.

Booted on p5 LPAR, iSeries and Power3.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-22 15:04:09 +11:00
Nathan Lynch
2b2612272c [PATCH] powerpc numa: Consolidate assignment of cpus to nodes
We can plug the boot cpu into its node independently of whether numa
topology is detected.  And numa_setup_cpu does the right thing for all
cases now, so remove special-casing for non-numa from the cpu hotplug
callback.

Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-22 15:04:03 +11:00
Nathan Lynch
482ec7c403 [PATCH] powerpc numa: Support sparse online node map
The powerpc numa code unconditionally onlines all nodes from 0 to the
highest node id found, regardless of whether cpus or memory are
present in the nodes.  This wastes 8K per node and complicates some
cpu and memory hotplug situations, such as adding a resource that
doesn't map to one of the nodes discovered at boot.

Set nodes online as resources are scanned.  Fall back to node 0 only
when we're sure this isn't a NUMA machine.

Instead of defaulting to node 0 for cases of hot-adding a resource
which doesn't belong to any initialized node, assign it to the first
online node.

Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-22 15:04:01 +11:00
Nathan Lynch
bc16a75926 [PATCH] powerpc numa: Consolidate handling of Power4 special case
Code to handle Power4's invalid node id (0xffff) is duplicated for cpu
and memory.  Better to handle this case in one place --
of_node_to_nid.  Overall behavior should be unchanged.

Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-22 15:03:57 +11:00
Nathan Lynch
cf950b7af0 [PATCH] powerpc numa: Get rid of "numa domain" terminology
Since we effectively treat the domain ids given to us by firmare as
logical node ids, make this explicit (basically s/numa_domain/nid/).

No functional changes, only variable and function names are modified.

Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-22 15:03:52 +11:00
Nathan Lynch
2e5ce39d67 [PATCH] powerpc numa: Minor cpu hotplug-related cleanups
map_cpu_to_node does not need to be inline, it is never called in a
hot path.

map_cpu_to_node, numa_setup_cpu, and find_cpu_node can be marked
__cpuinit, as they are never used after boot if CONFIG_HOTPLUG_CPU=n.

Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-22 15:03:48 +11:00
Nathan Lynch
bf4b85b0e4 [PATCH] powerpc numa: Minor debugging code changes
Add debug statement for map_cpu_to_node; it's useful for cpu hotplug.

Clarify debug statement about not finding the numa reference points
property.

Don't print a meaningless associativity depth (-1) on non-numa systems.

Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-22 15:03:45 +11:00
Nathan Lynch
c08888cf3c [PATCH] powerpc numa: fix boot_cpuid always assigned to node 0
At boot, the numa code is assigning boot_cpuid to node 0
unconditionally.  Basically, numa_setup_cpu is being stupid about it,
but this is the minimal fix -- just call numa_setup_cpu(boot_cpuid)
later, after all nodes have been set online.

Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-22 15:03:40 +11:00
Michael Ellerman
2c276603c3 [PATCH] powerpc: Fix bug in bug fix for bug in lmb_alloc()
My patch (d7a5b2ffa1) to always panic if
lmb_alloc() fails is broken because it checks alloc < 0, but should be
checking alloc == 0.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-17 13:28:24 +11:00
Paul Mackerras
23dd640112 Merge ../linux-2.6 2006-03-17 12:01:19 +11:00
Olaf Hering
920573bd03 [PATCH] powerpc: remove duplicate EXPORT_SYMBOLS
remove warnings when building a 64bit kernel.
smp_call_function triggers also with 32bit kernel.

WARNING: vmlinux: duplicate symbol 'smp_call_function' previous definition was in vmlinux
arch/powerpc/kernel/ppc_ksyms.c:164:EXPORT_SYMBOL(smp_call_function);
arch/powerpc/kernel/smp.c:300:EXPORT_SYMBOL(smp_call_function);

WARNING: vmlinux: duplicate symbol 'ioremap' previous definition was in vmlinux
arch/powerpc/kernel/ppc_ksyms.c:113:EXPORT_SYMBOL(ioremap);
arch/powerpc/mm/pgtable_64.c:321:EXPORT_SYMBOL(ioremap);

WARNING: vmlinux: duplicate symbol '__ioremap' previous definition was in vmlinux
arch/powerpc/kernel/ppc_ksyms.c:117:EXPORT_SYMBOL(__ioremap);
arch/powerpc/mm/pgtable_64.c:322:EXPORT_SYMBOL(__ioremap);

WARNING: vmlinux: duplicate symbol 'iounmap' previous definition was in vmlinux
arch/powerpc/kernel/ppc_ksyms.c:118:EXPORT_SYMBOL(iounmap);
arch/powerpc/mm/pgtable_64.c:323:EXPORT_SYMBOL(iounmap);

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-16 16:55:05 +11:00
Paul Mackerras
6749c55073 Merge ../powerpc-merge 2006-02-28 16:35:24 +11:00
Michael Ellerman
56ec6462af [PATCH] powerpc/iseries: Fix double phys_to_abs bug in htab_bolt_mapping
Before the merge I updated create_pte_mapping() to work for iSeries, by
calling iSeries_hpte_bolt_or_insert. (4c55130b2a)

Later we changed iSeries_hpte_insert to cope with the bolting case, and called
that instead from create_pte_mapping() (which was renamed to htab_bolt_mapping)
(3c726f8dee).

Unfortunately that change introduced a subtle bug, where we pass an absolute
address to iSeries_hpte_insert() where it expects a physical address. This
leads to us calling phys_to_abs() twice on the physical address, which is
seriously bogus.

This only causes a problem if the absolute address from the first translation
can be looked up again in the chunk_map, which depends on the size and layout
of memory. I've seen it fail on one box, but not others.

The minimal fix is to pass the physical address to iSeries_hpte_insert(). For
2.6.17 we should make phys_to_abs() BUG if we try to double-translate an
address.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-28 16:25:55 +11:00
Paul Mackerras
a00428f5b1 Merge ../powerpc-merge 2006-02-24 14:05:47 +11:00
R Sharada
47f78a4920 [PATCH] powerpc64: fix spinlock recursion in native_hpte_clear
native_hpte_clear has a spinlock recursion problem with the native_tlbie_lock
being called twice, once in native_hpte_clear() and once within tlbie().
Fix the problem by changing the call to tlbie() in native_hpte_clear() to
__tlbie(). It still supports only 4k pages for now.

Signed-off-by: R Sharada <sharada@in.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-24 11:36:29 +11:00
Michael Ellerman
337a7128db [PATCH] powerpc: Only calculate htab_size in one place for kexec
For kexec we need to know the size of the MMU hash table.

Currently we calculate the size once in the htab code, and then twice more in
the kexec code, once using htab_hash_mask and once using ppc64_pft_size.
On some machines the ppc64_pft_size calculation is broken because
ppc64_pft_size is not set.

So we need to fix the second calculation, but better still we should just
calculate the size once and use it everywhere else.

Tested on Power5 LPAR, Power4 non-LPAR and Power3.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-24 11:36:18 +11:00
Jon Mason
2ef9481e66 [PATCH] powerpc: trivial: modify comments to refer to new location of files
This patch removes all self references and fixes references to files
in the now defunct arch/ppc64 tree.  I think this accomplises
everything wanted, though there might be a few references I missed.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-10 16:53:51 +11:00
Michael Ellerman
3b9331dac1 [PATCH] powerpc: Move LMB_ALLOC_ANYWHERE out of lmb.h
LMB_ALLOC_ANYWHERE doesn't need to be part of the API, it's only used in
lmb.c - so move it out of the header file.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 22:38:36 +11:00
Michael Ellerman
d7a5b2ffa1 [PATCH] powerpc: Always panic if lmb_alloc() fails
Currently most callers of lmb_alloc() don't check if it worked or not, if it
ever does weird bad things will probably happen. The few callers who do check
just panic or BUG_ON.

So make lmb_alloc() panic internally, to catch bugs at the source. The few
callers who did check the result no longer need to.

The only caller that did anything interesting with the return result was
careful_allocation(). For it we create __lmb_alloc_base() which _doesn't_ panic
automatically, a little messy, but passable.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 22:38:34 +11:00
David Gibson
09f5dc44ae [PATCH] powerpc: Cleanup, consolidating icache dirtying logic
The code to mark a page as icache dirty (so that it will later be
icache-dcache flushed when we try to execute from it) is duplicated in
three places: flush_dcache_page() does this marking and nothing else,
but clear_user_page() and copy_user_page() duplicate it, since those
functions make the page icache dirty themselves.

This patch makes those other functions call flush_dcache_page()
instead, so the logic's all in one place.  This will make life less
confusing if we ever need to tweak the details of the the lazy icache
flush mechanism.

 arch/powerpc/mm/mem.c |   14 ++------------
 1 file changed, 2 insertions(+), 12 deletions(-)

Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:51:53 +11:00
Michael Ellerman
8c20fafa85 [PATCH] powerpc: Make sure we don't create empty lmb regions
To prevent problems later in boot, make sure we don't create zero-size lmb
regions.

I've checked all the callers, and at the moment no one should ever hit this.
All callers use a constant size, or they check the computed size before they
call us.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:28:38 +11:00
Linas Vepstas
d177c207ba [PATCH] powerpc: IOMMU: don't ioremap null addresses
240-ioremap-null-ptr-test.patch

Under highly unusual circumstances, a buggy driver will ask a null ptr
to be ioremapped, an operation that curently succeeds but leads to
later trouble.  Instead, refuse to remap the null pointer.

Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
(cherry picked from e71d9e598533c1889e7162f5f4647e5d378c102c commit)
2006-01-10 15:30:31 +11:00
Adrian Bunk
943ffb587c spelling: s/retreive/retrieve/
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-01-10 00:10:13 +01:00
Michael Ellerman
e0fa93d6e6 [PATCH] powerpc: Don't use KERNELBASE in add_memory()
In add_memory() we should be using __va() to get a virtual address.
Spotted by Mike Kravetz.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 20:18:19 +11:00
Anton Blanchard
bce6c5fd8c [PATCH] powerpc: DABR exceptions should report the address not the PC
When taking a DABR exception we were reporting the PC. It makes more
sense to report the address that caused the exception, and the gdb guys
would like it that way.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 16:03:33 +11:00
Mike Kravetz
b226e46212 [PATCH] powerpc: don't add memory to empty node/zone
The system will oops if an attempt is made to add memory to an
empty node/zone.  This patch prevents adding memory to an empty
node.  The code to dynamically add a node/zone is non-trivial.
This patch is temporary and will be removed when the ability
to dynamically add a node/zone is complete.

Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 15:14:22 +11:00
Arnd Bergmann
021c733549 [PATCH] powerpc: fix two build warnings
Building the arch/powerpc tree currently gives me
two warnings with gcc-4.0:

arch/powerpc/mm/imalloc.c: In function '__im_get_area':
arch/powerpc/mm/imalloc.c:225: warning: 'tmp' may be used uninitialized in this function
arch/powerpc/mm/hugetlbpage.c: In function 'hugetlb_get_unmapped_area':
arch/powerpc/mm/hugetlbpage.c:608: warning: unused variable 'vma'

both fixes are trivial.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 15:14:14 +11:00
David Gibson
14c89e7fc8 [PATCH] powerpc: Replace VMALLOCBASE with VMALLOC_START
On ppc64, we independently define VMALLOCBASE and VMALLOC_START to be
the same thing: the start of the vmalloc() area at 0xd000000000000000.
VMALLOC_START is used much more widely, including in generic code, so
this patch gets rid of the extraneous VMALLOCBASE.

This does require moving the definitions of region IDs from page_64.h
to pgtable.h, but they don't clearly belong in the former rather than
the latter, anyway.  While we're moving them, clean up the definitions
of the REGION_IDs:
	- Abolish REGION_SIZE, it was only used once, to define
REGION_MASK anyway
	- Define the specific region ids in terms of the REGION_ID()
macro.
	- Define KERNEL_REGION_ID in terms of PAGE_OFFSET rather than
KERNELBASE.  It amounts to the same thing, but conceptually this is
about the region of the linear mapping (which starts at PAGE_OFFSET)
rather than of the kernel text itself (which is at KERNELBASE).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 15:05:47 +11:00
Benjamin Herrenschmidt
cc5d0189b9 [PATCH] powerpc: Remove device_node addrs/n_addr
The pre-parsed addrs/n_addrs fields in struct device_node are finally
gone. Remove the dodgy heuristics that did that parsing at boot and
remove the fields themselves since we now have a good replacement with
the new OF parsing code. This patch also fixes a bunch of drivers to use
the new code instead, so that at least pmac32, pseries, iseries and g5
defconfigs build.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:53:55 +11:00
Anton Blanchard
4b703a2317 [PATCH] ppc64: Add NUMA cpu summary at boot
We used to print a NUMA cpu summary at boot before the hotplug cpu code
was added. This has been useful for catching machine configuration as
well as firmware bugs in the past.

This patch restores that functionality. An example of the output is:

Node 0 CPUs: 0-7
Node 1 CPUs: 8-15

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:53:37 +11:00
Michael Ellerman
ba7594852f [PATCH] powerpc: Add support for "linux,usable-memory" on memory nodes
Milton has proposed that we should support a "linux,usable-memory" property
on memory nodes which describes, in preference to "reg", the regions of memory
Linux should use.

This facility is required for kdump to inform the second kernel which memory
it should use.

Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:52:38 +11:00
Mike Kravetz
237a0989e2 [PATCH] powerpc: numa placement for dynamically added memory
This places dynamically added memory within the appropriate
numa node.  A new routine hot_add_scn_to_nid() replicates most of
the memory scanning code in parse_numa_properties().

Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:51:57 +11:00
Michael Ellerman
b5666f7039 [PATCH] powerpc: Separate usage of KERNELBASE and PAGE_OFFSET
This patch separates usage of KERNELBASE and PAGE_OFFSET. I haven't
looked at any of the PPC32 code, if we ever want to support Kdump on
PPC we'll have to do another audit, ditto for iSeries.

This patch makes PAGE_OFFSET the constant, it'll always be 0xC * 1
gazillion for 64-bit.

To get a physical address from a virtual one you subtract PAGE_OFFSET,
_not_ KERNELBASE.

KERNELBASE is the virtual address of the start of the kernel, it's
often the same as PAGE_OFFSET, but _might not be_.

If you want to know something's offset from the start of the kernel
you should subtract KERNELBASE.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:51:54 +11:00
Michael Ellerman
51fae6de24 [PATCH] powerpc: Add a is_kernel_addr() macro
There's a bunch of code that compares an address with KERNELBASE to see if
it's a "kernel address", ie. >= KERNELBASE. The proper test is actually to
compare with PAGE_OFFSET, since we're going to change KERNELBASE soon.

So replace all of them with an is_kernel_addr() macro that does that.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:51:50 +11:00
Mike Kravetz
84c9fdd11e [PATCH] powerpc: Minor numa memory code cleanup
Here is an updated version of the patch that panics if no memory is
found as Nathan suggested.  I'm still concerned that panic strings
(not just the one added here) at this stage of booting do not show
up on my system.  But, that is an issue separate from this patch.

Combine get_mem_*_cells() routines to avoid multiple memory node
lookups.  Added missing of_node_put() call.  Changed variable names
to help with some confusion as to meaning.

Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:51:34 +11:00
Paul Mackerras
54c233102f Revert "[PATCH] powerpc: Minor numa memory code cleanup"
This reverts f1fdc0117004d343698b9830e141491d5ae320d1 commit.
2006-01-09 14:51:30 +11:00
Mike Kravetz
74761bb53d [PATCH] powerpc: Minor numa memory code cleanup
I started to add missing of_node_put() calls to the routines that
determine the number of cells for memory.  Decided to combine the
routines instead of making separate node lookups.  Changed variable
names to help with some confusion as to meaning.

Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:51:02 +11:00
David Gibson
456752f750 [PATCH] powerpc: Make hugepage mappings respect hint addresses
Currently, the powerpc version of hugetlb_get_unmapped_area() entirely
ignores the hint address.  The only way to get a hugepage mapping at a
specified address is with MAP_FIXED, in which case there's no way
(short of parsing /proc/self/maps) for userspace to tell if it will
clobber an existing mapping.  This is inconvenient, so the patch below
makes hugepage mappings use the given hint address if possible.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:50:28 +11:00
Benjamin Herrenschmidt
51d3082fe6 [PATCH] powerpc: Unify udbg (#2)
This patch unifies udbg for both ppc32 and ppc64 when building the
merged achitecture. xmon now has a single "back end". The powermac udbg
stuff gets enriched with some ADB capabilities and btext output. In
addition, the early_init callback is now called on ppc32 as well,
approx. in the same order as ppc64 regarding device-tree manipulations.
The init sequences of ppc32 and ppc64 are getting closer, I'll unify
them in a later patch.

For now, you can force udbg to the scc using "sccdbg" or to btext using
"btextdbg" on powermacs. I'll implement a cleaner way of forcing udbg
output to something else than the autodetected OF output device in a
later patch.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:49:54 +11:00
Arnd Bergmann
67207b9664 [PATCH] spufs: The SPU file system, base
This is the current version of the spu file system, used
for driving SPEs on the Cell Broadband Engine.

This release is almost identical to the version for the
2.6.14 kernel posted earlier, which is available as part
of the Cell BE Linux distribution from
http://www.bsc.es/projects/deepcomputing/linuxoncell/.

The first patch provides all the interfaces for running
spu application, but does not have any support for
debugging SPU tasks or for scheduling. Both these
functionalities are added in the subsequent patches.

See Documentation/filesystems/spufs.txt on how to use
spufs.

Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:49:12 +11:00
Anton Blanchard
e597cb32e9 [PATCH] ppc64: htab_initialize_secondary cannot be marked __init
Sonny has noticed hotplug CPU on ppc64 is broken in 2.6.15-*. One of the
problems is that htab_initialize_secondary is called when a cpu is being
brought up, but it is marked __init.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-29 10:26:36 -08:00
David Gibson
23ed6cb9a2 [PATCH] powerpc: Fix SLB flushing path in hugepage
On ppc64, when opening a new hugepage region, we need to make sure any
old normal-page SLBs for the area are flushed on all CPUs.  There was
a bug in this logic - after putting the new hugepage area masks into
the thread structure, we copied it into the paca (read by the SLB miss
handler) only on one CPU, not on all.  This could cause incorrect SLB
entries to be loaded when a multithreaded program was running
simultaneously on several CPUs.  This patch corrects the error,
copying the context information into the PACA on all CPUs using the mm
in question before flushing any existing SLB entries.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-12-09 16:57:35 +11:00
David Gibson
cbf52afdc0 [PATCH] powerpc: Add missing icache flushes for hugepages
On most powerpc CPUs, the dcache and icache are not coherent so
between writing and executing a page, the caches must be flushed.
Userspace programs assume pages given to them by the kernel are icache
clean, so we must do this flush between the kernel clearing a page and
it being mapped into userspace for execute.  We were not doing this
for hugepages, this patch corrects the situation.

We use the same lazy mechanism as we use for normal pages, delaying
the flush until userspace actually attempts to execute from the page
in question.

Tested on G5.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-12-09 16:30:48 +11:00
Benjamin Herrenschmidt
325c82a029 [PATCH] powerpc: Fix a huge page bug
The 64k pages patch changed the meaning of one argument passed to the
low level hash functions (from "large" it became "psize" or page size
index), but one of the call sites wasn't properly updated, causing
potential random weird problems with huge pages. This fixes it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-12-08 16:57:41 +11:00
Mike Kravetz
6d91bb93e4 [PATCH] powerpc/pseries: boot failures on numa if no memory on node
This bug exists in the current code and prevents machines from booting
with numa enabled if there is a node that does not contain memory.
Workaround is to boot with 'numa=off'.  Looks like a simple typo.

Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-12-08 16:54:39 +11:00
Olof Johansson
ca507eaf32 [PATCH] powerpc: remove redundant code in stab init
There's never been a hardware platform that has both pSeries/RPA LPAR
hypervisor and stab (pre-POWER4 segment management). This removes
the redundant code in stab_initalize().

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-12-05 15:41:35 +11:00
David Gibson
9a94c5793a [PATCH] powerpc: More hugepage boundary case fixes
Blah.  The patch [0] I recently sent fixing errors with
in_hugepage_area() and prepare_hugepage_range() for powerpc itself has
an off-by-one bug.  Furthermore, the related functions
touches_hugepage_*_range() and within_hugepage_*_range() are also
buggy.  Some of the bugs, like those addressed in [0] originated with
commit 7d24f0b8a5 where we tweaked the
semantics of where hugepages are allowed.  Other bugs have been there
essentially forever, and are due to the undefined behaviour of '<<'
with shift counts greater than the type width (LOW_ESID_MASK could
return non-zero for high ranges with the right congruences).

The good news is that I now have a testsuite which should pick up
things like this if they creep in again.

[0] "powerpc-fix-for-hugepage-areas-straddling-4gb-boundary"

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-25 22:12:45 +11:00
David Gibson
5e391dc9e3 [PATCH] powerpc: fix for hugepage areas straddling 4GB boundary
Commit 7d24f0b8a5 fixed bugs in the ppc64 SLB
miss handler with respect to hugepage handling, and in the process tweaked
the semantics of the hugepage address masks in mm_context_t.

Unfortunately, it left out a couple of necessary changes to go with that
change.  First, the in_hugepage_area() macro was not updated to match,
second prepare_hugepage_range() was not updated to correctly handle
hugepages regions which straddled the 4GB point.

The latter appears only to cause process-hangs when attempting to map such
a region, but the former can cause oopses if a get_user_pages() is
triggered at the wrong point.  This patch addresses both bugs.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-23 16:08:39 -08:00
Hugh Dickins
7ce774b480 [PATCH] mm: powerpc init_mm without ptlock
Restore an earlier mod which went missing in the powerpc reshuffle: the 4xx
mmu_mapin_ram does not need to take init_mm.page_table_lock.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-23 16:08:38 -08:00
Hugh Dickins
01edcd891c [PATCH] mm: powerpc ptlock comments
Update comments (only) on page_table_lock and mmap_sem in arch/powerpc.
Removed the comment on page_table_lock from hash_huge_page: since it's no
longer taking page_table_lock itself, it's irrelevant whether others are; but
how it is safe (even against huge file truncation?) I can't say.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-23 16:08:38 -08:00
David Gibson
800fc3eeb0 [PATCH] powerpc: Remove imalloc.h
asm-ppc64/imalloc.h is only included from files in arch/powerpc/mm.
We already have a header for mm local definitions,
arch/powerpc/mm/mmu_decl.h.  Thus, this patch moves the contents of
imalloc.h into mmu_decl.h.  The only exception are the definitions of
PHBS_IO_BASE, IMALLOC_BASE and IMALLOC_END.  Those are moved into
pgtable.h, next to similar definitions of VMALLOC_START and
VMALLOC_SIZE.

Built for multiplatform 32bit and 64bit (ARCH=powerpc).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-19 14:46:02 +11:00
Linus Torvalds
d58a75ef75 Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc-merge 2005-11-16 07:58:48 -08:00
Michael Ellerman
eb481899aa [PATCH] powerpc: Fixup debugging in lmb.c
Somewhere we lost the include of udbg.h in lmb.c. While we're there, add a DBG
macro like every other file has and use it in lmb_dump_all().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-16 13:28:40 +11:00
Paul Mackerras
fb6d73d301 [PATCH] powerpc: Fix sparsemem with memory holes [was Re: ppc64 oops..]
This patch should fix the crashes we have been seeing on 64-bit
powerpc systems with a memory hole when sparsemem is enabled.
I'd appreciate it if people who know more about NUMA and sparsemem
than me could look over it.

There were two bugs.  The first was that if NUMA was enabled but there
was no NUMA information for the machine, the setup_nonnuma() function
was adding a single region, assuming memory was contiguous.  The
second was that the loops in mem_init() and show_mem() assumed that
all pages within the span of a pgdat were valid (had a valid struct
page).

I also fixed the incorrect setting of num_physpages that Mike Kravetz
pointed out.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-15 16:57:12 -08:00
Kumar Gala
4c8d3d997e [PATCH] Update email address for Kumar
Changed jobs and the Freescale address is no longer valid.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-13 18:14:10 -08:00
Benjamin Herrenschmidt
a7f290dad3 [PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel
This patch moves the vdso's to arch/powerpc, adds support for the 32
bits vdso to the 32 bits kernel, rename systemcfg (finally !), and adds
some new (still untested) routines to both vdso's: clock_gettime() with
support for CLOCK_REALTIME and CLOCK_MONOTONIC, clock_getres() (same
clocks) and get_tbfreq() for glibc to retreive the timebase frequency.

Tom,Steve: The implementation of get_tbfreq() I've done for 32 bits
returns a long long (r3, r4) not a long. This is such that if we ever
add support for >4Ghz timebases on ppc32, the userland interface won't
have to change.

I have tested gettimeofday() using some glibc patches in both ppc32 and
ppc64 kernels using 32 bits userland (I haven't had a chance to test a
64 bits userland yet, but the implementation didn't change and was
tested earlier). I haven't tested yet the new functions.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-11 22:25:39 +11:00
Anton Blanchard
45fb6cea09 [PATCH] ppc64: Convert NUMA to sparsemem (3)
Convert to sparsemem and remove all the discontigmem code in the
process. This has a few advantages:

- The old numa_memory_lookup_table can go away
- All the arch specific discontigmem magic can go away

We also remove the triple pass of memory properties and instead create a
list of per node extents that we iterate through. A final cleanup would
be to change our lmb code to store extents per node, then we can reuse
that information in the numa code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-11 22:21:11 +11:00
Anton Blanchard
3e66c4def1 [PATCH] ppc64: prep for NUMA sparsemem rework 2
Remove ppc64 specific version of nr_cpus_node and use the generic one
provided.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-11 22:20:57 +11:00
Paul Mackerras
49b09853df powerpc: Move some extern declarations from C code into headers
This also make klimit have the same type on 32-bit as on 64-bit,
namely unsigned long, and defines and initializes it in one place.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-10 15:53:40 +11:00
Benjamin Herrenschmidt
87655ff268 [PATCH] powerpc: 64k pages pmd alloc fix
This patch makes the kernel use a different kmem cache for PMD pages
as they are smaller than PTE pages. Avoids waste of memory.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-10 15:03:29 +11:00
Paul Mackerras
799d6046d3 [PATCH] powerpc: merge code values for identifying platforms
This patch merges platform codes.  systemcfg->platform is no longer used,
systemcfg use in general is deprecated as much as possible (and renamed
_systemcfg before it gets completely moved elsewhere in a future patch),
_machine is now used on ppc64 along as ppc32.  Platform codes aren't gone
yet but we are getting a step closer. A bunch of asm code in head[_64].S
is also turned into C code.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-10 13:37:51 +11:00
Benjamin Herrenschmidt
77ac166fba [PATCH] ppc64: Don't panic when early __ioremap fails
Early calls to __ioremap() will panic if the hash insertion fails. This
patch makes them return NULL instead. It happens with some pSeries users
who enabled CONFIG_BOOTX_TEXT. The later is getting an incorrect address
for the fame buffer and the hash insertion fails. With this patch, it
will display an error instead of crashing at boot.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-10 11:26:06 +11:00
Mike Kravetz
dd7ccbd3ee [PATCH] Memory Add Fixes for ppc64
memmap_init_zone() sets page count to 1.  Before 'freeing' the
page, we need to clear the count.  This is the same that is done
on free_all_bootmem_core() for memory discovered at boot time.

Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-08 15:17:19 +11:00
Mike Kravetz
54b79248b2 [PATCH] revised Memory Add Fixes for ppc64
Add the create_section_mapping() routine to create hptes for memory
sections dynamically added after system boot.

Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-08 15:17:13 +11:00
Benjamin Herrenschmidt
76c8e25b90 [PATCH] ppc64: Fix the lazy icache/dcache code for non-RAM pages
For some stupid reason I can't explain (brown paper bag is at hand), I
removed the check pfn_valid() in the code that does the icache/dcache
coherency on POWER4 and later. That causes us to eventually try to
access non existing struct page when hashing in IO pages.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-08 13:07:50 +11:00
Anton Blanchard
bcb3557694 [PATCH] ppc64: fix Memory: summary line
On ppc64 we end up with a negative value for the data size in the memory
boot message:

Memory: 2035560k/2097152k available (5792k kernel code, 89564k reserved,
18014398509481632k data, 870k bss, 352k init)

It turns out the section ordering of the linker script is different on
ppc32 and ppc64, so just count data as _edata - _sdata which should work
on both.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-08 11:19:53 +11:00
Paul Mackerras
24bfb00123 Merge ../linux-2.6 2005-11-08 11:14:20 +11:00
Benjamin Herrenschmidt
863c84b97c [PATCH] ppc: Fix ppc32 build after 64K pages
Oops, some last minute changes caused the 64K pages patch to break ppc32
build, this fixes it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:23 -08:00
David Gibson
7d24f0b8a5 [PATCH] ppc64: Fix bug in SLB miss handler for hugepages
This patch, however, should be applied on top of the 64k-page-size patch to
fix some problems with hugepage (some pre-existing, another introduced by
this patch).

The patch fixes a bug in the SLB miss handler for hugepages on ppc64
introduced by the dynamic hugepage patch (commit id
c594adad56) due to a misunderstanding of the
srd instruction's behaviour (mea culpa).  The problem arises when a 64-bit
process maps some hugepages in the low 4GB of the address space (unusual).
In this case, as well as the 256M segment in question being marked for
hugepages, other segments at 32G intervals will be incorrectly marked for
hugepages.

In the process, this patch tweaks the semantics of the hugepage bitmaps to
be more sensible.  Previously, an address below 4G was marked for hugepages
if the appropriate segment bit in the "low areas" bitmask was set *or* if
the low bit in the "high areas" bitmap was set (which would mark all
addresses below 1TB for hugepage).  With this patch, any given address is
governed by a single bitmap.  Addresses below 4GB are marked for hugepage
if and only if their bit is set in the "low areas" bitmap (256M
granularity).  Addresses between 4GB and 1TB are marked for hugepage iff
the low bit in the "high areas" bitmap is set.  Higher addresses are marked
for hugepage iff their bit in the "high areas" bitmap is set (1TB
granularity).

To avoid conflicts, this patch must be applied on top of BenH's pending
patch for 64k base page size [0].  As such, this patch also addresses a
hugepage problem introduced by that patch.  That patch allows hugepages of
1MB in size on hardware which supports it, however, that won't work when
using 4k pages (4 level pagetable), because in that case hugepage PTEs are
stored at the PMD level, and each PMD entry maps 2MB.  This patch simply
disallows hugepages in that case (we can do something cleverer to re-enable
them some other day).

Built, booted, and a handful of hugepage related tests passed on POWER5
LPAR (both ARCH=powerpc and ARCH=ppc64).

[0] http://gate.crashing.org/~benh/ppc64-64k-pages.diff

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:23 -08:00
Paul Mackerras
c613523455 Merge ../linux-2.6 2005-11-07 14:42:09 +11:00
Paul Mackerras
2249ca9d60 powerpc: Various UP build fixes
Mostly this involves adding #include <asm/smp.h>, since that defines
things like boot_cpuid[_phys] and [gs]et_hard_smp_processor_id, which
are SMP-related but still needed on UP.  This incorporates fixes
posted by Olof Johansson and Heikki Lindholm.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-07 13:18:13 +11:00
David Gibson
dcad47fc42 [PATCH] powerpc: Kill ppcdebug
The ancient ppcdebug/PPCDBG mechanism is now only used in two places.
First, in the hash setup code, one of the bits allows the size of the
hash table to be reduced by a factor of 8 - which would be better
accomplished with a command line option for that purpose.  The other
was a bunch of bus walking related messages in the iSeries code, which
would seem to be insufficient reason to keep the mechanism.

This patch removes the last traces of this mechanism.

Built and booted on iSeries and pSeries POWER5 LPAR (ARCH=powerpc).

Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-07 12:37:45 +11:00
Olof Johansson
723925b7b1 [PATCH] powerpc: Nicer printing of address at oops
Add nicer printing of faulting address on unresolvable kernel faults.

Makes life a little easier for those who don't know how to decode our
register contents at oops time.

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-07 12:37:28 +11:00
Benjamin Herrenschmidt
3c726f8dee [PATCH] ppc64: support 64k pages
Adds a new CONFIG_PPC_64K_PAGES which, when enabled, changes the kernel
base page size to 64K.  The resulting kernel still boots on any
hardware.  On current machines with 4K pages support only, the kernel
will maintain 16 "subpages" for each 64K page transparently.

Note that while real 64K capable HW has been tested, the current patch
will not enable it yet as such hardware is not released yet, and I'm
still verifying with the firmware architects the proper to get the
information from the newer hypervisors.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-06 16:56:47 -08:00
Paul Mackerras
e2f2e58e79 powerpc: import a fix from arch/ppc/mm/pgtable.c
... namely, the change to the 2-argument pte_alloc_kernel.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-31 14:40:03 +11:00
Paul Mackerras
734d652480 powerpc: apply recent changes to merged code
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-31 13:57:01 +11:00
Paul Mackerras
23fd07750a Merge ../linux-2.6 by hand 2005-10-31 13:37:12 +11:00
Paul Mackerras
cf00a8d18b powerpc: Fix bug arising from having multiple memory_limit variables
We had a static memory_limit in prom.c, and then another one defined
in setup_64.c and used in numa.c, which resulted in the kernel crashing
when mem=xxx was given on the command line.  This puts the declaration
in system.h and the definition in mem.c.  This also moves the
definition of tce_alloc_start/end out of setup_64.c.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-31 13:07:02 +11:00
Paul Mackerras
3ee1fcac33 powerpc: import a gfp_t fix to arch/powerpc/mm/pgtable_32.c
This applies the same fix as Al Viro recently made to
arch/ppc/mm/pgtable.c.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-29 22:10:38 +10:00
Roland Dreier
8b150478ae [PATCH] ppc: make phys_mem_access_prot() work with pfns instead of addresses
Change the phys_mem_access_prot() function to take a pfn instead of an
address.  This allows mmap64() to work on /dev/mem for addresses above 4G
on 32-bit architectures.  We start with a pfn in mmap_mem(), so there's no
need to convert to an address; in fact, it's actively bad, since the
conversion can overflow when the address is above 4G.

Similarly fix the ppc32 page_is_ram() function to avoid a conversion to an
address by directly comparing to max_pfn.  Working with max_pfn instead of
high_memory fixes page_is_ram() to give the right answer for highmem pages.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-29 14:25:49 +10:00
Kumar Gala
cffb09ce6b [PATCH] powerpc: Fix warning related to do_dabr
do_dabr() is not relevant on 40x or Book-E processors so dont build it

Signed-off-by: Kumar K. Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-27 20:51:19 +10:00
David Gibson
e37bc5df8e [PATCH] powerpc: Purge bootinfo.h
With ARCH=powerpc we assume the presence of a device tree, so we don't
require any support for the old bi_recs method of passing boot
parameters.  Likewise, we've never needed it for ppc64, but we still
had an include/asm-ppc64/bootinfo.h from which nothing was used.  This
patch removes that file, and all references to it in arch/ppc64 and
arch/powerpc.  A related, unused variable 'boot_mem_size' is also
removed from setup_32.c.  The bootinfo stuff remains in ARCH=ppc for
the time being.

Built and booted on Power5 (ARCH=ppc64 and ARCH=powerpc), built for
32-bit powermac (ARCH=powerpc and ARCH=ppc).

Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-27 20:50:46 +10:00
Paul Mackerras
fa39dc437a powerpc32: Limit memory to lowmem if !CONFIG_HIGHMEM.
This trims off the extra unusable memory from the lmb structure,
so we don't try to use it.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-26 21:54:21 +10:00
Paul Mackerras
985990137e Merge changes from linux-2.6 by hand 2005-10-22 16:51:34 +10:00
Paul Mackerras
5c8c56ebdf powerpc: Move agp_special_page export to where it is defined
... instead of exporting it in arch/*/kernel/ppc_ksyms.c.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-22 14:42:51 +10:00
Kumar Gala
b15125fa81 [PATCH] powerpc: Some more fixes to allow building for a Book-E processor
Some minor fixes that are needed if we are building for a book-e
processor.

Signed-off-by: Kumar K. Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-20 09:43:32 +10:00
Paul Mackerras
30cd4a4e9c powerpc: Initialize btext subsystem later, after prom_init
We were initializing the btext stuff from prom_init(), thus breaking
the rule that all communication between prom_init() and the rest of
the kernel has to be via the flattened device tree.  This removes
the btext initialization calls from prom_init() and initializes it
instead after the device tree is unflattened.  It would be nice to
do it earlier, but that needs some more infrastructure to find the
properties we need in the flattened device tree.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-17 19:20:46 +10:00
Paul Mackerras
3eac8c69d1 powerpc: Move default hash table size calculation to hash_utils_64.c
We weren't computing the size of the hash table correctly on iSeries
because the relevant code in prom.c was #ifdef CONFIG_PPC_PSERIES.
This moves the code to hash_utils_64.c, makes it unconditional, and
cleans it up a bit.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-12 16:58:53 +10:00
Stephen Rothwell
3a5f8c5f78 powerpc: make iSeries boot again
On ARCH=ppc64 we were getting htab_hash_mask recalculated
to the correct value for our particular machine by accident.
In the merge tree, that code was commented out, so htab_hash_mask
was being corrupted.

We now set ppc64_pft_size instead which gets htab_has_mask
calculated correctly for us later.  We should put an
ibm,pft-size property in the device tree at some point.

Also set -mno-minimal-toc in some makefiles.
Allow iSeries to configure PROC_DEVICETREE.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
2005-10-12 10:58:00 +10:00
Paul Mackerras
b3b8dc6c07 powerpc: Use reg.h instead of processor.h when we just want reg names
Now that the register names and bit definitions are all in reg.h,
use that instead of processor.h in assembly code in a few places.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-10 22:20:10 +10:00
Paul Mackerras
ab1f9dac6e powerpc: Merge arch/ppc64/mm to arch/powerpc/mm
This moves the remaining files in arch/ppc64/mm to arch/powerpc/mm,
and arranges that we use them when compiling with ARCH=ppc64.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-10 21:58:35 +10:00
Paul Mackerras
70d64ceaa1 powerpc: Rename files to have consistent _32/_64 suffixes
This doesn't change any code, just renames things so we consistently
have foo_32.c and foo_64.c where we have separate 32- and 64-bit
versions.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-10 21:52:43 +10:00
Paul Mackerras
7c8c6b9776 powerpc: Merge lmb.c and make MM initialization use it.
This also creates merged versions of do_init_bootmem, paging_init
and mem_init and moves them to arch/powerpc/mm/mem.c.  It gets rid
of the mem_pieces stuff.

I made memory_limit a parameter to lmb_enforce_memory_limit rather
than a global referenced by that function.  This will require some
small changes to ppc64 if we want to continue building ARCH=ppc64
using the merged lmb.c.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-06 12:23:33 +10:00
Paul Mackerras
20c8c21063 powerpc: Fixes to get the merged kernel to boot on powermac.
This merges ppc_ksyms.c, puts back the actual do_execve call in
sys_execve, makes init_MMU call find_end_of_memory rather than
ppc_md.find_end_of_memory (every platform has a device tree
with a /memory node now, right?) and fixes some problems with the
mpic initialization on newworld powermacs.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-09-28 20:28:14 +10:00
Paul Mackerras
14cf11af6c powerpc: Merge enough to start building in arch/powerpc.
This creates the directory structure under arch/powerpc and a bunch
of Kconfig files.  It does a first-cut merge of arch/powerpc/mm,
arch/powerpc/lib and arch/powerpc/platforms/powermac.  This is enough
to build a 32-bit powermac kernel with ARCH=powerpc.

For now we are getting some unmerged files from arch/ppc/kernel and
arch/ppc/syslib, or arch/ppc64/kernel.  This makes some minor changes
to files in those directories and files outside arch/powerpc.

The boot directory is still not merged.  That's going to be interesting.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-09-26 16:04:21 +10:00