Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
a many-threaded application which concurrently initializes different parts of
a large anonymous area.
This patch corrects that, by using a separate spinlock per page table page, to
guard the page table entries in that page, instead of using the mm's single
page_table_lock. (But even then, page_table_lock is still used to guard page
table allocation, and anon_vma allocation.)
In this implementation, the spinlock is tucked inside the struct page of the
page table page: with a BUILD_BUG_ON in case it overflows - which it would in
the case of 32-bit PA-RISC with spinlock debugging enabled.
Splitting the lock is not quite for free: another cacheline access. Ideally,
I suppose we would use split ptlock only for multi-threaded processes on
multi-cpu machines; but deciding that dynamically would have its own costs.
So for now enable it by config, at some number of cpus - since the Kconfig
language doesn't support inequalities, let preprocessor compare that with
NR_CPUS. But I don't think it's worth being user-configurable: for good
testing of both split and unsplit configs, split now at 4 cpus, and perhaps
change that to 8 later.
There is a benefit even for singly threaded processes: kswapd can be attacking
one part of the mm while another part is busy faulting.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Convert those few architectures which are calling pud_alloc, pmd_alloc,
pte_alloc_map on a user mm, not to take the page_table_lock first, nor drop it
after. Each of these can continue to use pte_alloc_map, no need to change
over to pte_alloc_map_lock, they're neither racy nor swappable.
In the sparc64 io_remap_pfn_range, flush_tlb_range then falls outside of the
page_table_lock: that's okay, on sparc64 it's like flush_tlb_mm, and that has
always been called from outside of page_table_lock in dup_mmap.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Deepak Saxena
This patch adds support for 36-bit static mapped I/O. While there
are no platforms in the tree ATM that use it, it has been tested
tested on the IXP2350 NPU and I would like to get the support for
that chipset upstream one piece at a time. There are also other
Intel chipset ports in development that are waiting on this to go
upstream.
The patch replaces the print formats for physical addresses with
%016llx which will create a bit extraneous output on 32-bit systems,
but I think that is cleaner than having #ifdefs, specially since
users will only see the output in error cases.
Depends on 3016/1.
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Deepak Saxena
Convert map_desc.physical to map_desc.pfn. This allows us to add
support for 36-bit addressed physical devices in the static maps
without having to resort to u64 variables.
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Make ARM independent of the way bootmem operates internally. We
now map each node as we initialise it, and place the bootmem bitmap
inside each node, rather than all in the first node.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We weren't explicitly setting the page table bits we desired
in user_prot in the protection table, which resulted in the
user mappings for v6 CPUs being marked global.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
No point checking what CPU architecture level we have each time
within the loop, so precompute the base PMD flags outside the
loop.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Deepak Saxena
Working on adding support for 36-bit static mappings for ARMv6 and
Intel's XSC3 core and noticed that alloc_init_supersection currently
increments the phys addr by 1MB on each of the 16 iterations and then
forces alignment to supersection size (16MB). This is really uneeded
b/c we have already forced the phys address to be 16MB aligned in
create_mapping(). Furthermore, this breaks 36-bit addressing b/c bits
[23:20] of the PMD contain bits [35:32] of the physical address and
the masking causes us to loose those bits thus ending up with an
incorrect virt -> phys translation. The other option is to have an
alloc_init_supersection36.
Tested on Intel IXP2350 CPU with 36-bit static I/O mappings.
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Unfortunately, we can't use the "user" bit in the page tables to
control whether a page table entry is "global" or "asid" specific,
since the vector page is mapped as "user" accessible but is not
process specific.
Therefore, give direct control of the ARMv6 "nG" (not global)
bit to the mm layers.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
ARMv6 introduces memory types into the page tables. Mark devices
mappings with the "shared device" memory type.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Deepak Saxena
The code in mm-armv.c checks for the condition (cpu_architecture()<= ARMv5)
in a few places but should be checking for ARMv5TEJ as the MMU is shared
across all v5 variations.
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
It doesn't make sense for this to be in mm-armv.c now that 26-bit
ARM support is no longer integrated into arch/arm.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
It doesn't make sense to have the PGD kernel pointers initialisation
separate from the PGD user pointers, especially when we clean the
data cache over the whole range.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add pmd_off() and pmd_off_k() to obtain the pmd pointer for a
virtual address, and use them throughout the mm initialisation.
Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Patch from George G. Davis
This patch is required for kernel XIP support on ARMv6 machines. It ensures that the access permission bits for kernel XIP section descriptors are APX=1 and AP[1:0]=01, which is Kernel read-only/User no access permissions. Prior to this change, kernel XIP section descriptor access permissions were set to Kernel no access/User no access on ARMv6 machines and the kernel would therefore hang upon entry to userspace when set_fs(USER_DS) was executed.
Signed-off-by: Steve Longerbeam
Signed-off-by: George G. Davis
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!