The flat loader uses an architecture's flat_stack_align() to align the
stack but assumes word-alignment is enough for the data sections.
However, on the Xtensa S6000 we have registers up to 128bit width
which can be used from userspace and therefor need userspace stack and
data-section alignment of at least this size.
This patch drops flat_stack_align() and uses the same alignment that
is required for slab caches, ARCH_SLAB_MINALIGN, or wordsize if it's
not defined by the architecture.
It also fixes m32r which was obviously kaput, aligning an
uninitialized stack entry instead of the stack pointer.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Oskar Schirmer <os@emlix.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Bryan Wu <cooloney@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Johannes Weiner <jw@emlix.com>
Acked-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ARM SMP code wasn't properly updated for the cpumask changes, which
results in smp_timer_broadcast() broadcasting ticks to non-online CPUs.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
From: Bruce Ashfield <bruce.ashfield@windriver.com>
To fully support the armv7-a instruction set/optimizations, support
for the R_ARM_MOVW_ABS_NC and R_ARM_MOVT_ABS relocation types is
required.
The MOVW and MOVT are both load-immediate instructions, MOVW loads 16
bits into the bottom half of a register, and MOVT loads 16 bits into the
top half of a register.
The relocation information for these instructions has a full 32 bit
value, plus an addend which is stored in the 16 immediate bits in the
instruction itself. The immediate bits in the instruction are not
contiguous (the register # splits it into a 4 bit and 12 bit value),
so the addend has to be extracted accordingly and added to the value.
The value is then split and put into the instruction; a MOVW uses the
bottom 16 bits of the value, and a MOVT uses the top 16 bits.
Signed-off-by: David Borman <david.borman@windriver.com>
Signed-off-by: Bruce Ashfield <bruce.ashfield@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Kernel 2.6.30-rc1 added sys_preadv and sys_pwritev to most archs
but not ARM, resulting in
<stdin>:1421:2: warning: #warning syscall preadv not implemented
<stdin>:1425:2: warning: #warning syscall pwritev not implemented
This patch adds sys_preadv and sys_pwritev to ARM.
These syscalls simply take five long-sized parameters, so they
should have no calling-convention/ABI issues in the kernel.
Tested on armv5tel eabi using a preadv/pwritev test program posted
on linuxppc-dev earlier this month.
It would be nice to get this into the kernel before 2.6.30 final,
so that glibc's kernel version feature test for these syscalls
doesn't have to special-case ARM.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When unmapping N pages (e.g. shared memory) the amount of TLB flushes
done can be (N*PAGE_SIZE/ZAP_BLOCK_SIZE)*N although it should be N at
maximum. With PREEMPT kernel ZAP_BLOCK_SIZE is 8 pages, so there is a
noticeable performance penalty when unmapping a large VMA and the system
is spending its time in flush_tlb_range().
The problem is that tlb_end_vma() is always flushing the full VMA
range. The subrange that needs to be flushed can be calculated by
tlb_remove_tlb_entry(). This approach was suggested by Hugh Dickins,
and is also used by other arches.
The speed increase is roughly 3x for 8M mappings and for larger mappings
even more.
Signed-off-by: Aaro Koskinen <Aaro.Koskinen@nokia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This adds a SZ_32K define to the available sizes. I need it for an
upcoming platform support.
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pass the original flags to rwlock arch-code, so that it can re-enable
interrupts if implemented for that architecture.
Initially, make __raw_read_lock_flags and __raw_write_lock_flags stubs
which just do the same thing as non-flags variants.
Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adds support for Faraday FA526 core. This core is used at least by:
Cortina Systems Gemini and Centroid family
Cavium Networks ECONA family
Grain Media GM8120
Pixelplus ImageARM
Prolific PL-1029
Faraday IP evaluation boards
v2:
- move TLB_BTB to separate patch
- update copyrights
Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
Now, as all places that use Scoop GPIO have been converted to use
GPIO API, drop old-style accessors completely.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
"""The Marvell® PXA168 processor is the first in a family of application
processors targeted at mass market opportunities in computing and consumer
devices. It balances high computing and multimedia performance with low
power consumption to support extended battery life, and includes a wealth
of integrated peripherals to reduce overall BOM cost .... """
See http://www.marvell.com/featured/pxa168.jsp for more information.
1. Marvell Mohawk core is a hybrid of xscale3 and its own ARM core,
there are many enhancements like instructions for flushing the
whole D-cache, and so on
2. Clock reuses Russell's common clkdev, and added the basic support
for UART1/2.
3. Devices are a bit different from the 'mach-pxa' way, the platform
devices are now dynamically allocated only when necessary (i.e.
when pxa_register_device() is called). Description for each device
are stored in an array of 'struct pxa_device_desc'. Now that:
a. this array of device description is marked with __initdata and
can be freed up system is fully up
b. which means board code has to add all needed devices early in
his initializing function
c. platform specific data can now be marked as __initdata since
they are allocated and copied by platform_device_add_data()
4. only the basic UART1/2/3 are added, more devices will come later.
Signed-off-by: Jason Chagas <chagas@marvell.com>
Signed-off-by: Eric Miao <eric.miao@marvell.com>
It would seem when building kernel modules with modern binutils
(required by modern GCC) for ARM v4T targets (specifically observed
with the Samsung 24xx SoC which is an 920T) R_ARM_V4BX relocations
are emitted for function epilogues.
This manifests at module load time with an "unknown relocation: 40"
error message.
The following patch adds the R_ARM_V4BX relocation to the ARM kernel
module loader. The relocation operation is taken from that within the
binutils bfd library.
Signed-off-by: Simtec Linux Team <linux@simtec.co.uk>
Signed-off-by: Vincent Sanders <vince@simtec.co.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
OMAP wishes to pass state to the boot loader upon reboot in order to
instruct it whether to wait for USB-based reflashing or not. There is
already a facility to do this via the reboot() syscall, except we ignore
the string passed to machine_restart().
This patch fixes things to pass this string to arch_reset(). This means
that we keep the reboot mode limited to telling the kernel _how_ to
perform the reboot which should be independent of what we request the
boot loader to do.
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Map unused registers at the end of DMA region at 64 MB to allow PCI masters
to cross the boundary when prefetching data from SDRAM.
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
The choice is between looping over the physical range and performing
single cache line operations, or to map highmem pages somewhere, as
cache range ops are possible only on virtual addresses.
Because L2 range ops are much faster, we go with the later by factoring
the physical-to-virtual address conversion and use a fixmap entry for it
in the HIGHMEM case.
Possible future optimizations to avoid the pte setup cost:
- do the pte setup for highmem pages only
- determine a threshold for doing a line-by-line processing on physical
addresses when the range is small
Signed-off-by: Nicolas Pitre <nico@marvell.com>
If a machine class has a custom __virt_to_bus() implementation then it
must provide a __arch_page_to_dma() implementation as well which is
_not_ based on page_address() to support highmem.
This patch fixes existing __arch_page_to_dma() and provide a default
implementation otherwise. The default implementation for highmem is
based on __pfn_to_bus() which is defined only when no custom
__virt_to_bus() is provided by the machine class.
That leaves only ebsa110 and footbridge which cannot support highmem
until they provide their own __arch_page_to_dma() implementation.
But highmem support on those legacy platforms with limited memory is
certainly not a priority.
Signed-off-by: Nicolas Pitre <nico@marvell.com>
This is a helper to be used by the DMA mapping API to handle cache
maintenance for memory identified by a page structure instead of a
virtual address. Those pages may or may not be highmem pages, and
when they're highmem pages, they may or may not be virtually mapped.
When they're not mapped then there is no L1 cache to worry about. But
even in that case the L2 cache must be processed since unmapped highmem
pages can still be L2 cached.
Signed-off-by: Nicolas Pitre <nico@marvell.com>
The kmap virtual area borrows a 2MB range at the top of the 16MB area
below PAGE_OFFSET currently reserved for kernel modules and/or the
XIP kernel. This 2MB corresponds to the range covered by 2 consecutive
second-level page tables, or a single pmd entry as seen by the Linux
page table abstraction. Because XIP kernels are unlikely to be seen
on systems needing highmem support, there shouldn't be any shortage of
VM space for modules (14 MB for modules is still way more than twice the
typical usage).
Because the virtual mapping of highmem pages can go away at any moment
after kunmap() is called on them, we need to bypass the delayed cache
flushing provided by flush_dcache_page() in that case.
The atomic kmap versions are based on fixmaps, and
__cpuc_flush_dcache_page() is used directly in that case.
Signed-off-by: Nicolas Pitre <nico@marvell.com>
This is the minimum fixmap interface expected to be implemented by
architectures supporting highmem.
We have a second level page table already allocated and covering
0xfff00000-0xffffffff because the exception vector page is located
at 0xffff0000, and various cache tricks already use some entries above
0xffff0000. Therefore the PTEs covering 0xfff00000-0xfffeffff are free
to be used.
However the XScale cache flushing code already uses virtual addresses
between 0xfffe0000 and 0xfffeffff.
So this reserves the 0xfff00000-0xfffdffff range for fixmap stuff.
The Documentation/arm/memory.txt information is updated accordingly,
including the information about the actual top of DMA memory mapping
region which didn't match the code.
Signed-off-by: Nicolas Pitre <nico@marvell.com>
This patch adds a Non-cacheable Normal ARM executable memory type,
MT_MEMORY_NONCACHED.
On OMAP3, this is used for rapid dynamic voltage/frequency scaling in
the VDD2 voltage domain. OMAP3's SDRAM controller (SDRC) is in the
VDD2 voltage domain, and its clock frequency must change along with
voltage. The SDRC clock change code cannot run from SDRAM itself,
since SDRAM accesses are paused during the clock change. So the
current implementation of the DVFS code executes from OMAP on-chip
SRAM, aka "OCM RAM."
If the OCM RAM pages are marked as Cacheable, the ARM cache controller
will attempt to flush dirty cache lines to the SDRC, so it can fill
those lines with OCM RAM instruction code. The problem is that the
SDRC is paused during DVFS, and so any SDRAM access causes the ARM MPU
subsystem to hang.
TI's original solution to this problem was to mark the OCM RAM
sections as Strongly Ordered memory, thus preventing caching. This is
overkill: since the memory is marked as non-bufferable, OCM RAM writes
become needlessly slow. The idea of "Strongly Ordered SRAM" is also
conceptually disturbing. Previous LAKML list discussion is here:
http://www.spinics.net/lists/arm-kernel/msg54312.html
This memory type MT_MEMORY_NONCACHED is used for OCM RAM by a future
patch.
Cc: Richard Woodruff <r-woodruff2@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There's no point these being in a generic include file when they're
only used in arch/arm/mach-rpc/dma.c.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds ELF section parsing for the unwinding tables in loadable
modules together with the PREL31 relocation symbol resolving.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds the main functionality for parsing the stack unwinding
information generated by the ARM EABI toolchains. The unwinding
information consists of an index with a pair of words per function and a
table with unwinding instructions. For more information, see "Exception
Handling ABI for the ARM Architecture" at:
http://infocenter.arm.com/help/topic/com.arm.doc.subset.swdev.abi/index.html
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
User space can request hardware and/or software time stamping.
Reporting of the result(s) via a new control message is enabled
separately for each field in the message because some of the
fields may require additional computation and thus cause overhead.
User space can tell the different kinds of time stamps apart
and choose what suits its needs.
When a TX timestamp operation is requested, the TX skb will be cloned
and the clone will be time stamped (in hardware or software) and added
to the socket error queue of the skb, if the skb has a socket
associated with it.
The actual TX timestamp will reach userspace as a RX timestamp on the
cloned packet. If timestamping is requested and no timestamping is
done in the device driver (potentially this may use hardware
timestamping), it will be done in software after the device's
start_hard_xmit routine.
Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes the walk_stacktrace and its callers for easier
integration of stack unwinding. The arch/arm/kernel/stacktrace.h file is
also moved to arch/arm/include/asm/stacktrace.h.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch moves code around in the arch/arm/kernel/traps.c file for
easier integration of the stack unwinding support.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The VFPv3D16 is a VFPv3 CPU configuration where only 16 double registers
are present, as the VFPv2 configuration. This patch adds the
corresponding hwcap bits so that applications or debuggers have more
information about the supported features.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds ptrace support for setting and getting the VFP registers
using PTRACE_SETVFPREGS and PTRACE_GETVFPREGS. The user_vfp structure
defined in asm/user.h contains 32 double registers (to cover VFPv3 and
Neon hardware) and the FPSCR register.
Cc: Paul Brook <paul@codesourcery.com>
Cc: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
fix the following 'make headers_check' warnings:
usr/include/asm-arm/swab.h:19: include of <linux/types.h> is preferred over <asm/types.h>
usr/include/asm-arm/swab.h:25: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warnings:
usr/include/asm-arm/setup.h:17: include of <linux/types.h> is preferred over <asm/types.h>
usr/include/asm-arm/setup.h:25: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warnings:
usr/include/asm-arm/a.out.h:5: include of <linux/types.h> is preferred over <asm/types.h>
usr/include/asm-arm/a.out.h:9: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Add swab.h to kbuild.asm and remove the individual entries from
each arch, mark as unifdef as some arches have some kernel-only
bits inside.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make VMAs per mm_struct as for MMU-mode linux. This solves two problems:
(1) In SYSV SHM where nattch for a segment does not reflect the number of
shmat's (and forks) done.
(2) In mmap() where the VMA's vm_mm is set to point to the parent mm by an
exec'ing process when VM_EXECUTABLE is specified, regardless of the fact
that a VMA might be shared and already have its vm_mm assigned to another
process or a dead process.
A new struct (vm_region) is introduced to track a mapped region and to remember
the circumstances under which it may be shared and the vm_list_struct structure
is discarded as it's no longer required.
This patch makes the following additional changes:
(1) Regions are now allocated with alloc_pages() rather than kmalloc() and
with no recourse to __GFP_COMP, so the pages are not composite. Instead,
each page has a reference on it held by the region. Anything else that is
interested in such a page will have to get a reference on it to retain it.
When the pages are released due to unmapping, each page is passed to
put_page() and will be freed when the page usage count reaches zero.
(2) Excess pages are trimmed after an allocation as the allocation must be
made as a power-of-2 quantity of pages.
(3) VMAs are added to the parent MM's R/B tree and mmap lists. As an MM may
end up with overlapping VMAs within the tree, the VMA struct address is
appended to the sort key.
(4) Non-anonymous VMAs are now added to the backing inode's prio list.
(5) Holes may be punched in anonymous VMAs with munmap(), releasing parts of
the backing region. The VMA and region structs will be split if
necessary.
(6) sys_shmdt() only releases one attachment to a SYSV IPC shared memory
segment instead of all the attachments at that addresss. Multiple
shmat()'s return the same address under NOMMU-mode instead of different
virtual addresses as under MMU-mode.
(7) Core dumping for ELF-FDPIC requires fewer exceptions for NOMMU-mode.
(8) /proc/maps is now the global list of mapped regions, and may list bits
that aren't actually mapped anywhere.
(9) /proc/meminfo gains a line (tagged "MmapCopy") that indicates the amount
of RAM currently allocated by mmap to hold mappable regions that can't be
mapped directly. These are copies of the backing device or file if not
anonymous.
These changes make NOMMU mode more similar to MMU mode. The downside is that
NOMMU mode requires some extra memory to track things over NOMMU without this
patch (VMAs are no longer shared, and there are now region structs).
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Mike Frysinger <vapier.adi@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
The atomic_t type cannot currently be used in some header files because it
would create an include loop with asm/atomic.h. Move the type definition
to linux/types.h to break the loop.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The original arch/arm/include/asm/hardware/vic.h was
written for the PL190 ARM VIC implementation, and as
such does not have any information about the PL192
version.
Add details about the PL192 and PL190 specific registers
and any changes between the two units.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>