on ia64 thread_info is at the constant offset from task_struct and stack
is embedded into the same beast. Set __HAVE_THREAD_FUNCTIONS, made
task_thread_info() just add a constant.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
cris KSTK_EIP looked for pt_regs at the right offset but from the wrong
place - forgotten ->thread_info
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
)
From: Al Viro <viro@ftp.linux.org.uk>
task_pt_regs() needs the same offset-by-8 to match copy_thread()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
)
From: Al Viro <viro@ftp.linux.org.uk>
rename alpha_task_regs() to task_pt_regs(), switch open-coded instances
to use of the helper.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
use task_stack_page() for accesses to stack page of task in alpha-specific
parts of tree
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
use task_thread_info() for accesses to thread_info of task in arch/alpha
and include/asm-alpha
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patchset annotates arch/* uses of ->thread_info. Ones that really are about
access of thread_info of given process are simply switched to
task_thread_info(task); ones that deal with access to objects on stack are
switched to new helper - task_stack_page(). A _lot_ of the latter are
actually open-coded instances of "find where pt_regs are"; those are
consolidated into task_pt_regs(task) (many architectures actually have such
helper already).
Note that these annotations are not mandatory - any code not converted to
these helpers still works. However, they clean up a lot of places and have
actually caught a number of bugs, so converting out of tree ports would be a
good idea...
As an example of breakage caught by that stuff, see i386 pt_regs mess - we
used to have it open-coded in a bunch of places and when back in April Stas
had fixed a bug in copy_thread(), the rest had been left out of sync. That
required two followup patches (the latest - just before 2.6.15) _and_ still
had left /proc/*/stat eip field broken. Try ps -eo eip on i386 and watch the
junk...
This patch:
new helper - task_stack_page(task). Returns pointer to the memory object
containing task stack; usually thread_info of task sits in the beginning
of that object.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
)
From: Nick Piggin <nickpiggin@yahoo.com.au>
Track the last waker CPU, and only consider wakeup-balancing if there's a
match between current waker CPU and the previous waker CPU. This ensures
that there is some correlation between two subsequent wakeup events before
we move the task. Should help random-wakeup workloads on large SMP
systems, by reducing the migration attempts by a factor of nr_cpus.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
)
From: Ingo Molnar <mingo@elte.hu>
This is the latest version of the scheduler cache-hot-auto-tune patch.
The first problem was that detection time scaled with O(N^2), which is
unacceptable on larger SMP and NUMA systems. To solve this:
- I've added a 'domain distance' function, which is used to cache
measurement results. Each distance is only measured once. This means
that e.g. on NUMA distances of 0, 1 and 2 might be measured, on HT
distances 0 and 1, and on SMP distance 0 is measured. The code walks
the domain tree to determine the distance, so it automatically follows
whatever hierarchy an architecture sets up. This cuts down on the boot
time significantly and removes the O(N^2) limit. The only assumption
is that migration costs can be expressed as a function of domain
distance - this covers the overwhelming majority of existing systems,
and is a good guess even for more assymetric systems.
[ People hacking systems that have assymetries that break this
assumption (e.g. different CPU speeds) should experiment a bit with
the cpu_distance() function. Adding a ->migration_distance factor to
the domain structure would be one possible solution - but lets first
see the problem systems, if they exist at all. Lets not overdesign. ]
Another problem was that only a single cache-size was used for measuring
the cost of migration, and most architectures didnt set that variable
up. Furthermore, a single cache-size does not fit NUMA hierarchies with
L3 caches and does not fit HT setups, where different CPUs will often
have different 'effective cache sizes'. To solve this problem:
- Instead of relying on a single cache-size provided by the platform and
sticking to it, the code now auto-detects the 'effective migration
cost' between two measured CPUs, via iterating through a wide range of
cachesizes. The code searches for the maximum migration cost, which
occurs when the working set of the test-workload falls just below the
'effective cache size'. I.e. real-life optimized search is done for
the maximum migration cost, between two real CPUs.
This, amongst other things, has the positive effect hat if e.g. two
CPUs share a L2/L3 cache, a different (and accurate) migration cost
will be found than between two CPUs on the same system that dont share
any caches.
(The reliable measurement of migration costs is tricky - see the source
for details.)
Furthermore i've added various boot-time options to override/tune
migration behavior.
Firstly, there's a blanket override for autodetection:
migration_cost=1000,2000,3000
will override the depth 0/1/2 values with 1msec/2msec/3msec values.
Secondly, there's a global factor that can be used to increase (or
decrease) the autodetected values:
migration_factor=120
will increase the autodetected values by 20%. This option is useful to
tune things in a workload-dependent way - e.g. if a workload is
cache-insensitive then CPU utilization can be maximized by specifying
migration_factor=0.
I've tested the autodetection code quite extensively on x86, on 3
P3/Xeon/2MB, and the autodetected values look pretty good:
Dual Celeron (128K L2 cache):
---------------------
migration cost matrix (max_cache_size: 131072, cpu: 467 MHz):
---------------------
[00] [01]
[00]: - 1.7(1)
[01]: 1.7(1) -
---------------------
cacheflush times [2]: 0.0 (0) 1.7 (1784008)
---------------------
Here the slow memory subsystem dominates system performance, and even
though caches are small, the migration cost is 1.7 msecs.
Dual HT P4 (512K L2 cache):
---------------------
migration cost matrix (max_cache_size: 524288, cpu: 2379 MHz):
---------------------
[00] [01] [02] [03]
[00]: - 0.4(1) 0.0(0) 0.4(1)
[01]: 0.4(1) - 0.4(1) 0.0(0)
[02]: 0.0(0) 0.4(1) - 0.4(1)
[03]: 0.4(1) 0.0(0) 0.4(1) -
---------------------
cacheflush times [2]: 0.0 (33900) 0.4 (448514)
---------------------
Here it can be seen that there is no migration cost between two HT
siblings (CPU#0/2 and CPU#1/3 are separate physical CPUs). A fast memory
system makes inter-physical-CPU migration pretty cheap: 0.4 msecs.
8-way P3/Xeon [2MB L2 cache]:
---------------------
migration cost matrix (max_cache_size: 2097152, cpu: 700 MHz):
---------------------
[00] [01] [02] [03] [04] [05] [06] [07]
[00]: - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[01]: 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[02]: 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[03]: 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[04]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1)
[05]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1)
[06]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1)
[07]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) -
---------------------
cacheflush times [2]: 0.0 (0) 19.2 (19281756)
---------------------
This one has huge caches and a relatively slow memory subsystem - so the
migration cost is 19 msecs.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Cc: <wilder@us.ibm.com>
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add per-arch sched_cacheflush() which is a write-back cacheflush used by
the migration-cost calibration code at bootup time.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Kevin Hilman
This patch increase available DMA-consistent memory allocated by dma_coherent_alloc(). The default remains at 2M (defined in asm/memory.h) and each platform has the ability to override in asm/arch-foo/memory.h.
Signed-off-by: Kevin Hilman <kevin@hilman.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Roman Zippel pointed out that the missing lower limit of intervals
leads to an accounting error in the overrun count. Enforce the lower
limit of intervals to resolution in the timer forwarding code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Change the storage format of the per base resolution to ktime_t to
make it easier accessible in the hrtimers code.
Change the resolution from (NSEC_PER_SEC/HZ) to TICK_NSEC as Roman
pointed out. TICK_NSEC is closer to the real resolution.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The list_head in the hrtimer structure was introduced for easy access
to the first timer with the further extensions of real high resolution
timers in mind, but it turned out in the course of development that
it is not necessary for the standard use case. Remove the list head
and access the first expiry timer by a datafield in the timer base.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit 5388fb1025 made signal_32.c
use discard_lazy_cpu_state, which broke ARCH=ppc because that
uses the common signal_32.c but has its own process.c. Make ARCH=ppc
use the common process.c to fix this and to reduce the amount
of duplicated code.
Signed-off-by: Paul Mackerras <paulus@samba.org>
pcibios_claim_one_bus is not needed on iSeries and phbs_remap_io can be
mode static.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
There was a function declared for CONFIG_PSERIES which no longer exists
and the two function declarations for CONFIG_ISERIES have been moved
into an include file in platforms/iseries since they are defined and
used only there.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This reverts part of "ppc64 iSeries: allow build with no PCI"
(145d01e428) which affected generic code
and applies a fix in the arch specific code.
Commit "partly merge iseries do_IRQ"
(5fee9b3b39eb55c7e3619a3b36ceeabffeb8f144) introduced iSeries_get_irq
which was only available if CONFIG_PCI is set.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Mainly just removing file names from the comments.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Also does some comment cleanups and removal of unnecessary
variables.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Heikki Lindholm pointed out that there was a potential race with the
lazy CPU state (FP, VR, EVR) stuff if preempt is enabled. The race
is that in the process of restoring FP state on sigreturn, the task
gets preempted by a user task that wants to use the FPU. It will take
an FP unavailable exception, which will write the current FPU state
to the thread_struct, overwriting the values which sigreturn has
stored. Note that this can only happen on UP since we don't implement
lazy CPU state on SMP.
The fix is to flush the lazy CPU state before updating the
thread_struct. To do this we re-use the flush_lazy_cpu_state()
function from process.c.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Remove support for obsolete hardware and cleanup.
- Remove checks for non integrated APICs
- Replace apic_write_around with apic_write.
- Remove apic_read_around
- Remove APIC version reads used by old workarounds
- Remove old workaround for Simics
- Fix indentation
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When building in a separate objtree, file names produced by BUG() & Co. can
get fairly long; printing only the first 50 characters may thus result in
(almost) no useful information. The following change makes it so that rather
the last 50 characters of the filename get printed.
Signed-Off-By: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
X86_FEATURE_K8_C was a synthetic Linux CPUID flag that was used for some
code optimizations in Opteron C stepping or later. But support for pre C
stepping optimizations has been removed, so this isn't needed anymore.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Saves about ~18K .text in defconfig
There would be more optimization potential, but that's for later.
Suggestion originally from Bill Irwin.
Fix from Andy Whitcroft.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
They used to be used by the reboot code, but not anymore.
Noticed by Jan Beulich
Cc: JBeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce vSMP arch to the kernel.
This patch:
1. Adds CONFIG_X86_VSMP
2. Adds machine specific macros for local_irq_disabled, local_irq_enabled
and irqs_disabled
3. Writes to the vSMP CTL device to indicate kernel compiled with CONFIG_VSMP
Signed-off-by: Ravikiran Thirumalai <kiran@scalemp.com>
Signed-off-by: Shai Fultheim <shai@scalemp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
vSMP specific alignment patch to
1. Define INTERNODE_CACHE_SHIFT for vSMP
2. Use this for alignment of critical structures
3. Use INTERNODE_CACHE_SHIFT for ARCH_MIN_TASKALIGN,
and let the slab align task_struct allocations to the internode cacheline size
4. Introduce and use ARCH_MIN_MMSTRUCT_ALIGN for mm_struct slab allocations.
Signed-off-by: Ravikiran Thirumalai <kiran@scalemp.com>
Signed-off-by: Shai Fultheim <shai@scalemp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fixes
CC fs/nfsctl.o
In file included from include2/asm/atomic.h:427,
from /home/lsrc/quilt/linux/include/linux/file.h:8,
from /home/lsrc/quilt/linux/fs/nfsctl.c:8:
/home/lsrc/quilt/linux/include/asm-generic/atomic.h:20:5: warning: "BITS_PER_LONG" is not defined
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Move the #ifdef into the function body.
Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It was set as an NMI, but the NMI bit always forces an interrupt
to end up at vector 2. So it was never used. Remove.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It looks like the new scalable TLB flush code for x86_64 is claiming
one more IRQ vector than it actually uses.
Signed-off-by: Jason Uhlenkott <jasonuhl@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch uses a static PDA array early at boot and reallocates processor PDA
with node local memory when kmalloc is ready, just before pda_init.
The boot_cpu_pda is needed since the cpu_pda is used even before pda_init for
that cpu is called (to set the static per-cpu areas offset table etc)
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch enables early intialization of cpu_to_node.
apicid_to_node is built by reading the SRAT table, from acpi_numa_init with
ACPI_NUMA and k8_scan_nodes with K8_NUMA.
x86_cpu_to_apicid is built by parsing the ACPI MADT table, from acpi_boot_init.
We combine these two tables and setup cpu_to_node.
Early intialization helps the static per_cpu_areas in getting pages from
correct node.
Change since last release:
Do not initialize early init_cpu_to_node for faking node cases.
Patch tested on TYAN dual core 4P board with K8 only, ACPI_NUMA.
Tested on EM64T NUMA. Also tested with numa=off, numa=fake, and running
a kernel compiled with NUMA on a regular EM64 2 way SMP.
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Passing random input values in eax to cpuid is not a good idea
because the CPU will GPF for unknown ones.
Use the correct x86-64 version that exists for a longer time too.
This also adds a memory barrier to prevent the optimizer from
reordering.
Cc: tigran@veritas.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
They already do this in hardware and the Linux algorithm
actually adds errors.
Cc: mingo@elte.hu
Cc: rohit.seth@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
cpumask.h wasn't included implicitely into proto.h in this case.
Just move it over to smp.h
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
o Apic id is in most significant 8 bits of APIC_ID register. Current code
is trying to write apic id to least significant 8 bits. This patch fixes
it.
o This fix enables booting uni kdump capture kernel on a cpu with non-zero
apic id.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This function is never used for x86_64.
Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As suggested by Linus.
This catches driver bugs that could cause corruption on IOMMU architectures.
Also I converted the BUGs to out_of_line_bug()s to save a bit
of text space.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
AK: I hacked Muli's original patch a lot and there were a lot
of changes - all bugs are probably to blame on me now.
There were also some changes in the fall back behaviour
for swiotlb - in particular it doesn't try to use GFP_DMA
now anymore. Also all DMA mapping operations use the
same core dma_alloc_coherent code with proper fallbacks now.
And various other changes and cleanups.
Known problems: iommu=force swiotlb=force together breaks
needs more testing.
This patch cleans up x86_64's DMA mapping dispatching code. Right now
we have three possible IOMMU types: AGP GART, swiotlb and nommu, and
in the future we will also have Xen's x86_64 swiotlb and other HW
IOMMUs for x86_64. In order to support all of them cleanly, this
patch:
- introduces a struct dma_mapping_ops with function pointers for each
of the DMA mapping operations of gart (AMD HW IOMMU), swiotlb
(software IOMMU) and nommu (no IOMMU).
- gets rid of:
if (swiotlb)
return swiotlb_xxx();
- PCI_DMA_BUS_IS_PHYS is now checked against the dma_ops being set
This makes swiotlb faster by avoiding double copying in some cases.
Signed-Off-By: Muli Ben-Yehuda <mulix@mulix.org>
Signed-Off-By: Jon D. Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds a new notifier chain that is called with IDLE_START
when a CPU goes idle and IDLE_END when it goes out of idle.
The context can be idle thread or interrupt context.
Since we cannot rely on MONITOR/MWAIT existing the idle
end check currently has to be done in all interrupt
handlers.
They were originally inspired by the similar s390 implementation.
They have a variety of applications:
- They will be needed for CONFIG_NO_IDLE_HZ
- They can be used for oprofile to fix up the missing time
in idle when performance counters don't tick.
- They can be used for better C state management in ACPI
- They could be used for microstate accounting.
This is just infrastructure so far, no users.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Whenever we see that a CPU is capable of C3 (during ACPI cstate init), we
disable local APIC timer and switch to using a broadcast from external timer
interrupt (IRQ 0).
Patch below adds the code for x86_64.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Whenever we see that a CPU is capable of C3 (during ACPI cstate init), we
disable local APIC timer and switch to using a broadcast from external timer
interrupt (IRQ 0). This is needed because Intel CPUs stop the local
APIC timer in C3. This is currently only enabled for Intel CPUs.
Patch below adds the code for i386 and also the ACPI hunk.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch is on the same lines as Zachary Amsden's i386 GDT page alignemnt
patch in -mm, but for x86_64.
Patch to align and pad x86_64 GDT on page boundries.
[AK: some minor cleanups and fixed incorrect TLS initialization
in CPU init.]
Signed-off-by: Nippun Goel <nippung@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Following kmalloc_node.
Needed for another patch to return -1 for unknown nodes in x86-64.
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: kiran@scalex86.org
Signed-off-by: Andi Kleen <ak@suse.de>
[ Changed 0 to numa_node_id() on suggestion by Christoph Lameter ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The separation of the rex64 prefix (on fxsave/fxrstor) by way of using
a semicolon resulted in the prefix not always taking effect (because
when extended registers are needed for addressing, another rex prefix
would have been generated by the compiler), thus (depending on the
build) resulting in eventually getting 32-bit saves and/or restores.
Signed-Off-By: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some people need it now on 64bit so reuse the i386 code for
x86-64. This will be also useful for future bug workarounds.
It is a bit simplified there because there is no need
to do it very early on x86-64. This means it doesn't need
early ioremap et.al. We run it as a core initcall right now.
I hope it's not needed for early setup.
I added a general CONFIG_DMI symbol in case IA64 or someone
else wants to reuse the code later too.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use single instruction for find largest set bit on x86_64.
[Updated by Jan Beulich to fix wrong asm constraints in original
patch -AK]
Cc: jbeulich@novell.com
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As discussed, the flags register on x86-64 is saved and restored by the
assembly code which sets up struct pt_regs, so we do not need to save
and restore it in the inline assembler which already informs gcc that
we're clobbering the flags. This patch has been sanity booted and works
okay here.
Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I noticed that some lowlevel send_IPI_mask helpers had a hotplug/preempt
race whereupon the cpu_online_map was read before disabling preemption;
...
cpumask_t mask = cpu_online_map;
int cpu = get_cpu();
cpu_clear(cpu, mask);
...
But then i realised that there is no need for these lowlevel functions to
be going through all this trouble when all the callers are already made
hotplug/preempt safe.
Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This
- switches the INT3 handler to run on an IST stack (to cope with
breakpoints set by a kernel debugger on places where the kernel's
%gs base hasn't been set up, yet); the IST stack used is shared with
the INT1 handler's
[AK: this also allows setting a kprobe on the interrupt/exception entry
points]
- allows nesting of INT1/INT3 handlers so that one can, with a kernel
debugger, debug (at least) the user-mode portions of the INT1/INT3
handling; the nesting isn't actively enabled here since a kernel-
debugger-free kernel doesn't need it
Signed-Off-By: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Print bits for RDTSCP, SVM, CR8-LEGACY.
Also now print power flags on i386 like x86-64 always did.
This will add a new line in the 386 cpuinfo, but that shouldn't
be an issue - did that in the past too and I haven't heard
of any breakage.
I shrunk some of the fields in the i386 cpuinfo_x86 to chars
to make up for the new int "x86_power" field. Overall it's
smaller than before.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Define it for i386 too.
This is a synthetic flag that signifies that the CPU's TSC runs
at a constant P state invariant frequency.
Fix up the logic on x86-64/i386 to set it on all known CPUs.
Use the AMD defined bit to set it on future AMD CPUs.
Cc: venkatesh.pallipadi@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Was only used by the floppy driver to work around some ancient
hardware bug that should never occur on any 64bit system.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adjusts things so that handlers of the die() notifier will have
sufficient information about the trap currently being handled. It also
adjusts the notify_die() prototype to (again) match that of i386.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As a follow-up to the introduction of CONFIG_UNWIND_INFO, this
separates the generation of frame unwind information for x86-64 from
that of full debug information.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Frame unwind information was still incorrect for ia32_ptregs_common
(sorry, my fault), and could be improved for some of the other entry
points.
Signed-Off-By: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Move capable() from sched.h to capability.h;
- Use <linux/capability.h> where capable() is used
(in include/, block/, ipc/, kernel/, a few drivers/,
mm/, security/, & sound/;
many more drivers/ to go)
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Uninline capable(). Saves 2K of kernel text on a generic .config, and 1K on a
tiny config. In addition it makes the use of capable more consistent between
CONFIG_SECURITY and !CONFIG_SECURITY
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When a kprobes modules is written in such a way that probes are inserted on
itself, then unload of that moudle was not possible due to reference
couning on the same module.
The below patch makes a check and incrementes the module refcount only if
it is not a self probed module.
We need to allow modules to probe themself for kprobes performance
measurements
This patch has been tested on several x86_64, ppc64 and IA64 architectures.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clarify in comments that GFP_ATOMIC means both "don't sleep" and "use
emergency pools", hence both ALLOC_HARDER and ALLOC_HIGH.
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Noticed by Arjan originally on x86-64, then Ingo on x86, and finally me
grepping for it in the generic version.
Bad parenthesis nesting.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Debug global var is already used inside kernel, so renamed
debug to tuner_debug for the tuner module
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
- Debug global var is already used inside kernel.
- v4l_dbg now expects the debug var
- global vars inside msp34xx renamed to msp_*
Signed-off-by: Mauro Carvalho Chehab <mchehab@brturbo.com.br>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Let's switch mutex_debug_check_no_locks_freed() to take (addr, len) as
arguments instead, since all its callers were just calculating the 'to'
address for themselves anyway... (and sometimes doing so badly).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
PPC32 is still using asm-generic/4level-fixup.h, but asm-powerpc/page.h
was defining pud_t and pgd_t. Depending on the order in which files
got included, this could result in a compilation error. Tweak the ifdef
so that page.h doesn't try to define pud_t on ppc32 (which uses 2-level
page tables).
Signed-off-by: Paul Mackerras <paulus@samba.org>
The current ppc64 per cpu data implementation is quite slow. eg:
lhz 11,18(13) /* smp_processor_id() */
ld 9,.LC63-.LCTOC1(30) /* per_cpu__variable_name */
ld 8,.LC61-.LCTOC1(30) /* __per_cpu_offset */
sldi 11,11,3 /* form index into __per_cpu_offset */
mr 10,9
ldx 9,11,8 /* __per_cpu_offset[smp_processor_id()] */
ldx 0,10,9 /* load per cpu data */
5 loads for something that is supposed to be fast, pretty awful. One
reason for the large number of loads is that we have to synthesize 2
64bit constants (per_cpu__variable_name and __per_cpu_offset).
By putting __per_cpu_offset into the paca we can avoid the 2 loads
associated with it:
ld 11,56(13) /* paca->data_offset */
ld 9,.LC59-.LCTOC1(30) /* per_cpu__variable_name */
ldx 0,9,11 /* load per cpu data
Longer term we can should be able to do even better than 3 loads.
If per_cpu__variable_name wasnt a 64bit constant and paca->data_offset
was in a register we could cut it down to one load. A suggestion from
Rusty is to use gcc's __thread extension here. In order to do this we
would need to free up r13 (the __thread register and where the paca
currently is). So far Ive had a few unsuccessful attempts at doing that :)
The patch also allocates per cpu memory node local on NUMA machines.
This patch from Rusty has been sitting in my queue _forever_ but stalled
when I hit the compiler bug. Sorry about that.
Finally I also only allocate per cpu data for possible cpus, which comes
straight out of the x86-64 port. On a pseries kernel (with NR_CPUS == 128)
and 4 possible cpus we see some nice gains:
total used free shared buffers cached
Mem: 4012228 212860 3799368 0 0 162424
total used free shared buffers cached
Mem: 4016200 212984 3803216 0 0 162424
A saving of 3.75MB. Quite nice for smaller machines. Note: we now have
to be careful of per cpu users that touch data for !possible cpus.
At this stage it might be worth making the NUMA and possible cpu
optimisations generic, but per cpu init is done so early we have to be
careful that all architectures have their possible map setup correctly.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This stops parport from accessing nonexistent parallel ports.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds Kconfig entries to control the early debugging options,
currently in setup_64.c.
Doing this via Kconfig rather than #defines means you can have one source tree,
which is buildable for multiple platforms - and you can enable the correct
early debug option for each platform via .config.
I made udbg_early_init() a static inline because otherwise GCC is to daft to
optimise it away when debugging is off.
Now that we have udbg_init_rtas() we can make call_rtas_display_status* static.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cleanup asm-parisc/processor.h to use C99 initializers in
INIT_THREAD().
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Change to asm-parisc/pci.h makes the define of PCI_HOST_ADDR symmetrical
with PCI_BUS_ADDR. Also add a comment about PA_VIEW and LMMIO/ELMMIO/GMMIO.
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Define some constants for HugeTLB pages, not that parisc-linux supports
it yet.
Signed-off-by: Helge Deller <deller@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Make flush_data_cache_local, flush_instruction_cache_local and
flush_tlb_all_local take a void * so they don't have to be cast
when using on_each_cpu(). This becomes a problem when on_each_cpu
is a macro (as it is in current -mm).
Also move the prototype of flush_tlb_all_local into tlbflush.h and
remove its declaration from .c files.
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Flag a whole bunch of things as __read_mostly on parisc. Also flag a few
branches as unlikely() and cleanup a bit of code.
Signed-off-by: Helge Deller <deller@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Call the mutex slowpath more conservatively - e.g. FRAME_POINTERS can
change the calling convention, in which case a direct branch to the
slowpath becomes illegal. Bug found by Hugh Dickins.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
->print and ->print_range are not used (and apparently never were).
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't wrap entire file in #ifdef CONFIG_NETFILTER, remove a few
unneccessary includes.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The congestion ops and af_ops in the inet_connection_sock
can be const.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch from Nicolas Pitre
Depending on your gcc version, the current C-only implementation would
produce suboptimal code, ranging from a bad register selection forcing
an additional mov instruction to a failure to merge the eor and the ror
in a single instruction. With a little help gcc always produces the
best code.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Fix a5 register corruption when processing user space signals handlers.
We need to save a5 through each contenxt change.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We're starting a number of big applications (memory footprint app.
1MByte) on our Arcturus uC5272. Therefore memory fragmentation is a
real pain for us. We've switched to uClinux-2.4.27-uc1 and found that
page_alloc2 fragments the memory heavily.
Digging into it we found a bug in the find_next_zero_bit function in the
m68knommu/bitops.h file. if the size isn't a multiple of 32 than the
upper bits of the last word to be searched should be masked. But the
functions masks the lower bits of the last word because it uses a right
shift instead of a left shift operator.
Patch submitted by Sascha Smejkal <s.smejkal@centersystems.at>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Andrew Victor
This patch adds support to the 2.6 kernel series for the Atmel
AT91RM9200 processor.
This patch is the Serial driver.
This version uses the newly re-written GPL'ed hardware headers.
Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch moves rcu_state into the rcu_ctrlblk. I think there
are no reasons why we should have 2 different variables to control
rcu state. Every user of rcu_state has also "rcu_ctrlblk *rcp" in
the parameter list.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In the process of optimising our per cpu data code, I found a ppc64
compiler bug that has been around forever. Basically the current
RELOC_HIDE can end up trashing r30. Details of the bug can be found at
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25572
This bug is present in all compilers before 4.1. It is masked by the
fact that our current per cpu data code is inefficient and causes
other loads that end up marking r30 as used.
A workaround identified by Alan Modra is to use the =r asm constraint
instead of =g.
Signed-off-by: Anton Blanchard <anton@samba.org>
[ Verified that this makes no real difference on x86[-64] */
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There's no need to guard the normalize_rt_tasks() prototype with an #ifdef
CONFIG_MAGIC_SYSRQ.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
"extern inline" doesn't make much sense.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Wrap all the code to 80 chars on a line.
`}\nelse' changed to `} else'.
Clean whitespaces in header file.
Signed-off-by: Jiri Slaby <xslaby@fi.muni.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Move some code from one place to another. Get rid of ugly ifdefs in code in
next p[patches, so here create functions and macros to enable it. Rename some
functions and align some code to 80 chars.
Signed-off-by: Jiri Slaby <xslaby@fi.muni.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The API and code have been through various bits of initial review by
serial driver people but they definitely need to live somewhere for a
while so the unconverted drivers can get knocked into shape, existing
drivers that have been updated can be better tuned and bugs whacked out.
This replaces the tty flip buffers with kmalloc objects in rings. In the
normal situation for an IRQ driven serial port at typical speeds the
behaviour is pretty much the same, two buffers end up allocated and the
kernel cycles between them as before.
When there are delays or at high speed we now behave far better as the
buffer pool can grow a bit rather than lose characters. This also means
that we can operate at higher speeds reliably.
For drivers that receive characters in blocks (DMA based, USB and
especially virtualisation) the layer allows a lot of driver specific
code that works around the tty layer with private secondary queues to be
removed. The IBM folks need this sort of layer, the smart serial port
people do, the virtualisers do (because a virtualised tty typically
operates at infinite speed rather than emulating 9600 baud).
Finally many drivers had invalid and unsafe attempts to avoid buffer
overflows by directly invoking tty methods extracted out of the innards
of work queue structs. These are no longer needed and all go away. That
fixes various random hangs with serial ports on overflow.
The other change in here is to optimise the receive_room path that is
used by some callers. It turns out that only one ldisc uses receive room
except asa constant and it updates it far far less than the value is
read. We thus make it a variable not a function call.
I expect the code to contain bugs due to the size alone but I'll be
watching and squashing them and feeding out new patches as it goes.
Because the buffers now dynamically expand you should only run out of
buffering when the kernel runs out of memory for real. That means a lot of
the horrible hacks high performance drivers used to do just aren't needed any
more.
Description:
tty_insert_flip_char is an old API and continues to work as before, as does
tty_flip_buffer_push() [this is why many drivers dont need modification]. It
does now also return the number of chars inserted
There are also
tty_buffer_request_room(tty, len)
which asks for a buffer block of the length requested and returns the space
found. This improves efficiency with hardware that knows how much to
transfer.
and tty_insert_flip_string_flags(tty, str, flags, len)
to insert a string of characters and flags
For a smart interface the usual code is
len = tty_request_buffer_room(tty, amount_hardware_says);
tty_insert_flip_string(tty, buffer_from_card, len);
More description!
At the moment tty buffers are attached directly to the tty. This is causing a
lot of the problems related to tty layer locking, also problems at high speed
and also with bursty data (such as occurs in virtualised environments)
I'm working on ripping out the flip buffers and replacing them with a pool of
dynamically allocated buffers. This allows both for old style "byte I/O"
devices and also helps virtualisation and smart devices where large blocks of
data suddenely materialise and need storing.
So far so good. Lots of drivers reference tty->flip.*. Several of them also
call directly and unsafely into function pointers it provides. This will all
break. Most drivers can use tty_insert_flip_char which can be kept as an API
but others need more.
At the moment I've added the following interfaces, if people think more will
be needed now is a good time to say
int tty_buffer_request_room(tty, size)
Try and ensure at least size bytes are available, returns actual room (may be
zero). At the moment it just uses the flipbuf space but that will change.
Repeated calls without characters being added are not cumulative. (ie if you
call it with 1, 1, 1, and then 4 you'll have four characters of space. The
other functions will also try and grow buffers in future but this will be a
more efficient way when you know block sizes.
int tty_insert_flip_char(tty, ch, flag)
As before insert a character if there is room. Now returns 1 for success, 0
for failure.
int tty_insert_flip_string(tty, str, len)
Insert a block of non error characters. Returns the number inserted.
int tty_prepare_flip_string(tty, strptr, len)
Adjust the buffer to allow len characters to be added. Returns a buffer
pointer in strptr and the length available. This allows for hardware that
needs to use functions like insl or mencpy_fromio.
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Paul Fulghum <paulkf@microgate.com>
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix typos in comments to remove kernel-doc warnings.
Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
"extern inline" doesn't make much sense.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Chipsets with PCI device ids & 0xf0 == 0x00f0 has their actual chipset type in
offset 0x1800 of the mmio space. Add support for this.
Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- remove unneeded casts
- make setcolreg return success if regno > 15, but don't do anything
- use framebuffer_alloc/framebuffer_release to allocate/free memory
Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- remove unneeded casts
- use framebuffer_alloc/framebuffer_release to allocate/free memory
- the pseudo_palette is always u32 regardless of bpp if using generic
drawing functions
Signed-off-by: Antonino Daplas <adaplas@pol.net>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
From: Bugzilla Bug 5351
"After resuming from S3 (suspended while in X), the LCD panel stays black .
However, the laptop is up again, and I can SSH into it from another
machine.
I can get the panel working again, when I first direct video output to the
CRT output of the laptop, and then back to LCD (done by repeatedly hitting
Fn+F5 buttons on the Toshiba, which directs output to either LCD, CRT or
TV) None of this ever happened with older kernels."
This bug is due to the recently added vesafb_blank() method in vesafb. It
works with CRT displays, but has a high incidence of problems in laptop
users. Since CRT users don't really get that much benefit from hardware
blanking, drop support for this.
Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The following patch (against 2.6.15-rc5-mm3) fixes a kprobes build break
due to changes introduced in the kprobe locking in 2.6.15-rc5-mm3. In
addition, the patch reverts back the open-coding of kprobe_mutex.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently arch_remove_kprobes() is only implemented/required for x86_64 and
powerpc. All other architecture like IA64, i386 and sparc64 implementes a
dummy function which is being called from arch independent kprobes.c file.
This patch removes the dummy functions and replaces it with
#define arch_remove_kprobe(p, s) do { } while(0)
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Since Kprobes runtime exception handlers is now lock free as this code path is
now using RCU to walk through the list, there is no need for the
register/unregister{_kprobe} to use spin_{lock/unlock}_isr{save/restore}. The
serialization during registration/unregistration is now possible using just a
mutex.
In the above process, this patch also fixes a minor memory leak for x86_64 and
powerpc.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The arch specific kprobes.h files never gets included when CONFIG_KPROBES is
turned off. Hence check for CONFIG_KPROBES is not appropriate here in this
arch specific kprobes.h files.
Also the below defined function kprobes_exception_notify() is not needed when
CONFIG_KPROBES is off.
Compile tested for both CONFIG_KPROBES=y and N.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kernel/kprobes.c defines get_insn_slot() and free_insn_slot() which are
currently required _only_ for x86_64 and powerpc (which has no-exec support).
FYI, get{free}_insn_slot() functions manages the memory page which is mapped
as executable, required for instruction emulation.
This patch moves those two functions under __ARCH_WANT_KPROBES_INSN_SLOT and
defines __ARCH_WANT_KPROBES_INSN_SLOT in arch specific kprobes.h file.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Switch clock_nanosleep to use the new nanosleep functions in hrtimer.c
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
introduce the hrtimer_nanosleep() and hrtimer_nanosleep_real() APIs. Not yet
used by any code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
hrtimer subsystem core. It is initialized at bootup and expired by the timer
interrupt, but is otherwise not utilized by any other subsystem yet.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- introduce ktime_t: nanosecond-resolution time format.
- eliminate the plain s64 scalar type, and always use the union.
This simplifies the arithmetics. Idea from Roman Zippel.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
add timespec_valid(ts) [returns false if the timespec is denorm]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
add const arguments to the posix-timers.h API functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
style and whitespace cleanup of the rest of time.h.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
clean up the CLOCK_ portions of time.h
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
add 'const' to mktime arguments, and clean it up a bit
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
mktime() and set_normalized_timespec() are large inline functions used in many
places: deinline them.
From: George Anzinger, off-by-1 bugfix
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
move div_long_long_rem() from jiffies.h into a new calc64.h include file, as
it is a general math function useful for other things than the jiffy code.
Convert it to an inline function
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Most arches copied the i386 ioctl.h. Combine them into a generic header.
Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Turn noatime and nodiratime into per-mount instead of per-sb flags.
After all the preparations this is a rather trivial patch. The mount code
needs to treat the two options as per-mount instead of per-superblock, and
touch_atime needs to be changed to check the new MNT_ flags in addition to
the MS_ flags that are kept for filesystems that are always
noatime/nodiratime but not user settable anymore. Besides that core code
only nfs needed an update because it's leaving atime updates to the server
and thus sets the S_NOATIME flag on every inode, but needs to know whether
it's a real noatime mount for an getattr optimization.
While we're at it I've killed the IS_NOATIME/IS_NODIRATIME macros that were
only used by touch_atime.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Now that all these entries in the arch ioctl32.c files are gone [1], we can
build fs/compat_ioctl.c as a normal object and kill tons of cruft. We need a
special do_ioctl32_pointer handler for s390 so the compat_ptr call is done.
This is not needed but harmless on all other architectures. Also remove some
superflous includes in fs/compat_ioctl.c
Tested on ppc64.
[1] parisc still had it's PPP handler left, which is not fully correct
for ppp and besides that ppp uses the generic SIOCPRIV ioctl so it'd
kick in for all netdevice users. We can introduce a proper handler
in one of the next patch series by adding a compat_ioctl method to
struct net_device but for now let's just kill it - parisc doesn't
compile in mainline anyway and I don't want this to block this
patchset.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch __deprecated_for_modules the lookup_hash() prototype.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
All callers use touch_atime now which takes a vfsmount and allows us to
implement per-mount noatime.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
To allow various options to work per-mount instead of per-sb we need a
struct vfsmount when updating ctime and mtime. This preparation patch
replaces the inode_update_time routine with a file_update_atime routine so
we can easily get at the vfsmount. (and the file makes more sense in this
context anyway). Also get rid of the unused second argument - we always
want to update the ctime when calling this routine.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
)
From: Christoph Hellwig <hch@lst.de>
The xattr code has rather complex permission checks because the rules are very
different for different attribute namespaces. This patch moves as much as we
can into the generic code. Currently all the major disk based filesystems
duplicate these checks, while many minor filesystems or network filesystems
lack some or all of them.
To do this we need defines for the extended attribute names in common code, I
moved them up from JFS which had the nicest defintions.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add vfs_getxattr, vfs_setxattr and vfs_removexattr helpers for common checks
around invocation of the xattr methods. NFSD already was missing some of the
checks and there will be more soon.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: James Morris <jmorris@namei.org>
(James, I haven't touched selinux yet because it's doing various odd things
and I'm not sure how it would interact with the security attribute fallbacks
you added. Could you investigate whether it could use vfs_getxattr or if not
add a __vfs_getxattr helper to share the bits it is fine with?)
For NFSv4: instead of just converting it add an nfsd_getxattr helper for the
code shared by NFSv2/3 and NFSv4 ACLs. In fact that code isn't even
NFS-specific, but I'll wait for more users to pop up first before moving it to
common code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
)
From: Vivek Goyal <vgoyal@in.ibm.com>
- In some cases, the number of segments, on a kexec load, exceeds the
existing cap of 8. This patch increases the KEXEC_SEGMENT_MAX limit from 8
to 16.
Signed-off-by: Rachita Kothiyal <rachita@in.ibm.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Saving the cpu registers of all cpus before booting in to the crash
kernel.
- crash_setup_regs will save the registers of the cpu on which panic has
occured. One of the concerns ppc64 folks raised is that after capturing the
register states, one should not pop the current call frame and push new one.
Hence it has been inlined. More call frames later get pushed on to stack
(machine_crash_shutdown() and machine_kexec()), but one will not want to
backtrace those.
- Not very sure about the CFI annotations. With this patch I am getting
decent backtrace with gdb. Assuming, compiler has generated enough
debugging information for crash_kexec(). Coding crash_setup_regs() in pure
assembly makes it tricky because then it can not be inlined and we don't
want to return back after capturing register states we don't want to pop
this call frame.
- Saving the non-panicing cpus registers will be done in the NMI handler
while shooting down them in machine_crash_shutdown.
- Introducing CRASH_DUMP option in Kconfig for x86_64.
Signed-off-by: Murali M Chakravarthy <muralim@in.ibm.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
)
From: Vivek Goyal <vgoyal@in.ibm.com>
- This patch introduces the memmap option for x86_64 similar to i386.
- memmap=exactmap enables setting of an exact E820 memory map, as specified
by the user.
Changes in this version:
- Used e820_end_of_ram() to find the max_pfn as suggested by Andi kleen.
- removed PFN_UP & PFN_DOWN macros
- Printing the user defined map also.
Signed-off-by: Murali M Chakravarthy <muralim@in.ibm.com>
Signed-off-by: Hariprasad Nellitheertha <nharipra@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
)
From: Vivek Goyal <vgoyal@in.ibm.com>
crash_setup_regs() is an architecture dependent function which is called in
architecture independent section. So every architecture supporting kexec
should at least provide a dummy definition of crash_setup_regs() even if
crash dumping is not implemented yet, to avoid build failures.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- If system panics then cpu register states are captured through funciton
crash_get_current_regs(). This is not a inline function hence a stack frame
is pushed on to the stack and then cpu register state is captured. Later
this frame is popped and new frames are pushed (machine_kexec).
- In theory this is not very right as we are capturing register states for a
frame and that frame is no more valid. This seems to have created back
trace problems for ppc64.
- This patch fixes it up. The very first thing it does after entering
crash_kexec() is to capture the register states. Anyway we don't want the
back trace beyond crash_kexec(). crash_get_current_regs() has been made
inline
- crash_setup_regs() is the top architecture dependent function which should
be responsible for capturing the register states as well as to do some
architecture dependent tricks. For ex. fixing up ss and esp for i386.
crash_setup_regs() has also been made inline to ensure no new call frame is
pushed onto stack.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- In case of system crash, current state of cpu registers is saved in memory
in elf note format. So far memory for storing elf notes was being allocated
statically for NR_CPUS.
- This patch introduces dynamic allocation of memory for storing elf notes.
It uses alloc_percpu() interface. This should lead to better memory usage.
- Introduced based on Andi Kleen's and Eric W. Biederman's suggestions.
- This patch also moves memory allocation for elf notes from architecture
dependent portion to architecture independent portion. Now crash_notes is
architecture independent. The whole idea is that size of memory to be
allocated per cpu (MAX_NOTE_BYTES) can be architecture dependent and
allocation of this memory can be architecture independent.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
)
From: Adrian Bunk <bunk@stusta.de>
- create one common dump_thread() prototype in kernel.h
- dump_thread() is only used in fs/binfmt_aout.c and can therefore be
removed on all architectures where CONFIG_BINFMT_AOUT is not
available
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add list_for_each_entry_safe_reverse() to linux/list.h
This is needed by unmerged cachefs and be an as-yet-unreviewed
device_shutdown() fix.
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Patrick Mochel <mochel@digitalimplant.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A simple driver for the CS5535 and CS5536 that allows a user-space program
to manipulate GPIO pins. The CS5535/CS5536 chips are Geode processor
companion devices.
Signed-off-by: Ben Gardner <bgardner@wabtec.com>
Signed-off-by: Richard Knutsson <ricknu-0@student.ltu.se>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a node_guid field to struct ib_device. It is the responsibility
of the low-level driver to initialize this field before registering a
device with the midlayer. Convert everyone to looking at this field
instead of calling ib_query_device() when all they want is the node
GUID, and remove the node_guid field from struct ib_device_attr.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Gcc has a tradition of misscompiling the previous construct using the
address of a label as argument to inline assembler. Gas otoh has the
annoying difference between la and dla which are only usable for 32-bit
rsp. 64-bit code, so can't be used without conditional compilation.
The alterantive is switching the assembler to 64-bit code which happens
to work right even for 32-bit code ...
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
local_irq_restore uses di which saves the whole status content, not
just the IE bit resulting in local_irq_restore() to fail. This only
happens if both CONFIG_CPU_MIPSR2 and CONFIG_IRQ_CPU are enabled.
Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
dump_regs() is used by a bunch of drivers for their internal stuff;
renamed mips instance (one that is seen in system-wide headers)
to elf_dump_regs()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
USB OpenHCI host controller on Au1550 only decodes memory addresses from
0x14020000 to 0x1407FFFF according to the databook, which gives 0x60000
(on the prior Au1x00 chips the map size was 1MB).
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Acked-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
mdelay(1) (i.e. udelay(1000)) does not work correctly due to overflow.
1000 * 0x004189374BC6A7f0 = 0x10000000000000180 (>= 2**64)
0x004189374BC6A7ef (0x004189374BC6A7f0 - 1) is OK and it is exactly
same as catchall case (0x8000000000000000UL / (500000 / HZ)).
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
used_dsp was meant to be used like used_math - but since the FPU context
is small and lazy context switching is a stupid idea on multiprocessors
this idea only got halfway implemented and those bits are were now
breaking ptrace.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The previous change by Kumar Gala in this area led to legacy_serial.c
and udbg_16550.c being built as modules when CONFIG_SERIAL_8250=m.
Fix this by introducing a new symbol, CONFIG_PPC_UDBG_16550, to
control whether these files get built, and arrange for it to be selected
for those platforms that need it.
Signed-off-by: Paul Mackerras <paulus@samba.org>