Kprobe inserts breakpoint instruction in probepoint and then jumps to
instruction slot when breakpoint is hit, the instruction slot icache must
be consistent with dcache. Here is the patch which invalidates instruction
slot icache area.
Without this patch, in some machines there will be fault when executing
instruction slot where icache content is inconsistent with dcache.
Signed-off-by: bibo,mao <bibo.mao@intel.com>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Acked-by: Keshavamurthy Anil S <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
set_wmb should not be used in the kernel because it just confuses the
code more and has no benefit. Since it is not currently used in the
kernel this patch removes it so that new code does not include it.
All archs define set_wmb(var, value) to do { var = value; wmb(); }
while(0) except ia64 and sparc which use a mb() instead. But this is
still moot since it is not used anyway.
Hasn't been tested on any archs but x86 and x86_64 (and only compiled
tested)
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes several problems:
- pmac_backlight_key() is called under interrupt context, and therefore
can't use mutexes or semaphores, so defer the backlight level for
later, as it's not critical (original code by Aristeu S. Rozanski F.
<aris@valeta.org>).
- Add exports for functions that might be called from modules
- Fix Kconfig depdencies on PMAC_BACKLIGHT.
- Fix locking issues on calls from inside the driver (reported by
Aristeu S. Rozanski F., too)
- Fix wrong calculation of backlight values in some of the drivers
- Replace pmac_backlight_key_up/down by inline functions
[akpm@osdl.org: fix function prototypes]
Signed-off-by: Michael Hanselmann <linux-kernel@hansmi.ch>
Acked-by: Aristeu S. Rozanski F. <aris@valeta.org>
Acked-by: Rene Nussbaumer <linux-kernel@killerfox.forkbomb.ch>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch slightly reworks the new irq code to fix a small design error. I
removed the passing of the trigger to the map() calls entirely, it was not a
good idea to have one call do two different things. It also fixes a couple of
corner cases.
Mapping a linux virtual irq to a physical irq now does only that. Setting the
trigger is a different action which has a different call.
The main changes are:
- I no longer call host->ops->map() for an already mapped irq, I just return
the virtual number that was already mapped. It was called before to give an
opportunity to change the trigger, but that was causing issues as that could
happen while the interrupt was in use by a device, and because of the
trigger change, map would potentially muck around with things in a racy way.
That was causing much burden on a given's controller implementation of
map() to get it right. This is much simpler now. map() is only called on
the initial mapping of an irq, meaning that you know that this irq is _not_
being used. You can initialize the hardware if you want (though you don't
have to).
- Controllers that can handle different type of triggers (level/edge/etc...)
now implement the standard irq_chip->set_type() call as defined by the
generic code. That means that you can use the standard set_irq_type() to
configure an irq line manually if you wish or (though I don't like that
interface), pass explicit trigger flags to request_irq() as defined by the
generic kernel interfaces. Also, using those interfaces guarantees that
your controller set_type callback is called with the descriptor lock held,
thus providing locking against activity on the same interrupt (including
mask/unmask/etc...) automatically. A result is that, for example, MPIC's
own map() implementation calls irq_set_type(NONE) to configure the hardware
to the default triggers.
- To allow the above, the irq_map array entry for the new mapped interrupt
is now set before map() callback is called for the controller.
- The irq_create_of_mapping() (also used by irq_of_parse_and_map()) function
for mapping interrupts from the device-tree now also call the separate
set_irq_type(), and only does so if there is a change in the trigger type.
- While I was at it, I changed pci_read_irq_line() (which is the helper I
would expect most archs to use in their pcibios_fixup() to get the PCI
interrupt routing from the device tree) to also handle a fallback when the
DT mapping fails consisting of reading the PCI_INTERRUPT_PIN to know wether
the device has an interrupt at all, and the the PCI_INTERRUPT_LINE to get an
interrupt number from the device. That number is then mapped using the
default controller, and the trigger is set to level low. That default
behaviour works for several platforms that don't have a proper interrupt
tree like Pegasos. If it doesn't work for your platform, then either
provide a proper interrupt tree from the firmware so that fallback isn't
needed, or don't call pci_read_irq_line()
- Add back a bit that got dropped by my main rework patch for properly
clearing pending IPIs on pSeries when using a kexec
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
asm-powerpc/cputime.h doesn't declare jiffies64_to_cputime64() or
cputime64_sub(), and due to CONFIG_VIRT_CPU_ACCOUNTING it's not picking
up the definition from asm-generic like x86-64 & friends do.
Cc: Dave Jones <davej@redhat.com>
Cc: Andrew Morton <akpm@osdl.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc: add defconfig for Freescale MPC8349E-mITX board
powerpc: Add base support for the Freescale MPC8349E-mITX eval board
Documentation: correct values in MPC8548E SEC example node
[POWERPC] Actually copy over i8259.c to arch/ppc/syslib this time
[POWERPC] Add new interrupt mapping core and change platforms to use it
[POWERPC] Copy i8259 code back to arch/ppc
[POWERPC] New device-tree interrupt parsing code
[POWERPC] Use the genirq framework
[PATCH] genirq: Allow fasteoi handler to retrigger disabled interrupts
[POWERPC] Update the SWIM3 (powermac) floppy driver
[POWERPC] Fix error handling in detecting legacy serial ports
[POWERPC] Fix booting on Momentum "Apache" board (a Maple derivative)
[POWERPC] Fix various offb and BootX-related issues
[POWERPC] Add a default config for 32-bit CHRP machines
[POWERPC] fix implicit declaration on cell.
[POWERPC] change get_property to return void *
Accurate hard-IRQ-flags and softirq-flags state tracing.
This allows us to attach extra functionality to IRQ flags on/off
events (such as trace-on/off).
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
RWSEM_DEBUG used to be a printk based 'tracing' facility, probably used for
very early prototypes of the rwsem code. Remove it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add the per_cpu_offset() generic method. (used by the lock validator)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds the new irq remapper core and removes the old one. Because
there are some fundamental conflicts with the old code, like the value
of NO_IRQ which I'm now setting to 0 (as per discussions with Linus),
etc..., this commit also changes the relevant platform and driver code
over to use the new remapper (so as not to cause difficulties later
in bisecting).
This patch removes the old pre-parsing of the open firmware interrupt
tree along with all the bogus assumptions it made to try to renumber
interrupts according to the platform. This is all to be handled by the
new code now.
For the pSeries XICS interrupt controller, a single remapper host is
created for the whole machine regardless of how many interrupt
presentation and source controllers are found, and it's set to match
any device node that isn't a 8259. That works fine on pSeries and
avoids having to deal with some of the complexities of split source
controllers vs. presentation controllers in the pSeries device trees.
The powerpc i8259 PIC driver now always requests the legacy interrupt
range. It also has the feature of being able to match any device node
(including NULL) if passed no device node as an input. That will help
porting over platforms with broken device-trees like Pegasos who don't
have a proper interrupt tree.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Adds new routines to prom_parse to walk the device-tree for interrupt
information. This includes both direct mapping of interrupts and low
level parsing functions for use with partial trees.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adapts the generic powerpc interrupt handling code, and all of
the platforms except for the embedded 6xx machines, to use the new
genirq framework.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Change the get_property() function to return a void *. This allows us
to later remove the cast done in the majority of callers.
Built for pseries, iseries, pmac32, cell, cbesim, g5, systemsim, maple,
and mpc* defconfigs
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Use the new IRQF_ constants and remove the SA_INTERRUPT define
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch implements an API whereby an application can determine the
label of its peer's Unix datagram sockets via the auxiliary data mechanism of
recvmsg.
Patch purpose:
This patch enables a security-aware application to retrieve the
security context of the peer of a Unix datagram socket. The application
can then use this security context to determine the security context for
processing on behalf of the peer who sent the packet.
Patch design and implementation:
The design and implementation is very similar to the UDP case for INET
sockets. Basically we build upon the existing Unix domain socket API for
retrieving user credentials. Linux offers the API for obtaining user
credentials via ancillary messages (i.e., out of band/control messages
that are bundled together with a normal message). To retrieve the security
context, the application first indicates to the kernel such desire by
setting the SO_PASSSEC option via getsockopt. Then the application
retrieves the security context using the auxiliary data mechanism.
An example server application for Unix datagram socket should look like this:
toggle = 1;
toggle_len = sizeof(toggle);
setsockopt(sockfd, SOL_SOCKET, SO_PASSSEC, &toggle, &toggle_len);
recvmsg(sockfd, &msg_hdr, 0);
if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) {
cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr);
if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) &&
cmsg_hdr->cmsg_level == SOL_SOCKET &&
cmsg_hdr->cmsg_type == SCM_SECURITY) {
memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext));
}
}
sock_setsockopt is enhanced with a new socket option SOCK_PASSSEC to allow
a server socket to receive security context of the peer.
Testing:
We have tested the patch by setting up Unix datagram client and server
applications. We verified that the server can retrieve the security context
using the auxiliary data mechanism of recvmsg.
Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Acked-by: Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (43 commits)
[POWERPC] Use little-endian bit from firmware ibm,pa-features property
[POWERPC] Make sure smp_processor_id works very early in boot
[POWERPC] U4 DART improvements
[POWERPC] todc: add support for Time-Of-Day-Clock
[POWERPC] Make lparcfg.c work when both iseries and pseries are selected
[POWERPC] Fix idr locking in init_new_context
[POWERPC] mpc7448hpc2 (taiga) board config file
[POWERPC] Add tsi108 pci and platform device data register function
[POWERPC] Add general support for mpc7448hpc2 (Taiga) platform
[POWERPC] Correct the MAX_CONTEXT definition
powerpc: minor cleanups for mpc86xx
[POWERPC] Make sure we select CONFIG_NEW_LEDS if ADB_PMU_LED is set
[POWERPC] Simplify the code defining the 64-bit CPU features
[POWERPC] powerpc: kconfig warning fix
[POWERPC] Consolidate some of kernel/misc*.S
[POWERPC] Remove unused function call_with_mmu_off
[POWERPC] update asm-powerpc/time.h
[POWERPC] Clean up it_lp_queue.h
[POWERPC] Skip the "copy down" of the kernel if it is already at zero.
[POWERPC] Add the use of the firmware soft-reset-nmi to kdump.
...
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6:
[PATCH] i386: export memory more than 4G through /proc/iomem
[PATCH] 64bit Resource: finally enable 64bit resource sizes
[PATCH] 64bit Resource: convert a few remaining drivers to use resource_size_t where needed
[PATCH] 64bit resource: change pnp core to use resource_size_t
[PATCH] 64bit resource: change pci core and arch code to use resource_size_t
[PATCH] 64bit resource: change resource core to use resource_size_t
[PATCH] 64bit resource: introduce resource_size_t for the start and end of struct resource
[PATCH] 64bit resource: fix up printks for resources in misc drivers
[PATCH] 64bit resource: fix up printks for resources in arch and core code
[PATCH] 64bit resource: fix up printks for resources in pcmcia drivers
[PATCH] 64bit resource: fix up printks for resources in video drivers
[PATCH] 64bit resource: fix up printks for resources in ide drivers
[PATCH] 64bit resource: fix up printks for resources in mtd drivers
[PATCH] 64bit resource: fix up printks for resources in pci core and hotplug drivers
[PATCH] 64bit resource: fix up printks for resources in networks drivers
[PATCH] 64bit resource: fix up printks for resources in sound drivers
[PATCH] 64bit resource: C99 changes for struct resource declarations
Fixed up trivial conflict in drivers/ide/pci/cmd64x.c (the printk that
was changed by the 64-bit resources had been deleted in the meantime ;)
Add ->retrigger() irq op to consolidate hw_irq_resend() implementations.
(Most architectures had it defined to NOP anyway.)
NOTE: ia64 needs testing. i386 and x86_64 tested.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch-queue improves the generic IRQ layer to be truly generic, by adding
various abstractions and features to it, without impacting existing
functionality.
While the queue can be best described as "fix and improve everything in the
generic IRQ layer that we could think of", and thus it consists of many
smaller features and lots of cleanups, the one feature that stands out most is
the new 'irq chip' abstraction.
The irq-chip abstraction is about describing and coding and IRQ controller
driver by mapping its raw hardware capabilities [and quirks, if needed] in a
straightforward way, without having to think about "IRQ flow"
(level/edge/etc.) type of details.
This stands in contrast with the current 'irq-type' model of genirq
architectures, which 'mixes' raw hardware capabilities with 'flow' details.
The patchset supports both types of irq controller designs at once, and
converts i386 and x86_64 to the new irq-chip design.
As a bonus side-effect of the irq-chip approach, chained interrupt controllers
(master/slave PIC constructs, etc.) are now supported by design as well.
The end result of this patchset intends to be simpler architecture-level code
and more consolidation between architectures.
We reused many bits of code and many concepts from Russell King's ARM IRQ
layer, the merging of which was one of the motivations for this patchset.
This patch:
rename desc->handler to desc->chip.
Originally i did not want to do this, because it's a big patch. But having
both "desc->handler", "desc->handle_irq" and "action->handler" caused a
large degree of confusion and made the code appear alot less clean than it
truly is.
I have also attempted a dual approach as well by introducing a
desc->chip alias - but that just wasnt robust enough and broke
frequently.
So lets get over with this quickly. The conversion was done automatically
via scripts and converts all the code in the kernel.
This renaming patch is the first one amongst the patches, so that the
remaining patches can stay flexible and can be merged and split up
without having some big monolithic patch act as a merge barrier.
[akpm@osdl.org: build fix]
[akpm@osdl.org: another build fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is a resubmit with a proper subject and with all comments addressed.
Applies cleanly to powerpc.git 649e857972
Mark
--
The todc code from arch/ppc supports many todc/rtc chips and is needed
in arch/powerpc. This patch adds the todc code to arch/powerpc.
Signed-off-by: Mark A. Greer <mgreer@mvista.com>
--
arch/powerpc/Kconfig | 7
arch/powerpc/sysdev/Makefile | 1
arch/powerpc/sysdev/todc.c | 392 ++++++++++++++++++++++++++++++++++
include/asm-powerpc/todc.h | 487 +++++++++++++++++++++++++++++++++++++++++++
4 files changed, 887 insertions(+)
--
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add Tundra Semiconductor tsi108 pci and platform device data register
function support.
Signed-off-by: Alexandre Bounine <alexandreb@tundra.com>
Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
---
Signed-off-by: Paul Mackerras <paulus@samba.org>
When we increased the address space per process to 2^44 bytes, the
number of contexts that we could actually use reduced, but we forgot
to decrease the MAX_CONTEXT definition. (Fortunately this would only
cause problems if we actually had more than 512k user processes
running.) This patch corrects the definition.
Signed-off-by: Paul Mackerras <paulus@samba.org>
* Remove duplicated cputable entry for 8641 (matches w/7448)
* Removed __init from function prototypes in mpc86xx.h
* Moved pci fixups into board specific code
* Moved mpc86xx_exclude_device to generic mpc86xx pci code
* Fixed sparse warnings in mpc86xx_smp.c
* Removed board specific header include from asm-powerpc/mpc86xx.h
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
If we ever build a combined kernel including iSeries, then this will
be needed.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
No more StudlyCaps.
Remove from a couple of places it is no longer needed.
Use C style comments.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
With this patch, kdump uses the firmware soft-reset NMI for two purposes:
1) Initiate the kdump (take a crash dump) by issuing a soft-reset.
2) Break a CPU out of a deadlock condition that is detected during kdump
processing.
When a soft-reset is initiated each CPU will enter
system_reset_exception() and set its corresponding bit in the global
bit-array cpus_in_sr then call die(). When die() finds the CPU's bit set
in cpu_in_sr crash_kexec() is called to initiate a crash dump. The first
CPU to enter crash_kexec() is called the "crashing CPU". All other CPUs
are "secondary CPUs". The secondary CPU's pass through to
crash_kexec_secondary() and sleep. The crashing CPU waits for all CPUs
to enter via soft-reset then boots the kdump kernel (see
crash_soft_reset_check())
When the system crashes due to a panic or exception, crash_kexec() is
called by panic() or die(). The crashing CPU sends an IPI to all other
CPUs to notify them of the pending shutdown. If a CPU is in a deadlock
or hung state with interrupts disabled, the IPI will not be delivered.
The result being, that the kdump kernel is not booted. This problem is
solved with the use of a firmware generated soft-reset. After the
crashing_cpu has issued the IPI, it waits for 10 sec for all CPUs to
enter crash_ipi_callback(). A CPU signifies its entry to
crash_ipi_callback() by setting its corresponding bit in the
cpus_in_crash bit array. After 10 sec, if one or more CPUs have not set
their bit in cpus_in_crash we assume that the CPU(s) is deadlocked. The
operator is then prompted to generate a soft-reset to break the
deadlock. Each CPU enters the soft reset handler as described above.
Two conditions must be handled at this point:
1) The system crashed because the operator generated a soft-reset. See
2) The system had crashed before the soft-reset was generated ( in the
case of a Panic or oops).
The first CPU to enter crash_kexec() uses the state of the kexec_lock to
determine this state. If kexec_lock is already held then condition 2 is
true and crash_kexec_secondary() is called, else; this CPU is flagged as
the crashing CPU, the kexec_lock is acquired and crash_kexec() proceeds
as described above.
Each additional CPUs responding to the soft-reset will pass through
crash_kexec() to kexec_secondary(). All secondary CPUs call
crash_ipi_callback() readying them self's for the shutdown. When ready
they clear their bit in cpus_in_sr. The crashing CPU waits in
kexec_secondary() until all other CPUs have cleared their bits in
cpus_in_sr. The kexec kernel boot is then started.
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add udbg hooks for the RTAS console, based on the RTAS put-term-char
and get-term-char calls. Along with my previous patches, this should
enable debugging as soon as early_init_dt_scan_rtas() is called.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Althought RTAS is instantiated when we enter the kernel, we can't actually
call into it until we know its entry point address. Currently we grab that
in rtas_initialize(), however that's quite late in the boot sequence.
To enable rtas_call() earlier, we can grab the RTAS entry etc. values while
we're scanning the flattened device tree. There's existing code to retrieve
the values from /chosen, however we don't store them there anymore, so remove
that code.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
There's no reason kexec_setup() needs to be called explicitly from
setup_system(), it can just be a regular initcall.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Initialise the ppc_md htab callbacks earlier, in the probe routines. This
allows us to call htab_finish_init() from htab_initialize(), and makes it
private to hash_utils_64.c. Move htab_finish_init() and make_bl() above
htab_initialize() to avoid forward declarations.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
During kdump boot, noticed some machines checkstop on dma protection
fault for ongoing DMA left in the first kernel. Instead of initializing
TCE entries in iommu_init() for the kdump boot, this patch fixes this
issue by walking through the each TCE table and checks whether the
entries are in use by the first kernel. If so, reserve those entries by
setting the corresponding bit in tbl->it_map such that these entries
will not be available for the kdump boot.
However it could be possible that all TCE entries might be used up due
to the driver bug that does continuous mapping. My observation is around
1700 TCE entries are used on some systems (Ex: P4) at some point of
time during kdump boot and saving dump (either write into the disk or
sending to remote machine). Hence, this patch will make sure that
minimum of 2048 entries will be available such that kdump boot could be
successful in some cases.
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
sysfs entries 'sched_mc_power_savings' and 'sched_smt_power_savings' in
/sys/devices/system/cpu/ control the MC/SMT power savings policy for the
scheduler.
Based on the values (1-enable, 0-disable) for these controls, sched groups
cpu power will be determined for different domains. When power savings
policy is enabled and under light load conditions, scheduler will minimize
the physical packages/cpu cores carrying the load and thus conserving
power(with a perf impact based on the workload characteristics... see OLS
2005 CMP kernel scheduler paper for more details..)
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Con Kolivas <kernel@kolivas.org>
Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Based on a patch series originally from Vivek Goyal <vgoyal@in.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
With this patch Kprobes now registers for page fault notifications only when
their is an active probe registered. Once all the active probes are
unregistered their is no need to be notified of page faults and kprobes
unregisters itself from the page fault notifications. Hence we will have ZERO
side effects when no probes are active.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Overloading of page fault notification with the notify_die() has performance
issues(since the only interested components for page fault is kprobes and/or
kdb) and hence this patch introduces the new notifier call chain exclusively
for page fault notifications their by avoiding notifying unnecessary
components in the do_page_fault() code path.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There are several instances of per_cpu(foo, raw_smp_processor_id()), which
is semantically equivalent to __get_cpu_var(foo) but without the warning
that smp_processor_id() can give if CONFIG_DEBUG_PREEMPT is enabled. For
those architectures with optimized per-cpu implementations, namely ia64,
powerpc, s390, sparc64 and x86_64, per_cpu() turns into more and slower
code than __get_cpu_var(), so it would be preferable to use __get_cpu_var
on those platforms.
This defines a __raw_get_cpu_var(x) macro which turns into per_cpu(x,
raw_smp_processor_id()) on architectures that use the generic per-cpu
implementation, and turns into __get_cpu_var(x) on the architectures that
have an optimized per-cpu implementation.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The floppy driver is already calling add_disk_randomness as it should, so this
was redundant.
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch contains a total rewrite of the backlight infrastructure for
portable Apple computers. Backward compatibility is retained. A sysfs
interface allows userland to control the brightness with more steps than
before. Userland is allowed to upload a brightness curve for different
monitors, similar to Mac OS X.
[akpm@osdl.org: add needed exports]
Signed-off-by: Michael Hanselmann <linux-kernel@hansmi.ch>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (139 commits)
[POWERPC] re-enable OProfile for iSeries, using timer interrupt
[POWERPC] support ibm,extended-*-frequency properties
[POWERPC] Extra sanity check in EEH code
[POWERPC] Dont look for class-code in pci children
[POWERPC] Fix mdelay badness on shared processor partitions
[POWERPC] disable floating point exceptions for init
[POWERPC] Unify ppc syscall tables
[POWERPC] mpic: add support for serial mode interrupts
[POWERPC] pseries: Print PCI slot location code on failure
[POWERPC] spufs: one more fix for 64k pages
[POWERPC] spufs: fail spu_create with invalid flags
[POWERPC] spufs: clear class2 interrupt status before wakeup
[POWERPC] spufs: fix Makefile for "make clean"
[POWERPC] spufs: remove stop_code from struct spu
[POWERPC] spufs: fix spu irq affinity setting
[POWERPC] spufs: further abstract priv1 register access
[POWERPC] spufs: split the Cell BE support into generic and platform dependant parts
[POWERPC] spufs: dont try to access SPE channel 1 count
[POWERPC] spufs: use kzalloc in create_spu
[POWERPC] spufs: fix initial state of wbox file
...
Manually resolved conflicts in:
drivers/net/phy/Makefile
include/asm-powerpc/spu.h
VGA_MAP_MEM translates to ioremap() on some architectures. It makes sense
to do this to vga_vram_base, because we're going to access memory between
vga_vram_base and vga_vram_end.
But it doesn't really make sense to map starting at vga_vram_end, because
we aren't going to access memory starting there. On ia64, which always has
to be different, ioremapping vga_vram_end gives you something completely
incompatible with ioremapped vga_vram_start, so vga_vram_size ends up being
nonsense.
As a bonus, we often know the size up front, so we can use ioremap()
correctly, rather than giving it a zero size.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On partitioned PPC64 systems where a partition is given 1/10 of a
processor, we have seen mdelay() delaying for 10 times longer than it
should. The reason is that the generic mdelay(n) does n delays of 1
millisecond each. However, with 1/10 of a processor, we only get a
one-millisecond timeslice every 10ms. Thus each 1 millisecond delay
loop ends up taking 10ms elapsed time.
The solution is just to use the PPC64 udelay function, which uses the
timebase to ensure that the delay is based on elapsed time rather than
how much processing time the partition has been given. (Yes, the
generic mdelay uses the PPC64 udelay, but the problem is that the
start time gets reset every millisecond, and each time it gets reset
we lose another 9ms.)
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Andrew Morton <akpm@osdl.org>
Floating point exceptions should not be enabled by default,
as this setting impacts the performance on some CPUs, in
particular the Cell BE. Since the bits are inherited from
parent processes, the place to change the default is the
thread struct used for init.
glibc sets this up correctly per thread in its fesetenv
function, so user space should not be impacted by this
setting. None of the other common libc implementations
(uClibc, dietlibc, newlib, klibc) has support for fp
exceptions, so they are unlikely to be hit by this either.
There is a small risk that somebody wrote their own
application that manually sets the fpscr bits instead
of calling fesetenv, without changing the MSR bits as well.
Those programs will break with this change.
It probably makes sense to change glibc in the future
to be more clever about FE bits, so that when running
on a CPU where this is expensive, it disables exceptions
ASAP, while it keeps them enabled on CPUs where running
with exceptions on is cheaper than changing the state
often.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Avoid duplication of the syscall table for the cell platform. Based on an
idea from David Woodhouse.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>