Commit Graph

1358 Commits

Author SHA1 Message Date
Matt Helsley
dc52ddc0e6 container freezer: implement freezer cgroup subsystem
This patch implements a new freezer subsystem in the control groups
framework.  It provides a way to stop and resume execution of all tasks in
a cgroup by writing in the cgroup filesystem.

The freezer subsystem in the container filesystem defines a file named
freezer.state.  Writing "FROZEN" to the state file will freeze all tasks
in the cgroup.  Subsequently writing "RUNNING" will unfreeze the tasks in
the cgroup.  Reading will return the current state.

* Examples of usage :

   # mkdir /containers/freezer
   # mount -t cgroup -ofreezer freezer  /containers
   # mkdir /containers/0
   # echo $some_pid > /containers/0/tasks

to get status of the freezer subsystem :

   # cat /containers/0/freezer.state
   RUNNING

to freeze all tasks in the container :

   # echo FROZEN > /containers/0/freezer.state
   # cat /containers/0/freezer.state
   FREEZING
   # cat /containers/0/freezer.state
   FROZEN

to unfreeze all tasks in the container :

   # echo RUNNING > /containers/0/freezer.state
   # cat /containers/0/freezer.state
   RUNNING

This is the basic mechanism which should do the right thing for user space
task in a simple scenario.

It's important to note that freezing can be incomplete.  In that case we
return EBUSY.  This means that some tasks in the cgroup are busy doing
something that prevents us from completely freezing the cgroup at this
time.  After EBUSY, the cgroup will remain partially frozen -- reflected
by freezer.state reporting "FREEZING" when read.  The state will remain
"FREEZING" until one of these things happens:

	1) Userspace cancels the freezing operation by writing "RUNNING" to
		the freezer.state file
	2) Userspace retries the freezing operation by writing "FROZEN" to
		the freezer.state file (writing "FREEZING" is not legal
		and returns EIO)
	3) The tasks that blocked the cgroup from entering the "FROZEN"
		state disappear from the cgroup's set of tasks.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: export thaw_process]
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Tested-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 08:52:34 -07:00
Joerg Roedel
0fcff28f47 sparc64: use iommu_num_pages function in IOMMU code
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Joerg Roedel
a7375762a5 sparc64: rename iommu_num_pages function to iommu_nr_pages
This is a preparation patch for introducing a generic iommu_num_pages function.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Christoph Hellwig
b418da16dd compat: generic compat get/settimeofday
Nothing arch specific in get/settimeofday.  The details of the timeval
conversion varied a little from arch to arch, but all with the same
results.

Also add an extern declaration for sys_tz to linux/time.h because externs
in .c files are fowned upon.  I'll kill the externs in various other files
in a sparate patch.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ]
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Christoph Hellwig
f7a5000f7a compat: move cp_compat_stat to common code
struct stat / compat_stat is the same on all architectures, so
cp_compat_stat should be, too.

Turns out it is, except that various architectures have slightly and some
high2lowuid/high2lowgid or the direct assignment instead of the
SET_UID/SET_GID that expands to the correct one anyway.

This patch replaces the arch-specific cp_compat_stat implementations with
a common one based on the x86-64 one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ]
Acked-by: Kyle McMartin <kyle@mcmartin.ca> [ parisc bits ]
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Dominik Brodowski
459fc208ab cpufreq: remove policy->governor setting in drivers initialization
As policy->governor is already set to CPUFREQ_DEFAULT_GOVERNOR in the
(always built-in) cpufreq core, we do not need to set it in the drivers.
This fixes the sparc64 allmodconfig build failure.

Also, remove a totally useles setting of ->policy in cpufreq-pxa3xx.c.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-15 16:42:47 -07:00
David S. Miller
615c9136b3 chmc: Mark %ver register inline asm with __volatile__
Otherwise GCC can try to do the register read before the guarding test
on us3mc_platform() being true.

If that happens we can take an exception, because %ver register reads
are not allowed in privileged more on hypervisor platforms.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-12 23:56:12 -07:00
David S. Miller
82960b8543 sparc64: Add missing notify_cpu_starting() call.
Commit e545a6140b ("kernel/cpu.c: create
a CPU_STARTING cpu_chain notifier") added a notify_cpu_starting()
notifier event, and hit every arch except sparc64.

Fix that missed case.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-12 23:55:47 -07:00
David S. Miller
56c5d900db Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	sound/core/memalloc.c
2008-10-11 12:39:35 -07:00
David S. Miller
44b50e5a1a sparc64: Fix missing devices due to PCI bridge test in of_create_pci_dev().
Just like in the arch/sparc64/kernel/of_device.c code fix commit
071d7f4c3b411beae08d27656e958070c43b78b4 ("sparc64: Fix SMP bootup
with CONFIG_STACK_DEBUG or ftrace.") we have to check the OF device
node name for "pci" instead of relying upon the 'device_type' property
being there on all PCI bridges.

Tested by Meelis Roos, and confirmed to make the PCI QFE devices
reappear on the E3500 system.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 15:51:54 -07:00
David S. Miller
7ee766d8fb sparc64: Fix disappearing PCI devices on e3500.
Based upon a bug report by Meelis Roos.

The OF device layer builds properties by matching bus types and
applying 'range' properties as appropriate, up to the root.

The match for "PCI" busses is looking at the 'device_type' property,
and this does work %99 of the time.

But on an E3500 system with a PCI QFE card, the DEC 21153 bridge
sitting above the QFE network interface devices has a 'name' of "pci",
but it completely lacks a 'device_type' property.  So we don't match
it as a PCI bus, and subsequently we end up with no resource values at
all for the devices sitting under that DEC bridge.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-20 22:00:40 -07:00
David S. Miller
2e57572a50 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
Conflicts:

	arch/sparc64/kernel/pci_psycho.c
2008-09-16 14:11:43 -07:00
David S. Miller
9843099ff4 sparc64: Fix SMP bootup with CONFIG_STACK_DEBUG or ftrace.
Based upon a report by Meelis Roos.

Any function call can try to access the current
thread register via the _mcount hooks when the kernel
is built with -pg (via ftrace or STACK_DEBUG).

That can't be setup properly very early on during
the bootup of other cpus for sun4u and some early
sun4v systems.

So add notrace markers to these specific functions, so
that _mcount doesn't get invoked too early.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-16 11:44:00 -07:00
David S. Miller
f948cc6ab9 sparc64: Fix OOPS in psycho_pcierr_intr_other().
We no longer put the top-level PCI controller device into the
PCI layer device list.  So pbm->pci_bus->self is always NULL.

Therefore, use direct PCI config space accesses to get at
the PCI controller's PCI_STATUS register.

Tested by Meelis Roos.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-16 09:53:42 -07:00
David S. Miller
7d4ee289d1 sparc: Fix user_regset 'n' field values.
As noticed by Russell King, we were not setting this properly
to the number of entries, but rather the total size.

This results in the core dumping code allocating waayyyy too
much memory.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-12 15:55:44 -07:00
David S. Miller
80a56ab626 sparc64: Fix PCI error interrupt registry on PSYCHO.
We need to pass IRQF_SHARED, otherwise we get things like:

IRQ handler type mismatch for IRQ 33
current handler: PSYCHO_UE
Call Trace:
 [000000000048394c] request_irq+0xac/0x120
 [00000000007c5f6c] psycho_scan_bus+0x98/0x158
 [00000000007c2bc0] pcibios_init+0xdc/0x12c
 [0000000000426a5c] do_one_initcall+0x1c/0x160
 [00000000007c0180] kernel_init+0x9c/0xfc
 [0000000000427050] kernel_thread+0x30/0x60
 [00000000006ae1d0] rest_init+0x10/0x60

on e3500 and similar systems.

On a single board, the UE interrupts of two Psycho nodes
are funneled through the same interrupt, from of_debug=3
dump:

/pci@b,4000: direct translate 2ee --> 21
 ...
/pci@b,2000: direct translate 2ee --> 21

Decimal "33" mentioned above is the hex "21" mentioned here.

Thanks to Meelis Roos for dumps and testing.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-12 15:13:15 -07:00
David S. Miller
3c50370103 sparc: Fix user_regset 'n' field values.
As noticed by Russell King, we were not setting this properly
to the number of entries, but rather the total size.

This results in the core dumping code allocating waayyyy too
much memory.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-12 15:01:31 -07:00
David S. Miller
3ab5827eb0 sparc64: Fix sparse warnings in chmc.c
Several constants are larger than 32-bit and need "UL" markers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-12 00:22:42 -07:00
David S. Miller
af1ee569d3 sparc64: Kill sparse warnings in mm/init.h
1) Several exported symbols need extern decls, they are exported
   not for C code but for assembler routines.

2) PAGE_EXEC isn't used, delete

3) Several larger than 32-bit constants need "UL" markers

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-12 00:19:21 -07:00
David S. Miller
b539c46766 sparc64: Fix sparse warnings in fault.c
1) set_brkpt() is referenced by nothing and hasn't been used by anyone
   to my knowledge for many many years.  So just delete it.

2) add extern decl for do_sparc64_fault() in asm/pgtable_64.h

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-12 00:10:32 -07:00
David S. Miller
a9e7bb0410 sparc64: Remove explicit initialization of mmu_gathers
This was just needed to work around an ancient gcc bug that
we don't care about any more.

It was also causing a sparse warnings:

arch/sparc64/mm/tlb.c:22:52: warning: Using plain integer as NULL pointer

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-12 00:06:58 -07:00
David S. Miller
7694b024f1 sparc64: Fix sparse warnings in vio.c
Several variables should be marked static.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-12 00:04:33 -07:00
David S. Miller
8d2aec5123 sparc64: Fix sparse warnings in pci_sun4v.c
'err' variable shadowing in pci_sun4v_probe()

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-12 00:01:03 -07:00
David S. Miller
77d10d0e63 sparc64: Fix sparse warnings in pci.c
1) Declare pci_poke_* in pci_impl.h
2) of_create_pci_dev() should be static
3) ->setup_msi_irq() wants an unsigned int pointer not a plain
   int one
4) void value expression return in arch_teardown_msi_irq()

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-11 23:57:40 -07:00
David S. Miller
21cd883393 sparc64: Fix sparse warnings in of_device.c
Passing unsigned int pointer where plain int pointer is
expected.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-11 23:53:41 -07:00
David S. Miller
c91e2ecad0 sparc64: Fix sparse warnings in prom.c
1) Testing null with '0'
2) returning void-valued expression

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-11 23:52:35 -07:00
David S. Miller
7e0b1e6186 sparc64: Fix sparse warnings in visemul.c
1) edge8 tables should be static
2) add vis_emul() extern decl. to asm/visasm.h

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-11 23:46:40 -07:00
David S. Miller
d8ada0a2cd sparc64: Fix sparse warnings in kernel/time.c
1) Using "clock" as a local variable shadows a global variable of
   the same name declared in linux/clocksource.h

2) rtc_cmos_resource should be static

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-11 23:39:11 -07:00
David S. Miller
17f04fbb0f sysctl: Use header file for sysctl knob declarations on sparc.
This also takes care of a sparse warning as scons_pwroff's definition
point.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-11 23:33:53 -07:00
David S. Miller
8f20b20de7 sparc64: Fix sparse warnings in global reg snapshotting.
Lots of shadowed local variables and global_reg_snapshot[] needs
an extern declaration in asm/ptrace_64.h.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-11 23:19:22 -07:00
David S. Miller
4845afac95 sparc64: Add __arch64__ to CHECKFLAGS
Otherwise sparse doesn't work.  The 32 vs. 64 header ifdef
used under arch/sparc/include/asm/ is:

#if defined(__sparc__) && defined(__arch64__)

And that doesn't work for sparse unless we give it __arch64__

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-11 23:14:52 -07:00
David S. Miller
87395fc678 sparc64: Kill hand-crafted I/O accessors in PCI controller drivers.
Use existing upa_{read,write}q() interfaces instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-10 23:14:46 -07:00
David S. Miller
e6e003720f sparc64: Commonize large portions of PSYCHO error handling.
The IOMMU and streaming cache error interrogation is moved here
as well as the PCI error interrupt handler.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-10 23:11:58 -07:00
David S. Miller
1c03a55cdf sparc64: Create and use psycho_pbm_init_common().
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-10 23:11:57 -07:00
David S. Miller
a21cff3e5e sparc64: Start commonizing code common between SABRE and PSYCHO.
These are very similar chips, in fact they are identical in some
macro blocks.

So start commonizing code which they can share.  We begin with
the IOMMU initialization sequence.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-10 23:11:56 -07:00
David S. Miller
22fecbae44 sparc64: Record OF device instead of device node pointer in pci_pbm_info.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-10 23:07:59 -07:00
David S. Miller
d3ae4b5bc7 sparc64: Get rid of pci_controller_info.
It is just used as a parent to encapsulate two PBM objects.

But that layout is only really relevant and necessary for
psycho PCI controllers, which unlike all the others share
a single IOMMU instance between sibling PCI busses.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-10 23:07:41 -07:00
David S. Miller
ebfb2c6340 sparc64: Fix interrupt register calculations on Psycho and Sabre.
Use the IMAP offset calculation for OBIO devices as documented in the
programmer's manual.  Which is "0x10000 + ((ino & 0x1f) << 3)"

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-10 14:08:27 -07:00
David S. Miller
90158d84eb sparc64: Fix return value in update_persistent_clock().
Noticed by Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-10 13:35:08 -07:00
David S. Miller
088a396236 sparc64: Add missing rtc_close() in update_persistent_clock()
Noticed by David Brownell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-08 19:53:28 -07:00
David S. Miller
2eb2f77900 sparc64: Disable timer interrupts in fixup_irqs().
When a CPU is offlined, we leave the timer interrupts disabled
because fixup_irqs() does not explicitly take care of that case.

Fix this by invoking tick_ops->disable_irq().

Based upon analysis done by Paul E. McKenney.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-08 17:21:07 -07:00
David S. Miller
98d86c0915 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
Conflicts:

	arch/sparc/kernel/of_device.c
2008-09-08 15:39:30 -07:00
Krzysztof Helt
3baca76f56 sparc64: fix wrong m48t59 RTC year
Correctly calculate offset to the year register for
Mostek RTC.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-07 18:12:59 -07:00
Paul E. McKenney
4d084617fb sparc64: Prevent sparc64 from invoking irq handlers on offline CPUs
Make sparc64 refrain from clearing a given to-be-offlined CPU's bit in the
cpu_online_mask until it has processed pending irqs.  This change
prevents other CPUs from being blindsided by an apparently offline CPU
nevertheless changing globally visible state.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-03 02:15:30 -07:00
David S. Miller
e5bd1c3fdd sparc64: Fix IPI call locking.
When I switched sparc64 over to the generic helpers for
smp_call_function(), I didn't convert the dinky call_lock
we had.

Use ipi_call_lock() and ipi_call_unlock().

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-03 02:14:39 -07:00
David S. Miller
5280267c1d sparc: Fix handling of LANCE and ESP parent nodes in of_device.c
The device nodes that sit above 'esp' and 'le' on SBUS lack a 'ranges'
property, but we should pass the translation up to the parent node so
that the SBUS level ranges get applied.

Based upon a bug report from Robert Reif.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-03 02:05:19 -07:00
David S. Miller
8aef727861 pci_sun4v: Use of_get_property().
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-02 00:52:56 -07:00
David S. Miller
463801b3ae pci_schizo: Use of_get_property() and delete spurious local vars.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-02 00:52:55 -07:00
David S. Miller
0f73d1bbe6 pci_psycho: Use of_getintprop_default().
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-02 00:52:54 -07:00
David S. Miller
446139a8f7 sparc64: Implement SSTATE purely using notifiers and initcalls.
Don't clutter up the tree with sstate_blah() scattered all over the
place.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-02 00:49:38 -07:00