umount is done under the protection of the namespace semaphore. This
can lead to intresting deadlocks when the last reference to a mount is
released, if filesystem code is in sufficiently nasty state.
This collects all the to-be-released-mounts and releases them after
releasing the namespace semaphore. That both reduces the time we are
holding namespace semaphore and gets the things more robust.
Idea proposed by Al Viro.
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Old semantics: graft_tree() grabs a reference on the vfsmount before
returning success.
New one: graft_tree() leaves that to caller.
All the callers of graft_tree() immediately dropped that reference
anyway. Changing the interface takes care of this unnecessary overhead.
Idea proposed by Al Viro.
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Allow caller of seq_open() to kmalloc() seq_file + whatever else they
want and set ->private_data to it. seq_open() will then abstain from
doing allocation itself.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- check_mnt() on the source of binding should've been unconditional
from the very beginning. My fault - as far I could've trace it,
that's an old thinko made back in 2001. Kudos to Miklos for spotting
it...
Fixed.
- code cleaned up.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The way we currently deal with quota and process accounting that might
keep vfsmount busy at umount time is inherently broken; we try to turn
them off just in case (not quite correctly, at that) and
a) pray umount doesn't fail (otherwise they'll stay turned off)
b) pray nobody doesn anything funny just as we turn quota off
Moreover, LSM provides hooks for doing the same sort of broken logics.
The proper way to deal with that is to introduce the second kind of
reference to vfsmount. Semantics:
- when the last normal reference is dropped, all special ones are
converted to normal ones and if there had been any, cleanup is done.
- normal reference can be cloned into a special one
- special reference can be converted to normal one; that's a no-op if
we'd already passed the point of no return (i.e. mntput() had
converted special references to normal and started cleanup).
The way it works: e.g. starting process accounting converts the vfsmount
reference pinned by the opened file into special one and turns it back
to normal when it gets shut down; acct_auto_close() is done when no
normal references are left. That way it does *not* obstruct umount(2)
and it silently gets turned off when the last normal reference to
vfsmount is gone. Which is exactly what we want...
The same should be done by LSM module that holds some internal
references to vfsmount and wants to shut them down on umount - it should
make them special and security_sb_umount_close() will be called exactly
when the last normal reference to vfsmount is gone.
quota handling is even simpler - we don't use normal file IO anymore, so
there's no need to hold vfsmounts at all. DQUOT_OFF() is done from
deactivate_super(), where it really belongs.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For some stupid reason I can't explain (brown paper bag is at hand), I
removed the check pfn_valid() in the code that does the icache/dcache
coherency on POWER4 and later. That causes us to eventually try to
access non existing struct page when hashing in IO pages.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This is very simple with it being almost all ppc32 with just a couple
of common defines.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
On Tue, Nov 08, 2005 at 08:12:56AM +1100, Benjamin Herrenschmidt wrote:
> Yes, the MAX_ORDER should be different indeed. But can Kconfig do that ?
> That is have the default value be different based on a Kconfig option ?
> I don't see that ... We may have to do things differently here...
This seems to be done in other parts of the Kconfig file. Using those
as an example, this should keep the MAX_ORDER block size at 16MB.
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Even though we can enable and disable xmon at runtime now, there are a
few places in the merge tree that call xmon and xmon_printf directly.
In the case below we call die() which will call xmon if it is enabled.
Also remove an unnecessary include of xmon.h in smp.c.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Oprofile was hardwiring the MMCRA sample bit to 1 but on newer cpus
(eg POWER5) we want to vary it based on the group being sampled.
Add a temporary workaround until people update their oprofile userspace.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
On ppc64 we end up with a negative value for the data size in the memory
boot message:
Memory: 2035560k/2097152k available (5792k kernel code, 89564k reserved,
18014398509481632k data, 870k bss, 352k init)
It turns out the section ordering of the linker script is different on
ppc32 and ppc64, so just count data as _edata - _sdata which should work
on both.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The PowerBook HD led code uses obsoletes device-tree accessors which do
not work anymore for getting the root of the tree.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
xmon() prototype is inconsistent between ARCH=ppc and ARCH=powerpc,
thus causing ARCH=ppc build breakage.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Building a PowerMac kernel with ARCH=powerpc causes a bunch of warnings,
this fixes some of them
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds a new thermal control framework for PowerMac, along with the
implementation for PowerMac8,1, PowerMac8,2 (iMac G5 rev 1 and 2), and
PowerMac9,1 (latest single CPU desktop). In the future, I expect to move
the older G5 thermal control to the new framework as well.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Some more U3 revisions have the missing "interrupts" property in U3,
this adds them to the fixup code in prom_init.c
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch updates g5_defconfig for ARCH=powerpc in order to add the SMU
support & thermal drivers to it, the pmac sound driver (works on some
G5s) and replaces rivafb with nvidiafb which works better for the cards
found in G5 based machines.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds the ability to the SMU driver to recover missing
calibration partitions from the SMU chip itself. It also adds some
dynamic mecanism to /proc/device-tree so that new properties are visible
to userland.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
CPU freq support using 970FX powertune facility for iMac G5 and SMU
based single CPU desktop.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
OK, the Fedora ppc32 and ppc64 kernels should both be arch/powerpc by
tomorrow. They're booting on G5, POWER5, and my powerbook. I'll test
pmac SMP and Pegasos later -- but pmac smp is known broken in arch/ppc
anyway, and I'll live with a potential Pegasos regression for now; it
wasn't supported officially in FC4 either.
I needed to fix ppc32 initrd -- we were never setting initrd_start.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
drivers/drm/ now implements proper ->compat_ioctl methods, so this isn't
needed anymore.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
all ioctls are 32bit compat clean, so the driver can use ->compat_ioctl
and ->unlocked_ioctl easily.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
implement a compat_ioctl handle in the driver instead of having table
entries in sparc64 ioctl32.c (I plan to get rid of the arch ioctl32.c
file eventually)
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
all the ioctls in the driver are 32bit compat clean and don't need BKL,
so we can switch it to ->unlocked_ioctl and ->compat_ioctl trivially.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Would you mind applying the following patch that kills those two + the
m68k and Documentation/ references?
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor simplification to the sparc64 tlb_flush_mmu: tlb_remove_page
set need_flush only after handling the tlb_fast_mode case, then
tlb_flush_mmu need not consider whether it's tlb_fast_mode.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
all these are handled by fs/compat_ioctls.c already.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
I don't know if we ever implemented this, but the only user in any 2.6
tree are the compat ioctls.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The old keyboard driver is gone in 2.6, so the only user left are the
compat ioctls.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The old sound drivers are gone in 2.6, so the only user left are the
compat ioctls.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
this inline routine in arch/sparc64/kernel/ioctl32.c is completely
unused and superceeded by compat_alloc_user_space()
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TIOCPKT_ macros are defined by all other architectures in asm/ioctls.h
and so does sparc and sparc64, so reomve the duplicates in asm/termios.h.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
It only serves to generate false-positive buildcheck warnings.
Just set it initially to tick_operations which uses the v9
%tick register which every sparc64 processor has.
Signed-off-by: David S. Miller <davem@davemloft.net>
It isn't needed any longer, as noted by Hugh Dickins.
We still need the flush routines, due to the one remaining
call site in hugetlb_prefault_arch_hook(). That can be
eliminated at some later point, however.
Signed-off-by: David S. Miller <davem@davemloft.net>
sparc64 is unique among architectures in taking the page_table_lock in
its context switch (well, cris does too, but erroneously, and it's not
yet SMP anyway).
This seems to be a private affair between switch_mm and activate_mm,
using page_table_lock as a per-mm lock, without any relation to its uses
elsewhere. That's fine, but comment it as such; and unlock sooner in
switch_mm, more like in activate_mm (preemption is disabled here).
There is a block of "if (0)"ed code in smp_flush_tlb_pending which would
have liked to rely on the page_table_lock, in switch_mm and elsewhere;
but its comment explains how dup_mmap's flush_tlb_mm defeated it. And
though that could have been changed at any time over the past few years,
now the chance vanishes as we push the page_table_lock downwards, and
perhaps split it per page table page. Just delete that block of code.
Which leaves the mysterious spin_unlock_wait(&oldmm->page_table_lock)
in kernel/fork.c copy_mm. Textual analysis (supported by Nick Piggin)
suggests that the comment was written by DaveM, and that it relates to
the defeated approach in the sparc64 smp_flush_tlb_pending. Just delete
this block too.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sparc64 prom_callback and new_setup_frame32 each operates on a user page
table without holding lock, and no doubt they've good reason. But I'd
feel more confident if they were to do a "pte = *ptep" and then operate
on pte, rather than re-evaluating *ptep.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Georg Chini <georg.chini@triaton-webhosting.com>
Introduce some sbus_dma routines similar to the
ebus_dma stuff to make the code look nearly the same
for both cases.
Thanks to Christopher for testing.
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a forward port of a 2.4.x sun4m LED driver written by Lars
Kotthoff.
Signed-off-by: Lars Kotthoff <metalhead@metalhead.ws>
Signed-off-by: David S. Miller <davem@davemloft.net>