TASK_UNMAPPED_BASE is not used with the new top down mmap layout. We can
reuse this preload slot by loading in the segment at 0x10000000, where almost
all PowerPC binaries are linked at.
On a microbenchmark that bounces a token between two 64bit processes over pipes
and calls gettimeofday each iteration (to access the VDSO), both the 32bit and
64bit context switch rate improves (tested on a 4GHz POWER6):
32bit: 273k/sec -> 283k/sec
64bit: 277k/sec -> 284k/sec
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With the new top down layout it is likely that the pc and stack will be in the
same segment, because the pc is most likely in a library allocated via a top
down mmap. Right now we bail out early if these segments match.
Rearrange the SLB preload code to sanity check all SLB preload addresses
are not in the kernel, then check all addresses for conflicts.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On 64bit applications the VDSO is the only thing in segment 0. Since the VDSO
is position independent we can remove the hint and let get_unmapped_area pick
an area. This will mean the vdso will be near other mmaps and will share
an SLB entry:
10000000-10001000 r-xp 00000000 08:06 5778459 /root/context_switch_64
10010000-10011000 r--p 00000000 08:06 5778459 /root/context_switch_64
10011000-10012000 rw-p 00001000 08:06 5778459 /root/context_switch_64
fffa92ae000-fffa92b0000 rw-p 00000000 00:00 0
fffa92b0000-fffa9453000 r-xp 00000000 08:06 4334051 /lib64/power6/libc-2.9.so
fffa9453000-fffa9462000 ---p 001a3000 08:06 4334051 /lib64/power6/libc-2.9.so
fffa9462000-fffa9466000 r--p 001a2000 08:06 4334051 /lib64/power6/libc-2.9.so
fffa9466000-fffa947c000 rw-p 001a6000 08:06 4334051 /lib64/power6/libc-2.9.so
fffa947c000-fffa9480000 rw-p 00000000 00:00 0
fffa9480000-fffa94a8000 r-xp 00000000 08:06 4333852 /lib64/ld-2.9.so
fffa94b3000-fffa94b4000 rw-p 00000000 00:00 0
fffa94b4000-fffa94b7000 r-xp 00000000 00:00 0 [vdso] <----- here I am
fffa94b7000-fffa94b8000 r--p 00027000 08:06 4333852 /lib64/ld-2.9.so
fffa94b8000-fffa94bb000 rw-p 00028000 08:06 4333852 /lib64/ld-2.9.so
fffa94bb000-fffa94bc000 rw-p 00000000 00:00 0
fffe4c10000-fffe4c25000 rw-p 00000000 00:00 0 [stack]
On a microbenchmark that bounces a token between two 64bit processes over pipes
and calls gettimeofday each iteration (to access the VDSO), our context switch
rate goes from 268k to 277k ctx switches/sec (tested on a 4GHz POWER6).
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The bitops.h functions that operate on a single bit in a bitfield are
implemented by operating on the corresponding word location. In all
cases the inner logic is valid if the mask being applied has more than
one bit set, so this patch exposes those inner operations. Indeed,
set_bits() was already available, but it duplicated code from
set_bit() (rather than making the latter a wrapper) - it was also
missing the PPC405_ERR77() workaround and the "volatile" address
qualifier present in other APIs. This corrects that, and exposes the
other multi-bit equivalents.
One advantage of these multi-bit forms is that they allow word-sized
variables to essentially be their own spinlocks, eg. very useful for
state machines where an atomic "flags" variable can obviate the need
for any additional locking.
Signed-off-by: Geoff Thorpe <geoff@geoffthorpe.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The workaround enabled by CONFIG_MPIC_BROKEN_REGREAD does not work
on non-broken MPICs. The symptom is no interrupts being received.
The fix is twofold. Firstly the code was broken for multiple isus,
we need to index into the shadow array with the src_no, not the idx.
Secondly, we always do the read, but only use the VECPRI_MASK and
VECPRI_ACTIVITY bits from the hardware, the rest of "val" comes
from the shadow.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This allows to remove the ppc_md.init() hook in the setup code.
Signed-off-by: Gerhard Pircher <gerhard_pircher@gmx.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Make sure the stack-protector segment registers are properly set up
before calling any functions which may have stack-protection compiled
into them.
[ Impact: prevent Xen early-boot crash when stack-protector is enabled ]
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
load_percpu_segment() is used to set up the per-cpu segment registers,
which are also used for -fstack-protector. Make sure that the
load_percpu_segment() function doesn't have stackprotector enabled.
[ Impact: allow percpu setup before calling stack-protected functions ]
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Fix build errors due to missing Kconfig select of CRC32:
ks8851.c:(.text+0x7d2ee): undefined reference to `crc32_le'
ks8851.c:(.text+0x7d2f5): undefined reference to `bitrev32'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are many variants of Toshiba laptops with ALC268 codec, and
it seems that a few of them don't work with model=toshiba preset
since they have the secondary ALC268 codec just for HDMI output.
This is a regression due to the previous clean-up work to merge all
Toshiba quirk entries into a single check.
This patch adds the identification of such laptops to apply the
standard BIOS-probing method. Unfortunately, Toshiba laptops have
all the same PCI SSID, so we need to check the codec SSID to identify
each device.
Tested-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: fix oopses with doubly mounted snapshots
nilfs2: missing a read lock for segment writer in nilfs_attach_checkpoint()
Fix some issues with the AFS documentation, found when testing AFS on ppc64:
- Update AFS features: reading/writing, local caching
- Typo in kafs sysfs debug file
- Use modprobe instead of insmod in example
- Update IPs for grand.central.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/kms: teardown crtc correctly when fb is destroyed.
drm/kms/radeon: cleanup combios TV table like DDX.
drm/radeon/kms: memset the allocated framebuffer before using it.
drm/radeon/kms: although LVDS might be possible on crtc 1 don't do it.
drm/radeon/kms: implement bo busy check + current domain
drm/radeon/kms: cut down indirects in register accesses.
drm/radeon/kms: Fix up vertical blank interrupt support.
drm/radeon/kms: add rv530 R300_SU_REG_DEST + reloc for ZPASS_ADDR
drm/edid: fixup detailed timings like the X server.
drm/radeon/kms: Add specific rs690 authorized register table
* 'next' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Update Microblaze defconfigs
microblaze: Use klimit instead of _end for memory init
microblaze: Enable ppoll syscall
microblaze: Sane handling of missing timer/intc in device tree
microblaze: use the generic ack_bad_irq implementation
Currently clockevents_notify() is called with interrupts enabled at
some places and interrupts disabled at some other places.
This results in a deadlock in this scenario.
cpu A holds clockevents_lock in clockevents_notify() with irqs enabled
cpu B waits for clockevents_lock in clockevents_notify() with irqs disabled
cpu C doing set_mtrr() which will try to rendezvous of all the cpus.
This will result in C and A come to the rendezvous point and waiting
for B. B is stuck forever waiting for the spinlock and thus not
reaching the rendezvous point.
Fix the clockevents code so that clockevents_lock is taken with
interrupts disabled and thus avoid the above deadlock.
Also call lapic_timer_propagate_broadcast() on the destination cpu so
that we avoid calling smp_call_function() in the clockevents notifier
chain.
This issue left us wondering if we need to change the MTRR rendezvous
logic to use stop machine logic (instead of smp_call_function) or add
a check in spinlock debug code to see if there are other spinlocks
which gets taken under both interrupts enabled/disabled conditions.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: "Pallipadi Venkatesh" <venkatesh.pallipadi@intel.com>
Cc: "Brown Len" <len.brown@intel.com>
LKML-Reference: <1250544899.2709.210.camel@sbs-t61.sc.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add an owner check to opening perf.data files and a switch to
silence it.
Because perf-report/perf-annotate are binary parsers reading
another users' perf.data file could be a security risk if the
file were explicitly engineered to trigger bugs in the parser
(we hope of course there are non such bugs, but you never
know).
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090819092023.896648538@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The BIOS pin configs are in fact correct and shall not be overwritten.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
HP Compaq 6530s and 6531s internal speaker is silence or becomes silence
within 1 minute after fresh boot. It is found that pin 0x1c must be set to
PIN_OUT mode to make the speaker work. This is weird - line-in pin 0x1c and
speaker pin 0x16 seem to be unrelated.
The codec differences before/after patch are:
@@ Node 0x17 [Pin Complex] wcaps 0x40020b:
Pin Default 0x41a6e130: [N/A] Mic at Ext Rear
Conn = Digital, Color = White
DefAssociation = 0x3, Sequence = 0x0
Misc = NO_PRESENCE
- Pin-ctls: 0x24: IN
+ Pin-ctls: 0x40: OUT
@@ Node 0x1c [Pin Complex] wcaps 0x40018d:
Pin Default 0x41813021: [N/A] Line In at Ext Rear
Conn = 1/8, Color = Blue
DefAssociation = 0x2, Sequence = 0x1
- Pin-ctls: 0x24: IN VREF_80
+ Pin-ctls: 0x40: OUT VREF_HIZ
Unsolicited: tag=00, enabled=0
Connection: 1
0x24
Tests show that it won't impact (external) Mic recording.
Reported-by: "Lin, Ming M" <ming.m.lin@intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Normally, srmmu uses different trap table register values to allow
determination of the cpu we're on. All of the trap tables have
identical content, they just sit at different offsets from the first
trap table, and the offset shifted down and masked out determines
the cpu we are on.
The code tries to free them up when they aren't actually used
(don't have all 4 cpus, we're on sun4d, etc.) but that causes
problems.
For one thing it triggers false positives in the DMA debugging
code. And fixing that up while preserving this relative offset
thing isn't trivial.
So just kill the freeing code, it costs us at most 3 pages, big
deal...
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to I modified the corresponding platform device name,
so I make the patch to rename MAC platform driver
for w90p910 platform.
Signed-off-by: Wan ZongShun <mcuos.com@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If userspace destroys a framebuffer that is in use on a crtc,
don't just null it out, tear down the crtc properly so the
hw gets turned off.
Signed-off-by: Dave Airlie <airlied@redhat.com>
The fallback case wasn't getting executed properly if there
was no TV table, which my T42 M7 hasn't got.
Signed-off-by: Dave Airlie <airlied@redhat.com>
LVDS always requests RMX_FULL, we need to fix it so that doesn't happen
before we can enable LVDS on crtc 1.
Signed-off-by: Dave Airlie <airlied@redhat.com>
yellowfin_init_ring() needs to clean up if dev_alloc_skb() fails and
should pass an error status up to the caller. This also prevents an
buffer underrun if failure occurred in the first iteration.
yellowfin_open() which calls yellowfin_init_ring() should free its
requested irq upon failure.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I think arch/sparc/kernel/sys32.S has an incorrect splice definition:
SIGN2(sys32_splice, sys_splice, %o0, %o1)
The splice() prototype looks like :
long splice(int fd_in, loff_t *off_in, int fd_out,
loff_t *off_out, size_t len, unsigned int flags);
So I think we should have :
SIGN2(sys32_splice, sys_splice, %o0, %o2)
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
security: Fix prompt for LSM_MMAP_MIN_ADDR
security: Make LSM_MMAP_MIN_ADDR default match its help text.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: use the right flag for get_vm_area()
percpu, sparc64: fix sparse possible cpu map handling
init: set nr_cpu_ids before setup_per_cpu_areas()
If node_load[] is cleared everytime build_zonelists() is
called,node_load[] will have no help to find the next node that should
appear in the given node's fallback list.
Because of the bug, zonelist's node_order is not calculated as expected.
This bug affects on big machine, which has asynmetric node distance.
[synmetric NUMA's node distance]
0 1 2
0 10 12 12
1 12 10 12
2 12 12 10
[asynmetric NUMA's node distance]
0 1 2
0 10 12 20
1 12 10 14
2 20 14 10
This (my bug) is very old but no one has reported this for a long time.
Maybe because the number of asynmetric NUMA is very small and they use
cpuset for customizing node memory allocation fallback.
[akpm@linux-foundation.org: fix CONFIG_NUMA=n build]
Signed-off-by: Bo Liu <bo-liu@hotmail.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
According to the POSIX (1003.1-2008), the file descriptor shall have been
opened with read permission, regardless of the protection options specified to
mmap(). The ltp test cases mmap06/07 need this.
Signed-off-by: Graff Yang <graff.yang@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since the changes to the bitbang driver, there is the possibility we will
be called with either the speed_hz or bpw values zero. We take these to
mean that the default values (8 bits per word, or maximum bus speed).
Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently the clock rate calculation may round as pleased, which means
that it is possible that we will round down and end up with a faster clock
rate than intended.
Change the calculation to use DIV_ROUND_UP() to ensure that we end up with
a clock rate either the same as or lower than the user requested one.
Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are a number of individual MMC drivers listed in MAINTAINERS. I
didn't modify those records. Perhaps I should have.
Cc: <linux-mmc@vger.kernel.org>
Cc: Manuel Lauss <manuel.lauss@gmail.com>
Cc: Nicolas Pitre <nico@cam.org>
Cc: Pierre Ossman <drzeus@drzeus.cx>
Cc: Pavel Pisa <ppisa@pikron.com>
Cc: Jarkko Lavinen <jarkko.lavinen@nokia.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Sascha Sommer <saschasommer@freenet.de>
Cc: Ian Molton <ian@mnementh.co.uk>
Cc: Joseph Chan <JosephChan@via.com.tw>
Cc: Harald Welte <HaraldWelte@viatech.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The commit 2ff05b2b (oom: move oom_adj value) moveed the oom_adj value to
the mm_struct. It was a very good first step for sanitize OOM.
However Paul Menage reported the commit makes regression to his job
scheduler. Current OOM logic can kill OOM_DISABLED process.
Why? His program has the code of similar to the following.
...
set_oom_adj(OOM_DISABLE); /* The job scheduler never killed by oom */
...
if (vfork() == 0) {
set_oom_adj(0); /* Invoked child can be killed */
execve("foo-bar-cmd");
}
....
vfork() parent and child are shared the same mm_struct. then above
set_oom_adj(0) doesn't only change oom_adj for vfork() child, it's also
change oom_adj for vfork() parent. Then, vfork() parent (job scheduler)
lost OOM immune and it was killed.
Actually, fork-setting-exec idiom is very frequently used in userland program.
We must not break this assumption.
Then, this patch revert commit 2ff05b2b and related commit.
Reverted commit list
---------------------
- commit 2ff05b2b4e (oom: move oom_adj value from task_struct to mm_struct)
- commit 4d8b9135c3 (oom: avoid unnecessary mm locking and scanning for OOM_DISABLE)
- commit 8123681022 (oom: only oom kill exiting tasks with attached memory)
- commit 933b787b57 (mm: copy over oom_adj value at fork time)
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
get_sb_pseudo sets s_maxbytes to ~0ULL which becomes negative when cast
to a signed value. Fix it to use MAX_LFS_FILESIZE which casts properly
to a positive signed value.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Steve French <smfrench@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Robert Love <rlove@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix prompt for LSM_MMAP_MIN_ADDR.
(Verbs are cool!)
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Commit 788084aba2 added the LSM_MMAP_MIN_ADDR
option, whose help text states "For most ia64, ppc64 and x86 users with lots
of address space a value of 65536 is reasonable and should cause no problems."
Which implies that it's default setting was typoed.
Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (60 commits)
net: restore gnet_stats_basic to previous definition
NETROM: Fix use of static buffer
e1000e: fix use of pci_enable_pcie_error_reporting
e1000e: WoL does not work on 82577/82578 with manageability enabled
cnic: Fix locking in init/exit calls.
cnic: Fix locking in start/stop calls.
bnx2: Use mutex on slow path cnic calls.
cnic: Refine registration with bnx2.
cnic: Fix symbol_put_addr() panic on ia64.
gre: Fix MTU calculation for bound GRE tunnels
pegasus: Add new device ID.
drivers/net: fixed drivers that support netpoll use ndo_start_xmit()
via-velocity: Fix test of mii_status bit VELOCITY_DUPLEX_FULL
rt2x00: fix memory corruption in rf cache, add a sanity check
ixgbe: Fix receive on real device when VLANs are configured
ixgbe: Do not return 0 in ixgbe_fcoe_ddp() upon FCP_RSP in DDP completion
netxen: free napi resources during detach
netxen: remove netxen workqueue
ixgbe: fix issues setting rx-usecs with legacy interrupts
can: fix oops caused by wrong rtnl newlink usage
...
will fix kernel oopses like the following:
# mount -t nilfs2 -r -o cp=20 /dev/sdb1 /test1
# mount -t nilfs2 -r -o cp=20 /dev/sdb1 /test2
# umount /test1
# umount /test2
BUG: sleeping function called from invalid context at arch/x86/mm/fault.c:1069
in_atomic(): 0, irqs_disabled(): 1, pid: 3886, name: umount.nilfs2
1 lock held by umount.nilfs2/3886:
#0: (&type->s_umount_key#31){+.+...}, at: [<c10b398a>] deactivate_super+0x52/0x6c
irq event stamp: 1219
hardirqs last enabled at (1219): [<c135c774>] __mutex_unlock_slowpath+0xf8/0x119
hardirqs last disabled at (1218): [<c135c6d5>] __mutex_unlock_slowpath+0x59/0x119
softirqs last enabled at (1214): [<c1033316>] __do_softirq+0x1a5/0x1ad
softirqs last disabled at (1205): [<c1033354>] do_softirq+0x36/0x5a
Pid: 3886, comm: umount.nilfs2 Not tainted 2.6.31-rc6 #55
Call Trace:
[<c1023549>] __might_sleep+0x107/0x10e
[<c13603c0>] do_page_fault+0x246/0x397
[<c136017a>] ? do_page_fault+0x0/0x397
[<c135e753>] error_code+0x6b/0x70
[<c136017a>] ? do_page_fault+0x0/0x397
[<c104f805>] ? __lock_acquire+0x91/0x12fd
[<c1050a62>] ? __lock_acquire+0x12ee/0x12fd
[<c1050a62>] ? __lock_acquire+0x12ee/0x12fd
[<c1050b2b>] lock_acquire+0xba/0xdd
[<d0d17d3f>] ? nilfs_detach_segment_constructor+0x2f/0x2fa [nilfs2]
[<c135d4fe>] down_write+0x2a/0x46
[<d0d17d3f>] ? nilfs_detach_segment_constructor+0x2f/0x2fa [nilfs2]
[<d0d17d3f>] nilfs_detach_segment_constructor+0x2f/0x2fa [nilfs2]
[<c104ea2c>] ? mark_held_locks+0x43/0x5b
[<c104ecb1>] ? trace_hardirqs_on_caller+0x10b/0x133
[<c104ece4>] ? trace_hardirqs_on+0xb/0xd
[<d0d09ac1>] nilfs_put_super+0x2f/0xca [nilfs2]
[<c10b3352>] generic_shutdown_super+0x49/0xb8
[<c10b33de>] kill_block_super+0x1d/0x31
[<c10e6599>] ? vfs_quota_off+0x0/0x12
[<c10b398f>] deactivate_super+0x57/0x6c
[<c10c4bc3>] mntput_no_expire+0x8c/0xb4
[<c10c5094>] sys_umount+0x27f/0x2a4
[<c10c50c6>] sys_oldumount+0xd/0xf
[<c10031a4>] sysenter_do_call+0x12/0x38
...
This turns out to be a bug brought by an -rc1 patch ("nilfs2: simplify
remaining sget() use").
In the patch, a new "put resource" function, nilfs_put_sbinfo()
was introduced to delay freeing nilfs_sb_info struct.
But the nilfs_put_sbinfo() mistakenly used atomic_dec_and_test()
function to check the reference count, and it caused the nilfs_sb_info
was freed when user mounted a snapshot twice.
This bug also suggests there was unseen memory leak in usual mount
/umount operations for nilfs.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>