This starts bringing the PowerPC and Sparc64 implemetations back closer
together.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
It should be set to the total number of pages that the
system will really have available after things like
initmem, the bootmem map, and initrd are freed up.
Signed-off-by: David S. Miller <davem@davemloft.net>
While useful in odd circumstances to debug something, they are
normally totally unused and anyone can fetch this code out of the
history if they really need it.
And in any event, the person who needs this kind of code is usually me
:-)
Signed-off-by: David S. Miller <davem@davemloft.net>
__get_phys is only called from init.c as is prom_virt_to_phys(),
__get_iospace() is not called at all, and sun4u_get_pte() is largely
misnamed.
Privatize the implementation and helper functions of
sun4u_get_phys() to mm/init.c, and rename to
kvaddr_to_paddr().
The only used of this thing is flush_icache_range(), and thus
things can be considerably further simplified. For example,
we should only see module or PAGE_OFFSET kernel addresses here,
so we don't need the OBP firmware range handling at all.
Signed-off-by: David S. Miller <davem@davemloft.net>
Kick out empty entries as soon as we spot them, and use memmove()
instead of a silly loop to make the operation more clear.
Signed-off-by: David S. Miller <davem@davemloft.net>
Decrease the SECTION_SIZE_BITS --> MAX_PHYSADDR_BITS
range a little bit.
The cost of going to SPARSEMEM_STATIC becomes 8K of BSS space, and in
return we save a pointer dereferences on every page struct lookup.
Even better we hit the main kernel image for the base address which is
in a hugepage locked TLB entry.
Signed-off-by: David S. Miller <davem@davemloft.net>
This helps deal with the invisible bridge that sits between
the host controller and the top-most visisble PCI devices
on hypervisor systems.
For example, on T1000 the bus-range property says 2 --> 4
and so there is a PCI express bridge at bus 2, devfn 0, etc.
So if we don't force the dummy host controller to bus zero,
we'll try to create two devices with the same domain/bus/devfn
triplet.
Also, add some more log diagnostics to make debugging stuff like this
easyer.
Signed-off-by: David S. Miller <davem@davemloft.net>
We fake up a dummy one in all cases because that is the simplest
thing to do and it happens to be necessary for hypervisor systems.
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't do the "Simba APB is a PBM" bogosity for Sabre
controllers any longer, so this pbms_same_domain thing
is no longer necessary.
Signed-off-by: David S. Miller <davem@davemloft.net>
The SIMBA APB bridge is strange, it is a PCI bridge but it lacks
some standard OF properties, in particular it lacks a 'ranges'
property.
What you have to do is read the IO and MEM range registers in
the APB bridge to determine the ranges handled by each bridge.
So fill in the bus resources by doing that.
Since we now handle this quirk in the generic PCI and OF device
probing layers, we can flat out eliminate all of that code from
the sabre pci controller driver.
In fact we can thus eliminate completely another quirk of the sabre
driver. It tried to make the two APB bridges look like PBMs but that
makes zero sense now (and it's questionable whether it ever made sense).
So now just use pbm_A and probe the whole PCI hierarchy using that as
the root.
This simplification allows many future cleanups to occur.
Also, I've found yet another quirk that needs to be worked around
while testing this. You can't use the 'class-code' OF firmware
property, especially for IDE controllers. We have to read the value
out of PCI config space or else we'll see the value the device was
showing before it was programmed into native mode.
I'm starting to think it might be wise to just read all of the values
out of PCI config space instead of using the OF properties. :-/
Signed-off-by: David S. Miller <davem@davemloft.net>
Need to traverse recursively down child busses else we only
get the file created under devices at the top-level.
Signed-off-by: David S. Miller <davem@davemloft.net>
The only user was bus_dvma_to_mem() which is no longer used
by any driver, so kill that, and the export of pci_memspace_mask.
The only user now is the PCI mmap support code.
Signed-off-by: David S. Miller <davem@davemloft.net>
Almost entirely taken from the 64-bit PowerPC PCI code.
This allowed to eliminate a ton of cruft from the sparc64
PCI layer.
Signed-off-by: David S. Miller <davem@davemloft.net>
Also, do not try to compute resources by hand, instead use
the pre-computed ones in the of_device.
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows us to simplify sharing code with powerpc which
has properties that have various forms of capitalization
when on the sparc64 side the property is all lower-case.
Signed-off-by: David S. Miller <davem@davemloft.net>
Finally, we actually change the functions themselves.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Removes days_in_mo[], as it's almost identical to month_days[]
- Use the leapyear() macro
- Line length wrapping.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I'd like to thank John Stul and others for helping
me along the way.
A lot of cleanups fell out of this. For example, the get_compare()
tick_op was totally unused, so was deleted. And the most often used
tick_op members were grouped together for cache-friendlyness.
The sparc64 TSC is given to the kernel as a one-shot timer.
tick_ops->init_timer() simply turns off the privileged bit in
the tick register (when possible), and disables the interrupt
by setting bit 63 in the compare register. The ->disable_irq()
op also sets this bit.
tick_ops->add_compare() is changed to:
1) Add the given delta to "tick" not to "compare"
2) Return a boolean which, if true, means that the tick
value read after writing the compare value was found
to have incremented past the initial tick value. This
mirrors logic used in the HPET driver's ->next_event()
method.
Each tick_ops implementation also now provides a name string.
And we feed this into the clocksource and clockevents layers.
Signed-off-by: David S. Miller <davem@davemloft.net>
Things were scattered all over the place, split between
SMP and non-SMP.
Unify it all so that dyntick support is easier to add.
Signed-off-by: David S. Miller <davem@davemloft.net>
While building a test kernel for the new esp driver (against
git-current), I hit this bug. Trivial fix, put the inline declaration
in the right place. :)
Signed-off-by: Tom "spot" Callaway <tcallawa@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not sign extend args using the sys32_ipc stub, that is
buggy and unnecessary.
Based upon an excellent report by Mikael Pettersson.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix section mismatch in arch/sparc/kernel/pcic.c and
arch/sparc64/kernel/pci.c.
Signed-off-by: Robert Reif <reif@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
I don't figure anyone really cares about SunOS syscall emulation, and I
certainly don't. But I'm getting rid of uses of the OPEN_MAX and CHILD_MAX
compile-time constant, and these are almost the only ones. OPEN_MAX is a
bogus constant with no meaning about anything. The RLIMIT_NOFILE resource
limit is what sysconf (_SC_OPEN_MAX) actually wants to return.
The CHILD_MAX cases weren't actually using anything I want to get rid of,
but I noticed that they are there and are wrong too. The CHILD_MAX value
is not really unlimited as a -1 return from sysconf indicates. The
RLIMIT_NPROC resource limit is what sysconf (_SC_CHILD_MAX) wants to return.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are several IOMMU allocator bugs. Instead of trying to fix this
overly complicated code, just mirror the PCI IOMMU arena allocator
which is very stable and well stress tested.
I tried to make the code as identical as possible so we can switch
sun4u PCI and SBUS over to a common piece of IOMMU code. All that
will be need are two callbacks, one to do a full IOMMU flush and one
to do a streaming buffer flush.
This patch gets rid of a lot of hangs and mysterious crashes on SBUS
sparc64 systems, at least for me.
Signed-off-by: David S. Miller <davem@davemloft.net>
The manual says that it is required and we actually have crash reports
where loads see stale data due to not having membars here.
In one case the networking does:
memset(skb, 0, offsetof(struct sk_buff, truesize));
and then some code later checks skb->nohdr for zero, but it's still
the value that was there before the memset().
Note that arch/sparc64/lib/xor.S already got this right.
Signed-off-by: David S. Miller <davem@davemloft.net>
We have to make sure to use base-pagesize TLB entries even during the
early transition period where we need TLB miss handling but don't have
the kernel page tables setup yet for the linear region.
Also, it is necessary therefore to not use the 4MB TSB for these
translations, and instead use the normal kernel TSB. This allows us
to also get rid of the 4MB tsb for debug builds which shrinks the
kernel a little bit.
Signed-off-by: David S. Miller <davem@davemloft.net>
These pte loops all assume the passed in address is HPAGE
aligned, make sure that is actually true.
Signed-off-by: David S. Miller <davem@davemloft.net>
sys_mbind
sys_get_mempolicy
sys_set_mempolicy
sys_kexec_load
sys_move_pages
sys_getcpu
sys_epoll_pwait
This work is largely a result of David Woodhouse's most
excellent missing syscalls patch.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix atomicity of TIF update in flush_thread() for sparc64
Fixes correctly the race by using *_ti_thread_flag.
Race :
parent process executing :
sys_ptrace()
(lock_kernel())
(ptrace_get_task_struct(pid))
arch_ptrace()
ptrace_detach()
ptrace_disable(child);
clear_singlestep(child);
clear_tsk_thread_flag(child, TIF_SINGLESTEP);
(which clears the TIF_SINGLESTEP flag atomically from a different
process)
(put_task_struct(child))
(unlock_kernel())
And at the same time, in the child process :
sys_execve()
do_execve()
search_binary_handler()
load_elf_binary()
flush_old_exec()
flush_thread()
doing a non-atomic thread flag update
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
We mistakedly modify 'bus' in the innermost loop. What
should happen is that at each register index iteration,
we start with the same 'bus'.
So preserve it's value at the top level, and use a loop
local variable 'dbus' for iteration.
This bug causes registers other than the first to be
decoded improperly.
Signed-off-by: David S. Miller <davem@davemloft.net>
When the PCI controller OBP node lacks an interrupt-map
and interrupt-map-mask property, we need to form the
INO by hand. The PCI swizzle logic was not doing that
properly.
This was a regression added by the of_device code.
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts some bogosity from the dynamic command-line
changes made on sparc32 and sparc64.
Drivers such as drivers/sbus/char/openprom.c reference
saved_command_line, and can be modular.
The boot_command_line is __initdata, yet the dynamic command-line
changes add modular exports of that symbol, obviously wrong.
Signed-off-by: David S. Miller <davem@davemloft.net>
Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.
[akpm@osdl.org: sparc64 fix]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I am slowly moving to a model where all process killing is struct pid based
instead of pid_t based. The sunos compatibility code is one of the last users
of the old pid_t based kill_pg in the kernel. By being complete I allow for
the future removal of kill_pg from the kernel, which will ensure I don't miss
something.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I noticed that almost all architectures implemented exactly the same
sys32_sysinfo... except parisc, where a bug was to be found in handling of
the uptime. So let's remove a whole whack of code for fun and profit.
Cribbed compat_sys_sysinfo from x86_64's implementation, since I figured it
would be the best tested.
This patch incorporates Arnd's suggestion of not using set_fs/get_fs, but
instead extracting out the common code from sys_sysinfo.
Cc: Christoph Hellwig <hch@infradead.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The line discipline numbers N_* are currently defined for each architecture
individually, but (except for a seeming mistake) identically, in
asm/termios.h. There is no obvious reason why these numbers should be
architecture specific, nor any apparent relationship with the termios
structure. The total number of these, NR_LDISCS, is defined in linux/tty.h
anyway. So I propose the following patch which moves the definitions of
the individual line disciplines to linux/tty.h too.
Three of these numbers (N_MASC, N_PROFIBUS_FDL, and N_SMSBLOCK) are unused
in the current kernel, but the patch still keeps the complete set in case
there are plans to use them yet.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Update all arch/*/kernel/vmlinux.lds.S to not include space for initramfs
when CONFIG_BLK_DEV_INITRAMFS is not selected. This saves another 4 kbytes
on most platfoms (some reserve PAGE_SIZE for initramfs).
Signed-off-by: Jean-Paul Saman <jean-paul.saman@nxp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As Andi pointed out: CONFIG_GENERIC_ISA_DMA only disables the ISA DMA
channel management. Other functionality may still expect GFP_DMA to
provide memory below 16M. So we need to make sure that CONFIG_ZONE_DMA is
set independent of CONFIG_GENERIC_ISA_DMA. Undo the modifications to
mm/Kconfig where we made ZONE_DMA dependent on GENERIC_ISA_DMA and set
theses explicitly in each arches Kconfig.
Reviews must occur for each arch in order to determine if ZONE_DMA can be
switched off. It can only be switched off if we know that all devices
supported by a platform are capable of performing DMA transfers to all of
memory (Some arches already support this: uml, avr32, sh sh64, parisc and
IA64/Altix).
In order to switch ZONE_DMA off conditionally, one would have to establish
a scheme by which one can assure that no drivers are enabled that are only
capable of doing I/O to a part of memory, or one needs to provide an
alternate means of performing an allocation from a specific range of memory
(like provided by alloc_pages_range()) and insure that all drivers use that
call. In that case the arches alloc_dma_coherent() may need to be modified
to call alloc_pages_range() instead of relying on GFP_DMA.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
nr_free_pages is now a simple access to a global variable. Make it a macro
instead of a function.
The nr_free_pages now requires vmstat.h to be included. There is one
occurrence in power management where we need to add the include. Directly
refrer to global_page_state() there to clarify why the #include was added.
[akpm@osdl.org: arm build fix]
[akpm@osdl.org: sparc64 build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is kind of hokey, we could use the hardware provided facilities
much better.
MSIs are assosciated with MSI Queues. MSI Queues generate interrupts
when any MSI assosciated with it is signalled. This suggests a
two-tiered IRQ dispatch scheme:
MSI Queue interrupt --> queue interrupt handler
MSI dispatch --> driver interrupt handler
But we just get one-level under Linux currently. What I'd like to do
is possibly stick the IRQ actions into a per-MSI-Queue data structure,
and dispatch them form there, but the generic IRQ layer doesn't
provide a way to do that right now.
So, the current kludge is to "ACK" the interrupt by processing the
MSI Queue data structures and ACK'ing them, then we run the actual
handler like normal.
We are wasting a lot of useful information, for example the MSI data
and address are provided with ever MSI, as well as a system tick if
available. If we could pass this into the IRQ handler it could help
with certain things, in particular for PCI-Express error messages.
The MSI entries on sparc64 also tell you exactly which bus/device/fn
sent the MSI, which would be great for error handling when no
registered IRQ handler can service the interrupt.
We override the disable/enable IRQ chip methods in sun4v_msi, so we
have to call {mask,unmask}_msi_irq() directly from there. This is
another ugly wart.
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise we can't use the generic MSI code.
Furthermore, properly use the {get,set}_irq_foo() abstracted
interfaces instead of direct accesses to irq_desc[]->foo.
Signed-off-by: David S. Miller <davem@davemloft.net>
Mirror the logic in the sun4u handler, we have to update
both registers even when we branch out to window fault
fixup handling.
The way it works is that if we are in etrap processing a
fault already, g4/g5 holds the original fault information.
If we take a window spill fault while doing etrap, then
we put the window spill fault info into g4/g5 and this is
what the top-level fault handler ends up processing first.
Then we retry the originally faulting instruction, and
process the original fault at that time.
This is all necessary because of how constrained the trap
registers are in these code paths. These cases trigger
very rarely, so even if there is some performance implication
it's doesn't happen very often. In fact the rarity is why
it took so long to trigger and find this particular bug.
Signed-off-by: David S. Miller <davem@davemloft.net>
Compiling the kernel with CONFIG_HOTPLUG = y and CONFIG_HOTPLUG_CPU = n
with CONFIG_RELOCATABLE = y generates the following modpost warnings
WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141b7d) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141b9c) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.text:__cpu_up
from .text between '_cpu_up' (at offset 0xc0141bd8) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141c05) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141c26) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141c37) and 'cpu_up'
This is because cpu_up, _cpu_up and __cpu_up (in some architectures) are
defined as __devinit
AND
__cpu_up calls some __cpuinit functions.
Since __cpuinit would map to __init with this kind of a configuration,
we get a .text refering .init.data warning.
This patch solves the problem by converting all of __cpu_up, _cpu_up
and cpu_up from __devinit to __cpuinit. The approach is justified since
the callers of cpu_up are either dependent on CONFIG_HOTPLUG_CPU or
are of __init type.
Thus when CONFIG_HOTPLUG_CPU=y, all these cpu up functions would land up
in .text section, and when CONFIG_HOTPLUG_CPU=n, all these functions would
land up in .init section.
Tested on a i386 SMP machine running linux-2.6.20-rc3-mm1.
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
And this points out that the return value from
isa_dev_get_resource() and the 'pregs' arg to
isa_dev_get_irq() are totally unused.
Based upon a patch from Richard Mortimer <richm@oldelvet.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to pass in the resource otherwise we cannot
release the region properly. We must know whether it is
an I/O or MEM resource.
Spotted by Eric Brower.
Signed-off-by: David S. Miller <davem@davemloft.net>
We were not being careful enough. When we trim the physical
memory areas, we have to make sure we don't remove the kernel
image or initial ramdisk image ranges.
Signed-off-by: David S. Miller <davem@davemloft.net>
The apple fn keys don't work anymore with 2.6.20-rc1.
The reason is that USB_HID_POWERBOOK appears in several files although
USB_HIDINPUT_POWERBOOK is the thing to be used.
The patch fixes this.
Cc: Greg KH <greg@kroah.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It branches around some necessary prom calls, which we would
need to do even if we are mapped at the correct location already.
So it doesn't work.
The idea was that this sort of thing could be used for the eventual
kexec implementation, but it is clear that this will need to be
done differently.
Signed-off-by: David S. Miller <davem@davemloft.net>
Run this:
#!/bin/sh
for f in $(grep -Erl "\([^\)]*\) *k[cmz]alloc" *) ; do
echo "De-casting $f..."
perl -pi -e "s/ ?= ?\([^\)]*\) *(k[cmz]alloc) *\(/ = \1\(/" $f
done
And then go through and reinstate those cases where code is casting pointers
to non-pointers.
And then drop a few hunks which conflicted with outstanding work.
Cc: Russell King <rmk@arm.linux.org.uk>, Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Karsten Keil <kkeil@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Steven French <sfrench@us.ibm.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- relbranch_fixup(), for non-branches, would end up setting
regs->tnpc incorrectly, in fact it would set it equal to
regs->tpc which would cause that instruction to execute twice
Also, if this is not a PC-relative branch, we should just
leave regs->tnpc as-is. This covers cases like 'jmpl' which
branch to absolute values.
- To be absolutely %100 safe, we need to flush the instruction
cache for all assignments to kprobe->ainsn.insn[], including
cases like add_aggr_kprobe()
- prev_kprobe's status field needs to be 'unsigned long' to match
the type of the value it is saving
- jprobes were totally broken:
= jprobe_return() can run in the stack frame of the jprobe handler,
or in an even deeper stack frame, thus we'll be in the wrong
register window than the one from the original probe state.
So unwind using 'restore' instructions, if necessary, right
before we do the jprobe_return() breakpoint trap.
= There is no reason to save/restore the register window saved
at %sp at jprobe trigger time. Those registers cannot be
modified by the jprobe handler. Also, this code was saving
and restoring "sizeof (struct sparc_stackf)" bytes. Depending
upon the caller, this could clobber unrelated stack frame
pieces if there is only a basic 128-byte register window
stored on the stack, without the argument save area.
So just saving and restoring struct pt_regs is sufficient.
= Kill the "jprobe_saved_esp", totally unused.
Also, delete "jprobe_saved_regs_location", with the stack frame
unwind now done explicitly by jprobe_return(), this check is
superfluous.
Signed-off-by: David S. Miller <davem@davemloft.net>
ptrace_traceme() consolidation made
ret = ptrace_traceme();
dead write.
Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Userspace is forbidden from making unaligned loads and
stores. So if we get an unaligned trap due to a
{get,put}_user(), signal a fault and run the exception
handler.
Signed-off-by: David S. Miller <davem@davemloft.net>
To add this logic, put the VIS instruction check at the
vis_emul() call site instead of inside of vis_emul().
Signed-off-by: David S. Miller <davem@davemloft.net>
This facility provides three entry points:
ilog2() Log base 2 of unsigned long
ilog2_u32() Log base 2 of u32
ilog2_u64() Log base 2 of u64
These facilities can either be used inside functions on dynamic data:
int do_something(long q)
{
...;
y = ilog2(x)
...;
}
Or can be used to statically initialise global variables with constant values:
unsigned n = ilog2(27);
When performing static initialisation, the compiler will report "error:
initializer element is not constant" if asked to take a log of zero or of
something not reducible to a constant. They treat negative numbers as
unsigned.
When not dealing with a constant, they fall back to using fls() which permits
them to use arch-specific log calculation instructions - such as BSR on
x86/x86_64 or SCAN on FRV - if available.
[akpm@osdl.org: MMC fix]
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David Howells <dhowells@redhat.com>
Cc: Wojtek Kaniewski <wojtekka@toxygen.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>