Use the %p format string which already accounts for the padding you need
with a pointer type on a particular architecture.
Also replace the macro with a static inline function to match the rest of
the file.
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a WARN() macro that acts like WARN_ON(), with the added feature that it
takes a printk like argument that is printed as part of the warning message.
[akpm@linux-foundation.org: fix printk arguments]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We want to use WARN() as a variant of WARN_ON(), however a few drivers are
using WARN() internally. This patch renames these to WARNING() to avoid the
namespace clash. A few cases were defining but not using the thing, for those
cases I just deleted the definition.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Greg KH <greg@kroah.com>
Cc: Karsten Keil <kkeil@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove apparently obsolete content from init.h referring to gcc 2.9x
and to "no_module_init".
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We duplicate alloc/free_thread_info defines on many platforms (the
majority uses __get_free_pages/free_pages). This patch defines common
defines and removes these duplicated defines.
__HAVE_ARCH_THREAD_INFO_ALLOCATOR is introduced for platforms that do
something different.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Presently call_usermodehelper_setup() uses GFP_ATOMIC. but it can return
NULL _very_ easily.
GFP_ATOMIC is needed only when we can't sleep. and, GFP_KERNEL is robust
and better.
thus, I add gfp_mask argument to call_usermodehelper_setup().
So, its callers pass the gfp_t as below:
call_usermodehelper() and call_usermodehelper_keys():
depend on 'wait' argument.
call_usermodehelper_pipe():
always GFP_KERNEL because always run under process context.
orderly_poweroff():
pass to GFP_ATOMIC because may run under interrupt context.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: "Paul Menage" <menage@google.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Several compilers offer "long long" without claiming to support C99.
Considering how frequent __s64/__u64 are used our userspace headers are
anyway unusable without __s64/__u64 available.
Always offer __s64/__u64 to non-gcc non-C99 compilers - if they provide
"long long" that makes the headers compiling and if they don't they are
anyway screwed.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Build kernel/profile.o only if CONFIG_PROFILING is enabled.
This makes CONFIG_PROFILING=n kernels smaller.
As a bonus, some profile_tick() calls and one branch from schedule() are
now eliminated with CONFIG_PROFILING=n (but I doubt these are
measurable effects).
This patch changes the effects of CONFIG_PROFILING=n, but I don't think
having more than two choices would be the better choice.
This patch also adds the name of the first parameter to the prototypes
of profile_{hits,tick}() since I anyway had to add them for the dummy
functions.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the conditional surrounding the definition of list_add() from list.h
since, if you define CONFIG_DEBUG_LIST, the definition you will subsequently
pick up from lib/list_debug.c will be absolutely identical, at which point you
can remove that redundant definition from list_debug.c as well.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This header file has been unused for quite some time, and the
corresponding source files appear to have been removed back in commit
99eb8a550d ("Remove the arm26 port")
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: Ian Molton <spyro@f2s.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There haave been several areas in the kernel where an int has been used for
flags in local_irq_save() and friends instead of a long. This can cause some
hard to debug problems on some architectures.
This patch adds a typecheck inside the irqsave and restore functions to flag
these cases.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Needed to fix up a recursive include snafu in
locking-add-typecheck-on-irqsave-and-friends-for-correct-flags.patch
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add fields for VLAN tag and insert VLAN tag flag to the control
section struct. These fields will be used for sending ethernet
packets.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Changeset 7fa897b91a ("ide: trivial sparse
annotations") created an IDE bootup regression on big-endian systems.
In drivers/ide/ide-iops.c, function ide_fixstring() we now have the
loop:
for (p = end ; p != s;)
be16_to_cpus((u16 *)(p -= 2));
which will never terminate on big-endian because in such
a configuration be16_to_cpus() evaluates to "do { } while (0)"
Therefore, always evaluate the arguments to nop endian transformation
operations.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch enables NAND subpage read functionality.
If upper layer drivers are requesting to read non page aligned data NAND
subpage-read functionality reads the only whose ECC regions which include
requested data when original code reads whole page.
This significantly improves performance in many cases.
Here are some digits :
UBI volume mount time
No subpage reads: 5.75 seconds
Subpage read patch: 2.42 seconds
Open/stat time for files on JFFS2 volume:
No subpage read 0m 5.36s
Subpage read 0m 2.88s
Signed-off-by Alexey Korolev <akorolev@infradead.org>
Acked-by: Artem Bityutskiy <dedekind@infradead.org>
Acked-by: Jörn Engel <joern@logfs.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Isochronous reception in dualbuffer mode is reportedly broken with
TI TSB43AB22A on x86-64. Descriptor addresses above 2G have been
determined as the trigger:
https://bugzilla.redhat.com/show_bug.cgi?id=435550
Two fixes are possible:
- pci_set_consistent_dma_mask(pdev, DMA_31BIT_MASK);
at least when IR descriptors are allocated, or
- simply don't use dualbuffer.
This fix implements the latter workaround.
But we keep using dualbuffer on x86-32 which won't give us highmen (and
thus physical addresses outside the 31bit range) in coherent DMA memory
allocations. Right now we could for example also whitelist PPC32, but
DMA mapping implementation details are expected to change there.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
This patch merges the IPv4/IPv6 IPComp implementations since most
of the code is identical. As a result future enhancements will no
longer need to be duplicated.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
signalfd4, eventfd2, epoll_create1, dup3, pipe2 and inotify_init1
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is a large patch but the normal code path is not affected. For
non-pSeries platforms the code is ifdef'ed out and for non-CMO enabled
pSeries systems this does not affect the normal code path. Devices that
do not perform DMA operations do not need modification with this patch.
The function get_desired_dma was renamed from get_io_entitlement for
clarity.
Overview
Cooperative Memory Overcommitment (CMO) allows for a set of OS partitions
to be run with less RAM than the aggregate needs of the group of
partitions. The firmware will balance memory between the partitions
and page in/out memory as needed. Based on the number and type of IO
adpaters preset each partition is allocated an amount of memory for
DMA operations and this allocation will be guaranteed to the partition;
this is referred to as the partition's 'entitlement'.
Partitions running in a CMO environment can only have virtual IO devices
present. The VIO bus layer will manage the IO entitlement for the system.
Accounting, at a system and per-device level, is tracked in the VIO bus
code and exposed via sysfs. A set of dma_ops functions are added to
the bus to allow for this accounting.
Bus initialization
At initialization, the bus will calculate the minimum needs of the system
based on providing each device present with a standard minimum entitlement
along with a spare allocation for the bus to handle hotplug events.
If the minimum needs can not be met the system boot will be halted.
Device changes
The significant changes for devices while running under CMO are that the
devices must specify how much dedicated IO entitlement they desire and
must also handle DMA mapping errors that can occur due to constrained
IO memory. The virtual IO drivers are modified to silence errors when
DMA mappings fail for CMO and handle these failures gracefully.
Each devices will be guaranteed a minimum entitlement that can always
be mapped. Devices will specify how much entitlement they desire and
the VIO bus will attempt to provide for this. Devices can change their
desired entitlement level at any point in time to address particular needs
(via vio_cmo_set_dev_desired()), not just at device probe time.
VIO bus changes
The system will have a particular entitlement level available from which
it can provide memory to the devices. The bus defines two pools of memory
within this entitlement, the reserved and excess pools. Each device is
provided with it's own entitlement no less than a system defined minimum
entitlement and no greater than what the device has specified as it's
desired entitlement. The entitlement provided to devices comes from the
reserve pool. The reserve pool can also contain a spare allocation as
large as the system defined minimum entitlement which is used for device
hotplug events. Any entitlement not needed to fulfill the needs of a
reserve pool is placed in the excess pool. Each device is guaranteed
that it can map up to it's entitled level; additional mapping are possible
as long as there is unmapped memory in the excess pool.
Bus probe
As the system starts, each device is given an entitlement equal only
to the system defined minimum entitlement. The reserve pool is equal
to the sum of these entitlements, plus a spare allocation. The VIO bus
also tracks the aggregate desired entitlement of all the devices. If the
system desired entitlement is greater than the size of the reserve pool,
when devices unmap IO memory it will be reserved and a balance operation
will be scheduled for some time in the future.
Entitlement balancing
The balance function tries to fairly distribute entitlement between the
devices in the system with the goal of providing each device with it's
desired amount of entitlement. Devices using more than what would be
ideal will have their entitled set-point adjusted; this will effectively
set a goal for lower IO memory usage as future mappings can fail and
deallocations will trigger a balance operation to distribute the newly
unmapped memory. A fair distribution of entitlement can take several
balance operations to achieve. Entitlement changes and device DLPAR
events will alter the state of CMO and will trigger balance operations.
Hotplug events
The VIO bus allows for changes in system entitlement at run-time via
'vio_cmo_entitlement_update()'. When devices are added the hotplug
device event will be preceded by a system entitlement increase and this
is reversed when devices are removed.
The following changes are made that the VIO bus layer for CMO:
* add IO memory accounting per device structure.
* add IO memory entitlement query function to driver structure.
* during vio bus probe, if CMO is enabled, check that driver has
memory entitlement query function defined. Fail if function not defined.
* fail to register driver if io entitlement function not defined.
* create set of dma_ops at vio level for CMO that will track allocations
and return DMA failures once entitlement is reached. Entitlement will
limited by overall system entitlement. Devices will have a reserved
quantity of memory that is guaranteed, the rest can be used as available.
* expose entitlement, current allocation, desired allocation, and the
allocation error counter for devices to the user through sysfs
* provide mechanism for changing a device's desired entitlement at run time
for devices as an exported function and sysfs tunable
* track any DMA failures for entitled IO memory for each vio device.
* check entitlement against available system entitlement on device add
* track entitlement metrics (high water mark, current usage)
* provide function to reset high water mark
* provide minimum and desired entitlement numbers at a bus level
* provide drivers with a minimum guaranteed entitlement
* balance available entitlement between devices to satisfy their needs
* handle system entitlement changes and device hotplug
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To support Cooperative Memory Overcommitment (CMO), we need to check
for failure from some of the tce hcalls.
These changes for the pseries platform affect the powerpc architecture;
patches for the other affected platforms are included in this patch.
pSeries platform IOMMU code changes:
* platform TCE functions must handle H_NOT_ENOUGH_RESOURCES errors and
return an error.
Architecture IOMMU code changes:
* Calls to ppc_md.tce_build need to check return values and return
DMA_MAPPING_ERROR for transient errors.
Architecture changes:
* struct machdep_calls for tce_build*_pSeriesLP functions need to change
to indicate failure.
* all other platforms will need updates to iommu functions to match the new
calling semantics; they will return 0 on success. The other platforms
default configs have been built, but no further testing was performed.
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With the addition of Cooperative Memory Overcommitment (CMO) support
for IBM Power Systems, two fields have been added to the VPA to report
paging statistics. Add support in lparcfg to report them to userspace.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Newer versions of firmware support page states, which are used by the
collaborative memory manager (future patch) to "loan" pages to the
hypervisor for use by other partitions.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
For Cooperative Memory Overcommitment (CMO), set the FW_FEATURE_CMO
flag in powerpc_firmware_features from the rtas ibm,get-system-parameters
table prior to calling iommu_init_early_pSeries.
With this, any CMO specific functionality can be controlled by checking:
firmware_has_feature(FW_FEATURE_CMO)
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Update /proc/ppc64/lparcfg to display Cooperative Memory
Overcommitment statistics as reported by the H_GET_MPP hcall. This
also updates the lparcfg interface to allow setting memory entitlement
and weight.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch implements support for HW based watchpoint via the
DBSR_DAC (Data Address Compare) facility of the BookE processors.
It does so by interfacing with the existing DABR breakpoint code
and adding the necessary bits and pieces for the new bits to
be properly set or cleared
Signed-off-by: Luis Machado <luisgpm@br.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Stash the first platform string matched by identify_cpu() in
powerpc_base_platform, and supply that to the ELF loader for the value
of AT_BASE_PLATFORM.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Some IBM POWER-based platforms have the ability to run in a
mode which mostly appears to the OS as a different processor from the
actual hardware. For example, a Power6 system may appear to be a
Power5+, which makes the AT_PLATFORM value "power5+". This means that
programs are restricted to the ISA supported by Power5+;
Power6-specific instructions are treated as illegal.
However, some applications (virtual machines, optimized libraries) can
benefit from knowledge of the underlying CPU model. A new aux vector
entry, AT_BASE_PLATFORM, will denote the actual hardware. For
example, on a Power6 system in Power5+ compatibility mode, AT_PLATFORM
will be "power5+" and AT_BASE_PLATFORM will be "power6". The idea is
that AT_PLATFORM indicates the instruction set supported, while
AT_BASE_PLATFORM indicates the underlying microarchitecture.
If the architecture has defined ELF_BASE_PLATFORM, copy that value to
the user stack in the same manner as ELF_PLATFORM.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
virtio: Add transport feature handling stub for virtio_ring.
virtio: Rename set_features to finalize_features
virtio: Formally reserve bits 28-31 to be 'transport' features.
s390: use virtio_console for KVM on s390
virtio: console as a config option
virtio_console: use virtqueue notification for hvc_console
hvc_console: rework setup to replace irq functions with callbacks
virtio_blk: check for hardsector size from host
virtio: Use bus_type probe and remove methods
virtio: don't always force a notification when ring is full
virtio: clarify that ABI is usable by any implementations
virtio: Recycle unused recv buffer pages for large skbs in net driver
virtio net: Allow receiving SG packets
virtio net: Add ethtool ops for SG/GSO
virtio: fix virtio_net xmit of freed skb bug
Obvious misc patch been in my queue (& linux-next) for over a cycle.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To prepare for virtio_ring transport feature bits, hook in a call in
all the users to manipulate them. This currently just clears all the
bits, since it doesn't understand any features.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rather than explicitly handing the features to the lower-level, we just
hand the virtio_device and have it set the features. This make it clear
that it has the chance to manipulate the features of the device at this
point (and that all feature negotiation is already done).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We assign feature bits as required, but it makes sense to reserve some
for the particular transport, rather than the particular device.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This patch enables virtio_console as the default console on kvm for
s390. We currently use the same notify hack as lguest for early
console output. I will try to address this for lguest and s390 later.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently virtio_blk assumes a 512 byte hard sector size. This can cause
trouble / performance issues if the backing has a different block size
(like a file on an ext3 file system formatted with 4k block size or a dasd).
Lets add a feature flag that tells the guest to use a different hard sector
size than 512 byte.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We want others to implement and use virtio, so it makes sense to BSD
license the non-__KERNEL__ parts of the headers to make this crystal
clear.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Mark McLoughlin <markmc@redhat.com>
Acked-by: Ryan Harper <ryanh@us.ibm.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
Suresh Siddha wants to fix a possible FPU leakage in error conditions,
but the fact that save/restore_i387() are inlines in a header file makes
that harder to do than necessary. So start off with an obvious cleanup.
This just moves the x86-64 version of save/restore_i387() out of the
header file, and moves it to the only file that it is actually used in:
arch/x86/kernel/signal_64.c. So exposing it in a header file was wrong
to begin with.
[ Side note: I'd like to fix up some of the games we play with the
32-bit version of these functions too, but that's a separate
matter. The 32-bit versions are shared - under different names
at that! - by both the native x86-32 code and the x86-64 32-bit
compatibility code ]
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits)
ide: use proper printk() KERN_* levels in ide-probe.c
ide: fix for EATA SCSI HBA in ATA emulating mode
ide: remove stale comments from drivers/ide/Makefile
ide: enable local IRQs in all handlers for TASKFILE_NO_DATA data phase
ide-scsi: remove kmalloced struct request
ht6560b: remove old history
ht6560b: update email address
ide-cd: fix oops when using growisofs
gayle: release resources on ide_host_add() failure
palm_bk3710: add UltraDMA/100 support
ide: trivial sparse annotations
ide: ide-tape.c sparse annotations and unaligned access removal
ide: drop 'name' parameter from ->init_chipset method
ide: prefix messages from IDE PCI host drivers by driver name
it821x: remove DECLARE_ITE_DEV() macro
it8213: remove DECLARE_ITE_DEV() macro
ide: include PCI device name in messages from IDE PCI host drivers
ide: remove <asm/ide.h> for some archs
ide-generic: remove ide_default_{io_base,irq}() inlines (take 3)
ide-generic: is no longer needed on ppc32
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: fixup sparse endianness warnings in proc.c
PCI PM: make more PCI PM core functionality available to drivers
PCI/DMAR: don't assume presence of RMRRs
PCI hotplug: fix error path in pci_slot's register_slot
* Remove <linux/irq.h> include from <asm-ia64.h> (<linux/ide.h> includes
<linux/interrupt.h> which is enough).
* Remove <asm/ide.h> for alpha/blackfin/h8300/ia64/m32r/sh/x86/xtensa
(this leaves us with arm/frv/m68k/mips/mn10300/parisc/powerpc/sparc[64]).
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Replace ide_default_{io_base,irq}() inlines by legacy_{bases,irqs}[].
v2:
Add missing zero-ing of hws[] (caught during testing by Borislav Petkov).
v3:
Fix zero-oing of hws[] for _real_ this time.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
PPC_PREP has been depending on BROKEN for some time now.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Now that ide_hwif_t instances are allocated dynamically
the difference between MAX_HWIFS == 2 and MAX_HWIFS == 10
is ~100 bytes (x86-32) so use MAX_HWIFS == 10 on all archs
except these ones that use MAX_HWIFS == 1.
* Define MAX_HWIFS in <linux/ide.h> instead of <asm/ide.h>.
[ Please note that avr32/cris/v850 have no <asm/ide.h>
and alpha/ia64/sh always define CONFIG_IDE_MAX_HWIFS. ]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Remove <asm-cris/arch-v{10,32}/ide.h> and <asm-cris/ide.h>.
This has been a broken code for some time now and needs rewrite
to match IDE core code / host driver model anyway.
Cc: Jesper Nilsson <Jesper.Nilsson@axis.com>
Cc: Mikael Starvik <mikael.starvik@axis.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Since the decision to probe for ISA ide2-6 is now left to the user
"no_pci_devices()" quirk is no longer needed and may be removed.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Move ide_probe_legacy() call to ide_generic_init() so it fails
early if necessary and returns the proper error value (nowadays
ide_default_io_base() is used only by ide-generic).
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add 'unsigned long host_flags' field to struct ide_host.
* Set ->host_flags in ide_host_alloc_all().
* Always set PCI dev's ->driver_data in ide_pci_init_{one,two}().
* Add ide_pci_remove() helper (the default implementation for
struct pci_driver's ->remove method).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add 'struct ide_host *host' field to ide_hwif_t and set it
in ide_host_alloc_all().
* Add ide_device_{get,put}() helpers loosely based on SCSI's
scsi_device_{get,put}() ones.
* Convert IDE device drivers to use ide_device_{get,put}().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add 'struct device *dev[2]' and 'void *host_priv' fields
to struct ide_host.
* Set ->dev[] in ide_host_alloc_all()/ide_setup_pci_device[s]().
* Pass 'void *priv' argument to ide_setup_pci_device[s]()
and use it to set ->host_priv.
* Set PCI dev's ->driver_data to point to the struct ide_host
instance if PCI host driver wants to use ->host_priv.
* Rename ide_setup_pci_device[s]() to ide_pci_init_{one,two}().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
MAINTAINERS: Remove Glenn Streiff from NetEffect entry
mlx4_core: Improve error message when not enough UAR pages are available
IB/mlx4: Add support for memory management extensions and local DMA L_Key
IB/mthca: Keep free count for MTT buddy allocator
mlx4_core: Keep free count for MTT buddy allocator
mlx4_code: Add missing FW status return code
IB/mlx4: Rename struct mlx4_lso_seg to mlx4_wqe_lso_seg
mlx4_core: Add module parameter to enable QoS support
RDMA/iwcm: Remove IB_ACCESS_LOCAL_WRITE from remote QP attributes
IPoIB: Include err code in trace message for ib_sa_path_rec_get() failures
IB/sa_query: Check if sm_ah is NULL in ib_sa_remove_one()
IB/ehca: Release mutex in error path of alloc_small_queue_page()
IB/ehca: Use default value for Local CA ACK Delay if FW returns 0
IB/ehca: Filter PATH_MIG events if QP was never armed
IB/iser: Add support for RDMA_CM_EVENT_ADDR_CHANGE event
RDMA/cma: Add RDMA_CM_EVENT_TIMEWAIT_EXIT event
RDMA/cma: Add RDMA_CM_EVENT_ADDR_CHANGE event
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
nohz: adjust tick_nohz_stop_sched_tick() call of s390 as well
nohz: prevent tick stop outside of the idle loop
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: fix header export, asm-x86/processor-flags.h, CONFIG_* leaks
x86: BUILD_IRQ say .text to avoid .data.percpu
xen: don't use sysret for sysexit32
x86: call early_cpu_init at the same point
* 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc:
Remove __DECLARE_SEMAPHORE_GENERIC
Remove asm/semaphore.h
Remove use of asm/semaphore.h
Add missing semaphore.h includes
Remove mention of semaphores from kernel-locking
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68knommu: put ColdFire head code into .text.head section
m68knommu: remove last use of CONFIG_FADS and CONFIG_RPXCLASSIC
m68knommu: remove RPXCLASSIC from the m68k tree
m68knommu: fec: remove FADS
m68knommu: MCF5307 PIT GENERIC_CLOCKEVENTS support
m68knommu: add read_barrier_depends() and irqs_disabled_flags()
m68knommu: add byteswap assembly opcode for ISA A+
m68knommu: add ffs and __ffs plattform which support ISA A+ or ISA C
m68knommu: add sched_clock() for the DMA timer
m68knommu: complete generic time
m68knommu: move code within time.c
m68knommu: m68knommu: add old stack trace method
m68knommu: Add Coldfire DMA Timer support
m68knommu: defconfig for M5407C3 board
m68knommu: defconfig for M5307C3 board
m68knommu: defconfig for M5275EVB board
m68knommu: defconfig for M5249EVB board
m68knommu: change to a configs directory for board configurations
* 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds:
leds: Ensure led->trigger is set earlier
leds: Add support for Philips PCA955x I2C LED drivers
leds: Fix sparse warnings in leds-h1940 driver
leds: mark led_classdev.default_trigger as const
leds: fix unsigned value overflow in atmel pwm driver
leds: Add pca9532 platform data for Thecus N2100
leds: Add pca9532 led driver
This patch adds a "const" to the parser token table. I've done an
allmodconfig build to see if this produces any warnings/failures and the
patch includes a fix for the only warning that was produced.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: Alexander Viro <aviro@redhat.com>
Acked-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, linux/major.h defines a GRAPHDEV_MAJOR (29) that nobody uses,
and linux/fb.h defines the real FB_MAJOR (also 29), that only fbmem.c
needs. Drop GRAPHDEV_MAJOR from major.h, move FB_MAJOR definition from
fb.h to major.h, and fix fbmem.c to use major.h's definition.
Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds a platform driver using the ATMEL PWM driver to control a
backlight which requires a PWM signal and optional GPIO signal for discrete
on/off signal. It has been tested on Favr-32 board from EarthLCD.
The driver is configurable by supplying a struct with the platform data. See
the include/linux/atmel-pwm-bl.h for details.
The board code for Favr-32 will be submitted to the AVR32 kernel list.
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the xtimings structure which only stored some values to be used
later (mostly once). Calculate and use these values in places they are
needed.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a platform_lcd driver to allow boards with simple lcd power controls
to register themselves easily.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add the lcd_device being checked to the check_fb entry of lcd_ops. This
ensures that any driver using this to check against it's own state can do
so, and also makes all the calls in lcd_ops more orthogonal in their
arguments.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Provide support for the ILI9320 display controller chip which is found in
many LCD displays. Included with this is support for an example LCD using
this chip, the VGG2432A4.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add flags to allow the driver to invert the sense of both VBIASEN and FPEN
signals comming from the SM501.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the SuperH Mobile LCDC frame buffer driver V2, adding support for
the LCDC block found in SuperH Mobile processors. The hardware supports
up to two LCD panels per LCDC block, and both RGB and SYS interfaces can
be used to hook up LCD panels/modules.
The device driver is a regular platform driver, so LCD configuration and
board specific hooks are passed to the driver using platform data. LCD
modules using SYS interface often require special configuration using the
SYS bus, and to solve this cleanly the driver provides SYS interface
operations to the board code.
Tested on sh7723 and sh7722 processors with a SYS16A QVGA panel and WVGA
panels using RGB16 and RGB18 interfaces.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Reviewed-by: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Manage atmel_lcdfb FIFO underflow
Resetting the LCD and DMA allows to fix screen shifting after a FIFO
underflow. It follows reset sequence from errata "LCD Screen Shifting
After a Reset".
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add imageblit acceleration for the Blade3D family of cores. The code is
based on code from the cyblafb driver.
It is a step toward assimilating back the cyblafb driver into the
tridentfb driver. The cyblafb driver handles a subfamily of the Trident
Blade3d cores.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch contains general source code improvments:
- more simple functions are inline
- removes some meaningless output and the VERSION
string as it is no use
- eng_par is moved into the tridentfb_par
- removed small section of code for CyberBladeXPAi1
which is maybe right for only one resolution
and refresh rate and is probably redundant now
- other minor improvements
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch replaces deprecated constant FB_ACCELF_TEXT with
FBINFO_HWACCEL_DISABLED and adds constants for Trident families of
accelerators.
The FBINFO_HWACCEL_DISABLED is correctly used so noaccel parameter works
now.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch brings various acceleration improvements:
- set copyarea/fillrect for non-accelerated framebuffer (fix)
- remove 15 bpp depth handling to simplify code as it hardly
works (15 bpp handling was obviously missing in some switches)
- add fb_sync call and move waiting before accelerated function
to make acceleration more asynchronous to cpu (few % of speed
improvement)
- add cpu_relax() call in waiting loops
- make longer register names and name more registers
- move registers' definition to header
- general code improvements (shortening, simplifying)
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for TGUI 9440 chip.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Improved values for some registers after Xorg Trident driver. The main
problem was that values set by BIOS have been ignored.
This patch completely remove random pixels ("snow") on the TGUI 9680 and
9440 (not supported yet by the driver). It does not help with the "snow"
on 3DImage and Blade3D cards.
There is also small improvement in timing calculations (hblank start and
vblank start)
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make use of functions and constants from the vga.h header to compact the code
and make it more readable.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch converts the is_blade() and is_xp() macros into local functions.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch moves flat panel indicator into tridentfb_par structure and removes
related global variables and macros.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This updates <linux/bcd.h> to define the key routines as constant
functions, which the macros will then call. Newer code can now call
bcd2bin() instead of SCREAMING BCD2BIN() TO THE FOUR WINDS.
This lets each driver shrink their codespace by using N function calls to
a single (global) copy of those routines, instead of N inlined copies of
these functions per driver.
These routines aren't used in speed-critical code. Almost all callers are
in the RTC framework. Typical per-driver savings is near 300 bytes.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Adrian Bunk <bunk@kernel.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Support the Dallas/Maxim DS1305 and DS1306 RTC chips. These use SPI, and
support alarms, NVRAM, and a trickle charger for use when their backup
power supply is a supercap or rechargeable cell.
This basic driver doesn't yet support suspend/resume or wakealarms.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove implicit use of BKL in ioctl() from the RTC framework.
Instead, the rtc->ops_lock is used. That's the same lock that already
protects the RTC operations when they're issued through the exported
rtc_*() calls in drivers/rtc/interface.c ... making this a bugfix, not
just a cleanup, since both ioctl calls and set_alarm() need to update IRQ
enable flags and that implies a common lock (which RTC drivers as a rule
do not provide on their own).
A new comment at the declaration of "struct rtc_class_ops" summarizes
current locking rules. It's not clear to me that the exceptions listed
there should exist ... if not, those are pre-existing problems which can
be fixed in a patch that doesn't relate to BKL removal.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Jonathan Corbet <corbet@lwn.net>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ioctls AUTOFS_IOC_TOGGLEREGHOST and AUTOFS_IOC_ASKREGHOST were added
several years ago but what they were intended for has never been
implemented (as far as I'm aware noone uses them) so remove them.
Signed-off-by: Ian Kent <raven@themaw.net>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the Au1550 resource table and instead extract MMIO/IRQ/DMA
resources from platform resource information like any well-behaved
platform driver.
Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Jan Nikitenko <jan.nikitenko@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, 'modalias' in the spi_device structure is a 'const char *'.
The spi_new_device() function fills in the modalias value from a passed in
spi_board_info data block. Since it is a pointer copy, the new spi_device
remains dependent on the spi_board_info structure after the new spi_device
is registered (no other fields in spi_device directly depend on the
spi_board_info structure; all of the other data is copied).
This causes a problem when dynamically propulating the list of attached
SPI devices. For example, in arch/powerpc, the list of SPI devices can be
populated from data in the device tree. With the current code, the device
tree adapter must kmalloc() a new spi_board_info structure for each new
SPI device it finds in the device tree, and there is no simple mechanism
in place for keeping track of these allocations.
This patch changes modalias from a 'const char *' to a fixed char array.
By copying the modalias string instead of referencing it, the dependency
on the spi_board_info structure is eliminated and an outside caller does
not need to maintain a separate spi_board_info allocation for each device.
If searched through the code to the best of my ability for any references
to modalias which may be affected by this change and haven't found
anything. It has been tested with the lite5200b platform in arch/powerpc.
[dbrownell@users.sourceforge.net: cope with linux-next changes: KOBJ_NAME_LEN obliterated, etc]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds O_NONBLOCK support to pipe2. It is minimally more involved
than the patches for eventfd et.al but still trivial. The interfaces of the
create_write_pipe and create_read_pipe helper functions were changed and the
one other caller as well.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_pipe2
# ifdef __x86_64__
# define __NR_pipe2 293
# elif defined __i386__
# define __NR_pipe2 331
# else
# error "need __NR_pipe2"
# endif
#endif
int
main (void)
{
int fds[2];
if (syscall (__NR_pipe2, fds, 0) == -1)
{
puts ("pipe2(0) failed");
return 1;
}
for (int i = 0; i < 2; ++i)
{
int fl = fcntl (fds[i], F_GETFL);
if (fl == -1)
{
puts ("fcntl failed");
return 1;
}
if (fl & O_NONBLOCK)
{
printf ("pipe2(0) set non-blocking mode for fds[%d]\n", i);
return 1;
}
close (fds[i]);
}
if (syscall (__NR_pipe2, fds, O_NONBLOCK) == -1)
{
puts ("pipe2(O_NONBLOCK) failed");
return 1;
}
for (int i = 0; i < 2; ++i)
{
int fl = fcntl (fds[i], F_GETFL);
if (fl == -1)
{
puts ("fcntl failed");
return 1;
}
if ((fl & O_NONBLOCK) == 0)
{
printf ("pipe2(O_NONBLOCK) does not set non-blocking mode for fds[%d]\n", i);
return 1;
}
close (fds[i]);
}
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch introduces the new syscall inotify_init1 (note: the 1 stands for
the one parameter the syscall takes, as opposed to no parameter before). The
values accepted for this parameter are function-specific and defined in the
inotify.h header. Here the values must match the O_* flags, though. In this
patch CLOEXEC support is introduced.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_inotify_init1
# ifdef __x86_64__
# define __NR_inotify_init1 294
# elif defined __i386__
# define __NR_inotify_init1 332
# else
# error "need __NR_inotify_init1"
# endif
#endif
#define IN_CLOEXEC O_CLOEXEC
int
main (void)
{
int fd;
fd = syscall (__NR_inotify_init1, 0);
if (fd == -1)
{
puts ("inotify_init1(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("inotify_init1(0) set close-on-exit");
return 1;
}
close (fd);
fd = syscall (__NR_inotify_init1, IN_CLOEXEC);
if (fd == -1)
{
puts ("inotify_init1(IN_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("inotify_init1(O_CLOEXEC) does not set close-on-exit");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch introduces the new syscall pipe2 which is like pipe but it also
takes an additional parameter which takes a flag value. This patch implements
the handling of O_CLOEXEC for the flag. I did not add support for the new
syscall for the architectures which have a special sys_pipe implementation. I
think the maintainers of those archs have the chance to go with the unified
implementation but that's up to them.
The implementation introduces do_pipe_flags. I did that instead of changing
all callers of do_pipe because some of the callers are written in assembler.
I would probably screw up changing the assembly code. To avoid breaking code
do_pipe is now a small wrapper around do_pipe_flags. Once all callers are
changed over to do_pipe_flags the old do_pipe function can be removed.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_pipe2
# ifdef __x86_64__
# define __NR_pipe2 293
# elif defined __i386__
# define __NR_pipe2 331
# else
# error "need __NR_pipe2"
# endif
#endif
int
main (void)
{
int fd[2];
if (syscall (__NR_pipe2, fd, 0) != 0)
{
puts ("pipe2(0) failed");
return 1;
}
for (int i = 0; i < 2; ++i)
{
int coe = fcntl (fd[i], F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
printf ("pipe2(0) set close-on-exit for fd[%d]\n", i);
return 1;
}
}
close (fd[0]);
close (fd[1]);
if (syscall (__NR_pipe2, fd, O_CLOEXEC) != 0)
{
puts ("pipe2(O_CLOEXEC) failed");
return 1;
}
for (int i = 0; i < 2; ++i)
{
int coe = fcntl (fd[i], F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
printf ("pipe2(O_CLOEXEC) does not set close-on-exit for fd[%d]\n", i);
return 1;
}
}
close (fd[0]);
close (fd[1]);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds the new dup3 syscall. It extends the old dup2 syscall by one
parameter which is meant to hold a flag value. Support for the O_CLOEXEC flag
is added in this patch.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_dup3
# ifdef __x86_64__
# define __NR_dup3 292
# elif defined __i386__
# define __NR_dup3 330
# else
# error "need __NR_dup3"
# endif
#endif
int
main (void)
{
int fd = syscall (__NR_dup3, 1, 4, 0);
if (fd == -1)
{
puts ("dup3(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("dup3(0) set close-on-exec flag");
return 1;
}
close (fd);
fd = syscall (__NR_dup3, 1, 4, O_CLOEXEC);
if (fd == -1)
{
puts ("dup3(O_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("dup3(O_CLOEXEC) set close-on-exec flag");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds the new epoll_create2 syscall. It extends the old epoll_create
syscall by one parameter which is meant to hold a flag value. In this
patch the only flag support is EPOLL_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.
A new name EPOLL_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_epoll_create2
# ifdef __x86_64__
# define __NR_epoll_create2 291
# elif defined __i386__
# define __NR_epoll_create2 329
# else
# error "need __NR_epoll_create2"
# endif
#endif
#define EPOLL_CLOEXEC O_CLOEXEC
int
main (void)
{
int fd = syscall (__NR_epoll_create2, 1, 0);
if (fd == -1)
{
puts ("epoll_create2(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("epoll_create2(0) set close-on-exec flag");
return 1;
}
close (fd);
fd = syscall (__NR_epoll_create2, 1, EPOLL_CLOEXEC);
if (fd == -1)
{
puts ("epoll_create2(EPOLL_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The timerfd_create syscall already has a flags parameter. It just is
unused so far. This patch changes this by introducing the TFD_CLOEXEC
flag to set the close-on-exec flag for the returned file descriptor.
A new name TFD_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_timerfd_create
# ifdef __x86_64__
# define __NR_timerfd_create 283
# elif defined __i386__
# define __NR_timerfd_create 322
# else
# error "need __NR_timerfd_create"
# endif
#endif
#define TFD_CLOEXEC O_CLOEXEC
int
main (void)
{
int fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, 0);
if (fd == -1)
{
puts ("timerfd_create(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("timerfd_create(0) set close-on-exec flag");
return 1;
}
close (fd);
fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, TFD_CLOEXEC);
if (fd == -1)
{
puts ("timerfd_create(TFD_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("timerfd_create(TFD_CLOEXEC) set close-on-exec flag");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds the new eventfd2 syscall. It extends the old eventfd
syscall by one parameter which is meant to hold a flag value. In this
patch the only flag support is EFD_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.
A new name EFD_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_eventfd2
# ifdef __x86_64__
# define __NR_eventfd2 290
# elif defined __i386__
# define __NR_eventfd2 328
# else
# error "need __NR_eventfd2"
# endif
#endif
#define EFD_CLOEXEC O_CLOEXEC
int
main (void)
{
int fd = syscall (__NR_eventfd2, 1, 0);
if (fd == -1)
{
puts ("eventfd2(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("eventfd2(0) sets close-on-exec flag");
return 1;
}
close (fd);
fd = syscall (__NR_eventfd2, 1, EFD_CLOEXEC);
if (fd == -1)
{
puts ("eventfd2(EFD_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("eventfd2(EFD_CLOEXEC) does not set close-on-exec flag");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds the new signalfd4 syscall. It extends the old signalfd
syscall by one parameter which is meant to hold a flag value. In this
patch the only flag support is SFD_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.
A new name SFD_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_signalfd4
# ifdef __x86_64__
# define __NR_signalfd4 289
# elif defined __i386__
# define __NR_signalfd4 327
# else
# error "need __NR_signalfd4"
# endif
#endif
#define SFD_CLOEXEC O_CLOEXEC
int
main (void)
{
sigset_t ss;
sigemptyset (&ss);
sigaddset (&ss, SIGUSR1);
int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0);
if (fd == -1)
{
puts ("signalfd4(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("signalfd4(0) set close-on-exec flag");
return 1;
}
close (fd);
fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_CLOEXEC);
if (fd == -1)
{
puts ("signalfd4(SFD_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("signalfd4(SFD_CLOEXEC) does not set close-on-exec flag");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch just extends the anon_inode_getfd interface to take an additional
parameter with a flag value. The flag value is passed on to
get_unused_fd_flags in anticipation for a use with the O_CLOEXEC flag.
No actual semantic changes here, the changed callers all pass 0 for now.
[akpm@linux-foundation.org: KVM fix]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some platforms do not have support to restore the signal mask in the
return path from a syscall. For those platforms syscalls like pselect are
not defined at all. This is, I think, not a good choice for paccept()
since paccept() adds more value on top of accept() than just the signal
mask handling.
Therefore this patch defines a scaled down version of the sys_paccept
function for those platforms. It returns -EINVAL in case the signal mask
is non-NULL but behaves the same otherwise.
Note that I explicitly included <linux/thread_info.h>. I saw that it is
currently included but indirectly two levels down. There is too much risk
in relying on this. The header might change and then suddenly the
function definition would change without anyone immediately noticing.
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is by far the most complex in the series. It adds a new syscall
paccept. This syscall differs from accept in that it adds (at the userlevel)
two additional parameters:
- a signal mask
- a flags value
The flags parameter can be used to set flag like SOCK_CLOEXEC. This is
imlpemented here as well. Some people argued that this is a property which
should be inherited from the file desriptor for the server but this is against
POSIX. Additionally, we really want the signal mask parameter as well
(similar to pselect, ppoll, etc). So an interface change in inevitable.
The flag value is the same as for socket and socketpair. I think diverging
here will only create confusion. Similar to the filesystem interfaces where
the use of the O_* constants differs, it is acceptable here.
The signal mask is handled as for pselect etc. The mask is temporarily
installed for the thread and removed before the call returns. I modeled the
code after pselect. If there is a problem it's likely also in pselect.
For architectures which use socketcall I maintained this interface instead of
adding a system call. The symmetry shouldn't be broken.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <errno.h>
#include <fcntl.h>
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <netinet/in.h>
#include <sys/socket.h>
#include <sys/syscall.h>
#ifndef __NR_paccept
# ifdef __x86_64__
# define __NR_paccept 288
# elif defined __i386__
# define SYS_PACCEPT 18
# define USE_SOCKETCALL 1
# else
# error "need __NR_paccept"
# endif
#endif
#ifdef USE_SOCKETCALL
# define paccept(fd, addr, addrlen, mask, flags) \
({ long args[6] = { \
(long) fd, (long) addr, (long) addrlen, (long) mask, 8, (long) flags }; \
syscall (__NR_socketcall, SYS_PACCEPT, args); })
#else
# define paccept(fd, addr, addrlen, mask, flags) \
syscall (__NR_paccept, fd, addr, addrlen, mask, 8, flags)
#endif
#define PORT 57392
#define SOCK_CLOEXEC O_CLOEXEC
static pthread_barrier_t b;
static void *
tf (void *arg)
{
pthread_barrier_wait (&b);
int s = socket (AF_INET, SOCK_STREAM, 0);
struct sockaddr_in sin;
sin.sin_family = AF_INET;
sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
sin.sin_port = htons (PORT);
connect (s, (const struct sockaddr *) &sin, sizeof (sin));
close (s);
pthread_barrier_wait (&b);
s = socket (AF_INET, SOCK_STREAM, 0);
sin.sin_port = htons (PORT);
connect (s, (const struct sockaddr *) &sin, sizeof (sin));
close (s);
pthread_barrier_wait (&b);
pthread_barrier_wait (&b);
sleep (2);
pthread_kill ((pthread_t) arg, SIGUSR1);
return NULL;
}
static void
handler (int s)
{
}
int
main (void)
{
pthread_barrier_init (&b, NULL, 2);
struct sockaddr_in sin;
pthread_t th;
if (pthread_create (&th, NULL, tf, (void *) pthread_self ()) != 0)
{
puts ("pthread_create failed");
return 1;
}
int s = socket (AF_INET, SOCK_STREAM, 0);
int reuse = 1;
setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse));
sin.sin_family = AF_INET;
sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
sin.sin_port = htons (PORT);
bind (s, (struct sockaddr *) &sin, sizeof (sin));
listen (s, SOMAXCONN);
pthread_barrier_wait (&b);
int s2 = paccept (s, NULL, 0, NULL, 0);
if (s2 < 0)
{
puts ("paccept(0) failed");
return 1;
}
int coe = fcntl (s2, F_GETFD);
if (coe & FD_CLOEXEC)
{
puts ("paccept(0) set close-on-exec-flag");
return 1;
}
close (s2);
pthread_barrier_wait (&b);
s2 = paccept (s, NULL, 0, NULL, SOCK_CLOEXEC);
if (s2 < 0)
{
puts ("paccept(SOCK_CLOEXEC) failed");
return 1;
}
coe = fcntl (s2, F_GETFD);
if ((coe & FD_CLOEXEC) == 0)
{
puts ("paccept(SOCK_CLOEXEC) does not set close-on-exec flag");
return 1;
}
close (s2);
pthread_barrier_wait (&b);
struct sigaction sa;
sa.sa_handler = handler;
sa.sa_flags = 0;
sigemptyset (&sa.sa_mask);
sigaction (SIGUSR1, &sa, NULL);
sigset_t ss;
pthread_sigmask (SIG_SETMASK, NULL, &ss);
sigaddset (&ss, SIGUSR1);
pthread_sigmask (SIG_SETMASK, &ss, NULL);
sigdelset (&ss, SIGUSR1);
alarm (4);
pthread_barrier_wait (&b);
errno = 0 ;
s2 = paccept (s, NULL, 0, &ss, 0);
if (s2 != -1 || errno != EINTR)
{
puts ("paccept did not fail with EINTR");
return 1;
}
close (s);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[akpm@linux-foundation.org: make it compile]
[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Roland McGrath <roland@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds support for flag values which are ORed to the type passwd
to socket and socketpair. The additional code is minimal. The flag
values in this implementation can and must match the O_* flags. This
avoids overhead in the conversion.
The internal functions sock_alloc_fd and sock_map_fd get a new parameters
and all callers are changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <netinet/in.h>
#include <sys/socket.h>
#define PORT 57392
/* For Linux these must be the same. */
#define SOCK_CLOEXEC O_CLOEXEC
int
main (void)
{
int fd;
fd = socket (PF_INET, SOCK_STREAM, 0);
if (fd == -1)
{
puts ("socket(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("socket(0) set close-on-exec flag");
return 1;
}
close (fd);
fd = socket (PF_INET, SOCK_STREAM|SOCK_CLOEXEC, 0);
if (fd == -1)
{
puts ("socket(SOCK_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("socket(SOCK_CLOEXEC) does not set close-on-exec flag");
return 1;
}
close (fd);
int fds[2];
if (socketpair (PF_UNIX, SOCK_STREAM, 0, fds) == -1)
{
puts ("socketpair(0) failed");
return 1;
}
for (int i = 0; i < 2; ++i)
{
coe = fcntl (fds[i], F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
printf ("socketpair(0) set close-on-exec flag for fds[%d]\n", i);
return 1;
}
close (fds[i]);
}
if (socketpair (PF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, fds) == -1)
{
puts ("socketpair(SOCK_CLOEXEC) failed");
return 1;
}
for (int i = 0; i < 2; ++i)
{
coe = fcntl (fds[i], F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
printf ("socketpair(SOCK_CLOEXEC) does not set close-on-exec flag for fds[%d]\n", i);
return 1;
}
close (fds[i]);
}
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Trying to compile the v850 port brings many compile errors, one of them exists
since at least kernel 2.6.19.
There also seems to be noone willing to bring this port back into a usable
state.
This patch therefore removes the v850 port.
If anyone ever decides to revive the v850 port the code will still be
available from older kernels, and it wouldn't be impossible for the port to
reenter the kernel if it would become actively maintained again.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Make some variables and functions static, since they don't need to be
global.
- Remove an unused function - arch/um/kernel/time.c::sched_clock().
- Clean the style a bit as complained by checkpatch.pl.
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Remove arch_validate(), because no one uses it.
- Remove useless macro HAVE_ARCH_VALIDATE.
- Make the variable 'empty_bad_page' static.
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
global_flush_tlb is declared but never used.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Cc: Mikael Starvik <starvik@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mn10300 was the only architecture where sg_dma_{address,len}() were not
in asm/scatterlist.h, and it's not a big surprise that this caused a
compile error somewhere:
/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/media/video/videobuf-dma-sg.c: In function `videobuf_dma_map':
/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/media/video/videobuf-dma-sg.c:238: error: implicit declaration of function 'sg_dma_address'
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ACPI defines a hardware signature. BIOS calculates the signature according to
hardware configure and if hardware changes while hibernated, the signature
will change. In that case, S4 resume should fail.
Still, there may be systems on which this mechanism does not work correctly,
so it is better to provide a workaround for them. For this reason, add a new
switch to the acpi_sleep= command line argument allowing one to disable
hardware signature checking.
[shaohua.li@intel.com: build fix]
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: <Valdis.Kletnieks@vt.edu>
Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This interface allows adding a job on a specific cpu.
Although a work struct on a cpu will be scheduled to other cpu if the cpu
dies, there is a recursion if a work task tries to offline the cpu it's
running on. we need to schedule the task to a specific cpu in this case.
http://bugzilla.kernel.org/show_bug.cgi?id=10897
[oleg@tv-sign.ru: cleanups]
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Rus <harbour@sfinx.od.ua>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch (as1112) adds some new PM_EVENT_* codes for use by kernel
subsystems. They describe runtime power-state transitions of the sort already
implemented by the USB subsystem.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Drop unnecessary includes from include/linux/pm.h .
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the remaining obsolete definitions from include/linux/pm.h and move
the definitions of PM_SUSPEND and PM_RESUME to the header of h3600 which
is the only user of them.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the definition of 'struct pm_dev', which is not used any more,
along with some related stuff from include/linux/pm.h .
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the obsolete and no longer used include/linux/pm_legacy.h
Reviewed-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Pavel Machek <pavel@suse.cz>
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch removes the unused include/asm-h8300/keyboard.h
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Why would linux/security.h need forward declarations for nfsctl_arg and
swap_info_struct? It's hard to imagine: remove them.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When cap_bset suppresses some of the forced (fP) capabilities of a file,
it is generally only safe to execute the program if it understands how to
recognize it doesn't have enough privilege to work correctly. For legacy
applications (fE!=0), which have no non-destructive way to determine that
they are missing privilege, we fail to execute (EPERM) any executable that
requires fP capabilities, but would otherwise get pP' < fP. This is a
fail-safe permission check.
For some discussion of why it is problematic for (legacy) privileged
applications to run with less than the set of capabilities requested for
them, see:
http://userweb.kernel.org/~morgan/sendmail-capabilities-war-story.html
With this iteration of this support, we do not include setuid-0 based
privilege protection from the bounding set. That is, the admin can still
(ab)use the bounding set to suppress the privileges of a setuid-0 program.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We'd like to support CONFIG_MEMORY_HOTREMOVE on s390, which depends on
CONFIG_MIGRATION. So far, CONFIG_MIGRATION is only available with NUMA
support.
This patch makes CONFIG_MIGRATION selectable for architectures that define
ARCH_ENABLE_MEMORY_HOTREMOVE. When MIGRATION is enabled w/o NUMA, the
kernel won't compile because migrate_vmas() does not know about
vm_ops->migrate() and vma_migratable() does not know about policy_zone.
To fix this, those two functions can be restricted to '#ifdef CONFIG_NUMA'
because they are not being used w/o NUMA. vma_migratable() is moved over
from migrate.h to mempolicy.h.
[kosaki.motohiro@jp.fujitsu.com: build fix]
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: KOSAKI Motorhiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While in all cases in the kernel we know the size of the elements to be
created, we don't always know the count of elements. By commuting the size
and count in the overflow check, the compiler can reduce the runtime division
of size_t with a compare to a (unique) constant in these cases.
Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Memory may be hot-removed on a per-memory-block basis, particularly on
POWER where the SPARSEMEM section size often matches the memory-block
size. A user-level agent must be able to identify which sections of
memory are likely to be removable before attempting the potentially
expensive operation. This patch adds a file called "removable" to the
memory directory in sysfs to help such an agent. In this patch, a memory
block is considered removable if;
o It contains only MOVABLE pageblocks
o It contains only pageblocks with free pages regardless of pageblock type
On the other hand, a memory block starting with a PageReserved() page will
never be considered removable. Without this patch, the user-agent is
forced to choose a memory block to remove randomly.
Sample output of the sysfs files:
./memory/memory0/removable: 0
./memory/memory1/removable: 0
./memory/memory2/removable: 0
./memory/memory3/removable: 0
./memory/memory4/removable: 0
./memory/memory5/removable: 0
./memory/memory6/removable: 0
./memory/memory7/removable: 1
./memory/memory8/removable: 0
./memory/memory9/removable: 0
./memory/memory10/removable: 0
./memory/memory11/removable: 0
./memory/memory12/removable: 0
./memory/memory13/removable: 0
./memory/memory14/removable: 0
./memory/memory15/removable: 0
./memory/memory16/removable: 0
./memory/memory17/removable: 1
./memory/memory18/removable: 1
./memory/memory19/removable: 1
./memory/memory20/removable: 1
./memory/memory21/removable: 1
./memory/memory22/removable: 1
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit
boundary. For example:
u64 val = PAGE_ALIGN(size);
always returns a value < 4GB even if size is greater than 4GB.
The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for
example):
#define PAGE_SHIFT 12
#define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT)
#define PAGE_MASK (~(PAGE_SIZE-1))
...
#define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK)
The "~" is performed on a 32-bit value, so everything in "and" with
PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary.
Using the ALIGN() macro seems to be the right way, because it uses
typeof(addr) for the mask.
Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in
include/linux/mm.h.
See also lkml discussion: http://lkml.org/lkml/2008/6/11/237
[akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c]
[akpm@linux-foundation.org: fix v850]
[akpm@linux-foundation.org: fix powerpc]
[akpm@linux-foundation.org: fix arm]
[akpm@linux-foundation.org: fix mips]
[akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c]
[akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c]
[akpm@linux-foundation.org: fix powerpc]
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
alloc_pages_exact() is similar to alloc_pages(), except that it allocates
the minimum number of pages to fulfill the request. This is useful if you
want to allocate a very large buffer that is slightly larger than an even
power-of-two number of pages. In that case, alloc_pages() will waste a
lot of memory.
I have a video driver that wants to allocate a 5MB buffer. alloc_pages()
wiill waste 3MB of physically-contiguous memory.
Signed-off-by: Timur Tabi <timur@freescale.com>
Cc: Andi Kleen <andi@firstfloor.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Almost all users of this field need a PFN instead of a physical address,
so replace node_boot_start with node_min_pfn.
[Lee.Schermerhorn@hp.com: fix spurious BUG_ON() in mark_bootmem()]
Signed-off-by: Johannes Weiner <hannes@saeureba.de>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
alloc_bootmem_core has become quite nasty to read over time. This is a
clean rewrite that keeps the semantics.
bdata->last_pos has been dropped.
bdata->last_success has been renamed to hint_idx and it is now an index
relative to the node's range. Since further block searching might start
at this index, it is now set to the end of a succeeded allocation rather
than its beginning.
bdata->last_offset has been renamed to last_end_off to be more clear that
it represents the ending address of the last allocation relative to the
node.
[y-goto@jp.fujitsu.com: fix new alloc_bootmem_core()]
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This only reorders functions so that further patches will be easier to
read. No code changed.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of using the variable mmu_huge_psize to keep track of the huge
page size we use an array of MMU_PAGE_* values. For each supported huge
page size we need to know the hugepte_shift value and have a
pgtable_cache. The hstate or an mmu_huge_psizes index is passed to
functions so that they know which huge page size they should use.
The hugepage sizes 16M and 64K are setup(if available on the hardware) so
that they don't have to be set on the boot cmd line in order to use them.
The number of 16G pages have to be specified at boot-time though (e.g.
hugepagesz=16G hugepages=5).
Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The huge page size is defined for 16G pages. If a hugepagesz of 16G is
specified at boot-time then it becomes the huge page size instead of the
default 16M.
The change in pgtable-64K.h is to the macro pte_iterate_hashed_subpages to
make the increment to va (the 1 being shifted) be a long so that it is not
shifted to 0. Otherwise it would create an infinite loop when the shift
value is for a 16G page (when base page size is 64K).
Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The 16G huge pages have to be reserved in the HMC prior to boot. The
location of the pages are placed in the device tree. This patch adds code
to scan the device tree during very early boot and save these page
locations until hugetlbfs is ready for them.
Acked-by: Adam Litke <agl@us.ibm.com>
Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow alloc_bootmem_huge_page() to be overridden by architectures that
can't always use bootmem. This requires huge_boot_pages to be available
for use by this function.
This is required for powerpc 16G pages, which have to be reserved prior to
boot-time. The location of these pages are indicated in the device tree.
Acked-by: Adam Litke <agl@us.ibm.com>
Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add an hugepagesz=... option similar to IA64, PPC etc. to x86-64.
This finally allows to select GB pages for hugetlbfs in x86 now that all
the infrastructure is in place.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Straight forward extensions for huge pages located in the PUD instead of
PMDs.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Straight forward variant of the existing __alloc_bootmem_node, only
subsequent patch when allocating giant hugepages at boot -- don't want to
panic if we can't allocate as many as the user asked for.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Provide new hugepages user APIs that are more suited to multiple hstates
in sysfs. There is a new directory, /sys/kernel/hugepages. Underneath
that directory there will be a directory per-supported hugepage size,
e.g.:
/sys/kernel/hugepages/hugepages-64kB
/sys/kernel/hugepages/hugepages-16384kB
/sys/kernel/hugepages/hugepages-16777216kB
corresponding to 64k, 16m and 16g respectively. Within each
hugepages-size directory there are a number of files, corresponding to the
tracked counters in the hstate, e.g.:
/sys/kernel/hugepages/hugepages-64/nr_hugepages
/sys/kernel/hugepages/hugepages-64/nr_overcommit_hugepages
/sys/kernel/hugepages/hugepages-64/free_hugepages
/sys/kernel/hugepages/hugepages-64/resv_hugepages
/sys/kernel/hugepages/hugepages-64/surplus_hugepages
Of these files, the first two are read-write and the latter three are
read-only. The size of the hugepage being manipulated is trivially
deducible from the enclosing directory and is always expressed in kB (to
match meminfo).
[dave@linux.vnet.ibm.com: fix build]
[nacc@us.ibm.com: hugetlb: hang off of /sys/kernel/mm rather than /sys/kernel]
[nacc@us.ibm.com: hugetlb: remove CONFIG_SYSFS dependency]
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add the ability to configure the hugetlb hstate used on a per mount basis.
- Add a new pagesize= option to the hugetlbfs mount that allows setting
the page size
- This option causes the mount code to find the hstate corresponding to the
specified size, and sets up a pointer to the hstate in the mount's
superblock.
- Change the hstate accessors to use this information rather than the
global_hstate they were using (requires a slight change in mm/memory.c
so we don't NULL deref in the error-unmap path -- see comments).
[np: take hstate out of hugetlbfs inode and vma->vm_private_data]
Acked-by: Adam Litke <agl@us.ibm.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add basic support for more than one hstate in hugetlbfs. This is the key
to supporting multiple hugetlbfs page sizes at once.
- Rather than a single hstate, we now have an array, with an iterator
- default_hstate continues to be the struct hstate which we use by default
- Add functions for architectures to register new hstates
[akpm@linux-foundation.org: coding-style fixes]
Acked-by: Adam Litke <agl@us.ibm.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The goal of this patchset is to support multiple hugetlb page sizes. This
is achieved by introducing a new struct hstate structure, which
encapsulates the important hugetlb state and constants (eg. huge page
size, number of huge pages currently allocated, etc).
The hstate structure is then passed around the code which requires these
fields, they will do the right thing regardless of the exact hstate they
are operating on.
This patch adds the hstate structure, with a single global instance of it
(default_hstate), and does the basic work of converting hugetlb to use the
hstate.
Future patches will add more hstate structures to allow for different
hugetlbfs mounts to have different page sizes.
[akpm@linux-foundation.org: coding-style fixes]
Acked-by: Adam Litke <agl@us.ibm.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a kobject to create /sys/kernel/mm when sysfs is mounted. The kobject
will exist regardless. This will allow for the hugepage related sysfs
directories to exist under the mm "subsystem" directory. Add an ABI file
appropriately.
[kosaki.motohiro@jp.fujitsu.com: fix build]
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With Mel's hugetlb private reservation support patches applied, strict
overcommit semantics are applied to both shared and private huge page
mappings. This can be a problem if an application relied on unlimited
overcommit semantics for private mappings. An example of this would be an
application which maps a huge area with the intention of using it very
sparsely. These application would benefit from being able to opt-out of
the strict overcommit. It should be noted that prior to hugetlb
supporting demand faulting all mappings were fully populated and so
applications of this type should be rare.
This patch stack implements the MAP_NORESERVE mmap() flag for huge page
mappings. This flag has the same meaning as for small page mappings,
suppressing reservations for that mapping.
Thanks to Mel Gorman for reviewing a number of early versions of these
patches.
This patch:
When a small page mapping is created with mmap() reservations are created
by default for any memory pages required. When the region is read/write
the reservation is increased for every page, no reservation is needed for
read-only regions (as they implicitly share the zero page). Reservations
are tracked via the VM_ACCOUNT vma flag which is present when the region
has reservation backing it. When we convert a region from read-only to
read-write new reservations are aquired and VM_ACCOUNT is set. However,
when a read-only map is created with MAP_NORESERVE it is indistinguishable
from a normal mapping. When we then convert that to read/write we are
forced to incorrectly create reservations for it as we have no record of
the original MAP_NORESERVE.
This patch introduces a new vma flag VM_NORESERVE which records the
presence of the original MAP_NORESERVE flag. This allows us to
distinguish these two circumstances and correctly account the reserve.
As well as fixing this FIXME in the code, this makes it much easier to
introduce MAP_NORESERVE support for huge pages as this flag is available
consistantly for the life of the mapping. VM_ACCOUNT on the other hand is
heavily used at the generic level in association with small pages.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Johannes Weiner <hannes@saeurebad.de>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After patch 2 in this series, a process that successfully calls mmap() for
a MAP_PRIVATE mapping will be guaranteed to successfully fault until a
process calls fork(). At that point, the next write fault from the parent
could fail due to COW if the child still has a reference.
We only reserve pages for the parent but a copy must be made to avoid
leaking data from the parent to the child after fork(). Reserves could be
taken for both parent and child at fork time to guarantee faults but if
the mapping is large it is highly likely we will not have sufficient pages
for the reservation, and it is common to fork only to exec() immediatly
after. A failure here would be very undesirable.
Note that the current behaviour of mainline with MAP_PRIVATE pages is
pretty bad. The following situation is allowed to occur today.
1. Process calls mmap(MAP_PRIVATE)
2. Process calls mlock() to fault all pages and makes sure it succeeds
3. Process forks()
4. Process writes to MAP_PRIVATE mapping while child still exists
5. If the COW fails at this point, the process gets SIGKILLed even though it
had taken care to ensure the pages existed
This patch improves the situation by guaranteeing the reliability of the
process that successfully calls mmap(). When the parent performs COW, it
will try to satisfy the allocation without using reserves. If that fails
the parent will steal the page leaving any children without a page.
Faults from the child after that point will result in failure. If the
child COW happens first, an attempt will be made to allocate the page
without reserves and the child will get SIGKILLed on failure.
To summarise the new behaviour:
1. If the original mapper performs COW on a private mapping with multiple
references, it will attempt to allocate a hugepage from the pool or
the buddy allocator without using the existing reserves. On fail, VMAs
mapping the same area are traversed and the page being COW'd is unmapped
where found. It will then steal the original page as the last mapper in
the normal way.
2. The VMAs the pages were unmapped from are flagged to note that pages
with data no longer exist. Future no-page faults on those VMAs will
terminate the process as otherwise it would appear that data was corrupted.
A warning is printed to the console that this situation occured.
2. If the child performs COW first, it will attempt to satisfy the COW
from the pool if there are enough pages or via the buddy allocator if
overcommit is allowed and the buddy allocator can satisfy the request. If
it fails, the child will be killed.
If the pool is large enough, existing applications will not notice that
the reserves were a factor. Existing applications depending on the
no-reserves been set are unlikely to exist as for much of the history of
hugetlbfs, pages were prefaulted at mmap(), allocating the pages at that
point or failing the mmap().
[npiggin@suse.de: fix CONFIG_HUGETLB=n build]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch reserves huge pages at mmap() time for MAP_PRIVATE mappings in
a similar manner to the reservations taken for MAP_SHARED mappings. The
reserve count is accounted both globally and on a per-VMA basis for
private mappings. This guarantees that a process that successfully calls
mmap() will successfully fault all pages in the future unless fork() is
called.
The characteristics of private mappings of hugetlbfs files behaviour after
this patch are;
1. The process calling mmap() is guaranteed to succeed all future faults until
it forks().
2. On fork(), the parent may die due to SIGKILL on writes to the private
mapping if enough pages are not available for the COW. For reasonably
reliable behaviour in the face of a small huge page pool, children of
hugepage-aware processes should not reference the mappings; such as
might occur when fork()ing to exec().
3. On fork(), the child VMAs inherit no reserves. Reads on pages already
faulted by the parent will succeed. Successful writes will depend on enough
huge pages being free in the pool.
4. Quotas of the hugetlbfs mount are checked at reserve time for the mapper
and at fault time otherwise.
Before this patch, all reads or writes in the child potentially needs page
allocations that can later lead to the death of the parent. This applies
to reads and writes of uninstantiated pages as well as COW. After the
patch it is only a write to an instantiated page that causes problems.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
free_area_init_node() gets passed in the node id as well as the node
descriptor. This is redundant as the function can trivially get the node
descriptor itself by means of NODE_DATA() and the node's id.
I checked all the users and NODE_DATA() seems to be usable everywhere
from where this function is called.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is called on a per-page basis and in the vast majority of cases
`error' is zero.
Cc: Guillaume Chazarain <guichaz@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SLOB reuses two page bits for internal purposes, it overlays PG_active and
PG_private. This is hidden away in slob.c. Document these overlays
explicitly in the main page-flags enum along with all the others.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SLUB reuses two page bits for internal purposes, it overlays PG_active and
PG_error. This is hidden away in slub.c. Document these overlays
explicitly in the main page-flags enum along with all the others.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the recent page flag reorganisation we have a single enum which
defines the valid page flags and their values, nice and clear. However
there are a number of bits which are overloaded by different subsystems.
Firstly there is PG_owner_priv_1 which is used by filesystems and by XEN.
Secondly both SLOB and SLUB use a couple of extra page bits to manage
internal state for pages they own; both overlay other bits. All of these
"aliases" are scattered about the source making it very hard for a reader
to know if the bits are safe to rely on in all contexts; confusion here is
bad.
As we now have a single place where the bits are clearly assigned it makes
sense to clarify the reuse of bits by making the aliases explicit and
visible with the original bit assignments. This patch creates explicit
aliases within the enum itself for the overloaded bits, creates standard
bit accessors PageFoo etc. and uses those throughout.
This version pulls the bit manipulation out to standard named page bit
accessors as suggested by Christoph, it retains the explicit mapping to
the overlayed bits. A fusion of both ideas. This has been SLUB and SLOB
have been compile tested on x86_64 only, and SLUB boot tested. If people
feel this is worth doing then I can run a fuller set of testing.
This patch:
Some page flags are used for more than one purpose, for example
PG_owner_priv_1. Currently there are individual accessors for each user,
each built using the common flag name far away from the bit definitions.
This makes it hard to see all possible uses of these bits.
Now that we have a single enum to generate the bit orders it makes sense
to express overlays in the same place. So create per use aliases for this
bit in the main page-flags enum and use those in the accessors.
[akpm@linux-foundation.org: fix xen]
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[Summary]
Split LRU-list of unused dentries to one per superblock to avoid soft
lock up during NFS mounts and remounting of any filesystem.
Previously I posted here:
http://lkml.org/lkml/2008/3/5/590
[Descriptions]
- background
dentry_unused is a list of dentries which are not referenced.
dentry_unused grows up when references on directories or files are
released. This list can be very long if there is huge free memory.
- the problem
When shrink_dcache_sb() is called, it scans all dentry_unused linearly
under spin_lock(), and if dentry->d_sb is differnt from given
superblock, scan next dentry. This scan costs very much if there are
many entries, and very ineffective if there are many superblocks.
IOW, When we need to shrink unused dentries on one dentry, but scans
unused dentries on all superblocks in the system. For example, we scan
500 dentries to unmount a filesystem, but scans 1,000,000 or more unused
dentries on other superblocks.
In our case , At mounting NFS*, shrink_dcache_sb() is called to shrink
unused dentries on NFS, but scans 100,000,000 unused dentries on
superblocks in the system such as local ext3 filesystems. I hear NFS
mounting took 1 min on some system in use.
* : NFS uses virtual filesystem in rpc layer, so NFS is affected by
this problem.
100,000,000 is possible number on large systems.
Per-superblock LRU of unused dentried can reduce the cost in
reasonable manner.
- How to fix
I found this problem is solved by David Chinner's "Per-superblock
unused dentry LRU lists V3"(1), so I rebase it and add some fix to
reclaim with fairness, which is in Andrew Morton's comments(2).
1) http://lkml.org/lkml/2006/5/25/318
2) http://lkml.org/lkml/2006/5/25/320
Split LRU-list of unused dentries to each superblocks. Then, NFS
mounting will check dentries under a superblock instead of all. But
this spliting will break LRU of dentry-unused. So, I've attempted to
make reclaim unused dentrins with fairness by calculate number of
dentries to scan on this sb based on following way
number of dentries to scan on this sb =
count * (number of dentries on this sb / number of dentries in the machine)
- ToDo
- I have to measuring performance number and do stress tests.
- When unmount occurs during prune_dcache(), scanning on same
superblock, It is unable to reach next superblock because it is gone
away. We restart scannig superblock from first one, it causes
unfairness of reclaim unused dentries on first superblock. But I think
this happens very rarely.
- Test Results
Result on 6GB boxes with excessive unused dentries.
Without patch:
$ cat /proc/sys/fs/dentry-state
10181835 10180203 45 0 0 0
# mount -t nfs 10.124.60.70:/work/kernel-src nfs
real 0m1.830s
user 0m0.001s
sys 0m1.653s
With this patch:
$ cat /proc/sys/fs/dentry-state
10236610 10234751 45 0 0 0
# mount -t nfs 10.124.60.70:/work/kernel-src nfs
real 0m0.106s
user 0m0.002s
sys 0m0.032s
[akpm@linux-foundation.org: fix comments]
Signed-off-by: Kentaro Makita <k-makita@np.css.fujitsu.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: David Chinner <dgc@sgi.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The double indirection here is not needed anywhere and hence (at least)
confusing.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds ioremap_prot and pte_pgprot() so that one can extract protection
bits from a PTE and use them to ioremap_prot() (in order to support ptrace
of VM_IO | VM_PFNMAP as per Rik's patch).
This moves a couple of flag checks around in the ioremap implementations
of arch/powerpc. There's a side effect of allowing non-cacheable and
non-guarded mappings on ppc32 which before would always have _PAGE_GUARDED
set whenever _PAGE_NO_CACHE is.
(standard ioremap will still set _PAGE_GUARDED, but ioremap_prot will be
capable of setting such a non guarded mapping).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In order to be able to debug things like the X server and programs using
the PPC Cell SPUs, the debugger needs to be able to access device memory
through ptrace and /proc/pid/mem.
This patch:
Add the generic_access_phys access function and put the hooks in place
to allow access_process_vm to access device or PPC Cell SPU memory.
[riel@redhat.com: Add documentation for the vm_ops->access function]
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Benjamin Herrensmidt <benh@kernel.crashing.org>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are no users of nopfn in the tree. Remove it.
[hugh@veritas.com: fix build error]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds proper extern declarations for five variables in
include/linux/vmstat.h
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Two zonelist patch series rewrote __page_alloc() largely. Now, it is just
a wrapper function. Inlining them will save a function call.
[akpm@linux-foundation.org: export __alloc_pages_internal]
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This function has no external callers, so unexport it. Also fix its naming
inconsistency.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are a lot of places that define either a single bootmem descriptor or an
array of them. Use only one central array with MAX_NUMNODES items instead.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
lib/debugobjects.c has a function to test if an object is on the stack.
The block layer and ide needs it (they need to avoid DMA from/to stack
buffers). This patch moves the function to include/linux/sched.h so that
everyone can use it.
lib/debugobjects.c uses current->stack but this patch uses a
task_stack_page() accessor, which is a preferable way to access the stack.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
James Bottomley warns that inclusion of linux/fs.h in a low level
driver was always a danger signal. This patch moves
memory_read_from_buffer() from fs.h to string.h and fixes includes in
existing memory_read_from_buffer() users.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Bob Moore <robert.moore@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All users have now been converted to linux/semaphore.h and we don't need
to keep these files around any longer.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
This patch adds platform data to the AC97C platform device. This will
let the board add a GPIO line which is connected to the external codecs
reset line.
The platform data, ac97c_platform_data, must also contain the DMA
controller ID, RX channel ID and TX channel ID.
Tested with Wolfson WM9712 and AP7000.
Signed-off-by: Hans-Christian Egtvedt <hcegtvedt@atmel.com>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Apparently,
commit 6330a30a76
Author: Vegard Nossum <vegard.nossum@gmail.com>
Date: Wed May 28 09:46:19 2008 +0200
x86: break mutual header inclusion
introduced some CONFIG names to processor-flags.h, which was exported in
commit 6093015db2
Author: Ingo Molnar <mingo@elte.hu>
Date: Sun Mar 30 11:45:23 2008 +0200
x86: cleanup replace most vm86 flags with flags from processor-flags.h, fix
Fix it by wrapping the CONFIG parts in __KERNEL__.
Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Just out or curiousity ran checkpatch.pl for whole UBI,
and discovered there are quite a few of stylistic issues.
Fix them.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Quite useful ioctl which allows to make atomic system upgrades.
The idea belongs to Richard Titmuss <richard_titmuss@logitech.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Hch asked not to use "unit" for sub-systems, let it be so.
Also some other commentaries modifications.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
* 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: hrtick_enabled() should use cpu_active()
sched, x86: clean up hrtick implementation
sched: fix build error, provide partition_sched_domains() unconditionally
sched: fix warning in inc_rt_tasks() to not declare variable 'rq' if it's not needed
cpu hotplug: Make cpu_active_map synchronization dependency clear
cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2)
sched: rework of "prioritize non-migratable tasks over migratable ones"
sched: reduce stack size in isolated_cpu_setup()
Revert parts of "ftrace: do not trace scheduler functions"
Fixed up conflicts in include/asm-x86/thread_info.h (due to the
TIF_SINGLESTEP unification vs TIF_HRTICK_RESCHED removal) and
kernel/sched_fair.c (due to cpu_active_map vs for_each_cpu_mask_nr()
introduction).
* 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits)
NR_CPUS: Replace NR_CPUS in speedstep-centrino.c
cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP
NR_CPUS: Replace NR_CPUS in cpufreq userspace routines
NR_CPUS: Replace per_cpu(..., smp_processor_id()) with __get_cpu_var
NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genapic_flat_64.c
NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genx2apic_uv_x.c
NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/proc.c
NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/mcheck/mce_64.c
cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c, fix
cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target
cpumask: Provide a generic set of CPUMASK_ALLOC macros
cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c
cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c
cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c
cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c
cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c
cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
Revert "cpumask: introduce new APIs"
cpumask: make for_each_cpu_mask a bit smaller
net: Pass reference to cpumask variable in net/sunrpc/svc.c
...
Fix up trivial conflicts in drivers/cpufreq/cpufreq.c manually
* 'core/softlockup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
softlockup: fix invalid proc_handler for softlockup_panic
softlockup: fix watchdog task wakeup frequency
softlockup: fix watchdog task wakeup frequency
softlockup: show irqtrace
softlockup: print a module list on being stuck
softlockup: fix NMI hangs due to lock race - 2.6.26-rc regression
softlockup: fix false positives on nohz if CPU is 100% idle for more than 60 seconds
softlockup: fix softlockup_thresh fix
softlockup: fix softlockup_thresh unaligned access and disable detection at runtime
softlockup: allow panic on lockup
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (85 commits)
[ARM] pxa: add base support for PXA930 Handheld Platform (aka SAAR)
[ARM] pxa: add base support for PXA930 Evaluation Board (aka TavorEVB)
[ARM] pxa: add base support for PXA930 (aka Tavor-P)
[ARM] Update mach-types
[ARM] pxa: make littleton to use the new smc91x platform data
[ARM] pxa: make zylonite to use the new smc91x platform data
[ARM] pxa: make mainstone to use the new smc91x platform data
[ARM] pxa: make lubbock to use new smc91x platform data
[NET] smc91x: prepare SMC_USE_PXA_DMA to be specified in platform data
[NET] smc91x: prepare for SMC_IO_SHIFT to be a platform configurable variable
[NET] smc91x: add SMC91X_NOWAIT flag to platform data
[NET] smc91x: favor the use of SMC91X_USE_* instead of SMC_CAN_USE_*
[NET] smc91x: remove "irq_flags" from "struct smc91x_platdata"
[ARM] 5146/1: pxa2xx: convert all boards to call pxa2xx_transceiver_mode helper
Support for LCD on e740 e750 e400 and e800 e-series PDAs
E-series UDC support
PXA UDC - allow use of inverted GPIO for pullup
Add e350 support
Fix broken e-series build
E-series GPIO / IRQ definitions.
...
The functions in a header should not belong to another module. The prio functions
belong to v4l2-common.c, so move them to v4l2-common.h.
The ioctl functions belong to v4l2-ioctl.c, so create a new v4l2-ioctl.h header
and move those functions to it.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The class_dev field is a normal device, not a class device. This is very
confusing and now that the old 'dev' field has been renamed to 'parent'
we can rename 'class_dev' to just 'dev'.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The field 'dev' is not the video device, but the parent of the video device.
Rename accordingly.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
sdhci: highmem capable PIO routines
sg: reimplement sg mapping iterator
mmc_test: print message when attaching to card
mmc: Remove Russell as primecell mci maintainer
mmc_block: bounce buffer highmem support
sdhci: fix bad warning from commit c8b3e02
sdhci: add warnings for bad buffers in ADMA path
mmc_test: test oversized sg lists
mmc_test: highmem tests
s3cmci: ensure host stopped on machine shutdown
au1xmmc: suspend/resume implementation
s3cmci: fixes for section mismatch warnings
pxamci: trivial fix of DMA alignment register bit clearing
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (24 commits)
I/OAT: I/OAT version 3.0 support
I/OAT: tcp_dma_copybreak default value dependent on I/OAT version
I/OAT: Add watchdog/reset functionality to ioatdma
iop_adma: cleanup iop_chan_xor_slot_count
iop_adma: document how to calculate the minimum descriptor pool size
iop_adma: directly reclaim descriptors on allocation failure
async_tx: make async_tx_test_ack a boolean routine
async_tx: remove depend_tx from async_tx_sync_epilog
async_tx: export async_tx_quiesce
async_tx: fix handling of the "out of descriptor" condition in async_xor
async_tx: ensure the xor destination buffer remains dma-mapped
async_tx: list_for_each_entry_rcu() cleanup
dmaengine: Driver for the Synopsys DesignWare DMA controller
dmaengine: Add slave DMA interface
dmaengine: add DMA_COMPL_SKIP_{SRC,DEST}_UNMAP flags to control dma unmap
dmaengine: Add dma_client parameter to device_alloc_chan_resources
dmatest: Simple DMA memcpy test client
dmaengine: DMA engine driver for Marvell XOR engine
iop-adma: fix platform driver hotplug/coldplug
dmaengine: track the number of clients using a channel
...
Fixed up conflict in drivers/dca/dca-sysfs.c manually
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (60 commits)
ide: small whitespace fixes
ide: ide-cd_ioctl.c fix sparse integer as NULL pointer warnings
ide: ide-cd.c fix sparse endianness warnings
ide-cd: convert to using the new atapi_flags
ide: remove unused PC_FLAG_DRQ_INTERRUPT
ide-scsi: convert to using the new atapi_flags
ide-tape: convert to using the new atapi_flags
ide-floppy: convert to using the new atapi_flags (take 2)
ide: add per-device flags
ide: use rq->cmd instead of pc->c in atapi common code
ide-scsi: pass packet command in rq->cmd
ide-tape: pass packet command in rq->cmd
ide-tape: make room for packet command ids in rq->cmd
ide-floppy: pass packet command in rq->cmd
ide: remove pc->callback member from ide_atapi_pc
ide-scsi: use drive->pc_callback instead of pc->callback
ide-tape: use drive->pc_callback instead of pc->callback
ide-floppy: use drive->pc_callback instead of pc->callback
ide: push pc callback pointer into the ide_drive_t structure
drivers/ide/ide-tape.c: remove double kfree
...
There should be no functionality change resulting from this patch.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
while at it, remove PC_FLAG_ZIP_DRIVE from the packed command flags altogether
and query the drive type through drive->atapi_flags.
v2:
ide-floppy fix.
There should be no functionality change resulting from this patch.
[bart: IDE_FLAG_* -> IDE_AFLAG_*, dev_flags -> atapi_flags]
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Push device flags up into ide_drive_t.
There should be no functionality change resulting from this patch.
[bart: IDE_FLAG_* -> IDE_AFLAG_*, dev_flags -> atapi_flags]
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
There should be no functionality change resulting from this patch.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Refrain from carrying the callback ptr with every packet command since the
callback function is only one anyways. ide_drive_t is probably not the most
suitable place for it right now but is the more sane solution. Besides, these
structs are going to be reorganized anyways during the generic ide rewrite.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add ide_host_free() helper and convert ide_host_remove() to use it.
* Fix handling of ide_host_register() failure in ide_host_add(),
icside.c, ide-generic.c, falconide.c and sgiioc4.c.
While at it:
* Fix handling of ide_host_alloc_all() failure in ide-generic.c.
* Fix handling of ide_host_alloc() failure in falconide.c
(also return the correct error value if no device is found).
v2:
* falconide build fix. (From Stephen Rothwell)
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add ide_host_add() helper which does ide_host_alloc()+ide_host_register(),
then convert ide_setup_pci_device[s](), ide_legacy_device_add() and some
host drivers to use it.
While at it:
* Fix ide_setup_pci_device[s](), ide_arm.c, gayle.c, ide-4drives.c,
macide.c, q40ide.c, cmd640.c and cs5520.c to return correct error value.
* -ENOENT -> -ENOMEM in rapide.c, ide-h8300.c, ide-generic.c, au1xxx-ide.c
and pmac.c
* -ENODEV -> -ENOMEM in palm_bk3710.c, ide_platform.c and delkin_cb.c
* -1 -> -ENOMEM in ide-pnp.c
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add struct ide_host which keeps pointers to host's ports.
* Add ide_host_alloc[_all]() and ide_host_remove() helpers.
* Pass 'struct ide_host *host' instead of 'u8 *idx' to
ide_device_add[_all]() and rename it to ide_host_register[_all]().
* Convert host drivers and core code to use struct ide_host.
* Remove no longer needed ide_find_port().
* Make ide_find_port_slot() static.
* Unexport ide_unregister().
v2:
* Add missing 'struct ide_host *host' to macide.c.
v3:
* Fix build problem in pmac.c (s/ide_alloc_host/ide_host_alloc/)
(Noticed by Stephen Rothwell).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add struct ide_tp_ops for transport methods.
* Add 'const struct ide_tp_ops *tp_ops' to struct ide_port_info
and ide_hwif_t.
* Set the default hwif->tp_ops in ide_init_port_data().
* Set host driver specific hwif->tp_ops in ide_init_port().
* Export ide_exec_command(), ide_read_status(), ide_read_altstatus(),
ide_read_sff_dma_status(), ide_set_irq(), ide_tf_{load,read}()
and ata_{in,out}put_data().
* Convert host drivers and core code to use struct ide_tp_ops.
* Remove no longer needed default_hwif_transport().
* Cleanup ide_hwif_t from methods that are now in struct ide_tp_ops.
While at it:
* Use struct ide_port_info in falconide.c and q40ide.c.
* Rename ata_{in,out}put_data() to ide_{in,out}put_data().
v2:
* Fix missing convertion in ns87415.c.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add 'config' field to hw_regs_t and use it to set hwif->config_data in
ide_init_port_hw(), then convert ide_legacy_init_one() to use hw->config.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Filter out "default" transfer mode values (0x00 - default PIO mode,
0x01 - default PIO mode w/ IORDY disabled) in write handler for obsoleted
/proc/ide/hd?/settings:current_speed setting.
Allowing "default" transfer mode values is a dangerous thing to do as
we don't support programming controller to the "default" transfer mode
and devices often use different values for the default and maximum PIO
mode (i.e. PIO2 default and PIO4 maximum) so the controller will stay
programmed for higher PIO mode while device will use the lower PIO mode.
There is no functionality loss as by using special IOCTLs device can
still be programmed to "default" transfer modes (it is only useful for
debugging/testing purposes anyway).
* Remove no longer needed IDE_HFLAG_ABUSE_SET_DMA_MODE host flag, it was
previously used by few host drivers to program the controller to PIO0
timings for "default" transfer mode == 0x01 (although some host drivers
would program invalid PIO timings instead).
* Cleanup ide_set_xfer_rate() and add BUG_ON().
Suggested-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Lets remove dead Virtual DMA support for now so it doesn't clutter
core IDE code (it can be bring back when there is a need for it):
* Remove IDE_HFLAG_VDMA host flag.
* Remove ide_drive_t.vdma flag.
* cs5520.c: remove stale FIXMEs, cs5520_dma_host_set() and cs5520_dma_ops
(also there is no longer a need to set IDE_HFLAG_NO_ATAPI_DMA).
There should be no functional changes caused by this patch.
Cc: TAKADA Yoshihito <takada@mbf.nifty.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Remove no longer needed ->INB, ->OUTB and ->OUTBSYNC methods.
Then:
* Remove no longer used default_hwif_[mm]iops() and ide_[mm_]outbsync().
* Cleanup SuperIO handling in ns87415.c.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add ide_read_bcount_and_ireason() helper and use it instead of ->INB
in {cdrom_newpc,ide_pc}_intr().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add IDE_TFLAG_IN_FEATURE taskfile flag for reading Feature
register and handle it in ->tf_read.
* Convert ide_read_error() to use ->tf_read instead of ->INB,
then uninline and export it.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add ->set_irq method for setting nIEN bit of ATA Device Control
register and use it instead of ide_set_irq().
While at it:
* Use ->set_irq in init_irq() and do_reset1().
* Don't use HWIF() macro in ide_check_pm_state().
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Remove ide_read_altstatus() inline helper.
* Add ->read_altstatus method for reading ATA Alternate Status
register and use it instead of ->INB.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Remove ide_read_status() inline helper.
* Add ->read_status method for reading ATA Status register
and use it instead of ->INB.
While at it:
* Don't use HWGROUP() macro.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add ->exec_command method for writing ATA Command register
and use it instead of ->OUTBSYNC.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Factor out simplex handling from ide_pci_dma_base() to
ide_pci_check_simplex().
* Set hwif->dma_base early in ->init_dma method / ide_hwif_setup_dma()
and reset it in ide_init_port() if DMA initialization fails.
* Use ->read_sff_dma_status instead of ->INB in ide_pci_dma_base().
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Export sff_dma_ops and then remove ide_setup_dma().
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Use ->dma_base + offset instead of ->dma_{status,command}
and remove no longer needed ->dma_{status,command}.
While at it:
* Use ATA_DMA_* defines.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add ->read_sff_dma_status method for reading DMA Status register
and use it instead of ->INB.
While at it:
* Use inb() directly in ns87415.c::ns87415_dma_end().
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add 'hw_regs_t **hws' argument to ide_device_add[_all]() and convert
host drivers + ide_legacy_init_one() + ide_setup_pci_device[s]() to use
it instead of calling ide_init_port_hw() directly.
[ However if host has > 1 port we must still set hwif->chipset to hint
consecutive ide_find_port() call that the previous slot is occupied. ]
* Unexport ide_init_port_hw().
v2:
* Use defines instead of hard-coded values in buddha.c, gayle.c and q40ide.c.
(Suggested by Geert Uytterhoeven)
* Better patch description.
v3:
* Fix build problem in ide-cs.c. (Noticed by Stephen Rothwell)
There should be no functional changes caused by this patch.
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This patch removes the old kgdb reminants from ARCH=powerpc and
implements the new style arch specific stub for the common kgdb core
interface.
It is possible to have xmon and kgdb in the same kernel, but you
cannot use both at the same time because there is only one set of
debug hooks.
The arch specific kgdb implementation saves the previous state of the
debug hooks and restores them if you unconfigure the kgdb I/O driver.
Kgdb should have no impact on a kernel that has no kgdb I/O driver
configured.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
This patch adds the ARCH=arm specific a kgdb backend, originally
written by Deepak Saxena <dsaxena@plexity.net> and George Davis
<gdavis@mvista.com>. Geoff Levand <geoffrey.levand@am.sony.com>,
Nicolas Pitre, Manish Lachwani, and Jason Wessel have contributed
various fixups here as well.
The KGDB patch makes one change to the core ARM architecture such that
the traps are initialized early for use with the debugger or other
subsystems.
[ mingo@elte.hu: small cleanups. ]
[ ben-linux@fluff.org: fixed early_trap_init ]
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Deepak Saxena <dsaxena@plexity.net>
Add support for the following operations to mlx4 when device firmware
supports them:
- Send with invalidate and local invalidate send queue work requests;
- Allocate/free fast register MRs;
- Allocate/free fast register MR page lists;
- Fast register MR send queue work requests;
- Local DMA L_Key.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This adds a hid usage that is reported by the N-Trig digitizer in the Dell
Latitude XT screen.
Signed-off-by: Rafi Rubin <rafi@seas.upenn.edu>
Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This is alternative implementation of sg content iterator introduced
by commit 83e7d317... from Pierre Ossman in next-20080716. As there's
already an sg iterator which iterates over sg entries themselves, name
this sg_mapping_iterator.
Slightly edited description from the original implementation follows.
Iteration over a sg list is not that trivial when you take into
account that memory pages might have to be mapped before being used.
Unfortunately, that means that some parts of the kernel restrict
themselves to directly accesible memory just to not have to deal with
the mess.
This patch adds a simple iterator system that allows any code to
easily traverse an sg list and not have to deal with all the details.
The user can decide to consume part of the iteration. Also, iteration
can be stopped and resumed later if releasing the kmap between
iteration steps is necessary. These features are useful to implement
piecemeal sg copying for interrupt drive PIO for example.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
This driver supports the PCA9550, PCA9551, PCA9552, and PCA9553
LED driver chips.
Signed-off-by: Nate Case <ncase@xes-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
LED classdev core doesn't modify memory pointed by the default_trigger,
so mark it as const and we'll able to pass const char *s without getting
compiler warnings.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
NXP pca9532 is a LED dimmer/controller attached to i2c bus. It allows
attaching upto 16 leds which can either be on, off or dimmed and/or blinked
with the two PWM modulators available.
This driver is a "new-style" i2c driver that adheres to the driver model and
implements the led framework api. Since the leds connected to the driver are
platform specific, it is only useful when platform data is passed to the
driver to define what leds are connected to which pins.
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
This ifdefs are leftovers from the time as the driver was running
on a ppc.
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
/home/bigeasy/git/linux-2.6-ftrace/kernel/trace/trace.c: In function 'tracing_generic_entry_update':
/home/bigeasy/git/linux-2.6-ftrace/kernel/trace/trace.c:802: error: implicit declaration of function 'irqs_disabled_flags'
make[3]: *** [kernel/trace/trace.o] Error 1
/home/bigeasy/git/linux-2.6-ftrace/kernel/trace/ftrace.c: In function 'ftrace_list_func':
/home/bigeasy/git/linux-2.6-ftrace/kernel/trace/ftrace.c:61: error: implicit declaration of function 'read_barrier_depends'
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
the ff1 and bitrev opcode appears in ISA C and ISA A+ what isn't
supported by all plattforms. The assembly optimization is automaticly
enabled if the compiler understand the required cpu keyword.
My m5235 seems to boot and run fine so far.
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (82 commits)
ipw2200: Call netif_*_queue() interfaces properly.
netxen: Needs to include linux/vmalloc.h
[netdrvr] atl1d: fix !CONFIG_PM build
r6040: rework init_one error handling
r6040: bump release number to 0.18
r6040: handle RX fifo full and no descriptor interrupts
r6040: change the default waiting time
r6040: use definitions for magic values in descriptor status
r6040: completely rework the RX path
r6040: call napi_disable when puting down the interface and set lp->dev accordingly.
mv643xx_eth: fix NETPOLL build
r6040: rework the RX buffers allocation routine
r6040: fix scheduling while atomic in r6040_tx_timeout
r6040: fix null pointer access and tx timeouts
r6040: prefix all functions with r6040
rndis_host: support WM6 devices as modems
at91_ether: use netstats in net_device structure
sfc: Create one RX queue and interrupt per CPU package by default
sfc: Use a separate workqueue for resets
sfc: I2C adapter initialisation fixes
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc32: pass -m32 when building vmlinux.lds
sparc: Fixes the DRM layer build on sparc.
ide: merge <asm-sparc/ide_64.h> with <asm-sparc/ide_32.h>
ide: <asm-sparc/ide_64.h>: use __raw_{read,write}w()
ide: <asm-sparc/ide_32.h>: use __raw_{read,write}w()
ide: <asm-sparc/ide_64.h>: use %r0 for outw_be()
sparc64: Do not define BIO_VMERGE_BOUNDARY.
This patch adds to ioatdma and dca modules
support for Intel I/OAT DMA engine ver.3 (aka CB3 device).
The main features of I/OAT ver.3 are:
* 8 single channel DMA devices (8 channels total)
* 8 DCA providers, each can accept 2 requesters
* 8-bit TAG values and 32-bit extended APIC IDs
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Change icmp6_dst_gc to return the one value the caller cares about rather
than using call by reference.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FIB timer list is a trivial size structure, avoid indirection and just
put it in existing ns.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make more PCI PM core functionality available to drivers
* Export pci_pme_capable() so that it can be called directly by
drivers (for example, tg3 needs that).
* Move the state choosing part of pci_prepare_to_sleep() to a
separate function, pci_target_state(), that can be called directly
by drivers (for example, tg3 needs that).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
sctp_outq_flush() can now become static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
get_proc_net() can now become static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Consumers that want to re-use their QPs in new connections need to
know when the QP has exited the timewait state. Report the timewait
event through the rdma_cm.
Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add an RDMA_CM_EVENT_ADDR_CHANGE event can be used by rdma-cm
consumers that wish to have their RDMA sessions always use the same
links (eg <hca/port>) as the IP stack does. In the current code, this
does not happen when bonding is used and fail-over happened but the IB
link used by an already existing session is operating fine.
Use the netevent notification for sensing that a change has happened
in the IP stack, then scan the rdma-cm ID list to see if there is an
ID that is "misaligned" with respect to the IP stack, and deliver
RDMA_CM_EVENT_ADDR_CHANGE for this ID. The consumer can act on the
event or just ignore it.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The function header comments have to go with the functions
they are documenting, or things go horribly wrong when we
try to process them with the docbook tools.
Warning(include/linux/netdevice.h:1006): No description found for parameter 'dev_queue'
Warning(include/linux/netdevice.h:1033): No description found for parameter 'dev_queue'
Warning(include/linux/netdevice.h:1067): No description found for parameter 'dev_queue'
Warning(include/linux/netdevice.h:1093): No description found for parameter 'dev_queue'
Warning(include/linux/netdevice.h:1474): No description found for parameter 'txq'
Error(net/core/dev.c:1674): cannot understand prototype: 'u32 simple_tx_hashrnd; '
Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: fix crash due to missing debugctlmsr on AMD K6-3
x86: add PTE_FLAGS_MASK
x86: rename PTE_MASK to PTE_PFN_MASK
x86: fix pte_flags() to only return flags, fix lguest (updated)
x86: use setup_clear_cpu_cap with disable_apic, fix
x86: move the last Dprintk instance to pr_debug()
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
remove CONFIG_KMOD from core kernel code
remove CONFIG_KMOD from lib
remove CONFIG_KMOD from sparc64
rework try_then_request_module to do less in non-modular kernels
remove mention of CONFIG_KMOD from documentation
make CONFIG_KMOD invisible
modules: Take a shortcut for checking if an address is in a module
module: turn longs into ints for module sizes
Shrink struct module: CONFIG_UNUSED_SYMBOLS ifdefs
module: reorder struct module to save space on 64 bit builds
module: generic each_symbol iterator function
module: don't use stop_machine for waiting rmmod
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (49 commits)
powerpc: Fix build bug with binutils < 2.18 and GCC < 4.2
powerpc/eeh: Don't panic when EEH_MAX_FAILS is exceeded
fbdev: Teaches offb about palette on radeon r5xx/r6xx
powerpc/cell/edac: Log a syndrome code in case of correctable error
powerpc/cell: Add DMA_ATTR_WEAK_ORDERING dma attribute and use in Cell IOMMU code
powerpc: Indicate which oprofile counters to use while in compat mode
powerpc/boot: Change spaces to tabs
powerpc: Remove duplicate 6xx option in Kconfig
powerpc: Use PPC_LONG and PPC_LONG_ALIGN in lib/string.S
powerpc: Use PPC_LONG_ALIGN in uaccess.h
powerpc: Add a #define for aligning to a long-sized boundary
powerpc: Fix OF parsing of 64 bits PCI addresses
powerpc: Use WARN_ON(1) instead of __WARN()
powerpc: Fix support for latencytop
powerpc/ps3: Update ps3_defconfig
powerpc/ps3: Add a sub-match id to ps3_system_bus
powerpc: Add a 6xx defconfig
powerpc/dma: Use the struct dma_attrs in iommu code
powerpc/cell: Add support for power button of future IBM cell blades
powerpc/cell: Cleanup sysreset_hack for IBM cell blades
...
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (79 commits)
arm: bus_id -> dev_name() and dev_set_name() conversions
sparc64: fix up bus_id changes in sparc core code
3c59x: handle pci_name() being const
MTD: handle pci_name() being const
HP iLO driver
sysdev: Convert the x86 mce tolerant sysdev attribute to generic attribute
sysdev: Add utility functions for simple int/ulong variable sysdev attributes
sysdev: Pass the attribute to the low level sysdev show/store function
driver core: Suppress sysfs warnings for device_rename().
kobject: Transmit return value of call_usermodehelper() to caller
sysfs-rules.txt: reword API stability statement
debugfs: Implement debugfs_remove_recursive()
HOWTO: change email addresses of James in HOWTO
always enable FW_LOADER unless EMBEDDED=y
uio-howto.tmpl: use unique output names
uio-howto.tmpl: use standard copyright/legal markings
sysfs: don't call notify_change
sysdev: fix debugging statements in registration code.
kobject: should use kobject_put() in kset-example
kobject: reorder kobject to save space on 64 bit builds
...
The mdio_port, mdio_bit, mdc_port and mdc_bit fields in the
fs_mii_bb_platform_info structure are left-overs from the move to the Phy
Abstraction Layer subsystem. They are not used anymore and can be safely
removed.
Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Add control of hardware serial bit order between LSB first
(default/standard) and MSB first.
Signed-off-by: Paul Fulghum <paulkf@microgate.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some hardware needs to do break handling itself and may have partial
support only. Make break_ctl return an error code. Add a tty driver flag
so you can indicate driver hardware side break support.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
USB serial likes to use port->tty back pointers for the real work it does and
to do so without any actual locking. Unfortunately when you consider hangup
events, hangup/parallel reopen or even worse hangup followed by parallel close
events the tty->port and port->tty pointers are not guaranteed to be the same
as port->tty is the active tty while tty->port is the port the tty may or
may not still be attached to.
So rework the entire API to pass the tty struct. For console cases we need
to pass both for now. This shows up multiple drivers that immediately crash
with USB console some of which have been fixed in the process.
Longer term we need a proper tty as console abstraction
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Maxim's MAX7301 is an SPI GPIO expander with 28 GPIOs. Note: MAX7301's
interrupt feature is not supported yet.
[akpm@linux-foundation.org: coding-style fixes]
[g.liakhovetski@pengutronix.de: Fix inaccuracies in comments, check spi_setup()
return code, mask off high byte in max7301_read()]
Signed-off-by: Juergen Beisert <j.beisert@pengutronix.de>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Linux kernel puts the filename argument of execve() into the new
address space. Many developers are surprised to learn this. Those who
know and could use it, object "But it's not documented."
Those who want to use it dislike the expression
(char *)(1+ strlen(env[-1+ n_env]) + env[-1+ n_env])
because it requires locating the last original environment variable,
and assumes that the filename follows the characters.
This patch documents the insertion of the filename, and makes it easier
to find by adding a new tag AT_EXECFN in the ElfXX_auxv_t; see <elf.h>.
In many cases readlink("/proc/self/exe",) gives the same answer. But if
all the original pages get unmapped, then the kernel erases the symlink
for /proc/self/exe. This can happen when a program decompressor does a
good job of cleaning up after uncompressing directly to memory, so that
the address space of the target program looks the same as if compression
had never happened. One example is http://upx.sourceforge.net .
One notable use of the underlying concept (what path containED the
executable) is glibc expanding $ORIGIN in DT_RUNPATH. In practice for
the near term, it may be a good idea for user-mode code to use both
/proc/self/exe and AT_EXECFN as fall-back methods for each other.
/proc/self/exe can fail due to unmapping, AT_EXECFN can fail because it
won't be present on non-new systems. The auxvec or {AT_EXECFN}.d_val
also can get overwritten, although in nearly all cases this would be the
result of a bug.
The runtime cost is one NEW_AUX_ENT using two words of stack space. The
underlying value is maintained already as bprm->exec; setup_arg_pages()
in fs/exec.c slides it for stack_shift, etc.
Signed-off-by: John Reiser <jreiser@BitWagon.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Always compile request_module when the kernel allows modules.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This reworks try_then_request_module to only invoke the "lookup"
function "x" once when the kernel is not modular.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This shrinks module.o and each *.ko file.
And finally, structure members which hold length of module
code (four such members there) and count of symbols
are converted from longs to ints.
We cannot possibly have a module where 32 bits won't
be enough to hold such counts.
For one, module loading checks module size for sanity
before loading, so such insanely big module will fail
that test first.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
module.c and module.h conatains code for finding
exported symbols which are declared with EXPORT_UNUSED_SYMBOL,
and this code is compiled in even if CONFIG_UNUSED_SYMBOLS is not set
and thus there can be no EXPORT_UNUSED_SYMBOLs in modules anyway
(because EXPORT_UNUSED_SYMBOL(x) are compiled out to nothing then).
This patch adds required #ifdefs.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
reorder struct module to save space on 64 bit builds.
saves 1 cacheline_size (128 on default x86_64 & 64 on AMD
Opteron/athlon) when CONFIG_MODULE_UNLOAD=y.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
PTE_PFN_MASK was getting lonely, so I made it a friend.
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Rusty, in his peevish way, complained that macros defining constants
should have a name which somewhat accurately reflects the actual
purpose of the constant.
Aside from the fact that PTE_MASK gives no clue as to what's actually
being masked, and is misleadingly similar to the functionally entirely
different PMD_MASK, PUD_MASK and PGD_MASK, I don't really see what the
problem is.
But if this patch silences the incessent noise, then it will have
achieved its goal (TODO: write test-case).
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
(Jeremy said:
rusty: use PTE_MASK
rusty: use PTE_MASK
rusty: use PTE_MASK
When I asked:
jsgf: does that include the NX flag?
He responded eloquently:
rusty: use PTE_MASK
rusty: use PTE_MASK
yes, it's the official constant of masking flags out of ptes
)
Change a15af1c9ea 'x86/paravirt: add
pte_flags to just get pte flags' removed lguest's private pte_flags()
in favor of a generic one.
Unfortunately, the generic one doesn't filter out the non-flags bits:
this results in lguest creating corrupt shadow page tables and blowing
up host memory.
Since noone is supposed to use the pfn part of pte_flags(), it seems
safest to always do the filtering.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-and-morning-tea-spilled-by: Ingo Molnar <mingo@elte.hu>
This changes the MTD core to handle pci_name() now returning a constant
string.
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This adds a new sysdev_ext_attribute that stores a pointer to the variable
it manages and some utility functions/macro to easily use them.
Previously all users wrote custom macros to generate show/store
functions for each variable, with this it is possible to avoid
that in many cases.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This allow to dynamically generate attributes and share show/store
functions between attributes. Right now most attributes are generated
by special macros and lots of duplicated code. With the attribute
passed it's instead possible to attach some data to the attribute
and then use that in shared low level functions to do different things.
I need this for the dynamically generated bank attributes in the x86
machine check code, but it'll allow some further cleanups.
I converted all users in tree to the new show/store prototype. It's a single
huge patch to avoid unbisectable sections.
Runtime tested: x86-32, x86-64
Compiled only: ia64, powerpc
Not compile tested/only grep converted: sh, arm, avr32
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
driver core: Suppress sysfs warnings for device_rename().
Renaming network devices to an already existing name is not
something we want sysfs to print a scary warning for, since the
callers can deal with this correctly. So let's introduce
sysfs_create_link_nowarn() which gets rid of the common warning.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
debugfs_remove_recursive() will remove a dentry and all its children.
Drivers can use this to zap their whole debugfs tree so that they don't
need to keep track of every single debugfs dentry they created.
It may fail to remove the whole tree in certain cases:
sh-3.2# rmmod atmel-mci < /sys/kernel/debug/mmc0/ios/clock
mmc0: card b368 removed
atmel_mci atmel_mci.0: Lost dma0chan1, falling back to PIO
sh-3.2# ls /sys/kernel/debug/mmc0/
ios
But I'm not sure if that case can be handled in any sane manner.
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Cc: Pierre Ossman <drzeus-list@drzeus.cx>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
reorder kobject to save space on 64 bit builds.
shrinks from 72 to 64 bytes & moves allocated kobject to a smaller
slab.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Sometimes it is necessary to enable/disable the interrupt of a UIO device
from the userspace part of the driver. With this patch, the UIO kernel driver
can implement an "irqcontrol()" function that does this. Userspace can write
an s32 value to /dev/uioX (usually 0 or 1 to turn the irq off or on). The
UIO core will then call the driver's irqcontrol function.
Signed-off-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com>
Acked-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is no such thing as a "device id size" in the driver core, so
remove the define and fix up any users of this odd define in the rest of
the kernel.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is no such thing as a "device name size" in the driver core, so
remove the define and fix up any users of this odd define in the rest of
the kernel.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kobjects do not have a limit in name size since a while, so stop
pretending that they do.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This adds the infrastructure to properly handle lockdep issues when the
internal class semaphore is changed to a mutex.
Matthew wrote the original patch, and Greg fixed it up to work properly
with the class_create() function.
From: Matthew Wilcox <matthew@wil.cx>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Dave Young <hidave.darkstar@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This moves the portions of struct class that are dynamic (kobject and
lock and lists) out of the main structure and into a dynamic, private,
structure.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This mirrors the functionality that driver_find_device has as well.
We add a start variable, and all callers of the function are fixed up at
the same time.
The block layer will be using this new functionality in a follow-on
patch.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This mirrors the functionality that driver_for_each_device has as well.
We add a start variable, and all callers of the function are fixed up at
the same time.
The block layer will be using this new functionality in a follow-on
patch.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Now that device_create() has been audited, rename things back to the
original call to be sane.
Keep the device_create_drvdata macro around to make merges easier.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There are no more users of this, and it is racy. Use
device_create_drvdata() or device_create_vargs() instead.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Why?:
There are occasions where userspace would like to access sysfs
attributes for a device but it may not know how sysfs has named the
device or the path. For example what is the sysfs path for
/dev/disk/by-id/ata-ST3160827AS_5MT004CK? With this change a call to
stat(2) returns the major:minor then userspace can see that
/sys/dev/block/8:32 links to /sys/block/sdc.
What are the alternatives?:
1/ Add an ioctl to return the path: Doable, but sysfs is meant to reduce
the need to proliferate ioctl interfaces into the kernel, so this
seems counter productive.
2/ Use udev to create these symlinks: Also doable, but it adds a
udev dependency to utilities that might be running in a limited
environment like an initramfs.
3/ Do a full-tree search of sysfs.
[kay.sievers@vrfy.org: fix duplicate registrations]
[kay.sievers@vrfy.org: cleanup suggestions]
Cc: Neil Brown <neilb@suse.de>
Cc: Tejun Heo <htejun@gmail.com>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Reviewed-by: SL Baur <steve@xemacs.org>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Mark Lord <lkml@rtr.ca>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Introduce a new dma attriblue DMA_ATTR_WEAK_ORDERING to use weak ordering
on DMA mappings in the Cell processor. Add the code to the Cell's IOMMU
implementation to use this code.
Dynamic mappings can be weakly or strongly ordered on an individual basis
but the fixed mapping has to be either completely strong or completely weak.
This is currently decided by a kernel boot option (pass iommu_fixed=weak
for a weakly ordered fixed linear mapping, strongly ordered is the default).
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use the new PPC_LONG_ALIGN macro instead of passing an argument
to the asm for consistency.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add a #define for aligning to a long-sized boundary. It would be nice
to use sizeof(long) for this, but that requires generating the value
with asm-offsets.c, and asm-offsets.c includes asm-compat.h and we
descend into some sort of recursive include hell.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add sub match id for ps3 system bus so that two different system bus
devices can be connected to a shared device.
Signed-off-by: Masakazu Mokuno <mokuno@sm.sony.co.jp>
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Update iommu_alloc() to take the struct dma_attrs and pass them on to
tce_build(). This change propagates down to the tce_build functions of
all the platforms.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch adds support for the power button on future IBM cell blades.
It actually doesn't shut down the machine. Instead it exposes an
input device /dev/input/event0 to userspace which sends KEY_POWER
if power button has been pressed.
haldaemon actually recognizes the button, so a plattform independent acpid
replacement should handle it correctly.
Signed-off-by: Christian Krafft <krafft@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Since commit 7560fa60fc (gpio: <linux/gpio.h>
and "no GPIO support here" stubs) drivers can use GPIOs if they're available,
but don't require them.
This patch actually enables this feature.
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use __raw_{read,write}w() in __ide_{in,out}sw()
and remove no longer needed {in,out}w_be().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use __raw_{read,write}w() in __ide_{in,out}sw().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use %r0 for outw_be() to make it match __raw_writew().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (100 commits)
usb-storage: revert DMA-alignment change for Wireless USB
USB: use reset_resume when normal resume fails
usb_gadget: composite cdc gadget fault handling
usb gadget: minor USBCV fix for composite framework
USB: Fix bug with byte order in isp116x-hcd.c fio write/read
USB: fix double kfree in ipaq in error case
USB: fix build error in cdc-acm for CONFIG_PM=n
USB: remove board-specific UP2OCR configuration from pxa27x-udc
USB: EHCI: Reconciling USB register differences on MPC85xx vs MPC83xx
USB: Fix pointer/int cast in USB devio code
usb gadget: g_cdc dependso on NET
USB: Au1xxx-usb: suspend/resume support.
USB: Au1xxx-usb: clean up ohci/ehci bus glue sources.
usbfs: don't store bad pointers in registration
usbfs: fix race between open and unregister
usbfs: simplify the lookup-by-minor routines
usbfs: send disconnect signals when device is unregistered
USB: Force unbinding of drivers lacking reset_resume or other methods
USB: ohci-pnx4008: I2C cleanups and fixes
USB: debug port converter does not accept more than 8 byte packets
...
This patch (as1024) takes care of a FIXME issue: Drivers that don't
have the necessary suspend, resume, reset_resume, pre_reset, or
post_reset methods will be unbound and their interface reprobed when
one of the unsupported events occurs.
This is made slightly more difficult by the fact that bind operations
won't work during a system sleep transition. So instead the code has
to defer the operation until the transition ends.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch renames the existing usb_reset_device in hub.c to
usb_reset_and_verify_device and renames the existing
usb_reset_composite_device to usb_reset_device. Also the new
usb_reset_and_verify_device does't need to be EXPORTED .
The idea of the patch is that external interface driver
should warn the other interfaces' driver of the same
device before and after reseting the usb device. One interface
driver shoud call _old_ usb_reset_composite_device instead of
_old_ usb_reset_device since it can't assume the device contains
only one interface. The _old_ usb_reset_composite_device
is safe for single interface device also. we rename the two
functions to make the change easily.
This patch is under guideline from Alan Stern.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
From the current implementation of usb_reset_composite_device
function, the iface parameter is no longer useful. This function
doesn't do something special for the iface usb_interface,compared
with other interfaces in the usb_device. So remove the parameter
and fix the related caller.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1102) clarifies two points in the USB Gadget kerneldoc:
Request completion callbacks are always made with interrupts
disabled;
Device controllers may not support STALLing the status stage
of a control transfer after the data stage is over.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
General cleanup on ir-usb module. Introduced
a common header that could be used also on
usb gadget framework.
Lot's of cleanups and now using macros from the header
file.
Signed-off-by: Felipe Balbi <me@felipebalbi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add <linux/usb/composite.h> interfaces for composite gadget drivers, and
basic implementation support behind it:
- struct usb_function ... groups one or more interfaces into a function
managed as one unit within a configuration, to which it's added by
usb_add_function().
- struct usb_configuration ... groups one or more such functions into
a configuration managed as one unit by a driver, to which it's added
by usb_add_config(). These operate at either high or full/low speeds
and at a given bMaxPower.
- struct usb_composite_driver ... groups one or more such configurations
into a gadget driver, which may be registered or unregistered.
- struct usb_composite_dev ... a usb_composite_driver manages this; it
wraps the usb_gadget exposed by the controller driver.
This also includes some basic kerneldoc.
How to use it (the short version): provide a usb_composite_driver with a
bind() that calls usb_add_config() for each of the needed configurations.
The configurations in turn have bind() calls, which will usb_add_function()
for each function required. Each function's bind() allocates resources
needed to perform its tasks, like endpoints; sometimes configurations will
allocate resources too.
Separate patches will convert most gadget drivers to this infrastructure.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Define three new descriptor manipulation utilities, for use when
setting up functions that may have multiple instances:
usb_copy_descriptors() to copy a vector of descriptors
usb_free_descriptors() to free the copy
usb_find_endpoint() to find a copied version
These will be used as follows. Functions will continue to have static
tables of descriptors they update, now used as __initdata templates.
When a function creates a new instance, it patches those tables with
relevant interface and string IDs, plus endpoint assignments. Then it
copies those morphed descriptors, associates the copies with the new
function instance, and records the endpoint descriptors to use when
activating the endpoints. When initialization is done, only the copies
remain in memory. The copies are freed on driver removal.
This ensures that each instance has descriptors which hold the right
instance-specific data. Two instances in the same configuration will
obviously never share the same interface IDs or use the same endpoints.
Instances in different configurations won't do so either, which means
this is slightly less memory-efficient in some cases.
This also includes a bugfix to the epautoconf code that shows up with
this usage model. It must replace the previous endpoint number when
updating the template descriptors, not just mask in a few more bits.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch removes CVS keywords that weren't updated for a long time
from comments.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1091) changes the way usbcore handles interface
unbinding. If the interface's driver supports "soft" unbinding (a new
flag in the driver structure) then in-flight URBs are not cancelled
and endpoints are not disabled. Instead the driver is allowed to
continue communicating with the device (although of course it should
stop before its disconnect routine returns).
The purpose of this change is to allow drivers to do a clean shutdown
when they get unbound from a device that is still plugged in. Killing
all the URBs and disabling the endpoints before calling the driver's
disconnect method doesn't give the driver any control over what
happens, and it can leave devices in indeterminate states. For
example, when usb-storage unbinds it doesn't want to stop while in the
middle of transmitting a SCSI command.
The soft_unbind flag is added because in the past, a number of drivers
have experienced problems related to ongoing I/O after their disconnect
routine returned. Hence "soft" unbinding is made available only to
drivers that claim to support it.
The patch also replaces "interface_to_usbdev(intf)" with "udev" in a
couple of places, a minor simplification.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This changes usb_create_hcd() to be able to handle the fact that
pci_name() has changed to a constant string.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: (25 commits)
mmtimer: Push BKL down into the ioctl handler
[IA64] Remove experimental status of kdump
[IA64] Update ia64 mmr list for SGI uv
[IA64] Avoid overflowing ia64_cpu_to_sapicid in acpi_map_lsapic()
[IA64] adding parameter check to module_free()
[IA64] improper printk format in acpi-cpufreq
[IA64] pv_ops: move some functions in ivt.S to avoid lack of space.
[IA64] pvops: documentation on ia64/pv_ops
[IA64] pvops: add to hooks, pv_time_ops, for steal time accounting.
[IA64] pvops: add hooks, pv_irq_ops, to paravirtualized irq related operations.
[IA64] pvops: add hooks, pv_iosapic_ops, to paravirtualize iosapic.
[IA64] pvops: define initialization hooks, pv_init_ops, for paravirtualized environment.
[IA64] pvops: paravirtualize NR_IRQS
[IA64] pvops: paravirtualize entry.S
[IA64] pvops: paravirtualize ivt.S
[IA64] pvops: paravirtualize minstate.h.
[IA64] pvops: define paravirtualized instructions for native.
[IA64] pvops: preparation for paravirtulization of hand written assembly code.
[IA64] pvops: introduce pv_cpu_ops to paravirtualize privileged instructions.
[IA64] pvops: add an early setup hook for pv_ops.
...
As suggested by Dave:
This patch adds a function to get the driver name from a struct net_device,
and consequently uses this in the watchdog timeout handler to print as
part of the message.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IOMMU code and the block layer can split things
up using different rules, so this can't work reliably.
Signed-off-by: David S. Miller <davem@davemloft.net>
There are a couple of places where (P)Dprintk is used which is an old
compile time enabled printk wrapper. Convert it to the generic
pr_debug().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
netfilter: nf_conntrack_sctp: fix sparse warnings
netfilter: nf_nat_sip: c= is optional for session
netfilter: xt_TCPMSS: collapse tcpmss_reverse_mtu{4,6} into one function
netfilter: nfnetlink_log: send complete hardware header
netfilter: xt_time: fix time's time_mt()'s use of do_div()
netfilter: accounting rework: ct_extend + 64bit counters (v4)
netlink: add NLA_PUT_BE64 macro
netfilter: nf_nat_core: eliminate useless find_appropriate_src for IP_NAT_RANGE_PROTO_RANDOM
hdlcdrv: Fix CRC calculation.
Revert "pkt_sched: Make default qdisc nonshared-multiqueue safe."
net: In __netif_schedule() use WARN_ON instead of BUG_ON
net: Improve simple_tx_hash().
pkt_sched: Remove unused variable skb in dev_deactivate_queue function.
sunhme: Remove stop/wake TX queue calls in set-multicast-list handler.
ucc_geth: do not touch net queue in adjust_link phylib callback
gianfar: do not touch net queue in adjust_link phylib callback
atl1: Do not wake queue before queue has been started.
* 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (160 commits)
x86: remove extra calling to get ext cpuid level
x86: use setup_clear_cpu_cap() when disabling the lapic
KVM: fix exception entry / build bug, on 64-bit
x86: add unknown_nmi_panic kernel parameter
x86, VisWS: turn into generic arch, eliminate leftover files
x86: add ->pre_time_init to x86_quirks
x86: extend and use x86_quirks to clean up NUMAQ code
x86: introduce x86_quirks
x86: improve debug printout: add target bootmem range in early_res_to_bootmem()
Subject: devmem, x86: fix rename of CONFIG_NONPROMISC_DEVMEM
x86: remove arch_get_ram_range
x86: Add a debugfs interface to dump PAT memtype
x86: Add a arch directory for x86 under debugfs
x86: i386: reduce boot fixmap space
i386/xen: add proper unwind annotations to xen_sysenter_target
x86: reduce force_mwait visibility
x86: reduce forbid_dac's visibility
x86: fix two modpost warnings
x86: check function status in EDD boot code
x86_64: ia32_signal.c: remove signal number conversion
...
* 'for-linus' of git://neil.brown.name/md: (52 commits)
md: Protect access to mddev->disks list using RCU
md: only count actual openers as access which prevent a 'stop'
md: linear: Make array_size sector-based and rename it to array_sectors.
md: Make mddev->array_size sector-based.
md: Make super_type->rdev_size_change() take sector-based sizes.
md: Fix check for overlapping devices.
md: Tidy up rdev_size_store a bit:
md: Remove some unused macros.
md: Turn rdev->sb_offset into a sector-based quantity.
md: Make calc_dev_sboffset() return a sector count.
md: Replace calc_dev_size() by calc_num_sectors().
md: Make update_size() take the number of sectors.
md: Better control of when do_md_stop is allowed to stop the array.
md: get_disk_info(): Don't convert between signed and unsigned and back.
md: Simplify restart_array().
md: alloc_disk_sb(): Return proper error value.
md: Simplify sb_equal().
md: Simplify uuid_equal().
md: sb_equal(): Fix misleading printk.
md: Fix a typo in the comment to cmd_match().
...
This patch adds some fields to NFLOG to be able to send the complete
hardware header with all necessary informations.
It sends to userspace:
* the type of hardware link
* the lenght of hardware header
* the hardware header
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initially netfilter has had 64bit counters for conntrack-based accounting, but
it was changed in 2.6.14 to save memory. Unfortunately in-kernel 64bit counters are
still required, for example for "connbytes" extension. However, 64bit counters
waste a lot of memory and it was not possible to enable/disable it runtime.
This patch:
- reimplements accounting with respect to the extension infrastructure,
- makes one global version of seq_print_acct() instead of two seq_print_counters(),
- makes it possible to enable it at boot time (for CONFIG_SYSCTL/CONFIG_SYSFS=n),
- makes it possible to enable/disable it at runtime by sysctl or sysfs,
- extends counters from 32bit to 64bit,
- renames ip_conntrack_counter -> nf_conn_counter,
- enables accounting code unconditionally (no longer depends on CONFIG_NF_CT_ACCT),
- set initial accounting enable state based on CONFIG_NF_CT_ACCT
- removes buggy IPCT_COUNTER_FILLING event handling.
If accounting is enabled newly created connections get additional acct extend.
Old connections are not changed as it is not possible to add a ct_extend area
to confirmed conntrack. Accounting is performed for all connections with
acct extend regardless of a current state of "net.netfilter.nf_conntrack_acct".
Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add NLA_PUT_BE64 macro required for 64bit counters in netfilter
Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce a bvec merge function for device mapper devices
for dynamic size restrictions.
This code ensures the requested biovec lies within a single
target and then calls a target-specific function to check
against any constraints imposed by underlying devices.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-tip testing found this build bug:
arch/x86/kvm/built-in.o:(.text.fixup+0x1): relocation truncated to fit: R_X86_64_32 against `.text'
arch/x86/kvm/built-in.o:(.text.fixup+0xb): relocation truncated to fit: R_X86_64_32 against `.text'
arch/x86/kvm/built-in.o:(.text.fixup+0x15): relocation truncated to fit: R_X86_64_32 against `.text'
arch/x86/kvm/built-in.o:(.text.fixup+0x1f): relocation truncated to fit: R_X86_64_32 against `.text'
arch/x86/kvm/built-in.o:(.text.fixup+0x29): relocation truncated to fit: R_X86_64_32 against `.text'
Introduced by commit 4ecac3fd. The problem is that 'push' will default
to 32-bit, which is not wide enough as a fixup address. (and which would
crash on any real fixup event even if it was wide enough)
Introduce KVM_EX_PUSH to get the proper address push width on 64-bit too.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All modifications and most access to the mddev->disks list are made
under the reconfig_mutex lock. However there are three places where
the list is walked without any locking. If a reconfig happens at this
time, havoc (and oops) can ensue.
So use RCU to protect these accesses:
- wrap them in rcu_read_{,un}lock()
- use list_for_each_entry_rcu
- add to the list with list_add_rcu
- delete from the list with list_del_rcu
- delay the 'free' with call_rcu rather than schedule_work
Note that export_rdev did a list_del_init on this list. In almost all
cases the entry was not in the list anymore so it was a no-op and so
safe. It is no longer safe as after list_del_rcu we may not touch
the list_head.
An audit shows that export_rdev is called:
- after unbind_rdev_from_array, in which case the delete has
already been done,
- after bind_rdev_to_array fails, in which case the delete isn't needed.
- before the device has been put on a list at all (e.g. in
add_new_disk where reading the superblock fails).
- and in autorun devices after a failure when the device is on a
different list.
So remove the list_del_init call from export_rdev, and add it back
immediately before the called to export_rdev for that last case.
Note also that ->same_set is sometimes used for lists other than
mddev->list (e.g. candidates). In these cases rcu is not needed.
Signed-off-by: NeilBrown <neilb@suse.de>
Open isn't the only thing that increments ->active. e.g. reading
/proc/mdstat will increment it briefly. So to avoid false positives
in testing for concurrent access, introduce a new counter that counts
just the number of times the md device it open.
Signed-off-by: NeilBrown <neilb@suse.de>
This patch renames the array_size field of struct mddev_s to array_sectors
and converts all instances to use units of 512 byte sectors instead of 1k
blocks.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
* 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: (51 commits)
nfsd: nfs4xdr.c do-while is not a compound statement
nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.c
lockd: Pass "struct sockaddr *" to new failover-by-IP function
lockd: get host reference in nlmsvc_create_block() instead of callers
lockd: minor svclock.c style fixes
lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock
lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock
lockd: nlm_release_host() checks for NULL, caller needn't
file lock: reorder struct file_lock to save space on 64 bit builds
nfsd: take file and mnt write in nfs4_upgrade_open
nfsd: document open share bit tracking
nfsd: tabulate nfs4 xdr encoding functions
nfsd: dprint operation names
svcrdma: Change WR context get/put to use the kmem cache
svcrdma: Create a kmem cache for the WR contexts
svcrdma: Add flush_scheduled_work to module exit function
svcrdma: Limit ORD based on client's advertised IRD
svcrdma: Remove unused wait q from svcrdma_xprt structure
svcrdma: Remove unneeded spin locks from __svc_rdma_free
svcrdma: Add dma map count and WARN_ON
...
* 'for-linus' of git://git.o-hand.com/linux-mfd:
mfd: let asic3 use mem resource instead of bus_shift
mfd: remove DS1WM register definitions from asic3.h
mfd: add ASIC3_CONFIG_GPIO templates
mfd: fix the asic3 irq demux code
mfd: asic3 should depend on gpiolib
mfd: fix asic3 config array initialisation
mfd: move asic3 probe functions into __init section
mfd: Use uppercase only for asic3 macros and defines
mfd: use dev_* macros for asic3 debugging
mfd: New asic3 gpio configuration code
mfd: asic3 children platform data removal
mfd: asic3 gpiolib support
* 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (70 commits)
KVM: Adjust smp_call_function_mask() callers to new requirements
KVM: MMU: Fix potential race setting upper shadow ptes on nonpae hosts
KVM: x86 emulator: emulate clflush
KVM: MMU: improve invalid shadow root page handling
KVM: MMU: nuke shadowed pgtable pages and ptes on memslot destruction
KVM: Prefix some x86 low level function with kvm_, to avoid namespace issues
KVM: check injected pic irq within valid pic irqs
KVM: x86 emulator: Fix HLT instruction
KVM: Apply the kernel sigmask to vcpus blocked due to being uninitialized
KVM: VMX: Add ept_sync_context in flush_tlb
KVM: mmu_shrink: kvm_mmu_zap_page requires slots_lock to be held
x86: KVM guest: make kvm_smp_prepare_boot_cpu() static
KVM: SVM: fix suspend/resume support
KVM: s390: rename private structures
KVM: s390: Set guest storage limit and offset to sane values
KVM: Fix memory leak on guest exit
KVM: s390: dont allocate dirty bitmap
KVM: move slots_lock acquision down to vapic_exit
KVM: VMX: Fake emulate Intel perfctr MSRs
KVM: VMX: Fix a wrong usage of vmcs_config
...
The stab bits can't be referenced uniless the full
packet scheduler layer is enabled.
Reported by Stephen Rothwell.
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (1232 commits)
iucv: Fix bad merging.
net_sched: Add size table for qdiscs
net_sched: Add accessor function for packet length for qdiscs
net_sched: Add qdisc_enqueue wrapper
highmem: Export totalhigh_pages.
ipv6 mcast: Omit redundant address family checks in ip6_mc_source().
net: Use standard structures for generic socket address structures.
ipv6 netns: Make several "global" sysctl variables namespace aware.
netns: Use net_eq() to compare net-namespaces for optimization.
ipv6: remove unused macros from net/ipv6.h
ipv6: remove unused parameter from ip6_ra_control
tcp: fix kernel panic with listening_get_next
tcp: Remove redundant checks when setting eff_sacks
tcp: options clean up
tcp: Fix MD5 signatures for non-linear skbs
sctp: Update sctp global memory limit allocations.
sctp: remove unnecessary byteshifting, calculate directly in big-endian
sctp: Allow only 1 listening socket with SO_REUSEADDR
sctp: Do not leak memory on multiple listen() calls
sctp: Support ipv6only AF_INET6 sockets.
...
* 'for-linus' of git://www.jni.nu/cris:
[CRISv10] Clean up compressed/misc.c
[CRISv10] Correct whitespace damage.
[CRIS] Correct definition of subdirs for install_headers.
[CRIS] Correct image makefiles to allow using a separate OBJ-directory.
[CRIS] Build fixes for compressed and rescue images for v10 and v32:
It looks at least odd to apply spin_unlock to a mutex.
cris: compile fixes for 2.6.26-rc5
This patch contains the following possible cleanups:
- amiints.c: add a proper prototype for amiga_init_IRQ() in
include/asm-m68k/amigaints.h
- make the following needlessly global code static:
- config.c: amiga_model
- config.c: amiga_psfreq
- config.c: amiga_serial_console_write()
- #if 0 the following unused functions:
- config.c: amiga_serial_puts()
- config.c: amiga_serial_console_wait_key()
- config.c: amiga_serial_gets()
- remove the following unused variable:
- config.c: amiga_masterclock
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Unless I miss something that's code for a sparc machine even the sparc
code no longer supports that got copied to m68k when these files were
copied.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow no CPU/platform type for allnoconfig
- Provide a dummy value for FPSTATESIZE if no CPU type was selected
- Provide a dummy value for NR_IRQS if no platform type was selected
- Warn the user if no CPU or platform type was selected
Note: you still cannot build an allnoconfig kernel, as CONFIG_SWAP=n doesn't
build and we cannot easily fix that
(http://groups.google.com/group/linux.kernel/browse_thread/thread/d430c78b07e1827b)
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch removes CVS keywords that weren't updated for a long time
from comments.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'configfs-fixup-ptr-error' of git://oss.oracle.com/git/jlbec/linux-2.6:
configfs: Allow ->make_item() and ->make_group() to return detailed errors.
Revert "configfs: Allow ->make_item() and ->make_group() to return detailed errors."
Fix up the termios of the people who have not yet got with the program
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch cyclades to use the new tty_port structure
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch the stallion driver to use the tty_port structure
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch istallion to use the new tty_port structure
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch drivers using the old "generic serial" driver to use the tty_port
structures
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>