Change sysenter_setup to __cpuinit.
Change __INIT & __INITDATA to be cpu hotplug aware.
Resolve MODPOST warnings similar to:
WARNING: vmlinux - Section mismatch: reference to .init.text:sysenter_setup from
.text between 'identify_cpu' (at offset 0xc040a380) and 'detect_ht'
and
WARNING: vmlinux - Section mismatch: reference to .init.data:vsyscall_int80_end
from .text between 'sysenter_setup' (at offset 0xc041a269) and 'enable_sep_cpu'
WARNING: vmlinux - Section mismatch: reference to
.init.data:vsyscall_int80_start from .text between 'sysenter_setup' (at offset
0xc041a26e) and 'enable_sep_cpu'
WARNING: vmlinux - Section mismatch: reference to
.init.data:vsyscall_sysenter_end from .text between 'sysenter_setup' (at offset
0xc041a275) and 'enable_sep_cpu'
WARNING: vmlinux - Section mismatch: reference to
.init.data:vsyscall_sysenter_start from .text between 'sysenter_setup' (at
offset 0xc041a27a) and 'enable_sep_cpu'
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
For the CAFÉ NAND controller, we need to support non-canonical
representations of the Galois field. Allow the caller to provide its own
function for generating the field, and CAFÉ can use rslib instead of its
own implementation.
Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
This patch adds ablkcipher_request_set_tfm for those users that need
to manage the memory for ablkcipher requests directly.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the mid-level interface for asynchronous block ciphers.
It also includes a generic queueing mechanism that can be used by other
asynchronous crypto operations in future.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch passes the type/mask along when constructing instances of
templates. This is in preparation for templates that may support
multiple types of instances depending on what is requested. For example,
the planned software async crypto driver will use this construct.
For the moment this allows us to check whether the instance constructed
is of the correct type and avoid returning success if the type does not
match.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the frontend interface for asynchronous block ciphers.
In addition to the usual block cipher parameters, there is a callback
function pointer and a data pointer. The callback will be invoked only
if the encrypt/decrypt handlers return -EINPROGRESS. In other words,
if the return value of zero the completion handler (or the equivalent
code) needs to be invoked by the caller.
The request structure is allocated and freed by the caller. Its size
is determined by calling crypto_ablkcipher_reqsize(). The helpers
ablkcipher_request_alloc/ablkcipher_request_free can be used to manage
the memory for a request.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This is a very simple bitbanging I2C bus driver utilizing the new
arch-neutral GPIO API. Useful for chips that don't have a built-in
I2C controller, additional I2C busses, or testing purposes.
To use, include something similar to the following in the
board-specific setup code:
#include <linux/i2c-gpio.h>
static struct i2c_gpio_platform_data i2c_gpio_data = {
.sda_pin = GPIO_PIN_FOO,
.scl_pin = GPIO_PIN_BAR,
};
static struct platform_device i2c_gpio_device = {
.name = "i2c-gpio",
.id = 0,
.dev = {
.platform_data = &i2c_gpio_data,
},
};
Register this platform_device, set up the I2C pins as GPIO if
required and you're ready to go. This will use default values for
udelay and timeout, and will work with GPIO hardware that does not
support open drain mode, but allows sensing of the SDA and SCL lines
even when they are being driven.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Add back the i2c_smbus_read_block_data helper function, it is needed
by the upcoming lm93 hardware monitoring driver and possibly others.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
The original i2c-algo-bit implementation uses a 33/66 SCL duty cycle
when bits are being written on the bus. While the I2C specification
doesn't forbid it, this prevents us from driving the I2C bus to its
max speed, limiting us to 66 kbps max on standard I2C busses.
Implementing a 50/50 duty cycle instead lets us max out the bandwidth
up to the theoretical max of 100 kbps on standard I2C busses. This is
particularly important when large amounts of data need to be transfered
over the bus, as is the case with some TV adapters when the firmware is
being uploaded.
In fact this change even allows, at least in theory, fast-mode I2C
support at 125, 166 and 250 kbps. There's no way to reach the
theoretical max of 400 kbps with this implementation. But I don't
think we want to put efforts in that direction anyway: software-driven
I2C is very CPU-intensive and bad for latency.
Other timing changes:
* Don't set SDA high explicitly on error, we're going to issue a stop
condition before we leave anyway.
* If an error occurs when sending the slave address, yield the CPU
before retrying, and remove the additional delay after the new start
condition.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
The i2c linux driver for blackfin architecture which supports blackfin
on-chip TWI controller i2c operation.
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Make i2c_del_driver a void function, like all other driver removal
functions. It always returned 0 even when errors occured, and nobody
ever actually checked the return value anyway. And we cannot fail
a module removal anyway.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Move the declaration of i2c-isa-only exported symbols to i2c-isa
itself, that's the best way to ensure nobody will attempt to use them.
Hopefully we'll get rid of the exports themselves soon anyway.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Add a new helper function to instantiate an i2c device. It is meant as a
replacement for i2c_new_device() when you don't know for sure at which
address your I2C/SMBus device lives. This happens frequently on TV
adapters for example, you know there is a tuner chip on the bus, but
depending on the exact board model and revision, it can live at different
addresses. So, the new i2c_new_probed_device() function will probe the bus
according to a list of addresses, and as soon as one of these addresses
responds, it will call i2c_new_device() on that one address.
This function will make it possible to port the old i2c drivers to the
new model quickly.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Add i2c_bit_add_numbered_bus(), which is equivalent to i2c_bit_add_bus
except that it calls i2c_add_numbered_adapter() at the end instead of
i2c_add_adapter().
Signed-off-by: Jean Delvare <khali@linux-fr.org>
This adds a call, i2c_add_numbered_adapter(), registering an I2C adapter
with a specific bus number and then creating I2C device nodes for any
pre-declared devices on that bus. It builds on previous patches adding
I2C probe() and remove() support, and that pre-declaration of devices.
This completes the core support for "new style" I2C device drivers.
Those follow the standard driver model for binding devices to drivers
(using probe and remove methods) rather than a legacy model (where the
driver tries to autoconfigure each bus, and registers devices itself).
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
This provides partial support for new-style I2C driver binding. It builds
on "struct i2c_board_info" declarations that identify I2C devices on a given
board. This is needed on systems with I2C devices that can't be fully probed
and/or autoconfigured, such as many embedded Linux configurations where the
way a given I2C device is wired may affect how it must be used.
There are two models for declaring such devices:
* LATE -- using a public function i2c_new_device(). This lets modules
declare I2C devices found *AFTER* a given I2C adapter becomes available.
For example, a PCI card could create adapters giving access to utility
chips on that card, and this would be used to associate those chips with
those adapters.
* EARLY -- from arch_initcall() level code, using a non-exported function
i2c_register_board_info(). This copies the declarations *BEFORE* such
an i2c_adapter becomes available, arranging that i2c_new_device() will
be called later when i2c-core registers the relevant i2c_adapter.
For example, arch/.../.../board-*.c files would declare the I2C devices
along with their platform data, and I2C devices would behave much like
PNPACPI devices. (That is, both enumerate from board-specific tables.)
To match the exported i2c_new_device(), the previously-private function
i2c_unregister_device() is now exported.
Pending later patches using these new APIs, this is effectively a NOP.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
More update for new style driver support: add a remove() method, and
use it in the relevant code paths.
Again, nothing will use this yet since there's nothing to create devices
feeding this infrastructure.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
One of a series of I2C infrastructure updates to support enumeration using
the standard Linux driver model.
This patch updates probe() and associated hotplug/coldplug support, but
not remove(). Nothing yet _uses_ it to create I2C devices, so those
hotplug/coldplug mechanisms will be the only externally visible change.
This patch will be an overall NOP since the I2C stack doesn't yet create
clients/devices except as part of binding them to legacy drivers.
Some code is moved earlier in the source code, helping group more of the
per-device infrastructure in one place and simplifying handling per-device
attributes.
Terminology being adopted: "legacy drivers" create devices (i2c_client)
themselves, while "new style" ones follow the driver model (the i2c_client
is handed to the probe routine). It's an either/or thing; the two models
don't mix, and drivers that try mixing them won't even be registered.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Let the I2C bus drivers emulate the SMBus Block Read and Block Process
Call transactions if they wish. This requires to define a new message
flag, which i2c-core will use to let the underlying I2C bus driver
know that the first received byte will specify the length of the read
message.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Rename dev_to_i2c_adapter() as to_i2c_adapter(), since the previous
syntax was a surprising and needless difference from normal naming
conventions in Linux.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
This shrinks the size of "struct i2c_client" by 40 bytes:
- Substantially shrinks the string used to identify the chip type
- The "flags" don't need to be so big
- Removes some internal padding
It also adds kerneldoc for that struct, explaining how "name" is really a
chip type identifier; it's otherwise potentially confusing.
Because the I2C_NAME_SIZE symbol was abused for both i2c_client.name
and for i2c_adapter.name, this needed to affect i2c_adapter too. The
adapters which used that symbol now use the more-obviously-correct
idiom of taking the size of that field.
JD: Shorten i2c_adapter.name from 50 to 48 bytes while we're here, to
avoid wasting space in padding.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Kill i2c_adapter_driver as it doesn't make sense and it prevents
further i2c-core cleanups. i2c_adapter devices are virtual devices
(ex-class devices) and as such they don't need a driver.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Kill i2c_adapter.class_dev. Instead, set the class of i2c_adapter.dev
to i2c_adapter_class, so that a symlink will be created for every
i2c_adapter in /sys/class/i2c-adapter.
The same change must be mirrored to i2c-isa as it duplicates some
of the i2c-core functionalities.
User-space tools and libraries might need some adjustments. In
particular, libsensors from lm_sensors 2.10.3 or later is required for
proper discovery of i2c adapter names after this change.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Fix handling of low voltage MMC cards.
The latest MMC and SD specs both agree that support for
low-voltage operations is indicated by bit 7 in the OCR.
The MMC spec states that the low voltage range is
1.65-1.95V while the SD spec leaves the actual voltage
range undefined - meaning that there is still no such
thing as a low voltage SD card.
However, an old Sandisk spec implied that bits 7.0
represented voltages below 2.0V in 1V or 0.5V increments,
and the code was accordingly written with that expectation.
This confusion meant that host drivers attempting to support
the typical low voltage (1.8V) would set the wrong bits in
the host OCR mask (usually bits 5 and/or 6) resulting in the
the low voltage mode never being used.
This change corrects the low voltage range and adds sanity
checks on the reserved bits (0-6) and for SD cards that
claim to support low-voltage operations.
Signed-off-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
libata previously depended upon waits in prereset to get resets after
hotplug right for both spin up and device ready wait. This was
necessary both for reliablity and speed as reset was likely to fail if
initiated too early and each try usually took more than 30secs to
fail. Previous patches fixed the reliability part by fixing status
and SCR handling in resets. This patch remedies the speed part by
improving reset sequencing.
Prereset waiting timeout is adjusted to 10s because spinup wait is
replaced by reset sequencing and !BSY wait is not as important as
before. During boot or module loading where the drive is already
fully spun up, !BSY wait succeeds immediately, so 10s should be enough
in most cases. It matters after hotplugging or other error
conditions, but in those cases, !BSY wait in prereset simply can't be
relied upon due to the varied and weird behaviors ATA controllers and
devices show.
Reset is now driven by ata_eh_reset_timeouts[] table which contains
timeouts for each reset try. The first reset can be softreset but the
following ones are always hardreset if available. Each timeout
defines deadline for the reset try. If a reset try fails, reset is
retried with the next timeout till the end of the timeout table is
reached. If a reset try fails before the timeout with error, libata
waits till the deadline of the failed try before retrying.
IOW, the timeout table defines timetable of reset tries such that the
n'th try always begins at least after the sum of all previous timeouts
has passed. The current timetable defines 4 tries and takes around 1
minute.
@0 : First try. This should succeed most of the time during boot.
@10 : 10s is enough to spin up most consumer harddrives. Give it
another shot.
@20 : 20s should spin up > 99% of working drives. This has 30s
timeout for retarded devices needing long idleness post reset.
@55 : Final try with 5s timeout just in case.
The above timetable is trade off between not annoying the device too
much with frequent resets and taking reasonable amount of time in most
cases. Some controllers may do better with shorter timeouts while
others may fare better with longer but we just can't rely upon LLD
writers to test each controller with wide variety of devices using
various scenarios. We need default behavior which reasonably fits
most cases.
I've tested the above timetable on a dozen SATA controllers and a few
PATA controllers with about a dozen different drives from all major
vendors and 4 different ODDs from three different vendors for both
boot and hotplug (if available) cases.
Boot probing is not affected unless the device is broken in which
cases new code gives up on the port after a minute rather than five or
nine minutes. When hotplugging, most devices get detected on the
first or second try. Multi-platter drives with long spin up time
which sometimes took > 40 secs with the original code, now usually
comes up during the second try and at least right after the third try
@20.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add @deadline to prereset and reset methods and make them honor it.
ata_wait_ready() which directly takes @deadline is implemented to be
used as the wait function. This patch is in preparation for EH timing
improvements.
* ata_wait_ready() never does busy sleep. It's only used from EH and
no wait in EH is that urgent. This function also prints 'be
patient' message automatically after 5 secs of waiting if more than
3 secs is remaining till deadline.
* ata_bus_post_reset() now fails with error code if any of its wait
fails. This is important because earlier reset tries will have
shorter timeout than the spec requires. If a device fails to
respond before the short timeout, reset should be retried with
longer timeout rather than silently ignoring the device.
There are three behavior differences.
1. Timeout is applied to both devices at once, not separately. This
is more consistent with what the spec says.
2. When a device passes devchk but fails to become ready before
deadline. Previouly, post_reset would just succeed and let
device classification remove the device. New code fails the
reset thus causing reset retry. After a few times, EH will give
up disabling the port.
3. When slave device passes devchk but fails to become accessible
(TF-wise) after reset. Original code disables dev1 after 30s
timeout and continues as if the device doesn't exist, while the
patched code fails reset. When this happens, new code fails
reset on whole port rather than proceeding with only the primary
device.
If the failing device is suffering transient problems, new code
retries reset which is a better behavior. If the failing device is
actually broken, the net effect is identical to it, but not to the
other device sharing the channel. In the previous code, reset would
have succeeded after 30s thus detecting the working one. In the new
code, reset fails and whole port gets disabled. IMO, it's a
pathological case anyway (broken device sharing bus with working
one) and doesn't really matter.
* ata_bus_softreset() is changed to return error code from
ata_bus_post_reset(). It used to return 0 unconditionally.
* Spin up waiting is to be removed and not converted to honor
deadline.
* To be on the safe side, deadline is set to 40s for the time being.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Consolidate the list of available voltages.
Up until now, a separate set of defines has been
used for host->vdd than that used for the OCR
voltage mask values. Having two sets of defines
allows them to get out of sync and the current
sets are already inconsistent with one claiming
to describe ranges and the other specific voltages.
Only the SDHCI driver uses the host->vdd defines and
it is easily fixed to use the OCR defines.
Signed-off-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Delegate protocol handling to "bus handlers". This allows the core to
just handle the task of arbitrating the bus. Initialisation and
pampering of cards is now done by the different bus handlers.
This design also allows MMC and SD (and later SDIO) to be more cleanly
separated, allowing easier maintenance.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Move protocol operations and definitions into their own files
in an effort to separate protocol handling and bus
arbitration more clearly.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
The classic MMC bus was defined as multi card bus
system, which is reflected in the design in the MMC
layer.
When SD showed up, the bus topology was abandoned
and a star topology (one card per host) was mandated.
MMC version 4 has followed this, officially deprecating
the bus topology.
As we do not have any known users of the bus
topology we can remove support for it. This will
simplify the code and rectify some incorrect
assumptions in the newer additions.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Make sure we kill of any pending detection runs when the host
is removed instead of when it is freed. Also add some debugging
to make sure the driver doesn't queue up more detection after it
has removed the host.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
All host drivers were #include:ing mmc/protocol.h just to
get access to the OCR bit defines. Move these to host.h instead.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Support for MMC 4.2 sector based cards. This tweaks the init a
bit and reads a new field out of the EXT_CSD.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
It was found that delays associated with issue and completion of the commands
severely limit performance of the new, fast SD cards. To alleviate this issue
scatter-gather emulation in software is implemented for both dma and pio
transfer modes. Non-block aligned and high memory sg entries are accounted
for.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
State machine used to to track mmc command state was found to be fragile
and unreliable, making many cards unusable. The safer solution is to perform
all needed checks at every card event.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Some details of the device management (create, add, remove) are really
belong to the tifm_core, as they are not hardware specific.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Some details of the adapter management (create, add, remove) are really
belong to the tifm_core, as they are not hardware specific.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Freezeable workqueue makes sure that adapter work items (device insertions
and removals) would be handled after the system is fully resumed. Previously
this was achieved by explicit freezing of the kthread.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Remove code duplicating the kernel functionality and clean up data
structures involved in driver matching.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Instead of passing transformed value of adapter interrupt status to
socket drivers, implement two separate callbacks - one for card events
and another for dma events.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Add code to accept purge commands from userland.
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
READDIRPLUS can be a performance hindrance when the client is working with
large directories. In addition, some servers still have bugs in their
implementations (e.g. Tru64 returns wrong values for the fsid).
Add a mount flag to enable users to turn it off at mount time following the
implementation in Apple's NFS client.
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
net/sunrpc/pmap_clnt.c has been replaced by net/sunrpc/rpcb_clnt.c.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Introduce a replacement for the in-kernel portmapper client that supports
all 3 versions of the rpcbind protocol. This code is not used yet.
Original code by Groupe Bull updated for the latest kernel, with multiple
bug fixes.
Note that rpcb_clnt.c does not yet support registering via versions 3 and
4 of the rpcbind protocol. That is planned for a later patch.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently rpc_malloc sets req->rq_buffer internally. Make this a more
generic interface: return a pointer to the new buffer (or NULL) and
make the caller set req->rq_buffer and req->rq_bufsize. This looks much
more like kmalloc and eliminates the side effects.
To fix a potential deadlock, this patch also replaces GFP_NOFS with
GFP_NOWAIT in rpc_malloc. This prevents async RPCs from sleeping outside
the RPC's task scheduler while allocating their buffer.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The RPC buffer size estimation logic in net/sunrpc/clnt.c always
significantly overestimates the requirements for the buffer size.
A little instrumentation demonstrated that in fact rpc_malloc was never
allocating the buffer from the mempool, but almost always called kmalloc.
To compute the size of the RPC buffer more precisely, split p_bufsiz into
two fields; one for the argument size, and one for the result size.
Then, compute the sum of the exact call and reply header sizes, and split
the RPC buffer precisely between the two. That should keep almost all RPC
buffers within the 2KiB buffer mempool limit.
And, we can finally be rid of RPC_SLACK_SPACE!
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NLM version 4 requests estimate the call and reply header sizes rather
conservatively, using the very maximum size allowed in the protocol even
though Linux always uses only a small fraction of the allowable space.
Reduce the size of caller and lock arguments to conserve RPC buffer space
while XDR encoding NLM4 arguments. Add compile-time checks to ensure the
hostname string won't overflow NLM protocol maximums.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently we do write coalescing in a very inefficient manner: one pass in
generic_writepages() in order to lock the pages for writing, then one pass
in nfs_flush_mapping() and/or nfs_sync_mapping_wait() in order to gather
the locked pages for coalescing into RPC requests of size "wsize".
In fact, it turns out there is actually a deadlock possible here since we
only start I/O on the second pass. If the user signals the process while
we're in nfs_sync_mapping_wait(), for instance, then we may exit before
starting I/O on all the requests that have been queued up.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Do the coalescing of read requests into block sized requests at start of
I/O as we scan through the pages instead of going through a second pass.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
For backwards compatibility, call_platform_enable_wakeup() can return 0
instead of -EIO since we aren't guaranteed to have errno defined.
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a kvasprintf() function to complement kasprintf().
No in-tree users yet, but I have some coming up.
[akpm@linux-foundation.org: EXPORT it]
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Keir Fraser <keir@xensource.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch changes the docs and behaviour from "all states valid" to "no
states valid" if no .valid callback is assigned. Users of pm_ops that only
need mem sleep can assign pm_valid_only_mem without any overhead, others
will require more elaborate callbacks.
Now that all users of pm_ops have a .valid callback this is a safe thing to
do and prevents things from getting messy again as they were before.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Looks-okay-to: Rafael J. Wysocki <rjw@sisk.pl>
Cc: <linux-pm@lists.linux-foundation.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Almost all users of pm_ops only support mem sleep, don't check in .valid and
don't reject any others in .prepare so users can be confused if they check
/sys/power/state, especially when new states are added (these would then
result in s-t-r although they're supposed to be something different).
This patch implements a generic pm_valid_only_mem function that is then
exported for users and puts it to use in almost all existing pm_ops.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: David Brownell <david-b@pacbell.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: linux-pm@lists.linux-foundation.org
Cc: Len Brown <lenb@kernel.org>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Greg KH <greg@kroah.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch removes the firmware disk suspend mode which is the wrong approach,
it is supposed to be used for implementing firmware-based disk suspend but
cannot actually be used for that.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: <linux-pm@lists.linux-foundation.org>
Cc: David Brownell <david-b@pacbell.net>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Greg KH <greg@kroah.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch series cleans up some misconceptions about pm_ops. Some users of
the pm_ops structure attempt to use it to stop the user from entering suspend
to disk, this, however, is not possible since the user can always use
"shutdown" in /sys/power/disk and then the pm_ops are never invoked. Also,
platforms that don't support suspend to disk simply should not allow
configuring SOFTWARE_SUSPEND (read the help text on it, it only selects
suspend to disk and nothing else, all the other stuff depends on PM).
The pm_ops structure is actually intended to provide a way to enter
platform-defined sleep states (currently supported states are "standby" and
"mem" (suspend to ram)) and additionally (if SOFTWARE_SUSPEND is configured)
allows a platform to support a platform specific way to enter low-power mode
once everything has been saved to disk. This is currently only used by ACPI
(S4).
This patch:
The pm_ops.pm_disk_mode is used in totally bogus ways since nobody really
seems to understand what it actually does.
This patch clarifies the pm_disk_mode description.
It also removes all the arm and sh users that think they can veto suspend to
disk via pm_ops; not so since the user can always do echo shutdown >
/sys/power/disk, they need to find a better way involving Kconfig or such.
ACPI is the only user left with a non-zero pm_disk_mode.
The patch also sets the default mode to shutdown again, but when a new pm_ops
is registered its pm_disk_mode is selected as default, that way the default
stays for ACPI where it is apparently required.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: David Brownell <david-b@pacbell.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: <linux-pm@lists.linux-foundation.org>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Greg KH <greg@kroah.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Today's print_symbol function dumps a kernel symbol with printk. This
patch extends the functionality of kallsyms.c so that the symbol lookup
function may be used without the printk. This is useful for modules that
want to dump symbols elsewhere, for example, to debugfs. I intend to use
the new function call in the GFS2 file system (which will be a separate
patch).
[akpm@linux-foundation.org: build fix]
[clameter@sgi.com: sprint_symbol should return length of string like sprintf]
Signed-off-by: Robert Peterson <rpeterso@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Cc: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Paulo Marques <pmarques@grupopie.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jikos/hid: (21 commits)
USB HID: don't warn on idVendor == 0
USB HID: add 'quirks' module parameter
USB HID: add support for dynamically-created quirks
USB HID: clarify static quirk handling as squirks
USB HID: encapsulate quirk handling into hid-quirks.c
USB HID: EMS USBII device needs HID_QUIRK_MULTI_INPUT
HID: update copyright and authorship macro
HID: introduce proper zeroing of unused bits in output reports
USB HID: add support for WiseGroup MP-8800 Quad Joypad
USB HID: add FF support for Logitech Force 3D Pro Joystick
USB HID: numlock quirk for dell W7658 keyboard
USB HID: Logitech MX3000 keyboard needs report descriptor quirk
USB HID: extend quirk for Logitech S510 keyboard
USB HID: usbkbd/usbmouse - handle errors when registering devices
USB HID: add QUIRK_HIDDEV for Belkin Flip KVM
HID: enable dead keys on a belkin wireless keyboard
USB HID: Thustmaster firestorm dual power v1 support
USB HID: specify explicit size for hid_blacklist.quirks
USB HID: fix retry & reset logic
USB HID: consolidate vendor/product ids
...
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
[IPV4] SNMP: Support OutMcastPkts and OutBcastPkts
[IPV4] SNMP: Support InMcastPkts and InBcastPkts
[IPV4] SNMP: Support InTruncatedPkts
[IPV4] SNMP: Support InNoRoutes
[SNMP]: Add definitions for {In,Out}BcastPkts
[TCP] FRTO: RFC4138 allows Nagle override when new data must be sent
[TCP] FRTO: Delay skb available check until it's mandatory
[XFRM]: Restrict upper layer information by bundle.
[TCP]: Catch skb with S+L bugs earlier
[PATCH] INET : IPV4 UDP lookups converted to a 2 pass algo
[L2TP]: Add the ability to autoload a pppox protocol module.
[SKB]: Introduce skb_queue_walk_safe()
[AF_IUCV/IUCV]: smp_call_function deadlock
[IPV6]: Fix slab corruption running ip6sic
[TCP]: Update references in two old comments
[XFRM]: Export SPD info
[IPV6]: Track device renames in snmp6.
[SCTP]: Fix sctp_getsockopt_local_addrs_old() to use local storage.
[NET]: Remove NETIF_F_INTERNAL_STATS, default to internal stats.
[NETPOLL]: Remove CONFIG_NETPOLL_RX
...
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block:
[PATCH] elevator: elv_list_lock does not need irq disabling
[BLOCK] Don't pin lots of memory in mempools
cfq-iosched: speedup cic rb lookup
ll_rw_blk: add io_context private pointer
cfq-iosched: get rid of cfqq hash
cfq-iosched: tighten queue request overlap condition
cfq-iosched: improve sync vs async workloads
cfq-iosched: never allow an async queue idling
cfq-iosched: get rid of ->dispatch_slice
cfq-iosched: don't pass unused preemption variable around
cfq-iosched: get rid of ->cur_rr and ->cfq_list
cfq-iosched: slice offset should take ioprio into account
[PATCH] cfq-iosched: style cleanups and comments
cfq-iosched: sort IDLE queues into the rbtree
cfq-iosched: sort RT queues into the rbtree
[PATCH] cfq-iosched: speed up rbtree handling
cfq-iosched: rework the whole round-robin list concept
cfq-iosched: minor updates
cfq-iosched: development update
cfq-iosched: improve preemption for cooperating tasks
The updated IP-MIB RFC (RFC4293) specifys new objects, InBcastPkts
and OutBcastPkts. This adds definitions for them.
Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we scale the mempool sizes depending on memory installed
in the machine, except for the bio pool itself which sits at a fixed
256 entry pre-allocation.
There's really no point in "optimizing" this OOM path, we just need
enough preallocated to make progress. A single unit is enough, lets
scale it down to 2 just to be on the safe side.
This patch saves ~150kb of pinned kernel memory on a 32-bit box.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch provides a method for walking skb lists while inserting or
removing skbs from the list.
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
input-polldev provides a skeleton for supporting simple input
devices that need to be periodically scanned or polled to
detect changes in their state.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (107 commits)
smc911x: fix compilation breakage wjen debug is on
[netdrvr] eexpress: minor corrections
add NAPI support to sb1250-mac.c
ixgb: ROUND_UP macro cleanup in drivers/net/ixgb
e1000: ROUND_UP macro cleanup in drivers/net/e1000
Generic HDLC sparse annotations
e100: Optionally use I/O mode only to access register space
e100: allow bad MAC address when running with invalid eeprom csum
ehea: fix for dlpar support
ehea: fix for sysfs entries
3C509: Remove unnecessary include of <linux/pm_legacy.h>
NetXen: Fix for vmalloc issues
NetXen: Fixes for Power PC architecture
NetXen: Port swap feature for multi port cards
NetXen: Removal of redundant macros
NetXen: Multi PCI support for Quad cards
NetXen: Removal of redundant argument passing
NetXen: Use multiple PCI functions
[netdrvr e100] experiment with doing RX in a similar manner to eepro100
[PATCH] ieee80211: add missing global needed by IEEE80211_DEBUG_XXXX
...
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: (86 commits)
SPIN_LOCK_UNLOCKED cleanup in drivers/ata/pata_winbond.c
drivers/ata/pata_cmd640.c: fix build with CONFIG_PM=n
pata_hpt37x: Further small fixes
pata_hpt3x2n: Add HPT371N support and other bits
ata: printk warning fixes
libata: separate ATA_EHI_DID_RESET into DID_SOFTRESET and DID_HARDRESET
ahci: consolidate common port flags
ata_timing: ensure t->cycle is always correct
libata: add missing call to ->cable_detect() in new EH path
pata_amd: remove contamination added during cable_detect conversion
libata: Handle drives that require a spin-up command before first access
libata: HPA support
libata: kill probe_ent and related helpers
libata: convert the remaining PATA drivers to new init model
libata: convert the remaining SATA drivers to new init model
libata: convert ata_pci_init_native_mode() users to new init model
libata: convert drivers with combined SATA/PATA ports to new init model
libata: add init helpers including ata_pci_prepare_native_host()
libata: convert native PCI host handling to new init model
libata: convert legacy PCI host handling to new init model
...
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (105 commits)
sonypi: use mutex instead of semaphore
sony-laptop: remove user visible camera controls as platform attributes
meye: make meye use sony-laptop instead of sonypi
sony-laptop: add a meye-usable include file for camera ops
sony-laptop: complete the motion eye camera support in sony-laptop
sonypi: try to detect if sony-laptop has already taken one of the known ioports
sonypi: suggest sonypi users to try sony-laptop instead
sony-laptop: add edge modem support (also called WWAN)
sony-laptop: add locking on accesses to the ioport and global vars
sony-laptop: add camera enable/disable parameter, better handle possible infinite loop
thinkpad-acpi: make drivers/misc/thinkpad_acpi:fan_mutex static
ACPI: thinkpad-acpi: add sysfs support to wan and bluetooth subdrivers
ACPI: thinkpad-acpi: add sysfs support to hotkey subdriver
ACPI: thinkpad-acpi: improve dock subdriver initialization
ACPI: thinkpad-acpi: improve debugging for acpi helpers
ACPI: thinkpad-acpi: improve fan control documentation
ACPI: thinkpad-acpi: map ENXIO to EINVAL for fan sysfs
ACPI: thinkpad-acpi: fix a fan watchdog invocation
ACPI: thinkpad-acpi: do not arm fan watchdog if it would not work
ACPI: thinkpad-acpi: add a fan-control feature master toggle
...
With this patch you can use iproute2 in user space to efficiently see
how many policies exist in different directions.
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu conviced me that a new flag was overkill; every driver
currently overrides get_stats, so we might as well make the internal
one the default. If someone did fail to set get_stats, they would now
get all 0 stats instead of "No statistics available".
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
CONFIG_NETPOLL_TRAP causes the TX queue controls to be completely bypassed in
the netpoll's "trapped" mode which easily causes overflows in the drivers with
short TX queues (most notably, in 8139too with its 4-deep queue). So, make
this option more sensible by making it only bypass the TX softirq wakeup.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Acked-by: Tom Rini <trini@kernel.crashing.org>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Copy and rename (for easier co-existence) the MEYE-wise exported interface.
Signed-off-by: Mattia Dongili <malattia@linux.it>
Signed-off-by: Len Brown <len.brown@intel.com>
Separate ATA_EHI_DID_RESET into ATA_EHI_DID_SOFTRESET and
ATA_EHI_DID_HARDRESET. ATA_EHI_DID_RESET is redefined as OR of the
two flags. This patch doesn't introduce any behavior change. This
will be used later to determine whether _SDD is necessary or not.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
(S)ATA drives can be configured for "power-up in standby",
a mode whereby a specific "spin up now!" command is required
before the first media access.
Currently, a drive with this feature enabled can not be used at all
with libata, and once in this mode, the drive becomes a doorstop.
The older drivers/ide subsystem at least enumerates the drive,
so that it can be woken up after the fact from a userspace HDIO_*
command, but not libata.
This patch adds support to libata for the "power-up in standby"
mode where a "spin up now!" command (SET_FEATURES) is needed.
With this, libata will recognize such drives, spin them up,
and then re-IDENTIFY them if necessary to get a full/complete
set of drive features data.
Drives in this state are determined by looking for
special values in id[2], as documented in the current ATA specs.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Alan Cox <alan@redhat.com>
Add support for ignoring the BIOS HPA result (off by default) and setting
the disk to the full available size unless already frozen.
Tested with various platforms/disks and confirmed to work with the
Macintosh (which broke earlier) and ata_piix (breakage due to the LBA48
readback that Tejun fixed).
For normal users this brings us, I believe, to feature parity with old IDE
(and of course more featured in some areas too).
Signed-off-by: Jeff Garzik <jeff@garzik.org>
All drivers are converted to new init model. Kill probe_ent,
ata_device_add() and ata_pci_init_native_mode().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
These will be used to convert LLDs to new init model.
* Add irq_handler field to port_info. In new init model, requesting
IRQ is LLD's responsibility and libata doesn't need to know about
irq_handler. Most LLDs can simply register their irq_handler but
some need different irq_handler depending on specific chip. The
added port_info->irq_handler field can be used by LLDs to select
the matching IRQ handler in such cases.
* Add ata_dummy_port_info.
* Implement ata_pci_prepare_native_host(), a helper to alloc ATA host,
acquire all resources and init the host in one go.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Convert native PCI host handling to alloc-init-register model. New
function ata_pci_init_native_host() follows the new init model and
replaces ata_pci_init_native_mode(). As there are remaining LLD
users, the old function isn't removed yet.
ata_pci_init_one() is reimplemented using the new function and now
fully converted to new init model.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement ata_host_alloc_pinfo() and ata_host_register(). These helpers
will be used in the following patches to adopt new init model.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Reorganize ata_host_alloc() and its subroutines into the following
three functions.
* ata_host_alloc() : allocates host and its ports. shost is not
registered automatically.
* ata_scsi_add_hosts() : allocates and adds shosts associated with an
ATA host. Used by ata_host_register().
* ata_host_register() : takes a fully initialized ata_host structure
and registers it to libata layer and probes it.
Only ata_host_alloc() and ata_host_register() are exported.
ata_device_add() is rewritten using the above functions. This patch
does not introduce any observable behavior change. Things worth
mentioning.
* print_id is assigned at registration time and LLDs are allowed to
overallocate ports and reduce host->n_ports during initialization.
ata_host_register() will throw away unused ports automatically.
* All SCSI host initialization stuff now resides in
ata_scsi_add_hosts() in libata-scsi.c, where it should be.
* ipr is now the only user of ata_host_init(). Either kill it by
converting ipr to use ata_host_alloc() and friends or rename and
move it to libata-scsi.c
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Separate out ata_host_start() from ata_device_add(). ata_host_start()
calls ->port_start on each port if available and freezes the port.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Don't embed ap inside shost. Allocate it separately and point it back
from shosts's hostdata. This makes port allocation more flexible and
allows regular ATA and SAS share host alloc/init paths.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
SB600 RAID and SB600 SATA is the same controller and share the
same PCI ID 0x4380. There is no such PCI ID 0x4381.
Signed-off-by: Conke Hu <conke.hu@gmail.com>
---------
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The READ/WRITE LONG commands are theoretically obsolete,
but the majority of drives in existance still implement them.
The WRITE_LONG and WRITE_LONG_ONCE commands are of particular
interest for fault injection testing -- eg. creating "media errors"
at specific locations on a disk.
The fussy bit is that these commands require a non-standard
sector size, usually 520 bytes instead of 512.
This patch adds support to libata for READ/WRITE LONG commands
issued via SG_IO/ATA_16.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Both old-IDE and libata should be able handle all controllers and
devices found using normal resource reservation methods.
This eliminates the awful, low-performing split-driver configuration
where old-IDE drove the PATA portion of a PCI device, in PIO-only mode,
and libata drove the SATA portion of the /same/ PCI device, in DMA mode.
Typically vendors would ship SATA hard drive / PATA optical
configuration, which would lend itself to slow (PIO-only) CD-ROM
performance.
For Intel users running in combined mode, it is now wholly dependent on
your driver choice (potentially link order, if you compile both drivers
in) whether old-IDE or libata will drive your hardware.
In either case, you will get full performance from both SATA and PATA
ports now, without having to pass a kernel command line parameter.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
With Tejun having added adev->ap some time ago we can get rid of the
almost unused port being passed to mode filters. And while we are
doing filters, lets turn on the !IORDY filter as well.
Signed-off-by: Alan Cox <alan@redhat.com>
With some hand massaging from
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This splits set_mode into do_set_mode and the wrapper so that a driver can
call the standard method inside its own. This in theory also obsoletes
->post_set_mode().
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement pcim_iounmap_regions() - the opposite of
pcim_iomap_regions().
Signed-off-by: Tejun heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2.6.21-rc has horrible problems with libata and PATA cable types (and
thus speeds). This occurs because Tejun fixed a pile of other bugs and
we now do cable detect enforcement for drive side detection properly.
Unfortunately we don't do the process around cable detection right. Tejun
identified the problem and pointed to the right Annex in the spec, this patch
implements the rest of the needed changes.
We add a ->cable_detect() method called after the identify
sequence which allows a host to do host side detection at this point
should it wish, or to modify the results of the drive side identify.
This separate ->cable_detect method also cleans up a lot of code because
many drivers have their own error_handler methods which really just set
the cable type.
If there is no ->cable_detect method the cable type is left alone so a
driver setting it earlier (eg because it has the SATA flags set or
because it uses the old error_handler approach) will still do the right
thing (or at least the same thing) as before.
This patch simply adds the cable_detect method and helpers it doesn't use
them but other follow up patches will (ie Adrian please don't submit
patches to unexport them ;))
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
It used to be impossible to get from ata_device to ata_port but that is
no longer true. Various methods have been cleaned up over time but
dev_config still takes both and most users don't need both anyway. Tidy
this one up
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
migrate ucc_geth to use the common phylib code.
There are several side effects from doing this:
o deprecate 'interface' property specification present
in some old device tree source files in
favour of a split 'max-speed' and 'interface-type'
description to appropriately match definitions
in include/linux/phy.h. Note that 'interface' property
is still honoured if max-speed or interface-type
are not present (backward compatible).
o compile-time CONFIG_UGETH_HAS_GIGA is eliminated
in favour of probe time speed derivation logic.
o adjust_link streamlined to only operate on maccfg2
and upsmr.r10m, instead of reapplying static initial
values related to the interface-type.
o Addition of UEC MDIO of_platform driver requires
platform code add 'mdio' type to id list
prior to calling of_platform_bus_probe (separate patch).
o ucc_struct_init introduced to reduce ucc_geth_startup
complexity.
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The RGMII spec allows compliance for devices that implement an internal
delay on TXC or RXC inside the transmitter. This patch adds an RGMII_ID
definition to support RGMII-ID devices in the phylib.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
After 13 years of use, it looks like my email address is finally going
to disappear. While this is likely to drop the amount of incoming spam
greatly ;-), it may also affect more appropriate messages, so let's
update my email address in various places. In addition, Host AP mailing
list is subscribers-only and linux-wireless can also be used for
discussing issues related to this driver which is now shown in
MAINTAINERS.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Current tc35815 driver is very obsolete and less maintained for a long
time. Replace it with a new driver based on one from CELF patch
archive.
Major advantages of CELF version (version 1.23, for kernel 2.6.10) are:
* Independent of JMR3927.
(Actually independent of MIPS, but AFAIK the chip is used only on
MIPS platforms)
* TX4938 support.
* 64-bit proof.
* Asynchronous and on-demand auto negotiation.
* High performance on non-coherent architecture.
* ethtool support.
* Many bugfixes and cleanups.
And improvoments since version 1.23 are:
* TX4939 support.
* NETPOLL support.
* NAPI support. (disabled by default)
* Reduce memcpy on receiving.
* PM support.
* Many cleanups and bugfixes.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* git://git.infradead.org/mtd-2.6: (46 commits)
[MTD] [MAPS] drivers/mtd/maps/ck804xrom.c: convert pci_module_init()
[MTD] [NAND] CM-x270 MTD driver
[MTD] [NAND] Wrong calculation of page number in nand_block_bad()
[MTD] [MAPS] fix plat-ram printk format
[JFFS2] Fix compr_rubin.c build after include file elimination.
[JFFS2] Handle inodes with only a single metadata node with non-zero isize
[JFFS2] Tidy up licensing/copyright boilerplate.
[MTD] [OneNAND] Exit loop only when column start with 0
[MTD] [OneNAND] Fix access the past of the real oobfree array
[MTD] [OneNAND] Update Samsung OneNAND official URL
[JFFS2] Better fix for all-zero node headers
[JFFS2] Improve read_inode memory usage, v2.
[JFFS2] Improve failure mode if inode checking leaves unchecked space.
[JFFS2] Fix cross-endian build.
[MTD] Finish conversion mtd_blkdevs to use the kthread API
[JFFS2] Obsolete dirent nodes immediately on unlink, where possible.
Use menuconfig objects: MTD
[MTD] mtd_blkdevs: Convert to use the kthread API
[MTD] Fix fwh_lock locking
[JFFS2] Speed up mount for directly-mapped NOR flash
...
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6: (78 commits)
USB: update MAINAINERS and CREDITS for Freescale USB driver
USB: update gadget files for fsl_usb2_udc driver
USB: add Freescale high-speed USB SOC device controller driver
USB: quirk for broken suspend of IT8152F/G
USB: iowarrior.c: timeouts too small in usb_control_msg calls
USB: dell device id for option.c
USB: Remove Huawei unusual_devs entry
USB: CP2101 New Device IDs
USB: add picdem device to ldusb
usbfs micro optimitation
USB: remove ancient/broken CRIS hcd
usb ethernet gadget, workaround network stack API glitch
USB: add "busnum" attribute for USB devices
USB: cxacru: ADSL state management
usbatm: Detect usb device shutdown and ignore failed urbs
USB: Remove duplicate define of OHCI_QUIRK_ZFMICRO
USB: BandRich BandLuxe HSDPA Data Card Driver
USB gadget rndis: fix struct rndis_packet_msg_type unaligned bug
USB Elan FTDI: check for driver registration status
USB: sierra: add more checks on shutdown
...
* 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (184 commits)
V4L/DVB (5563): Radio-maestro.c Replace radio_ioctl to use video_ioctl2
V4L/DVB (5562): Radio-gemtek-pci.c Replace gemtek_pci_ioctl to use video_ioctl2
V4L/DVB (5560): Ivtv: fix incorrect bitwise-and for command flags.
V4L/DVB (5558): Opera: use 7-bit i2c addresses
V4L/DVB (5557): Cafe_ccic: check return value of pci_enable_device
V4L/DVB (5556): Radio-gemtek.c Replace gemtek_ioctl to use video_ioctl2
V4L/DVB (5555): Radio-aimslab.c Replace rt_ioctl to use video_ioctl2
V4L/DVB (5554): Fix: vidioc_g_parm were not zeroing the memory
V4L/DVB (5553): Replace typhoon_do_ioctl to use video_ioctl2
V4L/DVB (5552): Plan-b: Switch to refcounting PCI API
V4L/DVB (5551): Plan-b: header change
V4L/DVB (5550): Radio-sf16fmi.c Replace fmi_do_ioctl to use video_ioctl2
V4L/DVB (5549): Radio-sf16fmr2.c Replace fmr2_do_ioctl to use video_ioctl2
V4L/DVB (5548): Fix v4l2 buffer to the length
V4L/DVB (5547): Add ENUM_FRAMESIZES and ENUM_FRAMEINTERVALS ioctls
V4L/DVB (5546): Radio-terratec.c Replace tt_do_ioctl to use video_ioctl2
V4L/DVB (5545): Saa7146: Release capture buffers on device close
V4L/DVB (5544): Budget-av: Make inversion setting configurable, add KNC ONE V1.0 card
V4L/DVB (5543): Tda10023: Add support for frontend TDA10023
V4L/DVB (5542): Budget-av: Remove polarity switching of the clock for DVB-C
...
Minor doc update to <linux/usb/ch9.h> ... say where USB_DT_CS_* came
from and update the definitions to match how they're derived there.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as877) adds a "last_busy" field to struct usb_device, for
use by the autosuspend framework. Now if an autosuspend call comes at
a time when the device isn't busy but hasn't yet been idle for long
enough, the timer can be set to exactly the desired value. And we
will be ready to handle things like HID drivers, which can't maintain
a useful usage count and must rely on the time-of-last-use to decide
when to autosuspend.
The patch also makes some related minor improvements:
Move the calls to the autosuspend condition-checking routine
into usb_suspend_both(), which is the only place where it
really matters.
If the autosuspend timer is already running, don't stop
and restart it.
Replace immediate returns with gotos so that the optional
debugging ouput won't be bypassed.
If autoresume is disabled but the device is already awake,
don't return an error for an autoresume call.
Don't try to autoresume a device if it isn't suspended.
(Yes, this undercuts the previous change -- so sue me.)
Don't duplicate existing code in the autosuspend work routine.
Fix the kerneldoc in usb_autopm_put_interface(): If an
autoresume call fails, the usage counter is left unchanged.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
o The "real" usb-devices export now a device node which can
populate /dev/bus/usb.
o The usb_device class is optional now and can be disabled in the
kernel config. Major/minor of the "real" devices and class devices
are the same.
o The environment of the usb-device event contains DEVNUM and BUSNUM to
help udev and get rid of the ugly udev rule we need for the class
devices.
o The usb-devices and usb-interfaces share the same bus, so I used
the new "struct device_type" to let these devices identify
themselves. This also removes the current logic of using a magic
platform-pointer.
The name of the device_type is also added to the environment
which makes it easier to distinguish the different kinds of devices
on the same subsystem.
It looks like this:
add@/devices/pci0000:00/0000:00:1d.1/usb2/2-1
ACTION=add
DEVPATH=/devices/pci0000:00/0000:00:1d.1/usb2/2-1
SUBSYSTEM=usb
SEQNUM=1533
MAJOR=189
MINOR=131
DEVTYPE=usb_device
PRODUCT=46d/c03e/2000
TYPE=0/0/0
BUSNUM=002
DEVNUM=004
This udev rule works as a replacement for usb_device class devices:
SUBSYSTEM=="usb", ACTION=="add", ENV{DEVTYPE}=="usb_device", \
NAME="bus/usb/$env{BUSNUM}/$env{DEVNUM}", MODE="0644"
Updated patch, which needs the device_type patches in Greg's tree.
I also got a bugzilla assigned for this. :)
https://bugzilla.novell.com/show_bug.cgi?id=250659
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as874) adds another piece to the user-visible part of the
USB autosuspend interface. The new power/level sysfs attribute allows
users to force the device on (with autosuspend off), force the device
to sleep (with autoresume off), or return to normal automatic operation.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as867) adds an entry for the new power/autosuspend
attribute in Documentation/ABI/testing, and it changes the behavior of
the delay value. Now a delay of 0 means to autosuspend as soon as
possible, and negative values will prevent autosuspend.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
this adds another structure for CDC devices to cdc.h.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (46 commits)
dev_dbg: check dev_dbg() arguments
drivers/base/attribute_container.c: use mutex instead of binary semaphore
mod_sysfs_setup() doesn't return errno when kobject_add_dir() failure occurs
s2ram: add arch irq disable/enable hooks
define platform wakeup hook, use in pci_enable_wake()
security: prevent permission checking of file removal via sysfs_remove_group()
device_schedule_callback() needs a module reference
s390: cio: Delay uevents for subchannels
sysfs: bin.c printk fix
Driver core: use mutex instead of semaphore in DMA pool handler
driver core: bus_add_driver should return an error if no bus
debugfs: Add debugfs_create_u64()
the overdue removal of the mount/umount uevents
kobject: Comment and warning fixes to kobject.c
Driver core: warn when userspace writes to the uevent file in a non-supported way
Driver core: make uevent-environment available in uevent-file
kobject core: remove rwsem from struct subsystem
qeth: Remove usage of subsys.rwsem
PHY: remove rwsem use from phy core
IEEE1394: remove rwsem use from ieee1394 core
...
Negative speed values have to be allowed for reverse playback.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
VIDEO_EVENT_VSYNC needs to tell the application which field it was that
received a VSYNC (odd/even/progressive). The vsync_field was added to the
union in video_event for this purpose.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The cx23415 adds some extra features that this DVB decoding API did
not support. This API has been expanded to support the required
features. Both source and binary backwards compatibility is kept
intact by these changes. So existing applications are not affected.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Ralph Metzler <rjkm@metzlerbros.de>
Signed-off-by: Oliver Endriss <o.endriss@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
VIDIOC_G_CHIP_IDENT improves debugging of card problems: it can be
used to detect which chips are on the board and based on that information
selected register dumps can be made, making it easy to debug complicated
media chips containing tens or hundreds of registers.
This ioctl replaces the internal VIDIOC_INT_G_CHIP_IDENT ioctl.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Add V4L2_BUF_TYPE_VIDEO_OUTPUT_OVERLAY support.
Also add support for local and global alpha overlays.
Add new field enums V4L2_FIELD_INTERLACED_TB and V4L2_FIELD_INTERLACED_BT.
These changes are needed to support the ivtv On Screen Display features.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Add V4L2_CAP_VIDEO_OUTPUT_POS capability and x, y position coordinates
to struct v4l2_pix_format.
This is needed to support positioning the MPEG/YUV output of the cx23415.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Added V4L2_CID_MPEG_AUDIO_MUTE, V4L2_CID_MPEG_VIDEO_MUTE and
V4L2_CID_MPEG_CX2341X_STREAM_INSERT_NAV_PACKETS controls together with
their implementation in the cx2341x module.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Duplicate what Zach Brown did for pr_debug in commit
8b2a1fd1b3
[akpm@linux-foundation.org: fix a couple of things which broke]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
After some more discussion this patch replaces it:
From: Johannes Berg <johannes@sipsolutions.net>
Subject: suspend: add arch irq disable/enable hooks
For powermac, we need to do some things between suspending devices and
device_power_off, for example setting the decrementer. This patch
allows architectures to define arch_s2ram_{en,dis}able_irqs in their
asm/suspend.h to have control over this step.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This defines a platform hook to enable/disable a device as a wakeup event
source. It's initially for use with ACPI, but more generally it could be used
whenever enable_irq_wake()/disable_irq_wake() don't suffice.
The hook is called -- if available -- inside pci_enable_wake(); and the
semantics of that call are enhanced so that support for PCI PME# is no longer
needed. It can now work for devices with "legacy PCI PM", when platform
support allows it. (That support would use some board-specific signal for for
the same purpose as PME#.)
[akpm@linux-foundation.org: Make it compile with CONFIG_PM=n]
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Prevent permission checking from being performed when the kernel wants to
unconditionally remove a sysfs group, by introducing an kernel-only variant
of lookup_one_len(), lookup_one_len_kern().
Additionally, as sysfs_remove_group() does not check the return value of
the lookup before using it, a BUG_ON has been added to pinpoint the cause
of any problems potentially caused by this (and as a form of annotation).
Signed-off-by: James Morris <jmorris@namei.org>
Cc: Nagendra Singh Tomar <nagendra_tomar@adaptec.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as896b) fixes an oversight in the design of
device_schedule_callback(). It is necessary to acquire a reference to the
module owning the callback routine, to prevent the module from being
unloaded before the callback can run.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Satyam Sharma <satyam.sharma@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I went to use this the other day, only to find it didn't exist.
It's a straight copy of the debugfs u32 code, then s/u32/u64/. A quick
test shows it seems to be working.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch contains the overdue removal of the mount/umount uevents.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It isn't used at all by the driver core anymore, and the few usages of
it within the kernel have now all been fixed as most of them were using
it incorrectly. So remove it.
Now the whole struct subsys can be removed from the system, but that's
for a later patch...
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Driver core: add suspend() and resume() to struct device_type
In cases when there are devices of different types in the same class
we can't use class's implementation of suspend and resume methods and
we need to add them to struct device_type instead.
Also fix error handling in resume code (we should not try to call
class's resume method iof bus's resume method for the device failed.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The completion in the driver release path is due to ancient history in
the _very_ early 2.5 days when we were not tracking the module reference
count of attributes. It is not needed at all and can be removed.
Note, we now have an empty release function for the driver structure.
This is due to the fact that drivers are statically allocated in the
system at this point in time, something which I want to change in the
future. But remember, drivers are really code, which is reference
counted by the module, unlike devices, which are data and _must_ be
reference counted properly in order to work correctly.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Make multithreaded probing work per subsystem instead of per driver.
It doesn't make much sense to probe the same device for multiple drivers in
parallel (after all, only one driver can bind to the device). Instead, create
a probing thread for each device that probes the drivers one after another.
Also make the decision to use multi-threaded probe per bus instead of per
device and adapt the pci code.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If "name" of a device_type is specified, the uevent will
contain the device_type name in the DEVTYPE variable.
This helps userspace to distingiush between different types
of devices, belonging to the same subsystem.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Driver core: use attribute groups in struct device_type
Attribute groups are more flexible than attribute lists
(an attribute list can be represented by anonymous group)
so switch struct device_type to use them.
Also rework attribute creation for devices so that they all
cleaned up properly in case of errors.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Cc: Kay Sievers <kay.sievers@novell.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We get two per-bus sysfs files:
ls-l /sys/subsystem/usb
drwxr-xr-x 2 root root 0 2007-02-16 16:42 devices
drwxr-xr-x 7 root root 0 2007-02-16 14:55 drivers
-rw-r--r-- 1 root root 4096 2007-02-16 16:42 drivers_autoprobe
--w------- 1 root root 4096 2007-02-16 16:42 drivers_probe
The flag "drivers_autoprobe" controls the behavior of the bus to bind
devices by default, or just initialize the device and leave it alone.
The command "drivers_probe" accepts a bus_id and the bus tries to bind a
driver to this device.
Systems who want to control the driver binding with udev, switch off the
bus initiated probing:
echo 0 > /sys/subsystem/usb/drivers_autoprobe
echo 0 > /sys/subsystem/pcmcia/drivers_autoprobe
...
and initiate the probing with udev rules like:
ACTION=="add", SUBSYSTEM=="usb", ATTR{subsystem/drivers_probe}="$kernel"
ACTION=="add", SUBSYSTEM=="pcmcia", ATTR{subsystem/drivers_probe}="$kernel"
...
Custom driver binding can happen in earlier rules by something like:
ACTION=="add", SUBSYSTEM=="usb", \
ATTRS{idVendor}=="1234", ATTRS{idProduct}=="5678" \
ATTR{subsystem/drivers/<custom-driver>/bind}="$kernel"
This is intended to solve the modprobe.conf mess with "install-rules", custom
bind/unbind-scripts and all the weird things people invented over the years.
It should also provide the functionality "libusual" was supposed to do.
With udev, one can just write a udev rule to drive all USB-disks at the
third port of USB-hub by the "ub" driver, and everything else by
usb-storage. One can also instruct udev to bind different wireless
drivers to identical cards - just selected by the pcmcia slot-number, and
whatever ...
To use the mentioned rules, it needs udev version 106, to be able to
write ATTR{}="$kernel" to sysfs files.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
- uses a kset in "struct class" to keep track of all directories
belonging to this class
- merges with the /sys/devices/virtual logic.
- removes the namespace-dir if the last member of that class
leaves the directory.
There may be locking or refcounting fixes left, I stopped when it seemed
to work with network and sound modules. :)
From: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
show_state() (SysRq-T) developed the buggy habbit of not showing
TASK_RUNNING tasks. This was due to the mistaken belief that state_filter
== -1 would be a pass-through filter - while in reality it did not let
TASK_RUNNING == 0 p->state values through.
Fix this by restoring the original '!state_filter means all tasks'
special-case i had in the original version. Test-built and test-booted on
i686, SysRq-T now works as intended.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.infradead.org/ubi-2.6:
UBI: remove unused variable
UBI: add me to MAINTAINERS
JFFS2: add UBI support
UBI: Unsorted Block Images
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (27 commits)
ocfs2: Cache extent records
ocfs2: Remember rw lock level during direct io
ocfs2: Fix up i_blocks calculation to know about holes
ocfs2: Fix extent lookup to return true size of holes
ocfs2: Read from an unwritten extent returns zeros
ocfs2: make room for unwritten extents flag
ocfs2: Use own splice write actor
ocfs2: Use do_sync_mapping_range() in ocfs2_zero_tail_for_truncate()
[PATCH] Turn do_sync_file_range() into do_sync_mapping_range()
ocfs2: zero tail of sparse files on truncate
ocfs2: Teach ocfs2_get_block() about holes
ocfs2: remove ocfs2_prepare_write() and ocfs2_commit_write()
ocfs2: teach ocfs2_file_aio_write() about sparse files
ocfs2: Turn off shared writeable mmap for local files systems with holes.
ocfs2: abstract out allocation locking
ocfs2: teach extend/truncate about sparse files
ocfs2: temporarily remove extent map caching
ocfs2: sparse b-tree support
ocfs2: small cleanup of ocfs2_request_delete()
ocfs2: remove unused code
...
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: (67 commits)
[SCSI] SUNESP: Complete driver rewrite to version 2.0
[SPARC64]: Convert PCI over to generic struct iommu/strbuf.
[SPARC]: device_node name constification fallout
[SPARC64]: Convert SBUS over to generic iommu/strbuf structs.
[SPARC64]: Add generic iommu and strbuf structs to iommu.h
[SPARC64]: Consolidate {sbus,pci}_iommu_arena.
[SPARC]: Make device_node name and type const
[SPARC64]: constify some paramaters of OF routines
[TIGON3]: of_get_property() returns const.
[SPARC64]: Fix PCI rework to adhere to of_get_property() const return.
[SPARC64]: Document and fix calculation of pages_avail.
[SPARC64]: Make sure pbm->prom_node is setup easly enough in psycho.c
[SPARC64]: Use bootmem_bootmap_pages() in choose_bootmap_pfn().
[SPARC64]: Add proper header file extern for cmdline_memory_size.
[SPARC64]: Kill sparc_ultra_dump_{i,d}tlb()
[SPARC64]: Use DECLARE_BITMAP and BITS_TO_LONGS in mm/init.c
[SPARC64]: Give move verbose show_mem() output just like i386.
[SPARC64]: Mark show_mem() printk's with KERN_INFO.
[SPARC64]: Kill kvaddr_to_phys() and friends.
[SPARC64]: Privatize sun4u_get_pte() and fix name.
...
The page_test_and_clear_dirty primitive really consists of two
operations, page_test_dirty and the page_clear_dirty. The combination
of the two is not an atomic operation, so it makes more sense to have
two separate operations instead of one.
In addition to the improved readability of the s390 version of
SetPageUptodate, it now avoids the page_test_dirty operation which is
an insert-storage-key-extended (iske) instruction which is an expensive
operation.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
UBI (Latin: "where?") manages multiple logical volumes on a single
flash device, specifically supporting NAND flash devices. UBI provides
a flexible partitioning concept which still allows for wear-levelling
across the whole flash device.
In a sense, UBI may be compared to the Logical Volume Manager
(LVM). Whereas LVM maps logical sector numbers to physical HDD sector
numbers, UBI maps logical eraseblocks to physical eraseblocks.
More information may be found at
http://www.linux-mtd.infradead.org/doc/ubi.html
Partitioning/Re-partitioning
An UBI volume occupies a certain number of erase blocks. This is
limited by a configured maximum volume size, which could also be
viewed as the partition size. Each individual UBI volume's size can
be changed independently of the other UBI volumes, provided that the
sum of all volume sizes doesn't exceed a certain limit.
UBI supports dynamic volumes and static volumes. Static volumes are
read-only and their contents are protected by CRC check sums.
Bad eraseblocks handling
UBI transparently handles bad eraseblocks. When a physical
eraseblock becomes bad, it is substituted by a good physical
eraseblock, and the user does not even notice this.
Scrubbing
On a NAND flash bit flips can occur on any write operation,
sometimes also on read. If bit flips persist on the device, at first
they can still be corrected by ECC, but once they accumulate,
correction will become impossible. Thus it is best to actively scrub
the affected eraseblock, by first copying it to a free eraseblock
and then erasing the original. The UBI layer performs this type of
scrubbing under the covers, transparently to the UBI volume users.
Erase Counts
UBI maintains an erase count header per eraseblock. This frees
higher-level layers (like file systems) from doing this and allows
for centralized erase count management instead. The erase counts are
used by the wear-levelling algorithm in the UBI layer. The algorithm
itself is exchangeable.
Booting from NAND
For booting directly from NAND flash the hardware must at least be
capable of fetching and executing a small portion of the NAND
flash. Some NAND flash controllers have this kind of support. They
usually limit the window to a few kilobytes in erase block 0. This
"initial program loader" (IPL) must then contain sufficient logic to
load and execute the next boot phase.
Due to bad eraseblocks, which may be randomly scattered over the
flash device, it is problematic to store the "secondary program
loader" (SPL) statically. Also, due to bit-flips it may become
corrupted over time. UBI allows to solve this problem gracefully by
storing the SPL in a small static UBI volume.
UBI volumes vs. static partitions
UBI volumes are still very similar to static MTD partitions:
* both consist of eraseblocks (logical eraseblocks in case of UBI
volumes, and physical eraseblocks in case of static partitions;
* both support three basic operations - read, write, erase.
But UBI volumes have the following advantages over traditional
static MTD partitions:
* there are no eraseblock wear-leveling constraints in case of UBI
volumes, so the user should not care about this;
* there are no bit-flips and bad eraseblocks in case of UBI volumes.
So, UBI volumes may be considered as flash devices with relaxed
restrictions.
Where can it be found?
Documentation, kernel code and applications can be found in the MTD
gits.
What are the applications for?
The applications help to create binary flash images for two purposes: pfi
files (partial flash images) for in-system update of UBI volumes, and plain
binary images, with or without OOB data in case of NAND, for a manufacturing
step. Furthermore some tools are/and will be created that allow flash content
analysis after a system has crashed..
Who did UBI?
The original ideas, where UBI is based on, were developed by Andreas
Arnez, Frank Haverkamp and Thomas Gleixner. Josh W. Boyer and some others
were involved too. The implementation of the kernel layer was done by Artem
B. Bityutskiy. The user-space applications and tools were written by Oliver
Lohmann with contributions from Frank Haverkamp, Andreas Arnez, and Artem.
Joern Engel contributed a patch which modifies JFFS2 so that it can be run on
a UBI volume. Thomas Gleixner did modifications to the NAND layer. Alexander
Schmidt made some testing work as well as core functionality improvements.
Signed-off-by: Artem B. Bityutskiy <dedekind@linutronix.de>
Signed-off-by: Frank Haverkamp <haver@vnet.ibm.com>
This patch makes the wext bits in struct net_device depend on
CONFIG_WIRELESS_EXT.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide AF_RXRPC sockets that can be used to talk to AFS servers, or serve
answers to AFS clients. KerberosIV security is fully supported. The patches
and some example test programs can be found in:
http://people.redhat.com/~dhowells/rxrpc/
This will eventually replace the old implementation of kernel-only RxRPC
currently resident in net/rxrpc/.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Export the keyring key type definition and document its availability.
Add alternative types into the key's type_data union to make it more useful.
Not all users necessarily want to use it as a list_head (AF_RXRPC doesn't, for
example), so make it clear that it can be used in other ways.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
del_timer_sync() buys nothing for cancel_delayed_work(), but it is less
efficient since it locks the timer unconditionally, and may wait for the
completion of the delayed_work_timer_fn().
cancel_delayed_work() == 0 means:
before this patch:
work->func may still be running or queued
after this patch:
work->func may still be running or queued, or
delayed_work_timer_fn->__queue_work() in progress.
The latter doesn't differ from the caller's POV,
delayed_work_timer_fn() is called with _PENDING
bit set.
cancel_delayed_work() == 1 with this patch adds a new possibility:
delayed_work->work was cancelled, but delayed_work_timer_fn
is still running (this is only possible for the re-arming
works on single-threaded workqueue).
In this case the timer was re-started by work->func(), nobody
else can do this. This in turn means that delayed_work_timer_fn
has already passed __queue_work() (and wont't touch delayed_work)
because nobody else can queue delayed_work->work.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
do_sync_file_range() accepts a file * from which it takes an address_space to
sync. Abstract out the bulk of the function into do_sync_mapping_range()
which takes the address_space directly. This way callers who want to sync an
address_space directly can take advantage of the functionality provided.
do_sync_file_range() is preserved as a small wrapper around
do_sync_mapping_range().
Ocfs2 in particular would like to use this to initiate a sync of a specific
inode range during truncate, where a file * may not be available.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Remove deprecated /proc/acpi/processor/performance write support
Writing to /proc/acpi/processor/xy/performance interferes with sysfs
cpufreq interface. Also removes buggy cpufreq_set_policy exported symbol.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
Delete the unreferenced header file include/linux/if_wanpipe_common.h,
as well as the reference to it in the Doc file.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Delete the unreferenced header file include/linux/sdla_fr.h.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
On a system with a lot of SAs, counting SAD entries chews useful
CPU time since you need to dump the whole SAD to user space;
i.e something like ip xfrm state ls | grep -i src | wc -l
I have seen taking literally minutes on a 40K SAs when the system
is swapping.
With this patch, some of the SAD info (that was already being tracked)
is exposed to user space. i.e you do:
ip xfrm state count
And you get the count; you can also pass -s to the command line and
get the hash info.
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pause frames should never make it out of the network device into
the stack. But if a device was misconfigured, it might happen.
So drop pause frames in bridge.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
To avoid raw division, use ktime_to_timeval() to get usec.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do some simple changes to make congestion control API faster/cleaner.
* use ktime_t rather than timeval
* merge rtt sampling into existing ack callback
this means one indirect call versus two per ack.
* use flags bits to store options/settings
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch creates the core cfg80211 code along with some sysfs bits.
This is a stripped down version to allow mac80211 to function, but
doesn't include any configuration yet except for creating and removing
virtual interfaces.
This patch includes the nl80211 header file but it only contains the
interface types which the cfg80211 interface for creating virtual
interfaces relies on.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Getting warnings becuase skb_store_bits has skb as constant,
but the function overwrites it. Looks like const was on the
wrong side.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a packet socket option to allow the orig_dev index to be returned
to userspace when passing traffic through a decapsulated device, such
as the bonding driver.
This is very useful for layer 2 traffic being able to report which
physical device actually received the traffic, instead of having the
encapsulating device hide that information.
The new option is called PACKET_ORIGDEV.
Signed-off-by: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add IP(V6)_PMTUDISC_PROBE value for IP(V6)_MTU_DISCOVER. This option forces
us not to fragment, but does not make use of the kernel path MTU discovery.
That is, it allows for user-mode MTU probing (or, packetization-layer path
MTU discovery). This is particularly useful for diagnostic utilities, like
traceroute/tracepath.
Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch cb_lock to mutex and allow netlink kernel users to override it
with a subsystem specific mutex for consistent locking in dump callbacks.
All netlink_dump_start users have been audited not to rely on any
side-effects of the previously used spinlock.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The attached patch adds gratuitous arp filtering, more precisely: it
allows checking that the IPv4 source address matches the IPv4
destination address inside the ARP header. It also adds a check for the
hardware address type when matching MAC addresses (nothing critical,
just for better consistency).
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Acked-by: Carl-Daniel Hailfinger <c-d.hailfinger.devel.2006@gmx.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The attached patch by Michael Milner adds support for using iptables and
ip6tables on bridged traffic encapsulated in ppoe frames, similar to
what's already supported for vlan.
Signed-off-by: Michael Milner <milner@blissisland.ca>
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fills in missing documentation for dccp_sock fields.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the bridging hook to be simple function with return value
rather than modifying the skb argument. This could generate better
code and is cleaner.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
When a transmitted packet is looped back directly, CHECKSUM_PARTIAL
maps to the semantics of CHECKSUM_UNNECESSARY. Therefore we should
treat it as such in the stack.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The skb transport pointer is currently used to specify the start
of the checksum region for transmit checksum offload. Unfortunately,
the same pointer is also used during receive side processing.
This creates a problem when we want to retransmit a received
packet with partial checksums since the skb transport pointer
would be overwritten.
This patch solves this problem by creating a new 16-bit csum_start
offset value to replace the skb transport header for the purpose
of checksums. This offset is calculated from skb->head so that
it does not have to change when skb->data changes.
No extra space is required since csum_offset itself fits within
a 16-bit word so we can use the other 16 bits for csum_start.
For backwards compatibility, just before we push a packet with
partial checksums off into the device driver, we set the skb
transport header to what it would have been under the old scheme.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
When looking up route for destination with rules with
source address restrictions, we may need to find a source
address for the traffic if not given.
Based on patch from Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move generic skbuff stuff from XFRM code to generic code so that
AF_RXRPC can use it too.
The kdoc comments I've attached to the functions needs to be checked
by whoever wrote them as I had to make some guesses about the workings
of these functions.
Signed-off-By: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To clearly state the intent of copying to linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Network drivers which keep stats allocate their own stats structure
then write a get_stats() function to return them. It would be nice if
this were done by default.
1) Add a new "stats" field to "struct net_device".
2) Add a new feature field to say "this driver uses the internal one"
3) Have a default "get_stats" which returns NULL if that feature not set.
4) Change callers to check result of get_stats call for NULL, not if
->get_stats is set.
This should not break backwards compatibility with older drivers, yet
allow modern drivers to shed some boilerplate code.
Lightly tested: works for a modified lguest network driver.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
To clearly state the intent of copying from linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Right now Xen has a horrible hack that lets it forward packets with
partial checksums. One of the reasons that CHECKSUM_PARTIAL and
CHECKSUM_COMPLETE were added is so that we can get rid of this hack
(where it creates two extra bits in the skbuff to essentially mirror
ip_summed without being destroyed by the forwarding code).
I had forgotten that I've already gone through all the deivce drivers
last time around to make sure that they're looking at ip_summed ==
CHECKSUM_PARTIAL rather than ip_summed != 0 on transmit. In any case,
I've now done that again so it should definitely be safe.
Unfortunately nobody has yet added any code to update CHECKSUM_COMPLETE
values on forward so we I'm setting that to CHECKSUM_NONE. This should
be safe to remove for bridging but I'd like to check that code path
first.
So here is the patch that lets us get rid of the hack by preserving
ip_summed (mostly) on forwarded packets.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The use of nop rules simplifies the usage of goto rules
and adds more flexibility as they allow targets to remain
while the actual content of the branches can change easly.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rules which match against device names in their selector can
remain while the device itself disappears, in fact the device
doesn't have to present when the rule is added in the first
place. The device name is resolved by trying when the rule is
added and later by listening to NETDEV_REGISTER/UNREGISTER
notifications.
This patch adds the flag FIB_RULE_DEV_DETACHED which is set
towards userspace when a rule contains a device match which
is unresolved at the moment. This eases spotting the reason
why certain rules seem not to function properly.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a new rule action FR_ACT_GOTO which allows
to skip a set of rules by jumping to another rule. The rule
to jump to is specified via the FRA_GOTO attribute which
carries a rule preference.
Referring to a rule which doesn't exists is explicitely allowed.
Such goto rules are marked with the flag FIB_RULE_UNRESOLVED
and will act like a rule with a non-matching selector. The rule
will become functional as soon as its target is present.
The goto action enables performance optimizations by reducing
the average number of rules that have to be passed per lookup.
Example:
0: from all lookup local
40: not from all to 192.168.23.128 goto 32766
41: from all fwmark 0xa blackhole
42: from all fwmark 0xff blackhole
32766: from all lookup main
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The destructor per conntrack is unnecessary, then this replaces it with
system wide destructor.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The error pointer argument in netlink message handlers is used
to signal the special case where processing has to be interrupted
because a dump was started but no error happened. Instead it is
simpler and more clear to return -EINTR and have netlink_run_queue()
deal with getting the queue right.
nfnetlink passed on this error pointer to its subsystem handlers
but only uses it to signal the start of a netlink dump. Therefore
it can be removed there as well.
This patch also cleans up the error handling in the affected
message handlers to be consistent since it had to be touched anyway.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a new interface to register rtnetlink message
handlers replacing the exported rtnl_links[] array which
required many message handlers to be exported unnecessarly.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common "(struct nlmsghdr *)skb->data" sequence, so that we reduce the
number of direct accesses to skb->data and for consistency with all the other
cast skb member helpers.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now to convert the last one, skb->data, that will allow many simplifications
and removal of some of the offset helpers.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
on 64bit architectures, allowing us to combine the 4 bytes hole left by the
layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
:-)
Many calculations that previously required that skb->{transport,network,
mac}_header be first converted to a pointer now can be done directly, being
meaningful as offsets or pointers.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and
skb->mac to skb->mac_header, to match the names of the associated helpers
(skb[_[re]set]_{transport,network,mac}_header).
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common sequence "skb->h.raw - skb->nh.raw", similar to skb->mac_len,
that is precalculated tho, don't think we need to bloat skb with one more
member, so just use this new helper, reducing the number of non-skbuff.h
references to the layer headers even more.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch let userspace programs set the IP_CT_TCP_BE_LIBERAL flag to
force the pickup of established connections.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This unifies the codes to copy netfilter related datas. Before copying,
nf_copy() puts original members in destination skb.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This unifies the codes to copy netfilter related datas. Note that
__nf_copy() assumes destination skb doesn't have any netfilter
related members.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use const to avoid forcing users to cast const data.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the obsolete IPv4 only connection tracking/NAT as scheduled in
feature-removal-schedule.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the places where we need a pointer to the transport header, it is
still legal to touch skb->h.raw directly if just adding to,
subtracting from or setting it to another layer header.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ip_hdrlen() buddy, created to reduce the number of skb->h.th-> uses and to
avoid the longer, open coded equivalent.
Ditched a no-op in bnx2 in the process.
I wonder if we should have a BUG_ON(skb->h.th->doff < 5) in tcp_optlen()...
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For consistency with all the other skb->h.raw accessors.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For consistency with all the other skb->h.raw accessors.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the cases where the transport header is being set to a offset from
skb->data.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the quite common 'skb->h.raw - skb->data' sequence.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common, open coded 'skb->h.raw = skb->data' operation, so that we can
later turn skb->h.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.
This one touches just the most simple cases:
skb->h.raw = skb->data;
skb->h.raw = {skb_push|[__]skb_pull}()
The next ones will handle the slightly more "complex" cases.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now the skb->nh union has just one member, .raw, i.e. it is just like the
skb->mac union, strange, no? I'm just leaving it like that till the transport
layer is done with, when we'll rename skb->mac.raw to skb->mac_header (or
->mac_header_offset?), ditto for ->{h,nh}.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the cases where the network header is being set to a offset from skb->data.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the places where we need a pointer to the network header, it is still legal
to touch skb->nh.raw directly if just adding to, subtracting from or setting it
to another layer header.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the quite common 'skb->nh.raw - skb->data' sequence.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common, open coded 'skb->nh.raw = skb->data' operation, so that we can
later turn skb->nh.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.
This one touches just the most simple case, next will handle the slightly more
"complex" cases.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For consistency with all the other skb->nh.raw accessors.
Also do some really obvious simplifications in pppoe_recvmsg, well the
kfree_skb one is not so obvious, but free() and kfree() have the same behaviour
(hint :-) ).
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the places where we need a pointer to the mac header, it is still legal to
touch skb->mac.raw directly if just adding to, subtracting from or setting it
to another layer header.
This one also converts some more cases to skb_reset_mac_header() that my
regex missed as it had no spaces before nor after '=', ugh.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the cases where we want to set skb->mac.raw to an offset from skb->data.
Simple cases first, the memmove ones and specially pktgen will be left for later.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can
later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.
This one touches just the most simple case, next will handle the slightly more
"complex" cases.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Covert network warning messages from a compile time to runtime choice.
Removes kernel config option and replaces it with new /proc/sys/net/core/warnings.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch eliminates some duplicate code for the verification of
receive checksums between UDP-Lite and UDP. It does this by
introducing __skb_checksum_complete_head which is identical to
__skb_checksum_complete_head apart from the fact that it takes
a length parameter rather than computing the first skb->len bytes.
As a result UDP-Lite will be able to use hardware checksum offload
for packets which do not use partial coverage checksums. It also
means that UDP-Lite loopback no longer does unnecessary checksum
verification.
If any NICs start support UDP-Lite this would also start working
automatically.
This patch removes the assumption that msg_flags has MSG_TRUNC clear
upon entry in recvmsg.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nominally an autoconfigured IPv6 address is added to an interface in the
Tentative state (as per RFC 2462). Addresses in this state remain in this
state while the Duplicate Address Detection process operates on them to
determine their uniqueness on the network. During this period, these
tentative addresses may not be used for communication, increasing the time
before a node may be able to communicate on a network. Using Optimistic
Duplicate Address Detection, autoconfigured addresses may be used
immediately for communication on the network, as long as certain rules are
followed to avoid conflicts with other nodes during the Duplicate Address
Detection process.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently use a special structure (struct skb_timeval) and plain
'struct timeval' to store packet timestamps in sk_buffs and struct
sock.
This has some drawbacks :
- Fixed resolution of micro second.
- Waste of space on 64bit platforms where sizeof(struct timeval)=16
I suggest using ktime_t that is a nice abstraction of high resolution
time services, currently capable of nanosecond resolution.
As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits
a 8 byte shrink of this structure on 64bit architectures. Some other
structures also benefit from this size reduction (struct ipq in
ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...)
Once this ktime infrastructure adopted, we can more easily provide
nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or
SO_TIMESTAMPNS/SCM_TIMESTAMPNS)
Note : this patch includes a bug correction in
compat_sock_get_timestamp() where a "err = 0;" was missing (so this
syscall returned -ENOENT instead of 0)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
CC: Stephen Hemminger <shemminger@linux-foundation.org>
CC: John find <linux.kernel@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
New sysctl tcp_frto_response is added to select amongst these
responses:
- Rate halving based; reuses CA_CWR state (default)
- Very conservative; used to be the only one available (=1)
- Undo cwr; undoes ssthresh and cwnd reductions (=2)
The response with rate halving requires a new parameter to
tcp_enter_cwr because FRTO has already reduced ssthresh and
doing a second reduction there has to be prevented. In addition,
to keep things nice on 80 cols screen, a local variable was
added.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
I noticed in oprofile study a cache miss in tcp_rcv_established() to read
copied_seq.
ffffffff80400a80 <tcp_rcv_established>: /* tcp_rcv_established total: 4034293
2.0400 */
55493 0.0281 :ffffffff80400bc9: mov 0x4c8(%r12),%eax copied_seq
543103 0.2746 :ffffffff80400bd1: cmp 0x3e0(%r12),%eax rcv_nxt
if (tp->copied_seq == tp->rcv_nxt &&
len - tcp_header_len <= tp->ucopy.len) {
In this function, the cache line 0x4c0 -> 0x500 is used only for this
reading 'copied_seq' field.
rcv_wup and copied_seq should be next to rcv_nxt field, to lower number of
active cache lines in hot paths. (tcp_rcv_established(), tcp_poll(), ...)
As you suggested, I changed tcp_create_openreq_child() so that these fields
are changed together, to avoid adding a new store buffer stall.
Patch is 64bit friendly (no new hole because of alignment constraints)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add input_set_capability() helper used to indicate that an input
device supports a certain event without need to manipulate bitmaps
directly.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
A security issue is emerging. Disallow Routing Header Type 0 by default
as we have been doing for IPv4.
Note: We allow RH2 by default because it is harmless.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of the inlined #ifdefs.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a 'quirks' module parameter for the usbhid module, so users can
add or modify quirks at module load time.
Signed-off-by: Paul Walmsley <paul@booyaka.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Add internal support for dynamically-allocated HID quirks, "dquirks"
(for "dynamic quirks"). Includes several functions to add/modify quirks
from the list. This code is used by the next patch to implement quirk
modification upon module load.
Signed-off-by: Paul Walmsley <paul@booyaka.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Move the USB_VENDOR* and USB_DEVICE* defines and the hid_blacklist[]
array there from hid-core.c. Add
hid-quirks.c:usbhid_lookup_any_quirks() to return quirk information to
hid-core.c. Convert __u32, __u16 types to u32, u16.
Signed-off-by: Paul Walmsley <paul@booyaka.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[BRIDGE]: Unaligned access when comparing ethernet addresses
[SCTP]: Unmap v4mapped addresses during SCTP_BINDX_REM_ADDR operation.
[SCTP]: Fix assertion (!atomic_read(&sk->sk_rmem_alloc)) failed message
[NET]: Set a separate lockdep class for neighbour table's proxy_queue
[NET]: Fix UDP checksum issue in net poll mode.
[KEY]: Fix conversion between IPSEC_MODE_xxx and XFRM_MODE_xxx.
[NET]: Get rid of alloc_skb_from_cache
Provide an dummy implementation of devm_ioport_map() and
devm_ioport_unmap() to allow drivers (eg, pata_platform) to build for
platforms where CONFIG_NO_IOPORT is selected.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make kernel-doc comments match macro names.
Correct parameter names in a few places.
Remove '#' from beginning of kernel-doc comment macro names.
Remove extra (erroneous) blank lines in kernel-doc.
Warning(plist.h:100): Cannot understand * #PLIST_HEAD_INIT - static struct plist_head initializer on line 100 - I thought it was a doc line
Warning(plist.h:112): Cannot understand * #PLIST_NODE_INIT - static struct plist_node initializer on line 112 - I thought it was a doc line
Warning(plist.h:103): No description found for parameter '_lock'
Warning(plist.h:129): No description found for parameter 'lock'
Warning(plist.h:158): No description found for parameter 'pos'
Warning(plist.h:169): No description found for parameter 'pos'
Warning(plist.h:169): No description found for parameter 'n'
Warning(plist.h:179): No description found for parameter 'mem'
This still leaves one warning & one error that need attention:
Error(plist.h:219): cannot understand prototype: '('
Warning(plist.h): no structured comments found
Acked-by: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Cc: Daniel Walker <dwalker@mvista.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Otherwise the following calltrace will lead to a wrong
lockdep warning:
neigh_proxy_process()
`- lock(neigh_table->proxy_queue.lock);
arp_redo /* via tbl->proxy_redo */
arp_process
neigh_event_ns
neigh_update
skb_queue_purge
`- lock(neighbor->arp_queue.lock);
This is not a deadlock actually, as neighbor table's proxy_queue
and the neighbor's arp_queue are different queues.
Lockdep thinks there is a deadlock as both queues are initialized
with skb_queue_head_init() and thus have a common class.
Signed-off-by: David S. Miller <davem@davemloft.net>
Since this was added originally for Xen, and Xen has recently (~2.6.18)
stopped using this function, we can safely get rid of it. Good timing
too since this function has started to bit rot.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the writebacks are cancelled via nfs_cancel_dirty_list, or due to the
memory allocation failing in nfs_flush_one/nfs_flush_multi, then we must
ensure that the PG_writeback flag is cleared.
Also ensure that we actually own the PG_writeback flag whenever we
schedule a new writeback by making nfs_set_page_writeback() return the
value of test_set_page_writeback().
The PG_writeback page flag ends up replacing the functionality of the
PG_FLUSHING nfs_page flag, so we rip that out too.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In preparation to switching to struct device and class device
going away provide an alias to allow drivers that create devices
to use either input_dev->cdev.dev or input_dev->dev.parent to
put them into sysfs tree. The former will go away once conversion
to struct device is complete.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Add helpers to set up and access driver-specific data in input
device structure. Once conversion to struct driver is complete
we will drop input_dev->private and will use dev_get_drvdata()
and dev_set_drvdata().
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Acerhk supports already a lot of laptops. Lets import its database so
that everyone can benefit of the work of Olaf Tauber. Only the "tm_new"
laptops were imported. "tm_old" laptops could be possible but requires
more testing and probably only few laptops are still alive. "dritek"
laptops should probably be imported into a different driver. Also compress
the keymaps by fitting each entry on an int. Most of the dmi matching was
written based on google searches, so it's rather prone to errors. That's
why I'm asking people to confirm it works.
Support to generate switch input events was added as some laptops indicate
lid open/close through this interface.
This adds the following hardware:
Acer TravelMate 370
Acer TravelMate 380
Acer TravelMate C300
Acer TravelMate C100
Acer TravelMate C110
Acer TravelMate 250
Acer TravelMate 350
Acer TravelMate 620
Acer TravelMate 630
Acer TravelMate 220
Acer TravelMate 230
Acer TravelMate 260
Acer TravelMate 280
Acer TravelMate 360
Acer TravelMate 2100
Acer TravelMate 2410
Acer Aspire 1500
Acer Aspire 1600
Acer Aspire 3020
Acer Aspire 5020
Medion MD 2900
Medion MD 40100
Medion MD 95400
Medion MD 96500
Fujitsu Siemens Amilo 7820
Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
- consolidate code for binding handlers to a device
- return error codes from handlers connect() methods back to input
core and log failures
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
On Dell W7658 keyboard, when BIOS sets NumLock LED on, it survives the
takeover by kernel and thus confuses users.
Eating of an increasibly scarce quirk bit is unfortunate. We do it for safety,
given the history of nervous input devices which crash if anything unusual
happens.
Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Logitech MX3000 contains report descriptor which doesn't cover usages
above 0x28c, but emits such usages. Report descriptor needs fixing
in the very same way as with receivers shipped with S510 keyboards.
This patch also adds a few mappings for multimedia keys that S510 didn't
emit.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
It is possible for the timer expiry function to run even though the
request has already been handled: ide_timer_expiry() only checks that
the handler is not NULL, but it is possible that we have handled a
request (thus clearing the handler) and then started a new request
(thus starting the timer again, and setting a handler).
A simple way to exhibit this is to set the DMA timeout to 1 jiffy and
run dd: The kernel will panic after a few minutes because
ide_timer_expiry() tries to add a timer when it's already active.
To fix this, we simply add a request generation count that gets
incremented at every interrupt, and check in ide_timer_expiry() that
we have not already handled a new interrupt before running the expiry
function.
Signed-off-by: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Soeren Sonnenburg reported that upon resume he is getting
this backtrace:
[<c0119637>] smp_apic_timer_interrupt+0x57/0x90
[<c0142d30>] retrigger_next_event+0x0/0xb0
[<c0104d30>] apic_timer_interrupt+0x28/0x30
[<c0142d30>] retrigger_next_event+0x0/0xb0
[<c0140068>] __kfifo_put+0x8/0x90
[<c0130fe5>] on_each_cpu+0x35/0x60
[<c0143538>] clock_was_set+0x18/0x20
[<c0135cdc>] timekeeping_resume+0x7c/0xa0
[<c02aabe1>] __sysdev_resume+0x11/0x80
[<c02ab0c7>] sysdev_resume+0x47/0x80
[<c02b0b05>] device_power_up+0x5/0x10
it turns out that on resume we mistakenly re-enable interrupts too
early. Do the timer retrigger only on the current CPU.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Soeren Sonnenburg <kernel@nn7.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert all this. It can cause device-mapper to receive a different major from
earlier kernels and it turns out that the Amanda backup program (via GNU tar,
apparently) checks major numbers on files when performing incremental backups.
Which is a bit broken of Amanda (or tar), but this feature isn't important
enough to justify the churn.
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A device can be removed from an md array via e.g.
echo remove > /sys/block/md3/md/dev-sde/state
This will try to remove the 'dev-sde' subtree which will deadlock
since
commit e7b0d26a86
With this patch we run the kobject_del via schedule_work so as to
avoid the deadlock.
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
patch 4/4:
Limit ATAPI DMA to R/W commands only for TORiSAN DRD-N216 DVD-ROM drives
(http://bugzilla.kernel.org/show_bug.cgi?id=6710)
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
patch 3/4:
The TORiSAN drive locks up when max sector == 256.
Limit max sector to 128 for the TORiSAN DRD-N216 drives.
(http://bugzilla.kernel.org/show_bug.cgi?id=6710)
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
patch 1/4:
Reorder HSM_ST_FIRST, such that the task state transition is easier decoded with human eyes.
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Auto unlock sectors on resume for auto locking flash on power up.
Signed-off-by: Rodolfo Giometti <giometti@enneenne.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Fix the regression resulting from the recent change of suspend code
ordering that causes systems based on Intel x86 CPUs using the microcode
driver to hang during the resume.
The problem occurs since the microcode driver uses request_firmware() in
its CPU hotplug notifier, which is called after tasks has been frozen and
hangs. It can be fixed by telling the microcode driver to use the
microcode stored in memory during the resume instead of trying to load it
from disk.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Adrian Bunk <bunk@stusta.de>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Maxim <maximlevitsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The input_device pointer is not refcounted, which means the device may
disappear while packets are queued, causing a crash when ifb passes packets
with a stale skb->dev pointer to netif_rx().
Fix by storing the interface index instead and do a lookup where neccessary.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg discovered that kernel space was leaking to
userspace on 64 bit platform. He made a first patch to fix that. This
is an improved version of his patch.
Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block:
Export __splice_from_pipe()
2/2 splice: dont readpage
1/2 splice: dont steal
make elv_register() output atomic
block: blk_max_pfn is somtimes wrong
When CONFIG_IPC_NS=n, clone(CLONE_NEWIPC) claims success, but did not actually
clone a new IPC namespace.
Fix this to return -EINVAL so the caller knows his request was denied.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When CONFIG_UTS_NS=n, clone(CLONE_NEWUTS) quietly refuses. So correctly does
not unshare a new uts namespace, but also does not return -EINVAL.
Fix this to return -EINVAL so the caller knows his request was denied.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
UML/x86_64 needs the same packing of struct epoll_event as x86_64.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ocfs2 wants to implement it's own splice write actor so that it can better
manage cluster / page locks. This lets us re-use the rest of splice write
while only providing our own code where it's actually important.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Change prototypes for __chk_user_ptr and __chk_io_ptr to take const
void* instead of void*, so that code can pass "const void *" to them.
(Right now sparse does not warn about passing const void* to void*
functions, but that is a separate bug that I believe Josh is working on,
and once sparse does check this, the changed prototypes will be
necessary.)
Signed-off-by: Russ Cox <rsc@swtch.com>
Signed-off-by: Josh Triplett <josh@freedesktop.org>
Acked-by: Christopher Li <sparse@chrisli.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
IDE error recovery is using IDLE IMMEDIATE if the drive is busy or has DRQ set.
This violates the ATA spec (can only send IDLEÂ IMMEDIATE when drive is not
busy) and really hoses up some drives (modern drives will not be able to
recover using this error handling). The correct thing to do is issue a SRST
followed by a SET FEATURES command. This is what Western Digital recommends
for error recovery and what Western Digital says Windows does.  It also does
not violate the ATA spec as far as I can tell.
Bart:
* port the patch over the current tree
* undo the recalibration code removal
* send SET FEATURES command after checking for good drive status
* don't check whether the current request is of REQ_TYPE_ATA_{CMD,TASK}
type because we need to send SET FEATURES before handling any requests
* some pre-ATA4 drives require INITIALIZE DEVICE PARAMETERS command before
other commands (except IDENTIFY) so send SET FEATURES only if there are
no pending drive->special requests
* update comments and patch description
* any bugs introduced by this patch are mine and not Suleiman's :-)
Signed-off-by: Suleiman Souhlal <suleiman@google.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Tracing through the code, no current PMU sleep notifier can abort sleep.
Since no new PMU sleep notifiers should be added, this patch simplifies the
code and removes the ability to abort sleep.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Delete the unreferenced header file include/linux/mtd/iflash.h.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
lockdep found a bug during a run of workqueue function - this could be also
caused by a bug from other code running simultaneously.
lockdep really shouldn't be used when debug_locks == 0!
Reported-by: Folkert van Heusden <folkert@vanheusden.com>
Inspired-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix unannotated variable declarations. Variables that have allocation
section annotations (such as __meminitdata) on their definitions must also
have them on their declarations as not doing so may affect the addressing
mode used by the compiler and may result in a linker error.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since d9a9cdfb07 <linux/sysfs.h> is using
ENOSYS without including <linux/errno.h> if CONFIG_SYSFS is disabled.
Fixed by including <linux/errno.h>.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The maximum seconds value we can handle on 32bit is LONG_MAX.
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The current NFS client congestion logic is severly broken, it marks the
backing device congested during each nfs_writepages() call but doesn't
mirror this in nfs_writepage() which makes for deadlocks. Also it
implements its own waitqueue.
Replace this by a more regular congestion implementation that puts a cap on
the number of active writeback pages and uses the bdi congestion waitqueue.
Also always use an interruptible wait since it makes sense to be able to
SIGKILL the process even for mounts without 'intr'.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the console is in VT_AUTO+KD_GRAPHICS mode, switching to the
SUSPEND_CONSOLE fails, resulting in vt_waitactive() waiting indefinitely or
until the task is interrupted. This patch tests if a console switch can
occur in set_console() and returns early if a console switch is not
possible.
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Andrew Johnson <ajohnson@intrinsyc.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a bug in the cleanup of an spi_bitbang bus.
The workqueue associated with the bus was destroyed before the call to
spi_unregister_master. That meant that spi devices on that bus would be
unable to do IO in their remove method. The shutdown flag should have been
able to prevent a segfault, but was never getting set. By waiting to
destroy the workqueue until after the master is unregistered, devices are
able to do IO in their remove methods. An added benefit is that neither
the shutdown flag nor a wait for the queue of messages to empty is needed.
Signed-off-by: Chris Lesiak <chris.lesiak@licor.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch corrects work with time in UFS2 case.
1) According to UFS2 disk layout modification/access and so on "time"
should be hold in two variables one 64bit for seconds and another 32bit for
nanoseconds,
at now for some unknown reason we suppose that "inode time" holds in
three variables 32bit for seconds, 32bit for milliseconds and 32bit for
nanoseconds.
2) We set amount of nanoseconds in "VFS inode" to 0 during read, instead of
getting values from "on disk inode"(this should close
http://bugzilla.kernel.org/show_bug.cgi?id=7991).
Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Bjoern Jacke <bjoern@j3e.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch (as868) adds a helper routine for device drivers that need
to set up a callback to perform some action in a different process's
context. This is intended for use by attribute methods that want to
unregister themselves or their parent device. Attribute method calls
are mutually exclusive with unregistration, so such actions cannot be
taken directly.
Two attribute methods are converted to use the new helper routine: one
for SCSI device deletion and one for System/390 ccwgroup devices.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow drivers to implement their own get and set keycode methods. This
will allow drivers to change their keymaps without allocating huge
tables covering entire range of possible scancodes.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
have it return the buffer it had allocated
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Because we do not reserve space for the pci-x and pci-e state in struct
pci dev we need to dynamically allocate it. However because we need
to support restore being called multiple times after a single save
it is never safe to free the buffers we have allocated to hold the
state.
So this patch modifies the save routines to first check to see
if we have already allocated a state buffer before allocating
a new one. Then the restore routines are modified to not free
the state after restoring it. Simple and it fixes some subtle
error path handling bugs, that are hard to test for.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are two ways pci_save_state and pci_restore_state are used. As
helper functions during suspend/resume, and as helper functions around
a hardware reset event. When used as helper functions around a hardware
reset event there is no reason to believe the calls will be paired, nor
is there a good reason to believe that if we restore the msi state from
before the reset that it will match the current msi state. Since arch
code may change the msi message without going through the driver, drivers
currently do not have enough information to even know when to call
pci_save_state to ensure they will have msi state in sync with the other
kernel irq reception data structures.
It turns out the solution is straight forward, cache the state in the
existing msi data structures (not the magic pci saved things) and
have the msi code update the cached state each time we write to the hardware.
This means we never need to read the hardware to figure out what the hardware
state should be.
By modifying the caching in this manner we get to remove our save_state
routines and only need to provide restore_state routines.
The only fields that were at all tricky to regenerate were the msi and msi-x
control registers and the way we regenerate them currently is a bit dependent
upon assumptions on how we use the allow msi registers to be configured and used
making the code a little bit brittle. If we ever change what cases we allow
or how we configure the msi bits we can address the fragility then.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.infradead.org/mtd-2.6:
[JFFS2] print a message when marking bad block
[JFFS2] Check for all-zero node headers
[MTD] [OneNAND] Classify the page data and oob buffer
[MTD] [OneNAND] Exit the loop when transferring/filling of the oob is finished
[MTD] [OneNAND] add Nokia Copyright and a credit
[MTD] [OneNAND] Fix typo & wrong comments
[MTD] [OneNAND] Use oob buffer instead of main one in oob functions
[MTD] Correct partition failed erase address
[JFFS2] Use yield() between GC passes in background thread.
[MTD] [NAND] Correct misspelled preprocessor variable.
[MTD] [MAPS] dilnetpc: Fix printk warning
[MTD] [NOR] Fix oops in cfi_amdstd_sync
[MTD] ESB2 check for closed ROM window
[JFFS2] Fix writebuffer recovery in the first page of a block
[MTD] [NAND] make oobavail public
Classify the page data and oob buffer
and it prevents the memory fragementation (writesize + oobsize)
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
IA64 and ARM-OABI are currently using their own version of epoll compat_
code.
An architecture needs epoll_event translation if alignof(u64) in 32 bit
mode is different from alignof(u64) in 64 bit mode. If an architecture
needs epoll_event translation, it must define struct compat_epoll_event in
asm/compat.h and set CONFIG_HAVE_COMPAT_EPOLL_EVENT and use
compat_sys_epoll_ctl and compat_sys_epoll_wait.
All 64 bit architecture should use compat_sys_epoll_pwait.
[sfr: restructure and move to fs/compat.c, remove MIPS version
of compat_sys_epoll_pwait, use __put_user_unaligned]
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
During the MTD rework the oobavail parameter of mtd_info structure has become
private. This is not quite correct in terms of integrity and logic. If we have
means to write to OOB area, then we'd like to know upfront how many bytes out
of OOB are spare per page to be able to adapt to specific cases.
The patch inlined adds the public oobavail parameter.
Signed-off-by: Vitaly Wool <vwool@ru.mvista.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
sdhci: release irq during suspend
sdhci: make isr tolerant of read errors
mmc: require explicit support for high-speed
ncpfs: make sure server connection survives a kill
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
sis900 warning fixes
mv643xx_eth: Place explicit port number in mv643xx_eth_platform_data
pcnet32: Fix PCnet32 performance bug on non-coherent architecutres
__devinit & __devexit cleanups for de2104x driver
3c59x: Handle pci_enable_device() failure while resuming
dmfe: Fix link detection
dmfe: fix two bugs
dmfe: trivial/spelling fixes
revert "drivers/net/tulip/dmfe: support basic carrier detection"
ucc_geth: returns NETDEV_TX_BUSY when BD ring is full
ucc_geth: Fix BD processing
natsemi: netpoll fixes
bonding: Improve IGMP join processing
bonding: only receive ARPs for us
bonding: fix double dev_add_pack
When the last thread of nfsd exits, it shuts down all related sockets. It
currently uses svc_close_socket to do this, but that only is immediately
effective if the socket is not SK_BUSY.
If the socket is busy - i.e. if a request has arrived that has not yet been
processes - svc_close_socket is not effective and the shutdown process spins.
So create a new svc_force_close_socket which removes the SK_BUSY flag is set
and then calls svc_close_socket.
Also change some open-codes loops in svc_destroy to use
list_for_each_entry_safe.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
They don't really save that much, and aren't worth the hassle.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Include linux/types.h here because we need a definition of __u32. This file
appears not be exported verbatim by libc, so I think this doesn't have any
userspace consequences.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The description for the hrtimer_clock_base struct describes "hrtimer_base".
That should be hrtimer_clock_base.
Signed-off-by: Andres Salomon <dilinger@debian.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The description for HRTIMER_CB_IRQSAFE_NO_SOFTIRQ is backwards; "NO
SOFTIRQ" sounds a whole lot like it means it must not be run in a softirq.
Signed-off-by: Andres Salomon <dilinger@debian.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The new high-speed timings are similar to each other and the old
system, but not identical. And although things "just work" most of
the time, sometimes it does not. So we need to start marking which
hosts are known to fully comply with the new timings.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Use internal buffers instead of the ones supplied by the caller
so that a caller can be interrupted without having to abort the
entire ncp connection.
Signed-off-by: Pierre Ossman <ossman@cendio.se>
Acked-by: Petr Vandrovec <petr@vandrovec.name>
We were using the platform_device.id field to identify which ethernet
port is used for mv643xx_eth device. This is not generally correct.
It will be incorrect, for example, if a hardware platform uses a single
port but not the first port. Here, we add an explicit port_number field
to struct mv643xx_eth_platform_data.
This makes the mv643xx_eth_platform_data structure required, but that
isn't an issue since all users currently provide it already.
Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
In active-backup mode, the current bonding code duplicates IGMP
traffic to all slaves, so that switches are up to date in case of a
failover from an active to a backup interface. If bonding then fails
back to the original active interface, it is likely that the "active
slave" switch's IGMP forwarding for the port will be out of date until
some event occurs to refresh the switch (e.g., a membership query).
This patch alters the behavior of bonding to no longer flood
IGMP to all ports, and to issue IGMP JOINs to the newly active port at
the time of a failover. This insures that switches are kept up to date
for all cases.
"GOELLESCH Niels" <niels.goellesch@eurocontrol.int> originally
reported this problem, and included a patch. His original patch was
modified by Jay Vosburgh to additionally remove the existing IGMP flood
behavior, use RCU, streamline code paths, fix trailing white space, and
adjust for style.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix {nf,ip}_ct_iterate_cleanup unconfirmed list handling:
- unconfirmed entries can not be killed manually, they are removed on
confirmation or final destruction of the conntrack entry, which means
we might iterate forever without making forward progress.
This can happen in combination with the conntrack event cache, which
holds a reference to the conntrack entry, which is only released when
the packet makes it all the way through the stack or a different
packet is handled.
- taking references to an unconfirmed entry and using it outside the
locked section doesn't work, the list entries are not refcounted and
another CPU might already be waiting to destroy the entry
What the code really wants to do is make sure the references of the hash
table to the selected conntrack entries are released, so they will be
destroyed once all references from skbs and the event cache are dropped.
Since unconfirmed entries haven't even entered the hash yet, simply mark
them as dying and skip confirmation based on that.
Reported and tested by Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doing something like this on a two cpu system
# echo 0 > /sys/devices/system/cpu/cpu0/online
# echo 1 > /sys/devices/system/cpu/cpu0/online
# echo 0 > /sys/devices/system/cpu/cpu1/online
will give me this:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.21-rc2-g562aa1d4-dirty #7
-------------------------------------------------------
bash/1282 is trying to acquire lock:
(&cpu_base->lock_key){.+..}, at: [<000000000005f17e>] hrtimer_cpu_notify+0xc6/0x240
but task is already holding lock:
(&cpu_base->lock_key#2){.+..}, at: [<000000000005f174>] hrtimer_cpu_notify+0xbc/0x240
which lock already depends on the new lock.
This happens because we have the following code in kernel/hrtimer.c:
migrate_hrtimers(int cpu)
[...]
old_base = &per_cpu(hrtimer_bases, cpu);
new_base = &get_cpu_var(hrtimer_bases);
[...]
spin_lock(&new_base->lock);
spin_lock(&old_base->lock);
Which means the spinlocks are taken in an order which depends on which cpu
gets shut down from which other cpu. Therefore lockdep complains that there
might be an ABBA deadlock. Since migrate_hrtimers() gets only called on
cpu hotplug it's safe to assume that it isn't executed concurrently on a
The same problem exists in kernel/timer.c: migrate_timers().
As pointed out by Christian Borntraeger one possible solution to avoid
the locking order complaints would be to make sure that the locks are
always taken in the same order. E.g. by taking the lock of the cpu with
the lower number first.
To achieve this we introduce two new spinlock functions double_spin_lock
and double_spin_unlock which lock or unlock two locks in a given order.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Christian Borntraeger <cborntra@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently we do not check for vma flags if sys_move_pages is called to move
individual pages. If sys_migrate_pages is called to move pages then we
check for vm_flags that indicate a non migratable vma but that still
includes VM_LOCKED and we can migrate mlocked pages.
Extract the vma_migratable check from mm/mempolicy.c, fix it and put it
into migrate.h so that is can be used from both locations.
Problem was spotted by Lee Schermerhorn
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the SMT-nice feature which idles sibling cpus on SMT cpus to
facilitiate nice working properly where cpu power is shared. The idling of
cpus in the presence of runnable tasks is considered too fragile, easy to
break with outside code, and the complexity of managing this system if an
architecture comes along with many logical cores sharing cpu power will be
unworkable.
Remove the associated per_cpu_gain variable in sched_domains used only by
this code.
Also:
The reason is that with dynticks enabled, this code breaks without yet
further tweaks so dynticks brought on the rapid demise of this code. So
either we tweak this code or kill it off entirely. It was Ingo's preference
to kill it off. Either way this needs to happen for 2.6.21 since dynticks
has gone in.
Signed-off-by: Con Kolivas <kernel@kolivas.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The gpio_keys driver is wrongly ARM-specific; it can't build on
other platforms with GPIO suport. This fixes that problem.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: pHilipp Zabel <philipp.zabel@gmail.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Ben Nizette <ben.nizette@iinet.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In some cases when we are not using msi we need a way to ensure that the
hardware does not have an msi capability enabled. Currently the code has been
calling disable_msi_mode to try and achieve that. However disable_msi_mode
has several other side effects and is only available when msi support is
compiled in so it isn't really appropriate.
Instead this patch implements pci_msi_off which disables all msi and msix
capabilities unconditionally with no additional side effects.
pci_disable_device was redundantly clearing the bus master enable flag and
clearing the msi enable bit. A device that is not allowed to perform bus
mastering operations cannot generate intx or msi interrupt messages as those
are essentially a special case of dma, and require bus mastering. So the call
in pci_disable_device to disable msi capabilities was redundant.
quirk_pcie_pxh also called disable_msi_mode and is updated to use pci_msi_off.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[VLAN]: Avoid a 4-order allocation.
[HDLC] Fix dev->header_cache_update having a random value.
[NetLabel]: Verify sensitivity level has a valid CIPSO mapping
[PPPOE]: Key connections properly on local device.
[AF_UNIX]: Test against sk_max_ack_backlog properly.
[NET]: Fix bugs in "Whether sock accept queue is full" checking
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] MTX1: clear PCI errors
[MIPS] MTX1: add idsel cardbus ressources
[MIPS] MTX1: remove unneeded settings
[MIPS] dma_sync_sg_for_cpu is a no-op except for non-coherent R10000s.
[MIPS] Cobalt: update reserved resources
[MIPS] SN: PCI fixup needs to include <irq.h>.
[MIPS] DMA: Fix a bunch of warnings due to missing inline keywords.
[MIPS] RM: It should be #ifdef CONFIG_FOO not #if CONFIG_FOO ...
[MIPS] Fix and cleanup the mess that a dozen prom_printf variants are.
[MIPS] DEC: Remove redeclarations of mips_machgroup and mips_machtype.
[MIPS] No need to write c0_compare in plat_timer_setup
[MIPS] Convert to RTC-class ds1742 driver
[MIPS] Oprofile: Add missing break statements.
[MIPS] jmr3927: build fix
[MIPS] SNI: Fix mc146818_decode_year
[MIPS] Replace sys32_timer_create with the generic compat_sys_timer_create.
[MIPS] Replace sys32_socketcall with the generic compat_sys_socketcall.
[MIPS] N32 waitid is the same as o32.
The generic rtc-ds1742 driver can be used for RBTX4927 and JMR3927
(with __swizzle_addr trick). This patch also removes MIPS local
DS1742 stuff.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Use the standard magic.h for kvmfs.
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Allocate a distinct inode for every vcpu in a VM. This has the following
benefits:
- the filp cachelines are no longer bounced when f_count is incremented on
every ioctl()
- the API and internal code are distinctly clearer; for example, on the
KVM_GET_REGS ioctl, there is no need to copy the vcpu number from
userspace and then copy the registers back; the vcpu identity is derived
from the fd used to make the call
Right now the performance benefits are completely theoretical since (a) we
don't support more than one vcpu per VM and (b) virtualization hardware
inefficiencies completely everwhelm any cacheline bouncing effects. But
both of these will change, and we need to prepare the API today.
Signed-off-by: Avi Kivity <avi@qumranet.com>
This avoids having filp->f_op and the corresponding inode->i_fop different,
which is a little unorthodox.
The ioctl list is split into two: global kvm ioctls and per-vm ioctls. A new
ioctl, KVM_CREATE_VM, is used to create VMs and return the VM fd.
Signed-off-by: Avi Kivity <avi@qumranet.com>
This adds a special MSR based hypercall API to KVM. This is to be
used by paravirtual kernels and virtual drivers.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Avi Kivity <avi@qumranet.com>
The function ide_get_best_pio_mode() fails to return the correct IORDY setting
for the explicitly specified modes -- fix this along with the heading comment,
and also remove the long commented out code.
Also, while at it, correct the misliading comment about the PIO cycle time in
<linux/ide.h> -- it actually consists of only the active and recovery periods,
with only some chips also including the address setup time into equation...
[ bart: sl82c105 seems to be currently the only driver affected by this fix ]
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This patch splits the vlan_group struct into a multi-allocated struct. On
x86_64, the size of the original struct is a little more than 32KB, causing
a 4-order allocation, which is prune to problems caused by buddy-system
external fragmentation conditions.
I couldn't just use vmalloc() because vfree() cannot be called in the
softirq context of the RCU callback.
Signed-off-by: Dan Aloni <da-x@monatomic.org>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The information contained within platform_data should be self-contained.
Replace the pointer to a MAC address with the actual MAC address in
struct mv643xx_eth_platform_data.
Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Conditionalize all PM related stuff in libata core layer using
CONFIG_PM.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The initial simplex handling code is fooled if you suspend and resume.
This also causes problems with some single channel controllers which
claim to be simplex.
The fix is fairly simple, instead of keeping a flag to remember if we
gave away the simplex channel we remember the actual owner. As the owner
is always part of the host_set we don't even need a refcount.
Knowing the owner also means we can reassign simplex DMA channels in
future hotplug code etc if we need to
Signed-off-by: Alan Cox <alan@redhat.com>
(and a signed-off for the patch I sent before while I remember)
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jikos/hid:
HID: fix Logitech DiNovo Edge touchwheel and Logic3 /SpectraVideo middle button
HID: add git tree information to MAINTAINERS
HID: fix broken Logitech S510 keyboard report descriptor; make extra keys work
HID: fix possible double-free on error path in hid parser
HID: hid-debug.c should #include <linux/hid-debug.h>
HID: fix bug in zeroing the last field byte in output reports
USB HID: use CONFIG_HID_DEBUG for outputting report descriptor
USB HID: Fix USB vendor and product IDs endianness for USB HID devices
This patch provides the following hugetlb-related fixes to the recent stacked
shm files changes:
- Update is_file_hugepages() so it will reconize hugetlb shm segments.
- get_unmapped_area must be called with the nested file struct to handle
the sfd->file->f_ops->get_unmapped_area == NULL case.
- The fsync f_op must be wrapped since it is specified in the hugetlbfs
f_ops.
This is based on proposed fixes from Eric Biederman that were debugged and
tested by me. Without it, attempting to use hugetlb shared memory segments
on powerpc (and likely ia64) will kill your box.
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: William Irwin <bill.irwin@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The CAPI trace debug functions were using a fixed size buffer, which can be
overflowed if wrong formatted CAPI messages were sent to the kernel capi
layer. The code was also not protected against multiple callers. This fix
bug 8028.
Additionally the patch make the CAPI trace functions optional.
Signed-off-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
linux/irq.h uses EINVAL but does not #include linux/errno.h. This results in
the compiler spitting out errors on some files.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
throttle_vm_writeout() is designed to wait for the dirty levels to subside.
But if the caller holds IO or FS locks, we might be holding up that writeout.
So change it to take a single nap to give other devices a chance to clean some
memory, then return.
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rename PG_checked to PG_owner_priv_1 to reflect its availablilty as a
private flag for use by the owner/allocator of the page. In the case of
pagecache pages (which might be considered to be owned by the mm),
filesystems may use the flag.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>