Commit Graph

5908 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo
ea2ae17d64 [SK_BUFF]: Introduce skb_transport_offset()
For the quite common 'skb->h.raw - skb->data' sequence.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:16 -07:00
Arnaldo Carvalho de Melo
badff6d01a [SK_BUFF]: Introduce skb_reset_transport_header(skb)
For the common, open coded 'skb->h.raw = skb->data' operation, so that we can
later turn skb->h.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple cases:

skb->h.raw = skb->data;
skb->h.raw = {skb_push|[__]skb_pull}()

The next ones will handle the slightly more "complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:15 -07:00
Arnaldo Carvalho de Melo
0660e03f6b [SK_BUFF]: Introduce ipv6_hdr(), remove skb->nh.ipv6h
Now the skb->nh union has just one member, .raw, i.e. it is just like the
skb->mac union, strange, no? I'm just leaving it like that till the transport
layer is done with, when we'll rename skb->mac.raw to skb->mac_header (or
->mac_header_offset?), ditto for ->{h,nh}.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:14 -07:00
Arnaldo Carvalho de Melo
d0a92be05e [SK_BUFF]: Introduce arp_hdr(), remove skb->nh.arph
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:12 -07:00
Arnaldo Carvalho de Melo
eddc9ec53b [SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iph
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:10 -07:00
Arnaldo Carvalho de Melo
c14d2450cb [SK_BUFF]: Introduce skb_set_network_header
For the cases where the network header is being set to a offset from skb->data.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:01 -07:00
Arnaldo Carvalho de Melo
d56f90a7c9 [SK_BUFF]: Introduce skb_network_header()
For the places where we need a pointer to the network header, it is still legal
to touch skb->nh.raw directly if just adding to, subtracting from or setting it
to another layer header.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:59 -07:00
Arnaldo Carvalho de Melo
bbe735e424 [SK_BUFF]: Introduce skb_network_offset()
For the quite common 'skb->nh.raw - skb->data' sequence.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:58 -07:00
Arnaldo Carvalho de Melo
c1d2bbe1cd [SK_BUFF]: Introduce skb_reset_network_header(skb)
For the common, open coded 'skb->nh.raw = skb->data' operation, so that we can
later turn skb->nh.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple case, next will handle the slightly more
"complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:46 -07:00
Arnaldo Carvalho de Melo
797659fb4a [PPPOE]: Introduce pppoe_hdr()
For consistency with all the other skb->nh.raw accessors.

Also do some really obvious simplifications in pppoe_recvmsg, well the
kfree_skb one is not so obvious, but free() and kfree() have the same behaviour
(hint :-) ).

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:43 -07:00
Arnaldo Carvalho de Melo
98e399f82a [SK_BUFF]: Introduce skb_mac_header()
For the places where we need a pointer to the mac header, it is still legal to
touch skb->mac.raw directly if just adding to, subtracting from or setting it
to another layer header.

This one also converts some more cases to skb_reset_mac_header() that my
regex missed as it had no spaces before nor after '=', ugh.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:41 -07:00
Arnaldo Carvalho de Melo
48d49d0ccd [SK_BUFF]: Introduce skb_set_mac_header()
For the cases where we want to set skb->mac.raw to an offset from skb->data.

Simple cases first, the memmove ones and specially pktgen will be left for later.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:37 -07:00
Arnaldo Carvalho de Melo
459a98ed88 [SK_BUFF]: Introduce skb_reset_mac_header(skb)
For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can
later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple case, next will handle the slightly more
"complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:32 -07:00
Stephen Hemminger
a2a316fd06 [NET]: Replace CONFIG_NET_DEBUG with sysctl.
Covert network warning messages from a compile time to runtime choice.
Removes kernel config option and replaces it with new /proc/sys/net/core/warnings.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:05 -07:00
Herbert Xu
759e5d0064 [UDP]: Clean up UDP-Lite receive checksum
This patch eliminates some duplicate code for the verification of
receive checksums between UDP-Lite and UDP.  It does this by
introducing __skb_checksum_complete_head which is identical to
__skb_checksum_complete_head apart from the fact that it takes
a length parameter rather than computing the first skb->len bytes.

As a result UDP-Lite will be able to use hardware checksum offload
for packets which do not use partial coverage checksums.  It also
means that UDP-Lite loopback no longer does unnecessary checksum
verification.

If any NICs start support UDP-Lite this would also start working
automatically.

This patch removes the assumption that msg_flags has MSG_TRUNC clear
upon entry in recvmsg.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:23:51 -07:00
David S. Miller
fc910a2783 [NETLINK]: Limit NLMSG_GOODSIZE to 8K.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:23:45 -07:00
Neil Horman
95c385b4d5 [IPV6] ADDRCONF: Optimistic Duplicate Address Detection (RFC 4429) Support.
Nominally an autoconfigured IPv6 address is added to an interface in the
Tentative state (as per RFC 2462).  Addresses in this state remain in this
state while the Duplicate Address Detection process operates on them to
determine their uniqueness on the network.  During this period, these
tentative addresses may not be used for communication, increasing the time
before a node may be able to communicate on a network.  Using Optimistic
Duplicate Address Detection, autoconfigured addresses may be used
immediately for communication on the network, as long as certain rules are
followed to avoid conflicts with other nodes during the Duplicate Address
Detection process.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:23:43 -07:00
Eric Dumazet
b7aa0bf70c [NET]: convert network timestamps to ktime_t
We currently use a special structure (struct skb_timeval) and plain
'struct timeval' to store packet timestamps in sk_buffs and struct
sock.

This has some drawbacks :
- Fixed resolution of micro second.
- Waste of space on 64bit platforms where sizeof(struct timeval)=16

I suggest using ktime_t that is a nice abstraction of high resolution
time services, currently capable of nanosecond resolution.

As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits
a 8 byte shrink of this structure on 64bit architectures. Some other
structures also benefit from this size reduction (struct ipq in
ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...)

Once this ktime infrastructure adopted, we can more easily provide
nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or
SO_TIMESTAMPNS/SCM_TIMESTAMPNS)

Note : this patch includes a bug correction in
compat_sock_get_timestamp() where a "err = 0;" was missing (so this
syscall returned -ENOENT instead of 0)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
CC: Stephen Hemminger <shemminger@linux-foundation.org>
CC: John find <linux.kernel@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:23:34 -07:00
Ilpo Järvinen
3cfe3baaf0 [TCP]: Add two new spurious RTO responses to FRTO
New sysctl tcp_frto_response is added to select amongst these
responses:
	- Rate halving based; reuses CA_CWR state (default)
	- Very conservative; used to be the only one available (=1)
	- Undo cwr; undoes ssthresh and cwnd reductions (=2)

The response with rate halving requires a new parameter to
tcp_enter_cwr because FRTO has already reduced ssthresh and
doing a second reduction there has to be prevented. In addition,
to keep things nice on 80 cols screen, a local variable was
added.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:23:23 -07:00
David S. Miller
e0ef57cc56 [TCP]: Make snd_cwnd_clamp a u32.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:23:21 -07:00
Eric Dumazet
54287cc178 [TCP]: Keep copied_seq, rcv_wup and rcv_next together.
I noticed in oprofile study a cache miss in tcp_rcv_established() to read
copied_seq.

ffffffff80400a80 <tcp_rcv_established>: /* tcp_rcv_established total: 4034293  
2.0400 */

 55493  0.0281 :ffffffff80400bc9:   mov    0x4c8(%r12),%eax copied_seq
543103  0.2746 :ffffffff80400bd1:   cmp    0x3e0(%r12),%eax   rcv_nxt    

if (tp->copied_seq == tp->rcv_nxt &&
        len - tcp_header_len <= tp->ucopy.len) {

In this function, the cache line 0x4c0 -> 0x500 is used only for this
reading 'copied_seq' field.

rcv_wup and copied_seq should be next to rcv_nxt field, to lower number of
active cache lines in hot paths. (tcp_rcv_established(), tcp_poll(), ...)

As you suggested, I changed tcp_create_openreq_child() so that these fields
are changed together, to avoid adding a new store buffer stall.

Patch is 64bit friendly (no new hole because of alignment constraints)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:23:21 -07:00
John Heffner
886236c124 [TCP]: Add RFC3742 Limited Slow-Start, controlled by variable sysctl_tcp_max_ssthresh.
Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:23:19 -07:00
Linus Torvalds
12145387a0 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [BNX2]: Fix occasional NETDEV WATCHDOG on 5709.
  [IPV6]: Disallow RH0 by default.
  [XFRM]: beet: fix pseudo header length value
  [TCP]: Congestion control initialization.
2007-04-24 18:20:32 -07:00
YOSHIFUJI Hideaki
0bcbc92629 [IPV6]: Disallow RH0 by default.
A security issue is emerging.  Disallow Routing Header Type 0 by default
as we have been doing for IPv4.
Note: We allow RH2 by default because it is harmless.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-24 14:58:30 -07:00
Balbir Singh
7e40f2ab0a Taskstats fix the structure members alignment issue
We broke the the alignment of members of taskstats to the 8 byte boundary
with the CSA patches.  In the current kernel, the taskstats structure is
not suitable for use by 32 bit applications in a 64 bit kernel.

On x86_64

Offsets of taskstats' members (64 bit kernel, 64 bit application)

@taskstats'offsetof[@taskstats'indices] = (
        0,      # version
        4,      # ac_exitcode
        8,      # ac_flag
        9,      # ac_nice
        16,     # cpu_count
        24,     # cpu_delay_total
        32,     # blkio_count
        40,     # blkio_delay_total
        48,     # swapin_count
        56,     # swapin_delay_total
        64,     # cpu_run_real_total
        72,     # cpu_run_virtual_total
        80,     # ac_comm
        112,    # ac_sched
        113,    # ac_pad
        116,    # ac_uid
        120,    # ac_gid
        124,    # ac_pid
        128,    # ac_ppid
        132,    # ac_btime
        136,    # ac_etime
        144,    # ac_utime
        152,    # ac_stime
        160,    # ac_minflt
        168,    # ac_majflt
        176,    # coremem
        184,    # virtmem
        192,    # hiwater_rss
        200,    # hiwater_vm
        208,    # read_char
        216,    # write_char
        224,    # read_syscalls
        232,    # write_syscalls
        240,    # read_bytes
        248,    # write_bytes
        256,    # cancelled_write_bytes
    );

Offsets of taskstats' members (64 bit kernel, 32 bit application)

@taskstats'offsetof[@taskstats'indices] = (
        0,      # version
        4,      # ac_exitcode
        8,      # ac_flag
        9,      # ac_nice
        12,     # cpu_count
        20,     # cpu_delay_total
        28,     # blkio_count
        36,     # blkio_delay_total
        44,     # swapin_count
        52,     # swapin_delay_total
        60,     # cpu_run_real_total
        68,     # cpu_run_virtual_total
        76,     # ac_comm
        108,    # ac_sched
        109,    # ac_pad
        112,    # ac_uid
        116,    # ac_gid
        120,    # ac_pid
        124,    # ac_ppid
        128,    # ac_btime
        132,    # ac_etime
        140,    # ac_utime
        148,    # ac_stime
        156,    # ac_minflt
        164,    # ac_majflt
        172,    # coremem
        180,    # virtmem
        188,    # hiwater_rss
        196,    # hiwater_vm
        204,    # read_char
        212,    # write_char
        220,    # read_syscalls
        228,    # write_syscalls
        236,    # read_bytes
        244,    # write_bytes
        252,    # cancelled_write_bytes
    );

This is one way to solve the problem without re-arranging structure members
is to pack the structure.  The patch adds an __attribute__((aligned(8))) to
the taskstats structure members so that 32 bit applications using taskstats
can work with a 64 bit kernel.

Using __attribute__((packed)) would break the 64 bit alignment of members.

The fix was tested on x86_64. After the fix, we got

Offsets of taskstats' members (64 bit kernel, 64 bit application)

@taskstats'offsetof[@taskstats'indices] = (
        0,      # version
        4,      # ac_exitcode
        8,      # ac_flag
        9,      # ac_nice
        16,     # cpu_count
        24,     # cpu_delay_total
        32,     # blkio_count
        40,     # blkio_delay_total
        48,     # swapin_count
        56,     # swapin_delay_total
        64,     # cpu_run_real_total
        72,     # cpu_run_virtual_total
        80,     # ac_comm
        112,    # ac_sched
        113,    # ac_pad
        120,    # ac_uid
        124,    # ac_gid
        128,    # ac_pid
        132,    # ac_ppid
        136,    # ac_btime
        144,    # ac_etime
        152,    # ac_utime
        160,    # ac_stime
        168,    # ac_minflt
        176,    # ac_majflt
        184,    # coremem
        192,    # virtmem
        200,    # hiwater_rss
        208,    # hiwater_vm
        216,    # read_char
        224,    # write_char
        232,    # read_syscalls
        240,    # write_syscalls
        248,    # read_bytes
        256,    # write_bytes
        264,    # cancelled_write_bytes
    );

Offsets of taskstats' members (64 bit kernel, 32 bit application)

@taskstats'offsetof[@taskstats'indices] = (
        0,      # version
        4,      # ac_exitcode
        8,      # ac_flag
        9,      # ac_nice
        16,     # cpu_count
        24,     # cpu_delay_total
        32,     # blkio_count
        40,     # blkio_delay_total
        48,     # swapin_count
        56,     # swapin_delay_total
        64,     # cpu_run_real_total
        72,     # cpu_run_virtual_total
        80,     # ac_comm
        112,    # ac_sched
        113,    # ac_pad
        120,    # ac_uid
        124,    # ac_gid
        128,    # ac_pid
        132,    # ac_ppid
        136,    # ac_btime
        144,    # ac_etime
        152,    # ac_utime
        160,    # ac_stime
        168,    # ac_minflt
        176,    # ac_majflt
        184,    # coremem
        192,    # virtmem
        200,    # hiwater_rss
        208,    # hiwater_vm
        216,    # read_char
        224,    # write_char
        232,    # read_syscalls
        240,    # write_syscalls
        248,    # read_bytes
        256,    # write_bytes
        264,    # cancelled_write_bytes
    );

Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Cc: Shailabh Nagar <nagar@watson.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-24 08:23:08 -07:00
Trond Myklebust
8e821cad12 NFS: clean up the unstable write code
Get rid of the inlined #ifdefs.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-20 22:56:29 -07:00
Linus Torvalds
80d74d5123 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [BRIDGE]: Unaligned access when comparing ethernet addresses
  [SCTP]: Unmap v4mapped addresses during SCTP_BINDX_REM_ADDR operation.
  [SCTP]: Fix assertion (!atomic_read(&sk->sk_rmem_alloc)) failed message
  [NET]: Set a separate lockdep class for neighbour table's proxy_queue
  [NET]: Fix UDP checksum issue in net poll mode.
  [KEY]: Fix conversion between IPSEC_MODE_xxx and XFRM_MODE_xxx.
  [NET]: Get rid of alloc_skb_from_cache
2007-04-17 16:51:32 -07:00
Russell King
93da28790c Provide dummy devm_ioport_* if !HAS_IOPORT
Provide an dummy implementation of devm_ioport_map() and
devm_ioport_unmap() to allow drivers (eg, pata_platform) to build for
platforms where CONFIG_NO_IOPORT is selected.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-17 16:36:27 -07:00
Randy Dunlap
112654208b kernel-doc: fix plist.h comments
Make kernel-doc comments match macro names.
Correct parameter names in a few places.
Remove '#' from beginning of kernel-doc comment macro names.
Remove extra (erroneous) blank lines in kernel-doc.

Warning(plist.h:100): Cannot understand  * #PLIST_HEAD_INIT - static struct plist_head initializer on line 100 - I thought it was a doc line
Warning(plist.h:112): Cannot understand  * #PLIST_NODE_INIT - static struct plist_node initializer on line 112 - I thought it was a doc line
Warning(plist.h:103): No description found for parameter '_lock'
Warning(plist.h:129): No description found for parameter 'lock'
Warning(plist.h:158): No description found for parameter 'pos'
Warning(plist.h:169): No description found for parameter 'pos'
Warning(plist.h:169): No description found for parameter 'n'
Warning(plist.h:179): No description found for parameter 'mem'

This still leaves one warning & one error that need attention:
Error(plist.h:219): cannot understand prototype: '('
Warning(plist.h): no structured comments found

Acked-by: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Cc: Daniel Walker <dwalker@mvista.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-17 16:36:26 -07:00
Pavel Emelianov
c2ecba7171 [NET]: Set a separate lockdep class for neighbour table's proxy_queue
Otherwise the following calltrace will lead to a wrong
lockdep warning:

  neigh_proxy_process()
    `- lock(neigh_table->proxy_queue.lock);
  arp_redo /* via tbl->proxy_redo */
  arp_process
  neigh_event_ns
  neigh_update
  skb_queue_purge
    `- lock(neighbor->arp_queue.lock);

This is not a deadlock actually, as neighbor table's proxy_queue
and the neighbor's arp_queue are different queues.

Lockdep thinks there is a deadlock as both queues are initialized
with skb_queue_head_init() and thus have a common class.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-17 13:13:31 -07:00
Herbert Xu
b4dfa0b1fb [NET]: Get rid of alloc_skb_from_cache
Since this was added originally for Xen, and Xen has recently (~2.6.18)
stopped using this function, we can safely get rid of it.  Good timing
too since this function has started to bit rot.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-17 13:13:16 -07:00
Trond Myklebust
5a6d41b32a NFS: Ensure PG_writeback is cleared when writeback fails
If the writebacks are cancelled via nfs_cancel_dirty_list, or due to the
memory allocation failing in nfs_flush_one/nfs_flush_multi, then we must
ensure that the PG_writeback flag is cleared.

Also ensure that we actually own the PG_writeback flag whenever we
schedule a new writeback by making nfs_set_page_writeback() return the
value of test_set_page_writeback().
The PG_writeback page flag ends up replacing the functionality of the
PG_FLUSHING nfs_page flag, so we rip that out too.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-14 21:46:48 -07:00
Suleiman Souhlal
23450319e2 ide: correctly prevent IDE timer expiry function to run if request was already handled
It is possible for the timer expiry function to run even though the
request has already been handled: ide_timer_expiry() only checks that
the handler is not NULL, but it is possible that we have handled a
request (thus clearing the handler) and then started a new request
(thus starting the timer again, and setting a handler). 

A simple way to exhibit this is to set the DMA timeout to 1 jiffy and
run dd: The kernel will panic after a few minutes because
ide_timer_expiry() tries to add a timer when it's already active.

To fix this, we simply add a request generation count that gets
incremented at every interrupt, and check in ide_timer_expiry() that
we have not already handled a new interrupt before running the expiry
function.

Signed-off-by: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-04-10 22:38:37 +02:00
Ingo Molnar
995f054f2a [PATCH] high-res timers: resume fix
Soeren Sonnenburg reported that upon resume he is getting
this backtrace:

 [<c0119637>] smp_apic_timer_interrupt+0x57/0x90
 [<c0142d30>] retrigger_next_event+0x0/0xb0
 [<c0104d30>] apic_timer_interrupt+0x28/0x30
 [<c0142d30>] retrigger_next_event+0x0/0xb0
 [<c0140068>] __kfifo_put+0x8/0x90
 [<c0130fe5>] on_each_cpu+0x35/0x60
 [<c0143538>] clock_was_set+0x18/0x20
 [<c0135cdc>] timekeeping_resume+0x7c/0xa0
 [<c02aabe1>] __sysdev_resume+0x11/0x80
 [<c02ab0c7>] sysdev_resume+0x47/0x80
 [<c02b0b05>] device_power_up+0x5/0x10

it turns out that on resume we mistakenly re-enable interrupts too
early.  Do the timer retrigger only on the current CPU.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Soeren Sonnenburg <kernel@nn7.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-07 10:03:43 -07:00
Andrew Morton
2363cc0264 [PATCH] remove protection of LANANA-reserved majors
Revert all this.  It can cause device-mapper to receive a different major from
earlier kernels and it turns out that the Amanda backup program (via GNU tar,
apparently) checks major numbers on files when performing incremental backups.

Which is a bit broken of Amanda (or tar), but this feature isn't important
enough to justify the churn.

Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-04 21:12:47 -07:00
NeilBrown
5792a2856a [PATCH] md: avoid a deadlock when removing a device from an md array via sysfs
A device can be removed from an md array via e.g.
  echo remove > /sys/block/md3/md/dev-sde/state

This will try to remove the 'dev-sde' subtree which will deadlock
since
  commit e7b0d26a86

With this patch we run the kobject_del via schedule_work so as to
avoid the deadlock.

Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-04 21:12:47 -07:00
Albert Lee
6f23a31d1c libata: Limit ATAPI DMA to R/W commands only for TORiSAN DVD drives (take 3)
patch 4/4:

  Limit ATAPI DMA to R/W commands only for TORiSAN DRD-N216 DVD-ROM drives
  (http://bugzilla.kernel.org/show_bug.cgi?id=6710)

Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-04-04 02:12:27 -04:00
Albert Lee
18d6e9d518 libata: Limit max sector to 128 for TORiSAN DVD drives (take 3)
patch 3/4:
  The TORiSAN drive locks up when max sector == 256.
  Limit max sector to 128 for the TORiSAN DRD-N216 drives.
  (http://bugzilla.kernel.org/show_bug.cgi?id=6710)

Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-04-04 02:12:27 -04:00
Albert Lee
7152764700 libata: reorder HSM_ST_FIRST for easier decoding (take 3)
patch 1/4:
  Reorder HSM_ST_FIRST, such that the task state transition is easier decoded with human eyes.

Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-04-04 02:12:27 -04:00
Rafael J. Wysocki
1d64b9cb1d [PATCH] Fix microcode-related suspend problem
Fix the regression resulting from the recent change of suspend code
ordering that causes systems based on Intel x86 CPUs using the microcode
driver to hang during the resume.

The problem occurs since the microcode driver uses request_firmware() in
its CPU hotplug notifier, which is called after tasks has been frozen and
hangs.  It can be fixed by telling the microcode driver to use the
microcode stored in memory during the resume instead of trying to load it
from disk.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Adrian Bunk <bunk@stusta.de>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Maxim <maximlevitsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-02 10:06:09 -07:00
Kay Sievers
0c84ce268b [PATCH] driver core: fix built-in drivers sysfs links
built-in drivers had broken sysfs links that caused bootup hangs for
certain driver unregistry sequences.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-02 10:06:09 -07:00
Patrick McHardy
c01003c205 [IFB]: Fix crash on input device removal
The input_device pointer is not refcounted, which means the device may
disappear while packets are queued, causing a crash when ifb passes packets
with a stale skb->dev pointer to netif_rx().

Fix by storing the interface index instead and do a lookup where neccessary.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-29 11:46:52 -07:00
Jeff Garzik
a9c87a10db Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream-fixes 2007-03-28 02:21:18 -04:00
Jean Tourrilhes
c2805fbb86 [PATCH] WE-22 : prevent information leak on 64 bit
Johannes Berg discovered that kernel space was leaking to
userspace on 64 bit platform. He made a first patch to fix that. This
is an improved version of his patch.

Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-03-27 14:10:26 -04:00
Linus Torvalds
e0ab0bb6d2 Merge branch 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block:
  Export __splice_from_pipe()
  2/2 splice: dont readpage
  1/2 splice: dont steal
  make elv_register() output atomic
  block: blk_max_pfn is somtimes wrong
2007-03-27 09:05:49 -07:00
Serge E. Hallyn
a28d193cbf [PATCH] ipcns: fix !CONFIG_IPC_NS behavior
When CONFIG_IPC_NS=n, clone(CLONE_NEWIPC) claims success, but did not actually
clone a new IPC namespace.

Fix this to return -EINVAL so the caller knows his request was denied.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-27 09:05:16 -07:00
Serge E. Hallyn
78d832f626 [PATCH] utsns: fix !CONFIG_UTS_NS behavior
When CONFIG_UTS_NS=n, clone(CLONE_NEWUTS) quietly refuses.  So correctly does
not unshare a new uts namespace, but also does not return -EINVAL.

Fix this to return -EINVAL so the caller knows his request was denied.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-27 09:05:15 -07:00
Jeff Dike
d75e26a829 [PATCH] uml: fix epoll
UML/x86_64 needs the same packing of struct epoll_event as x86_64.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-27 09:05:15 -07:00
Mark Fasheh
40bee44eae Export __splice_from_pipe()
Ocfs2 wants to implement it's own splice write actor so that it can better
manage cluster / page locks. This lets us re-use the rest of splice write
while only providing our own code where it's actually important.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-03-27 08:55:47 +02:00
Russ Cox
04a3952330 [PATCH] Add const to pointer qualifiers for __chk_user_ptr and __chk_io_ptr.
Change prototypes for __chk_user_ptr and __chk_io_ptr to take const
void* instead of void*, so that code can pass "const void *" to them.

(Right now sparse does not warn about passing const void* to void*
functions, but that is a separate bug that I believe Josh is working on,
and once sparse does check this, the changed prototypes will be
necessary.)

Signed-off-by: Russ Cox <rsc@swtch.com>
Signed-off-by: Josh Triplett <josh@freedesktop.org>
Acked-by: Christopher Li <sparse@chrisli.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-26 14:23:52 -07:00