Commit Graph

966 Commits

Author SHA1 Message Date
David S. Miller
c1b4a7e695 [TCP]: Move to new TSO segmenting scheme.
Make TSO segment transmit size decisions at send time not earlier.

The basic scheme is that we try to build as large a TSO frame as
possible when pulling in the user data, but the size of the TSO frame
output to the card is determined at transmit time.

This is guided by tp->xmit_size_goal.  It is always set to a multiple
of MSS and tells sendmsg/sendpage how large an SKB to try and build.

Later, tcp_write_xmit() and tcp_push_one() chop up the packet if
necessary and conditions warrant.  These routines can also decide to
"defer" in order to wait for more ACKs to arrive and thus allow larger
TSO frames to be emitted.

A general observation is that TSO elongates the pipe, thus requiring a
larger congestion window and larger buffering especially at the sender
side.  Therefore, it is important that applications 1) get a large
enough socket send buffer (this is accomplished by our dynamic send
buffer expansion code) 2) do large enough writes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 15:24:38 -07:00
David S. Miller
55c97f3e99 [TCP]: Fix __tcp_push_pending_frames() 'nonagle' handling.
'nonagle' should be passed to the tcp_snd_test() function
as 'TCP_NAGLE_PUSH' if we are checking an SKB not at the
tail of the write_queue.  This is because Nagle does not
apply to such frames since we cannot possibly tack more
data onto them.

However, while doing this __tcp_push_pending_frames() makes
all of the packets in the write_queue use this modified
'nonagle' value.

Fix the bug and simplify this function by just calling
tcp_write_xmit() directly if sk_send_head is non-NULL.

As a result, we can now make tcp_data_snd_check() just call
tcp_push_pending_frames() instead of the specialized
__tcp_data_snd_check().

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 15:19:38 -07:00
David S. Miller
a2e2a59c93 [TCP]: Fix redundant calculations of tcp_current_mss()
tcp_write_xmit() uses tcp_current_mss(), but some of it's callers,
namely __tcp_push_pending_frames(), already has this value available
already.

While we're here, fix the "cur_mss" argument to be "unsigned int"
instead of plain "unsigned".

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 15:19:23 -07:00
David S. Miller
a762a98007 [TCP]: Kill extra cwnd validate in __tcp_push_pending_frames().
The tcp_cwnd_validate() function should only be invoked
if we actually send some frames, yet __tcp_push_pending_frames()
will always invoke it.  tcp_write_xmit() does the call for us,
so the call here can simply be removed.

Also, tcp_write_xmit() can be marked static.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 15:18:51 -07:00
David S. Miller
84d3e7b957 [TCP]: Move __tcp_data_snd_check into tcp_output.c
It reimplements portions of tcp_snd_check(), so it
we move it to tcp_output.c we can consolidate it's
logic much easier in a later change.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 15:18:18 -07:00
David S. Miller
f6302d1d78 [TCP]: Move send test logic out of net/tcp.h
This just moves the code into tcp_output.c, no code logic changes are
made by this patch.

Using this as a baseline, we can begin to untangle the mess of
comparisons for the Nagle test et al.  We will also be able to reduce
all of the redundant computation that occurs when outputting data
packets.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 15:18:03 -07:00
David S. Miller
fc6415bcb0 [TCP]: Fix quick-ack decrementing with TSO.
On each packet output, we call tcp_dec_quickack_mode()
if the ACK flag is set.  It drops tp->ack.quick until
it hits zero, at which time we deflate the ATO value.

When doing TSO, we are emitting multiple packets with
ACK set, so we should decrement tp->ack.quick that many
segments.

Note that, unlike this case, tcp_enter_cwr() should not
take the tcp_skb_pcount(skb) into consideration.  That
function, one time, readjusts tp->snd_cwnd and moves
into TCP_CA_CWR state.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 15:17:45 -07:00
David S. Miller
c65f7f00c5 [TCP]: Simplify SKB data portion allocation with NETIF_F_SG.
The ideal and most optimal layout for an SKB when doing
scatter-gather is to put all the headers at skb->data, and
all the user data in the page array.

This makes SKB splitting and combining extremely simple,
especially before a packet goes onto the wire the first
time.

So, when sk_stream_alloc_pskb() is given a zero size, make
sure there is no skb_tailroom().  This is achieved by applying
SKB_DATA_ALIGN() to the header length used here.

Next, make select_size() in TCP output segmentation use a
length of zero when NETIF_F_SG is true on the outgoing
interface.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 15:17:25 -07:00
Alexey Dobriyan
b8259d9ad1 [NET]: Remove __ARGS from include/net/slhc_vj.h
I suspect "#define __ARGS(x) ()" was deprecated before I was born.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 15:12:04 -07:00
Christoph Hellwig
bc971dee6e [SHAPER]: Switch to spinlocks.
Dave, you were right and the sleeping locks in shaper were
broken. Markus Kanet noticed this and also tested the patch below that
switches locking to spinlocks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 15:03:46 -07:00
Thomas Graf
3d54b82fdf [PKT_SCHED]: Cleanup qdisc creation and alignment macros
Adds qdisc_alloc() to share code between qdisc_create()
and qdisc_create_dflt(). Hides the qdisc alignment behind
macros and makes use of them.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 14:15:09 -07:00
Thomas Graf
e41a33e6ec [PKT_SCHED]: Move sch_generic.c prototypes to correct header file
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 14:14:30 -07:00
Thomas Graf
1cbb3380ef [NET]: Reduce size of sk_buff by 4 bytes
Reduce local_df to a bit field and ip_summed to a 2 bits
field thus saving 13 bits. Move bit fields, packet type,
and protocol into the spare area between the priority
and the destructor. Saves 4 bytes on both, 32bit and
64bit architectures.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 14:13:41 -07:00
Thomas Graf
e176fe8954 [NET]: Remove unused security member in sk_buff
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 14:12:44 -07:00
Patrick McHardy
55820ee2f8 [NET]: Fix signedness issues in net/core/filter.c
This is the code to load packet data into a register:

                        k = fentry->k;
                        if (k < 0) {
...
                        } else {
                                u32 _tmp, *p;
                                p = skb_header_pointer(skb, k, 4, &_tmp);
                                if (p != NULL) {
                                        A = ntohl(*p);
                                        continue;
                                }
                        }

skb_header_pointer checks if the requested data is within the
linear area:

        int hlen = skb_headlen(skb);

        if (offset + len <= hlen)
                return skb->data + offset;

When offset is within [INT_MAX-len+1..INT_MAX] the addition will
result in a negative number which is <= hlen.

I couldn't trigger a crash on my AMD64 with 2GB of memory, but a
coworker tried on his x86 machine and it crashed immediately.

This patch fixes the check in skb_header_pointer to handle large
positive offsets similar to skb_copy_bits. Invalid data can still
be accessed using negative offsets (also similar to skb_copy_bits),
anyone using negative offsets needs to verify them himself.

Thanks to Thomas Vgtle <thomas.voegtle@coreworks.de> for verifying the
problem by crashing his machine and providing me with an Oops.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-05 14:08:10 -07:00
Russell King
17af691cd1 [PATCH] ARM: Fix new-ABI layout of struct stat64
Add __attribute__((packed)) to ensure that the stat64 structure is
correctly laid out no matter which ABI the kernel is compiled for.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-07-04 13:02:46 +01:00
Russell King
f9bd6ea446 [PATCH] ARM: Change 'param_offset' to 'boot_params'
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-07-04 10:43:36 +01:00
Linus Torvalds
19f7241a3b Merge master.kernel.org:/home/rmk/linux-2.6-arm 2005-07-03 14:39:33 -07:00
Russell King
e9dea0c65d [PATCH] ARM: Remove machine description macros
Remove the pointless machine description macros, favouring C99
initialisers instead.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-07-03 17:38:58 +01:00
Rob Punkunus
21e2c01dc3 [PATCH] amd74xx: support MCP55 device IDs
From: Rob Punkunus <rpunkunus@nvidia.com>

Rob Punkunus recently submitted a patch to enable support for MCP51/MCP55 in
the amd74xx driver. This patch was whitespace-corrupted and didn't apply to
2.6.12 since MCP51 support was merged in the 2.6.12-rc series.

Gentoo would like to support this hardware for our upcoming release media, so
I fixed the patch, and here it is :)

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
2005-07-03 17:37:18 +02:00
Todd Poynor
3e18a45abc [PATCH] ARM: 2782/1: PXA27x MDREFR K0DB4 define
Patch from Todd Poynor

Add definition of K0DB4 SDCLK<0,3> divide-by-4 control/status bit in the
MDREFR register for Intel XScale PXA27x.

Signed-off-by: Todd Poynor <tpoynor@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-07-01 11:27:06 +01:00
Todd Poynor
26705ca46b [PATCH] ARM: 2781/2: PXA27x Standby mode take 2
Patch from Todd Poynor

Add support for PXA27x Standby mode, a low-power mode that retains CPU
and some peripheral state (the existing "sleep" mode is a power-power
mode that retains less state). Activated via:
echo -n standby > /sys/power/state
From: David Burrage and Todd Poynor

Signed-off-by: Todd Poynor <tpoynor@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-07-01 11:27:05 +01:00
Linus Torvalds
62351cc38d Merge master.kernel.org:/home/rmk/linux-2.6-arm 2005-06-30 17:07:37 -07:00
Linus Torvalds
bd53d1270f Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 2005-06-30 09:04:36 -07:00
Linus Torvalds
12829dcb10 Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/paulus/ppc64-2.6 2005-06-30 08:48:56 -07:00
Chris Zankel
9ec55a9bd3 [PATCH] xtensa: Fix asm macro
Removed dead code in arch/xtensa/kernel/pci.c and use the pci_name() macro.
 Fixed an error in the delay asm macro: '1' is an invalid immediate value.

Signed-off-by: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:11 -07:00
Chris Zankel
5b0de927d9 [PATCH] xtensa: cleanups for errno and ipc.
I noticed this because I was doing some more ipc cleanups and I did the
original errno and ipc cleanups for other architectures, so it stuck out.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:10 -07:00
Ingo Molnar
306e440daf [PATCH] x86: i8253/i8259A lock cleanup
Introduce proper declarations for i8253_lock and i8259A_lock.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:10 -07:00
Heiko Carstens
5ee24d9594 [PATCH] s390: fix finish_arch_switch
Commit 4866cde064 requires finish_arch_switch
to have only one parameter instead of two.

Also fix another compile error (double declaration of account_system_vtime)
if CONFIG_VIRT_CPU_ACCOUNTING is not defined.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:08 -07:00
Russell King
cfb0810eab [PATCH] ARM: Don't try to send a signal to pid0
If we receive an unrecognised abort during boot, don't try to
send a signal to pid0, but instead report the current state.
This leads to less confusing debug reports.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-30 11:06:49 +01:00
Greg KH
bf164c790d Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2005-06-29 22:54:31 -07:00
Greg Kroah-Hartman
23d3d602cb [PATCH] driver core: change bus_rescan_devices to return void
No one was looking at the return value of bus_rescan_devices, and it
really wasn't anything that anyone in the kernel would ever care about.
So change it which enabled some counting code to be removed also.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-29 22:48:04 -07:00
Cornelia Huck
0edb586049 [PATCH] driver core: add bus_find_device & driver_find_device functions
Add bus_find_device() and driver_find_device() which allow searching for a
device in the bus's resp. the driver's klist and obtain a reference on it.

Signed-off-by: Cornelia Huck <cohuck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-29 22:48:03 -07:00
Michael Ellerman
719d1cd867 [PATCH] ppc64: Replace custom locking code with a spinlock
The hvlpevent_queue (formally ItLpQueue) has a member called xInUseWord
which is used for serialising access to the queue. Because it's a word
(ie. 32 bit) there's a custom 32-bit version of test_and_set_bit() or
thereabouts in ItLpQueue.c.

The xInUseWord is not shared with they hypervisor, so we can replace it
with a spinlock and remove the custom code.

There is also another locking mechanism (ItLpQueueInProcess). This is
redundant because it's only manipulated while the lock's held. Remove it.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-06-30 15:17:02 +10:00
Michael Ellerman
ed094150bd [PATCH] ppc64: Simplify counting of lpevents, remove lpevent_count from paca
Currently there's a per-cpu count of lpevents processed, a per-queue (ie.
global) total count, and a count by event type.

Replace all that with a count by event for each cpu. We only need to add
it up int the proc code.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-06-30 15:16:09 +10:00
Michael Ellerman
74889802a1 [PATCH] ppc64: Don't count number of events processed for caller
Currently we count the number of lpevents processed in 3 seperate places.

One of these counters is never read, so just remove it. This means
hvlpevent_queue_process() no longer needs to return the number of events
processed.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-06-30 15:15:53 +10:00
Michael Ellerman
937b31b114 [PATCH] ppc64: Rename ItLpQueue_* functions to hvlpevent_queue_*
Now that we've renamed the xItLpQueue structure, rename the functions that
operate on it also.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-06-30 15:15:42 +10:00
Michael Ellerman
a61874648d [PATCH] ppc64: Rename xItLpQueue to hvlpevent_queue
The xItLpQueue is a queue of HvLpEvents that we're given by the Hypervisor.
Rename xItLpQueue to hvlpevent_queue and make the type struct hvlpevent_queue.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-06-30 15:15:32 +10:00
Michael Ellerman
0f6014b37e [PATCH] ppc64: Make two ItLpQueue related functions static
External parties don't need to use ItLpQueue_getNextLpEvent() or
ItLpQueue_clearValid(), they're internal to ItLpQueue.c

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-06-30 15:08:56 +10:00
Michael Ellerman
512d31d6a8 [PATCH] ppc64: Move initialisation of xItLpQueue into ItLpQueue.c
The xItLpQueue is initalised manually in iSeries_setup_arch().  Move
this code into ItLpQueue.c for a cleaner separation.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-06-30 15:08:27 +10:00
Michael Ellerman
1b19bc7214 [PATCH] ppc64: Don't pass the pointers to xItLpQueue around
Because there's only one ItLpQueue and we know where it is, ie. xItLpQueue,
there's no point passing pointers to it it around all over the place.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-06-30 15:07:57 +10:00
Michael Ellerman
bea248fb30 [PATCH] ppc64: Remove lpqueue pointer from the paca on iSeries
The iSeries code keeps a pointer to the ItLpQueue in its paca struct. But
all these pointers end up pointing to the one place, ie. xItLpQueue.

So remove the pointer from the paca struct and just refer to xItLpQueue
directly where needed.

The only complication is that the spread_lpevents logic was implemented by
having a NULL lpqueue pointer in the paca on CPUs that weren't supposed to
process events. Instead we just compare the spread_lpevents value to the
processor id to get the same behaviour.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-06-30 15:07:09 +10:00
Linus Torvalds
9b4311eedb Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 2005-06-29 21:03:03 -07:00
Linus Torvalds
0b35ff23b2 Merge master.kernel.org:/home/rmk/linux-2.6-serial 2005-06-29 21:00:38 -07:00
Linus Torvalds
92dd7ca0af Merge master.kernel.org:/home/rmk/linux-2.6-arm 2005-06-29 20:59:55 -07:00
Tony Luck
d18bfacff2 Auto merge with /home/aegl/GIT/linus 2005-06-29 15:21:41 -07:00
Russell King
026d02a236 [PATCH] Serial: Split 8250 port table (part 2)
Remove legacy ISA serial ports for Accent, Boca, Fourport, Hub6 and MCA
from the architecture specific serial.h include.

The only ports which remain in asm-*/serial.h are the platform specific
entries.  These should really be converted by platform maintainers to
use a platform device, such as can be found in
arch/arm/mach-footbridge/isa.c

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-29 18:45:19 +01:00
Richard Purdie
9ec3c75cde [PATCH] ARM: 2768/1: PXA: Add a required header file for LL_DEBUG
Patch from Richard Purdie

With DEBUG enabled, head.S includes arch/debug-macro.S. On the PXA, this
contains references to the macro io_p2v() so hardware.h needs to be
included.

Signed-off-by: Richard Purdie <rpurdie@openedhand.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-29 15:17:49 +01:00
Russell King
b720f73296 [PATCH] ARM: Convert ARM timer implementations to use readl/writel
Convert ARMs timer implementations to use readl/writel instead of accessing
the registers via a struct.

People have recently asked if accessing timers via a structure is the
"right way" and its not the Linux way.  So fix this code to conform to
"The Linux Way"(tm).

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2005-06-29 15:15:54 +01:00
Russell King
99a0616bcd Merge with ../linux-2.6-smp 2005-06-29 09:40:28 +01:00