Fix a subtle race condition between tg3_start_xmit() and tg3_tx()
discovered by Herbert Xu <herbert@gondor.apana.org.au>:
CPU0 CPU1
tg3_start_xmit()
if (tx_ring_full) {
tx_lock
tg3_tx()
if (!netif_queue_stopped)
netif_stop_queue()
if (!tx_ring_full)
update_tx_ring
netif_wake_queue()
tx_unlock
}
Even though tx_ring is updated before the if statement in tg3_tx() in
program order, it can be re-ordered by the CPU as shown above. This
scenario can cause the tx queue to be stopped forever if tg3_tx() has
just freed up the entire tx_ring. The possibility of this happening
should be very rare though.
The following changes are made:
1. Add memory barrier to fix the above race condition.
2. Eliminate the private tx_lock altogether and rely solely on
netif_tx_lock. This eliminates one spinlock in tg3_start_xmit()
when the ring is full.
3. Because of 2, use netif_tx_lock in tg3_tx() before calling
netif_wake_queue().
4. Change TX_BUFFS_AVAIL to an inline function with a memory barrier.
Herbert and David suggested using the memory barrier instead of
volatile.
5. Check for the full wake queue condition before getting
netif_tx_lock in tg3_tx(). This reduces the number of unnecessary
spinlocks when the tx ring is full in a steady-state condition.
6. Update version to 3.65.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
All caller of netdev_alloc_skb need to assign skb->dev shortly
afterwards. Move it into common code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Handle dev_alloc_skb() failures when initializing the RX rings.
Without proper handling, the driver will crash when using a partial
ring.
Thanks to Stephane Doyon <sdoyon@max-t.com> for reporting the bug and
providing the initial patch.
Howie Xu <howie@vmware.com> also reported the same issue.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add tg3_restart_hw() to handle failures when re-initializing the
device.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the AMD 8131 bridge to the list of chipsets that reorder writes.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable ipv6 TSO feature on chips that support it.
Update version to 3.61.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[IPV6]: Added GSO support for TCPv6
[NET]: Generalise TSO-specific bits from skb_setup_caps
[IPV6]: Added GSO support for TCPv6
[IPV6]: Remove redundant length check on input
[NETFILTER]: SCTP conntrack: fix crash triggered by packet without chunks
[TG3]: Update version and reldate
[TG3]: Add TSO workaround using GSO
[TG3]: Turn on hw fix for ASF problems
[TG3]: Add rx BD workaround
[TG3]: Add tg3_netif_stop() in vlan functions
[TCP]: Reset gso_segs if packet is dodgy
Use GSO to workaround a rare TSO bug on some chips. This hardware
bug may be triggered when the TSO header size is greater than 80
bytes. When this condition is detected in a TSO packet, the driver
will use GSO to segment the packet to workaround the hardware bug.
Thanks to Juergen Kreileder <jk@blackdown.de> for reporting the
problem and collecting traces to help debug the problem.
And thanks to Herbert Xu <herbert@gondor.apana.org.au> for providing
the GSO mechanism that happens to be the perfect workaround for this
problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clear a bit to enable a hardware fix for some ASF related problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add workaround to limit the burst size of rx BDs being DMA'ed to the
chip. This works around hardware errata on a number of 5750, 5752,
and 5755 chips.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add tg3_netif_stop() when changing the vlgrp (vlan group) pointer. It
is necessary to quiesce the device before changing that pointer.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One thing this change pointed out was that we really should
pull the "get 'local-mac-address' property" logic into a helper
function all the network drivers can call.
Signed-off-by: David S. Miller <davem@davemloft.net>
Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
going to scale if we add any more segmentation methods (e.g., DCCP). So
let's merge them.
They were used to tell the protocol of a packet. This function has been
subsumed by the new gso_type field. This is essentially a set of netdev
feature bits (shifted by 16 bits) that are required to process a specific
skb. As such it's easy to tell whether a given device can process a GSO
skb: you just have to and the gso_type field and the netdev's features
field.
I've made gso_type a conjunction. The idea is that you have a base type
(e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
For example, if we add a hardware TSO type that supports ECN, they would
declare NETIF_F_TSO | NETIF_F_TSO_ECN. All TSO packets with CWR set would
have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
packets would be SKB_GSO_TCPV4. This means that only the CWR packets need
to be emulated in software.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu pointed out that it is unsafe to call netif_tx_disable()
from LLTX drivers because it uses dev->xmit_lock to synchronize
whereas LLTX drivers use private locks.
Convert tg3 to non-LLTX to fix this issue. tg3 is a lockless driver
where hard_start_xmit and tx completion handling can run concurrently
under normal conditions. A tx_lock is only needed to prevent
netif_stop_queue and netif_wake_queue race condtions when the queue
is full.
So whether we use LLTX or non-LLTX, it makes practically no
difference.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove tx_lock where it is unnecessary. tg3 runs lockless and so it
requires interrupts to be disabled and sync'ed, netif_queue and NAPI
poll to be stopped before the device can be reconfigured. After
stopping everything, it is no longer necessary to get the tx_lock.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add recovery logic when we suspect that the system is re-ordering
MMIOs. Re-ordered MMIOs to the send mailbox can cause bogus tx
completions and hit BUG_ON() in the tx completion path.
tg3 already has logic to handle re-ordered MMIOs by flushing the MMIOs
that must be strictly ordered (such as the send mailbox). Determining
when to enable the flush is currently a manual process of adding known
chipsets to a list.
The new code replaces the BUG_ON() in the tx completion path with the
call to tg3_tx_recover(). It will set the TG3_FLAG_MBOX_WRITE_REORDER
flag and reset the chip later in the workqueue to recover and start
flushing MMIOs to the mailbox.
A message to report the problem will be printed. We will then decide
whether or not to add the host bridge to the list of chipsets that do
re-ordering.
We may add some additional code later to print the host bridge's ID so
that the user can report it more easily.
The assumption that re-ordering can only happen on x86 systems is also
removed.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add PCI ID for BCM5786 which is a variant of 5787.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of all the SUN_570X logic and instead:
1) Make sure MEMARB_ENABLE is set when we probe the SRAM
for config information. If that is off we will get
timeouts.
2) Always try to sync with the firmware, if there is no
firmware running do not treat it as an error and instead
just report it the first time we notice this condition.
3) If there is no valid SRAM signature, assume the device
is onboard by setting TG3_FLAG_EEPROM_WRITE_PROT.
Update driver version and release date.
With help from Michael Chan and Fabio Massimo Di Nitto.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add some missing rx error counters for 5705 and newer chips.
Update version to 3.58.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even with fiber cards ethtool reports that the connected port is TP,
the patch fix this.
Signed-off-by: Karsten Keil <kkeil@suse.de>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tg3_run_loopback doesn't check that dev_alloc_skb() returns anything
useful.
Even if dev_alloc_skb() fails to return an skb to us we'll happily go
on and assume it did, so we risk dereferencing a NULL pointer. Much
better to fail gracefully by returning -ENOMEM than crashing here.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix bug in nvram write function. If the starting nvram address offset
happens to be the last dword of the page, the NVRAM_CMD_LAST bit will
not get set in the existing code. This patch fixes the bug by changing
the "else if" to "if" so that the last dword condition always gets
checked.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a reset_phy parameter to tg3_reset_hw() and tg3_init_hw(). With
the full chip reset during MAC address change, the automatic PHY reset
during chip reset will cause a link down and bonding will not work
properly as a result. With this reset_phy parameter, we can do a chip
reset without link down when changing MAC address or MTU.
Signed-off-by: Gary Zambrano <zambrano@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do the full chip reset when changing MAC address if ASF is enabled.
ASF sometimes uses a different MAC address than the driver. Without
the reset, the ASF MAC address may be overwritten when the driver's
MAC address is changed.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add some PHY workaround code to reduce jitter on some PHYs.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add netif_carrier_off() call during tg3_phy_reset(). This is needed
to properly track the netif_carrier state in cases where we do a
PHY reset with interrupts disabled. The SerDes code will not run
properly if the netif_carrier state is wrong.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Speed up SRAM read and write functions if possible by using MMIO
instead of config. cycles. With this change, the post reset signature
done at the end of D3 power change must now be moved before the D3
power change.
IBM reported a problem on powerpc blades during ethtool self test that
was caused by the memory test taking excessively long. Config. cycles
are very slow on powerpc and the memory test can take more than 10
seconds to complete using config. cycles.
David Miller informed me that an earlier version of the patch caused
problems on sparc64 systems with built-in tg3 chips. This version
fixes the problem by excluding all SUN built-in tg3 chips from doing
MMIO SRAM access.
TG3_FLAG_EEPROM_WRITE_PROT is also set unconditionally when
TG3_FLG2_SUN_570X is set. This should be sane as all SUN chips are
built-in and do not require Vaux switching.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kill the TG3_FLAG_NO_{TX|RX}_PSEUDO_CSUM flags because they are not
very useful. This will free up some bits for new flags.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a memory leak (buf wasn't freed) spotted by the
Coverity checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (48 commits)
Documentation: fix minor kernel-doc warnings
BUG_ON() Conversion in drivers/net/
BUG_ON() Conversion in drivers/s390/net/lcs.c
BUG_ON() Conversion in mm/slab.c
BUG_ON() Conversion in mm/highmem.c
BUG_ON() Conversion in kernel/signal.c
BUG_ON() Conversion in kernel/signal.c
BUG_ON() Conversion in kernel/ptrace.c
BUG_ON() Conversion in ipc/shm.c
BUG_ON() Conversion in fs/freevxfs/
BUG_ON() Conversion in fs/udf/
BUG_ON() Conversion in fs/sysv/
BUG_ON() Conversion in fs/inode.c
BUG_ON() Conversion in fs/fcntl.c
BUG_ON() Conversion in fs/dquot.c
BUG_ON() Conversion in md/raid10.c
BUG_ON() Conversion in md/raid6main.c
BUG_ON() Conversion in md/raid5.c
Fix minor documentation typo
BFP->BPF in Documentation/networking/tuntap.txt
...
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Undo commit 100c467330
MMIOs timeout more quickly that PCI config cycles and some
of these SRAM accesses can take a very long time, triggering
the MMIO limits on some sparc64 PCI controllers and thus
resulting in bus timeouts and bus errors.
Signed-off-by: David S. Miller <davem@davemloft.net>
Skip the main timer code if interrupts are disabled in the full lock
state.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Speed up SRAM read and write functions if possible by using MMIO
instead of config. cycles. With this change, the post reset signature
done at the end of D3 power change must now be moved before the D3
power change.
IBM reported a problem on powerpc blades during ethtool self test
that was caused by the memory test taking excessively long. Config.
cycles are very slow on powerpc and the memory test can take more
than 10 seconds to complete using config. cycles. As a result, NETDEV
WATCHDOG can be triggered during self test and the chip can end up in
a funny state.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Need to check the TG3_FLAG_40BIT_DMA_BUG flag in the workaround code
path instead of device flags.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some older bootcode in some devices may report 0 MAC address in
SRAM when booting up from low power state. This patch fixes the
problem by checking for a valid MAC address in SRAM and falling back
to NVRAM if necessary.
Thanks to walt <wa1ter@myrealbox.com> for reporting the problem
and helping to debug it.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for new chip 5755 which is very similar to 5787.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>