The e1000 driver has a workaround for 82544 on PCI-X where if the
terminating byte of a buffer is at addresses 0-3 mod 8, then 4 bytes
are shaved off it and defered to a new segment. This is due to an
erratum that could otherwise cause TX hangs.
Unfortunately this breaks TSO because it may cause the TCP header to
be split over two segments which itself causes TX hangs. The solution
is to pull 4 bytes of data up from the next segment rather than pushing
4 bytes off. This ensures the TCP header remains in one piece and
works around the PCI-X hang.
This patch is based on one from Jesse Brandeburg.
This bug has been trigered by both CONFIG_DEBUG_SLAB as well as Xen.
Note that the only reason we don't see this normally is because the
TCP stack starts writing from the end, i.e., it writes the TCP header
first then slaps on the IP header, etc. So the end of the TCP header
(skb->tail - 1 here) is always aligned correctly.
Had we made the start of the IP header (e.g., IPv6) 8-byte aligned
instead, this would happen for normal TCP traffic as well.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
Signed-off-by: Jeff Garzik <jeff@garzik.org>
On suspend, handle pci_set_power_state errors, and on resume
handle failures in pci_resume_state().
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The PCI MSI and express state are already saved and restored by the
current versions of pci_save_state/pci_restore_state.
Therefore it is no longer necessary for the driver to do it.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Now that IRQ are requested is called on open() and freed on close(),
we can safely switch from/to MSI without unloading the module.
We are guaranteed to correctly free IRQ even if the sysfs file got
written in the meantime since the MSI initialization is stored in
mgp->msi_enabled.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Request IRQ in myri10ge_open() and free in close() instead of probe()
and remove() to eliminate potential race between the watchdog and the
interrupt handler. Additionaly, the interrupt handler won't get called
on shared irq anymore when the interface is down.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Since pci_save_state() pushes MSI and PCIe states on a kind of stack,
myri10ge saving the state in advance for parity recovery will push the
state again on the stack on suspend. This leads to some memory leak.
We add a couple additional calls to save_state and restore_state so
that we don't leak anymore.
For the future, we are thinking of a better way to recover from parity
error without using pci_save_state().
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The former option is removed and platform code can now specify the
expected behavior.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
net/core/netpoll.c::netpoll_send_skb() calls the poll handler when
it is available. As netconsole can be used from almost any context,
IRQ must not be enabled blindly in the NAPI handler of a driver which
supports netpoll.
b57bd06655 fixed the issue for the
8139too.c driver.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The netxen driver includes a private ioctl that provides access
to functionality that is already available in other ways. The PCI
layer has application access hooks (see setpci), and the statistics
are available in ethtool/netstats.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Currently after an interface up, the link state is detected 2 seconds later
when the first watchdog timer runs. This patch changes that by triggering
the hardware to generate a link-change interrupt from the up() function
instead. This has the result that the link state gets detected immediately
and without races. This has the potential to speed up booting since a normal
distribution boot process waits for a link before DHCP is attempted.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add 3 extra packet redirect counters for tracking purposes to make sure
we can test that all packets arrive properly.
Originally from Jesse Brandeburg <jesse.brandeburg@intel.com>,
rewritten to use feature flags by me.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Allow the user to vary the size that copybreak works. Currently cb is enabled
for packets < 256 bytes, but various tests indicate that this should be
configurable for specific use cases. In addition, this parameter allows us
to force never/always during testing to get full and predictable coverage of
both code paths.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Assign the PBA to be large enough to contain at least 2 jumbo frames on
all adapters. This dramatically increases performance on several adapters
and fixes TX performance degradation issues where the PBA was misallocated
in the old algorithm.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
the driver has (ancient) code for messing with TIPG from the 82542 days.
Unfortunately this code was running on our current adapters and setting
TIPG for fiber to be +1 over the copper value. This caused 1.45Mpps
to be sent instead of 1.487Mpps.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
For older adapters we know that they are of the PCI bus type, so we can
just set this.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This bugfix makes sure that the driver data reflects the full new situation
before the adapter is reinitialized.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
In rare occasions, ESB2 systems would end up started without the RX
unit being turned on. Add a check that runs post-init to work around
this issue.
Originally from Jesse Brandeburg <jesse.brandeburg@intel.com>,
rewritten to use feature flags by me.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
CONFIG_DEBUG_SLAB changes alignments of the data structures the slab
allocators return. These break certain workarounds for TSO on the 82544.
Since DEBUG_SLAB is relatively rare and not used for performance sensitive
cases, the simplest fix is to disable TSO in this special situation.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
If the user has forced gigabit speed, phy power management must be disabled;
otherwise the NIC would try to negotiate to a linkspeed of 10/100 mbit on
shutdown, which would lead to a total loss of link. This loss of link breaks
Wake-on-Lan and IPMI.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Several bugs existed in how we handle manageability issues all
over the driver. This patch consolidates all the managability
release and init code in two single functions and call them from
appropriate locations. This fixes several BMC packet redirect issues
and powerup/down hiccups.
Originally from Jesse Brandeburg <jesse.brandeburg@intel.com>, rewritten
to use feature flags by me.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The 82543 chip does not count tx_carrier_errors properly in FD mode;
report zeros instead of garbage.
Originally from Jesse Brandeburg <jesse.brandeburg@intel.com>, rewritten
to use feature flags by me.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The dynamic interrupt rate control patches omitted proper counting
for jumbo's and TSO resulting in suboptimal interrupt mitigation strategies.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The lower 2 bits of a user-supplied itr setting (via ethtool) need to be
masked off: These lower two bits are used as control bits.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This also adds he required page "writeback" flag handling, that cifs
hasn't been doing and that the page dirty flag changes made obvious.
Acked-by: Steve French <smfltc@us.ibm.com>
Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We use the fixmap for accessing pci config space in pci_mmcfg_read/write().
The problem is in pci_exp_set_dev_base(). It is caching a last
accessed address to avoid calling set_fixmap_nocache() whenever
pci_mmcfg_read/write() is used.
static inline void pci_exp_set_dev_base(int bus, int devfn)
{
u32 dev_base = base | (bus << 20) | (devfn << 12);
if (dev_base != mmcfg_last_accessed_device) {
mmcfg_last_accessed_device = dev_base;
set_fixmap_nocache(FIX_PCIE_MCFG, dev_base);
}
}
cpu0 cpu1
---------------------------------------------------------------------------
pci_mmcfg_read("device-A")
pci_exp_set_dev_base()
set_fixmap_nocache()
pci_mmcfg_read("device-B")
pci_exp_set_dev_base()
set_fixmap_nocache()
pci_mmcfg_read("device-B")
pci_exp_set_dev_base()
/* doesn't flush tlb */
But if cpus accessed the above order, the second pci_mmcfg_read() on
cpu0 doesn't flush the TLB, because "mmcfg_last_accessed_device" is
device-B. So, second pci_mmcfg_read() on cpu0 accesses a device-A via
a previous TLB cache. This problem became the cause of several strange
behavior.
This patches fixes this situation by adds "mmcfg_last_accessed_cpu" check.
[ Alternatively, we could make a per-cpu mapping area or something. Not
that it's probably worth it, but if we wanted to avoid all locking and
instead just disable preemption, that would be the way to go. --Linus ]
Signed-off-by: OGAWA Hirofumi <hogawa@miraclelinux.com>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clark Williams reported that suspend doesnt work on his laptop on
2.6.20-rc1-rt kernels. The bug was introduced by the following cleanup
commit:
commit 112cecb2cc
Author: Siddha, Suresh B <suresh.b.siddha@intel.com>
Date: Wed Dec 6 20:34:31 2006 -0800
[PATCH] suspend: don't change cpus_allowed for task initiating the suspend
because with this change 'error' is not initialized to 0 anymore, if
there are no other online CPUs. (i.e. if the system is single-CPU).
the fix is the initialize it to 0. The really weird thing is that my
version of gcc does not warn about this non-initialized variable
situation ...
(also fix the kernel printk in the error branch, it was missing a
newline)
Reported-by: Clark Williams <williams@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Thanks to Len Brown for testing this fix, since while they have in the
past, none of my machines run reiserfs at the moment.
Cc: Vladimir V. Saveliev <vs@namesys.com>
Acked-by: Len Brown <lenb@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make cancel_dirty_page() act more like all the other dirty and writeback
accounting functions: test for "mapping" being NULL, and do the
NR_FILE_DIRY accounting purely based on mapping_cap_account_dirty()).
Also, add it to the exports, so that modular filesystems can use it.
Acked-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On the G965, the GTT size may be larger than is required to cover the
aperture. (In fact, on all hardware we've seen, the GTT is 512KB to the
aperture's 256MB). A previous commit forced the aperture size to 512MB on
G965 to match GTT, which would likely result in hangs at best if users
tried to rely on agpgart's aperture size information. Instead, we use the
resource length for the aperture size and the system's reported GTT size
when available for the GTT size.
Because the MSAC registers which had been read for aperture size detection
on i9xx chips just cause a change in the resource size, we can use generic
code for aperture detection on all i9xx.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
A space and a bracket are missing (and indentation is wrong).
Signed-off-by: Brice Goglin <Brice.Goglin@ens-lyon.org>
Signed-off-by: Dave Jones <davej@redhat.com>
Fixes the oops in cpufreq_stats with acpi_cpufreq driver. The issue was
that the frequency was reported as 0 in acpi-cpufreq.c. The bug is due to
different indicies for freq_table and ACPI perf table.
Also adds a check in cpufreq_stats to check for error return from
freq_table_get_index() and avoid using the error return value.
Patch fixes the issue reported at
http://www.ussg.iu.edu/hypermail/linux/kernel/0611.2/0629.html
and also other similar issue here
http://bugme.osdl.org/show_bug.cgi?id=7383 comment 53
Signed-off-by: Dhaval Giani <dhaval.giani@gmail.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
Make x86_64 ACPI_CPU_FREQ select CPU_FREQ_TABLE like other methods do.
(although we should still eliminate as much use of 'select' as possible)
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
This patch is to speed up flipping of pages in and out of the AGP aperture as
needed by the new drm memory manager.
A number of global cache flushes are removed as well as some PCI posting flushes.
The following guidelines have been used:
1) Memory that is only mapped uncached and that has been subject to a global
cache flush after the mapping was changed to uncached does not need any more
cache flushes. Neither before binding to the aperture nor after unbinding.
2) Only do one PCI posting flush after a sequence of writes modifying page
entries in the GATT.
Signed-off-by: Thomas Hellstrom <thomas@tungstengraphics.com>
Signed-off-by: Dave Jones <davej@redhat.com>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (68 commits)
ACPI: replace kmalloc+memset with kzalloc
ACPI: Add support for acpi_load_table/acpi_unload_table_id
fbdev: update after backlight argument change
ACPI: video: Add dev argument for backlight_device_register
ACPI: Implement acpi_video_get_next_level()
ACPI: Kconfig - depend on PM rather than selecting it
ACPI: fix NULL check in drivers/acpi/osl.c
ACPI: make drivers/acpi/ec.c:ec_ecdt static
ACPI: prevent processor module from loading on failures
ACPI: fix single linked list manipulation
ACPI: ibm_acpi: allow clean removal
ACPI: fix git automerge failure
ACPI: ibm_acpi: respond to workqueue update
ACPI: dock: add uevent to indicate change in device status
ACPI: ec: Lindent once again
ACPI: ec: Change #define to enums there possible.
ACPI: ec: Style changes.
ACPI: ec: Acquire Global Lock under EC mutex.
ACPI: ec: Drop udelay() from poll mode. Loop by reading status field instead.
ACPI: ec: Rename gpe_bit to gpe
...
The function isdn_ppp_ccp_reset_alloc_state() sets ->timer.function
and ->timer.data and later on calls add_timer() with no init_timer()
ever done.
Noted by Al Viro.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Karsten Keil <kkeil@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[UDP]: Fix reversed logic in udp_get_port().
[IPV6]: Dumb typo in generic csum_ipv6_magic()
[SCTP]: make 2 functions static
[SCTP]: Fix typo adaption -> adaptation as per the latest API draft.
[SCTP]: Don't export include/linux/sctp.h to userspace.
[TCP]: Fix ambiguity in the `before' relation.
[ATM] drivers/atm/fore200e.c: Cleanups.
[ATM]: Remove dead ATM_TNETA1570 option.
NetLabel: correctly fill in unused CIPSOv4 level and category mappings
NetLabel: perform input validation earlier on CIPSOv4 DOI add ops
The logic in cfq_allow_merge() wasn't clear enough - basically allow
merging for the same queues only. Do a fast check for 'rq and bio both
sync/async' before doing the cfqq hash lookup.
This is verified to work with the fixed elv_try_merge() from commit
bb4067e341.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When this code was converted to use sk_for_each() the
logic for the "best hash chain length" code was reversed,
breaking everything.
The original code was of the form:
size = 0;
do {
if (++size >= best_size_so_far)
goto next;
} while ((sk = sk->next) != NULL);
best_size_so_far = size;
best = result;
next:;
and this got converted into:
sk_for_each(sk2, node, head)
if (++size < best_size_so_far) {
best_size_so_far = size;
best = result;
}
Which does something very very different from the original.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes the following needlessly global functions static:
- ipv6.c: sctp_inet6addr_event()
- protocol.c: sctp_inetaddr_event()
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ivan Skytte Jorgensen <isj-sctp@i1.dk>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This file contains protocol definitions and there are no SCTP apps
that use this file.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While looking at DCCP sequence numbers, I stumbled over a problem with
the following definition of before in tcp.h:
static inline int before(__u32 seq1, __u32 seq2)
{
return (__s32)(seq1-seq2) < 0;
}
Problem: This definition suffers from an an ambiguity, i.e. always
before(a, (a + 2^31) % 2^32)) = 1
before((a + 2^31) % 2^32), a) = 1
In text: when the difference between a and b amounts to 2^31,
a is always considered `before' b, the function can not decide.
The reason is that implicitly 0 is `before' 1 ... 2^31-1 ... 2^31
Solution: There is a simple fix, by defining before in such a way that
0 is no longer `before' 2^31, i.e. 0 `before' 1 ... 2^31-1
By not using the middle between 0 and 2^32, before can be made
unambiguous.
This is achieved by testing whether seq2-seq1 > 0 (using signed
32-bit arithmetic).
I attach a patch to codify this. Also the `after' relation is basically
a redefinition of `before', it is now defined as a macro after before.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contains the following transformations from custom functions
to standard kernel version:
- fore200e_kmalloc() -> kzalloc()
- fore200e_kfree() -> kfree()
- fore200e_swap() -> cpu_to_be32()
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the unconverted ATM_TNETA1570 option that also lacks
any code in the kernel.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Back when the original NetLabel patches were being changed to use Netlink
attributes correctly some code was accidentially dropped which set all of the
undefined CIPSOv4 level and category mappings to a sentinel value. The result
is the mappings data in the kernel contains bogus mappings which always map to
zero. This patch restores the old/correct behavior by initializing the mapping
data to the correct sentinel value.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
There are a couple of cases where the user input for a CIPSOv4 DOI add
operation was not being done soon enough; the result was unexpected behavior
which was resulting in oops/panics/lockups on some platforms. This patch moves
the existing input validation code earlier in the code path to protect against
bogus user input.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>