Fix a problem where simple ACKs can be sent ahead of RDMA read
responses thus implicitly NAKing the RDMA read.
Signed-off-by: Ralph Campbell <ralph.cambpell@qlogic.com>
Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix a missing unlock in ipath_rc_rcv_resp() and remove an extra unlock
from ipath_rc_rcv_error().
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The HW now posts 2 ABORT_RPL and/or PEER_ABORT_REQ messages. We need
to handle them by silenty dropping the 1st but mark that we're ready
for the final message. This plugs some close races between the uP and
HW. Also update the minimum required firmware version.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
I noticed that many source files include <linux/pci.h> while they do
not appear to need it. Here is an attempt to clean it all up.
In order to find all possibly affected files, I searched for all
files including <linux/pci.h> but without any other occurence of "pci"
or "PCI". I removed the include statement from all of these, then I
compiled an allmodconfig kernel on both i386 and x86_64 and fixed the
false positives manually.
My tests covered 66% of the affected files, so there could be false
positives remaining. Untested files are:
arch/alpha/kernel/err_common.c
arch/alpha/kernel/err_ev6.c
arch/alpha/kernel/err_ev7.c
arch/ia64/sn/kernel/huberror.c
arch/ia64/sn/kernel/xpnet.c
arch/m68knommu/kernel/dma.c
arch/mips/lib/iomap.c
arch/powerpc/platforms/pseries/ras.c
arch/ppc/8260_io/enet.c
arch/ppc/8260_io/fcc_enet.c
arch/ppc/8xx_io/enet.c
arch/ppc/syslib/ppc4xx_sgdma.c
arch/sh64/mach-cayman/iomap.c
arch/xtensa/kernel/xtensa_ksyms.c
arch/xtensa/platform-iss/setup.c
drivers/i2c/busses/i2c-at91.c
drivers/i2c/busses/i2c-mpc.c
drivers/media/video/saa711x.c
drivers/misc/hdpuftrs/hdpu_cpustate.c
drivers/misc/hdpuftrs/hdpu_nexus.c
drivers/net/au1000_eth.c
drivers/net/fec_8xx/fec_main.c
drivers/net/fec_8xx/fec_mii.c
drivers/net/fs_enet/fs_enet-main.c
drivers/net/fs_enet/mac-fcc.c
drivers/net/fs_enet/mac-fec.c
drivers/net/fs_enet/mac-scc.c
drivers/net/fs_enet/mii-bitbang.c
drivers/net/fs_enet/mii-fec.c
drivers/net/ibm_emac/ibm_emac_core.c
drivers/net/lasi_82596.c
drivers/parisc/hppb.c
drivers/sbus/sbus.c
drivers/video/g364fb.c
drivers/video/platinumfb.c
drivers/video/stifb.c
drivers/video/valkyriefb.c
include/asm-arm/arch-ixp4xx/dma.h
sound/oss/au1550_ac97.c
I would welcome test reports for these files. I am fine with removing
the untested files from the patch if the general opinion is that these
changes aren't safe. The tested part would still be nice to have.
Note that this patch depends on another header fixup patch I submitted
to LKML yesterday:
[PATCH] scatterlist.h needs types.h
http://lkml.org/lkml/2007/3/01/141
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
These are all the remaining instances of get_property. Simple rename of
get_property to of_get_property.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fix TERMINATE layer, type, and ecode values based on
conformance testing.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix the pending mmap code so it doesn't corrupt the list of pending
mmaps and crash the machine when pending mmaps are destroyed without
first being mapped. Also, remove an unused variable, and use standard
kernel lists instead of our own homebrewed linked list implementation
to keep the pending mmap list.
Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
With mthca, RC QPs can starve each other and even UD QPs on the same
hardware schedule queue. As a result, userspace MPI can starve
e.g. IPoIB traffic, with netdev watchdog warnings getting printed out,
and TCP connections getting stuck or failing.
Reduce the chance of this happening by using three separate hardware
schedule queues: one for userspace RC QPs, one for kernel RC QPs, and
one for all other QPs.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This fixes a problem which causes too many RC timeouts and
retransmits.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes the problem reported by Bernd Schubert <bs@q-leap.de>
with kernel debug options enabled:
BUG: at kernel/lockdep.c:1860 trace_hardirqs_on()
This was caused by using spin_lock_irq()/spin_unlock_irq() from
interrupt context. Fix all the places that might be called from
interrupts to use spin_lock_irqsave()/spin_unlock_irqrestore().
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: (49 commits)
IB: Set class_dev->dev in core for nice device symlink
IB/ehca: Implement modify_port
IB/umad: Clarify documentation of transaction ID
IPoIB/cm: spin_lock_irqsave() -> spin_lock_irq() replacements
IB/mad: Change SMI to use enums rather than magic return codes
IB/umad: Implement GRH handling for sent/received MADs
IB/ipoib: Use ib_init_ah_from_path to initialize ah_attr
IB/sa: Set src_path_bits correctly in ib_init_ah_from_path()
IB/ucm: Simplify ib_ucm_event()
RDMA/ucma: Simplify ucma_get_event()
IB/mthca: Simplify CQ cleaning in mthca_free_qp()
IB/mthca: Fix mthca_write_mtt() on HCAs with hidden memory
IB/mthca: Update HCA firmware revisions
IB/ipath: Fix WC format drift between user and kernel space
IB/ipath: Check that a UD work request's address handle is valid
IB/ipath: Remove duplicate stuff from ipath_verbs.h
IB/ipath: Check reserved memory keys
IB/ipath: Fix unit selection when all CPU affinity bits set
IB/ipath: Don't allow QPs 0 and 1 to be opened multiple times
IB/ipath: Disable IB link earlier in shutdown sequence
...
To clearly state the intent of copying from linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now to convert the last one, skb->data, that will allow many simplifications
and removal of some of the offset helpers.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
on 64bit architectures, allowing us to combine the 4 bytes hole left by the
layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
:-)
Many calculations that previously required that skb->{transport,network,
mac}_header be first converted to a pointer now can be done directly, being
meaningful as offsets or pointers.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common, open coded 'skb->h.raw = skb->data' operation, so that we can
later turn skb->h.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.
This one touches just the most simple cases:
skb->h.raw = skb->data;
skb->h.raw = {skb_push|[__]skb_pull}()
The next ones will handle the slightly more "complex" cases.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One less thing for drivers writers to worry about.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All RDMA drivers except ehca set class_dev->dev to their dma_device
value (ehca leaves this unset). dma_device is the only value that
makes any sense, so move this assignment to core/sysfs.c. This reduce
the duplicated code in the rest of the drivers and gives ehca a nice
/sys/class/infiniband/ehcaX/device symlink.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add "Modify Port" verb support to eHCA driver. The IB communication
manager needs this to set the IsCM port capability bit when
initializing.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
mthca_free_qp() already has local variables to hold the QP's send_cq
and recv_cq, so we can slightly clean up the calls to mthca_cq_clean()
by using those local variables instead of expressions like
to_mcq(qp->ibqp.send_cq).
Also, by cleaning the recv_cq first, we can avoid worrying about
whether the QP is attached to an SRQ for the second call, because we
would only clean send_cq if send_cq is not equal to recv_cq, and that
means send_cq cannot have any receive completions from the QP being
destroyed.
All this work even improves the generated code a bit:
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-5 (-5)
function old new delta
mthca_free_qp 510 505 -5
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit b2875d4c ("IB/mthca: Always fill MTTs from CPU") causes a crash
in mthca_write_mtt() with non-memfree HCAs that have their memory
hidden (that is, have only two PCI BARs instead of having a third BAR
that allows access to the RAM attached to the HCA) on 64-bit
architectures. This is because the commit just before, c20e20ab
("IB/mthca: Merge MR and FMR space on 64-bit systems") makes
dev->mr_table.fmr_mtt_buddy equal to &dev->mr_table.mtt_buddy and
hence mthca_write_mtt() tries to write directly into the HCA's MTT
table. However, since that table is in the HCA's memory, this is
impossible without the PCI BAR that gives access to that memory.
This causes a crash because mthca_tavor_write_mtt_seg() basically
tries to dereference some offset of a NULL pointer. Fix this by
adding a test of MTHCA_FLAG_FMR in mthca_write_mtt() so that we always
use the WRITE_MTT firmware command rather than writing directly if
FMRs are not enabled.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The kernel ib_wc structure now uses a QP pointer, but the user space
equivalent uses a QP number instead. This means we can no longer use
a simple structure copy to copy stuff into user space.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Don't let userspace use the direct-physical-map L_key or R_key.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
At some point things changed so that all the affinity bits can be set,
but cpus_full() macro is not true. This caused problems with the unit
selection logic on multi-unit (board) configurations.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Move the code that shuts down the IB link earlier in the unload
process, to be sure no new packets can arrive while we are unloading.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
To prevent random utility reads and writes of the diag interface to the
chip, we first require a handshake of reading from offset 0 and writing
to offset 0 before any other reads or writes can be done through the
diags device. Otherwise chip errors can be triggered.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the chip is no longer usable, LEDs should be turned off so system
can be found easily in the cluster.
Also some minor reorganizing so both chips print hardware error
message at same point and only if there were unrecovered errors
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Re-init of the kernel structures after a chip reset was leaving the
portdata structure for port zero in an inconsistent state, and a
pointer to it either stale (in re-init code) or NULL (in devdata)
Fixing the order of operations on this struct, and the condition for
interrupt access, prevents the crashes.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Due to a chip bug, the PIOAvail register is not always updated to
memory. This patch allows userspace to force an update.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In initialization, if we bailed at chip specific initialization, we
forgot to clean up the irq we had requested.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes a bug where multicast packets without a GRH were not
being dropped as per the IB spec.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the module parameter "kpiobufs" is set too high, the calculation to
reset it to a sane value was incorrect.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix RDMA read response length checking for RDMA_READ_RESPONSE_ONLY to
allow a zero length response. RDMA read responses which don't match
the expected length or occur in response to some other operation
should generate a completion queue error (see table 56, ch. 9.9.2.3 in
the IB spec).
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Improve port-sharing performance by allowing any process to receive
packets from the shared hardware port under a spin lock for mutual
exclusion. Previously, one process was nominated as the master and
that process was responsible for receiving all packets from the shared
hardware port and either consuming them or forwarding them to their
destination. This led to starvation problems for other processes when
the master process was busy in computation phases.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The port sharing feature mixed kernel virtual addresses as well as
physical addresses for the offset used to describe the mmap address to
map the InfiniPath hardware into user space. This had a conflict on
powerpc. The new scheme converts it to a physical address so it
doesn't conflict with chip addresses and yet still fits in 40/44 bits
so it isn't truncated by 32-bit applications calling mmap64().
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If a receive work request has been removed from the queue but has not
had a CQ entry generated for it and the QP is modified to the error
state, the completion entry generated is incorrect. This patch fixes
the problem.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Code was converted from a &= ~mask to clear_bit, but the bit was left
shifted instead of being used directly, so we were either trashing
memory several pages away, or sometimes taking a kernel page fault on
an invalid page.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Some types of packet errors are moderately common with longer IB
cables and large clusters, and are not reported with prints by other
IB HCA drivers. This suppresses those messages unless the new
__IPATH_ERRPKTDBG bit is set in ipath_debug. Reporting of temporarily
disabled frequent error interrupts was also made clearer
We also distinguish between chip errors, and bad packets sent or
received in the wording of the messages.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes a number of bugs with updating the PSN for retries of
RC requests.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When switching to the QP error state, the completion queue entries
(error or flush) were not being generated correctly.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ipath_dbg doesn't need the same prefixes that printk does.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch adds support for multiple RDMA reads and atomics to be sent
before an ACK is required to be seen by the requester.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If a post send is done in loopback and there is no receive queue
entry, the sending QP is put on a timeout list for a while so the
receiver has a chance to post a receive buffer. If the another post
send is done, the code incorrectly tried to put the QP on the timeout
list again an corrupted the timeout list. This eventually leads to a
spin lock deadlock NMI due to the timer function looping forever with
the lock held.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A silly programming error causes a CQ entry to not be generated if a
SRQ limit event is generated.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A recent change was made to allocate memory for a port after CPU
affinity is set. That change didn't account for subports and was
trying to allocate memory for the port twice.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The chip documentation on the expected TID vs eager TID parity error
bits was reversed from what was implemented in the RTL, for both
chips. This corrects the definitions.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The loop which initializes the user memory region from an array of
pages was using the wrong limit for the array. This worked OK when
dma_map_sg() returned the same number as the number of pages. This
patch fixes the problem.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This is a sticky state. It is useful for diagnosing problems with
boards versus cable/switch problems.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In mthca_arbel_fmr_unmap(), the high bits of the key are masked off.
This gets rid of the effect of adjust_key(), which makes sure that
bits 3 and 23 of the key are equal when the Sinai throughput
optimization is enabled, and so it may happen that an FMR will end up
with bits 3 and 23 in the key being different. This causes data
corruption, because when enabling the throughput optimization, the
driver promises the HCA firmware that bits 3 and 23 of all memory keys
will always be equal.
Fix by re-applying adjust_key() after masking the key.
Thanks to Or Gerlitz for reproducing the problem, and Ariel Shahar for
help in debug.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
As of commit 6cdbd77e ("cxgb3 - missing CPL hanler and register
setting."), the cxgb3 ethernet NIC driver no longer handles SET_TCB
replies, so we need to do it in the iWARP driver.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit c20e20ab ("IB/mthca: Merge MR and FMR space on 64-bit systems")
swapped the number of MTTs and MPTs when initializing the MR table. As
a result, we get a kernel oops when the number of MTT segments
allocated exceeds 0x20000.
Noted by Troy Benjegerdes <troy@scl.ameslab.gov>, and reproduced by
Dotan Barak <dotanb@mellanox.co.il>. This fixes
https://bugs.openfabrics.org/show_bug.cgi?id=490
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This was spotted by the Coverity checker (CID 1554).
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
eHCA scaling code must not depend on register_cpu_notifier() if
CONFIG_HOTPLUG_CPU is not set, so put all related code into #ifdefs.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix memory region permission problems:
- remove useless and redundant iwch_mem_perms enum.
- create ib_to_tpt_access_rights() for mapping ib access rights
to T3 TPT permissions.
- create ib_to_mwbind_access_rights() for mapping ib access rights
to T3 MWBIND WR permissions.
- fix up the mem reg code to utilize the new functions.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Only print one AE error for a given connection in the kernel log.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Stop the endpoint timer when the MPA exchange is aborted by the peer.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Change iwch_destroy_qp() to always move the QP to ERROR and let
iwch_modify_qp() decide what to do.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fixes for "normal close" failures:
- Start normal close timer when moving to CLOSING state.
- Handle ABORTING state in close_con_rpl().
- Stop timer correctly on abort during a normal close.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
cxgb3 uses dma_alloc_coherent() et al. thus needs linux/dma-mapping.h
include in order to build reliably.
Noticed on sparc64.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the consumer rejects the connection we end up under-referencing the
endpoint structure. The fix is to call iwch_ep_disconnect() instead
of the low level disconnect functions so that the endpoint close timer
is started correctly.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The garbled logic in mthca_alloc_memfree() causes it to return 0, even
if it fails to allocate all doorbell records. Fix it to return -ENOMEM
when it fails.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes two issues reported by Roland Dreier and Christoph Hellwig:
- Mismatched sync/locking between completion handler and destroy cq We
introduced a counter nr_events per cq to track number of irq events
seen. This counter is incremented when an event queue entry is seen
and decremented after completion handler has been called regardless
if scaling code is active or not. Note that nr_callbacks tracks
number of events assigned to a cpu and both counters can potentially
diverge.
The sync between running completion handler and destroy cq is done
by using the global spin lock ehca_cq_idr_lock.
- Replace yield by wait_event on the counter above to become zero.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Stop the ep timer in ec_status() if the status indicates a
bad close.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- don't mark static functions in C files as inline - gcc should know
best whether inlining makes sense
- never compile the unused cxio_dbg.c
- make the following needlessly global functions static:
- cxio_hal.c: cxio_hal_clear_qp_ctx()
- iwch_provider.c: iwch_get_qp()
- remove the following unused global functions:
- cxio_hal.c: cxio_allocate_stag()
- cxio_resource.: cxio_hal_get_rhdl()
- cxio_resource.: cxio_hal_put_rhdl()
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch makes the needlessly global functions mthca_tavor_write_mtt_seg()
and mthca_arbel_write_mtt_seg() static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (25 commits)
Documentation/kernel-docs.txt update.
arch/cris: typo in KERN_INFO
Storage class should be before const qualifier
kernel/printk.c: comment fix
update I/O sched Kconfig help texts - CFQ is now default, not AS.
Remove duplicate listing of Cris arch from README
kbuild: more doc. cleanups
doc: make doc. for maxcpus= more visible
drivers/net/eexpress.c: remove duplicate comment
add a help text for BLK_DEV_GENERIC
correct a dead URL in the IP_MULTICAST help text
fix the BAYCOM_SER_HDX help text
fix SCSI_SCAN_ASYNC help text
trivial documentation patch for platform.txt
Fix typos concerning hierarchy
Fix comment typo "spin_lock_irqrestore".
Fix misspellings of "agressive".
drivers/scsi/a100u2w.c: trivial typo patch
Correct trivial typo in log2.h.
Remove useless FIND_FIRST_BIT() macro from cardbus.c.
...
Correct mis-spellings of "algorithm", "appear", "consistent" and
(shame, shame) "kernel".
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Remove the Open Grid Computing copyright. It shouldn't be there.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
For T3B devices, mark user QP in error once we transition
to TERMINATE.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Set the port phys state as returned from ehca_query_port() to LINK_UP.
ehca actually represents a logical HCA, whose phys/link state always
is LINK_UP.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Allow users to en/disable scaling code when loading ib_ehca module,
rather than requiring the module to be rebuilt to change the setting.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix a race condition in find_next_cpu_online() and some other locking
issues in ehca scaling code.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The change to allow allocating ICM chunks from coherent memory did not
increment the count of sg entries properly, so a chunk that required
more than allocation would not be mapped properly by the HCA.
Fix this by adding the missing increment of chunk->nsg.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
RESET->RESET is an allowed QP state transition, so mthca should handle
it correctly, by just returning success without involving the firmware.
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
IB/mthca: Always fill MTTs from CPU
IB/mthca: Merge MR and FMR space on 64-bit systems
IB/mthca: Fix access to MTT and MPT tables on non-cache-coherent CPUs
IB/mthca: Give reserved MTTs a separate cache line
IB/mthca: Fix reserved MTTs calculation on mem-free HCAs
RDMA/cxgb3: Add driver for Chelsio T3 RNIC
IB: Remove redundant "_wq" from workqueue names
RDMA/cma: Increment port number after close to avoid re-use
IB/ehca: Fix memleak on module unloading
IB/mthca: Work around gcc bug on sparc64
IPoIB: Connected mode experimental support
IB/core: Use ARRAY_SIZE macro for mandatory_table
IB/mthca: Use correct structure size in call to memset()
Speed up memory registration by filling in MTTs directly when the CPU
can write directly to the whole table (all mem-free cards, and to
Tavor mode on 64-bit systems with the patch I posted earlier). This
reduces the number of FW commands needed to register an MR by at least
a factor of 2 and speeds up memory registration significantly.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
For Tavor, we currently reserve separate MPT and MTT space for FMRs to
avoid abusing the vmalloc space on 32 bit kernels. No such problem
exists on 64 bit kernels so let's not do it there.
This way we have a shared pool for MR and FMR resources, used on
demand. This will also make it possible to write MTTs for regular
regions directly from driver.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
We allocate the MTT table with alloc_pages() and then do pci_map_sg(),
so we must call pci_dma_sync_sg() after the CPU writes to the MTT
table. This works since the device will never write MTTs on mem-free
HCAs, once we get rid of the use of the WRITE_MTT firmware command.
This change is needed to make that work, and is an improvement for
now, since it gives FMRs a chance at working.
For MPTs, both the device and CPU might write there, so we must
allocate DMA coherent memory for these.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
MTTs are allocated in non-cache-coherent memory, so we must give
reserved MTTs their own cache line, to prevent both device and
CPU from writing into the same cache line at the same time.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The reserved_mtts field has different meaning in Tavor and Arbel, so
we are wasting mtt entries on memfree. Fix the Arbel case to match
Tavor semantics.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>