A couple of chip bugs in the iba6110 and in the iba6120 are not in more
recent chips. This first bug swaps two of the pioavail register
locations. In the second bug, the chip can sometimes forget to dma the
pio avail register to memory. We indicate the presence of these bugs
with runtime flags and we indicate the presence of the flags by bumping
the SWMINOR.
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
iba6110 rev3 and earlier had a chip bug where the chip could overrun the
recv header queue. rev4 fixed this chip bug so userspace no longer needs
to workaround it. Now we only set the workaround flag for older chip
versions.
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ipath_poll() suffered from a couple subtle bugs. Under the right
conditions we could leave recv interrupts enabled on an ipath user
context on close, thereby taking potentially unwanted interrupts on the
next open -- this is fixed by unconditionally turning off recv
interrupts on close. Also, we now use counters rather than set/clear
bits which allows us to make sure we catch all interrupts at the cost of
changing the semantics slightly (it's now give me all events since the
last time I called poll() rather than give me all events since I called
_this_ poll routine). We also added some memory barriers which may help
ensure we get all notifications in a timely manner.
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The LMC value was being saved by the SMA in two places. This patch
cleans it up so only one copy is kept.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch adds the ability to set the LMC via a sysfs file as if the SM
sent a SubnSet(PortInfo) MAD. It is useful for debugging when no SM is
running.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The code to add an entry to the completion queue stored the QPN which is
needed for the user level verbs view of the completion queue entry but
the kernel struct ib_wc contains a pointer to the QP instead of a QPN.
When the kernel polled for a completion queue entry, the QPN was lookup
up and the QP pointer recovered. This patch stores the CQE differently
based on whether the CQ is a kernel CQ or a user CQ thus avoiding the
QPN to QP lookup overhead.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch implements the IB_EVENT_QP_LAST_WQE_REACHED event which is
needed by ib_ipoib to destroy the QP when used in connected mode.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Follow the IB spec. (C10-96) for post send which states that a flushed
completion event should be generated for work requests posted when a QP
is in the error state.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch removes some redundant initialization code.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In an earlier change, the amount of data read from the flash was
mistakenly limited to the size known to the current driver. This causes
problems when the length is increased, and written with the new longer
version; the checksum would fail because not enough data was read.
Always read the full 128 byte length to prevent this.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes a bug in the receive processing for UC RDMA WRITE with
immediate which caused the last packet to be dropped.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This is a comment change, only, correcting the comment to match the
implemented workaround, rather than the original workaround, and
clarifying why it's needed.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The ipathfs file system is used to export binary data verses ASCII data
such as through /sys. This patch removes some unneeded files since the
data is available through other /sys files.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
There have been a number of issues where host bandwidth via HT or PCIe
to the InfiniPath chip has been limited in some fashion (BIOS,
configuration, etc.), resulting in user confusion. This check gives a
clear warning that something is wrong and needs to be resolved.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The code to post UD sends tried to process work requests at the time
ib_post_send() is called without using a WQE queue. This was fine as
long as HW resources were available for sending a packet. This patch
changes UD to be handled more like RC and UC and shares more code.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Different processors have different ordering restrictions for write
combining. By taking advantage of this, we can eliminate some write
barriers when writing to the send buffers.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
On iba6110 rev4, support for three more IB counters were added. The
LocalLinkIntegrityError counter, the ExcessiveBufferOverrunErrors
counter and support for error counting of flow control packets on an
invalid VL. These counters trigger GPIO interrupts and the sw keeps
track of the counts. Since we also use GPIO interrupts to signal packet
reception, we need to turn off the fast interrupts, or we risk losing a
GPIO interrupt.
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Doing min_t(int, foo, INT_MAX) doesn't work correctly, because if foo
is bigger than INT_MAX, then when treated as a signed integer, it will
become negative and hence such an expression is just an elaborate NOP.
Fix such cases in ehca to do min_t(unsigned, foo, INT_MAX) instead.
This fixes negative reported values for max_cqe, max_pd and max_ah:
Before:
max_cqe: -64
max_pd: -1
max_ah: -1
After:
max_cqe: 2147483647
max_pd: 2147483647
max_ah: 2147483647
Based on a bug report and fix from Anton Blanchard <anton@samba.org>.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Firmware commands are sent to the HCA by writing multiple words to a
command register block. Access to this block of registers is
serialized with a mutex. However, on large SGI systems, problems were
seen with multiple CPUs issuing FW commands at the same time, because
the writes to the register block may be reordered within the system
interconnect and reach the HCA in a different order than they were
issued (even with the mutex). Fix this by adding an mmiowb() before
dropping the mutex.
Tested-by: Arthur Kepner <akepner@sgi.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Increase the number of QPs allowed per multicast group from 8 to 56.
This allows for one QP per core on 16-core systems, which are now
quite common, and allows some space for future growth.
This is basically the same patch that Jack Morgenstein
<jackm@dev.mellanox.co.il> just supplied for mlx4.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Implement FMRs for mlx4. This is an adaptation of code from mthca.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Write MTT entries directly to ICM from the driver (eliminating use of
WRITE_MTT command). This reduces the number of FW commands needed to
register an MR by at least a factor of 2 and speeds up memory
registration significantly. This code will also be used to implement
FMRs.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
display the following device information under /sys/class/infiniband/mlx4_X:
board_id, fw_ver, hw_rev, hca_type.
This patch makes this information available to userspace utilities
such as ibstat and ibv_devinfo.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* Replace {un}register_cpu_notifier with {un}register_hotcpu_notifier
thereby losing a couple of #ifdef HOTPLUG_CPU pairs.
* Move comp_pool_callback_nb declaration to below that of callback
function so that initialization of .notifier_call and .priority can
occur at build time itself and not runtime.
* Mark the notifier_block (and callback function, and another static
function used by it) as __cpuinit{data} for the sake of consistency
and remove enclosing #ifdef. (This may increase size for modular
build of this module, however, because these are no longer dropped
unconditionally now.)
Signed-off-by: Satyam Sharma <satyam@infradead.org>
Acked-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
...because, on virtualized hardware like System p, we can't be sure
that the physical pages behind them are contiguous otherwise.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
We can use raw_smp_processor_id() here because the processor ID is
only used for debug output and therefore our use is preemption-unsafe.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Some firmware levels exhibit a race condition between H_ALLOC_RESOURCE(MR)
and H_FREE_RESOURCE(MR). Work around this problem by locking these hvCalls
against each other.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix some modify_qp() issues related to path migration.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Change hvcall trace output towards better readability: reg numbers
instead of argument numbers, return code as signed decimal instead of
unsigned hex.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Use Paul's new remap_4k_pfn() function to map our 4K firmware contexts
into user space on 64K-page machines without exposing neighboring
firmware contexts. Return the context's offset within a 64K page to
user space so it can determine the proper virtual address.
For details about remap_4k_pfn(), see commit 721151d0 or
http://patchwork.ozlabs.org/linuxppc/patch?id=10281
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
These driver changes incorporate the proposed PCI-X / PCI-Express read
byte count interface. Reading and setting those values doesn't take
place "manually", instead wrapping functions are called to allow
quirks for some PCI bridges.
Signed-off by: Peter Oruba <peter.oruba@amd.com>
Based on work by Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
At the moment the ehca module parameters are not exported in sysfs.
Export them with 0444 permissions.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ehca spits out a lot of debugging information. I had to look closely to
see the "Port 1 is not active" message within all the debug:
eHCA Infiniband Device Driver (Rel.: SVNEHCA_0022)
eHCA scaling code enabled
ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_define_sqp Port 1 is not active.
ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_create_qp ehca_define_sqp() failed rc=ffffffffffffffff
ib_mad: Couldn't create ib_mad QP1
ib_mad: Couldn't open ehca0 port 1
ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_alloc_fmr unsupported fmr_attr->page_shift=9
ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_alloc_fmr rc=ffffffffffffffea pd=c000000b4b5b2420 mr_access_flags=7 fmr_attr=c0000005afd37394
fmr_create failed for FMR 0
Remove a few debug statements so that things are clearer:
eHCA Infiniband Device Driver (Rel.: SVNEHCA_0022)
eHCA scaling code enabled
ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_define_sqp Port 1 is not active.
ib_mad: Couldn't create ib_mad QP1
ib_mad: Couldn't open ehca0 port 1
ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_alloc_fmr unsupported fmr_attr->page_shift=9
fmr_create failed for FMR 0
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
mlx4_srq_query() returns a big-endian 16-bit value through an int *,
which screws up sparse checking. Fix this so that a CPU-endian value
is returned.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Recover from MSI-X errors by automatically falling back on regular
interrupt, instead of asking the user to do this manually. This makes
it possible to enable MSI-X by default, and will make it possible to
get rid of the msi_x module option in the future.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ehca_classes.h uses struct mutex, so while <linux/mutex.h> seems to be
pulled in indirectly by one of the headers it includes, the right
thing is to include <linux/mutex.h> directly.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Acked-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Use a __set_data_seg() helper in mlx4_ib_post_recv() too; in addition
to making the code easier to read, this also allows gcc to generate
better code -- on x86_64:
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-8 (-8)
function old new delta
mlx4_ib_post_recv 359 351 -8
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Allow changing parameter values without having to reload the module.
This is safe because these parameters are only looked at when a new
connection is established.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This is an addendum to commit 0e6e7416 ("IB/mlx4: Handle new FW
requirement for send request prefetching"). We also need to handle
prefetch marking properly for S/G segments, or else the HCA may end up
processing S/G segments that are not fully written and end up sending
the wrong data. This can actually cause data corruption in practice,
especially on systems with relatively slow CPUs (where the HCA is more
likely to prefetch while the CPU is in the middle of writing a work
request into memory).
We write S/G segments in reverse order into the WQE, in order to
guarantee that the first dword of all cachelines containing S/G
segments is written last (overwriting the headroom invalidation
pattern). The entire cacheline will thus contain valid data when the
invalidation pattern is overwritten.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/ehca: SRQ fixes to enable IPoIB CM
IB/ehca: Fix Small QP regressions