* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (76 commits)
IB: Update MAINTAINERS with Hal's new email address
IB/mlx4: Implement query SRQ
IB/mlx4: Implement query QP
IB/cm: Send no match if a SIDR REQ does not match a listen
IB/cm: Fix handling of duplicate SIDR REQs
IB/cm: cm_msgs.h should include ib_cm.h
IB/cm: Include HCA ACK delay in local ACK timeout
IB/cm: Use spin_lock_irq() instead of spin_lock_irqsave() when possible
IB/sa: Make sure SA queries use default P_Key
IPoIB: Recycle loopback skbs instead of freeing and reallocating
IB/mthca: Replace memset(<addr>, 0, PAGE_SIZE) with clear_page(<addr>)
IPoIB/cm: Fix warning if IPV6 is not enabled
IB/core: Take sizeof the correct pointer when calling kmalloc()
IB/ehca: Improve latency by unlocking after triggering the hardware
IB/ehca: Notify consumers of LID/PKEY/SM changes after nondisruptive events
IB/ehca: Return QP pointer in poll_cq()
IB/ehca: Change idr spinlocks into rwlocks
IB/ehca: Refactor sync between completions and destroy_cq using atomic_t
IB/ehca: Lock renaming, static initializers
IB/ehca: Report RDMA atomic attributes in query_qp()
...
Instead of all drivers reading pci config space to get the revision
ID, they can now use the pci_device->revision member.
This exposes some issues where drivers where reading a word or a dword
for the revision number, and adding useless error-handling around the
read. Some drivers even just read it for no purpose of all.
In devices where the revision ID is being copied over and used in what
appears to be the equivalent of hotpath, I have left the copy code
and the cached copy as not to influence the driver's performance.
Compile tested with make all{yes,mod}config on x86_64 and i386.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kick the hardware before unlocking the send/receive queue to overlap
processing a little more.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When firmware reports a nondisruptive port configuration change event,
previous versions of the eHCA driver didn't forward the event to consumers
like IPoIB. Add code that determines the type of configuration change by
comparing old and new port attributes and reports it.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This eliminates lock contention among IRQs as well as the need to
disable IRQs around idr_find, because there are no IRQ writers.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- ehca_cq.nr_events is made an atomic_t, eliminating a lot of locking.
- The CQ is removed from the CQ idr first now to make sure no more
completions are scheduled on that CQ. The "wait for all completions to
end" code becomes much simpler this way.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- Rename all spinlock flags to "flags", matching the vast majority of kernel
code.
- Move hcall_lock into the only file it's used in.
- Replaced spin_lock_init() and friends with static initializers for
global variables.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Support SRQs on eHCA2. Since an SRQ is a QP for eHCA2, a lot of code
(structures, create, destroy, post_recv) can be shared between QP and SRQ.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- Replace init_qp_queues() by a shorter init_qp_queue(), eliminating
duplicate code.
- hipz_h_alloc_resource_qp() doesn't need a pointer to struct ehca_qp any
longer. All input and output data is transferred through the parms
parameter.
- Change the interface to also support SRQ.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In preparation for support of new eHCA2 features, change adapter probing:
- Hardware level is changed to encode major and minor chip version
- Hardware capabilities are queried from the firmware
- The maximum MTU is queried from the firmware instead of assuming a
fixed value
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Return the PortGUID of the correct port when responding to a NodeInfo
query. Returning the SystemImageGUID causes issues when there are
multiple HCAs in a single system.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The changeset 3859e39d ("IB/ipath: Support larger IB_QP_MAX_DEST_RD_ATOMIC
and IB_QP_MAX_QP_RD_ATOMIC") added support for larger RD_ATOMIC values,
but it failed to take out the stricter checks that were before these and
hence had no effect. This patch takes out the bogus checks....
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
All too often, interrupts do not get enabled for our card due to BIOS
misconfiguration and other issues. This patch checks for that
condition on startup and warns the user. This patch is based on work
(check LID availability) by Robert Walsh.
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The default calculation for the number of send buffers to allocate to
the kernel was too high for the PCIe version of the chip thus leaving
fewer than desired send buffers for user MPI applications.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
We are more careful to be sure that we don't lose information about
changes that occurred while we were in freeze mode, when the chip will
not notify us, and try to avoid false error interrupts while doing
cleanup. Put all of this logic in a new function ipath_clear_freeze().
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add a barrier to make sure the CPU doesn't reorder writes to memory,
since user programs can be polling on the head index update and the
entry should be written before that.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Change Kconfig objects from "menu, config" into "menuconfig" so
that the user can disable the whole feature without having to
enter the menu first.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
[ Also remove cast from void * return of kmalloc() as suggested by
Jesper Juhl <jesper.juhl@gmail.com>. ]
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This bug results in an abort request being sent down _after_ the tid
has been released. If the tid happens to have been reused, then the
subsequent generation of the tid gets incorrectly aborted.
The thread running iwch_accecpt_cr() must not abort a connection if an
error is returned after being awakened. If any errors did occur while
iwch_accept_cr() is blocked, then the connection has already been
aborted on the thread processing the error.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The LLD does this for us in cxgb3_remove_tid().
Also fixed active open failure cases where we also shouldn't be
releasing the TID.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Negative advice messages should _not_ count toward the 2 abort
requests needed to indicate an abort request.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Don't set the gen bits nor length bits in the terminate WR. This is
done by the LLD driver.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Due to a HW issue, our current scheme to transition the connection from
streaming to rdma mode is broken on the passive side. The firmware
and driver now support a new transition scheme for the passive side:
- driver posts rdma_init_wr (now including the initial receive seqno)
- driver posts last streaming message via TX_DATA message (MPA start
response)
- uP atomically sends the last streaming message and transitions the
tcb to rdma mode.
- driver waits for wr_ack indicating the last streaming message was ACKed.
NOTE: This change also bumps the required firmware version to 4.3.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Get the maximum message size from the device capabilities returned
from the QUERY_DEV_CAP firmware command, rather than hard-coding 2 GB.
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Now that it's June, it's about time to update
the copyright notices of files that have changed.
Signed-off-by: John Gregor <john.gregor@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix ipath_poll and enhance it so we can poll for urgent packets or
regular packets and receive notifications of when a header queue
overflows.
Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The IB specification ch. 9.9.3 table 58 says that a QP which isn't set
up for the operation should return a NAK invalid request.
Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
During compliance testing and when debugging some interconnect issues,
it is very useful to be able to send malformed packets, without having
the device signal them as malformed (drop, or terminate with EBP). The
hardware supports this, but the driver "diagnostic packet" interface
did not.
Extend capability to send specific malformed packets for testing.
Signed-off-by: Michael Albaugh <Michael.Albaugh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Previously the driver and userspace code handled the case of 1 subport
somewhat inconsistently. The new interpretation of this situation is
that if one subport is requested, the driver turns on the subport
mechanism and arranges for the port to be "shared" by one process. In
normal use the userspace library does not use this configuration and
instead arranges for the port not to be shared at all. This
particular idiom can be useful for testing purposes.
Signed-off-by: Mark Debbage <mark.debbage@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When subports are required to run a program, this patch checks that
the driver and the userspace library have compatible subport
implementations. This is achieved through checks on the swminor
version field built into the driver and userspace library. Bad
combinations are reported through syslog and result in an error when
opening the port.
Signed-off-by: Mark Debbage <mark.debbage@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A duplicate RDMA read request can fool the responder into NAKing a new
RDMA read request because the responder wasn't keeping track of
whether the queue of RDMA read requests had been sent at least once.
For example, requester sends 4 2K byte RDMA read requests, times out,
and resends the first, then sees the 4 responses, then sends a 5th
RDMA read or atomic operation. The responder sees the 4 requests,
sends 4 responses, sees the resent 1st request, rewinds the queue,
then sees the 5th request but thinks the queue is full and that the
requester is invalidly sending a 5th new request.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The code to copy data from the receive queue buffers to the IB SGEs
doesn't check the SGE length, only the memory region/page length when
copying data. This could overwrite parts of the user's memory that
were not intended to be written. It can only happen if multiple SGEs
are used to describe a receive buffer which almost never happens in
practice.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The send function is called when posting new send work requests.
There is no point in trying to send a packet if the QP is already
waiting for a HW send buffer so don't clear the busy bit until the
buffer available interrupt happens.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A RDMA read response or atomic response can ACK earlier sends and RDMA
writes. In this case, the wrong work request pointer was being used
to store the read first response or atomic result. Also, if a RDMA
read request is retried, the code to compute which request to resend
was incorrect.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This centralizes the use of the abort functionality, removes the
unneeded buffer cancel (abort does the same thing), sets up to ignore
launch errors after abort, same as cancel. We need abort on exit from
freeze mode to avoid having buffers stuck in the busy state, if a user
process happened to complete the send while we were in freeze mode
doing the recovery.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The values passed have never been right for iba 6120 chips, but just
happened to work. We needed to select the right buffer offset in the
chip (both are in same register), and the total length was wrong also,
but was covered by the rounding up.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Define pkt rcvd 'type' in a way consistent with HW spec and chips.
The hardware considers received packets of type 0 to be expected, and
type 1 to be eager. The driver was calling the ipath_f_put_tid
functions using a variable called 'type' set to 0 for eager and to 1
for expected packets. Worse, the iba6110 and iba6120 drivers used
those values inconsistently. This was quite confusing. Now
everything is consistent with the hardware.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
According to chapter 17.2.8.1.1, QPs start in the migrated state and
should send packets with the M bit set in the BTH.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>