Factor code to set data segment entries out of the work request
posting functions into inline functions mthca_set_data_seg() and
mthca_set_data_seg_inval(). This makes the code more readable and
also allows the compiler to do a better job -- on x86_64:
add/remove: 0/0 grow/shrink: 0/6 up/down: 0/-69 (-69)
function old new delta
mthca_arbel_post_srq_recv 373 369 -4
mthca_arbel_post_receive 570 562 -8
mthca_tavor_post_srq_recv 520 508 -12
mthca_tavor_post_send 1344 1330 -14
mthca_arbel_post_send 1481 1467 -14
mthca_tavor_post_receive 792 775 -17
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Return the receive queue sizes for both userspace QPs and kernel Qps
(not just kernel QPs) from mlx4_ib_query_qp(). Also zero the send
queue sizes for userspace QPs to avoid a possible information leak,
and set the max_inline_data for kernel QPs to 0 since inline sends are
not supported for kernel QPs.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Local write permission makes no sense as part of the QP access flags,
since the access flags only control what the remote end of the
connection is allowed to do. Remove the code in the RDMA CM that
initializes qp_access_flags with IB_ACCESS_LOCAL_WRITE.
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit 9db48926 ("drivers/infiniband/hw/mthca/mthca_qp: kill uninit'd
var warning") added "= 0" to the declarations of f0 to shut up gcc
warnings. However, there's no point in making the code bigger by
initializing f0 to a random value just to get rid of a warning;
setting f0 to 0 is no safer than just using uninitialized_var(), which
documents the situation better and gives smaller code too. For example,
on x86_64:
add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-16 (-16)
function old new delta
mthca_tavor_post_send 1352 1344 -8
mthca_arbel_post_send 1489 1481 -8
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Make some functions that are only used in a single .c file static. In
addition to being a cleanup, this shrinks the generated code. On x86_64:
add/remove: 1/3 grow/shrink: 2/1 up/down: 4777/-4956 (-179)
function old new delta
handle_errors - 3994 +3994
__verbs_timer 42 710 +668
ipath_do_ruc_send 2131 2246 +115
ipath_no_bufs_available 136 - -136
ipath_disarm_senderrbufs 639 - -639
ipath_ib_timer 658 - -658
ipath_intr 5878 2355 -3523
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Make iser_conn_release() and iser_start_rdma_unaligned_sg() static,
since they are only used in the .c file where they are defined. In
addition to being a cleanup, this even shrinks the generated code by
allowing the single call of iser_start_rdma_unaligned_sg() to be
inlined into its callsite. On x86_64:
add/remove: 0/1 grow/shrink: 1/0 up/down: 466/-533 (-67)
function old new delta
iser_reg_rdma_mem 1518 1984 +466
iser_start_rdma_unaligned_sg 533 - -533
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When warning about out-of-date firmware, current mthca code messes up
the formatting of the version if the subminor doesn't have three
digits. It doesn't fill the field with 0s so we end up with:
ib_mthca 0000:0b:00.0: HCA FW version 1.1. 0 is old (1.2. 0 is current).
Change the format from "%3d" to "%03d" to get the right thing printed.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The mthca driver supports both MSI and MSI-X. However, MSI-X works with
all hardware that the driver handles, and provides a superset of what
MSI does, so there's no point in having code for both. Schedule MSI
support for removal in 2008 to give anyone who actually needs MSI and
who can't use MSI time to speak up.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Run the existing ehca code through checkpatch.pl and clean up the
worst of the coding style violations.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Split ehca_set_pagebuf() into three functions depending on MR type
(phys/user/fast) and remove superfluous ehca_set_pagebuf_1().
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- Rename struct ehca_mr fields to clearly distinguish between kernel
and HW page size.
- Sort struct ehca_mr_pginfo into a common part and a union containing
specific fields for physical, user and fast MR
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Instead of one error mapping function for each potential error source
in ehca_mrmw.c, use a centralized function that handles all cases,
saving a three-figure line count.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Autodetection was missing a few HW revisions, causing certain eHCA1
revisions to be treated like eHCA2. Fixed.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When clearing the ib_ah_attr parameter in to_ib_ah_attr(), use sizeof
*ib_ah_attr instead of sizeof *path. This is the same bug as was
fixed for mthca in 99d4f22e ("IB/mthca: Use correct structure size in
call to memset()"), but the code was cut and pasted into mlx4 before the
fix was merged.
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When a QP is in the INIT state, the sched_queue field hasn't been given
to the firmware yet, so the firmware cannot return the value when the QP
is queried. To handle this, use the port number that is saved in the
driver's QP data structure.
Found by Dotan Barak and Yaron Gepstein of Mellanox.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Correct the mask used to get the flow label, since the field is 20 bits,
not 24 bits.
Found by Dotan Barak and Yaron Gepstein of Mellanox.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
drivers/infiniband/hw/mthca/mthca_qp.c: In function
‘mthca_tavor_post_send’:
drivers/infiniband/hw/mthca/mthca_qp.c:1594: warning: ‘f0’ may be used
uninitialized in this function
drivers/infiniband/hw/mthca/mthca_qp.c: In function
‘mthca_arbel_post_send’:
drivers/infiniband/hw/mthca/mthca_qp.c:1949: warning: ‘f0’ may be used
uninitialized in this function
Initializing 'f0' is not strictly necessary in either case, AFAICS.
I was considering use of uninitialized_var(), but looking at the
complex flow of control in each function, I feel it is wiser and
safer to simply zero the var and be certain of ourselves.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (166 commits)
[SCSI] ibmvscsi: convert to use the data buffer accessors
[SCSI] dc395x: convert to use the data buffer accessors
[SCSI] ncr53c8xx: convert to use the data buffer accessors
[SCSI] sym53c8xx: convert to use the data buffer accessors
[SCSI] ppa: coding police and printk levels
[SCSI] aic7xxx_old: remove redundant GFP_ATOMIC from kmalloc
[SCSI] i2o: remove redundant GFP_ATOMIC from kmalloc from device.c
[SCSI] remove the dead CYBERSTORMIII_SCSI option
[SCSI] don't build scsi_dma_{map,unmap} for !HAS_DMA
[SCSI] Clean up scsi_add_lun a bit
[SCSI] 53c700: Remove printk, which triggers because of low scsi clock on SNI RMs
[SCSI] sni_53c710: Cleanup
[SCSI] qla4xxx: Fix underrun/overrun conditions
[SCSI] megaraid_mbox: use mutex instead of semaphore
[SCSI] aacraid: add 51245, 51645 and 52245 adapters to documentation.
[SCSI] qla2xxx: update version to 8.02.00-k1.
[SCSI] qla2xxx: add support for NPIV
[SCSI] stex: use resid for xfer len information
[SCSI] Add Brownie 1200U3P to blacklist
[SCSI] scsi.c: convert to use the data buffer accessors
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (76 commits)
IB: Update MAINTAINERS with Hal's new email address
IB/mlx4: Implement query SRQ
IB/mlx4: Implement query QP
IB/cm: Send no match if a SIDR REQ does not match a listen
IB/cm: Fix handling of duplicate SIDR REQs
IB/cm: cm_msgs.h should include ib_cm.h
IB/cm: Include HCA ACK delay in local ACK timeout
IB/cm: Use spin_lock_irq() instead of spin_lock_irqsave() when possible
IB/sa: Make sure SA queries use default P_Key
IPoIB: Recycle loopback skbs instead of freeing and reallocating
IB/mthca: Replace memset(<addr>, 0, PAGE_SIZE) with clear_page(<addr>)
IPoIB/cm: Fix warning if IPV6 is not enabled
IB/core: Take sizeof the correct pointer when calling kmalloc()
IB/ehca: Improve latency by unlocking after triggering the hardware
IB/ehca: Notify consumers of LID/PKEY/SM changes after nondisruptive events
IB/ehca: Return QP pointer in poll_cq()
IB/ehca: Change idr spinlocks into rwlocks
IB/ehca: Refactor sync between completions and destroy_cq using atomic_t
IB/ehca: Lock renaming, static initializers
IB/ehca: Report RDMA atomic attributes in query_qp()
...
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (34 commits)
PCI: Only build PCI syscalls on architectures that want them
PCI: limit pci_get_bus_and_slot to domain 0
PCI: hotplug: acpiphp: avoid acpiphp "cannot get bridge info" PCI hotplug failure
PCI: hotplug: acpiphp: remove hot plug parameter write to PCI host bridge
PCI: hotplug: acpiphp: fix slot poweroff problem on systems without _PS3
PCI: hotplug: pciehp: wait for 1 second after power off slot
PCI: pci_set_power_state(): check for PM capabilities earlier
PCI: cpci_hotplug: Convert to use the kthread API
PCI: add pci_try_set_mwi
PCI: pcie: remove SPIN_LOCK_UNLOCKED
PCI: ROUND_UP macro cleanup in drivers/pci
PCI: remove pci_dac_dma_... APIs
PCI: pci-x-pci-express-read-control-interfaces cleanups
PCI: Fix typo in include/linux/pci.h
PCI: pci_ids, remove double or more empty lines
PCI: pci_ids, add atheros and 3com_2 vendors
PCI: pci_ids, reorder some entries
PCI: i386: traps, change VENDOR to DEVICE
PCI: ATM: lanai, change VENDOR to DEVICE
PCI: Change all drivers to use pci_device->revision
...
sysfs is now completely out of driver/module lifetime game. After
deletion, a sysfs node doesn't access anything outside sysfs proper,
so there's no reason to hold onto the attribute owners. Note that
often the wrong modules were accounted for as owners leading to
accessing removed modules.
This patch kills now unnecessary attribute->owner. Note that with
this change, userland holding a sysfs node does not prevent the
backing module from being unloaded.
For more info regarding lifetime rule cleanup, please read the
following message.
http://article.gmane.org/gmane.linux.kernel/510293
(tweaked by Greg to not delete the field just yet, to make it easier to
merge things properly.)
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Instead of all drivers reading pci config space to get the revision
ID, they can now use the pci_device->revision member.
This exposes some issues where drivers where reading a word or a dword
for the revision number, and adding useless error-handling around the
read. Some drivers even just read it for no purpose of all.
In devices where the revision ID is being copied over and used in what
appears to be the equivalent of hotpath, I have left the copy code
and the cached copy as not to influence the driver's performance.
Compile tested with make all{yes,mod}config on x86_64 and i386.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If a SIDR REQ does not match a listen, we should reply with status
value 1 (service ID not supported), rather than dropping through to
the default case of status 2 (rejected by service provider).
Doing this also fixes a bug where the cm_id_priv is removed from the
remote_sidr_table twice.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix handling to duplicate SIDR REQs to avoid sending a reject if a
duplicate is detected. Duplicates should just be silently discarded.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
cm_msgs.h uses definitions from ib_cm.h. Include it directly, rather
than depending on a specific include order.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The IB CM should include the HCA ACK delay when calculating the local
ACK timeout value to use for RC QPs. If the HCA ACK delay is large
enough relative to the packet life time, then if it is not taken into
account, the calculated timeout value ends up being too small, which
can result in "retry exceeded" errors.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The ib_cm is a little over zealous about using spin_lock_irqsave,
when spin_lock_irq would do.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
MADs sent to the SA should use the the default P_Key (0x7fff/0xffff).
There's no requirement that the default P_Key is stored at index 0 in
the local P_Key table, so add code to the sa_query module to look up
the index of the default P_Key when creating an address handle for the
SA (which is done any time the P_Key table might change), and use this
index for all SA queries.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
InfiniBand HCAs replicate multicast packets back to the QP that sent
them if that QP is attached to the destination multicast group. This
means that IPoIB multicasts are often replicated back to the receive
queue of the interface that generated them. To avoid confusing the
network stack, we drop these duplicates within the IPoIB driver.
However, there's no reason to free the skb that received the duplicate
and then immediately allocate a new skb to post to the receive queue.
We can be more efficient and just repost the same skb.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix
drivers/infiniband/ulp/ipoib/ipoib_cm.c:1151: warning: unused variable 'dev'
by getting rid of the variable dev, which is only used if CONFIG_IPV6
is enabled, and replacing the one use of it with the value it is
assigned, namely priv->dev.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When allocating out_mad in show_pma_counter(), take sizeof *out_mad
instead of sizeof *in_mad. It is true that today the type of in_mad
and out_mad are the same, but this patch will give us a cleaner code.
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Kick the hardware before unlocking the send/receive queue to overlap
processing a little more.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When firmware reports a nondisruptive port configuration change event,
previous versions of the eHCA driver didn't forward the event to consumers
like IPoIB. Add code that determines the type of configuration change by
comparing old and new port attributes and reports it.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This eliminates lock contention among IRQs as well as the need to
disable IRQs around idr_find, because there are no IRQ writers.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- ehca_cq.nr_events is made an atomic_t, eliminating a lot of locking.
- The CQ is removed from the CQ idr first now to make sure no more
completions are scheduled on that CQ. The "wait for all completions to
end" code becomes much simpler this way.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- Rename all spinlock flags to "flags", matching the vast majority of kernel
code.
- Move hcall_lock into the only file it's used in.
- Replaced spin_lock_init() and friends with static initializers for
global variables.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Support SRQs on eHCA2. Since an SRQ is a QP for eHCA2, a lot of code
(structures, create, destroy, post_recv) can be shared between QP and SRQ.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- Replace init_qp_queues() by a shorter init_qp_queue(), eliminating
duplicate code.
- hipz_h_alloc_resource_qp() doesn't need a pointer to struct ehca_qp any
longer. All input and output data is transferred through the parms
parameter.
- Change the interface to also support SRQ.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In preparation for support of new eHCA2 features, change adapter probing:
- Hardware level is changed to encode major and minor chip version
- Hardware capabilities are queried from the firmware
- The maximum MTU is queried from the firmware instead of assuming a
fixed value
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Return the PortGUID of the correct port when responding to a NodeInfo
query. Returning the SystemImageGUID causes issues when there are
multiple HCAs in a single system.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The changeset 3859e39d ("IB/ipath: Support larger IB_QP_MAX_DEST_RD_ATOMIC
and IB_QP_MAX_QP_RD_ATOMIC") added support for larger RD_ATOMIC values,
but it failed to take out the stricter checks that were before these and
hence had no effect. This patch takes out the bogus checks....
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
All too often, interrupts do not get enabled for our card due to BIOS
misconfiguration and other issues. This patch checks for that
condition on startup and warns the user. This patch is based on work
(check LID availability) by Robert Walsh.
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The default calculation for the number of send buffers to allocate to
the kernel was too high for the PCIe version of the chip thus leaving
fewer than desired send buffers for user MPI applications.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
We are more careful to be sure that we don't lose information about
changes that occurred while we were in freeze mode, when the chip will
not notify us, and try to avoid false error interrupts while doing
cleanup. Put all of this logic in a new function ipath_clear_freeze().
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add a barrier to make sure the CPU doesn't reorder writes to memory,
since user programs can be polling on the head index update and the
entry should be written before that.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Change Kconfig objects from "menu, config" into "menuconfig" so
that the user can disable the whole feature without having to
enter the menu first.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
[ Also remove cast from void * return of kmalloc() as suggested by
Jesper Juhl <jesper.juhl@gmail.com>. ]
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This bug results in an abort request being sent down _after_ the tid
has been released. If the tid happens to have been reused, then the
subsequent generation of the tid gets incorrectly aborted.
The thread running iwch_accecpt_cr() must not abort a connection if an
error is returned after being awakened. If any errors did occur while
iwch_accept_cr() is blocked, then the connection has already been
aborted on the thread processing the error.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The LLD does this for us in cxgb3_remove_tid().
Also fixed active open failure cases where we also shouldn't be
releasing the TID.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Negative advice messages should _not_ count toward the 2 abort
requests needed to indicate an abort request.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Don't set the gen bits nor length bits in the terminate WR. This is
done by the LLD driver.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Due to a HW issue, our current scheme to transition the connection from
streaming to rdma mode is broken on the passive side. The firmware
and driver now support a new transition scheme for the passive side:
- driver posts rdma_init_wr (now including the initial receive seqno)
- driver posts last streaming message via TX_DATA message (MPA start
response)
- uP atomically sends the last streaming message and transitions the
tcb to rdma mode.
- driver waits for wr_ack indicating the last streaming message was ACKed.
NOTE: This change also bumps the required firmware version to 4.3.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Get the maximum message size from the device capabilities returned
from the QUERY_DEV_CAP firmware command, rather than hard-coding 2 GB.
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Now that it's June, it's about time to update
the copyright notices of files that have changed.
Signed-off-by: John Gregor <john.gregor@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix ipath_poll and enhance it so we can poll for urgent packets or
regular packets and receive notifications of when a header queue
overflows.
Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The IB specification ch. 9.9.3 table 58 says that a QP which isn't set
up for the operation should return a NAK invalid request.
Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
During compliance testing and when debugging some interconnect issues,
it is very useful to be able to send malformed packets, without having
the device signal them as malformed (drop, or terminate with EBP). The
hardware supports this, but the driver "diagnostic packet" interface
did not.
Extend capability to send specific malformed packets for testing.
Signed-off-by: Michael Albaugh <Michael.Albaugh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Previously the driver and userspace code handled the case of 1 subport
somewhat inconsistently. The new interpretation of this situation is
that if one subport is requested, the driver turns on the subport
mechanism and arranges for the port to be "shared" by one process. In
normal use the userspace library does not use this configuration and
instead arranges for the port not to be shared at all. This
particular idiom can be useful for testing purposes.
Signed-off-by: Mark Debbage <mark.debbage@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When subports are required to run a program, this patch checks that
the driver and the userspace library have compatible subport
implementations. This is achieved through checks on the swminor
version field built into the driver and userspace library. Bad
combinations are reported through syslog and result in an error when
opening the port.
Signed-off-by: Mark Debbage <mark.debbage@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A duplicate RDMA read request can fool the responder into NAKing a new
RDMA read request because the responder wasn't keeping track of
whether the queue of RDMA read requests had been sent at least once.
For example, requester sends 4 2K byte RDMA read requests, times out,
and resends the first, then sees the 4 responses, then sends a 5th
RDMA read or atomic operation. The responder sees the 4 requests,
sends 4 responses, sees the resent 1st request, rewinds the queue,
then sees the 5th request but thinks the queue is full and that the
requester is invalidly sending a 5th new request.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The code to copy data from the receive queue buffers to the IB SGEs
doesn't check the SGE length, only the memory region/page length when
copying data. This could overwrite parts of the user's memory that
were not intended to be written. It can only happen if multiple SGEs
are used to describe a receive buffer which almost never happens in
practice.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The send function is called when posting new send work requests.
There is no point in trying to send a packet if the QP is already
waiting for a HW send buffer so don't clear the busy bit until the
buffer available interrupt happens.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A RDMA read response or atomic response can ACK earlier sends and RDMA
writes. In this case, the wrong work request pointer was being used
to store the read first response or atomic result. Also, if a RDMA
read request is retried, the code to compute which request to resend
was incorrect.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This centralizes the use of the abort functionality, removes the
unneeded buffer cancel (abort does the same thing), sets up to ignore
launch errors after abort, same as cancel. We need abort on exit from
freeze mode to avoid having buffers stuck in the busy state, if a user
process happened to complete the send while we were in freeze mode
doing the recovery.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The values passed have never been right for iba 6120 chips, but just
happened to work. We needed to select the right buffer offset in the
chip (both are in same register), and the total length was wrong also,
but was covered by the rounding up.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Define pkt rcvd 'type' in a way consistent with HW spec and chips.
The hardware considers received packets of type 0 to be expected, and
type 1 to be eager. The driver was calling the ipath_f_put_tid
functions using a variable called 'type' set to 0 for eager and to 1
for expected packets. Worse, the iba6110 and iba6120 drivers used
those values inconsistently. This was quite confusing. Now
everything is consistent with the hardware.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
According to chapter 17.2.8.1.1, QPs start in the migrated state and
should send packets with the M bit set in the BTH.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes a minor bug where the wrong QP was checked for a send
work request that should wait for an RNR timeout.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes a bug introduced when moving some code around for
readability.
Setting the wqe pointer at the end of the function is a NOP since it
isn't used. Move it back to where it is used.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In ipath_query_device(), some of the struct ib_device_attr fields were
not being initialized.
Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Although our chip supports 4K MTUs, our driver doesn't yet support
this feature, so limit the maximum MTU to 2K until we get support for
4K MTUs implemented.
Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Recognize IBA 6110 Revision 4: same feature set, etc. as earlier revisions.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
We currently track various errors, now we enhance that capability by
logging some of them to EEPROM. We also now log a cumulative "active"
time defined by traffic though the InfiniPath HCA beyond the normal SM
traffic.
Signed-off-by: Michael Albaugh <michael.albaugh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The IPATH_RUNTIME_PBC_REWRITE and the IPATH_RUNTIME_LOOSE_DMA_ALIGN
flags were not ever implemented correctly and did not turn out to be
necessary. Remove the last vestiges of these flags but mark the spot
with a comment to remind us to not reuse these flags in the interest
of binary compatibility. The INFINIPATH_XGXS_SUPPRESS_ARMLAUNCH_ERR
bit was also not found to be useful, so it was dropped in the cleanup
as well.
Signed-off-by: John Gregor <john.gregor@qlogic.com>
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The new LED blinking interface adds more contention for the
unprotected GPIO pins that were already shared, though not commonly at
the same time. We add locks to the accesses to these pins so that
Read-Modify-Write is now safe. Some of these locks are added at
interrupt context, so we shadow the registers which drive and inspect
these pins to avoid the mmio read/writes. This mitigates the effects
of the locks and hastens us through the interrupt.
Add locking and always use shadows for registers controlling GPIO pins
(ExtCtrl and GPIOout). The use of shadows implies doing less I/O,
which can make I2C operation too fast on some platforms. An explicit
udelay(1) in SCL manipulation fixes that.
Signed-off-by: Michael Albaugh <michael.albaugh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When we want to find an InfiniPath HCA in a rack of nodes, it is often
expeditious to blink the status LEDs via a userspace /sys file.
A write-only led_override "file" is published per device. Writes to
this file are interpreted as (string form) numbers, and the resulting
value sent to ipath_set_led_override(). The upper eight bits are
interpretted as a 4.4 fixed-point "frequency in Hertz", and the bottom
two 4-bit values are alternately (D0..3, then D4..7) used by the
board-specific LED-setting function to override the normal state.
Signed-off-by: Michael Albaugh <michael.albaugh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
mlx4_ib.h uses struct mutex, so although <linux/mutex.h> seems to be
pulled in indirectly by one of the headers it includes, the right
thing is to include <linux/mutex.h> directly.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
gcc correctly warned:
drivers/infiniband/core/umem.c: In function 'ib_umem_get':
drivers/infiniband/core/umem.c:78: warning: 'ret' may be used uninitialized in this function
Set ret to 0 in case npages == 0 and the loop isn't entered at all.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Refactor the ehca changes from commit ed23a727 ("IB: Return "maybe
missed event" hint from ib_req_notify_cq()") so the queue arithmetic
is done in slightly fewer lines. Also, move the spinlock flags into
the block they're used in.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Extend the SMI with switch (intermediate hop) support. Care has been
taken to ensure that the CA (and router) code paths are changed as
little as possible.
Signed-off-by: Suresh Shelvapille <suri@baymicrosystems.com>
Signed-off-by: Hal Rosenstock <halr@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>