Commit Graph

702 Commits

Author SHA1 Message Date
Bryan O'Sullivan
0624b072f2 IB/ipath: Fix compiler warnings and errors on non-x86_64 systems
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 11:16:45 -07:00
Bryan O'Sullivan
8d588f8bb7 IB/ipath: Print more informative parity error messages
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 11:16:43 -07:00
Bryan O'Sullivan
6a553af286 IB/ipath: Ensure that PD of MR matches PD of QP checking the Rkey
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 11:16:41 -07:00
Bryan O'Sullivan
10aeb0e6d8 IB/ipath: RC and UC should validate SLID and DLID
This is required for IB conformance (spec ch. 9.6.1.5).

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 11:16:39 -07:00
Bryan O'Sullivan
7dae5bff2e IB/ipath: Only allow complete writes to flash
Don't allow a write to the eeprom from ipathfs unless the write is exactly
128 bytes and starts at offset 0.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 11:16:37 -07:00
Bryan O'Sullivan
f3e93c7757 IB/ipath: Count SRQs properly
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 11:16:35 -07:00
Bryan O'Sullivan
aa4eaed702 IB/ipath: Lock and count allocated CQs properly
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 11:16:34 -07:00
Bryan O'Sullivan
11b054fe1d IB/ipath: Clean up handling of GUID 0
Respond with an error to the SM if our GUID is 0, and don't allow the
user to set our GUID to 0.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 11:16:32 -07:00
Bryan O'Sullivan
c78f6415e9 IB/ipath: Unregister from IB core early
This gives upper-level protocols a chance to unregister while the device
is still usable.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 11:16:29 -07:00
Bryan O'Sullivan
2c9446a1d6 IB/ipath: Support revision 2 InfiniPath PCIE devices
This also entailed a little GPIO-interrupt general cleanup.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 11:16:27 -07:00
Bryan O'Sullivan
9929b0fb0f IB/ipath: Driver support for userspace sharing of HW contexts
This allows multiple userspace processes to share a single hardware
context in a master/slave arrangement.  It is backwards binary compatible
with existing userspace.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 11:16:25 -07:00
Bryan O'Sullivan
221e31985b IB/ipath: Fix memory leak if allocation fails
If the second allocation failed, the first structure allocated in this
routine was not freed.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 11:16:23 -07:00
Bryan O'Sullivan
6022943eb4 IB/ipath: Limit # of packets sent without an ACK received
The sender requests an ACK every 1/2 MB to avoid retransmit timeouts that
were causing MVAPICH mod_bw to fail after a predictable number of sends.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 11:16:21 -07:00
Erez Zilber
fd6a79a786 IB/iser: Fix the description of iSER in Kconfig
Fix the description of iSER in Kconfig.  It is not accurate.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 10:54:51 -07:00
Erez Zilber
74a2078061 IB/iser: DMA unmap unaligned for RDMA data before touching it
iSER uses the DMA mapping api to map the page holding the
SCSI command data to the HCA DMA address space. When the
command data is not aligned for RDMA, the data is copied
to/from an allocated buffer which in turn is used for
executing this command. The pages associated with the
command must be unmapped before being touched.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 10:53:18 -07:00
Erez Zilber
87e8df7a27 IB/iser: Have iSER data transaction object point to iSER conn
iSER uses a data transaction object (struct iser_dto) as part
of its IB data descriptors (struct iser_desc) management.
It also uses a hierarchy of connection structures pointing to
each other. A DTO may exist even after the iscsi_iser connection
pointed by it is destroyed (eg one that is bound to a post
receive buffer which was flushed by the IB HW). Hence DTOs need
point to the lowest connection, which is struct iser_conn.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 10:53:16 -07:00
Roland Dreier
ee30cb5b0b RDMA/amso1100: Fix memory leak in c2_reg_phys_mr()
If the allocation of mr fails, then c2_reg_phys_mr() leaks the
page_list array it allocated earlier.

This was Coverity CID #1413.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 10:44:07 -07:00
Eric Sesterhenn
44334bd97e RDMA/amso1100: Fix error path in c2_llp_accept()
Another NULL dereference spotted by the Coverity checker (cid #1395):
In case we can't alloc the vq_req, we goto bail1, where we call
vq_req_free(c2dev, vq_req); which then dereferences vq_req.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Acked-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28 10:38:32 -07:00
Roland Dreier
6edf602341 RDMA/amso1100: Fix compile warnings
Make sure all 64-bit quantities are cast to unsigned long long
when printed with "%ll" printk formats.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-27 14:42:56 -07:00
Theodore Ts'o
ba52de123d [PATCH] inode-diet: Eliminate i_blksize from the inode structure
This eliminates the i_blksize field from struct inode.  Filesystems that want
to provide a per-inode st_blksize can do so by providing their own getattr
routine instead of using the generic_fillattr() function.

Note that some filesystems were providing pretty much random (and incorrect)
values for i_blksize.

[bunk@stusta.de: cleanup]
[akpm@osdl.org: generic_fillattr() fix]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:18 -07:00
Theodore Ts'o
8e18e2941c [PATCH] inode_diet: Replace inode.u.generic_ip with inode.i_private
The following patches reduce the size of the VFS inode structure by 28 bytes
on a UP x86.  (It would be more on an x86_64 system).  This is a 10% reduction
in the inode size on a UP kernel that is configured in a production mode
(i.e., with no spinlock or other debugging functions enabled; if you want to
save memory taken up by in-core inodes, the first thing you should do is
disable the debugging options; they are responsible for a huge amount of bloat
in the VFS inode structure).

This patch:

The filesystem or device-specific pointer in the inode is inside a union,
which is pretty pointless given that all 30+ users of this field have been
using the void pointer.  Get rid of the union and rename it to i_private, with
a comment to explain who is allowed to use the void pointer.  This is just a
cleanup, but it allows us to reuse the union 'u' for something something where
the union will actually be used.

[judith@osdl.org: powerpc build fix]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Judith Lebzelter <judith@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
Alexey Dobriyan
1a1d92c10d [PATCH] Really ignore kmem_cache_destroy return value
* Rougly half of callers already do it by not checking return value
* Code in drivers/acpi/osl.c does the following to be sure:

	(void)kmem_cache_destroy(cache);

* Those who check it printk something, however, slab_error already printed
  the name of failed cache.
* XFS BUGs on failed kmem_cache_destroy which is not the decision
  low-level filesystem driver should make. Converted to ignore.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Brice Goglin
46ff34633e MSI: Rename PCI_CAP_ID_HT_IRQCONF into PCI_CAP_ID_HT
0x08 is the HT capability, while PCI_CAP_ID_HT_IRQCONF would be
the subtype 0x80 that mpic_scan_ht_pic() uses.
Rename PCI_CAP_ID_HT_IRQCONF into PCI_CAP_ID_HT.

And by the way, use it in the ipath driver instead of defining its
own HT_CAPABILITY_ID.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-26 17:43:52 -07:00
James Bottomley
1aedf2ccc6 Merge mulgrave-w:git/linux-2.6
Conflicts:

	include/linux/blkdev.h

Trivial merge to incorporate tag prototypes.
2006-09-23 21:03:52 -05:00
James Bottomley
c9802cd957 Merge mulgrave-w:git/scsi-misc-2.6
Conflicts:

	drivers/scsi/iscsi_tcp.c
	drivers/scsi/iscsi_tcp.h

Pretty horrible merge between crypto hash consolidation
and crypto_digest_...->crypto_hash_... conversion

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-09-23 15:33:43 -05:00
Al Viro
d7b2004528 [PATCH] missing includes from infiniband merge
indirect chains of includes are arch-specific and can't
be relied upon...  (hell, even attempt to build it for
itanic would trigger vmalloc.h ones; err.h triggers
on e.g. alpha).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-23 11:34:43 -07:00
Krishna Kumar
9cd330d36b IB: Fix typo in kerneldoc for ib_set_client_data()
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:58 -07:00
Eli Cohen
a8bfca0243 IPoIB: Add some likely/unlikely annotations in hot path
Signed-off-by: Eli Cohen <eli@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:58 -07:00
Dotan Barak
507c335046 IPoIB: Remove unused include of vmalloc.h
IPoIB doesn't use anything from <linux/vmalloc.h>, so don't include it.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:57 -07:00
Eli Cohen
5ccd025553 IPoIB: Rejoin all multicast groups after a port event
When ipoib_ib_dev_flush() is called because of a port event, the
driver needs to rejoin all multicast groups, since the flush will call
ipoib_mcast_dev_flush() (via ipoib_ib_dev_down()).  Otherwise no
(non-broadcast) multicast groups will be rejoined until the networking
core calls ->set_multicast_list again, and so multicast reception will
be broken for potentially a long time.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:56 -07:00
Roland Dreier
d0df6d6d45 IPoIB: Create MCGs with all attributes required by RFC
RFC 4391 ("Transmission of IP over InfiniBand (IPoIB)") says:

  If the IB multicast group does not already exist, one must be
  created first with the IPoIB link MTU.  The MGID MUST use the same
  P_Key, Q_Key, SL, MTU, and HopLimit as those used in the
  broadcast-GID.  The rest of attributes SHOULD follow the values used
  in the broadcast-GID as well.

However, the current IPoIB driver is only setting the attributes
required by the InfiniBand spec to create a multicast group, so in
particular the MTU and HopLimit are not being set.  Add these
attributes when creating MCGs, and also set the Rate attribute, since
IPoIB pays attention to that attribute as well.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:56 -07:00
Roland Dreier
5755d6dad9 IB/iser: INFINIBAND_ISER depends on INET
iSER won't build without CONFIG_INET enabled, so make Kconfig reflect that.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:55 -07:00
Roland Dreier
d35cc330a2 IB/mthca: Simplify calls to mthca_cq_clean()
If a QP has separate send and receive CQs, then the send CQ will never
have receive completions from that QP in it.  So when cleaning the
send CQ, there's no need to pass in an SRQ pointer, even if the QP is
attached to an SRQ.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:55 -07:00
Jack Morgenstein
b3b30f5e8a IB/mthca: Recover from catastrophic errors
Trigger device remove and then add when a catastrophic error is
detected in hardware.  This, in turn, will cause a device reset, which
we hope will recover from the catastrophic condition.

Since this might interefere with debugging the root cause, add a
module option to suppress this behaviour.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:54 -07:00
Michael S. Tsirkin
a70d059009 IB/cm: Do not track remote QPN in timewait state
Do not track remote QPN in TimeWait state, since QP is not connected.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:53 -07:00
Michael S. Tsirkin
c1a0b23bf4 IB/sa: Require SA registration
Require users to register with SA module, to prevent the sa_query
module text from going away while an SA query callback is still
running.  Update all in-tree users for the new interface.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:53 -07:00
Roland Dreier
2439a6e65f IPoIB: Refactor completion handling
Split up ipoib_ib_handle_wc() into ipoib_ib_handle_rx_wc() and
ipoib_ib_handle_tx_wc() to make the code easier to read.  This will
also help implement NAPI in the future.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:52 -07:00
Erez Zilber
d81110285f IB/iser: Do not use FMR for a single dma entry sg
Fast Memory Registration (fmr) is used to register for rdma an sg whose
elements are not linearly sequential after dma mapping.

The IB verbs layer provides an "all dma memory MR (memory region)" which
can be used for RDMA-ing a dma linearly sequential buffer.

Change the code to use the dma mr instead of doing fmr when dma mapping
produces a single dma entry sg.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:52 -07:00
Erez Zilber
e981f1d4b8 IB/iser: fix some debug prints
fix and add some debug prints related to iser
handling of memory for rdma.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:51 -07:00
Erez Zilber
8dfa0876d3 IB/iser: make FMR "page size" be 4K and not PAGE_SIZE
As iser is able to use at most one rdma operation for the
execution of a scsi command, and registration of the sg
associated with scsi command has its restrictions, the code
checks if an sg is "aligned for rdma".

Alignment for rdma is measured in "fmr page" units whose
possible resolutions are different between HCAs and can be
smaller, equal or bigger to the system page size.

When the system page size is bigger than 4KB (eg the default
with ia64 kernels) there a bigger chance that an sg would be
aligned for rdma if the fmr page size is 4KB.

Change the code to create FMR whose pages are of size 4KB
and to take that into account when processing the sg.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:51 -07:00
Erez Zilber
8072ec2f8f IB/iser: Limit the max size of a scsi command
Currently, the data length of a command coming down from scsi-ml
is limited only by the size of its sg list (sg_tablesize). The
max data length may be different for different page size values.
By setting max_sectors, we limit the data length to
max_sectors*512 bytes.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:50 -07:00
Erez Zilber
777a71dd4d IB/iser: fix a check of SG alignment for RDMA
dma mapping may include a "compaction" of the sg associated with scsi command.
Hence, the size of the maximal prefix of the SG which is aligned for rdma must be
compared against the length of the dma mapped sg (mem->dma_nents) and not against
the size of it before it was mapped (mem->size).

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:49 -07:00
Sean Hefty
61a73c708f RDMA/cma: Protect against adding device during destruction
Closes a window where address resolution can attach an rdma_cm_id to a
device during destruction of the rdma_cm_id.  This can result in the
rdma_cm_id remaining in the device list after its memory has been
freed.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:49 -07:00
Tom Tucker
f94b533d09 RDMA/amso1100: Add driver for Ammasso 1100 RNIC
Add a driver for the Ammasso 1100 gigabit ethernet RNIC.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:48 -07:00
Tom Tucker
07ebafbaaa RDMA: iWARP Core Changes.
Modifications to the existing rdma header files, core files, drivers,
and ulp files to support iWARP, including:
 - Hook iWARP CM into the build system and use it in rdma_cm.
 - Convert enum ib_node_type to enum rdma_node_type, which includes
   the possibility of RDMA_NODE_RNIC, and update everything for this.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:47 -07:00
Tom Tucker
922a8e9fb2 RDMA: iWARP Connection Manager.
Add an iWARP Connection Manager (CM), which abstracts connection
management for iWARP devices (RNICs).  It is a logical instance of the
xx_cm where xx is the transport type (ib or iw).  The symbols exported
are used by the transport independent rdma_cm module, and are
available also for transport dependent ULPs.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:46 -07:00
Roland Dreier
3cd965646b IB: Whitespace fixes
Remove some trailing whitespace that has snuck in despite the best
efforts of whitespace=error-all.  Also fix a few other whitespace
bogosities.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:46 -07:00
Sean Hefty
f06d265375 IB/cm: Randomize starting comm ID
Randomize the starting local comm ID to avoid getting a rejected
connection due to a stale connection after a system reboot or
reloading of the ib_cm.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:45 -07:00
James Lentini
2b3e258e5d IB/mad: Remove unused includes
The ib_mad module does not use a kthread function, but mad_priv.h
includes <linux/kthread.h>.  mad_rmpp.c does not do any DMA-related
stuff, but includes <linux/dma-mapping.h>.  Remove the unused includes.

Signed-off-by: James Lentini <jlentini@netapp.com>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:44 -07:00
Sean Hefty
75ab13443e IB/mad: Add support for dual-sided RMPP transfers.
The implementation assumes that any RMPP request that requires a
response uses DS RMPP.  Based on the RMPP start-up scenarios defined
by the spec, this should be a valid assumption.  That is, there is no
start-up scenario defined where an RMPP request is followed by a
non-RMPP response.  By having this assumption we avoid any API
changes.

In order for a node that supports DS RMPP to communicate with one that
does not, RMPP responses assume a new window size of 1 if a DS ACK has
not been received.  (By DS ACK, I'm referring to the turn-around ACK
after the final ACK of the request.)  This is a slight spec deviation,
but is necessary to allow communication with nodes that do not
generate the DS ACK.  It also handles the case when a response is sent
after the request state has been discarded.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22 15:22:44 -07:00