Some types of packet errors are moderately common with longer IB
cables and large clusters, and are not reported with prints by other
IB HCA drivers. This suppresses those messages unless the new
__IPATH_ERRPKTDBG bit is set in ipath_debug. Reporting of temporarily
disabled frequent error interrupts was also made clearer
We also distinguish between chip errors, and bad packets sent or
received in the wording of the messages.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes a number of bugs with updating the PSN for retries of
RC requests.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When switching to the QP error state, the completion queue entries
(error or flush) were not being generated correctly.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ipath_dbg doesn't need the same prefixes that printk does.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch adds support for multiple RDMA reads and atomics to be sent
before an ACK is required to be seen by the requester.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If a post send is done in loopback and there is no receive queue
entry, the sending QP is put on a timeout list for a while so the
receiver has a chance to post a receive buffer. If the another post
send is done, the code incorrectly tried to put the QP on the timeout
list again an corrupted the timeout list. This eventually leads to a
spin lock deadlock NMI due to the timer function looping forever with
the lock held.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A silly programming error causes a CQ entry to not be generated if a
SRQ limit event is generated.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A recent change was made to allocate memory for a port after CPU
affinity is set. That change didn't account for subports and was
trying to allocate memory for the port twice.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The chip documentation on the expected TID vs eager TID parity error
bits was reversed from what was implemented in the RTL, for both
chips. This corrects the definitions.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The loop which initializes the user memory region from an array of
pages was using the wrong limit for the array. This worked OK when
dma_map_sg() returned the same number as the number of pages. This
patch fixes the problem.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This is a sticky state. It is useful for diagnosing problems with
boards versus cable/switch problems.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In mthca_arbel_fmr_unmap(), the high bits of the key are masked off.
This gets rid of the effect of adjust_key(), which makes sure that
bits 3 and 23 of the key are equal when the Sinai throughput
optimization is enabled, and so it may happen that an FMR will end up
with bits 3 and 23 in the key being different. This causes data
corruption, because when enabling the throughput optimization, the
driver promises the HCA firmware that bits 3 and 23 of all memory keys
will always be equal.
Fix by re-applying adjust_key() after masking the key.
Thanks to Or Gerlitz for reproducing the problem, and Ariel Shahar for
help in debug.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
As of commit 6cdbd77e ("cxgb3 - missing CPL hanler and register
setting."), the cxgb3 ethernet NIC driver no longer handles SET_TCB
replies, so we need to do it in the iWARP driver.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit c20e20ab ("IB/mthca: Merge MR and FMR space on 64-bit systems")
swapped the number of MTTs and MPTs when initializing the MR table. As
a result, we get a kernel oops when the number of MTT segments
allocated exceeds 0x20000.
Noted by Troy Benjegerdes <troy@scl.ameslab.gov>, and reproduced by
Dotan Barak <dotanb@mellanox.co.il>. This fixes
https://bugs.openfabrics.org/show_bug.cgi?id=490
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This was spotted by the Coverity checker (CID 1554).
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
eHCA scaling code must not depend on register_cpu_notifier() if
CONFIG_HOTPLUG_CPU is not set, so put all related code into #ifdefs.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix memory region permission problems:
- remove useless and redundant iwch_mem_perms enum.
- create ib_to_tpt_access_rights() for mapping ib access rights
to T3 TPT permissions.
- create ib_to_mwbind_access_rights() for mapping ib access rights
to T3 MWBIND WR permissions.
- fix up the mem reg code to utilize the new functions.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Only print one AE error for a given connection in the kernel log.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Stop the endpoint timer when the MPA exchange is aborted by the peer.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Change iwch_destroy_qp() to always move the QP to ERROR and let
iwch_modify_qp() decide what to do.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fixes for "normal close" failures:
- Start normal close timer when moving to CLOSING state.
- Handle ABORTING state in close_con_rpl().
- Stop timer correctly on abort during a normal close.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
cxgb3 uses dma_alloc_coherent() et al. thus needs linux/dma-mapping.h
include in order to build reliably.
Noticed on sparc64.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the consumer rejects the connection we end up under-referencing the
endpoint structure. The fix is to call iwch_ep_disconnect() instead
of the low level disconnect functions so that the endpoint close timer
is started correctly.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The garbled logic in mthca_alloc_memfree() causes it to return 0, even
if it fails to allocate all doorbell records. Fix it to return -ENOMEM
when it fails.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes two issues reported by Roland Dreier and Christoph Hellwig:
- Mismatched sync/locking between completion handler and destroy cq We
introduced a counter nr_events per cq to track number of irq events
seen. This counter is incremented when an event queue entry is seen
and decremented after completion handler has been called regardless
if scaling code is active or not. Note that nr_callbacks tracks
number of events assigned to a cpu and both counters can potentially
diverge.
The sync between running completion handler and destroy cq is done
by using the global spin lock ehca_cq_idr_lock.
- Replace yield by wait_event on the counter above to become zero.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Stop the ep timer in ec_status() if the status indicates a
bad close.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- don't mark static functions in C files as inline - gcc should know
best whether inlining makes sense
- never compile the unused cxio_dbg.c
- make the following needlessly global functions static:
- cxio_hal.c: cxio_hal_clear_qp_ctx()
- iwch_provider.c: iwch_get_qp()
- remove the following unused global functions:
- cxio_hal.c: cxio_allocate_stag()
- cxio_resource.: cxio_hal_get_rhdl()
- cxio_resource.: cxio_hal_put_rhdl()
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch makes the needlessly global functions mthca_tavor_write_mtt_seg()
and mthca_arbel_write_mtt_seg() static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (25 commits)
Documentation/kernel-docs.txt update.
arch/cris: typo in KERN_INFO
Storage class should be before const qualifier
kernel/printk.c: comment fix
update I/O sched Kconfig help texts - CFQ is now default, not AS.
Remove duplicate listing of Cris arch from README
kbuild: more doc. cleanups
doc: make doc. for maxcpus= more visible
drivers/net/eexpress.c: remove duplicate comment
add a help text for BLK_DEV_GENERIC
correct a dead URL in the IP_MULTICAST help text
fix the BAYCOM_SER_HDX help text
fix SCSI_SCAN_ASYNC help text
trivial documentation patch for platform.txt
Fix typos concerning hierarchy
Fix comment typo "spin_lock_irqrestore".
Fix misspellings of "agressive".
drivers/scsi/a100u2w.c: trivial typo patch
Correct trivial typo in log2.h.
Remove useless FIND_FIRST_BIT() macro from cardbus.c.
...
Correct mis-spellings of "algorithm", "appear", "consistent" and
(shame, shame) "kernel".
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Remove the Open Grid Computing copyright. It shouldn't be there.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
For T3B devices, mark user QP in error once we transition
to TERMINATE.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Set the port phys state as returned from ehca_query_port() to LINK_UP.
ehca actually represents a logical HCA, whose phys/link state always
is LINK_UP.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Allow users to en/disable scaling code when loading ib_ehca module,
rather than requiring the module to be rebuilt to change the setting.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix a race condition in find_next_cpu_online() and some other locking
issues in ehca scaling code.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The change to allow allocating ICM chunks from coherent memory did not
increment the count of sg entries properly, so a chunk that required
more than allocation would not be mapped properly by the HCA.
Fix this by adding the missing increment of chunk->nsg.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
RESET->RESET is an allowed QP state transition, so mthca should handle
it correctly, by just returning success without involving the firmware.
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
IB/mthca: Always fill MTTs from CPU
IB/mthca: Merge MR and FMR space on 64-bit systems
IB/mthca: Fix access to MTT and MPT tables on non-cache-coherent CPUs
IB/mthca: Give reserved MTTs a separate cache line
IB/mthca: Fix reserved MTTs calculation on mem-free HCAs
RDMA/cxgb3: Add driver for Chelsio T3 RNIC
IB: Remove redundant "_wq" from workqueue names
RDMA/cma: Increment port number after close to avoid re-use
IB/ehca: Fix memleak on module unloading
IB/mthca: Work around gcc bug on sparc64
IPoIB: Connected mode experimental support
IB/core: Use ARRAY_SIZE macro for mandatory_table
IB/mthca: Use correct structure size in call to memset()
Speed up memory registration by filling in MTTs directly when the CPU
can write directly to the whole table (all mem-free cards, and to
Tavor mode on 64-bit systems with the patch I posted earlier). This
reduces the number of FW commands needed to register an MR by at least
a factor of 2 and speeds up memory registration significantly.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
For Tavor, we currently reserve separate MPT and MTT space for FMRs to
avoid abusing the vmalloc space on 32 bit kernels. No such problem
exists on 64 bit kernels so let's not do it there.
This way we have a shared pool for MR and FMR resources, used on
demand. This will also make it possible to write MTTs for regular
regions directly from driver.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
We allocate the MTT table with alloc_pages() and then do pci_map_sg(),
so we must call pci_dma_sync_sg() after the CPU writes to the MTT
table. This works since the device will never write MTTs on mem-free
HCAs, once we get rid of the use of the WRITE_MTT firmware command.
This change is needed to make that work, and is an improvement for
now, since it gives FMRs a chance at working.
For MPTs, both the device and CPU might write there, so we must
allocate DMA coherent memory for these.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
MTTs are allocated in non-cache-coherent memory, so we must give
reserved MTTs their own cache line, to prevent both device and
CPU from writing into the same cache line at the same time.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The reserved_mtts field has different meaning in Tavor and Arbel, so
we are wasting mtt entries on memfree. Fix the Arbel case to match
Tavor semantics.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add an RDMA/iWARP driver for the Chelsio T3 1GbE and 10GbE adapters.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the
corresponding "kmem_cache_zalloc()" call.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Roland McGrath <roland@redhat.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Greg KH <greg@kroah.com>
Acked-by: Joel Becker <Joel.Becker@oracle.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Percpu data is not freed on module unloading.
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Christoph Raisch <raisch@de.ibm.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
For some reason gcc-3.4.5 on sparc64 does:
WARNING: "____ilog2_NaN" [drivers/infiniband/hw/mthca/ib_mthca.ko] undefined!
Points to note:
(1) The asm volatile flush/flushw are just markers for viewing what comes out
in the assembly; removing them has no effect on the result.
(2) Changing almost anything else in dwh__mthca_arbel_init_srq_context() or
dwh__mthca_alloc_srq() causes the problem to go away.
The compiler command line issued by the kernel build is:
/opt/crosstool/gcc-3.4.5-glibc-2.3.6/sparc64-unknown-linux-gnu/bin/sparc64-unknown-linux-gnu-gcc -fno-strict-aliasing -fno-common -Os -m64 -mno-fpu -mcpu=ultrasparc -mcmodel=medlow -ffixed-g4 -ffixed-g5 -fcall-used-g7 -Wa,--undeclared-regs -pg -fno-omit-frame-pointer -fno-optimize-sibling-calls -fasynchronous-unwind-tables -g -c -o drivers/infiniband/hw/mthca/.tmp_mthca_srq.o drivers/infiniband/hw/mthca/mthca_srq.c
This can be reduced to this whilst still retaining the problem:
/opt/crosstool/gcc-3.4.5-glibc-2.3.6/sparc64-unknown-linux-gnu/bin/sparc64-unknown-linux-gnu-gcc -m64 -c -o drivers/infiniband/hw/mthca/mthca_srq.o drivers/infiniband/hw/mthca/mthca_srq.c -Os
Removing -Os or changing it to -O or -O0 thru -O6 gets rid of the problem.
This patch to the kernel code fixes the problem:
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When clearing the ib_ah_attr parameter in to_ib_ah_attr(), use sizeof
*ib_ah_attr instead of sizeof *path.
Pointed out by Jack Morgenstein <jackm@mellanox.co.il>.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Remove prototypes for functions that don't exist.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch removes do_mmap() from ehca:
- Call remap_pfn_range() for hardware register block
- Use vm_insert_page() to register memory allocated for completion
queues and queue pairs
- The actual mmap() call/trigger is now controlled by user space,
ie. libehca
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
struct ib_wc currently only includes the local QP number: this matches
the IB spec, but seems mostly useless. The following patch replaces
this with the pointer to qp itself, and updates all low level drivers
and all users.
This has the following advantages:
- Ability to get a per-qp context through wc->qp->qp_context
- Existing drivers already have the qp pointer ready in poll cq, so
this change actually saves a tiny bit (extra memory read) on data path
(for ehca it would actually be expensive to find the QP pointer when
polling a CQ, but ehca does not support SRQ so we can leave wc->qp as
NULL for ehca)
- Users that need the QP number can still get it through wc->qp->qp_num
Use case:
In IPoIB connected mode code, I have a common CQ shared by multiple
QPs. To track connection usage, I need a way to get at some per-QP
context upon the completion, and I would like to avoid allocating
context object per work request just to stick a QP pointer into it.
With this code, I can just use wc->qp->qp_context.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The lock is taken with _irqsave and hence must be released with
_irqrestore on all paths.
Signed-off-by Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If a QP being queried is in the RESET state, don't execute the
QUERY_QP firmware command (because it will fail).
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Here is a patch for ehca to use proper flag, ie. GFP_ATOMIC
resp. GFP_KERNEL, when calling get_zeroed_page() to prevent "Bug:
scheduling while atomic...". This error does not cause a kernel panic
but makes ipoib un-usable afterwards. It is reproducible on
2.6.20-rc4 if one does ifconfig down during a flood ping test. I have
not observed this error in earlier releases incl. 2.6.20-rc1.
This error occurs when a qp event/irq is received and ehca event
handler allocates a control block/page to obtain HCA error data block.
Use of GFP_ATOMIC when in interrupt context prevents this issue.
Signed-off-by Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
According to the Tavor and Arbel programmer's reference manuals, the
number of bytes transferred is not provided in the byte_cnt field of
the CQ entry for atomic operation completions. For atomic operations,
the number of bytes transferred is always 8 (when the status is
"success"), and this constant value should always be used by the
driver in the ib_wc entry returned, rather than using the CQE.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
mthca_table_find() will return the wrong address when the table entry
being searched for is exactly at the beginning of a sglist entry
(other than the first), because it uses >= when it should use >.
Example: assume we have 2 entries in scatterlist, 4K each, offset is
4K. The current code will return first entry + 4K when we really want
the second entry.
In particular this means mapping an FMR on a memfree HCA may end up
writing the page table into the wrong place, leading to memory
corruption and also causing the HCA to use an incorrect address
translation table.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit bed8bdfd ("IB: kmemdup() cleanup") introduced one bad conversion to
kmemdup() in mthca_alloc_fmr(), where the structure allocated and the
structure copied are not the same size. Revert this back to the original
kmalloc()/memcpy() code.
Reported-by: Dotan Barak <dotanb@mellanox.co.il>.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@digitalvampire.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
mthca_device_mutex() can be initialized automatically with
DEFINE_MUTEX() rather than explicitly calling mutex_init(). This
saves a bit of text and shrinks the source by a line, so we may as
well do it....
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add module parameters that enable settting some of the HCA
profile values, such as the number of QPs, CQs, etc.
Signed-off-by: Leonid Arsh <leonida@voltaire.com>
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch implements the interposing DMA mapping functions to allow
support for IOMMUs and remove the dependence on phys_to_virt() and
bus_to_virt().
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit 51f65ebc ("IB/ipath - program intconfig register using new HT
irq hook"), which fixed interrupts for HyperTransport HCAs, broke PCI
Express HCAs, because for those HCAs, the driver uses the value of
pdev->irq before pci_enable_msi() and ends up getting a totally bogus
IRQ number. Fix this by using the value of pdev->irq after
pci_enable_msi().
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Remove variables that are set but then never looked at in the ipath
driver. These cleanups came from David Binderman's list of "set but
never used" warnings from icc.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This facility provides three entry points:
ilog2() Log base 2 of unsigned long
ilog2_u32() Log base 2 of u32
ilog2_u64() Log base 2 of u64
These facilities can either be used inside functions on dynamic data:
int do_something(long q)
{
...;
y = ilog2(x)
...;
}
Or can be used to statically initialise global variables with constant values:
unsigned n = ilog2(27);
When performing static initialisation, the compiler will report "error:
initializer element is not constant" if asked to take a log of zero or of
something not reducible to a constant. They treat negative numbers as
unsigned.
When not dealing with a constant, they fall back to using fls() which permits
them to use arch-specific log calculation instructions - such as BSR on
x86/x86_64 or SCAN on FRV - if available.
[akpm@osdl.org: MMC fix]
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David Howells <dhowells@redhat.com>
Cc: Wojtek Kaniewski <wojtekka@toxygen.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
SLAB_KERNEL is an alias of GFP_KERNEL.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
SLAB_ATOMIC is an alias of GFP_ATOMIC
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Conflicts:
drivers/infiniband/core/iwcm.c
drivers/net/chelsio/cxgb2.c
drivers/net/wireless/bcm43xx/bcm43xx_main.c
drivers/net/wireless/prism54/islpci_eth.c
drivers/usb/core/hub.h
drivers/usb/input/hid-core.c
net/core/netpoll.c
Fix up merge failures with Linus's head and fix new compilation failures.
Signed-Off-By: David Howells <dhowells@redhat.com>
It is possible to swap the CQs used for send_cq and recv_cq when
creating two different QPs. If these two QPs are then destroyed at
the same time, an AB-BA deadlock can occur because the CQ locks are
taken our of order. Fix this by always taking CQ locks in a fixed
order.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When initializing an mthca SRQ, the log_srq_size field should be the
log of the number of SRQ WQEs, not the log of the number of bytes in
the SRQ.
This affects only mthca drivers for memfree HCAs which set the initial
srq wqe counter (in the SW2HW transition) to a non-zero value.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This is a patch for ehca to fix a bug in prepare_sqe_to_rts(), which
used WQE address to iterate pending work requests. This might cause
an access violation since the queue pages can not be assumed to follow
each other consecutively. Thus, this patch introduces a few queue
functions to determine WQE offset based on its address and uses WQE
offset to iterate the pending work requests.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The amso1100 driver was missing a couple of __devinit/__devexit
annotations for init/cleanup functions that are called from
__devinit/__devexit functions.
Reported by Randy Dunlap <randy.dunlap@oracle.com>.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit b3b30f5e ("IB/mthca: Recover from catastrophic errors")
introduced some section mismatch breakage, because the error recovery
code tears down and reinitializes the device, which calls into lots of
code originally marked __devinit and __devexit from regular .text.
Fix this by getting rid of these now-incorrect section markers.
Reported by Randy Dunlap <randy.dunlap@oracle.com>.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Replace open coded kmemdup() to save some screen space, and allow
inlining/not inlining to be triggered by gcc.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ipath uses skb functions and won't build without CONFIG_NET.
Spotted by Randy Dunlap.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The PCI Express and Hypertransport chip-specific source files should only
be built when the kernel has the capability of actually compiling them.
This fixes the driver build on, for example, ia64.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
IB/mad: Fix race between cancel and receive completion
RDMA/amso1100: Fix && typo
RDMA/amso1100: Fix unitialized pseudo_netdev accessed in c2_register_device
IB/ehca: Activate scaling code by default
IB/ehca: Use named constant for max mtu
IB/ehca: Assure 4K alignment for firmware control blocks
Fix the AMSO1100 firmware version computation, which was broken
due to "&&" being used where "&" should have.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Rework some load-time error handling: c2_register_device() leaked when
it failed, and the function that called it didn't check the return code.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Change ehca's Kconfig to activates scaling code as default. After
several measurements we saw that this feature prevents dropped packets
(UD) in stress situation. Thus, enabling it helps to improve ehca's
bandwidth through IPoIB.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Define and use a constant EHCA_MAX_MTU instead hardcoded value.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Assure 4K alignment for firmware control blocks in 64K page mode,
because kzalloc()'s result address might not be 4K aligned if 64K
pages are enabled. Thus, we introduce wrappers called
ehca_{alloc,free}_fw_ctrlblock(), which use a slab cache for objects
with 4K length and 4K alignment in order to alloc/free firmware
control blocks in 64K page mode. In 4K page mode those wrappers just
are defines of get_zeroed_page() and free_page().
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Eric's changes to the htirq infrastructure require corresponding
modifications to the ipath HT driver code so that interrupts are still
delivered properly.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Several fields in an incoming MAD extended info header were passed
into the MAD_IFC firmware command at incorrect offsets (mostly off by
4 bytes). As the result, the HCA will fail to generate traps in which
this info is needed (e.g. traps which include the GRH of the incoming
packet), in violation of the IB spec.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>