Commit f780a9f1 ("mlx4_core: Add ethernet fields to CQE struct")
introduced a bug in how wc->sl is set in mlx4_ib_poll_one() -- since
cqe->sl_vid is a big-endian value, the shift must be done after
converting to host endianness.
This bug was found using sparse endianness checking.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits)
net: Allow dependancies of FDDI & Tokenring to be modular.
igb: Fix build warning when DCA is disabled.
net: Fix warning fallout from recent NAPI interface changes.
gro: Fix potential use after free
sfc: If AN is enabled, always read speed/duplex from the AN advertising bits
sfc: When disabling the NIC, close the device rather than unregistering it
sfc: SFT9001: Add cable diagnostics
sfc: Add support for multiple PHY self-tests
sfc: Merge top-level functions for self-tests
sfc: Clean up PHY mode management in loopback self-test
sfc: Fix unreliable link detection in some loopback modes
sfc: Generate unique names for per-NIC workqueues
802.3ad: use standard ethhdr instead of ad_header
802.3ad: generalize out mac address initializer
802.3ad: initialize ports LACPDU from const initializer
802.3ad: remove typedef around ad_system
802.3ad: turn ports is_individual into a bool
802.3ad: turn ports is_enabled into a bool
802.3ad: make ntt bool
ixgbe: Fix set_ringparam in ixgbe to use the same memory pools.
...
Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due
to the conversion to %pI (in this networking merge) and the addition of
doing IPv6 addresses (from the earlier merge of CIFS).
When we removed the network device argument from several
NAPI interfaces in 908a7a16b8
("net: Remove unused netdev arg from some NAPI interfaces.")
several drivers now started getting unused variable warnings.
This fixes those up.
Signed-off-by: David S. Miller <davem@davemloft.net>
When resizing a CQ, when copying over unpolled CQEs from the old CQE
buffer to the new buffer, the ownership bit must be set appropriately
for the new buffer, or the ownership bit in the new buffer gets
corrupted.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
There is no lock protecting tx_free_list thus causing a system crash
when skb_dequeue() is called and the list is empty. Since it did not give
any performance boost under heavy load, remove it to simplify the code.
Replace get_free_pkt() with dev_alloc_skb() to allocate MAX_CM_BUFFER skb
for connection establishment/teardown as well as MPA request/response.
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When the napi api was changed to separate its 1:1 binding to the net_device
struct, the netif_rx_[prep|schedule|complete] api failed to remove the now
vestigual net_device structure parameter. This patch cleans up that api by
properly removing it..
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using MSI-X mode, create a completion event queue for each CPU.
Report the number of completion EQs in a new struct mlx4_caps member,
num_comp_vectors, and extend the mlx4_cq_alloc() interface with a
vector parameter so that consumers can specify which completion EQ
should be used to report events for the CQ being created.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
vpage is checked not to be NULL just after it is initialized at the
beginning of each loop iteration.
A simplified version of the semantic patch that makes this change is
as follows: (http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@r exists@
local idexpression x;
expression E;
position p1,p2;
@@
if (x@p1 == NULL || ...) { ... when forall
return ...; }
... when != \(x=E\|x--\|x++\|--x\|++x\|x-=E\|x+=E\|x|=E\|x&=E\|&x\)
(
x@p2 == NULL
|
x@p2 != NULL
)
// another path to the test that is not through p1?
@s exists@
local idexpression r.x;
position r.p1,r.p2;
@@
... when != x@p1
(
x@p2 == NULL
|
x@p2 != NULL
)
@fix depends on !s@
position r.p1,r.p2;
expression x,E;
statement S1,S2;
@@
(
- if ((x@p2 != NULL) || ...)
S1
|
- if ((x@p2 == NULL) && ...) S1
|
- BUG_ON(x@p2 == NULL);
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
With the latest flush error completion patch we introduced modulus
operation to calculate the next index within a qmap. Based on
comments from other mailing lists we decided to optimize this
operation by using an addition and an if-statement instead of modulus,
even though this is on the error path.
Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fixes timing race resulting in panic. Not a performance sensitive path.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ipath_piobufbase was a single value offset, but is multiple values on
newer chips, so use only the 32 bits for the 2K buffers (4K buffers
are currently used only by the driver).
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Implement the ignoring of ibsymbol errors and linkrecover errors while
the link is at less than INIT (long needed), to get accurate counts.
Particularly an issue when doing non-IBTA DDR negotiation with chips
from vendors that do not support IBTA mode negotiation. If the driver
is unloaded, and there is a delta, the adjusted counters are written
back to the chip, so they stay adjusted across driver reload.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This fixes an obvious oversight where the return value is not checked
for error.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The PSN of the first packet after an RDMA read is based on the size of
the RDMA read request. This is calculated correctly for the WQE sent
after the first request message but not on subsequent requests if the
RDMA read is resent.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Wrap NES_DEBUG and assert macros with do while (0) to avoid ambiguous
else. No one is using sk_buff * returned from form_cm_frame(), so
drop the return. drop_packet() should not be incrementing reset
counter on receiving a FIN.
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Between the first empty list check and locking the list, the list can
change. Check it again after it is locked to make sure the list is
still not empty.
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ANVL testing showed we are not handling all cm_node states during
connection establishment. Add missing state handlers and fix sequence
number send reset in handle_tcp_options().
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Under heavy traffic, there is a small windows when an APBVT entry is
not yet removed and a new connection is established. Packets for the
new connection are dropped until APBVT entry is removed. This patch
will forward the packets instead of dropping them.
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In passive open, after indicating MPA request to rdma_cm, an incoming
RST would fire a reset event to rdma_cm causing it to crash, since the
current state is not connected. The solution is to wait for
nes_accept() or nes_reject() before firing the reset event. If
nes_accept() or nes_reject() is already done, then the reset event
will be fired when RST is processed.
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
While processing connected_nodes list, we would release the lock when
we need to send reset to remote partner. That created a window where
the list can be modified. Change this into a two step process: place
nodes that need processing on a local list then process the local list.
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Use nes_free_cqp_request() instead of open coding. Change some
continue to break in nes_cm_timer_tick, because send_entry used to be
a list processed in a loop (so continue went to the next item). Now
it is a single item, so using break is correct.
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Receive work queue entries are checked for L_Key validity, and
pointers to the memory region structure are saved in an allocated
structure. For UD loopback packets, this structure is allocated and
freed for each packet. This patch changes that to allocate/free
during QP creation and destruction.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The return from lookup_one_len() is assigned to *dentry, so that's
what we should be checking with IS_ERR().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
shca_list_lock is taken from softirq context in ehca_poll_eqs, so we
need to lock IRQ safe elsewhere. Found by lockdep.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When resizing a CQ, MTTs associated with the old CQE buffer were not
freed. As a result, if any app used resize CQ repeatedly, all MTTs
were eventually exhausted, which led to all memory registration
operations failing until the driver is reloaded.
Once the RESIZE_CQ command returns successfully from FW, FW no longer
accesses the old CQ buffer, so it is safe to deallocate the MTT
entries used by the old CQ buffer.
Finally, if the RESIZE_CQ command fails, the MTTs allocated for the
new CQEs buffer also need to be de-allocated.
This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1416>.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This fix enables ehca device driver to generate flush work completions
even if the application doesn't request completions for all work
requests. The current implementation of ehca will generate flush work
completions for the wrong work requests if an application uses non
signaled work completions.
Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The error message printed when the eHCA driver prevents memory hotplug
is misleading -- the user might think that hot-removing the lhca,
hotplugging memory, then hot-adding the lhca again will work, but it
actually doesn't.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This use of netdev->priv is wrong.
The right way is:
alloc_netdev() with no memory for private data.
make netdev->ml_priv to point to c2_dev.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Acked-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the iw_cxgb3 module's cxgb3_client "add" func gets called by the
cxgb3 module, the iwarp driver ends up calling the ethtool ops
get_drvinfo function in cxgb3 to get the fw version and other info.
Currently the iwarp driver grabs the rtnl lock around this down call
to serialize. As of 2.6.27 or so, things changed such that the rtnl
lock is held around the call to the netdev driver open function. Also
the cxgb3_client "add" function doesn't get called if the device is
down.
So, if you load cxgb3, then load iw_cxgb3, then ifconfig up the
device, the iw_cxgb3 add func gets called with the rtnl_lock held. If
you load cxgb3, ifconfig up the device, then load iw_cxgb3, the add
func gets called without the rtnl_lock held. The former causes the
deadlock, the latter does not.
In addition, there are iw_cxgb3 sysfs handlers that also can call down
into cxgb3 to gather the fw and hw versions. These can be called
concurrently on different processors and at any time. Thus we need to
push this serialization down in the cxgb3 driver get_drvinfo func.
The fix is to remove rtnl lock usage, and use a per-device lock in cxgb3.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the initialization of a special QP (e.g. AQP1) fails due to a
software timeout, we have to remove the reference to that special QP
struct from the port struct to stop the driver from accessing the QP,
since it will be/has been destroyed by the caller, eg in this case
ib_mad.
Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Set mr->umem to NULL in mlx4_ib_alloc_fast_reg_mr(). Otherwise
ib_dereg_mr() may invoke ib_umem_release() on a random pointer value
and get an oops.
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Under heavy load, there is an compatibility issue regarding PCIe write
credits with certain chipsets. It can be mitigated by limiting read
requests to 256 Bytes.
This workaround is always enabled for Tbird2 on Gladius. We also add
a module parameter to enable workaround for non-Gladius cards.
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix CQ allocation for multicast receive queue applications. Before
this patch, the CQ was not lined up with the right NIC.
Signed-off-by: Vadim Makhervaks <vadim.makhervaks@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* Roll back allocated structures on failures.
* Use GFP_ATOMIC instead of GFP_KERNEL since we are holding a lock.
* Acquire nesadapter->pbl_lock when modifying PBL counters.
* Decrement PBL counters on deallocation.
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The array wqe->read.reserved has only two entries, but
iwch_post_zb_read() sets [0], [1], and [2], which is one too many.
This is harmless since it runs into the next field, rem_stag, which is
initialized correctly immediately after, but we might as well get
things right, especially since it makes the code smaller.
This was spotted by the Coverity checker (CID 2475).
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As a bonus, removes some unnecessary byteswapping.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This converts pretty much everything to print_mac. There were
a few things that had conflicts which I have just dropped for
now, no harm done.
I've built an allyesconfig with this and looked at the files
that weren't built very carefully, but it's a huge patch.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the last packet of a RDMA write with immediate is received, the
next receive work queue entry ID should be used to generate a completion
entry. The code was incorrectly resetting part of the state used to copy
the last packet.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/ehca: Reject dynamic memory add/remove when ehca adapter is present
IB/ehca: Fix reported max number of QPs and CQs in systems with >1 adapter
IPoIB: Set netdev offload features properly for child (VLAN) interfaces
IPoIB: Clean up ethtool support
mlx4_core: Add Ethernet PCI device IDs
mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC
mlx4_core: Multiple port type support
mlx4_core: Ethernet MAC/VLAN management
mlx4_core: Get ethernet MTU and default address from firmware
mlx4_core: Support multiple pre-reserved QP regions
Update NetEffect maintainer emails to Intel emails
RDMA/cxgb3: Remove cmid reference on tid allocation failures
IB/mad: Use krealloc() to resize snoop table
IPoIB: Always initialize poll_timer to avoid crash on unload
IB/ehca: Don't allow creating UC QP with SRQ
mlx4_core: Add QP range reservation support
RDMA/ucma: Test ucma_alloc_multicast() return against NULL, not with IS_ERR()