Commit Graph

38 Commits

Author SHA1 Message Date
Michael S. Tsirkin
8a7f752125 IB/ipoib: Fix packet loss after hardware address update
The neighbour ha field may get updated without destroying the
neighbour.  In this case, the ha field gets out of sync with the
address handle stored in ipoib_neigh->ah, with the result that
the ah field would point to an incorrect path, resulting in all
packets being lost.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-07-24 09:18:07 -07:00
Jack Morgenstein
37c22a7721 IPoIB: Fix kernel unaligned access on ia64
Fix misaligned access faults on ia64: never cast a misaligned
neighbour->ha + 4 pointer to union ib_gid type; pass a void * pointer
instead.  The memcpy was being optimized to use full word accesses
because the compiler thought that union ib_gid is always aligned.

The cast in IPOIB_GID_ARG is safe, since it is fixed to access each
byte separately.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:35 -07:00
Roland Dreier
f697f74a6b IPoIB: Use spin_lock_irq() instead of spin_lock_irqsave()
We know ipoib_flush_paths() is called from plain process context with
interrupts enabled, since it does wait_for_completion().  So there's
no need to use spin_lock_irqsave() -- spin_lock_irq() is fine.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-10 09:43:59 -07:00
Eli Cohen
a30bb96c6f IPoIB: Close race in ipoib_flush_paths()
ib_sa_cancel_query() must be called with priv->lock held since
a completion might arrive and set path->query to NULL.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-10 09:43:59 -07:00
Shirley Ma
0f4852513f IPoIB: Make send and receive queue sizes tunable
Make IPoIB's send and receive queue sizes tunable via module
parameters ("send_queue_size" and "recv_queue_size").  This allows the
queue sizes to be enlarged to fix disastrously bad performance on some
platforms and workloads, without bloating memory usage when large
queues aren't needed.

Signed-off-by: Shirley Ma <xma@us.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-10 09:43:58 -07:00
Jack Morgenstein
bf6a9e31cf IB: simplify static rate encoding
Push translation of static rate to HCA format into low-level drivers,
where it belongs.  For static rate encoding, use encoding of rate
field from IB standard PathRecord, with addition of value 0, for
backwards compatibility with current usage.  The changes are:

 - Add enum ib_rate to midlayer includes.
 - Get rid of static rate translation in IPoIB; just use static rate
   directly from Path and MulticastGroup records.
 - Update mthca driver to translate absolute static rate into the
   format used by hardware.  This also fixes mthca's static rate
   handling for HCAs that are capable of 4X DDR.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-10 09:43:47 -07:00
Michael S. Tsirkin
d2e0655ede IPoIB: Consolidate private neighbour data handling
Consolidate IPoIB's private neighbour data handling into
ipoib_neigh_alloc() and ipoib_neigh_free().  This will make it easier
to keep track of the neighbour structures that IPoIB is handling, and
is a nice cleanup of the code:

add/remove: 2/1 grow/shrink: 1/8 up/down: 100/-178 (-78)
function                                     old     new   delta
ipoib_neigh_alloc                              -      61     +61
ipoib_neigh_free                               -      36     +36
ipoib_mcast_join_finish                     1288    1291      +3
path_rec_completion                          575     573      -2
ipoib_mcast_join_task                        664     660      -4
ipoib_neigh_destructor                       101      92      -9
ipoib_neigh_setup_dev                         14       3     -11
ipoib_neigh_setup                             17       -     -17
path_free                                    238     215     -23
ipoib_mcast_free                             329     306     -23
ipoib_mcast_send                             718     684     -34
neigh_add_path                               705     650     -55

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-04 14:46:48 -07:00
Roland Dreier
ef12d45619 IPoIB: Fix oops with raw sockets
ipoib_hard_header() needs to handle the case that daddr is NULL.  This
can happen when packets are injected via a raw socket, and IPoIB
shouldn't oops in this case.

Reported by Anton Blanchard <anton@samba.org>

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-29 09:36:46 -08:00
Leonid Arsh
7a343d4c46 IPoIB: P_Key change event handling
This patch causes the network interface to respond to P_Key change
events correctly.  As a result, you'll see a child interface in the
"RUNNING" state (netif_carrier_on()) only when the corresponding P_Key
is configured by the SM.  When SM removes a P_Key, the "RUNNING" state
will be disabled for the corresponding network interface.  To
implement this, I added IB_EVENT_PKEY_CHANGE event handling.  To
prevent flushing the device before the device is open by the "delay
open" mechanism, I added an additional device flag called
IPOIB_FLAG_INITIALIZED.

This also prevents the child network interface from trying to join to
multicast groups until the PKEY is configured.  We used to get error
messages like:

    ib0.f2f2: couldn't attach QP to multicast group ff12:401b:f2f2:0:0:0:ffff:ffff

in this case.  To fix this, I just check IPOIB_FLAG_OPER_UP flag in
ipoib_set_mcast_list().

Signed-off-by: Leonid Arsh <leonida@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24 15:47:30 -08:00
Linus Torvalds
3d1f337b3e Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (235 commits)
  [NETFILTER]: Add H.323 conntrack/NAT helper
  [TG3]: Don't mark tg3_test_registers() as returning const.
  [IPV6]: Cleanups for net/ipv6/addrconf.c (kzalloc, early exit) v2
  [IPV6]: Nearly complete kzalloc cleanup for net/ipv6
  [IPV6]: Cleanup of net/ipv6/reassambly.c
  [BRIDGE]: Remove duplicate const from is_link_local() argument type.
  [DECNET]: net/decnet/dn_route.c: fix inconsequent NULL checking
  [TG3]: make drivers/net/tg3.c:tg3_request_irq() static
  [BRIDGE]: use LLC to send STP
  [LLC]: llc_mac_hdr_init const arguments
  [BRIDGE]: allow show/store of group multicast address
  [BRIDGE]: use llc for receiving STP packets
  [BRIDGE]: stp timer to jiffies cleanup
  [BRIDGE]: forwarding remove unneeded preempt and bh diasables
  [BRIDGE]: netfilter inline cleanup
  [BRIDGE]: netfilter VLAN macro cleanup
  [BRIDGE]: netfilter dont use __constant_htons
  [BRIDGE]: netfilter whitespace
  [BRIDGE]: optimize frame pass up
  [BRIDGE]: use kzalloc
  ...
2006-03-21 09:31:48 -08:00
Michael S. Tsirkin
c5ecd62c25 [NET]: Move destructor from neigh->ops to neigh_params
struct neigh_ops currently has a destructor field, which no in-kernel
drivers outside of infiniband use.  The infiniband/ulp/ipoib in-tree
driver stashes some info in the neighbour structure (the results of
the second-stage lookup from ARP results to real link-level path), and
it uses neigh->ops->destructor to get a callback so it can clean up
this extra info when a neighbour is freed.  We've run into problems
with this: since the destructor is in an ops field that is shared
between neighbours that may belong to different net devices, there's
no way to set/clear it safely.

The following patch moves this field to neigh_parms where it can be
safely set, together with its twin neigh_setup.  Two additional
patches in the patch series update ipoib to use this new interface.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:25:41 -08:00
Roland Dreier
bfef73fa78 IPoIB: Get rid of useless test of queue length
In neigh_add_path(), the queue of delayed packets can never be full,
because the queue is always freshly created and cannot be found by any
other code path.  In fact, the test of the queue length is worse than
useless: if somehow the test ever triggered and path_rec_start() also
failed, then dev_kfree_skb_any() will be called twice on the same skb.
Fix this by deleting the useless test.  Pointed out by Michael
S. Tsirkin <mst@mellanox.co.il>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:26 -08:00
Jack Morgenstein
0b3ea0829c IPoIB: Move ipoib_ib_dev_flush() to ipoib workqueue
Move ipoib_ib_dev_flush() to ipoib's workqueue.  This keeps it ordered
with respect to other work scheduled by the ipoib driver.  This fixes
problems with races, for example:
 - ipoib_ib_dev_flush() has started running because of an IB event
 - user does ifconfig ib0 down
 - ipoib_mcast_stop_thread() gets called twice and waits for the same
   completion twice

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:24 -08:00
Michael S. Tsirkin
47f7a0714b IPoIB: Make sure path is fully initialized before using it
The SA path record query completion can initialize path->pathrec.dlid
before IPoIB's callback runs and initializes path->ah, so we must test
ah rather than dlid.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-17 09:22:05 -08:00
Ingo Molnar
95ed644fd1 IB: convert from semaphores to mutexes
semaphore to mutex conversion by Ingo and Arjan's script.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
[ Sanity-checked on real IB hardware ]
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-13 14:51:39 -08:00
Arnaldo Carvalho de Melo
14c850212e [INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h
To help in reducing the number of include dependencies, several files were
touched as they were getting needed headers indirectly for stuff they use.

Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had
linux/dccp.h include twice.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:21 -08:00
Roland Dreier
267ee88ed3 IPoIB: fix error handling in ipoib_open
If ipoib_ib_dev_up() fails after ipoib_ib_dev_open() is called, then
ipoib_ib_dev_stop() needs to be called to clean up.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-29 10:55:58 -08:00
Roland Dreier
5872a9fc28 IPoIB: always set path->query to NULL when query finishes
Always set path->query to NULL when the SA path record query
completes, rather than only when we don't have an address handle.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-29 10:13:54 -08:00
Roland Dreier
65c7eddaba IPoIB: reinitialize path struct's completion for every query
It's possible that IPoIB will issue multiple SA queries for the same
path struct.  Therefore the struct's completion needs to be
initialized for each query rather than only once when the struct is
allocated, or else we might not wait long enough for later queries to
finish and free the path struct too soon.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-28 21:20:34 -08:00
Roland Dreier
1732b0ef3b [IPoIB] add path record information in debugfs
Add ibX_path files to debugfs that contain information about the IPoIB
path cache.  IPoIB ARP only gives GIDs, which the IPoIB driver must
resolve to real IB paths through the ib_sa module.  For debugging,
when the ARP table looks OK but traffic isn't flowing, it's useful to
be able to see if the resolution from GID to path worked.

Also clean up the formatting of the existing _mcg debugfs files.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-10 10:22:49 -08:00
Roland Dreier
21a384897d [IPoIB] remove unneeded initializations to 0
Shrink our source and .text a little by removing a few assignments of
NULL and 0 to memory that is already cleared as part of the allocation.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-02 10:07:59 -08:00
Roland Dreier
de6eb66b56 [IB] kzalloc() conversions
Replace kmalloc()+memset(,0,) with kzalloc(), for a net savings of 35
source lines and about 500 bytes of text.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-02 07:23:14 -08:00
Roland Dreier
a20583a7c2 [IPoIB] use spin_trylock_irqsave()
Use spin_trylock_irqsave() in ipoib_start_xmit() instead of
reinventing it out of local_irq_save(), spin_trylock() and
local_irq_restore().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-29 13:54:40 -07:00
Roland Dreier
1993d683f3 [IPoIB] Drop RX packets when out of memory
Change the way IPoIB handles RX packets when it can't allocate a new
receive skbuff.  If the allocation of a new receive skb fails, we now
drop the packet we just received and repost the original receive skb.
This means that the receive ring always stays full and we don't have
to monkey around with trying to schedule a refill task for later.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-28 15:30:34 -07:00
Roland Dreier
4b2d319b53 [IPoIB] Improve ipoib_timeout() output
Use jiffies_to_msecs() so we print a human-readable time so
we don't have to worry about what HZ is configured to, and
print out a few values to make post-mortem analysis easier.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-18 12:20:06 -07:00
Roland Dreier
d70ed6075f [IPoIB] Rename IPoIB's path_lookup() to avoid name clashes
Rename IPoIB driver's path_lookup() to ipoib_path_lookup() to avoid a
clashes with the kernel global path_lookup().  We don't hit this with
the current kernel source, but some external patches seem to trigger
this, and it's cleaner to avoid clashing with global names anyway.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
refs/heads/for-linus
2005-09-28 19:56:57 -07:00
Michael S. Tsirkin
51574e0398 [PATCH] IPoIB: fix module removal race
Since ipoib uses queue_delayed_work to run flush task on port state events,
it must flush scheduled work after unregistering the event handler.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-09-18 22:02:37 -07:00
Michael S. Tsirkin
06c56e44f3 [PATCH] IPoIB: fix memory leak
Fix IPoIB memory leak on device removal.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-09-07 09:48:52 -07:00
Michael S. Tsirkin
1ad62a19f1 [PATCH] IPoIB: Fix device removal race
Currently we may have work scheduled in default kernel workqueue when
the device is going down.  The device could get freed before this
workqueue gets serviced.  I am actually seeing this causing system
hangs.

The following patch fixes this by using ipoib_workqueue which gets
flushed when the device is going down.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-26 20:37:38 -07:00
Roland Dreier
4ce059378c [PATCH] IPoIB: Set full membership bit in P_Keys
Always make sure that the full membership bit is set in the P_Keys
that IPoIB uses.  This makes sure that all hosts join the correct
multicast groups so that hosts that are partial partition members
can talk to the rest of the network.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-26 20:37:37 -07:00
Olaf Hering
2aeba9a03b [PATCH] IB: Remove unnecessary includes of <linux/version.h>
changing CONFIG_LOCALVERSION rebuilds too much, for no appearent reason.
Remove unneeded includes of <linux/version.h>.

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-26 20:37:36 -07:00
Sean Hefty
97f52eb438 [PATCH] IB: sparse endianness cleanup
Fix sparse warnings.  Use __be* where appropriate.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-26 20:37:35 -07:00
Hal Rosenstock
92a6b34bf4 [PATCH] IB: Eliminate redundant NULL checks
IPoIB: Eliminate NULL checks prior to calling kfree

Signed-off-by: Hal Rosenstock <halr@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-26 20:37:35 -07:00
Roland Dreier
2a1d9b7f09 [PATCH] IB: Add copyright notices
Make some lawyers happy and add copyright notices for people who
forgot to include them when they actually touched the code.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-26 20:37:35 -07:00
Hal Rosenstock
0dca0f7bf8 [PATCH] [IPoIB] Handle sending of unicast RARP responses
RARP replies are another valid case where IPoIB may need to send a
unicast packet with no neighbour structure.

Signed-off-by: Hal Rosenstock <halr@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-07-28 13:17:26 -07:00
Roland Dreier
9adec1a808 [PATCH] IPoIB: convert to debugfs
Convert IPoIB to use debugfs instead of its own custom debugging filesystem.

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16 15:26:07 -07:00
Roland Dreier
e6ded99cbb [PATCH] IPoIB: fix static rate calculation
Correct and simplify calculation of static rate.  We need to round up the
quotient of (local_rate - path_rate) / path_rate.  To round up we add
(path_rate - 1) to the numerator, so the quotient simplifies to (local_rate -
1) / path_rate.

No idea how I came up with the old formula.

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16 15:26:06 -07:00
Linus Torvalds
1da177e4c3 Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!
2005-04-16 15:20:36 -07:00