The set_multicast_list function may be called without holding the rtnl
mutex, resulting in races when changing the underlying device's promiscous
and allmulti state. Use the change_rx_mode hook, which is always invoked
under the rtnl.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method drivers currently use to synchronize multicast lists is not
very pretty:
- walk the multicast list
- search each entry on a copy of the previous list
- if new add to lower device
- walk the copy of the previous list
- search each entry on the current list
- if removed delete from lower device
- copy entire list
This patch adds a new field to struct dev_addr_list to store the
synchronization state and adds two helper functions for synchronization
and cleanup.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the set_multicast_list (and set_rx_mode) callbacks are
responsible for configuring the device according to the IFF_PROMISC,
IFF_MULTICAST and IFF_ALLMULTI flags and the mc_list (and uc_list in
case of set_rx_mode).
These callbacks can be invoked from BH context without the rtnl_mutex
by dev_mc_add/dev_mc_delete, which makes reading the device flags and
promiscous/allmulti count racy. For real hardware drivers that just
commit all changes to the hardware this is not a real problem since
the stack guarantees to call them for every change, so at least the
final call will not race and commit the correct configuration to the
hardware.
For software devices that want to synchronize promiscous and multicast
state to an underlying device however this can cause corruption of the
underlying device's flags or promisc/allmulti counts.
When the software device is concurrently put in promiscous or allmulti
mode while set_multicast_list is invoked from bottem half context, the
device might synchronize the change to the underlying device without
holding the rtnl_mutex, which races with concurrent changes to the
underlying device.
Add a dev->change_rx_flags hook that is invoked when any of the flags
that affect rx filtering change (under the rtnl_mutex), which allows
drivers to perform synchronization immediately and only synchronize
the address lists in set_multicast_list/set_rx_mode.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subject: [patch] net/input: fix net/rfkill/rfkill-input.c bug on 64-bit systems
this recent commit:
commit cf4328cd94
Author: Ivo van Doorn <IvDoorn@gmail.com>
Date: Mon May 7 00:34:20 2007 -0700
[NET]: rfkill: add support for input key to control wireless radio
added this 64-bit bug:
....
unsigned int flags;
spin_lock_irqsave(&task->lock, flags);
....
irq 'flags' must be unsigned long, not unsigned int. The -rt tree has
strict checks about this on 64-bit so this triggered a build failure.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
With
dma-mapping-prevent-dma-dependent-code-from-linking-on.patch
scsi fails to build on !HAS_DMA architectures:
drivers/built-in.o(.text+0x20af6): In function `scsi_dma_map':
: undefined reference to `dma_map_sg'
drivers/built-in.o(.text+0x20b5c): In function `scsi_dma_unmap':
: undefined reference to `dma_unmap_sg'
I split those functions out into a new file. Builds on s390 and i386.
Move scsi_dma_{map,unmap} into scsi_lib_dma.c which is only build if
HAS_DMA is set.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch tidies up scsi_add_lun a bit. I rewrote the kerneldoc to match
the actual parameters, moved the check for RBC and MMC REPORT_LUN devices
away from the switch(), changed the setup of sdev->type to account for
BLIST_ISROM, moved the check for BLIST_NO_ULD_ATTACH further down in
the function, removed a bogus comment and fixed some whitespace issues.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
remove printk, which triggers because of low scsi clock on SNI RMs
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- base address is now a physical address; no need to convert it
- remove not needed error printk in module init function
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
On Wed, 2007-06-06 at 11:55 -0700, David C Somayajulu wrote:
This patch fixes the code handling underrun and overrun conditions.
Also fixed coding style as per Mike Christie's advice.
Signed-off-by: David Somayajulu <david.somayajulu@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The Megaraid Mailbox driver uses a semaphore as mutex. Use the mutex API
instead of the (binary) semaphore.
Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com>
Acked-by: "Patro, Sumant" <Sumant.Patro@lsi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Adding Adaptec 51245 (16 port), 51645 (20 port) and 52445 (28 port)
Universal Serial RAID controllers to the aacraid documentation.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Following patch bump up the driver version reflecting NPIV addition to
the qla2xxx.
- version changed from 8.01.07-k7 to 8.02.00-k1.
Signed-off-by: Seokmann Ju <seokmann.ju@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Following patch adds support for NPIV (N-Port ID Virtualization) to the
qla2xxx.
- supported within switched-fabric topologies only.
- supports up to 63 virtual ports on each physical port.
Signed-off-by: Seokmann Ju <seokmann.ju@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The original implementation in stex_ys_commands() is inappropriate.
For xfer len information, we should use resid instead.
Signed-off-by: Ed Lin <ed.lin@promise.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The Brownie 1200U3P has the same problem with REPORT LUNS as the
1600U3P. Add it to the blacklist.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- a couple of prints, they can use the accessors
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- The saved sg_count was a leftover from the time the driver was doing
dma mapping by himself. But now that scsi-ml is called for the mapping
it is not the drivers responsibility.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: G. Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
CONFIG_SCSI_FD_8xx no longer exists.
Apparently it was renamed to CONFIG_SCSI_SEAGATE, but the Makefile was
not correctly updated.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Sweep registered blkdev when scsi_register_driver has failed.
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_create_port':
drivers/scsi/lpfc/lpfc_init.c:1573: error: 'struct kobject' has no member named 'dentry'
Just remove the if check on this ... lpfc shouldn't be poking around
in kobject structures.
drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_pci_probe_one':
drivers/scsi/lpfc/lpfc_init.c:1723: warning: unused variable 'retval'
And remove the unused variable.
Cc: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch uses dma_map_sg with phba->pcidev->dev instead of
scsi_dma_map.
scsi_dma_map doesn't work for NPIV since fc_vport->dev isn't fully
initialized. check_addr() in arch/x86_64/kernel/pci-nommu.c leads to
the crash since dev->dma_mask is NULL.
For more details:
http://marc.info/?l=linux-scsi&m=118312448030633&w=2
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This is an addendum to:
commit a0b4f78f9a
Author: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
[SCSI] lpfc: convert to use the data buffer accessors
One place was missed in the merge
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Removes an obsolete method scsi_device_cancel which isn't being used
anywhere in the kernel.
Signed-off-by: Priyanka Gupta <priyankag@google.com>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
umounting partitions after heavy activity would sometimes trigger a
segmentation violation. This fix appears to remove that problem.
Fix originally provided by Latchesar Ionkov.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
During reorganization, the mount time debug option was removed in favor
of module-load-time parameters. However, the mount time option is still
a useful for feature during debug and for user-fault isolation when the
module is compiled into the kernel.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
This patch expands the impact of the loose cache mode to allow for cached
metadata increasing the performance of directory listings and other metadata
read operations.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
If trans->write returns 0, p9_write_work goes through the error path, but
sets the error code to zero.
This patch sets the error code to EREMOTEIO if trans->write returns zero
value.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
This patchset moves non-filesystem interfaces of v9fs from fs/9p to net/9p.
It moves the transport, packet marshalling and connection layers to net/9p
leaving only the VFS related files in fs/9p. This work is being done in
preparation for in-kernel 9p servers as well as alternate 9p clients (other
than VFS).
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
XFS filestreams functionality uses radix trees and the preload
functions. XFS can be built as a module and hence we need
radix_tree_preload() exported. radix_tree_preload_end() is a
static inline, so it doesn't need exporting.
Signed-Off-By: Dave Chinner <dgc@sgi.com>
Signed-Off-By: Tim Shimmin <tes@sgi.com>
* 32bit struct xfs_fsop_bulkreq has different size and layout of
members, no matter the alignment. Move the code out of the #else
branch (why was it there in the first place?). Define _32 variants of
the ioctl constants.
* 32bit struct xfs_bstat is different because of time_t and on
i386 because of different padding. Make xfs_bulkstat_one() accept a
custom "output formatter" in the private_data argument which takes care
of the xfs_bulkstat_one_compat() that takes care of the different
layout in the compat case.
* i386 struct xfs_inogrp has different padding.
Add a similar "output formatter" mecanism to xfs_inumbers().
SGI-PV: 967354
SGI-Modid: xfs-linux-melb:xfs-kern:29102a
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
32bit struct xfs_fsop_handlereq has different size and offsets (due to
pointers). TODO: case XFS_IOC_{FSSETDM,ATTRLIST,ATTRMULTI}_BY_HANDLE still
not handled.
SGI-PV: 967354
SGI-Modid: xfs-linux-melb:xfs-kern:29101a
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
i386 struct xfs_fsop_geom_v1 has no padding after the last member, so the
size is different.
SGI-PV: 967354
SGI-Modid: xfs-linux-melb:xfs-kern:29100a
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Remove the hardcoded "fnames" for tracing, and just embed them in tracing
macros via __FUNCTION__. Kills a lot of #ifdefs too.
SGI-PV: 967353
SGI-Modid: xfs-linux-melb:xfs-kern:29099a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Avoid using a special "zero inode" as the parent of the quota inode as
this can confuse the filestreams code into thinking the quota inode has a
parent. We do not want the quota inode to follow filestreams allocation
rules, so pass a NULL as the parent inode and detect this condition when
doing stream associations.
SGI-PV: 964469
SGI-Modid: xfs-linux-melb:xfs-kern:29098a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
In media spaces, video is often stored in a frame-per-file format. When
dealing with uncompressed realtime HD video streams in this format, it is
crucial that files do not get fragmented and that multiple files a placed
contiguously on disk.
When multiple streams are being ingested and played out at the same time,
it is critical that the filesystem does not cross the streams and
interleave them together as this creates seek and readahead cache miss
latency and prevents both ingest and playout from meeting frame rate
targets.
This patch set creates a "stream of files" concept into the allocator to
place all the data from a single stream contiguously on disk so that RAID
array readahead can be used effectively. Each additional stream gets
placed in different allocation groups within the filesystem, thereby
ensuring that we don't cross any streams. When an AG fills up, we select a
new AG for the stream that is not in use.
The core of the functionality is the stream tracking - each inode that we
create in a directory needs to be associated with the directories' stream.
Hence every time we create a file, we look up the directories' stream
object and associate the new file with that object.
Once we have a stream object for a file, we use the AG that the stream
object point to for allocations. If we can't allocate in that AG (e.g. it
is full) we move the entire stream to another AG. Other inodes in the same
stream are moved to the new AG on their next allocation (i.e. lazy
update).
Stream objects are kept in a cache and hold a reference on the inode.
Hence the inode cannot be reclaimed while there is an outstanding stream
reference. This means that on unlink we need to remove the stream
association and we also need to flush all the associations on certain
events that want to reclaim all unreferenced inodes (e.g. filesystem
freeze).
SGI-PV: 964469
SGI-Modid: xfs-linux-melb:xfs-kern:29096a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Appease gcc in regards to "warning: 'rtx' is used uninitialized in
this function".
SGI-PV: 907752
SGI-Modid: xfs-linux-melb:xfs-kern:29007a
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
A check for file_count is always a bad idea. Linux has the ->release
method to deal with cleanups on last close and ->flush is only for the
very rare case where we want to perform an operation on every drop of a
reference to a file struct.
This patch gets rid of vop_close and surrounding code in favour of simply
doing the page flushing from ->release.
SGI-PV: 966562
SGI-Modid: xfs-linux-melb:xfs-kern:28952a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
xfs_count_bits is only called once, and is then compared to 0. IOW, what
it really wants to know is, is the bitmap empty. This can be done more
simply, certainly.
SGI-PV: 966503
SGI-Modid: xfs-linux-melb:xfs-kern:28944a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
The remount readonly path can fail to writeback properly because we still
have active transactions after calling xfs_quiesce_fs(). Further
investigation shows that this path is broken in the same ways that the xfs
freeze path was broken so fix it the same way.
SGI-PV: 964464
SGI-Modid: xfs-linux-melb:xfs-kern:28869a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
During delayed allocation extent conversion or unwritten extent
conversion, we need to reserve some blocks for transactions reservations.
We need to reserve these blocks in case a btree split occurs and we need
to allocate some blocks.
Unfortunately, we've only ever reserved the number of data blocks we are
allocating, so in both the unwritten and delalloc case we can get ENOSPC
to the transaction reservation. This is bad because in both cases we
cannot report the failure to the writing application.
The fix is two-fold:
1 - leverage the reserved block infrastructure XFS already
has to reserve a small pool of blocks by default to allow
specially marked transactions to dip into when we are at
ENOSPC.
Default setting is min(5%, 1024 blocks).
2 - convert critical transaction reservations to be allowed
to dip into this pool. Spots changed are delalloc
conversion, unwritten extent conversion and growing a
filesystem at ENOSPC.
This also allows growing the filesytsem to succeed at ENOSPC.
SGI-PV: 964468
SGI-Modid: xfs-linux-melb:xfs-kern:28865a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
When we are unmounting the filesystem, we flush all the inodes to disk.
Unfortunately, if we have an inode cluster that has just been freed and
marked stale sitting in an incore log buffer (i.e. hasn't been flushed to
disk), it will be holding all the flush locks on the inodes in that
cluster.
xfs_iflush_all() which is called during unmount walks all the inodes
trying to reclaim them, and it doing so calls xfs_finish_reclaim() on each
inode. If the inode is dirty, if grabs the flush lock and flushes it.
Unfortunately, find dirty inodes that already have their flush lock held
and so we sleep.
At this point in the unmount process, we are running single-threaded.
There is nothing more that can push on the log to force the transaction
holding the inode flush locks to disk and hence we deadlock.
The fix is to issue a log force before flushing the inodes on unmount so
that all the flush locks will be released before we start flushing the
inodes.
SGI-PV: 964538
SGI-Modid: xfs-linux-melb:xfs-kern:28862a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>