Now that the ktrace_enter() code is using atomics, the non-power-of-2
buffer sizes - which require modulus operations to get the index - are
showing up as using substantial CPU in the profiles.
Force the buffer sizes to be rounded up to the nearest power of two and
use masking rather than modulus operations to convert the index counter to
the buffer index. This reduces ktrace_enter overhead to 8% of a CPU time,
and again almost halves the trace intensive test runtime.
SGI-PV: 977546
SGI-Modid: xfs-linux-melb:xfs-kern:30538a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
ktrace_enter() is consuming vast amounts of CPU time due to the use of a
single global lock for protecting buffer index increments. Change it to
use per-buffer atomic counters - this reduces ktrace_enter() overhead
during a trace intensive test on a 4p machine from 58% of all CPU time to
12% and halves test runtime.
SGI-PV: 977546
SGI-Modid: xfs-linux-melb:xfs-kern:30537a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
XFS changes the c/mtime of an inode when truncating it to the same size.
The c/mtime is only supposed to change if the size is changed. Not to be
confused with ftruncate, where the c/mtime is supposed to be changed even
if the size is not changed.
The Linux VFS encodes this semantic difference in the flags it sends down
to ->setattr, which XFS currently ignores. We need to make XFS pay
attention to the VFS flags and hence Do The Right Thing.
SGI-PV: 977547
SGI-Modid: xfs-linux-melb:xfs-kern:30536a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
As Dave pointed out after the export ops changes we now always encode the
parent into the filehandle for regular files, but it's not actually needed
when the filesystem is export with no_subtree_check. This one-liner fixes
xfs_fs_encode_fh to skip encoding the parent unless nessecary.
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30535a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
We can just use xfs_ilock/xfs_iunlock instead and get rid of the ugly
bhv_vrwlock_t.
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30533a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Instead of of xfs_get_dir_entry use a macro to get the xfs_inode from the
dentry in the callers and grab the reference manually.
Only grab the reference once as it's fine to keep it over the dmapi calls.
(And even that reference is actually superflous in Linux but I'll leave
that for another patch)
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30531a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Cleanup the unneeded intermediate vnode step in the flushing helpers and
go directly from the xfs_inode to the struct address_space.
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30530a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
- use proper goto based unwinding instead of the current mess of
multiple conditionals
- rename ip to inode because that's the normal convention for Linux
inodes while ip is the convention for xfs_inodes
- remove unlikely checks for the default_acl - branches marked unlikely
might lead to extreme branch bredictor slowdons if taken and for some
workloads a default acl is quite common
- properly indent the switch statements
- remove xfs_has_fs_struct as nfsd has a fs_struct in any semi-recent
kernel
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30529a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Now that we update the log tail LSN less frequently on transaction
completion, we pass the contention straight to the global log state lock
(l_iclog_lock) during transaction completion.
We currently have to take this lock to decrement the iclog reference
count. there is a reference count on each iclog, so we need to take þhe
global lock for all refcount changes.
When large numbers of processes are all doing small trnasctions, the iclog
reference counts will be quite high, and the state change that absolutely
requires the l_iclog_lock is the except rather than the norm.
Change the reference counting on the iclogs to use atomic_inc/dec so that
we can use atomic_dec_and_lock during transaction completion and avoid the
need for grabbing the l_iclog_lock for every reference count decrement
except the one that matters - the last.
SGI-PV: 975671
SGI-Modid: xfs-linux-melb:xfs-kern:30505a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
When hundreds of processors attempt to commit transactions at the same
time, they can contend on the AIL lock when updating the tail LSN held in
the in-core log structure.
At the moment, the tail LSN is only needed when actually writing out an
iclog, so it really does not need to be updated on every single
transaction completion - only those that result in switching iclogs and
flushing them to disk.
The result is that we reduce the number of times we need to grab the AIL
lock and the log grant lock by up to two orders of magnitude on large
processor count machines. The problem has previously been hidden by AIL
lock contention walking the AIL list which was recently solved and
uncovered this issue.
SGI-PV: 975671
SGI-Modid: xfs-linux-melb:xfs-kern:30504a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Remove open coded checks for the whether the inode is clean and replace
them with an inlined function.
SGI-PV: 977461
SGI-Modid: xfs-linux-melb:xfs-kern:30503a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Remove the xfs_icluster structure and replace with a radix tree lookup.
We don't need to keep a list of inodes in each cluster around anymore as
we can look them up quickly when we need to. The only time we need to do
this now is during inode writeback.
Factor the inode cluster writeback code out of xfs_iflush and convert it
to use radix_tree_gang_lookup() instead of walking a list of inodes built
when we first read in the inodes.
This remove 3 pointers from each xfs_inode structure and the xfs_icluster
structure per inode cluster. Hence we reduce the cache footprint of the
xfs_inodes by between 5-10% depending on cluster sparseness.
To be truly efficient we need a radix_tree_gang_lookup_range() call to
stop searching once we are past the end of the cluster instead of trying
to find a full cluster's worth of inodes.
Before (ia64):
$ cat /sys/slab/xfs_inode/object_size 536
After:
$ cat /sys/slab/xfs_inode/object_size 512
SGI-PV: 977460
SGI-Modid: xfs-linux-melb:xfs-kern:30502a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
When pdflush is writing back inodes, it can get stuck on inode cluster
buffers that are currently under I/O. This occurs when we write data to
multiple inodes in the same inode cluster at the same time.
Effectively, delayed allocation marks the inode dirty during the data
writeback. Hence if the inode cluster was flushed during the writeback of
the first inode, the writeback of the second inode will block waiting for
the inode cluster write to complete before writing it again for the newly
dirtied inode.
Basically, we want to avoid this from happening so we don't block pdflush
and slow down all of writeback. Hence we introduce a non-blocking async
inode flush flag that pdflush uses. If this flag is set, we use
non-blocking operations (e.g. try locks) whereever we can to avoid
blocking or extra I/O being issued.
SGI-PV: 970925
SGI-Modid: xfs-linux-melb:xfs-kern:30501a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
The only difference between the functions is one passes an inode for the
lookup, the other passes an inode number. However, they don't do the same
validity checking or set all the same state on the buffer that is returned
yet they should.
Factor the functions into a common implementation.
SGI-PV: 970925
SGI-Modid: xfs-linux-melb:xfs-kern:30500a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Remove the xfs_refcache, it was only needed while we were still
building for 2.4 kernels.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30472a
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
On a forced shutdown, xfs_finish_reclaim() will skip flushing the inode.
If the inode flush lock is not already held and there is an outstanding
xfs_iflush_done() then we might free the inode prematurely. By acquiring
and releasing the flush lock we will synchronise with xfs_iflush_done().
SGI-PV: 909874
SGI-Modid: xfs-linux-melb:xfs-kern:30468a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
* Use ide_default_irq() instead of ide_init_default_irq() in
ide_generic host driver (so the correct IRQ is always set
regardless of CONFIG_PCI / CONFIG_BLK_DEV_IDEPCI).
* Remove no longer needed ide_init_default_irq() macro.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Make CONFIG_IDE_GENERIC depended on CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS.
* Move default IDE ports setup from init_ide_data() to ide_generic.
* Use ide_init_port_hw() in ide_generic.
* Remove no longer needed CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
ide_init_default_irq() is always zero for CONFIG_PCI=y so hwif->irq
check in ide_hwif_configure() can be safely removed.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Do explicit port setup in legacy VLB host drivers instead of depending
on init_ide_data(). This way hwif->io_ports[] and hwif->irq are always
correctly set regardless of CONFIG_PCI / CONFIG_BLK_DEV_IDEPCI.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
These host drivers indirectly depend on CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS=y
which is defined only on alpha, x86, ia64, m32r, mips and ppc32.
Moreover:
- on ia64 there is no ISA
- m32r is too new for VLB
- on ppc32 ISA is available only on PPC_CHRP (no default IDE ports)
and PPC_PREP (marked as BROKEN)
[ the common sense tells me that VLB was only used on x86 but there
are urban legends that one of these host drivers was needed on some
other arch - thus the extra care ]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Do explicit port setup instead of depending on init_ide_data().
This way hwif->io_ports[] and hwif->irq are always correctly set
regardless of CONFIG_PCI / CONFIG_BLK_DEV_IDEPCI.
While at it fix printk().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
CONFIG_BLK_DEV_4DRIVES deserves its own host driver:
* Add drivers/ide/legacy/ide-4drives.c and move "4drives" support there.
* Add ide-4drives.o in the link order after all other legacy host
drivers enabled by "ide0=" options (they all are mutually exclusive).
* Make ide-4drives host driver probe itself for IDE devices instead of
indirectly depending on ide_generic host driver.
* Add "probe" module parameter to ide-4drives and update documentation.
v2:
* s/paramater/parameter/ in ide.txt. (Noticed by Randy Dunlap)
v3:
* s/ide_4drives.probe/ide-4drives.probe/ in help entry.
(Noticed by Sergei Shtylyov)
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
On PPC32 ide_init_default_irq() is non-zero only for PPLUS and PPC_PREP
(the latter marked as BROKEN currently) so this ifdef can be removed.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
It is always == '((base) + 0x206)' if CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS=y
and it is not needed otherwise (arm, blackfin, parisc, ppc64, sh, sparc[64]).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Use CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS instead of
CONFIG_IDE_ARCH_OBSOLETE_INIT in init_ide_data().
* Remove no longer needed CONFIG_IDE_ARCH_OBSOLETE_INIT.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS to drivers/ide/Kconfig and use
it instead of defining IDE_ARCH_OBSOLETE_DEFAULTS in <arch/ide.h>.
v2:
* Define ide_default_irq() in ide-probe.c/ns87415.c if not already defined
and drop defining ide_default_irq() for CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS=n.
[ Thanks to Stephen Rothwell and David Miller for noticing the problem. ]
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
IDE PMAC host driver and all IDE PCI host drivers use pci_enable_device()
nowadays so the following quirk in pmac_pcibios_after_init() can be removed.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add special cases for pplus and prep to ide_default_{irq,io_base}()
(+ FIXMEs about the need to use IDE platform host driver instead).
* Remove no longer needed ppc_ide_md and struct ide_machdep_calls.
* Then remove <linux/ide.h> include from:
- arch/powerpc/kernel/setup_32.c
- arch/ppc/kernel/ppc_ksyms.c
- arch/ppc/kernel/setup.c
- arch/ppc/platforms/pplus.c
- arch/ppc/platforms/prep_setup.c
There should be no functional changes caused by this patch.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Call ide_init_default_irq() for pplus in init_ide_data().
* Remove no longer needed pplus_ide_init_hwif_ports().
There should be no functional changes caused by this patch.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add IDE_HFLAG_FORCE_LEGACY_IRQS host flag for Motorola-Sandpoint platform
to sl82c105 host driver.
* Disable ide_generic host driver in arch/ppc/configs/sandpoint_defconfig
and enable sl82c105 one.
* Remove ppc_ide_md hooks from arch/ppc/platforms/sandpoint.c - no need for
them (sl82c105 host driver takes care of all this setup).
* Then remove no longer needed <linux/ide.h> include.
* Also update arch/ppc/platforms/sandpoint.h.
Unfortunately (unlike lopec's case) sl82c105 host driver was not enabled
in defconfing so there is a funcionality change.
[ Not a big deal since sl82c105 is superior over ide_generic. ]
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add IDE_HFLAG_FORCE_LEGACY_IRQS host flag for Motorola-LoPEC platform
to sl82c105 host driver.
* Remove ppc_ide_md hooks from arch/ppc/platforms/lopec.c - no need for
them (sl82c105 host driver takes care of all this setup).
* Then remove no longer needed <linux/ide.h> include.
Looking at arch/ppc/configs/lopec_defconfig:
...
CONFIG_IDE_GENERIC=y
CONFIG_BLK_DEV_IDEPCI=y
# CONFIG_IDEPCI_SHARE_IRQ is not set
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
CONFIG_BLK_DEV_SL82C105=y
...
there should be no functional changes unless somebody preferred to disable
sl82c105 host driver and use only ide_generic one (but why would anybody
want to do such thing :-).
PS It seems that lopec_defconfig hasn't been updated for ages but if somebody
is going to do it please look into disabling IDE_GENERIC and BLK_DEV_GENERIC
config options. Thanks.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Initialize IDE ports in mpc8xx_ide_probe().
* Remove m8xx_ide_init() and ppc_ide_md hooks - no need for them
(IDE mpc8xx host driver takes care of all this setup).
* Remove needless 'if (irq)' and 'if (data_port >= MAX_HWIFS)' checks
from m8xx_ide_init_hwif_ports().
* Remove 'ctrl_port' and 'irq' arguments from m8xx_ide_init_hwif_ports().
* Rename m8xx_ide_init_hwif_ports() to m8xx_ide_init_ports().
* Add __init tag to m8xx_ide_init_ports().
This patch fixes hwif->irq always being overriden to 0 (== auto-probe, is
this even working on PPC?) because of ide_init_default_irq() call in ide.c.
There should be no other functional changes.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Vitaly Bordug <vitb@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add pmac_ide_init_ports() helper and use it instead of
pmac_ide_init_hwif_ports().
* Remove ppc_ide_md hooks - no need for them
(IDE pmac host driver takes care of all this setup).
* Then remove no longer needed <linux/ide.h> include
from arch/powerpc/platforms/powermac/pmac.h.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
There are no "default" IDE ports on PPC4xx so ppc4xx_ide_init_hwif_ports() is
unnecessary, remove it. Also remove no longer needed <linux/ide.h> include.
There should be no functional changes caused by this patch.
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Also remove now not needed <linux/ide.h> include.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Remove unused pmac_ide_{check_base,get_irq}() and pmac_find_ide_boot(),
then remove no longer needed ide_majors[] and pmac_ide_count.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
mv idefloppy_do_end_request -> idefloppy_end_request as is the case with ide-cd
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This new struct unifies ide{-floppy,-tape,-scsi}'s view of a packet command. For now,
it represents the common denominator between the three drivers while adding driver-
specific members at the end of the struct which will be merged/simplified into the
generic ATAPI handling code in later steps, or removed completely.
Bart:
- move struct ide_atapi_pc outside of #ifdef/#endif CONFIG_IDE_PROC_FS
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Removing the atomic tests for pc's is unobjectionable. Since this driver will
probably go to /dev/null soon, the atomic tests for tape->flags are left in
place for there are some situations where they're needed (chrdev DSC handling,
low level pipeline operation and so on). While at it, rename all test/set flag
bit defines explicitly to *_FLAG_* for clarity.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>