Enhanced GPIO alternate functions descriptions,
taken from Intel PXA270 Developers Manual.
Signed-off-by: Robert Jarzmik <rjarzmik@free.fr>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Expose control of the PXA3xx 13MHz CLK_POUT pin via the clock API
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There have been a few oopses caused by 'struct file's with NULL f_vfsmnts.
There was also a set of potentially missed mnt_want_write()s from
dentry_open() calls.
This patch provides a very simple debugging framework to catch these kinds of
bugs. It will WARN_ON() them, but should stop us from having any oopses or
mnt_writer count imbalances.
I'm quite convinced that this is a good thing because it found bugs in the
stuff I was working on as soon as I wrote it.
[hch: made it conditional on a debug option.
But it's still a little bit too ugly]
[hch: merged forced remount r/o fix from Dave and akpm's fix for the fix]
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Originally from: Herbert Poetzl <herbert@13thfloor.at>
This is the core of the read-only bind mount patch set.
Note that this does _not_ add a "ro" option directly to the bind mount
operation. If you require such a mount, you must first do the bind, then
follow it up with a 'mount -o remount,ro' operation:
If you wish to have a r/o bind mount of /foo on bar:
mount --bind /foo /bar
mount -o remount,ro /bar
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This is the real meat of the entire series. It actually
implements the tracking of the number of writers to a mount.
However, it causes scalability problems because there can be
hundreds of cpus doing open()/close() on files on the same mnt at
the same time. Even an atomic_t in the mnt has massive scalaing
problems because the cacheline gets so terribly contended.
This uses a statically-allocated percpu variable. All want/drop
operations are local to a cpu as long that cpu operates on the same
mount, and there are no writer count imbalances. Writer count
imbalances happen when a write is taken on one cpu, and released
on another, like when an open/close pair is performed on two
Upon a remount,ro request, all of the data from the percpu
variables is collected (expensive, but very rare) and we determine
if there are any outstanding writers to the mount.
I've written a little benchmark to sit in a loop for a couple of
seconds in several cpus in parallel doing open/write/close loops.
http://sr71.net/~dave/linux/openbench.c
The code in here is a a worst-possible case for this patch. It
does opens on a _pair_ of files in two different mounts in parallel.
This should cause my code to lose its "operate on the same mount"
optimization completely. This worst-case scenario causes a 3%
degredation in the benchmark.
I could probably get rid of even this 3%, but it would be more
complex than what I have here, and I think this is getting into
acceptable territory. In practice, I expect writing more than 3
bytes to a file, as well as disk I/O to mask any effects that this
has.
(To get rid of that 3%, we could have an #defined number of mounts
in the percpu variable. So, instead of a CPU getting operate only
on percpu data when it accesses only one mount, it could stay on
percpu data when it only accesses N or fewer mounts.)
[AV] merged fix for __clear_mnt_mount() stepping on freed vfsmount
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
If someone decides to demote a file from r/w to just
r/o, they can use this same code as __fput().
NFS does just that, and will use this in the next
patch.
AV: drop write access in __fput() only after we evict from file list.
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Cc: Erez Zadok <ezk@cs.sunysb.edu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J Bruce Fields" <bfields@fieldses.org>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This patch adds two function mnt_want_write() and mnt_drop_write(). These are
used like a lock pair around and fs operations that might cause a write to the
filesystem.
Before these can become useful, we must first cover each place in the VFS
where writes are performed with a want/drop pair. When that is complete, we
can actually introduce code that will safely check the counts before allowing
r/w<->r/o transitions to occur.
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
open_namei() will, in the future, need to take mount write counts
over its creation and truncation (via may_open()) operations. It
needs to keep these write counts until any potential filp that is
created gets __fput()'d.
This gets complicated in the error handling and becomes very murky
as to how far open_namei() actually got, and whether or not that
mount write count was taken. That makes it a bad interface.
All that the current do_filp_open() really does is allocate the
nameidata on the stack, then call open_namei().
So, this merges those two functions and moves filp_open() over
to namei.c so it can be close to its buddy: do_filp_open(). It
also gets a kerneldoc comment in the process.
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
None of these files use any of the functionality promised by
asm/semaphore.h. It's possible that they (or some user of them) rely
on it dragging in some unrelated header file, but I can't build all
these files, so we'll have to fix any build failures as they come up.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
security: fix up documentation for security_module_enable
Security: Introduce security= boot parameter
Audit: Final renamings and cleanup
SELinux: use new audit hooks, remove redundant exports
Audit: internally use the new LSM audit hooks
LSM/Audit: Introduce generic Audit LSM hooks
SELinux: remove redundant exports
Netlink: Use generic LSM hook
Audit: use new LSM hooks instead of SELinux exports
SELinux: setup new inode/ipc getsecid hooks
LSM: Introduce inode_getsecid and ipc_getsecid hooks
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.26: (1090 commits)
[NET]: Fix and allocate less memory for ->priv'less netdevices
[IPV6]: Fix dangling references on error in fib6_add().
[NETLABEL]: Fix NULL deref in netlbl_unlabel_staticlist_gen() if ifindex not found
[PKT_SCHED]: Fix datalen check in tcf_simp_init().
[INET]: Uninline the __inet_inherit_port call.
[INET]: Drop the inet_inherit_port() call.
SCTP: Initialize partial_bytes_acked to 0, when all of the data is acked.
[netdrvr] forcedeth: internal simplifications; changelog removal
phylib: factor out get_phy_id from within get_phy_device
PHY: add BCM5464 support to broadcom PHY driver
cxgb3: Fix __must_check warning with dev_dbg.
tc35815: Statistics cleanup
natsemi: fix MMIO for PPC 44x platforms
[TIPC]: Cleanup of TIPC reference table code
[TIPC]: Optimized initialization of TIPC reference table
[TIPC]: Remove inlining of reference table locking routines
e1000: convert uint16_t style integers to u16
ixgb: convert uint16_t style integers to u16
sb1000.c: make const arrays static
sb1000.c: stop inlining largish static functions
...
Add the security= boot parameter. This is done to avoid LSM
registration clashes in case of more than one bult-in module.
User can choose a security module to enable at boot. If no
security= boot parameter is specified, only the first LSM
asking for registration will be loaded. An invalid security
module name will be treated as if no module has been chosen.
LSM modules must check now if they are allowed to register
by calling security_module_enable(ops) first. Modify SELinux
and SMACK to do so.
Do not let SMACK register smackfs if it was not chosen on
boot. Smackfs assumes that smack hooks are registered and
the initial task security setup (swapper->security) is done.
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Rename the se_str and se_rule audit fields elements to
lsm_str and lsm_rule to avoid confusion.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Setup the new Audit LSM hooks for SELinux.
Remove the now redundant exported SELinux Audit interface.
Audit: Export 'audit_krule' and 'audit_field' to the public
since their internals are needed by the implementation of the
new LSM hook 'audit_rule_known'.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Introduce a generic Audit interface for security modules
by adding the following new LSM hooks:
audit_rule_init(field, op, rulestr, lsmrule)
audit_rule_known(krule)
audit_rule_match(secid, field, op, rule, actx)
audit_rule_free(rule)
Those hooks are only available if CONFIG_AUDIT is enabled.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Remove the following exported SELinux interfaces:
selinux_get_inode_sid(inode, sid)
selinux_get_ipc_sid(ipcp, sid)
selinux_get_task_sid(tsk, sid)
selinux_sid_to_string(sid, ctx, len)
They can be substitued with the following generic equivalents
respectively:
new LSM hook, inode_getsecid(inode, secid)
new LSM hook, ipc_getsecid*(ipcp, secid)
LSM hook, task_getsecid(tsk, secid)
LSM hook, sid_to_secctx(sid, ctx, len)
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Introduce inode_getsecid(inode, secid) and ipc_getsecid(ipcp, secid)
LSM hooks. These hooks will be used instead of similar exported
SELinux interfaces.
Let {inode,ipc,task}_getsecid hooks set the secid to 0 by default
if CONFIG_SECURITY is not defined or if the hook is set to
NULL (dummy). This is done to notify the caller that no valid
secid exists.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Reviewed-by: Paul Moore <paul.moore@hp.com>
This patch adds the base files for the PB1176 platform support.
Signed-off-by: Bahadir Balban <bahadir.balban@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch adds the base files for the PB11MPCore platform support.
Signed-off-by: Bahadir Balban <bahadir.balban@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch changes the IO_ADDRESS macro for the RealView platforms to
accomodate a wider range of physical addresses on PB11MPCore.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The upcoming PB11MPCore and PB1176 have different memory maps and some
of the definitions in platform.h are no longer common. This patch
moves them to the board-eb.h file and updates their usage in
realview_eb.c.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Since the PB1176 has different UART base addresses, this patch moves
the definitions form platorm.h to board-eb.h. It also modifies
uncompress.h to detect the platform type at run-time.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch moves the timer definitions from platform.h into board-eb.h
as they are different on PB11MPCore and PB1176. It also adds
timerX_va_base variables in core.c which are set by the
realview_eb_timer_init function before invoking realview_timer_init.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch moves the patch definitions into board-eb.h and
realview_eb.c (from core.c) as they are different on the PB11MPCore
and PB1176 platforms.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This is in preparation for the RealView PB11MPCore and PB1176 patches
which have different base addresses for the GIC.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch moves the SCU initialisation from __v6_setup to the
smp_prepare_cpus() function as it relies on platform-specific
settings. Changes to get_core_count() are mainly for allowing cleaner
code with the upcoming PB11MPCore patches.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch adds a prefetch abort handler similar to the data abort one
and renames the latter for consistency. Initial implementation by Paul
Brook with some renaming by Catalin Marinas.
Signed-off-by: Paul Brook <paul@codesourcery.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (137 commits)
[SCSI] iscsi: bidi support for iscsi_tcp
[SCSI] iscsi: bidi support at the generic libiscsi level
[SCSI] iscsi: extended cdb support
[SCSI] zfcp: Fix error handling for blocked unit for send FCP command
[SCSI] zfcp: Remove zfcp_erp_wait from slave destory handler to fix deadlock
[SCSI] zfcp: fix 31 bit compile warnings
[SCSI] bsg: no need to set BSG_F_BLOCK bit in bsg_complete_all_commands
[SCSI] bsg: remove minor in struct bsg_device
[SCSI] bsg: use better helper list functions
[SCSI] bsg: replace kobject_get with blk_get_queue
[SCSI] bsg: takes a ref to struct device in fops->open
[SCSI] qla1280: remove version check
[SCSI] libsas: fix endianness bug in sas_ata
[SCSI] zfcp: fix compiler warning caused by poking inside new semaphore (linux-next)
[SCSI] aacraid: Do not describe check_reset parameter with its value
[SCSI] aacraid: Fix down_interruptible() to check the return value
[SCSI] sun3_scsi_vme: add MODULE_LICENSE
[SCSI] st: rename flush_write_buffer()
[SCSI] tgt: use KMEM_CACHE macro
[SCSI] initio: fix big endian problems for auto request sense
...
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (49 commits)
[GFS2] fix assertion in log_refund()
[GFS2] fix GFP_KERNEL misuses
[GFS2] test for IS_ERR rather than 0
[GFS2] Invalidate cache at correct point
[GFS2] fs/gfs2/recovery.c: suppress warnings
[GFS2] Faster gfs2_bitfit algorithm
[GFS2] Streamline quota lock/check for no-quota case
[GFS2] Remove drop of module ref where not needed
[GFS2] gfs2_adjust_quota has broken unstuffing code
[GFS2] possible null pointer dereference fixup
[GFS2] Need to ensure that sector_t is 64bits for GFS2
[GFS2] re-support special inode
[GFS2] remove gfs2_dev_iops
[GFS2] fix file_system_type leak on gfs2meta mount
[GFS2] Allow bmap to allocate extents
[GFS2] Fix a page lock / glock deadlock
[GFS2] proper extern for gfs2/locking/dlm/mount.c:gdlm_ops
[GFS2] gfs2/ops_file.c should #include "ops_inode.h"
[GFS2] be*_add_cpu conversion
[GFS2] Fix bug where we called drop_bh incorrectly
...
Support for extended CDBs in iscsi.
All we need is to check if command spills over 16 bytes then allocate
an iscsi-extended-header for the leftovers.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Reviewed-by: Pete Wyckoff <pw@osc.edu>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch adds a MigoR specific header file. We may want to use a cpu
specific header file instead, but this will do for now.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Presently this only checks to see if an address is an RAM, but this
doesn't work with XIP, so just always return 1. Follows m68knommu.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add support for Solution Engine SH7721 board(MS7721RP01).
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add KEYSC platform data for the Solution Engine 7722 board.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add a platform driver for the SuperH KEYSC block. The driver expects to get
mode, timing information and keypad layout from the board code as platform
data. The board code is resonsible for pin configuration.
Both sh7343 and sh7722 should be supported, but only the sh7722 processor has
been tested so far. SH_KEYSC_MODE_3 is yet to be tested.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
dmaengine: ack to flags: make use of the unused bits in the 'ack' field
iop-adma: remove the workaround for missed interrupts on iop3xx
async_tx: kill ->device_dependency_added
async_tx: fix multiple dependency submission
fsldma: Split the MPC83xx event from MPC85xx and refine irq codes.
fsldma: Remove CONFIG_FSL_DMA_SELFTEST, keep fsl_dma_self_test() running always.
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (79 commits)
ata-acpi: don't call _GTF for disabled drive
sata_mv add temporary 3 second init delay for SiliconImage PMs
sata_mv remove redundant edma init code
sata_mv add basic port multiplier support
sata_mv fix SOC flags, enable NCQ on SOC
sata_mv disable hotplug for now
sata_mv cosmetics
sata_mv hardreset rework
[libata] improve Kconfig help text for new PMP, SFF options
libata: make EH fail gracefully if no reset method is available
libata: Be a bit more slack about early devices
libata: cable logic
libata: move link onlineness check out of softreset methods
libata: kill dead code paths in reset path
pata_scc: fix build breakage
libata: make PMP support optional
libata: implement PMP helpers
libata: separate PMP support code from core code
libata: make SFF support optional
libata: don't use ap->ioaddr in non-SFF drivers
...
* git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt:
clocksource: make clocksource watchdog cycle through online CPUs
Documentation: move timer related documentation to a single place
clockevents: optimise tick_nohz_stop_sched_tick() a bit
locking: remove unused double_spin_lock()
hrtimers: simplify lockdep handling
timers: simplify lockdep handling
posix-timers: fix shadowed variables
timer_list: add annotations to workqueue.c
hrtimer: use nanosleep specific restart_block fields
hrtimer: add nanosleep specific restart_block member
* 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc:
Remove DEBUG_SEMAPHORE from Kconfig
Improve semaphore documentation
Simplify semaphore implementation
Add down_timeout and change ACPI to use it
Introduce down_killable()
Generic semaphore implementation
Add semaphore.h to kernel_lock.c
Fix quota.h includes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (104 commits)
IB/iser: Don't change itt endianness
IB/mlx4: Update module version and release date
IPoIB: Handle case when P_Key is deleted and re-added at same index
IB/iser: Release connection resources on RDMA_CM_EVENT_DEVICE_REMOVAL event
IB/mlx4: Fix incorrect comment
IB/mlx4: Fix race when detaching a QP from a multicast group
IB/ehca: Support all ibv_devinfo values in query_device() and query_port()
RDMA/nes: Free IRQ before killing tasklet
IB/mthca: Update module version and release date
IB/mlx4: Update QP state if query QP succeeds
IB/mthca: Update QP state if query QP succeeds
RDMA/amso1100: Add check for NULL reply_msg in c2_intr()
IB/mlx4: Add support for resizing CQs
IB/mlx4: Add support for modifying CQ moderation parameters
IPoIB: Support modifying IPoIB CQ event moderation
IB/core: Add support for modify CQ
IPoIB: Add basic ethtool support
mlx4_core: Increase max number of QPs to 128K
RDMA/amso1100: Add support for "send with invalidate" work requests
IB/core: Add support for "send with invalidate" work requests
...
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (36 commits)
[S390] Remove code duplication from monreader / dcssblk.
[S390] kernel: show last breaking-event-address on oops
[S390] lowcore: Change type of lowcores softirq_pending to __u32.
[S390] zcrypt: Comments and kernel-doc cleanup
[S390] uaccess: Always access the correct address space.
[S390] Fix a lot of sparse warnings.
[S390] Convert s390 to GENERIC_CLOCKEVENTS.
[S390] genirq/clockevents: move irq affinity prototypes/inlines to interrupt.h
[S390] Convert monitor calls to function calls.
[S390] qdio (new feature): enhancing info-retrieval from QDIO-adapters
[S390] replace remaining __FUNCTION__ occurrences
[S390] remove redundant display of free swap space in show_mem()
[S390] qdio: remove outdated developerworks link.
[S390] Add debug_register_mode() function to debug feature API
[S390] crypto: use more descriptive function names for init/exit routines.
[S390] switch sched_clock to store-clock-extended.
[S390] zcrypt: add support for large random numbers
[S390] hw_random: allow rng_dev_read() to return hardware errors.
[S390] Vertical cpu management.
[S390] cpu topology support for s390.
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slub: No need for per node slab counters if !SLUB_DEBUG
slub: Move map/flag clearing to __free_slab
slub: Fixes to per cpu stat output in sysfs
slub: Deal with config variable dependencies
slub: Reduce #ifdef ZONE_DMA by moving kmalloc_caches_dma near dma logic
slub: Initialize per-cpu stats
64-bit powerpc processors can find the leftmost 1 bit in a 64-bit
doubleword in one instruction, so use that rather than using the
generic fls64(), which does two 32-bit fls() calls.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This deblats ~200 bytes when ipv6 and dccp are 'y'.
Besides, this will ease compilation issues for patches
I'm working on to make inet hash tables more scalable
wrt net namespaces.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As I can see from the code, two places (tcp_v6_syn_recv_sock and
dccp_v6_request_recv_sock) that call this one already run with
BHs disabled, so it's safe to call __inet_inherit_port there.
Besides (in case I missed smth with code review) the calltrace
tcp_v6_syn_recv_sock
`- tcp_v4_syn_recv_sock
`- __inet_inherit_port
and the similar for DCCP are valid, but assumes BHs to be disabled.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds the low level irq tracing hooks to the powerpc architecture
needed to enable full lockdep functionality.
This is partly based on Johannes Berg's initial version. I removed
the asm trampoline that isn't needed (thus improving performance) and
modified all sorts of bits and pieces, reworking most of the assembly,
etc...
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This moves various definitions used all over the place to parse stack
frames to ptrace.h so only one definition is needed.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
A) It's not modified and so it can be made const. const is good.
B) If one has a function that was given a const pci_bus pointer and you
want to get a pointer to its pci_controller, you'll get a warning from gcc
when you use pci_bus_to_host(). This is the right way to stop that
warning.
Signed-off-by: Trent Piepho <tpiepho@freescale.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Powerpc and ppc have some code in their bitops.h that is exactly the
same as asm-generic/bitops/find.h. Include this header instead of the
private implementation.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Paul Mackerras <paulus@samba.org>
* Use ide_default_irq() instead of ide_init_default_irq() in
ide_generic host driver (so the correct IRQ is always set
regardless of CONFIG_PCI / CONFIG_BLK_DEV_IDEPCI).
* Remove no longer needed ide_init_default_irq() macro.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
It is always == '((base) + 0x206)' if CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS=y
and it is not needed otherwise (arm, blackfin, parisc, ppc64, sh, sparc[64]).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS to drivers/ide/Kconfig and use
it instead of defining IDE_ARCH_OBSOLETE_DEFAULTS in <arch/ide.h>.
v2:
* Define ide_default_irq() in ide-probe.c/ns87415.c if not already defined
and drop defining ide_default_irq() for CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS=n.
[ Thanks to Stephen Rothwell and David Miller for noticing the problem. ]
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add special cases for pplus and prep to ide_default_{irq,io_base}()
(+ FIXMEs about the need to use IDE platform host driver instead).
* Remove no longer needed ppc_ide_md and struct ide_machdep_calls.
* Then remove <linux/ide.h> include from:
- arch/powerpc/kernel/setup_32.c
- arch/ppc/kernel/ppc_ksyms.c
- arch/ppc/kernel/setup.c
- arch/ppc/platforms/pplus.c
- arch/ppc/platforms/prep_setup.c
There should be no functional changes caused by this patch.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This new struct unifies ide{-floppy,-tape,-scsi}'s view of a packet command. For now,
it represents the common denominator between the three drivers while adding driver-
specific members at the end of the struct which will be merged/simplified into the
generic ATAPI handling code in later steps, or removed completely.
Bart:
- move struct ide_atapi_pc outside of #ifdef/#endif CONFIG_IDE_PROC_FS
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add IDE_{ALTSTATUS,IREASON,BCOUNTL,BCOUNTH}_OFFSET defines.
* Remove IDE_*_REG macros - this results in more readable
and slightly smaller code.
There should be no functional changes caused by this patch.
Cc: Borislav Petkov <petkovbb@gmail.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add ide_atapi_{discard_data,write_zeros} inline helpers to <linux/ide.h>
and use them instead of home-brewn helpers in ide-{floppy,tape,scsi}.
There should be no functional changes caused by this patch.
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
hdparm explicitely marks HDIO_[UNREGISTER,SCAN]_HWIF ioctls as DANGEROUS
and given the number of bugs we can assume that there are no real users:
* DMA has no chance of working because DMA resources are released by
ide_unregister() and they are never allocated again.
* Since ide_init_hwif_ports() is used for ->io_ports[] setup the ioctls
don't work for almost all hosts with "non-standard" (== non ISA-like)
layout of IDE taskfile registers (there is a lot of such host drivers).
* ide_port_init_devices() is not called when probing IDE devices so:
- drive->autotune is never set and IDE host/devices are not programmed
for the correct PIO/DMA transfer modes (=> possible data corruption)
- host specific I/O 32-bit and IRQ unmasking settings are not applied
(=> possible data corruption)
- host specific ->port_init_devs method is not called (=> no luck with
ht6560b, qd65xx and opti621 host drivers)
* ->rw_disk method is not preserved (=> no HPT3xxN chipsets support).
* ->serialized flag is not preserved (=> possible data corruption when
using icside, aec62xx (ATP850UF chipset), cmd640, cs5530, hpt366
(HPT3xxN chipsets), rz1000, sc1200, dtc2278 and ht6560b host drivers).
* ->ack_intr method is not preserved (=> needed by ide-cris, buddha,
gayle and macide host drivers).
* ->sata_scr[] and sata_misc[] is cleared by ide_unregister() and it
isn't initialized again (SiI3112 support needs them).
* To issue an ioctl() there need to be at least one IDE device present
in the system.
* ->cable_detect method is not preserved + it is not called when probing
IDE devices so cable detection is broken (however since DMA support is
also broken it doesn't really matter ;-).
* Some objects which may have already been freed in ide_unregister()
are restored by ide_hwif_restore() (i.e. ->hwgroup).
* ide_register_hw() may unregister unrelated IDE ports if free ide_hwifs[]
slot cannot be found.
* When IDE host drivers are modular unregistered port may be re-used by
different host driver that owned it first causing subtle bugs.
Since we now have a proper warm-plug support remove these ioctls,
then remove no longer needed:
- ide_register_hw() and ide_hwif_restore() functions
- 'init_default' and 'restore' arguments of ide_unregister()
- zeroeing of hwif->{dma,extra}_* fields in ide_unregister()
As an added bonus IDE core code size shrinks by ~3kB (x86-32).
v2:
* fix ide_unregister() arguments in cleanup_module() (Andrew Morton).
v3:
* fix ide_unregister() arguments in palm_bk3710.c.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add 'struct class ide_port_class' ('ide_port' class) and a 'struct
device *portdev' ('ide_port' class device) in ide_hwif_t.
* Register 'ide_port' class in ide_init() and unregister it in
cleanup_module().
* Create ->portdev in ide_register_port () and unregister it in
ide_unregister().
* Add "delete_devices" class device attribute for unregistering IDE devices
on a port and "scan" one for probing+registering IDE devices on a port.
* Add ide_sysfs_register_port() helper for registering "delete_devices"
and "scan" attributes with ->portdev. Call it in ide_device_add_all().
* Document IDE warm-plug support in Documentation/ide/warm-plug-howto.txt.
v2:
* Convert patch from using 'struct class_device' to use 'struct device'.
(thanks to Kay Sievers for doing it)
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
->busproc method is used by HDIO_SET_BUSSTATE ioctl but it has no chance
of working as intended (in 2.4.x days) because to issue an ioctl there
is a device node needed and:
- for BUSSTATE_TRISTATE+OFF it is too late (devices are already gone)
- for BUSSTATE_TRISTATE+ON it is too early (devices are not registered yet)
Just remove ->busproc method for now (it was only implemented by hpt366,
siimage and tc86c001 host drivers).
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Rework PowerMac media-bay support in such way that instead of
un/registering the IDE interface we un/register IDE devices:
* Add ide_port_scan() helper for probing+registerering devices on a port.
* Rename ide_port_unregister_devices() to __ide_port_unregister_devices().
* Add ide_port_unregister_devices() helper for unregistering devices on a port.
* Add 'ide_hwif_t *cd_port' to 'struct media_bay_info', pass 'hwif' instead
of hwif->index to media_bay_set_ide_infos() and use it to setup 'cd_port'.
* Use ide_port_unregister_devices() instead of ide_unregister()
and ide_port_scan() instead of ide_register_hw() in media_bay_step().
* Unexport ide_register_hw() and make it static.
v2:
* Fix build by adding <linux/ide.h> include to <asm-powerpc/mediabay.h>.
(Reported by Michael/Kamalesh/Andrew).
Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
IDE devices need to be removed from /proc/ide/ _before_ being unregistered:
* Drop 'ide_hwif_t *hwif' argument from destroy_proc_ide_device()
and use drive->hwif instead.
* Rename destroy_proc_ide_device() to ide_proc_unregister_device().
* Call ide_proc_unregister_device() in drive_release_dev().
* Remove no longer needed destroy_proc_ide_drives().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Use ide_find_port() instead of ide_deprecated_find_port() in bast-ide/
palm_bk3710/ide-cs/delkin_cb host drivers and in ide_register_hw().
* Remove no longer needed ide_deprecated_find_port().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This option is obsolete and can be removed safely.
It allows us to remove the pci_get_device_reverse() function from the
PCI core.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
'ack' is currently a simple integer that flags whether or not a client is done
touching fields in the given descriptor. It is effectively just a single bit
of information. Converting this to a flags parameter allows the other bits to
be put to use to control completion actions, like dma-unmap, and capture
results, like xor-zero-sum == 0.
Changes are one of:
1/ convert all open-coded ->ack manipulations to use async_tx_ack
and async_tx_test_ack.
2/ set the ack bit at prep time where possible
3/ make drivers store the flags at prep time
4/ add flags to the device_prep_dma_interrupt prototype
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
DMA drivers no longer need to be notified of dependency submission
events as async_tx_run_dependencies and async_tx_channel_switch will
handle the scheduling and execution of dependent operations.
[sfr@canb.auug.org.au: extend this for fsldma]
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Shrink struct dma_async_tx_descriptor and introduce
async_tx_channel_switch to properly inject a channel switch interrupt in
the descriptor stream. This simplifies the locking model as drivers no
longer need to handle dma_async_tx_descriptor.lock.
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Make PMP support optional by adding CONFIG_SATA_PMP and leaving out
libata-pmp.c if it isn't set. PMP helpers return constant values if
PMP support is not enabled and PMP declarations alias non-PMP
counterparts. This makes the compiler to leave out PMP related part
out and LLDs to use non-PMP counterparts automatically.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Implement helpers to test whether PMP is supported, attached and
determine pmp number to use when issuing SRST to a link. While at it,
move ata_is_host_link() so that it's together with the two new PMP
helpers.
This change simplifies LLDs and helps making PMP support optional.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Most of PMP support code is already in libata-pmp.c. All that are in
libata-core.c are sata_pmp_port_ops and EXPORTs. Move them to
libata-pmp.c. Also, collect PMP related prototypes and declarations
in header files and move them right above of SFF stuff.
This change is to make PMP support optional.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Now that SFF support is completely separated out from the core layer,
it can be made optional. Add CONFIG_ATA_SFF and let SFF drivers
depend on it. If CONFIG_ATA_SFF isn't set, all codes in libata-sff.c
and data structures for SFF support are disabled. This saves good
number of bytes for small systems.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Now that SFF assumptions are separated out from non-SFF reset
sequence, port_ops->sff_dev_select() is no longer necessary for
non-SFF controllers. Kill ata_noop_dev_select() and ->sff_dev_select
initialization from base and other non-SFF port_ops.
Signed-off-by: Tejun Heo <htejun@gmail.com>
ata_qc_complete_multiple() took @finish_qc and called it on every qc
before completing it. This was to give opportunity to update TF cache
before ata_qc_complete() tries to fill result_tf. Now that result TF
is a separate operation, this is no longer necessary.
Update sata_sil24, which was the only user of this mechanism, such
that it implements its own ops->qc_fill_rtf() and drop @finish_qc from
ata_qc_complete_multiple().
Signed-off-by: Tejun Heo <htejun@gmail.com>
On command completion, ata_qc_complete() directly called ops->tf_read
to fill qc->result_tf. This patch adds ops->qc_fill_rtf to replace
hardcoded ops->tf_read usage.
ata_sff_qc_fill_rtf() which uses ops->tf_read to fill result_tf is
implemented and set in ata_base_port_ops and other ops tables which
don't inherit from ata_base_port_ops, so this patch doesn't introduce
any behavior change.
ops->qc_fill_rtf() is similar to ops->sff_tf_read() but can only be
called when a command finishes. As some non-SFF controllers don't
have TF registers defined unless they're associated with in-flight
commands, this limited operation makes life easier for those drivers
and help lifting SFF assumptions from libata core layer.
Signed-off-by: Tejun Heo <htejun@gmail.com>
If PMP fan-out reset fails and SCR isn't accessible, PMP should be
reset. This used to be tested by sata_pmp_std_hardreset() and
communicated to EH by -ERESTART. However, this logic is generic and
doesn't really have much to do with specific hardreset implementation.
This patch moves SCR access failure detection logic to ata_eh_reset()
where it belongs. As this makes sata_pmp_std_hardreset() identical to
sata_std_hardreset(), the function is killed and replaced with the
standard method.
Signed-off-by: Tejun Heo <htejun@gmail.com>
SError used to be cleared in ->postreset. This has small hotplug race
condition. If a device is plugged in after reset is complete but
postreset hasn't run yet, its hotplug event gets lost when SError is
cleared. This patch makes sata_link_resume() clear SError. This
kills the race condition and makes a lot of sense as some PMP and host
PHYs don't work properly without SError cleared.
This change makes sata_pmp_std_{pre|post}_reset()'s unnecessary as
they become identical to ata_std counterparts. It also simplifies
sata_pmp_hardreset() and ahci_vt8251_hardreset().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Implement sata_std_hardreset(), which simply wraps around
sata_link_hardreset(). sata_std_hardreset() becomes new standard
hardreset method for sata_port_ops and sata_sff_hardreset() moves from
ata_base_port_ops to ata_sff_port_ops, which is where it really
belongs.
ata_is_builtin_hardreset() is added so that both
ata_std_error_handler() and ata_sff_error_handler() skip both builtin
hardresets if SCR isn't accessible.
piix_sidpr_hardreset() in ata_piix.c is identical to
sata_std_hardreset() in functionality and got replaced with the
standard function.
Signed-off-by: Tejun Heo <htejun@gmail.com>
sata_sff_hardreset() contains link readiness wait logic which isn't
SFF specific. Move that part into sata_link_hardreset(), which now
takes two more parameters - @online and @check_ready. Both are
optional. The former is out parameter for link onlineness after
reset. The latter is used to wait for link readiness after hardreset.
Users of sata_link_hardreset() is updated to use new funtionality and
ahci_hardreset() is updated to use sata_link_hardreset() instead of
sata_sff_hardreset(). This doesn't really cause any behavior change.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Factor out waiting logic (which is common to all ATA controllers) from
ata_sff_wait_ready() into ata_wait_ready(). ata_wait_ready() takes
@check_ready function pointer and uses it to poll for readiness. This
allows non-SFF controllers to use ata_wait_ready() to wait for link
readiness.
This patch also implements ata_wait_after_reset() - generic version of
ata_sff_wait_after_reset() - using ata_wait_ready().
ata_sff_wait_ready() is reimplemented using ata_wait_ready() and
ata_sff_check_ready(). Functionality remains the same.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Previously, post-softreset readiness is waited as follows.
1. ata_sff_wait_after_reset() waits for 150ms and then for
ATA_TMOUT_FF_WAIT if status is 0xff and other conditions meet.
2. ata_bus_softreset() finishes with -ENODEV if status is still 0xff.
If not, continue to #3.
3. ata_bus_post_reset() waits readiness of dev0 and/or dev1 depending
on devmask using ata_sff_wait_ready().
And for post-hardreset readiness,
1. ata_sff_wait_after_reset() waits for 150ms and then for
ATA_TMOUT_FF_WAIT if status is 0xff and other conditions meet.
2. sata_sff_hardreset waits for device readiness using
ata_sff_wait_ready().
This patch merges and unifies post-reset readiness waits into
ata_sff_wait_ready() and ata_sff_wait_after_reset().
ATA_TMOUT_FF_WAIT handling is merged into ata_sff_wait_ready(). If TF
status is 0xff, link status is unknown and the port is SATA, it will
continue polling till ATA_TMOUT_FF_WAIT.
ata_sff_wait_after_reset() is updated to perform the following steps.
1. waits for 150ms.
2. waits for dev0 readiness using ata_sff_wait_ready(). Note that
this is done regardless of devmask, as ata_sff_wait_ready() handles
0xff status correctly, this preserves the original behavior except
that it may wait longer after softreset if link is online but
status is 0xff. This behavior change is very unlikely to cause any
actual difference and is intended. It brings softreset behavior to
that of hardreset.
3. waits for dev1 readiness just the same way ata_bus_post_reset() did.
Now both soft and hard resets call ata_sff_wait_after_reset() after
reset to wait for readiness after resets. As
ata_sff_wait_after_reset() contains calls to ->sff_dev_select(),
explicit call near the end of sata_sff_hardreset() is removed.
This change makes reset implementation simpler and more consistent.
While at it, make the magical 150ms wait post-reset wait duration a
constant and ata_sff_wait_ready() and ata_sff_wait_after_reset() take
@link instead of @ap. This is to make them consistent with other
reset helpers and ease core changes.
pata_scc is updated accordingly.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Separate out generic ATA portion from ata_sff_postreset() into
ata_std_postreset() and implement ata_sff_postreset() using the std
version.
ata_base_port_ops now has ata_std_postreset() for its postreset and
ata_sff_port_ops overrides it to ata_sff_postreset().
This change affects pdc_adma, ahci, sata_fsl and sata_sil24. pdc_adma
now specifies postreset to ata_sff_postreset() explicitly. sata_fsl
and sata_sil24 now use ata_std_postreset() which makes no difference
to them. ahci now calls ata_std_postreset() from its own postreset
method, which causes no behavior difference.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Separate out generic ATA portion from ata_sff_prereset() into
ata_std_prereset() and implement ata_sff_prereset() using the std
version. Waiting for device readiness is the only SFF specific part.
ata_base_port_ops now has ata_std_prereset() for its prereset and
ata_sff_port_ops overrides it to ata_sff_prereset(). This change can
affect pdc_adma, ahci, sata_fsl and sata_sil24. pdc_adma implements
its own prereset using ata_sff_prereset() and the rest has hardreset
and thus are unaffected by this change.
This change reflects real world situation. There is no generic way to
wait for device readiness for non-SFF controllers and some of them
don't have any mechanism for that. Non-sff drivers which don't have
hardreset should wrap ata_std_prereset() and wait for device readiness
itself but there's no such driver now and isn't likely to be popular
in the future either.
Signed-off-by: Tejun Heo <htejun@gmail.com>
->sff_irq_clear() is called only from SFF interrupt handler, so there
is no reason to initialize it for non-SFF controllers. Also,
ata_sff_irq_clear() can handle both BMDMA and non-BMDMA SFF
controllers.
This patch kills ata_noop_irq_clear() and removes it from base
port_ops and sets ->sff_irq_clear to ata_sff_irq_clear() in sff
port_ops instead of bmdma port_ops.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Add sff_ prefix to SFF specific port ops.
This rename is in preparation of separating SFF support out of libata
core layer. This patch strictly renames ops and doesn't introduce any
behavior difference.
Signed-off-by: Tejun Heo <htejun@gmail.com>
SFF functions have confusing names. Some have sff prefix, some have
bmdma, some std, some pci and some none. Unify the naming by...
* SFF functions which are common to both BMDMA and non-BMDMA are
prefixed with ata_sff_.
* SFF functions which are specific to BMDMA are prefixed with
ata_bmdma_.
* SFF functions which are specific to PCI but apply to both BMDMA and
non-BMDMA are prefixed with ata_pci_sff_.
* SFF functions which are specific to PCI and BMDMA are prefixed with
ata_pci_bmdma_.
* Drop generic prefixes from LLD specific routines. For example,
bfin_std_dev_select -> bfin_dev_select.
The following renames are noteworthy.
ata_qc_issue_prot() -> ata_sff_qc_issue()
ata_pci_default_filter() -> ata_bmdma_mode_filter()
ata_dev_try_classify() -> ata_sff_dev_classify()
This rename is in preparation of separating SFF support out of libata
core layer. This patch strictly renames functions and doesn't
introduce any behavior difference.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Currently whether a command should be retried after failure is
determined inside ata_eh_finish(). Add ATA_QCFLAG_RETRY and move the
logic into ata_eh_autopsy(). This makes things clearer and helps
extending retry determination logic.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ata_chk_status() just calls ops->check_status and it only adds
confusion with other status functions. Kill it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ata_pci_default_filter() doesn't really have anything to do with PCI.
It's generally applicable to BMDMA controllers. Move it out of
CONFIG_PCI.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* Move SFF related functions from libata-core.c to libata-sff.c.
ata_[bmdma_]sff_port_ops, ata_devchk(), ata_dev_try_classify(),
ata_std_dev_select(), ata_tf_to_host(), ata_busy_sleep(),
ata_wait_after_reset(), ata_wait_ready(), ata_bus_post_reset(),
ata_bus_softreset(), ata_bus_reset(), ata_std_softreset(),
sata_std_hardreset(), ata_fill_sg(), ata_fill_sg_dumb(),
ata_qc_prep(), ata_dump_qc_prep(), ata_data_xfer(),
ata_data_xfer_noirq(), ata_pio_sector(), ata_pio_sectors(),
atapi_send_cdb(), __atapi_pio_bytes(), atapi_pio_bytes(),
ata_hsm_ok_in_wq(), ata_hsm_qc_complete(), ata_hsm_move(),
ata_pio_task(), ata_qc_issue_prot(), ata_host_intr(),
ata_interrupt(), ata_std_ports()
* Make ata_pio_queue_task() global as it's now called from
libata-sff.c.
* Move SFF related stuff in include/linux/libata.h and
drivers/ata/libata.h into one place. While at it, move timing
constants into the global enum definition and fortify comments a
bit.
This patch strictly moves stuff around and as such doesn't cause any
functional difference.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Currently reset methods are not specified directly in the
ata_port_operations table. If a LLD wants to use custom reset
methods, it should construct and use a error_handler which uses those
reset methods. It's done this way for two reasons.
First, the ops table already contained too many methods and adding
four more of them would noticeably increase the amount of necessary
boilerplate code all over low level drivers.
Second, as ->error_handler uses those reset methods, it can get
confusing. ie. By overriding ->error_handler, those reset ops can be
made useless making layering a bit hazy.
Now that ops table uses inheritance, the first problem doesn't exist
anymore. The second isn't completely solved but is relieved by
providing default values - most drivers can just override what it has
implemented and don't have to concern itself about higher level
callbacks. In fact, there currently is no driver which actually
modifies error handling behavior. Drivers which override
->error_handler just wraps the standard error handler only to prepare
the controller for EH. I don't think making ops layering strict has
any noticeable benefit.
This patch makes ->prereset, ->softreset, ->hardreset, ->postreset and
their PMP counterparts propoer ops. Default ops are provided in the
base ops tables and drivers are converted to override individual reset
methods instead of creating custom error_handler.
* ata_std_error_handler() doesn't use sata_std_hardreset() if SCRs
aren't accessible. sata_promise doesn't need to use separate
error_handlers for PATA and SATA anymore.
* softreset is broken for sata_inic162x and sata_sx4. As libata now
always prefers hardreset, this doesn't really matter but the ops are
forced to NULL using ATA_OP_NULL for documentation purpose.
* pata_hpt374 needs to use different prereset for the first and second
PCI functions. This used to be done by branching from
hpt374_error_handler(). The proper way to do this is to use
separate ops and port_info tables for each function. Converted.
Signed-off-by: Tejun Heo <htejun@gmail.com>
libata core layer doesn't care about sht or ->irq_handler. Those are
only of interest to the LLD during initialization. This is confusing
and has caused several drivers to have duplicate unused initializers
for these fields.
Currently only sata_nv uses these fields. Make sata_nv use
->private_data, which is supposed to carry LLD-specific information,
instead and kill ->sht and ->irq_handler. nv_pi_priv structure is
defined and struct literals are used to initialize private_data.
Notational overhead is negligible.
Signed-off-by: Tejun Heo <htejun@gmail.com>
port_info->private_data is currently used for two purposes - to record
private data about the port_info or to specify host->private_data to
use when allocating ata_host.
This overloading is confusing and counter-intuitive in that
port_info->private_data becomes host->private_data instead of
port->private_data. In addition, port_info and host don't correspond
to each other 1-to-1. Currently, the first non-NULL
port_info->private_data is used.
This patch makes port_info->private_data just be what it is -
private_data for the port_info where LLD can jot down extra info.
libata no longer sets host->private_data to the first non-NULL
port_info->private_data, @host_priv argument is added to
ata_pci_init_one() instead. LLDs which use ata_pci_init_one() can use
this argument to pass in pointer to host private data. LLDs which
don't should use init-register model anyway and can initialize
host->private_data directly.
Adding @host_priv instead of using init-register model for LLDs which
use ata_pci_init_one() is suggested by Alan Cox.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
ata_pci_init_one() is the only function which uses ops->irq_handler
and pi->sht. Other initialization functions take the same information
as arguments. This causes confusion and duplicate unused entries in
structures.
Make ata_pci_init_one() take sht as an argument and use ata_interrupt
implicitly. All current users use ata_interrupt and if different irq
handler is necessary open coding ata_pci_init_one() using
ata_prepare_sff_host() and ata_activate_sff_host can be done under ten
lines including error handling and driver which requires custom
interrupt handler is likely to require custom initialization anyway.
As ata_pci_init_one() was the last user of ops->irq_handler, this
patch also kills the field.
Signed-off-by: Tejun Heo <htejun@gmail.com>
libata lets low level drivers build ata_port_operations table and
register it with libata core layer. This allows low level drivers
high level of flexibility but also burdens them with lots of
boilerplate entries.
This becomes worse for drivers which support related similar
controllers which differ slightly. They share most of the operations
except for a few. However, the driver still needs to list all
operations for each variant. This results in large number of
duplicate entries, which is not only inefficient but also error-prone
as it becomes very difficult to tell what the actual differences are.
This duplicate boilerplates all over the low level drivers also make
updating the core layer exteremely difficult and error-prone. When
compounded with multi-branched development model, it ends up
accumulating inconsistencies over time. Some of those inconsistencies
cause immediate problems and fixed. Others just remain there dormant
making maintenance increasingly difficult.
To rectify the problem, this patch implements ata_port_operations
inheritance. To allow LLDs to easily re-use their own ops tables
overriding only specific methods, this patch implements poor man's
class inheritance. An ops table has ->inherits field which can be set
to any ops table as long as it doesn't create a loop. When the host
is started, the inheritance chain is followed and any operation which
isn't specified is taken from the nearest ancestor which has it
specified. This operation is called finalization and done only once
per an ops table and the LLD doesn't have to do anything special about
it other than making the ops table non-const such that libata can
update it.
libata provides four base ops tables lower drivers can inherit from -
base, sata, pmp, sff and bmdma. To avoid overriding these ops
accidentaly, these ops are declared const and LLDs should always
inherit these instead of using them directly.
After finalization, all the ops table are identical before and after
the patch except for setting .irq_handler to ata_interrupt in drivers
which didn't use to. The .irq_handler doesn't have any actual effect
and the field will soon be removed by later patch.
* sata_sx4 is still using old style EH and currently doesn't take
advantage of ops inheritance.
Signed-off-by: Tejun Heo <htejun@gmail.com>
libata lets low level drivers build scsi_host_template and register it
to the SCSI layer. This allows low level drivers high level of
flexibility but also burdens them with lots of boilerplate entries.
This patch implements SHT initializers which can be used to initialize
all the boilerplate entries in a sht. Three variants of them are
implemented - BASE, BMDMA and NCQ - for different types of drivers.
Note that entries can be overriden by putting individual initializers
after the helper macro.
All sht tables are identical before and after this patch.
Signed-off-by: Tejun Heo <htejun@gmail.com>
->irq_clear() is used to clear IRQ bit of a SFF controller and isn't
useful for drivers which don't use libata SFF HSM implementation.
However, it's a required callback and many drivers implement their own
noop version as placeholder. This patch implements ata_noop_irq_clear
and use it to replace those custom placeholders.
Also, SFF drivers which don't support BMDMA don't need to use
ata_bmdma_irq_clear(). It becomes noop if BMDMA address isn't
initialized. Convert them to use ata_noop_irq_clear().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Over the time, ops in ata_port_operations has become a bit confusing.
Reorganize. SFF/BMDMA ops are separated into separate a group as they
will be taken out of ata_port_operations later.
Signed-off-by: Tejun Heo <htejun@gmail.com>
ata_ehi_schedule_probe() was created to hide details of link-resuming
reset magic. Now that all the softreset workarounds are gone,
scheduling probe is very simple - set probe_mask and request RESET.
Kill ata_ehi_schedule_probe() and open code it. This also increases
consistency as ata_ehi_schedule_probe() couldn't cover individual
device probings so they were open-coded even when the helper existed.
While at it, define ATA_ALL_DEVICES as mask of all possible devices on
a link and always use it when requesting probe on link level for
simplicity and consistency. Setting extra bits in the probe_mask
doesn't hurt anybody.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Some controllers can't reliably record the initial D2H FIS after SATA
link is brought online for whatever reason. Advanced controllers
which don't have traditional TF register based interface often have
this problem as they don't really have the TF registers to update
while the controller and link are being initialized.
SKIP_D2H_BSY works around the problem by skipping the wait for device
readiness before issuing SRST, so for such controllers libata issues
SRST blindly and hopes for the best.
Now that libata defaults to hardreset, this workaround is no longer
necessary. For controllers which have support for hardreset, SRST is
never issued by itself. It is only issued as follow-up SRST for
device classification and PMP initialization, so there's no need to
wait for it from prereset.
Kill ATA_LFLAG_SKIP_D2H_BSY.
Signed-off-by: Tejun Heo <htejun@gmail.com>
ATA_EHI_RESUME_LINK has two functions - promote reset to hardreset if
ATA_LFLAG_HRST_TO_RESUME is set and preventing EH from shortcutting
reset action when probing is requested. The former is gone now and
the latter can easily be achieved by making EH to perform at least one
reset if reset is requested, which also makes more sense than
depending on RESUME_LINK flag.
As ATA_EHI_RESUME_LINK was the only EHI reset modifier, this also
kills reset modifier handling.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Now that hardreset is the preferred method of resetting, there's no
need for ATA_LFLAG_HRST_TO_RESUME flag. Kill it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
When both soft and hard resets are available, libata preferred
softreset till now. The logic behind it was to be softer to devices;
however, this doesn't really help much. Rationales for the change:
* BIOS may freeze lock certain things during boot and softreset can't
unlock those. This by itself is okay but during operation PHY event
or other error conditions can trigger hardreset and the device may
end up with different configuration.
For example, after a hardreset, previously unlockable HPA can be
unlocked resulting in different device size and thus revalidation
failure. Similar condition can occur during or after resume.
* Certain ATAPI devices require hardreset to recover after certain
error conditions. On PATA, this is done by issuing the DEVICE RESET
command. On SATA, COMRESET has equivalent effect. The problem is
that DEVICE RESET needs its own execution protocol.
For SFF controllers with bare TF access, it can be easily
implemented but more advanced controllers (e.g. ahci and sata_sil24)
require specialized implementations. Simply using hardreset solves
the problem nicely.
* COMRESET initialization sequence is the norm in SATA land and many
SATA devices don't work properly if only SRST is used. For example,
some PMPs behave this way and libata works around by always issuing
hardreset if the host supports PMP.
Like the above example, libata has developed a number of mechanisms
aiming to promote softreset to hardreset if softreset is not going
to work. This approach is time consuming and error prone.
Also, note that, dependingon how you read the specs, it could be
argued that PMP fan-out ports require COMRESET to start operation.
In fact, all the PMPs on the market except one don't work properly
if COMRESET is not issued to fan-out ports after PMP reset.
* COMRESET is an integral part of SATA connection and any working
device should be able to handle COMRESET properly. After all, it's
the way to signal hardreset during reboot. This is the most used
and recommended (at least by the ahci spec) method of resetting
devices.
So, this patch makes libata prefer hardreset over softreset by making
the following changes.
* Rename ATA_EH_RESET_MASK to ATA_EH_RESET and use it whereever
ATA_EH_{SOFT|HARD}RESET used to be used. ATA_EH_{SOFT|HARD}RESET is
now only used to tell prereset whether soft or hard reset will be
issued.
* Strip out now unneeded promote-to-hardreset logics from
ata_eh_reset(), ata_std_prereset(), sata_pmp_std_prereset() and
other places.
Signed-off-by: Tejun Heo <htejun@gmail.com>
We were already doing what amounts to a get_phy_id from within
get_phy_device, and rather than duplicate this for the TBIPA
probing, we might as well just factor it out and make it available
instead.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
In order to not trip the clocksource watchdog, kgdb must touch the
clocksource watchdog on the return to normal system run state.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch fixes the hang regression with kgdb when the NMI interrupt
comes in while the master core is returning from an exception.
Adjust the NMI logic such that KGDB will not stop NMI exceptions from
occurring by in general returning NOTIFY_DONE. It is not possible to
distinguish the debug NMI sync vs the normal NMI apic interrupt so
kgdb needs to catch the unknown NMI if it the debugger was previously
active on one of the cpus.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
simplified and streamlined kgdb support on x86, both 32-bit and 64-bit,
based on patch from:
Subject: kgdb: core-lite
From: Jason Wessel <jason.wessel@windriver.com>
[ and countless other authors - see the patch for details. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
polled console handling support, to access a console in an irq-less
way while in debug or irq context.
absolutely zero impact as long as CONFIG_CONSOLE_POLL is disabled.
(which is the default)
[ jan.kiszka@siemens.com: lots of cleanups ]
[ mingo@elte.hu: redesign, splitups, cleanups. ]
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
kgdb core code. Handles the protocol and the arch details.
[ mingo@elte.hu: heavily modified, simplified and cleaned up. ]
[ xemul@openvz.org: use find_task_by_pid_ns ]
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
add probe_kernel_read() and probe_kernel_write().
Uninlined and restricted to kernel range memory only, as suggested
by Linus.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
This is a small addition of forgotten defines to regs-gpio.h include file for the Samsung S3C2410 ARM9 SoC
Signed-off-by: Davide Rizzo <davide@elpa.it>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There seems to be some problem with at-least the S3C2440 and
bus traffic during an reset. It is unlikely, but still possible
that the system will hang in such a way that the watchdog cannot
get the system out of the state it is in.
Change to making the code that calls the watchdog reset run from
cached memory so that instruction fetches have quiesced before the
watchdog fires.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Fix the name of the S3C2412_CLKDIVN_ARMDIVN define.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add initial defines for the S3C2412's memory controller registers.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move wakeup code to .c, so that video mode setting code can be shared
between boot and wakeup. Remove nasty assembly code in 64-bit case by
re-using trampoline code. Stack setup was fixed to clear high 16bits
of %esp, maybe that fixes some machines.
.c code sharing and morse code was done H. Peter Anvin, Sam Ravnborg
reviewed kbuild related stuff, and it seems okay to him. Rafael did
some cleanups.
[rjw:
* Made the patch stop breaking compilation on x86-32
* Added arch/x86/kernel/acpi/sleep.h
* Got rid of compiler warnings in arch/x86/kernel/acpi/sleep.c
* Fixed 32-bit compilation on x86-64 systems
* Added include/asm-x86/trampoline.h and fixed the non-SMP
compilation on 64-bit x86
* Removed arch/x86/kernel/acpi/sleep_32.c which was not used
* Fixed some breakage caused by the integration of smpboot.c done
under us in the meantime]
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cleanup references to the early cpu maps for the non-SMP configuration
and remove some functions called for SMP configurations only.
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
UV supports really big systems. So big, in fact, that the APICID register
does not contain enough bits to contain an APICID that is unique across all
cpus.
The UV BIOS supports 3 APICID modes:
- legacy mode. This mode uses the old APIC mode where
APICID is in bits [31:24] of the APICID register.
- x2apic mode. This mode is whitebox-compatible. APICIDs
are unique across all cpus. Standard x2apic APIC operations
(Intel-defined) can be used for IPIs. The node identifier
fits within the Intel-defined portion of the APICID register.
- x2apic-uv mode. In this mode, the APICIDs on each node have
unique IDs, but IDs on different node are not unique. For example,
if each mode has 32 cpus, the APICIDs on each node might be
0 - 31. Every node has the same set of IDs.
The UV hub is used to route IPIs/interrupts to the correct node.
Traditional APIC operations WILL NOT WORK.
In x2apic-uv mode, the ACPI tables all contain a full unique ID (note:
exact bit layout still changing but the following is close):
nnnnnnnnnnlc0cch
n = unique node number
l = socket number on board
c = core
h = hyperthread
Only the "lc0cch" bits are written to the APICID register. The remaining bits are
supplied by having the get_apic_id() function "OR" the extra bits into the value
read from the APICID register. (Hmmm.. why not keep the ENTIRE APICID register
in per-cpu data....)
The x2apic-uv mode is recognized by the MADT table containing:
oem_id = "SGI"
oem_table_id = "UV-X"
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add UV macros for converting between cpu numbers, blade numbers
and node numbers. Note that these are used ONLY within x86_64 UV
modules, and are not for general kernel use.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Definitions of UV MMRs.
Note: this file is auto-generated by hardware design tools.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Increase the number of bits in an apicid from 8 to 32.
By default, MP_processor_info() gets the APICID from the
mpc_config_processor structure. However, this structure limits
the size of APICID to 8 bits. This patch allows the caller of
MP_processor_info() to optionally pass a larger APICID that will
be used instead of the one in the mpc_config_processor struct.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add functions that can be used to determine if an x86_64
system is a SGI "UV" system. UV systems come in 3 types and
are identified by the OEM ID in the MADT.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Introduce a function to read the local APIC_ID.
This change is in preparation for additional changes to
the APICID functions that will come in a later patch.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch renames VM_MASK to X86_VM_MASK (which
in turn defined as alias to X86_EFLAGS_VM) to better
distinguish from virtual memory flags. We can't just
use X86_EFLAGS_VM instead because it is also used
for conditional compilation
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
A 1G section size makes memory hotplug too coarse in a virtual
environment. Retuce it by a factor of 2 to 512M. I would have liked
to make it smaller, but it runs out of reserved flags in the page flags.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Merge what's left from smp_32.h and smp_64.h into smp.h
By now, they're basically extern definitions.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
we merge everything that is inside CONFIG_SMP
to smp.h. They differ a little bit, so we use
CONFIG_X86_32_SMP and CONFIG_X86_64_SMP as markers.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This implementation in x86_64 is clean and consistent, but we
sacrifice it for the sake of being equal to i386 (since the other
way around would be harder).
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Although those constants are always defined in x86_64,
and will have the effect of just including the headers
in the very way we did before, I'm doing this in a separate
patch to be conservative and avoid surprises.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The code is now the same between i386 and x86_64. We already
know what happens when it reaches this point: They go away
from the arch-specific headers, and suddenly appears in the common
header.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We provide a bogus macro for x86_64 in case CONFIG_X86_LOCAL_APIC
is not set. It will always be set for x86_64, so the effect
is just to make the code equal to i386.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
APIC_DEFINITION is not defined in x86_64, so in practice, we keep
our old code here. But as a nice side effect, the code is now
equal to smp_32.h.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The new cacheflush.h API's didn't have any comments describing
how they're to be used yet and the conventions around these functions.
This patch adds comments to this effect; in order for that to be
a logical series, some prototypes had to move around.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Using a naked parameterless macro could lead to other tokens being
unexpectedly replaced.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When compilers became generally better at optimizing code than humans, the
register keyword became mostly useless. For the floppy driver it certainly
is since it's so slow compared to the rest of the system that optimizing
access to a single variable or two isn't going to make any real difference
So let's just leave it to the compiler - it'll do a better job anyway.
This patch does away with a few register keywords in the x86 floppy driver.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On AMD SMM protected memory is part of the address map, but handled
internally like an MTRR. That leads to large pages getting split
internally which has some performance implications. Check for the
AMD TSEG MSR and split the large page mapping on that area
explicitely if it is part of the direct mapping.
There is also SMM ASEG, but it is in the first 1MB and already covered by
the earlier split first page patch.
Idea for this came from an earlier patch by Andreas Herrmann
On a RevF dual Socket Opteron system kernbench shows a clear
improvement from this:
(together with the earlier patches in this series, especially the
split first 2MB patch)
[lower is better]
no split stddev split stddev delta
Elapsed Time 87.146 (0.727516) 84.296 (1.09098) -3.2%
User Time 274.537 (4.05226) 273.692 (3.34344) -0.3%
System Time 34.907 (0.42492) 34.508 (0.26832) -1.1%
Percent CPU 322.5 (38.3007) 326.5 (44.5128) +1.2%
=> About 3.2% improvement in elapsed time for kernbench.
With GB pages on AMD Fam1h the impact of splitting is much higher of course,
since it would split two full GB pages (together with the first
1MB split patch) instead of two 2MB pages. I could not benchmark
a clear difference in kernbench on gbpages, so I kept it disabled
for that case
That was only limited benchmarking of course, so if someone
was interested in running more tests for the gbpages case
that could be revisited (contributions welcome)
I didn't bother implementing this for 32bit because it is very
unlikely the 32bit lowmem mapping overlaps into the TSEG near 4GB
and the 2MB low split is already handled for both.
[ mingo@elte.hu: do it on gbpages kernels too, there's no clear reason
why it shouldnt help there. ]
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: andreas.herrmann3@amd.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
RDMSR for 64bit values with exception handling.
Makes it easier to deal with 64bit valued MSRs. The old 64bit code
base had that too as checking_rdmsrl(), but it got dropped somehow.
Signed-off-by: Andi Kleen <andi@firstfloor.org>
Cc: andreas.herrmann3@amd.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a new function to force split large pages into 4k pages.
This is needed for some followup optimizations.
I had to add a new field to cpa_data to pass down the information
that try_preserve_large_page should not run.
Right now no set_page_4k() because I didn't need it and all the
specialized users I have in mind would be more comfortable with
pure addresses. I also didn't export it because it's unlikely
external code needs it.
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: andreas.herrmann3@amd.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When end_pfn is not aligned to 2MB (or 1GB) then the kernel might
map more memory than end_pfn. Account this in max_pfn_mapped.
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: andreas.herrmann3@amd.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Even on 32bit 2MB pages can map more memory than is in the true
max_low_pfn if end_pfn is not highmem and not aligned to 2MB.
Add a end_pfn_map similar to x86-64 that accounts for this
fact. This is important for code that really needs to know about
all mapping aliases.
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: andreas.herrmann3@amd.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All of early setup runs with interrupts disabled, so there is no
need to set up early exception handlers for vectors >= 32
This saves some minor text size.
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some of pde bits weren't documented, add the short description to them.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>