* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: print out alloc information with KERN_DEBUG instead of KERN_INFO
kthread_work: make lockdep happy
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] fix up documentation for change in ->queuecommand to lockless calling
[SCSI] bfa: rename log_level to bfa_log_level
Low level drivers may behave differently depending on the current
link->lpm_policy. During ata_eh_set_lpm(), DIPM enable commands are
issued after the successful completion of ap->ops->set_lpm(), which
means that the controller is already in the target state. This causes
DIPM enable commands to be processed with mismatching controller power
state and link->lpm_policy value.
In ahci, link->lpm_policy is used to ignore certain PHY events if LPM
is enabled; however, as DIPM commands are issued with stale
link->lpm_policy, they sometimes end up triggering these conditions
and get aborted leading to LPM configuration failure.
Fix it by updating link->lpm_policy before issuing DIPM enable
commands.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Kyle McMartin <kyle@mcmartin.ca>
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
ata_qc_complete() contains special handling for certain commands. For
example, it schedules EH for device revalidation after certain
configurations are changed. These shouldn't be applied to EH
commands but they were.
In most cases, it doesn't cause an actual problem because EH doesn't
issue any command which would trigger special handling; however, ACPI
can issue such commands via _GTF which can cause weird interactions.
Restructure ata_qc_complete() such that EH commands are always passed
on to __ata_qc_complete().
stable: Please apply to -stable only after 2.6.38 is released.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Kyle McMartin <kyle@mcmartin.ca>
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Found by this build-error if BMDMA is disabled:
drivers/ata/pata_mpc52xx.c: In function 'mpc52xx_ata_init_one':
drivers/ata/pata_mpc52xx.c:662: error: 'ata_bmdma_interrupt' undeclared (first use in this function)
...
Move the Kconfig entry to the proper location as needed since
9a7780c9ac (libata-sff: make BMDMA optional)
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
pata_cs5536 does work on the other platforms(e.g. Loongson, a MIPS
variant), so, remove the dependency of X86_32 and fix the building
errors under the other platforms via only reserving the X86_32 specific
parts for X86_32.
pata_amd also supports cs5536 IDE controller, but this one saves about
33k for the compressed kernel image(vmlinuz for MIPS).
Signed-off-by: Zhang Le <r0bertz@gentoo.org>
Signed-off-by: Chen Jie <chenj@lemote.com>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
While separating out BMDMA irq handler from SFF, commit c3b28894
(libata-sff: separate out BMDMA irq handler) incorrectly made
__ata_sff_port_intr() consider an IRQ to be an idle one if the host
state was transitioned to HSM_ST_ERR by ata_bmdma_port_intr().
This makes BMDMA drivers ignore IRQs reporting host bus error which
leads to timeouts instead of triggering EH immediately. Fix it by
making __ata_sff_port_intr() consider the IRQ to be an idle one iff
the state is HSM_ST_IDLE. This is equivalent to adding HSM_ST_ERR to
the "break"ing case but less error-prone.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Antonio Toma <antonio.toma@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Presently the root node is initialized by way of kzalloc on the parent
data structure, which by chance happens to do the bulk of what an
explicit initialization does with GFP_NOWAIT semantics. This however is
more by luck than by design, and as we ideally want to permit radix node
allocations access to the emergency pools anyways, add in the proper
initializer with the desired mask.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
If the irqsoff tracer is in use, stop tracing the interrupt disable
interval when returning to userspace. Tracing userspace execution time
as interrupts disabled time is not helpful for kernel performance
analysis purposes. Only do so if the irqsoff tracer is enabled, to
avoid overhead for lockdep, which doesn't care.
Signed-off-by: Todd Poynor <toddpoynor@google.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Adding in self as maintainer for Nomadik and Ux500, I'm running
an active -next tree for that stuff now. Extend file matchers to
cover a few more relevant drivers and add git references.
Cc: Alessandro Rubini <rubini@unipv.it>
Acked-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
... and also remove misleading comment stating that this header is
auto-generated.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Uwe Kleine-Knig <u.kleine-koenig@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Or else we can't operate on the right address when the trans length
is greater than 65535.
Signed-off-by: Mingkai Hu <Mingkai.hu@freescale.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
The user must read N bytes of SPIRF (1 <= N <= 4) that do not exceed the
amount of data in the receive FIFO, so read the SPIRF byte by byte when
the data in receive FIFO is less than 4 bytes.
On Simics, when read N bytes that exceed the amount of data in receive
FIFO, we can't read the data out, that is we can't clear the rx FIFO,
then the CPU will loop on the espi rx interrupt.
Signed-off-by: Mingkai Hu <Mingkai.hu@freescale.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
If we are registering an i2c device that has a device tree node like
this real-world example:
rtc@68 {
compatible = "dallas,ds1337";
reg = <0x68>;
};
of_i2c_register_devices() will try to load a module called ds1337.ko.
There is no such module, so it will fail. If we look in modules.alias
we will find entries like these:
.
.
.
alias i2c:ds1339 rtc_ds1307
alias i2c:ds1338 rtc_ds1307
alias i2c:ds1337 rtc_ds1307
alias i2c:ds1307 rtc_ds1307
alias i2c:ds1374 rtc_ds1374
.
.
.
The module we want is really called rtc_ds1307.ko. If we request a
module called "i2c:ds1337", the userspace module loader will do the
right thing (unless it is busybox) and load rtc_ds1307.ko. So we add
the I2C_MODULE_PREFIX to the request_module() string.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
On my system with a radeon x2, the first GPU was not overlapping vesa
but the test decided it was.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The master clock initialization for SH7201 was wholly bogus. Users of the
legacy API must initialize the clock rate through the struct clk itself
rather than returning the clock frequency. Given that the init function
itself is void, returning the frequency isn't terribly effective.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Now that these have been introduced in to the vmalloc API, sync up the
nommu side of things. At present we don't deal with VMAs as such, so for
the time being these will simply BUG() out. In the future it should be
possible to support this interface by layering on top of the vm_regions.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
With some recent tidying of duplicate register definitions the se7206 IRQ
code broke:
arch/sh/boards/mach-se/7206/irq.c: error: 'INTC_ICR' undeclared (first use in this function)
arch/sh/boards/mach-se/7206/irq.c: error: (Each undeclared identifier is reported only once
arch/sh/boards/mach-se/7206/irq.c: error: for each function it appears in.)
Fix it up.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Some of the SH4-202 code was overlooked in the set_rate() API conversion,
resulting in:
arch/sh/kernel/cpu/sh4/clock-sh4-202.c: error: too many arguments to function 'clk->ops->set_rate'
Fix it up.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
ocfs2: Fix system inodes cache overflow.
ocfs2: Hold ip_lock when set/clear flags for indexed dir.
ocfs2: Adjust masklog flag values
Ocfs2: Teach 'coherency=full' O_DIRECT writes to correctly up_read i_alloc_sem.
ocfs2/dlm: Migrate lockres with no locks if it has a reference
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Fix GPIO2-fixup for Sony laptops
ALSA: hda - Try to find an empty control index when it's occupied
ALSA: hda - Fix conflict of d-mic capture volume controls
ALSA: hda - Don't apply ALC269-specific initialization to ALC275
ALSA: hda - Add fix-up for Sony VAIO with ALC275 codecs
ALSA: pcm: remember to always call va_end() on stuff that we va_start()
ALSA: HDA: Add auto-mute for Thinkpad SL410/SL510
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (21 commits)
[media] mceusb: set a default rx timeout
[media] mceusb: fix inverted mask inversion logic
[media] mceusb: add another Fintek device ID
[media] lirc_dev: fixes in lirc_dev_fop_read()
[media] lirc_dev: stray unlock in lirc_dev_fop_poll()
[media] rc: fix sysfs entry for mceusb and streamzap
[media] streamzap: merge timeout space with trailing space
[media] mceusb: fix keybouce issue after parser simplification
[media] IR: add tv power scancode to rc6 mce keymap
[media] mceusb: buffer parsing fixups for 1st-gen device
[media] mceusb: fix up reporting of trailing space
[media] nuvoton-cir: improve buffer parsing responsiveness
[media] mceusb: add support for Conexant Hybrid TV RDU253S
[media] s5p-fimc: Fix output DMA handling in S5PV310 IP revisions
[media] s5p-fimc: Use correct fourcc code for 32-bit RGB format
[media] s5p-fimc: Convert m2m driver to unlocked_ioctl
[media] s5p-fimc: Explicitly add required header file
[media] s5p-fimc: Fix vidioc_g_crop/cropcap on camera sensor
[media] s5p-fimc: BKL lock removal - compilation fix
[media] soc-camera: fix static build of the sh_mobile_csi2.c driver
...
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf probe: Fix to support libdwfl older than 0.148
perf tools: Fix lazy wildcard matching
perf buildid-list: Fix error return for success
perf buildid-cache: Fix symbolic link handling
perf symbols: Stop using vmlinux files with no symbols
perf probe: Fix use of kernel image path given by 'k' option
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, kexec: Limit the crashkernel address appropriately
In construct_alloc_key(), up_write() is called in the error path if
__key_link_begin() fails, but this is incorrect as __key_link_begin() only
returns with the nominated keyring locked if it returns successfully.
Without this patch, you might see the following in dmesg:
=====================================
[ BUG: bad unlock balance detected! ]
-------------------------------------
mount.cifs/5769 is trying to release lock (&key->sem) at:
[<ffffffff81201159>] request_key_and_link+0x263/0x3fc
but there are no more locks to release!
other info that might help us debug this:
3 locks held by mount.cifs/5769:
#0: (&type->s_umount_key#41/1){+.+.+.}, at: [<ffffffff81131321>] sget+0x278/0x3e7
#1: (&ret_buf->session_mutex){+.+.+.}, at: [<ffffffffa0258e59>] cifs_get_smb_ses+0x35a/0x443 [cifs]
#2: (root_key_user.cons_lock){+.+.+.}, at: [<ffffffff81201000>] request_key_and_link+0x10a/0x3fc
stack backtrace:
Pid: 5769, comm: mount.cifs Not tainted 2.6.37-rc6+ #1
Call Trace:
[<ffffffff81201159>] ? request_key_and_link+0x263/0x3fc
[<ffffffff81081601>] print_unlock_inbalance_bug+0xca/0xd5
[<ffffffff81083248>] lock_release_non_nested+0xc1/0x263
[<ffffffff81201159>] ? request_key_and_link+0x263/0x3fc
[<ffffffff81201159>] ? request_key_and_link+0x263/0x3fc
[<ffffffff81083567>] lock_release+0x17d/0x1a4
[<ffffffff81073f45>] up_write+0x23/0x3b
[<ffffffff81201159>] request_key_and_link+0x263/0x3fc
[<ffffffffa026fe9e>] ? cifs_get_spnego_key+0x61/0x21f [cifs]
[<ffffffff812013c5>] request_key+0x41/0x74
[<ffffffffa027003d>] cifs_get_spnego_key+0x200/0x21f [cifs]
[<ffffffffa026e296>] CIFS_SessSetup+0x55d/0x1273 [cifs]
[<ffffffffa02589e1>] cifs_setup_session+0x90/0x1ae [cifs]
[<ffffffffa0258e7e>] cifs_get_smb_ses+0x37f/0x443 [cifs]
[<ffffffffa025a9e3>] cifs_mount+0x1aa1/0x23f3 [cifs]
[<ffffffff8111fd94>] ? alloc_debug_processing+0xdb/0x120
[<ffffffffa027002c>] ? cifs_get_spnego_key+0x1ef/0x21f [cifs]
[<ffffffffa024cc71>] cifs_do_mount+0x165/0x2b3 [cifs]
[<ffffffff81130e72>] vfs_kern_mount+0xaf/0x1dc
[<ffffffff81131007>] do_kern_mount+0x4d/0xef
[<ffffffff811483b9>] do_mount+0x6f4/0x733
[<ffffffff8114861f>] sys_mount+0x88/0xc2
[<ffffffff8100ac42>] system_call_fastpath+0x16/0x1b
Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-and-Tested-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The default for non-READ_BACK GPIO regs is to have the clear bits set;
this means that our original errata fix was too simplistic. This
changes it to the following behavior:
- when setting GPIOs, ignore the higher order bits (they're for
clearing, we don't need to care about them).
- when clearing GPIOs, keep all the bits, but unset (via XOR) the
lower order bit that negates the clear bit that we care about. That
is, if we're clearing GPIO 26 (val = 0x04000000), we first XOR what's
currently in the register with 0x0400 (GPIO 26's SET bit), and then
OR that with the GPIO 26's CLEAR bit.
Tested-by: Daniel Drake <dsd@laptop.org>
Signed-off-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The edge detect status GPIOs function differently from the other atomic
model CS5536 GPIO registers; writing 1 to the high bits clears the GPIO,
but writing 1 to the lower bits also clears the bit.
This means that read-modify-write doesn't actually work for it, so don't
apply the errata here. If a negative edge status gets lost after
resume.. well, we tried our best!
Tested-by: Daniel Drake <dsd@laptop.org>
Signed-off-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If pcie_ports_disabled is set, pcie_port_service_register() returns
error code and select_detection_mode() should not attempt to
unregister dummy_driver and use dummy_slots. It should return
PCIEHP_DETECT_ACPI immediately instead.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This reverts commit 4465b46900.
Conflicts:
net/ipv4/fib_frontend.c
As reported by Ben Greear, this causes regressions:
> Change 4465b46900 caused rules
> to stop matching the input device properly because the
> FLOWI_FLAG_MATCH_ANY_IIF is always defined in ip_dev_find().
>
> This breaks rules such as:
>
> ip rule add pref 512 lookup local
> ip rule del pref 0 lookup local
> ip link set eth2 up
> ip -4 addr add 172.16.0.102/24 broadcast 172.16.0.255 dev eth2
> ip rule add to 172.16.0.102 iif eth2 lookup local pref 10
> ip rule add iif eth2 lookup 10001 pref 20
> ip route add 172.16.0.0/24 dev eth2 table 10001
> ip route add unreachable 0/0 table 10001
>
> If you had a second interface 'eth0' that was on a different
> subnet, pinging a system on that interface would fail:
>
> [root@ct503-60 ~]# ping 192.168.100.1
> connect: Invalid argument
Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
https://bugzilla.kernel.org/show_bug.cgi?id=25352
This regression was caused by commit a31437b85: "ext4: use
sb_issue_zeroout in setup_new_group_blocks", by accidentally dropping
the code which reserved the block group descriptor and inode table
blocks.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix build errors like these (from a randconfig and my defconfig for a custom board):
src/arch/powerpc/platforms/52xx/mpc52xx_gpt.c:549: error: dereferencing pointer to incomplete type: 1 errors in 1 logs
src/arch/powerpc/platforms/52xx/mpc52xx_gpt.c:636: error: implicit declaration of function 'nonseekable_open': 1 errors in 1 logs
src/arch/powerpc/platforms/52xx/mpc52xx_gpt.c:657: error: variable 'mpc52xx_wdt_fops' has initializer but incomplete type: 1 errors in 1 logs
src/arch/powerpc/platforms/52xx/mpc52xx_gpt.c:658: error: excess elements in struct initializer: 1 errors in 1 logs
src/arch/powerpc/platforms/52xx/mpc52xx_gpt.c:658: error: unknown field 'owner' specified in initializer: 1 errors in 1 logs
...
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
The original code returns 0 on success and 1 on failure. In fact, at
this point, "ret" is already either zero or a negative error code so
we can just return it directly.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the user-provided len is less than the expected offset, the
IRLMP_ENUMDEVICES getsockopt will do a copy_to_user() with a very large
size value. While this isn't be a security issue on x86 because it will
get caught by the access_ok() check, it may leak large amounts of kernel
heap on other architectures. In any event, this patch fixes it.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Vlasov found /proc/net/tcp could sometime loop and display
millions of sockets in LISTEN state.
In 2.6.29, when we converted TCP hash tables to RCU, we left two
sk_next() calls in listening_get_next().
We must instead use sk_nulls_next() to properly detect an end of chain.
Reported-by: Alexey Vlasov <renton@renton.name>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix two related problems in the event-copying loop of
ring_buffer_read_page.
The loop condition for copying events is off-by-one.
"len" is the remaining space in the caller-supplied page.
"size" is the size of the next event (or two events).
If len == size, then there is just enough space for the next event.
size was set to rb_event_ts_length, which may include the size of two
events if the first event is a time-extend, in order to assure time-
extends are kept together with the event after it. However,
rb_advance_reader always advances by one event. This would result in the
event after any time-extend being duplicated. Instead, get the size of
a single event for the memcpy, but use rb_event_ts_length for the loop
condition.
Signed-off-by: David Sharp <dhsharp@google.com>
LKML-Reference: <1293064704-8101-1-git-send-email-dhsharp@google.com>
LKML-Reference: <AANLkTin7nLrRPc9qGjdjHbeVDDWiJjAiYyb-L=gH85bx@mail.gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The fix-up entries by the commit 2785591a97
ALSA: hda - Add fix-up for Sony VAIO with ALC275 codecs
weren't applied in the right position. They had to be before the quirk
entry matching to all Sony devices.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The IPS driver is designed to be able to run detached from i915 and
just not enable GPU turbo in that case, in order to avoid module
dependencies between the two drivers. This means that we don't know
what the load order between the two is going to be, and we had
previously only supported IPS after (optionally) i915, but not i915
after IPS. If the wrong order was chosen, you'd get no GPU turbo, and
something like half the possible graphics performance.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
It's required by the specs, but we don't know why. Let's not find out
why.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When a mixer control element was already created with the given name,
try to find another index for avoiding conflicts, instead of breaking
with an error. This makes the driver more robust.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
When the d-mics are assigned to the same purpose of another analog mic
pins, the driver doesn't compute the index properly, resulting in an
error with "existing control". This patch fixes it.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm: Include the connector name in the output_poll_execute() debug message
drm/radeon/kms: fix bug in r600_gpu_is_lockup
drm/radeon/kms: reorder display resume to avoid problems
drm/radeon/kms/evergreen: reset the grbm blocks at resume and init
drm/radeon/kms: fix evergreen asic reset
Revert "drm: Don't try and disable an encoder that was never enabled"
drm/radeon: Add early unregister of firmware fb's
drm/radeon: use aperture size not vram size for overlap tests
drm/radeon/kms/evergreen: flush hdp cache when flushing gart tlb
drm/radeon/kms: disable the r600 cb offset checker for linear surfaces
drm/radeon/kms: disable ss fixed ref divide
drm/i915/bios: Reverse order of 100/120 Mhz SSC clocks
agp/intel: Fix missed cached memory flags setting in i965_write_entry()
drm/i915/sdvo: Only use the SDVO pin if it is in the valid range
drm/i915/ringbuffer: Handle wrapping of the autoreported HEAD
drm/i915/dp: Fix I2C/EDID handling with active DisplayPort to DVI converter
This was fixed by David Lamparter in v2.6.36-rc5 3486008 ("spi: free
children in spi_unregister_master, not siblings") and broken again in
v2.6.37-rc1~2^2~4 during the merge of 2b9603a0 ("spi: enable
spi_board_info to be registered after spi_master").
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David Lamparter <equinox@diac24.net>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The taskstats structure is internally aligned on 8 byte boundaries but the
layout of the aggregrate reply, with two NLA headers and the pid (each 4
bytes), actually force the entire structure to be unaligned. This causes
the kernel to issue unaligned access warnings on some architectures like
ia64. Unfortunately, some software out there doesn't properly unroll the
NLA packet and assumes that the start of the taskstats structure will
always be 20 bytes from the start of the netlink payload. Aligning the
start of the taskstats structure breaks this software, which we don't
want. So, for now the alignment only happens on architectures that
require it and those users will have to update to fixed versions of those
packages. Space is reserved in the packet only when needed. This ifdef
should be removed in several years e.g. 2012 once we can be confident
that fixed versions are installed on most systems. We add the padding
before the aggregate since the aggregate is already a defined type.
Commit 85893120 ("delayacct: align to 8 byte boundary on 64-bit systems")
previously addressed the alignment issues by padding out the pid field.
This was supposed to be a compatible change but the circumstances
described above mean that it wasn't. This patch backs out that change,
since it was a hack, and introduces a new NULL attribute type to provide
the padding. Padding the response with 4 bytes avoids allocating an
aligned taskstats structure and copying it back. Since the structure
weighs in at 328 bytes, it's too big to do it on the stack.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reported-by: Brian Rogers <brian@xyzw.org>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Guillaume Chazarain <guichaz@gmail.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>