What I realized recently is that calling rebuild_sched_domains() in
arch_reinit_sched_domains() by itself is not enough when cpusets are enabled.
partition_sched_domains() code is trying to avoid unnecessary domain rebuilds
and will not actually rebuild anything if new domain masks match the old ones.
What this means is that doing
echo 1 > /sys/devices/system/cpu/sched_mc_power_savings
on a system with cpusets enabled will not take affect untill something changes
in the cpuset setup (ie new sets created or deleted).
This patch fixes restore correct behaviour where domains must be rebuilt in
order to enable MC powersaving flags.
Test on quad-core Core2 box with both CONFIG_CPUSETS and !CONFIG_CPUSETS.
Also tested on dual-core Core2 laptop. Lockdep is happy and things are working
as expected.
Signed-off-by: Max Krasnyansky <maxk@qualcomm.com>
Tested-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Spencer reported a problem where utime and stime were going negative despite
the fixes in commit b27f03d4bd. The suspected
reason for the problem is that signal_struct maintains it's own utime and
stime (of exited tasks), these are not updated using the new task_utime()
routine, hence sig->utime can go backwards and cause the same problem
to occur (sig->utime, adds tsk->utime and not task_utime()). This patch
fixes the problem
TODO: using max(task->prev_utime, derived utime) works for now, but a more
generic solution is to implement cputime_max() and use the cputime_gt()
function for comparison.
Reported-by: spencer@bluehost.com
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
ipsec: Fix deadlock in xfrm_state management.
ipv: Re-enable IP when MTU > 68
net/xfrm: Use an IS_ERR test rather than a NULL test
ath9: Fix ath_rx_flush_tid() for IRQs disabled kernel warning message.
ath9k: Incorrect key used when group and pairwise ciphers are different.
rt2x00: Compiler warning unmasked by fix of BUILD_BUG_ON
mac80211: Fix debugfs union misuse and pointer corruption
wireless/libertas/if_cs.c: fix memory leaks
orinoco: Multicast to the specified addresses
iwlwifi: fix 64bit platform firmware loading
iwlwifi: fix apm_stop (wrong bit polarity for FLAG_INIT_DONE)
iwlwifi: workaround interrupt handling no some platforms
iwlwifi: do not use GFP_DMA in iwl_tx_queue_init
net/wireless/Kconfig: clarify the description for CONFIG_WIRELESS_EXT_SYSFS
net: Unbreak userspace usage of linux/mroute.h
pkt_sched: Fix locking of qdisc_root with qdisc_root_sleeping_lock()
ipv6: When we droped a packet, we should return NET_RX_DROP instead of 0
hwif_to_node() incorrectly assumes that hwif->dev always belongs to
a PCI device. This results in ide-cs oopsing in init_irq() after
commit c56c5648a3 accidentally fixed
device tree registration for ide-cs. Fix it by using dev_to_node().
Thanks to Martin Michlmayr and Larry Finger for help with debugging
the issue.
Reported-by: Martin Michlmayr <tbm@cyrius.com>
Tested-by: Martin Michlmayr <tbm@cyrius.com>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
The sff_dma_ops struct should be wrapped by BLK_DEV_IDEDMA_SFF instead
of BLK_DEV_IDEDMA_PCI.
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* 'for-2.6.27' of git://linux-nfs.org/~bfields/linux:
nfsd: fix buffer overrun decoding NFSv4 acl
sunrpc: fix possible overrun on read of /proc/sys/sunrpc/transports
nfsd: fix compound state allocation error handling
svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_tasklet
Not used anywhere yet, but this complements the existing plain
'insert_resource()' functionality with a version that can expand the
resource we are adding in order to fix up any conflicts it has with
existing resources.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nothing in linux/pim.h should be exported to userspace.
This should fix the XORP build failure reported by
Jose Calhariz, the debain package maintainer.
Nothing originally in linux/mroute.h was exported to userspace
ever, but some of this stuff started to be when it was moved into
this new linux/pim.h, and that was wrong. If we didn't provide these
definitions for 10 years we can reasonably expect that applications
defined this stuff locally or used GLIBC headers providing the
protocol definitions. And as such the only result of this can
be conflict and userland build breakage.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add missing kernel descriptions of struct i2c_driver members.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: David Brownell <david-b@pacbell.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
[PATCH] deal with the first call of ->show() generating no output
[PATCH] fix ->llseek() for a bunch of directories
[PATCH] fix regular readdir() and friends
[PATCH] fix hpux_getdents()
[PATCH] fix osf_getdirents()
[PATCH] ntfs: use d_add_ci
[PATCH] change d_add_ci argument ordering
[PATCH] fix efs_lookup()
[PATCH] proc: inode number fixlet
Technically, the cmd_filter would be applied to other protocols though
it's unlikely to happen. Putting SCSI stuff to request_queue is kinda
layer violation. So let's rename it.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cmd_filter works only for the block layer SG_IO with SCSI block
devices. It breaks scsi/sg.c, bsg, and the block layer SG_IO with SCSI
character devices (such as st). We hit a kernel crash with them.
The problem is that cmd_filter code accesses to gendisk (having struct
blk_scsi_cmd_filter) via inode->i_bdev->bd_disk. It works for only
SCSI block device files. With character device files, inode->i_bdev
leads you to struct cdev. inode->i_bdev->bd_disk->blk_scsi_cmd_filter
isn't safe.
SCSI ULDs don't expose gendisk; they keep it private. bsg needs to be
independent on any protocols. We shouldn't change ULDs to expose their
gendisk.
This patch moves struct blk_scsi_cmd_filter from gendisk to
request_queue, a common object, which eveyone can access to.
The user interface doesn't change; users can change the filters via
/sys/block/. gendisk has a pointer to request_queue so the cmd_filter
code accesses to struct blk_scsi_cmd_filter.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Including <linux/fcntl.h> in the user-visible part of this header has
caused build regressions with headers from 2.6.27-rc. Move it down to
the #ifdef __KERNEL__ part, which is the only place it's needed. Move
some other kernel-only things down there too, while we're at it.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
KVM: fix userspace ABI breakage
KVM: MMU: Fix torn shadow pte
KVM: Use .fixup instead of .text.fixup on __kvm_handle_fault_on_reboot
The following part of commit 9ef621d3be
(KVM: Support mixed endian machines) changed on the size of a struct
that is exported to userspace:
include/linux/kvm.h:
@@ -318,14 +318,14 @@ struct kvm_trace_rec {
__u32 vcpu_id;
union {
struct {
- __u32 cycle_lo, cycle_hi;
+ __u64 cycle_u64;
__u32 extra_u32[KVM_TRC_EXTRA_MAX];
} cycle;
struct {
__u32 extra_u32[KVM_TRC_EXTRA_MAX];
} nocycle;
} u;
-};
+} __attribute__((packed));
Packing a struct was the correct idea, but it packed the wrong struct.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Avi Kivity <avi@qumranet.com>
As pointed out during review d_add_ci argument order should match d_add,
so switch the dentry and inode arguments.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This patch lets the files using linux/version.h match the files that
#include it.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
nohz: fix wrong event handler after online an offlined cpu
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] pata_it821x: fix warning
libata: Fix a large collection of DMA mode mismatches
ahci: sis controllers actually can do PMP
pata_via: clean up recent tf_load changes
libata: restore SControl on detach
libata: use ata_link_printk() when printing SError
libata: always do follow-up SRST if hardreset returned -EAGAIN
libata: fix EH action overwriting in ata_eh_reset()
sata_mv: add the Gen IIE flag to the SoC devices.
ata_piix: IDE Mode SATA patch for Intel Ibex Peak DeviceIDs
ahci: RAID mode SATA patch for Intel Ibex Peak DeviceIDs
sata_mv: don't issue two DMA commands concurrently
libata: implement no[hs]rst force params
Dave Müller sent a diff for the pata_oldpiix that highlighted a problem
where a lot of the ATA drivers assume dma_mode == 0 means "no DMA" while
the core code uses 0xFF.
This turns out to have other consequences such as code doing >= XFER_UDMA_0
also catching 0xFF as UDMAlots. Fortunately it doesn't generally affect
set_dma_mode, although some drivers call back into their own set mode code
from other points.
Having been through the drivers I've added helpers for using_udma/using_mwdma
dma_enabled so that people don't open code ranges that may change (eg if UDMA8
appears somewhere)
Thanks to David for the initial bits
[and added fix for pata_oldpiix from and signed-off-by Dave Mueller
<dave.mueller@gmx.ch> -jg]
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Save SControl during probing and restore it on detach. This prevents
adjustments made by libata drivers to seep into the next driver which
gets attached (be it a libata one or not).
It's not clear whether SControl also needs to be restored on suspend.
The next system to have control (ACPI or kexec'd kernel) would
probably like to see the original SControl value but there's no
guarantee that a link is gonna keep working after SControl is adjusted
without a reset and adding a reset and modified recovery cycle soley
for this is an overkill. For now, do it only for detach.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Implement force params nohrst, nosrst and norst. This is to work
around reset related problems and ease debugging.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
pnp: fix "add acpi:* modalias entries"
UIO: generic irq handling for some uio platform devices
UIO: uio_pdrv: fix license specification
UIO: uio_pdrv: fix memory leak
block: drop references taken by class_find_device()
block: fix partial read() of /proc/{partitions,diskstats}
PM: Remove WARN_ON from device_pm_add
driver core: add init_name to struct device
PM: don't skip device PM init when CONFIG_PM_SLEEP isn't set and CONFIG_PM is set
driver model: anti-oopsing medicine
dev_printk(): constify the `dev' argument
drivers/base/driver.c: remove unused to_dev() macro
Documentation: HOWTO-ja_JP-sync patch
Japanese translation of Documentation/SubmitChecklist
kobject: Replace ALL occurrences of '/' with '!' instead of only the first one.
This patch (as1128) fixes one of the problems related to the new PM
infrastructure. We are not allowed to register new child devices
during the middle of a system sleep transition, but unbinding a USB
driver causes the core to automatically install altsetting 0 and
thereby create new endpoint pseudo-devices.
The patch fixes this problem (and the related problem that installing
altsetting 0 will fail if the device is suspended) by deferring the
Set-Interface call until some later time when it is legal and can
succeed. Possible later times are: when a new driver is being probed
for the interface, and when the interface is being resumed.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This gives us a way to handle both the bus_id and init_name values being
used for a while during the transition period.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add const markings to dev_name and dev_driver_string to make it clear that
dev_printk doesn't modify dev. This is a prerequisite to adding more
const markings to other functions make it clearer, which functions can
modify dev and which can't.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On the tickless system(CONFIG_NO_HZ=y and CONFIG_HIGH_RES_TIMERS=n), after
I made an offlined cpu online, I found this cpu's event handler was
tick_handle_periodic, not tick_nohz_handler.
After debuging, I found this bug was caused by the wrong tick mode. the
tick mode is not changed to NOHZ_MODE_INACTIVE when the cpu is offline.
This patch fixes this bug.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fixes kernel BUG at lib/radix-tree.c:473.
Previously the handler was incidentally provided by tmpfs but this was
removed with:
commit 14fcc23fdc
Author: Hugh Dickins <hugh@veritas.com>
Date: Mon Jul 28 15:46:19 2008 -0700
tmpfs: fix kernel BUG in shmem_delete_inode
relying on this behaviour was incorrect in any case and the BUG also
appeared when the device node was on an ext3 filesystem.
v2: override a_ops at open() time rather than mmap() time to minimise
races per AKPM's concerns.
Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Cc: Jaya Kumar <jayakumar.lkml@gmail.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Johannes Weiner <hannes@saeurebad.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Kel Modderman <kel@otaku42.de>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: <stable@kernel.org> [14fcc23fd is in 2.6.25.14 and 2.6.26.1]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is a race with dirty page accounting where a page may not properly
be accounted for.
clear_page_dirty_for_io() calls page_mkclean; then TestClearPageDirty.
page_mkclean walks the rmaps for that page, and for each one it cleans and
write protects the pte if it was dirty. It uses page_check_address to
find the pte. That function has a shortcut to avoid the ptl if the pte is
not present. Unfortunately, the pte can be switched to not-present then
back to present by other code while holding the page table lock -- this
should not be a signal for page_mkclean to ignore that pte, because it may
be dirty.
For example, powerpc64's set_pte_at will clear a previously present pte
before setting it to the desired value. There may also be other code in
core mm or in arch which do similar things.
The consequence of the bug is loss of data integrity due to msync, and
loss of dirty page accounting accuracy. XIP's __xip_unmap could easily
also be unreliable (depending on the exact XIP locking scheme), which can
lead to data corruption.
Fix this by having an option to always take ptl to check the pte in
page_check_address.
It's possible to retain this optimization for page_referenced and
try_to_unmap.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Jared Hulbert <jaredeh@gmail.com>
Cc: Carsten Otte <cotte@freenet.de>
Cc: Hugh Dickins <hugh@veritas.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When user calls sys_setpriority(PRIO_PGRP ...) on a NPTL style multi-LWP
process, only the task leader of the process is affected, all other
sibling LWP threads didn't receive the setting. The problem was that the
iterator used in sys_setpriority() only iteartes over one task for each
process, ignoring all other sibling thread.
Introduce a new macro do_each_pid_thread / while_each_pid_thread to walk
each thread of a process. Convert 4 call sites in {set/get}priority and
ioprio_{set/get}.
Signed-off-by: Ken Chen <kenchen@google.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Purely cosmetic for now, but we might as well get it merged ASAP.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: add acpi_find_root_bridge_handle
PCI: acpi_pcihp: run _OSC on a root bridge
x86/PCI: irq and pci_ids patch for Intel Ibex Peak PCHs
x86/PCI: allow scanning of 255 PCI busses
x86, pci: detect end_bus_number according to acpi/e820 reserved, v2
pci: debug extra pci bus resources
pci: debug extra pci resources range
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (94 commits)
pkt_sched: Prevent livelock in TX queue running.
Revert "pkt_sched: Add BH protection for qdisc_stab_lock."
Revert "pkt_sched: Protect gen estimators under est_lock."
pkt_sched: remove bogus block (cleanup)
nf_nat: use secure_ipv4_port_ephemeral() for NAT port randomization
netfilter: ctnetlink: sleepable allocation with spin lock bh
netfilter: ctnetlink: fix sleep in read-side lock section
netfilter: ctnetlink: fix double helper assignation for NAT'ed conntracks
netfilter: ipt_addrtype: Fix matching of inverted destination address type
dccp: Fix panic caused by too early termination of retransmission mechanism
pkt_sched: Don't hold qdisc lock over qdisc_destroy().
pkt_sched: Add lockdep annotation for qdisc locks
pkt_sched: Never schedule non-root qdiscs.
removed unused #include <version.h>
rt2x00: Fix txdone_entry_desc_flags
b43: Fix for another Bluetooth Coexistence SPROM Programming error for BCM4306
mac80211: remove kdoc references to IEEE80211_HW_HOST_GEN_BEACON_TEMPLATE
p54u: reset skb's data/tail pointer on requeue
p54: move p54_vdcf_init to the right place.
iwlwifi: fix printk newlines
...
Consolidate finding of a root bridge and getting its handle to the one
inline function. It's cut & pasted on multiple places. Use this new
inline in those.
Cc: kristen.c.accardi@intel.com
Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add missing ATA_* defines to <linux/ata.h>. Also add
ATAPI_{LFS,EOM,ILI,IO,CODE} defines while at it.
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add missing ATA_CMD_* defines to <linux/ata.h>. Also add
ATA_EXABYTE_ENABLE_NEST, SETFEATURES_AAM_* and ATA_SMART_*
defines while at it.
Partially based on earlier work by Chris Wedgwood.
Acked-by: Chris Wedgwood <cw@f00f.org>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add missing ATA_ID_* defines and update {ata,atapi}_*()
inlines accordingly. The currently unused defines are
needed for the forthcoming drivers/ide/ changes.
v2:
Add ATA_ID_SPG.
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
security.h: fix build failure
include/linux/security.h: In function 'security_ptrace_traceme':
include/linux/security.h:1760: error: 'parent' undeclared (first use in this function)
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: James Morris <jmorris@namei.org>
The exported copy of videodev2.h contains this line:
#define #include <sys/time.h>
This is because for some reason it defines __user for itself -- despite
the fact that we remove all instances of __user when exporting headers.
_All_ pointers in userspace are user pointers. Fix it by removing the
unnecessary '#define __user' from the file.
The new headers ivtv.h and ivtvfb.h would have the same problem... if
whoever put them there had actually remembered to add them to the Kbuild
file while he was at it. Fix those too, and export them as was
presumably intended.
Note that includes of <linux/compiler.h> are also stripped by the header
export process, so those don't need to be conditional.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Acked-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Try to comment away a little of the confusion between mm's vm_area_struct
vm_flags and vmalloc's vm_struct flags: based on an idea by Ulrich Drepper.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There's an skb_copy_datagram_iovec() to copy out of a paged skb, but
nothing the other way around (because we don't do that).
We want to allocate big skbs in tun.c, so let's add the function.
It's a carbon copy of skb_copy_datagram_iovec() with enough changes to
be annoying.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a TUNGETIFF interface so that userspace can query a
tun/tap descriptor for its name and flags.
This is needed because it is common for one app to create
a tap interface, exec another app and pass it the file
descriptor for the interface. Without TUNGETIFF the spawned
app has no way of detecting wheter the interface has e.g.
IFF_VNET_HDR set.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Acked-by: Max Krasnyansky <maxk@qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>