Commit Graph

2591 Commits

Author SHA1 Message Date
Ingo Molnar
71fd371463 sched: remove HZ dependency from the granularity default
remove HZ dependency from the granularity default. Use 10 msec for
the base granularity, 1 msec for wakeup granularity and 25 msec for
batch wakeup granularity. (These defaults are close to the values
that the default HZ=250 setting got previously, and thus it's the
most common setting.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-24 20:39:10 +02:00
Bruce Ashfield
7c6c16f354 sched: CONFIG_SCHED_GROUP_FAIR=y fixlet
when I built with CONFIG_FAIR_GROUP_SCHED=y, I need the following change
to make things right.

[ From: mingo@elte.hu ]

this config option is not upstream-configurable right now but lets fix
this for completeness.

Signed-off-by: Bruce Ashfield <bruce.ashfield@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-24 20:39:10 +02:00
Linus Torvalds
d0797b39dc Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  sched: tweak the sched_runtime_limit tunable
  sched: skip updating rq's next_balance under null SD
  sched: fix broken SMT/MC optimizations
  sched: accounting regression since rc1
  sched: fix sysctl directory permissions
  sched: sched_clock_idle_[sleep|wakeup]_event()
2007-08-23 21:38:39 -07:00
Linus Torvalds
de80af4cc9 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6:
  sysfs: don't warn on removal of a nonexistent binary file
  HOWTO: latest lxr url address changed
  HOWTO: korean translation of Documentation/HOWTO
  Fix Off-by-one in /sys/module/*/refcnt
  sysfs: fix locking in sysfs_lookup() and sysfs_rename_dir()
2007-08-23 21:34:43 -07:00
Ingo Molnar
505c0efd58 sched: tweak the sched_runtime_limit tunable
Michael Gerdau reported reniced task CPU usage weirdnesses.
Such symptoms can be caused by limit underruns so double the
sched_runtime_limit.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-23 15:18:02 +02:00
Suresh Siddha
f549da848e sched: skip updating rq's next_balance under null SD
Was playing with sched_smt_power_savings/sched_mc_power_savings and
found out that while the scheduler domains are reconstructed when sysfs
settings change, rebalance_domains() can get triggered with null domain
on other cpus, which is setting next_balance to jiffies + 60*HZ.
Resulting in no idle/busy balancing for 60 seconds.

Fix this.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-23 15:18:02 +02:00
Suresh Siddha
f8700df7c4 sched: fix broken SMT/MC optimizations
On a four package system with HT - HT load balancing optimizations were
broken.  For example, if two tasks end up running on two logical threads
of one of the packages, scheduler is not able to pull one of the tasks
to a completely idle package.

In this scenario, for nice-0 tasks, imbalance calculated by scheduler
will be 512 and find_busiest_queue() will return 0 (as each cpu's load
is 1024 > imbalance and has only one task running).

Similarly MC scheduler optimizations also get fixed with this patch.

[ mingo@elte.hu: restored fair balancing by increasing the fuzz and
                 adding it back to the power decision, without the /2
                 factor. ]

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-23 15:18:02 +02:00
Eric W. Biederman
c57baf1e1e sched: fix sysctl directory permissions
There are two remaining gotchas:

- The directories have impossible permissions (writeable).

- The ctl_name for the kernel directory is inconsistent with
  everything else.  It should be CTL_KERN.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-23 15:18:02 +02:00
Ingo Molnar
2aa44d0567 sched: sched_clock_idle_[sleep|wakeup]_event()
construct a more or less wall-clock time out of sched_clock(), by
using ACPI-idle's existing knowledge about how much time we spent
idling. This allows the rq clock to work around TSC-stops-in-C2,
TSC-gets-corrupted-in-C3 type of problems.

( Besides the scheduler's statistics this also benefits blktrace and
  printk-timestamps as well. )

Furthermore, the precise before-C2/C3-sleep and after-C2/C3-wakeup
callbacks allow the scheduler to get out the most of the period where
the CPU has a reliable TSC. This results in slightly more precise
task statistics.

the ACPI bits were acked by Len.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Len Brown <len.brown@intel.com>
2007-08-23 15:18:02 +02:00
Oleg Nesterov
834d216e1f signalfd: fix interaction with posix-timers
dequeue_signal:

	if (__SI_TIMER) {
		spin_unlock(&tsk->sighand->siglock);
		do_schedule_next_timer(info);
		spin_lock(&tsk->sighand->siglock);
	}

Unless tsk == curent, this is absolutely unsafe: nothing prevents tsk from
exiting. If signalfd was passed to another process, do_schedule_next_timer()
is just wrong.

Add yet another "tsk == current" check into dequeue_signal().

This patch fixes an oopsable bug, but breaks the scheduling of posix timers
if the shared __SI_TIMER signal was fetched via signalfd attached to another
sub-thread. Mostly fixed by the next patch.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Roland McGrath <roland@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-22 19:52:46 -07:00
Oleg Nesterov
d02479bdeb posix-timers: fix creation race
sys_timer_create() sets ->it_process and unlocks ->siglock, then checks
tmr->it_sigev_notify to define if get_task_struct() is needed.

We already passed ->it_id to the caller, another thread can delete this timer
and free its memory in between.

As a minimal fix, move this code under ->siglock, sys_timer_delete() takes it
too before calling release_posix_timer().  A proper serialization would be to
take ->it_lock, we add a partly initialized timer on posix_timers_id, not
good.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-22 19:52:46 -07:00
Thomas Gleixner
179394af7a posix-timers: fix deletion race
timer_delete does:
	lock_timer();
	timer->it_process = NULL;
	unlock_timer();
	release_posix_timer();

timer->it_process is checked in lock_timer() to prevent access to a
timer, which is on the way to be deleted, but the check happens after
idr_lock is dropped. This allows release_posix_timer() to delete the
timer before the lock code can check the timer:

  CPU 0				CPU 1

  lock_timer();
  timer->it_process = NULL;
  unlock_timer();
				lock_timer()
					spin_lock(idr_lock);
					timer = idr_find();
					spin_lock(timer->lock);
					spin_unlock(idr_lock);
  release_posix_timer();
	spin_lock(idr_lock);
	idr_remove(timer);
	spin_unlock(idr_lock);
	free_timer(timer);
					if (timer->......)

Change the locking to prevent this.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-22 19:52:45 -07:00
Andrew Morton
8b7f07155f free_irq(): fix DEBUG_SHIRQ handling
If we're going to run the handler from free_irq() then we must do it with
local irq's disabled.  Otherwise lockdep complains that the handler is taking
irq-safe spinlocks in a non-irq-safe fashion.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-22 19:52:44 -07:00
john stultz
187226f57f futex_unlock_pi() hurts my brain and may cause application deadlock
Avoid futex_unlock_pi returning -EFAULT (which results in deadlock), by
clearing uval before jumping to retry_locked.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-22 19:52:44 -07:00
Adrian Bunk
88ae704c2a kernel/auditsc.c: fix an off-by-one
This patch fixes an off-by-one in a BUG_ON() spotted by the Coverity
checker.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Amy Griffis <amy.griffis@hp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-22 19:52:44 -07:00
Alexey Dobriyan
256e2fdf03 Fix Off-by-one in /sys/module/*/refcnt
sysfs internals were changed to not pin module in question.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-08-22 14:35:35 -07:00
Robin Getz
cb00e99c0a fix - ensure we don't use bootconsoles after init has been released
Gerd Hoffmann pointed out that my patch from yesterday can lead
to a null pointer dereference if the kernel is booted with no
console, and no earlyprintk defined. This fixes that issue.

Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-21 20:23:53 -07:00
Robin Getz
0c5564bd91 ensure we don't use bootconsoles after init has been released
This is a followup to the cleanups for earlyprintk patch from Gerd Hoffmann

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=69331af79cf29e26d1231152a172a1a10c2df511

This ensures that a bootconsole is unregistered if it is not replaced.
The current implementation spews garbage out the bootconsole in this case,
since the bootconsole structure is normally in the init section, and is
freed, but still used.

Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-20 22:42:01 -07:00
Christian Heim
e598fbaabd Remove double inclusion of linux/capability.h
Remove the second inclusion of linux/capability.h, which has been
introduced with "[PATCH] move capable() to capability.h" (commit
c59ede7b78)

Signed-off-by: Christian Heim <phreak@gentoo.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-19 10:12:32 -07:00
Linus Torvalds
738ddd3039 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  sched: run_rebalance_domains: s/SCHED_IDLE/CPU_IDLE/
  sched: fix sleeper bonus
  sched: make global code static
2007-08-12 11:06:45 -07:00
Thomas Gleixner
2464286ace genirq: suppress resend of level interrupts
Level type interrupts are resent by the interrupt hardware when they are
still active at irq_enable().

Suppress the resend mechanism for interrupts marked as level.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-12 11:05:45 -07:00
Thomas Gleixner
496634217e genirq: cleanup mismerge artifact
Commit 5a43a066b1: "genirq: Allow fasteoi
handler to retrigger disabled interrupts" was erroneously applied to
handle_level_irq().  This added the irq retrigger / resend functionality
to the level irq handler.

Revert the offending bits.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-12 11:05:45 -07:00
Oleg Nesterov
de0cf899bb sched: run_rebalance_domains: s/SCHED_IDLE/CPU_IDLE/
rebalance_domains(SCHED_IDLE) looks strange (typo), change it to CPU_IDLE.

the effect of this bug was slightly more agressive idle-balancing on
SMP than intended.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-12 18:08:19 +02:00
Ingo Molnar
5d2b3d3695 sched: fix sleeper bonus
Peter Ziljstra noticed that the sleeper bonus deduction code
was not properly rate-limited: a task that scheduled more
frequently would get a disproportionately large deduction.
So limit the deduction to delta_exec.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-12 18:08:19 +02:00
Adrian Bunk
6707de00fd sched: make global code static
This patch makes the following needlessly global code static:

- arch_reinit_sched_domains()
- struct attr_sched_mc_power_savings
- struct attr_sched_smt_power_savings

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-12 18:08:19 +02:00
Linus Torvalds
d291676ce8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  sched debug: dont print kernel address in /proc/sched_debug
  sched: fix typo in the FAIR_GROUP_SCHED branch
  sched: improve rq-clock overflow logic
2007-08-11 15:58:37 -07:00
Peter Chubb
cd5bfea278 fix compilation with gcc 4.2
gcc-4.2 is a lot more picky about its symbol handling.  EXPORT_SYMBOL no
longer works on symbols that are undefined or defined with static scope.

For example, with CONFIG_PROFILE off, I see:

  kernel/profile.c:206: error: __ksymtab_profile_event_unregister causes a section type conflict
  kernel/profile.c:205: error: __ksymtab_profile_event_register causes a section type conflict

This patch moves the EXPORTs inside the #ifdef CONFIG_PROFILE, so we
only try to export symbols that are defined.

Also, in kernel/kprobes.c there's an EXPORT_SYMBOL_GPL() for
jprobes_return, which if CONFIG_JPROBES is undefined is a static
inline and gives the same error.

And in drivers/acpi/resources/rsxface.c, there's an
ACPI_EXPORT_SYMBOPL() for a static symbol. If it's static, it's not
accessible from outside the compilation unit, so should bot be exported.

These three changes allow building a zx1_defconfig kernel with gcc 4.2
on IA64.

[akpm@linux-foundation.org: export jpobe_return properly]
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:47:42 -07:00
Miao Xie
6ddfca9548 timer: remove clockevents_unregister_notifier
I find a function(clockevents_unregister_notifier) which is not called by
anything in tree.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:47:42 -07:00
Rafael J. Wysocki
c5a69adff9 Hibernation: do not try to mark invalid PFNs as nosave
On some systems some PFNs reported by the early initialization code as
'nosave' may be invalid.  If we try to set the corresponding bits in the
hibernation bitmap, BUG_ON() in memory_bm_find_bit() will be triggered and
the system won't be able to boot (cf.
https://bugzilla.novell.com/show_bug.cgi?id=296242).

Prevent this from happening by verifying if the 'nosave' PFNs are valid in
mark_nosave_pages().

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:47:40 -07:00
Lee Schermerhorn
8daec965e7 Fix missing numa_zonelist_order sysctl
Misplaced #endif is hiding the numa_zonelist_order sysctl when !SECURITY.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:47:40 -07:00
Ingo Molnar
5167e75f4d sched debug: dont print kernel address in /proc/sched_debug
Arjan van de Ven pointed out that we should not print kernel addresses
in world-readable /proc files - fix that.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-08-10 23:05:11 +02:00
Ingo Molnar
e56f31aad9 sched: fix typo in the FAIR_GROUP_SCHED branch
while there's no in-tree way to turn group scheduling at the moment,
fix a typo in it nevertheless.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-10 23:05:11 +02:00
Ingo Molnar
529c77261b sched: improve rq-clock overflow logic
improve the rq-clock overflow logic: limit the absolute rq->clock
delta since the last scheduler tick, instead of limiting the delta
itself.

tested by Arjan van de Ven - whole laptop was misbehaving due to
an incorrectly calibrated cpu_khz confusing sched_clock().

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-08-10 23:05:11 +02:00
Linus Torvalds
be12014dd7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: (61 commits)
  sched: refine negative nice level granularity
  sched: fix update_stats_enqueue() reniced codepath
  sched: round a bit better
  sched: make the multiplication table more accurate
  sched: optimize update_rq_clock() calls in the load-balancer
  sched: optimize activate_task()
  sched: clean up set_curr_task_fair()
  sched: remove __update_rq_clock() call from entity_tick()
  sched: move the __update_rq_clock() call to scheduler_tick()
  sched debug: remove the 'u64 now' parameter from print_task()/_rq()
  sched: remove the 'u64 now' local variables
  sched: remove the 'u64 now' parameter from deactivate_task()
  sched: remove the 'u64 now' parameter from dequeue_task()
  sched: remove the 'u64 now' parameter from enqueue_task()
  sched: remove the 'u64 now' parameter from dec_nr_running()
  sched: remove the 'u64 now' parameter from inc_nr_running()
  sched: remove the 'u64 now' parameter from dec_load()
  sched: remove the 'u64 now' parameter from inc_load()
  sched: remove the 'u64 now' parameter from update_curr_load()
  sched: remove the 'u64 now' parameter from ->task_new()
  ...
2007-08-09 08:23:31 -07:00
Linus Torvalds
88ffc35059 Revert "genirq: temporary fix for level-triggered IRQ resend"
This reverts commit 0fc4969b86.  It was
always meant to be temporary, but it's generating more useless noise
than anything else, and we probably should never have done it in the
generic kernel (only had the people involved test it on their own).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-09 08:10:16 -07:00
Ingo Molnar
7cff8cf61c sched: refine negative nice level granularity
refine the granularity of negative nice level tasks: let them
reschedule more often to offset the effect of them consuming
their wait_runtime proportionately slower. (This makes nice-0
task scheduling smoother in the presence of negatively
reniced tasks.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:52 +02:00
Ingo Molnar
a69edb5560 sched: fix update_stats_enqueue() reniced codepath
the key has to be rescaled to /weight even if it has a positive value.

(this change only affects the scheduling of reniced tasks)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:52 +02:00
Ingo Molnar
194081ebfa sched: round a bit better
round a tiny bit better in high-frequency rescheduling scenarios,
by rounding around zero instead of rounding down.

(this is pretty theoretical though)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:51 +02:00
Ingo Molnar
254753dc32 sched: make the multiplication table more accurate
do small deltas in the weight and multiplication constant table so
that the worst-case numeric error is better than 1:100000000. (8 digits)

the current error table is:

     nice       mult *   inv_mult   error
     ------------------------------------------
     -20:      88761 *      48388  -0.0000000065
     -19:      71755 *      59856  -0.0000000037
     -18:      56483 *      76040   0.0000000056
     -17:      46273 *      92818   0.0000000042
     -16:      36291 *     118348  -0.0000000065
     -15:      29154 *     147320  -0.0000000037
     -14:      23254 *     184698  -0.0000000009
     -13:      18705 *     229616  -0.0000000037
     -12:      14949 *     287308  -0.0000000009
     -11:      11916 *     360437  -0.0000000009
     -10:       9548 *     449829  -0.0000000009
      -9:       7620 *     563644  -0.0000000037
      -8:       6100 *     704093   0.0000000009
      -7:       4904 *     875809   0.0000000093
      -6:       3906 *    1099582  -0.0000000009
      -5:       3121 *    1376151  -0.0000000058
      -4:       2501 *    1717300   0.0000000009
      -3:       1991 *    2157191  -0.0000000035
      -2:       1586 *    2708050   0.0000000009
      -1:       1277 *    3363326   0.0000000014
       0:       1024 *    4194304   0.0000000000
       1:        820 *    5237765   0.0000000009
       2:        655 *    6557202   0.0000000033
       3:        526 *    8165337  -0.0000000079
       4:        423 *   10153587   0.0000000012
       5:        335 *   12820798   0.0000000079
       6:        272 *   15790321   0.0000000037
       7:        215 *   19976592  -0.0000000037
       8:        172 *   24970740  -0.0000000037
       9:        137 *   31350126  -0.0000000079
      10:        110 *   39045157  -0.0000000061
      11:         87 *   49367440  -0.0000000037
      12:         70 *   61356676   0.0000000056
      13:         56 *   76695844  -0.0000000075
      14:         45 *   95443717  -0.0000000072
      15:         36 *  119304647  -0.0000000009
      16:         29 *  148102320  -0.0000000037
      17:         23 *  186737708  -0.0000000028
      18:         18 *  238609294  -0.0000000009
      19:         15 *  286331153  -0.0000000002

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:51 +02:00
Ingo Molnar
6e82a3befe sched: optimize update_rq_clock() calls in the load-balancer
optimize update_rq_clock() calls in the load-balancer: update them
right after locking the runqueue(s) so that the pull functions do
not have to call it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:51 +02:00
Ingo Molnar
2daa357705 sched: optimize activate_task()
optimize activate_task() by removing update_rq_clock() from it.
(and add update_rq_clock() to all callsites of activate_task() that
did not have it before.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:51 +02:00
Ingo Molnar
c3b64f1e4f sched: clean up set_curr_task_fair()
clean up set_curr_task_fair().

( identity transformation that causes no change in functionality. )

   text    data     bss     dec     hex filename
  39170    3750      36   42956    a7cc sched.o.before
  39170    3750      36   42956    a7cc sched.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:51 +02:00
Ingo Molnar
d9e0e6aa6d sched: remove __update_rq_clock() call from entity_tick()
remove __update_rq_clock() call from entity_tick().

no change in functionality because scheduler_tick() already calls
__update_rq_clock().

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:51 +02:00
Ingo Molnar
546fe3c909 sched: move the __update_rq_clock() call to scheduler_tick()
move the __update_rq_clock() call from update_cpu_load() to
scheduler_tick().

( identity transformation that causes no change in functionality. )

this allows the direct use of rq->clock in ->task_tick() functions.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:51 +02:00
Ingo Molnar
a48da48b40 sched debug: remove the 'u64 now' parameter from print_task()/_rq()
remove the 'u64 now' parameter from sched_debug.c:print_task()/_rq().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:51 +02:00
Ingo Molnar
bdd4dfa89c sched: remove the 'u64 now' local variables
final step: remove all (now superfluous) 'u64 now' variables.

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:51 +02:00
Ingo Molnar
2e1cb74a50 sched: remove the 'u64 now' parameter from deactivate_task()
remove the 'u64 now' parameter from deactivate_task().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:49 +02:00
Ingo Molnar
69be72c13d sched: remove the 'u64 now' parameter from dequeue_task()
remove the 'u64 now' parameter from dequeue_task().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:49 +02:00
Ingo Molnar
8159f87e2b sched: remove the 'u64 now' parameter from enqueue_task()
remove the 'u64 now' parameter from enqueue_task().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:49 +02:00
Ingo Molnar
db53181e41 sched: remove the 'u64 now' parameter from dec_nr_running()
remove the 'u64 now' parameter from dec_nr_running().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:49 +02:00
Ingo Molnar
e5fa2237b5 sched: remove the 'u64 now' parameter from inc_nr_running()
remove the 'u64 now' parameter from inc_nr_running().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:49 +02:00
Ingo Molnar
79b5dddf83 sched: remove the 'u64 now' parameter from dec_load()
remove the 'u64 now' parameter from dec_load().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:49 +02:00
Ingo Molnar
29b4b623fe sched: remove the 'u64 now' parameter from inc_load()
remove the 'u64 now' parameter from inc_load().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:49 +02:00
Ingo Molnar
84a1d7a2f9 sched: remove the 'u64 now' parameter from update_curr_load()
remove the 'u64 now' parameter from update_curr_load().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:49 +02:00
Ingo Molnar
ee0827d8b5 sched: remove the 'u64 now' parameter from ->task_new()
remove the 'u64 now' parameter from ->task_new().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:49 +02:00
Ingo Molnar
31ee529cc2 sched: remove the 'u64 now' parameter from ->put_prev_task()
remove the 'u64 now' parameter from ->put_prev_task().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:49 +02:00
Ingo Molnar
ff95f3df54 sched: remove the 'u64 now' parameter from pick_next_task()
remove the 'u64 now' parameter from pick_next_task().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:49 +02:00
Ingo Molnar
fb8d472402 sched: remove the 'u64 now' parameter from ->pick_next_task()
remove the 'u64 now' parameter from ->pick_next_task().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:48 +02:00
Ingo Molnar
f02231e51a sched: remove the 'u64 now' parameter from ->dequeue_task()
remove the 'u64 now' parameter from ->dequeue_task().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:48 +02:00
Ingo Molnar
fd390f6a04 sched: remove the 'u64 now' parameter from ->enqueue_task()
remove the 'u64 now' parameter from ->enqueue_task().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:48 +02:00
Ingo Molnar
f1e14ef64d sched: remove the 'u64 now' parameter from update_curr_rt()
remove the 'u64 now' parameter from update_curr_rt().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:48 +02:00
Ingo Molnar
ab6cde2692 sched: remove the 'u64 now' parameter from put_prev_entity()
remove the 'u64 now' parameter from put_prev_entity().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:48 +02:00
Ingo Molnar
9948f4b2a7 sched: remove the 'u64 now' parameter from pick_next_entity()
remove the 'u64 now' parameter from pick_next_entity().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:48 +02:00
Ingo Molnar
8494f412ed sched: remove the 'u64 now' parameter from set_next_entity()
remove the 'u64 now' parameter from set_next_entity().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:48 +02:00
Ingo Molnar
525c2716a4 sched: remove the 'u64 now' parameter from dequeue_entity()
remove the 'u64 now' parameter from dequeue_entity().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:48 +02:00
Ingo Molnar
668031ca8f sched: remove the 'u64 now' parameter from enqueue_entity()
remove the 'u64 now' parameter from enqueue_entity().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:48 +02:00
Ingo Molnar
2396af69be sched: remove the 'u64 now' parameter from enqueue_sleeper()
remove the 'u64 now' parameter from enqueue_sleeper().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:48 +02:00
Ingo Molnar
dfdc119e54 sched: remove the 'u64 now' parameter from __enqueue_sleeper()
remove the 'u64 now' parameter from __enqueue_sleeper().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:48 +02:00
Ingo Molnar
c7e9b5b293 sched: remove the 'u64 now' parameter from update_stats_curr_end()
remove the 'u64 now' parameter from update_stats_curr_end().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:48 +02:00
Ingo Molnar
19b6a2e370 sched: remove the 'u64 now' parameter from update_stats_dequeue()
remove the 'u64 now' parameter from update_stats_dequeue().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:48 +02:00
Ingo Molnar
79303e9e02 sched: remove the 'u64 now' parameter from update_stats_curr_start()
remove the 'u64 now' parameter from update_stats_curr_start().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:47 +02:00
Ingo Molnar
9ef0a9615b sched: remove the 'u64 now' parameter from update_stats_wait_end()
remove the 'u64 now' parameter from update_stats_wait_end().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:47 +02:00
Ingo Molnar
eac55ea376 sched: remove the 'u64 now' parameter from __update_stats_wait_end()
remove the 'u64 now' parameter from __update_stats_wait_end().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:47 +02:00
Ingo Molnar
d2417e5a3e sched: remove the 'u64 now' parameter from update_stats_enqueue()
remove the 'u64 now' parameter from update_stats_enqueue().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:47 +02:00
Ingo Molnar
5870db5b83 sched: remove the 'u64 now' parameter from update_stats_wait_start()
remove the 'u64 now' parameter from update_stats_wait_start().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:47 +02:00
Ingo Molnar
b7cc089657 sched: remove the 'u64 now' parameter from update_curr()
remove the 'u64 now' parameter from update_curr().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:47 +02:00
Ingo Molnar
5cef9eca38 sched: remove the 'u64 now' parameter from print_cfs_rq()
remove the 'u64 now' parameter from print_cfs_rq().

( identity transformation that causes no change in functionality. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:47 +02:00
Ingo Molnar
d281918d7c sched: remove 'now' use from assignments
change all 'now' timestamp uses in assignments to rq->clock.

( this is an identity transformation that causes no functionality change:
  all such new rq->clock is necessarily preceded by an update_rq_clock()
  call. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:47 +02:00
Ingo Molnar
eb59449400 sched: remove __rq_clock()
remove the (now unused) __rq_clock() function.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:47 +02:00
Ingo Molnar
c1b3da3ecd sched: eliminate __rq_clock() use
eliminate __rq_clock() use by changing it to:

   __update_rq_clock(rq)
   now = rq->clock;

identity transformation - no change in behavior.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:47 +02:00
Ingo Molnar
2ab81159fa sched: remove rq_clock()
remove the now unused rq_clock() function.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:47 +02:00
Ingo Molnar
a8e504d2a5 sched: eliminate rq_clock() use
eliminate rq_clock() use by changing it to:

   update_rq_clock(rq)
   now = rq->clock;

identity transformation - no change in behavior.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:47 +02:00
Ingo Molnar
b04a0f4c16 sched: add [__]update_rq_clock(rq)
add the [__]update_rq_clock(rq) functions. (No change in functionality,
just reorganization to prepare for elimination of the heavy 64-bit
timestamp-passing in the scheduler.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:46 +02:00
Peter Williams
a4ac01c36e sched: fix bug in balance_tasks()
There are two problems with balance_tasks() and how it used:

1. The variables best_prio and best_prio_seen (inherited from the old
move_tasks()) were only required to handle problems caused by the
active/expired arrays, the order in which they were processed and the
possibility that the task with the highest priority could be on either.
  These issues are no longer present and the extra overhead associated
with their use is unnecessary (and possibly wrong).

2. In the absence of CONFIG_FAIR_GROUP_SCHED being set, the same
this_best_prio variable needs to be used by all scheduling classes or
there is a risk of moving too much load.  E.g. if the highest priority
task on this at the beginning is a fairly low priority task and the rt
class migrates a task (during its turn) then that moved task becomes the
new highest priority task on this_rq but when the sched_fair class
initializes its copy of this_best_prio it will get the priority of the
original highest priority task as, due to the run queue locks being
held, the reschedule triggered by pull_task() will not have taken place.
  This could result in inappropriate overriding of skip_for_load and
excessive load being moved.

The attached patch addresses these problems by deleting all reference to
best_prio and best_prio_seen and making this_best_prio a reference
parameter to the various functions involved.

load_balance_fair() has also been modified so that this_best_prio is
only reset (in the loop) if CONFIG_FAIR_GROUP_SCHED is set.  This should
preserve the effect of helping spread groups' higher priority tasks
around the available CPUs while improving system performance when
CONFIG_FAIR_GROUP_SCHED isn't set.

Signed-off-by: Peter Williams <pwil3058@bigpond.net.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:46 +02:00
Alexey Dobriyan
e0361851e5 sched: remove binary sysctls from kernel.sched_domain
kernel.sched_domain hierarchy is under CTL_UNNUMBERED and thus
unreachable to sysctl(2). Generating .ctl_number's in such situation is
not useful.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:46 +02:00
Ingo Molnar
fd8bb43e27 sched: delta_exec accounting fix
small delta_exec accounting fix: increase delta_exec and increase
sum_exec_runtime even if the task is not on the runqueue anymore.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:46 +02:00
Ingo Molnar
c5dcfe72aa sched: clean up delta_mine
cleanup: delta_mine is an unsigned value.

no code impact:

   text    data     bss     dec     hex filename
   27823    2726      16   30565    7765 sched.o.before
   27823    2726      16   30565    7765 sched.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:46 +02:00
Ingo Molnar
8e717b194c sched: schedule() speedup
speed up schedule(): share the 'now' parameter that deactivate_task()
was calculating internally.

( this also fixes the small accounting window between the deactivate
  call and the pick_next_task() call. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:46 +02:00
Ingo Molnar
7bfd048587 sched: uninline rq_clock()
uninline rq_clock() to save 263 bytes of code:

   text    data     bss     dec     hex filename
   39561    3642      24   43227    a8db sched.o.before
   39298    3642      24   42964    a7d4 sched.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:46 +02:00
Josh Triplett
291ae5a120 sched: mark print_cfs_stats static
sched_fair.c defines print_cfs_stats, and sched_debug.c uses it, but sched.c
includes both sched_fair.c and sched_debug.c, so all the references to
print_cfs_stats occur in the same compilation unit.  Thus, mark
print_cfs_stats static.

Eliminates a sparse warning:
warning: symbol 'print_cfs_stats' was not declared. Should it be static?

Signed-off-by: Josh Triplett <josh@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:46 +02:00
Ulrich Drepper
9531b62f5e sched: clean up sched_getaffinity()
here's another tiny cleanup.  The generated code is not affected (gcc is
smart enough) but for people looking over the code it is just irritating
to have the extra conditional.

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:46 +02:00
Peter Williams
4301065920 sched: simplify move_tasks()
The move_tasks() function is currently multiplexed with two distinct
capabilities:

1. attempt to move a specified amount of weighted load from one run
queue to another; and
2. attempt to move a specified number of tasks from one run queue to
another.

The first of these capabilities is used in two places, load_balance()
and load_balance_idle(), and in both of these cases the return value of
move_tasks() is used purely to decide if tasks/load were moved and no
notice of the actual number of tasks moved is taken.

The second capability is used in exactly one place,
active_load_balance(), to attempt to move exactly one task and, as
before, the return value is only used as an indicator of success or failure.

This multiplexing of sched_task() was introduced, by me, as part of the
smpnice patches and was motivated by the fact that the alternative, one
function to move specified load and one to move a single task, would
have led to two functions of roughly the same complexity as the old
move_tasks() (or the new balance_tasks()).  However, the new modular
design of the new CFS scheduler allows a simpler solution to be adopted
and this patch addresses that solution by:

1. adding a new function, move_one_task(), to be used by
active_load_balance(); and
2. making move_tasks() a single purpose function that tries to move a
specified weighted load and returns 1 for success and 0 for failure.

One of the consequences of these changes is that neither move_one_task()
or the new move_tasks() care how many tasks sched_class.load_balance()
moves and this enables its interface to be simplified by returning the
amount of load moved as its result and removing the load_moved pointer
from the argument list.  This helps simplify the new move_tasks() and
slightly reduces the amount of work done in each of
sched_class.load_balance()'s implementations.

Further simplification, e.g. changes to balance_tasks(), are possible
but (slightly) complicated by the special needs of load_balance_fair()
so I've left them to a later patch (if this one gets accepted).

NB Since move_tasks() gets called with two run queue locks held even
small reductions in overhead are worthwhile.

[ mingo@elte.hu ]

this change also reduces code size nicely:

   text    data     bss     dec     hex filename
   39216    3618      24   42858    a76a sched.o.before
   39173    3618      24   42815    a73f sched.o.after

Signed-off-by: Peter Williams <pwil3058@bigpond.net.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:46 +02:00
Ingo Molnar
f1a438d813 sched: reorder update_cpu_load(rq) with the ->task_tick() call
Peter Williams suggested to flip the order of update_cpu_load(rq) with
the ->task_tick() call. This is a NOP for the current scheduler (the
two functions are independent of each other), ->task_tick() might
create some state for update_cpu_load() in the future (or in PlugSched).

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:45 +02:00
Ingo Molnar
0915c4e89d sched: batch sleeper bonus
batch up the sleeper bonus sum a bit more. Anything below
sched-granularity is too small to make a practical difference
anyway.

this optimization reduces the math in high-frequency scheduling
scenarios.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-09 11:16:45 +02:00
Al Viro
175fc48425 fix oops in __audit_signal_info()
The check for audit_signals is misplaced and the check for
audit_dummy_context() is missing; as the result, if we send a signal to
auditd from task with NULL ->audit_context while we have audit_signals
!= 0 we end up with an oops.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-07 19:58:56 -07:00
Al Viro
6f605d83dd take sched_debug.c out of nasal demon territory
C99 6.10.3[11]: preprocessing directive within the argument list of
macro invocation => undefined behaviour.  Don't do that...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-06 18:22:27 -07:00
Oleg Nesterov
247284481c Kill some obsolete sub-thread-ptrace stuff
There is a couple of subtle checks which were needed to handle ptracing from
the same thread group. This was deprecated a long ago, imho this code just
complicates the understanding.

And, the "->parent->signal->flags & SIGNAL_GROUP_EXIT" check in exit_notify()
is not right. SIGNAL_GROUP_EXIT can mean exec(), not exit_group(). This means
ptracer can lose a ptraced zombie on exec(). Minor problem, but still the bug.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-03 15:06:33 -07:00
Daniel Ritz
b6b1d87785 serial: fix 8250 early console setup
the early setup function serial8250_console_early_setup() can be called
from non __init code (eg. hotpluggable serial ports like serial_cs) so
remove the __init from the call chain to avoid crashes.

Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch>
Cc: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-03 15:02:56 -07:00
Ingo Molnar
6cfb0d5d06 [PATCH] sched: reduce debug code
move the rest of the debugging/instrumentation code to under
CONFIG_SCHEDSTATS too. This reduces code size and speeds code up:

    text    data     bss     dec     hex filename
   33044    4122      28   37194    914a sched.o.before
   32708    4122      28   36858    8ffa sched.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-02 17:41:40 +02:00
Ingo Molnar
8179ca23d5 [PATCH] sched: use schedstat_set() API
make use of the new schedstat_set() API to eliminate two #ifdef sections.

No functional changes:

    text    data     bss     dec     hex filename
   29009    4122      28   33159    8187 sched.o.before
   29009    4122      28   33159    8187 sched.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-02 17:41:40 +02:00
Ingo Molnar
c3c7011969 [PATCH] sched: add schedstat_set() API
add the schedstat_set() API, to allow the reduction of
CONFIG_SCHEDSTAT related #ifdefs. No code changed.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-02 17:41:40 +02:00
Ingo Molnar
9c2172459a [PATCH] sched: move load-calculation functions
move load-calculation functions so that they can use the per-policy
declarations and methods.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-02 17:41:40 +02:00
Ingo Molnar
cad60d93e1 [PATCH] sched: ->task_new cleanup
make sched_class.task_new == NULL a 'default method', this
allows the removal of task_rt_new.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-02 17:41:40 +02:00
Ingo Molnar
4e6f96f313 [PATCH] sched: uninline inc/dec_nr_running()
uninline inc_nr_running() and dec_nr_running():

   text    data     bss     dec     hex filename
   29039    4162      24   33225    81c9 sched.o.before
   29027    4162      24   33213    81bd sched.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-02 17:41:40 +02:00
Ingo Molnar
cb1c4fc924 [PATCH] sched: uninline calc_delta_mine()
uninline calc_delta_mine():

   text    data     bss     dec     hex filename
   29162    4162      24   33348    8244 sched.o.before
   29039    4162      24   33225    81c9 sched.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-02 17:41:40 +02:00
Ingo Molnar
ecf691daf7 [PATCH] sched: calc_delta_mine(): use fixed limit
use fixed limit in calc_delta_mine() - this saves an instruction :)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-02 17:41:40 +02:00
Peter Williams
5a4f3ea77e [PATCH] sched: tidy up left over smpnice code
1. The only place that RTPRIO_TO_LOAD_WEIGHT() is used is in the call to
move_tasks() in the function active_load_balance() and its purpose here
is just to make sure that the load to be moved is big enough to ensure
that exactly one task is moved (if there's one available).  This can be
accomplished by using ULONG_MAX instead and this allows
RTPRIO_TO_LOAD_WEIGHT() to be deleted.

2. This, in turn, allows PRIO_TO_LOAD_WEIGHT() to be deleted.

3. This allows load_weight() to be deleted which allows
TIME_SLICE_NICE_ZERO to be deleted along with the comment above it.

Signed-off-by: Peter Williams <pwil3058@bigpond.net.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-02 17:41:40 +02:00
Ingo Molnar
362a701663 [PATCH] sched: remove cache_hot_time
remove the last unused remains of cache_hot_time.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-02 17:41:40 +02:00
Thomas Gleixner
0fc4969b86 genirq: temporary fix for level-triggered IRQ resend
Marcin Slusarz reported a ne2k-pci "hung network interface" regression.

delayed disable relies on the ability to re-trigger the interrupt in the
case that a real interrupt happens after the software disable was set.
In this case we actually disable the interrupt on the hardware level
_after_ it occurred.

On enable_irq, we need to re-trigger the interrupt. On i386 this relies
on a hardware resend mechanism (send_IPI_self()).

Actually we only need the resend for edge type interrupts. Level type
interrupts come back once enable_irq() re-enables the interrupt line.

I assume that the interrupt in question is level triggered because it is
shared and above the legacy irqs 0-15:

	17:         12   IO-APIC-fasteoi   eth1, eth0

Looking into the IO_APIC code, the resend via send_IPI_self() happens
unconditionally. So the resend is done for level and edge interrupts.
This makes the problem more mysterious.

The code in question lib8390.c does

	disable_irq();
	fiddle_with_the_network_card_hardware()
	enable_irq();

The fiddle_with_the_network_card_hardware() might cause interrupts,
which are cleared in the same code path again,

Marcin found that when he disables the irq line on the hardware level
(removing the delayed disable) the card is kept alive.

So the difference is that we can get a resend on enable_irq, when an
interrupt happens during the time, where we are in the disabled region.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-01 20:46:22 -07:00
Jesper Juhl
c9b3febc5b Fix a use after free bug in kernel->userspace relay file support
Coverity spotted what looks like a real possible case of using a variable
after it has been freed.  The problem is in
kernel/relay.c::relay_open_buf()

If the code hits "goto free_buf;" it ends up in this code :

  free_buf:
    	relay_destroy_buf(buf);	<--- calls kfree() on 'buf'.
  free_name:
   	kfree(tmpname);
  end:
  	return buf;		<-- use after free of 'buf'.

I read through the callers and they all handle a NULL return from this
function as an error (and hitting the 'free_buf' label only happens on
failure to chan->cb->create_buf_file(), so that looks like a clear error to
me).

The patch simply sets 'buf' to NULL after the call to
relay_destroy_buf(buf); - as far as I can see that should take care of the
problem.

The patch also corrects a reference to a documentation file while
I was at it.

Note from Mathieu: the documentation reference change should have been
done in a separate patch, but I guess no one will really care.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Acked-by: "David J. Wilder" <wilder@us.ibm.com>
Tested-by: "David J. Wilder" <wilder@us.ibm.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Tom Zanussi <zanussi@us.ibm.com>
Cc: Karim Yaghmour <karim@opersys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31 15:39:42 -07:00
Satyam Sharma
e804a4a4dd kthread: silence bogus section mismatch warning
WARNING: kernel/built-in.o(.text+0x16910): Section mismatch:
reference to .init.text: (between 'kthreadd' and 'init_waitqueue_head')

comes because kernel/kthread.c:kthreadd() is not __init but calls
kthreadd_setup() which is __init. But this is ok, because kthreadd_setup()
is only ever called at init time, and then kthreadd() proceeds into its
"for (;;)" loop. We could mark kthreadd __init_refok, but kthreadd_setup()
with just one callsite and 4 lines in it (it's been that small since
10ab825bde) doesn't need to be a separate function at all -- so let's
just move those four lines at beginning of kthreadd() itself.

Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31 15:39:42 -07:00
Andreas Schwab
f54f098612 futex: pass nr_wake2 to futex_wake_op
The fourth argument of sys_futex is ignored when op == FUTEX_WAKE_OP,
but futex_wake_op expects it as its nr_wake2 parameter.

The only user of this operation in glibc is always passing 1, so this
bug had no consequences so far.

Signed-off-by: Andreas Schwab <schwab@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31 15:39:40 -07:00
Alexey Dobriyan
c0f3358621 Fix leak on /proc/lockdep_stats
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31 15:39:40 -07:00
Alexey Dobriyan
5ea473a1df Fix leaks on /proc/{*/sched,sched_debug,timer_list,timer_stats}
On every open/close one struct seq_operations leaks.
Kudos to /proc/slab_allocators.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31 15:39:40 -07:00
Randy Dunlap
421cee2935 sched: fix kernel-doc warnings
Fix kernel-doc warnings in sched.c:

Warning(linux-2623-rc1g4//kernel/sched.c:1685): No description found for parameter 'notifier'
Warning(linux-2623-rc1g4//kernel/sched.c:1696): No description found for parameter 'notifier'
Warning(linux-2623-rc1g4//kernel/sched.c:1750): No description found for parameter 'prev'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31 15:39:38 -07:00
Greg Kroah-Hartman
74c5b597e9 modules: better error messages when modules fail to load due to a sysfs problem.
This helps people when debugging problems like the ones that were in the
recent -mm releases.


Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-30 14:25:23 -07:00
Len Brown
673d5b43da ACPI: restore CONFIG_ACPI_SLEEP
Restore the 2.6.22 CONFIG_ACPI_SLEEP build option, but now shadowing the
new CONFIG_PM_SLEEP option.

Signed-off-by: Len Brown <len.brown@intel.com>
[ Modified to work with the PM config setup changes. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-29 16:53:59 -07:00
Rafael J. Wysocki
296699de6b Introduce CONFIG_SUSPEND for suspend-to-Ram and standby
Introduce CONFIG_SUSPEND representing the ability to enter system sleep
states, such as the ACPI S3 state, and allow the user to choose SUSPEND
and HIBERNATION independently of each other.

Make HOTPLUG_CPU be selected automatically if SUSPEND or HIBERNATION has
been chosen and the kernel is intended for SMP systems.

Also, introduce CONFIG_PM_SLEEP which is automatically selected if
CONFIG_SUSPEND or CONFIG_HIBERNATION is set and use it to select the
code needed for both suspend and hibernation.

The top-level power management headers and the ACPI code related to
suspend and hibernation are modified to use the new definitions (the
changes in drivers/acpi/sleep/main.c are, mostly, moving code to reduce
the number of ifdefs).

There are many other files in which CONFIG_PM can be replaced with
CONFIG_PM_SLEEP or even with CONFIG_SUSPEND, but they can be updated in
the future.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-29 16:45:38 -07:00
Rafael J. Wysocki
b0cb1a19d0 Replace CONFIG_SOFTWARE_SUSPEND with CONFIG_HIBERNATION
Replace CONFIG_SOFTWARE_SUSPEND with CONFIG_HIBERNATION to avoid
confusion (among other things, with CONFIG_SUSPEND introduced in the
next patch).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-29 16:45:38 -07:00
Peter Zijlstra
040b3a2df2 audit: fix two bugs in the new execve audit code
copy_from_user() returns the number of bytes not copied, hence 0 is the
expected output.

axi->mm might not be valid anymore when not equal to current->mm, do not
dereference before checking that - thanks to Al for spotting that.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-28 19:42:22 -07:00
Al Viro
0af3678f7c rip some includes from linux/interrupt.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-28 19:42:22 -07:00
Linus Torvalds
257f49251c Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  [PATCH] sched: debug feature - make the sched-domains tree runtime-tweakable
  [PATCH] sched: add above_background_load() function
  [PATCH] sched: update Documentation/sched-stats.txt
  [PATCH] sched: mark sysrq_sched_debug_show() static
  [PATCH] sched: make cpu_clock() not use the rq clock
  [PATCH] sched: remove unused rq->load_balance_class
  [PATCH] sched: arch preempt notifier mechanism
  [PATCH] sched: increase SCHED_LOAD_SCALE_FUZZ
2007-07-26 13:59:59 -07:00
Rafael J. Wysocki
58b3b71dfa Fix ThinkPad T42 poweroff failure introduced by by "PM: Introduce pm_power_off_prepare"
Commit bd804eba1c ("PM: Introduce
pm_power_off_prepare") caused problems in the poweroff path, as reported by
YOSHIFUJI Hideaki / 吉藤英明.

Generally, sysdev_shutdown() should be called after the ACPI preparation for
powering the system off.  To make it happen, we can separate sysdev_shutdown()
from device_shutdown() and call it directly wherever necessary.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-26 12:13:06 -07:00
Randy Dunlap
61df47c8da kernel-doc fix for kmod.c
Fix kmod.c:
Warning(linux-2.6.23-rc1//kernel/kmod.c:364): No description found for parameter 'envp'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-26 11:33:06 -07:00
Nick Piggin
e692ab5347 [PATCH] sched: debug feature - make the sched-domains tree runtime-tweakable
debugging feature: make the sched-domains tree runtime-tweakable.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ mingo@elte.hu: made it depend on CONFIG_SCHED_DEBUG & small updates ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-26 13:40:43 +02:00
Josh Triplett
f337346193 [PATCH] sched: mark sysrq_sched_debug_show() static
Only sched.c uses sysrq_sched_debug_show, and sched.c includes sched_debug.c,
so all uses of sysrq_sched_debug_show occur in the same source file.

Eliminates a sparse warning:
warning: symbol 'sysrq_sched_debug_show' was not declared. Should it be static?

Signed-off-by: Josh Triplett <josh@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-26 13:40:43 +02:00
Ingo Molnar
2cd4d0ea19 [PATCH] sched: make cpu_clock() not use the rq clock
it is enough to disable interrupts to get the precise rq-clock
of the local CPU.

this also solves an NMI watchdog regression: the NMI watchdog
calls touch_softlockup_watchdog(), which might deadlock on
rq->lock if the NMI hits an rq-locked critical section.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-26 13:40:43 +02:00
Satoru Takeuchi
018a221295 [PATCH] sched: remove unused rq->load_balance_class
Remove unused rq->load_balance_class.

Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-26 13:40:43 +02:00
Avi Kivity
e107be36ef [PATCH] sched: arch preempt notifier mechanism
This adds a general mechanism whereby a task can request the scheduler to
notify it whenever it is preempted or scheduled back in.  This allows the
task to swap any special-purpose registers like the fpu or Intel's VT
registers.

Signed-off-by: Avi Kivity <avi@qumranet.com>
[ mingo@elte.hu: fixes, cleanups ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-26 13:40:43 +02:00
Linus Torvalds
a4fb2122f1 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPI: Kconfig: remove CONFIG_ACPI_SLEEP from source
  ACPI: quiet ACPI Exceptions due to no _PTC or _TSS
  ACPI: Remove references to ACPI_STATE_S2 from acpi_pm_enter
  ACPI: Kconfig: always enable CONFIG_ACPI_SLEEP on X86
  ACPI: Kconfig: fold /proc/acpi/sleep under CONFIG_ACPI_PROCFS
  ACPI: Kconfig: CONFIG_ACPI_PROCFS now defaults to N
  ACPI: autoload modules - Create __mod_acpi_device_table symbol for all ACPI drivers
  ACPI: autoload modules - Create ACPI alias interface
  ACPI: autoload modules - ACPICA modifications
  ACPI: asus-laptop: Fix failure exits
  ACPI: fix oops due to typo in new throttling code
  ACPI: ignore _PSx method for hotplugable PCI devices
  ACPI: Use ACPI methods to select PCI device suspend state
  ACPI, PNP: hook ACPI D-state to PNP suspend/resume
  ACPI: Add acpi_pm_device_sleep_state helper routine
  ACPI: Implement the set_target() callback from pm_ops
2007-07-25 11:28:00 -07:00
john stultz
17c38b7490 Cache xtime every call to update_wall_time
This avoids xtime lag seen with dynticks, because while 'xtime' itself
is still not updated often, we keep a 'xtime_cache' variable around that
contains the approximate real-time that _is_ updated each time we do a
'update_wall_time()', and is thus never off by more than one tick.

IOW, this restores the original semantics for 'xtime' users, as long as
you use the proper abstraction functions (ie 'current_kernel_time()' or
'get_seconds()' depending on whether you want a timespec or just the
seconds field).

[ Updated Patch.  As penance for my sins I've also yanked another #ifdef
  that was added to avoid the xtime lag w/ hrtimers.  ]

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-25 10:17:44 -07:00
john stultz
2c6b47de17 Cleanup non-arch xtime uses, use get_seconds() or current_kernel_time().
This avoids use of the kernel-internal "xtime" variable directly outside
of the actual time-related functions.  Instead, use the helper functions
that we already have available to us.

This doesn't actually change any behaviour, but this will allow us to
fix the fact that "xtime" isn't updated very often with CONFIG_NO_HZ
(because much of the realtime information is maintained as separate
offsets to 'xtime'), which has caused interfaces that use xtime directly
to get a time that is out of sync with the real-time clock by up to a
third of a second or so.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-25 10:09:20 -07:00
Len Brown
e8b2fd0122 ACPI: Kconfig: remove CONFIG_ACPI_SLEEP from source
As it was a synonym for (CONFIG_ACPI && CONFIG_X86),
the ifdefs for it were more clutter than they were worth.

For ia64, just add a few stubs in anticipation of future
S3 or S4 support.

Signed-off-by: Len Brown <len.brown@intel.com>
2007-07-25 01:29:39 -04:00
Linus Torvalds
dd6ccfe64d Merge branch 'audit.b39' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b39' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [PATCH] get rid of AVC_PATH postponed treatment
  [PATCH] allow audit filtering on bit & operations
  [PATCH] audit: fix broken class-based syscall audit
  [PATCH] Make IPC mode consistent
2007-07-22 11:18:20 -07:00
Masoud Asgharifard Sharbiani
abd4f7505b x86: i386-show-unhandled-signals-v3
This patch makes the i386 behave the same way that x86_64 does when a
segfault happens.  A line gets printed to the kernel log so that tools
that need to check for failures can behave more uniformly between
debug.show_unhandled_signals sysctl variable to 0 (or by doing echo 0 >
/proc/sys/debug/exception-trace)

Also, all of the lines being printed are now using printk_ratelimit() to
deny the ability of DoS from a local user with a program like the
following:

main()
{
       while (1)
               if (!fork()) *(int *)0 = 0;
}

This new revision also includes the fix that Andrew did which got rid of
new sysctl that was added to the system in earlier versions of this.
Also, 'show-unhandled-signals' sysctl has been renamed back to the old
'exception-trace' to avoid breakage of people's scripts.

AK: Enabling by default for i386 will be likely controversal, but let's see what happens
AK: Really folks, before complaining just fix your segfaults
AK: I bet this will find a lot of silent issues

Signed-off-by: Masoud Sharbiani <masouds@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
[ Personally, I've found the complaints useful on x86-64, so I'm all for
  this. That said, I wonder if we could do it more prettily..   -Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-22 11:03:37 -07:00
Al Viro
4259fa01a2 [PATCH] get rid of AVC_PATH postponed treatment
Selinux folks had been complaining about the lack of AVC_PATH
records when audit is disabled.  I must admit my stupidity - I assumed
that avc_audit() really couldn't use audit_log_d_path() because of
deadlocks (== could be called with dcache_lock or vfsmount_lock held).
Shouldn't have made that assumption - it never gets called that way.
It _is_ called under spinlocks, but not those.

        Since audit_log_d_path() uses ab->gfp_mask for allocations,
kmalloc() in there is not a problem.  IOW, the simple fix is sufficient:
let's rip AUDIT_AVC_PATH out and simply generate pathname as part of main
record.  It's trivial to do.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: James Morris <jmorris@namei.org>
2007-07-22 09:57:02 -04:00
Eric Paris
74f2345b6b [PATCH] allow audit filtering on bit & operations
Right now the audit filter can match on = != > < >= blah blah blah.
This allow the filter to also look at bitwise AND operations, &

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-07-22 09:57:02 -04:00
Klaus Weidner
c926e4f432 [PATCH] audit: fix broken class-based syscall audit
The sanity check in audit_match_class() is wrong.  We are able to audit
2048 syscalls but in audit_match_class() we were accidentally using
sizeof(_u32) instead of number of bits in _u32 when deciding how many
syscalls were valid.  On ia64 in particular we were hitting syscall
numbers over the (wrong) limit of 256.  Fixing the audit_match_class
check takes care of the problem.

Signed-off-by: Klaus Weidner <klaus@atsec.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-07-22 09:57:02 -04:00
Steve Grubb
5b9a426223 [PATCH] Make IPC mode consistent
The mode fields for IPC records are not consistent. Some are hex, others are
octal. This patch makes them all octal.

Signed-off-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-07-22 09:57:02 -04:00
Nigel Cunningham
44bf4cea43 x86: PM_TRACE support
Signed-off-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:10 -07:00
Andi Kleen
42ee2b7414 x86_64: Report the pending irq if available in smp_affinity
Otherwise smp_affinity would only update after the next interrupt
on x86 systems.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:07 -07:00
Thomas Gleixner
82644459c5 NTP: move the cmos update code into ntp.c
i386 and sparc64 have the identical code to update the cmos clock.  Move it
into kernel/time/ntp.c as there are other architectures coming along with the
same requirements.

[akpm@linux-foundation.org: build fixes]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: David Miller <davem@davemloft.net>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 17:49:15 -07:00
Ingo Molnar
99bc2fcb28 hrtimer: speedup hrtimer_enqueue
Speedup hrtimer_enqueue by evaluating the rbtree insertion result.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 17:49:15 -07:00
Ingo Molnar
820de5c39e highres: improve debug output
Add some more debug information to the hrtimer and clock events code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 17:49:15 -07:00
john stultz
3704540b48 tick management: spread timer interrupt
After discussing w/ Thomas over IRC, it seems the issue is the sched tick
fires on every cpu at the same time, causing extra lock contention.

This smaller change, adds an extra offset per cpu so the ticks don't line up.
This patch also drops the idle latency from 40us down to under 20us.

Signed-off-by: john stultz <johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 17:49:15 -07:00
Thomas Gleixner
5590a536c0 clockevents: fix device replacement
When a device is replaced by a better rated device, then the broadcast
mode needs to be evaluated again. When the new device has no requirement
for broadcasting, then the broadcast bits for the CPU must be cleared.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 17:49:15 -07:00
Thomas Gleixner
18de5bc4c1 clockevents: fix resume logic
We need to make sure, that the clockevent devices are resumed, before
the tick is resumed. The current resume logic does not guarantee this.

Add CLOCK_EVT_MODE_RESUME and call the set mode functions of the clock
event devices before resuming the tick / oneshot functionality.

Fixup the existing users.

Thanks to Nigel Cunningham for tracking down a long standing thinko,
which affected the jinxed VAIO.

[akpm@linux-foundation.org: xen build fix]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 17:49:15 -07:00
Linus Torvalds
2008220879 Revert "sys_time() speedup"
This basically reverts commit 4e44f3497d,
while waiting for it to be re-done more completely.  There are cases of
people mixing "time()" with higher-resolution time sources, and we need
to take the nanosecond offsets into account.

Ingo has a patch that does that, but it's still under some discussion.
In the meantime, just revert back to the old simple situation of just
doing the whole exact timesource calculations.

But rather than using do_gettimeofday(), use the internal nanosecond
resolution getnstimeofday(), which at least avoids one unnecessary
conversion (since we really don't care about whether the fractional
seconds are nanoseconds or microseconds - we'll just throw them away).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-20 13:32:46 -07:00
Linus Torvalds
efa7e8673c Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Prevent people from directly including <asm/rwsem.h>.
  [IA64] remove time interpolator
  [IA64] Convert to generic timekeeping/clocksource
  [IA64] refresh some config files for 64K pagesize
  [IA64] Delete iosapic_free_rte()
  [IA64] fallocate system call
  [IA64] Enable percpu vector domain for IA64_DIG
  [IA64] Enable percpu vector domain for IA64_GENERIC
  [IA64] Support irq migration across domain
  [IA64] Add support for vector domain
  [IA64] Add mapping table between irq and vector
  [IA64] Check if irq is sharable
  [IA64] Fix invalid irq vector assumption for iosapic
  [IA64] Use dynamic irq for iosapic interrupts
  [IA64] Use per iosapic lock for indirect iosapic register access
  [IA64] Cleanup lock order in iosapic_register_intr
  [IA64] Remove duplicated members in iosapic_rte_info
  [IA64] Remove block structure for locking in iosapic.c
2007-07-20 12:02:20 -07:00
David Howells
0b1937ac0e FRV: Fix linkage problems
Make it possible to use __start_notes and __stop_notes without getting a GPREL
overflow error from the FRV linker.

Small variables that would otherwise be in .data or .bss may, depending on the
arch, be placed in special sections (.sdata or .sbss) that permit single
instruction references on fixed instruction width machines.

__start_notes and __stop_notes aren't really char variables, and certainly
don't refer to data in .data or .bss.  Making them type "void" fools the
compiler into not assuming anything about them.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-20 12:01:34 -07:00