Commit Graph

5151 Commits

Author SHA1 Message Date
Li Zefan
faa2f98f85 sched: add sanity check in partition_sched_domains()
Impact: cleanup, add debug check

It's wrong to make dattr_new = NULL if doms_new == NULL, it introduces
memory leak if dattr_new != NULL. Fortunately dattr_new is always NULL
in this case. So remove the code and add a sanity check.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04 10:25:13 +01:00
Li Zefan
a17e226092 sched: remove redundant call to unregister_sched_domain_sysctl()
Impact: cleanup

The sysctl has been unregistered by partition_sched_domains().

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04 10:23:42 +01:00
Li Zefan
0a0db8f5c9 sched debug: remove NULL checking in print_cfs/rt_rq()
Impact: cleanup

cfs->tg is initialized in init_tg_cfs_entry() with tg != NULL, and
will never be invalidated to NULL. And the underlying cgroup of a
valid task_group is always valid.

Same for rt->tg.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04 10:23:18 +01:00
Li Zefan
eefd796a8e sched debug: remove sd_level_to_string()
Impact: cleanup

Just use the newly introduced sd->name.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04 10:21:49 +01:00
Peter Zijlstra
8bb8c4386d sched, ftrace: trace sched.c
Impact: allow function tracing within sched.c

Its useful to see what happens in sched.c.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-03 08:58:55 +01:00
Ingo Molnar
db5935001a Merge commit 'v2.6.28-rc3' into sched/core 2008-11-03 08:57:41 +01:00
Al Viro
28959742c1 PM_TEST_SUSPEND should depend on RTC_CLASS, not RTC_LIB
Insufficient dependency - we really want CONFIG_RTC_CLASS=y there.
That will give us CONFIG_RTC_LIB=y, so the old dependency can be
simply replaced.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01 12:40:38 -07:00
Linus Torvalds
42c0202363 reserve_region_with_split: Fix GFP_KERNEL usage under spinlock
This one apparently doesn't generate any warnings, because the function
is only used during system bootup, when the warnings are disabled.  But
it's still very wrong.

The __reserve_region_with_split() function is called with the
resource_lock held for writing, so it must only ever do GFP_ATOMIC
allocations.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01 09:53:58 -07:00
Linus Torvalds
0b23e30b48 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: remove sched-design.txt from 00-INDEX
  sched: change sched_debug's mode to 0444
2008-10-30 18:32:03 -07:00
Linus Torvalds
147db6e947 Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ftrace: handle archs that do not support irqs_disabled_flags
2008-10-30 18:31:42 -07:00
Linus Torvalds
43908195e0 Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  resources: fix x86info results ioremap.c:226 __ioremap_caller+0xf2/0x2d6() WARNINGs
2008-10-30 18:31:27 -07:00
Steven Rostedt
9244489a7b ftrace: handle archs that do not support irqs_disabled_flags
Impact: build fix on non-lockdep architectures

Some architectures do not support a way to read the irq flags that
is set from "local_irq_save(flags)" to determine if interrupts were
disabled or enabled. Ftrace uses this information to display to the user
if the trace occurred with interrupts enabled or disabled.

Besides the fact that those archs that do not support this will fail to
compile, unless they fix it, we do not want to have the trace simply
say interrupts were not disabled or they were enabled, without knowing
the real answer.

This patch adds a 'X' in the output to let the user know that the
architecture they are running on does not support a way for the tracer
to determine if interrupts were enabled or disabled. It also lets those
same archs compile with tracing enabled.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 00:03:26 +01:00
Linus Torvalds
0b54968f66 Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ftrace: fix trace_nop config select
  ftrace: perform an initialization for ftrace to enable it
2008-10-30 11:44:09 -07:00
Sukadev Bhattiprolu
d25141a818 'kill sig -1' must only apply to caller's namespace
Currently "kill <sig> -1" kills processes in all namespaces and breaks the
isolation of namespaces.  Earlier attempt to fix this was discussed at:

	http://lkml.org/lkml/2008/7/23/148

As suggested by Oleg Nesterov in that thread, use "task_pid_vnr() > 1"
check since task_pid_vnr() returns 0 if process is outside the caller's
namespace.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Tested-by: Daniel Hokka Zakrisson <daniel@hozac.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:46 -07:00
Paul Mundt
ce05fcc30e kernel/profile: fix profile_init() section mismatch
profile_init() calls in to alloc_bootmem() on early initialization.  While
alloc_bootmem() is __init, the reference itself is safe in that it is
tucked below a !slab_is_available() check.  So, flag profile_init() as
__ref.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:46 -07:00
Li Zefan
51308ee59d freezer_cg: simplify freezer_change_state()
Just call unfreeze_cgroup() if goal_state == THAWED, and call
try_to_freeze_cgroup() if goal_state == FROZEN.

No behavior has been changed.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Cedric Le Goater <clg@fr.ibm.com>
Acked-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:45 -07:00
Li Zefan
00c2e63c31 freezer_cg: use thaw_process() in unfreeze_cgroup()
Don't duplicate the implementation of thaw_process().

[akpm@linux-foundation.org: make __thaw_process() static]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Acked-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:45 -07:00
Li Zefan
80a6a2cf3b freezer_cg: remove redundant check in freezer_can_attach()
It is sufficient to check if @task is frozen, and no need to check if the
original freezer is frozen.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Cedric Le Goater <clg@fr.ibm.com>
Acked-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:45 -07:00
Li Zefan
7ccb97437b freezer_cg: fix improper BUG_ON() causing oops
The BUG_ON() should be protected by freezer->lock, otherwise it can be
triggered easily when a task has been unfreezed but the corresponding
cgroup hasn't been changed to FROZEN state.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Cedric Le Goater <clg@fr.ibm.com>
Acked-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:45 -07:00
Li Zefan
34f3a814ee sched: switch sched_features to seqfile
Impact: cleanup

So handling of sched_features read is simplified.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-30 11:38:35 +01:00
Li Zefan
a9cf4ddb3b sched: change sched_debug's mode to 0444
Impact: change /proc/sched/debug from rw-r--r-- to r--r--r--

/proc/sched_debug is read-only.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-30 11:37:57 +01:00
Steven Rostedt
f3384b28a0 ftrace: fix trace_nop config select
Impact: build fix on non-function-tracing architectures

The trace_nop is the tracer that is defined when no tracer is set in
the ftrace infrastructure.

The trace_nop was mistakenly selected by HAVE_FTRACE due to the confusion
between ftrace infrastructure and the ftrace function tracer (which has
been solved by renaming the function tracer).

This patch changes the select to the approriate TRACING.

This patch should fix compile errors on architectures that do not define
the FUNCTION_TRACER.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-29 17:21:05 +01:00
Li Zefan
eab172294d sched: cleanup for alloc_rt/fair_sched_group()
Impact: cleanup

Remove checking parent == NULL. It won't be NULLL, because we dynamically
create sub task_group only, and sub task_group always has its parent.
(root task_group is statically defined)

Also replace kmalloc_node(GFP_ZERO) with kzalloc_node().

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-29 11:53:26 +01:00
Suresh Siddha
d68612b257 resources: fix x86info results ioremap.c:226 __ioremap_caller+0xf2/0x2d6() WARNINGs
Impact: avoid false-positive WARN_ON()

Andi Kleen reported:
> When running x86info on a 2.6.27-git8 system I get
>
> resource map sanity check conflict: 0x9e000 0x9efff 0x10000 0x9e7ff System RAM
> ------------[ cut here ]------------
> WARNING: at /home/lsrc/linux/arch/x86/mm/ioremap.c:226 __ioremap_caller+0xf2/0x2d6()
> ...

Some of the pages below the 1MB ISA addresses will be shared typically by both
BIOS and system usable RAM. For example:
	BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
	BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)

x86info reads the low physical address using /dev/mem, which internally
uses ioremap() for accessing non RAM pages. ioremap() of such low
pages conflicts with multiple resource entities leading to the
above warning.

Change the iomem_map_sanity_check() to allow mapping a page spanning multiple
resource entities (minimum granularity that one can map is a page anyhow).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 19:56:17 +01:00
Frederic Weisbecker
0b6e4d56bf ftrace: perform an initialization for ftrace to enable it
Impact: corrects a bug which made the non-dyn function tracer not functional

With latest git, the non-dynamic function tracer didn't get any trace.

The problem was the fact that ftrace_enabled wasn't initialized to 1
because ftrace hasn't any init function when DYNAMIC_FTRACE is disabled.

So when a tracer tries to register an ftrace_ops struct,
__register_ftrace_function failed to set the hook.

This patch corrects it by setting an init function to initialize
ftrace during the boot.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 19:15:58 +01:00
Linus Torvalds
e946217e4f Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits)
  ftrace: fix current_tracer error return
  tracing: fix a build error on alpha
  ftrace: use a real variable for ftrace_nop in x86
  tracing/ftrace: make boot tracer select the sched_switch tracer
  tracepoint: check if the probe has been registered
  asm-generic: define DIE_OOPS in asm-generic
  trace: fix printk warning for u64
  ftrace: warning in kernel/trace/ftrace.c
  ftrace: fix build failure
  ftrace, powerpc, sparc64, x86: remove notrace from arch ftrace file
  ftrace: remove ftrace hash
  ftrace: remove mcount set
  ftrace: remove daemon
  ftrace: disable dynamic ftrace for all archs that use daemon
  ftrace: add ftrace warn on to disable ftrace
  ftrace: only have ftrace_kill atomic
  ftrace: use probe_kernel
  ftrace: comment arch ftrace code
  ftrace: return error on failed modified text.
  ftrace: dynamic ftrace process only text section
  ...
2008-10-28 09:52:25 -07:00
Linus Torvalds
0d8762c9ee Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  lockdep: fix irqs on/off ip tracing
  lockdep: minor fix for debug_show_all_locks()
  x86: restore the old swiotlb alloc_coherent behavior
  x86: use GFP_DMA for 24bit coherent_dma_mask
  swiotlb: remove panic for alloc_coherent failure
  xen: compilation fix of drivers/xen/events.c on IA64
  xen: portability clean up and some minor clean up for xencomm.c
  xen: don't reload cr3 on suspend
  kernel/resource: fix reserve_region_with_split() section mismatch
  printk: remove unused code from kernel/printk.c
2008-10-28 09:49:27 -07:00
Linus Torvalds
cf76dddb22 Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  irq: make variable static
2008-10-28 09:48:25 -07:00
Linus Torvalds
8ca6215502 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: fix documentation reference for sched_min_granularity_ns
  sched: virtual time buddy preemption
  sched: re-instate vruntime based wakeup preemption
  sched: weaken sync hint
  sched: more accurate min_vruntime accounting
  sched: fix a find_busiest_group buglet
  sched: add CONFIG_SMP consistency
2008-10-28 09:46:20 -07:00
Steven Rostedt
60063a6623 ftrace: fix current_tracer error return
The commit (in linux-tip) c2931e05ec
 ( ftrace: return an error when setting a nonexistent tracer )
added useful code that would error when a bad tracer was written into
the current_tracer file.

But this had a bug if the amount written was more than the amount read by
that code. The first iteration would set the tracer correctly, but since
it did not consume the rest of what was written (usually whitespace), the
userspace utility would continue to write what was not consumed. This
second iteration would fail to find a tracer and return -EINVAL. Funny
thing is that the tracer would have already been set.

This patch just consumes all the data that is written to the file.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 16:33:47 +01:00
Heiko Carstens
6afe40b4da lockdep: fix irqs on/off ip tracing
Impact: fix lockdep lock-api-caller output when irqsoff tracing is enabled

81d68a96 "ftrace: trace irq disabled critical timings" added wrappers around
trace_hardirqs_on/off_caller. However these functions use
__builtin_return_address(0) to figure out which function actually disabled
or enabled irqs. The result is that we save the ips of trace_hardirqs_on/off
instead of the real caller. Not very helpful.

However since the patch from Steven the ip already gets passed. So use that
and get rid of __builtin_return_address(0) in these two functions.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 11:19:07 +01:00
qinghuang feng
46fec7ac40 lockdep: minor fix for debug_show_all_locks()
When we failed to get tasklist_lock eventually (count equals 0),
we should only print " ignoring it.\n", and not print
" locked it.\n" needlessly.

Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 10:30:04 +01:00
Frederic Weisbecker
21798a84ab tracing: fix a build error on alpha
Impact: build fix on Alpha

When tracing is enabled, some arch have included <linux/irqflags.h>
on their <asm/system.h> but others like alpha or m68k don't.

Build error on alpha:

kernel/trace/trace.c: In function 'tracing_cpumask_write':
kernel/trace/trace.c:2145: error: implicit declaration of function 'raw_local_irq_disable'
kernel/trace/trace.c:2162: error: implicit declaration of function 'raw_local_irq_enable'

Tested on Alpha through a cross-compiler (should correct a similar issue on m68k).

Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 09:53:28 +01:00
Frederic Weisbecker
ea31e72d75 tracing/ftrace: make boot tracer select the sched_switch tracer
Impact: build fix

If the boot tracer is selected but not the sched_switch,
there will be a build failure:

 kernel/built-in.o: In function `boot_trace_init':
 trace_boot.c:(.text+0x5ee38): undefined reference to `sched_switch_trace'
 kernel/built-in.o: In function `disable_boot_trace':
 (.text+0x5eee1): undefined reference to `tracing_stop_cmdline_record'
 kernel/built-in.o: In function `enable_boot_trace':
 (.text+0x5ef11): undefined reference to `tracing_start_cmdline_record'

This patch fixes it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-27 16:47:13 +01:00
Frederic Weisbecker
f66af459a9 tracepoint: check if the probe has been registered
Impact: fix kernel crash that can trigger during tracing

If we try to remove a probe that has not been already registered,
the tracepoint_entry_remove_probe() function will dereference a NULL
pointer.

Check the probe before removing it to avoid crashes.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-27 16:45:46 +01:00
Stephen Rothwell
e2862c9470 trace: fix printk warning for u64
A powerpc ppc64_defconfig build produces these warnings:

kernel/trace/ring_buffer.c: In function 'rb_add_time_stamp':
kernel/trace/ring_buffer.c:969: warning: format '%llu' expects type 'long long unsigned int', but argument 2 has type 'u64'
kernel/trace/ring_buffer.c:969: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'u64'
kernel/trace/ring_buffer.c:969: warning: format '%llu' expects type 'long long unsigned int', but argument 4 has type 'u64'

Just cast the u64s to unsigned long long like we do everywhere else.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-27 11:31:58 +01:00
Ingo Molnar
4944dd62de Merge commit 'v2.6.28-rc2' into tracing/urgent 2008-10-27 10:50:54 +01:00
Stephen Rothwell
2077776641 cgroup: remove unused variable
/scratch/sfr/next/kernel/cgroup.c: In function 'cgroup_tasks_start':
/scratch/sfr/next/kernel/cgroup.c:2107: warning: unused variable 'i'

Introduced in commit cc31edceee "cgroups:
convert tasks file to use a seq_file with shared pid array".

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-26 09:38:17 -07:00
Linus Torvalds
4403b406d4 Revert "Call init_workqueues before pre smp initcalls."
This reverts commit a802dd0eb5 by moving
the call to init_workqueues() back where it belongs - after SMP has been
initialized.

It also moves stop_machine_init() - which needs workqueues - to a later
phase using a core_initcall() instead of early_initcall().  That should
satisfy all ordering requirements, and was apparently the reason why
init_workqueues() was moved to be too early.

Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-25 19:53:38 -07:00
Ingo Molnar
f17845e5d9 ftrace: warning in kernel/trace/ftrace.c
this warning:

  kernel/trace/ftrace.c:189: warning: ‘frozen_record_count’ defined but not used

triggers because frozen_record_count is only used in the KCONFIG_MARKERS
case. Move the variable it there.

Alas, this frozen-record facility seems to have little use. The
frozen_record_count variable is not used by anything, nor the flags.

So this section might need a bit of dead-code-removal care as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24 12:52:44 +02:00
Peter Zijlstra
3f3a490480 sched: virtual time buddy preemption
Since we moved wakeup preemption back to virtual time, it makes sense to move
the buddy stuff back as well. The purpose of the buddy scheduling is to allow
a quickly scheduling pair of tasks to run away from the group as far as a
regular busy task would be allowed under wakeup preemption.

This has the advantage that the pair can ping-pong for a while, enjoying
cache-hotness. Without buddy scheduling other tasks would interleave destroying
the cache.

Also, it saves a word in cfs_rq.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24 12:51:03 +02:00
Peter Zijlstra
464b75273f sched: re-instate vruntime based wakeup preemption
The advantage is that vruntime based wakeup preemption has a better
conceptual model. Here wakeup_gran = 0 means: preempt when 'fair'.
Therefore wakeup_gran is the granularity of unfairness we allow in order
to make progress.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24 12:51:02 +02:00
Mike Galbraith
0d13033bc9 sched: weaken sync hint
Mysql+oltp and pgsql+oltp peaks are still shifted right. The below puts
the peaks back to 1 client/server pair per core.

Use the avg_overlap information to weaken the sync hint.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24 12:51:01 +02:00
Peter Zijlstra
1af5f730fc sched: more accurate min_vruntime accounting
Mike noticed the current min_vruntime tracking can go wrong and skip the
current task. If the only remaining task in the tree is a nice 19 task
with huge vruntime, new tasks will be inserted too far to the right too,
causing some interactibity issues.

min_vruntime can only change due to the leftmost entry disappearing
(dequeue_entity()), or by the leftmost entry being incremented past the
next entry, which elects a new leftmost (__update_curr())

Due to the current entry not being part of the actual tree, we have to
compare the leftmost tree entry with the current entry, and take the
leftmost of these two.

So create a update_min_vruntime() function that takes computes the
leftmost vruntime in the system (either tree of current) and increases
the cfs_rq->min_vruntime if the computed value is larger than the
previously found min_vruntime. And call this from the two sites we've
identified that can change min_vruntime.

Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24 12:51:00 +02:00
Peter Zijlstra
01c8c57d66 sched: fix a find_busiest_group buglet
In one of the group load balancer patches:

	commit 408ed066b1
	Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
	Date:   Fri Jun 27 13:41:28 2008 +0200
	Subject: sched: hierarchical load vs find_busiest_group

The following change:

-               if (max_load - this_load + SCHED_LOAD_SCALE_FUZZ >=
+               if (max_load - this_load + 2*busiest_load_per_task >=
                                        busiest_load_per_task * imbn) {

made the condition always true, because imbn is [1,2].
Therefore, remove the 2*, and give the it a fair chance.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24 12:50:59 +02:00
Ingo Molnar
8c82a17e9c Merge commit 'v2.6.28-rc1' into sched/urgent 2008-10-24 12:48:46 +02:00
Paul Mundt
bea9211241 kernel/resource: fix reserve_region_with_split() section mismatch
Impact: cleanup, small kernel text size reduction, no functionality changed

reserve_region_with_split() calls in to __reserve_region_with_split(),
which is an __init function. The only caller of reserve_region_with_split()
is an __init function, so make it __init too.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-23 21:54:34 +02:00
roel kluin
acff181d35 printk: remove unused code from kernel/printk.c
both log_buf_copy() and log_buf_len are unused.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-23 21:54:29 +02:00
Linus Torvalds
d2441183dc Fix compile warning in kernel/params.c
Move free_module_param_attrs() into the CONFIG_MODULES section, since
it's only used inside there. Thus avoiding the warning

  kernel/params.c:514: warning: 'free_module_param_attrs' defined but not used

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-23 12:09:00 -07:00
Linus Torvalds
88ed86fee6 Merge branch 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc
* 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc: (35 commits)
  proc: remove fs/proc/proc_misc.c
  proc: move /proc/vmcore creation to fs/proc/vmcore.c
  proc: move pagecount stuff to fs/proc/page.c
  proc: move all /proc/kcore stuff to fs/proc/kcore.c
  proc: move /proc/schedstat boilerplate to kernel/sched_stats.h
  proc: move /proc/modules boilerplate to kernel/module.c
  proc: move /proc/diskstats boilerplate to block/genhd.c
  proc: move /proc/zoneinfo boilerplate to mm/vmstat.c
  proc: move /proc/vmstat boilerplate to mm/vmstat.c
  proc: move /proc/pagetypeinfo boilerplate to mm/vmstat.c
  proc: move /proc/buddyinfo boilerplate to mm/vmstat.c
  proc: move /proc/vmallocinfo to mm/vmalloc.c
  proc: move /proc/slabinfo boilerplate to mm/slub.c, mm/slab.c
  proc: move /proc/slab_allocators boilerplate to mm/slab.c
  proc: move /proc/interrupts boilerplate code to fs/proc/interrupts.c
  proc: move /proc/stat to fs/proc/stat.c
  proc: move rest of /proc/partitions code to block/genhd.c
  proc: move /proc/cpuinfo code to fs/proc/cpuinfo.c
  proc: move /proc/devices code to fs/proc/devices.c
  proc: move rest of /proc/locks to fs/locks.c
  ...
2008-10-23 12:04:37 -07:00