Impact: extend the tracing API
The goal of this patch is to normalize and make more easy the
implementation of statistical (histogram) tracing.
It implements a trace_stat file into the /debugfs/tracing directory where
one can print a one-shot output of statistics/histogram entries.
A tracer has to provide two basic iterator callbacks:
stat_start() => the first entry
stat_next(prev, idx) => the next one.
Note that it is adapted for arrays or hash tables or lists.... since it
provides a pointer to the previous entry and the current index of the
iterator.
These two callbacks are called to get a snapshot of the statistics at each
opening of the trace_stat file because. The values are so updated between
two "cat trace_stat". And the tracer is free to lock its datas during the
iteration to keep consistent values.
Since it is almost always interesting to sort statisticals values to
address the problems by priority, this infrastructure provides a "sorting"
of the stat entries too if desired. A tracer has just to provide a
stat_cmp callback to compare two entries and the stat tracing
infrastructure will build a sorted list of the given entries.
A last callback, called stat_headers, can be implemented by a tracer to
output headers on its trace.
If one of these callbacks is changed on runtime, it just have to signal it
to the stat tracing API by calling the init_tracer_stat() helper.
Changes in V2:
- Fix a memory leak if the user opens multiple times the trace_stat file
without closing it. Now we always free our list before rebuilding it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: rework trace.c to use new event register API
Almost every ftrace event has to implement its output display in
trace.c through a different function. Some events did not handle
all the formats (trace, latency-trace, raw, hex, binary), and
this method does not scale well.
This patch converts the format functions to use the event API to
find the event and and print its format. Currently, we have
a print function for trace, latency_trace, raw, hex and binary.
A trace_nop_print is available if the event wants to avoid output
on a particular format.
Perhaps other tracers could use this in the future (like mmiotrace and
function_graph).
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: simplify/generalize/refactor trace.c
The trace.c file is becoming more difficult to maintain due to the
growing number of events. There is several formats that an event may
be printed. This patch sets up the infrastructure of an event hash to
allow for events to register how they should be printed.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, remove obsolete code
Now that the ring buffer used by ftrace allows for variable length
entries, we do not need the 'cont' feature of the buffer. This code
makes other parts of ftrace more complex and by removing this it
simplifies the ftrace code.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (241 commits)
sched, trace: update trace_sched_wakeup()
tracing/ftrace: don't trace on early stage of a secondary cpu boot, v3
Revert "x86: disable X86_PTRACE_BTS"
ring-buffer: prevent false positive warning
ring-buffer: fix dangling commit race
ftrace: enable format arguments checking
x86, bts: memory accounting
x86, bts: add fork and exit handling
ftrace: introduce tracing_reset_online_cpus() helper
tracing: fix warnings in kernel/trace/trace_sched_switch.c
tracing: fix warning in kernel/trace/trace.c
tracing/ring-buffer: remove unused ring_buffer size
trace: fix task state printout
ftrace: add not to regex on filtering functions
trace: better use of stack_trace_enabled for boot up code
trace: add a way to enable or disable the stack tracer
x86: entry_64 - introduce FTRACE_ frame macro v2
tracing/ftrace: add the printk-msg-only option
tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp()
x86, bts: correctly report invalid bts records
...
Fixed up trivial conflict in scripts/recordmcount.pl due to SH bits
being already partly merged by the SH merge.
Impact: cleanup
This patch factors out common code from multiple tracers into a
tracing_reset_online_cpus() function and converts the tracers to use it.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
this warning:
kernel/trace/trace.c: In function ‘print_lat_fmt’:
kernel/trace/trace.c:1826: warning: unused variable ‘state’
Triggers because 'state' has become unused - remove it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix occasionally incorrect trace output
The tracing code has interesting varieties of printing out task state.
Unfortunalely only one of the instances is correct as it copies the
code from sched.c:sched_show_task(). The others are plain wrong as
they treatthe bitfield as an integer offset into the character
array. Also the size check of the character array is wrong as it
includes the trailing \0.
Use a common state decoder inline which does the Right Thing.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: display ftrace_printk messages "as is"
By default, ftrace_printk() messages find their output with some other
informations like pid, caller, ...
Sometimes a developer just want to have the ftrace_printk left "as is", without
other information.
This is done by providing a default-off option called printk-msg-only.
To enable it, just do `echo printk-msg-only > /debugfs/tracing/trace_options`
Before the patch:
<...>-2739 [000] 145.692153: __might_sleep: I'm an ftrace_printk msg in __might_sleep
<...>-2739 [000] 145.692155: __might_sleep: I'm another ftrace_printk msg in __might_sleep
After the patch and the printk-msg-only option enabled:
I'm an ftrace_printk msg in __might_sleep
I'm another ftrace_printk msg in __might_sleep
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: restructure code, cleanup
Remove BTS bits from the hw-branch-tracer (renamed from bts-tracer) and
use the ds interface.
Signed-off-by: Markus Metzger <markut.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
this warning:
kernel/trace/trace.c: In function ‘trace_vprintk’:
kernel/trace/trace.c:3626: warning: ‘flags’ may be used uninitialized in this function
shows some confusion about irq_flags / flags use here. We already have
irq_flags so remove the extra flags variable.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Provide a way to pause the function graph tracer
As suggested by Steven Rostedt, the previous patch that prevented from
spinlock function tracing shouldn't use the raw_spinlock to fix it.
It's much better to follow lockdep with normal spinlock, so this patch
adds a new flag for each task to make the function graph tracer able
to be paused. We also can send an ftrace_printk whithout worrying of
the irrelevant traced spinlock during insertion.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Apply some suggestions of Steven Rostedt:
_turn tracing_selftest_running into a simple int (no need of an atomic_t)
_set it __read_mostly
_fix a comment style
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
As suggested by Steven Rostedt, this patch provide a new macro
task_curr_ret_stack() to move the cpp conditionnal CONFIG into
the linux/ftrace.h headers.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix tracer selfstests false results
After setting a ftrace_printk somewhere in th kernel, I saw the
Function tracer selftest failing.
When a selftest occurs, the ring buffer is lurked to see if
some entries were inserted. But concurrent insertion such as
ftrace_printk could occured at the same time and could give
false positive or negative results.
This patch prevent prevent from TRACE_PRINT entries insertion
during selftests.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Handle the TRACE_PRINT entries from the function grapg tracer
and output them as a C comment just below the function that called
it, as if it was a comment inside this function.
Example with an ftrace_printk inside might_sleep() function:
void __might_sleep(char *file, int line)
{
static unsigned long prev_jiffy; /* ratelimiting */
ftrace_printk("Hi I'm a comment in might_sleep() :-)");
A chunk of a resulting trace:
0) | _reiserfs_free_block() {
0) | reiserfs_read_bitmap_block() {
0) | __bread() {
0) | __getblk() {
0) | __find_get_block() {
0) 0.698 us | mark_page_accessed();
0) 2.267 us | }
0) | __might_sleep() {
0) | /* Hi I'm a comment in might_sleep() :-) */
0) 1.321 us | }
0) 5.872 us | }
0) 7.313 us | }
0) 8.718 us | }
And this patch brings two minor fixes:
- The newline after a switch-out task has disappeared
- The "|" sign just before the cpu number on task-switch has been deleted.
0) 0.616 us | pick_next_task_rt();
0) 1.457 us | _spin_trylock();
0) 0.653 us | _spin_unlock();
0) 0.728 us | _spin_trylock();
0) 0.631 us | _spin_unlock();
0) 0.729 us | native_load_sp0();
0) 0.593 us | native_load_tls();
------------------------------------------
0) cat-2834 => migrati-3
------------------------------------------
0) | finish_task_switch() {
0) 0.841 us | _spin_unlock_irq();
0) 0.616 us | post_schedule_rt();
0) 3.882 us | }
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: New feature
This patch makes the changes to set_ftrace_pid apply to the function
graph tracer.
# echo $$ > /debugfs/tracing/set_ftrace_pid
# echo function_graph > /debugfs/tracing/current_tracer
Will cause only the current task to be traced. Note, the trace flags are
also inherited by child processes, so the children of the shell
will also be traced.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds the file:
/debugfs/tracing/set_graph_function
which can be used along with the function graph tracer.
When this file is empty, the function graph tracer will act as
usual. When the file has a function in it, the function graph
tracer will only trace that function.
For example:
# echo blk_unplug > /debugfs/tracing/set_graph_function
# cat /debugfs/tracing/trace
[...]
------------------------------------------
| 2) make-19003 => kjournald-2219
------------------------------------------
2) | blk_unplug() {
2) | dm_unplug_all() {
2) | dm_get_table() {
2) 1.381 us | _read_lock();
2) 0.911 us | dm_table_get();
2) 1. 76 us | _read_unlock();
2) + 12.912 us | }
2) | dm_table_unplug_all() {
2) | blk_unplug() {
2) 0.778 us | generic_unplug_device();
2) 2.409 us | }
2) 5.992 us | }
2) 0.813 us | dm_table_put();
2) + 29. 90 us | }
2) + 34.532 us | }
You can add up to 32 functions into this file. Currently we limit it
to 32, but this may change with later improvements.
To add another function, use the append '>>':
# echo sys_read >> /debugfs/tracing/set_graph_function
# cat /debugfs/tracing/set_graph_function
blk_unplug
sys_read
Using the '>' will clear out the function and write anew:
# echo sys_write > /debug/tracing/set_graph_function
# cat /debug/tracing/set_graph_function
sys_write
Note, if you have function graph running while doing this, the small
time between clearing it and updating it will cause the graph to
record all functions. This should not be an issue because after
it sets the filter, only those functions will be recorded from then on.
If you need to only record a particular function then set this
file first before starting the function graph tracer. In the future
this side effect may be corrected.
The set_graph_function file is similar to the set_ftrace_filter but
it does not take wild cards nor does it allow for more than one
function to be set with a single write. There is no technical reason why
this is the case, I just do not have the time yet to implement that.
Note, dynamic ftrace must be enabled for this to appear because it
uses the dynamic ftrace records to match the name to the mcount
call sites.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Conflicts:
fs/nfsd/nfs4recover.c
Manually fixed above to use new creds API functions, e.g.
nfs4_save_creds().
Signed-off-by: James Morris <jmorris@namei.org>
Impact: feature, let entry function decide to trace or not
This patch lets the graph tracer entry function decide if the tracing
should be done at the end as well. This requires all function graph
entry functions return 1 if it should trace, or 0 if the return should
not be traced.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix for lockdep and ftrace
The raw_local_irq_save/restore confuses lockdep. This patch
converts them to the local_irq_save/restore variants.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There are architectures that still have no stacktrace support.
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: enhancement to function graph tracer
Export the trace_find_cmdline so the function graph tracer can
use it to print the comms of the threads.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: feature
This patch sets a C-like output for the function graph tracing.
For this aim, we now call two handler for each function: one on the entry
and one other on return. This way we can draw a well-ordered call stack.
The pid of the previous trace is loosely stored to be compared against
the one of the current trace to see if there were a context switch.
Without this little feature, the call tree would seem broken at
some locations.
We could use the sched_tracer to capture these sched_events but this
way of processing is much more simpler.
2 spaces have been chosen for indentation to fit the screen while deep
calls. The time of execution in nanosecs is printed just after closed
braces, it seems more easy this way to find the corresponding function.
If the time was printed as a first column, it would be not so easy to
find the corresponding function if it is called on a deep depth.
I plan to output the return value but on 32 bits CPU, the return value
can be 32 or 64, and its difficult to guess on which case we are.
I don't know what would be the better solution on X86-32: only print
eax (low-part) or even edx (high-part).
Actually it's thee same problem when a function return a 8 bits value, the
high part of eax could contain junk values...
Here is an example of trace:
sys_read() {
fget_light() {
} 526
vfs_read() {
rw_verify_area() {
security_file_permission() {
cap_file_permission() {
} 519
} 1564
} 2640
do_sync_read() {
pipe_read() {
__might_sleep() {
} 511
pipe_wait() {
prepare_to_wait() {
} 760
deactivate_task() {
dequeue_task() {
dequeue_task_fair() {
dequeue_entity() {
update_curr() {
update_min_vruntime() {
} 504
} 1587
clear_buddies() {
} 512
add_cfs_task_weight() {
} 519
update_min_vruntime() {
} 511
} 5602
dequeue_entity() {
update_curr() {
update_min_vruntime() {
} 496
} 1631
clear_buddies() {
} 496
update_min_vruntime() {
} 527
} 4580
hrtick_update() {
hrtick_start_fair() {
} 488
} 1489
} 13700
} 14949
} 16016
msecs_to_jiffies() {
} 496
put_prev_task_fair() {
} 504
pick_next_task_fair() {
} 489
pick_next_task_rt() {
} 496
pick_next_task_fair() {
} 489
pick_next_task_idle() {
} 489
------------8<---------- thread 4 ------------8<----------
finish_task_switch() {
} 1203
do_softirq() {
__do_softirq() {
__local_bh_disable() {
} 669
rcu_process_callbacks() {
__rcu_process_callbacks() {
cpu_quiet() {
rcu_start_batch() {
} 503
} 1647
} 3128
__rcu_process_callbacks() {
} 542
} 5362
_local_bh_enable() {
} 587
} 8880
} 9986
kthread_should_stop() {
} 669
deactivate_task() {
dequeue_task() {
dequeue_task_fair() {
dequeue_entity() {
update_curr() {
calc_delta_mine() {
} 511
update_min_vruntime() {
} 511
} 2813
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
This patch changes the name of the "return function tracer" into
function-graph-tracer which is a more suitable name for a tracing
which makes one able to retrieve the ordered call stack during
the code flow.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a callback to allow an ftrace plug-in to write its own header.
Move the call to trace->open() up a few lines.
The changes are required by the BTS ftrace plug-in.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix race
vma->vm_file reference is only stable while holding the mmap_sem,
so move usage of it to within the critical section.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix refcounting/object-access bug
Hold mmap_sem while looking up/accessing vma.
Hold the RCU lock while using the task we looked up.
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new API to disable all of ftrace on anomalies
It case of a serious anomaly being detected (like something caught by
lockdep) it is a good idea to disable all tracing immediately, without
grabing any locks.
This patch adds ftrace_off_permanent that disables the tracers, function
tracing and ring buffers without a way to enable them again. This should
only be used when something serious has been detected.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: modify+improve the userstacktrace tracing visualization feature
Store thread group leader id, and use it to lookup the address in the
process's map. We could have looked up the address on thread's map,
but the thread might not exist by the time we are called. The process
might not exist either, but if you are reading trace_pipe, that is
unlikely.
Example usage:
mount -t debugfs nodev /sys/kernel/debug
cd /sys/kernel/debug/tracing
echo userstacktrace >iter_ctrl
echo sym-userobj >iter_ctrl
echo sched_switch >current_tracer
echo 1 >tracing_enabled
cat trace_pipe >/tmp/trace&
.... run application ...
echo 0 >tracing_enabled
cat /tmp/trace
You'll see stack entries like:
/lib/libpthread-2.7.so[+0xd370]
You can convert them to function/line using:
addr2line -fie /lib/libpthread-2.7.so 0xd370
Or:
addr2line -fie /usr/lib/debug/libpthread-2.7.so 0xd370
For non-PIC/PIE executables this won't work:
a.out[+0x73b]
You need to run the following: addr2line -fie a.out 0x40073b
(where 0x400000 is the default load address of a.out)
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new (default-off) tracing visualization feature
Usage example:
mount -t debugfs nodev /sys/kernel/debug
cd /sys/kernel/debug/tracing
echo userstacktrace >iter_ctrl
echo sched_switch >current_tracer
echo 1 >tracing_enabled
.... run application ...
echo 0 >tracing_enabled
Then read one of 'trace','latency_trace','trace_pipe'.
To get the best output you can compile your userspace programs with
frame pointers (at least glibc + the app you are tracing).
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix preemptoff and preemptirqsoff tracer self-tests
I was wondering why the preemptoff and preemptirqsoff tracer selftests
don't work on s390. After all its just that they get called from
non-preemptible context:
kernel_init() will execute all initcalls, however the first line in
kernel_init() is lock_kernel(), which causes the preempt_count to be
increased. Any later calls to add_preempt_count() (especially those
from the selftests) will therefore not result in a call to
trace_preempt_off() since the check below in add_preempt_count()
will be false:
if (preempt_count() == val)
trace_preempt_off(CALLER_ADDR0, get_parent_ip(CALLER_ADDR1));
Hence the trace buffer will be empty.
Fix this by releasing the BKL during the self-tests.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix memory leak
Error handling code following a kzalloc should free the allocated data.
The semantic match that finds the problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@r exists@
local idexpression x;
statement S;
expression E;
identifier f,l;
position p1,p2;
expression *ptr != NULL;
@@
(
if ((x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...)) == NULL) S
|
x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...);
...
if (x == NULL) S
)
<... when != x
when != if (...) { <+...x...+> }
x->f = E
...>
(
return \(0\|<+...x...+>\|ptr\);
|
return@p2 ...;
)
@script:python@
p1 << r.p1;
p2 << r.p2;
@@
print "* file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: help to find the better depth of trace
We decided to arbitrary define the depth of function return trace as
"20". Perhaps this is not enough. To help finding an optimal depth, we
measure now the overrun: the number of functions that have been missed
for the current thread. By default this is not displayed, we have to
do set a particular flag on the return tracer: echo overrun >
/debug/tracing/trace_options And the overrun will be printed on the
right.
As the trace shows below, the current 20 depth is not enough.
update_wall_time+0x37f/0x8c0 -> update_xtime_cache (345 ns) (Overruns: 2838)
update_wall_time+0x384/0x8c0 -> clocksource_get_next (1141 ns) (Overruns: 2838)
do_timer+0x23/0x100 -> update_wall_time (3882 ns) (Overruns: 2838)
tick_do_update_jiffies64+0xbf/0x160 -> do_timer (5339 ns) (Overruns: 2838)
tick_sched_timer+0x6a/0xf0 -> tick_do_update_jiffies64 (7209 ns) (Overruns: 2838)
vgacon_set_cursor_size+0x98/0x120 -> native_io_delay (2613 ns) (Overruns: 274)
vgacon_cursor+0x16e/0x1d0 -> vgacon_set_cursor_size (33151 ns) (Overruns: 274)
set_cursor+0x5f/0x80 -> vgacon_cursor (36432 ns) (Overruns: 274)
con_flush_chars+0x34/0x40 -> set_cursor (38790 ns) (Overruns: 274)
release_console_sem+0x1ec/0x230 -> up (721 ns) (Overruns: 274)
release_console_sem+0x225/0x230 -> wake_up_klogd (316 ns) (Overruns: 274)
con_flush_chars+0x39/0x40 -> release_console_sem (2996 ns) (Overruns: 274)
con_write+0x22/0x30 -> con_flush_chars (46067 ns) (Overruns: 274)
n_tty_write+0x1cc/0x360 -> con_write (292670 ns) (Overruns: 274)
smp_apic_timer_interrupt+0x2a/0x90 -> native_apic_mem_write (330 ns) (Overruns: 274)
irq_enter+0x17/0x70 -> idle_cpu (413 ns) (Overruns: 274)
smp_apic_timer_interrupt+0x2f/0x90 -> irq_enter (1525 ns) (Overruns: 274)
ktime_get_ts+0x40/0x70 -> getnstimeofday (465 ns) (Overruns: 274)
ktime_get_ts+0x60/0x70 -> set_normalized_timespec (436 ns) (Overruns: 274)
ktime_get+0x16/0x30 -> ktime_get_ts (2501 ns) (Overruns: 274)
hrtimer_interrupt+0x77/0x1a0 -> ktime_get (3439 ns) (Overruns: 274)
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: give a way to send specific messages to tracers
The current implementation of tracing uses some flags to control the
output of general tracers. But we have no way to implement custom
flags handling for a specific tracer. This patch proposes a new
callback for the struct tracer which called set_flag and a structure
that represents a 32 bits variable flag.
A tracer can implement a struct tracer_flags on which it puts the
initial value of the flag integer. Than it can place a range of flags
with their name and their flag mask on the flag integer. The structure
that implement a single flag is called struct tracer_opt.
These custom flags will be available through the trace_options file
like the general tracing flags. Changing their value is done like the
other general flags. For example if you have a flag that calls "foo",
you can activate it by writing "foo" or "nofoo" on trace_options.
Note that the set_flag callback is optional and is only needed if you
want the flags changing to be signaled to your tracer and let it to
accept or refuse their assignment.
V2: Some arrangements in coding style....
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix trace_options behavior
writing to trace/trace_options use the index of the array
to find the value of the flag. With branch tracer flag
defined conditionally, this breaks writing to trace_options
with branch tracer disabled.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: extend the ->init() method with the ability to fail
This bring a way to know if the initialization of a tracer successed.
A tracer must return 0 on success and a traditional error (ie:
-ENOMEM) if it fails.
If a tracer fails to init, it is free to print a detailed warn. The
tracing api will not and switch to a new tracer will just return the
error from the init callback.
Note: this will be used for the return tracer.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix confusing write() -EINVAL when changing the tracer
The following commit d9e540762f remade
alive the bug which made the set of a new tracer returning -EINVAL if
this is the longest name of tracer. This patch corrects it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix lockdep disabling itself when function tracing is enabled
The raw_local_irq_saves used in ftrace is causing problems with
lockdep. (it thinks the irq flags are out of sync and disables
itself with a warning)
The raw ops here are not needed, and the normal local_irq_save is fine.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Conflicts:
security/keys/internal.h
security/keys/process_keys.c
security/keys/request_key.c
Fixed conflicts above by using the non 'tsk' versions.
Signed-off-by: James Morris <jmorris@namei.org>
Separate the task security context from task_struct. At this point, the
security data is temporarily embedded in the task_struct with two pointers
pointing to it.
Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
entry.S via asm-offsets.
With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Impact: better handling of CPU buffer start annotation
Because of the confusion with the per CPU buffers wrapping where
one CPU might be more active at the end of the trace than the other
CPUs causing that one CPU to have a shorter history. Kernel
developers were confused by the "missing" data of that one CPU
at the beginning of the trace output. An annotation was added to
the trace output to show that the buffer had started:
# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
##### CPU 3 buffer started ####
<idle>-0 [003] 158.192959: smp_apic_timer_interrupt
[...]
<idle>-0 [003] 161.556520: default_idle
##### CPU 1 buffer started ####
<idle>-0 [001] 161.592494: hrtimer_force_reprogram
[etc]
But this annotation gets a bit messy when tracers do not fill the
buffers. This patch does a couple of things:
One) it adds a flag to trace_options to disable these annotations
Two) it does not annotate if the tracer did not overflow its buffer.
This makes the output much cleaner.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>