2009-01-09 01:01:53 -05:00
|
|
|
/*
|
|
|
|
* Performance counter support - PowerPC-specific definitions.
|
|
|
|
*
|
|
|
|
* Copyright 2008-2009 Paul Mackerras, IBM Corporation.
|
|
|
|
*
|
|
|
|
* This program is free software; you can redistribute it and/or
|
|
|
|
* modify it under the terms of the GNU General Public License
|
|
|
|
* as published by the Free Software Foundation; either version
|
|
|
|
* 2 of the License, or (at your option) any later version.
|
|
|
|
*/
|
2009-01-09 04:21:55 -05:00
|
|
|
#include <linux/types.h>
|
|
|
|
|
|
|
|
#define MAX_HWCOUNTERS 8
|
|
|
|
#define MAX_EVENT_ALTERNATIVES 8
|
perf_counter: powerpc: allow use of limited-function counters
POWER5+ and POWER6 have two hardware counters with limited functionality:
PMC5 counts instructions completed in run state and PMC6 counts cycles
in run state. (Run state is the state when a hardware RUN bit is 1;
the idle task clears RUN while waiting for work to do and sets it when
there is work to do.)
These counters can't be written to by the kernel, can't generate
interrupts, and don't obey the freeze conditions. That means we can
only use them for per-task counters (where we know we'll always be in
run state; we can't put a per-task counter on an idle task), and only
if we don't want interrupts and we do want to count in all processor
modes.
Obviously some counters can't go on a limited hardware counter, but there
are also situations where we can only put a counter on a limited hardware
counter - if there are already counters on that exclude some processor
modes and we want to put on a per-task cycle or instruction counter that
doesn't exclude any processor mode, it could go on if it can use a
limited hardware counter.
To keep track of these constraints, this adds a flags argument to the
processor-specific get_alternatives() functions, with three bits defined:
one to say that we can accept alternative event codes that go on limited
counters, one to say we only want alternatives on limited counters, and
one to say that this is a per-task counter and therefore events that are
gated by run state are equivalent to those that aren't (e.g. a "cycles"
event is equivalent to a "cycles in run state" event). These flags
are computed for each counter and stored in the counter->hw.counter_base
field (slightly wonky name for what it does, but it was an existing
unused field).
Since the limited counters don't freeze when we freeze the other counters,
we need some special handling to avoid getting skew between things counted
on the limited counters and those counted on normal counters. To minimize
this skew, if we are using any limited counters, we read PMC5 and PMC6
immediately after setting and clearing the freeze bit. This is done in
a single asm in the new write_mmcr0() function.
The code here is specific to PMC5 and PMC6 being the limited hardware
counters. Being more general (e.g. having a bitmap of limited hardware
counter numbers) would have meant more complex code to read the limited
counters when freezing and unfreezing the normal counters, with
conditional branches, which would have increased the skew. Since it
isn't necessary for the code to be more general at this stage, it isn't.
This also extends the back-ends for POWER5+ and POWER6 to be able to
handle up to 6 counters rather than the 4 they previously handled.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
LKML-Reference: <18936.19035.163066.892208@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-29 08:38:51 -04:00
|
|
|
#define MAX_LIMITED_HWCOUNTERS 2
|
2009-01-09 04:21:55 -05:00
|
|
|
|
|
|
|
/*
|
|
|
|
* This struct provides the constants and functions needed to
|
|
|
|
* describe the PMU on a particular POWER-family CPU.
|
|
|
|
*/
|
|
|
|
struct power_pmu {
|
|
|
|
int n_counter;
|
|
|
|
int max_alternatives;
|
|
|
|
u64 add_fields;
|
|
|
|
u64 test_adder;
|
2009-05-13 23:29:14 -04:00
|
|
|
int (*compute_mmcr)(u64 events[], int n_ev,
|
2009-01-09 04:21:55 -05:00
|
|
|
unsigned int hwc[], u64 mmcr[]);
|
2009-05-13 23:29:14 -04:00
|
|
|
int (*get_constraint)(u64 event, u64 *mskp, u64 *valp);
|
|
|
|
int (*get_alternatives)(u64 event, unsigned int flags,
|
|
|
|
u64 alt[]);
|
2009-01-09 04:21:55 -05:00
|
|
|
void (*disable_pmc)(unsigned int pmc, u64 mmcr[]);
|
2009-05-13 23:29:14 -04:00
|
|
|
int (*limited_pmc_event)(u64 event);
|
perf_counter: powerpc: allow use of limited-function counters
POWER5+ and POWER6 have two hardware counters with limited functionality:
PMC5 counts instructions completed in run state and PMC6 counts cycles
in run state. (Run state is the state when a hardware RUN bit is 1;
the idle task clears RUN while waiting for work to do and sets it when
there is work to do.)
These counters can't be written to by the kernel, can't generate
interrupts, and don't obey the freeze conditions. That means we can
only use them for per-task counters (where we know we'll always be in
run state; we can't put a per-task counter on an idle task), and only
if we don't want interrupts and we do want to count in all processor
modes.
Obviously some counters can't go on a limited hardware counter, but there
are also situations where we can only put a counter on a limited hardware
counter - if there are already counters on that exclude some processor
modes and we want to put on a per-task cycle or instruction counter that
doesn't exclude any processor mode, it could go on if it can use a
limited hardware counter.
To keep track of these constraints, this adds a flags argument to the
processor-specific get_alternatives() functions, with three bits defined:
one to say that we can accept alternative event codes that go on limited
counters, one to say we only want alternatives on limited counters, and
one to say that this is a per-task counter and therefore events that are
gated by run state are equivalent to those that aren't (e.g. a "cycles"
event is equivalent to a "cycles in run state" event). These flags
are computed for each counter and stored in the counter->hw.counter_base
field (slightly wonky name for what it does, but it was an existing
unused field).
Since the limited counters don't freeze when we freeze the other counters,
we need some special handling to avoid getting skew between things counted
on the limited counters and those counted on normal counters. To minimize
this skew, if we are using any limited counters, we read PMC5 and PMC6
immediately after setting and clearing the freeze bit. This is done in
a single asm in the new write_mmcr0() function.
The code here is specific to PMC5 and PMC6 being the limited hardware
counters. Being more general (e.g. having a bitmap of limited hardware
counter numbers) would have meant more complex code to read the limited
counters when freezing and unfreezing the normal counters, with
conditional branches, which would have increased the skew. Since it
isn't necessary for the code to be more general at this stage, it isn't.
This also extends the back-ends for POWER5+ and POWER6 to be able to
handle up to 6 counters rather than the 4 they previously handled.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
LKML-Reference: <18936.19035.163066.892208@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-29 08:38:51 -04:00
|
|
|
int limited_pmc5_6; /* PMC5 and PMC6 have limited function */
|
2009-01-09 04:21:55 -05:00
|
|
|
int n_generic;
|
|
|
|
int *generic_events;
|
|
|
|
};
|
|
|
|
|
|
|
|
extern struct power_pmu *ppmu;
|
|
|
|
|
perf_counter: powerpc: allow use of limited-function counters
POWER5+ and POWER6 have two hardware counters with limited functionality:
PMC5 counts instructions completed in run state and PMC6 counts cycles
in run state. (Run state is the state when a hardware RUN bit is 1;
the idle task clears RUN while waiting for work to do and sets it when
there is work to do.)
These counters can't be written to by the kernel, can't generate
interrupts, and don't obey the freeze conditions. That means we can
only use them for per-task counters (where we know we'll always be in
run state; we can't put a per-task counter on an idle task), and only
if we don't want interrupts and we do want to count in all processor
modes.
Obviously some counters can't go on a limited hardware counter, but there
are also situations where we can only put a counter on a limited hardware
counter - if there are already counters on that exclude some processor
modes and we want to put on a per-task cycle or instruction counter that
doesn't exclude any processor mode, it could go on if it can use a
limited hardware counter.
To keep track of these constraints, this adds a flags argument to the
processor-specific get_alternatives() functions, with three bits defined:
one to say that we can accept alternative event codes that go on limited
counters, one to say we only want alternatives on limited counters, and
one to say that this is a per-task counter and therefore events that are
gated by run state are equivalent to those that aren't (e.g. a "cycles"
event is equivalent to a "cycles in run state" event). These flags
are computed for each counter and stored in the counter->hw.counter_base
field (slightly wonky name for what it does, but it was an existing
unused field).
Since the limited counters don't freeze when we freeze the other counters,
we need some special handling to avoid getting skew between things counted
on the limited counters and those counted on normal counters. To minimize
this skew, if we are using any limited counters, we read PMC5 and PMC6
immediately after setting and clearing the freeze bit. This is done in
a single asm in the new write_mmcr0() function.
The code here is specific to PMC5 and PMC6 being the limited hardware
counters. Being more general (e.g. having a bitmap of limited hardware
counter numbers) would have meant more complex code to read the limited
counters when freezing and unfreezing the normal counters, with
conditional branches, which would have increased the skew. Since it
isn't necessary for the code to be more general at this stage, it isn't.
This also extends the back-ends for POWER5+ and POWER6 to be able to
handle up to 6 counters rather than the 4 they previously handled.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
LKML-Reference: <18936.19035.163066.892208@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-29 08:38:51 -04:00
|
|
|
/*
|
|
|
|
* Values for flags to get_alternatives()
|
|
|
|
*/
|
|
|
|
#define PPMU_LIMITED_PMC_OK 1 /* can put this on a limited PMC */
|
|
|
|
#define PPMU_LIMITED_PMC_REQD 2 /* have to put this on a limited PMC */
|
|
|
|
#define PPMU_ONLY_COUNT_RUN 4 /* only counting in run state */
|
|
|
|
|
2009-01-09 04:21:55 -05:00
|
|
|
/*
|
|
|
|
* The power_pmu.get_constraint function returns a 64-bit value and
|
|
|
|
* a 64-bit mask that express the constraints between this event and
|
|
|
|
* other events.
|
|
|
|
*
|
|
|
|
* The value and mask are divided up into (non-overlapping) bitfields
|
|
|
|
* of three different types:
|
|
|
|
*
|
|
|
|
* Select field: this expresses the constraint that some set of bits
|
|
|
|
* in MMCR* needs to be set to a specific value for this event. For a
|
|
|
|
* select field, the mask contains 1s in every bit of the field, and
|
|
|
|
* the value contains a unique value for each possible setting of the
|
|
|
|
* MMCR* bits. The constraint checking code will ensure that two events
|
|
|
|
* that set the same field in their masks have the same value in their
|
|
|
|
* value dwords.
|
|
|
|
*
|
|
|
|
* Add field: this expresses the constraint that there can be at most
|
|
|
|
* N events in a particular class. A field of k bits can be used for
|
|
|
|
* N <= 2^(k-1) - 1. The mask has the most significant bit of the field
|
|
|
|
* set (and the other bits 0), and the value has only the least significant
|
|
|
|
* bit of the field set. In addition, the 'add_fields' and 'test_adder'
|
|
|
|
* in the struct power_pmu for this processor come into play. The
|
|
|
|
* add_fields value contains 1 in the LSB of the field, and the
|
|
|
|
* test_adder contains 2^(k-1) - 1 - N in the field.
|
|
|
|
*
|
|
|
|
* NAND field: this expresses the constraint that you may not have events
|
|
|
|
* in all of a set of classes. (For example, on PPC970, you can't select
|
|
|
|
* events from the FPU, ISU and IDU simultaneously, although any two are
|
|
|
|
* possible.) For N classes, the field is N+1 bits wide, and each class
|
|
|
|
* is assigned one bit from the least-significant N bits. The mask has
|
|
|
|
* only the most-significant bit set, and the value has only the bit
|
|
|
|
* for the event's class set. The test_adder has the least significant
|
|
|
|
* bit set in the field.
|
|
|
|
*
|
|
|
|
* If an event is not subject to the constraint expressed by a particular
|
|
|
|
* field, then it will have 0 in both the mask and value for that field.
|
|
|
|
*/
|