Impact: new API
The old topology_core_siblings() and topology_thread_siblings() return
a cpumask_t; these new ones return a (const) struct cpumask *.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Impact: use new API, remove cpumask from stack.
Change smp_call_function_mask() callers to smp_call_function_many().
This removes a cpumask from the stack, and falls back should allocating
the cpumask var fail (only possible with CONFIG_CPUMASKS_OFFSTACK).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: jeremy@xensource.com
Impact: Remove cpumask_t's from stack.
Simple transition to work_on_cpu(), rather than cpumask games.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Robert Richter <robert.richter@amd.com>
Cc: jacob.shin@amd.com
Impact: remove cpumask_t from stack.
We should not try to save and restore cpus_allowed on current.
We can't use work_on_cpu() here, since it's in the hotplug cpu path
(if anyone else tries to get the hotplug lock from a workqueue we
could deadlock against them).
Fortunately, we can just use smp_call_function_single() since the
function can run from an interrupt.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Impact: use new API
Use the accessors rather than frobbing bits directly. Most of this is
in arch code I haven't even compiled, but is straightforward.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Impact: cleanup, futureproof
In fact, all cpumask ops will only be valid (in general) for bit
numbers < nr_cpu_ids. So use that instead of NR_CPUS in various
places.
This is always safe: no cpu number can be >= nr_cpu_ids, and
nr_cpu_ids is initialized to NR_CPUS at boot.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
This patch simply changes cpumask_t to struct cpumask and similar
trivial modernizations.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Impact: cleanup, remove on-stack cpumask.
The "map" arg is always cpu_online_mask. Importantly, set_affinity
always ands the argument with cpu_online_mask anyway, so we don't need
to do it in fixup_irqs(), avoiding a temporary.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Simple change, and eventual space saving when NR_CPUS >> nr_cpu_ids.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Impact: cleanup, consolidate patches, use new API
Consolidate the following into a single patch to adapt to new
sparseirq code in arch/x86/kernel/io_apic.c, add allocation of
cpumask_var_t's in domain and old_domain, and reduce further
merge conflicts. Only one file (arch/x86/kernel/io_apic.c) is
changed in all of these patches.
0006-x86-io_apic-change-irq_cfg-domain-old_domain-to.patch
0007-x86-io_apic-set_desc_affinity.patch
0008-x86-io_apic-send_cleanup_vector.patch
0009-x86-io_apic-eliminate-remaining-cpumask_ts-from-st.patch
0021-x86-final-cleanups-in-io_apic-to-use-new-cpumask-AP.patch
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Impact: use updated APIs
Various API updates for x86:add-cpu_mask_to_apicid_and
(Note: separate because previous patch has been "backported" to 2.6.27.)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Impact: new API
Add a helper function that takes two cpumask's, and's them and then
returns the apicid of the result. This removes a need in io_apic.c
that uses a temporary cpumask to hold (mask & cfg->domain).
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Impact: cleanup, better debugging
This has proven useful in debugging, *before* we try to use
for_each_possible_cpu(). It also now shows nr_cpumask_bits.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Impact: cleanup, change parameter passing
* Change genapic interfaces to accept cpumask_t pointers where possible.
* Modify external callers to use cpumask_t pointers in function calls.
* Create new send_IPI_mask_allbutself which is the same as the
send_IPI_mask functions but removes smp_processor_id() from list.
This removes another common need for a temporary cpumask_t variable.
* Functions that used a temp cpumask_t variable for:
cpumask_t allbutme = cpu_online_map;
cpu_clear(smp_processor_id(), allbutme);
if (!cpus_empty(allbutme))
...
become:
if (!cpus_equal(cpu_online_map, cpumask_of_cpu(cpu)))
...
* Other minor code optimizations (like using cpus_clear instead of
CPU_MASK_NONE, etc.)
Applies to linux-2.6.tip/master.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Ingo Molnar <mingo@elte.hu>
Impact: activates new off-stack cpumask code on MAXSMP (non-default) x86 configs
Set MAXSMP to enable CONFIG_CPUMASK_OFFSTACK which moves cpumask's off
the stack (and in structs) when using cpumask_var_t.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Ingo Molnar <mingo@elte.hy>
Impact: build fix
make intr_remapping.c to include smp.h, so could use boot_cpu_id there
also remove old change that disabling sparseirq with !SMP
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: improve NUMA handling by migrating irq_desc on smp_affinity changes
if CONFIG_NUMA_MIGRATE_IRQ_DESC is set:
- make irq_desc to go with affinity aka irq_desc moving etc
- call move_irq_desc in irq_complete_move()
- legacy irq_desc is not moved, because they are allocated via static array
for logical apic mode, need to add move_desc_in_progress_in_same_domain,
otherwise it will not be moved ==> also could need two phases to get
irq_desc moved.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar wrote:
> allyes64 build failure:
>
> arch/x86/kernel/io_apic.c: In function ‘set_ir_ioapic_affinity_irq_desc’:
> arch/x86/kernel/io_apic.c:2295: error: incompatible type for argument 2 of
> ‘migrate_ioapic_irq_desc’
> arch/x86/kernel/io_apic.c: In function ‘ir_set_msi_irq_affinity’:
> arch/x86/kernel/io_apic.c:3205: error: incompatible type for argument 2 of
> ‘set_extra_move_desc’
> make[1]: *** wait: No child processes. Stop.
Here's a small patch to correct the build error with the post-merge tree.
Built and boot-tested. I'll will reset the follow on patches in my brand
new git tree to accommodate this change.
Fix two references in io_apic.c that were incorrect.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
VMI initialiation can relocate the fixmap, causing early_ioremap to
malfunction if it is initialized before the relocation. To fix this,
VMI activation is split into two phases; the detection, which must
happen before setting up ioremap, and the activation, which must happen
after parsing early boot parameters.
This fixes a crash on boot when VMI is enabled under VMware.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: change calling convention of existing clock_event APIs
struct clock_event_timer's cpumask field gets changed to take pointer,
as does the ->broadcast function.
Another single-patch change. For safety, we BUG_ON() in
clockevents_register_device() if it's not set.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Impact: change existing irq_chip API
Not much point with gentle transition here: the struct irq_chip's
setaffinity method signature needs to change.
Fortunately, not widely used code, but hits a few architectures.
Note: In irq_select_affinity() I save a temporary in by mangling
irq_desc[irq].affinity directly. Ingo, does this break anything?
(Folded in fix from KOSAKI Motohiro)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Acked-by: Ingo Molnar <mingo@redhat.com>
Cc: ralf@linux-mips.org
Cc: grundler@parisc-linux.org
Cc: jeremy@xensource.com
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Impact: change calling convention of existing cpumask APIs
Most cpumask functions started with cpus_: these have been replaced by
cpumask_ ones which take struct cpumask pointers as expected.
These four functions don't have good replacement names; fortunately
they're rarely used, so we just change them over.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: paulus@samba.org
Cc: mingo@redhat.com
Cc: tony.luck@intel.com
Cc: ralf@linux-mips.org
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: cl@linux-foundation.org
Cc: srostedt@redhat.com
Impact: cleanup
Each SMP arch defines these themselves. Move them to a central
location.
Twists:
1) Some archs (m32, parisc, s390) set possible_map to all 1, so we add a
CONFIG_INIT_ALL_POSSIBLE for this rather than break them.
2) mips and sparc32 '#define cpu_possible_map phys_cpu_present_map'.
Those archs simply have phys_cpu_present_map replaced everywhere.
3) Alpha defined cpu_possible_map to cpu_present_map; this is tricky
so I just manipulate them both in sync.
4) IA64, cris and m32r have gratuitous 'extern cpumask_t cpu_possible_map'
declarations.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Tested-by: Tony Luck <tony.luck@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Mike Travis <travis@sgi.com>
Cc: ink@jurassic.park.msu.ru
Cc: rmk@arm.linux.org.uk
Cc: starvik@axis.com
Cc: tony.luck@intel.com
Cc: takata@linux-m32r.org
Cc: ralf@linux-mips.org
Cc: grundler@parisc-linux.org
Cc: paulus@samba.org
Cc: schwidefsky@de.ibm.com
Cc: lethal@linux-sh.org
Cc: wli@holomorphy.com
Cc: davem@davemloft.net
Cc: jdike@addtoit.com
Cc: mingo@redhat.com
We merge the irq/sparseirq, x86/quirks and x86/reboot trees into the
cpus4096 tree because the io-apic changes in the sparseirq change
conflict with the cpumask changes in the cpumask tree, and we
want to resolve those.
Ingo Molnar wrote:
>>> drivers/pci/intr_remapping.c: In function 'irq_2_iommu_alloc':
>>> drivers/pci/intr_remapping.c:72: error: 'boot_cpu_id' undeclared (first use in this function)
>>> drivers/pci/intr_remapping.c:72: error: (Each undeclared identifier is reported only once
>>> drivers/pci/intr_remapping.c:72: error: for each function it appears in.)
sparseirq should only be used with SMP for now.
these warnings:
arch/x86/kernel/paravirt-spinlocks.c: In function ‘default_spin_lock_flags’:
arch/x86/kernel/paravirt-spinlocks.c:12: warning: passing argument 1 of ‘__raw_spin_lock’ from incompatible pointer type
arch/x86/kernel/paravirt-spinlocks.c: At top level:
arch/x86/kernel/paravirt-spinlocks.c:11: warning: ‘default_spin_lock_flags’ defined but not used
showed that the prototype of default_spin_lock_flags() was confused about
what type spinlocks have.
the proper type on UP is raw_spinlock_t.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Provide a way to pause the function graph tracer
As suggested by Steven Rostedt, the previous patch that prevented from
spinlock function tracing shouldn't use the raw_spinlock to fix it.
It's much better to follow lockdep with normal spinlock, so this patch
adds a new flag for each task to make the function graph tracer able
to be paused. We also can send an ftrace_printk whithout worrying of
the irrelevant traced spinlock during insertion.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: trace more functions
When the function graph tracer is configured, three more files are not
traced to prevent only four functions to be traced. And this impacts the
normal function tracer too.
arch/x86/kernel/process_64/32.c:
I had crashes when I let this file traced. After some debugging, I saw
that the "current" task point was changed inside__swtich_to(), ie:
"write_pda(pcurrent, next_p);" inside process_64.c Since the tracer store
the original return address of the function inside current, we had
crashes. Only __switch_to() has to be excluded from tracing.
kernel/module.c and kernel/extable.c:
Because of a function used internally by the function graph tracer:
__kernel_text_address()
To let the other functions inside these files to be traced, this patch
introduces the __notrace_funcgraph function prefix which is __notrace if
function graph tracer is configured and nothing if not.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
reorder exit path in __get_smp_config().
also move two print outs to acpi_process_madt
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: simplify code
Pass irq_desc and cfg around, instead of raw IRQ numbers - this way
we dont have to look it up again and again.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: sanitize MSI irq number ordering from top-down to bottom-up
Increase new MSI IRQs starting from nr_irqs_gsi (which is somewhere below
256), instead of decreasing from NR_IRQS. (The latter method can result
in confusingly high IRQ numbers - if NR_CPUS is set to a high value and
NR_IRQS scales up to a high value.)
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: new feature
Problem on distro kernels: irq_desc[NR_IRQS] takes megabytes of RAM with
NR_CPUS set to large values. The goal is to be able to scale up to much
larger NR_IRQS value without impacting the (important) common case.
To solve this, we generalize irq_desc[NR_IRQS] to an (optional) array of
irq_desc pointers.
When CONFIG_SPARSE_IRQ=y is used, we use kzalloc_node to get irq_desc,
this also makes the IRQ descriptors NUMA-local (to the site that calls
request_irq()).
This gets rid of the irq_cfg[] static array on x86 as well: irq_cfg now
uses desc->chip_data for x86 to store irq_cfg.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>