android_kernel_xiaomi_sm8350

Author	SHA1	Message	Date
Nick Piggin	20aaffd6a6	[PATCH] x86_64: pageattr use single list Use page->lru.next to implement the singly linked list of pages rather than the struct deferred_page which needs to be allocated and freed for each page. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-22 07:54:01 -08:00
Andi Kleen	2aed711a39	[PATCH] x86_64: Always pass full number of nodes to NUMA hash computation Previously the numa hash code would be confused by holes in the node space and stop early. This is the first part of the fix for the non boot issue with empty nodes on Opterons. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-17 08:00:41 -08:00
Andi Kleen	fdb9df9424	[PATCH] x86_64: Relax SRAT covers all memory check a bit Code was refusing good SRATs because about 12K got lost somewhere. Allow less than 1MB of difference before rejecting it. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-17 08:00:41 -08:00
Daniel Yeisley	d1db4ec86c	[PATCH] x86_64: early initialization of cpu_to_node The early initialization of cpu_to_node code as it is now only updates the cpu_to_node array, and does not update cpu_pda()->nodemember. This will cause numa_node_id() to return 0 on systems where CPU 0 is not on Node 0. This leads to a kernel panic in slab.c. I've tested the patch below on a 16 processor x86_64 ES7000-600 server, and no longer see the panic I saw with the original 2.6.16-rc3. Signed-off-by: Dan Yeisley <dan.yeisley@unisys.com> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-15 15:32:22 -08:00
Jan Beulich	d646bce4c7	[PATCH] x86_64: minor odering correction to dump_pagetable() Checking of the validity of pointers should be consistently done before dereferencing the pointer. Signed-Off-By: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-04 16:43:15 -08:00
Andi Kleen	d22fe80844	[PATCH] x86_64: Do more checking in the SRAT header code - Check if the processor/memory affinity entries are long enough according to the ACPI 3.0 spec. - Ignore memory affinity entries that define a zero length region. All based on BIOS issues found in the field @) Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-04 16:43:14 -08:00
Andi Kleen	9391a3f9c7	[PATCH] x86_64: Clear more state when ignoring empty node in SRAT parsing Might fix boot failures on systems with empty PXMs in SRAT Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-04 16:43:14 -08:00
Matt Tolentino	44df75e629	[PATCH] x86_64: add x86-64 support for memory hot-add Add x86-64 specific memory hot-add functions, Kconfig options, and runtime kernel page table update functions to make hot-add usable on x86-64 machines. Also, fixup the nefarious conditional locking and exports pointed out by Andi. Tested on Intel and IBM x86-64 memory hot-add capable systems. Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:18:35 -08:00
Andi Kleen	8817210d4d	[PATCH] x86_64: Flexmap for 32bit and randomized mappings for 64bit Another try at this. For 32bit follow the 32bit implementation from Ingo - mappings are growing down from the end of stack now and vary randomly by 1GB. Randomized mappings for 64bit just vary the normal mmap break by 1TB. I didn't bother implementing full flex mmap for 64bit because it shouldn't be needed there. Cc: mingo@elte.hu Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:18:35 -08:00
Linus Torvalds	f74e6670c4	x86-64: fix initrd freeing The comparison of the initrd start address against "&_end" is unnecessary and incorrect. Make it match the x86 code that just compares the passed-in arguments. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 11:33:09 -08:00
Andi Kleen	ee408c7942	[PATCH] x86_64: Don't try to put kernel page tables beyond ZONE_DMA32. For not fully explained reasons it broke mem=... on several setups. Also minor cleanup. Cc: axboe@suse.de Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 11:27:59 -08:00
Andi Kleen	6c5acd160a	[PATCH] x86_64: Allow kernel page tables upto the end of memory Previously they would be only allocated before the kernel text at 1MB. This limited the maximum supported memory to 128GB. Now allow the e820 allocator to put them everywhere. Try to put them beyond any DMA zones to avoid filling them up. This should free some GFP_DMA memory compared to earlier kernels. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:05:03 -08:00
Andi Kleen	cf05013286	[PATCH] x86_64: Move NUMA page_to_pfn/pfn_to_page functions out of line Saves about ~18K .text in defconfig There would be more optimization potential, but that's for later. Suggestion originally from Bill Irwin. Fix from Andy Whitcroft. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:05:01 -08:00
Ravikiran G Thirumalai	df79efde82	[PATCH] x86_64: Node local pda take 2 -- cpu_pda preparation Helper patch to change cpu_pda users to use macros to access cpu_pda instead of the cpu_pda[] array. Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:04:59 -08:00
Ravikiran Thirumalai	05b3cbd8bb	[PATCH] x86_64: Early initialization of cpu_to_node Patch enables early intialization of cpu_to_node. apicid_to_node is built by reading the SRAT table, from acpi_numa_init with ACPI_NUMA and k8_scan_nodes with K8_NUMA. x86_cpu_to_apicid is built by parsing the ACPI MADT table, from acpi_boot_init. We combine these two tables and setup cpu_to_node. Early intialization helps the static per_cpu_areas in getting pages from correct node. Change since last release: Do not initialize early init_cpu_to_node for faking node cases. Patch tested on TYAN dual core 4P board with K8 only, ACPI_NUMA. Tested on EM64T NUMA. Also tested with numa=off, numa=fake, and running a kernel compiled with NUMA on a regular EM64 2 way SMP. Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:04:59 -08:00
Muli Ben-Yehuda	17a941d854	[PATCH] x86_64: Use function pointers to call DMA mapping functions AK: I hacked Muli's original patch a lot and there were a lot of changes - all bugs are probably to blame on me now. There were also some changes in the fall back behaviour for swiotlb - in particular it doesn't try to use GFP_DMA now anymore. Also all DMA mapping operations use the same core dma_alloc_coherent code with proper fallbacks now. And various other changes and cleanups. Known problems: iommu=force swiotlb=force together breaks needs more testing. This patch cleans up x86_64's DMA mapping dispatching code. Right now we have three possible IOMMU types: AGP GART, swiotlb and nommu, and in the future we will also have Xen's x86_64 swiotlb and other HW IOMMUs for x86_64. In order to support all of them cleanly, this patch: - introduces a struct dma_mapping_ops with function pointers for each of the DMA mapping operations of gart (AMD HW IOMMU), swiotlb (software IOMMU) and nommu (no IOMMU). - gets rid of: if (swiotlb) return swiotlb_xxx(); - PCI_DMA_BUS_IS_PHYS is now checked against the dma_ops being set This makes swiotlb faster by avoiding double copying in some cases. Signed-Off-By: Muli Ben-Yehuda <mulix@mulix.org> Signed-Off-By: Jon D. Mason <jdmason@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:04:55 -08:00
Andi Kleen	8a6fdd3e91	[PATCH] x86_64: Reject SRAT tables that don't cover all memory Broken BIOS on Iwill 8way systems reports these and it causes the bootmem allocator to crash. Add a sanity check if all the PXMs in the SRAT table cover all memory as reported by e820. If the sanity check fails the SRAT is rejected and the code will fall back to discover the NUMA topology using the K8 northbridge registers when applicable. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:04:55 -08:00
Andi Kleen	6b050f8075	[PATCH] x86_64: Clean up some printks in NUMA code Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:04:55 -08:00
Andi Kleen	d18ff47068	[PATCH] x86_64: Fix up coding style in numa.c No functional changes Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:04:54 -08:00
Andi Kleen	ca8642f606	[PATCH] x86_64: Fix off by one in IOMMU check Fix off by one when checking if the machine has enougn memory to need IOMMU This caused the IOMMUs to be needlessly enabled for mem=4G Based on a patch from Jon Mason Signed-off-by: jdmason@us.ibm.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:04:54 -08:00
Andi Kleen	66c581569e	[PATCH] x86_64: Convert page fault error codes to symbolic constants. Much better to deal with these than with the magic numbers. And remove the comment describing the bits - kernel source is no replacement for an architecture manual. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:04:53 -08:00
Andi Kleen	f95190b28d	[PATCH] x86_64: Remove unnecessary case from the page fault handler Don't need to do the vmalloc check for the module range because its PML4 is shared with the kernel text. Also removed an unnecessary TLB flush. Pointed out by Jan Beulich Cc: jbeulich@novell.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:04:53 -08:00
Andi Kleen	e4e94072d9	[PATCH] x86_64: Return -1 for unknown PCI bus affinity When we don't know the node a PCI bus is connected to return -1. This matches the generic code. Noticed by Ravikiran G Thirumalai <kiran@scalex86.org> Cc: Ravikiran G Thirumalai <kiran@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:04:52 -08:00
Andi Kleen	1584b89c92	[PATCH] x86_64: Validate SLIT table A lot of Opteron BIOS just pass 10 in all SLIT entries (10 is the normalized unit). This is actually worse than the default heuristic because it leads to pci_distance not knowing the difference between local and remote nodes anymore. This messes up some NUMA heuristics in generic code. In this case it's better to fall back to the default heuristic which just does nodea == nodeb ? 10 : 20. This patch does some basic sanity checking on the SLIT and only accepts the SLIT when it passes. Invariants enforced are: - Node to itself shall be 10 - Any other distance shouldn't be 10 - Distances smaller than 10 are illegal Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:04:51 -08:00
Jan Beulich	8b1bde9317	[PATCH] x86_64: Adjust page fault handling Adjust page fault protection error check before considering it to be a vmalloc synchronization candidate. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:01:11 -08:00
Jan Beulich	6e3f361781	[PATCH] x86_64: make trap information available to die notification handlers This adjusts things so that handlers of the die() notifier will have sufficient information about the trap currently being handled. It also adjusts the notify_die() prototype to (again) match that of i386. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 19:01:10 -08:00
Arjan van de Ven	67df197b1a	[PATCH] x86/x86_64: mark rodata section read-only: x86-64 support x86-64 specific parts to make the .rodata section read only Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:36 -08:00
Arjan van de Ven	c728252c7a	[PATCH] x86/x86_64: mark rodata section read only: generic x86-64 bugfix Bug fix required for the .rodata work on x86-64: when change_page_attr() and friends need to break up a 2Mb page into 4Kb pages, it always set the NX bit on the PMD, which causes the cpu to consider the entire 2Mb region to be NX regardless of the actual PTE perms. This is fine in general, with one big exception: the 2Mb page that covers the last part of the kernel .text! The fix is to not invent a new permission for the new PMD entry, but to just inherit the existing one minus the PSE bit. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:36 -08:00
Ravikiran G Thirumalai	576fc0978b	[PATCH] x86_64: Fix incorrect node_present_pages on NUMA Currently, we do not pass the correct start_pfn to e820_hole_size, to calculate holes. Following patch fixes that. The bug results in incorrect number of node_present_pages for each pgdat and causes ugly output in /sys and probably VM inbalances. Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Sighed-off-by: Shair Fultheim <shai@scalex86.org> Sighed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-29 10:20:19 -08:00
Al Viro	b16b88e55d	[PATCH] i386,amd64: ioremap.c __iomem annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-15 10:04:30 -08:00
Eric Dumazet	8309cf66fd	[PATCH] x86_64: Bug correction in populate_memnodemap() As reported by Keith Mannthey, there are problems in populate_memnodemap() The bug was that the compute_hash_shift() was returning 31, with incorrect initialization of memnodemap[] To correct the bug, we must use (1UL << shift) instead of (1 << shift) to avoid an integer overflow, and we must check that shift < 64 to avoid an infinite loop. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-12 22:31:16 -08:00
Andi Kleen	bf5421c309	[PATCH] i386/x86-64: Don't call change_page_attr with a spinlock held It's illegal because it can sleep. Use a two step lookup scheme instead. First look up the vm_struct, then change the direct mapping, then finally unmap it. That's ok because nobody can change the particular virtual address range as long as the vm_struct is still in the global list. Also added some LinuxDoc documentation to iounmap. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-12 22:31:16 -08:00
Bob Picco	d3ee871e63	[PATCH] x86_64: Fix sparse mem Fix up booting with sparse mem enabled. Otherwise it would just cause an early PANIC at boot. Signed-off-by: Bob Picco <bob.picco@hp.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:18 -08:00
Andi Kleen	9e43e1b7c7	[PATCH] x86_64: Remove CONFIG_CHECKING and add command line option for pagefault tracing CONFIG_CHECKING covered some debugging code used in the early times of the port. But it wasn't even SMP safe for quite some time and the bugs it checked for seem to be gone. This patch removes all the code to verify GS at kernel entry. There haven't been any new bugs in this area for a long time. Previously it also covered the sysctl for the page fault tracing. That didn't make much sense because that code was unconditionally compiled in. I made that a boot option now because it is typically only useful at boot. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:17 -08:00
Magnus Damm	ffd10a2b77	[PATCH] x86_64: Make node boundaries consistent The current x86_64 NUMA memory code is inconsequent when it comes to node memory ranges. The exact behaviour varies depending on which config option that is used. setup_node_bootmem() has start and end as arguments and these are used to calculate the size of the node like this: (end - start). This is all fine if end is pointing to the first non-available byte. The problem is that the current x86_64 code sometimes treats it as the last present byte and sometimes as the first non-available byte. The result is that some configurations might lose a page at the end of the range. This patch tries to fix CONFIG_ACPI_NUMA, CONFIG_K8_NUMA and CONFIG_NUMA_EMU so they all treat the end variable as the first non-available byte. This is the same way as the single node code. The patch is boot tested on dual x86_64 hardware with the above configurations, but maybe the removed code is needed as some workaround? Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:17 -08:00
Eric Dumazet	529a340402	[PATCH] x86_64: Optimize NUMA node hash function Compute the highest possible value for memnode_shift, in order to reduce footprint of memnodemap[] to the minimum, thus making all users (phys_to_nid(), kfree()), more cache friendly. Before the patch : Node 0 MemBase 0000000000000000 Limit 00000001ffffffff Node 1 MemBase 0000000200000000 Limit 00000003ffffffff Using 23 for the hash shift. Max adder is 3ffffffff After the patch : Node 0 MemBase 0000000000000000 Limit 00000001ffffffff Node 1 MemBase 0000000200000000 Limit 00000003ffffffff Using 33 for the hash shift. In this case, only 2 bytes of memnodemap[] are used, instead of 2048 Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:15 -08:00
Andi Kleen	5917089104	[PATCH] x86_64: Replace swiotlb extern with include Minor victory on the continuous quest against all stray extern. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:15 -08:00
Andi Kleen	4d74dbd79a	[PATCH] x86_64: Replace cpu_pda extern with include Minor cleanup - remove obsolete extern Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:15 -08:00
Andi Kleen	2bc0414ee0	[PATCH] x86_64: Only use asm/sections.h to declare section symbols Adding __initdata_* to asm-generic/sections.h Replaces a lot of open coded externs in arch/x86_64/* I had to change __bss_end to __bss_stop to match the other architectures. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:14 -08:00
Siddha, Suresh B	f6c2e3330d	[PATCH] x86_64: Unmap NULL during early bootup We should zap the low mappings, as soon as possible, so that we can catch kernel bugs more effectively. Previously early boot had NULL mapped and didn't trap on NULL references. This patch introduces boot_level4_pgt, which will always have low identity addresses mapped. Druing boot, all the processors will use this as their level4 pgt. On BP, we will switch to init_level4_pgt as soon as we enter C code and zap the low mappings as soon as we are done with the usage of identity low mapped addresses. On AP's we will zap the low mappings as soon as we jump to C code. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:14 -08:00
Andi Kleen	69d81fcde7	[PATCH] x86_64: Speed up numa_node_id by putting it directly into the PDA Not go from the CPU number to an mapping array. Mode number is often used now in fast paths. This also adds a generic numa_node_id to all the topology includes Suggested by Eric Dumazet Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:14 -08:00
Andi Kleen	e18c6874a5	[PATCH] x86_64: Account mem_map in VM holes accounting The VM needs to know about lost memory in zones to accurately balance dirty pages. This patch accounts mem_map in there too, which fixes a constant errror of a few percent. Also some other misc mappings and the kernel text itself are accounted too. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:13 -08:00
Andi Kleen	a2f1b42490	[PATCH] x86_64: Add 4GB DMA32 zone Add a new 4GB GFP_DMA32 zone between the GFP_DMA and GFP_NORMAL zones. As a bit of historical background: when the x86-64 port was originally designed we had some discussion if we should use a 16MB DMA zone like i386 or a 4GB DMA zone like IA64 or both. Both was ruled out at this point because it was in early 2.4 when VM is still quite shakey and had bad troubles even dealing with one DMA zone. We settled on the 16MB DMA zone mainly because we worried about older soundcards and the floppy. But this has always caused problems since then because device drivers had trouble getting enough DMA able memory. These days the VM works much better and the wide use of NUMA has proven it can deal with many zones successfully. So this patch adds both zones. This helps drivers who need a lot of memory below 4GB because their hardware is not accessing more (graphic drivers - proprietary and free ones, video frame buffer drivers, sound drivers etc.). Previously they could only use IOMMU+16MB GFP_DMA, which was not enough memory. Another common problem is that hardware who has full memory addressing for >4GB misses it for some control structures in memory (like transmit rings or other metadata). They tended to allocate memory in the 16MB GFP_DMA or the IOMMU/swiotlb then using pci_alloc_consistent, but that can tie up a lot of precious 16MB GFPDMA/IOMMU/swiotlb memory (even on AMD systems the IOMMU tends to be quite small) especially if you have many devices. With the new zone pci_alloc_consistent can just put this stuff into memory below 4GB which works better. One argument was still if the zone should be 4GB or 2GB. The main motivation for 2GB would be an unnamed not so unpopular hardware raid controller (mostly found in older machines from a particular four letter company) who has a strange 2GB restriction in firmware. But that one works ok with swiotlb/IOMMU anyways, so it doesn't really need GFP_DMA32. I chose 4GB to be compatible with IA64 and because it seems to be the most common restriction. The new zone is so far added only for x86-64. For other architectures who don't set up this new zone nothing changes. Architectures can set a compatibility define in Kconfig CONFIG_DMA_IS_DMA32 that will define GFP_DMA32 as GFP_DMA. Otherwise it's a nop because on 32bit architectures it's normally not needed because GFP_NORMAL (=0) is DMA able enough. One problem is still that GFP_DMA means different things on different architectures. e.g. some drivers used to have #ifdef ia64 use GFP_DMA (trusting it to be 4GB) #elif __x86_64__ (use other hacks like the swiotlb because 16MB is not enough) ... . This was quite ugly and is now obsolete. These should be now converted to use GFP_DMA32 unconditionally. I haven't done this yet. Or best only use pci_alloc_consistent/dma_alloc_coherent which will use GFP_DMA32 transparently. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:13 -08:00
Hugh Dickins	872fec16d9	[PATCH] mm: init_mm without ptlock First step in pushing down the page_table_lock. init_mm.page_table_lock has been used throughout the architectures (usually for ioremap): not to serialize kernel address space allocation (that's usually vmlist_lock), but because pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it. Reverse that: don't lock or unlock init_mm.page_table_lock in any of the architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take and drop it when allocating a new one, to check lest a racing task already did. Similarly no page_table_lock in vmalloc's map_vm_area. Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle user mms, which are converted only by a later patch, for now they have to lock differently according to whether or not it's init_mm. If sources get muddled, there's a danger that an arch source taking init_mm.page_table_lock will be mixed with common source also taking it (or neither take it). So break the rules and make another change, which should break the build for such a mismatch: remove the redundant mm arg from pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13). Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64 used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64 map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free took page_table_lock for no good reason. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-29 21:40:40 -07:00
Andi Kleen	094804c5a1	[PATCH] x86_64: Fix change_page_attr cache flushing Noticed by Terence Ripperda Undo wrong change in global_flush_tlb. We need to flush the caches in all cases, not just when pages were reverted. This was a bogus optimization added earlier, but it was wrong. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-10 16:10:33 -07:00
Ravikiran G Thirumalai	85cc5135ac	[PATCH] x86_64 early numa init fix The tests Alok carried out on Petr's box confirmed that cpu_to_node[BP] is not setup early enough by numa_init_array due to the x86_64 changes in 2.6.14-rc*, and unfortunately set wrongly by the work around code in numa_init_array(). cpu_to_node[0] gets set with 1 early and later gets set properly to 0 during identify_cpu() when all cpus are brought up, but confusing the numa slab in the process. Here is a quick fix for this. The right fix obviously is to have cpu_to_node[bsp] setup early for numa_init_array(). The following patch will fix the problem now, and the code can stay on even when cpu_to_node{BP] gets fixed early correctly. Thanks to Petr for access to his box. Signed off by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-30 12:41:20 -07:00
Ravikiran G Thirumalai	e6a045a5b8	[PATCH] x86_64: fix the BP node_to_cpumask Fix the BP node_to_cpumask. 2.6.14-rc* broke the boot cpu bit as the cpu_to_node(0) is now not setup early enough for numa_init_array. cpu_to_node[] is setup much later at srat_detect_node on acpi srat based em64t machines. This seems like a problem on amd machines too, Tested on em64t though. /sys/devices/system/node/node0/cpumap shows up sanely after this patch. Signed off by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-30 12:41:20 -07:00
Andi Kleen	4b6a455c74	[PATCH] x86-64: Use correct mask to compute conflicting nodes in SRAT The nodes are not set online yet at this point. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:58 -07:00
Andi Kleen	2bce2b54ae	[PATCH] x86-64: reset apicid<->node tables when SRAT cannot be parsed Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:58 -07:00
Andi Kleen	e58e0d0312	[PATCH] x86-64: Clean up the SRAT node list before computing the hash function Also use for_each_node_mask instead of hand crafted loops. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:58 -07:00

1 2

79 Commits