android_kernel_xiaomi_sm8350

lisa/android_kernel_xiaomi_sm8350

Fork 0

Commit Graph

Author	SHA1	Message	Date
Dipankar Sarma	8b6490e5fa	[PATCH] files: fix rcu initializers First of a number of files_lock scaability patches. Here are the x86 numbers - tiobench on a 4(8)-way (HT) P4 system on ramdisk : (lockfree) Test 2.6.10-vanilla Stdev 2.6.10-fd Stdev ------------------------------------------------------------- Seqread 1400.8 11.52 1465.4 34.27 Randread 1594 8.86 2397.2 29.21 Seqwrite 242.72 3.47 238.46 6.53 Randwrite 445.74 9.15 446.4 9.75 The performance improvement is very significant. We are getting killed by the cacheline bouncing of the files_struct lock here. Writes on ramdisk (ext2) seems to vary just too much to get any meaningful number. Also, With Tridge's thread_perf test on a 4(8)-way (HT) P4 xeon system : 2.6.12-rc5-vanilla : Running test 'readwrite' with 8 tasks Threads 0.34 +/- 0.01 seconds Processes 0.16 +/- 0.00 seconds 2.6.12-rc5-fd : Running test 'readwrite' with 8 tasks Threads 0.17 +/- 0.02 seconds Processes 0.17 +/- 0.02 seconds I repeated the measurements on ramfs (as opposed to ext2 on ramdisk in the earlier measurement) and I got more consistent results from tiobench : 4(8) way xeon P4 ----------------- (lock-free) Test 2.6.12-rc5 Stdev 2.6.12-rc5-fd Stdev ------------------------------------------------------------- Seqread 1282 18.59 1343.6 26.37 Randread 1517 7 2415 34.27 Seqwrite 702.2 5.27 709.46 5.9 Randwrite 846.86 15.15 919.68 21.4 4-way ppc64 ------------ (lock-free) Test 2.6.12-rc5 Stdev 2.6.12-rc5-fd Stdev ------------------------------------------------------------- Seqread 1549 91.16 1569.6 47.2 Randread 1473.6 25.11 1585.4 69.99 Seqwrite 1096.8 20.03 1136 29.61 Randwrite 1189.6 4.04 1275.2 32.96 Also running Tridge's thread_perf test on ppc64 : 2.6.12-rc5-vanilla -------------------- Running test 'readwrite' with 4 tasks Threads 0.20 +/- 0.02 seconds Processes 0.16 +/- 0.01 seconds 2.6.12-rc5-fd -------------------- Running test 'readwrite' with 4 tasks Threads 0.18 +/- 0.04 seconds Processes 0.16 +/- 0.01 seconds The benefits are huge (upto ~60%) in some cases on x86 primarily due to the atomic operations during acquisition of ->file_lock and cache line bouncing in fast path. ppc64 benefits are modest due to LL/SC based locking, but still statistically significant. This patch: RCU head initilizer no longer needs the head varible name since we don't use list.h lists anymore. Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 13:57:54 -07:00
Paul E. McKenney	9b06e81898	[PATCH] Deprecate synchronize_kernel, GPL replacement The synchronize_kernel() primitive is used for quite a few different purposes: waiting for RCU readers, waiting for NMIs, waiting for interrupts, and so on. This makes RCU code harder to read, since synchronize_kernel() might or might not have matching rcu_read_lock()s. This patch creates a new synchronize_rcu() that is to be used for RCU readers and a new synchronize_sched() that is used for the rest. These two new primitives currently have the same implementation, but this is might well change with additional real-time support. Both new primitives are GPL-only, the old primitive is deprecated. Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:59:04 -07:00
Linus Torvalds	1da177e4c3	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!	2005-04-16 15:20:36 -07:00

Author

SHA1

Message

Date

Dipankar Sarma

8b6490e5fa

[PATCH] files: fix rcu initializers

First of a number of files_lock scaability patches.

 Here are the x86 numbers -

 tiobench on a 4(8)-way (HT) P4 system on ramdisk :

                                         (lockfree)
 Test            2.6.10-vanilla  Stdev   2.6.10-fd       Stdev
 -------------------------------------------------------------
 Seqread         1400.8          11.52   1465.4          34.27
 Randread        1594            8.86    2397.2          29.21
 Seqwrite        242.72          3.47    238.46          6.53
 Randwrite       445.74          9.15    446.4           9.75

 The performance improvement is very significant.
 We are getting killed by the cacheline bouncing of the files_struct
 lock here. Writes on ramdisk (ext2) seems to vary just too
 much to get any meaningful number.

 Also, With Tridge's thread_perf test on a 4(8)-way (HT) P4 xeon system :

 2.6.12-rc5-vanilla :

 Running test 'readwrite' with 8 tasks
 Threads     0.34 +/- 0.01 seconds
 Processes   0.16 +/- 0.00 seconds

 2.6.12-rc5-fd :

 Running test 'readwrite' with 8 tasks
 Threads     0.17 +/- 0.02 seconds
 Processes   0.17 +/- 0.02 seconds

 I repeated the measurements on ramfs (as opposed to ext2 on ramdisk in
 the earlier measurement) and I got more consistent results from tiobench :

 4(8) way xeon P4
 -----------------
                                         (lock-free)
 Test            2.6.12-rc5      Stdev   2.6.12-rc5-fd   Stdev
 -------------------------------------------------------------
 Seqread         1282            18.59   1343.6          26.37
 Randread        1517            7       2415            34.27
 Seqwrite        702.2           5.27    709.46           5.9
 Randwrite       846.86          15.15   919.68          21.4

 4-way ppc64
 ------------
                                         (lock-free)
 Test            2.6.12-rc5      Stdev   2.6.12-rc5-fd   Stdev
 -------------------------------------------------------------
 Seqread         1549            91.16   1569.6          47.2
 Randread        1473.6          25.11   1585.4          69.99
 Seqwrite        1096.8          20.03   1136            29.61
 Randwrite       1189.6           4.04   1275.2          32.96

 Also running Tridge's thread_perf test on ppc64 :

 2.6.12-rc5-vanilla
 --------------------
 Running test 'readwrite' with 4 tasks
 Threads     0.20 +/- 0.02 seconds
 Processes   0.16 +/- 0.01 seconds

 2.6.12-rc5-fd
 --------------------
 Running test 'readwrite' with 4 tasks
 Threads     0.18 +/- 0.04 seconds
 Processes   0.16 +/- 0.01 seconds

 The benefits are huge (upto ~60%) in some cases on x86 primarily
 due to the atomic operations during acquisition of ->file_lock
 and cache line bouncing in fast path. ppc64 benefits are modest
 due to LL/SC based locking, but still statistically significant.

This patch:

RCU head initilizer no longer needs the head varible name since we don't use
list.h lists anymore.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

2005-09-09 13:57:54 -07:00

Paul E. McKenney

9b06e81898

[PATCH] Deprecate synchronize_kernel, GPL replacement

The synchronize_kernel() primitive is used for quite a few different purposes:
waiting for RCU readers, waiting for NMIs, waiting for interrupts, and so on.
This makes RCU code harder to read, since synchronize_kernel() might or might
not have matching rcu_read_lock()s.  This patch creates a new
synchronize_rcu() that is to be used for RCU readers and a new
synchronize_sched() that is used for the rest.  These two new primitives
currently have the same implementation, but this is might well change with
additional real-time support.  Both new primitives are GPL-only, the old
primitive is deprecated.

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

2005-05-01 08:59:04 -07:00

Linus Torvalds

1da177e4c3

Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

2005-04-16 15:20:36 -07:00

3 Commits