Interface cast spin lock #1126

vyzo · 2024-02-20T08:40:51Z

This changes the prototype table lock to be a spin lock .... and it is almost 3x faster.

Before:

$ /tmp/interface-bench cast 10 1000000
(time (let () (declare (not safe)) (std/interface-benchmark#cast-benchmark _iters180_ _threads179_ std/interface-benchmark#do-cast)))
    2.117610 secs real time
    2.117559 secs cpu time (2.117150 user, 0.000409 system)
    142 collections accounting for 0.122705 secs real time (0.122418 user, 0.000250 system)
    1120026112 bytes allocated
    672 minor faults
    no major faults
    5529541900 cpu cycles

After:

$ /tmp/interface-bench2 cast 10 1000000
(time (let () (declare (not safe)) (std/interface-benchmark#cast-benchmark _iters180_ _threads179_ std/interface-benchmark#do-cast)))
    0.857509 secs real time
    0.857492 secs cpu time (0.857288 user, 0.000204 system)
    142 collections accounting for 0.104038 secs real time (0.103984 user, 0.000000 system)
    1119990592 bytes allocated
    672 minor faults
    no major faults
    2239135638 cpu cycles

netlify · 2024-02-20T08:41:07Z

✅ Deploy Preview for elastic-ritchie-8f47f9 ready!

Name	Link
🔨 Latest commit	`8808136`
🔍 Latest deploy log	https://app.netlify.com/sites/elastic-ritchie-8f47f9/deploys/65d472029dc6de000875d4ef
😎 Deploy Preview	https://deploy-preview-1126--elastic-ritchie-8f47f9.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

fare · 2024-02-20T19:31:33Z

I think we should be using futexes or something else on Linux that doesn't busywait and waste batteries on laptop smp—and importantly, this should be all abstracted over somehow in a macro.

If you want this in, fine for now, but then you should open an issue about getting locking right on SMP.

vyzo · 2024-02-20T19:52:49Z

It is abstracted over a macro, see __lock-inline! and __unlock-inline!.

It is a workable solution for SMP, the critical sections are expected to be small (less than 10ns each) so the spin/busy wait is not that bad.

Ideally we'd have futexes; unfortunately full gambit mutexes are so much slower it is not even funny.

vyzo · 2024-02-20T20:24:24Z

Follow up issue in #1128.

On top of ##1126 And so it begins... the compiler generates specializers for all bound methods that could benefit from it, and interface prototype creation plugs to it, with wondrous performance results for certain programs. Here is an example: ``` (defclass A (x y)) (defclass (B A) (z)) (defmethod {linear A} (lambda (self w) (fx+ (fx* (A-x self) w) (A-y self)))) (defmethod {bilinear-combination B} (lambda (self w z) {self.bilinear {self.linear w} z})) (defmethod {bilinear B} (lambda (self lc z) (fx+ lc (fx* (B-z self) z)))) (interface Combinator (bilinear-combination w z)) (def (run iters) (let (instance (Combinator (B x: 1 y: 2 z: 3))) (for (i (in-range iters)) (let (result (&Combinator-bilinear-combination instance 4 5)) (unless (= result 21) (error "bad result" result: result expected: 21)))))) (def (main iters) (let (iters (string->number iters)) (time (run iters)))) ``` With gxc master: ``` $ gxc -exe -o /tmp/ispec-bench -O src/gerbil/test/interface-specialization-bench.ss /tmp/gxc.1708454081.817515/test__interface-specialization-bench.scm: /tmp/ispec-bench__exe.scm: /tmp/gxc.1708454081.817515/test__interface-specialization-bench.c: /tmp/ispec-bench__exe.c: /tmp/ispec-bench__exe_.c: $ /tmp/ispec-bench 1000000 (time (let () (declare (not safe)) (test/interface-specialization-bench#run _iters79_))) 0.215345 secs real time 0.215332 secs cpu time (0.211338 user, 0.003994 system) 20 collections accounting for 0.016191 secs real time (0.015828 user, 0.000356 system) 159892208 bytes allocated 671 minor faults no major faults 562301090 cpu cycles ``` With the specializers: ``` $ ./build.sh env gxc -exe -o /tmp/ispec-bench -O gerbil/test/interface-specialization-bench.ss /tmp/gxc.1708454105.0922964/test__interface-specialization-bench.scm: /tmp/ispec-bench__exe.scm: /tmp/gxc.1708454105.0922964/test__interface-specialization-bench.c: /tmp/ispec-bench__exe.c: /tmp/ispec-bench__exe_.c: [*] Done $ /tmp/ispec-bench 1000000 (time (let () (declare (not safe)) (test/interface-specialization-bench#run _iters79_))) 0.010587 secs real time 0.010587 secs cpu time (0.010586 user, 0.000001 system) no collections 1408 bytes allocated no minor faults no major faults 27638154 cpu cycles ``` **20x, not bad huh?** Basically all the dynamic dispatch call cost of the MOP for self references (slots or methods) has disappeared.

vyzo added 5 commits February 20, 2024 10:31

use a spin lock for protecting the interface prototype table

23c3981

add some spins in th4e SMP case

058d399

disable interrupts in UP spin lock

a1c124a

reusable low level lock macros

b8a97fc

bootstrap

f3189f2

vyzo requested review from fare and a team February 20, 2024 08:41

vyzo added 2 commits February 20, 2024 10:45

move interface test/benchmark to gerbil/test

a4937fc

improve interface-bench

8808136

vyzo mentioned this pull request Feb 20, 2024

Interface Specializers #1127

Merged

fare approved these changes Feb 20, 2024

View reviewed changes

vyzo merged commit 000892a into master Feb 20, 2024
12 checks passed

vyzo deleted the interface-spin-lock branch February 20, 2024 20:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interface cast spin lock #1126

Interface cast spin lock #1126

vyzo commented Feb 20, 2024

netlify bot commented Feb 20, 2024 •

edited

Loading

fare commented Feb 20, 2024

vyzo commented Feb 20, 2024

vyzo commented Feb 20, 2024

Interface cast spin lock #1126

Interface cast spin lock #1126

Conversation

vyzo commented Feb 20, 2024

netlify bot commented Feb 20, 2024 • edited Loading

✅ Deploy Preview for elastic-ritchie-8f47f9 ready!

fare commented Feb 20, 2024

vyzo commented Feb 20, 2024

vyzo commented Feb 20, 2024

netlify bot commented Feb 20, 2024 •

edited

Loading