-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interface cast spin lock #1126
Interface cast spin lock #1126
Conversation
✅ Deploy Preview for elastic-ritchie-8f47f9 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
I think we should be using futexes or something else on Linux that doesn't busywait and waste batteries on laptop smp—and importantly, this should be all abstracted over somehow in a macro. If you want this in, fine for now, but then you should open an issue about getting locking right on SMP. |
It is abstracted over a macro, see It is a workable solution for SMP, the critical sections are expected to be small (less than 10ns each) so the spin/busy wait is not that bad. Ideally we'd have futexes; unfortunately full gambit mutexes are so much slower it is not even funny. |
Follow up issue in #1128. |
On top of ##1126 And so it begins... the compiler generates specializers for all bound methods that could benefit from it, and interface prototype creation plugs to it, with wondrous performance results for certain programs. Here is an example: ``` (defclass A (x y)) (defclass (B A) (z)) (defmethod {linear A} (lambda (self w) (fx+ (fx* (A-x self) w) (A-y self)))) (defmethod {bilinear-combination B} (lambda (self w z) {self.bilinear {self.linear w} z})) (defmethod {bilinear B} (lambda (self lc z) (fx+ lc (fx* (B-z self) z)))) (interface Combinator (bilinear-combination w z)) (def (run iters) (let (instance (Combinator (B x: 1 y: 2 z: 3))) (for (i (in-range iters)) (let (result (&Combinator-bilinear-combination instance 4 5)) (unless (= result 21) (error "bad result" result: result expected: 21)))))) (def (main iters) (let (iters (string->number iters)) (time (run iters)))) ``` With gxc master: ``` $ gxc -exe -o /tmp/ispec-bench -O src/gerbil/test/interface-specialization-bench.ss /tmp/gxc.1708454081.817515/test__interface-specialization-bench.scm: /tmp/ispec-bench__exe.scm: /tmp/gxc.1708454081.817515/test__interface-specialization-bench.c: /tmp/ispec-bench__exe.c: /tmp/ispec-bench__exe_.c: $ /tmp/ispec-bench 1000000 (time (let () (declare (not safe)) (test/interface-specialization-bench#run _iters79_))) 0.215345 secs real time 0.215332 secs cpu time (0.211338 user, 0.003994 system) 20 collections accounting for 0.016191 secs real time (0.015828 user, 0.000356 system) 159892208 bytes allocated 671 minor faults no major faults 562301090 cpu cycles ``` With the specializers: ``` $ ./build.sh env gxc -exe -o /tmp/ispec-bench -O gerbil/test/interface-specialization-bench.ss /tmp/gxc.1708454105.0922964/test__interface-specialization-bench.scm: /tmp/ispec-bench__exe.scm: /tmp/gxc.1708454105.0922964/test__interface-specialization-bench.c: /tmp/ispec-bench__exe.c: /tmp/ispec-bench__exe_.c: [*] Done $ /tmp/ispec-bench 1000000 (time (let () (declare (not safe)) (test/interface-specialization-bench#run _iters79_))) 0.010587 secs real time 0.010587 secs cpu time (0.010586 user, 0.000001 system) no collections 1408 bytes allocated no minor faults no major faults 27638154 cpu cycles ``` **20x, not bad huh?** Basically all the dynamic dispatch call cost of the MOP for self references (slots or methods) has disappeared.
This changes the prototype table lock to be a spin lock .... and it is almost 3x faster.
Before:
After: