Comments on nominolo's Blog: Implementing Fast Interpreters

I'm not sure it makes sense to expend so much ...

2012-07-24T12:51:14.293+02:00

I'm not sure it makes sense to expend so much effort optimizing an interpreter. I'd expect a splat compiler to be simpler and faster...

Thomas Schilling: Thanks for the answer!

2012-07-23T19:33:45.738+02:00

Thomas Schilling: Thanks for the answer!

Nicely explained. These are some of the very techn...

2012-07-23T14:47:12.744+02:00

Nicely explained. These are some of the very techniques, albeit in Java--that are used in the highly customizable GPVM that I designed for my Computer Organization and Assembly Lanuguage students. Well Done.

Alessandro: The design goal is that it shouldn'...

2012-07-23T12:10:41.959+02:00

Alessandro: The design goal is that it shouldn't. The main idea is to remove unnecessary repetition if the code is the same or very similar on different architectures. If it isn't, then you still use architecture-specific code instead of compromising performance.

TerryD: Yes, there are many ways to remove interpreter overhead, but they all start going into JIT territory. E.g., for the MemCpy technique, you need to manage extra memory, possibly flush instruction caches, make sure to dispatch execution to the generated code, possibly invalidate them if the execution mode changes, etc. Context-threading uses a similar idea to improve branch prediction.

If you're fine with an interpreter and just want a little bit more performance, then going this route is fine and the additional complexity is probably worth it. If you want a better-optimising JIT on top, then it's probably fine to use a simple interpreter and not have an additional proto-JIT.

aweasd: AFAIK, the LLVM JIT is relatively slow. As above, you'll also have to manage the memory for the compiled code. If you care more about ease of implementation than performance, I'd go with a direct-threaded interpreter in C. I'm not sure if LLVM is ready as a full JIT these days. I know the Unladen Swallow (Python JIT) project had it's issues and concluded that LLVM wasn't ready, yet.

hsm: I assume you mean the RISC-like MMIX, since the original MIX is an abomination by today's standard. That said, using MMIX would probably hide too many important details. For example, MMIX has 256 registers, but for our purposes we want to optimise register use for each architecture and not rely on having 256 of them. Since ultimately performance is the main goal, we shouldn't try to hide all the implementation details, but just automate the boring parts.

This wasn't new when Charles Moore invented Fo...

2012-07-23T04:49:44.902+02:00

This wasn't new when Charles Moore invented Forth. Good idea then, good idea now. I wonder if at some point Knuth's MIX, had multiple back ends, would that help with the portability problem?

What about LLVM bitcode? Seems like a reasonable c...

2012-07-23T03:39:59.208+02:00

What about LLVM bitcode? Seems like a reasonable compromise between portability and assembly-level flexibility.

If you have ASM code for the different operations ...

2012-07-23T03:37:04.804+02:00

If you have ASM code for the different operations and you want to do a little more work, you can use your table as a template and MemCpy the code together. Then, patch the operations that have constants. Presto! From interpreter to compiler! You can spend a year doing little optimizations.

Loved your post ; ). Assembly, Interpreters, spee...

2012-07-23T02:22:50.405+02:00

Loved your post ; ).

Assembly, Interpreters, speed... Very cool!

Does the "portable assembly" solution through DSL incurs a significant speed penalty?