First, to prevent ourselves from getting tongue-tied, I will refer to Run-Time Code Generation as RTCG. Like all recent technologies, RTCG is a mouthful of a name for what is really just a simple concept. An example is the best way of understanding what it's all about. Traditional emulators are based on slow interpreters. Modern emulators, such as Java JIT compilers, instead prefer to generate code native to the processor. This code is generated while the emulator is still running. It is also faster because it executes the program being emulated more directly than the slow interpretive layer. Pretty straight forward, uh? Compilers don't fall into this category because the code they generate is run after the compiler is finished.
Even microprocessors use RTCG. Instruction level parallelism (ILP) is attained by re-arranging instructions, which is essentially a form of RTCG. The entire Pentium line translates the horribly complex x86 opcodes into the micro-ops actually understood by their underlying RISC engines. Similarly, Transmeta's Crusoe and the Elbrus 2000 translate x86 and IA-64 instructions into those actually used by their respective VLIW processors.
Because translation is such an essential part of RTCG, it is also known as Binary Translation. Dynamic Compiling is another common name although the Tao Group prefers Dynamic Binding. David Ditzel of Transmeta has coined the most hip term so far: "Code Morphing".
Dynamic Optimization is a more advanced form of RTCG. It aims to improve the
performance of (already) native programs by optimizing their most frequently executed code paths. These optimizations are based on information only available at run-time. Thus Dynamic Optimization fills a niche that compilers cannot.
Research into RTCG was initially motivated in part to speed up emulators. However, more recent work is addressing the inability of compilers to optimize across method boundaries, virtual function calls and DLLs.
So What Does This Have To Do With Me?
RTCG is a new field of computer technology, having only attracted the serious attention of academia and the corporate world within the last decade. However, it has more potential than just helping geeks brag about the latest speed limits their computers violated.
But don't get me wrong: (transparently) improving the performance of software does have its benefits. By allowing more work to be wrung out of computers they don't need to be upgraded as often. The increase in performance will ease the acceptance of virtual machines (VMs) into popular use. VMs enhance security, simplify development and debugging, and sometimes even allow running applications under foreign OSs. VMs can do this because they insulate Oss and programs from each other.
Better performance also gives higher-level languages some slack to work with, thus easing their adoption by programmers. JIT compilers have helped Java be taken more seriously for server side development. Criticisms against Functional languages for being slow now have less legitimacy. Programmers of imperative languages can be freed from subtler tuning issues and instead focus on writing maintainable code. Issues of array striding and conditionals within loops can be left to the RTCG subsystem.
RTCG also allows compiled binaries to become independent of the processor without a significant tradeoff in performance. Compiled applications are increasingly used between different architectures. Any Java class file can run on both a SPARC and a Pentium. Tao's Elate RTOS runs the same executable on any processor. However, the ability to "write-once, run-anywhere" only works among environments that provide the same set of APIs. Since computers are bought for the applications they run, there will be less concern for which processor is used, as long as the entire computer system is affordable and fast enough.
Compilers tend not be released in a timely enough manner for software vendors to take advantage of the latest whizbang processor. Even when the compilers eventually do become available, developers are too lazy to compile all their applications for each architecture (no flames please - I'm a developer myself!). They like to use "deployment complexities" and "user friendliness" as excuses. Well-designed RTCG systems will be able to address these concerns.
To correspond with the expected growth in digital appliances will be diverse heat, speed, cost and form factor constraints. Clearly, no single company or processor will be able to meet this challenge. However, portability of compiled software between competing appliances is a challenge that RTCG can handle.
Another obstacle RTCG can hurdle is making the instruction set orthogonal to the processor's architecture. The Pentium, Crusoe and Elbrus 2000 are based on RISC and VLIW designs, but are able to execute what is essentially a CISC instruction set by using RTCG. This flexibility is becoming increasingly important in today's market which places a premium on compatibility with the x86 even though chip designer's wish to be rid of its restrictions.
More importantly, RTCG frees VLIW processors from the burden of backward compatibility. Without RTCG, VLIW instruction sets are inherently dependant on the number and latencies of functional units in a specific processor. RTCG should ease the adoption of VLIW processors by extending their expected lifetime. Contemporary corporate research is hedging its bets on the VLIW approach. Witness Intel's IA-64, Sun Microsystems's MACJ, Transmeta's Crusoe and the Elbrus 2000. VLIW processors forego any hardware-assisted optimization and instead shift all the responsibility (and blame) to the compiler. Perhaps there is a role in this hardware-software spectrum for RTCG to play.
RCTG is software-based. Thus it gives processor designers more elbowroom to work with. They are no longer restricted to hardware-only options when making tradeoffs between a chip's speed, size, cost, heat dissipation and power requirements. But the best is yet to come: RTCG is a path around patent restrictions. Now companies can compete with instruction-compatible processors on a level playing field, beyond the grimy reach of patent lawyers. Coup de Grace! Shifts in the competitive landscape due to RTCG are bound to occur. The extent of change however, remains to be seen.
Despite all the benefits its brings, RTCG still raises a fuss. What are the legal implications of running code that is different from what the software vendor delivered? Whose tech support do you call when correct code becomes faulty during optimization? (I'd hate to be the guy who has to debug that). Will the emergence of dynamic compiling replace the market dominance of the x86 instruction set architecture (ISA)? If so, will it be replaced by a RISC-like ISA like the Sparc? Or a bytecode like Java's? Something more hierarchical, like the high level Slim Binaries? Or maybe even something resembling a 4-way VLIW ISA? Perhaps a multitude of instruction formats may emerge to serve different niches.
This site aims to foster the understanding of RTCG amongst all developers, not just those who engage in its research or actively follow conference proceedings. I am particular interested in RTCG's impact on the hardware and software industries, how it will influence microprocessor designs and the trends that have driven research into this field. What are your thoughts on these topics? Recommended links on academic, corporate and related research, products, conferences, books, magazines and jobs are also welcome.
`C (Tick-C) is an extension of C developed at MIT. It allows developers to generate code
at run-time by supplying C statements. This now retired project essentially introduces the functionality of Perl's eval () keyword to C programmers.
DyC is a compiler from the University of Washington that compiles code on the fly. The intent is to improve program performance by generated specializing code based on the program's data during execution.
Dynamo, from Indiana University, is similar in spirit to DyC. It aims to build an architecture for a compiler that can dynamically optimize code at various levels of representation. (Note: there is a research project from Hewlett Packard that is also called Dynamo).
Tempo is a partial evaluator for C from the University of Rennes in France. This means that given C source and runtime constants, Tempo will emit an optimized version of the source code based on the runtime constants. Unlike the above projects, Tempo can work offline and at runtime.
The Compilation and Computer Architecture research group at Harvard is exploring various aspects of RTCG. The projects include building infrastructure, temporal profiling and dynamic optimization.
Self is an OO programming environment developed at Sun Microsystems. This project did alot of seminal work in the RTCG field.
Kanban is Sun Microsystems' umbrella group for projects regarding Java VM JIT technology. HotSpot is Sun's commercial JVM product line and is based on technology from the Kanban group.
QuickSilver is an IBM research project focused on new compilation methods for Java. Jalapeño is a sister project of Quicksilver and features a JIT compiler written in Java itself.
DAISY aims to dynamically compile code for x86, PowerPCs and the JVM for VLIW architectures. Hmm... this sounds suspiciously similar to Transmeta's Code Morphing technology. They've even gone Open Source!
Dynamo is an emulator from Hewlett Packard for their PA-RISC processors that runs on the PA-RISC itself. As it emulates a program, Dynamo builds a library of code fragments. The most frequent of these fragments are optimized and linked together to form long chains of optimized fragments. (Note: there is a research project at Indiana University that is also called Dynamo).
Slashdot discussion (134K) (I recommend filtering out comments with scores < 4).
UQBT is a Binary Translator from the University of Queensland in Australia. It
translates fragments of code from one architecture to another. Unlike the above projects, it does not operate at run-time. However, this research will still be relevant in a runtime environment.
The Adaptive Computation Lab at McGill University in Montreal is studying how to adapt both the processor and software for greater performance. Inline caches, bytecode prediction, genetic
algorithms, oh my!
ARTiC Labs (Laboratory for Advanced Research in Computing Technology and Compilers) at the University of Minnestoa, is focused on the interaction between compilers and parallel architectures. The Multiprocessor Architecture and Communications project has published papers on sub-block reuse, value reuse and value prediction, among others.
Transmeta is the first company to release a microprocessor that features dynamic optimization. The Crusoe white paper includes an overview of their Code Morphing technology.
Elbrus is a Russian firm that is developing a microprocessor, the Elbrus 2000, that uses "binary translation" techniques to maintain compatibility with the Intel IA-64.
Tower Technology has developed TowerJ, a server-side Java VM that competes with Sun's HotSpot.
JRockit is a server-side JVM from Appeal Virtual Machines in Sweden.
FX!32 is a discontinued product originally from Digital (but now from Compaq) that translated x86 applications to native Alpha code during execution. Here is a technical white paper on FX!32.