The Metaprogramming Dilemma
2016-12-01
Article was originally posted here: https://odin.handmade.network/blogs/p/1723-the_metaprogramming_dilemma
Designing this language has been difficult but fun. Two of the original goals of this language were simplicity and metaprogramming however, these together could be an oxymoron. But before I explain why, I first need to explain what I mean by “metaprogramming”.
Metaprogramming is an “art” of writing programs to treats other programs as their data. This means that a program could generate, read, analyse, and transform code or even itself to achieve a certain solution. The approaches of metaprogramming can be split into a few distinct categories:
- Introspection (and Reflection for OOP languages)
- Compile Time Execution (CTE)
- Template Programming
- Macros (Textual and Syntactic)
- Parametric Polymorphism (“Generics”)
Many languages have metaprogramming functionalities: C has textual macros; C++ has textual macros, a functional templating language, and rudimentary introspection; Nim has all of the above; Go has external textual “templates/macros”. Each approach has its advantages and disadvantages but all can be used together to achieve different solutions and results.
Introspection is already part of the language and is a functionality I think is necessary in a “modern” language. It is needed to have something like a “type-safe printf” (at runtime) and the ability to serialize data with ease. However, introspection does require extra memory to be stored for the type information. (n.b. Reflection is only appropriate for object oriented programming language which this language is not.)
Compile Time Execution (CTE) is an idea that I’ve been pondering for a while and it’s already part of Jon Blow’s language, Jai. It would be a stage of the compiler which runs any Odin code the user requests before the creation of the executable. The data modified and generated by this stage will be used as the initialization data for the compiled code. I have come to the conclusion that this CTE stage would be required to be ran through an interpreter to achieve the results needed. However there are few problems with this CTE stage. The main problem being: pointers will point to invalid memory addresses. This is because the memory space of the interpreter in completely different to the memory space of the executable (compiled code). Numerous types are stored with pointers internally and these values would be invalid at runtime. Due to this problem (and few others), this powerful feature becomes extremely delicate.
Templates can be thought of a subset of macros. Templates are usually a simple substitution mechanism that operate on the Abstract Syntax Tree (AST). Macros on the other hand, could be a simple text substitution system (akin to C’s preprocessor) or even a compile AST modification and generation tool (similar to Lisp or Nim). Templates could even be an entire language built into the language (like C++’s templates). This makes them both extremely powerful but also “magic”.
Parametric polymorphism, or commonly referred to as “Generics” (which is a very “generic” name too), is the ability to duplicate certain “snippets” of code that have a similar structure but different types/names/etc. A basic example is a generic sorting function which accepts an array of a certain type and a sorting function for that specific type. In a language such as C, this would either have to achieved through code duplication (copy&paste or macros), or through void pointers to remove the type information. The former method can become cumbersome and prone to mistakes while the latter, removes a lot of type safety and prevents compiler optimizations. In a language that does have “generics”, this “problem” can be solved. However, the problem hand is very really a generic problem and “genericizing” the problem doesn’t actually solve the original problem. “Generics” can be emulated through the use of templates and macros which means that it may not need to be a built in feature of a language.
This now brings me to the dilemma. How far do I want to go with metaprogramming in this language? How far can I go whilst keeping Odin a simple language? Or is the very definition of metaprogramming not simple? Or should metaprogramming be left to an external (standardized) tool (like go generate
)?