The C programming language will turn 50 next year (first appeared 1972; 49 years ago). It is still popular and used on various platforms and applications (e.g. Linux, Git, Microsoft's Windows kernel, embedded systems, real-time app, ...)
The C++ is "younger" with 36 candles and popular in gaming, video, desktop/mobile apps...
I like both languages and there is a good difference between the first C++ (C with Classes) and the latest ones (C++11, C++14, C++17, C++20, ...). But overtime, I think C++ became a strange language especially because of the compatibility...
It's been in my mind for a long time that I would like to create a new programming language. So I have 2 options: Reinvent the wheel which will take a lot of time or fork from an existing project. I chose the second one. I forked LLVM/Clang and I called it Klang !!!
(Todo: need to find a better name later).
This post is the first of a serie where I would like to evolve a new language based on C++, C. I also like Assembly, Pascal and C# languages so I will add some concepts based on these languages.
The first steps I did was to strip a couple of things I don't like or find useless in C++.
"goto" statement is available in C and C++ and it's a JMP assembly instruction under the hood where the code will jump on the label provided with the goto like that:
int main()
{
goto label;
printf("skipped\n");
label:
printf("Hello World!!!\n");
return 0;
}
With goto, it's easy to create spaghetti code (resulting in a program flow that is conceptually like a bowl of spaghetti, twisted and tangled).
Here is the NASA rules called "Power of Ten" which suggests to avoid goto instructions:
1. Avoid complex flow constructs, such as goto and recursion.
Source: Coding for space programs that took us to the moon and beyond
2. All loops must have fixed bounds. This prevents runaway code.
3. Avoid heap memory allocation.
4. Restrict functions to a single printed page.
5. Use a minimum of two runtime assertions per function.
6. Restrict the scope of data to the smallest possible.
7. Check the return value of all non-void functions, or cast to void to indicate the return value is useless.
8. Use the preprocessor sparingly.
9. Limit pointer use to a single dereference, and do not use function pointers.
10. Compile with all possible warnings active; all warnings should then be addressed before release of the software.
It's the first thing I did. Klang prohibits any goto and label instructions so here is the result if we compile the code above:
Ok, this one is more of a candy where it could be handled easily outside the compilation with any kind of auto-format (e.g. clang-format) to replace tabs by spaces.
Not all companies use an auto-format and it's easy to insert a random tab inside a code base. One option could be to detect tabs on the build machine but I try to avoid any extra useless steps.
Klang will detect and prohibit any tabs inside a code file:
Some people or companies will avoid curly braces if it's for one line. There is no official rule so to avoid any confusion and for fast reading,
Klang forces the usage of curly braces on selection and iteration statements like "if", "else", "switch", "for", "do", "while".
It sounds like overhead to add curly braces but it's useful. Below is the Apple goto bug that created a vulnerability:
"The issue is the two consecutive goto fail; statements. Although the indentation of the lines makes it appear as though they’ll both get executed only if the predicate in the if-statement is true, the second one gets executed regardless of whether the predicate is true or false. If the indentation is corrected, the problem becomes more obvious. The return value of zero is provided to the caller, who believes that the signature verification on the "Server Key Exchange" message passed."
[Source]
The code above looks odd but it's C++ code and it will compile with almost any compiler (e.g. Visual Studio 2019).
Why?
Did you find the curly braces on this keyboard?
Some very old keyboards may not have keys to cover so digraphs and trigraphs are sequences of two or three characters to simulate not available characters: {, }, [, ], #, \, ^, |, ~.
Trigraph | Digraph | Equivalent |
---|---|---|
??= | %: | # |
??/ | \ | |
??' | ^ | |
??( | <: | [ |
??) | :> | ] |
??! | | | |
?? | <% | { |
??> | %> | } |
??- | ~ |
Same thing for alternative operators which act as alias:
Token | Equivalent |
---|---|
compl | ~ |
not | ! |
bitand | & |
bitor | | |
and | && |
or | || |
xor | ^ |
and_eq | &= |
or_eq | |= |
xor_eq | ^= |
not_eq | != |
I disabled and removed digraphs, trigraphs and alternative operators for Klang!
I have given a couple of interviews in the last few years and I like to ask some quick questions on the whiteboard to get the candidate's knowledge. One question is I write some variables (e.g. double, float, int, short, ...) on the whiteboard and I ask what is the byte size or size in bits of those variables.
I will say half of the candidates will give good values.
I think the problem is the confusion about the primitive naming: float, double, short, int and it's not intuitive!
I like the primitive naming I used when I coded on GBA a long time ago.
The GBA uses ARM processor and they have a very good naming for instructions, specially SIMD (neon):
[Source] VSHR
Datatype: must be one of S8, S16, S32, S64, U8, U16, U32, or U64.
"float" is a 32 bit floating point type and "double" a 64 bit floating point type.
So for Klang, I replaced "float" by "f32" and "double" by "f64".
Here are two examples from Wolfenstein 3D code I updated:
For Klang, I only updated the floating points but for the next blog post, I will also change other primitives (e.g. int, short, ...).
It's easy to compile one source file like the "Hello World" sample:
#include <iostream>
int main() {
std::cout << "Hello World!";
return 0;
}
It will better to use a real case so I used one of the first game I played on PC (DOS time): Wolfenstein 3D
The original source code is available on GitHub
and it contains some assembly code and it's more for DOS.
There is also another version (Chocolate-Wolfenstein-3D)
compatible with Windows and MacOS where the goal is: "Chocolate Wolf3D removes all the crap that was added over the years (snow, rain ...) in order to recreate the experience from 1993".
This version was created by Fabien Sanglard. He also wrote a book about how Wolfenstein was done (Game Engine Black Book: Wolfenstein 3D). It's a pretty interesting reading.
I cloned the Chocolate-Wolfenstein-3D repository and I did some modifications inside:
- Removed all goto instructions
- Replaced all "float" and "double" primitives by "f32" and "f64"
- Added Wolf3D dependencies: SDL-1.2.15 & SDL_mixer-1.2.12
- Increased default resolution by 50%
- Fixed warnings (e.g. casting)
- Fixed SDL library build (e.g. double->f64)
- Fixed some Windows files for build (tabs, float->f32, ...)
- Removed some tabs
- Added shareware binary files for Wolfenstein 3D
- Added a build batch file to build, copy dll files (SDL) and copy binary data.
Here is the result:
Thanks for reading,
JS.