From 1d943da0cf9154e7ce78ce867cdbb91531c5d78e Mon Sep 17 00:00:00 2001 From: Matthew Sotoudeh Date: Tue, 25 Jul 2023 14:58:33 -0700 Subject: initial dietc commit --- README.txt | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) create mode 100644 README.txt (limited to 'README.txt') diff --git a/README.txt b/README.txt new file mode 100644 index 0000000..ca59161 --- /dev/null +++ b/README.txt @@ -0,0 +1,76 @@ +# dietC: a C backend for chibicc + +What: desugar C code + +Why: easier program analysis and transformation + +How: fork of Rui Ueyama's great chibicc compiler, replacing the x86 assembly +backend with a C backend + +Quick Example: + + $ cat tests/super_simple.c + int main(void) { + int x = 1; + int y = 2; + return y; + } + $ make + ... + $ ./dietc tests/super_simple.c + #include "/home/matthew/repos/dietc/scripts/dietc_helpers.h" + typedef int Type_1 ; + ... + extern Type_5 main ; + ... + Type_1 main ( ) { + Type_1 t1 ; + ... + t3 = 1 ; + * t5 = t3 ; + ... + return t1 ; + } + +Selling points: + + - Guaranteed space-separated tokens + - Extremely restricted grammar, super easy to parse; see docs/LANGUAGE.txt + - Maintains all type information + - Comes with a wrapper to replace GCC in existing build scripts; see + docs/DIETCC.txt + +Known to be unsupported: + +- dietC is ... optimistic about identifier conflicts with its generated + labels. It likely will not work if you run it on its own output. This is very + fixable, will do soon. +- chibicc does not have good support for long double; specifically, long + doubles cannot be used in constant expressions. This is fixable, but requires + non-trivial modifications to the chibicc side of the code. +- chibicc does not parse const, volatile, etc. type modifiers. it would not be + too hard to support this if desired. + +Why not cilly? + + - cilly performs both not enough and too much lowering. + - it often compiles a struct field acces a->foo into a literal offset like + *(a + 16). + - even after simplification it has multiple control flow constructs: loop, + etc. + - it's written in OCAML, and hard to parse the result outside of OCAML + +Why not C--, assembly, QBE? + + - all these are strictly lower level than cilly. They preserve almost no type + information at all. + +Why not LLVM? + + - LLVM has breaking changes approximately every 5ns and preservation of + high-level type information is not guaranteed. + - mostly forced to use LLVM's C++ interfaces and then compile with LLVM + +License: + - the underlying chibicc code is licensed under the MIT license + - my modifications are licensed under the AGPLv3 -- cgit v1.2.3