• Open source C compiler using Regular Expressions

    From sasho648@sasho648@gmail.com to comp.std.c on Wed Sep 1 09:28:49 2021
    From Newsgroup: comp.std.c

    It uses PCRE2 to parse the C file and match a huge regex composed of several .regex files stitched together one Perl script (main.pl). There are about 94 currently callouts placed inside it which invoke C++ code that reads named capture groups and calls the LLVM APIs appropriately to construct a program.
    https://github.com/6a4h8/cparser2/tree/wip
    This is an open source compiler using regular expressions and mainly focusing on the C89 (from fips pub 160 pdf document).
    The backend was originally a huge C switch which I recently converted into C++ virtual functions - there are two pair of them - one for parsing - they can alter the match and one for producing.
    The parsing one is mainly used for typedefs since they require context sensitive parsing inside functions.
    Currently it doesn't implement: initialization, WIP on implementing conditional evaluation with the logical ops, incomplete types, un-prototyped functions.
    Most importantly it doesn't support attributes and preprocessor directives.
    It does implement: everything else hopefully.
    Check out the WIP branch (lastly worked on Windows). Invocation:
    cparser main.pl in_src.c
    Expected output (llvm bitcode and IR representation):
    in_src.c.bc
    in_src.c.ll
    It can be debugged if you uncomment the ending of line 6 in main.h. This will produce 2 output.txt files and significantly slow down the compilation process.
    --- Synchronet 3.19a-Linux NewsLink 1.113
  • From Benjamin Williams (Hodgez)@benjamin.jacob.williamsmi@gmail.com to comp.std.c on Sun Jun 11 22:55:33 2023
    From Newsgroup: comp.std.c

    Absolute mad lad. I love it. I will have to give it a try later to see how all it works.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From sasho648@sasho648@gmail.com to comp.std.c on Mon Jun 12 00:46:14 2023
    From Newsgroup: comp.std.c

    On Monday, June 12, 2023 at 8:55:35 AM UTC+3, Benjamin Williams (Hodgez) wrote:
    Absolute mad lad. I love it. I will have to give it a try later to see how all it works.
    Just FYI - it's on https://github.com/AnFunctionArray/cllvmbackend on now (with git submodule - the actual perl/regex part). I guess on the "mad lad" part you'll be happy to hear that this version is also multithreaded (because it turned out (last time - I've not checked out the last perl updates) that this way was actually faster - with the bottleneck otherwise being the regex engine) - you need this evn vars:
    MAXTHREADS=8
    MINLEN=50000
    SILENT=1
    Otherwise the syntax is the same:
    regularc ./parse.pl ./bulk/tests/test.c
    But also generally last time it had some issues (since I was trying it for different purposes (for which there is the non standard INTPROM env var)). However I also had success compiling the c donut program with slight modifications (mainly removed the preprocessor - line concatenation and comments) at certain point in the past.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From sasho648@sasho648@gmail.com to comp.std.c on Mon Jun 12 00:49:45 2023
    From Newsgroup: comp.std.c

    On Monday, June 12, 2023 at 10:46:16 AM UTC+3, sasho648 wrote:
    On Monday, June 12, 2023 at 8:55:35 AM UTC+3, Benjamin Williams (Hodgez) wrote:
    Absolute mad lad. I love it. I will have to give it a try later to see how all it works.
    Just FYI - it's on https://github.com/AnFunctionArray/cllvmbackend on now (with git submodule - the actual perl/regex part). I guess on the "mad lad" part you'll be happy to hear that this version is also multithreaded (because it turned out (last time - I've not checked out the last perl updates) that this way was actually faster - with the bottleneck otherwise being the regex engine) - you need this evn vars:

    MAXTHREADS=8
    MINLEN=50000
    SILENT=1

    Otherwise the syntax is the same:

    regularc ./parse.pl ./bulk/tests/test.c

    But also generally last time it had some issues (since I was trying it for different purposes (for which there is the non standard INTPROM env var)). However I also had success compiling the c donut program with slight modifications (mainly removed the preprocessor - line concatenation and comments) at certain point in the past.
    Faster - that's for **very large** files - otherwise it's the same.
    --- Synchronet 3.20a-Linux NewsLink 1.114