Writing a simple Compiler on my own - Action Rules for Assignments and Simple Statements [C][Flex][Bison]

[Custom Thumbnail]

All the Code of the series can be found at the Github repository:


    Hello it's a me again @drifter1! Today we continue with my Compiler Series, a series where we implement a complete compiler for a simple C-like language by using the C-tools Flex and Bison. In this article we will write the needed Action Rules for Assignments and Simple Statements. These cases are pretty simple, but an article is still worth it :P

More specifically, the topics that we will cover today are:

  1. Action rules for simple statements
  2. Action rules for assignments
  3. Running the compiler


    Actually you need to read and understand all the topics that I covered in the series as a whole, as these articles will give you access to knowledge about:
  • What Compiler Design is (mainly the steps)
  • For which exact Language the Compiler is build for (Tokens and Grammar)
  • How to use Flex and Bison
  • How to implement a lexer and parser for the language using those tools
  • What the Symbol Table is and how we implement it
  • How we combine Flex and Bison together
  • How we can pass information from the Lexer to the Parser
  • How we define operator priorities, precedencies and associativity
  • What Semantic Analysis is (Attributes, SDT etc.)
  • How we do the so called "Scope Resolution"
  • How we declare types and check the type inter-compatibility for different cases, including function parameters
  • How we check function calls later on using a "Revisit queue", as function declarations mostly happen after functions get used (in our Language)
  • Intermediate Code Representations, including AST's
  • How we implement an AST (structure and management)
  • Action Rules for other cases


Talking about the series in general this series can be rated:
  • Intermediate to Advanced
Today's topic(s) can be rated:
  • Easy
So, without further ado, let's now finally start with the actual Tutorial...

Actual Tutorial Content

    In today's we finally start getting into statements. To get more specific, our language has the following statements:

    As you might have already guessed the previous article about expressions is quite important, as expressions are used in most of the types of statements! In general, for all these types of statements we only have to include the following token definition in the parser:

%type <node> statement assigment

    That's because both the statement and assignment rule are represented by AST nodes. So, let's start with the simple statements on the right!

Action Rules for Simple Statements

    By "simple" I refer to the two simple statements of continue and break, and to the statement form of increment/decrement. Let's start out with the second one!

Increment/Decrement statement

    We already saw the increment/decrement stuff in our expressions article! And so we can easily copy-paste that code and use it for the statements too! So, the statement rule at first contains:
... other sub-rules ...
| ID INCR SEMI { /* increment */ if($2.ival == INC){ $$ = new_ast_incr_node($1, 0, 0); } else{ $$ = new_ast_incr_node($1, 1, 0); } ast_traversal($$); /* just for testing */ } | INCR ID SEMI { /* increment */ if($1.ival == INC){ $$ = new_ast_incr_node($2, 0, 1); } else{ $$ = new_ast_incr_node($2, 1, 1); } ast_traversal($$); /* just for testing */ } ;

Continue and break

    The other two types of statements: "continue" and "break" are both represented by an AST Simple Node, by only having a difference in the variable "statement_type", where '0' represents the "continue", whilst '1' represents the "break" statement. A simple visualization of that looks as following:

    Having a function that creates such a node, we just have to pass the correct parameter to "new_ast_simple_node", depending on the statement type. This newly created node of course get's passed over to the statement rule. So, in code the statement rule now contains the following action rules:
... other sub-rules ...
| CONTINUE SEMI { $$ = new_ast_simple_node(0); ast_traversal($$); /* just for testing */ } | BREAK SEMI { $$ = new_ast_simple_node(1); ast_traversal($$); /* just for testing */ }
... other sub-rules ...

Action Rules for Assignments

    The assignment rule is the more complicated one of today's statements. A visualization of that rule looks as following:

    The left part has been managed last time in expressions and so the var_ref node already contains the two things: "symbol table entry" and "reference type". The right part is an expression that will be simply passed as the "assign_val" node entry of the AST Assign Node. Right now, this Node only stores an ST entry and the assign value, but we also have to be able to store the reference type. That's why we will create a new entry called "ref" (to stay with the naming of the Reference Node). Doing that we now have the following changes in the ast.h and ast.c files, as we also have to tweak the creation function:
typedef struct AST_Node_Assign{ enum Node_Type type; // node type
// symbol table entry list_t *entry;
// reference or not int ref; // 0: not reference, 1: reference
// assignment value struct AST_Node *assign_val; }AST_Node_Assign; ...
AST_Node *new_ast_assign_node(list_t *entry, int ref, AST_Node *assign_val);
AST_Node *new_ast_assign_node(list_t *entry, int ref, AST_Node *assign_val){ // allocate memory AST_Node_Assign *v = malloc (sizeof (AST_Node_Assign));
// set entries v->type = ASSIGN_NODE; v->entry = entry; v->ref = ref; v->assign_val = assign_val;
// return type-casted result return (struct AST_Node *) v; }

We can now easily write the needed action code in the assignment rule:
assigment: var_ref ASSIGN expression
    AST_Node_Ref *temp = (AST_Node_Ref*) $1;
    $$ = new_ast_assign_node(temp->entry, temp->ref, $3);

    In the statement rule we simply have to pass the node from the assignment over to the statement rule:
... other sub-rules ...
| assigment SEMI { $$ = $1; /* just pass information */ ast_traversal($$); /* just for testing */ }
... other sub-rules ...

Running the compiler

    Running the compiler for the "full_example" we now get informed about the occurrences of simple statements and assignments! Some parts of the console messages look as following:

    I think we got to a very interesting point now! Next up are the more complicated statements if-else, for and while. I can already see them in the console :P Function calls will be done after function declarations, so that we can also do stuff around the revisit queue! Of course we also never really checked the semantics in these action rules. Right now we are only passing information between the different terminals and non-terminals of the grammar!



No references, just using code that I implemented in my previous articles.


All of the images are custom-made!

Previous parts of the series

Final words | Next up on the project

     And this is actually it for today's post! I hope that I explained everything as much as I needed to, meaning that you learned something out of it.
Next up on this series are:
  • Semantic analysis (using more action rules in Bison)
  • Intermediate Code generation (using the AST structure for more cases)
  • Machine Code generation (MIPS Assembly)
     Which are all topics that will need more than one article to complete. Also, note that we might also get into Optimizations later on, or could even extend the Language by adding complex datatypes (structs and unions), more rules etc.
So, see ya next time!

GitHub Account:


Keep on drifting! ;)

3 columns
2 columns
1 column