Writing a simple Compiler on my own - Revisit Queue and Parameter Checking (part 2) [C][Flex][Bison]

[Custom Thumbnail]

All the Code of the series can be found at the Github repository:


    Hello it's a me again @drifter1! Today we continue with my Compiler Series, a series where we implement a complete compiler for a simple C-like language by using the C-tools Flex and Bison. In this article we will make the necessary changes of the Revisit Queue Structure and it's Functions.

The topics that we will cover today are:

  1. Revisit Queue Structure
  2. Revisit Queue Functions


    Actually you need to read and understand all the topics that I covered in the series as a whole, as these articles will give you access to knowledge about:
  • What Compiler Design is (mainly the steps)
  • For which exact Language the Compiler is build for (Tokens and Grammar)
  • How to use Flex and Bison
  • How to implement a lexer and parser for the language using those tools
  • What the Symbol Table is and how we implement it
  • How we combine Flex and Bison together
  • How we can pass information from the Lexer to the Parser
  • How we define operator priorities, precedencies and associativity
  • What Semantic Analysis is (Attributes, SDT etc.)
  • How we do the so called "Scope Resolution"
  • How we declare types and check the type inter-compatibility for different cases, including function parameters
  • How we check function calls later on using a "Revisit queue", as function declarations mostly happen after functions get used (in our Language)
  • Intermediate Code Representations, including AST's
  • How we implement an AST (structure and management)
  • Action Rules for AST nodes and more


Talking about the series in general this series can be rated:
  • Intermediate to Advanced
Today's topic(s) can be rated:
  • Medium
So, without further ado, let's now finally start with the actual Tutorial...

Actual Tutorial Content

Revisit Queue Structure

    The visualization of the revisit queue structure from the previous article is the following:

    Based on that, we can re-write the struct definition for this structure, which is inside of the "symtab.h" header file, like this:
typedef struct revisit_queue{
    // symbol table entry
    list_t *entry;
// name of identifier char *st_name;
// type of revisit int revisit_type;
// parameters of function calls int **par_types; int *num_of_pars; int num_of_calls;
// maybe additional information to simplify the process ...
struct revisit_queue *next; }revisit_queue;

    By already talking about the optional assignment check, that's needed whenever a function call is a part of the expression in an assignment, we can also re-write the revisit type "define's" like this:
#define PARAM_CHECK 1  /* Check parameters of function call when functions gets declared */
#define ASSIGN_CHECK 2 /* Check assignment when function call part of the expression */

Revisit Queue Functions

The main and most interesting part of the revisit queue are the functions.
Those are:
void add_to_queue(list_t *entry, char *name, int type); revisit_queue *search_queue(char *name); revisit_queue *search_prev_queue(char *name); int revisit(char *name); void revisit_dump(FILE *of);

Some just needed changes...others are completely new.
Let's get into each of them.

Add Element to Queue

    When adding a element to the queue we now also pass the symbol table entry that corresponds to it. This means that we will have to set this additional entry equal to the parameter's "value". Doing that we will also need to set the additional entries for the special case of parameter checking, which just includes the number of calls to be initialized to 0. We still have two cases: "queue is empty" and "queue contains elements". So, the final code becomes:
void add_to_queue(list_t *entry, char *name, int type){
  revisit_queue *q;
/* queue is empty */ if(queue == NULL){ /* set up entry */ q = (revisit_queue*) malloc(sizeof(revisit_queue)); q->entry = entry; q->st_name = name; q->revisit_type = type; q->next = NULL;
/* additional info */ if(type == PARAM_CHECK){ q->num_of_calls = 0; }
/* q "becomes" the queue */ queue = q; } /* queue not empty */ else{ /* find last element */ q = queue; while(q->next != NULL) q = q->next;
/* add element to the end */ q->next = (revisit_queue*) malloc(sizeof(revisit_queue)); q->next->entry = entry; q->next->st_name = name; q->next->revisit_type = type; q->next->next = NULL;
/* additional info */ if(type == PARAM_CHECK){ q->next->num_of_calls = 0; } } }

Search Element in Queue

    The search queue function needs to be revamped a little bit. When having an empty queue the previously defined function would cause a segmentation fault. To fix this, we will not only loop as long as the current element's name is not equal to the searched name, but will also check that the current queue element is not NULL! That way the function will return NULL, when the element was not found!

The code is:
revisit_queue *search_queue(char *name){ /* search queue */
    revisit_queue *q;
/* search for the entry */ q = queue; while( (q != NULL) && (strcmp(q->st_name, name) != 0) ) q = q->next;
return q; }

Search Previous of Element in Queue

    When revisiting an element, we want to be able to remove it afterwards. To do so we need to define a function that gives back the previous element of that element, so that we can make this element point to the next of the "to remove" element, instead of the "to remove" element. This function will return NULL when the whole queue is empty or the searched element is the first entry of the queue. In general, it will return the previous of the searched element.

In code this looks like this:
revisit_queue *search_prev_queue(char *name){
  revisit_queue *q;
/* queue is empty */ if(queue == NULL){ return NULL; }
/* special case - first entry */ if( strcmp(queue->st_name, name) == 0 ){ return NULL; }
/* search for the entry */ q = queue; while( (q != NULL) && (strcmp(q->next->st_name, name) != 0) ) q = q->next;
return q; }

Revisit Element of Queue (by also removing it)

    Right now, what matters is the revisit for parameter checking. Also, let's use the newly created "search_queue" function, instead of having two or even three cases. As we said previously, that function will return NULL when the element was not found. When that happens, we will return '-1', to let the "caller" know that the revisit was not needed. In a success we will return '0'. When the parameter check or any other revisit fails we will return '1'.

    Either way, "switching" based on the revisit type, we can now write the needed code for parameter checks. At first we will run the parameter check using the "func_param_check" function, that we will tweak more next time. What you need to know about this function right now is that it will return '0', when the check was successful. After doing the parameter check, we have to remove the entry using the "search_prev_queue" function. And this is actually it! Let's get into the code:
int revisit(char *name){ /* revisit entry by also removing it from queue */
  revisit_queue *q = search_queue(name);
if(q == NULL){ return -1; // no entry }
/* revisit entry depending on the type */ switch(q->revisit_type){ case PARAM_CHECK: /* run parameter check */ if(!func_param_check(name, q->num_of_calls, q->par_types, q->num_of_pars)){ printf("Successful Parameter Check of %s\n", name); }
/* remove entry by making it point to it's next */ revisit_queue *q2 = search_prev_queue(name); if(q2 == NULL){ /* special case: first entry */ queue = queue->next; } else{ q2->next = q2->next->next; }
break; case ASSIGN_CHECK: /* run assignment check */ break; /* ... */ }
return 0; // success }

Generate Revisit Queue Dump File

    Adding new information for parameter checks, we can also tweak the dump file generation, to include the number of function calls that need revisit! In code this looks like this:
void revisit_dump(FILE *of){
  int i;
  revisit_queue *q;
  q = queue;
fprintf(of,"------------ -------------\n"); fprintf(of,"Identifier Revisit Type\n"); fprintf(of,"------------ -------------\n"); while(q != NULL){ fprintf(of, "%-13s", q->st_name); if(q->revisit_type == PARAM_CHECK){ fprintf(of,"%s","Parameter Check "); fprintf(of,"for %d function calls",q->num_of_calls); } // more later on fprintf(of, "\n"); q = q->next; } }

    Without managing the "print" function declaration, the dump file for the "full_example.c" program looks like this for example:

Running the compiler, in general, we will get messages like this:

But, we still need to tweak the symbol table and the parameter check function!



No references, just using code that I implemented in my previous articles.


All of the images are custom-made!

Previous parts of the series

General Knowledge and Lexical Analysis

Syntax Analysis

  • Syntax Analysis Theory
  • Bison basics
  • Creating a grammar for our Language
  • Combine Flex and Bison
  • Passing information from Lexer to Parser
  • Finishing Off The Grammar/Parser (part 1)
  • Finishing Off The Grammar/Parser (part 2)
  • Semantic Analysis (1)

  • Semantic Analysis Theory
  • Semantics Examples
  • Scope Resolution using the Symbol Table
  • Type Declaration and Checking
  • Function Semantics (part 1)
  • Function Semantics (part 2)
  • Intermediate Code Generation (AST)

  • Abstract Syntax Tree Principle
  • Abstract Syntax Tree Structure
  • Abstract Syntax Tree Management
  • Action Rules for Declarations and Initializations
  • Action Rules for Expressions
  • Action Rules for Assignments and Simple Statements
  • Action Rules for If-Else Statements
  • Action Rules for Loop Statements and some Fixes
  • Action Rules for Function Declarations (part 1)
  • Action Rules for Function Declarations (part 2)
  • Action Rules for Function Calls
  • Semantic Analysis (2)

  • Datatype attribute for Expressions
  • Type Checking for Assignments
  • Revisit Queue and Parameter Checking (part 1)

  • Final words | Next up on the project

         And this is actually it for today's post! I hope that I explained everything as much as I needed to, meaning that you learned something out of it.
    Next up on this series are:
    • Semantic analysis (using even more action rules in Bison)
    • Machine Code generation (MIPS Assembly)
         Which are all topics that will need more than one article to complete. Also, note that we might also get into Optimizations later on, or could even extend the Language by adding complex datatypes (structs and unions), more rules etc.
    So, see ya next time!

    GitHub Account:


    Keep on drifting! ;)

    3 columns
    2 columns
    1 column