Writing a simple Compiler on my own - Revisit Queue and Parameter Checking (part 1) [C][Flex][Bison]




[Custom Thumbnail]

All the Code of the series can be found at the Github repository:
https://github.com/drifter1/compiler


Introduction

    Hello it's a me again @drifter1! Today we continue with my Compiler Series, a series where we implement a complete compiler for a simple C-like language by using the C-tools Flex and Bison. In this article we will visualize function calls, and how we use the Revisit Queue concept to solve the problem of knowing the return type and if the parameters are actually compatible with the parameter declaration, that happens much later on! The actual implementation will come in the next parts...

The topics that we will cover today are:

  1. Revisit Queue Concept
  2. Visualizing the Problem

Requirements:

    Actually you need to read and understand all the topics that I covered in the series as a whole, as these articles will give you access to knowledge about:
  • What Compiler Design is (mainly the steps)
  • For which exact Language the Compiler is build for (Tokens and Grammar)
  • How to use Flex and Bison
  • How to implement a lexer and parser for the language using those tools
  • What the Symbol Table is and how we implement it
  • How we combine Flex and Bison together
  • How we can pass information from the Lexer to the Parser
  • How we define operator priorities, precedencies and associativity
  • What Semantic Analysis is (Attributes, SDT etc.)
  • How we do the so called "Scope Resolution"
  • How we declare types and check the type inter-compatibility for different cases, including function parameters
  • How we check function calls later on using a "Revisit queue", as function declarations mostly happen after functions get used (in our Language)
  • Intermediate Code Representations, including AST's
  • How we implement an AST (structure and management)
  • Action Rules for AST nodes and more

Difficulty:

Talking about the series in general this series can be rated:
  • Intermediate to Advanced
Today's topic(s) can be rated:
  • Medium
So, without further ado, let's now finally start with the actual Tutorial...

Actual Tutorial Content

Revisit Queue Concept

    One of the most difficult aspects of the language that we are trying to implement a compiler for is that we are allowed to declare the functions after using them in function calls! In the actual C language you have to declare the function before-hand, and you can define the actual code of the function later on. So, how should we check different kinds of semantics later on? Well, in the function semantics articles, I talked about a concept called "revisit queue". This concept is all about adding elements that should be revisited later on to a queue (or list), and revisiting the queue right after the information that was needed is accessible.

In functions, the topics that need to be revisited are:
  • parameter compatibility checking
  • return datatype for assignment statements

What we want to cover right now is the parameter check!

Visualizing the Problem

    After this "recap" of the revisit queue concept, let's now get into a visualization! Functions are described by identifiers (IDs) and so stored inside of a symbol table entry. A function call is one of statement types (when returning void) and can also be used as a part of a larger or smaller expression (when used in assignments). But, either way a function call will occur inside of the statements and maybe even inside of the function declarations part. The function declarations are the final part of each program. In general, we can say that a program looks like this:


    As you can see, the parameter check takes it's information from the function call and function declaration. But, what exactly get's stored to the entries of the symbol table and revisit queue, while the whole process is in effect? The following diagram shows that:

  • The first occurrence (mostly function call) creates the symbol table entry
  • A revisit queue entry will be created, as we are not declaring (declare = 0)
  • Each function call adds it's information (parameter datatypes and number of parameters) to the rq entry
  • The function declaration will add the datatype to the common symbol table entry (can be quite difficult to implement) and trigger the revisit
  • During the revisit we will perform the parameter check
Let's dive more deeply into which exact changes happen to those entries..

Symbol table entry

    The symbol table entry stores the information around the function declaration. This information defines the semantics of the function and is used for semantic checks in anything that has to do with functions and mainly function calls. The important pieces of information, that have to do with functions, can be visualized as:

Revisit queue entry

    The revisit queue entry stores the information around function calls, which means that we store the datatype of each parameter and the actual count of them, for each function call that has to do with a specific function. At first the entry get's created, when creating the symbol table entry, by knowing that we will have to perform a revisit (not currently declaring). That way the whole structure can be visualized as:

A complete diagram

The whole problem can be summarized like this:


    I just added some more specific information to the diagram of the entries. This information includes some of the needed functions and their parameters, to give you a better understanding of how we will implement it later on. I hope that this whole article gave you a great insight to the problem :)

RESOURCES

References:

No references, just using code that I implemented in my previous articles.

Images:

All of the images are custom-made!

Previous parts of the series


Final words | Next up on the project

     And this is actually it for today's post! I hope that I explained everything as much as I needed to, meaning that you learned something out of it.
Next up on this series are:
  • Semantic analysis (using even more action rules in Bison)
  • Machine Code generation (MIPS Assembly)
     Which are all topics that will need more than one article to complete. Also, note that we might also get into Optimizations later on, or could even extend the Language by adding complex datatypes (structs and unions), more rules etc.
So, see ya next time!

GitHub Account:

https://github.com/drifter1

Keep on drifting! ;)

H2
H3
H4
3 columns
2 columns
1 column
4 Comments
Ecency