Compiler - Why and Where

Compiler - Why and Where

Speedbreaker 1 : Semantic Gap

ยท

7 min read

So finally we are here, this is our First Stop of compiler journey. Are you excited for this stop ? Off course you are, RIGHT ! ! !

I guess title had already gave you hint of todays stop. If yes then you got me there otherwise let me explain it first then we will continue our talk. Also in subtitle why speedbreaker, because in order to understand the concepts clearly we need to slow down our speed and learn everything patiently. In this way we will not get a sudden jerk of new things.

Now, at this moment we are going to understand why exactly there is need of compiler and If we consider language processing diagram where is the compiler located . . .

Following is the plan for our stop :

  • Why and where

  • Understanding semantic gap

  • How to overcome it

Why and where

If you are aware about steps of language processing then you might already know that compiler take pure high level language as input and generates assembly language as output. So, we can conclude that compiler is located after pre-processor and before assembler. Now, we got the answer about WHERE but what about WHY ?

Lets travel back to get some insights . . .

Semantic gap

Grace Hopper is the one who first designed the compiler for A-O programming language in 1952. and if you are aware then he is often credited to pioneering the concept of compiler.

If you are thinking I am going to give you some history lectures then it's not going to happen, as I also get bored of it too. But it is good to know some of it and we will get back to the above sentence may be in future stop.

Imagine you want to design the software application, all you worried about is its behaviour. In order to execute that software or machine we need to write it into the machine language which is quite challenging.

HOLD TIGHT . . .

It may start confusing, so what we are going to do is analysing everything slowly one by one:

  1. You have an idea and want to develop an application/software. 2) This application should run/execute on computer system. 3) To run this application correctly it should have certain rules, cases or conditions. Now, let's break it down...
  • Application domain - The ideas and behaviour of the application that you want to develop

  • Execution domain - To implement ideas and execute, ideas should be interpreted for execution and that is execution domain

  • Semantics - Semantics means the rules of meaning of domain

  • Semantic gap - There is always a difference between our ideas and what we have implemented. this difference is nothing but semantic gap.

๐Ÿ–ฅ
Let's take a look on example:

๐Ÿ›‘ WARNING: Don't try to understand the code just take a look

You want to develop simple addition program:

  • Application domain
    Take two numbers, add them and then stop

  • Execution domain
    I am simply trying to visualize the machine code to you as per the architecture.

      Instruction: Load the first number into register 1
      Binary: 0001 0000 (Assume 0001 represents the load operation and 0000 represents register 1)
      Decimal: 16
    
      Instruction: Load the second number into register 2
      Binary: 0010 0001 (Assume 0010 represents the load operation and 0001 represents register 2)
      Decimal: 33
    
      Instruction: Add the contents of register 1 and register 2, and store the result in register 1
      Binary: 0011 0001 (Assume 0011 represents the add operation and 0001 represents register 1)
      Decimal: 49
    
      Instruction: Halt the program
      Binary: 1111 1111 (Assume 1111 represents the halt operation)
      Decimal: 255
    
      section .data
          num1 dd 10       ; First number (32-bit integer)
          num2 dd 20       ; Second number (32-bit integer)
    
      section .text
          global _start
    
      _start:
          ; Load num1 into eax
          mov eax, [num1]
    
          ; Add num2 to eax
          add eax, [num2]
    
          ; At this point, eax contains the sum of num1 and num2
    
          ; Exit program
          mov eax, 1       ; syscall number for exit
          xor ebx, ebx     ; exit code 0
          int 0x80         ; invoke syscall
    

This is not the machine code its just for visualization purpose

Because of Semantic gap you need to face many consequences, I am listing some important ones only:

  • It take very large development time

  • It also need large development efforts

  • Final software quality will be poor

The consequences arise because we try to develop application directly in machine language. and even if we achieve developing the executable software along with above issues it is really hard to make even a small change.

Solution

In order to tackle this issue we use software engineering approach by using methodologies and Programming Languages (PLs). So the the process will be in two steps:

  1. Specification, design and coding step (software developer who will use programming language)

  2. PL implementation step (Programming language designer)

Now we are going to take new domain which is ' PL domain '
Also by introducing new domain we bridged the semantic gap by engineering steps. We will understand what are this gap's are but first try to visualize what just happened.

The first step bridges the gap between application and PL domain and the second step bridges the gap between PL domain and execution domain.

  • Gap between Application domain and PL domain is specification and design gap or simply Specification Gap

  • Gap between PL domain and Execution domain is execution gap.

PL stands for Programming language

How it works

Software developer team will bridge the specification gap by taking specifications and requirements required for application. Then developer will develop the software with the help of any programming language. Now the important part is what about execution gap. But first what it mean by execution gap.
Execution gap is exist because developer will develop the application in programming language like c, cpp, python, ... etc. But machine can only understand only machine language then how we are going to resolve this issue. This issue is comes under execution gap.
The execution gap is bridged by the designer of the programming language processor, viz a translator or an interpreter.

  • Specification gap - semantic gap between two specification of same task

  • Execution gap - semantics of program(that perform same task) written in different programming language.

Semantics can be referred as rules

We are assuming a specification language (SL) for every domain. So specification written in SL is nothing but program in SL. e.g. specification written in C language is nothing but program in C.

DomainSpecification Language
PL DomainPL (Programming Language) e.g. C, CPP, . . .
Execution DomainMachine Language of computer system
๐Ÿ’ก
Let's take a look on example with PL domain:

๐Ÿ›‘ WARNING: Don't try to understand the code just take a look

You want to develop simple addition program:

  • Application domain
    Take two numbers, add them and then stop

  • PL domain
    We will use C programming language

#include<stdio.h>

int main(){
    int a,b;
    scanf("%d",&a);
    scanf("%d",&b);
    a = a + b;
    printf("%d",a);
    return 0;
}
  • Execution domain
    Machine code based on compiler of C programming.

Now you can see that we can easily make modifications with PL domain.

So far we have understand that, to develop an application we need to take help of programming language which further converted to machine language somehow.

Our task is to understand that somehow.

๐Ÿš
THAT'S IT, WE HAVE TO GO NOW FOR NEXT STOP IN ORDER TO CONTINUE OUR JOURNEY.

#FAQ

  1. How does machine code is generated ?

In a real-world scenario, machine code is typically generated by an assembler, which translates assembly code into machine code specific to the target architecture. Writing machine code directly is extremely complex and error-prone. We already know from language processing diagram that compiler generates assembly language. This assembly language is used by assembler to generate machine code.

  1. As of today 19-04-2024 who are enjoying the compiler journey ?

As of today's date 19-04-2024 Aishwarya Patil & Paul Bogatyr are enjoying our compiler journey. I would like to thank them for being my companion.

  1. Where can I find the complete journey of compiler design ?

You can go through the complete compiler journey by clicking HERE or simply pasting following url in any browser.

https://malivinayak.hashnode.dev/series/compiler-journey
  1. Is there anyway I could get notified at every STOP of journey ?

There is always a way. You can subscribe to newsletter service in-order to get notified at every STOP. You can find it into the navbar on top right or at the bottom of page.

ย