Every C programer should understand what is happening behind a C program, this helps to understand core concepts of C this invovles in understanding concepts like compilation stages in C.
To discuss on this, we need to have a basic C program:
#include <stdio.h>
int main()
{
printf("Hello World !! Welcome to onesandzeroverse.com\n");
}
The above program is a very basic C code that gives output as below:
Hello World !! Welcome to onesandzerose.com
OK. But how does this happens? the program which is written is called source code and it is in human readable language, How does CPU executes this? What if the programmer writes program against C rules? who does the validation?
The simple answer for all the questions above is ‘Compiler’. A compiler is a software tool which converts source codes(human readable programs) to executable files(CPU executable). The most standard C compiler is GCC which is commonly used in Linux systems.
Before going through compiler lets quickly understand what is the meaning of each line written in above C program.
line 1: #include <stdio.h>
This means include the file with name stdio.h in this .c file. stdio.h file have all the rules related to standard input and output(stdio) functions. stdio functions helps to interact with user like printf(), to display a message to user. scanf() to take inputs from user. By adding stdio.h file, compiler understands what is printf() and scanf() like functions. There are several .h files used in C programs based on the functions being used by programmer, in the upcoming classes we will discuss more on it.
Note: a C program can be written without header file inclusions, more detailed explanation is given in C program without #include <stdio.h> in functions chapter.
line 2: main()
This is the first function will execute in a C program. Every C program contains main function, one can also write a C program without main.
Note: a C program can be written without main function, more detailed explanation is given in C program without main function in functions chapter.
line 3 and 5: { and }
This tells main function starts from { i.e line 3 and ends at } i.e at line 5. The lines between { and } of main function is called definition of main function or main function block. Any block of code should written within parenthesis.
line 4: printf(“Hello World !! Welcome to onesandzerose.com\n”);
This is a statement, calling printf function. printf() helps in displaying a message to the user. The sentence that is written within double quotes is displayed.
I hope you are clear with the short C code written, now lets start compiling stages in C.
I am using linux system to explain the compiling stages of C. If you are not having a linux system you can use online linux environments like cocalc.
Compiling Stages in C
The stages involved in converting Source code to Executable code are called compiling stages. In C there are 4 compiling stages in C:
here i am using file name as “myfirstcode.c”
1. Preprocessor:
In this stage .c file converts to .i file. The generated myfirstcode.i contains all the data present in included header file (here stdio.h), macro expansion(no macros used here),etc.
In linux the command for preprocessor stage is:
$ gcc -E myfirstcode.c > mysirstcode.i
a new file is created as myfirstcode.i
2. Compiler:
In this compiler stage the c code is converted into assembly code(a medium level language) with .s as extension. The command for compiler stage is:
$ gcc -S myfirstcode.c
a file is generated with .s extension. if you open and see, the file contains equivalent assembly instructions of written C program
3. Assembler:
The assembly code generated by compiler stage is converted into object code in this stage. object code is the binary code, a machine understandable code. The command for assembler stage is:
$ gcc -c myfirstcode.c
4. Linker:
At linker stage compiler adds the C library file to the executable code generated by assembler. C libraries define and store many frequently used functions in programming. These functions, known as predefined functions, are readily available for use. The definition of printf() function used in the sample code is not present in the myfirstcode.c file. As this is a predefined function programmer can only call. During the linker stage, the C library adds the function’s definition, which is present in the C library.
Note: More detailed explanations on functions are here.
$ gcc myfirstcode.c -o myfirstcode
This will generate a final executable file with name myfirstcode. myfirstcode is a complete binary file a CPU can understand and executes according to the instructions present in the file.
Types of compilers
Native compiler:
A native compiler is a compiler that runs on a particular platform and generates executable code for that same platform. It produces executable files that can be directly executed on the host platform where the compiler is running. For example, if you are using a compiler on a Linux x86 machine and it produces executable code for the same Linux x86 machine, then that compiler is a native compiler.
Cross Compiler:
A cross compiler runs on one platform (the host platform) but generates executable code for a different platform (the target platform). It enables you to compile code on one architecture or operating system and generate executables for a different architecture or operating system. For example, if you are using a compiler on a Linux x86 machine, but it produces executable code for an ARM-based embedded device running a different operating system, then that compiler is a cross compiler.
next >> Basics of C