🚀 Have you ever wondered when you open up any program like VLC Player (music player)
- What happens?
- How it comes on your PC's screen?
- How it uses your resources and runs on your machine?
Well, there are many workers who are responsible of running a process.
📔 But wait before starting anything, what is a program and process ?
📍 Program
Whenever someone writes code lines in a file (let say java) which we call a set of instructions
class Test {
public static void main (String[] args){
System.out.println("Hello world");
}
}
Above code written is nothing but we are trying to tell the computer that load the Test class, call the main() method and print "Hello world" on our screen.
- Now this set of instructions File is called a program (a fancy word) and when we run (execute) our program
- it takes some resources like some share of CPU, memory and storage and program runs.
Running program is called a process, yes another fancy word
😢 Note:- Well, Interesting thing is our computer doesn't understand ENGLISH at all. It only understand bits which have only two values like 1 and 0.
😎 That is where your compiler comes into the picture.
Now, what the heck is compiler ?
📍 Compiler
Again, compiler is an another fancy word but a translator and inspector. Compiler is a program written by compiler programmers which convert english written files (code files like java, js, c++ etc files) to assembly code.
😅 Assembly code is another type of format which is more closer to the machine (our PC), for example
section .text
global main_entry_point
main_entry_point:
sub rsp, 40
mov rax, [address_of_System_out_object_pointer]
lea rdx, [address_of_hello_world_string_object]
mov rdi, rax
mov rsi, rdx
call address_of_native_println_implementation
add rsp, 40
ret
Above code is assembly code which is hard to write for software engineers but easy for assembler to understand and convert to machine code. And machine code is what machines understands.
01010101 ; push rbp
01001000 10001001 11100101 ; mov rbp, rsp
01001000 10000011 11101100 00101000 ; sub rsp, 0x28
01001000 10111000 <64 bits of address_ptr> ; mov rax, address_of_System_out_object_pointer
01001000 10001001 01000101 11111000 ; mov [rbp-0x8], rax
01001000 10111010 <64 bits of address_str> ; mov rdx, address_of_hello_world_string_object
01001000 10001001 01010101 11110000 ; mov [rbp-0x10], rdx
01001000 10001011 01000101 11111000 ; mov rax, [rbp-0x8]
01001000 10001001 11100111 ; mov rdi, rax
01001000 10001011 01010101 11110000 ; mov rdx, [rbp-0x10]
01001000 10001001 11010110 ; mov rsi, rdx
11101000 <32 bits of offset> ; call address_of_native_println_implementation
11001001 ; leave
11000011 ; ret
📍 Process of conversion so far looks like
- Above picture shows the execution of C++ program, where C++ file is compiled by g++ (compiler of C++) which is a program responsible of converting C++ code to assembly code.
- Then assembler is responsible of converting assembly code to machine code which finally runs on our machine.
✅ But here is a very important point to notice, which is C++ is not a language like JAVA which is write once run anywhere (WORA).
Because when C++ code is converted, it is directly converted to machine code with no intervention. Machine code generation is generated according to the architecture & operating system that our machine has.
💯 For ex: we have x86, arm based processors which has different nature of understanding code at machine level. Hence you can't run same code running on Apple mac M1 (arm based processor) on x86 based processor (intel i9).
That's why, in JAVA there is one intermediator step introduced which is virtual machine & byte code.
📍 Virtual machine & Byte Code
Virtual machine as name suggests is a machine which is not real but mimic our original machine.
⭐ Virtual machine is a software written which takes few %age of CPU, memory and storage and runs as a mini computer inside your main computer. Hence, we can run multiple virtual machines on a single host (main) machine.
Pros of VM:-
- software developer using VM has a lot of control because we can tweak stuff at software level which is hard to do at hardware level
- VM can be made secure and can act as intermediate step before your program goes to hardware and execute
❤️ Due to above reasons a Virtual machine specific to Java was introduced which is Java virtual machine (JVM)
People wrote JVM for x86, arm architectures. People wrote for windows, linux and other Operating systems. Which is why due to the availibility of JVM on all machines JAVA can run on any machine.
📍 This is how JAVA program runs on our machines:-
As you can see that intermediate step here is JVM which is a program written by developers which takes intermediate code file (byte code) and JVM understands byte code and is responsible of converting bytecode to machine code and then it runs.
📍 Here you can see:-
- compiler doesn't care about what underlying OS or architecture (x86, arm) is
- It just knows how to convert and it does
- Rest work is for JVM which is built according to the underlying OS and architecture
- Hence software developers (who focuses on Main.java) they don't care about what their OS and architecture is, JVM will handle everything.
- On the other hand, if you would have noticed C++ code differs on windows and linux. libraries also do differ with x86 and arm architecutres
📍 Let's see the conversion of code in Java
- Java file that we write
// File: Test.java
class Test {
public static void main (String[] args){
System.out.println("Hello world");
}
}
- File that compiler translates to (Don't worry about the details, we dont have to fear this syntax)
Compiled from "Test.java"
class Test {
// Default constructor added by javac
Test();
Code:
0: aload_0 // Load 'this' onto the operand stack
1: invokespecial #1 // Method java/lang/Object."":()V (Call superclass constructor)
4: return // Return from constructor
public static void main(java.lang.String[]);
Code:
// --- The interesting part ---
0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
// Pushes the static field System.out (a PrintStream object) onto the stack
3: ldc #3 // String "Hello world"
// Pushes the String "Hello world" from the constant pool onto the stack
5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
// Pops the string and PrintStream object, calls the println method
8: return // Return void from main
}
- JVM who converts bytecode to machine code and runs on the machine
📍 When Program runs
When program runs (machine code) on your machine. It takes memory, cpu and storage while running. while running, process treats memory as stack and heap which acts as static and dynamic memory respectively.
Stack
import java.util.*;
public class Main {
private static void printHello(){
System.out.println("Hello world");
Car bmw = new Car();
}
public static void main(String[] args) {
printHello();
}
}
- stack memory holds local variables, function parameters, pointer to dynamic memory if any list, object (bmw in above example) is created
Above picture states that first we call main() then we push printHello() to stack and then we push println() which prints the output. Then we create an object on heap because objects are created on Heap memory and pointer to the object memory location is stored in the stack itself.
Note:- Stack flushes automatically as soon as function completes its execution while Heap does not flushes automatically
- To flush the heap, there is another program written which runs and flushes the memory on heap allocated in our java program
- that program is called garbage collection
CPU cycles, memory representation:- binary numbers and Hexadecimal base and Hertz
When we look at the clock we measure time using seconds, minutes or hours. Similarly, the unit of work for our CPU is called its CPU cycle.
Every CPU has its clock speed or frequency. for example AMD ryzen 7 9800x3D has 4.7GHz (1Giga = 1 billion) as its base frequency. Which means in one seconds it can run 4.7 billion instructions.
Instructions are nothing but our code lines that we write in simple words.
Question:- Can you think how much approximate work CPU has to do to perform 2+4 calculation ?
📍 Next comes memory representation
Memory is represented in hexadecimal base. What is hexadecimal base ??
in decimal space we have numbers as 0,1,2,3,4,5,6,7,8,9
in binary space we have numbers as 0,1
in hexadecimal space we have 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F
Conversion of binary, hexa, deci base to decimal numbers that we humans understand:-
We represent memory in hexadecimal base (16 base, because we have 16 numbers instead of 10 numbers that we have in decimal base)
- 0x45FBC memory means:-