64-Bit Assembly Language - Lab 5 | Programming and Development

Introduction

Hi everybody! In lab 5, we are moving forward from 6502 assembly language to modern processors such as x86 and aarch64. Although 6502 is offers a minimal instruction set, these modern processors have far more advanced capabilities and architecture. In this lab, we will be exploring the assembly languages in them.

All the experimentations will be made in x86 and aarch64 remote servers.

Setting Up

In our class servers, the code examples are obtained through the path: /public/spo600-assembler-lab-examples.tgz.

We will extract this .tgz file using the command tar.

tar xvf /public/spo600-assembler-lab-examples.tgz

The code will be presented in this structure:

spo600
 └── examples
     └── hello                     # "hello world" example programs
         ├── assembler
         │   ├── aarch64           # aarch64 gas assembly language version
         │   │   ├── hello.s
         │   │   └── Makefile
         │   ├── Makefile
         │   └── x86_64            # x86_64 assembly language versions
         │       ├── hello-gas.s   # ... gas syntax
         │       ├── hello-nasm.s  # ... nasm syntax
         │       └── Makefile
         └── c                     # Portable C versions
             ├── hello2.c          # ... using write()
             ├── hello3.c          # ... using syscall()
             ├── hello.c           # ... using printf()
             └── Makefile

📌 `AArch64` Assembly Program

First, we will look at aarch64 server. Log in and go to the AArch64 assembly example directory:

cd ~/spo600/examples/hello/assembler/aarch64

There is a hello.s file. This is the source file of our code!

Next, you will see that there is a Makefile in this location. This means that we can run make and compile our program. Build the program using the make command.

Based on the dependencies defined in the Makefile, make determines which parts of the code have changed and need to be rebuilt, then executes the necessary commands to update only those parts.

make

After this, you will see a binary file named hello in the directory. Run this resulting binary file.

./hello

It will show you a "Hello, world!" on your terminal.

Okay, now let's take a look at the code using objdump command. I want to compare the disassembled output of the object file (hello.o) to the source file (hello.s).

objdump -d hello.o > hello_disassembled.txt

Disassembled Output (hello.o)

hello.o:     file format elf64-littleaarch64


Disassembly of section .text:

0000000000000000 <_start>:
   0:   d2800020        mov     x0, #0x1                        // #1
   4:   10000001        adr     x1, 0 <_start>
   8:   d28001c2        mov     x2, #0xe                        // #14
   c:   d2800808        mov     x8, #0x40                       // #64
  10:   d4000001        svc     #0x0
  14:   d2800000        mov     x0, #0x0                        // #0
  18:   d2800ba8        mov     x8, #0x5d                       // #93
  1c:   d4000001        svc     #0x0

Source File (hello.s)

.text
.globl _start
_start:

        mov     x0, 1           /* file descriptor: 1 is stdout */
        adr     x1, msg         /* message location (memory address) */
        mov     x2, len         /* message length (bytes) */

        mov     x8, 64          /* write is syscall #64 */
        svc     0               /* invoke syscall */

        mov     x0, 0           /* status -> 0 */
        mov     x8, 93          /* exit is syscall #93 */
        svc     0               /* invoke syscall */

.data
msg:    .ascii      "Hello, world!\n"
len=    . - msg

Looking at these two files, the disassembly file is an accurate instruction-by-instruction translation into machine code from our source file. The only difference is when we use the msg label in the adr instruction. However, this arose from the nature of how symbols are represented in disassembled code.

Modifying the `AArch64` Assembly Program

Here is a basic loop in AArch64 assembler.

.text
 .globl _start
 min = 0                          /* starting value for the loop index; **note that this is a symbol (constant)**, not a variable */
 max = 6                         /* loop exits when the index hits this number (loop condition is i
 _start:
     mov     x19, min
 loop:

     /* ... body of the loop ... do something useful here ... */

     add     x19, x19, 1     /* increment the loop counter */
     cmp     x19, max        /* see if we've hit the max */
     b.ne    loop            /* if not, then continue the loop */

     mov     x0, 0           /* set exit status to 0 */
     mov     x8, 93          /* exit is syscall #93 */
     svc     0               /* invoke syscall */

The code is looping 6 times (max is 6). It stores the loop's index in register 19 (x19) to keep track of iterations. The body of the loop is empty right now.

☑️ Print Loop

Let's change this code so that it prints out "Loop" on every iteration. We added a .data section and added the print in the body of the loop.
Change hello.s like below:

.text
.globl _start

min = 0                          /* starting value for the loop index */
max = 6                          /* loop exits when the index hits this number */

_start:
    mov     x19, min            /* initialize loop counter */
loop:
    /* Print "Loop" message */
    mov     x0, 1               /* file descriptor: 1 is stdout */
    adr     x1, msg             /* message location (memory address) */
    mov     x2, len             /* message length (bytes) */
    mov     x8, 64              /* write is syscall #64 */
    svc     0                   /* invoke syscall */

    add     x19, x19, 1         /* increment the loop counter */
    cmp     x19, max            /* see if we've hit the max */
    b.ne    loop                /* if not, then continue the loop */

    mov     x0, 0               /* set exit status to 0 */
    mov     x8, 93              /* exit is syscall #93 */
    svc     0                   /* invoke syscall */

.data
msg:    .ascii      "Loop\n"
len=    . - msg

Test this new code by using make and running

make clean
make
./hello

Output:

[kzaw@aarch64-002 aarch64]$ ./hello
Loop
Loop
Loop
Loop
Loop
Loop

☑️ Print Loop and Index Number

Let's change the code again so that it will print Loop: # where '#' is the current index number.

To do that, we will need to convert our loop counter number to its ASCII character representation. In ASCII/ISO-8859-1/Unicode UTF-8, the digit characters are in the range 48-57 (0x30-0x39).

add     x20, x19, #48       /* x20 = x19 + 48 (ASCII '0') */

This converts the loop counter in register 19 (x19) by adding 48 value (ASCII code for '0') and stores the new value in x20. For example,

when x19 = 0, x20 becomes 48 (ASCII '0').
when x19 = 1, x20 becomes 49 (ASCII '1').
And so on...

Next, define our message.

msg:    .ascii      "Loop: #\n"    /* # is a placeholder for the digit */

I added this below code into the body of the loop. What's happening here is that I am getting the address of the msg, and then adding an offset of 6 bytes to point to the position after "Loop: ". (which is 6 characters long).

The strb instruction writes a byte from a register into a memory.

w20 is the 32 bit view of x20. When used with strb, this means we take the lowest 8 bits (1 byte) from that register. In this context, where x20 is pointing to an ASCII digit (which takes 1 byte), the instruction will ignore the other 24 bits and ensure unnecessary space isn't taken.

[x21] is the location to store the ASCII digit to. Now this whole thing means the instruction will write that ASCII digit to the memory location pointed to by x21 ('#').

adr     x21, msg            /* Get address of message */
add     x21, x21, #6        /* Position of the digit character (after "Loop: ") */
strb    w20, [x21]          /* Store the ASCII character at that position */

✨
Modified Code:

.text
.globl _start

min = 0                          /* starting value for the loop index */
max = 6                          /* loop exits when the index hits this number */

_start:
    mov     x19, min            /* initialize loop counter */
loop:
    /* Convert loop counter to ASCII character */
    add     x20, x19, #48       /* x20 = x19 + 48 (ASCII '0') */

    /* Store the ASCII digit in the message */
    adr     x21, msg            /* Get address of message */
    add     x21, x21, #6        /* Position of the digit character (after "Loop: ") */
    strb    w20, [x21]          /* Store the ASCII character at that position */

    /* Print message with loop counter */
    mov     x0, 1               /* file descriptor: 1 is stdout */
    adr     x1, msg             /* message location (memory address) */
    mov     x2, len             /* message length (bytes) */
    mov     x8, 64              /* write is syscall #64 */
    svc     0                   /* invoke syscall */

    add     x19, x19, 1         /* increment the loop counter */
    cmp     x19, max            /* see if we've hit the max */
    b.ne    loop                /* if not, then continue the loop */

    mov     x0, 0               /* set exit status to 0 */
    mov     x8, 93              /* exit is syscall #93 */
    svc     0                   /* invoke syscall */

.data
msg:    .ascii      "Loop: #\n"    /* # is a placeholder for the digit */
len=    . - msg

Output:

[kzaw@aarch64-002 aarch64]$ ./hello
Loop: 0
Loop: 1
Loop: 2
Loop: 3
Loop: 4
Loop: 5

☑️ Loop From 00 - 32

The next requirement is to loop from 00 - 32, printing in 2-digit decimal numbers. Besides changing the max symbol to 33, there are other important changes we must do.

We copy the loop counter to x22.

mov     x22, x19            /* Copy loop counter to x22 */

Next, we need to change our code to cater for two-digit conversion.
Before, it was just adding the ASCII number to our counter and that was it. Now it is more complicated.

How we calculate the tens digit is basically divide by 10. The udiv instruction gets the quotient in the division.

So x20 = x22 / 10

/* Calculate tens digit: quotient of division by 10 */
mov     x23, #10            /* Set divisor to 10 */
udiv    x20, x22, x23       /* x20 = x22 / 10 (quotient = tens digit) */

Now that the tens digit is extracted, we can get the ones digit by getting the remainder. To do this, we will use mul to multiply the quotient with 10. This value is then subtracted from original value to get the remainder.

/* Calculate ones digit: remainder of division by 10 */
mul     x24, x20, x23       /* x24 = quotient * 10 */
sub     x21, x22, x24       /* x21 = original - (quotient * 10) = remainder */

After all of this, we can convert everything to ASCII.

/* Convert digits to ASCII */
add     x20, x20, #48       /* Convert tens digit to ASCII */
add     x21, x21, #48       /* Convert ones digit to ASCII */

Let me show you a table of registers, comparing the old and new so we can understand better.

Register	Original Purpose	New Purpose
x19	Loop counter	Loop counter (unchanged)
x20	ASCII digit	Tens digit (after conversion to ASCII)
x21	Message pointer	Ones digit (after conversion to ASCII)
x22	–	Copy of loop counter for calculations
x23	–	Constant value 10 (divisor)
x24	–	Pointer for message buffer manipulation

Calculation is completed! Change the logic where we change the message to accommodate the two digits.

/* Store the ASCII digits in the message */
adr     x24, msg            /* Get address of message */
add     x24, x24, #6        /* Position of the first digit (after "Loop: ") */
strb    w20, [x24]          /* Store the tens digit */
add     x24, x24, #1        /* Move to the position of the second digit */
strb    w21, [x24]          /* Store the ones digit */

✨
Modified Code:

.text
.globl _start

min = 0                          /* starting value for the loop index */
max = 33                         /* loop exits when the index hits this number */

_start:
    mov     x19, min            /* initialize loop counter */
loop:
    /* Convert loop counter to two ASCII digits */
    mov     x22, x19            /* Copy loop counter to x22 */

    /* Calculate tens digit: quotient of division by 10 */
    mov     x23, #10            /* Set divisor to 10 */
    udiv    x20, x22, x23       /* x20 = x22 / 10 (quotient = tens digit) */

    /* Calculate ones digit: remainder of division by 10 */
    mul     x24, x20, x23       /* x24 = quotient * 10 */
    sub     x21, x22, x24       /* x21 = original - (quotient * 10) = remainder */

    /* Convert digits to ASCII */
    add     x20, x20, #48       /* Convert tens digit to ASCII */
    add     x21, x21, #48       /* Convert ones digit to ASCII */

    /* Store the ASCII digits in the message */
    adr     x24, msg            /* Get address of message */
    add     x24, x24, #6        /* Position of the first digit (after "Loop: ") */
    strb    w20, [x24]          /* Store the tens digit */
    add     x24, x24, #1        /* Move to the position of the second digit */
    strb    w21, [x24]          /* Store the ones digit */

    /* Print message with loop counter */
    mov     x0, 1               /* file descriptor: 1 is stdout */
    adr     x1, msg             /* message location (memory address) */
    mov     x2, len             /* message length (bytes) */
    mov     x8, 64              /* write is syscall #64 */
    svc     0                   /* invoke syscall */

    add     x19, x19, 1         /* increment the loop counter */
    cmp     x19, max            /* see if we've hit the max */
    b.ne    loop                /* if not, then continue the loop */

    mov     x0, 0               /* set exit status to 0 */
    mov     x8, 93              /* exit is syscall #93 */
    svc     0                   /* invoke syscall */

.data
msg:    .ascii      "Loop: ##\n"   /* ## are placeholders for the two digits */
len=    . - msg

Output:

[kzaw@aarch64-002 aarch64]$ ./hello
Loop: 00
Loop: 01
...
Loop: 09
Loop: 10
...
Loop: 32

☑️ Loop Without Trailing Zeros

The next change we're making is removing the leading zero for single-digit numbers.

To make this happen, we need to implement conditional logic that detects whether we're dealing with a single-digit or double-digit number, and formats the output accordingly.

We need to use different message formats depending on the number's value:

For numbers 0-9: Use "Loop: #" (notice two spaces after the colon)
For numbers 10-32: Use "Loop: ##" (notice one space after the colon)

.data
msg1:   .ascii      "Loop:  #\n"   /* Single-digit format (note: two spaces after colon) */
len1=   . - msg1
msg2:   .ascii      "Loop: ##\n"   /* Double-digit format (note: one space after colon) */
len2=   . - msg2

The KEY change is this conditional check. We compare the tens digit with the ASCII value '0'. If it's not '0', then we know we have a two-digit number. We will make a new double_digit function for our condition.

/* Determine if number is single or double digit */
cmp     x20, #48            /* Compare tens digit to ASCII '0' */
b.ne    double_digit        /* If not '0', it's a double-digit number */

If it's not double digit, then we continue on to the single digit case. Use msg1 and store in position 6. After the logic, jump to the common print_msg routine.

/* Single-digit case (0-9) */
adr     x24, msg1           /* Get address of single-digit message */
strb    w21, [x24, #6]      /* Store ones digit at position after "Loop: " */
mov     x1, x24             /* Set message address for print */
mov     x2, len1            /* Set message length */
b       print_msg           /* Jump to print routine */

If it's double digit, we would jump to double_digit routine. Use msg2, set the length and address up for printing and then fall through to the print logic.

double_digit:
    /* Double-digit case (10-32) */
    adr     x24, msg2           /* Get address of double-digit message */
    strb    w20, [x24, #6]      /* Store tens digit */
    strb    w21, [x24, #7]      /* Store ones digit */
    mov     x1, x24             /* Set message address for print */
    mov     x2, len2            /* Set message length */

This is the printing logic below.

print_msg:
    /* Print message with loop counter */
    mov     x0, 1               /* file descriptor: 1 is stdout */
    mov     x8, 64              /* write is syscall #64 */
    svc     0                   /* invoke syscall */

✨
Modified Code:

.text
.globl _start

min = 0                          /* starting value for the loop index */
max = 33                         /* loop exits when the index hits this number */

_start:
    mov     x19, min            /* initialize loop counter */
loop:
    /* Convert loop counter to two ASCII digits */
    mov     x22, x19            /* Copy loop counter to x22 */

    /* Calculate tens digit: quotient of division by 10 */
    mov     x23, #10            /* Set divisor to 10 */
    udiv    x20, x22, x23       /* x20 = x22 / 10 (quotient = tens digit) */

    /* Calculate ones digit: remainder of division by 10 */
    mul     x24, x20, x23       /* x24 = quotient * 10 */
    sub     x21, x22, x24       /* x21 = original - (quotient * 10) = remainder */

    /* Convert digits to ASCII */
    add     x20, x20, #48       /* Convert tens digit to ASCII */
    add     x21, x21, #48       /* Convert ones digit to ASCII */

    /* Determine if number is single or double digit */
    cmp     x20, #48            /* Compare tens digit to ASCII '0' */
    b.ne    double_digit        /* If not '0', it's a double-digit number */

    /* Single-digit case (0-9) */
    adr     x24, msg1           /* Get address of single-digit message */
    strb    w21, [x24, #7]      /* Store ones digit at position after "Loop:  " */
    mov     x1, x24             /* Set message address for print */
    mov     x2, len1            /* Set message length */
    b       print_msg           /* Jump to print routine */

double_digit:
    /* Double-digit case (10-32) */
    adr     x24, msg2           /* Get address of double-digit message */
    strb    w20, [x24, #6]      /* Store tens digit */
    strb    w21, [x24, #7]      /* Store ones digit */
    mov     x1, x24             /* Set message address for print */
    mov     x2, len2            /* Set message length */

print_msg:
    /* Print message with loop counter */
    mov     x0, 1               /* file descriptor: 1 is stdout */
    mov     x8, 64              /* write is syscall #64 */
    svc     0                   /* invoke syscall */

    add     x19, x19, 1         /* increment the loop counter */
    cmp     x19, max            /* see if we've hit the max */
    b.ne    loop                /* if not, then continue the loop */

    mov     x0, 0               /* set exit status to 0 */
    mov     x8, 93              /* exit is syscall #93 */
    svc     0                   /* invoke syscall */

.data
msg1:   .ascii      "Loop:  #\n"   /* Single-digit format (note: two spaces after colon) */
len1=   . - msg1
msg2:   .ascii      "Loop: ##\n"   /* Double-digit format (note: one space after colon) */
len2=   . - msg2

Output:

Loop:  0
Loop:  1
Loop:  2
...
Loop:  9
Loop: 10
Loop: 11
...
Loop: 32

☑️ Loop with Hex Output (0 - 20)

Now, let's say we want to output in hex instead of decimal. Remove the single vs. double digit branch. These are the changes you can make.

We will now divide by 16 instead of 10. This makes the quotient the “high nibble” (first hexadecimal digit) and the remainder the “low nibble.”

mov     x23, #16

For each nibble, check whether its value is less than 10. If it is, convert it to ASCII by adding 48. If it's not, we add 55 so that 1o becomes 65 ('A'), 11 becomes 66 ('B'), etc.

Here's that logic for the high nibble. The same is applied to low nibble.

cmp     x20, #10            /* Check if high nibble is less than 10 */
    blt     high_digit_decimal
    add     x20, x20, #55       /* For hex A-F */
    b       high_digit_done
high_digit_decimal:
    add     x20, x20, #48       /* For digits 0-9 */
high_digit_done:

✨
Modified Code:

.text
.globl _start

min = 0                          /* starting value for the loop index */
max = 33                         /* loop exits when the index hits this number */

_start:
    mov     x19, min            /* initialize loop counter */
loop:
    /* Convert loop counter to two hexadecimal digits */
    mov     x22, x19            /* Copy loop counter to x22 */

    /* Calculate high nibble: quotient of division by 16 */
    mov     x23, #16            /* Set divisor to 16 for hex */
    udiv    x20, x22, x23       /* x20 = x22 / 16 (high nibble) */

    /* Calculate low nibble: remainder of division by 16 */
    mul     x24, x20, x23       /* x24 = high nibble * 16 */
    sub     x21, x22, x24       /* x21 = original - (high nibble * 16) = low nibble */

    /* Convert high nibble to ASCII */
    cmp     x20, #10            /* Check if high nibble is less than 10 */
    blt     high_digit_decimal
    /* For hex A-F: add 55 (10+55=65 -> 'A') */
    add     x20, x20, #55       
    b       high_digit_done
high_digit_decimal:
    add     x20, x20, #48       /* For digits 0-9: add 48 ('0') */
high_digit_done:

    /* Convert low nibble to ASCII */
    cmp     x21, #10            /* Check if low nibble is less than 10 */
    blt     low_digit_decimal
    /* For hex A-F */
    add     x21, x21, #55       
    b       low_digit_done
low_digit_decimal:
    add     x21, x21, #48       /* For digits 0-9 */
low_digit_done:

    /* Store the ASCII characters in the message template */
    adr     x24, msg           /* Get address of the message template */
    strb    w20, [x24, #6]      /* Store high nibble at position after "Loop: " */
    strb    w21, [x24, #7]      /* Store low nibble */

    /* Print message */
    mov     x1, x24             /* Set message address for print */
    mov     x2, len             /* Set message length */
    mov     x0, 1               /* file descriptor: 1 is stdout */
    mov     x8, 64              /* write is syscall #64 */
    svc     0                   /* invoke syscall */

    add     x19, x19, 1         /* increment the loop counter */
    cmp     x19, max            /* see if we've hit the max */
    b.ne    loop                /* if not, then continue the loop */

    mov     x0, 0               /* set exit status to 0 */
    mov     x8, 93              /* exit is syscall #93 */
    svc     0                   /* invoke syscall */

.data
msg:   .ascii      "Loop: ##\n"   /* Message format for hexadecimal output */
len=   . - msg

Output:

[kzaw@aarch64-002 aarch64]$ ./hello
Loop: 00
Loop: 01
Loop: 02
Loop: 03
Loop: 04
Loop: 05
Loop: 06
Loop: 07
Loop: 08
Loop: 09
Loop: 0A
Loop: 0B
Loop: 0C
Loop: 0D
Loop: 0E
Loop: 0F
Loop: 10
Loop: 11
Loop: 12
Loop: 13
Loop: 14
Loop: 15
Loop: 16
Loop: 17
Loop: 18
Loop: 19
Loop: 1A
Loop: 1B
Loop: 1C
Loop: 1D
Loop: 1E
Loop: 1F
Loop: 20

📌 `x86` Assembly Program

While the overall logic of our loop programs are similar, several key differences set x86_64 apart from AArch64.

Overall Differences

The first difference is in register naming and constants.

Aarch64: Registers are named as x0, x1, …, and constants are used without a special prefix.
x86: Registers have names like %rax, %rdi, %rsi, etc., and constants are prefixed with $.

Memory addressing is also different.

Aarch64: We use adr.
x86: We use the lea (load effective address) instruction combined with %rip.

Syscall is different.

Aarch64: The syscall is invoked with svc 0.
x86: We use the syscall instruction.

There are only other differences in division and arithmetic logic which you will see in the code.

Modifying for `x86`

cd ~/spo600/examples/hello/assembler/x86_64

In this directory, you'll find two assembly source files: hello-gas.s using GNU Assembly syntax (GAS) and hello-nasm.s using the NASM syntax. We'll work with the GAS syntax version since it's more consistent with what we used for AArch64.

☑️ Basic Loop Template

.text
 .globl    _start

 min = 0                         /* starting value for the loop index */
 max = 5                         /* loop exits when the index hits this number */

 _start:
     mov     $min,%r15           /* loop index */

 loop:
     /* ... body of the loop ... do something useful here ... */

     inc     %r15                /* increment the loop index */
     cmp     $max,%r15           /* see if we've hit the max */
     jne     loop                /* if not, then continue the loop */

     mov     $0,%rdi             /* set exit status to 0 */
     mov     $60,%rax            /* exit is syscall #60 */
     syscall                     /* invoke syscall */

Some differences from AArch64:

The direction in mov instructions is reversed (destination first)
We use inc for incrementing instead of add
We use jne (jump if not equal) instead of b.ne

☑️ Print Loop

.data
 msg: .ascii "Loop\n"
 len = .-msg

 .text
 .globl    _start

 min = 0                         /* starting value for the loop index */
 max = 6                         /* loop exits when the index hits this number */

 _start:
     mov     $min,%r15           /* loop index */

 loop:
     mov     $1, %rax   /* syscall: write */
     mov     $1, %rdi   /* File descriptor: stdout */
     lea     msg(%rip), %rsi /* Message address */
     mov     $len, %rdx /* Message length */
     syscall            /* Invoke syscall */

     inc     %r15                /* increment the loop index */
     cmp     $max,%r15           /* see if we've hit the max */
     jne     loop                /* if not, then continue the loop */

     mov     $0,%rdi             /* set exit status to 0 */
     mov     $60,%rax            /* exit is syscall #60 */
     syscall                     /* invoke syscall */

The lea instruction is used to load the address of msg into the %rsi register. The (%rip) part is relative addressing from the current instruction pointer.

The registers used are different!

%rax holds the syscall number (1 for write)
%rdi holds the first argument (file descriptor: 1 for stdout)
%rsi holds the second argument (message address)
%rdx holds the third argument (message length)

If you run the code with make and ./hello-gas, you should see it printing Loop 6 times as intended.

Test

☑️ Print Loop and Index Number

.data
 msg: .ascii "Loop   \n"
 len = .-msg

 .text
 .globl    _start

 min = 0                         /* starting value for the loop index */
 max = 6                         /* loop exits when the index hits this number */

 _start:
     mov     $min,%r15           /* loop index */

 loop:
     mov        %r15, %rax
     mov        $10, %r10
     xor        %rdx, %rdx
     div        %r10            /* quotient is stored in %rax, remainder in %rdx */

     # Convert digits to ASCII
     add     $'0', %al
     add     $'0', %dl

     lea        msg(%rip), %rsi /* Message address */
     movb    %al, 5(%rsi)   /* Store tens digit in the 6th position of msg */
     movb    %dl, 6(%rsi)    /* Store units digit in the 7th position of msg */

     mov        $1, %rax   /* syscall: write */
     mov        $1, %rdi   /* File descriptor: stdout */
     mov        $len, %rdx /* Message length */
     syscall            /* Invoke syscall */

     inc     %r15                /* increment the loop index */
     cmp     $max,%r15           /* see if we've hit the max */
     jne     loop                /* if not, then continue the loop */

     mov     $0,%rdi             /* set exit status to 0 */
     mov     $60,%rax            /* exit is syscall #60 */
     syscall                     /* invoke syscall */

Here we're using different instructions for arithmetic:

We first copy the loop counter from %r15 to %rax
The xor %rdx, %rdx instruction clears the %rdx register
The div %r10 instruction divides the combined 128-bit value in %rdx:%rax by the value in %r10 (which is 10)
After division, %rax contains the quotient and %rdx contains the remainder
%al refers to the lower 8 bits of %rax, and %dl refers to the lower 8 bits of %rdx
movb instruction moves a single byte to memory

Test

☑️ Loop From 00 - 32

.data
 msg: .ascii "Loop   \n"
 len = .-msg

 .text
 .globl    _start

 min = 0                         /* starting value for the loop index */
 max = 33                        /* loop exits when the index hits this number */

 _start:
     mov     $min,%r15           /* loop index */

 loop:
     mov        %r15, %rax
     mov        $10, %r10
     xor        %rdx, %rdx
     div        %r10            /* quotient is stored in %rax, remainder in %rdx */

     # Convert digits to ASCII
     add     $'0', %al
     add     $'0', %dl

     lea        msg(%rip), %rsi /* Message address */
     movb    %al, 5(%rsi)   /* Store tens digit in the 6th position of msg */
     movb    %dl, 6(%rsi)    /* Store units digit in the 7th position of msg */

     mov        $1, %rax   /* syscall: write */
     mov        $1, %rdi   /* File descriptor: stdout */
     mov        $len, %rdx /* Message length */
     syscall            /* Invoke syscall */

     inc     %r15                /* increment the loop index */
     cmp     $max,%r15           /* see if we've hit the max */
     jne     loop                /* if not, then continue the loop */

     mov     $0,%rdi             /* set exit status to 0 */
     mov     $60,%rax            /* exit is syscall #60 */
     syscall                     /* invoke syscall */

The only change here is to set max = 33.

Test

☑️ Loop Without Trailing Zeros

.data
 msg: .ascii "Loop   \n"
 len = .-msg

 .text
 .globl    _start

 min = 0                         /* starting value for the loop index */
 max = 33                        /* loop exits when the index hits this number */

 _start:
     mov     $min,%r15           /* loop index */

 loop:
     mov        %r15, %rax
     mov        $10, %r10
     xor        %rdx, %rdx
     div        %r10            /* quotient is stored in %rax, remainder in %rdx */

     # Convert digits to ASCII
     add        $'0', %al
     add        $'0', %dl

     cmp        $'0', %al
     lea        msg(%rip), %rsi /* Message address */
     je         single_digit

     movb       %al, 5(%rsi)   /* Store tens digit in the 6th position of msg */
     movb       %dl, 6(%rsi)    /* Store units digit in the 7th position of msg */
     jmp        print

single_digit:
     movb       %dl, 5(%rsi)   /* Store units digit in the 6th position of msg */
     movb       $' ', 6(%rsi)    /* Store space in the 7th position of msg */

print:
     mov        $1, %rax   /* syscall: write */
     mov        $1, %rdi   /* File descriptor: stdout */
     mov        $len, %rdx /* Message length */
     syscall            /* Invoke syscall */

     inc     %r15                /* increment the loop index */
     cmp     $max,%r15           /* see if we've hit the max */
     jne     loop                /* if not, then continue the loop */

     mov     $0,%rdi             /* set exit status to 0 */
     mov     $60,%rax            /* exit is syscall #60 */
     syscall                     /* invoke syscall */

Key changes:

We compare the tens digit (%al) with ASCII '0'
If equal (je), we jump to the single_digit label
For single-digit numbers, we display only the ones digit followed by a space
We use jmp print to skip the single-digit handling code for two-digit numbers

Test

☑️ Loop with Hex Output (0 - 20)

.data
 msg: .ascii "Loop   \n"
 len = .-msg

 .text
 .globl    _start

 min = 0                         /* starting value for the loop index */
 max = 0x21                      /* loop exits when the index hits this number */

 _start:
     mov     $min,%r15           /* loop index */

 loop:
     mov        %r15, %rax
     mov        $16, %r10
     xor        %rdx, %rdx
     div        %r10            /* quotient is stored in %rax, remainder in %rdx */

     cmp        $9, %al
     jg         convert_letter_l
     add        $'0', %al
     jmp        right_digit

convert_letter_l:
     sub        $10, %al
     add        $'A', %al

right_digit:
     cmp        $9, %dl
     jg         convert_letter_r
     add        $'0', %dl
     jmp        store_num

convert_letter_r:
     sub        $10, %dl
     add        $'A', %dl

store_num:
     lea     msg(%rip), %rsi
     movb    %al, 5(%rsi)
     movb    %dl, 6(%rsi)

print:
     mov        $1, %rax   /* syscall: write */
     mov        $1, %rdi   /* File descriptor: stdout */
     mov        $len, %rdx /* Message length */
     syscall            /* Invoke syscall */

     inc     %r15                /* increment the loop index */
     cmp     $max,%r15           /* see if we've hit the max */
     jne     loop                /* if not, then continue the loop */

     mov     $0,%rdi             /* set exit status to 0 */
     mov     $60,%rax            /* exit is syscall #60 */
     syscall                     /* invoke syscall */

Test

Final Thoughts

This lab helped me understand and code more in modern assembly language. I got used to multiple logic concepts from memory manipulation, loops to character encoding! This helped me solidify my understanding.

One major takeaway is how explicit everything is in assembly. Every character printed and every loop iteration must be carefully crafted. I became fully aware that this is the low-level assembly and there is not BTS. I had to manage my own memory locations and be cautious with my index counters. It is far more low level than C!

Thank you for your time. See you soon! 😄

64-Bit Assembly Language - Lab 5

Introduction

Setting Up

📌 `AArch64` Assembly Program

Modifying the `AArch64` Assembly Program

☑️ Print Loop

☑️ Print Loop and Index Number

☑️ Loop From 00 - 32

☑️ Loop Without Trailing Zeros

☑️ Loop with Hex Output (0 - 20)

📌 `x86` Assembly Program

Overall Differences

Modifying for `x86`

☑️ Basic Loop Template

☑️ Print Loop

☑️ Print Loop and Index Number

☑️ Loop From 00 - 32

☑️ Loop Without Trailing Zeros

☑️ Loop with Hex Output (0 - 20)

Final Thoughts

Comments (0)

Read More

#reading

#popular

64-Bit Assembly Language - Lab 5

Introduction

Setting Up

📌 AArch64 Assembly Program

Modifying the AArch64 Assembly Program

☑️ Print Loop

☑️ Print Loop and Index Number

☑️ Loop From 00 - 32

☑️ Loop Without Trailing Zeros

☑️ Loop with Hex Output (0 - 20)

📌 x86 Assembly Program

Overall Differences

Modifying for x86

☑️ Basic Loop Template

☑️ Print Loop

☑️ Print Loop and Index Number

☑️ Loop From 00 - 32

☑️ Loop Without Trailing Zeros

☑️ Loop with Hex Output (0 - 20)

Final Thoughts

Comments (0)

Read More

#reading

#popular

📌 `AArch64` Assembly Program

Modifying the `AArch64` Assembly Program

📌 `x86` Assembly Program

Modifying for `x86`