Introduction

This book is a reflection and a practical study of x86-64 assembly language programming on Ubuntu. It is based on the book “x86-64 Assembly Language Programming with Ubuntu” by Dr. Ed Jorgensen.

In each chapter I describe what I learned, include illustrations and solved exercises, and provide explanations to aid understanding.

Foreword

This material is adapted and reorganized from the x86 folder to provide a coherent, interactive learning path about x86-64 assembly language. Content was reviewed for clarity and formatting.

Prerequisites

Before using this book, ensure you have the necessary tools installed (for example, mdbook, an assembler, and a Linux environment). Basic familiarity with C or another systems language and the Linux command line is recommended.

0 - Why Assembly?

1.0 - Assembly and low level ?

Assembly language is a low-level machine specific langage, each instruction set processor used it’s assembly in our case x86-64 not run on different processor. A High-level language are translated into assembly who is used by processor to execute a program.
The assembly language help you to see how system ressources like memory, registers work .

It help to improve algorithm development skills, debugging by practicing because it requires more thought, more attention and nuanced approch . The function / procedure call, Input/Output instructions including the content and structure of function, a important significant implementations are best understood when working at a low-level.

In security world a fondamental mechanism of multi-processing concept a shared memory, threaded processing can help to nicely understood race condition and others bugs associated to this type of concepts.

1.1 - History and Begining

x86-64 instruction set for x86-64 class of processor using by 64-bit operation system. x86-64 is a CISC (Complex Instruction Set Computing) We have multiple internals processor design philosophy

(x86_64, AMD64, x64) is a 64-bit extension of x86 instruction set First annonced and available is AMD Opteron in 2003. It introduces 64-bit mode and compatible mode, and new 4 level paging mechanism. Compared to 32-bit it support virtual and physical memory, and a number of GPR (General Purpose Register) from 8 to 16 . SSE2 instruction in 64-bit mode permit a floatin-point arithmetic.

128-bit vector register (XMM registers), it can store one or two double precision floating-point number up to 4 single precision

64-bit mode instructions are modified to support 64-bit operands and 64-bit addressing mode.
x86-64 architecture use a 16-bit and 32-bit applications to run on system
Opteron, Athlon followed by X2, X3, X4 to indicate the number of cores and XLT models, same for Turion, Sempron, Phenom

It important to note that a reader would have a prerequist in C, C++ or Java, because many explation assume that reader is already familiar with programming concepts and with linux-based operating system and it’s command line interface.

If not i highly recommended you to do a C Pool of 42 School.

1 - Assembly and Low Level

1.0 - Assembly and low level ?

Assembly language is a low-level, machine-specific language: each instruction-set processor uses its own assembly. In our case, x86-64 assembly does not run on a different processor.

High-level languages are translated into assembly or machine code, which the processor executes. Assembly language helps you understand system resources like memory and registers.

It also helps improve algorithm development skills and debugging through practice, because it requires more thought, attention, and a nuanced approach. Function and procedure calls, input/output instructions, and the structure of functions — important implementation details — are best understood when working at a low level.

In the security world, fundamental mechanisms of multiprocessing such as shared memory and threaded processing help to understand race conditions and other bugs associated with those concepts.

1.1 - History and Beginning

The x86-64 instruction set is used by 64-bit operating systems. x86-64 is a CISC (Complex Instruction Set Computing) architecture. There are multiple internal processor design philosophies.

(x86_64, AMD64, x64) is a 64-bit extension of the x86 instruction set. The first widely available implementation was the AMD Opteron in 2003. It introduced 64-bit mode, compatibility mode, and a new 4-level paging mechanism. Compared to 32-bit, it supports larger virtual and physical address spaces and increased the number of general-purpose registers (GPR) from 8 to 16. SSE2 instructions are available and are commonly used in 64-bit mode for floating-point arithmetic.

128-bit vector registers (XMM registers) can store one or two double-precision floating-point numbers or up to four single-precision numbers.

64-bit mode instructions were modified to support 64-bit operands and 64-bit addressing.
x86-64 architecture can run 16-bit and 32-bit applications on compatible systems.
AMD product names such as Opteron and Athlon were followed by X2, X3, X4 to indicate the number of cores; similar naming was used for Turion, Sempron, Phenom, and others.

It is important to note that readers should have prerequisites in C, C++, or Java, because many explanations assume familiarity with programming concepts and with Linux-based operating systems and their command-line interfaces.

If not, I highly recommend doing a C Pool from 42 School.

2.0 - Architecture Overview

2.1 -

A computer components include a CPU (Central Processing Unit), a RAM (Random Access Memory), a storage device (SSD / HDD), input / output devices (screenm keyboard, mouse).

Von Neumann

A Big Picture of Von Neumann Architecture

A CPU or processor is a computer brain that contains a Control Unit (CU), main memory, and Arithmetic Logic Unit (ALU) A CPU includes all circuitry required to process input, store data, and generate output. It also follows program instructions that tell it which information to process and how to process it.

CU (Control Unit): It is responsible for how data moves throughout the system, for redirecting all input and output flow, and for fetching instruction code.
ALU (Arithmetic and Logic Unit) : It a part of CPU who handles all computations like addition, subtraction, and comparisons, with logical operations, arithmetic operations and bit shifting operation
Register : It’s a sort of highly fast Computer memory that is used to accept, store, and transport data A Processor register is the term used to define the registers that CPU uses (General Purpose Register) .
Accumulator : It store the result of calculation that ALU makes.
Program Counter: The memory address of the next instruction (the instruction that will be executed next). This next address is passed from the Program Counter to the Memory Address Register.
Memory Address Register : MAR stores the memory address locations of those instructions that are either to be fetched from memory or to be stored in memory.
Memory Data Register : MDR stores the instructions that are fetched from memory or any information that is to be transferred to and stored in the memory.
Current Instruction Register : CIR stores the recently fetched instructions while they wait for execution.
Instruction Buffer Register : IBR is used to hold instructions that are not immediately executed.

Input / Output Devices

The program or the data is read into main memory (RAM) from the secondary storage (HDD, SSD) If an result are evaluated by a computer and saved in it , you can present them to user via output devices.

Buses: Data is sent from one part of a computer to another via buses which connect all key internal components to the memory and CPU.
Control Bus: It receive control commands from CPU, as well as status signals from other devices, and uses them to control and coordinate all of the computer’s actions.
Address Bus: It communicates between memory and the processor data address (not the actual data)
Data Bus: It relays information between the memory unit, I/O devices, and the processor.

2.2 Data Storage Sizes

The x86-64 architecture supports a specific set of data storage size elements. The storage size are a direct correlation to variable declarations in high-level language (C, C++, Rust)

2.3 CPU Registers

A CPU Registers or just register, is a temporary storage or working location built into the CPU itself (separate) A 64-bit General Purpose Registers (GPRs) are in number of 60, they can be used by all 64-bits or some portion or subset accessed.

When a data want to used a element with sizes less than 64-bits (32-bit, 16-bit, or 8-bit) a specific part of this less sized register can be accessed by using a different register name like described here :

As show in the excalidrw diagram, the first 4 registers, rax, rbx, rcx, rdx allow to accessed to 8-15 bits with the ah, bh, ch, and dh register names. But ah is provided for legacy support.

A register save value the used to affected to them in hex base. By exemple if:

rax = 50.000.000.000 	  # value set to rax in decimal base (10)
rax = 0000 000B A43B 7400 # the eax value is saved in hex base (16)

ax = 50.000 			  # if ax is set to 50.000 in base decimal (10)
ax = C350 				  # the ax calue is save in hex base 	(16)

rax = 0000 000B A43B C350 # each value is 1byte, the total do 16 byte, and each section separate by space do 16-bit
  # the total do 64-bit,  here the lower 16-bit ax of rax is set the upper 48-bits are unaffected
  # Note the change of ax to 7400 (16) to C350 (16)

al = 50					  # al register is set to 50 (10), who is 32 (16) in hex
rax = 0000 000B A43B C332 # when the lower 8-bit al portion of the 64-bit is set the 56-bits are unaffected

For 32-bit register operations, the upper 32-bit (first 32-bit from left to right) is set cleared (set to 0)

RSP (Register Stack Pointer)

A rsp is a register who is not used for data or other uses but it used to point to the top of the stack.

RBP (Register Base Pointer)

A rbp is used as a base pointer during functions calls, it her only functions.

RIP (Register Instruction Pointer)

A rip is a special register used by CPU ro point to the next instruction to be executed. So if rip points to the next instruction means that in debugger the rip point to a instruction who is not already executed.

Flag Register (rFlags)

A flag register, rFlags is used for store status and CPU control information about the instruction that was just executed. The rFlag is directly updated (Status) by processor and not accessible by program,

XMM registers

The XMM are set of dedicated registers used to support 32-bit and 64-bit floating point operations and Single Instruction Multiple Data (SIMD) instructions. SIMD allow a single instructions allow a single instruction to be applied simultaneaously to to multiple data items, it help to increase a performance.

Cache Memory

Cache memory is a small subset of the primary storage or RAM located in the CPU chip. If a memory location is accessed, a copy of the value is placed in the cache*. A memory read involves sending the address via the bus to the memory controller, which will obtain the value at the requested memory location, and send it back through the bus. Comparatively, if a value is in cache, it would be much faster to access that value. A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than reading from main memory. The more requests that can be served from cache, the faster the system will typically perform.

Main memory

Memory can be viewed as a series of bytes, one after another. That is, memory is byte addressable. This means each memory address holds one byte of information. To store a double-word, four bytes are required which use four memory addresses. Additionally, architecture is little-endian. This means that the Least Significant Byte (LSB) is stored in the lowest memory address. The Most Significant Byte (MSB) is stored in the highest memory location.

For example, assuming the value of, 5,000,000 (10) -> 004C4B40 (16), is to be placed in a double-word variable named var1. For a little-endian architecture, the memory picture would be as follows:

Based on the little-endian architecture, the LSB is stored in the lowest memory address and the MSB is stored in the highest memory location.

2.5 Memory Layout

The general memory layout for a program is as shown:

The reserved section is not available to user programs. The text (or code) section is where the machine language (i.e., the 1’s and 0’s that represent the code) is stored. The data section is where the initialized data is stored. This includes declared variables that have been provided an initial value at assemble-time. The uninitialized data section, typically called BSS section, is where declared variables that have not been provided an initial value are stored. If accessed before being set, the value will not be meaningful. The heap is where dynamically allocated data will be stored (if requested). The stack starts in high memory and grows downward.

Memory Hierarchy

In order to fully understand the various different memory levels and associated usage, it is useful to review the memory hierarchy. In general terms, faster memory is more expensive and slower memory blocks are less expensive. The CPU registers are small, fast, and expensive. Secondary storage devices such as disk drives and Solid State Drives (SSD’s) are larger, slower, and less expensive. The overall goal is to balance performance with cost. An overview of the memory hierarchy is as follows:

Where the top of the triangle represents the fastest, smallest, and most expensive memory. As we move down levels, the memory becomes slower, larger, and less expensive. The goal is to use an effective balance between the small, fast, expensive memory and the large, slower, and cheaper memory.

Based on this table, a primary storage access at 100 nanoseconds is 30,000 times faster than a secondary storage access, at 3 milliseconds . The typical speeds improve over time (and these are already out of date). The key point is the relative difference between each memory unit is significant. This difference between the memory units applies even as newer, faster SSDs are being utilized.

3.0 Data Representation

In a computer, data representation is how information is stored. It differs depending on what you want to store; for example, the method used to store integers is different from the method used to store floating-point numbers and strings.

Different bases are used here (binary, decimal, hex) to represent a number. Ex:

19 = 19 (10) = 13 (16) = 0x13

3.1 Integer Representation

The computer represents numbers in binary (1s and 0s). A number or variable can be stored in a limited amount of space provided by the computer. This directly impacts the size or range of numbers that can be represented.

Ex:
1 byte (8 bits) can represent 2^8, or 256, different numbers. These 256 numbers (0-255) can be unsigned (all positive). The signed range is (-128 to +127).

So if a number that we want to represent need more space to be represented a larger size must be used. Like :

A word 16-bits for 65.536 (0 - 65.535) for signed and (-32.768 - 32.767) for unsigned value
A double-word 32-bits for 4.294.967.296 (0 - 4.294.967.295) for signed and (-2.147.483.648 to +2.147,483,647) for unsigned value

size-bytes

It is important to know whether a value can be represented; you need to know the size of the storage element (byte, word, double-word, quadword) being used and whether the values are signed or not. Signed values use a standard binary representation. Unsigned values use a two’s complement representation.

For example, the unsigned byte range can be represented using a number line as follows:

unsigned-bytes

For example, the signed byte range can be represented using a number line as follows:

signed-bytes

When we examine a binary file with a debugger, it is difficult to know whether a variable in memory is signed or not because unsigned values have a different, positive-only range than signed values.

For example, when the unsigned and signed values are within the overlapping positive range (0 to +127):

A signed byte representation of 12 (10) is 0x0C (16)
An unsigned byte representation of 12 (10) is also 0x0C (16)

When the unsigned and signed values are outside the overlapping range:

A signed byte representation of -15 (10) is 0xF1 (16)
An unsigned byte representation of 241 (10) is also 0xF1 (16)

Note: if your number contains a group of 4 bits (1111), that is 15 in decimal.

Two’s Complement

To find a two’s-complement representation for negative values:

Take the one’s complement (negate all bits).
Add 1 (binary addition).

For example, to represent -9:

Start with the positive value 9.
Convert it to binary.
Invert all bits (change 1 to 0 and 0 to 1).
Add 1.

See the example below:

3.3 Floating-point Representation

The representation of floating-point numbers differs by format. The representation shown here is IEEE 754 32-bit floating-point

Where s is the sign (0 => positive and 1 => negat*ive). More formally, this can be written as;

After these calculations, the next step is to calculate the biased exponent, which is the exponent from the normalized scientific notation plus the bias.

The value for the IEEE 32-bit floating-point standard is 127, and the result should be converted to 8 bits (1 byte) and stored in the biased exponent portion of the word.

A 64-bit floating-point standard representation is the same as 32-bit, however the format allows an 11-bit biased exponent with a bias of 1023

It is possible that when a value is interpreted as floating-point and it does not conform to the standard (either 32-bit or 64-bit), then it cannot be used as a floating-point value. This can occur if an integer representation is mistakenly interpreted as a floating-point value, or when a floating-point arithmetic operation (add, subtract, multiply, divide) produces a value that is too large or too small to represent.

An incorrect format or an unrepresentable value is referred to as NaN (Not a Number).

3.4 Characters and strings

Computer memory is designed to store and retrieve numbers. So symbols (non-numeric data like characters) are assigned numeric values. This is the functionality of the ASCII (American Standard Code for Information Interchange) table. For example, the “A” character has the value “65” in decimal and “0x41” in hexadecimal. It is important to distinguish between the “2” character and the integer 2.

Unicode is an improvement over ASCII because it supports many languages.

A string is a series of ASCII characters terminated with NULL (a non-printable ASCII character). It is used to mark the end of strings.

And, as described, strings can contain numeric symbols, but they are not considered numeric numbers. A char uses 1 byte, so each character represents one byte plus a NULL character. An integer uses a minimum of 2 bytes.

Again, it is very important to understand the difference between the string 19653 (using 6 bytes) and the single integer 19,65310 (which can be stored in a single word which is 2 bytes).

4 - Assembly Programming (Draft)

Program Format

An assembly file contains several parts:

Data section where initialized data is declared and defined.
BSS section where uninitialized data is declared.
Text section where code is placed.

Note that assembly uses a semicolon ; for comments; any text after ; is ignored. Numbers can be specified in decimal, hex, or octal. The default base is decimal. All hex or base-16 values must be preceded with 0x. For octal or base-8 values, use a q suffix like 777q.

A constant is defined with the equ keyword; its value cannot be changed during program execution. It does not have an associated memory location or a fixed size (byte, word, double-word); the size depends on the value.

<name> equ 	<value>

Ex:

SIZE equ 	10000 ; Could be used as word or double-word but not a byte

Data Section

The Data section contains all initialized variables and constants. A naming specification for your variables is simple: It can start with an underscore or letter, followed by letters or numbers, including some special characters.

<variablename>   <dataType> 	<initialValue>

Refer to these tables for a series of examples using various data types.

data type

See the example below for common assembler directives used for initialized data declarations. Here d means define.

bVar		db		10 					; byte variable
cVar		db		"H"					; single character
strng		db		"Hello World"		; string
wVar		dw		5000				; 16-bit variable
dVar		dd		50000				; 32-bit variable
arr			dd		100, 200, 300		; 3 element array
flt1		dd		3.14159				; 32-bit float
qVar		dq		1000000000			; 64-bit variable

The value specified here must fit the specified data type; for example, a byte variable set to 500 would generate an assembler error.

BSS section

The BSS section contains uninitialized data or variables. It uses the same declaration pattern as the Data section.

<variable>		<resType>		<count>

The supported data types are as follows:

data type

The following are common assembler directives for uninitialized data declarations.

bArr	resb		10 		; 10 element byte array
wArr	resw		50 		; 50 element word array
dArr	resd		100 	; 100 element double array
qArr	resq		200 	; 200 element quad array

Here res means reserve, and the last character after res tells you what type of variable you want to reserve space for without allocating memory.

Text Section

The code is placed in the text section, and instructions are specified line by line. The text section includes some headers or labels before the code that defines the initial program entry point.

global _start
_start:

No special label or directives are required to terminate the program. A system service is used to inform the operating system that the program should be terminated and the resources, such as memory, recovered and re-used.

; Simple example demonstrating basic program format and layout.
; Ed Jorgensen
; July 18, 2014
; ************************************************************
; Some basic data declarations

section .data

; -----

; Define constants

EXIT_SUCCESS	equ		0 		; successful operation
SYS_exit		equ		60      ; call code for terminate

; -----

; Byte (8-bit) variable declarations

bVar1			db		17
bVar2			db		9
bResult			db		0

; -----

; Word (16-bit)	variable declarations

wVar1			dw		17000
wVar2			dw		9000
wResult			dw		0

; -----

; Double-Word (32-bit) variable declarations

dVar1			dd		17000000
dVar2			dd		9000000
dResult			dd		0

; -----

; quadword (64-bit) variable declarations

qVar1			dq		1700000000
qVar2			dq		900000000
qResult			dq		0

; ************************************************************
; Code Section

section		.text
global _start
_start:

; Performs a series of very basic addition operations
; to demonstrate basic program format.

; ----------
; Byte example
; bResult = bVar1 + bVar2

mov	al,	byte [bVar1]
add   al, byte [bVar2]
mov	byte [bResult], al

;------------
; Word example
; wResult = wVar1 + wVar2

mov	ax,	word [wVar1]
add ax, word [wVar2]
mov	[wResult], ax

;------------
; Double-Word example
; dResult = dVar1 + dVar2

mov	eax, dword [dVar1]
add eax, dword [dVar2]
mov	[dResult], eax
;------------
; QuadWord example
; qResult = qVar1 + qVar2

mov	rax,	qword [qVar1]
add rax, qword [qVar2]
mov	[qResult], rax


; ************************************************************
; Done, terminate program.

last:
mov		rax, SYS_exit		; Call code for exit
mov		rdi, EXIT_SUCCESS	; Exit program with success
syscall

5 - Examples and Exercises (Draft)

Toolchain

In general, the set of programming tools used to create a program is referred to as the toolchain. The toolchain used here consists of the following:

Assembler
Linker
Loader
Debugger

5.1 - Assemble / Link / Load Overview

The source code file passes through multiple stages before becoming an executable program during the assemble, link, and load process. The human-readable source code file is converted into an object file by the assembler, which is then transformed into an executable by the linker, and the executable is loaded into memory with the help of loader.

Overview: Assemble, Link, Load

Assembler

The assembler is a program that will read an assembly language source code containing assembly instruction in input file and convert the code into a machine language binary (bytecode).

During this process the comment are removed and variable names and label are converted into appropriate addres (as required by the CPU during execution)

The assembler used here is yasm.

yasm -g dwarf2 -f elf64 example.asm -l example.lst

-g dwarf2 : it used to specify to assembler to include debugging information in object file (.o)
-f elf64 : Informs the assembler to create the object file in elf64 format (which is appropriate to 64-bit Linux based-system)
axample.asm : is a assembly source file in input.
-l example.lst : in form assembler to create a list file named example.lst

But what is a list file? A list file shows the line number, the relative address, the machine-language version of the instruction (including variable references), and the original source line. This information is useful when debugging.

36 00000009 40660301		dVar1	dd		17000000
37 0000000D 40548900		dVar2	dd		9000000
38 00000011 00000000		dResult	dd		0

Line 36

relativ address : 0x00000009 stored in the data area
double-word variable: dVar1 requires four-bytes.
next address is 0x0000000D so dVar1 uses a 0x00000009, 0x0000000A, 0x0000000B, 0x0000000C
0x40660301 is the value in hex, as placed in memory. A 17000000 is 0x01036640. in hex. Remember that the architecture used here is little-endian; the LSB (0x40) is placed in the lowest memory address.
A 0x40 is placed at 0x00000009 next 0x66 is placed in address 0x0000000A

Little Endian

For example, a fragment of the list file text section, excerpted from the example program in the previous chapter is as follows:

95						last:
96	0000005A 48C7C03C000000		mov		rax, SYS_exit
97	00000061 48C7C300000000		mov		rdi, EXIT_SUCCESS
98	00000068 0F05				syscall

Again, the number to the left are the line numbers, the net number 0x0000005A is the relative address if where the line of code is placed.
The next number 0x48C7C03C000000 is the machine language version of instruction, in hex , that the CPU reads and understands.
The rest of the line is the original assembly language source instruction.
The label last: does not have a machine language instruction, it not a executable instruction.

Two-Pass Assembler

The assembler will read a source code and convert it in bytecode who it translate in binary (understand by CPU) The 1’s and 0’s are referred to as machine language. This relationship between assembly code and binary readable language means that machine language can be converted back to human readable, but of course the comment, variable names and label names are missing, so the resulting code can be very difficult to read.

Each line read by the assembler has her instruction generated, but in case when a instructions is a jumps like If statements or unconditional jumps, it not possible to perform the convertion of this instructions.

Ex :

mov		rax, 0
jmp		skipRest
...
...
...
skipRest:

Reading line by line a assembler cannot know if a skipRest is defined or just exist when it read a line when it called, the solution for that is to read a file twice, it know by name of two-pass assembler.

Fisrt pass

This step vary of the design specific assembler, but several basic operations performed is:

Create symbol table
Expand macros
Evaluate constant expressions

A macro is a program element that is expanded into a set of programmer predefinned instructions. A constant expression is an expression composed entirel of contants. By example if a constant is used in one line do a operations, if we know from begenning that it was declared this line can be read, understand and executed without problem.

Ex:

mov rax, BUFF+5

Second pass

The steps taken on the second pass vary based on design of the specific assembler. The differents basic operation performed on the second pass include :

Final generation of code
Creation of list file (if requested)
Create object file

The generation of code is about to the conversion of the assembly language into the CPU executable machine instruction. Knowing that a one-to-one correspondance, is used for transform instructions (instructions that do not use symbols on either the first or second pass)

A based assembler design can help to done code generation be done on the first or all done on the second pass. In much case a final generation is performed on second pass and require using the symbol table to check program symbols and obtain the appropriate addresses from the table.

Assembler Directives

Assembler directives are instructions to the assembler that direct the assembler to do something. This might be formatting or layout. These directives are not translated into instructions for the CPU.

Linker

The linker, sometimes referred to as linkage editor, will combine one or more object files into a single executable file including any neccesary libraries . A example using example file from previous chapter with GNU gold linker.

ld -g -o example example.o

-g is used to included debugging information in the final executable file.
-o specifies to create a executable file name example (with no extension) when the -o is ommitted the output file is named a.out The linker reader a example.o file who is input here, note that you can name you file like what you want and not need to have the same name as any of the input object files.

It is also possible to link multiple object files.

ld -g -o example main.o example.o

When a function are located in external source file, any function not in the current source file must be declared as extern . Variables, such as global variables, in other source files can be accessed by using the extern statement as well, however data is typically transferred as arguments of the function call.

Linking Process

The object files and library routines are combined into a single executale module. As part of combining the object file, the linker must adjust the relocatable addresses as necessary.

Assuming there are two source files, the main and secondary source file boths of which have been assembled into object file main.o and funcs.o .4 After assembles the calls to routines outside of file being assembled are declared with the external assembler directive.

The code is not available for an external reference and such references are marked as external in the object file. The list file will show an R for such relocatable addresses. The linker must satisfy the external references. Additionally, the final location of the external references must be placed in the code. For example, if the main.o object file calls a function in the funcs.o file, the linker must update the call with the appropriate address as shown in the following illustration.

linking multiple

Here fnc1 is external to main.o it inside a funcs.o file and it marked with the R. It started to relative address (0x100), and when it was combined with main.o the final executable it adapt her and take 0x400 like address, and the linker insert this address into the call statement in the main in order to complete the linking process and ensure the function call work correctly it work with the relocatable adresses for both code and data.

Dynamic Linking

The linux Operating system supports dynamic linking who is represented by a .so (shared object file), which allows for postponing the resolution of some symbols until a program is being executed. The actual instructions are not placed in exacutable file and instead, if needed, resolved and accessed at run-time.

This approach offers two advantages:

Commonly used libraries can be stored in a single location instead of being duplicated in every binary.
If a bug in a shared library is fixed, programs that use it dynamically will benefit from the fix on next run.

Disadvantages

When a library is updated, the executable may break because it depends on the previous library version.
A program using its own library must be trusted; replacing components can introduce compatibility issues.

Assemble / Link Script

For not wast a time for type always the command to assemble and link with ld it possible to write a script who do all assembly and linking process See below:

#!/bin/bash

# Simple assemble/link script

if [ -z $1 ]; then
echo "Usage: ./asm64 <asmMainFile> (no extension)"
exit
fi
#Verify no extensions were entered

if [ ! -e "$1.asm" ]; then
echo "Error, $1.asm not found."
echo "Note, do not enter file extensions."
exit
fi

# Compile, assemble, and link.

yasm -Worphan-labels -g dwarf2 -f elf64 $1.asm -l $1.lst

ld -g -o $1 $1.o

This script file can be name asm64 we don’t need obligatory a extension here because on linu all is file.

chmod +x asm64  # to give execution right to script file

Execute it:

./asm64 example # to compile a file and give her the name example note that you can use another filename

Loader

The loader is a part of our opereationg system who load the file from secondary storage (Hard drive) to primary storage (RAM), it create a new process for executable, and load the code in memeory, the program is run when the executable is invoked

./example  # the previous file created after linking and assembley

Debugger

The debugger is used for control program execution of program, if during execution nothing is printed to user it possible to use debugger to check a result . Multiple debugger exist but the GNU product is appreciated for our exporation. So we used a GNU DDD who is a graphical interface for GDB.

Style Guide

Follow the project’s style conventions for headings, prose, code blocks, and links. Prefer title case for headings, italics for terms, and hard wrap at 80 characters. Use relative links for intra-book references and bash highlighting for shell examples.

Keyboard shortcuts

x86-64 Assembly Language Programming with Ubuntu