Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
3 views

intro to assembly language programming - Copy

Assembly language is a low-level programming language that provides a more understandable format than machine language, allowing programmers to write instructions using symbolic code. It is processor-dependent and requires an assembler to convert the code into machine language. The document also covers the advantages of assembly language, installation of the NASM assembler, and provides detailed steps for creating and executing assembly programs on Linux.

Uploaded by

hetavimodi2005
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

intro to assembly language programming - Copy

Assembly language is a low-level programming language that provides a more understandable format than machine language, allowing programmers to write instructions using symbolic code. It is processor-dependent and requires an assembler to convert the code into machine language. The document also covers the advantages of assembly language, installation of the NASM assembler, and provides detailed steps for creating and executing assembly programs on Linux.

Uploaded by

hetavimodi2005
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Assembly language programming

What is Assembly Language?


A processor understands only machine language instructions, which are strings of 1's and 0's.
However, machine language is too obscure and complex for using in software development. So,
the low-level assembly language is designed for a specific family of processors that represents
various instructions in symbolic code and a more understandable form.
Assembly language is a low-level computer language. It is processor-dependent, since it
represents machine codes of instructions of a processor with relevant words, which makes it
easier for a programmer to write a program instead of writing in machine code format consisting
of only 0s &1s .
Assembler is a software which converts program written in assembly language into the machine
code ( instructions ) of a particular CPU .
Advantages of Assembly Language
Having an understanding of assembly language makes one aware of −
 How programs interface with OS, processor, and BIOS;
 How data is represented in memory and other external devices;
 How the processor accesses and executes instruction;
 How instructions access and process data;
 How a program accesses external devices.
Other advantages of using assembly language are −
 It requires less memory and execution time;
 It allows hardware-specific complex jobs in an easier way;
 It is most suitable for writing interrupt service routines and other memory resident
programs & for system programming

There are many good assembler programs, such as −


 Microsoft Assembler (MASM)
 Borland Turbo Assembler (TASM)
 The GNU assembler (GAS)
 NASM ( Netwide assembler)
Steps of Compilation for higher level languages
Steps of assembling for low level language

Difference between high level language & low level ( assembly language)

High level languages are suitable for application programming. Such as database , web , AI,
image & video processing etc
Low level languages are suitable for system programming.eg device drivers .
We will use the NASM assembler, as it is −
 Free
 Lot of documentation available
 Could be used on both Linux and Windows.

Installing NASM in Linux


If you select "Development Tools" while installing Linux, you may get NASM installed along
with the Linux operating system and you do not need to download and install it separately. For
checking whether you already have NASM installed, take the following steps −
 Open a Linux terminal.
 Type whereis nasm and press ENTER.
 If it is already installed, then a line like, nasm: /usr/bin/nasm appears. Otherwise, you
will see just nasm:, then you need to install NASM.

Method 1 Install nasm Using apt-get


Update apt database with apt-get using the following command.
sudo apt-get update
After updating apt database, We can install nasm using apt-get by running the following
command:
sudo apt-get -y install nasm

Method 2 Install nasm Using apt


Update apt database with apt-get using the following command.
sudo apt update
After updating apt database, We can install nasm using apt-get by running the following
command:
sudo apt -y install nasm

Procedure to create and execute a simple assembly program on Ubuntu (Linux)using nasm

1. Start typing “terminal”. Different terminal windows available will be displayed.


2. Click on “terminal” icon. A terminal window will open showing command prompt.
3. Give the following command at the prompt to invoke the editor gedit
4. Type in the program in gedit window,
5. Save by name hello.asm and exit
6. To assemble the program write the command at the prompt as follows and press enter
key
7. nasm –f elf32 hello.asm –o hello.o (for 32bit)
nasm –f elf64 hello.asm –o hello.o (for64bit)

8. If the program is error free, it implies hello.o object file has been created.
9. To link and create the executable give the command as
Ld –o hello hello.o
10. To execute the program write at the prompt
./hello
11. “hello world” will be displayed at the prompt

The Linux System Calls –(int 80h for 32bit execution)


system calls are provided by OS for standard operations such as reading from keyboard,
displaying on screen , file operations such as create, open, close, read & write .
1. Write the system call number in EAX
2. Set up the arguments to the system call in EBX, ECX,etc.
3. Call the interrupt 80h
4. The result is usually returned in EAX

The“hello world!”assembly program (32bit execution)


Section .data
msg db 'Hello, world!',10 ; message to be displayed
msglen equ $ - msg ;length of our message

section .text
global _start
_start: ;tell linker entry point
mov edx, msglen ;message length
mov ecx, msg ;message to write
mov ebx, 1 ;choose device: screen
mov eax, 4 ; system call for write
int 80h ;call kernel
mov eax, 1 ;system call for exit
int 0x80 ;call kernel
after typing above program, save it by name hello.asm
then type the following commands:

nasm –f elf32 hello.asm -o hello.o

ld –o hello hello.o

./hello

The“hello world!”assembly program (64 bit execution)

1. Write the system call number in RAX


2. Setup the arguments to the system call in RDI, RSI, RDX etc.
3. Make a system call with “SYSCALL”instruction
4. The result is usually returned in RAX
Section .data
msg db 'Hello world!',10
msglen equ $-msg ; Length of the 'Hello world!' string
Section .text
global _start
_start:
mov rax, 1 ;system call codefor write
mov rdi, 1 ;file handle 1 is stdout
mov rsi, msg ; address of message
mov rdx, msglen ; number of bytes in message
syscall ; invoke operating system to do the write
mov rax, 60 ; system call codefor exit
mov rdi, 0 ; return code 0 ( means no error)
syscall ; system call to exit

after typing the program , save it by name hello.asm


& run following commands ( don’t type $ , it is prompt of terminal)

nasm –f elf64 hello.asm -o hello.o


If there are any syntax ( grammatical ) errors , nasm will display the line numbers & type of
error. Then open the asm file again in gedit and correct those errors & again run above
command.
If no errors , then run following command to link it and create executable program
ld –o hello hello.o
If no errors , then run following command to execute it
./hello

ASP program structure


An assembly program can be divided into three sections −
 The data section,
 The bss section, and
 The text section.

The .data section

This section is for "declaring initialized data", in other words defining "variables" that already
contain stuff. However this data does not change at runtime so they're not really variables. The
.data section is used for things like filenames and buffer sizes, and you can also define constants
using the EQU instruction. Here you can use the DB, DW, DD,DQ and DT instructions. For
example:

section.data
message db 'Helloworld!' ;Declare message
msglength equ $-message ;Declare msglength & which will store the length
;of above message
Buffersize dw 1024 ;Declare buffersize to be a word with1024 bytes

The .bss section

This section is where you declare your variables. You use the RESB, RESW, RESD, RESQand
REST instructions to reserve uninitialized space in memory for your variables, likethis:
Section .bss
filename resb 255 ;Reserve 255 bytes
number resb 1 ;Reserve1byte
bignum resw 1 ;Reserve1word(1word=2bytes)
realarray resq 10 ;Reserve an array of 10 reals

The .text section


This is where the actual assembly code is written. The .text section must begin with the
declaration global _start, which just tells the kernel where the program execution begins. (It's
like the main function in C or Java, only it's not a function, just a starting point.)
Eg.:

Section .text
Global _start

_start:

the program actually begins Here

Assembly Language Statements


Assembly language programs consist of three types of statements −
 Executable instructions or instructions,
 Assembler directives or pseudo-ops, and
 Macros.
The executable instructions or simply instructions tell the processor what to do. Each
instruction consists of an operation code (opcode). Each executable instruction generates one
machine language instruction.

The assembler directives or pseudo-ops tell the assembler about the various aspects of the
assembly process. These are non-executable and do not generate machine language instructions.

Macros are basically a text substitution mechanism.

Syntax of Assembly Language Statements


Assembly language statements are entered one statement per line. Each statement follows the
following format −
[label] mnemonic [operands] [;comment]
The fields in the square brackets are optional. A basic instruction has two parts, the first one is
the name of the instruction (or the mnemonic), which is to be executed, and the second are the
operands or the parameters of the command.

Following are some examples of typical assembly language statements –

INC COUNT ; Increment the memory variable COUNT


MOV TOTAL, 48 ; Transfer the value 48 in the memory variable TOTAL
ADD AH, BH ; Add the content of the BH register into the AH register
;and store result in AH
ADD MARKS, 10 ; Add 10 to the variable MARKS
MOV AL, 10 ; Transfer the value 10 to the AL register

Memory Segments
A segmented memory model divides the system memory into groups of independent segments
referenced by pointers located in the segment registers. Each segment is used to contain a
specific type of data. One segment is used to contain instruction codes, another segment stores
the data elements, and a third segment keeps the program stack.
In the light of the above discussion, we can specify various memory segments as −
 Data segment − It is represented by .data section and the .bss. The .data section is used
to declare the memory region, where data elements are stored for the program. This section
cannot be expanded after the data elements are declared, and it remains static throughout the
program.
The .bss section is also a static memory section that contains buffers for data to be declared later
in the program. This buffer memory is zero-filled.
 Code segment − It is represented by .text section. This defines an area in memory that
stores the instruction codes. This is also a fixed area.
 Stack − This segment contains data values passed to functions and procedures within the
program.

System calls
are the API interface between user programs and the Linux kernel. They are used to let the kernel
perform various system tasks, such as file access, process management
64-bit x86 uses syscall instead of interrupt 0x80. The result value will be in %rax
For making a system call using an interrupt, you have to pass all required information to the
kernel by copying them into general purpose registers.
Each system call has a fixed number.
system calls take parameters to perform their task. Those parameters are passed by writing them
in the appropriate registers before making the actual call. Each parameter index has a specific
register.
You need to take the following steps for using Linux system calls in your program −
 Put the system call number in the EAX register.
 Store the arguments to the system call in the registers EBX, ECX, etc.
 Call the relevant interrupt (80h).
 The result is usually returned in the EAX register.

Assembler directives
An assembler directive is a statement to provide necessary information to the assembler to
generate necessary machine codes.
Directives are commands to the assembler & are not instructions of processor.
Directives are also called as Pseudo-instructions
Because they are not instructions to be executed by processor, So They are not converted to
machine language instructions.
Eg
Section .data
Section .bss
Section .text
DB DW DD Char etc
Explanation of these Directives are given in write-ups.

You might also like