ARM assembly programming
This is a quick tutorial on using a Raspberry Pi to learn ARM assembly
programming. It assumes that you’ve previous experience with assembly language,
say, with x86. We’ll compute fibonacci numbers as a simple example. Unless
stated otherwise, description and examples refer to ARMv7 architecture
(arm-linux-gnueabihf
).
Registers
ARM has 16 addressible registers, R0 to R15, each of which is 32-bit wide. In User Mode, R13 holds stack pointer (SP), R14 is link register (LR) and R15 is program counter (PC). There is also a current program status register (CPSR) which holds certain status flags, the most important of which are “NZCV” (these bits are set based on the result of the previous instruction):
N = 1 if the result was negative
Z = 1 if the result was zero
C = 1 if the result had a carry (addition) or borrow (subtraction) or extend
V = 1 if the result overflowed
These flags are used with branching instructions (ex: BNE = branch if not equal).
System calls
We’ll be executing our programs on Linux, so you need to be aware of the codes for [sycall][]s. System calls are function calls. Function parameters are passed in R0 to R6, the function return value (if any) is in R0. The syscall number itself is inserted in R7.
.text /* Text section is where program code goes */
.global _start /* label where execution begins */
.syntax unified /* modern "unified" UAL syntax */
_start:
mov r7, 1 /* sys_exit(int) has syscall number 1 */
mov r0, 42 /* first parameter to sys_exit() */
swi 0 /* software interrupt executes system call */
We can create an executable thus:
as -o ex1.o ex1.s # GNU assembler
ld -o ex1 ex1.o # GNU linker
./ex1 # Run the program, does nothing
echo $? # Prints 42
_start
is a special symbol with which ld
starts program execution.
Input and Output
The function signatures of system calls to write(2)
and read(2)
are:
ssize_t read(int fd, void *buf, size_t count); // syscall number 3
ssize_t write(int fd, const void *buf, size_t count); // syscall number 4
File descriptor fd
has some standard values for stdin
(0), stdout
(1)
and stderr
(3). So, a Hello World program looks like:
.text
.global _start
.syntax unified
_start:
mov r7, 4 /* sys_write() has syscall number 4 */
mov r0, 1 /* first param fd = stdout */
ldr r1, =msg /* second param buf -> load address of `msg' to r1 */
mov r2, 13 /* third param count = strlen(msg) + 1 */
swi 0 /* software interrupt executes system call */
end:
mov r7, 1 /* sys_exit */
swi 0
.data /* Data section holds variables and constants */
msg:
.ascii "Hello, World\n"
Assembling and executing the program, as before:
as -o temp.o ex2.s
ld -o ex2 temp.o
./ex2 # "Hello, World"
echo $? # 13 = return value of `sys_write' was in R0
Fibonacci numbers
Fibonacci sequence was well-known to Indian Hindu mathematicians centuries before Fibonacci (13th c.) discovered it. Virahanka (6th c.) mentions the sequence in his discussion of Sanskrit prosody. Other pre-Finonacci Hindu mathematicians, Gopala and Hemachandra, refer to Virahanka as well.
Sample code to print 10th Fibonacci number.
.text
.global _start
.syntax unified
_start:
mov r10, 10 /* counter = 10, Compute fib(10) */
mov r1, 0 /* fib(0) = 0 */
mov r2, 1 /* fib(1) = 1 */
fib:
add r3, r1, r2 /* fib(n) = fib(n-1) + fib(n-2) */
mov r1, r2 /* fib(n-1) = fib(n-2) */
mov r2, r3 /* fib(n-2) = fib(n) */
subs r10, r10, 1 /* R10 = R0 - 1. Suffix 's' sets NZCV flags as well. */
beq end /* If 'Z' flag is set, branch to 'end' */
bal fib /* Branch always to 'fib' */
end:
mov r0, r3 /* Final result is in R3 */
mov r7, 1
swi 0
Executing the above code should print 89.
TODO: Discuss subroutines, branching-with-linking, etc.