Posts MIPS cheatsheet
Post
Cancel

MIPS cheatsheet

This is a cheatsheet for MIPS 32-bit, It worth mentioning that MIPS is a RISC (Reduced Instruction Set Computer) architecture with 32 general-purpose registers and 3 instruction formats which you will see in more detail.

MIPS architecture uses 32-bit memory addresses and 32-bit data words (4 bytes), note that the endianness of MIPS can be little or big-endian but we will talk about little-endian here regarding the data represented in memory.

Talking about memory! It’s important to know that the addresses of data read or written from/into memory should be word aligned (divisible by 4), now we have a good grasp of MIPS.

Before we dive in. It worth mentioning that most of the examples in this post is from Digital Design and Computer Architecture 2nd edition by David Harris and Sarah Harris which I highly recommend for those interested in computer architecture.


Registers

NameNumberUse
$00constant 0
$at1assembler temporary
$v0–$v12–3function return value
$a0–$a34–7function arguments
$t0–$t78–15temporary variables
$s0–$s716–23saved variables
$t8–$t924–25temporary variables
$k0–$k126–27operating system (OS) temporaries
$gp28global pointer
$sp29stack pointer
$fp30frame pointer
$ra31function return address

Instruction formats

There are 3 types/formats for registers in MIPS:

  • R-Type
  • I-Type
  • J-Type

R-Type

oprsrtrdshamtfunct
6 bits5 bits5 bits5 bits5 bits6 bits
  • op: opcode/operation (equals 0 in R-type)
  • rs: source register
  • rt: second source register (since t comes after s in alphabetical order)
  • rd: destination source
  • shamt: shift amount (for shift instructions)
  • funct: function (holds the actual functionality of the instruction for R-Type)

Examples:

add $s0, $s1, $s2
re-arrange000000$s1$s2$s000000add
decimal0000001718160000032
binary00000010001100101000000000100000
size6 bits5 bits5 bits5 bits5 bits6 bits
0x02328020
sll $s0, $s2, 2
0000000000018162000000
6 bits5 bits5 bits5 bits5 bits6 bits

Notes:

  • add instruction’s function value (32) and (0) for sll is from the MIPS manual in which all instruction have a unique code
  • the registers values are substituted from the registers table above

we can say that the instruction machine code is 0x02328020 in hex and this is how it’s stored in the executable file or in memory when loaded by the operating system for execution!

I-Type

oprsrtimmediate
6 bits5 bits5 bits16 bits

Examples:

addi $s0, $s1, 5
817165
6 bits5 bits5 bits16 bits
0x22300005
sw $s1, 4($t1)
439174
6 bits5 bits5 bits16 bits
0xAD310004

J-Type

a special type for J (Jump) and JAL (Jump and link) instructions.

NOTE: (Jump Register) JR instruction is an R-Type instruction with only rs operand assigned

opaddress
6 bits26 bits

Example:

jal loop
30x100028
6 bits26 bits
0x0C100028

Note that the label address is represented in pseudo-direct addressing to make it possible to write a 32-bit address in only 26-bits which will be discussed right now!


Addressing modes

There are 5 different ways the CPU can access the memory in MIPS

Register-only

registers for all source and destination operands (R-Type uses it)

Immediate addressing

16-bit immediate with registers as operands (i.e. addi and lui)

Base-addressing

memory access instructions (i.e. lw and sw)

address of memory = base + sign-extended 16-bit offset of immediate

Example: lw $s0, 8($s1) address = $s1 (base-pointer) + 8

PC-relative

conditional branch instructions (i.e. beq, bne, …) use it to compute the new value of the PC (Program Counter)

Branch Target Address (BTA) = (PC + 4) + sign-extended offset of immediate

so if the offset is negative the label is above the current instruction.

Example:

1
2
3
4
5
6
0xA4 beq $t0, $0, else
0xA8 addi $v0, $0, 1
0xAC addi $sp, $sp, 8
0xBO jr $ra
0xB4 else: addi $a0, $a0, −1
0xB8 jal factorial

assuming PC=0xA4 the BTA will be: (0xA4 + 4) + 3 instructions which means the target address is 3 instructions after 0xA8 instruction (if it’s a -5 then it would be 5 instructions before 0xA4)

Pseudo-direct

here the address is specified in the instruction which is used in J and JAL instructions (J-Type instructions) recall the example in J-Type the address had only 26-bits to be stored while in a program it should be 32-bit address for PC!

That’s the algorithm to calculate a 26-bit address from 32-bit address:

1- get the address of the label instruction Jump Target Address (JTA)

2- Discard the 2 least significant bits JTA1:0 Because the instructions are word-aligned 4 (0100)2, 8 (1000)2, 12 (1100)2 so the 2 LSB are always zeros!)

3- Discard the 4 most significant bits JTA31:28 Because they can be obtained from the PC address so if your program is not long it won’t be far from current instruction which also puts some constraints on the range)

Example:

1
2
3
0x0040005C jal sum
...
0x004000A0 sum: add $v0, $a0, $a1

The JTA for JAL instruction here is 0x004000A0 and here’s the conversion of it to 26-bit address by applying the above algorithm:

004000A0
00 0000 0001 0000 0000 0000 0010 100000
0x0100028

Note that the address is combined into hex from right to left. You can think of the reverse process of adding 00 to the most right and adding the 4 most significant bits from the PC which are zero to get the original address 0x004000A0 back!


Variables and Arrays

Variables

To store a variable you can store it “immediately” (I-Type) to a registers if it’s 16 bits or less

1
addi $s0, $0, $0xF00D # $s0 = 0xF00D

or if it’s 32 bits:

1
2
lui $s0, 0x1337 # $s0 = 0x13370000
ori $s0, $s0, 0xF00D # $s0 |= 0xF00D = 0x1337F00D

or you can store it in memory in the data section and load it into a register:

1
2
3
4
5
6
7
8
9
.data
  num: .word 0x1337F00D

.text
.globl main

main:
  la $t0, num # $t0 = &num
  lw $s0, ($t0) # $s0 = num = 0x1337F00D

note that .word in the data section is the size of the variable (word = 4 bytes = 32 bits) and can be:

  • .space (empty)
  • .byte (8 bits)
  • .word (4 bytes)
  • .asciiz (null terminated string)
  • .ascii (string without null terminator)
  • .align (aligns the next data on a 2n byte boundary)

Arrays

The array is stored in memory in the data section:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
.data
  arr: .word 1, 2, 3

.text
.globl main

main:
  la $s0, arr 	 # $s0 = base address of arr
  lw $t0, 0($s0) # $t0 = arr[0] = 1
  lw $t1, 4($s0) # $t0 = arr[1] = 2
  lw $t2, 8($s0) # $t0 = arr[2] = 3

exit:
  li $v0, 10
  syscall

note that the string is nothing but an array of characters in memory!


Multiplication and division

Since multiplication of two 32-bit registers can result in a 64-bit value, and the division of two numbers results in a quotient and a remainder there are 2 special purpose registers for mult and div instructions:

1
2
mult $s0, $s1 # result = hi:lo
div  $s0, $s1 # hi = quotient, lo = remainder

If / else conditions

The main observation here is the low level condition is the inverse of the high level condition

High-level code:

1
2
3
4
if (x == y):
  x = y << 2
else:
  x = y

Low-level code

1
2
3
4
5
6
7
8
9
10
11
12
..
  addi $s0, $0, 0     #x = 0
  addi $s1, $0, 1     #y = 1
  bne  $s0, $s1, else #if (x != y) go to else
  sll  $s0, $s1, 2    #x = (x << 2)
  j    done

else:
  add $s0, $s0, $s1   #x = y

done:
..

Note that this code contains a conditional jump bne for comparison and an unconditional jump j to prevent from executing the else statement since assembly goes naturally from top to bottom!


Loops

In programming loops consists of 4 main pieces:

  • Initialization
  • Condition
  • Code that gets repeated
  • Counter update

and so do assembly!

High-level code:

1
2
3
4
5
6
7
int pow = 1;         // (initialization)
int x = 0;

while (pow != 128) { // (condition)
  pow = pow * 2;     // (code that gets repeated)
  x = x + 1;         // (counter update)
}

Low-level code

1
2
3
4
5
6
7
8
9
10
11
12
13
..
  addi $s0, $0, 1    # pow = 1 (initialization)
  addi $s1, $0, 0    # x = 0
  addi $t0, $0, 128  # t0 = 128 for comparison

while:
  beq $s0, $t0, done # if pow == 128, exit while loop (condition)
  sll $s0, $s0, 1    # pow = pow * 2 (code that gets repeated)
  addi $s1, $s1, 1   # x = x + 1 (counter update)
  j while

done:
..

Functions

There’s no call/ret instructions in MIPS assembly like in Intel x86, but fortunately we have got labels! with this and Jump and Link JAL and Jump Register JR we can do magic!

Function call

Jump and Link JAL copies the next instruction address (PC+4) to the $ra register and jumps to the address of the label, when returning the Jump Register JR $ra copies the value of $ra back to the PC so the program continues execution from where it was left.

This is an implementation for a basic function that adds two numbers and returns the result:

High-level code:

1
2
3
4
5
6
7
int add_func(x, y) {
  return x + y;
}

int main() {
  a = add_func(2, 4);
}

Low-level code:

1
2
3
4
5
6
7
8
9
10
11
.text
.globl main
add_func:
  add $v0, $a0, $a1
  jr $ra            # return (PC = $ra)

main:
  addi $a0, $0, 2
  addi $a1, $0, 4
  jal add_func      # call function ($ra = PC + 4, PC --> add_func)
  add $s0, $0, $v0

Notes:

  • saved registers ($s0-$s7) shouldn’t change after call
  • temporary registers ($0-$t9) can be changed inside the function
  • arguments should be passed in ($a0-$a3) registers
  • return values should be saved in ($v0-$v1) registers

Stack frames

each function should have its stack frame for purposes like saving the saved registers ($s0-$s7) at first and retrieving the values back at the end, and to control the stack we have 2 registers for this task:

  • $fp: base of the stack
  • $sp: top of the stack

Note that $fp > $sp since the stack grows towards lower memory addresses

Example of allocating and de-allocating stack frame of a function:

1
2
3
4
5
6
7
8
9
 addi $sp, $sp, -12 # allocation
 sw   $s0, 8($sp) # saves $s0
 sw   $t0, 4($sp) # saves $t0
 sw   $t1, 0($sp) # saves $t1
 ..
 lw   $t1, 0($sp) # restores $t1
 lw   $t0, 4($sp) # restores $t0
 lw   $s0, 8($sp) # restores $s0
 addi $sp, $sp, 12 # de-allocation

System calls

To perform tasks like taking user input, printing on screen or exiting the program we use system calls as show in the table:

ServiceSystem call codeArgumentsResult
print_int1$a0 = integer 
print_float2$f12 = float 
print_double3$f12 = double 
print_string4$a0 = string 
read_int5 integer(in $v0)
read_float6 float(in $f0)
read_double7 double(in $f0)
read_string8$a0=buffer, $a1=length 
sbrk9$a0=amount 
exit10  

the system call code gets loaded in $v0 followed by a syscall instruction to execute it.

Example of exit system call to exit a program:

1
2
li $v0, 10
syscall

Another example of printing “foo” word on screen:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
.data
	foo: .asciiz "foo"

.text
.globl main

main:
	la   $a0, foo
	li   $v0, 4
	syscall

exit:
	li $v0, 10
	syscall

output: foo

This post is licensed under CC BY 4.0 by the author.