The Stack

The Stack is used by the ALU to store return addresses from subroutines.

Imagine you can't remember where you just left. You'd have to write down where you left and, if you're visiting several locations, put the notes onto a stack. Your stack pointer tells you where that stack is. A microcontroller is just doing that - when a subroutine is called, it leaves the place in flash where it was just working and saves the return address on the stack.

The Stack needs a stack pointer (SP) and space in SRAM (the stack pointer must point above the first SRAM address). When a return address is stored, the SP is post-decremented (!!!!!!). In other words: The stack is growing towards smaller SRAM addresses. The biggest stack possible is initialised to RAMEND. It can then grow all the way down to the first SRAM address.

Here's a table/diagram/figure/whatever of how the stack is changed by rcall and ret.

.org 0x00
ldi SPL low(RAMEND)
ldi SPH, high(RAMEND)
rcall subrtn_1

.org 0x100
subrtn_1:
rcall subrtn_2
ret

.org 0x140
subrtn_2:
ret
Stack value SP value Comment 
layer 0: --- SP = ??? Stack before init
layer 1: ---    
layer 2: ---    

Then the SP is set to RAMEND:

Stack value SP value  Comment
layer 0: --- <-SP Stack after init
layer 1: ---    
layer 2: ---    

Stack state after rcall subrtn_1:

Stack value SP value  Comment
layer 0: 0x01   return address
layer 1: --- <- SP SP=SP-1
layer 2: ---    

Stack state after rcall subrtn_2:

Stack value SP value Comment 
layer 0: 0x01   return address
layer 1: 0x0101   return address
layer 2: --- <- SP SP=SP-1

When the return is executed, the return address is popped from the stack and the SP is incremented. In the example, when returning from subrtn_2, the micro jumps to 0x101 (the ret instruction in subrtn_1) and the Stack Pointer points to stack layer 1 again. I didn't make a table for that as it should be easy to understand now.

The stack can also be used to pass arguments to subroutines using push and pop. If a subroutine has a 16-bit argument, passing it would look like this:

push r16
push r17
rcall set_TCNT1

set_TCNT1:
pop r17
pop r16
out TCNT1H, r17
out TCNT1L, r16
ret
; push 16-but argument r16:r17
;
; and call the subroutine
;
; our subroutine writes its 16-but argument to the Timer 1 counter
; register. It pops the argument from the stack
; (reversed order!)
; and uses it
;
; now it returns.

It's important to keep the push and pop instructions balanced to each other. If a value is pushed on the stack as an argument folowed by a subroutine call, the next ret can result in unexpected behavior if the subroutine popped too many or no argumants at all. One push, one pop. This bug is often hard to find.

Why can't the subroutine just use r16:r17 instead of the stack as a base for passing arguments? Good question. By using the stack, you can use any register to push the value on the stack. You're not limited to r16 and r17. You can also push an argument and then use the registers to calculate the next one (file systems for example need lots of registers for calculations). You can also use a heap to pass arguments. This has the advantage that you can't mess up your return addresses.

Let's take a closer look at how the return address is stored on the stack by simulating it in AVR Studio. I've not included images of this in order to save space, but it's quite simple. This is the code for finding out how return addresses are pushed on the stack:

;(include 2313def.inc)
 
.org 0x0000
rjmp reset
 
reset:
ldi r16, low(RAMEND)
out SPL, r16
rcall dummy
 
.org 0x0123
dummy:
rcall dummy2
ret
 
dummy2:
ret
;
;
; reset interrupt vector
;
;
; initialisation:
; stack pointer to RAMEND
;
;this will push 0x0004 on the stack (note 1)
;
;
;first dummy routine
; address on stack: 0x0124 [Break Point]
; the ret is at address 0x0124
;
; second dummy routine
; [Break point]

note 1: rcall dummy will push 0x0004 on the stack because there are 3 instructions before it that use one word of code space each (rjmp; ldi; out; + rcall) so the next address after the subroutine call instruction is 0x0004.

The simulator is set up as follows: 2313 @ 1MHz, one memory window (Data) for viewing SRAM contents.

Now run the code. After the first break the SRAM will hold 0x04 at address 0xDF and 0x00 at address 0xDE. That means that the low byte of the address (which is 0x04) is at the higher address.

After the second rcall (second break) the return address to dummy's ret is also pushed on the stack: 0x24 at address 0xDD and 0x01 at address 0xDC.

The low address byte is pushed first, as the simulation shows. If you wanted to do calculations on that address, you'd have to pop the high byte first. Beware: Messing with the stack is not easy and should be done with caution!