My New Hugo Site

How the Call Stack Works

On the quest to understanding how the xv6 boot, I got sidetracked to how to get from assembly to C. We know C is a high level language, and it’s closest to assembler amongst all high level languages. In the cake of abstractions, at the lowest it’s machine code.

  • C
  • assembler
  • machine code CPU only understand machine code, but they’re very not human readable. Fortunately we use assembly language to represent machine code, and the mapping is 1-to-1.

Although assembler is comprehensible, it’s still too troublesome to write complex programs. Certainly a “hello world” is within human’s mental capacity, we need something more abstract to code more complex things. I have heard fairly tales about Ken Thomson or Dennis Ritchie failed to program an operating system until C was invented by Dennis, because coding in assembly was just too complicated.

The source code we are following is the xv6 riscv variant. https://github.com/mit-pdos/xv6-riscv For curious reading, this is a detailed video (series) walking through this code repository. https://www.youtube.com/watch?v=KkenLT8S9Hs For this blog post, I’m going to assume these points are understood:

  1. The files and functions involved during startup.

This comment in entry.S line 8 interests me very much

# set up a stack for C.
# stack0 is declared in start.c,

Why we need to setup a stack for C? Is a stack required for the C language?

After giving some thoughts about these two questions, my answer is yes. But this requirement is not stated in the C language specification. source: https://stackoverflow.com/a/51579770 Instead, a stack is required if we want function calls.

C or ASM                         mem addr   instruction
-------------------------------------------------------------------
entry.S:19 call start            0x80000016 jal 0x80000064 <start>
...                        
start.c:15                       0x80000064 addi sp,sp,-16
start.c:15                       0x80000066 sd ra,8(sp)
start.c:15                       0x80000068 sd s0,0(sp)
start.c:15                       0x8000006a addi s0,sp,16
start.c:18                       0x8000006c csrr a5,mstatus
...

As we look into the disassembled instructions, we highlight the sp, ra, and s0 registers. These are the ABI names of the riscv registers. For our understand purpose without complicating things, let’s take ABI names as a naming convention to the general purpose riscv registers. They’re “general purpose” because a program may use these registers however they want, as oppose to specialized registers like CSR register, vector registers, etc.

ra is conventionally the Return Address register. x1 being the actual register in machine. sp is conventionally the Stack Pointer register. x2 being the actual register. s0, some might name fp, is the Frame Pointer register. x8 being the actual register. Frame Pointer is not strictly necessary for a call stack. Its existence is to support stack unwinding during a stack trace when debugging. https://stackoverflow.com/a/74650674

top of stack, defined in start.c:11
(actual memory address is a function of toolchain
 for our demonstration let's say it's 0xffff)
--------------------------------------------------
0xffff  <nothing>
0xfffe  ra -> entry.S:19
0xfffd  fp
0xfffc  ra -> start.c:41
0xfffb  fp
0xfffa
0xfff0
0xffef

It is easier to understand the call stack if we focus on the ret instruction at 0x80000062. First of all, this is a pseudo instruction translated to jalr x0, x1, 0. We previously established x1 is holding the return address, a.k.a ra. If we look at the C code at start.c:66, we see that the timerinit() function is completed, and ready to “return” to the caller. The steps to perform this “return” action in C is translated to these steps in assembler:

  1. load this return address to ra register. The value of this return address is stored at “top of stack” in memory. This “top of stack” memory address is stored in sp register. sd ra,8(sp) instruction performed this action.
  2. “popping” the stack, meaning remove the top layer of the stack. addi sp,sp,16 is doing this action. Because the stack grows downwards, shrinking is technically done by increasing memory address.
  3. ret instruction, which jump to which ra points to. program counter register pc will be changed underlying.
C or ASM                         mem addr   instruction
-------------------------------------------------------------------
start.c:53                       0x8000001c addi sp,sp,-16
start.c:53                       0x8000001e sd ra,8(sp)
start.c:53                       0x80000020 sd s0,0(sp)
start.c:53                       0x80000022 addi s0,sp,16
...
start.c:66                       0x8000005c ld ra,8(sp)
start.c:66                       0x8000005e ld s0,0(sp)
start.c:66                       0x80000060 addi sp,sp,16
start.c:66                       0x80000062 ret
...
start.c:41 timerinit()           0x800000bc jal 0x8000001c <timerinit>

Now we shift our focus to start.c:53, how to jump to a difference function. We see C had done the previous steps in reverse.

I think the call stack as a “context switching” of the CPU. Each context is equivalent to a C function. We introduce an abstract data type named “stack” to keep track of difference context. By pushing to the stack, we jump to a new function. By popping from the stack, we return to the previous location. On a side track, you might have heard of the famous stackoverflow issue. This is done by jumping to self function address. Each jump will push a new level to the call stack. If the function doesn’t have an exit condition, the stack will exhaust all available memory. Here an exit condition means not jumping to self anymore. Then the program continues to hit a series of ret and clear the stack.

First call - start() call start enter start

Second call - timerinit() before timerinit() entered timerinit() leaving timerinit()

I couldn’t have learnt this without setting up breakpoints. Let’s discuss that in a different post.