Intro

In earlier articles, we talked about various parts of an ELF file and the many steps needed to create an executable ELF file that can run on your computer.

(Note: The steps are shown visually below; For the source code, check out the symbol table article in this series.)



     ┌────────────────────┐                        ┌─────────────────┐         ┌─────────────────┐
     │                    │                        │                 │         │                 │
     │   libarithmatic.c  │                        │ libarithmatic.h ├───────► │     main.c      │
     │                    │                        │                 │         │                 │
     └─────────┬──────────┘                        └─────────────────┘         └────────┬────────┘
               │                                                                        │
               │                                                                        │
               │ /* Compile + assemble */                                               │ /* Compile + assemble */
               │                                                                        │
               │                                                                        │
               ▼                                                                        ▼
    ┌─────────────────────┐                                                   ┌────────────────────┐
    │                     │                                                   │                    │
    │   libarithmatic.o   │                                                   │       main.o       │
    │                     │                                                   │                    │
    └─────────┬───────────┘                                                   └──────────┬─────────┘
              │                                                                          │
              │                                                                          │
              │                                                                          │
              │                                                                          │
              │                          /* Linking Magic */                             │
              └───────────────────────────────────┬──────────────────────────────────────┘
                                           ┌────────────────┐
                                           │                │
                                           │     calc       │
                                           │                │
                                           └────────────────┘

After completing this process, we have an ELF executable called calc. However, we didn’t directly include any library that contains definitions for functions like printf or scanf, which we used in our main.c file to input and output data. So, how does that work?

Answer: Dynamic linking (which is a complex topic, so for this article, we’ll just cover the basics).

If you use the file command on the calc executable, it will display interesting information such as dynamically linked and interpreter /lib64/ld-linux-x86-64.so.2.

> file calc
calc: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=65b929ceea26ea5e9fb8df1b15f2ab24b5c43ff6, for GNU/Linux 4.4.0, not stripped

Before we move forward, let’s discuss some basics about libraries and how Linux manages them. Linux supports two types of libraries: static and shared.

Static libraries are connected to a program directly during the compile time (linking phase), while dynamic libraries (also known as shared libraries) are loaded when the application is launched, and all symbol resolutions and bindings are done at runtime.

Dynamic or shared libraries can be handled in two ways: Either you link your program with the shared library and let Linux load the library when the program runs (dynamic linking) 1, or you can design your application so that it loads the library from a specified path and then calls a particular function within that library (dynamic loading). 2

Now, looking at our calc binary, it’s evident that we’re using dynamic linking to handle functions like printf and scanf (Since we are not loading any other library in out code). If you have any background in C programming, you’ve likely heard of the standard C library (libc) at least once. libc contains definitions for many standard functions used by many C programs, including printf and scanf, which we need in our calc executable.

So, when we run the calc executable, Linux will figure out which libraries it needs to run and load them into the process memory space. Once that’s done, it’ll load the calc executable and resolve all the dynamic symbols it contains.

In newer systems, this loading and resolving process is done lazily. This means that libraries will only load and resolve when there’s demand for a specific symbol. This approach is called lazy binding, and it helps speed up the loading of calc itself.

Since symbol resolution happens at runtime, the address of the resolved symbol needs to be stored somewhere so that we don’t have to resolve it every time it’s needed.

GOT (GLobal Offset Table) and PLT (Procedure Linkage Table)

Let’s visualize our situation: We need the address of the printf function to make a call, but we don’t know where in the process memory space the libc library will load, so we can’t determine the exact address for printf.

How can we call printf then?

One naive method would be to load libc into the process memory space, find the exact address for printf using libc’s base address, and then modify the .text section of calc to update the placeholder address of printf with the exact address. This seems straightforward and will work. However, with this approach, we’ll have to load the library separately for each instance of calc or any other program that relies on libc. This isn’t efficient because it would mean having many copies of the same library in memory, unless the library is completely read-only and never modified.

Another approach is to add a level of redirection to the this method. In this newer approach, we patch the .got and/or .got.plt section (which contains the Global Offset Table) of calc. The idea is that when the library is loaded, the dynamic linker examines the relocation, finds the exact address of printf, and patches the .got and/or .got.plt entry as required. Then, the calc binary refers to these tables to point to the right place. This way, everything works seamlessly!

What does PLT do here ??

The PLT (Procedure Linkage Table) adds another level of redirection that utilizes the .got.plt section to keep track of function jumps. Essentially, the Global Offset Table (GOT) is a list of addresses from the libc, while the PLT is another list of addresses used as placeholders in the .text section of the calc binary.

By utilizing this combination of the PLT and the .got.plt section, there’s no need to directly patch the .text section of the calc binary. This approach offers security benefits as it avoids modifying the executable code, which could potentially introduce vulnerabilities or trigger security mechanisms designed to detect such modifications.

Security benifits ++

Analysis

It will become clearer when we examine the disassembly (which is my favorite part).

As usual, we’ll disassemble main function first. We don’t have to check everything here, just focus on printf and scanf call instructions.

0x000055555555518f <+38>:    call   0x555555555050 <printf@plt>
0x00005555555551b2 <+73>:    call   0x555555555060 <__isoc99_scanf@plt>

Interesting thing to note here is that they point to addresses which are just 0x555555555060 - 0x555555555050 = 16 bytes away from each other. I’m sure none of these functions can be defined in just 16 bytes.

This is the PLT stub, the area which is referred by .text section for all kinds of dynamic linked library calls.

(gdb) x/3i 0x555555555050
   0x555555555050 <printf@plt>:         jmp    QWORD PTR [rip+0x2fba]        # 0x555555558010 <printf@got.plt>
   0x555555555056 <printf@plt+6>:       push   0x2
   0x55555555505b <printf@plt+11>:      jmp    0x555555555020

(gdb) x/3i 0x555555555060
   0x555555555060 <__isoc99_scanf@plt>:         jmp    QWORD PTR [rip+0x2fb2]        # 0x555555558018 <__isoc99_scanf@got.plt>
   0x555555555066 <__isoc99_scanf@plt+6>:       push   0x3
   0x55555555506b <__isoc99_scanf@plt+11>:      jmp    0x555555555020

If you examine the first instructions in both, you’ll notice they both point to memory locations 0x555555558010 and 0x555555558018, which are the Global Offset Table (GOT) entries. These entries hold addresses of actual functions from the dynamic libraries. You can inspect these locations to find where the first instruction in the PLT stub is directing the jump to.

(gdb) x/1x 0x555555558010
0x555555558010 <printf@got.plt>:        0x55555056

(gdb) x/1x 0x555555558018
0x555555558018 <__isoc99_scanf@got.plt>:        0x55555066

Alright, since we’re using these functions for the first time, the steps of finding the function’s address and storing it in the Global Offset Table haven’t been completed yet (which is part of the lazy binding logic). So, the program jumps to the next step instead (0x55555056 and 0x55555066 respectively).

In next instruction from PLT stub, a certain number onto the stack… and then both of the PLT stubs jump to same address – 0x555555555020. This is the address which should trigger the dynamic symbol resolution process.

(gdb) x/3i 0x555555555020
=> 0x555555555020:      push   QWORD PTR [rip+0x2fca]        # 0x555555557ff0
   0x555555555026:      jmp    QWORD PTR [rip+0x2fcc]        # 0x555555557ff8
   0x55555555502c:      nop    DWORD PTR [rax+0x0]

It pushes something (0x555555557ff0) to stack and then jumps to 0x555555557ff8 address which is actually _dl_runtime_resolve_xsavec function in our dynamic linker (/lib64/ld-linux-x86-64.so.2). This function will resolve the address for printf and scanf and then patch the GOT table for it.

Once that is done, you can check the patched entries in the GOT tables

(gdb) x/1x 0x555555558010
0x555555558010 <printf@got.plt>:        0xf7e16730

(gdb) x/1x 0x555555558018
0x555555558018 <__isoc99_scanf@got.plt>:        0xf7e16430

Now, these are the real addresses for the printf and scanf functions within the calc program’s memory space. With this completed, whenever calc needs to use printf or scanf, their addresses will already be stored in the GOT table, which can be accessed by the corresponding PLT stubs.

Conclusion

In conclusion, dynamic linking plays a crucial role in how programs interact with shared libraries in Linux systems. By utilizing mechanisms like the Procedure Linkage Table (PLT) and the Global Offset Table (GOT), programs can efficiently access functions from shared libraries at runtime. This process involves lazy binding, where function addresses are resolved and stored in the GOT only when they are first called, optimizing performance and memory usage. Through this approach, programs like calc can seamlessly utilize functions like printf and scanf without the need for manual intervention or redundant loading of shared libraries. Overall, dynamic linking provides a flexible and efficient way for programs to access external functionality, enhancing the functionality and usability of software on Linux platforms.


Resources