Library injection is a powerful technique that enables the modification of an existing process’s behavior by dynamically loading external libraries. In this article, we delve into a more advanced approach using ptrace, a system call that grants deep control over a running process. This article is presented as a technical report based on a conference presentation.

Ptrace

The ptrace() system call allows a parent process to control the execution of another process and modify its memory image. To inject libraries, I utilized several capabilities of ptrace().This section presents some custom functions used for library injection, along with mechanisms for error handling and debugging.

Read and write target memory

Ptrace allows reading from and writing to a process’s memory by providing its PID, the size of the memory block, and the starting address of the targeted memory section.

Here is the code to read from memory:

void ptrace_read(int pid, unsigned long addr, void *vptr, int len)
{
	int bytesRead = 0;
	int i = 0;
	long word = 0;
	long *ptr = (long *) vptr;

	while (bytesRead < len)
	{
		word = ptrace(PTRACE_PEEKTEXT, pid, addr + bytesRead, NULL);
		if(word == -1)
		{
			fprintf(stderr, "ptrace(PTRACE_PEEKTEXT) failed\n");
			exit(1);
		}
		bytesRead += sizeof(word);
		ptr[i++] = word;
	}
}

And to write from memory:

void ptrace_write(int pid, unsigned long addr, void *vptr, int len)
{
	int byteCount = 0;
	long word = 0;

	while (byteCount < len)
	{
		memcpy(&word, vptr + byteCount, sizeof(word));
		word = ptrace(PTRACE_POKETEXT, pid, addr + byteCount, word);
		if(word == -1)
		{
			fprintf(stderr, "ptrace(PTRACE_POKETEXT) failed\n");
			exit(1);
		}
		byteCount += sizeof(word);
	}
}

Read and write target registries

Ptrace, like GDB, provides the ability to read and modify the registers of the child process. This capability will be used to alter the execution flow of the target process.

Code to read registers :

void ptrace_getregs(pid_t pid,struct user_regs_struct *regs){
    if (ptrace(PTRACE_GETREGS, pid, NULL, regs) == -1) {
        perror("ptrace GETREGS");
        ptrace(PTRACE_DETACH, pid, NULL, NULL);
        exit(1);
    }
}

And write registries :

void ptrace_setregs(pid_t pid,struct user_regs_struct *regs) {
    if (ptrace(PTRACE_SETREGS, pid, NULL, regs) == -1) {
        perror("ptrace SETREGS");
        ptrace(PTRACE_DETACH, pid, NULL, NULL);
        exit(1);
    }
}

Wait processus

The waitpid function allows waiting for the target process to stop, either when the injection process attaches to the target or when the program hits a breakpoint.

void ptrace_wait(int pid){
    int status;
    waitpid(pid, &status, 0);
    if (WIFSTOPPED(status) && WSTOPSIG(status) == SIGSEGV) {
        fprintf(stderr, "[!] Segmentation fault detected in target process (PID: %d).\n", pid);
        ptrace(PTRACE_DETACH, pid, NULL, NULL);
        exit(1);
    }
}

Continue processus execution

When the target process is stopped, waiting for an action from the debugger, the continue operation allows resuming the execution flow. The process will either terminate naturally or stop at a breakpoint, awaiting a debugger action. The following function executes this operation and waits for the process to hit a breakpoint :

void ptrace_continue(pid_t pid){
    if (ptrace(PTRACE_CONT, pid, NULL, NULL) == -1) {
        perror("ptrace CONT (dlopen call)");
        ptrace(PTRACE_DETACH, pid, NULL, NULL);
        exit(1);
    }

    ptrace_wait(pid);
}

Attach and detach processus

To manipulate the target process, the only missing functions are those to take control of the target process and release it so it can resume its execution flow.


void ptrace_attach(int pid){
    if (ptrace(PTRACE_ATTACH, pid, NULL, NULL) == -1) {
        perror("ptrace ATTACH");
        exit(1);
    }
}

void ptrace_detach(pid_t pid){
    if (ptrace(PTRACE_DETACH, pid, NULL, NULL) == -1) {
        perror("ptrace DETACH");
        exit(1);
    }
}

Implementation of injection

On Linux, it is possible to load a library dynamically using a function void *dlopen(const char *filename, int flags);. The goal of this injection is to execute this function within the target process. The concept is relatively simple, but the implementation is more technical.

Indeed, a process cannot execute code (except in specific cases) within another program’s memory space. Calling a function in another process also introduces the challenge of locating the function’s address within the target process. This is particularly difficult due to ASLR (Address Space Layout Randomization), which randomizes the memory addresses where libraries are loaded.

Locate address of dlopen

The first challenge is locating the address of dlopen. Addresses are randomized, but fortunately, with elevated privileges, there is a simple way to do this on a system.

ASLR (Address Space Layout Randomization) affects the location of shared libraries, but not necessarily the functions inside them. These functions maintain the same offset relative to the base address of the library. This means that once we determine the base address of the library, we can calculate the address of dlopen using a fixed offset.

The following function, given a PID, locates the base address of libc if the process has sufficient privileges:

long getlibcaddr(pid_t pid)
{
    FILE *fp;
    char filename[30];
    char line[850];
    long addr = 0;
    char perms[5];
    char modulePath[256];

    sprintf(filename, "/proc/%d/maps", pid);
    fp = fopen(filename, "r");
    if (fp == NULL) {
        perror("fopen");
        exit(1);
    }

    while (fgets(line, sizeof(line), fp) != NULL)
    {
        if (sscanf(line, "%lx-%*lx %4s %*s %*s %*d %s", &addr, perms, modulePath) == 3) {
            if ((strstr(modulePath, "libc-") || strstr(modulePath, "libc.so")) && perms[0] == 'r' && perms[2] == 'x') {
                break;
            }
        }
    }

    fclose(fp);
    return addr;
}

Once the base address of libc is obtained, we simply need to find the address of the dlopen function in the current process to determine the offset.The following function retrieves the address of a libc function by its name using the dlsym function, as explained in the previous article.

long getFunctionAddress(char* funcName)
{
	void* self = dlopen("libc.so.6", RTLD_LAZY);
	void* funcAddr = dlsym(self, funcName);
	return (long)funcAddr;
}

From there, it is easy to deduce the address of dlopen in the child process :

void inject_library(pid_t pid, const char *lib_path) {
    [...]
    long dlopen_addr = getFunctionAddress("dlopen") - getlibcaddr(getpid()) + getlibcaddr(pid);
    [...]
}

Payload

Once dlopen is located, it must now be executed. To do this, we need to control the execution flow of the target process. We will use the previously mentioned ptrace functions to capture this flow and release it once the library is injected, ensuring that the process continues its normal execution without disruption.

void inject_library(pid_t pid, const char *lib_path) {
    [...]
    ptrace_attach(pid);
    ptrace_wait(pid);
    [...]
    ptrace_detach(pid);
}

Modifying the execution flow involves altering the registers of the target process. Before releasing the process, it is also necessary to restore these registers to their original state as they were at the time of capture :

void inject_library(pid_t pid, const char *lib_path) {
    struct user_regs_struct regs, backup_regs;
    [...]
    ptrace_attach(pid);
    ptrace_wait(pid);
    ptrace_getregs(pid,&regs);
    memcpy(&backup_regs, &regs, sizeof(struct user_regs_struct));
    [...]
    ptrace_setregs(pid, &backup_regs);
    ptrace_detach(pid);
}

To execute a function call in the child process, assembly is required to craft executable code because ptrace only allows modifying registers and memory, but does not provide a direct way to invoke functions within the target process.

void injectSharedLibrary(){
	asm(
        "callq *%r9 \n"
        "int $3"
    );
}

void injectSharedLibrary_end()
{

}

This assembly code is not the most common one found online for this type of injection. However, it has the advantage of being compact, minimalist, and requiring only the address of the dlopen function, unlike other payloads that rely on additional functions such as malloc and free, potentially altering the target’s memory. I prefer to limit such operations to ensure that I do not disrupt the execution flow. It executes a function whose memory address is stored in the R9 register. Since dlopen takes two parameters, following the x64 calling convention used by C, the first parameter (the path of the library to inject) must be stored in the RDI register, and the second parameter (the flag) in the RSI register. The payload ends with the int $3 instruction, which is a breakpoint instruction. This will halt the execution of the target process and return control to our injection program.

Here is the code used to modify the registers in the target process :

    struct user_regs_struct regs, backup_regs;
    [...]
    ptrace_attach(pid);
    ptrace_wait(pid);
    ptrace_getregs(pid,&regs);
    memcpy(&backup_regs, &regs, sizeof(struct user_regs_struct));
    
    regs.r9 = dlopen_addr;
    regs.rsi = 1;
    regs.rdi = ...;

    ptrace_setregs(pid, &regs);

    ptrace_setregs(pid, &backup_regs);
    ptrace_detach(pid);

The injectSharedLibrary_end label marks the end of our payload, making it easier to manipulate the payload in memory.

Memory

As mentioned earlier, a program, except in rare cases, can only execute code within an executable memory region of its own address space. The following function searches for a memory address that meets this requirement:

    long executable_memory_addr(pid_t pid)
{
	FILE *fp;
	char filename[30];
	char line[850];
	long addr = 0;
	char str[20];
	char perms[5];
	sprintf(filename, "/proc/%d/maps", pid);
	fp = fopen(filename, "r");
	if(fp == NULL)
		exit(1);
	while(fgets(line, 850, fp) != NULL)
	{
		sscanf(line, "%lx-%*lx %s %*s %s %*d", &addr, perms, str);

		if(strstr(perms, "x") != NULL)
		{
			break;
		}
	}
	fclose(fp);
	return addr;
}

This memory region is crucial as it will store both the payload and the library path, which will be passed as a parameter to the function. In memory, strings are stored as raw data, and to use them in functions, we pass the base address of this memory region.

To manipulate memory, it is necessary to calculate the size of the payload and the total memory to inject. This includes the payload size, the length of the library path, plus 1 byte for the null terminator at the end of the string :

size_t payload_size = (intptr_t)injectSharedLibrary_end - (intptr_t)injectSharedLibrary;
size_t required_size = payload_size * strlen(lib_path) + 1 ;

The rest is relatively straightforward: first, save the memory region that will be modified, then copy the payload and the library path into the target process’s memory. Once the target process stops at the payload’s breakpoint, restore the altered memory to its original state. dditionally, we must set the RDI register to the memory address where the library path is stored.

void inject_library(pid_t pid, const char *lib_path) {
    struct user_regs_struct regs, backup_regs;
    long dlopen_addr = getFunctionAddress("dlopen") - getlibcaddr(getpid()) + getlibcaddr(pid);

    printf("[+] Attaching to process %d\n", pid);
    ptrace_attach(pid);
    ptrace_wait(pid);
    ptrace_getregs(pid,&regs);

    memcpy(&backup_regs, &regs, sizeof(struct user_regs_struct));

    long mem_addr = executable_memory_addr(pid);
    if( mem_addr == 0 ){
        printf("writable/executable section not found");
        exit(1);
    }
	size_t payload_size = (intptr_t)injectSharedLibrary_end - (intptr_t)injectSharedLibrary;
    size_t required_size = payload_size * strlen(lib_path) + 1 ;
    char* backup = malloc(required_size);
	ptrace_read(pid, mem_addr, backup, required_size);

    char* newcode = malloc(required_size);
	memset(newcode, 0, required_size);
    memcpy(newcode, payload_size, payload_size);
    memcpy(newcode+payload_size, lib_path, strlen(lib_path));

    ptrace_write(pid,mem_addr,newcode,required_size);
    regs.r9 = dlopen_addr;
    regs.rsi = 1;
    regs.rdi = mem_addr + payload_size;

    [...]

    ptrace_setregs(pid, &regs);

    ptrace_write(pid,mem_addr,backup,required_size);
    
    ptrace_setregs(pid, &backup_regs);

    ptrace_detach(pid);

    printf("[+] Injection completed successfully.\n");
}

Update execution flow

Now the memory is ready for injection. The only remaining step is to execute our payload. To do this, we use the RIP register, which holds the address of the next instruction to execute, by setting it to the payload’s address. Then, we use ptrace with ptrace_continue to resume execution.

void inject_library(pid_t pid, const char *lib_path) {
    struct user_regs_struct regs, backup_regs;
    long dlopen_addr = getFunctionAddress("dlopen") - getlibcaddr(getpid()) + getlibcaddr(pid);

    printf("[+] Attaching to process %d\n", pid);
    ptrace_attach(pid);
    ptrace_wait(pid);
    ptrace_getregs(pid,&regs);

    memcpy(&backup_regs, &regs, sizeof(struct user_regs_struct));

    long mem_addr = executable_memory_addr(pid);
    if( mem_addr == 0 ){
        printf("writable/executable section not found");
        exit(1);
    }
	size_t payload_size = (intptr_t)injectSharedLibrary_end - (intptr_t)injectSharedLibrary;
    size_t required_size = payload_size * strlen(lib_path) + 1 ;
    char* backup = malloc(required_size);
	ptrace_read(pid, mem_addr, backup, required_size);

    char* newcode = malloc(required_size);
	memset(newcode, 0, required_size);
    memcpy(newcode, injectSharedLibrary, payload_size);
    memcpy(newcode+payload_size, lib_path, strlen(lib_path));

    ptrace_write(pid,mem_addr,newcode,required_size);
    regs.r9 = dlopen_addr;
    regs.rsi = 1;
    regs.rdi = mem_addr + payload_size;
    regs.rip = (long) mem_addr + 2;

    ptrace_setregs(pid, &regs);
    ptrace_continue(pid);
    ptrace_write(pid,mem_addr,backup,required_size);
    
    ptrace_setregs(pid, &backup_regs);

    ptrace_detach(pid);

    printf("[+] Injection completed successfully.\n");
}

The RIP is set to mem_addr + 2 to skip the first two standard assembly instructions of the start function and avoid their execution.

(gdb) disassemble injectSharedLibrary
Dump of assembler code for function injectSharedLibrary:
   0x0000000000401296 <+0>:     push   %rbp
   0x0000000000401297 <+1>:     mov    %rsp,%rbp
   0x000000000040129a <+4>:     call   *%r9
   0x000000000040129d <+7>:     int3
   0x000000000040129e <+8>:     nop
   0x000000000040129f <+9>:     pop    %rbp
   0x00000000004012a0 <+10>:    ret

We also notice in this assembly code that the payload can be optimized, as fundamentally, the only instructions that matter are at addresses 0x40129A and 0x40129D.

Optimization

It is possible to optimize further by creating a shellcode for our payload and keeping only the essential instructions in the payload.

(gdb) x/4xb 0x40129a
0x40129a <injectSharedLibrary+4>:       0x41    0xff    0xd1    0xcc

Here is the optimized code using shellcode:

#define SHELLCODE_SIZE 4

unsigned char shellcode[] = "\x41\xff\xd1\xcc";

void inject_library(pid_t pid, const char *lib_path) {
    struct user_regs_struct regs, backup_regs;
    long dlopen_addr = getFunctionAddress("dlopen") - getlibcaddr(getpid()) + getlibcaddr(pid);

    printf("[+] Attaching to process %d\n", pid);
    ptrace_attach(pid);
    ptrace_wait(pid);
    ptrace_getregs(pid,&regs);

    memcpy(&backup_regs, &regs, sizeof(struct user_regs_struct));

    long mem_addr = executable_memory_addr(pid);
    if( mem_addr == 0 ){
        printf("writable/executable section not found");
        exit(1);
    }
    size_t required_size = SHELLCODE_SIZE * strlen(lib_path) + 1 ;
    char* backup = malloc(required_size);
	ptrace_read(pid, mem_addr, backup, required_size);

    char* newcode = malloc(required_size);
	memset(newcode, 0, required_size);
    memcpy(newcode, shellcode, SHELLCODE_SIZE);
    memcpy(newcode+SHELLCODE_SIZE, lib_path, strlen(lib_path));

    ptrace_write(pid,mem_addr,newcode,required_size);
    regs.r9 = dlopen_addr;
    regs.rsi = 1;
    regs.rdi = mem_addr + SHELLCODE_SIZE;
    regs.rip = (long) mem_addr;

    ptrace_setregs(pid, &regs);
    ptrace_continue(pid);
    ptrace_write(pid,mem_addr,backup,required_size);
    
    ptrace_setregs(pid, &backup_regs);

    ptrace_detach(pid);

    printf("[+] Injection completed successfully.\n");
}

Final code

// gcc inject.c -o inject
//gcc inject.c -o inject 
//./inject ./lib.so 14883
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <dlfcn.h>
#include <string.h>
#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/user.h>
#include <sys/mman.h>

#define SHELLCODE_SIZE 4

unsigned char shellcode[] = "\x41\xff\xd1\xcc";

long getlibcaddr(pid_t pid)
{
    FILE *fp;
    char filename[30];
    char line[850];
    long addr = 0;
    char perms[5];
    char modulePath[256];

    sprintf(filename, "/proc/%d/maps", pid);
    fp = fopen(filename, "r");
    if (fp == NULL) {
        perror("fopen");
        exit(1);
    }

    while (fgets(line, sizeof(line), fp) != NULL)
    {
        if (sscanf(line, "%lx-%*lx %4s %*s %*s %*d %s", &addr, perms, modulePath) == 3) {
            if ((strstr(modulePath, "libc-") || strstr(modulePath, "libc.so")) && perms[0] == 'r' && perms[2] == 'x') {
                break;
            }
        }
    }

    fclose(fp);
    return addr;
}

long getFunctionAddress(char* funcName)
{
	void* self = dlopen("libc.so.6", RTLD_LAZY);
	void* funcAddr = dlsym(self, funcName);
	return (long)funcAddr;
}

long executable_memory_addr(pid_t pid)
{
	FILE *fp;
	char filename[30];
	char line[850];
	long addr = 0;
	char str[20];
	char perms[5];
	sprintf(filename, "/proc/%d/maps", pid);
	fp = fopen(filename, "r");
	if(fp == NULL)
		exit(1);
	while(fgets(line, 850, fp) != NULL)
	{
		sscanf(line, "%lx-%*lx %s %*s %s %*d", &addr, perms, str);

		if(strstr(perms, "x") != NULL)
		{
			break;
		}
	}
	fclose(fp);
	return addr;
}

void ptrace_read(int pid, unsigned long addr, void *vptr, int len)
{
	int bytesRead = 0;
	int i = 0;
	long word = 0;
	long *ptr = (long *) vptr;

	while (bytesRead < len)
	{
		word = ptrace(PTRACE_PEEKTEXT, pid, addr + bytesRead, NULL);
		if(word == -1)
		{
			fprintf(stderr, "ptrace(PTRACE_PEEKTEXT) failed\n");
			exit(1);
		}
		bytesRead += sizeof(word);
		ptr[i++] = word;
	}
}

void ptrace_write(int pid, unsigned long addr, void *vptr, int len)
{
	int byteCount = 0;
	long word = 0;

	while (byteCount < len)
	{
		memcpy(&word, vptr + byteCount, sizeof(word));
		word = ptrace(PTRACE_POKETEXT, pid, addr + byteCount, word);
		if(word == -1)
		{
			fprintf(stderr, "ptrace(PTRACE_POKETEXT) failed\n");
			exit(1);
		}
		byteCount += sizeof(word);
	}
}


void ptrace_attach(int pid){
    if (ptrace(PTRACE_ATTACH, pid, NULL, NULL) == -1) {
        perror("ptrace ATTACH");
        exit(1);
    }
}

void ptrace_wait(int pid){
    int status;
    waitpid(pid, &status, 0);
    if (WIFSTOPPED(status) && WSTOPSIG(status) == SIGSEGV) {
        fprintf(stderr, "[!] Segmentation fault detected in target process (PID: %d).\n", pid);
        ptrace(PTRACE_DETACH, pid, NULL, NULL);
        exit(1);
    }
}

void ptrace_getregs(pid_t pid,struct user_regs_struct *regs){
    if (ptrace(PTRACE_GETREGS, pid, NULL, regs) == -1) {
        perror("ptrace GETREGS");
        ptrace(PTRACE_DETACH, pid, NULL, NULL);
        exit(1);
    }
}

void ptrace_setregs(pid_t pid,struct user_regs_struct *regs) {
    if (ptrace(PTRACE_SETREGS, pid, NULL, regs) == -1) {
        perror("ptrace SETREGS");
        ptrace(PTRACE_DETACH, pid, NULL, NULL);
        exit(1);
    }
}

void ptrace_detach(pid_t pid){
    if (ptrace(PTRACE_DETACH, pid, NULL, NULL) == -1) {
        perror("ptrace DETACH");
        exit(1);
    }
}

void ptrace_continue(pid_t pid){
    if (ptrace(PTRACE_CONT, pid, NULL, NULL) == -1) {
        perror("ptrace CONT (dlopen call)");
        ptrace(PTRACE_DETACH, pid, NULL, NULL);
        exit(1);
    }

    ptrace_wait(pid);
}
void inject_library(pid_t pid, const char *lib_path) {
    struct user_regs_struct regs, backup_regs;
    long dlopen_addr = getFunctionAddress("dlopen") - getlibcaddr(getpid()) + getlibcaddr(pid);

    printf("[+] Attaching to process %d\n", pid);
    ptrace_attach(pid);
    ptrace_wait(pid);
    ptrace_getregs(pid,&regs);

    memcpy(&backup_regs, &regs, sizeof(struct user_regs_struct));

    long mem_addr = executable_memory_addr(pid);
    if( mem_addr == 0 ){
        printf("writable/executable section not found");
        exit(1);
    }
    size_t required_size = SHELLCODE_SIZE * strlen(lib_path) + 1 ;
    char* backup = malloc(required_size);
	ptrace_read(pid, mem_addr, backup, required_size);

    char* newcode = malloc(required_size);
	memset(newcode, 0, required_size);
    memcpy(newcode, shellcode, SHELLCODE_SIZE);
    memcpy(newcode+SHELLCODE_SIZE, lib_path, strlen(lib_path));

    ptrace_write(pid,mem_addr,newcode,required_size);
    regs.r9 = dlopen_addr;
    regs.rsi = 1;
    regs.rdi = mem_addr + SHELLCODE_SIZE;
    regs.rip = (long) mem_addr;

    ptrace_setregs(pid, &regs);
    ptrace_continue(pid);
    ptrace_write(pid,mem_addr,backup,required_size);
    
    ptrace_setregs(pid, &backup_regs);

    ptrace_detach(pid);

    printf("[+] Injection completed successfully.\n");
}

int main(int argc, char **argv) {
    if (geteuid() != 0) {
        printf("[!] Run as root!\n");
        exit(EXIT_FAILURE);
    }

    if (argc < 3) {
        printf("Usage: %s <lib_path> <pid>\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    inject_library(atoi(argv[2]), argv[1]);
    return 0;
}

A simple target :

// gcc loop.c -o loop

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>

int main(){
    printf("PID : %d\n", getpid());
    while(1){
    }
}

Lib code :

// gcc -shared -fPIC lib.c  -o lib.so
#include <stdio.h>


void hello()
{
	printf("Hello,world!\n");
}

__attribute__((constructor))
void loadMsg()
{
	hello();
}

Conclusion

This project, developed for a technical conference, proved to be an incredibly enriching experience. It deepened my understanding of ptrace, enhanced my knowledge of dynamic linking on Linux, and gave me the opportunity to have fun with assembly and shellcode in a hands-on way. Beyond the technical insights, it reinforced my passion for low-level systems systems.