We are only going to concern ourselves with dynamically linked
Elf64_Ehdr.e_type = ET_EXEC (executable files) or ET_DYN (dynamic shared objects,
basically shared libraries).
Note: If you don’t know what dynamic linking means, I suggest to read this article.
I will not mention ELF sections on purpose. They are not relevant in executables and shared
libraries. They don’t have to be there
and should be treated like a nice bonus when they actually are. See sstrip.
This “technique” is used by malware fairly often and you don’t need sstrip to do the job.
e_phoff specifies the start of a program header table (PHT) in the file. The PHT is made
of Elf64_Phdr entries (segments):
p_type can have values such as PT_LOAD, PT_DYNAMIC, PT_INTERP etc.
When loading an ELF binary, the linux kernel looks for PT_LOAD segments
and maps them into memory (among other things). When doing so, it
uses both p_offset (segment file offset) and p_vaddr (the address where to map
the segment into memory). ELF segments can overlap in the file. Usually, there are 2 PT_LOAD
segments - 1 for code (R-X) and 1 for data (RW-). There can also be just 1 or more than 2.
Whenever a virtual address needs to be converted to a file offset, it can be done like this:
When you dynamically link an ELF, PT_DYNAMIC can be found in the program header table
of the resulting binary. It usually belongs to the second PT_LOAD segment, therefore it is loaded
into memory. PT_INTERP specifies the dynamic interpreter and the kernel is very sensitive about it.
PT_DYNAMIC is an array of dynamic entries:
d_tag is the type of the dynamic entry. Dynamic entries contain vital information
for the dynamic linker. Information such as symbol relocations to figure out what API are you
trying to call (simplified) etc.
Case: executable binaries
Let’s compile a program and look at it with radare2 (always use the git version)!
I am using radare on OS X:
I am using gcc (Debian 4.9.2-10) 4.9.2ldd (Debian GLIBC 2.19-18+deb8u3) 2.19 on Debian 8.3 x64.
Compile it (it should be dynamically linked unless given -static) and check its size:
Use sstrip and check its size:
Alright. Radare can clearly see what APIs are we trying to call. In short - radare
used the information contained within the dynamic entries (relocations, dynamic symbol table,
dynamic string table etc.) to figure it out.
Let’s try and use readelf utility.
Oops. readelf by default relies on section header table (which we took away) - first mistake, but this feature has been
known for a while. You have to force it to use PT_DYNAMIC, not the section .dynamic,
But how do they load the dynamic entries from the file?
They use PT_DYNAMIC’s p_offset - the file offset.
Is this correct? Well..
Let’s jump into 010 Editor, load our program and use Tim Strazzere’s ELF template
(the one on their website is fairly outdated) to change the p_offset field of PT_DYNAMIC to 0x0.
Run it.
That works. Let’s check radare.
PT_DYNAMICp_offset is 0x0, radare shows no relocations.
The linux kernel does not really care about the dynamic segment, but looking for PT_DYNAMIC identifier on LXR you can find this for example.
FreeBSD is doing it too. Glibc and IDA as well.
The general consensus seems to be that calculating the offset of the dynamic table should be done
through its virtual address just like you would when converting a virtual address to an offset:
Is this it?
Yes. Well, except maybe adding another dynamic table with a personal touch at the end of the file..
Shared libraries
Sample shared library, let’s call it libtest.c:
Try to run it with our previous myprogram:
Check disassembly of foo symbol in radare:
Change PT_DYNAMICp_offset to 0x0:
Run again..
Now take a look again with radare:
You can probably tell where I am going with this.
Implications
I think it’s a nice trick to fool some popular tools and newbie reversers. :-) Also somewhat
helpful if you are parsing ELF in your tool.
There are more things that can be done, but this post is way longer than I expected. Maybe next time.
Note: if I remember correctly, there was a CTF that used 2 dynamic string tables. One was
referenced from the dynamic table where PT_DYNAMIC pointed to and the other from the .dynamic section.
This caused some tools to show wrong APIs. If someone finds a link, let me know and I will update the post.