6 Extending ELF concepts
6.1 Debugging
Traditionally the primary method of post mortem debugging is referred to as the core dump. The term core comes from the original physical characteristics of magnetic core memory, which uses the orientation of small magnetic rings to store state.
Thus a core dump is simply a complete snapshot of the program as it was running at a particular time. A debugger can then be used to examine this dump and reconstruct the program state. Example 6.1.1, Example of creating a core dump and using it with gdb shows a sample program that writes to a random memory location in order to force a crash. At this point the processes will be halted and a dump of the current state is recorded.
$ cat coredump.c
int main(void) {
char *foo = (char*)0x12345;
*foo = 'a';
return 0;
}
$ gcc -Wall -g -o coredump coredump.c
$ ./coredump
Segmentation fault (core dumped)
$ file ./core
./core: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style, from './coredump'
$ gdb ./coredump
...
(gdb) core core
[New LWP 31614]
Core was generated by `./coredump'.
Program terminated with signal 11, Segmentation fault.
#0 0x080483c4 in main () at coredump.c:3
3 *foo = 'a';
(gdb)
Thus a core-dump is just another ELF file with a range of sections understood to the debugger to represent parts of the running program.
6.1.1 Symbols and Debugging Information
As Example 6.1.1, Example of creating a core dump and using it with gdb shows, the debugger
gdb requires the original
executable and the core dump to reconstruct the environment
for the debugging session. Note that the original executable
was built with the -g
flag,
which instructs the compiler to include all
debugging information. This extra
debugging information is kept in special sections of the ELF
file. It describes in detail things like what register values
currently hold which variables used in the code, size of
variables, length of arrays, etc. It is generally in the
standard DWARF format (a pun on the
almost-synonym ELF).
Including debugging information can make executable
files and libraries very large; although this data is not
required resident in memory for actually running it can still
take up considerable disk space. Thus the usual process is to
strip this information from the ELF file.
While it is possible to arrange for shipping of both stripped
and unstripped files, most all current binary distribution
methods provide the debugging information in separate files.
The objcopy tool can be used to
extract the debugging information
(--only-keep-debug
) and then
add a link in the original executable to this stripped
information
(--add-gnu-debuglink
). After
this is done, a special section called
.gnu_debuglink
will be
present in the original executable, which contains a hash so
that when a debugging sessions starts the debugger can be sure
it associates the right debugging information with the right
executable.
$ gcc -g -shared -o libtest.so libtest.c
$ objcopy --only-keep-debug libtest.so libtest.debug
$ objcopy --add-gnu-debuglink=libtest.debug libtest.so
$ objdump -s -j .gnu_debuglink libtest.so
libtest.so: file format elf32-i386
Contents of section .gnu_debuglink:
0000 6c696274 6573742e 64656275 67000000 libtest.debug...
0010 52a7fd0a R...
Symbols take up much less space, but are also targets for removal from final output. Once the individual object files of an executable are linked into the single final image there is generally no need for most symbols to remain. As discussed in Section 3.2, Symbols and Relocations symbols are required to fix up relocation entries, but once this is done the symbols are not strictly necessary for running the final program. On Linux the GNU toolchain strip program provides options to remove symbols. Note that some symbols are required to be resolved at run-time (for dynamic linking, the focus of Chapter 9, Dynamic Linking) but these are put in separate dynamic symbol tables so they will not be removed and render the final output useless.
6.1.2 Inside coredumps
A coredump is really just another ELF file; this illustrates the flexibility of ELF as a binary format.
$ readelf --all ./core
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: CORE (Core file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x0
Start of program headers: 52 (bytes into file)
Start of section headers: 0 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 15
Size of section headers: 0 (bytes)
Number of section headers: 0
Section header string table index: 0
There are no sections in this file.
There are no sections to group in this file.
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
NOTE 0x000214 0x00000000 0x00000000 0x0022c 0x00000 0
LOAD 0x001000 0x08048000 0x00000000 0x01000 0x01000 R E 0x1000
LOAD 0x002000 0x08049000 0x00000000 0x01000 0x01000 RW 0x1000
LOAD 0x003000 0x489fc000 0x00000000 0x01000 0x1b000 R E 0x1000
LOAD 0x004000 0x48a17000 0x00000000 0x01000 0x01000 R 0x1000
LOAD 0x005000 0x48a18000 0x00000000 0x01000 0x01000 RW 0x1000
LOAD 0x006000 0x48a1f000 0x00000000 0x01000 0x153000 R E 0x1000
LOAD 0x007000 0x48b72000 0x00000000 0x00000 0x01000 0x1000
LOAD 0x007000 0x48b73000 0x00000000 0x02000 0x02000 R 0x1000
LOAD 0x009000 0x48b75000 0x00000000 0x01000 0x01000 RW 0x1000
LOAD 0x00a000 0x48b76000 0x00000000 0x03000 0x03000 RW 0x1000
LOAD 0x00d000 0xb771c000 0x00000000 0x01000 0x01000 RW 0x1000
LOAD 0x00e000 0xb774d000 0x00000000 0x02000 0x02000 RW 0x1000
LOAD 0x010000 0xb774f000 0x00000000 0x01000 0x01000 R E 0x1000
LOAD 0x011000 0xbfeac000 0x00000000 0x22000 0x22000 RW 0x1000
There is no dynamic section in this file.
There are no relocations in this file.
There are no unwind sections in this file.
No version information found in this file.
Notes at offset 0x00000214 with length 0x0000022c:
Owner Data size Description
CORE 0x00000090 NT_PRSTATUS (prstatus structure)
CORE 0x0000007c NT_PRPSINFO (prpsinfo structure)
CORE 0x000000a0 NT_AUXV (auxiliary vector)
LINUX 0x00000030 Unknown note type: (0x00000200)
$ eu-readelf -n ./core
Note segment of 556 bytes at offset 0x214:
Owner Data size Type
CORE 144 PRSTATUS
info.si_signo: 11, info.si_code: 0, info.si_errno: 0, cursig: 11
sigpend: <>
sighold: <>
pid: 31614, ppid: 31544, pgrp: 31614, sid: 31544
utime: 0.000000, stime: 0.000000, cutime: 0.000000, cstime: 0.000000
orig_eax: -1, fpvalid: 0
ebx: 1219973108 ecx: 1243440144 edx: 1
esi: 0 edi: 0 ebp: 0xbfecb828
eax: 74565 eip: 0x080483c4 eflags: 0x00010286
esp: 0xbfecb818
ds: 0x007b es: 0x007b fs: 0x0000 gs: 0x0033 cs: 0x0073 ss: 0x007b
CORE 124 PRPSINFO
state: 0, sname: R, zomb: 0, nice: 0, flag: 0x00400400
uid: 1000, gid: 1000, pid: 31614, ppid: 31544, pgrp: 31614, sid: 31544
fname: coredump, psargs: ./coredump
CORE 160 AUXV
SYSINFO: 0xb774f414
SYSINFO_EHDR: 0xb774f000
HWCAP: 0xafe8fbff <fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe>
PAGESZ: 4096
CLKTCK: 100
PHDR: 0x8048034
PHENT: 32
PHNUM: 8
BASE: 0
FLAGS: 0
ENTRY: 0x8048300
UID: 1000
EUID: 1000
GID: 1000
EGID: 1000
SECURE: 0
RANDOM: 0xbfecba1b
EXECFN: 0xbfecdff1
PLATFORM: 0xbfecba2b
NULL
LINUX 48 386_TLS
index: 6, base: 0xb771c8d0, limit: 0x000fffff, flags: 0x00000051
index: 7, base: 0x00000000, limit: 0x00000000, flags: 0x00000028
index: 8, base: 0x00000000, limit: 0x00000000, flags: 0x00000028
In Example 6.1.2.1, Example of using readelf
and eu-readelf to examine a
coredump. we can see an
examination of the core file produced by Example 6.1.1, Example of creating a core dump and using it with gdb using firstly the
readelf tool. There are no
sections, relocations or other extraneous information in the
file that may be required for loading an executable or
library; it simply consists of a series of program headers
describing LOAD
segments.
These segments are raw data dumps, created by the kernel, of
the current memory allocations.
The other component of the core dump is the
NOTE
sections which contain
data necessary for debugging but not necessarily captured in
straight snapshot of the memory allocations. The
eu-readelf program used in the
second part of the figure provides a more complete view of the
data by decoding it.
The PRSTATUS
note gives
a range of interesting information about the process as it was
running; for example we can see from
cursig
that the program
received a signal 11, or segmentation fault, as we would
expect. Along with process number information, it also
includes a dump of all the current registers. Given the
register values, the debugger can reconstruct the stack state
and hence provide a backtrace; combined
with the symbol and debugging information from the original
binary the debugger can show exactly how you reached the
current point of execution.
Another interesting output is the current
auxiliary vector
(AUXV
), discussed in Section 8.1, Kernel communication to programs. The
386_TLS
describes
global descriptor table entries used for
the x86 implementation of thread-local
storage (see Section 4.1.1.3, Fast System Calls
for more information on use of segmentation, and Section 4.3.1.1, Threads for information on
threads1).
The kernel creates the core dump file within the bounds
of the current ulimit
settings — since a program using a lot of memory could
result in a very large dump, potentially filling up disk and
making problems even worse, generally the
ulimit
is set low or even at
zero, since most non-developers have little use for a core
dump file. However the core dump remains the single most
useful way to debug an unexpected situation in a postmortem
fashion.
6.2 Custom sections
For the most part, organisation of code, data and symbols is something a programmer can leave up the toolchain defaults. However, there are times when it makes sense to extend or customise sections and their contents. One common example of this is with Linux kernel modules which are used to dynamically load drivers and other features into the running kernel. Because these modules are not portable, in so much as they only work with one fixed kernel build version, the interface between modules and the kernel can be flexible and is not bound to particular standards. This means the methods of storing things like license information, authorship, dependencies and paramaters for the moudule can be uniquely and wholly defined by the kernel.
The modinfo
tool can
inspect this information within a module and present it to the
user. Below we use the example of the FUSE
Linux kernel module, which allows user-space libraries to
provide file-system implementations to the kernel.
$ cd /lib/modules/$(uname -r)
$ sudo modinfo ./kernel/fs/fuse/fuse.ko
filename: /lib/modules/3.2.0-4-amd64/./kernel/fs/fuse/fuse.ko
alias: devname:fuse
alias: char-major-10-229
license: GPL
description: Filesystem in Userspace
author: Miklos Szeredi <miklos@szeredi.hu>
depends:
intree: Y
vermagic: 3.2.0-4-amd64 SMP mod_unload modversions
parm: max_user_bgreq:Global limit for the maximum number of backgrounded requests an unprivileged user can set (uint)
parm: max_user_congthresh:Global limit for the maximum congestion threshold an unprivileged user can set (uint)
$ objdump -s -j .modinfo ./kernel/fs/fuse/fuse.ko
./kernel/fs/fuse/fuse.ko: file format elf64-x86-64
Contents of section .modinfo:
0000 616c6961 733d6465 766e616d 653a6675 alias=devname:fu
0010 73650061 6c696173 3d636861 722d6d61 se.alias=char-ma
0020 6a6f722d 31302d32 32390070 61726d3d jor-10-229.parm=
0030 6d61785f 75736572 5f636f6e 67746872 max_user_congthr
0040 6573683a 476c6f62 616c206c 696d6974 esh:Global limit
0050 20666f72 20746865 206d6178 696d756d for the maximum
0060 20636f6e 67657374 696f6e20 74687265 congestion thre
0070 73686f6c 6420616e 20756e70 72697669 shold an unprivi
0080 6c656765 64207573 65722063 616e2073 leged user can s
0090 65740070 61726d74 7970653d 6d61785f et.parmtype=max_
00a0 75736572 5f636f6e 67746872 6573683a user_congthresh:
00b0 75696e74 00706172 6d3d6d61 785f7573 uint.parm=max_us
00c0 65725f62 67726571 3a476c6f 62616c20 er_bgreq:Global
00d0 6c696d69 7420666f 72207468 65206d61 limit for the ma
00e0 78696d75 6d206e75 6d626572 206f6620 ximum number of
00f0 6261636b 67726f75 6e646564 20726571 backgrounded req
0100 75657374 7320616e 20756e70 72697669 uests an unprivi
0110 6c656765 64207573 65722063 616e2073 leged user can s
0120 65740070 61726d74 7970653d 6d61785f et.parmtype=max_
0130 75736572 5f626772 65713a75 696e7400 user_bgreq:uint.
0140 6c696365 6e73653d 47504c00 64657363 license=GPL.desc
0150 72697074 696f6e3d 46696c65 73797374 ription=Filesyst
0160 656d2069 6e205573 65727370 61636500 em in Userspace.
0170 61757468 6f723d4d 696b6c6f 7320537a author=Miklos Sz
0180 65726564 69203c6d 696b6c6f 7340737a eredi <miklos@sz
0190 65726564 692e6875 3e000000 00000000 eredi.hu>.......
01a0 64657065 6e64733d 00696e74 7265653d depends=.intree=
01b0 59007665 726d6167 69633d33 2e322e30 Y.vermagic=3.2.0
01c0 2d342d61 6d643634 20534d50 206d6f64 -4-amd64 SMP mod
01d0 5f756e6c 6f616420 6d6f6476 65727369 _unload modversi
01e0 6f6e7320 00 ons .
modinfo
outputAs you can see above,
modinfo
is parsing the
.modinfo
section embedded
within the module file to present the details of the module.
Example 6.2.2, Putting module info into sections shows how one field, the
"author" is put into the module. The code mostly comes from
include/linux/module.h
.
/*
* Start at the bottom, and work your way up!
*/
/* Indirect macros required for expanded argument pasting, eg. __LINE__. */
#define ___PASTE(a,b) a##b
#define __PASTE(a,b) ___PASTE(a,b)
#define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__)
/* Indirect stringification. Doing two levels allows the parameter to be a
* macro itself. For example, compile with -DFOO=bar, __stringify(FOO)
* converts to "bar".
*/
#define __stringify_1(x...) #x
#define __stringify(x...) __stringify_1(x)
#define __MODULE_INFO(tag, name, info) \
static const char __UNIQUE_ID(name)[] \
__used __attribute__((section(".modinfo"), unused, aligned(1))) \
= __stringify(tag) "=" info
/* Generic info of form tag = "info" */
#define MODULE_INFO(tag, info) __MODULE_INFO(tag, tag, info)
/*
* Author(s), use "Name <email>" or just "Name", for multiple
* authors use multiple MODULE_AUTHOR() statements/lines.
*/
#define MODULE_AUTHOR(_author) MODULE_INFO(author, _author)
/* ---- */
MODULE_AUTHOR("Your Name <your@name.com>");
At first, this looks like a macro nightmare, but it can be
unravelled step by step. Starting at the bottom, we see that
MODULE_AUTHOR
is a wrapper
around the more generic
__MODULE_INFO
macro, which is
where most of the magic happens. There, we can see that we are
building up a static const char
[]
variable to hold the string
"author=Your Name
<your@name.com>"
. The interesting thing
to note is that the variable has an extra parameter
__attribute__((section(".modinfo")))
which is telling the compiler to not put this in the
data
section with all the other
variables, but to stash it in its own ELF section called
.modinfo
. The other parameters
stop the variable being optimised away because it looks unused
and to ensure we pack the variables in next to each other by
specifying the alignment.
There is extensive use of
stringification macros, which are rather
arcane tricks used within the C pre-processor to ensure that
strings and definitions can live together. The only other
trick is the use of the
__COUNTER__
special define
provided by gcc
, which provides a unique,
incrementing value each time it is called; this allows
multiple MODULE_AUTHOR
calls
to in the one file and not end up with the same variable
name.
We can inspect the symbols placed in the final module to see the end result:
$ objdump --syms ./fuse.ko | grep modinfo
0000000000000000 l d .modinfo 0000000000000000 .modinfo
0000000000000000 l O .modinfo 0000000000000013 __UNIQUE_ID_alias1
0000000000000013 l O .modinfo 0000000000000018 __UNIQUE_ID_alias0
000000000000002b l O .modinfo 0000000000000011 __UNIQUE_ID_alias8
000000000000003c l O .modinfo 000000000000000e __UNIQUE_ID_alias7
000000000000004a l O .modinfo 0000000000000068 __UNIQUE_ID_max_user_congthresh6
00000000000000b2 l O .modinfo 0000000000000022 __UNIQUE_ID_max_user_congthreshtype5
00000000000000d4 l O .modinfo 000000000000006e __UNIQUE_ID_max_user_bgreq4
0000000000000142 l O .modinfo 000000000000001d __UNIQUE_ID_max_user_bgreqtype3
000000000000015f l O .modinfo 000000000000000c __UNIQUE_ID_license2
000000000000016b l O .modinfo 0000000000000024 __UNIQUE_ID_description1
000000000000018f l O .modinfo 000000000000002a __UNIQUE_ID_author0
00000000000001b9 l O .modinfo 0000000000000011 __UNIQUE_ID_alias0
00000000000001d0 l O .modinfo 0000000000000009 __module_depends
00000000000001d9 l O .modinfo 0000000000000009 __UNIQUE_ID_intree1
00000000000001e2 l O .modinfo 000000000000002f __UNIQUE_ID_vermagic0
.modinfo
sections6.3 Linker Scripts
In Example 3.3.2.2, Sections we described how sections make up segments in the final output. It is the job of the linker to build these sections into segments; to achieve this it uses a linker script which describes where segments start, what sections go into them and various other parameters.
Example 6.3.1, The default linker script shows an extract of the
default linker script, which the linker will show when given
its verbose flag via specifying
-Wl,--verbose
to
gcc. The default script is built-in
to the linker and is based on the standard API definitions to
create working user-space programs for the building
platform.
$ gcc -Wl,--verbose -o test test.c
GNU ld (GNU Binutils for Debian) 2.26
...
using internal linker script:
==================================================
OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64",
"elf64-x86-64")
OUTPUT_ARCH(i386:x86-64)
ENTRY(_start)
SEARCH_DIR("=/usr/local/lib/x86_64-linux-gnu"); ...
SECTIONS
{
/* Read-only sections, merged into text segment: */
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
.interp : { *(.interp) }
.note.gnu.build-id : { *(.note.gnu.build-id) }
.hash : { *(.hash) }
.gnu.hash : { *(.gnu.hash) }
.dynsym : { *(.dynsym) }
.dynstr : { *(.dynstr) }
.gnu.version : { *(.gnu.version) }
.gnu.version_d : { *(.gnu.version_d) }
.gnu.version_r : { *(.gnu.version_r) }
.rela.dyn :
{
...
}
PROVIDE (etext = .);
.rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) }
.rodata1 : { *(.rodata1) }
...
You can roughly see how the linker script specifies
things like starting locations and what sections to group into
various segments. In the same way
-Wl
is used to pass the
--verbose
to the linker via
gcc, customised linker scripts can
be provided by flags. Regular user-space developers are
unlikely to need to override the default linker script.
However, often very customised applications such as kernel
builds require customised linker scripts.