Object File Format
This format describes a single object file. Each file must have
a header with the following data, at the given byte offsets. The first
value is "magic number" to assert that this is indeed an object file. It
is defined in util.Version.objMagicNumber.
Byte offset | Value |
0 | Magic Number |
4 | Symbol table size (symTabSize) |
8 | Reference table size (refTabSize) |
12 | Data segment size (dataSegSize) |
16 | Text segment size (textSegSize) |
20 | Symbol table |
20 + symTabSize | Reference table |
20 + symTabSize + refTabSize | Data segment |
20 + symTabSize + refTabSize + dataSegSize | Text segment |
Symbol Table
This includes just the global symbols defined in this module,
represented as an ASCII string like this "foo 16 bar 128 baz 32". It
maps symbol to an offset into the data segment. (We should actually
put all symbols in here.)
Reference Table
The reference table is another symbol table, but it maps symbols to
LISTS of offsets into the text segment to where the references occur.
An example looks like this: "print 256 260 alloc 60 globalVariable 8"
Data Segment
The data segment consists of binary data representing the static data
of this module.
Text Segment
The text segment consists of binary data representing the text
(instructions) in this module. Some of the instructions will have
offsets that have yet to be resolved. These might be references to
global variables that will be stitched by the linker.
Executable File Format
The following format governs the specification of Cebollita executable files:
Byte offset | Value |
0 | Magic Number |
4 | Text segment size |
8 | Data segment size |
12 | Stack size |
16 | Heap size |
20 | Entry point (first instruction to execute) |
24 | Text segment |
24 + textSegSize | Data segment |
The first value is a "magic number", used to assert that this file is
indeed an object file. It is defined in util.Version.exeMagicNumber.
The next five values are integers that represent information about the
executable: text and data segment size describe the size of the
respective segments. Stack and heap size are the requested maximum
sizes for these two dynamic memory regions. Entry point is used to
specify the offset into the text segment that shall be the first
instruction to execute.
The text segment is the binary instructions with fully resolved
offsets. Finally, the data segment represents the program's
initialized static/global data.