Does anybody have an idea how to statically compile any resource file right into the executable or the shared library file using GCC?

For example I'd like to add image files that never change (and if they do, I'd have to replace the file anyway) and wouldn't want them to lie around in the file system.

If this is possible (and I think it is because Visual C++ for Windows can do this, too), how can I load the files which are stored in the own binary? Does the executable parse itself, find the file and extract the data out of it?

Maybe there's an option for GCC which I haven't seen yet. Using search engines didn't really spit out the right stuff.

I would need this to work for shared libraries and normal ELF-executables.

Solution 1

Update I have grown to prefer the control John Ripley's assembly .incbin based solution offers and now use a variant on that.

I have used objcopy (GNU binutils) to link the binary data from a file foo-data.bin into the data section of the executable:

objcopy -B i386 -I binary -O elf32-i386 foo-data.bin foo-data.o

This gives you a foo-data.o object file which you can link into your executable. The C interface looks something like

/** created from binary via objcopy */
extern uint8_t foo_data[]      asm("_binary_foo_data_bin_start");
extern uint8_t foo_data_size[] asm("_binary_foo_data_bin_size");
extern uint8_t foo_data_end[]  asm("_binary_foo_data_bin_end");

so you can do stuff like

for (uint8_t *byte=foo_data; byte<foo_data_end; ++byte) {
    transmit_single_byte(*byte);
}

or

size_t foo_size = (size_t)((void *)foo_data_size);
void  *foo_copy = malloc(foo_size);
assert(foo_copy);
memcpy(foo_copy, foo_data, foo_size);

If your target architecture has special constraints as to where constant and variable data is stored, or you want to store that data in the .text segment to make it fit into the same memory type as your program code, you can play with the objcopy parameters some more.

Solution 2

With imagemagick:

convert file.png data.h

Gives something like:

/*
  data.h (PNM).
*/
static unsigned char
  MagickImage[] =
  {
    0x50, 0x36, 0x0A, 0x23, 0x43, 0x72, 0x65, 0x61, 0x74, 0x65, 0x64, 0x20, 
    0x77, 0x69, 0x74, 0x68, 0x20, 0x47, 0x49, 0x4D, 0x50, 0x0A, 0x32, 0x37, 
    0x37, 0x20, 0x31, 0x36, 0x32, 0x0A, 0x32, 0x35, 0x35, 0x0A, 0xFF, 0xFF, 
    0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 

....

For compatibility with other code you can then use either fmemopen to get a "regular" FILE * object, or alternatively std::stringstream to make an iostream. std::stringstream is not great for this though and you can of course just use a pointer anywhere you can use an iterator.

If you're using this with automake don't forget to set BUILT_SOURCES appropriately.

The nice thing about doing it this way is:

  1. You get text out, so it can be in version control and patches sensibly
  2. It is portable and well defined on every platform

Solution 3

You can embed binary files in executable using ld linker. For example, if you have file foo.bar then you can embed it in executable adding the following commands to ld

--format=binary foo.bar --format=default

If you are invoking ld thru gcc then you will need to add -Wl

-Wl,--format=binary -Wl,foo.bar -Wl,--format=default

Here --format=binary tells the linker that the following file is binary and --format=default switches back to default input format (this is usefull if you will specify other input files after foo.bar).

Then you can access content of your file from code:

extern uint8_t data[]     asm("_binary_foo_bar_start");
extern uint8_t data_end[] asm("_binary_foo_bar_end");

There is also symbol named "_binary_foo_bar_size". I think it is of type uintptr_t but i didn't check it.

Solution 4

You can put all your resources into a ZIP file and append that to the end of the executable file:

g++ foo.c -o foo0
zip -r resources.zip resources/
cat foo0 resources.zip >foo

This works, because a) Most executable image formats don't care if there's extra data behind the image and b) zip stores the file signature at the end of the zip file. This means, your executable is a regular zip file after this (except for your upfront executable, which zip can handle), which can be opened and read with libzip.

Solution 5

If you want control over the exact symbol name and placement of resources, you can use (or script) the GNU assembler (not really part of gcc) to import whole binary files. Try this:

Assembly (x86/arm):

thing.s

    .section .rodata

    .global thing
    .type   thing, @object
    .balign 4
thing:
    .incbin "meh.bin"
thing_end:

    .global thing_size
    .type   thing_size, @object
    .balign 4
thing_size:
    .int    thing_end - thing

C:

main.c

#include <stdio.h>

extern const char thing[];
extern const unsigned thing_size;

int main() {
  printf("%p %u\n", thing, thing_size);
  return 0;
}

You can compile this simply with gcc main.c thing.s.

Whatever you use, it's probably best to make a script to generate all the resources, and have nice/uniform symbol names for everything.

Depending on your data and the system specifics, you might need to use different alignment values (preferably with .balign for portability), or integer types of a different size for thing_size, or a different element type for the thing[] array.

Solution 6

From http://www.linuxjournal.com/content/embedding-file-executable-aka-hello-world-version-5967:

I recently had the need to embed a file in an executable. Since I'm working at the command line with gcc, et al and not with a fancy RAD tool that makes it all happen magically it wasn't immediately obvious to me how to make this happen. A bit of searching on the net found a hack to essentially cat it onto the end of the executable and then decipher where it was based on a bunch of information I didn't want to know about. Seemed like there ought to be a better way...

And there is, it's objcopy to the rescue. objcopy converts object files or executables from one format to another. One of the formats it understands is "binary", which is basicly any file that's not in one of the other formats that it understands. So you've probably envisioned the idea: convert the file that we want to embed into an object file, then it can simply be linked in with the rest of our code.

Let's say we have a file name data.txt that we want to embed in our executable:

# cat data.txt
Hello world

To convert this into an object file that we can link with our program we just use objcopy to produce a ".o" file:

# objcopy --input binary \
--output elf32-i386 \
--binary-architecture i386 data.txt data.o

This tells objcopy that our input file is in the "binary" format, that our output file should be in the "elf32-i386" format (object files on the x86). The --binary-architecture option tells objcopy that the output file is meant to "run" on an x86. This is needed so that ld will accept the file for linking with other files for the x86. One would think that specifying the output format as "elf32-i386" would imply this, but it does not.

Now that we have an object file we only need to include it when we run the linker:

# gcc main.c data.o

When we run the result we get the prayed for output:

# ./a.out
Hello world

Of course, I haven't told the whole story yet, nor shown you main.c. When objcopy does the above conversion it adds some "linker" symbols to the converted object file:

_binary_data_txt_start
_binary_data_txt_end

After linking, these symbols specify the start and end of the embedded file. The symbol names are formed by prepending binary and appending _start or _end to the file name. If the file name contains any characters that would be invalid in a symbol name they are converted to underscores (eg data.txt becomes data_txt). If you get unresolved names when linking using these symbols, do a hexdump -C on the object file and look at the end of the dump for the names that objcopy chose.

The code to actually use the embedded file should now be reasonably obvious:

#include <stdio.h>

extern char _binary_data_txt_start;
extern char _binary_data_txt_end;

main()
{
    char*  p = &_binary_data_txt_start;

    while ( p != &_binary_data_txt_end ) putchar(*p++);
}

One important and subtle thing to note is that the symbols added to the object file aren't "variables". They don't contain any data, rather, their address is their value. I declare them as type char because it's convenient for this example: the embedded data is character data. However, you could declare them as anything, as int if the data is an array of integers, or as struct foo_bar_t if the data were any array of foo bars. If the embedded data is not uniform, then char is probably the most convenient: take its address and cast the pointer to the proper type as you traverse the data.

Solution 7

Reading all post here and in Internet I have made a conclusion that there is no tool for resources, which is :

1) Easy to use in code.

2) Automated (to be easy included in cmake/make).

3) Cross-platform.

I have decided to write the tool by myself. The code is available here. https://github.com/orex/cpp_rsc

To use it with cmake is very easy.

You should add to your CMakeLists.txt file such code.

file(DOWNLOAD https://raw.github.com/orex/cpp_rsc/master/cmake/modules/cpp_resource.cmake ${CMAKE_BINARY_DIR}/cmake/modules/cpp_resource.cmake) 

set(CMAKE_MODULE_PATH ${CMAKE_BINARY_DIR}/cmake/modules)

include(cpp_resource)

find_resource_compiler()
add_resource(pt_rsc) #Add target pt_rsc
link_resource_file(pt_rsc FILE <file_name1> VARIABLE <variable_name1> [TEXT]) #Adds resource files
link_resource_file(pt_rsc FILE <file_name2> VARIABLE <variable_name2> [TEXT])

...

#Get file to link and "resource.h" folder
#Unfortunately it is not possible with CMake add custom target in add_executable files list.
get_property(RSC_CPP_FILE TARGET pt_rsc PROPERTY _AR_SRC_FILE)
get_property(RSC_H_DIR TARGET pt_rsc PROPERTY _AR_H_DIR)

add_executable(<your_executable> <your_source_files> ${RSC_CPP_FILE})

The real example, using the approach can be downloaded here, https://bitbucket.org/orex/periodic_table