Introduction to library development under Linux

Author: Razvan MIHAIU
razvan_rem@rem_mihaiu.name (please remove '_rem' and 'rem_')
From: www.mihaiu.name
Date: 11/04/2005




Introduction to object files

An object file is composed of data stored in a pseudo language created from source code by a compiler. It represents an intermediate step in the preprocess - compile - link cycle.

In order to create an object file from a C source code one should invoke the C compiler with the -c parameter (the files will be compiled but they will not be linked). The result of the command:

gcc -c test.c
will be the creation of the object file "test.o".

On GNU systems all compiled languages are using object files as an intermediary step, before producing the executable. One advantage of this system is that, in theory, it is possible to link object files produced by compilers from different languages to create an executable; for example one can link C object files with Fortran object files to produce an executable program.

This is valid only in theory because many compilers are internally changing the names of the symbols that they encounter in the source code. Example of compilers that are using this technique are the C++ compiler or the Fortran compiler. This technique is called 'name mangling'. Because of 'name mangling' it is not possible to link C++ and Fortran object files together.

One 'clean' language is C. The C compiler it is not doing 'name mangling', so object files from this language can be linked with object files from any other GNU compiled language. This is one of the advantages of C over C++.

An object file contains a table of all the defined variables, constants and functions. The names of the variables, constants or functions are called symbols. Because the defitinitions of the symbols are distributed among the object files and because the final addresses of the symbols are not known at this stage, the symbols will be used for future references. That is, for making the necessary connections between the object files, the linker will use names and not addresses.

The list of the symbols from a given object file can be listed with the nm program:

00000000 R MAGIC_NO U exit U fork 00000000 t gcc2_compiled. U getpid U getppid 00000000 T main U perror U printf U scanf 00000000 D switch_state

The address that is displayed by nm is the relative address of the symbol in the object file. The address of the given symbol will most likely change in the final execuable. The most encountered types of symbols are:

Building ordinary (non-shared) libraries

A collection of object files can be used by many applications in order to provide some common functionality. The collection of object files can be grouped into a library.

For building libraries we will use 2 tools: ar and ranlib.

'ar' is used to build an archive and ranlib is used to build the symbol table for all the objects from the archive.

The command:
ar cru libtest.a test1.o test2.o test3.o
will create the archive "libtest.a".

The archive will be transformed into a library with the ranlib command:

ranlib libtest.a

Note 1:

On some Unix variants the functionality of ranlib was incorporated in ar - thus the command ranlib is doing nothing, or, even worse, it does not exist at all (this is bad because many scripts will break).

Note 2:

'ar' is just an archiver like 'tar'. 'ar' is used for the task of creating libraries because its format is understood by various tools like ranlib or the linker.

Since 'ar' is just an archiver any type of file can be inserted in a archive. This is not recommended because some linkers could have an unpredictable behaviour.

Note 3:

A library is different from one big object file. When doing a partial link with the -r option the programmer can create one big object file from many smaller object files.

Example:

gcc -r test1.o test2.o -o big_obj_file.o

The output of such a command is just an ordinary object file which is different from a library. A library is a collection of object files bundled toghether with a quick-access index while the result of the above command is just a big object file. Since the library is also an archive some other information about the included object files is also included in the library: creation/access time and the user access rights for each file.

A library can be linked in a project in the same way a object file is linked:

gcc test.o libtest.a -o test

There is one subtle difference between linking a library and linking a large object file created with the '-r' option: when a library is linked only the required object files from inside the library will be included, while when a large object file is linked then all the routines from the large object file will be included thus making the executable unnecessary large.

In order to install a library the programmer must copy it in a directory where the GNU compiler will search for it. There are 2 options:

/usr/lib /usr/local/lib

Since some compilers will only search the path /usr/lib, I will recommended its use against '/usr/local/lib'.

Once the library is installed the program can be compiled like this:

gcc test.o -ltest -o test

Notice the '-ltest' flag. The '-ltest' flag is telling the compiler that a library called libtest.a is somewhere in '/usr/lib' or '/usr/local/lib' and that this library must be linked with the current project.

If you are unable to install the library to either '/usr/lib' or '/usr/local/lib' then the '-L' parameter can help you specify the path to your library:

gcc test.o -L/home/razvan/work -ltest -o test

Any library must be composed of source files (*.cpp, *.cc, *.c) and header files (*.h). The files with the former extensions will compile in object files while the header files will list the resources defined inside the library. That means that in order to be useful any library must have at least one header file that is to be included in the source code of the target application.

The header files can be saved in 2 locations:

/usr/include /usr/local/include

If you are not able to access those directories then the '-I' compiler flag can be used to specify the non-standard location of your header file.

Example:

Let's suppose that a library called gemini ('libgemini.a') has a corresponding header file called 'gemini.h'. Neither the library nor the header file are installed in a standard directory.

The path for the library is '/home/razvan/work/lib/gemini' and the path for the header file is '/home/razvan/work/include/gemini'.

The application to be build is called 'center'.

The application will be compiled with the following commands:

gcc -c -I/home/razvan/work/include/gemini center.cc gcc -L/home/razvan/work/lib/gemini/ -lgemini -o center

The first step will perform only compilation (hence the -c flag). The full path of the included file is specified with the '-I' flag. The file 'center.cc' is specifying somewhere that the header gemini.h is to be included:

#include <gemini.h>

In the second step the linker will be called. The '-L' flag is specifying the location of the 'libgemini.a' library.









Best regards,
Razvan MIHAIU



25/05/2002 - Bucharest, Romania




Razvan Mihaiu � 2000 - 2024