on shared library

How to build a shared library using gcc/g++?

Building a shared library with gcc is easy but it takes 2 steps:

gcc -c -fpic mytools.c
gcc -shared -o libmytools.so mytools.o

The first step is to compile mytools.c to mytools.o. The second step is to link mytools.o to generate the shared library libmytools.so. If you try to combine the two steps into one like building an executable program:

g++ -shared -o libmytools.so mytools.c

You will get the error:

/usr/bin/ld: /tmp/ccGRogtm.o: relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output

 How to use a shared library?

Linking against a shared library is easier than building it.

gcc -o myapp myapp.c -lmytools -L.

In just one step, you build an app that is linked against the .so file generated before.

Now, it is time to run the app:

./myapp
./myapp: error while loading shared libraries: libmytools.so: cannot open shared object file: No such file or directory

The executable calls the dynamic linker/loader(ld.so/ld-linux.so.*, in my system, the /lib64/ld-linux-x86-64.so.2 which is the symbolic link to /lib64/ld-2.17.so). The .so loader looks like an ordinary shared library but is actually not. It is responsible for loading shared libraries your program requires to memory.

Where does dynamic loader look for the needed shared libraries?

From the above error, we can tell the so loader does not look up the current directory to find libmytools.so. So we copy our shared library libmytools.so to /lib64/. Then myapp can run without problem. On the official webpage of ld.so, it says the dynamic loader will search /lib64 and /usr/lib64 in order. But on my system, /lib64 is a symbolic link to /usr/lib64 so there is actually only one directory to search.

But I do not want to copy my dynamic library to system directories. Can I still let the loader to find it? No problem, you can put your .so file in any directory and use an environment variable LD_LIBRARY_PATH to specify the location. For example, you can still put the so file in the same directory as the main program, and set the following environment variable:

export LD_LIBRARY_PATH=.

Now ld.so will look up the current directory for libmytools.so and myapp can run normally.

But if I sell my program to customers, I do not install my .so files in the standard(default) system lib directories(/lib, /usr/lib, etc.), shall I tell the customers to set the LD_LIBRARY_PATH environment variable every time they run my program? You do not need to do so. You can modify /etc/ld.so.conf to add your custom lib directory and run ldconfig in your installation script. The /etc/ld.so.conf contains a list of directories to look up the .so files. However, the dynamic loader does not look up those directories directly. Why? If there are two many directories listed there, it would take too much time for the loader to search for a dynamic lib(The worst thing is, every time a program runs, the loader will lookup those directories repeatedly). Instead, a program called ldconfig will look at the content of /etc/ld.so.conf, iterate all the .so files in those directories and build a soname->sofile map one time for all. The map is saved in /etc/ld.so.cache. You can use “ldconfig -p” to see what is in /etc/ld.so.cache, or use “strings /etc/ld.so.cache” to peek the printable strings in that file. Although this binary file seems mysterious as you can not view it using vi, it is mainly a mapping from soname to sofilelocation. ld.so will lookup the (sorted)cache file to get the locations of so files and load them directly, thus speeding up the loading of programs.

Nowadays, apps do not insert their lib directories directly in /etc/ld.so.conf. Instead, they generate a custom configure file in the directory /etc/ld.so.conf.d/, and write their lib directories in that file. Looking at  /etc/ld.so.conf.d/, you may find there are already some conf files such as  mariadb-x86_64.conf, dyninst-x86_64.conf. Those files were generated when you installed corresponding software packages. The /etc/ld.so.conf now contains only one line:

include ld.so.conf.d/*.conf

So /etc/ld.so.conf becomes the aggregator of multiple configure files.

Now, we have the default search paths, we have /etc/ld.so.cache, we have ldconfig to generate that cache. It seems we have resolved the problem of finding the dependent shared libraries at run-time, perfectly. But…

What is SONAME used for?

When linking against a shared library as the above example, the linker will save the so file name in the dynamic section of the exe like

readelf -d myapp

Dynamic section at offset 0xe18 contains 25 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libmytools.so]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

....................................

When running myapp, the dynamic loader will extract the needed so file name from the exe and find it as talked above. We have known ldconfig will scan the directory containing the .so file and generate the mapping from the .so file name to the exact location of the .so file, so the dynamic loader can find the .so file and load it.

Considering the situation that your program uses(links against) a third-party lib. Your customer installs the third-party lib on his system and then installs your software. He can ran your program without problem. Later, the third-party lib undergoes an upgrade. Your customer upgrades the lib on his system using yum. If the third-party lib changes its logic in the function(s) your program uses but does not change the .so file name, chance is your program does not work as expected, even crashes, despite the fact that it can finds the .so file and loads it normally. Here, crashing may be better than unexpected behavior, at least it lets you clearly realize the problem. A better solution is preventing your program from starting in the first place.  There are many ways to implement this solution. For example, the third-party lib can change the .so name if it thinks the new version will break the old app that links against the old version of the lib and decides to cease the compatibility from now on. Now your program will fail to start due to the not found .so file. Many Linux libs use a more elegant versioning mechanism to do this. They name their .so file (realname) after xxx.so.major.minor.patch. Every release uses a different file name so users can recognize the version change simply by its file name. They promise backward compatibility across changes of minor and patch. They do not promise backwards compatibility when major changes. But, since the .so file name changes even for a minor code change, the program linking against a previous version won’t start as we talked before, how do they promise backwards compatibility? This is where SONAME plays its role.

When linking against a .so file, you can pass a SONAME as a linker option:

gcc -shared -Wl,-soname,libmytools.so.1 -o libmytools.so.1.0.0 mytools.o

Now the soname libmytools.so.1 is put into the dynamic section of the generated libmytools.so.1.0.0. You can see it using readelf.

readelf -d libmytools.so.1.0.0

Dynamic section at offset 0xe08 contains 25 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000e (SONAME)             Library soname: [libmytools.so.1]
..............

You can also check the soname using objdump:

objdump -p libmytools.so.1.0.0

................
Dynamic Section:
  NEEDED               libc.so.6
  SONAME               libmytools.so.1
........................................

A problem arises when you link myapp to libmytools.so.1.0.0  as usual:

gcc -o myapp -lmytools -L. myapp.c
/usr/bin/ld: cannot find -lmytools
collect2: error: ld returned 1 exit status

gcc always expands -lmytools to libmytools.so to find the shared library, so you need to create a symbolic link(so called linkname) to it:

ln -s libmytools.so.1.0.0 libmytools.so

With the help of the linkname, myapp can be built out. An important thing to note is the soname(libmytools.so.1) in libmytools.so.1.0.0 is read and put in myapp under the dynamic section. We can verify this using readelf or objdump.

readelf -d myapp

Dynamic section at offset 0xe18 contains 25 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libmytools.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
............................

 

objdump -p myapp
......................
Dynamic Section:
  NEEDED               libmytools.so.1
  NEEDED               libc.so.6
................

Note that, this time, it is the soname in the .so file, not the .so file name, that is put in the exe.

When myapp starts, ld.so will get this soname(libmytools.so.1) from the exe, then find libmytools.so.1 in /etc/ld.conf.cache, system lib directories,etc., as we talked before. Since our generated .so file is libmytools.so.1.0.0, it seems we need to create a symbolic link libmytools.so.1 which points to libmytools.so.1.0.0, otherwise, ld.so won’t find our lib. Fortunately, ldconfig can do this automatically for us.  ldconfig(of course you should add the directory that libmytools.so.1.0.0 resides to /etc/ld.so.conf before running it) will parse libmytools.so.1.0.0, extract the soname(libmytools.so.1) in it, and create a symbolic link in the same directory as the shared library: soname=>libmytools.so.1.0.0, then ldconfig will insert an entry in /etc/ld.so.cache: soname=>/pathto/soname.

The benefit of using soname is the version of the .so file can still evolve, but the soname in the .so file keeps the same as long as the developer of the lib wants to maintain the backwards compatibility. Once the developer has to break the compatibility, he uses a different soname such as libmytools.so.2 when building his lib.

Another subtle detail I want to mention is: if we don’t provide a soname when building shared libraries, there is no soname appeared in the .so files. Then, ldconfig won’t create symbolic links. But the entries for the so still appear in /etc/ld.so.cache, which are of the form: filename=>fullpathname.

backwards compatibility and versioning

From what we talked above, we know that SONAME is not used to realize compatibility, but to break compatibility, i.e., if we think our new version of lib will break old apps that’re linked against our old lib, we change the soname for the new lib. So, how do we realize compatibility? So far, that depends heavily on the cautiousness of the coder. He should use the same soname as previous versions and he should carefully change the code of current version of lib thinking (or hoping) the updated code won’t break old apps. Although there are no complete rules for the developer to follow to obtain a compatible new lib, there are some things that the developer shall not do. For example, the developer shall not change the signature of existing functions, otherwise, the old apps are almost doomed to cause segment fault.  To realize backwards compatibility, we need more reasonable ways.

One possible method to realize backwards compatibility is changing the SONAME for every version of the lib, and shipping all old versions of SOs in the new release of the lib. This way, the old apps are happily using old .so and the new apps use new .so. The shortcoming is that no separated version actually exists here. A release of lib is actually a collection of all old versions of the lib.

Another way to realize the compatibility is including old versions of functions in the new version of .so file. A versioning mechanism is used to differentiate different versions of functions in the same .so file. At the linking stage, the linker stores in the exe the external function names together with their versions in the lib. At run-time, the loader will check the .so and see if it includes the proper version of the function. If it does not, the loader will refuse to load the .so and the app won’t start. This way, one new version of .so can guarantee old apps(possibly linked to multiple old versions of the lib) to run as long as it includes the old versions of needed functions. Glibc uses such versioning mechanism to realize backwards compatibility.

 

 

If you like my content, please consider buying me a coffee. Buy me a coffeeBuy me a coffee Thank you for your support!

Leave a Reply