From Fedora Project Wiki

 
(35 intermediate revisions by 2 users not shown)
Line 1: Line 1:
= Understanding the (Proposed) Change to DSO Linking=  
= Understanding the (Proposed) Change to DSO Linking=  
A quick list of packages that were found to have DSO link issues in mock builds is at [[DSOLinkBugs]]




Line 5: Line 7:




The default behaviour for ld currently defaults to not linking objects that are listed as dependencies of another linked object. This is dangerous if the other object is ever changed to occlude the object on which your program depended, causing your program to break without any change to your code.
The default behaviour for ld allows users to 'indirectly' link to required objects/libraries through intermediate objects/libraries. While this is convenient, it can also be dangerous because it makes your program's dependencies tied to the dependencies of other objects. If those objects ever change their linkages, they can break your program without any changes to your own code!
 
For example :
 
libxml2.so has:
 
  NEEDED            Shared library: [libdl.so.2]
  NEEDED            Shared library: [libz.so.1]
 
Under the old system, a program that links with libxml2 and uses dlopen need not link with libdl, and a program that links with libxml2 and uses gzopen need not link with libz. While these programs will work, they will break if libxml2 is ever changed to omit the dependency on libdl/libz.


== What's the difference? ==
== What's the difference? ==


For example (courtesy Roland McGrath):
For example (courtesy Roland McGrath):
Line 27: Line 39:
   int foo () { return 0; }
   int foo () { return 0; }


Prepare position-independent code:


<code>gcc -g -fPIC -c foo1.c foo2.c foo3.c</code>
<code>gcc -g -fPIC -c foo1.c foo2.c foo3.c</code>
Generate foo3.so:


<code>gcc -shared -o foo3.so foo3.o</code>
<code>gcc -shared -o foo3.so foo3.o</code>
Generate foo2.so, linking foo3.so:


<code>gcc -shared -o foo2.so foo2.o foo3.so</code>
<code>gcc -shared -o foo2.so foo2.o foo3.so</code>


The proposed change will affect the next step: Creating foo1.


'''(This Succeeds)'''
=== Current ===
 
<code>gcc -o foo1 foo1.o foo2.so -Wl,--rpath-link=.</code>
 
 
'''(This Fails)'''
 
<code>gcc -Wl,--no-add-needed -o foo1 foo1.o foo2.so -Wl,--rpath-link=.</code>
 
<code>/usr/bin/ld: �: invalid DSO for symbol `foo' definition</code>
 
<code>./foo3.so: could not read symbols: Bad value</code>
 
<code>collect2: ld returned 1 exit status</code>
 
<code>[Exit 1]</code>
 


'''What it meant to say was:'''
A call to gcc will succeed quietly, even though the link to foo3.so is only implicit.


<code>gcc -Wl,--no-add-needed -o foo1 foo1.o foo2.so -Wl,--rpath-link=. -B/tmp/</code>
  gcc -o foo1 foo1.o foo2.so -Wl,--rpath-link=.


<code>/tmp/ld: ./foo3.so: invalid DSO for symbol `foo' definition</code>
=== Proposed ===


<code>./foo3.so: could not read symbols: Bad value</code>
The call to gcc will fail, prompting the user to explicitly link the required shared object.


<code>collect2: ld returned 1 exit status</code>
  gcc -o foo1 foo1.o foo2.so -Wl,--rpath-link=.
/usr/bin/ld: foo1.o: undefined reference to symbol 'foo'
/usr/bin/ld: note: 'foo' is defined in DSO ./foo3.so so try adding it to the linker command line


<code>[Exit 1]</code>




Line 70: Line 74:
DT_NEEDED dependency of one of those (or recursively of those, I think).
DT_NEEDED dependency of one of those (or recursively of those, I think).


I find that error message not very explanatory, but it's what it says.
The big difference is that with the proposed change in place, ld will no longer skip linking needed libraries by default. The current default behaviour will lead ld to skip linking with a library if it is listed as a needed by another library that the program uses. In abstract terms, if libA is needed by libB and your program requires both libA and libB, your program may only link to libB. Then if another version of libB comes out that does not list libA as a needed library, then a recompilation will mysteriously break.
Giving a generic "undefined symbol" error (which usually comes with
 
source line info for the reference) would be less strange but also
== What do I do? ==
perhaps too generic for this specially weird case.
 
If you encounter this error, the error message will prompt you to explicitly link to the DSO that you need. From the foo example, adding foo3.so will get rid of the error:


<code>gcc -o foo1 foo1.o foo2.so foo3.so -Wl,--rpath-link=.</code>


'''New result:'''
== Example deltarpm ==


<code>gcc -o foo1 foo1.o foo2.so -Wl,--rpath-link=.</code>
Run fedora-cvs deltarpm or check out a 'devel' version of deltarpm from :


<code>/usr/bin/ld: foo1.o: undefined reference to symbol 'foo'</code>
<code>:pserver:anonymous@cvs.fedproject.org:/cvs/pkgs</code>


<code>/usr/bin/ld: note: 'foo' is defined in DSO ./foo3.so so try adding it to the linker command line</code>
Go to the devel folder and run 'make srpm' to produce a source rpm.


In /etc/mock, copy the desired fedora-rawhide-*.cfg file to test.cfg. In the test.cfg file, change the root to 'test'.


The big difference is that with the proposed change in place, ld will no longer skip linking needed libraries by default. The current default behaviour will lead ld to skip linking with a library if it is listed as a needed by another library that the program uses. In abstract terms, if libA is needed by libB and your program requires both libA and libB, your program may only link to libB. Then if another version of libB comes out that does not list libA as a needed library, then a recompilation will mysteriously break.
Add the following to test.cfg:
  [ld-test]
  name=ld-test
  baseurl=http://roland.fedorapeople.org/ld-test/
  enabled=1
  gpgcheck=0


A concrete example from Roland McGrath:
''(Note that the changes to ld are within gcc-4.4.3-5.fc13 so this step should not be necessary)''


libxml2.so has:
This will enable the mock build to pull the latest test version of ld. Next, run the build by executing mock -r /path/to/deltarpm/srpm


  NEEDED            Shared library: [libdl.so.2]
The following error should appear in /var/lib/mock/test/result/build.log :
  NEEDED            Shared library: [libz.so.1]


In this case, a program that links with libxml2 and uses dlopen may not link with libdl, and a program that links with libxml2 and uses gzopen may not link with libz. While these programs will work, they are at risk of failure if libxml2 is ever changed to omit the dependency on libdl/libz.
RPM build errors:
'''/usr/bin/ld.bfd: rpmdumpheader.o: undefined reference to symbol 'Fopen''''
'''/usr/bin/ld.bfd: note: 'Fopen' is defined in DSO /usr/lib/librpmio.so.0 so try adding it to the linker command line'''
'''/usr/lib/librpmio.so.0: could not read symbols: Invalid operation'''
*** /usr/bin/ld: ld behavior mismatch! ***
*** /usr/bin/ld.bfd succeeeded ***
*** /usr/bin/ld.bfd --no-add-needed exits 1 ***
*** arguments: --eh-frame-hdr --build-id -m elf_i386 --hash-style=gnu -dynamic-linker /lib/ld-linux.so.2 -o rpmdumpheader
/usr/lib/gcc/i686-redhat-linux/4.4.2/../../../crt1.o /usr/lib/gcc/i686-redhat-linux/4.4.2/../../../crti.o /usr/lib/gcc/i686-redhat-linux/4.4.2/crtbegin.o
-L/usr/lib/gcc/i686-redhat-linux/4.4.2 -L/usr/lib/gcc/i686-redhat-linux/4.4.2 -L/usr/lib/gcc/i686-redhat-linux/4.4.2/../../.. rpmdumpheader.o -lrpm -lgcc
--as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/i686-redhat-linux/4.4.2/crtend.o
/usr/lib/gcc/i686-redhat-linux/4.4.2/../../../crtn.o
collect2: ld returned 1 exit status
make: *** [rpmdumpheader] Error 1


== What do I do? ==


The error message will prompt you to explicitly link to the DSO that you need. From the foo example, adding foo3.so will get rid of the error:
This indicates that deltarpm used /usr/lib/librpmio.so.0 without explicitly linking to it. To fix, add -lrpmio to the gcc command for any binaries that use librpmio. In deltarpm, this can be done quickly by changing the Makefile:


<code>gcc -o foo1 foo1.o foo2.so foo3.so -Wl,--rpath-link=.</code>
rpmdumpheader: rpmdumpheader.o
-      $(CC) $(LDFLAGS) $^ -lrpm -o $@
'''+      $(CC) $(LDFLAGS) $^ -lrpm -lrpmio -o $@'''

Latest revision as of 15:56, 16 February 2010

Understanding the (Proposed) Change to DSO Linking

A quick list of packages that were found to have DSO link issues in mock builds is at DSOLinkBugs


Basics

The default behaviour for ld allows users to 'indirectly' link to required objects/libraries through intermediate objects/libraries. While this is convenient, it can also be dangerous because it makes your program's dependencies tied to the dependencies of other objects. If those objects ever change their linkages, they can break your program without any changes to your own code!

For example :

libxml2.so has:

 NEEDED            Shared library: [libdl.so.2]
 NEEDED            Shared library: [libz.so.1]

Under the old system, a program that links with libxml2 and uses dlopen need not link with libdl, and a program that links with libxml2 and uses gzopen need not link with libz. While these programs will work, they will break if libxml2 is ever changed to omit the dependency on libdl/libz.

What's the difference?

For example (courtesy Roland McGrath):

 ==> foo1.c <==
 #include <stdio.h>
 extern int foo ();
 int
 main ()
 {
   printf ("%d\n", foo ());
 }
 ==> foo2.c <==
 extern int foo ();
 int bar () { return foo (); }
 ==> foo3.c <==
 int foo () { return 0; }


Prepare position-independent code:

gcc -g -fPIC -c foo1.c foo2.c foo3.c

Generate foo3.so:

gcc -shared -o foo3.so foo3.o

Generate foo2.so, linking foo3.so:

gcc -shared -o foo2.so foo2.o foo3.so

The proposed change will affect the next step: Creating foo1.

Current

A call to gcc will succeed quietly, even though the link to foo3.so is only implicit.

 gcc -o foo1 foo1.o foo2.so -Wl,--rpath-link=.

Proposed

The call to gcc will fail, prompting the user to explicitly link the required shared object.

 gcc -o foo1 foo1.o foo2.so -Wl,--rpath-link=.
/usr/bin/ld: foo1.o: undefined reference to symbol 'foo'
/usr/bin/ld: note: 'foo' is defined in DSO ./foo3.so so try adding it to the linker command line


So, the difference is whether you can refer to a symbol that's in a DSO that you didn't list explicitly in your link line, but that is a DT_NEEDED dependency of one of those (or recursively of those, I think).

The big difference is that with the proposed change in place, ld will no longer skip linking needed libraries by default. The current default behaviour will lead ld to skip linking with a library if it is listed as a needed by another library that the program uses. In abstract terms, if libA is needed by libB and your program requires both libA and libB, your program may only link to libB. Then if another version of libB comes out that does not list libA as a needed library, then a recompilation will mysteriously break.

What do I do?

If you encounter this error, the error message will prompt you to explicitly link to the DSO that you need. From the foo example, adding foo3.so will get rid of the error:

gcc -o foo1 foo1.o foo2.so foo3.so -Wl,--rpath-link=.

Example deltarpm

Run fedora-cvs deltarpm or check out a 'devel' version of deltarpm from :

:pserver:anonymous@cvs.fedproject.org:/cvs/pkgs

Go to the devel folder and run 'make srpm' to produce a source rpm.

In /etc/mock, copy the desired fedora-rawhide-*.cfg file to test.cfg. In the test.cfg file, change the root to 'test'.

Add the following to test.cfg:

 [ld-test]
 name=ld-test
 baseurl=http://roland.fedorapeople.org/ld-test/
 enabled=1
 gpgcheck=0

(Note that the changes to ld are within gcc-4.4.3-5.fc13 so this step should not be necessary)

This will enable the mock build to pull the latest test version of ld. Next, run the build by executing mock -r /path/to/deltarpm/srpm

The following error should appear in /var/lib/mock/test/result/build.log :

RPM build errors:
/usr/bin/ld.bfd: rpmdumpheader.o: undefined reference to symbol 'Fopen'
/usr/bin/ld.bfd: note: 'Fopen' is defined in DSO /usr/lib/librpmio.so.0 so try adding it to the linker command line
/usr/lib/librpmio.so.0: could not read symbols: Invalid operation
*** /usr/bin/ld: ld behavior mismatch! ***
*** /usr/bin/ld.bfd succeeeded ***
*** /usr/bin/ld.bfd --no-add-needed exits 1 ***
*** arguments: --eh-frame-hdr --build-id -m elf_i386 --hash-style=gnu -dynamic-linker /lib/ld-linux.so.2 -o rpmdumpheader
/usr/lib/gcc/i686-redhat-linux/4.4.2/../../../crt1.o /usr/lib/gcc/i686-redhat-linux/4.4.2/../../../crti.o /usr/lib/gcc/i686-redhat-linux/4.4.2/crtbegin.o
-L/usr/lib/gcc/i686-redhat-linux/4.4.2 -L/usr/lib/gcc/i686-redhat-linux/4.4.2 -L/usr/lib/gcc/i686-redhat-linux/4.4.2/../../.. rpmdumpheader.o -lrpm -lgcc 
--as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/i686-redhat-linux/4.4.2/crtend.o
/usr/lib/gcc/i686-redhat-linux/4.4.2/../../../crtn.o
collect2: ld returned 1 exit status
make: *** [rpmdumpheader] Error 1


This indicates that deltarpm used /usr/lib/librpmio.so.0 without explicitly linking to it. To fix, add -lrpmio to the gcc command for any binaries that use librpmio. In deltarpm, this can be done quickly by changing the Makefile:

rpmdumpheader: rpmdumpheader.o
-       $(CC) $(LDFLAGS) $^ -lrpm -o $@
+       $(CC) $(LDFLAGS) $^ -lrpm -lrpmio -o $@