1
|
1 |
<html>
|
|
2 |
<head>
|
|
3 |
<title>
|
|
4 |
Debugging transported core dumps
|
|
5 |
</title>
|
|
6 |
</head>
|
|
7 |
<body>
|
|
8 |
<h1>Debugging transported core dumps</h1>
|
|
9 |
|
|
10 |
<p>
|
|
11 |
When a core dump is moved to a machine different from the one where it was
|
|
12 |
produced ("transported core dump"), debuggers (dbx, gdb, windbg or SA) do not
|
|
13 |
always successfully open the dump. This is due to kernel, library (shared
|
|
14 |
objects or DLLs) mismatch between core dump machine and debugger machine.
|
|
15 |
</p>
|
|
16 |
|
|
17 |
<p>
|
|
18 |
In most platforms, core dumps do not contain text (a.k.a) Code pages.
|
|
19 |
There pages are to be read from executable and shared objects (or DLLs).
|
|
20 |
Therefore it is important to have matching executable and shared object
|
|
21 |
files in debugger machine.
|
|
22 |
</p>
|
|
23 |
|
|
24 |
<h3>Solaris transported core dumps</h3>
|
|
25 |
|
|
26 |
<p>
|
|
27 |
Debuggers on Solaris (and Linux) use two addtional shared objects
|
|
28 |
<b>rtld_db.so</b> and <b>libthread_db.so</b>. rtld_db.so is used to
|
|
29 |
read information on shared objects from the core dump. libthread_db.so
|
|
30 |
is used to get information on threads from the core dump. rtld_db.so
|
|
31 |
evolves along with rtld.so (the runtime linker library) and libthread_db.so
|
|
32 |
evolves along with libthread.so (user land multithreading library).
|
|
33 |
Hence, debugger machine should have right version of rtld_db.so and
|
|
34 |
libthread_db.so to open the core dump successfully. More details on
|
|
35 |
these debugger libraries can be found in
|
|
36 |
<a href="http://docs.sun.com/app/docs/doc/817-1984/">
|
|
37 |
Solaris Linkers and Libraries Guide - 817-1984</a>
|
|
38 |
</p>
|
|
39 |
|
|
40 |
<h3>Solaris SA against transported core dumps</h3>
|
|
41 |
|
|
42 |
<p>
|
|
43 |
With transported core dumps, you may get "rtld_db failures" or
|
|
44 |
"libthread_db failures" or SA may just throw some other error
|
|
45 |
(hotspot symbol is missing) when opening the core dump.
|
|
46 |
Enviroment variable <b>LIBSAPROC_DEBUG</b> may be set to any value
|
|
47 |
to debug such scenarios. With this env. var set, SA prints many
|
|
48 |
messages in standard error which can be useful for further debugging.
|
|
49 |
SA on Solaris uses <b>libproc.so</b> library. This library also
|
|
50 |
prints debug messages with env. var <b>LIBPROC_DEBUG</b>. But,
|
|
51 |
setting LIBSAPROC_DEBUG results in setting LIBPROC_DEBUG as well.
|
|
52 |
</p>
|
|
53 |
<p>
|
|
54 |
The best possible way to debug a transported core dump is to match the
|
|
55 |
debugger machine to that of core dump machine. i.e., have same Kernel
|
|
56 |
and libthread patch level between the machines. mdb (Solaris modular
|
|
57 |
debugger) may be used to find the Kernel patch level of core dump
|
|
58 |
machine and debugger machine may be brought to the same level.
|
|
59 |
</p>
|
|
60 |
<p>
|
|
61 |
If the matching machine is "far off" in your network, then
|
|
62 |
<ul>
|
|
63 |
<li>consider using rlogin and <a href="clhsdb.html">CLHSDB - SA command line HSDB interface</a> or
|
|
64 |
<li>use SA remote debugging and debug the core from core machine remotely.
|
|
65 |
</ul>
|
|
66 |
</p>
|
|
67 |
|
|
68 |
<p>
|
|
69 |
But, it may not be feasible to find matching machine to debug.
|
|
70 |
If so, you can copy all application shared objects (and libthread_db.so, if needed) from the core dump
|
|
71 |
machine into your debugger machine's directory, say, /export/applibs. Now, set <b>SA_ALTROOT</b>
|
|
72 |
environment variable to point to /export/applibs directory. Note that /export/applibs should either
|
|
73 |
contain matching 'full path' of libraries. i.e., /usr/lib/libthread_db.so from core
|
|
74 |
machine should be under /export/applibs/use/lib directory and /use/java/jre/lib/sparc/client/libjvm.so
|
|
75 |
from core machine should be under /export/applibs/use/java/jre/lib/sparc/client so on or /export/applibs
|
|
76 |
should just contain libthread_db.so, libjvm.so etc. directly.
|
|
77 |
</p>
|
|
78 |
|
|
79 |
<p>
|
|
80 |
Support for transported core dumps is <b>not</b> built into the standard version of libproc.so. You need to
|
|
81 |
set <b>LD_LIBRARY_PATH</b> env var to point to the path of a specially built version of libproc.so.
|
|
82 |
Note that this version of libproc.so has a special symbol to support transported core dump debugging.
|
|
83 |
In future, we may get this feature built into standard libproc.so -- if that happens, this step (of
|
|
84 |
setting LD_LIBRARY_PATH) can be skipped.
|
|
85 |
</p>
|
|
86 |
|
|
87 |
<h3>Ignoring libthread_db.so failures</h3>
|
|
88 |
<p>
|
|
89 |
If you are okay with missing thread related information, you can set
|
|
90 |
<b>SA_IGNORE_THREADDB</b> environment variable to any value. With this
|
|
91 |
set, SA ignores libthread_db failure, but you won't be able to get any
|
|
92 |
thread related information. But, you would be able to use SA and get
|
|
93 |
other information.
|
|
94 |
</p>
|
|
95 |
|
|
96 |
<h3>Linux SA against transported core dumps</h3>
|
|
97 |
<p>
|
|
98 |
On Linux, SA parses core and shared library ELF files. SA <b>does not</b> use
|
|
99 |
libthread_db.so or rtld_db.so for core dump debugging (although
|
|
100 |
libthread_db.so is used for live process debugging). But, you
|
|
101 |
may still face problems with transported core dumps, because matching shared
|
|
102 |
objects may not be in the path(s) specified in core dump file. To
|
|
103 |
workaround this, you can define environment variable <b>SA_ALTROOT</b>
|
|
104 |
to be the directory where shared libraries are kept. The semantics of
|
|
105 |
this env. variable is same as that for Solaris (please refer above).
|
|
106 |
</p>
|
|
107 |
|
|
108 |
|
|
109 |
</body>
|
|
110 |
</html>
|