hotspot/agent/doc/transported_core.html
author lana
Sat, 14 May 2011 10:24:05 -0700
changeset 9715 2917db2e1e91
parent 1 489c9b5090e2
permissions -rw-r--r--
Merge
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
1
489c9b5090e2 Initial load
duke
parents:
diff changeset
     1
<html>
489c9b5090e2 Initial load
duke
parents:
diff changeset
     2
<head>
489c9b5090e2 Initial load
duke
parents:
diff changeset
     3
<title>
489c9b5090e2 Initial load
duke
parents:
diff changeset
     4
Debugging transported core dumps
489c9b5090e2 Initial load
duke
parents:
diff changeset
     5
</title>
489c9b5090e2 Initial load
duke
parents:
diff changeset
     6
</head>
489c9b5090e2 Initial load
duke
parents:
diff changeset
     7
<body>
489c9b5090e2 Initial load
duke
parents:
diff changeset
     8
<h1>Debugging transported core dumps</h1>
489c9b5090e2 Initial load
duke
parents:
diff changeset
     9
489c9b5090e2 Initial load
duke
parents:
diff changeset
    10
<p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    11
When a core dump is moved to a machine different from the one where it was
489c9b5090e2 Initial load
duke
parents:
diff changeset
    12
produced ("transported core dump"), debuggers (dbx, gdb, windbg or SA) do not
489c9b5090e2 Initial load
duke
parents:
diff changeset
    13
always successfully open the dump. This is due to kernel, library (shared
489c9b5090e2 Initial load
duke
parents:
diff changeset
    14
objects or DLLs) mismatch between core dump machine and debugger machine.
489c9b5090e2 Initial load
duke
parents:
diff changeset
    15
</p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    16
489c9b5090e2 Initial load
duke
parents:
diff changeset
    17
<p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    18
In most platforms, core dumps do not contain text (a.k.a) Code pages.
489c9b5090e2 Initial load
duke
parents:
diff changeset
    19
There pages are to be read from executable and shared objects (or DLLs).
489c9b5090e2 Initial load
duke
parents:
diff changeset
    20
Therefore it is important to have matching executable and shared object
489c9b5090e2 Initial load
duke
parents:
diff changeset
    21
files in debugger machine. 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    22
</p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    23
489c9b5090e2 Initial load
duke
parents:
diff changeset
    24
<h3>Solaris transported core dumps</h3>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    25
489c9b5090e2 Initial load
duke
parents:
diff changeset
    26
<p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    27
Debuggers on Solaris (and Linux) use two addtional shared objects
489c9b5090e2 Initial load
duke
parents:
diff changeset
    28
<b>rtld_db.so</b> and <b>libthread_db.so</b>. rtld_db.so is used to
489c9b5090e2 Initial load
duke
parents:
diff changeset
    29
read information on shared objects from the core dump. libthread_db.so
489c9b5090e2 Initial load
duke
parents:
diff changeset
    30
is used to get information on threads from the core dump. rtld_db.so
489c9b5090e2 Initial load
duke
parents:
diff changeset
    31
evolves along with rtld.so (the runtime linker library) and libthread_db.so
489c9b5090e2 Initial load
duke
parents:
diff changeset
    32
evolves along with libthread.so (user land multithreading library). 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    33
Hence, debugger machine should have right version of rtld_db.so and
489c9b5090e2 Initial load
duke
parents:
diff changeset
    34
libthread_db.so to open the core dump successfully. More details on
489c9b5090e2 Initial load
duke
parents:
diff changeset
    35
these debugger libraries can be found in 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    36
<a href="http://docs.sun.com/app/docs/doc/817-1984/">
489c9b5090e2 Initial load
duke
parents:
diff changeset
    37
Solaris Linkers and Libraries Guide - 817-1984</a>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    38
</p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    39
489c9b5090e2 Initial load
duke
parents:
diff changeset
    40
<h3>Solaris SA against transported core dumps</h3>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    41
489c9b5090e2 Initial load
duke
parents:
diff changeset
    42
<p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    43
With transported core dumps, you may get "rtld_db failures" or
489c9b5090e2 Initial load
duke
parents:
diff changeset
    44
"libthread_db failures" or SA may just throw some other error
489c9b5090e2 Initial load
duke
parents:
diff changeset
    45
(hotspot symbol is missing) when opening the core dump. 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    46
Enviroment variable <b>LIBSAPROC_DEBUG</b> may be set to any value
489c9b5090e2 Initial load
duke
parents:
diff changeset
    47
to debug such scenarios. With this env. var set, SA prints many
489c9b5090e2 Initial load
duke
parents:
diff changeset
    48
messages in standard error which can be useful for further debugging.
489c9b5090e2 Initial load
duke
parents:
diff changeset
    49
SA on Solaris uses <b>libproc.so</b> library. This library also
489c9b5090e2 Initial load
duke
parents:
diff changeset
    50
prints debug messages with env. var <b>LIBPROC_DEBUG</b>. But,
489c9b5090e2 Initial load
duke
parents:
diff changeset
    51
setting LIBSAPROC_DEBUG results in setting LIBPROC_DEBUG as well.
489c9b5090e2 Initial load
duke
parents:
diff changeset
    52
</p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    53
<p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    54
The best possible way to debug a transported core dump is to match the
489c9b5090e2 Initial load
duke
parents:
diff changeset
    55
debugger machine to that of core dump machine. i.e., have same Kernel
489c9b5090e2 Initial load
duke
parents:
diff changeset
    56
and libthread patch level between the machines. mdb (Solaris modular
489c9b5090e2 Initial load
duke
parents:
diff changeset
    57
debugger) may be used to find the Kernel patch level of core dump
489c9b5090e2 Initial load
duke
parents:
diff changeset
    58
machine and debugger machine may be brought to the same level.
489c9b5090e2 Initial load
duke
parents:
diff changeset
    59
</p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    60
<p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    61
If the matching machine is "far off" in your network, then
489c9b5090e2 Initial load
duke
parents:
diff changeset
    62
<ul>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    63
<li>consider using rlogin and <a href="clhsdb.html">CLHSDB - SA command line HSDB interface</a> or
489c9b5090e2 Initial load
duke
parents:
diff changeset
    64
<li>use SA remote debugging and debug the core from core machine remotely.
489c9b5090e2 Initial load
duke
parents:
diff changeset
    65
</ul>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    66
</p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    67
489c9b5090e2 Initial load
duke
parents:
diff changeset
    68
<p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    69
But, it may not be feasible to find matching machine to debug. 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    70
If so, you can copy all application shared objects (and libthread_db.so, if needed) from the core dump 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    71
machine into your debugger machine's directory, say, /export/applibs. Now, set <b>SA_ALTROOT</b> 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    72
environment variable to point to /export/applibs directory. Note that /export/applibs should either 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    73
contain matching 'full path' of libraries. i.e., /usr/lib/libthread_db.so from core 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    74
machine should be under /export/applibs/use/lib directory and /use/java/jre/lib/sparc/client/libjvm.so 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    75
from core machine should be under /export/applibs/use/java/jre/lib/sparc/client so on or /export/applibs 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    76
should just contain libthread_db.so, libjvm.so etc. directly. 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    77
</p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    78
489c9b5090e2 Initial load
duke
parents:
diff changeset
    79
<p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    80
Support for transported core dumps is <b>not</b> built into the standard version of libproc.so. You need to
489c9b5090e2 Initial load
duke
parents:
diff changeset
    81
set <b>LD_LIBRARY_PATH</b> env var to point to the path of a specially built version of libproc.so. 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    82
Note that this version of libproc.so has a special symbol to support transported core dump debugging. 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    83
In future, we may get this feature built into standard libproc.so -- if that happens, this step (of 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    84
setting LD_LIBRARY_PATH) can be skipped.
489c9b5090e2 Initial load
duke
parents:
diff changeset
    85
</p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    86
489c9b5090e2 Initial load
duke
parents:
diff changeset
    87
<h3>Ignoring libthread_db.so failures</h3>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    88
<p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    89
If you are okay with missing thread related information, you can set 
489c9b5090e2 Initial load
duke
parents:
diff changeset
    90
<b>SA_IGNORE_THREADDB</b> environment variable to any value. With this
489c9b5090e2 Initial load
duke
parents:
diff changeset
    91
set, SA ignores libthread_db failure, but you won't be able to get any
489c9b5090e2 Initial load
duke
parents:
diff changeset
    92
thread related information. But, you would be able to use SA and get
489c9b5090e2 Initial load
duke
parents:
diff changeset
    93
other information.
489c9b5090e2 Initial load
duke
parents:
diff changeset
    94
</p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    95
489c9b5090e2 Initial load
duke
parents:
diff changeset
    96
<h3>Linux SA against transported core dumps</h3>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    97
<p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
    98
On Linux, SA parses core and shared library ELF files. SA <b>does not</b> use
489c9b5090e2 Initial load
duke
parents:
diff changeset
    99
libthread_db.so or rtld_db.so for core dump debugging (although 
489c9b5090e2 Initial load
duke
parents:
diff changeset
   100
libthread_db.so is used for live process debugging). But, you
489c9b5090e2 Initial load
duke
parents:
diff changeset
   101
may still face problems with transported core dumps, because matching shared
489c9b5090e2 Initial load
duke
parents:
diff changeset
   102
objects may not be in the path(s) specified in core dump file. To
489c9b5090e2 Initial load
duke
parents:
diff changeset
   103
workaround this, you can define environment variable <b>SA_ALTROOT</b>
489c9b5090e2 Initial load
duke
parents:
diff changeset
   104
to be the directory where shared libraries are kept. The semantics of
489c9b5090e2 Initial load
duke
parents:
diff changeset
   105
this env. variable is same as that for Solaris (please refer above).
489c9b5090e2 Initial load
duke
parents:
diff changeset
   106
</p>
489c9b5090e2 Initial load
duke
parents:
diff changeset
   107
489c9b5090e2 Initial load
duke
parents:
diff changeset
   108
489c9b5090e2 Initial load
duke
parents:
diff changeset
   109
</body>
489c9b5090e2 Initial load
duke
parents:
diff changeset
   110
</html>