hotspot/agent/doc/transported_core.html
changeset 1 489c9b5090e2
equal deleted inserted replaced
0:fd16c54261b3 1:489c9b5090e2
       
     1 <html>
       
     2 <head>
       
     3 <title>
       
     4 Debugging transported core dumps
       
     5 </title>
       
     6 </head>
       
     7 <body>
       
     8 <h1>Debugging transported core dumps</h1>
       
     9 
       
    10 <p>
       
    11 When a core dump is moved to a machine different from the one where it was
       
    12 produced ("transported core dump"), debuggers (dbx, gdb, windbg or SA) do not
       
    13 always successfully open the dump. This is due to kernel, library (shared
       
    14 objects or DLLs) mismatch between core dump machine and debugger machine.
       
    15 </p>
       
    16 
       
    17 <p>
       
    18 In most platforms, core dumps do not contain text (a.k.a) Code pages.
       
    19 There pages are to be read from executable and shared objects (or DLLs).
       
    20 Therefore it is important to have matching executable and shared object
       
    21 files in debugger machine. 
       
    22 </p>
       
    23 
       
    24 <h3>Solaris transported core dumps</h3>
       
    25 
       
    26 <p>
       
    27 Debuggers on Solaris (and Linux) use two addtional shared objects
       
    28 <b>rtld_db.so</b> and <b>libthread_db.so</b>. rtld_db.so is used to
       
    29 read information on shared objects from the core dump. libthread_db.so
       
    30 is used to get information on threads from the core dump. rtld_db.so
       
    31 evolves along with rtld.so (the runtime linker library) and libthread_db.so
       
    32 evolves along with libthread.so (user land multithreading library). 
       
    33 Hence, debugger machine should have right version of rtld_db.so and
       
    34 libthread_db.so to open the core dump successfully. More details on
       
    35 these debugger libraries can be found in 
       
    36 <a href="http://docs.sun.com/app/docs/doc/817-1984/">
       
    37 Solaris Linkers and Libraries Guide - 817-1984</a>
       
    38 </p>
       
    39 
       
    40 <h3>Solaris SA against transported core dumps</h3>
       
    41 
       
    42 <p>
       
    43 With transported core dumps, you may get "rtld_db failures" or
       
    44 "libthread_db failures" or SA may just throw some other error
       
    45 (hotspot symbol is missing) when opening the core dump. 
       
    46 Enviroment variable <b>LIBSAPROC_DEBUG</b> may be set to any value
       
    47 to debug such scenarios. With this env. var set, SA prints many
       
    48 messages in standard error which can be useful for further debugging.
       
    49 SA on Solaris uses <b>libproc.so</b> library. This library also
       
    50 prints debug messages with env. var <b>LIBPROC_DEBUG</b>. But,
       
    51 setting LIBSAPROC_DEBUG results in setting LIBPROC_DEBUG as well.
       
    52 </p>
       
    53 <p>
       
    54 The best possible way to debug a transported core dump is to match the
       
    55 debugger machine to that of core dump machine. i.e., have same Kernel
       
    56 and libthread patch level between the machines. mdb (Solaris modular
       
    57 debugger) may be used to find the Kernel patch level of core dump
       
    58 machine and debugger machine may be brought to the same level.
       
    59 </p>
       
    60 <p>
       
    61 If the matching machine is "far off" in your network, then
       
    62 <ul>
       
    63 <li>consider using rlogin and <a href="clhsdb.html">CLHSDB - SA command line HSDB interface</a> or
       
    64 <li>use SA remote debugging and debug the core from core machine remotely.
       
    65 </ul>
       
    66 </p>
       
    67 
       
    68 <p>
       
    69 But, it may not be feasible to find matching machine to debug. 
       
    70 If so, you can copy all application shared objects (and libthread_db.so, if needed) from the core dump 
       
    71 machine into your debugger machine's directory, say, /export/applibs. Now, set <b>SA_ALTROOT</b> 
       
    72 environment variable to point to /export/applibs directory. Note that /export/applibs should either 
       
    73 contain matching 'full path' of libraries. i.e., /usr/lib/libthread_db.so from core 
       
    74 machine should be under /export/applibs/use/lib directory and /use/java/jre/lib/sparc/client/libjvm.so 
       
    75 from core machine should be under /export/applibs/use/java/jre/lib/sparc/client so on or /export/applibs 
       
    76 should just contain libthread_db.so, libjvm.so etc. directly. 
       
    77 </p>
       
    78 
       
    79 <p>
       
    80 Support for transported core dumps is <b>not</b> built into the standard version of libproc.so. You need to
       
    81 set <b>LD_LIBRARY_PATH</b> env var to point to the path of a specially built version of libproc.so. 
       
    82 Note that this version of libproc.so has a special symbol to support transported core dump debugging. 
       
    83 In future, we may get this feature built into standard libproc.so -- if that happens, this step (of 
       
    84 setting LD_LIBRARY_PATH) can be skipped.
       
    85 </p>
       
    86 
       
    87 <h3>Ignoring libthread_db.so failures</h3>
       
    88 <p>
       
    89 If you are okay with missing thread related information, you can set 
       
    90 <b>SA_IGNORE_THREADDB</b> environment variable to any value. With this
       
    91 set, SA ignores libthread_db failure, but you won't be able to get any
       
    92 thread related information. But, you would be able to use SA and get
       
    93 other information.
       
    94 </p>
       
    95 
       
    96 <h3>Linux SA against transported core dumps</h3>
       
    97 <p>
       
    98 On Linux, SA parses core and shared library ELF files. SA <b>does not</b> use
       
    99 libthread_db.so or rtld_db.so for core dump debugging (although 
       
   100 libthread_db.so is used for live process debugging). But, you
       
   101 may still face problems with transported core dumps, because matching shared
       
   102 objects may not be in the path(s) specified in core dump file. To
       
   103 workaround this, you can define environment variable <b>SA_ALTROOT</b>
       
   104 to be the directory where shared libraries are kept. The semantics of
       
   105 this env. variable is same as that for Solaris (please refer above).
       
   106 </p>
       
   107 
       
   108 
       
   109 </body>
       
   110 </html>