学会一个JVM插件:使用HSDIS反汇编JIT生成的代码

时间:2022-05-06
本文章向大家介绍学会一个JVM插件:使用HSDIS反汇编JIT生成的代码,主要内容包括其使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。

HSDIS是一个Java官方推荐 HotSpot虚拟机JIT编译代码的反汇编插件。我们有了这个插件后,通过JVM参数-XX:+PrintAssembly就可以加载这个HSDIS插件,然后为我们把JIT动态生成的那些本地代码还原成汇编代码,然后打印出来。

这个插件的官方网站貌似已经不存在了,你输入www.kenai.com会让你去联系oracle官方:

还是直接去网上搜搜搜这个东东(hsdis-amd64.dylib),然后下载就可以了,或者去这里下载:https://github.com/importsource/jvm-tuts/blob/master/hsdis-amd64.dylib

根据不同的操作系统下载对应的版本,本文的代码是运行在Mac上的,所以选择hsdis-amd64.dylib。

然后把这个文件放在你指定的某个目录下,本文我是放在自己电脑的Download目录下的hsdis目录下,然后在idea处设置如下:

注意:LD_LIBRARY_PATH=/Users/hezhuofan/Downloads/hsdis,这里只设置目录到一层。

然后我们添加JVM启动参数:

-XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -Xcomp -XX:CompileCommand=dontinline,*Bar.sum -XX:CompileCommand=compileonly,*Bar.sum

然后运行程序:

/**
 * java -XX:+UnlockDiagnosticVMOptions
 *      -XX:+PrintAssembly
 *      -Xcomp
 *      -XX:CompileCommand=dontinline,*Bar.sum
 *      -XX:CompileCommand=compileonly,*Bar.sum
 *
 *
 * PATH LD_LIBRARY_PATH=/Users/hezhuofan/Downloads/hsdis
 * @author hezhuofan
 */
public class Bar {
    int a=1;
    static int b=2;
    public int sum(int c){
        return a+b+c;
    }

    public static void main(String[] args){
        new Bar().sum(3);
    }
}

运行结果:

CompilerOracle: dontinline *Bar.sum

CompilerOracle: compileonly *Bar.sum

Java HotSpot(TM) 64-Bit Server VM warning: PrintAssembly is enabled; turning on DebugNonSafepoints to gain additional output

Loaded disassembler from hsdis-amd64.dylib

Decoding compiled method 0x00000001037b3590:

Code:

[Disassembling for mach='i386:x86-64']

[Entry Point]

[Constants]

# {method} {0x000000011793e510} 'sum' '(I)I' in 'com/importsource/jvm/tuts/Bar'

# this: rsi:rsi = 'com/importsource/jvm/tuts/Bar'

# parm0: rdx = int

# [sp+0x40] (sp of caller)

0x00000001037b3700: mov 0x8(%rsi),%r10d

0x00000001037b3704: shl $0x3,%r10

0x00000001037b3708: cmp %rax,%r10

0x00000001037b370b: jne 0x00000001036edb60 ; {runtime_call}

0x00000001037b3711: data32 data32 nopw 0x0(%rax,%rax,1)

0x00000001037b371c: data32 data32 xchg %ax,%ax

[Verified Entry Point]

0x00000001037b3720: mov %eax,-0x14000(%rsp)

0x00000001037b3727: push %rbp

0x00000001037b3728: sub $0x30,%rsp

0x00000001037b372c: movabs $0x11793e7d8,%rax ; {metadata(method data for {method} {0x000000011793e510} 'sum' '(I)I' in 'com/importsource/jvm/tuts/Bar')}

0x00000001037b3736: mov 0xdc(%rax),%edi

0x00000001037b373c: add $0x8,%edi

0x00000001037b373f: mov %edi,0xdc(%rax)

0x00000001037b3745: movabs $0x11793e510,%rax ; {metadata({method} {0x000000011793e510} 'sum' '(I)I' in 'com/importsource/jvm/tuts/Bar')}

0x00000001037b374f: and $0x0,%edi

0x00000001037b3752: cmp $0x0,%edi

0x00000001037b3755: je 0x00000001037b377b ;*aload_0

; - com.importsource.jvm.tuts.Bar::sum@0 (line 11)

0x00000001037b375b: mov 0xc(%rsi),%eax ;*getfield a

; - com.importsource.jvm.tuts.Bar::sum@1 (line 11)

0x00000001037b375e: movabs $0x7956f9020,%rsi ; {oop(a 'java/lang/Class' = 'com/importsource/jvm/tuts/Bar')}

0x00000001037b3768: mov 0x68(%rsi),%esi ;*getstatic b

; - com.importsource.jvm.tuts.Bar::sum@4 (line 11)

0x00000001037b376b: add %esi,%eax

0x00000001037b376d: add %edx,%eax

0x00000001037b376f: add $0x30,%rsp

0x00000001037b3773: pop %rbp

0x00000001037b3774: test %eax,-0x9bd67a(%rip) # 0x0000000102df6100

; {poll_return}

0x00000001037b377a: retq

0x00000001037b377b: mov %rax,0x8(%rsp)

0x00000001037b3780: movq $0xffffffffffffffff,(%rsp)

0x00000001037b3788: callq 0x00000001037a8da0 ; OopMap{rsi=Oop off=141}

;*synchronization entry

; - com.importsource.jvm.tuts.Bar::sum@-1 (line 11)

; {runtime_call}

0x00000001037b378d: jmp 0x00000001037b375b

0x00000001037b378f: nop

0x00000001037b3790: nop

0x00000001037b3791: mov 0x290(%r15),%rax

0x00000001037b3798: movabs $0x0,%r10

0x00000001037b37a2: mov %r10,0x290(%r15)

0x00000001037b37a9: movabs $0x0,%r10

0x00000001037b37b3: mov %r10,0x298(%r15)

0x00000001037b37ba: add $0x30,%rsp

0x00000001037b37be: pop %rbp

0x00000001037b37bf: jmpq 0x00000001037185a0 ; {runtime_call}

0x00000001037b37c4: hlt

0x00000001037b37c5: hlt

0x00000001037b37c6: hlt

0x00000001037b37c7: hlt

0x00000001037b37c8: hlt

0x00000001037b37c9: hlt

0x00000001037b37ca: hlt

0x00000001037b37cb: hlt

0x00000001037b37cc: hlt

0x00000001037b37cd: hlt

0x00000001037b37ce: hlt

0x00000001037b37cf: hlt

0x00000001037b37d0: hlt

0x00000001037b37d1: hlt

0x00000001037b37d2: hlt

0x00000001037b37d3: hlt

0x00000001037b37d4: hlt

0x00000001037b37d5: hlt

0x00000001037b37d6: hlt

0x00000001037b37d7: hlt

0x00000001037b37d8: hlt

0x00000001037b37d9: hlt

0x00000001037b37da: hlt

0x00000001037b37db: hlt

0x00000001037b37dc: hlt

0x00000001037b37dd: hlt

0x00000001037b37de: hlt

0x00000001037b37df: hlt

[Exception Handler]

[Stub Code]

0x00000001037b37e0: callq 0x00000001037a6720 ; {no_reloc}

0x00000001037b37e5: mov %rsp,-0x28(%rsp)

0x00000001037b37ea: sub $0x80,%rsp

0x00000001037b37f1: mov %rax,0x78(%rsp)

0x00000001037b37f6: mov %rcx,0x70(%rsp)

0x00000001037b37fb: mov %rdx,0x68(%rsp)

0x00000001037b3800: mov %rbx,0x60(%rsp)

0x00000001037b3805: mov %rbp,0x50(%rsp)

0x00000001037b380a: mov %rsi,0x48(%rsp)

0x00000001037b380f: mov %rdi,0x40(%rsp)

0x00000001037b3814: mov %r8,0x38(%rsp)

0x00000001037b3819: mov %r9,0x30(%rsp)

0x00000001037b381e: mov %r10,0x28(%rsp)

0x00000001037b3823: mov %r11,0x20(%rsp)

0x00000001037b3828: mov %r12,0x18(%rsp)

0x00000001037b382d: mov %r13,0x10(%rsp)

0x00000001037b3832: mov %r14,0x8(%rsp)

0x00000001037b3837: mov %r15,(%rsp)

0x00000001037b383b: movabs $0x10249fa8e,%rdi ; {external_word}

0x00000001037b3845: movabs $0x1037b37e5,%rsi ; {internal_word}

0x00000001037b384f: mov %rsp,%rdx

0x00000001037b3852: and $0xfffffffffffffff0,%rsp

0x00000001037b3856: callq 0x00000001022d33de ; {runtime_call}

0x00000001037b385b: hlt

[Deopt Handler Code]

0x00000001037b385c: movabs $0x1037b385c,%r10 ; {section_word}

0x00000001037b3866: push %r10

0x00000001037b3868: jmpq 0x00000001036ef100 ; {runtime_call}

0x00000001037b386d: hlt

0x00000001037b386e: hlt

0x00000001037b386f: hlt

Decoding compiled method 0x00000001037b3950:

Code:

[Entry Point]

[Constants]

# {method} {0x000000011793e510} 'sum' '(I)I' in 'com/importsource/jvm/tuts/Bar'

# this: rsi:rsi = 'com/importsource/jvm/tuts/Bar'

# parm0: rdx = int

# [sp+0x40] (sp of caller)

0x00000001037b3aa0: mov 0x8(%rsi),%r10d

0x00000001037b3aa4: shl $0x3,%r10

0x00000001037b3aa8: cmp %rax,%r10

0x00000001037b3aab: jne 0x00000001036edb60 ; {runtime_call}

0x00000001037b3ab1: data32 data32 nopw 0x0(%rax,%rax,1)

0x00000001037b3abc: data32 data32 xchg %ax,%ax

[Verified Entry Point]

0x00000001037b3ac0: mov %eax,-0x14000(%rsp)

0x00000001037b3ac7: push %rbp

0x00000001037b3ac8: sub $0x30,%rsp ;*aload_0

; - com.importsource.jvm.tuts.Bar::sum@0 (line 11)

0x00000001037b3acc: mov 0xc(%rsi),%eax ;*getfield a

; - com.importsource.jvm.tuts.Bar::sum@1 (line 11)

0x00000001037b3acf: movabs $0x7956f9020,%rsi ; {oop(a 'java/lang/Class' = 'com/importsource/jvm/tuts/Bar')}

0x00000001037b3ad9: mov 0x68(%rsi),%esi ;*getstatic b

; - com.importsource.jvm.tuts.Bar::sum@4 (line 11)

0x00000001037b3adc: add %esi,%eax

0x00000001037b3ade: add %edx,%eax

0x00000001037b3ae0: add $0x30,%rsp

0x00000001037b3ae4: pop %rbp

0x00000001037b3ae5: test %eax,-0x9bd9eb(%rip) # 0x0000000102df6100

; {poll_return}

0x00000001037b3aeb: retq

0x00000001037b3aec: nop

0x00000001037b3aed: nop

0x00000001037b3aee: mov 0x290(%r15),%rax

0x00000001037b3af5: movabs $0x0,%r10

0x00000001037b3aff: mov %r10,0x290(%r15)

0x00000001037b3b06: movabs $0x0,%r10

0x00000001037b3b10: mov %r10,0x298(%r15)

0x00000001037b3b17: add $0x30,%rsp

0x00000001037b3b1b: pop %rbp

0x00000001037b3b1c: jmpq 0x00000001037185a0 ; {runtime_call}

0x00000001037b3b21: hlt

0x00000001037b3b22: hlt

0x00000001037b3b23: hlt

0x00000001037b3b24: hlt

0x00000001037b3b25: hlt

0x00000001037b3b26: hlt

0x00000001037b3b27: hlt

0x00000001037b3b28: hlt

0x00000001037b3b29: hlt

0x00000001037b3b2a: hlt

0x00000001037b3b2b: hlt

0x00000001037b3b2c: hlt

0x00000001037b3b2d: hlt

0x00000001037b3b2e: hlt

0x00000001037b3b2f: hlt

0x00000001037b3b30: hlt

0x00000001037b3b31: hlt

0x00000001037b3b32: hlt

0x00000001037b3b33: hlt

0x00000001037b3b34: hlt

0x00000001037b3b35: hlt

0x00000001037b3b36: hlt

0x00000001037b3b37: hlt

0x00000001037b3b38: hlt

0x00000001037b3b39: hlt

0x00000001037b3b3a: hlt

0x00000001037b3b3b: hlt

0x00000001037b3b3c: hlt

0x00000001037b3b3d: hlt

0x00000001037b3b3e: hlt

0x00000001037b3b3f: hlt

[Exception Handler]

[Stub Code]

0x00000001037b3b40: callq 0x00000001037a6720 ; {no_reloc}

0x00000001037b3b45: mov %rsp,-0x28(%rsp)

0x00000001037b3b4a: sub $0x80,%rsp

0x00000001037b3b51: mov %rax,0x78(%rsp)

0x00000001037b3b56: mov %rcx,0x70(%rsp)

0x00000001037b3b5b: mov %rdx,0x68(%rsp)

0x00000001037b3b60: mov %rbx,0x60(%rsp)

0x00000001037b3b65: mov %rbp,0x50(%rsp)

0x00000001037b3b6a: mov %rsi,0x48(%rsp)

0x00000001037b3b6f: mov %rdi,0x40(%rsp)

0x00000001037b3b74: mov %r8,0x38(%rsp)

0x00000001037b3b79: mov %r9,0x30(%rsp)

0x00000001037b3b7e: mov %r10,0x28(%rsp)

0x00000001037b3b83: mov %r11,0x20(%rsp)

0x00000001037b3b88: mov %r12,0x18(%rsp)

0x00000001037b3b8d: mov %r13,0x10(%rsp)

0x00000001037b3b92: mov %r14,0x8(%rsp)

0x00000001037b3b97: mov %r15,(%rsp)

0x00000001037b3b9b: movabs $0x10249fa8e,%rdi ; {external_word}

0x00000001037b3ba5: movabs $0x1037b3b45,%rsi ; {internal_word}

0x00000001037b3baf: mov %rsp,%rdx

0x00000001037b3bb2: and $0xfffffffffffffff0,%rsp

0x00000001037b3bb6: callq 0x00000001022d33de ; {runtime_call}

0x00000001037b3bbb: hlt

[Deopt Handler Code]

0x00000001037b3bbc: movabs $0x1037b3bbc,%r10 ; {section_word}

0x00000001037b3bc6: push %r10

0x00000001037b3bc8: jmpq 0x00000001036ef100 ; {runtime_call}

0x00000001037b3bcd: hlt

0x00000001037b3bce: hlt

0x00000001037b3bcf: hlt

为什么要做反汇编呢?

当你分析代码运行状况时,通过字节码指令来分析,势必不是最真实的运行细节,因为现在的很多虚拟机的具体实现已经和虚拟机规范相去略远,规范逐渐变成了一个概念模型(只要具体虚拟机实现做出对等的效果就可以了)。

分析程序还可以通过一些调试工具来搞,比如GDB、Windbg来断点调试,但断点调试无法触及到JIT生成的本地代码,所以这时候就只能通过反汇编JIT代码来分析代码运行的底层情况了。