0

I am getting a bug where sometimes this code works sometimes it does not:

48 8B 41 08            ; MOV RAX, [RCX + 0x08] gets the refcount
48 FF C8               ; DEC RAX               ; decrement refCount
48 89 41 08            ; MOV [RCX + 0x08], RAX ; update refCount in obj
48 83 F8 00            ; CMP RAX, 0
74 01                  ; JE +1                 ; (skips early return if RAX is zero)
C3                     ; RET                   ; (return from function)
51                     ; PUSH RCX              ; to cache it
48 8B 49 10            ; MOV RCX, [RCX + 0x10] ; Read the child object into RCX
48 83 F9 00            ; CMP RCX, 0            ; ensure the child object hasn't been set to Nothing
74 11                  ; JE 0x11               ; jump the next 17 bytes to avoid calling release
48 8B 01               ; MOV RAX, [RCX]        ; Get vtable pointer from [RCX] into RAX
48 8B 40 10            ; MOV RAX, [RAX + 0x10] ; Get IUnknown::Release function pointer from [RAX + 0x10] into RAX (QI + hex(ptr_size * 2) for Release)
48 83 EC 28            ; SUB RSP, 0x28         ; Allocate shadow space
FF D0                  ; CALL RAX              ; Call the function pointed to by RAX
48 83 C4 28            ; ADD RSP, 0x28         ; Deallocate shadow space
59                     ; POP RCX               ; to restore it to the parent object
48 B8 {addrCoTaskMemFree} ; MOV RAX, addrCoTaskMemFree
48 83 EC 28            ; SUB RSP, 0x28
FF D0                  ; CALL RAX
48 83 C4 28            ; ADD RSP, 0x28
48 31 C0               ; XOR RAX, RAX
C3                     ; RET

the purpose of this code is a 64 bit Windows Intel implementation of an object's IUnknown::Release method. Here I have a parent object and a child object, and this is the parent's release method.

IUnknown::Release should:

  • decrement a refcount
  • if the refcount is 0 call IUnknown::Release on any member objects
  • call CoTaskMemFree on the parent object instance
  • return the refcount after release

For reference the memory layout of the parent object is something this

Offset Value size
0x00 vtablePointer 0x08 - 64 bits
0x08 refcount 0x08 - 64 bits
0x10 **chiltVTable 0x08 - 64 bits

I am trying to track down a complex bug - this code works fine when I make the parent object hold a second instance of itself as a child, and that object holds nothing as a child. So parent->child->nullptr

However if I make the parent hold an instance of some other COM object like a VBA.Collection, then I get a crash.


What I've tried

I have tried making a diagram of every step of the code making note of the registers in an attempt to ensure the flow is correct see here or image below. But I cannot figure out. I think it may be an issue with the shadow stack space, as sometimes I use the stack prior to a function call, but I feel like I'm banging my head against a brick wall. Debugging this is very tedious, the assembly code is a thunk in VBA injected into memory at runtime, and so I have to attach to Excel.exe with x64dbg, create a breakpoint at the memory allocated for the function code, inject it and wait for the breakpoint to be hit. VBA is interpreted so the callstack is very confusing. As this code is injected I also cannot use jump labels in assembly and need to hardcode jump relative offsets which is error prone.

I'm hoping someone can look at my assembly and immediately spot the issue, I am a complete novice with assembly, especially 64 bit which I have only learnt for this project.

enter image description here

3
  • @Jester I though it was 0x28 because 0x20 is the shadow space, then the additional 0x08 is to add to the 0x08 of the return address and keep things 16 byte aligned. So the callstack looks like [padding][shadow 1][shadow 2][shadow 3][shadow 4][return addr] when I'm inside CoTaskMemFree, with 3 pairs of 8 byte values to keep 16 byte alignment.
    – Greedo
    Commented Jul 7 at 23:48
  • Except PUSH RCX already used up 8 bytes.
    – Jester
    Commented Jul 7 at 23:51
  • 1
    You also forgot to declare unwind codes, so if (say) a stack expansion guard page exception occurs, the system will not be able to walk the stack and decide that your process has been compromised and forcibly terminate it. Commented Jul 7 at 23:58

1 Answer 1

1

You misalign the stack due to the combination of PUSH RCX and how you allocate the shadow space. Also you can get rid of the ADD/SUB pair that cancel each other out. Code starting from the early return could be:

    PUSH RCX              ; to cache it
    SUB RSP, 0x20         ; Allocate shadow space
    MOV RCX, [RCX + 0x10] ; Read the child object into RCX
    TEST RCX, RCX         ; ensure the child object hasn't been set to Nothing
    JZ skip
    MOV RAX, [RCX]        ; Get vtable pointer from [RCX] into RAX
    MOV RAX, [RAX + 0x10] ; Get IUnknown::Release function pointer from [RAX + 0x10] into RAX (QI + hex(ptr_size * 2) for Release)
    CALL RAX              ; Call the function pointed to by RAX
skip:
    MOV RCX, [RSP + 0x20] ; to restore it to the parent object
    MOV RAX, addrCoTaskMemFree
    CALL RAX
    ADD RSP, 0x28         ; free shadow space and discard saved RCX
    XOR EAX, EAX          ; 32 bit ops zero extend no need for REX
    RET

If you are optimizing for size you can replace the MOV RCX, [RSP + 0x20] with POP RCX; PUSH RCX (ie. push it back onto the stack immediately).

8
  • The push/pop (or any other stack pointer modifying instructions aside from call) is not allowed outside a prologue or epilogue. Commented Jul 8 at 0:00
  • Moving the first two instructions to the prologue and adjusting the early return code is left as an exercise :)
    – Jester
    Commented Jul 8 at 0:19
  • 1
    Sorry got it backwards I meant MOV [RSP+0x08], RCX to store RCX in the shadow space, then make the function calls and avoid the need to push and pop. Can I use that space for anything I like is the question.
    – Greedo
    Commented Jul 8 at 0:46
  • 1
    Yes you can use your own shadow space. In fact since you actually even want to store the incoming argument there, it's intended for that specific purpose. Just make sure you adjust the offset if RSP is changed.
    – Jester
    Commented Jul 8 at 0:59
  • 1
    The system needs to understand where all non-volatile registers are, where the return address is, and where the stack pointer is, at any point during execution of your function, because asynchronous exceptions need to be able to walk the stack to find an appropriate handler. You tell the system about all the things you have done with the return address, nonvolatile registers, and the stack by declaring unwind codes in your program. Commented Jul 8 at 2:05

Not the answer you're looking for? Browse other questions tagged or ask your own question.