0

Could you please help me identify where the issue might be in my code, or is there possibly a problem with the compiler itself?" I was writing my own lex/yacc by cpp. There is some codes for generating binary tree of regular expression. Such code produced a weird bug. Nodes[parent].right = append(R, output, s); It copies a tree R to be the right sub-tree of Nodes[parent]. All the nodes within a tree were stored in a array Nodes in heap. So, the Nodes[parent].right and return value of append(R, output, s) are both of a type of size_t. The initial value of Nodes[parent].right is (size_t)-1. I checked the return value just before the append(R, output, s) returned, it was a correct value of 8. After its return I printed the value of Nodes[parent].right, it still was the wrong initial value of (size_t)-1. It seemed that the return met a failure. When I use size_t temp; temp = append(R, output, s); Nodes[parent].right = temp; instead, this bug disappeared.

Both two code style seems to be equivalent, why the first one produced a bug, while the second one was right.

the following is definition of the template of the binary tree: I have to say sorry that template class buffer<size_t> and list<size_t> was defined by myself, they are not a part of std.

template <class T> class Bitree
{
public:
    struct node
    {
        size_t left;
        size_t right;
        T content;
    };
    Bitree();
    ~Bitree();
    void clear(void);
    void SetHead(size_t head);
    size_t NewNodeOffset(void);
    node* NewNode(void);
    node* Node(size_t site)const;
    node* LeftChild(const node* now)const;
    node* RightChild(const node* now)const;
    const node& operator[](size_t target) const;
    node& operator[](size_t target);
    void postorderTraversal(buffer<size_t>& output, list<size_t>& s) const;
    void postorderTraversal(buffer<size_t>& output) const;
    void inorderTraversal(buffer<size_t>& output, list<size_t>& s) const;
    void inorderTraversal(buffer<size_t>& output) const;
    bool IfLeaf(size_t site)const;
    void Demo(FILE* fp)const;


    void removal(size_t site);
    void removal(size_t site, buffer<size_t>& outpaput, list<size_t>& s);
    size_t append(const Bitree<T>& source, buffer<size_t>& output, list<size_t>& s);
        
    void append(const Bitree<T>& left, size_t parent);
    void append(const Bitree<T>& left, const Bitree<T>& right, size_t parent);

    size_t append_t(const Bitree<T>& source, buffer<size_t>& output, list<size_t>& s, bool key);//for testing of this bug
    void append_t(const Bitree<T>& left, const Bitree<T>& right, size_t parent, bool key);
//for testing of this bug
    size_t Head;// root 
    private:
        
    size_t Size;// capacity of node* Nodes
    size_t Count;// amount of nodes in this tree
    size_t FirstEmpty;
    node* Nodes;
};

And the following is the definition of (All the results printed by std::cout are correct.) void append(const Bitree<T>& left, const Bitree<T>& right, size_t parent);

void append(const Bitree<T>& left, const Bitree<T>& right, size_t parent); 
{
    size_t now, new_node, temp;
    s.refresh();
    output.refresh();
    new_node = SizeMax;
    source.postorderTraversal(output, s);
    s.refresh();
    s.renew(source.Size);
    while (output.dequeue(now))
    {
        new_node = NewNodeOffset();
          std::cout << "new_node: " << new_node << std::endl;
          std::cout << "now: " << now << std::endl;
        temp = source.Nodes[now].left;
        (Nodes + new_node)->left = (temp == SizeMax ? SizeMax : s[temp]);
          std::cout << "source.Nodes[now].left: " << temp << std::endl;
          std::cout << "(Nodes + new_node)->left: " << (Nodes + new_node)->left << std::endl;
        temp = source.Nodes[now].right;
        (Nodes + new_node)->right = (temp == SizeMax ? SizeMax : s[temp]);
          std::cout << "source.Nodes[now].right: " << temp << std::endl;
          std::cout << "(Nodes + new_node)->right: " << (Nodes + new_node)->right << std::endl;
        (Nodes + new_node)->content = source.Nodes[now].content;
        s[now] = new_node;

    }
      std::cout << "newHead: " << new_node << std::endl;
      std::cout << "return newHead: "<< new_node <<"\n\n" << std::endl;
    return new_node;
}

the code that called oid append(const Bitree<T>& left, const Bitree<T>& right, size_t parent); was here:

template <class T> void Bitree<T>::append(const Bitree<T>& L, const Bitree<T>& R, size_t parent)
{
    buffer<size_t> output;
    list<size_t> s;
    size_t temp;
    size_t as[15];
    removal((Nodes + parent)->left, output, s);
    removal((Nodes + parent)->right, output, s);
      std::cout << "parent: " << parent << std::endl;
      std::cout << "(Nodes + parent)->left: " << (Nodes + parent)->left << std::endl;
    (Nodes + parent)->left = append(L, output, s);
      std::cout << "parent: " << parent << std::endl;
    temp = (Nodes + parent)->right;
      std::cout << "(Nodes + parent)->left: " << (Nodes + parent)->left << std::endl;
      std::cout << "(Nodes + parent)->right: " << temp << std::endl;
    (Nodes + parent)->right = 10007;
      std::cout << "(Nodes + parent)->right: " << (Nodes + parent)->right << std::endl;
      std::cout << "temp ptr: " << &temp << std::endl;
      std::cout << "(Nodes + parent) ptr: " << (Nodes + parent) << std::endl;
      std::cout << "(Nodes + parent)->left ptr: " << (size_t) & ((Nodes + parent)->left) << std::endl;
      std::cout << "(Nodes + parent)->right ptr: " << &((Nodes + parent)->right) << std::endl;
        
    Nodes[parent].right = append(R, output, s);// the style that produced the bug.
    //temp = append(R, output, s);// the style that produced no bug.
    //Nodes[parent].right = temp;// the style that produced no bug. 

      std::cout << "parent: " << parent << std::endl;
      std::cout << "(Nodes + parent)->right: " << (Nodes + parent)->right << std::endl;// this line outputs whether Nodes[parent].right was correct valued.

}

I was writing my own lex/yacc by cpp. I met a weird bug when I use g++ version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC),This bug disappeared when I changed my compiler to icc. My code is purely serial, and I compiled it with only option -g or -O0. And I have chosen the option -Wall -Wextra -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations -Wconversion -ftrapv to minimize the warnings. Finally I used option -S to see what was the g++ compiler's outputs. the return process after the calling was following:(append_t wersion for testing)

        movq    -184(%rbp), %rax
        movq    32(%rax), %rcx
        movq    -208(%rbp), %rdx
        movq    %rdx, %rax
        addq    %rax, %rax
        addq    %rdx, %rax
        salq    $3, %rax
        leaq    (%rcx,%rax), %rbx
        movzbl  -212(%rbp), %edi
        leaq    -48(%rbp), %rcx
        leaq    -176(%rbp), %rdx
        movq    -200(%rbp), %rsi
        movq    -184(%rbp), %rax
        movl    %edi, %r8d
        movq    %rax, %rdi
        call    _ZN8hyperlex6BitreeINS_7RegTree4infoEE8append_tERKS3_RNS_6bufferImEERNS_4listImEEb
        movq    %rax, 8(%rbx)

movq %rax, 8(%rbx) is the return process. %rbx is Nodes[parent] and 8(%rbx) is Nodes[parent].right. It kept the return value in the register %rax.

This is beginning of append(const Bitree<T>& L, const Bitree<T>& R, size_t parent)

        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        pushq   %rbx
        subq    $88, %rsp
        .cfi_offset 3, -24
        movq    %rdi, -56(%rbp)
        movq    %rsi, -64(%rbp)
        movq    %rdx, -72(%rbp)
        movq    %rcx, -80(%rbp)
        movl    %r8d, %eax
        movb    %al, -84(%rbp)
        movq    $0, -40(%rbp)
        movl    $.LC186, %esi
        movl    $_ZSt4cout, %edi
        movq    (%rax), %rbx
        movl    $.LC187, %esi
        call    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc

And this is end of append(const Bitree<T>& L, const Bitree<T>& R, size_t parent).

  movq    -40(%rbp), %rbx
        movl    $.LC177, %esi
        movl    $_ZSt4cout, %edi
        call    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
        movq    %rbx, %rsi
        movq    %rax, %rdi
        call    _ZNSolsEm
        movl    $.LC178, %esi
        movq    %rax, %rdi
        call    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
        movl    $_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_, %esi
        movq    %rax, %rdi
        call    _ZNSolsEPFRSoS_E
        leaq    -40(%rbp), %rax
        addq    $32, %rax
        movl    $_ZSt4cout, %edi
        call    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
        movq    %rbx, %rsi
        movq    %rax, %rdi
        call    _ZNSolsEm
        movl    $_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_, %esi
        movq    %rax, %rdi
        call    _ZNSolsEPFRSoS_E
        movq    -40(%rbp), %rax
        addq    $88, %rsp
        popq    %rbx
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc

The output before the return of append(R, output, s) is the value of %rax so the retrun value itself was right. It seemeed that address in %rbx may be wrong. Because append(R, output, s) used %rbx, and bat the beginning the old value of %rbx was pushed into stack by pushq %rbx, this value mat be wrongly changed by some overflows. But I successfully catched the location that %rbx was stored and printed its value many times, it turns out that the value of old %rbx in stack was not changed during the calling of append(R, output, s). Could you please help me identify where the issue might be in my code, or is there possibly a problem with the compiler itself?"

6
  • s.refresh(); output.refresh(); What is s, what is output, what is refresh? append uses a bunch of names that don't seem to ever have been declared. Commented Jul 8 at 14:07
  • One possiblity is that append has a side effect on Nodes[parent], by changing either Nodes or parent. Prior to C++17, the order of evaluation of the two halves of the assignment was unspecified; the compiler could effectively do Node* temp = &Nodes[parent]; temp->right = append(...); In this case, Nodes[parent] before the assignment may not be the same as Nodes[parent] you inspect after the assignment. This would explain why introducing an intermediate temp variable apperas to fix things - it changes the order of evaluation. Commented Jul 8 at 14:17
  • Say, does NewNodeOffset reallocate Nodes array when full, by any chance? That would explain how append may invalide Nodes[parent]. Commented Jul 8 at 14:27
  • Yes, thank you so much, you're right, you're so great. What you wrote is exact the situation. 'NewNodeOffset' does reallocate 'Nodes' array when full. 'append ' does has side effect on 'Nodes' while it won't change 'parent'.The x86 asm verifies your viewpoint. It computed 'Node* temp = &Nodes[parent]' firstly, storing the result in '%rbx'. Then it completed 'temp->right = append(...)' and receive return value by 'movq %rax, 8(%rbx)' , and in this calling,'NewNodeOffset' reallocated 'Nodes' array, so the address stored in '%rbx' was stale, finally led to a assignment failure.
    – Yiping Hao
    Commented Jul 9 at 3:16
  • Yes, thank you so much, you're right, you're so great. What you wrote is exact the situation. NewNodeOffset does reallocate Nodes array when full. append does has side effect on Nodes while it won't change parent. The x86 asm verifies your viewpoint. It computed Node* temp = &Nodes[parent] firstly, storing the result in %rbx. Then it completed temp->right = append(...) and receive return value by movq %rax, 8(%rbx) , and in this calling, NewNodeOffset reallocated Nodes array, so the address stored in %rbx was stale, finally led to a assignment failure.
    – Yiping Hao
    Commented Jul 9 at 3:19

0

Browse other questions tagged or ask your own question.