Could you please help me identify where the issue might be in my code, or is there possibly a problem with the compiler itself?"
I was writing my own lex/yacc by cpp. There is some codes for generating binary tree of regular expression. Such code produced a weird bug. Nodes[parent].right = append(R, output, s);
It copies a tree R to be the right sub-tree of Nodes[parent]
. All the nodes within a tree were stored in a array Nodes
in heap. So, the Nodes[parent].right
and return value of append(R, output, s)
are both of a type of size_t
. The initial value of Nodes[parent].right
is (size_t)-1
. I checked the return value just before the append(R, output, s)
returned, it was a correct value of 8. After its return I printed the value of Nodes[parent].right
, it still was the wrong initial value of (size_t)-1
. It seemed that the return met a failure. When I use size_t temp; temp = append(R, output, s); Nodes[parent].right = temp;
instead, this bug disappeared.
Both two code style seems to be equivalent, why the first one produced a bug, while the second one was right.
the following is definition of the template of the binary tree: I have to say sorry that template class buffer<size_t>
and list<size_t>
was defined by myself, they are not a part of std.
template <class T> class Bitree
{
public:
struct node
{
size_t left;
size_t right;
T content;
};
Bitree();
~Bitree();
void clear(void);
void SetHead(size_t head);
size_t NewNodeOffset(void);
node* NewNode(void);
node* Node(size_t site)const;
node* LeftChild(const node* now)const;
node* RightChild(const node* now)const;
const node& operator[](size_t target) const;
node& operator[](size_t target);
void postorderTraversal(buffer<size_t>& output, list<size_t>& s) const;
void postorderTraversal(buffer<size_t>& output) const;
void inorderTraversal(buffer<size_t>& output, list<size_t>& s) const;
void inorderTraversal(buffer<size_t>& output) const;
bool IfLeaf(size_t site)const;
void Demo(FILE* fp)const;
void removal(size_t site);
void removal(size_t site, buffer<size_t>& outpaput, list<size_t>& s);
size_t append(const Bitree<T>& source, buffer<size_t>& output, list<size_t>& s);
void append(const Bitree<T>& left, size_t parent);
void append(const Bitree<T>& left, const Bitree<T>& right, size_t parent);
size_t append_t(const Bitree<T>& source, buffer<size_t>& output, list<size_t>& s, bool key);//for testing of this bug
void append_t(const Bitree<T>& left, const Bitree<T>& right, size_t parent, bool key);
//for testing of this bug
size_t Head;// root
private:
size_t Size;// capacity of node* Nodes
size_t Count;// amount of nodes in this tree
size_t FirstEmpty;
node* Nodes;
};
And the following is the definition of (All the results printed by std::cout are correct.) void append(const Bitree<T>& left, const Bitree<T>& right, size_t parent);
void append(const Bitree<T>& left, const Bitree<T>& right, size_t parent);
{
size_t now, new_node, temp;
s.refresh();
output.refresh();
new_node = SizeMax;
source.postorderTraversal(output, s);
s.refresh();
s.renew(source.Size);
while (output.dequeue(now))
{
new_node = NewNodeOffset();
std::cout << "new_node: " << new_node << std::endl;
std::cout << "now: " << now << std::endl;
temp = source.Nodes[now].left;
(Nodes + new_node)->left = (temp == SizeMax ? SizeMax : s[temp]);
std::cout << "source.Nodes[now].left: " << temp << std::endl;
std::cout << "(Nodes + new_node)->left: " << (Nodes + new_node)->left << std::endl;
temp = source.Nodes[now].right;
(Nodes + new_node)->right = (temp == SizeMax ? SizeMax : s[temp]);
std::cout << "source.Nodes[now].right: " << temp << std::endl;
std::cout << "(Nodes + new_node)->right: " << (Nodes + new_node)->right << std::endl;
(Nodes + new_node)->content = source.Nodes[now].content;
s[now] = new_node;
}
std::cout << "newHead: " << new_node << std::endl;
std::cout << "return newHead: "<< new_node <<"\n\n" << std::endl;
return new_node;
}
the code that called oid append(const Bitree<T>& left, const Bitree<T>& right, size_t parent);
was here:
template <class T> void Bitree<T>::append(const Bitree<T>& L, const Bitree<T>& R, size_t parent)
{
buffer<size_t> output;
list<size_t> s;
size_t temp;
size_t as[15];
removal((Nodes + parent)->left, output, s);
removal((Nodes + parent)->right, output, s);
std::cout << "parent: " << parent << std::endl;
std::cout << "(Nodes + parent)->left: " << (Nodes + parent)->left << std::endl;
(Nodes + parent)->left = append(L, output, s);
std::cout << "parent: " << parent << std::endl;
temp = (Nodes + parent)->right;
std::cout << "(Nodes + parent)->left: " << (Nodes + parent)->left << std::endl;
std::cout << "(Nodes + parent)->right: " << temp << std::endl;
(Nodes + parent)->right = 10007;
std::cout << "(Nodes + parent)->right: " << (Nodes + parent)->right << std::endl;
std::cout << "temp ptr: " << &temp << std::endl;
std::cout << "(Nodes + parent) ptr: " << (Nodes + parent) << std::endl;
std::cout << "(Nodes + parent)->left ptr: " << (size_t) & ((Nodes + parent)->left) << std::endl;
std::cout << "(Nodes + parent)->right ptr: " << &((Nodes + parent)->right) << std::endl;
Nodes[parent].right = append(R, output, s);// the style that produced the bug.
//temp = append(R, output, s);// the style that produced no bug.
//Nodes[parent].right = temp;// the style that produced no bug.
std::cout << "parent: " << parent << std::endl;
std::cout << "(Nodes + parent)->right: " << (Nodes + parent)->right << std::endl;// this line outputs whether Nodes[parent].right was correct valued.
}
I was writing my own lex/yacc by cpp. I met a weird bug when I use g++ version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC),This bug disappeared when I changed my compiler to icc. My code is purely serial, and I compiled it with only option -g
or -O0
. And I have chosen the option -Wall -Wextra -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations -Wconversion -ftrapv
to minimize the warnings. Finally I used option -S to see what was the g++ compiler's outputs.
the return process after the calling was following:(append_t wersion for testing)
movq -184(%rbp), %rax
movq 32(%rax), %rcx
movq -208(%rbp), %rdx
movq %rdx, %rax
addq %rax, %rax
addq %rdx, %rax
salq $3, %rax
leaq (%rcx,%rax), %rbx
movzbl -212(%rbp), %edi
leaq -48(%rbp), %rcx
leaq -176(%rbp), %rdx
movq -200(%rbp), %rsi
movq -184(%rbp), %rax
movl %edi, %r8d
movq %rax, %rdi
call _ZN8hyperlex6BitreeINS_7RegTree4infoEE8append_tERKS3_RNS_6bufferImEERNS_4listImEEb
movq %rax, 8(%rbx)
movq %rax, 8(%rbx) is the return process. %rbx is Nodes[parent] and 8(%rbx) is Nodes[parent].right. It kept the return value in the register %rax.
This is beginning of append(const Bitree<T>& L, const Bitree<T>& R, size_t parent)
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
pushq %rbx
subq $88, %rsp
.cfi_offset 3, -24
movq %rdi, -56(%rbp)
movq %rsi, -64(%rbp)
movq %rdx, -72(%rbp)
movq %rcx, -80(%rbp)
movl %r8d, %eax
movb %al, -84(%rbp)
movq $0, -40(%rbp)
movl $.LC186, %esi
movl $_ZSt4cout, %edi
movq (%rax), %rbx
movl $.LC187, %esi
call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
And this is end of append(const Bitree<T>& L, const Bitree<T>& R, size_t parent)
.
movq -40(%rbp), %rbx
movl $.LC177, %esi
movl $_ZSt4cout, %edi
call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
movq %rbx, %rsi
movq %rax, %rdi
call _ZNSolsEm
movl $.LC178, %esi
movq %rax, %rdi
call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
movl $_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_, %esi
movq %rax, %rdi
call _ZNSolsEPFRSoS_E
leaq -40(%rbp), %rax
addq $32, %rax
movl $_ZSt4cout, %edi
call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
movq %rbx, %rsi
movq %rax, %rdi
call _ZNSolsEm
movl $_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_, %esi
movq %rax, %rdi
call _ZNSolsEPFRSoS_E
movq -40(%rbp), %rax
addq $88, %rsp
popq %rbx
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
The output before the return of append(R, output, s)
is the value of %rax
so the retrun value itself was right. It seemeed that address in %rbx may be wrong. Because append(R, output, s)
used %rbx
, and bat the beginning the old value of %rbx
was pushed into stack by pushq %rbx
, this value mat be wrongly changed by some overflows. But I successfully catched the location that %rbx
was stored and printed its value many times, it turns out that the value of old %rbx
in stack was not changed during the calling of append(R, output, s)
. Could you please help me identify where the issue might be in my code, or is there possibly a problem with the compiler itself?"
s.refresh(); output.refresh();
What iss
, what isoutput
, what isrefresh
?append
uses a bunch of names that don't seem to ever have been declared.append
has a side effect onNodes[parent]
, by changing eitherNodes
orparent
. Prior to C++17, the order of evaluation of the two halves of the assignment was unspecified; the compiler could effectively doNode* temp = &Nodes[parent]; temp->right = append(...);
In this case,Nodes[parent]
before the assignment may not be the same asNodes[parent]
you inspect after the assignment. This would explain why introducing an intermediate temp variable apperas to fix things - it changes the order of evaluation.NewNodeOffset
reallocateNodes
array when full, by any chance? That would explain howappend
may invalideNodes[parent]
.NewNodeOffset
does reallocateNodes
array when full.append
does has side effect onNodes
while it won't changeparent
. The x86 asm verifies your viewpoint. It computedNode* temp = &Nodes[parent]
firstly, storing the result in%rbx
. Then it completedtemp->right = append(...)
and receive return value bymovq %rax, 8(%rbx)
, and in this calling,NewNodeOffset
reallocatedNodes
array, so the address stored in%rbx
was stale, finally led to a assignment failure.