AddressSanitizer 将 std::vector<T>::p ush_back 标识为释放后堆使用错误的原因

AddressSanitizer identifies std::vector<T>::push_back as reason for heap-use-after-free error

本文关键字:释放 错误 标识 lt vector std gt AddressSanitizer ush back      更新时间:2023-10-16

我正试图调试一个程序,该程序在启动时经常崩溃(它最终在几次尝试后启动(。使用ASAN编译后,我得到以下跟踪,显示10次崩溃中有9次是由std::vector<T>::push_back触发的(注意下面两个跟踪中的第9行和第15行(:

==35520== ERROR: AddressSanitizer: heap-use-after-free on address 0x60520005c37f at pc 0x51cc5c bp 0x7f257ebfc050 sp 0x7f257ebfc048
READ of size 1 at 0x60520005c37f thread T8 (CELOXICA)
#0 0x51cc5b in MappingData** std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<MappingData*>(MappingData* const*, MappingData* const*, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:372
#1 0x51c9f5 in MappingData** std::__copy_move_a<false, MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:390
#2 0x51c502 in MappingData** std::__copy_move_a2<false, MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:428
#3 0x51b916 in MappingData** std::copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:460
#4 0x519852 in MappingData** std::__uninitialized_copy<true>::__uninit_copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:93
#5 0x5158d6 in MappingData** std::uninitialized_copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:117
#6 0x511aeb in MappingData** std::__uninitialized_copy_a<MappingData**, MappingData**, MappingData*>(MappingData**, MappingData**, MappingData**, std::allocator<MappingData*>&) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:258
#7 0x50df13 in MappingData** std::__uninitialized_move_if_noexcept_a<MappingData**, MappingData**, std::allocator<MappingData*> >(MappingData**, MappingData**, MappingData**, std::allocator<MappingData*>&) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:281
#8 0x509202 in std::vector<MappingData*, std::allocator<MappingData*> >::_M_insert_aux(__gnu_cxx::__normal_iterator<MappingData**, std::vector<MappingData*, std::allocator<MappingData*> > >, MappingData* const&) /home/olumide/4.8.5/include/c++/4.8.5/bits/vector.tcc:362
#9 0x5060ac in std::vector<MappingData*, std::allocator<MappingData*> >::push_back(MappingData* const&) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_vector.h:913
#10 0x4f1270 in Queue::publishMappingData(MappingData*) /home/olumide/repo/source/app/src/framework/queue.cpp:149
#11 0x7f258c449cd7 in Manager::communicationThread() (/home/fmeprod/apps/current/celoxica.so+0x2bcd7)
#12 0x7f25995f9e82 in thread_proxy (/home/repo/boost/boost_1_56_x64/lib/libboost_thread.so.1.56.0+0x10e82)
#13 0x7f2596248b87 in __asan::AsanThread::ThreadStart() /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/asan/../../../../libsanitizer/asan/asan_thread.cc:99
#14 0x331ac07aa0 in start_thread (/lib64/libpthread.so.0+0x331ac07aa0)
#15 0x331a8e8bcc in clone (/lib64/libc.so.6+0x331a8e8bcc)
0x60520005c37f is located 2047 bytes inside of 2048-byte region [0x60520005bb80,0x60520005c380)
==35520== AddressSanitizer CHECK failed: ../../../../libsanitizer/asan/asan_allocator2.cc:216 "((id)) != (0)" (0x0, 0x0)
#0 0x7f25962423dd in __asan::AsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/asan/../../../../libsanitizer/asan/asan_rtl.cc:60
#1 0x7f2596249123 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/sanitizer_common/../../../../libsanitizer/sanitizer_common/sanitizer_common.cc:57
#2 0x7f25962356ab in __asan::GetStackTraceFromId(unsigned int, __sanitizer::StackTrace*) /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/asan/../../../../libsanitizer/asan/asan_allocator2.cc:216
#3 0x7f2596246e7a in __asan::DescribeHeapAddress(unsigned long, unsigned long) /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/asan/../../../../libsanitizer/asan/asan_report.cc:342
#4 0x7f2596247f61 in __asan_report_error /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/asan/../../../../libsanitizer/asan/asan_report.cc:693
#5 0x7f2596242763 in __asan_report_load1 /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/asan/../../../../libsanitizer/asan/asan_rtl.cc:226
#6 0x51cc5b in MappingData** std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<MappingData*>(MappingData* const*, MappingData* const*, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:372
#7 0x51c9f5 in MappingData** std::__copy_move_a<false, MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:390
#8 0x51c502 in MappingData** std::__copy_move_a2<false, MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:428
#9 0x51b916 in MappingData** std::copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:460
#10 0x519852 in MappingData** std::__uninitialized_copy<true>::__uninit_copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:93
#11 0x5158d6 in MappingData** std::uninitialized_copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:117
#12 0x511aeb in MappingData** std::__uninitialized_copy_a<MappingData**, MappingData**, MappingData*>(MappingData**, MappingData**, MappingData**, std::allocator<MappingData*>&) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:258
#13 0x50df13 in MappingData** std::__uninitialized_move_if_noexcept_a<MappingData**, MappingData**, std::allocator<MappingData*> >(MappingData**, MappingData**, MappingData**, std::allocator<MappingData*>&) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:281
#14 0x509202 in std::vector<MappingData*, std::allocator<MappingData*> >::_M_insert_aux(__gnu_cxx::__normal_iterator<MappingData**, std::vector<MappingData*, std::allocator<MappingData*> > >, MappingData* const&) /home/olumide/4.8.5/include/c++/4.8.5/bits/vector.tcc:362
#15 0x5060ac in std::vector<MappingData*, std::allocator<MappingData*> >::push_back(MappingData* const&) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_vector.h:913
#16 0x4f1270 in Queue::publishMappingData(MappingData*) /home/olumide/repo/source/app/src/framework/queue.cpp:149
#17 0x7f258c449cd7 in Manager::communicationThread() (/home/fmeprod/apps/current/celoxica.so+0x2bcd7)
#18 0x7f25995f9e82 in thread_proxy (/home/repo/boost/boost_1_56_x64/lib/libboost_thread.so.1.56.0+0x10e82)
#19 0x7f2596248b87 in __asan::AsanThread::ThreadStart() /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/asan/../../../../libsanitizer/asan/asan_thread.cc:99
#20 0x331ac07aa0 in start_thread (/lib64/libpthread.so.0+0x331ac07aa0)
#21 0x331a8e8bcc in clone (/lib64/libc.so.6+0x331a8e8bcc)

我不能发布代码,因为它太大了,而且是我雇主的财产。然而,代码相关部分的本质是:

# Thread 1
void Manager::communicationThread()
{
MappingData* data = new MappingData(...)
...
m_queue.publishMappingData( data ); // m_queue is available to all threads 
// data is not referenced or deallocated
}
void Queue::publishMappingData(MappingData*)
{
...
// m_buffer is a member of type std::vector<MappingData*> m_buffer;
m_buffer.push_back( data );
// contents of m_buffer are ONLY deallocated on shutdown
}

奇怪的是:

  1. 线程1创建的指针在创建后不会被线程1删除或访问
  2. 线程2存储在m_buffer中的指针不会被删除,直到应用程序关闭

相同m_buffer对象的内容上迭代时,会发生剩余的1/10崩溃,如下所示

# Thread 3
void Transaction::completion()
{
...
m_queue.publishStatus();  // m_queue is available to all threads 
...
}
void Queue::publishStatus()
{
...
for( int i = 0; i < m_buffer.size(); ++i )
{
.. new StatusCode( m_buffer[i]->m_id ); // crashes here
...                                     // m_id is a member of MappingData
}
}   

我知道标准库中基本上有0%的机会出现错误,但我不知道如何继续我唯一能想到的就是比较从轨迹第10行开始的指针宽度的差异。我认为这是由于库之间的不兼容,但我已经使用Linux应用程序文件来检查所有应用程序和共享对象是否都是64位的。(它们是。((从跟踪的第10行开始的指针宽度的差异是由于堆栈跟踪在十六进制地址中省略了前导零。(

更新

由于担心AddressSanitizer可能会狼来了,我决定恢复到没有asan的编译,并使用gdb进行调试。此外,我还使用gcc 4.4.7和4.8.5构建了应用程序(我知道很古老,但这些是我们目前只能使用的编译器,而且它们运行得很好——直到现在(。这两个二进制文件产生的痕迹与asan构建的相似

gcc 4.4.7

#0  _wordcopy_fwd_aligned (dstp=140736214036472, srcp=140735609323312, len=75535088) at wordcopy.c:101
#1  0x000000331a8839d2 in memmove (dest=0x7fffb40b5fc0, src=<value optimized out>, len=604284864) at memmove.c:73
#2  0x0000000000481d86 in __copy_m<MappingData*> (this=0x7116e0, __position=, __x=<value optimized out>) at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_algobase.h:378
#3  __copy_move_a<false, MappingData**, MappingData**> (this=0x7116e0, __position=, __x=<value optimized out>)
at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_algobase.h:397
#4  __copy_move_a2<false, MappingData**, MappingData**> (this=0x7116e0, __position=, __x=<value optimized out>)
at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_algobase.h:436
#5  copy<MappingData**, MappingData**> (this=0x7116e0, __position=, __x=<value optimized out>) at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_algobase.h:468
#6  uninitialized_copy<MappingData**, MappingData**> (this=0x7116e0, __position=, __x=<value optimized out>)
at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_uninitialized.h:92
#7  uninitialized_copy<MappingData**, MappingData**> (this=0x7116e0, __position=, __x=<value optimized out>)
at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_uninitialized.h:116
#8  __uninitialized_copy_a<MappingData**, MappingData**, MappingData*> (this=0x7116e0, __position=, __x=<value optimized out>)
at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_uninitialized.h:256
#9  __uninitialized_move_a<MappingData**, MappingData**, std::allocator<MappingData*> > (this=0x7116e0, __position=, __x=<value optimized out>)
at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_uninitialized.h:266
#10 std::vector<MappingData*, std::allocator<MappingData*> >::_M_insert_aux (this=0x7116e0, __position=, __x=<value optimized out>)
at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/vector.tcc:338
#11 0x0000000000472e91 in push_back (this=0x711590, data=0x7fffb40b5d50) at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_vector.h:741
#12 Queue::publishMappingData (this=0x711590, data=0x7fffb40b5d50) at src/framework/queue.cpp:149

gcc 4.8.5

#0  0x000000331a889e1a in _wordcopy_bwd_aligned (dstp=140736349684976, srcp=140736348970864, len=84044416) at wordcopy.c:293
#1  0x000000331a8839ba in memmove (dest=0x7fff940df128, src=<value optimized out>, len=672355336) at memmove.c:99
#2  0x00000000004bd9b8 in MappingData** std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<MappingData*>(MappingData* const*, MappingData* const*, MappingData**) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:372
#3  0x00000000004bd87e in MappingData** std::__copy_move_a<false, MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:390
#4  0x00000000004bd5ac in MappingData** std::__copy_move_a2<false, MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:428
#5  0x00000000004bcf91 in MappingData** std::copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) ()
at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:460
#6  0x00000000004bbeb7 in MappingData** std::__uninitialized_copy<true>::__uninit_copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:93
#7  0x00000000004ba28d in MappingData** std::uninitialized_copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:117
#8  0x00000000004b80dd in MappingData** std::__uninitialized_copy_a<MappingData**, MappingData**, MappingData*>(MappingData**, MappingData**, MappingData**, std::allocator<MappingData*>&) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:258
#9  0x00000000004b5cfc in MappingData** std::__uninitialized_move_if_noexcept_a<MappingData**, MappingData**, std::allocator<MappingData*> >(MappingData**, MappingData**, MappingData**, std::allocator<MappingData*>&) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:281
#10 0x00000000004b3239 in std::vector<MappingData*, std::allocator<MappingData*> >::_M_insert_aux(__gnu_cxx::__normal_iterator<MappingData**, std::vector<MappingData*, std::allocator<MappingData*> > >, MappingData* const&) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/vector.tcc:369
#11 0x00000000004b1398 in std::vector<MappingData*, std::allocator<MappingData*> >::push_back(MappingData* const&) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_vector.h:913
#12 0x00000000004a4e43 in Queue::publishMappingData(MappingData*) () at src/framework/queue.cpp:149

让我印象深刻的是,传递给_wordcopy_bwd_aligned(gcc 4.8.5(和_wordcopy_fwd_aligned(gcc 4.4.7(的len变量几乎是1亿,而在这两种情况下传递给memmove的len都超过了5000万!(回想一下,矢量存储指针。(

传递给glibc函数的长度由std::__copy_move计算,并且是std::vector<_Tp,_Alloc>::_M_insert_aux最终传递给它的__last__first指针的指针差。对该成员函数模板的快速检查表明,导致崩溃的分支是因为this->_M_impl._M_finish == this->_M_impl._M_end_of_storage,即向量内存不足,必须重新分配和重新定位其内容。但这就是向量的设计目的。我不知道上面的推理是否正确,但这就是我已经走了多远。

我没有看到任何线程同步原语。可能发生的情况是m_buffer.push_back在一个线程中触发realloc。但读取向量的另一个线程在向量被重新分配和复制之前仍然访问旧的内存区域。换句话说,这与MappingData*指针无关,而是与vector类中存储这些指针的内存区域有关。当矢量达到其当前容量时,该区域被解除分配,然后在线程A中再次分配。线程B开始访问m_buffer[i]m_buffer.push_back()内部的数据并且由于该存储器区域不再属于进程而崩溃。

开始解决此问题的一些提示:

  • 使用unique_ptr而不是new。

  • 给线程一个ID,并断言函数是否真的是从您期望的线程中调用的。如果线程正在推回m_buffer,它可能会将向量中的所有项重新分配到不同的内存中,如果您正在推送的项将超过向量的当前容量,这将使当前持有的迭代器无效。

  • 这对我来说没有多大意义,难道不是矢量的大小吗?它正在向这里的指针递增一个int,可能会越界:

    for(int i = 0; i < m_buffer; ++i)