Tag Archives: clang

Testing libc++ with -fsanitize=undefined

Soon after I posted my last article, Testing libc++ with Address Sanitizer, I received a tweet from @jurederman, whose profile on Twitter identifies himself as a “Mozilla security bug hunter”, asking “Will you do -fsanitize=undefined next? :)”.

I responded with “Already running the UBSan tests”.

Address Sanitizer (ASan), which I used in the last post, is not the only “sanitizer” that clang offers. There are “Thread Sanitizer” (TSan), “Undefined Behavior Sanitizer” (UBSan), and others. There’s an integer overflow sanitizer which I believe is called IOC coming in the 3.3 release of clang. The documenation for UBSan can be found on the LLVM site

Anyway, I have been looking at the results of running the libc++ test suite with UBSan enabled.

The mechanics

Like ASan, UBSan is a compiler pass and a custom runtime library. You enable this by passing -fsanitize=undefined to the compiler and linker. I ran the libc++ test suite like this:

cd $LLVM/libcxx/test
CC=/path/to/tot/clang OPTIONS="--std=c++11 -stdlib=libc++ -fsanitize=undefined" ./testit

Unfortunately, this failed; working with unreleased compilers and libraries, I needed updated versions of both libc++.dylib and libc++abi.dylib. So I built those from sources, and then used DYLD_LIBRARY_PATH to make sure that the test program used the libraries that I’d just built. (I didn’t want to replace the ones in /usr/lib, because lots of things in the system depend on them)

cd $LLVM/libcxx/test
DYLD_LIBRARY_PATH=$LLVM/libcxx/lib:$LLVM/libcxxabi/lib CC=/path/to/tot/clang OPTIONS="-std=c++11 -stdlib=libc++ -fsanitize=undefined -L $LLVM/libcxxabi/lib -lc++abi" ./testit

where, as before “/path/to/tot/clang” is the clang that I just built from source, and $LLVM is where I’ve checked out the various parts of LLVM from Subversion.

The results

And the tests were off and running. In the last article, I noted that these tests take about 30 minutes to run on my MacBook Pro. The ASan tests took about 90 minutes. I was pleasantly surprised when the UBSan tests finished in about 42 minutes, or about 40% slower than the baseline tests.

There were 12 tests (out of more than 4800) that failed under normal circumstances. Using UBSan, 49 tests failed, and there were about 48,463 different runtime errors reported by UBSan.

The failing tests

Of the 37 tests that failed under UBSan, 34 of them were aborted because of uncaught exception of type XXXX, where XXX was from the standard library (std::out_of_range, for example). This is caused by a mismatch between libc++ and libc++abi, specifically by the fact that both my custom-built libc++ and my custom-built libc++abi contained typeinfo records for some of the standard exception classes. Getting this right and getting all the bits of the test infrastructure to use the right libraries turned into a big mess very quickly, and I still don’t have a good solution here. Hopefully this will be the subject of a future blog post. However, I was able to convince myself that these failures were not the result of a bug in either libc++, the test suite or UBSan.

The other three failures were in the std::thread test suite. When I investigated, it turned out that there was a race condition in some of the thread tests. A race condition? In threading code? Inconceivable! Apparently the runtime environment under UBSan was different enough to trigger the (latent) race condition in these three tests. Looking at the test suite, I found the same race condition in 10 other tests as well. I committed revision 178029 to fix this in all 13 tests.

The error messages

48K errors! I can’t look at 48K error messages; so I decided to bin them.

There were 37,675 messages of the form: 0x000106ae3fff: runtime error: value inf is outside the range of representable values of type 'xxxx'

and 10,693 messages of the form: 0x000101a8f244: runtime error: value nan is outside the range of representable values of type 'xxxx'

Where “xxxx” could be “double” or “float”. Also, the first bin also included “-inf” as well.

There were 52 messages of the form: what.pass.cpp:24:9: runtime error: member call on address 0x7fff5e8f48d0 which does not point to an object of type 'std::logic_error'

There were 29 messages like this: eval.pass.cpp:180:14: runtime error: division by zero

There were 6 messages like this: /Sources/LLVM/libcxx/include/memory:3163:25 runtime error: load of misaligned address 0x7fff569a85c6 for type 'const unsigned long', which requires 8 byte alignment

There were 5 messages like this: 0x0001037a329e: runtime error: load of value 4294967294, which is not a valid value for type 'std::regex_constants::match_flag_type'

There were 2 messages like this: /Sources/LLVM/libcxx/include/locale:3361:48: runtime error: index 40 out of bounds for type 'char_type [10]'

There was one message like this: runtime error: load of value 64, which is not a valid value for type 'bool'

The first thing that I noticed is that sometimes UBSan will give you file and line number, and otherwise just a hex address. The file and line number is incredibly useful for tracking stuff down.

The Analysis

Working from the bottom up:

The load of value 64, which is not a valid value for type 'bool' message came out of one of the atomics tests, where it is trying to clear and set an atomic flag that has been default constructed. I don’t know what the correct behavior is here; still looking at this one.

The index 40 out of bounds for type 'char_type [10]' errors came from the money formatting tests in libc++, and were failing only on “wide string” versions of the tests; i.e, with two (or four) byte characters. The offending line turned out to be:

*__nc = __src[find(__atoms, __atoms+sizeof(__atoms), *__w) - __atoms];

and the problem was that sizeof(__atoms) was assumed to be the same as the number of entries in that array. Perfectly fine for character arrays, not so fine for wide character arrays. Fixed in revision 177694.

The load of value 4294967294, which is not a valid value for type 'std::regex_constants::match_flag_type' errors turned out to be simple to fix as well, once we decided what the right fix was. This turned out to be complicated, because it involved a close reading of the standards document. The problem was that match_flag_type was an enum, emulating a bitmask. The type also had an operator ~(), which flipped all the bits in the type. But since the type was implemented as an enum, it had an underlying integer type that it was represented as, and the operator ~ just flipped all the bits. This led to values that UBSan didn’t like. A large discussion followed, with sentiments like “does it matter” and “can any code actually tell”, and so on. Eventually, I just changed the operator ~ to only flip the bits that are valid in the enumeration. Fixed in revision 177693.

The load of misaligned address 0x7fff569a85c6 for type 'const unsigned long', which requires 8 byte alignment were in the hashing code for strings. They are a performance optimization, and I haven’t tried to touch them. Whatever changes are made here will have to be done very carefully, since this will affect the performance of all the associative containers.

The “division by zero” messages were in three different tests. There were 3 of them in the numeric limits tests, and they were there on purpose. There were 2 of them in the complex number tests, and they were also on purpose. The other 24 of them were in the random number test suite, where the tests were generating a bunch of random numbers (using various distributions) and checking to see that the mean, variance, standard deviation, skew, etc, were all what the programmer expected. The problem is in the last measurement: skew. It is some calculated value divided by the variance. If the variance is zero, then the skew should be infinity. Many of the tests in the random number suite are testing “edge cases” of the random number generators, and some of these edge cases will produce a sequence where all the numbers are the same (and thus, the variance == 0). We solved this by commenting out the calculation of the skew for these degenerate cases, and leaving a comment in the test source file. Howard fixed this in revision 177826.

The runtime error: member call on address 0x7fff5e8f48d0 which does not point to an object of type 'std::logic_error' messages, as it turned out, were due to a bug in UBSan.

I’m just getting started on the inf/-inf/nan messages (about 48K of those). Most of these come from the complex number regression tests. Since this is a test suite for a library that implements a bunch of numeric routines, a lot of the tests actually do generate and use nan/inf, so I expect that many of these will be “false positives”.

Conclusions

This exercise, while not completed, has already turned up a set of bugs in the libc++ test suite, as well as a bug in libc++ and some undefined behavior in libc++. There’s more to look at here, but I think this was a good exercise. There’s kind of a mismatch of expectations here, especially in the complex and numeric test suites, because UBSan is looking for nan/inf/-inf and the libc++ test code is deliberately generating them.

Thanks to Howard Hinnant for his patience and explanations about the C++ standard and libc++ and the libc++ test suite, and to Richard Smith for his help with UBSan and interpreting the C++ standard.

Testing libc++ with Address Sanitizer

I’ve been running the libc++ tests off and on for a while. It’s a quite extensive test suite, but I wondered if there were any bugs that the test suite was not uncovering. In the upcoming clang 3.3, there is a new feature named Address Sanitizer which inserts a bunch of runtime checks into your executable to see if there are any “out of bounds” reads and writes to memory.

In the back of my head, I’ve always thought that it would be nice to be able to say that libc++ was “ASan clean” (i.e, passed all of the test suite when running with Address Sanitizer).

So I decided to do that. [ All of this work was done on Mac OS X 10.8.2/3, btw ]

How to run the tests:

There’s a script for running the tests. It’s called testit.

    $ cd $LLVM/libcxx/test ; ./testit

where $LLVM/libcxx is where libc++ is checked out.

This takes about 30 minutes to run.

Without Address Sanitizer, libc++ fails 12 out of the 4348 tests.

Running the tests with Address Sanitizer

    $ cd $LLVM/libcxx/test ; CC=/path/to/tot/clang++ OPTIONS= "-std=c++11 -stdlib=libc++ -fsanitize=address" ./testit

Note: the default options are “-std=c++11 -stdlib=libc++”, that’s what you get if you don’t specify anything

This takes about 92 minutes; just a bit more than three times as long.

With Address Sanitizer, libc++ fails 54 tests (again, out of 4348)

What are the failures?

  • In 11 tests, Address Sanitizer detected a one-byte write outside a heap block. All of these involve iostreams. I created a small test program that ASan also fires on, and sent it to Howard Hinnant (who wrote most of libc++), and he found a place where he was allocating a zero-byte buffer by mistake. One bug, multiple failures. He fixed this in revision 177452.
  • 2 tests for std::random were failing. This turned out to be an off-by-one error in the test code, not in libc++. I fixed these in revisions 177355 and 177464.
  • Address Sanitizer detected memory allocations failing in 4 cases. This is expected, since some of the tests are testing the memory allocation system of libc++. However, it appears that ASan does not call the user-supplied new_handler when memory allocation fails (and may not throw std::bad_alloc, ether). I have filed PR15544 to track this issue.
  • 25 cases are failing where the program is failing to load, due to a missing symbol. This is most commonly std::__1::__get_sp_mut(void const *), but there are a couple others. Howard says that this was added to libc++ after 10.8 shipped, so it’s not in the dylib in /usr/lib. If the tests are run with a copy of libc++ built from source, they pass.
  • There are the 12 cases that were failing before enabling Address Sanitizer.

Once Howard and I fixed the random tests and the bug in the iostreams code, I re-ran the tests using a recently build dylib.

    $ cd $LLVM/libcxx/test ; DYLD_LIBRARY_PATH=$LLVM/libcxx/lib CC=/path/to/tot/clang++ OPTIONS= "-std=c++11 -stdlib=libc++ -fsanitize=address" ./testit

This gave us 16 failures:

  • The 4 failures that have to do with memory allocation failures.
  • The 12 failures that we started with.

Conclusion

I’m glad to see that there were so few problems in the libc++ code. It’s a fundamental building block for applications on Mac OS X. And now it’s better than it was when we started this exercise.

However, we did find a couple bugs in the test suite, and one heap-smashing bug in libc++. We also found a limitation in Address Sanitizer, too – which I’m hoping the developers will address soon.

C++ and Xcode 4.6

So, you’ve installed Xcode 4.6, and you are a C++ programmer.

You want to use the latest and greatest, so you create a new project, and add your sources to the project, and hit Build, and … guess what? Your code doesn’t build!

What’s up with that?

In Xcode 4.6 (and presumably, later versions), the default C++ compiler is clang, the default language is C++11, and the standard library is libc++.

This is a change from previous versions, where the default was gcc 4.2.1, C++03, and libstdc++.

This is good news

Clang is a much more capable compiler than gcc 4.2.1. It’s also better integrated into Xcode.

C++11 is a major upgrade in functionality from C++03. There have been lots of articles written about the new features, so I won’t belabor them here.

However, with a new language, compiler, and standard library, there are some incompatibilities. I’ll try to run through the common ones, and hopefully you will be up and running quickly.

How can I tell if I’m using libc++?

If you’re writing cross-platform code, sometimes you need to know what standard library you are using. In theory, they should all offer equivalent functionality, but that’s just theory. Sometimes you just need to know. The best way to check for libc++ is to look for the preprocessor symbol _LIBCPP_VERSION. If that’s defined, then you’re using libc++.

    #ifdef  _LIBCPP_VERSION
    //  libc++ specific code here
    #else
    //  generic code here
    #endif

Note that this symbol is only defined after you include any of the libc++ header files. If you need a small header file to include just for this, you can do:

    #include <ciso646>

The header file “ciso646″ is required by both the C++03 and C++11 standards, and defined to do nothing.

What happened to TR1?

Technical Report #1 (TR1) was a set of library additions to the C++03 standard. Representing the fact that they were not part of the “official” standard, they were placed in the namespace std::tr1.

In c++11, they are officially part of the standard, and live in the namespace std, just like vector and string. The include files no longer live in the “tr1″ folder, either.

So, code like this:

    #include <tr1/unordered_map>
    int main()
    {
        std::tr1::unordered_map <int, int> ma;
        std::cout << ma.size () << std::endl;
        return 0;   
    }

Needs to be changed to:

    #include <unordered_map>
    int main()
    {
        std::unordered_map <int, int> ma;
        std::cout << ma.size () << std::endl;
        return 0;   
    }

It’s probably easiest to just search your code base for references to tr1 and remove them.

Missing identifiers (include what you use)

“My code used to build with Xcode 4.5, and now I’m getting “unknown identifier” errors with stuff in the standard C (or C++) library!”

Library headers may include other library headers. Sometimes, this is required by the standard, sometimes it is done as an “implementation feature” of the library.

To be portable, you should explicitly include the header files that define the routines that you use. That way, you’re not dependent on the internal details of libc++ (or libstdc++).

For example, if you are calling std::malloc (or malloc), you should really #include <cstdlib> (or #include <stdlib.h>) to make sure that it is defined.

[ Updated 03-03 ]  In a Xcode-Users mailing list posting, Todd Heberlein writes:

I have my own C++ library. When I link against it in a Cocoa app (with the appropriate files set to .mm), everything works fine.

But when I start a new "Command Line Tool" project and try to link against the library, I get a lot of errors about missing STL symbols.

This is almost certainly because his library was built with gcc/stdlibc++, and his new tool with clang/libc++.

More to come.

As I find other differences, I will be adding to this document. If you come across things, please let me know in the comments and I will add them.

Clang and standard libraries on Mac OS X

I’ve seen several people on the boost developers list (and the boost users list) using clang to build their programs. This generally goes pretty well; the diagnostics that clang produces are much better than gcc’s (though gcc 4.7 has made great strides in improving their error messages), but there’s a common problem when people try to turn on c++11 support.

For the first step, they just add -std=c++11 to their compiler options (Xcode configuration, makefile, command-line, whatever) to turn on c++11 language support. This generally works well for their existing code base. Then they start adding things like auto and range-based for loops, and this works great as well.

Then they start to use library features such as std::move or std::forward or #include <chrono> (and so on). And it all comes crashing down.

    ../boost/test/tools/assertion.hpp:386:36: error: no member named 'forward' in namespace 'std'
    return value_expr<T>( std::forward<T>( v ) );

The problem is that the standard library that clang uses is the gcc standard (libstdc++) library that Apple ships (which is based on gcc 4.2).

The advantage of this is that you can “mix and match” your code; compiling some parts with gcc and other parts with clang, and link them all together and they will work.

The disadvantage is that libstdc++ 4.2 predates the c++11 standard; it does not support most of the c++11 features. So, your code that uses std::forward, etc will not compile with this library, even if you turn on c++11 support in clang – this switch only controls what language the compiler will accept.

The second step that you have to do is to add -stdlib=libc++ to your command-line (Xcode settings, makefile, whatever). This tells clang to use libc++ as the standard library.

You have to tell the linker to link against libc++ instead of libstdc++ as well.

Here’s how to add a clang-11 toolset to your boost setup. In your “user-config.jam”, put this:

using clang : 11
    : "/usr/bin/clang++"
    : <cxxflags>"-std=c++11 -stdlib=libc++" <linkflags>"-stdlib=libc++""
    ;

Now you can run b2 like this b2 toolset=clang-11

On the boost list, Julian reminded me that if you’re building your clang yourself (instead of getting it through Xcode), you’ll need to get/install libc++ as well. (MacPorts is good for this)