Testing libc++ with -fsanitize=undefined

Soon after I posted my last article, Testing libc++ with Address Sanitizer, I received a tweet from @jurederman, whose profile on Twitter identifies himself as a “Mozilla security bug hunter”, asking “Will you do -fsanitize=undefined next? :)”.

I responded with “Already running the UBSan tests”.

Address Sanitizer (ASan), which I used in the last post, is not the only “sanitizer” that clang offers. There are “Thread Sanitizer” (TSan), “Undefined Behavior Sanitizer” (UBSan), and others. There’s an integer overflow sanitizer which I believe is called IOC coming in the 3.3 release of clang. The documenation for UBSan can be found on the LLVM site

Anyway, I have been looking at the results of running the libc++ test suite with UBSan enabled.

The mechanics

Like ASan, UBSan is a compiler pass and a custom runtime library. You enable this by passing -fsanitize=undefined to the compiler and linker. I ran the libc++ test suite like this:

cd $LLVM/libcxx/test
CC=/path/to/tot/clang OPTIONS="--std=c++11 -stdlib=libc++ -fsanitize=undefined" ./testit

Unfortunately, this failed; working with unreleased compilers and libraries, I needed updated versions of both libc++.dylib and libc++abi.dylib. So I built those from sources, and then used DYLD_LIBRARY_PATH to make sure that the test program used the libraries that I’d just built. (I didn’t want to replace the ones in /usr/lib, because lots of things in the system depend on them)

cd $LLVM/libcxx/test
DYLD_LIBRARY_PATH=$LLVM/libcxx/lib:$LLVM/libcxxabi/lib CC=/path/to/tot/clang OPTIONS="-std=c++11 -stdlib=libc++ -fsanitize=undefined -L $LLVM/libcxxabi/lib -lc++abi" ./testit

where, as before “/path/to/tot/clang” is the clang that I just built from source, and $LLVM is where I’ve checked out the various parts of LLVM from Subversion.

The results

And the tests were off and running. In the last article, I noted that these tests take about 30 minutes to run on my MacBook Pro. The ASan tests took about 90 minutes. I was pleasantly surprised when the UBSan tests finished in about 42 minutes, or about 40% slower than the baseline tests.

There were 12 tests (out of more than 4800) that failed under normal circumstances. Using UBSan, 49 tests failed, and there were about 48,463 different runtime errors reported by UBSan.

The failing tests

Of the 37 tests that failed under UBSan, 34 of them were aborted because of uncaught exception of type XXXX, where XXX was from the standard library (std::out_of_range, for example). This is caused by a mismatch between libc++ and libc++abi, specifically by the fact that both my custom-built libc++ and my custom-built libc++abi contained typeinfo records for some of the standard exception classes. Getting this right and getting all the bits of the test infrastructure to use the right libraries turned into a big mess very quickly, and I still don’t have a good solution here. Hopefully this will be the subject of a future blog post. However, I was able to convince myself that these failures were not the result of a bug in either libc++, the test suite or UBSan.

The other three failures were in the std::thread test suite. When I investigated, it turned out that there was a race condition in some of the thread tests. A race condition? In threading code? Inconceivable! Apparently the runtime environment under UBSan was different enough to trigger the (latent) race condition in these three tests. Looking at the test suite, I found the same race condition in 10 other tests as well. I committed revision 178029 to fix this in all 13 tests.

The error messages

48K errors! I can’t look at 48K error messages; so I decided to bin them.

There were 37,675 messages of the form: 0x000106ae3fff: runtime error: value inf is outside the range of representable values of type 'xxxx'

and 10,693 messages of the form: 0x000101a8f244: runtime error: value nan is outside the range of representable values of type 'xxxx'

Where “xxxx” could be “double” or “float”. Also, the first bin also included “-inf” as well.

There were 52 messages of the form: what.pass.cpp:24:9: runtime error: member call on address 0x7fff5e8f48d0 which does not point to an object of type 'std::logic_error'

There were 29 messages like this: eval.pass.cpp:180:14: runtime error: division by zero

There were 6 messages like this: /Sources/LLVM/libcxx/include/memory:3163:25 runtime error: load of misaligned address 0x7fff569a85c6 for type 'const unsigned long', which requires 8 byte alignment

There were 5 messages like this: 0x0001037a329e: runtime error: load of value 4294967294, which is not a valid value for type 'std::regex_constants::match_flag_type'

There were 2 messages like this: /Sources/LLVM/libcxx/include/locale:3361:48: runtime error: index 40 out of bounds for type 'char_type [10]'

There was one message like this: runtime error: load of value 64, which is not a valid value for type 'bool'

The first thing that I noticed is that sometimes UBSan will give you file and line number, and otherwise just a hex address. The file and line number is incredibly useful for tracking stuff down.

The Analysis

Working from the bottom up:

The load of value 64, which is not a valid value for type 'bool' message came out of one of the atomics tests, where it is trying to clear and set an atomic flag that has been default constructed. I don’t know what the correct behavior is here; still looking at this one.

The index 40 out of bounds for type 'char_type [10]' errors came from the money formatting tests in libc++, and were failing only on “wide string” versions of the tests; i.e, with two (or four) byte characters. The offending line turned out to be:

*__nc = __src[find(__atoms, __atoms+sizeof(__atoms), *__w) - __atoms];

and the problem was that sizeof(__atoms) was assumed to be the same as the number of entries in that array. Perfectly fine for character arrays, not so fine for wide character arrays. Fixed in revision 177694.

The load of value 4294967294, which is not a valid value for type 'std::regex_constants::match_flag_type' errors turned out to be simple to fix as well, once we decided what the right fix was. This turned out to be complicated, because it involved a close reading of the standards document. The problem was that match_flag_type was an enum, emulating a bitmask. The type also had an operator ~(), which flipped all the bits in the type. But since the type was implemented as an enum, it had an underlying integer type that it was represented as, and the operator ~ just flipped all the bits. This led to values that UBSan didn’t like. A large discussion followed, with sentiments like “does it matter” and “can any code actually tell”, and so on. Eventually, I just changed the operator ~ to only flip the bits that are valid in the enumeration. Fixed in revision 177693.

The load of misaligned address 0x7fff569a85c6 for type 'const unsigned long', which requires 8 byte alignment were in the hashing code for strings. They are a performance optimization, and I haven’t tried to touch them. Whatever changes are made here will have to be done very carefully, since this will affect the performance of all the associative containers.

The “division by zero” messages were in three different tests. There were 3 of them in the numeric limits tests, and they were there on purpose. There were 2 of them in the complex number tests, and they were also on purpose. The other 24 of them were in the random number test suite, where the tests were generating a bunch of random numbers (using various distributions) and checking to see that the mean, variance, standard deviation, skew, etc, were all what the programmer expected. The problem is in the last measurement: skew. It is some calculated value divided by the variance. If the variance is zero, then the skew should be infinity. Many of the tests in the random number suite are testing “edge cases” of the random number generators, and some of these edge cases will produce a sequence where all the numbers are the same (and thus, the variance == 0). We solved this by commenting out the calculation of the skew for these degenerate cases, and leaving a comment in the test source file. Howard fixed this in revision 177826.

The runtime error: member call on address 0x7fff5e8f48d0 which does not point to an object of type 'std::logic_error' messages, as it turned out, were due to a bug in UBSan.

I’m just getting started on the inf/-inf/nan messages (about 48K of those). Most of these come from the complex number regression tests. Since this is a test suite for a library that implements a bunch of numeric routines, a lot of the tests actually do generate and use nan/inf, so I expect that many of these will be “false positives”.

Conclusions

This exercise, while not completed, has already turned up a set of bugs in the libc++ test suite, as well as a bug in libc++ and some undefined behavior in libc++. There’s more to look at here, but I think this was a good exercise. There’s kind of a mismatch of expectations here, especially in the complex and numeric test suites, because UBSan is looking for nan/inf/-inf and the libc++ test code is deliberately generating them.

Thanks to Howard Hinnant for his patience and explanations about the C++ standard and libc++ and the libc++ test suite, and to Richard Smith for his help with UBSan and interpreting the C++ standard.

Richard Smith March 26, 2013 at 9:02 pm

Hi Jesse,

Regarding those “false positives”:

The C++ standard’s treatment of Inf and NaN values is highly underspecified, so for the most part it’s not clear what has defined behavior and what does not. In particular:

* Can inf and nan be converted from float to double? The C and C++ standards requires that this be allowed for all values of type ‘float’, but (for instance) C also says “Certain object representations need not represent a value of the object type”, so ‘float’ need not have a NaN value, even if there is a bit representation for one. The standards also require that the set of values of type float is a subset of the set of values of type double, but that’s clearly problematic if NaNs with different payloads are considered different (since double has more payload bits).

* Can inf and nan be converted from double to float? The C and C++ standards say that this has undefined behavior of the source value is not in the range of representable values of the target… which might include infinity (in which case inf is OK) or might not. It’s hard to see how NaN could be part of any range, since it’s not ordered with respect to other floating-point values. So presumably that means that converting NaN from double to float has undefined behavior — which seems unlikely to be the intent.

* Can double values > FLT_MAX be converted to float? Again, this is unclear, and depends on whether infinity is in the range of representable values of float.

* Can integer values > FLT_MAX be converted to float? Note that the largest value of type unsigned __int128 does not fit in a float, and the largest value of type unsigned short does not fit in an __fp16.

C11’s Annex F gives us some clues as to how IEEE-754 and C-derived languages are supposed to interact (but is not mandatory in any of the C or C++ language standards for compilers such as Clang and GCC which choose not to define __STDC_IEC_559__). In particular, it says: “Since negative and positive infinity are representable in IEC 60559 formats, all real numbers lie within the range of representable values.” That implies that the finite and infinite cases are all OK, but the NaN double-to-float conversion may still not be (oops).

Anyway… I’m updating UBSan to suppress the diagnostics for conversions of ‘Inf’ and ‘NaN’ between floating-point types, and will probably split out a separate flag for finite overflow in conversions to floating-point types, so that users can turn it off as needed. I think that’s the right compromise for the time being.

Reply ↓

4 thoughts on “Testing libc++ with -fsanitize=undefined”

Jesse Ruderman March 26, 2013 at 6:14 pm

Interesting that UBSan caught an out-of-bounds error that ASan missed!

I’m curious about the “false positives” involving NaN and Inf. Is it undefined behavior to cast those values among float types (rather than to integer types)?

It would be nice if you’d link to the changesets, discussions, and bug reports.

Reply ↓
1. Richard Smith March 26, 2013 at 9:02 pm
  
  Hi Jesse,
  
  Regarding those “false positives”:
  
  The C++ standard’s treatment of Inf and NaN values is highly underspecified, so for the most part it’s not clear what has defined behavior and what does not. In particular:
  
  * Can inf and nan be converted from float to double? The C and C++ standards requires that this be allowed for all values of type ‘float’, but (for instance) C also says “Certain object representations need not represent a value of the object type”, so ‘float’ need not have a NaN value, even if there is a bit representation for one. The standards also require that the set of values of type float is a subset of the set of values of type double, but that’s clearly problematic if NaNs with different payloads are considered different (since double has more payload bits).
  
  * Can inf and nan be converted from double to float? The C and C++ standards say that this has undefined behavior of the source value is not in the range of representable values of the target… which might include infinity (in which case inf is OK) or might not. It’s hard to see how NaN could be part of any range, since it’s not ordered with respect to other floating-point values. So presumably that means that converting NaN from double to float has undefined behavior — which seems unlikely to be the intent.
  
  * Can double values > FLT_MAX be converted to float? Again, this is unclear, and depends on whether infinity is in the range of representable values of float.
  
  * Can integer values > FLT_MAX be converted to float? Note that the largest value of type unsigned __int128 does not fit in a float, and the largest value of type unsigned short does not fit in an __fp16.
  
  C11’s Annex F gives us some clues as to how IEEE-754 and C-derived languages are supposed to interact (but is not mandatory in any of the C or C++ language standards for compilers such as Clang and GCC which choose not to define __STDC_IEC_559__). In particular, it says: “Since negative and positive infinity are representable in IEC 60559 formats, all real numbers lie within the range of representable values.” That implies that the finite and infinite cases are all OK, but the NaN double-to-float conversion may still not be (oops).
  
  Anyway… I’m updating UBSan to suppress the diagnostics for conversions of ‘Inf’ and ‘NaN’ between floating-point types, and will probably split out a separate flag for finite overflow in conversions to floating-point types, so that users can turn it off as needed. I think that’s the right compromise for the time being.
  
  Reply ↓
Pingback: Undefined, implementation defined and unspecified behaviors in C++ – Native Coding
Markus February 5, 2018 at 6:24 pm

Hi,

did you ever happen to find a solution for the ABI problem described under “The failing tests”?

Thanks
Markus

Reply ↓

Marshall's C++ Musings

Testing libc++ with -fsanitize=undefined

The mechanics

The results

The failing tests

The error messages

The Analysis

Conclusions

4 thoughts on “Testing libc++ with -fsanitize=undefined”

Leave a comment Cancel reply

The mechanics

The results

The failing tests

The error messages

The Analysis

Conclusions

Share this:

Related

4 thoughts on “Testing libc++ with -fsanitize=undefined”

Leave a comment Cancel reply