Monthly Archives: February 2013

Fixing an interface bug

The problem

In Boost 1.50, I introduced the Boost.Algorithm library.

In it, I included a pair of variations of std::copy, called copy_while and copy_until. I’m pretty sure that you can figure out what they do from the names. In fact, I mentioned boost::copy_while in as an answer to a question on Stack Overflow

Here’s the implementation of boost::copy_while:

template<typename InputIterator, typename OutputIterator, typename Predicate> 
OutputIterator copy_while ( InputIterator first, InputIterator last, 
OutputIterator result, Predicate p )
{
    for ( ; first != last && p(*first); ++first )
        *result++ = *first;
    return result;
}

This matches the definition of std::copy_if and std::copy_n in the C++11 standard.

However, Sean Parent has convinced me that I got the interface wrong. He says, and I (now) agree that I have to return the modified input iterator as well. The problem is that if you use copy_while to process part of a sequence, how do you know where to pick up the processing from? If you are using real, actual input iterators, then you’re fine, because then you can just pick up with the input iterator that you passed to copy_while. But if you are processing a std::list<T>, for example, then probably want to know which was the first element that was not copied.

I know, this all seems obvious – now.

But I didn’t think of it at the time. I should have. I remember going to a talk by Alex Stepanov several years ago, and he said that when writing the STL, one of his guiding principles was “Don’t throw away information”. Here, we are throwing away the position in the source list.

Not that it makes me feel better, but the C++ standards committee made the same mistake with std::copy_n.

So, how do we fix it?

There’s the “don’t break existing code” school of thought

template<typename InputIterator, typename OutputIterator, typename Predicate> 
OutputIterator copy_while ( InputIterator first, InputIterator last, 
OutputIterator result, Predicate p, InputIterator *pos = NULL )
{
    for ( ; first != last && p(*first); ++first )
        *result++ = *first;
    if ( pos != NULL )
        *pos = first;
    return result;
}

Advantages:

  • Calling code does not have to change
  • Callers who want the position from the input list can get it by passing the address of an iterator (in the second case)

Disadvantages:

  • It’s ugly. The code used to be written in a functional style; all the parameters are input-only, and all the results are in the function return. That’s a powerful argument for me, and the fact that this function is all about side-effects (modifying the things that the iterators ‘point to’) only diminishes the argument somewhat.
  • It reduces our options in the future. A default argument has to be at the end of the parameter list.

How about we just “Make it right”, and screw the callers?

template<typename InputIterator, typename OutputIterator, typename Predicate> 
std::pair<InputIterator, OutputIterator>
copy_while ( InputIterator first, InputIterator last, 
OutputIterator result, Predicate p )
{
    for ( ; first != last && p(*first); ++first )
    *result++ = *first;
    return std::make_pair(first, result);
}

Advantages:

  • Matches the “style” of the STL. Parameters are inputs, results are outputs.
  • No hidden default arguments.
  • The change is visible at compile time; there will be no “silent failures”.

Disadvantages:

  • Calling code may have to change
    • There are two scenarios to consider.
    • (void) copy_while (first, last, out, pred);
    • out2 = copy_while (first, last, out, pred);

In the first case, nothing has to change. The return value is not being used, so changing its type means nothing. In the second case, however, the calling site must be modified. Fortunately, the change is simple:

 out2 = copy_while (first, last, out, pred).second;

or:

 std::tie(std::ignore,out2) = copy_while (first, last, out, pred);

Can we “Make it right” and don’t screw the callers?

On the boost mailing list, Sebastian Redl suggested using what he called a “biased_pair” type, which works like a std::pair (and is convertible to a std::pair), but also includes an implicit conversion (chosen at compile time) to either the first element of the pair or the second. For copy_while, the conversion would be to OutputIterator (the second element).

[ Added 03-03: Motti Lanzkron notes in the comments that this won’t work with auto anyway. ]

Advantages:

  • Calling code does not have to change
  • The code is still written in a style that appeals to me; there’s no mixing of inputs and outputs in the parameter list.

Disadvantages:

  • That’s a lot of infrastructure to build. std::pair is a distressingly complicated class. If Sebastian’s “biased_pair” was already in boost, I probably would have used it; but I didn’t want to write two classes and a bunch of boilerplate code. The cost, from my point of view, outweighed the benefit (not having to change calling code)

Conclusion

After some discussion on the boost list, I went with option #2 (break the callers). We’ll see how this works out.

C++ and Xcode 4.6

So, you’ve installed Xcode 4.6, and you are a C++ programmer.

You want to use the latest and greatest, so you create a new project, and add your sources to the project, and hit Build, and … guess what? Your code doesn’t build!

What’s up with that?

In Xcode 4.6 (and presumably, later versions), the default C++ compiler is clang, the default language is C++11, and the standard library is libc++.

This is a change from previous versions, where the default was gcc 4.2.1, C++03, and libstdc++.

This is good news

Clang is a much more capable compiler than gcc 4.2.1. It’s also better integrated into Xcode.

C++11 is a major upgrade in functionality from C++03. There have been lots of articles written about the new features, so I won’t belabor them here.

However, with a new language, compiler, and standard library, there are some incompatibilities. I’ll try to run through the common ones, and hopefully you will be up and running quickly.

How can I tell if I’m using libc++?

If you’re writing cross-platform code, sometimes you need to know what standard library you are using. In theory, they should all offer equivalent functionality, but that’s just theory. Sometimes you just need to know. The best way to check for libc++ is to look for the preprocessor symbol _LIBCPP_VERSION. If that’s defined, then you’re using libc++.

    #ifdef  _LIBCPP_VERSION
    //  libc++ specific code here
    #else
    //  generic code here
    #endif

Note that this symbol is only defined after you include any of the libc++ header files. If you need a small header file to include just for this, you can do:

    #include <ciso646>

The header file “ciso646” is required by both the C++03 and C++11 standards, and defined to do nothing.

What happened to TR1?

Technical Report #1 (TR1) was a set of library additions to the C++03 standard. Representing the fact that they were not part of the “official” standard, they were placed in the namespace std::tr1.

In c++11, they are officially part of the standard, and live in the namespace std, just like vector and string. The include files no longer live in the “tr1” folder, either.

So, code like this:

    #include <tr1/unordered_map>
    int main()
    {
        std::tr1::unordered_map <int, int> ma;
        std::cout << ma.size () << std::endl;
        return 0;   
    }

Needs to be changed to:

    #include <unordered_map>
    int main()
    {
        std::unordered_map <int, int> ma;
        std::cout << ma.size () << std::endl;
        return 0;   
    }

It’s probably easiest to just search your code base for references to tr1 and remove them.

Missing identifiers (include what you use)

“My code used to build with Xcode 4.5, and now I’m getting “unknown identifier” errors with stuff in the standard C (or C++) library!”

Library headers may include other library headers. Sometimes, this is required by the standard, sometimes it is done as an “implementation feature” of the library.

To be portable, you should explicitly include the header files that define the routines that you use. That way, you’re not dependent on the internal details of libc++ (or libstdc++).

For example, if you are calling std::malloc (or malloc), you should really #include <cstdlib> (or #include <stdlib.h>) to make sure that it is defined.

[ Updated 03-03 ]  In a Xcode-Users mailing list posting, Todd Heberlein writes:

I have my own C++ library. When I link against it in a Cocoa app (with the appropriate files set to .mm), everything works fine.

But when I start a new "Command Line Tool" project and try to link against the library, I get a lot of errors about missing STL symbols.

This is almost certainly because his library was built with gcc/stdlibc++, and his new tool with clang/libc++.

More to come.

As I find other differences, I will be adding to this document. If you come across things, please let me know in the comments and I will add them.