Marshall's C++ Musings

Fixing an interface bug

Advertisements

The problem

In Boost 1.50, I introduced the Boost.Algorithm library.

In it, I included a pair of variations of std::copy, called copy_while and copy_until. I’m pretty sure that you can figure out what they do from the names. In fact, I mentioned boost::copy_while in as an answer to a question on Stack Overflow

Here’s the implementation of boost::copy_while:

template<typename InputIterator, typename OutputIterator, typename Predicate> 
OutputIterator copy_while ( InputIterator first, InputIterator last, 
OutputIterator result, Predicate p )
{
    for ( ; first != last && p(*first); ++first )
        *result++ = *first;
    return result;
}

This matches the definition of std::copy_if and std::copy_n in the C++11 standard.

However, Sean Parent has convinced me that I got the interface wrong. He says, and I (now) agree that I have to return the modified input iterator as well. The problem is that if you use copy_while to process part of a sequence, how do you know where to pick up the processing from? If you are using real, actual input iterators, then you’re fine, because then you can just pick up with the input iterator that you passed to copy_while. But if you are processing a std::list<T>, for example, then probably want to know which was the first element that was not copied.

I know, this all seems obvious – now.

But I didn’t think of it at the time. I should have. I remember going to a talk by Alex Stepanov several years ago, and he said that when writing the STL, one of his guiding principles was “Don’t throw away information”. Here, we are throwing away the position in the source list.

Not that it makes me feel better, but the C++ standards committee made the same mistake with std::copy_n.

So, how do we fix it?

There’s the “don’t break existing code” school of thought

template<typename InputIterator, typename OutputIterator, typename Predicate> 
OutputIterator copy_while ( InputIterator first, InputIterator last, 
OutputIterator result, Predicate p, InputIterator *pos = NULL )
{
    for ( ; first != last && p(*first); ++first )
        *result++ = *first;
    if ( pos != NULL )
        *pos = first;
    return result;
}

Advantages:

Disadvantages:

How about we just “Make it right”, and screw the callers?

template<typename InputIterator, typename OutputIterator, typename Predicate> 
std::pair<InputIterator, OutputIterator>
copy_while ( InputIterator first, InputIterator last, 
OutputIterator result, Predicate p )
{
    for ( ; first != last && p(*first); ++first )
    *result++ = *first;
    return std::make_pair(first, result);
}

Advantages:

Disadvantages:

In the first case, nothing has to change. The return value is not being used, so changing its type means nothing. In the second case, however, the calling site must be modified. Fortunately, the change is simple:

 out2 = copy_while (first, last, out, pred).second;

or:

 std::tie(std::ignore,out2) = copy_while (first, last, out, pred);

Can we “Make it right” and don’t screw the callers?

On the boost mailing list, Sebastian Redl suggested using what he called a “biased_pair” type, which works like a std::pair (and is convertible to a std::pair), but also includes an implicit conversion (chosen at compile time) to either the first element of the pair or the second. For copy_while, the conversion would be to OutputIterator (the second element).

[ Added 03-03: Motti Lanzkron notes in the comments that this won’t work with auto anyway. ]

Advantages:

Disadvantages:

Conclusion

After some discussion on the boost list, I went with option #2 (break the callers). We’ll see how this works out.

Advertisements