Fixing an interface bug

The problem

In Boost 1.50, I introduced the Boost.Algorithm library.

In it, I included a pair of variations of std::copy, called copy_while and copy_until. I’m pretty sure that you can figure out what they do from the names. In fact, I mentioned boost::copy_while in as an answer to a question on Stack Overflow

Here’s the implementation of boost::copy_while:

template<typename InputIterator, typename OutputIterator, typename Predicate> 
OutputIterator copy_while ( InputIterator first, InputIterator last, 
OutputIterator result, Predicate p )
{
    for ( ; first != last && p(*first); ++first )
        *result++ = *first;
    return result;
}

This matches the definition of std::copy_if and std::copy_n in the C++11 standard.

However, Sean Parent has convinced me that I got the interface wrong. He says, and I (now) agree that I have to return the modified input iterator as well. The problem is that if you use copy_while to process part of a sequence, how do you know where to pick up the processing from? If you are using real, actual input iterators, then you’re fine, because then you can just pick up with the input iterator that you passed to copy_while. But if you are processing a std::list<T>, for example, then probably want to know which was the first element that was not copied.

I know, this all seems obvious – now.

But I didn’t think of it at the time. I should have. I remember going to a talk by Alex Stepanov several years ago, and he said that when writing the STL, one of his guiding principles was “Don’t throw away information”. Here, we are throwing away the position in the source list.

Not that it makes me feel better, but the C++ standards committee made the same mistake with std::copy_n.

So, how do we fix it?

There’s the “don’t break existing code” school of thought

template<typename InputIterator, typename OutputIterator, typename Predicate> 
OutputIterator copy_while ( InputIterator first, InputIterator last, 
OutputIterator result, Predicate p, InputIterator *pos = NULL )
{
    for ( ; first != last && p(*first); ++first )
        *result++ = *first;
    if ( pos != NULL )
        *pos = first;
    return result;
}

Advantages:

  • Calling code does not have to change
  • Callers who want the position from the input list can get it by passing the address of an iterator (in the second case)

Disadvantages:

  • It’s ugly. The code used to be written in a functional style; all the parameters are input-only, and all the results are in the function return. That’s a powerful argument for me, and the fact that this function is all about side-effects (modifying the things that the iterators ‘point to’) only diminishes the argument somewhat.
  • It reduces our options in the future. A default argument has to be at the end of the parameter list.

How about we just “Make it right”, and screw the callers?

template<typename InputIterator, typename OutputIterator, typename Predicate> 
std::pair<InputIterator, OutputIterator>
copy_while ( InputIterator first, InputIterator last, 
OutputIterator result, Predicate p )
{
    for ( ; first != last && p(*first); ++first )
    *result++ = *first;
    return std::make_pair(first, result);
}

Advantages:

  • Matches the “style” of the STL. Parameters are inputs, results are outputs.
  • No hidden default arguments.
  • The change is visible at compile time; there will be no “silent failures”.

Disadvantages:

  • Calling code may have to change
    • There are two scenarios to consider.
    • (void) copy_while (first, last, out, pred);
    • out2 = copy_while (first, last, out, pred);

In the first case, nothing has to change. The return value is not being used, so changing its type means nothing. In the second case, however, the calling site must be modified. Fortunately, the change is simple:

 out2 = copy_while (first, last, out, pred).second;

or:

 std::tie(std::ignore,out2) = copy_while (first, last, out, pred);

Can we “Make it right” and don’t screw the callers?

On the boost mailing list, Sebastian Redl suggested using what he called a “biased_pair” type, which works like a std::pair (and is convertible to a std::pair), but also includes an implicit conversion (chosen at compile time) to either the first element of the pair or the second. For copy_while, the conversion would be to OutputIterator (the second element).

[ Added 03-03: Motti Lanzkron notes in the comments that this won't work with auto anyway. ]

Advantages:

  • Calling code does not have to change
  • The code is still written in a style that appeals to me; there’s no mixing of inputs and outputs in the parameter list.

Disadvantages:

  • That’s a lot of infrastructure to build. std::pair is a distressingly complicated class. If Sebastian’s “biased_pair” was already in boost, I probably would have used it; but I didn’t want to write two classes and a bunch of boilerplate code. The cost, from my point of view, outweighed the benefit (not having to change calling code)

Conclusion

After some discussion on the boost list, I went with option #2 (break the callers). We’ll see how this works out.

About these ads

7 thoughts on “Fixing an interface bug

    1. Marshall Clow Post author

      Have you looked at std::pair in C++11? Eight constructors, six non-member comparison operators, swap. I suspect that all of those would have to be replicated to inherit. (and the specializations of std::get, too)

      And no virtual destructor.

      Reply
  1. Motti Lanzkron

    Even if you use `biased_pair` you may break some calling code which uses `auto out2 = copy_while (first, last, out, pred);` since the using type inference will prevent the “biased” casting operator from being called.

    Reply
    1. Marshall Clow Post author

      C++ does not distinguish between functions that differ only in return type.

      For example:

      int foo (void);
      double foo (void);

      gives an error: functions that differ only in their return type cannot be overloaded

      Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s