SMOP map operator


The 'map' operator in Perl 6 is heavily functional, and is designed to ensure that it forces no eager evaluation of any of the input data.

 ... <== map { ... } <== ...

This may look simple at first, but then you have to realize how to traverse a List without knowing its size, specially when that list might autovivify as you traverse it. It gets worse when that list is also lazy.

The immediate alternative is providing Iterators. The Perl 6 spec still doesn't really speaks about it (besides implying the existence of an Iterator type), but they are indeed the only sane way I can think of to implement the traversal of a lazy list, with the implicit benefit of adding better support for cursor-based data (i.e. berkeley db btrees).

So far so good, but which API does the iterator implement? The Java language implements iterators in terms of "has_next", "next" calls, this might look ok, but it split a process of consuming an iterator into two steps, which is likely to produce problems with concurrency (like checking for eof before reading from a network socket). As TimToady exposed on irc, trying and failing produces saner semantics, so we end up with a single method in the Iterator API, as the specs seems to imply, prefix:<=>.

But don't go away yet, because we also have one thing called "Slice context". The return of a map in slice context should return a bi-dimensional list of the captures returned by each map iteration (actually that applies to all loop operators, not only map), while in List context, it should return a lazy list that flattens the items returned by the iterations.

The solution I've reached at this point is that the context imposed when obtaining the iterator is the key to solve that. If you obtain the iterator in item context, the iterator will provide prefix:<=> which will return one item at a time, it you obtain it in list context, we can create a generic lazy list that will consume the iterator as needed, and when you obtain the iterator in slice context, you'll get a generic lazy slice.

As in SMOP, we have late context propagation (conceptually, this can be optimized later), this means that the following code

   # this is a good example of loose context information anyway...
   other_something(| map { ... }, something())

Will

  1. coerce the return of something to Iterator (returning an Iterator object)
  2. coerce the iterator to item context (returning an IteratorInItemContext object)
  3. return an iterator that holds a reference to both the input iterator and to the code

The iterator returned by map, will itself be used in some context, producing a lazy scalar, a lazy list or a lazy slice.

When data is required from the iterator returned by map (which will depend on how it will be used), the following occurs..

  1. map calls prefix:<=> and gets the next item (even if that means many iterations or no new iterations in the input data)
  2. map calls { ... }($item) (the number of arguments is defined by the .arity of the signature, and a call to prefix:<=> is made to get each argument.)
  3. map returns the capture with no context applied (which means that it might contain several or no items)

The key to solve the puzzle here is that the Iterator and the Iterator in each context are different objects. If the iterator returned by map is used in item context (like being the input for another map), it will follow the first rule of the above list.

Conceptual notes

class List is also {
  # the iterator returned by each object is private to that type and is implementation specific.
  method Iterator() {...}
}
role Iterator {
  # enforces item context
  method FETCH() {...}
  # returns a lazy list
  method List() {...}
  # returns a lazy slice
  method Slice() {...}
  # returns the capture of the next iteration
  method prefix:<=> {...}
}
role ItemIterator {
  # returns the next item
  method prefix:<=> {...}
}
class MapIterator does Iterator {
  has $.input_iterator;
  has $.code;
  method FETCH() {
    return GenericItemIterator.new(input_iterator => self);
  }
  method List() {
    return GenericLazyList.new(input_iterator => self);
  }
  method Slice() {
    return GenericLazySlice.new(input_iterator => self);
  }
  method prefix:<=> {
    return $.code.(=$.input_iterator);
  }
}