Iterator design pattern is described in GoF ‘Design Patterns’ book. It is used at many places (e.g. Sql Cursor is a ‘iterator’), C++ standard template library uses iterators heavily. .Net Linq interfaces are based IEnumerable (i.e. iterator). However, I don’t see projects creating/using ‘custom’ iterator classes. Many problems can be solved ‘elegantly’ by use of customized iterators. This talk is about ‘power of iterators’ and how custom iterators can solve common problems and help create modular/reusable code components.
Key Discussion Points
Typical examples of iterators in common use.
Kind of problems that can be ‘elegantly’ solved with iterators
When to use custom iterators?
How write custom iterators in C++/C#
From webinar I did on TechGig
http://www.techgig.com/expert-speak/Iterator-a-powerful-but-underappreciated-pattern-449
Iterator - a powerful but underappreciated design pattern
1. Iterator
– A powerful but underappreciated pattern
Nitin Bhide
Chief Software Engineer
Geometric Ltd.
nitin.bhide@geometricglobal.com
2. Iterator – A Definition
WHAT IS AN ITERATOR ? WHAT PROBLEM DOES IT
SOLVE ?
Confidential
3. Iterator - Definition
In object-oriented computer programming, an
iterator is an object that enables a programmer to
traverse a container(or collection).
• http://en.wikipedia.org/wiki/Iterator
In object-oriented programming, the iterator
pattern is a design pattern in which an iterator is
used to traverse a container and access the
container's elements.
• http://en.wikipedia.org/wiki/Iterator_pattern
Confidential
4. Some More Info about Iterators
• Defined GoF book, as “Iterator Pattern”.
• It is a Structural Pattern
• The essence of the Iterator Pattern is to
"Provide a way to access the elements of an aggregate
object sequentially without exposing its underlying
representation.“
• The iterator pattern decouples algorithms from
containers.
Confidential
5. Benefits of “decoupling of algorithms and collections”
• Common ‘interface’ can be defined to access the elements of
collection.
• Algorithms can be written using this common iterator interface
• Allows us to change the ‘internal’ implementation of collection
with no change in the algorithms implementation
• Allows us to add new algorithms which work all existing
collection types.
• For example, C++ Standard Template Library provides a set of
algorithms which work on multiple collections types by using a
common ‘iterator’ interface
Confidential
6. Typical operations in an Iterator Interface
• CurrentElement()
• referencing one particular element in the object collection
• Next()
• modifying itself so it points to the next element
• Reset() – Optional
• Reset the current element to start element
• HasNext() or IsEnd() – Optional
• Detect if there is no element left in the collection.
Confidential
7. Looping with Iterator (pseudo code)
Simple for loop
• for(int i=0; i<10; ++i)
{
curval = i;
}
For loop with Iterator
• for(iterator it=iterator();
it.IsEnd()==false;
it.Next()
{
curval = it.Current();
}
Confidential
9. Iterator Implementations – Concepts and Limitations
• Iterator as an ‘object’ allows iterator to have its own state
(i.e. its own member variables) different than the state of
collection
• Allows multiple iterator objects corresponding to same
collection
• If collection changes while ‘iteration’ is in progress, all
existing iterators can become invalid.
• Implementation of Iterator and the collection on which it
operates are usually ‘tightly coupled’
Confidential
10. Examples of Iterator implementations
C++
C#
• STL Containers implement their iterators
• STL algorithms are defined in terms iterators
• IEnumerable<T> and Ienumerator<T>
iterfaces define the iterators
Java
• SDK defines iterator interface
• and collections implement their iterators
SQL
• Cursor is an iterator
Confidential
12. LINQ – Language INtegrated Query
• Language Integrated Query (LINQ, pronounced "link") is a
Microsoft .NET Framework component that adds native
data querying capabilities to .NET languages.
• From MSDN Documentation:
In Visual Studio you can write LINQ queries in Visual Basic or C#
with SQL Server databases, XML documents, ADO.NET
Datasets, and any collection of objects that supports IEnumerable
or the generic IEnumerable<T> interface.
Essentially Power of LINQ is because of
Power of Iterators
Confidential
13. Reason behind Iterators Power
Simple for loop with if check
For loop with iterator
For(i=0; i<100; i++)
{
if( I % 3 == 0)
{
print I;
}
}
For(it.Reset(); it.HasNext();
it.MoveNext())
{
print it.Current;
}
For loop is has 3 responsibilities
1. Looping
2. Filtering
3. Printing the results
For loop has only two responsibilities
1. Looping
2. Printing the results
Iterators MoveNext function is
overloaded such that it returns only
numbers divisible by 3.
Responsibility of Filtering is moved to
Iterator.
Confidential
14. Simple For loop with LINQ
with if check
With LINQ
for(i=0; i<100; i++)
{
if( i % 3 == 0)
{
print i;
}
}
var nums = from n in
Enumerable.Range(1, 100).w
here(n % 3 == 0)
for(var i in nums)
{
print i;
}
Confidential
15. LINQ Providers
• LINQ to SQL/ADO.Net
• LINQ to XML
• LINQ to Amazon
• LINQ to Active Directory
• LINQ to CRM
• LINQ To Geo - Language
Integrated Query for Geospatial
Data
• LINQ to Excel
• LINQ to Flickr
• LINQ to Google
• LINQ to Indexes (LINQ and i40)
• LINQ to JSON
• LINQ to IMAP
• LINQ to NHibernate
• LINQ to LDAP
• LINQ to Lucene
• LINQ to MySQL, Oracle and
PostgreSql (DbLinq)
• LINQ to NCover
• LINQ to Opf3
• LINQ to RDF Files
• LINQ to Sharepoint
• LINQ to SimpleDB
• LINQ to Streams
• LINQ to WebQueries
• LINQ to WMI
From Links to LINQ page :
http://blogs.msdn.com/b/charlie/archive/2008/02/28/link-to-everything-a-list-of-linqproviders.aspx
Confidential
16. Reasons behind Power of Iterators
Iterators can act as ‘filters’ or ‘views’
• where instead of returning all elements of a
collection it can ‘filter’ the elements and return a
‘partial’ list.
• The filtering happens as ‘needed’ and an
intermediate ‘collection’ of filtered results can be
avoided.
• Different implementations of iterator can provided
different filtering criteria.
• Hence algorithms written with ‘iterators’ can work
on original collection or filtered collections. .
Confidential
17. Reasons behind Power of Iterators
Iterators can be polymorphic
• Hence same function can return different
results by passing different iterator to it.
• For example, in our previous example of
printing numbers divisible by 3, can be easily
changed to printing all ‘primes’ by replacing
the iterator. There is no need to change the
print function.
Confidential
18. Reasons behind Power of Iterators
Iterators can be combined to provide more flexibility
• For example, combine two existing iterators into a new iterator
such that it ‘chains’ the child iterators
• For example, ‘<itertools>’ module of python provides many ways
of combining iterators
• Following one liner in python efficiently returns ‘dot product’ of
two mathematical vectors
sum(imap(operator.mul, vector1, vector2)).
imap() function takes multiple iterators and calls ‘mul’ (multiply
function) with current values of those iterators.
Confidential
19. Reasons behind Power of Iterators
Iterators can work with ‘collection like’ objects.
• For example, we can write an iterator which
return one word at a time from an input
stream (or a text file)
• Best example is how LINQ works with
Databases or XML with exactly same
interface.
Confidential
20. Few More Examples of Powerful Iterators
Suppose we want to extract ‘unique words’ from
a text file.
• Since iterator can have its own state, we can write an
iterator that remembers the words it already ‘seen’ and
• if the word is already ‘seen’ ignore that word and reads the
next word from the istream.
• Such an iterator will return just the ‘unique’ words in the
istream.
• Now if you have function which takes a word iterator as
parameter and prints the words, then just by passing the
‘unqiue word iterator, we can print the unique words in the
text file.
Confidential
21. Laser Cutting Tool path Generation
• For one of my project, problem was to generate the
toolpath for laser cutting of sheet metal parts
• Toolpath generation had different strategies based on
various parameters and part selection logic etc etc.
• Naïve implementation, will require
• writing complicated for loops different strategies
• Parameters passed as function parameters
• Combining strategies is nearly impossible.
• Lot of duplication of boiler plate code.
• New strategy will require changes in the existing code.
Confidential
22. Laser Cutting Tool path Generation - Solution
• We defined a base class ToolPathIterator
• Every new toolpath creation strategy is implemented as ‘drived
class’ of this iterator.
• We also implemented iterators which derived from this base
class And also used some existing Toolpathiterators internally to
combine the strategies.
• A Factory method instantiated the necessary toolpath
iterator based on name of the strategy.
• Only Factory method depends on all concrete iterator
implementations.
• Everywhere else ToolPathIterator base class was used.
• Add a new strategy required
• (a) adding new derived class from ToolpathIterator
• (b) change in Factory Method.
Confidential
23. Why I don’t see many custom iterators ?
WHY ITERATORS ARE UNDERAPPRECIATED ?
Confidential
24. •Most probably because Iterator is a
‘really simple concept’
•and
•people have difficulty in believing
that such simple concept can have
so much power.
Confidential
25. Why iterators are underappreciated ?
• Difficulty in thinking traversal/iteration as ‘separate
object/responsibility’
• Difficulty in thinking ‘iterator’ as interface with multiple
possible implementations
• Writing an iterator requires defining a new class.
• This may require additional coding while writing for loop may
look simpler/quicker.
Confidential
26. Difficulty in ‘thinking’ traversal/iteration as ‘separate object’
• Developers are used to thinking in loops which work on
‘indices’.
• For them ‘i++’ is conceptually easier that ‘it.next()’
• I see many examples of iterating over C++ vector use indexing
operation (e.g. vec[i]) rather than using vec.begin();
• Hence creating separate class/object to keep track of
current element and moving to next element is somehow
difficult ‘leap’.
• However once you make that ‘leap’ your subsequent design may
change significantly from the past design
Confidential
27. ‘iterator’ as interface with multiple possible implementations
• Since ‘iterator’ is a separate class/object, we can create
‘iterator’ interfaces (or base classes).
• If we write function using the ‘iterator’ interface, then we
can pass different implementations of iterators and get
different behavior.
• Different traversal algorithms (e.g. pre order, post order
traversals in trees)
• Filtering
• All this flexibility is possible, if you start thinking in terms
of ‘iterator interface and its implementations’ or a
hierarchy of iterators.
Confidential
28. Writing an iterator requires defining a new class
• In languages like Java, C++, writing an iterator means
adding a new class in the system.
• It required additional code than simple for loop
• Developer has to take care in defining iterator interface (e.g.
iterators that will work with STL containers).
• Developers treated it as ‘additional burden/cost’ . They could
not visualize the benefits and hence decided that separate
iterator class is probably not worth the efforts.
• However, this is changing with co-routines
implementations
• ‘yield return’ and similar keywords being introduced in the
languages like C#, Python.
Confidential
29. Yield Return in C#
• A Trivial Example : Writing an iterator which returns numbers divisible by
three
• Defining the Iterator using Yield Return.
IEnumberable<int> DivisibleByThree()
{
for(int i=0; i< 100; i++)
{
if(i % 3 == 0)
{
yield return i;
}
}
}
• Using the Iterator with foreach
foreach(var j=DivisibleByThree())
{
print j;
}
Confidential
31. Use Iterators instead of ‘indexing’ in loops
• In general, loops with iterators are more efficient than
index based loops
• For example, getting element at ‘i’th Index is equivalent
to
*(Startpos + i*sizeof(element))
While getting next element from the current element is
equivalent to
*(cur_elem_pos++)
• Hence usually ‘iterators’ based loops are slightly faster.
Confidential
32. Never directly return a member ‘collection’ from a class
Don’t
DO
Class XYZ
{
private List<int> intlist= new List<int>();
Class XYZ
{
private List<int> intlist= new List<int>();
List<int> IntList
{
get {
return intlist;
}
}
}
IEnumberable<int> IntList
{
get {
return intlist;
}
}
}
Confidential
33. Loop+If : see if you can use an iterator
• If you see similar loop + if condition or similar contents of
loop at multiple places
• See if you can extract the similarities into an Iterator
• Sometime you have to create a ‘base iterator’ and override
‘MoveNext()’ implementations
• If you are using C#/Python, see if you can use ‘yield’
Confidential
34. Never directly return a member ‘collection’ from a class.
• It violates ‘encapsulations’
• Also all the users of your class are now ‘explicitly’ depend
on the collection (e.g List).
• Tomorrow if you want to change the List to HashSet(),
every where your class is used will potentially need to
change.
• For C#
• Define an IEnumerable as ‘get’ property. It will allow you to
change the internal collection type with almost no impact on the
external interface.
• For Java/C++
• Define an iterator for your class.
Confidential
35. Summary
As Developer
change your
thinking to
• Think traversal/iterator as ‘separate
class/object/responsibility’
• Think ‘iterator’ as interface with
multiple possible implementations
• Different traversal strategies/algorithms
can be implemented as different iterator
implementations.
Once you do
that, you will
find many
more uses of
Iterator
• That can simplify your code
• Make it less bug prone and more stable.
• Make it easier to enhance and maintain
Confidential
Since iterators can have state, suppose we write an iterator that remembers the words it already ‘seen’ and if the word is already ‘seen’ reads the next word from the istream. Such an iterator will return just the ‘unique’ words in the istream.