mirror of git://gcc.gnu.org/git/gcc.git
				
				
				
			
		
			
				
	
	
		
			182 lines
		
	
	
		
			7.6 KiB
		
	
	
	
		
			XML
		
	
	
	
			
		
		
	
	
			182 lines
		
	
	
		
			7.6 KiB
		
	
	
	
		
			XML
		
	
	
	
| <chapter xmlns="http://docbook.org/ns/docbook" version="5.0" 
 | |
| 	 xml:id="std.iterators" xreflabel="Iterators">
 | |
| <?dbhtml filename="iterators.html"?>
 | |
| 
 | |
| <info><title>
 | |
|   Iterators
 | |
|   <indexterm><primary>Iterators</primary></indexterm>
 | |
| </title>
 | |
|   <keywordset>
 | |
|     <keyword>ISO C++</keyword>
 | |
|     <keyword>library</keyword>
 | |
|   </keywordset>
 | |
| </info>
 | |
| 
 | |
| 
 | |
| 
 | |
| <!-- Sect1 01 : Predefined -->
 | |
| <section xml:id="std.iterators.predefined" xreflabel="Predefined"><info><title>Predefined</title></info>
 | |
|   
 | |
| 
 | |
|   <section xml:id="iterators.predefined.vs_pointers" xreflabel="Versus Pointers"><info><title>Iterators vs. Pointers</title></info>
 | |
|     
 | |
|    <para>
 | |
|      The following
 | |
| FAQ <link linkend="faq.iterator_as_pod">entry</link> points out that
 | |
| iterators are not implemented as pointers.  They are a generalization
 | |
| of pointers, but they are implemented in libstdc++ as separate
 | |
| classes.
 | |
|    </para>
 | |
|    <para>
 | |
|      Keeping that simple fact in mind as you design your code will
 | |
|       prevent a whole lot of difficult-to-understand bugs.
 | |
|    </para>
 | |
|    <para>
 | |
|      You can think of it the other way 'round, even.  Since iterators
 | |
|      are a generalization, that means
 | |
|      that <emphasis>pointers</emphasis> are
 | |
|       <emphasis>iterators</emphasis>, and that pointers can be used
 | |
|      whenever an iterator would be.  All those functions in the
 | |
|      Algorithms section of the Standard will work just as well on plain
 | |
|      arrays and their pointers.
 | |
|    </para>
 | |
|    <para>
 | |
|      That doesn't mean that when you pass in a pointer, it gets
 | |
|       wrapped into some special delegating iterator-to-pointer class
 | |
|       with a layer of overhead.  (If you think that's the case
 | |
|       anywhere, you don't understand templates to begin with...)  Oh,
 | |
|       no; if you pass in a pointer, then the compiler will instantiate
 | |
|       that template using T* as a type, and good old high-speed
 | |
|       pointer arithmetic as its operations, so the resulting code will
 | |
|       be doing exactly the same things as it would be doing if you had
 | |
|       hand-coded it yourself (for the 273rd time).
 | |
|    </para>
 | |
|    <para>
 | |
|      How much overhead <emphasis>is</emphasis> there when using an
 | |
|       iterator class?  Very little.  Most of the layering classes
 | |
|       contain nothing but typedefs, and typedefs are
 | |
|       "meta-information" that simply tell the compiler some
 | |
|       nicknames; they don't create code.  That information gets passed
 | |
|       down through inheritance, so while the compiler has to do work
 | |
|       looking up all the names, your runtime code does not.  (This has
 | |
|       been a prime concern from the beginning.)
 | |
|    </para>
 | |
| 
 | |
| 
 | |
|   </section>
 | |
| 
 | |
|   <section xml:id="iterators.predefined.end" xreflabel="end() Is One Past the End"><info><title>One Past the End</title></info>
 | |
|     
 | |
| 
 | |
|    <para>This starts off sounding complicated, but is actually very easy,
 | |
|       especially towards the end.  Trust me.
 | |
|    </para>
 | |
|    <para>Beginners usually have a little trouble understand the whole
 | |
|       'past-the-end' thing, until they remember their early algebra classes
 | |
|       (see, they <emphasis>told</emphasis> you that stuff would come in handy!) and
 | |
|       the concept of half-open ranges.
 | |
|    </para>
 | |
|    <para>First, some history, and a reminder of some of the funkier rules in
 | |
|       C and C++ for builtin arrays.  The following rules have always been
 | |
|       true for both languages:
 | |
|    </para>
 | |
|    <orderedlist inheritnum="ignore" continuation="restarts">
 | |
|       <listitem>
 | |
| 	<para>You can point anywhere in the array, <emphasis>or to the first element
 | |
| 	  past the end of the array</emphasis>.  A pointer that points to one
 | |
| 	  past the end of the array is guaranteed to be as unique as a
 | |
| 	  pointer to somewhere inside the array, so that you can compare
 | |
| 	  such pointers safely.
 | |
| 	</para>
 | |
|       </listitem>
 | |
|       <listitem>
 | |
| 	<para>You can only dereference a pointer that points into an array.
 | |
| 	  If your array pointer points outside the array -- even to just
 | |
| 	  one past the end -- and you dereference it, Bad Things happen.
 | |
| 	</para>
 | |
|       </listitem>
 | |
|       <listitem>
 | |
| 	<para>Strictly speaking, simply pointing anywhere else invokes
 | |
| 	  undefined behavior.  Most programs won't puke until such a
 | |
| 	  pointer is actually dereferenced, but the standards leave that
 | |
| 	  up to the platform.
 | |
| 	</para>
 | |
|       </listitem>
 | |
|    </orderedlist>
 | |
|    <para>The reason this past-the-end addressing was allowed is to make it
 | |
|       easy to write a loop to go over an entire array, e.g.,
 | |
|       while (*d++ = *s++);.
 | |
|    </para>
 | |
|    <para>So, when you think of two pointers delimiting an array, don't think
 | |
|       of them as indexing 0 through n-1.  Think of them as <emphasis>boundary
 | |
|       markers</emphasis>:
 | |
|    </para>
 | |
|    <programlisting>
 | |
| 
 | |
|    beginning            end
 | |
|      |                   |
 | |
|      |                   |               This is bad.  Always having to
 | |
|      |                   |               remember to add or subtract one.
 | |
|      |                   |               Off-by-one bugs very common here.
 | |
|      V                   V
 | |
| 	array of N elements
 | |
|      |---|---|--...--|---|---|
 | |
|      | 0 | 1 |  ...  |N-2|N-1|
 | |
|      |---|---|--...--|---|---|
 | |
| 
 | |
|      ^                       ^
 | |
|      |                       |
 | |
|      |                       |           This is good.  This is safe.  This
 | |
|      |                       |           is guaranteed to work.  Just don't
 | |
|      |                       |           dereference 'end'.
 | |
|    beginning                end
 | |
| 
 | |
|    </programlisting>
 | |
|    <para>See?  Everything between the boundary markers is chapter of the array.
 | |
|       Simple.
 | |
|    </para>
 | |
|    <para>Now think back to your junior-high school algebra course, when you
 | |
|       were learning how to draw graphs.  Remember that a graph terminating
 | |
|       with a solid dot meant, "Everything up through this point,"
 | |
|       and a graph terminating with an open dot meant, "Everything up
 | |
|       to, but not including, this point," respectively called closed
 | |
|       and open ranges?  Remember how closed ranges were written with
 | |
|       brackets, <emphasis>[a,b]</emphasis>, and open ranges were written with parentheses,
 | |
|       <emphasis>(a,b)</emphasis>?
 | |
|    </para>
 | |
|    <para>The boundary markers for arrays describe a <emphasis>half-open range</emphasis>,
 | |
|       starting with (and including) the first element, and ending with (but
 | |
|       not including) the last element:  <emphasis>[beginning,end)</emphasis>.  See, I
 | |
|       told you it would be simple in the end.
 | |
|    </para>
 | |
|    <para>Iterators, and everything working with iterators, follows this same
 | |
|       time-honored tradition.  A container's <code>begin()</code> method returns
 | |
|       an iterator referring to the first element, and its <code>end()</code>
 | |
|       method returns a past-the-end iterator, which is guaranteed to be
 | |
|       unique and comparable against any other iterator pointing into the
 | |
|       middle of the container.
 | |
|    </para>
 | |
|    <para>Container constructors, container methods, and algorithms, all take
 | |
|       pairs of iterators describing a range of values on which to operate.
 | |
|       All of these ranges are half-open ranges, so you pass the beginning
 | |
|       iterator as the starting parameter, and the one-past-the-end iterator
 | |
|       as the finishing parameter.
 | |
|    </para>
 | |
|    <para>This generalizes very well.  You can operate on sub-ranges quite
 | |
|       easily this way; functions accepting a <emphasis>[first,last)</emphasis> range
 | |
|       don't know or care whether they are the boundaries of an entire {array,
 | |
|       sequence, container, whatever}, or whether they only enclose a few
 | |
|       elements from the center.  This approach also makes zero-length
 | |
|       sequences very simple to recognize:  if the two endpoints compare
 | |
|       equal, then the {array, sequence, container, whatever} is empty.
 | |
|    </para>
 | |
|    <para>Just don't dereference <code>end()</code>.
 | |
|    </para>
 | |
| 
 | |
|   </section>
 | |
| </section>
 | |
| 
 | |
| <!-- Sect1 02 : Stream -->
 | |
| 
 | |
| </chapter>
 |