mirror of git://gcc.gnu.org/git/gcc.git
				
				
				
			
		
			
				
	
	
		
			5111 lines
		
	
	
		
			181 KiB
		
	
	
	
		
			XML
		
	
	
	
			
		
		
	
	
			5111 lines
		
	
	
		
			181 KiB
		
	
	
	
		
			XML
		
	
	
	
| <chapter xmlns="http://docbook.org/ns/docbook" version="5.0"
 | ||
| 	 xml:id="manual.ext.containers.pbds" xreflabel="pbds">
 | ||
|   <info>
 | ||
|     <title>Policy-Based Data Structures</title>
 | ||
|     <keywordset>
 | ||
|       <keyword>ISO C++</keyword>
 | ||
|       <keyword>policy</keyword>
 | ||
|       <keyword>container</keyword>
 | ||
|       <keyword>data</keyword>
 | ||
|       <keyword>structure</keyword>
 | ||
|       <keyword>associated</keyword>
 | ||
|       <keyword>tree</keyword>
 | ||
|       <keyword>trie</keyword>
 | ||
|       <keyword>hash</keyword>
 | ||
|       <keyword>metaprogramming</keyword>
 | ||
|     </keywordset>
 | ||
|   </info>
 | ||
|   <?dbhtml filename="policy_data_structures.html"?>
 | ||
| 
 | ||
|   <!-- 2006-04-01 Ami Tavory -->
 | ||
|   <!-- 2011-05-25 Benjamin Kosnik -->
 | ||
| 
 | ||
|   <!-- S01: intro -->
 | ||
|   <section xml:id="pbds.intro">
 | ||
|     <info><title>Intro</title></info>
 | ||
| 
 | ||
|     <para>
 | ||
|       This is a library of policy-based elementary data structures:
 | ||
|       associative containers and priority queues. It is designed for
 | ||
|       high-performance, flexibility, semantic safety, and conformance to
 | ||
|       the corresponding containers in <literal>std</literal> and
 | ||
|       <literal>std::tr1</literal> (except for some points where it differs
 | ||
|       by design).
 | ||
|     </para>
 | ||
|     <para>
 | ||
|     </para>
 | ||
| 
 | ||
|     <section xml:id="pbds.intro.issues">
 | ||
|       <info><title>Performance Issues</title></info>
 | ||
|       <para>
 | ||
|       </para>
 | ||
| 
 | ||
|       <para>
 | ||
| 	An attempt is made to categorize the wide variety of possible
 | ||
| 	container designs in terms of performance-impacting factors. These
 | ||
| 	performance factors are translated into design policies and
 | ||
| 	incorporated into container design.
 | ||
|       </para>
 | ||
| 
 | ||
|       <para>
 | ||
| 	There is tension between unravelling factors into a coherent set of
 | ||
| 	policies. Every attempt is made to make a minimal set of
 | ||
| 	factors. However, in many cases multiple factors make for long
 | ||
| 	template names. Every attempt is made to alias and use typedefs in
 | ||
| 	the source files, but the generated names for external symbols can
 | ||
| 	be large for binary files or debuggers.
 | ||
|       </para>
 | ||
| 
 | ||
|       <para>
 | ||
| 	In many cases, the longer names allow capabilities and behaviours
 | ||
| 	controlled by macros to also be unamibiguously emitted as distinct
 | ||
| 	generated names.
 | ||
|       </para>
 | ||
| 
 | ||
|       <para>
 | ||
| 	Specific issues found while unraveling performance factors in the
 | ||
| 	design of associative containers and priority queues follow.
 | ||
|       </para>
 | ||
| 
 | ||
|       <section xml:id="pbds.intro.issues.associative">
 | ||
| 	<info><title>Associative</title></info>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Associative containers depend on their composite policies to a very
 | ||
| 	  large extent. Implicitly hard-wiring policies can hamper their
 | ||
| 	  performance and limit their functionality. An efficient hash-based
 | ||
| 	  container, for example, requires policies for testing key
 | ||
| 	  equivalence, hashing keys, translating hash values into positions
 | ||
| 	  within the hash table, and determining when and how to resize the
 | ||
| 	  table internally. A tree-based container can efficiently support
 | ||
| 	  order statistics, i.e. the ability to query what is the order of
 | ||
| 	  each key within the sequence of keys in the container, but only if
 | ||
| 	  the container is supplied with a policy to internally update
 | ||
| 	  meta-data. There are many other such examples.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Ideally, all associative containers would share the same
 | ||
| 	  interface. Unfortunately, underlying data structures and mapping
 | ||
| 	  semantics differentiate between different containers. For example,
 | ||
| 	  suppose one writes a generic function manipulating an associative
 | ||
| 	  container.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<programlisting>
 | ||
| 	  template<typename Cntnr>
 | ||
| 	  void
 | ||
| 	  some_op_sequence(Cntnr& r_cnt)
 | ||
| 	  {
 | ||
| 	  ...
 | ||
| 	  }
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Given this, then what can one assume about the instantiating
 | ||
| 	  container? The answer varies according to its underlying data
 | ||
| 	  structure. If the underlying data structure of
 | ||
| 	  <literal>Cntnr</literal> is based on a tree or trie, then the order
 | ||
| 	  of elements is well defined; otherwise, it is not, in general. If
 | ||
| 	  the underlying data structure of <literal>Cntnr</literal> is based
 | ||
| 	  on a collision-chaining hash table, then modifying
 | ||
| 	  r_<literal>Cntnr</literal> will not invalidate its iterators' order;
 | ||
| 	  if the underlying data structure is a probing hash table, then this
 | ||
| 	  is not the case. If the underlying data structure is based on a tree
 | ||
| 	  or trie, then a reference to the container can efficiently be split;
 | ||
| 	  otherwise, it cannot, in general. If the underlying data structure
 | ||
| 	  is a red-black tree, then splitting a reference to the container is
 | ||
| 	  exception-free; if it is an ordered-vector tree, exceptions can be
 | ||
| 	  thrown.
 | ||
| 	</para>
 | ||
| 
 | ||
|       </section>
 | ||
| 
 | ||
|       <section xml:id="pbds.intro.issues.priority_queue">
 | ||
| 	<info><title>Priority Que</title></info>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Priority queues are useful when one needs to efficiently access a
 | ||
| 	  minimum (or maximum) value as the set of values changes.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Most useful data structures for priority queues have a relatively
 | ||
| 	  simple structure, as they are geared toward relatively simple
 | ||
| 	  requirements. Unfortunately, these structures do not support access
 | ||
| 	  to an arbitrary value, which turns out to be necessary in many
 | ||
| 	  algorithms. Say, decreasing an arbitrary value in a graph
 | ||
| 	  algorithm. Therefore, some extra mechanism is necessary and must be
 | ||
| 	  invented for accessing arbitrary values. There are at least two
 | ||
| 	  alternatives: embedding an associative container in a priority
 | ||
| 	  queue, or allowing cross-referencing through iterators. The first
 | ||
| 	  solution adds significant overhead; the second solution requires a
 | ||
| 	  precise definition of iterator invalidation. Which is the next
 | ||
| 	  point...
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Priority queues, like hash-based containers, store values in an
 | ||
| 	  order that is meaningless and undefined externally. For example, a
 | ||
| 	  <code>push</code> operation can internally reorganize the
 | ||
| 	  values. Because of this characteristic, describing a priority
 | ||
| 	  queues' iterator is difficult: on one hand, the values to which
 | ||
| 	  iterators point can remain valid, but on the other, the logical
 | ||
| 	  order of iterators can change unpredictably.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Roughly speaking, any element that is both inserted to a priority
 | ||
| 	  queue (e.g. through <code>push</code>) and removed
 | ||
| 	  from it (e.g., through <code>pop</code>), incurs a
 | ||
| 	  logarithmic overhead (in the amortized sense). Different underlying
 | ||
| 	  data structures place the actual cost differently: some are
 | ||
| 	  optimized for amortized complexity, whereas others guarantee that
 | ||
| 	  specific operations only have a constant cost. One underlying data
 | ||
| 	  structure might be chosen if modifying a value is frequent
 | ||
| 	  (Dijkstra's shortest-path algorithm), whereas a different one might
 | ||
| 	  be chosen otherwise. Unfortunately, an array-based binary heap - an
 | ||
| 	  underlying data structure that optimizes (in the amortized sense)
 | ||
| 	  <code>push</code> and <code>pop</code> operations, differs from the
 | ||
| 	  others in terms of its invalidation guarantees. Other design
 | ||
| 	  decisions also impact the cost and placement of the overhead, at the
 | ||
| 	  expense of more difference in the the kinds of operations that the
 | ||
| 	  underlying data structure can support. These differences pose a
 | ||
| 	  challenge when creating a uniform interface for priority queues.
 | ||
| 	</para>
 | ||
|       </section>
 | ||
|     </section>
 | ||
| 
 | ||
|     <section xml:id="pbds.intro.motivation">
 | ||
|       <info><title>Goals</title></info>
 | ||
| 
 | ||
|       <para>
 | ||
| 	Many fine associative-container libraries were already written,
 | ||
| 	most notably, the C++ standard's associative containers. Why
 | ||
| 	then write another library? This section shows some possible
 | ||
| 	advantages of this library, when considering the challenges in
 | ||
| 	the introduction. Many of these points stem from the fact that
 | ||
| 	the ISO C++ process introduced associative-containers in a
 | ||
| 	two-step process (first standardizing tree-based containers,
 | ||
| 	only then adding hash-based containers, which are fundamentally
 | ||
| 	different), did not standardize priority queues as containers,
 | ||
| 	and (in our opinion) overloads the iterator concept.
 | ||
|       </para>
 | ||
| 
 | ||
|       <section xml:id="pbds.intro.motivation.associative">
 | ||
| 	<info><title>Associative</title></info>
 | ||
| 	<para>
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<section xml:id="motivation.associative.policy">
 | ||
| 	  <info><title>Policy Choices</title></info>
 | ||
| 	  <para>
 | ||
| 	    Associative containers require a relatively large number of
 | ||
| 	    policies to function efficiently in various settings. In some
 | ||
| 	    cases this is needed for making their common operations more
 | ||
| 	    efficient, and in other cases this allows them to support a
 | ||
| 	    larger set of operations
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <orderedlist>
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		Hash-based containers, for example, support look-up and
 | ||
| 		insertion methods (<function>find</function> and
 | ||
| 		<function>insert</function>). In order to locate elements
 | ||
| 		quickly, they are supplied a hash functor, which instruct
 | ||
| 		how to transform a key object into some size type; a hash
 | ||
| 		functor might transform <constant>"hello"</constant>
 | ||
| 		into <constant>1123002298</constant>. A hash table, though,
 | ||
| 		requires transforming each key object into some size-type
 | ||
| 		type in some specific domain; a hash table with a 128-long
 | ||
| 		table might transform <constant>"hello"</constant> into
 | ||
| 		position <constant>63</constant>. The policy by which the
 | ||
| 		hash value is transformed into a position within the table
 | ||
| 		can dramatically affect performance.  Hash-based containers
 | ||
| 		also do not resize naturally (as opposed to tree-based
 | ||
| 		containers, for example). The appropriate resize policy is
 | ||
| 		unfortunately intertwined with the policy that transforms
 | ||
| 		hash value into a position within the table.
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		Tree-based containers, for example, also support look-up and
 | ||
| 		insertion methods, and are primarily useful when maintaining
 | ||
| 		order between elements is important. In some cases, though,
 | ||
| 		one can utilize their balancing algorithms for completely
 | ||
| 		different purposes.
 | ||
| 	      </para>
 | ||
| 
 | ||
| 	      <para>
 | ||
| 		Figure A shows a tree whose each node contains two entries:
 | ||
| 		a floating-point key, and some size-type
 | ||
| 		<emphasis>metadata</emphasis> (in bold beneath it) that is
 | ||
| 		the number of nodes in the sub-tree. (The root has key 0.99,
 | ||
| 		and has 5 nodes (including itself) in its sub-tree.) A
 | ||
| 		container based on this data structure can obviously answer
 | ||
| 		efficiently whether 0.3 is in the container object, but it
 | ||
| 		can also answer what is the order of 0.3 among all those in
 | ||
| 		the container object: see <xref linkend="biblio.clrs2001"/>.
 | ||
| 
 | ||
| 	      </para>
 | ||
| 
 | ||
| 	      <para>
 | ||
| 		As another example, Figure B shows a tree whose each node
 | ||
| 		contains two entries: a half-open geometric line interval,
 | ||
| 		and a number <emphasis>metadata</emphasis> (in bold beneath
 | ||
| 		it) that is the largest endpoint of all intervals in its
 | ||
| 		sub-tree.  (The root describes the interval <constant>[20,
 | ||
| 		36)</constant>, and the largest endpoint in its sub-tree is
 | ||
| 		99.) A container based on this data structure can obviously
 | ||
| 		answer efficiently whether <constant>[3, 41)</constant> is
 | ||
| 		in the container object, but it can also answer efficiently
 | ||
| 		whether the container object has intervals that intersect
 | ||
| 		<constant>[3, 41)</constant>. These types of queries are
 | ||
| 		very useful in geometric algorithms and lease-management
 | ||
| 		algorithms.
 | ||
| 	      </para>
 | ||
| 
 | ||
| 	      <para>
 | ||
| 		It is important to note, however, that as the trees are
 | ||
| 		modified, their internal structure changes. To maintain
 | ||
| 		these invariants, one must supply some policy that is aware
 | ||
| 		of these changes.  Without this, it would be better to use a
 | ||
| 		linked list (in itself very efficient for these purposes).
 | ||
| 	      </para>
 | ||
| 
 | ||
| 	    </listitem>
 | ||
| 	  </orderedlist>
 | ||
| 
 | ||
| 	  <figure>
 | ||
| 	    <title>Node Invariants</title>
 | ||
| 	    <mediaobject>
 | ||
| 	      <imageobject>
 | ||
| 		<imagedata align="center" format="PNG" scale="100"
 | ||
| 			   fileref="../images/pbds_node_invariants.png"/>
 | ||
| 	      </imageobject>
 | ||
| 	      <textobject>
 | ||
| 		<phrase>Node Invariants</phrase>
 | ||
| 	      </textobject>
 | ||
| 	    </mediaobject>
 | ||
| 	  </figure>
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
| 	<section xml:id="motivation.associative.underlying">
 | ||
| 	  <info><title>Underlying Data Structures</title></info>
 | ||
| 	  <para>
 | ||
| 	    The standard C++ library contains associative containers based on
 | ||
| 	    red-black trees and collision-chaining hash tables. These are
 | ||
| 	    very useful, but they are not ideal for all types of
 | ||
| 	    settings.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    The figure below shows the different underlying data structures
 | ||
| 	    currently supported in this library.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <figure>
 | ||
| 	    <title>Underlying Associative Data Structures</title>
 | ||
| 	    <mediaobject>
 | ||
| 	      <imageobject>
 | ||
| 		<imagedata align="center" format="PNG" scale="100"
 | ||
| 			   fileref="../images/pbds_different_underlying_dss_1.png"/>
 | ||
| 	      </imageobject>
 | ||
| 	      <textobject>
 | ||
| 		<phrase>Underlying Associative Data Structures</phrase>
 | ||
| 	      </textobject>
 | ||
| 	    </mediaobject>
 | ||
| 	  </figure>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    A shows a collision-chaining hash-table, B shows a probing
 | ||
| 	    hash-table, C shows a red-black tree, D shows a splay tree, E shows
 | ||
| 	    a tree based on an ordered vector(implicit in the order of the
 | ||
| 	    elements), F shows a PATRICIA trie, and G shows a list-based
 | ||
| 	    container with update policies.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    Each of these data structures has some performance benefits, in
 | ||
| 	    terms of speed, size or both. For now, note that vector-based trees
 | ||
| 	    and probing hash tables manipulate memory more efficiently than
 | ||
| 	    red-black trees and collision-chaining hash tables, and that
 | ||
| 	    list-based associative containers are very useful for constructing
 | ||
| 	    "multimaps".
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    Now consider a function manipulating a generic associative
 | ||
| 	    container,
 | ||
| 	  </para>
 | ||
| 	  <programlisting>
 | ||
| 	    template<class Cntnr>
 | ||
| 	    int
 | ||
| 	    some_op_sequence(Cntnr &r_cnt)
 | ||
| 	    {
 | ||
| 	    ...
 | ||
| 	    }
 | ||
| 	  </programlisting>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    Ideally, the underlying data structure
 | ||
| 	    of <classname>Cntnr</classname> would not affect what can be
 | ||
| 	    done with <varname>r_cnt</varname>.  Unfortunately, this is not
 | ||
| 	    the case.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    For example, if <classname>Cntnr</classname>
 | ||
| 	    is <classname>std::map</classname>, then the function can
 | ||
| 	    use
 | ||
| 	  </para>
 | ||
| 	  <programlisting>
 | ||
| 	    std::for_each(r_cnt.find(foo), r_cnt.find(bar), foobar)
 | ||
| 	  </programlisting>
 | ||
| 	  <para>
 | ||
| 	    in order to apply <classname>foobar</classname> to all
 | ||
| 	    elements between <classname>foo</classname> and
 | ||
| 	    <classname>bar</classname>. If
 | ||
| 	    <classname>Cntnr</classname> is a hash-based container,
 | ||
| 	    then this call's results are undefined.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    Also, if <classname>Cntnr</classname> is tree-based, the type
 | ||
| 	    and object of the comparison functor can be
 | ||
| 	    accessed. If <classname>Cntnr</classname> is hash based, these
 | ||
| 	    queries are nonsensical.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    There are various other differences based on the container's
 | ||
| 	    underlying data structure. For one, they can be constructed by,
 | ||
| 	    and queried for, different policies. Furthermore:
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <orderedlist>
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		Containers based on C, D, E and F store elements in a
 | ||
| 		meaningful order; the others store elements in a meaningless
 | ||
| 		(and probably time-varying) order. By implication, only
 | ||
| 		containers based on C, D, E and F can
 | ||
| 		support <function>erase</function> operations taking an
 | ||
| 		iterator and returning an iterator to the following element
 | ||
| 		without performance loss.
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		Containers based on C, D, E, and F can be split and joined
 | ||
| 		efficiently, while the others cannot. Containers based on C
 | ||
| 		and D, furthermore, can guarantee that this is exception-free;
 | ||
| 		containers based on E cannot guarantee this.
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		Containers based on all but E can guarantee that
 | ||
| 		erasing an element is exception free; containers based on E
 | ||
| 		cannot guarantee this. Containers based on all but B and E
 | ||
| 		can guarantee that modifying an object of their type does
 | ||
| 		not invalidate iterators or references to their elements,
 | ||
| 		while containers based on B and E cannot. Containers based
 | ||
| 		on C, D, and E can furthermore make a stronger guarantee,
 | ||
| 		namely that modifying an object of their type does not
 | ||
| 		affect the order of iterators.
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 	  </orderedlist>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    A unified tag and traits system (as used for the C++ standard
 | ||
| 	    library iterators, for example) can ease generic manipulation of
 | ||
| 	    associative containers based on different underlying data
 | ||
| 	    structures.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
| 	<section xml:id="motivation.associative.iterators">
 | ||
| 	  <info><title>Iterators</title></info>
 | ||
| 	  <para>
 | ||
| 	    Iterators are centric to the design of the standard library
 | ||
| 	    containers, because of the container/algorithm/iterator
 | ||
| 	    decomposition that allows an algorithm to operate on a range
 | ||
| 	    through iterators of some sequence.  Iterators, then, are useful
 | ||
| 	    because they allow going over a
 | ||
| 	    specific <emphasis>sequence</emphasis>.  The standard library
 | ||
| 	    also uses iterators for accessing a
 | ||
| 	    specific <emphasis>element</emphasis>: when an associative
 | ||
| 	    container returns one through <function>find</function>. The
 | ||
| 	    standard library consistently uses the same types of iterators
 | ||
| 	    for both purposes: going over a range, and accessing a specific
 | ||
| 	    found element. Before the introduction of hash-based containers
 | ||
| 	    to the standard library, this made sense (with the exception of
 | ||
| 	    priority queues, which are discussed later).
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    Using the standard associative containers together with
 | ||
| 	    non-order-preserving associative containers (and also because of
 | ||
| 	    priority-queues container), there is a possible need for
 | ||
| 	    different types of iterators for self-organizing containers:
 | ||
| 	    the iterator concept seems overloaded to mean two different
 | ||
| 	    things (in some cases). <remark> XXX
 | ||
| 	    "ds_gen.html#find_range">Design::Associative
 | ||
| 	    Containers::Data-Structure Genericity::Point-Type and Range-Type
 | ||
| 	    Methods</remark>.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <section xml:id="associative.iterators.using">
 | ||
| 	    <info>
 | ||
| 	      <title>Using Point Iterators for Range Operations</title>
 | ||
| 	    </info>
 | ||
| 	    <para>
 | ||
| 	      Suppose <classname>cntnr</classname> is some associative
 | ||
| 	      container, and say <varname>c</varname> is an object of
 | ||
| 	      type <classname>cntnr</classname>. Then what will be the outcome
 | ||
| 	      of
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	    <programlisting>
 | ||
| 	      std::for_each(c.find(1), c.find(5), foo);
 | ||
| 	    </programlisting>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      If <classname>cntnr</classname> is a tree-based container
 | ||
| 	      object, then an in-order walk will
 | ||
| 	      apply <classname>foo</classname> to the relevant elements,
 | ||
| 	      as in the graphic below, label A. If <varname>c</varname> is
 | ||
| 	      a hash-based container, then the order of elements between any
 | ||
| 	      two elements is undefined (and probably time-varying); there is
 | ||
| 	      no guarantee that the elements traversed will coincide with the
 | ||
| 	      <emphasis>logical</emphasis> elements between 1 and 5, as in
 | ||
| 	      label B.
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	    <figure>
 | ||
| 	      <title>Range Iteration in Different Data Structures</title>
 | ||
| 	      <mediaobject>
 | ||
| 		<imageobject>
 | ||
| 		  <imagedata align="center" format="PNG" scale="100"
 | ||
| 			     fileref="../images/pbds_point_iterators_range_ops_1.png"/>
 | ||
| 		</imageobject>
 | ||
| 		<textobject>
 | ||
| 		  <phrase>Node Invariants</phrase>
 | ||
| 		</textobject>
 | ||
| 	      </mediaobject>
 | ||
| 	    </figure>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      In our opinion, this problem is not caused just because
 | ||
| 	      red-black trees are order preserving while
 | ||
| 	      collision-chaining hash tables are (generally) not - it
 | ||
| 	      is more fundamental. Most of the standard's containers
 | ||
| 	      order sequences in a well-defined manner that is
 | ||
| 	      determined by their <emphasis>interface</emphasis>:
 | ||
| 	      calling <function>insert</function> on a tree-based
 | ||
| 	      container modifies its sequence in a predictable way, as
 | ||
| 	      does calling <function>push_back</function> on a list or
 | ||
| 	      a vector. Conversely, collision-chaining hash tables,
 | ||
| 	      probing hash tables, priority queues, and list-based
 | ||
| 	      containers (which are very useful for "multimaps") are
 | ||
| 	      self-organizing data structures; the effect of each
 | ||
| 	      operation modifies their sequences in a manner that is
 | ||
| 	      (practically) determined by their
 | ||
| 	      <emphasis>implementation</emphasis>.
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      Consequently, applying an algorithm to a sequence obtained from most
 | ||
| 	      containers may or may not make sense, but applying it to a
 | ||
| 	      sub-sequence of a self-organizing container does not.
 | ||
| 	    </para>
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="associative.iterators.cost">
 | ||
| 	    <info>
 | ||
| 	      <title>Cost to Point Iterators to Enable Range Operations</title>
 | ||
| 	    </info>
 | ||
| 	    <para>
 | ||
| 	      Suppose <varname>c</varname> is some collision-chaining
 | ||
| 	      hash-based container object, and one calls
 | ||
| 	    </para>
 | ||
| 	    <programlisting>c.find(3)</programlisting>
 | ||
| 	    <para>
 | ||
| 	      Then what composes the returned iterator?
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      In the graphic below, label A shows the simplest (and
 | ||
| 	      most efficient) implementation of a collision-chaining
 | ||
| 	      hash table.  The little box marked
 | ||
| 	      <classname>point_iterator</classname> shows an object
 | ||
| 	      that contains a pointer to the element's node. Note that
 | ||
| 	      this "iterator" has no way to move to the next element (
 | ||
| 	      it cannot support
 | ||
| 	      <function>operator++</function>). Conversely, the little
 | ||
| 	      box marked <classname>iterator</classname> stores both a
 | ||
| 	      pointer to the element, as well as some other
 | ||
| 	      information (the bucket number of the element). the
 | ||
| 	      second iterator, then, is "heavier" than the first one-
 | ||
| 	      it requires more time and space. If we were to use a
 | ||
| 	      different container to cross-reference into this
 | ||
| 	      hash-table using these iterators - it would take much
 | ||
| 	      more space. As noted above, nothing much can be done by
 | ||
| 	      incrementing these iterators, so why is this extra
 | ||
| 	      information needed?
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      Alternatively, one might create a collision-chaining hash-table
 | ||
| 	      where the lists might be linked, forming a monolithic total-element
 | ||
| 	      list, as in the graphic below, label B.  Here the iterators are as
 | ||
| 	      light as can be, but the hash-table's operations are more
 | ||
| 	      complicated.
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	    <figure>
 | ||
| 	      <title>Point Iteration in Hash Data Structures</title>
 | ||
| 	      <mediaobject>
 | ||
| 		<imageobject>
 | ||
| 		  <imagedata align="center" format="PNG" scale="100"
 | ||
| 			     fileref="../images/pbds_point_iterators_range_ops_2.png"/>
 | ||
| 		</imageobject>
 | ||
| 		<textobject>
 | ||
| 		  <phrase>Point Iteration in Hash Data Structures</phrase>
 | ||
| 		</textobject>
 | ||
| 	      </mediaobject>
 | ||
| 	    </figure>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      It should be noted that containers based on collision-chaining
 | ||
| 	      hash-tables are not the only ones with this type of behavior;
 | ||
| 	      many other self-organizing data structures display it as well.
 | ||
| 	    </para>
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="associative.iterators.invalidation">
 | ||
| 	    <info><title>Invalidation Guarantees</title></info>
 | ||
| 	    <para>Consider the following snippet:</para>
 | ||
| 	    <programlisting>
 | ||
| 	      it = c.find(3);
 | ||
| 	      c.erase(5);
 | ||
| 	    </programlisting>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      Following the call to <classname>erase</classname>, what is the
 | ||
| 	      validity of <classname>it</classname>: can it be de-referenced?
 | ||
| 	      can it be incremented?
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      The answer depends on the underlying data structure of the
 | ||
| 	      container. The graphic below shows three cases: A1 and A2 show
 | ||
| 	      a red-black tree; B1 and B2 show a probing hash-table; C1 and C2
 | ||
| 	      show a collision-chaining hash table.
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	    <figure>
 | ||
| 	      <title>Effect of erase in different underlying data structures</title>
 | ||
| 	      <mediaobject>
 | ||
| 		<imageobject>
 | ||
| 		  <imagedata align="center" format="PNG" scale="100"
 | ||
| 			     fileref="../images/pbds_invalidation_guarantee_erase.png"/>
 | ||
| 		</imageobject>
 | ||
| 		<textobject>
 | ||
| 		  <phrase>Effect of erase in different underlying data structures</phrase>
 | ||
| 		</textobject>
 | ||
| 	      </mediaobject>
 | ||
| 	    </figure>
 | ||
| 
 | ||
| 	    <orderedlist>
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Erasing 5 from A1 yields A2. Clearly, an iterator to 3 can
 | ||
| 		  be de-referenced and incremented. The sequence of iterators
 | ||
| 		  changed, but in a way that is well-defined by the interface.
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Erasing 5 from B1 yields B2. Clearly, an iterator to 3 is
 | ||
| 		  not valid at all - it cannot be de-referenced or
 | ||
| 		  incremented; the order of iterators changed in a way that is
 | ||
| 		  (practically) determined by the implementation and not by
 | ||
| 		  the interface.
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Erasing 5 from C1 yields C2. Here the situation is more
 | ||
| 		  complicated. On the one hand, there is no problem in
 | ||
| 		  de-referencing <classname>it</classname>. On the other hand,
 | ||
| 		  the order of iterators changed in a way that is
 | ||
| 		  (practically) determined by the implementation and not by
 | ||
| 		  the interface.
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 	    </orderedlist>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      So in the standard library containers, it is not always possible
 | ||
| 	      to express whether <varname>it</varname> is valid or not. This
 | ||
| 	      is true also for <function>insert</function>. Again, the
 | ||
| 	      iterator concept seems overloaded.
 | ||
| 	    </para>
 | ||
| 	  </section>
 | ||
| 	</section> <!--iterators-->
 | ||
| 
 | ||
| 
 | ||
| 	<section xml:id="motivation.associative.functions">
 | ||
| 	  <info><title>Functional</title></info>
 | ||
| 	  <para>
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    The design of the functional overlay to the underlying data
 | ||
| 	    structures differs slightly from some of the conventions used in
 | ||
| 	    the C++ standard.  A strict public interface of methods that
 | ||
| 	    comprise only operations which depend on the class's internal
 | ||
| 	    structure; other operations are best designed as external
 | ||
| 	    functions. (See <xref linkend="biblio.meyers02both"/>).With this
 | ||
| 	    rubric, the standard associative containers lack some useful
 | ||
| 	    methods, and provide other methods which would be better
 | ||
| 	    removed.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <section xml:id="motivation.associative.functions.erase">
 | ||
| 	    <info><title><function>erase</function></title></info>
 | ||
| 
 | ||
| 	    <orderedlist>
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Order-preserving standard associative containers provide the
 | ||
| 		  method
 | ||
| 		</para>
 | ||
| 		<programlisting>
 | ||
| 		  iterator
 | ||
| 		  erase(iterator it)
 | ||
| 		</programlisting>
 | ||
| 
 | ||
| 		<para>
 | ||
| 		  which takes an iterator, erases the corresponding
 | ||
| 		  element, and returns an iterator to the following
 | ||
| 		  element. Also standardd hash-based associative
 | ||
| 		  containers provide this method. This seemingly
 | ||
| 		  increasesgenericity between associative containers,
 | ||
| 		  since it is possible to use
 | ||
| 		</para>
 | ||
| 		<programlisting>
 | ||
| 		  typename C::iterator it = c.begin();
 | ||
| 		  typename C::iterator e_it = c.end();
 | ||
| 
 | ||
| 		  while(it != e_it)
 | ||
| 		  it = pred(*it)? c.erase(it) : ++it;
 | ||
| 		</programlisting>
 | ||
| 
 | ||
| 		<para>
 | ||
| 		  in order to erase from a container object <varname>
 | ||
| 		  c</varname> all element which match a
 | ||
| 		  predicate <classname>pred</classname>. However, in a
 | ||
| 		  different sense this actually decreases genericity: an
 | ||
| 		  integral implication of this method is that tree-based
 | ||
| 		  associative containers' memory use is linear in the total
 | ||
| 		  number of elements they store, while hash-based
 | ||
| 		  containers' memory use is unbounded in the total number of
 | ||
| 		  elements they store. Assume a hash-based container is
 | ||
| 		  allowed to decrease its size when an element is
 | ||
| 		  erased. Then the elements might be rehashed, which means
 | ||
| 		  that there is no "next" element - it is simply
 | ||
| 		  undefined. Consequently, it is possible to infer from the
 | ||
| 		  fact that the standard library's hash-based containers
 | ||
| 		  provide this method that they cannot downsize when
 | ||
| 		  elements are erased. As a consequence, different code is
 | ||
| 		  needed to manipulate different containers, assuming that
 | ||
| 		  memory should be conserved. Therefor, this library's
 | ||
| 		  non-order preserving associative containers omit this
 | ||
| 		  method.
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  All associative containers include a conditional-erase method
 | ||
| 		</para>
 | ||
| 		<programlisting>
 | ||
| 		  template<
 | ||
| 		  class Pred>
 | ||
| 		  size_type
 | ||
| 		  erase_if
 | ||
| 		  (Pred pred)
 | ||
| 		</programlisting>
 | ||
| 		<para>
 | ||
| 		  which erases all elements matching a predicate. This is probably the
 | ||
| 		  only way to ensure linear-time multiple-item erase which can
 | ||
| 		  actually downsize a container.
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  The standard associative containers provide methods for
 | ||
| 		  multiple-item erase of the form
 | ||
| 		</para>
 | ||
| 		<programlisting>
 | ||
| 		  size_type
 | ||
| 		  erase(It b, It e)
 | ||
| 		</programlisting>
 | ||
| 		<para>
 | ||
| 		  erasing a range of elements given by a pair of
 | ||
| 		  iterators. For tree-based or trie-based containers, this can
 | ||
| 		  implemented more efficiently as a (small) sequence of split
 | ||
| 		  and join operations. For other, unordered, containers, this
 | ||
| 		  method isn't much better than an external loop. Moreover,
 | ||
| 		  if <varname>c</varname> is a hash-based container,
 | ||
| 		  then
 | ||
| 		</para>
 | ||
| 		<programlisting>
 | ||
| 		  c.erase(c.find(2), c.find(5))
 | ||
| 		</programlisting>
 | ||
| 		<para>
 | ||
| 		  is almost certain to do something
 | ||
| 		  different than erasing all elements whose keys are between 2
 | ||
| 		  and 5, and is likely to produce other undefined behavior.
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 	    </orderedlist>
 | ||
| 	  </section> <!-- erase -->
 | ||
| 
 | ||
| 	  <section xml:id="motivation.associative.functions.split">
 | ||
| 	    <info>
 | ||
| 	      <title>
 | ||
| 		<function>split</function> and <function>join</function>
 | ||
| 	      </title>
 | ||
| 	    </info>
 | ||
| 	    <para>
 | ||
| 	      It is well-known that tree-based and trie-based container
 | ||
| 	      objects can be efficiently split or joined (See
 | ||
| 	      <xref linkend="biblio.clrs2001"/>). Externally splitting or
 | ||
| 	      joining trees is super-linear, and, furthermore, can throw
 | ||
| 	      exceptions. Split and join methods, consequently, seem good
 | ||
| 	      choices for tree-based container methods, especially, since as
 | ||
| 	      noted just before, they are efficient replacements for erasing
 | ||
| 	      sub-sequences.
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	  </section> <!-- split -->
 | ||
| 
 | ||
| 	  <section xml:id="motivation.associative.functions.insert">
 | ||
| 	    <info>
 | ||
| 	      <title>
 | ||
| 		<function>insert</function>
 | ||
| 	      </title>
 | ||
| 	    </info>
 | ||
| 	    <para>
 | ||
| 	      The standard associative containers provide methods of the form
 | ||
| 	    </para>
 | ||
| 	    <programlisting>
 | ||
| 	      template<class It>
 | ||
| 	      size_type
 | ||
| 	      insert(It b, It e);
 | ||
| 	    </programlisting>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      for inserting a range of elements given by a pair of
 | ||
| 	      iterators. At best, this can be implemented as an external loop,
 | ||
| 	      or, even more efficiently, as a join operation (for the case of
 | ||
| 	      tree-based or trie-based containers). Moreover, these methods seem
 | ||
| 	      similar to constructors taking a range given by a pair of
 | ||
| 	      iterators; the constructors, however, are transactional, whereas
 | ||
| 	      the insert methods are not; this is possibly confusing.
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	  </section> <!-- insert -->
 | ||
| 
 | ||
| 	  <section xml:id="motivation.associative.functions.compare">
 | ||
| 	    <info>
 | ||
| 	      <title>
 | ||
| 		<function>operator==</function> and <function>operator<=</function>
 | ||
| 	      </title>
 | ||
| 	    </info>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      Associative containers are parametrized by policies allowing to
 | ||
| 	      test key equivalence: a hash-based container can do this through
 | ||
| 	      its equivalence functor, and a tree-based container can do this
 | ||
| 	      through its comparison functor. In addition, some standard
 | ||
| 	      associative containers have global function operators, like
 | ||
| 	      <function>operator==</function> and <function>operator<=</function>,
 | ||
| 	      that allow comparing entire associative containers.
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      In our opinion, these functions are better left out. To begin
 | ||
| 	      with, they do not significantly improve over an external
 | ||
| 	      loop. More importantly, however, they are possibly misleading -
 | ||
| 	      <function>operator==</function>, for example, usually checks for
 | ||
| 	      equivalence, or interchangeability, but the associative
 | ||
| 	      container cannot check for values' equivalence, only keys'
 | ||
| 	      equivalence; also, are two containers considered equivalent if
 | ||
| 	      they store the same values in different order? this is an
 | ||
| 	      arbitrary decision.
 | ||
| 	    </para>
 | ||
| 	  </section> <!-- compare -->
 | ||
| 
 | ||
| 	</section>  <!-- functional -->
 | ||
| 
 | ||
|       </section> <!--associative-->
 | ||
| 
 | ||
|       <section xml:id="pbds.intro.motivation.priority_queue">
 | ||
| 	<info><title>Priority Queues</title></info>
 | ||
| 
 | ||
| 	<section xml:id="motivation.priority_queue.policy">
 | ||
| 	  <info><title>Policy Choices</title></info>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    Priority queues are containers that allow efficiently inserting
 | ||
| 	    values and accessing the maximal value (in the sense of the
 | ||
| 	    container's comparison functor). Their interface
 | ||
| 	    supports <function>push</function>
 | ||
| 	    and <function>pop</function>. The standard
 | ||
| 	    container <classname>std::priorityqueue</classname> indeed support
 | ||
| 	    these methods, but little else. For algorithmic and
 | ||
| 	    software-engineering purposes, other methods are needed:
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <orderedlist>
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		Many graph algorithms (see
 | ||
| 		<xref linkend="biblio.clrs2001"/>) require increasing a
 | ||
| 		value in a priority queue (again, in the sense of the
 | ||
| 		container's comparison functor), or joining two
 | ||
| 		priority-queue objects.
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 
 | ||
| 	    <listitem>
 | ||
| 	      <para>The return type of <classname>priority_queue</classname>'s
 | ||
| 	      <function>push</function> method is a point-type iterator, which can
 | ||
| 	      be used for modifying or erasing arbitrary values. For
 | ||
| 	      example:</para>
 | ||
| 	      <programlisting>
 | ||
| 		priority_queue<int> p;
 | ||
| 		priority_queue<int>::point_iterator it = p.push(3);
 | ||
| 		p.modify(it, 4);
 | ||
| 	      </programlisting>
 | ||
| 
 | ||
| 	      <para>These types of cross-referencing operations are necessary
 | ||
| 	      for making priority queues useful for different applications,
 | ||
| 	      especially graph applications.</para>
 | ||
| 
 | ||
| 	    </listitem>
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		It is sometimes necessary to erase an arbitrary value in a
 | ||
| 		priority queue. For example, consider
 | ||
| 		the <function>select</function> function for monitoring
 | ||
| 		file descriptors:
 | ||
| 	      </para>
 | ||
| 
 | ||
| 	      <programlisting>
 | ||
| 		int
 | ||
| 		select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *errorfds,
 | ||
| 		struct timeval *timeout);
 | ||
| 	      </programlisting>
 | ||
| 	      <para>
 | ||
| 		then, as the select documentation states:
 | ||
| 	      </para>
 | ||
| 	      <para>
 | ||
| 		<quote>
 | ||
| 		  The nfds argument specifies the range of file
 | ||
| 		  descriptors to be tested. The select() function tests file
 | ||
| 		descriptors in the range of 0 to nfds-1.</quote>
 | ||
| 	      </para>
 | ||
| 
 | ||
| 	      <para>
 | ||
| 		It stands to reason, therefore, that we might wish to
 | ||
| 		maintain a minimal value for <varname>nfds</varname>, and
 | ||
| 		priority queues immediately come to mind. Note, though, that
 | ||
| 		when a socket is closed, the minimal file description might
 | ||
| 		change; in the absence of an efficient means to erase an
 | ||
| 		arbitrary value from a priority queue, we might as well
 | ||
| 		avoid its use altogether.
 | ||
| 	      </para>
 | ||
| 
 | ||
| 	      <para>
 | ||
| 		The standard containers typically support iterators. It is
 | ||
| 		somewhat unusual
 | ||
| 		for <classname>std::priority_queue</classname> to omit them
 | ||
| 		(See <xref linkend="biblio.meyers01stl"/>). One might
 | ||
| 		ask why do priority queues need to support iterators, since
 | ||
| 		they are self-organizing containers with a different purpose
 | ||
| 		than abstracting sequences. There are several reasons:
 | ||
| 	      </para>
 | ||
| 	      <orderedlist>
 | ||
| 		<listitem>
 | ||
| 		  <para>
 | ||
| 		    Iterators (even in self-organizing containers) are
 | ||
| 		    useful for many purposes: cross-referencing
 | ||
| 		    containers, serialization, and debugging code that uses
 | ||
| 		    these containers.
 | ||
| 		  </para>
 | ||
| 		</listitem>
 | ||
| 
 | ||
| 		<listitem>
 | ||
| 		  <para>
 | ||
| 		    The standard library's hash-based containers support
 | ||
| 		    iterators, even though they too are self-organizing
 | ||
| 		    containers with a different purpose than abstracting
 | ||
| 		    sequences.
 | ||
| 		  </para>
 | ||
| 		</listitem>
 | ||
| 
 | ||
| 		<listitem>
 | ||
| 		  <para>
 | ||
| 		    In standard-library-like containers, it is natural to specify the
 | ||
| 		    interface of operations for modifying a value or erasing
 | ||
| 		    a value (discussed previously) in terms of a iterators.
 | ||
| 		    It should be noted that the standard
 | ||
| 		    containers also use iterators for accessing and
 | ||
| 		    manipulating a specific value. In hash-based
 | ||
| 		    containers, one checks the existence of a key by
 | ||
| 		    comparing the iterator returned by <function>find</function> to the
 | ||
| 		    iterator returned by <function>end</function>, and not by comparing a
 | ||
| 		    pointer returned by <function>find</function> to <type>NULL</type>.
 | ||
| 		  </para>
 | ||
| 		</listitem>
 | ||
| 	      </orderedlist>
 | ||
| 	    </listitem>
 | ||
| 	  </orderedlist>
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
| 	<section xml:id="motivation.priority_queue.underlying">
 | ||
| 	  <info><title>Underlying Data Structures</title></info>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    There are three main implementations of priority queues: the
 | ||
| 	    first employs a binary heap, typically one which uses a
 | ||
| 	    sequence; the second uses a tree (or forest of trees), which is
 | ||
| 	    typically less structured than an associative container's tree;
 | ||
| 	    the third simply uses an associative container. These are
 | ||
| 	    shown in the figure below with labels A1 and A2, B, and C.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <figure>
 | ||
| 	    <title>Underlying Priority Queue Data Structures</title>
 | ||
| 	    <mediaobject>
 | ||
| 	      <imageobject>
 | ||
| 		<imagedata align="center" format="PNG" scale="100"
 | ||
| 			   fileref="../images/pbds_different_underlying_dss_2.png"/>
 | ||
| 	      </imageobject>
 | ||
| 	      <textobject>
 | ||
| 		<phrase>Underlying Priority Queue Data Structures</phrase>
 | ||
| 	      </textobject>
 | ||
| 	    </mediaobject>
 | ||
| 	  </figure>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    No single implementation can completely replace any of the
 | ||
| 	    others. Some have better <function>push</function>
 | ||
| 	    and <function>pop</function> amortized performance, some have
 | ||
| 	    better bounded (worst case) response time than others, some
 | ||
| 	    optimize a single method at the expense of others, etc. In
 | ||
| 	    general the "best" implementation is dictated by the specific
 | ||
| 	    problem.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    As with associative containers, the more implementations
 | ||
| 	    co-exist, the more necessary a traits mechanism is for handling
 | ||
| 	    generic containers safely and efficiently. This is especially
 | ||
| 	    important for priority queues, since the invalidation guarantees
 | ||
| 	    of one of the most useful data structures - binary heaps - is
 | ||
| 	    markedly different than those of most of the others.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
| 	<section xml:id="motivation.priority_queue.binary_heap">
 | ||
| 	  <info><title>Binary Heaps</title></info>
 | ||
| 
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    Binary heaps are one of the most useful underlying
 | ||
| 	    data structures for priority queues. They are very efficient in
 | ||
| 	    terms of memory (since they don't require per-value structure
 | ||
| 	    metadata), and have the best amortized <function>push</function> and
 | ||
| 	    <function>pop</function> performance for primitive types like
 | ||
| 	    <type>int</type>.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    The standard library's <classname>priority_queue</classname>
 | ||
| 	    implements this data structure as an adapter over a sequence,
 | ||
| 	    typically
 | ||
| 	    <classname>std::vector</classname>
 | ||
| 	    or <classname>std::deque</classname>, which correspond to labels
 | ||
| 	    A1 and A2 respectively in the graphic above.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    This is indeed an elegant example of the adapter concept and
 | ||
| 	    the algorithm/container/iterator decomposition. (See <xref linkend="biblio.nelson96stlpq"/>). There are
 | ||
| 	    several reasons why a binary-heap priority queue
 | ||
| 	    may be better implemented as a container instead of a
 | ||
| 	    sequence adapter:
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <orderedlist>
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		<classname>std::priority_queue</classname> cannot erase values
 | ||
| 		from its adapted sequence (irrespective of the sequence
 | ||
| 		type). This means that the memory use of
 | ||
| 		an <classname>std::priority_queue</classname> object is always
 | ||
| 		proportional to the maximal number of values it ever contained,
 | ||
| 		and not to the number of values that it currently
 | ||
| 		contains. (See <filename>performance/priority_queue_text_pop_mem_usage.cc</filename>.)
 | ||
| 		This implementation of binary heaps acts very differently than
 | ||
| 		other underlying data structures (See also pairing heaps).
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		Some combinations of adapted sequences and value types
 | ||
| 		are very inefficient or just don't make sense. If one uses
 | ||
| 		<classname>std::priority_queue<std::vector<std::string>
 | ||
| 		> ></classname>, for example, then not only will each
 | ||
| 		operation perform a logarithmic number of
 | ||
| 		<classname>std::string</classname> assignments, but, furthermore, any
 | ||
| 		operation (including <function>pop</function>) can render the container
 | ||
| 		useless due to exceptions. Conversely, if one uses
 | ||
| 		<classname>std::priority_queue<std::deque<int> >
 | ||
| 		></classname>, then each operation uses incurs a logarithmic
 | ||
| 		number of indirect accesses (through pointers) unnecessarily.
 | ||
| 		It might be better to let the container make a conservative
 | ||
| 		deduction whether to use the structure in the graphic above, labels A1 or A2.
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		There does not seem to be a systematic way to determine
 | ||
| 		what exactly can be done with the priority queue.
 | ||
| 	      </para>
 | ||
| 	      <orderedlist>
 | ||
| 		<listitem>
 | ||
| 		  <para>
 | ||
| 		    If <classname>p</classname> is a priority queue adapting an
 | ||
| 		    <classname>std::vector</classname>, then it is possible to iterate over
 | ||
| 		    all values by using <function>&p.top()</function> and
 | ||
| 		    <function>&p.top() + p.size()</function>, but this will not work
 | ||
| 		    if <varname>p</varname> is adapting an <classname>std::deque</classname>; in any
 | ||
| 		    case, one cannot use <classname>p.begin()</classname> and
 | ||
| 		    <classname>p.end()</classname>. If a different sequence is adapted, it
 | ||
| 		    is even more difficult to determine what can be
 | ||
| 		    done.
 | ||
| 		  </para>
 | ||
| 		</listitem>
 | ||
| 
 | ||
| 		<listitem>
 | ||
| 		  <para>
 | ||
| 		    If <varname>p</varname> is a priority queue adapting an
 | ||
| 		    <classname>std::deque</classname>, then the reference return by
 | ||
| 		  </para>
 | ||
| 		  <programlisting>
 | ||
| 		    p.top()
 | ||
| 		  </programlisting>
 | ||
| 		  <para>
 | ||
| 		    will remain valid until it is popped,
 | ||
| 		    but if <varname>p</varname> adapts an <classname>std::vector</classname>, the
 | ||
| 		    next <function>push</function> will invalidate it. If a different
 | ||
| 		    sequence is adapted, it is even more difficult to
 | ||
| 		    determine what can be done.
 | ||
| 		  </para>
 | ||
| 		</listitem>
 | ||
| 	      </orderedlist>
 | ||
| 	    </listitem>
 | ||
| 
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		Sequence-based binary heaps can still implement
 | ||
| 		linear-time <function>erase</function> and <function>modify</function> operations.
 | ||
| 		This means that if one needs to erase a small
 | ||
| 		(say logarithmic) number of values, then one might still
 | ||
| 		choose this underlying data structure. Using
 | ||
| 		<classname>std::priority_queue</classname>, however, this will generally
 | ||
| 		change the order of growth of the entire sequence of
 | ||
| 		operations.
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 	  </orderedlist>
 | ||
| 
 | ||
| 	</section>
 | ||
|       </section>
 | ||
|     </section> <!-- goals/motivation -->
 | ||
|   </section> <!-- intro -->
 | ||
| 
 | ||
|   <!-- S02: Using -->
 | ||
|   <section xml:id="containers.pbds.using">
 | ||
|     <info><title>Using</title></info>
 | ||
|     <?dbhtml filename="policy_data_structures_using.html"?>
 | ||
| 
 | ||
|     <section xml:id="pbds.using.prereq">
 | ||
|       <info><title>Prerequisites</title></info>
 | ||
| 
 | ||
|       <para>The library contains only header files, and does not require any
 | ||
|       other libraries except the standard C++ library . All classes are
 | ||
|       defined in namespace <code>__gnu_pbds</code>. The library internally
 | ||
|       uses macros beginning with <code>PB_DS</code>, but
 | ||
|       <code>#undef</code>s anything it <code>#define</code>s (except for
 | ||
|       header guards). Compiling the library in an environment where macros
 | ||
|       beginning in <code>PB_DS</code> are defined, may yield unpredictable
 | ||
|       results in compilation, execution, or both.</para>
 | ||
| 
 | ||
|       <para>
 | ||
| 	Further dependencies are necessary to create the visual output
 | ||
| 	for the performance tests. To create these graphs, an
 | ||
| 	additional package is needed: <command>pychart</command>.
 | ||
|       </para>
 | ||
|     </section>
 | ||
| 
 | ||
|     <section xml:id="pbds.using.organization">
 | ||
|       <info><title>Organization</title></info>
 | ||
| 
 | ||
|       <para>
 | ||
| 	The various data structures are organized as follows.
 | ||
|       </para>
 | ||
| 
 | ||
|       <itemizedlist>
 | ||
| 	<listitem>
 | ||
| 	  <para>
 | ||
| 	    Branch-Based
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <itemizedlist>
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		<classname>basic_branch</classname>
 | ||
| 		is an abstract base class for branched-based
 | ||
| 		associative-containers
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		<classname>tree</classname>
 | ||
| 		is a concrete base class for tree-based
 | ||
| 		associative-containers
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		<classname>trie</classname>
 | ||
| 		is a concrete base class trie-based
 | ||
| 		associative-containers
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 	  </itemizedlist>
 | ||
| 	</listitem>
 | ||
| 
 | ||
| 	<listitem>
 | ||
| 	  <para>
 | ||
| 	    Hash-Based
 | ||
| 	  </para>
 | ||
| 	  <itemizedlist>
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		<classname>basic_hash_table</classname>
 | ||
| 		is an abstract base class for hash-based
 | ||
| 		associative-containers
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		<classname>cc_hash_table</classname>
 | ||
| 		is a concrete collision-chaining hash-based
 | ||
| 		associative-containers
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		<classname>gp_hash_table</classname>
 | ||
| 		is a concrete (general) probing hash-based
 | ||
| 		associative-containers
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 	  </itemizedlist>
 | ||
| 	</listitem>
 | ||
| 
 | ||
| 	<listitem>
 | ||
| 	  <para>
 | ||
| 	    List-Based
 | ||
| 	  </para>
 | ||
| 	  <itemizedlist>
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		<classname>list_update</classname>
 | ||
| 		list-based update-policy associative container
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 	  </itemizedlist>
 | ||
| 	</listitem>
 | ||
| 	<listitem>
 | ||
| 	  <para>
 | ||
| 	    Heap-Based
 | ||
| 	  </para>
 | ||
| 	  <itemizedlist>
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		<classname>priority_queue</classname>
 | ||
| 		A priority queue.
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 	  </itemizedlist>
 | ||
| 	</listitem>
 | ||
|       </itemizedlist>
 | ||
| 
 | ||
|       <para>
 | ||
| 	The hierarchy is composed naturally so that commonality is
 | ||
| 	captured by base classes. Thus <function>operator[]</function>
 | ||
| 	is defined at the base of any hierarchy, since all derived
 | ||
| 	containers support it. Conversely <function>split</function> is
 | ||
| 	defined in <classname>basic_branch</classname>, since only
 | ||
| 	tree-like containers support it.
 | ||
|       </para>
 | ||
| 
 | ||
|       <para>
 | ||
| 	In addition, there are the following diagnostics classes,
 | ||
| 	used to report errors specific to this library's data
 | ||
| 	structures.
 | ||
|       </para>
 | ||
| 
 | ||
|       <figure>
 | ||
| 	<title>Exception Hierarchy</title>
 | ||
| 	<mediaobject>
 | ||
| 	  <imageobject>
 | ||
| 	    <imagedata align="center" format="PDF" scale="75"
 | ||
| 		       fileref="../images/pbds_exception_hierarchy.pdf"/>
 | ||
| 	  </imageobject>
 | ||
| 	  <imageobject>
 | ||
| 	    <imagedata align="center" format="PNG" scale="100"
 | ||
| 		       fileref="../images/pbds_exception_hierarchy.png"/>
 | ||
| 	  </imageobject>
 | ||
| 	  <textobject>
 | ||
| 	    <phrase>Exception Hierarchy</phrase>
 | ||
| 	  </textobject>
 | ||
| 	</mediaobject>
 | ||
|       </figure>
 | ||
| 
 | ||
|     </section>
 | ||
| 
 | ||
|     <section xml:id="pbds.using.tutorial">
 | ||
|       <info><title>Tutorial</title></info>
 | ||
| 
 | ||
|       <section xml:id="pbds.using.tutorial.basic">
 | ||
| 	<info><title>Basic Use</title></info>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  For the most part, the policy-based containers containers in
 | ||
| 	  namespace <literal>__gnu_pbds</literal> have the same interface as
 | ||
| 	  the equivalent containers in the standard C++ library, except for
 | ||
| 	  the names used for the container classes themselves. For example,
 | ||
| 	  this shows basic operations on a collision-chaining hash-based
 | ||
| 	  container:
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  #include <ext/pb_ds/assoc_container.h>
 | ||
| 
 | ||
| 	  int main()
 | ||
| 	  {
 | ||
| 	  __gnu_pbds::cc_hash_table<int, char> c;
 | ||
| 	  c[2] = 'b';
 | ||
| 	  assert(c.find(1) == c.end());
 | ||
| 	  };
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  The container is called
 | ||
| 	  <classname>__gnu_pbds::cc_hash_table</classname> instead of
 | ||
| 	  <classname>std::unordered_map</classname>, since <quote>unordered
 | ||
| 	  map</quote> does not necessarily mean a hash-based map as implied by
 | ||
| 	  the C++ library (C++11 or TR1). For example, list-based associative
 | ||
| 	  containers, which are very useful for the construction of
 | ||
| 	  "multimaps," are also unordered.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>This snippet shows a red-black tree based container:</para>
 | ||
| 
 | ||
| 	<programlisting>
 | ||
| 	  #include <ext/pb_ds/assoc_container.h>
 | ||
| 
 | ||
| 	  int main()
 | ||
| 	  {
 | ||
| 	  __gnu_pbds::tree<int, char> c;
 | ||
| 	  c[2] = 'b';
 | ||
| 	  assert(c.find(2) != c.end());
 | ||
| 	  };
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>The container is called <classname>tree</classname> instead of
 | ||
| 	<classname>map</classname> since the underlying data structures are
 | ||
| 	being named with specificity.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  The member function naming convention is to strive to be the same as
 | ||
| 	  the equivalent member functions in other C++ standard library
 | ||
| 	  containers. The familiar methods are unchanged:
 | ||
| 	  <function>begin</function>, <function>end</function>,
 | ||
| 	  <function>size</function>, <function>empty</function>, and
 | ||
| 	  <function>clear</function>.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  This isn't to say that things are exactly as one would expect, given
 | ||
| 	  the container requirments and interfaces in the C++ standard.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  The names of containers' policies and policy accessors are
 | ||
| 	  different then the usual. For example, if <type>hash_type</type> is
 | ||
| 	some type of hash-based container, then</para>
 | ||
| 
 | ||
| 	<programlisting>
 | ||
| 	  hash_type::hash_fn
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  gives the type of its hash functor, and if <varname>obj</varname> is
 | ||
| 	  some hash-based container object, then
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<programlisting>
 | ||
| 	  obj.get_hash_fn()
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>will return a reference to its hash-functor object.</para>
 | ||
| 
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Similarly, if <type>tree_type</type> is some type of tree-based
 | ||
| 	  container, then
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<programlisting>
 | ||
| 	  tree_type::cmp_fn
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  gives the type of its comparison functor, and if
 | ||
| 	  <varname>obj</varname> is some tree-based container object,
 | ||
| 	  then
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<programlisting>
 | ||
| 	  obj.get_cmp_fn()
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>will return a reference to its comparison-functor object.</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  It would be nice to give names consistent with those in the existing
 | ||
| 	  C++ standard (inclusive of TR1). Unfortunately, these standard
 | ||
| 	  containers don't consistently name types and methods. For example,
 | ||
| 	  <classname>std::tr1::unordered_map</classname> uses
 | ||
| 	  <type>hasher</type> for the hash functor, but
 | ||
| 	  <classname>std::map</classname> uses <type>key_compare</type> for
 | ||
| 	  the comparison functor. Also, we could not find an accessor for
 | ||
| 	  <classname>std::tr1::unordered_map</classname>'s hash functor, but
 | ||
| 	  <classname>std::map</classname> uses <classname>compare</classname>
 | ||
| 	  for accessing the comparison functor.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Instead, <literal>__gnu_pbds</literal> attempts to be internally
 | ||
| 	  consistent, and uses standard-derived terminology if possible.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Another source of difference is in scope:
 | ||
| 	  <literal>__gnu_pbds</literal> contains more types of associative
 | ||
| 	  containers than the standard C++ library, and more opportunities
 | ||
| 	  to configure these new containers, since different types of
 | ||
| 	  associative containers are useful in different settings.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Namespace <literal>__gnu_pbds</literal> contains different classes for
 | ||
| 	  hash-based containers, tree-based containers, trie-based containers,
 | ||
| 	  and list-based containers.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Since associative containers share parts of their interface, they
 | ||
| 	  are organized as a class hierarchy.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>Each type or method is defined in the most-common ancestor
 | ||
| 	in which it makes sense.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>For example, all associative containers support iteration
 | ||
| 	expressed in the following form:
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<programlisting>
 | ||
| 	  const_iterator
 | ||
| 	  begin() const;
 | ||
| 
 | ||
| 	  iterator
 | ||
| 	  begin();
 | ||
| 
 | ||
| 	  const_iterator
 | ||
| 	  end() const;
 | ||
| 
 | ||
| 	  iterator
 | ||
| 	  end();
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  But not all containers contain or use hash functors. Yet, both
 | ||
| 	  collision-chaining and (general) probing hash-based associative
 | ||
| 	  containers have a hash functor, so
 | ||
| 	  <classname>basic_hash_table</classname> contains the interface:
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<programlisting>
 | ||
| 	  const hash_fn&
 | ||
| 	  get_hash_fn() const;
 | ||
| 
 | ||
| 	  hash_fn&
 | ||
| 	  get_hash_fn();
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  so all hash-based associative containers inherit the same
 | ||
| 	  hash-functor accessor methods.
 | ||
| 	</para>
 | ||
| 
 | ||
|       </section> <!--basic use -->
 | ||
| 
 | ||
|       <section xml:id="pbds.using.tutorial.configuring">
 | ||
| 	<info>
 | ||
| 	  <title>
 | ||
| 	    Configuring via Template Parameters
 | ||
| 	  </title>
 | ||
| 	</info>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  In general, each of this library's containers is
 | ||
| 	  parametrized by more policies than those of the standard library. For
 | ||
| 	  example, the standard hash-based container is parametrized as
 | ||
| 	  follows:
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  template<typename Key, typename Mapped, typename Hash,
 | ||
| 	  typename Pred, typename Allocator, bool Cache_Hashe_Code>
 | ||
| 	  class unordered_map;
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  and so can be configured by key type, mapped type, a functor
 | ||
| 	  that translates keys to unsigned integral types, an equivalence
 | ||
| 	  predicate, an allocator, and an indicator whether to store hash
 | ||
| 	  values with each entry. this library's collision-chaining
 | ||
| 	  hash-based container is parametrized as
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  template<typename Key, typename Mapped, typename Hash_Fn,
 | ||
| 	  typename Eq_Fn, typename Comb_Hash_Fn,
 | ||
| 	  typename Resize_Policy, bool Store_Hash
 | ||
| 	  typename Allocator>
 | ||
| 	  class cc_hash_table;
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  and so can be configured by the first four types of
 | ||
| 	  <classname>std::tr1::unordered_map</classname>, then a
 | ||
| 	  policy for translating the key-hash result into a position
 | ||
| 	  within the table, then a policy by which the table resizes,
 | ||
| 	  an indicator whether to store hash values with each entry,
 | ||
| 	  and an allocator (which is typically the last template
 | ||
| 	  parameter in standard containers).
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Nearly all policy parameters have default values, so this
 | ||
| 	  need not be considered for casual use. It is important to
 | ||
| 	  note, however, that hash-based containers' policies can
 | ||
| 	  dramatically alter their performance in different settings,
 | ||
| 	  and that tree-based containers' policies can make them
 | ||
| 	  useful for other purposes than just look-up.
 | ||
| 	</para>
 | ||
| 
 | ||
| 
 | ||
| 	<para>As opposed to associative containers, priority queues have
 | ||
| 	relatively few configuration options. The priority queue is
 | ||
| 	parametrized as follows:</para>
 | ||
| 	<programlisting>
 | ||
| 	  template<typename Value_Type, typename Cmp_Fn,typename Tag,
 | ||
| 	  typename Allocator>
 | ||
| 	  class priority_queue;
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>The <classname>Value_Type</classname>, <classname>Cmp_Fn</classname>, and
 | ||
| 	<classname>Allocator</classname> parameters are the container's value type,
 | ||
| 	comparison-functor type, and allocator type, respectively;
 | ||
| 	these are very similar to the standard's priority queue. The
 | ||
| 	<classname>Tag</classname> parameter is different: there are a number of
 | ||
| 	pre-defined tag types corresponding to binary heaps, binomial
 | ||
| 	heaps, etc., and <classname>Tag</classname> should be instantiated
 | ||
| 	by one of them.</para>
 | ||
| 
 | ||
| 	<para>Note that as opposed to the
 | ||
| 	<classname>std::priority_queue</classname>,
 | ||
| 	<classname>__gnu_pbds::priority_queue</classname> is not a
 | ||
| 	sequence-adapter; it is a regular container.</para>
 | ||
| 
 | ||
|       </section>
 | ||
| 
 | ||
|       <section xml:id="pbds.using.tutorial.traits">
 | ||
| 	<info>
 | ||
| 	  <title>
 | ||
| 	    Querying Container Attributes
 | ||
| 	  </title>
 | ||
| 	</info>
 | ||
| 	<para></para>
 | ||
| 
 | ||
| 	<para>A containers underlying data structure
 | ||
| 	affect their performance; Unfortunately, they can also affect
 | ||
| 	their interface. When manipulating generically associative
 | ||
| 	containers, it is often useful to be able to statically
 | ||
| 	determine what they can support and what the cannot.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>Happily, the standard provides a good solution to a similar
 | ||
| 	problem - that of the different behavior of iterators. If
 | ||
| 	<classname>It</classname> is an iterator, then
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  typename std::iterator_traits<It>::iterator_category
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>is one of a small number of pre-defined tag classes, and
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  typename std::iterator_traits<It>::value_type
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>is the value type to which the iterator "points".</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Similarly, in this library, if <type>C</type> is a
 | ||
| 	  container, then <classname>container_traits</classname> is a
 | ||
| 	  trait class that stores information about the kind of
 | ||
| 	  container that is implemented.
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  typename container_traits<C>::container_category
 | ||
| 	</programlisting>
 | ||
| 	<para>
 | ||
| 	  is one of a small number of predefined tag structures that
 | ||
| 	  uniquely identifies the type of underlying data structure.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>In most cases, however, the exact underlying data
 | ||
| 	structure is not really important, but what is important is
 | ||
| 	one of its other attributes: whether it guarantees storing
 | ||
| 	elements by key order, for example. For this one can
 | ||
| 	use</para>
 | ||
| 	<programlisting>
 | ||
| 	  typename container_traits<C>::order_preserving
 | ||
| 	</programlisting>
 | ||
| 	<para>
 | ||
| 	  Also,
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  typename container_traits<C>::invalidation_guarantee
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>is the container's invalidation guarantee. Invalidation
 | ||
| 	guarantees are especially important regarding priority queues,
 | ||
| 	since in this library's design, iterators are practically the
 | ||
| 	only way to manipulate them.</para>
 | ||
|       </section>
 | ||
| 
 | ||
|       <section xml:id="pbds.using.tutorial.point_range_iteration">
 | ||
| 	<info>
 | ||
| 	  <title>
 | ||
| 	    Point and Range Iteration
 | ||
| 	  </title>
 | ||
| 	</info>
 | ||
| 	<para></para>
 | ||
| 
 | ||
| 	<para>This library differentiates between two types of methods
 | ||
| 	and iterators: point-type, and range-type. For example,
 | ||
| 	<function>find</function> and <function>insert</function> are point-type methods, since
 | ||
| 	they each deal with a specific element; their returned
 | ||
| 	iterators are point-type iterators. <function>begin</function> and
 | ||
| 	<function>end</function> are range-type methods, since they are not used to
 | ||
| 	find a specific element, but rather to go over all elements in
 | ||
| 	a container object; their returned iterators are range-type
 | ||
| 	iterators.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>Most containers store elements in an order that is
 | ||
| 	determined by their interface. Correspondingly, it is fine that
 | ||
| 	their point-type iterators are synonymous with their range-type
 | ||
| 	iterators. For example, in the following snippet
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  std::for_each(c.find(1), c.find(5), foo);
 | ||
| 	</programlisting>
 | ||
| 	<para>
 | ||
| 	  two point-type iterators (returned by <function>find</function>) are used
 | ||
| 	  for a range-type purpose - going over all elements whose key is
 | ||
| 	  between 1 and 5.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Conversely, the above snippet makes no sense for
 | ||
| 	  self-organizing containers - ones that order (and reorder)
 | ||
| 	  their elements by implementation. It would be nice to have a
 | ||
| 	  uniform iterator system that would allow the above snippet to
 | ||
| 	  compile only if it made sense.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  This could trivially be done by specializing
 | ||
| 	  <function>std::for_each</function> for the case of iterators returned by
 | ||
| 	  <classname>std::tr1::unordered_map</classname>, but this would only solve the
 | ||
| 	  problem for one algorithm and one container. Fundamentally, the
 | ||
| 	  problem is that one can loop using a self-organizing
 | ||
| 	  container's point-type iterators.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  This library's containers define two families of
 | ||
| 	  iterators: <type>point_const_iterator</type> and
 | ||
| 	  <type>point_iterator</type> are the iterator types returned by
 | ||
| 	  point-type methods; <type>const_iterator</type> and
 | ||
| 	  <type>iterator</type> are the iterator types returned by range-type
 | ||
| 	  methods.
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  class <- some container ->
 | ||
| 	  {
 | ||
| 	  public:
 | ||
| 	  ...
 | ||
| 
 | ||
| 	  typedef <- something -> const_iterator;
 | ||
| 
 | ||
| 	  typedef <- something -> iterator;
 | ||
| 
 | ||
| 	  typedef <- something -> point_const_iterator;
 | ||
| 
 | ||
| 	  typedef <- something -> point_iterator;
 | ||
| 
 | ||
| 	  ...
 | ||
| 
 | ||
| 	  public:
 | ||
| 	  ...
 | ||
| 
 | ||
| 	  const_iterator begin () const;
 | ||
| 
 | ||
| 	  iterator begin();
 | ||
| 
 | ||
| 	  point_const_iterator find(...) const;
 | ||
| 
 | ||
| 	  point_iterator find(...);
 | ||
| 	  };
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>For
 | ||
| 	containers whose interface defines sequence order , it
 | ||
| 	is very simple: point-type and range-type iterators are exactly
 | ||
| 	the same, which means that the above snippet will compile if it
 | ||
| 	is used for an order-preserving associative container.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  For self-organizing containers, however, (hash-based
 | ||
| 	  containers as a special example), the preceding snippet will
 | ||
| 	  not compile, because their point-type iterators do not support
 | ||
| 	  <function>operator++</function>.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>In any case, both for order-preserving and self-organizing
 | ||
| 	containers, the following snippet will compile:
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  typename Cntnr::point_iterator it = c.find(2);
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  because a range-type iterator can always be converted to a
 | ||
| 	  point-type iterator.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>Distingushing between iterator types also
 | ||
| 	raises the point that a container's iterators might have
 | ||
| 	different invalidation rules concerning their de-referencing
 | ||
| 	abilities and movement abilities. This now corresponds exactly
 | ||
| 	to the question of whether point-type and range-type iterators
 | ||
| 	are valid. As explained above, <classname>container_traits</classname> allows
 | ||
| 	querying a container for its data structure attributes. The
 | ||
| 	iterator-invalidation guarantees are certainly a property of
 | ||
| 	the underlying data structure, and so
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  container_traits<C>::invalidation_guarantee
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  gives one of three pre-determined types that answer this
 | ||
| 	  query.
 | ||
| 	</para>
 | ||
| 
 | ||
|       </section>
 | ||
|     </section> <!-- tutorial -->
 | ||
| 
 | ||
|     <section xml:id="pbds.using.examples">
 | ||
|       <info><title>Examples</title></info>
 | ||
|       <para>
 | ||
| 	Additional code examples are provided in the source
 | ||
| 	distribution, as part of the regression and performance
 | ||
| 	testsuite.
 | ||
|       </para>
 | ||
| 
 | ||
|       <section xml:id="pbds.using.examples.basic">
 | ||
| 	<info><title>Intermediate Use</title></info>
 | ||
| 
 | ||
| 	<itemizedlist>
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      Basic use of maps:
 | ||
| 	      <filename>basic_map.cc</filename>
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      Basic use of sets:
 | ||
| 	      <filename>basic_set.cc</filename>
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      Conditionally erasing values from an associative container object:
 | ||
| 	      <filename>erase_if.cc</filename>
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      Basic use of multimaps:
 | ||
| 	      <filename>basic_multimap.cc</filename>
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      Basic use of multisets:
 | ||
| 	      <filename>basic_multiset.cc</filename>
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      Basic use of priority queues:
 | ||
| 	      <filename>basic_priority_queue.cc</filename>
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      Splitting and joining priority queues:
 | ||
| 	      <filename>priority_queue_split_join.cc</filename>
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      Conditionally erasing values from a priority queue:
 | ||
| 	      <filename>priority_queue_erase_if.cc</filename>
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 	</itemizedlist>
 | ||
| 
 | ||
|       </section>
 | ||
| 
 | ||
|       <section xml:id="pbds.using.examples.query">
 | ||
| 	<info><title>Querying with <classname>container_traits</classname> </title></info>
 | ||
| 	<itemizedlist>
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      Using <classname>container_traits</classname> to query
 | ||
| 	      about underlying data structure behavior:
 | ||
| 	      <filename>assoc_container_traits.cc</filename>
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      A non-compiling example showing wrong use of finding keys in
 | ||
| 	      hash-based containers: <filename>hash_find_neg.cc</filename>
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      Using <classname>container_traits</classname>
 | ||
| 	      to query about underlying data structure behavior:
 | ||
| 	      <filename>priority_queue_container_traits.cc</filename>
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	</itemizedlist>
 | ||
| 
 | ||
|       </section>
 | ||
| 
 | ||
|       <section xml:id="pbds.using.examples.container">
 | ||
| 	<info><title>By Container Method</title></info>
 | ||
| 	<para></para>
 | ||
| 
 | ||
| 	<section xml:id="pbds.using.examples.container.hash">
 | ||
| 	  <info><title>Hash-Based</title></info>
 | ||
| 
 | ||
| 	  <section xml:id="pbds.using.examples.container.hash.resize">
 | ||
| 	    <info><title>size Related</title></info>
 | ||
| 
 | ||
| 	    <itemizedlist>
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Setting the initial size of a hash-based container
 | ||
| 		  object:
 | ||
| 		  <filename>hash_initial_size.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  A non-compiling example showing how not to resize a
 | ||
| 		  hash-based container object:
 | ||
| 		  <filename>hash_resize_neg.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Resizing the size of a hash-based container object:
 | ||
| 		  <filename>hash_resize.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Showing an illegal resize of a hash-based container
 | ||
| 		  object:
 | ||
| 		  <filename>hash_illegal_resize.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Changing the load factors of a hash-based container
 | ||
| 		  object: <filename>hash_load_set_change.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 	    </itemizedlist>
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="pbds.using.examples.container.hash.hashor">
 | ||
| 	    <info><title>Hashing Function Related</title></info>
 | ||
| 	    <para></para>
 | ||
| 
 | ||
| 	    <itemizedlist>
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Using a modulo range-hashing function for the case of an
 | ||
| 		  unknown skewed key distribution:
 | ||
| 		  <filename>hash_mod.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Writing a range-hashing functor for the case of a known
 | ||
| 		  skewed key distribution:
 | ||
| 		  <filename>shift_mask.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Storing the hash value along with each key:
 | ||
| 		  <filename>store_hash.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Writing a ranged-hash functor:
 | ||
| 		  <filename>ranged_hash.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 	    </itemizedlist>
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
| 	<section xml:id="pbds.using.examples.container.branch">
 | ||
| 	  <info><title>Branch-Based</title></info>
 | ||
| 
 | ||
| 
 | ||
| 	  <section xml:id="pbds.using.examples.container.branch.split">
 | ||
| 	    <info><title>split or join Related</title></info>
 | ||
| 
 | ||
| 	    <itemizedlist>
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Joining two tree-based container objects:
 | ||
| 		  <filename>tree_join.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Splitting a PATRICIA trie container object:
 | ||
| 		  <filename>trie_split.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Order statistics while joining two tree-based container
 | ||
| 		  objects:
 | ||
| 		  <filename>tree_order_statistics_join.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 	    </itemizedlist>
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="pbds.using.examples.container.branch.invariants">
 | ||
| 	    <info><title>Node Invariants</title></info>
 | ||
| 
 | ||
| 	    <itemizedlist>
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Using trees for order statistics:
 | ||
| 		  <filename>tree_order_statistics.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Augmenting trees to support operations on line
 | ||
| 		  intervals:
 | ||
| 		  <filename>tree_intervals.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 	    </itemizedlist>
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="pbds.using.examples.container.branch.trie">
 | ||
| 	    <info><title>trie</title></info>
 | ||
| 	    <itemizedlist>
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Using a PATRICIA trie for DNA strings:
 | ||
| 		  <filename>trie_dna.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 
 | ||
| 	      <listitem>
 | ||
| 		<para>
 | ||
| 		  Using a PATRICIA
 | ||
| 		  trie for finding all entries whose key matches a given prefix:
 | ||
| 		  <filename>trie_prefix_search.cc</filename>
 | ||
| 		</para>
 | ||
| 	      </listitem>
 | ||
| 	    </itemizedlist>
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
| 	<section xml:id="pbds.using.examples.container.priority_queue">
 | ||
| 	  <info><title>Priority Queues</title></info>
 | ||
| 	  <itemizedlist>
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		Cross referencing an associative container and a priority
 | ||
| 		queue: <filename>priority_queue_xref.cc</filename>
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		Cross referencing a vector and a priority queue using a
 | ||
| 		very simple version of Dijkstra's shortest path
 | ||
| 		algorithm:
 | ||
| 		<filename>priority_queue_dijkstra.cc</filename>
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 	  </itemizedlist>
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
| 
 | ||
|       </section>
 | ||
| 
 | ||
|     </section>
 | ||
| 
 | ||
|   </section> <!-- using -->
 | ||
| 
 | ||
|   <!-- S03: Design -->
 | ||
| 
 | ||
| 
 | ||
| <section xml:id="containers.pbds.design">
 | ||
|   <info><title>Design</title></info>
 | ||
|   <?dbhtml filename="policy_data_structures_design.html"?>
 | ||
|   <para></para>
 | ||
| 
 | ||
|   <section xml:id="pbds.design.concepts">
 | ||
|     <info><title>Concepts</title></info>
 | ||
| 
 | ||
|     <section xml:id="pbds.design.concepts.null_type">
 | ||
|       <info><title>Null Policy Classes</title></info>
 | ||
| 
 | ||
|       <para>
 | ||
| 	Associative containers are typically parametrized by various
 | ||
| 	policies. For example, a hash-based associative container is
 | ||
| 	parametrized by a hash-functor, transforming each key into an
 | ||
| 	non-negative numerical type. Each such value is then further mapped
 | ||
| 	into a position within the table. The mapping of a key into a
 | ||
| 	position within the table is therefore a two-step process.
 | ||
|       </para>
 | ||
| 
 | ||
|       <para>
 | ||
| 	In some cases, instantiations are redundant. For example, when the
 | ||
| 	keys are integers, it is possible to use a redundant hash policy,
 | ||
| 	which transforms each key into its value.
 | ||
|       </para>
 | ||
| 
 | ||
|       <para>
 | ||
| 	In some other cases, these policies are irrelevant.  For example, a
 | ||
| 	hash-based associative container might transform keys into positions
 | ||
| 	within a table by a different method than the two-step method
 | ||
| 	described above. In such a case, the hash functor is simply
 | ||
| 	irrelevant.
 | ||
|       </para>
 | ||
| 
 | ||
|       <para>
 | ||
| 	When a policy is either redundant or irrelevant, it can be replaced
 | ||
| 	by <classname>null_type</classname>.
 | ||
|       </para>
 | ||
| 
 | ||
|       <para>
 | ||
| 	For example, a <emphasis>set</emphasis> is an associative
 | ||
| 	container with one of its template parameters (the one for the
 | ||
| 	mapped type) replaced with <classname>null_type</classname>. Other
 | ||
| 	places simplifications are made possible with this technique
 | ||
| 	include node updates in tree and trie data structures, and hash
 | ||
| 	and probe functions for hash data structures.
 | ||
|       </para>
 | ||
|     </section>
 | ||
| 
 | ||
|     <section xml:id="pbds.design.concepts.associative_semantics">
 | ||
|       <info><title>Map and Set Semantics</title></info>
 | ||
| 
 | ||
|       <section xml:id="concepts.associative_semantics.set_vs_map">
 | ||
| 	<info>
 | ||
| 	  <title>
 | ||
| 	    Distinguishing Between Maps and Sets
 | ||
| 	  </title>
 | ||
| 	</info>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Anyone familiar with the standard knows that there are four kinds
 | ||
| 	  of associative containers: maps, sets, multimaps, and
 | ||
| 	  multisets. The map datatype associates each key to
 | ||
| 	  some data.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Sets are associative containers that simply store keys -
 | ||
| 	  they do not map them to anything. In the standard, each map class
 | ||
| 	  has a corresponding set class. E.g.,
 | ||
| 	  <classname>std::map<int, char></classname> maps each
 | ||
| 	  <classname>int</classname> to a <classname>char</classname>, but
 | ||
| 	  <classname>std::set<int, char></classname> simply stores
 | ||
| 	  <classname>int</classname>s. In this library, however, there are no
 | ||
| 	  distinct classes for maps and sets. Instead, an associative
 | ||
| 	  container's <classname>Mapped</classname> template parameter is a policy: if
 | ||
| 	  it is instantiated by <classname>null_type</classname>, then it
 | ||
| 	  is a "set"; otherwise, it is a "map". E.g.,
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  cc_hash_table<int, char>
 | ||
| 	</programlisting>
 | ||
| 	<para>
 | ||
| 	  is a "map" mapping each <type>int</type> value to a <type>
 | ||
| 	  char</type>, but
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  cc_hash_table<int, null_type>
 | ||
| 	</programlisting>
 | ||
| 	<para>
 | ||
| 	  is a type that uniquely stores <type>int</type> values.
 | ||
| 	</para>
 | ||
| 	<para>Once the <classname>Mapped</classname> template parameter is instantiated
 | ||
| 	by <classname>null_type</classname>, then
 | ||
| 	the "set" acts very similarly to the standard's sets - it does not
 | ||
| 	map each key to a distinct <classname>null_type</classname> object. Also,
 | ||
| 	, the container's <type>value_type</type> is essentially
 | ||
| 	its <type>key_type</type> - just as with the standard's sets
 | ||
| 	.</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  The standard's multimaps and multisets allow, respectively,
 | ||
| 	  non-uniquely mapping keys and non-uniquely storing keys. As
 | ||
| 	  discussed, the
 | ||
| 	  reasons why this might be necessary are 1) that a key might be
 | ||
| 	  decomposed into a primary key and a secondary key, 2) that a
 | ||
| 	  key might appear more than once, or 3) any arbitrary
 | ||
| 	  combination of 1)s and 2)s. Correspondingly,
 | ||
| 	  one should use 1) "maps" mapping primary keys to secondary
 | ||
| 	  keys, 2) "maps" mapping keys to size types, or 3) any arbitrary
 | ||
| 	  combination of 1)s and 2)s. Thus, for example, an
 | ||
| 	  <classname>std::multiset<int></classname> might be used to store
 | ||
| 	  multiple instances of integers, but using this library's
 | ||
| 	  containers, one might use
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  tree<int, size_t>
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  i.e., a <classname>map</classname> of <type>int</type>s to
 | ||
| 	  <type>size_t</type>s.
 | ||
| 	</para>
 | ||
| 	<para>
 | ||
| 	  These "multimaps" and "multisets" might be confusing to
 | ||
| 	  anyone familiar with the standard's <classname>std::multimap</classname> and
 | ||
| 	  <classname>std::multiset</classname>, because there is no clear
 | ||
| 	  correspondence between the two. For example, in some cases
 | ||
| 	  where one uses <classname>std::multiset</classname> in the standard, one might use
 | ||
| 	  in this library a "multimap" of "multisets" - i.e., a
 | ||
| 	  container that maps primary keys each to an associative
 | ||
| 	  container that maps each secondary key to the number of times
 | ||
| 	  it occurs.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  When one uses a "multimap," one should choose with care the
 | ||
| 	  type of container used for secondary keys.
 | ||
| 	</para>
 | ||
|       </section> <!-- map vs set -->
 | ||
| 
 | ||
| 
 | ||
|       <section xml:id="concepts.associative_semantics.multi">
 | ||
| 	<info><title>Alternatives to <classname>std::multiset</classname> and <classname>std::multimap</classname></title></info>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Brace onself: this library does not contain containers like
 | ||
| 	  <classname>std::multimap</classname> or
 | ||
| 	  <classname>std::multiset</classname>. Instead, these data
 | ||
| 	  structures can be synthesized via manipulation of the
 | ||
| 	  <classname>Mapped</classname> template parameter.
 | ||
| 	</para>
 | ||
| 	<para>
 | ||
| 	  One maps the unique part of a key - the primary key, into an
 | ||
| 	  associative-container of the (originally) non-unique parts of
 | ||
| 	  the key - the secondary key. A primary associative-container
 | ||
| 	  is an associative container of primary keys; a secondary
 | ||
| 	  associative-container is an associative container of
 | ||
| 	  secondary keys.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Stepping back a bit, and starting in from the beginning.
 | ||
| 	</para>
 | ||
| 
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Maps (or sets) allow mapping (or storing) unique-key values.
 | ||
| 	  The standard library also supplies associative containers which
 | ||
| 	  map (or store) multiple values with equivalent keys:
 | ||
| 	  <classname>std::multimap</classname>, <classname>std::multiset</classname>,
 | ||
| 	  <classname>std::tr1::unordered_multimap</classname>, and
 | ||
| 	  <classname>unordered_multiset</classname>. We first discuss how these might
 | ||
| 	  be used, then why we think it is best to avoid them.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Suppose one builds a simple bank-account application that
 | ||
| 	  records for each client (identified by an <classname>std::string</classname>)
 | ||
| 	  and account-id (marked by an <type>unsigned long</type>) -
 | ||
| 	  the balance in the account (described by a
 | ||
| 	  <type>float</type>). Suppose further that ordering this
 | ||
| 	  information is not useful, so a hash-based container is
 | ||
| 	  preferable to a tree based container. Then one can use
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<programlisting>
 | ||
| 	  std::tr1::unordered_map<std::pair<std::string, unsigned long>, float, ...>
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  which hashes every combination of client and account-id. This
 | ||
| 	  might work well, except for the fact that it is now impossible
 | ||
| 	  to efficiently list all of the accounts of a specific client
 | ||
| 	  (this would practically require iterating over all
 | ||
| 	  entries). Instead, one can use
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<programlisting>
 | ||
| 	  std::tr1::unordered_multimap<std::pair<std::string, unsigned long>, float, ...>
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  which hashes every client, and decides equivalence based on
 | ||
| 	  client only. This will ensure that all accounts belonging to a
 | ||
| 	  specific user are stored consecutively.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Also, suppose one wants an integers' priority queue
 | ||
| 	  (a container that supports <function>push</function>,
 | ||
| 	  <function>pop</function>, and <function>top</function> operations, the last of which
 | ||
| 	  returns the largest <type>int</type>) that also supports
 | ||
| 	  operations such as <function>find</function> and <function>lower_bound</function>. A
 | ||
| 	  reasonable solution is to build an adapter over
 | ||
| 	  <classname>std::set<int></classname>. In this adapter,
 | ||
| 	  <function>push</function> will just call the tree-based
 | ||
| 	  associative container's <function>insert</function> method; <function>pop</function>
 | ||
| 	  will call its <function>end</function> method, and use it to return the
 | ||
| 	  preceding element (which must be the largest). Then this might
 | ||
| 	  work well, except that the container object cannot hold
 | ||
| 	  multiple instances of the same integer (<function>push(4)</function>,
 | ||
| 	  will be a no-op if <constant>4</constant> is already in the
 | ||
| 	  container object). If multiple keys are necessary, then one
 | ||
| 	  might build the adapter over an
 | ||
| 	  <classname>std::multiset<int></classname>.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  The standard library's non-unique-mapping containers are useful
 | ||
| 	  when (1) a key can be decomposed in to a primary key and a
 | ||
| 	  secondary key, (2) a key is needed multiple times, or (3) any
 | ||
| 	  combination of (1) and (2).
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  The graphic below shows how the standard library's container
 | ||
| 	  design works internally; in this figure nodes shaded equally
 | ||
| 	  represent equivalent-key values. Equivalent keys are stored
 | ||
| 	  consecutively using the properties of the underlying data
 | ||
| 	  structure: binary search trees (label A) store equivalent-key
 | ||
| 	  values consecutively (in the sense of an in-order walk)
 | ||
| 	  naturally; collision-chaining hash tables (label B) store
 | ||
| 	  equivalent-key values in the same bucket, the bucket can be
 | ||
| 	  arranged so that equivalent-key values are consecutive.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<figure>
 | ||
| 	  <title>Non-unique Mapping Standard Containers</title>
 | ||
| 	  <mediaobject>
 | ||
| 	    <imageobject>
 | ||
| 	      <imagedata align="center" format="PNG" scale="100"
 | ||
| 			 fileref="../images/pbds_embedded_lists_1.png"/>
 | ||
| 	    </imageobject>
 | ||
| 	    <textobject>
 | ||
| 	      <phrase>Non-unique Mapping Standard Containers</phrase>
 | ||
| 	    </textobject>
 | ||
| 	  </mediaobject>
 | ||
| 	</figure>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Put differently, the standards' non-unique mapping
 | ||
| 	  associative-containers are associative containers that map
 | ||
| 	  primary keys to linked lists that are embedded into the
 | ||
| 	  container. The graphic below shows again the two
 | ||
| 	  containers from the first graphic above, this time with
 | ||
| 	  the embedded linked lists of the grayed nodes marked
 | ||
| 	  explicitly.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<figure xml:id="fig.pbds_embedded_lists_2">
 | ||
| 	  <title>
 | ||
| 	    Effect of embedded lists in
 | ||
| 	    <classname>std::multimap</classname>
 | ||
| 	  </title>
 | ||
| 	  <mediaobject>
 | ||
| 	    <imageobject>
 | ||
| 	      <imagedata align="center" format="PNG" scale="100"
 | ||
| 			 fileref="../images/pbds_embedded_lists_2.png"/>
 | ||
| 	    </imageobject>
 | ||
| 	    <textobject>
 | ||
| 	      <phrase>
 | ||
| 		Effect of embedded lists in
 | ||
| 		<classname>std::multimap</classname>
 | ||
| 	      </phrase>
 | ||
| 	    </textobject>
 | ||
| 	  </mediaobject>
 | ||
| 	</figure>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  These embedded linked lists have several disadvantages.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<orderedlist>
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      The underlying data structure embeds the linked lists
 | ||
| 	      according to its own consideration, which means that the
 | ||
| 	      search path for a value might include several different
 | ||
| 	      equivalent-key values. For example, the search path for the
 | ||
| 	      the black node in either of the first graphic, labels A or B,
 | ||
| 	      includes more than a single gray node.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      The links of the linked lists are the underlying data
 | ||
| 	      structures' nodes, which typically are quite structured.  In
 | ||
| 	      the case of tree-based containers (the grapic above, label
 | ||
| 	      B), each "link" is actually a node with three pointers (one
 | ||
| 	      to a parent and two to children), and a
 | ||
| 	      relatively-complicated iteration algorithm. The linked
 | ||
| 	      lists, therefore, can take up quite a lot of memory, and
 | ||
| 	      iterating over all values equal to a given key (through the
 | ||
| 	      return value of the standard
 | ||
| 	      library's <function>equal_range</function>) can be
 | ||
| 	      expensive.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      The primary key is stored multiply; this uses more memory.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      Finally, the interface of this design excludes several
 | ||
| 	      useful underlying data structures. Of all the unordered
 | ||
| 	      self-organizing data structures, practically only
 | ||
| 	      collision-chaining hash tables can (efficiently) guarantee
 | ||
| 	      that equivalent-key values are stored consecutively.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 	</orderedlist>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  The above reasons hold even when the ratio of secondary keys to
 | ||
| 	  primary keys (or average number of identical keys) is small, but
 | ||
| 	  when it is large, there are more severe problems:
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<orderedlist>
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      The underlying data structures order the links inside each
 | ||
| 	      embedded linked-lists according to their internal
 | ||
| 	      considerations, which effectively means that each of the
 | ||
| 	      links is unordered. Irrespective of the underlying data
 | ||
| 	      structure, searching for a specific value can degrade to
 | ||
| 	      linear complexity.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      Similarly to the above point, it is impossible to apply
 | ||
| 	      to the secondary keys considerations that apply to primary
 | ||
| 	      keys. For example, it is not possible to maintain secondary
 | ||
| 	      keys by sorted order.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      While the interface "understands" that all equivalent-key
 | ||
| 	      values constitute a distinct list (through
 | ||
| 	      <function>equal_range</function>), the underlying data
 | ||
| 	      structure typically does not. This means that operations such
 | ||
| 	      as erasing from a tree-based container all values whose keys
 | ||
| 	      are equivalent to a a given key can be super-linear in the
 | ||
| 	      size of the tree; this is also true also for several other
 | ||
| 	      operations that target a specific list.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	</orderedlist>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  In this library, all associative containers map
 | ||
| 	  (or store) unique-key values. One can (1) map primary keys to
 | ||
| 	  secondary associative-containers (containers of
 | ||
| 	  secondary keys) or non-associative containers (2) map identical
 | ||
| 	  keys to a size-type representing the number of times they
 | ||
| 	  occur, or (3) any combination of (1) and (2). Instead of
 | ||
| 	  allowing multiple equivalent-key values, this library
 | ||
| 	  supplies associative containers based on underlying
 | ||
| 	  data structures that are suitable as secondary
 | ||
| 	  associative-containers.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  In the figure below, labels A and B show the equivalent
 | ||
| 	  underlying data structures in this library, as mapped to the
 | ||
| 	  first graphic above. Labels A and B, respectively. Each shaded
 | ||
| 	  box represents some size-type or secondary
 | ||
| 	  associative-container.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<figure>
 | ||
| 	  <title>Non-unique Mapping Containers</title>
 | ||
| 	  <mediaobject>
 | ||
| 	    <imageobject>
 | ||
| 	      <imagedata align="center" format="PNG" scale="100"
 | ||
| 			 fileref="../images/pbds_embedded_lists_3.png"/>
 | ||
| 	    </imageobject>
 | ||
| 	    <textobject>
 | ||
| 	      <phrase>Non-unique Mapping Containers</phrase>
 | ||
| 	    </textobject>
 | ||
| 	  </mediaobject>
 | ||
| 	</figure>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  In the first example above, then, one would use an associative
 | ||
| 	  container mapping each user to an associative container which
 | ||
| 	  maps each application id to a start time (see
 | ||
| 	  <filename>example/basic_multimap.cc</filename>); in the second
 | ||
| 	  example, one would use an associative container mapping
 | ||
| 	  each <classname>int</classname> to some size-type indicating the
 | ||
| 	  number of times it logically occurs
 | ||
| 	  (see <filename>example/basic_multiset.cc</filename>.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  See the discussion in list-based container types for containers
 | ||
| 	  especially suited as secondary associative-containers.
 | ||
| 	</para>
 | ||
|       </section>
 | ||
| 
 | ||
|     </section> <!-- map and set semantics -->
 | ||
| 
 | ||
|     <section xml:id="pbds.design.concepts.iterator_semantics">
 | ||
|       <info><title>Iterator Semantics</title></info>
 | ||
| 
 | ||
|       <section xml:id="concepts.iterator_semantics.point_and_range">
 | ||
| 	<info><title>Point and Range Iterators</title></info>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Iterator concepts are bifurcated in this design, and are
 | ||
| 	  comprised of point-type and range-type iteration.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  A point-type iterator is an iterator that refers to a specific
 | ||
| 	  element as returned through an
 | ||
| 	  associative-container's <function>find</function> method.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  A range-type iterator is an iterator that is used to go over a
 | ||
| 	  sequence of elements, as returned by a container's
 | ||
| 	  <function>find</function> method.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  A point-type method is a method that
 | ||
| 	  returns a point-type iterator; a range-type method is a method
 | ||
| 	  that returns a range-type iterator.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>For most containers, these types are synonymous; for
 | ||
| 	self-organizing containers, such as hash-based containers or
 | ||
| 	priority queues, these are inherently different (in any
 | ||
| 	implementation, including that of C++ standard library
 | ||
| 	components), but in this design, it is made explicit. They are
 | ||
| 	distinct types.
 | ||
| 	</para>
 | ||
|       </section>
 | ||
| 
 | ||
| 
 | ||
|       <section xml:id="concepts.iterator_semantics.both">
 | ||
| 	<info><title>Distinguishing Point and Range Iterators</title></info>
 | ||
| 
 | ||
| 	<para>When using this library, is necessary to differentiate
 | ||
| 	between two types of methods and iterators: point-type methods and
 | ||
| 	iterators, and range-type methods and iterators. Each associative
 | ||
| 	container's interface includes the methods:</para>
 | ||
| 	<programlisting>
 | ||
| 	  point_const_iterator
 | ||
| 	  find(const_key_reference r_key) const;
 | ||
| 
 | ||
| 	  point_iterator
 | ||
| 	  find(const_key_reference r_key);
 | ||
| 
 | ||
| 	  std::pair<point_iterator,bool>
 | ||
| 	  insert(const_reference r_val);
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>The relationship between these iterator types varies between
 | ||
| 	container types. The figure below
 | ||
| 	shows the most general invariant between point-type and
 | ||
| 	range-type iterators: In <emphasis>A</emphasis> <literal>iterator</literal>, can
 | ||
| 	always be converted to <literal>point_iterator</literal>. In <emphasis>B</emphasis>
 | ||
| 	shows invariants for order-preserving containers: point-type
 | ||
| 	iterators are synonymous with range-type iterators.
 | ||
| 	Orthogonally,  <emphasis>C</emphasis>shows invariants for "set"
 | ||
| 	containers: iterators are synonymous with const iterators.</para>
 | ||
| 
 | ||
| 	<figure>
 | ||
| 	  <title>Point Iterator Hierarchy</title>
 | ||
| 	  <mediaobject>
 | ||
| 	    <imageobject>
 | ||
| 	      <imagedata align="center" format="PNG" scale="100"
 | ||
| 			 fileref="../images/pbds_point_iterator_hierarchy.png"/>
 | ||
| 	    </imageobject>
 | ||
| 	    <textobject>
 | ||
| 	      <phrase>Point Iterator Hierarchy</phrase>
 | ||
| 	    </textobject>
 | ||
| 	  </mediaobject>
 | ||
| 	</figure>
 | ||
| 
 | ||
| 
 | ||
| 	<para>Note that point-type iterators in self-organizing containers
 | ||
| 	(hash-based associative containers) lack movement
 | ||
| 	operators, such as <literal>operator++</literal> - in fact, this
 | ||
| 	is the reason why this library differentiates from the standard C++ librarys
 | ||
| 	design on this point.</para>
 | ||
| 
 | ||
| 	<para>Typically, one can determine an iterator's movement
 | ||
| 	capabilities using
 | ||
| 	<literal>std::iterator_traits<It>iterator_category</literal>,
 | ||
| 	which is a <literal>struct</literal> indicating the iterator's
 | ||
| 	movement capabilities. Unfortunately, none of the standard predefined
 | ||
| 	categories reflect a pointer's <emphasis>not</emphasis> having any
 | ||
| 	movement capabilities whatsoever. Consequently,
 | ||
| 	<literal>pb_ds</literal> adds a type
 | ||
| 	<literal>trivial_iterator_tag</literal> (whose name is taken from
 | ||
| 	a concept in C++ standardese, which is the category of iterators
 | ||
| 	with no movement capabilities.) All other standard C++ library
 | ||
| 	tags, such as <literal>forward_iterator_tag</literal> retain their
 | ||
| 	common use.</para>
 | ||
| 
 | ||
|       </section>
 | ||
| 
 | ||
|       <section xml:id="pbds.design.concepts.invalidation">
 | ||
| 	<info><title>Invalidation Guarantees</title></info>
 | ||
| 	<para>
 | ||
| 	  If one manipulates a container object, then iterators previously
 | ||
| 	  obtained from it can be invalidated. In some cases a
 | ||
| 	  previously-obtained iterator cannot be de-referenced; in other cases,
 | ||
| 	  the iterator's next or previous element might have changed
 | ||
| 	  unpredictably. This corresponds exactly to the question whether a
 | ||
| 	  point-type or range-type iterator (see previous concept) is valid or
 | ||
| 	  not. In this design, one can query a container (in compile time) about
 | ||
| 	  its invalidation guarantees.
 | ||
| 	</para>
 | ||
| 
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Given three different types of associative containers, a modifying
 | ||
| 	  operation (in that example, <function>erase</function>) invalidated
 | ||
| 	  iterators in three different ways: the iterator of one container
 | ||
| 	  remained completely valid - it could be de-referenced and
 | ||
| 	  incremented; the iterator of a different container could not even be
 | ||
| 	  de-referenced; the iterator of the third container could be
 | ||
| 	  de-referenced, but its "next" iterator changed unpredictably.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Distinguishing between find and range types allows fine-grained
 | ||
| 	  invalidation guarantees, because these questions correspond exactly
 | ||
| 	  to the question of whether point-type iterators and range-type
 | ||
| 	  iterators are valid. The graphic below shows tags corresponding to
 | ||
| 	  different types of invalidation guarantees.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<figure>
 | ||
| 	  <title>Invalidation Guarantee Tags Hierarchy</title>
 | ||
| 	  <mediaobject>
 | ||
| 	    <imageobject>
 | ||
| 	      <imagedata align="center" format="PDF" scale="75"
 | ||
| 			 fileref="../images/pbds_invalidation_tag_hierarchy.pdf"/>
 | ||
| 	    </imageobject>
 | ||
| 	    <imageobject>
 | ||
| 	      <imagedata align="center" format="PNG" scale="100"
 | ||
| 			 fileref="../images/pbds_invalidation_tag_hierarchy.png"/>
 | ||
| 	    </imageobject>
 | ||
| 	    <textobject>
 | ||
| 	      <phrase>Invalidation Guarantee Tags Hierarchy</phrase>
 | ||
| 	    </textobject>
 | ||
| 	  </mediaobject>
 | ||
| 	</figure>
 | ||
| 
 | ||
| 	<itemizedlist>
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      <classname>basic_invalidation_guarantee</classname>
 | ||
| 	      corresponds to a basic guarantee that a point-type iterator,
 | ||
| 	      a found pointer, or a found reference, remains valid as long
 | ||
| 	      as the container object is not modified.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      <classname>point_invalidation_guarantee</classname>
 | ||
| 	      corresponds to a guarantee that a point-type iterator, a
 | ||
| 	      found pointer, or a found reference, remains valid even if
 | ||
| 	      the container object is modified.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      <classname>range_invalidation_guarantee</classname>
 | ||
| 	      corresponds to a guarantee that a range-type iterator remains
 | ||
| 	      valid even if the container object is modified.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 	</itemizedlist>
 | ||
| 
 | ||
| 	<para>To find the invalidation guarantee of a
 | ||
| 	container, one can use</para>
 | ||
| 	<programlisting>
 | ||
| 	  typename container_traits<Cntnr>::invalidation_guarantee
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>Note that this hierarchy corresponds to the logic it
 | ||
| 	represents: if a container has range-invalidation guarantees,
 | ||
| 	then it must also have find invalidation guarantees;
 | ||
| 	correspondingly, its invalidation guarantee (in this case
 | ||
| 	<classname>range_invalidation_guarantee</classname>)
 | ||
| 	can be cast to its base class (in this case <classname>point_invalidation_guarantee</classname>).
 | ||
| 	This means that this this hierarchy can be used easily using
 | ||
| 	standard metaprogramming techniques, by specializing on the
 | ||
| 	type of <literal>invalidation_guarantee</literal>.</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  These types of problems were addressed, in a more general
 | ||
| 	  setting, in <xref linkend="biblio.meyers96more"/> - Item 2. In
 | ||
| 	  our opinion, an invalidation-guarantee hierarchy would solve
 | ||
| 	  these problems in all container types - not just associative
 | ||
| 	  containers.
 | ||
| 	</para>
 | ||
| 
 | ||
|       </section>
 | ||
|     </section> <!-- iterator semantics -->
 | ||
| 
 | ||
|     <section xml:id="pbds.design.concepts.genericity">
 | ||
|       <info><title>Genericity</title></info>
 | ||
| 
 | ||
|       <para>
 | ||
| 	The design attempts to address the following problem of
 | ||
| 	data-structure genericity. When writing a function manipulating
 | ||
| 	a generic container object, what is the behavior of the object?
 | ||
| 	Suppose one writes
 | ||
|       </para>
 | ||
|       <programlisting>
 | ||
| 	template<typename Cntnr>
 | ||
| 	void
 | ||
| 	some_op_sequence(Cntnr &r_container)
 | ||
| 	{
 | ||
| 	...
 | ||
| 	}
 | ||
|       </programlisting>
 | ||
| 
 | ||
|       <para>
 | ||
| 	then one needs to address the following questions in the body
 | ||
| 	of <function>some_op_sequence</function>:
 | ||
|       </para>
 | ||
| 
 | ||
|       <itemizedlist>
 | ||
| 	<listitem>
 | ||
| 	  <para>
 | ||
| 	    Which types and methods does <literal>Cntnr</literal> support?
 | ||
| 	    Containers based on hash tables can be queries for the
 | ||
| 	    hash-functor type and object; this is meaningless for tree-based
 | ||
| 	    containers. Containers based on trees can be split, joined, or
 | ||
| 	    can erase iterators and return the following iterator; this
 | ||
| 	    cannot be done by hash-based containers.
 | ||
| 	  </para>
 | ||
| 	</listitem>
 | ||
| 
 | ||
| 	<listitem>
 | ||
| 	  <para>
 | ||
| 	    What are the exception and invalidation guarantees
 | ||
| 	    of <literal>Cntnr</literal>? A container based on a probing
 | ||
| 	    hash-table invalidates all iterators when it is modified; this
 | ||
| 	    is not the case for containers based on node-based
 | ||
| 	    trees. Containers based on a node-based tree can be split or
 | ||
| 	    joined without exceptions; this is not the case for containers
 | ||
| 	    based on vector-based trees.
 | ||
| 	  </para>
 | ||
| 	</listitem>
 | ||
| 
 | ||
| 	<listitem>
 | ||
| 	  <para>
 | ||
| 	    How does the container maintain its elements? Tree-based and
 | ||
| 	    Trie-based containers store elements by key order; others,
 | ||
| 	    typically, do not. A container based on a splay trees or lists
 | ||
| 	    with update policies "cache" "frequently accessed" elements;
 | ||
| 	    containers based on most other underlying data structures do
 | ||
| 	    not.
 | ||
| 	  </para>
 | ||
| 	</listitem>
 | ||
| 	<listitem>
 | ||
| 	  <para>
 | ||
| 	    How does one query a container about characteristics and
 | ||
| 	    capabilities? What is the relationship between two different
 | ||
| 	    data structures, if anything?
 | ||
| 	  </para>
 | ||
| 	</listitem>
 | ||
|       </itemizedlist>
 | ||
| 
 | ||
|       <para>The remainder of this section explains these issues in
 | ||
|       detail.</para>
 | ||
| 
 | ||
| 
 | ||
|       <section xml:id="concepts.genericity.tag">
 | ||
| 	<info><title>Tag</title></info>
 | ||
| 	<para>
 | ||
| 	  Tags are very useful for manipulating generic types. For example, if
 | ||
| 	  <literal>It</literal> is an iterator class, then <literal>typename
 | ||
| 	  It::iterator_category</literal> or <literal>typename
 | ||
| 	  std::iterator_traits<It>::iterator_category</literal> will
 | ||
| 	  yield its category, and <literal>typename
 | ||
| 	  std::iterator_traits<It>::value_type</literal> will yield its
 | ||
| 	  value type.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  This library contains a container tag hierarchy corresponding to the
 | ||
| 	  diagram below.
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<figure>
 | ||
| 	  <title>Container Tag Hierarchy</title>
 | ||
| 	  <mediaobject>
 | ||
| 	    <imageobject>
 | ||
| 	      <imagedata align="center" format="PDF" scale="75"
 | ||
| 			 fileref="../images/pbds_container_tag_hierarchy.pdf"/>
 | ||
| 	    </imageobject>
 | ||
| 	    <imageobject>
 | ||
| 	      <imagedata align="center" format="PNG" scale="100"
 | ||
| 			 fileref="../images/pbds_container_tag_hierarchy.png"/>
 | ||
| 	    </imageobject>
 | ||
| 	    <textobject>
 | ||
| 	      <phrase>Container Tag Hierarchy</phrase>
 | ||
| 	    </textobject>
 | ||
| 	  </mediaobject>
 | ||
| 	</figure>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Given any container <type>Cntnr</type>, the tag of
 | ||
| 	  the underlying data structure can be found via <literal>typename
 | ||
| 	  Cntnr::container_category</literal>.
 | ||
| 	</para>
 | ||
| 
 | ||
|       </section> <!-- tag -->
 | ||
| 
 | ||
|       <section xml:id="concepts.genericity.traits">
 | ||
| 	<info><title>Traits</title></info>
 | ||
| 	<para></para>
 | ||
| 
 | ||
| 	<para>Additionally, a traits mechanism can be used to query a
 | ||
| 	container type for its attributes. Given any container
 | ||
| 	<literal>Cntnr</literal>, then <literal><Cntnr></literal>
 | ||
| 	is a traits class identifying the properties of the
 | ||
| 	container.</para>
 | ||
| 
 | ||
| 	<para>To find if a container can throw when a key is erased (which
 | ||
| 	is true for vector-based trees, for example), one can
 | ||
| 	use
 | ||
| 	</para>
 | ||
| 	<programlisting>container_traits<Cntnr>::erase_can_throw</programlisting>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  Some of the definitions in <classname>container_traits</classname>
 | ||
| 	  are dependent on other
 | ||
| 	  definitions. If <classname>container_traits<Cntnr>::order_preserving</classname>
 | ||
| 	  is <constant>true</constant> (which is the case for containers
 | ||
| 	  based on trees and tries), then the container can be split or
 | ||
| 	  joined; in this
 | ||
| 	  case, <classname>container_traits<Cntnr>::split_join_can_throw</classname>
 | ||
| 	  indicates whether splits or joins can throw exceptions (which is
 | ||
| 	  true for vector-based trees);
 | ||
| 	  otherwise <classname>container_traits<Cntnr>::split_join_can_throw</classname>
 | ||
| 	  will yield a compilation error. (This is somewhat similar to a
 | ||
| 	  compile-time version of the COM model).
 | ||
| 	</para>
 | ||
| 
 | ||
|       </section> <!-- traits -->
 | ||
| 
 | ||
|     </section> <!-- genericity -->
 | ||
|   </section> <!-- concepts -->
 | ||
| 
 | ||
|   <section xml:id="pbds.design.container">
 | ||
|     <info><title>By Container</title></info>
 | ||
| 
 | ||
|     <!-- hash -->
 | ||
|     <section xml:id="pbds.design.container.hash">
 | ||
|       <info><title>hash</title></info>
 | ||
| 
 | ||
|       <!--
 | ||
| 
 | ||
| // hash policies
 | ||
| /// general terms / background
 | ||
| /// range hashing policies
 | ||
| /// ranged-hash policies
 | ||
| /// implementation
 | ||
| 
 | ||
| // resize policies
 | ||
| /// general
 | ||
| /// size policies
 | ||
| /// trigger policies
 | ||
| /// implementation
 | ||
| 
 | ||
| // policy interactions
 | ||
| /// probe/size/trigger
 | ||
| /// hash/trigger
 | ||
| /// eq/hash/storing hash values
 | ||
| /// size/load-check trigger
 | ||
|       -->
 | ||
|       <section xml:id="container.hash.interface">
 | ||
| 	<info><title>Interface</title></info>
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  The collision-chaining hash-based container has the
 | ||
| 	following declaration.</para>
 | ||
| 	<programlisting>
 | ||
| 	  template<
 | ||
| 	  typename Key,
 | ||
| 	  typename Mapped,
 | ||
| 	  typename Hash_Fn = std::hash<Key>,
 | ||
| 	  typename Eq_Fn = std::equal_to<Key>,
 | ||
| 	  typename Comb_Hash_Fn =  direct_mask_range_hashing<>
 | ||
| 	  typename Resize_Policy = default explained below.
 | ||
| 	  bool Store_Hash = false,
 | ||
| 	  typename Allocator = std::allocator<char> >
 | ||
| 	  class cc_hash_table;
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>The parameters have the following meaning:</para>
 | ||
| 
 | ||
| 	<orderedlist>
 | ||
| 	  <listitem><para><classname>Key</classname> is the key type.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Mapped</classname> is the mapped-policy.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Hash_Fn</classname> is a key hashing functor.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Eq_Fn</classname> is a key equivalence functor.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Comb_Hash_Fn</classname> is a range-hashing_functor;
 | ||
| 	  it describes how to translate hash values into positions
 | ||
| 	  within the table. </para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Resize_Policy</classname> describes how a container object
 | ||
| 	  should change its internal size. </para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Store_Hash</classname> indicates whether the hash value
 | ||
| 	  should be stored with each entry. </para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Allocator</classname> is an allocator
 | ||
| 	  type.</para></listitem>
 | ||
| 	</orderedlist>
 | ||
| 
 | ||
| 	<para>The probing hash-based container has the following
 | ||
| 	declaration.</para>
 | ||
| 	<programlisting>
 | ||
| 	  template<
 | ||
| 	  typename Key,
 | ||
| 	  typename Mapped,
 | ||
| 	  typename Hash_Fn = std::hash<Key>,
 | ||
| 	  typename Eq_Fn = std::equal_to<Key>,
 | ||
| 	  typename Comb_Probe_Fn = direct_mask_range_hashing<>
 | ||
| 	  typename Probe_Fn = default explained below.
 | ||
| 	  typename Resize_Policy = default explained below.
 | ||
| 	  bool Store_Hash = false,
 | ||
| 	  typename Allocator =  std::allocator<char> >
 | ||
| 	  class gp_hash_table;
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>The parameters are identical to those of the
 | ||
| 	collision-chaining container, except for the following.</para>
 | ||
| 
 | ||
| 	<orderedlist>
 | ||
| 	  <listitem><para><classname>Comb_Probe_Fn</classname> describes how to transform a probe
 | ||
| 	  sequence into a sequence of positions within the table.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Probe_Fn</classname> describes a probe sequence policy.</para></listitem>
 | ||
| 	</orderedlist>
 | ||
| 
 | ||
| 	<para>Some of the default template values depend on the values of
 | ||
| 	other parameters, and are explained below.</para>
 | ||
| 
 | ||
|       </section>
 | ||
|       <section xml:id="container.hash.details">
 | ||
| 	<info><title>Details</title></info>
 | ||
| 
 | ||
| 	<section xml:id="container.hash.details.hash_policies">
 | ||
| 	  <info><title>Hash Policies</title></info>
 | ||
| 
 | ||
| 	  <section xml:id="details.hash_policies.general">
 | ||
| 	    <info><title>General</title></info>
 | ||
| 
 | ||
| 	    <para>Following is an explanation of some functions which hashing
 | ||
| 	    involves. The graphic below illustrates the discussion.</para>
 | ||
| 
 | ||
| 	    <figure>
 | ||
| 	      <title>Hash functions, ranged-hash functions, and
 | ||
| 	      range-hashing functions</title>
 | ||
| 	      <mediaobject>
 | ||
| 		<imageobject>
 | ||
| 		  <imagedata align="center" format="PNG" scale="100"
 | ||
| 			     fileref="../images/pbds_hash_ranged_hash_range_hashing_fns.png"/>
 | ||
| 		</imageobject>
 | ||
| 		<textobject>
 | ||
| 		  <phrase>Hash functions, ranged-hash functions, and
 | ||
| 		  range-hashing functions</phrase>
 | ||
| 		</textobject>
 | ||
| 	      </mediaobject>
 | ||
| 	    </figure>
 | ||
| 	    
 | ||
| 	    <para>Let U be a domain (e.g., the integers, or the
 | ||
| 	    strings of 3 characters). A hash-table algorithm needs to map
 | ||
| 	    elements of U "uniformly" into the range [0,..., m -
 | ||
| 	    1] (where m is a non-negative integral value, and
 | ||
| 	    is, in general, time varying). I.e., the algorithm needs
 | ||
| 	    a ranged-hash function</para>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      f : U × Z<subscript>+</subscript> → Z<subscript>+</subscript>
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	    <para>such that for any u in U ,</para>
 | ||
| 
 | ||
| 	    <para>0 ≤ f(u, m) ≤ m - 1</para>
 | ||
| 
 | ||
| 	    <para>and which has "good uniformity" properties (say
 | ||
| 	    <xref linkend="biblio.knuth98sorting"/>.)
 | ||
| 	    One
 | ||
| 	    common solution is to use the composition of the hash
 | ||
| 	    function</para>
 | ||
| 
 | ||
| 	    <para>h : U → Z<subscript>+</subscript> ,</para>
 | ||
| 
 | ||
| 	    <para>which maps elements of U into the non-negative
 | ||
| 	    integrals, and</para>
 | ||
| 
 | ||
| 	    <para>g : Z<subscript>+</subscript> × Z<subscript>+</subscript> →
 | ||
| 	    Z<subscript>+</subscript>,</para>
 | ||
| 
 | ||
| 	    <para>which maps a non-negative hash value, and a non-negative
 | ||
| 	    range upper-bound into a non-negative integral in the range
 | ||
| 	    between 0 (inclusive) and the range upper bound (exclusive),
 | ||
| 	    i.e., for any r in Z<subscript>+</subscript>,</para>
 | ||
| 
 | ||
| 	    <para>0 ≤ g(r, m) ≤ m - 1</para>
 | ||
| 
 | ||
| 
 | ||
| 	    <para>The resulting ranged-hash function, is</para>
 | ||
| 
 | ||
| 	    <!-- ranged_hash_composed_of_hash_and_range_hashing -->
 | ||
| 	    <equation>
 | ||
| 	      <title>Ranged Hash Function</title>
 | ||
| 	      <mathphrase>
 | ||
| 		f(u , m) = g(h(u), m)
 | ||
| 	      </mathphrase>
 | ||
| 	    </equation>
 | ||
| 
 | ||
| 	    <para>From the above, it is obvious that given g and
 | ||
| 	    h, f can always be composed (however the converse
 | ||
| 	    is not true). The standard's hash-based containers allow specifying
 | ||
| 	    a hash function, and use a hard-wired range-hashing function;
 | ||
| 	    the ranged-hash function is implicitly composed.</para>
 | ||
| 
 | ||
| 	    <para>The above describes the case where a key is to be mapped
 | ||
| 	    into a single position within a hash table, e.g.,
 | ||
| 	    in a collision-chaining table. In other cases, a key is to be
 | ||
| 	    mapped into a sequence of positions within a table,
 | ||
| 	    e.g., in a probing table. Similar terms apply in this
 | ||
| 	    case: the table requires a ranged probe function,
 | ||
| 	    mapping a key into a sequence of positions withing the table.
 | ||
| 	    This is typically achieved by composing a hash function
 | ||
| 	    mapping the key into a non-negative integral type, a
 | ||
| 	    probe function transforming the hash value into a
 | ||
| 	    sequence of hash values, and a range-hashing function
 | ||
| 	    transforming the sequence of hash values into a sequence of
 | ||
| 	    positions.</para>
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="details.hash_policies.range">
 | ||
| 	    <info><title>Range Hashing</title></info>
 | ||
| 
 | ||
| 	    <para>Some common choices for range-hashing functions are the
 | ||
| 	    division, multiplication, and middle-square methods (<xref linkend="biblio.knuth98sorting"/>), defined
 | ||
| 	    as</para>
 | ||
| 
 | ||
| 	    <equation>
 | ||
| 	      <title>Range-Hashing, Division Method</title>
 | ||
| 	      <mathphrase>
 | ||
| 		g(r, m) = r mod m
 | ||
| 	      </mathphrase>
 | ||
| 	    </equation>
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 	    <para>g(r, m) = ⌈ u/v ( a r mod v ) ⌉</para>
 | ||
| 
 | ||
| 	    <para>and</para>
 | ||
| 
 | ||
| 	    <para>g(r, m) = ⌈ u/v ( r<superscript>2</superscript> mod v ) ⌉</para>
 | ||
| 
 | ||
| 	    <para>respectively, for some positive integrals u and
 | ||
| 	    v (typically powers of 2), and some a. Each of
 | ||
| 	    these range-hashing functions works best for some different
 | ||
| 	    setting.</para>
 | ||
| 
 | ||
| 	    <para>The division method (see above) is a
 | ||
| 	    very common choice. However, even this single method can be
 | ||
| 	    implemented in two very different ways. It is possible to
 | ||
| 	    implement using the low
 | ||
| 	    level % (modulo) operation (for any m), or the
 | ||
| 	    low level & (bit-mask) operation (for the case where
 | ||
| 	    m is a power of 2), i.e.,</para>
 | ||
| 
 | ||
| 	    <equation>
 | ||
| 	      <title>Division via Prime Modulo</title>
 | ||
| 	      <mathphrase>
 | ||
| 		g(r, m) = r % m
 | ||
| 	      </mathphrase>
 | ||
| 	    </equation>
 | ||
| 
 | ||
| 	    <para>and</para>
 | ||
| 
 | ||
| 	    <equation>
 | ||
| 	      <title>Division via Bit Mask</title>
 | ||
| 	      <mathphrase>
 | ||
| 		g(r, m) = r & m - 1, (with m =
 | ||
| 		2<superscript>k</superscript> for some k)
 | ||
| 	      </mathphrase>
 | ||
| 	    </equation>
 | ||
| 
 | ||
| 
 | ||
| 	    <para>respectively.</para>
 | ||
| 
 | ||
| 	    <para>The % (modulo) implementation has the advantage that for
 | ||
| 	    m a prime far from a power of 2, g(r, m) is
 | ||
| 	    affected by all the bits of r (minimizing the chance of
 | ||
| 	    collision). It has the disadvantage of using the costly modulo
 | ||
| 	    operation. This method is hard-wired into SGI's implementation
 | ||
| 	    .</para>
 | ||
| 
 | ||
| 	    <para>The & (bit-mask) implementation has the advantage of
 | ||
| 	    relying on the fast bit-wise and operation. It has the
 | ||
| 	    disadvantage that for g(r, m) is affected only by the
 | ||
| 	    low order bits of r. This method is hard-wired into
 | ||
| 	    Dinkumware's implementation.</para>
 | ||
| 
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="details.hash_policies.ranged">
 | ||
| 	    <info><title>Ranged Hash</title></info>
 | ||
| 
 | ||
| 	    <para>In cases it is beneficial to allow the
 | ||
| 	    client to directly specify a ranged-hash hash function. It is
 | ||
| 	    true, that the writer of the ranged-hash function cannot rely
 | ||
| 	    on the values of m having specific numerical properties
 | ||
| 	    suitable for hashing (in the sense used in <xref linkend="biblio.knuth98sorting"/>), since
 | ||
| 	    the values of m are determined by a resize policy with
 | ||
| 	    possibly orthogonal considerations.</para>
 | ||
| 
 | ||
| 	    <para>There are two cases where a ranged-hash function can be
 | ||
| 	    superior. The firs is when using perfect hashing: the
 | ||
| 	    second is when the values of m can be used to estimate
 | ||
| 	    the "general" number of distinct values required. This is
 | ||
| 	    described in the following.</para>
 | ||
| 
 | ||
| 	    <para>Let</para>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      s = [ s<subscript>0</subscript>,..., s<subscript>t - 1</subscript>]
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	    <para>be a string of t characters, each of which is from
 | ||
| 	    domain S. Consider the following ranged-hash
 | ||
| 	    function:</para>
 | ||
| 	    <equation>
 | ||
| 	      <title>
 | ||
| 		A Standard String Hash Function
 | ||
| 	      </title>
 | ||
| 	      <mathphrase>
 | ||
| 		f<subscript>1</subscript>(s, m) = ∑ <subscript>i =
 | ||
| 		0</subscript><superscript>t - 1</superscript> s<subscript>i</subscript> a<superscript>i</superscript> mod m
 | ||
| 	      </mathphrase>
 | ||
| 	    </equation>
 | ||
| 	    
 | ||
| 
 | ||
| 	    <para>where a is some non-negative integral value. This is
 | ||
| 	    the standard string-hashing function used in SGI's
 | ||
| 	    implementation (with a = 5). Its advantage is that
 | ||
| 	    it takes into account all of the characters of the string.</para>
 | ||
| 
 | ||
| 	    <para>Now assume that s is the string representation of a
 | ||
| 	    of a long DNA sequence (and so S = {'A', 'C', 'G',
 | ||
| 	    'T'}). In this case, scanning the entire string might be
 | ||
| 	    prohibitively expensive. A possible alternative might be to use
 | ||
| 	    only the first k characters of the string, where</para>
 | ||
| 
 | ||
| 	    <para>|S|<superscript>k</superscript> ≥ m ,</para>
 | ||
| 
 | ||
| 	    <para>i.e., using the hash function</para>
 | ||
| 
 | ||
| 	    <equation>
 | ||
| 	      <title>
 | ||
| 		Only k String DNA Hash
 | ||
| 	      </title>
 | ||
| 	      <mathphrase>
 | ||
| 		f<subscript>2</subscript>(s, m) = ∑ <subscript>i
 | ||
| 		= 0</subscript><superscript>k - 1</superscript> s<subscript>i</subscript> a<superscript>i</superscript> mod m 
 | ||
| 	      </mathphrase>
 | ||
| 	    </equation>
 | ||
| 
 | ||
| 	    <para>requiring scanning over only</para>
 | ||
| 
 | ||
| 	    <para>k = log<subscript>4</subscript>( m )</para>
 | ||
| 
 | ||
| 	    <para>characters.</para>
 | ||
| 
 | ||
| 	    <para>Other more elaborate hash-functions might scan k
 | ||
| 	    characters starting at a random position (determined at each
 | ||
| 	    resize), or scanning k random positions (determined at
 | ||
| 	    each resize), i.e., using</para>
 | ||
| 
 | ||
| 	    <para>f<subscript>3</subscript>(s, m) = ∑ <subscript>i =
 | ||
| 	    r</subscript>0<superscript>r<subscript>0</subscript> + k - 1</superscript> s<subscript>i</subscript>
 | ||
| 	    a<superscript>i</superscript> mod m ,</para>
 | ||
| 
 | ||
| 	    <para>or</para>
 | ||
| 
 | ||
| 	    <para>f<subscript>4</subscript>(s, m) = ∑ <subscript>i = 0</subscript><superscript>k -
 | ||
| 	    1</superscript> s<subscript>r</subscript>i a<superscript>r<subscript>i</subscript></superscript> mod
 | ||
| 	    m ,</para>
 | ||
| 
 | ||
| 	    <para>respectively, for r<subscript>0</subscript>,..., r<subscript>k-1</subscript>
 | ||
| 	    each in the (inclusive) range [0,...,t-1].</para>
 | ||
| 
 | ||
| 	    <para>It should be noted that the above functions cannot be
 | ||
| 	    decomposed as per a ranged hash composed of hash and range hashing.</para>
 | ||
| 
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="details.hash_policies.implementation">
 | ||
| 	    <info><title>Implementation</title></info>
 | ||
| 
 | ||
| 	    <para>This sub-subsection describes the implementation of
 | ||
| 	    the above in this library. It first explains range-hashing
 | ||
| 	    functions in collision-chaining tables, then ranged-hash
 | ||
| 	    functions in collision-chaining tables, then probing-based
 | ||
| 	    tables, and finally lists the relevant classes in this
 | ||
| 	    library.</para>
 | ||
| 
 | ||
| 	    <section xml:id="hash_policies.implementation.collision-chaining">
 | ||
| 	      <info><title>
 | ||
| 		Range-Hashing and Ranged-Hashes in Collision-Chaining Tables
 | ||
| 	      </title></info>
 | ||
| 
 | ||
| 
 | ||
| 	      <para><classname>cc_hash_table</classname> is
 | ||
| 	      parametrized by <classname>Hash_Fn</classname> and <classname>Comb_Hash_Fn</classname>, a
 | ||
| 	      hash functor and a combining hash functor, respectively.</para>
 | ||
| 
 | ||
| 	      <para>In general, <classname>Comb_Hash_Fn</classname> is considered a
 | ||
| 	      range-hashing functor. <classname>cc_hash_table</classname>
 | ||
| 	      synthesizes a ranged-hash function from <classname>Hash_Fn</classname> and
 | ||
| 	      <classname>Comb_Hash_Fn</classname>. The figure below shows an <classname>insert</classname> sequence
 | ||
| 	      diagram for this case. The user inserts an element (point A),
 | ||
| 	      the container transforms the key into a non-negative integral
 | ||
| 	      using the hash functor (points B and C), and transforms the
 | ||
| 	      result into a position using the combining functor (points D
 | ||
| 	      and E).</para>
 | ||
| 
 | ||
| 	      <figure>
 | ||
| 		<title>Insert hash sequence diagram</title>
 | ||
| 		<mediaobject>
 | ||
| 		  <imageobject>
 | ||
| 		    <imagedata align="center" format="PNG" scale="100"
 | ||
| 			       fileref="../images/pbds_hash_range_hashing_seq_diagram.png"/>
 | ||
| 		  </imageobject>
 | ||
| 		  <textobject>
 | ||
| 		    <phrase>Insert hash sequence diagram</phrase>
 | ||
| 		  </textobject>
 | ||
| 		</mediaobject>
 | ||
| 	      </figure>
 | ||
| 	      
 | ||
| 	      <para>If <classname>cc_hash_table</classname>'s
 | ||
| 	      hash-functor, <classname>Hash_Fn</classname> is instantiated by <classname>null_type</classname> , then <classname>Comb_Hash_Fn</classname> is taken to be
 | ||
| 	      a ranged-hash function. The graphic below shows an <function>insert</function> sequence
 | ||
| 	      diagram. The user inserts an element (point A), the container
 | ||
| 	      transforms the key into a position using the combining functor
 | ||
| 	      (points B and C).</para>
 | ||
| 
 | ||
| 	      <figure>
 | ||
| 		<title>Insert hash sequence diagram with a null policy</title>
 | ||
| 		<mediaobject>
 | ||
| 		  <imageobject>
 | ||
| 		    <imagedata align="center" format="PNG" scale="100"
 | ||
| 			       fileref="../images/pbds_hash_range_hashing_seq_diagram2.png"/>
 | ||
| 		  </imageobject>
 | ||
| 		  <textobject>
 | ||
| 		    <phrase>Insert hash sequence diagram with a null policy</phrase>
 | ||
| 		  </textobject>
 | ||
| 		</mediaobject>
 | ||
| 	      </figure>
 | ||
| 	      
 | ||
| 	    </section>
 | ||
| 
 | ||
| 	    <section xml:id="hash_policies.implementation.probe">
 | ||
| 	      <info><title>
 | ||
| 		Probing tables
 | ||
| 	      </title></info>
 | ||
| 	      <para><classname>gp_hash_table</classname> is parametrized by
 | ||
| 	      <classname>Hash_Fn</classname>, <classname>Probe_Fn</classname>,
 | ||
| 	      and <classname>Comb_Probe_Fn</classname>. As before, if
 | ||
| 	      <classname>Hash_Fn</classname> and <classname>Probe_Fn</classname>
 | ||
| 	      are both <classname>null_type</classname>, then
 | ||
| 	      <classname>Comb_Probe_Fn</classname> is a ranged-probe
 | ||
| 	      functor. Otherwise, <classname>Hash_Fn</classname> is a hash
 | ||
| 	      functor, <classname>Probe_Fn</classname> is a functor for offsets
 | ||
| 	      from a hash value, and <classname>Comb_Probe_Fn</classname>
 | ||
| 	      transforms a probe sequence into a sequence of positions within
 | ||
| 	      the table.</para>
 | ||
| 
 | ||
| 	    </section>
 | ||
| 
 | ||
| 	    <section xml:id="hash_policies.implementation.predefined">
 | ||
| 	      <info><title>
 | ||
| 		Pre-Defined Policies
 | ||
| 	      </title></info>
 | ||
| 
 | ||
| 	      <para>This library contains some pre-defined classes
 | ||
| 	      implementing range-hashing and probing functions:</para>
 | ||
| 
 | ||
| 	      <orderedlist>
 | ||
| 		<listitem><para><classname>direct_mask_range_hashing</classname>
 | ||
| 		and <classname>direct_mod_range_hashing</classname>
 | ||
| 		are range-hashing functions based on a bit-mask and a modulo
 | ||
| 		operation, respectively.</para></listitem>
 | ||
| 
 | ||
| 		<listitem><para><classname>linear_probe_fn</classname>, and
 | ||
| 		<classname>quadratic_probe_fn</classname> are
 | ||
| 		a linear probe and a quadratic probe function,
 | ||
| 		respectively.</para></listitem>
 | ||
| 	      </orderedlist>
 | ||
| 
 | ||
| 	      <para>
 | ||
| 		The graphic below shows the relationships.
 | ||
| 	      </para>
 | ||
| 	      <figure>
 | ||
| 		<title>Hash policy class diagram</title>
 | ||
| 		<mediaobject>
 | ||
| 		  <imageobject>
 | ||
| 		    <imagedata align="center" format="PNG" scale="100"
 | ||
| 			       fileref="../images/pbds_hash_policy_cd.png"/>
 | ||
| 		  </imageobject>
 | ||
| 		  <textobject>
 | ||
| 		    <phrase>Hash policy class diagram</phrase>
 | ||
| 		  </textobject>
 | ||
| 		</mediaobject>
 | ||
| 	      </figure>
 | ||
| 
 | ||
| 
 | ||
| 	    </section>
 | ||
| 
 | ||
| 	  </section> <!-- impl -->
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
| 	<section xml:id="container.hash.details.resize_policies">
 | ||
| 	  <info><title>Resize Policies</title></info>
 | ||
| 
 | ||
| 	  <section xml:id="resize_policies.general">
 | ||
| 	    <info><title>General</title></info>
 | ||
| 
 | ||
| 	    <para>Hash-tables, as opposed to trees, do not naturally grow or
 | ||
| 	    shrink. It is necessary to specify policies to determine how
 | ||
| 	    and when a hash table should change its size. Usually, resize
 | ||
| 	    policies can be decomposed into orthogonal policies:</para>
 | ||
| 
 | ||
| 	    <orderedlist>
 | ||
| 	      <listitem><para>A size policy indicating how a hash table
 | ||
| 	      should grow (e.g., it should multiply by powers of
 | ||
| 	      2).</para></listitem>
 | ||
| 
 | ||
| 	      <listitem><para>A trigger policy indicating when a hash
 | ||
| 	      table should grow (e.g., a load factor is
 | ||
| 	      exceeded).</para></listitem>
 | ||
| 	    </orderedlist>
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="resize_policies.size">
 | ||
| 	    <info><title>Size Policies</title></info>
 | ||
| 
 | ||
| 
 | ||
| 	    <para>Size policies determine how a hash table changes size. These
 | ||
| 	    policies are simple, and there are relatively few sensible
 | ||
| 	    options. An exponential-size policy (with the initial size and
 | ||
| 	    growth factors both powers of 2) works well with a mask-based
 | ||
| 	    range-hashing function, and is the
 | ||
| 	    hard-wired policy used by Dinkumware. A
 | ||
| 	    prime-list based policy works well with a modulo-prime range
 | ||
| 	    hashing function and is the hard-wired policy used by SGI's
 | ||
| 	    implementation.</para>
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="resize_policies.trigger">
 | ||
| 	    <info><title>Trigger Policies</title></info>
 | ||
| 
 | ||
| 	    <para>Trigger policies determine when a hash table changes size.
 | ||
| 	    Following is a description of two policies: load-check
 | ||
| 	    policies, and collision-check policies.</para>
 | ||
| 
 | ||
| 	    <para>Load-check policies are straightforward. The user specifies
 | ||
| 	    two factors, Α<subscript>min</subscript> and
 | ||
| 	    Α<subscript>max</subscript>, and the hash table maintains the
 | ||
| 	    invariant that</para>
 | ||
| 
 | ||
| 	    <para>Α<subscript>min</subscript> ≤ (number of
 | ||
| 	    stored elements) / (hash-table size) ≤
 | ||
| 	    Α<subscript>max</subscript><remark>load factor min max</remark></para>
 | ||
| 
 | ||
| 	    <para>Collision-check policies work in the opposite direction of
 | ||
| 	    load-check policies. They focus on keeping the number of
 | ||
| 	    collisions moderate and hoping that the size of the table will
 | ||
| 	    not grow very large, instead of keeping a moderate load-factor
 | ||
| 	    and hoping that the number of collisions will be small. A
 | ||
| 	    maximal collision-check policy resizes when the longest
 | ||
| 	    probe-sequence grows too large.</para>
 | ||
| 
 | ||
| 	    <para>Consider the graphic below. Let the size of the hash table
 | ||
| 	    be denoted by m, the length of a probe sequence be denoted by k,
 | ||
| 	    and some load factor be denoted by Α. We would like to
 | ||
| 	    calculate the minimal length of k, such that if there were Α
 | ||
| 	    m elements in the hash table, a probe sequence of length k would
 | ||
| 	    be found with probability at most 1/m.</para>
 | ||
| 
 | ||
| 	    <figure>
 | ||
| 	      <title>Balls and bins</title>
 | ||
| 	      <mediaobject>
 | ||
| 		<imageobject>
 | ||
| 		  <imagedata align="center" format="PNG" scale="100"
 | ||
| 			     fileref="../images/pbds_balls_and_bins.png"/>
 | ||
| 		</imageobject>
 | ||
| 		<textobject>
 | ||
| 		  <phrase>Balls and bins</phrase>
 | ||
| 		</textobject>
 | ||
| 	      </mediaobject>
 | ||
| 	    </figure>
 | ||
| 
 | ||
| 	    <para>Denote the probability that a probe sequence of length
 | ||
| 	    k appears in bin i by p<subscript>i</subscript>, the
 | ||
| 	    length of the probe sequence of bin i by
 | ||
| 	    l<subscript>i</subscript>, and assume uniform distribution. Then</para>
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 	    <equation>
 | ||
| 	      <title>
 | ||
| 		Probability of Probe Sequence of Length k
 | ||
| 	      </title>
 | ||
| 	      <mathphrase>
 | ||
| 		p<subscript>1</subscript> = 
 | ||
| 	      </mathphrase>
 | ||
| 	    </equation>
 | ||
| 
 | ||
| 	    <para>P(l<subscript>1</subscript> ≥ k) =</para>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      P(l<subscript>1</subscript> ≥ α ( 1 + k / α - 1) ≤ (a)
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	    <para>
 | ||
| 	      e ^ ( - ( α ( k / α - 1 )<superscript>2</superscript> ) /2)
 | ||
| 	    </para>
 | ||
| 
 | ||
| 	    <para>where (a) follows from the Chernoff bound (<xref linkend="biblio.motwani95random"/>). To
 | ||
| 	    calculate the probability that some bin contains a probe
 | ||
| 	    sequence greater than k, we note that the
 | ||
| 	    l<subscript>i</subscript> are negatively-dependent
 | ||
| 	    (<xref linkend="biblio.dubhashi98neg"/>)
 | ||
| 	    . Let
 | ||
| 	    I(.) denote the indicator function. Then</para>
 | ||
| 
 | ||
| 	    <equation>
 | ||
| 	      <title>
 | ||
| 		Probability Probe Sequence in Some Bin
 | ||
| 	      </title>
 | ||
| 	      <mathphrase>
 | ||
| 		P( exists<subscript>i</subscript> l<subscript>i</subscript> ≥ k ) = 
 | ||
| 	      </mathphrase>
 | ||
| 	    </equation>
 | ||
| 
 | ||
| 	    <para>P ( ∑ <subscript>i = 1</subscript><superscript>m</superscript>
 | ||
| 	    I(l<subscript>i</subscript> ≥ k) ≥ 1 ) =</para>
 | ||
| 
 | ||
| 	    <para>P ( ∑ <subscript>i = 1</subscript><superscript>m</superscript> I (
 | ||
| 	    l<subscript>i</subscript> ≥ k ) ≥ m p<subscript>1</subscript> ( 1 + 1 / (m
 | ||
| 	    p<subscript>1</subscript>) - 1 ) ) ≤ (a)</para>
 | ||
| 
 | ||
| 	    <para>e ^ ( ( - m p<subscript>1</subscript> ( 1 / (m p<subscript>1</subscript>)
 | ||
| 	    - 1 ) <superscript>2</superscript> ) / 2 ) ,</para>
 | ||
| 
 | ||
| 	    <para>where (a) follows from the fact that the Chernoff bound can
 | ||
| 	    be applied to negatively-dependent variables (<xref
 | ||
| 	    linkend="biblio.dubhashi98neg"/>). Inserting the first probability
 | ||
| 	    equation into the second one, and equating with 1/m, we
 | ||
| 	    obtain</para>
 | ||
| 
 | ||
| 
 | ||
| 	    <para>k ~ √ ( 2 α ln 2 m ln(m) )
 | ||
| 	    ) .</para>
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="resize_policies.impl">
 | ||
| 	    <info><title>Implementation</title></info>
 | ||
| 
 | ||
| 	    <para>This sub-subsection describes the implementation of the
 | ||
| 	    above in this library. It first describes resize policies and
 | ||
| 	    their decomposition into trigger and size policies, then
 | ||
| 	    describes pre-defined classes, and finally discusses controlled
 | ||
| 	    access the policies' internals.</para>
 | ||
| 
 | ||
| 	    <section xml:id="resize_policies.impl.decomposition">
 | ||
| 	      <info><title>Decomposition</title></info>
 | ||
| 
 | ||
| 
 | ||
| 	      <para>Each hash-based container is parametrized by a
 | ||
| 	      <classname>Resize_Policy</classname> parameter; the container derives
 | ||
| 	      <classname>public</classname>ly from <classname>Resize_Policy</classname>. For
 | ||
| 	      example:</para>
 | ||
| 	      <programlisting>
 | ||
| 		cc_hash_table<typename Key,
 | ||
| 		typename Mapped,
 | ||
| 		...
 | ||
| 		typename Resize_Policy
 | ||
| 		...> : public Resize_Policy
 | ||
| 	      </programlisting>
 | ||
| 
 | ||
| 	      <para>As a container object is modified, it continuously notifies
 | ||
| 	      its <classname>Resize_Policy</classname> base of internal changes
 | ||
| 	      (e.g., collisions encountered and elements being
 | ||
| 	      inserted). It queries its <classname>Resize_Policy</classname> base whether
 | ||
| 	      it needs to be resized, and if so, to what size.</para>
 | ||
| 
 | ||
| 	      <para>The graphic below shows a (possible) sequence diagram
 | ||
| 	      of an insert operation. The user inserts an element; the hash
 | ||
| 	      table notifies its resize policy that a search has started
 | ||
| 	      (point A); in this case, a single collision is encountered -
 | ||
| 	      the table notifies its resize policy of this (point B); the
 | ||
| 	      container finally notifies its resize policy that the search
 | ||
| 	      has ended (point C); it then queries its resize policy whether
 | ||
| 	      a resize is needed, and if so, what is the new size (points D
 | ||
| 	      to G); following the resize, it notifies the policy that a
 | ||
| 	      resize has completed (point H); finally, the element is
 | ||
| 	      inserted, and the policy notified (point I).</para>
 | ||
| 
 | ||
| 	      <figure>
 | ||
| 		<title>Insert resize sequence diagram</title>
 | ||
| 		<mediaobject>
 | ||
| 		  <imageobject>
 | ||
| 		    <imagedata align="center" format="PNG" scale="100"
 | ||
| 			       fileref="../images/pbds_insert_resize_sequence_diagram1.png"/>
 | ||
| 		  </imageobject>
 | ||
| 		  <textobject>
 | ||
| 		    <phrase>Insert resize sequence diagram</phrase>
 | ||
| 		  </textobject>
 | ||
| 		</mediaobject>
 | ||
| 	      </figure>
 | ||
| 
 | ||
| 
 | ||
| 	      <para>In practice, a resize policy can be usually orthogonally
 | ||
| 	      decomposed to a size policy and a trigger policy. Consequently,
 | ||
| 	      the library contains a single class for instantiating a resize
 | ||
| 	      policy: <classname>hash_standard_resize_policy</classname>
 | ||
| 	      is parametrized by <classname>Size_Policy</classname> and
 | ||
| 	      <classname>Trigger_Policy</classname>, derives <classname>public</classname>ly from
 | ||
| 	      both, and acts as a standard delegate (<xref linkend="biblio.gof"/>)
 | ||
| 	      to these policies.</para>
 | ||
| 
 | ||
| 	      <para>The two graphics immediately below show sequence diagrams
 | ||
| 	      illustrating the interaction between the standard resize policy
 | ||
| 	      and its trigger and size policies, respectively.</para>
 | ||
| 
 | ||
| 	      <figure>
 | ||
| 		<title>Standard resize policy trigger sequence
 | ||
| 		diagram</title>
 | ||
| 		<mediaobject>
 | ||
| 		  <imageobject>
 | ||
| 		    <imagedata align="center" format="PNG" scale="100"
 | ||
| 			       fileref="../images/pbds_insert_resize_sequence_diagram2.png"/>
 | ||
| 		  </imageobject>
 | ||
| 		  <textobject>
 | ||
| 		    <phrase>Standard resize policy trigger sequence
 | ||
| 		    diagram</phrase>
 | ||
| 		  </textobject>
 | ||
| 		</mediaobject>
 | ||
| 	      </figure>
 | ||
| 
 | ||
| 	      <figure>
 | ||
| 		<title>Standard resize policy size sequence
 | ||
| 		diagram</title>
 | ||
| 		<mediaobject>
 | ||
| 		  <imageobject>
 | ||
| 		    <imagedata align="center" format="PNG" scale="100"
 | ||
| 			       fileref="../images/pbds_insert_resize_sequence_diagram3.png"/>
 | ||
| 		  </imageobject>
 | ||
| 		  <textobject>
 | ||
| 		    <phrase>Standard resize policy size sequence
 | ||
| 		    diagram</phrase>
 | ||
| 		  </textobject>
 | ||
| 		</mediaobject>
 | ||
| 	      </figure>
 | ||
| 
 | ||
| 
 | ||
| 	    </section>
 | ||
| 
 | ||
| 	    <section xml:id="resize_policies.impl.predefined">
 | ||
| 	      <info><title>Predefined Policies</title></info>
 | ||
| 	      <para>The library includes the following
 | ||
| 	      instantiations of size and trigger policies:</para>
 | ||
| 
 | ||
| 	      <orderedlist>
 | ||
| 		<listitem><para><classname>hash_load_check_resize_trigger</classname>
 | ||
| 		implements a load check trigger policy.</para></listitem>
 | ||
| 
 | ||
| 		<listitem><para><classname>cc_hash_max_collision_check_resize_trigger</classname>
 | ||
| 		implements a collision check trigger policy.</para></listitem>
 | ||
| 
 | ||
| 		<listitem><para><classname>hash_exponential_size_policy</classname>
 | ||
| 		implements an exponential-size policy (which should be used
 | ||
| 		with mask range hashing).</para></listitem>
 | ||
| 
 | ||
| 		<listitem><para><classname>hash_prime_size_policy</classname>
 | ||
| 		implementing a size policy based on a sequence of primes
 | ||
| 		(which should
 | ||
| 		be used with mod range hashing</para></listitem>
 | ||
| 	      </orderedlist>
 | ||
| 
 | ||
| 	      <para>The graphic below gives an overall picture of the resize-related
 | ||
| 	      classes. <classname>basic_hash_table</classname>
 | ||
| 	      is parametrized by <classname>Resize_Policy</classname>, which it subclasses
 | ||
| 	      publicly. This class is currently instantiated only by <classname>hash_standard_resize_policy</classname>. 
 | ||
| 	      <classname>hash_standard_resize_policy</classname>
 | ||
| 	      itself is parametrized by <classname>Trigger_Policy</classname> and
 | ||
| 	      <classname>Size_Policy</classname>. Currently, <classname>Trigger_Policy</classname> is
 | ||
| 	      instantiated by <classname>hash_load_check_resize_trigger</classname>,
 | ||
| 	      or <classname>cc_hash_max_collision_check_resize_trigger</classname>;
 | ||
| 	      <classname>Size_Policy</classname> is instantiated by <classname>hash_exponential_size_policy</classname>,
 | ||
| 	      or <classname>hash_prime_size_policy</classname>.</para>
 | ||
| 
 | ||
| 	    </section>
 | ||
| 
 | ||
| 	    <section xml:id="resize_policies.impl.internals">
 | ||
| 	      <info><title>Controling Access to Internals</title></info>
 | ||
| 
 | ||
| 	      <para>There are cases where (controlled) access to resize
 | ||
| 	      policies' internals is beneficial. E.g., it is sometimes
 | ||
| 	      useful to query a hash-table for the table's actual size (as
 | ||
| 	      opposed to its <function>size()</function> - the number of values it
 | ||
| 	      currently holds); it is sometimes useful to set a table's
 | ||
| 	      initial size, externally resize it, or change load factors.</para>
 | ||
| 
 | ||
| 	      <para>Clearly, supporting such methods both decreases the
 | ||
| 	      encapsulation of hash-based containers, and increases the
 | ||
| 	      diversity between different associative-containers' interfaces.
 | ||
| 	      Conversely, omitting such methods can decrease containers'
 | ||
| 	      flexibility.</para>
 | ||
| 
 | ||
| 	      <para>In order to avoid, to the extent possible, the above
 | ||
| 	      conflict, the hash-based containers themselves do not address
 | ||
| 	      any of these questions; this is deferred to the resize policies,
 | ||
| 	      which are easier to change or replace. Thus, for example,
 | ||
| 	      neither <classname>cc_hash_table</classname> nor
 | ||
| 	      <classname>gp_hash_table</classname>
 | ||
| 	      contain methods for querying the actual size of the table; this
 | ||
| 	      is deferred to <classname>hash_standard_resize_policy</classname>.</para>
 | ||
| 
 | ||
| 	      <para>Furthermore, the policies themselves are parametrized by
 | ||
| 	      template arguments that determine the methods they support
 | ||
| 	      (
 | ||
| 	      <xref linkend="biblio.alexandrescu01modern"/>
 | ||
| 	      shows techniques for doing so). <classname>hash_standard_resize_policy</classname>
 | ||
| 	      is parametrized by <classname>External_Size_Access</classname> that
 | ||
| 	      determines whether it supports methods for querying the actual
 | ||
| 	      size of the table or resizing it. <classname>hash_load_check_resize_trigger</classname>
 | ||
| 	      is parametrized by <classname>External_Load_Access</classname> that
 | ||
| 	      determines whether it supports methods for querying or
 | ||
| 	      modifying the loads. <classname>cc_hash_max_collision_check_resize_trigger</classname>
 | ||
| 	      is parametrized by <classname>External_Load_Access</classname> that
 | ||
| 	      determines whether it supports methods for querying the
 | ||
| 	      load.</para>
 | ||
| 
 | ||
| 	      <para>Some operations, for example, resizing a container at
 | ||
| 	      run time, or changing the load factors of a load-check trigger
 | ||
| 	      policy, require the container itself to resize. As mentioned
 | ||
| 	      above, the hash-based containers themselves do not contain
 | ||
| 	      these types of methods, only their resize policies.
 | ||
| 	      Consequently, there must be some mechanism for a resize policy
 | ||
| 	      to manipulate the hash-based container. As the hash-based
 | ||
| 	      container is a subclass of the resize policy, this is done
 | ||
| 	      through virtual methods. Each hash-based container has a
 | ||
| 	      <classname>private</classname> <classname>virtual</classname> method:</para>
 | ||
| 	      <programlisting>
 | ||
| 		virtual void
 | ||
| 		do_resize
 | ||
| 		(size_type new_size);
 | ||
| 	      </programlisting>
 | ||
| 
 | ||
| 	      <para>which resizes the container. Implementations of
 | ||
| 	      <classname>Resize_Policy</classname> can export public methods for resizing
 | ||
| 	      the container externally; these methods internally call
 | ||
| 	      <classname>do_resize</classname> to resize the table.</para>
 | ||
| 
 | ||
| 
 | ||
| 	    </section>
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 
 | ||
| 	</section> <!-- resize policies -->
 | ||
| 
 | ||
| 	<section xml:id="container.hash.details.policy_interaction">
 | ||
| 	  <info><title>Policy Interactions</title></info>
 | ||
| 	  <para>
 | ||
| 	  </para>
 | ||
| 	  <para>Hash-tables are unfortunately especially susceptible to
 | ||
| 	  choice of policies. One of the more complicated aspects of this
 | ||
| 	  is that poor combinations of good policies can form a poor
 | ||
| 	  container. Following are some considerations.</para>
 | ||
| 
 | ||
| 	  <section xml:id="policy_interaction.probesizetrigger">
 | ||
| 	    <info><title>probe/size/trigger</title></info>
 | ||
| 
 | ||
| 	    <para>Some combinations do not work well for probing containers.
 | ||
| 	    For example, combining a quadratic probe policy with an
 | ||
| 	    exponential size policy can yield a poor container: when an
 | ||
| 	    element is inserted, a trigger policy might decide that there
 | ||
| 	    is no need to resize, as the table still contains unused
 | ||
| 	    entries; the probe sequence, however, might never reach any of
 | ||
| 	    the unused entries.</para>
 | ||
| 
 | ||
| 	    <para>Unfortunately, this library cannot detect such problems at
 | ||
| 	    compilation (they are halting reducible). It therefore defines
 | ||
| 	    an exception class <classname>insert_error</classname> to throw an
 | ||
| 	    exception in this case.</para>
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="policy_interaction.hashtrigger">
 | ||
| 	    <info><title>hash/trigger</title></info>
 | ||
| 
 | ||
| 	    <para>Some trigger policies are especially susceptible to poor
 | ||
| 	    hash functions. Suppose, as an extreme case, that the hash
 | ||
| 	    function transforms each key to the same hash value. After some
 | ||
| 	    inserts, a collision detecting policy will always indicate that
 | ||
| 	    the container needs to grow.</para>
 | ||
| 
 | ||
| 	    <para>The library, therefore, by design, limits each operation to
 | ||
| 	    one resize. For each <classname>insert</classname>, for example, it queries
 | ||
| 	    only once whether a resize is needed.</para>
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="policy_interaction.eqstorehash">
 | ||
| 	    <info><title>equivalence functors/storing hash values/hash</title></info>
 | ||
| 
 | ||
| 	    <para><classname>cc_hash_table</classname> and
 | ||
| 	    <classname>gp_hash_table</classname> are
 | ||
| 	    parametrized by an equivalence functor and by a
 | ||
| 	    <classname>Store_Hash</classname> parameter. If the latter parameter is
 | ||
| 	    <classname>true</classname>, then the container stores with each entry
 | ||
| 	    a hash value, and uses this value in case of collisions to
 | ||
| 	    determine whether to apply a hash value. This can lower the
 | ||
| 	    cost of collision for some types, but increase the cost of
 | ||
| 	    collisions for other types.</para>
 | ||
| 
 | ||
| 	    <para>If a ranged-hash function or ranged probe function is
 | ||
| 	    directly supplied, however, then it makes no sense to store the
 | ||
| 	    hash value with each entry. This library's container will
 | ||
| 	    fail at compilation, by design, if this is attempted.</para>
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="policy_interaction.sizeloadtrigger">
 | ||
| 	    <info><title>size/load-check trigger</title></info>
 | ||
| 
 | ||
| 	    <para>Assume a size policy issues an increasing sequence of sizes
 | ||
| 	    a, a q, a q<superscript>1</superscript>, a q<superscript>2</superscript>, ... For
 | ||
| 	    example, an exponential size policy might issue the sequence of
 | ||
| 	    sizes 8, 16, 32, 64, ...</para>
 | ||
| 
 | ||
| 	    <para>If a load-check trigger policy is used, with loads
 | ||
| 	    α<subscript>min</subscript> and α<subscript>max</subscript>,
 | ||
| 	    respectively, then it is a good idea to have:</para>
 | ||
| 
 | ||
| 	    <orderedlist>
 | ||
| 	      <listitem><para>α<subscript>max</subscript> ~ 1 / q</para></listitem>
 | ||
| 
 | ||
| 	      <listitem><para>α<subscript>min</subscript> < 1 / (2 q)</para></listitem>
 | ||
| 	    </orderedlist>
 | ||
| 
 | ||
| 	    <para>This will ensure that the amortized hash cost of each
 | ||
| 	    modifying operation is at most approximately 3.</para>
 | ||
| 
 | ||
| 	    <para>α<subscript>min</subscript> ~ α<subscript>max</subscript> is, in
 | ||
| 	    any case, a bad choice, and α<subscript>min</subscript> >
 | ||
| 	    α <subscript>max</subscript> is horrendous.</para>
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
|       </section> <!-- details -->
 | ||
| 
 | ||
|     </section> <!-- hash -->
 | ||
| 
 | ||
|     <!-- tree -->
 | ||
|     <section xml:id="pbds.design.container.tree">
 | ||
|       <info><title>tree</title></info>
 | ||
| 
 | ||
|       <section xml:id="container.tree.interface">
 | ||
| 	<info><title>Interface</title></info>
 | ||
| 
 | ||
| 	<para>The tree-based container has the following declaration:</para>
 | ||
| 	<programlisting>
 | ||
| 	  template<
 | ||
| 	  typename Key,
 | ||
| 	  typename Mapped,
 | ||
| 	  typename Cmp_Fn = std::less<Key>,
 | ||
| 	  typename Tag = rb_tree_tag,
 | ||
| 	  template<
 | ||
| 	  typename Const_Node_Iterator,
 | ||
| 	  typename Node_Iterator,
 | ||
| 	  typename Cmp_Fn_,
 | ||
| 	  typename Allocator_>
 | ||
| 	  class Node_Update = null_node_update,
 | ||
| 	  typename Allocator = std::allocator<char> >
 | ||
| 	  class tree;
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>The parameters have the following meaning:</para>
 | ||
| 
 | ||
| 	<orderedlist>
 | ||
| 	  <listitem>
 | ||
| 	  <para><classname>Key</classname> is the key type.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	  <para><classname>Mapped</classname> is the mapped-policy.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	  <para><classname>Cmp_Fn</classname> is a key comparison functor</para></listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para><classname>Tag</classname> specifies which underlying data structure
 | ||
| 	  to use.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para><classname>Node_Update</classname> is a policy for updating node
 | ||
| 	  invariants.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para><classname>Allocator</classname> is an allocator
 | ||
| 	  type.</para></listitem>
 | ||
| 	</orderedlist>
 | ||
| 
 | ||
| 	<para>The <classname>Tag</classname> parameter specifies which underlying
 | ||
| 	data structure to use. Instantiating it by <classname>rb_tree_tag</classname>, <classname>splay_tree_tag</classname>, or
 | ||
| 	<classname>ov_tree_tag</classname>,
 | ||
| 	specifies an underlying red-black tree, splay tree, or
 | ||
| 	ordered-vector tree, respectively; any other tag is illegal.
 | ||
| 	Note that containers based on the former two contain more types
 | ||
| 	and methods than the latter (e.g.,
 | ||
| 	<classname>reverse_iterator</classname> and <classname>rbegin</classname>), and different
 | ||
| 	exception and invalidation guarantees.</para>
 | ||
| 
 | ||
|       </section>
 | ||
| 
 | ||
|       <section xml:id="container.tree.details">
 | ||
| 	<info><title>Details</title></info>
 | ||
| 
 | ||
| 	<section xml:id="container.tree.node">
 | ||
| 	  <info><title>Node Invariants</title></info>
 | ||
| 
 | ||
| 
 | ||
| 	  <para>Consider the two trees in the graphic below, labels A and B. The first
 | ||
| 	  is a tree of floats; the second is a tree of pairs, each
 | ||
| 	  signifying a geometric line interval. Each element in a tree is referred to as a node of the tree. Of course, each of
 | ||
| 	  these trees can support the usual queries: the first can easily
 | ||
| 	  search for <classname>0.4</classname>; the second can easily search for
 | ||
| 	  <classname>std::make_pair(10, 41)</classname>.</para>
 | ||
| 
 | ||
| 	  <para>Each of these trees can efficiently support other queries.
 | ||
| 	  The first can efficiently determine that the 2rd key in the
 | ||
| 	  tree is <constant>0.3</constant>; the second can efficiently determine
 | ||
| 	  whether any of its intervals overlaps
 | ||
| 	  <programlisting>std::make_pair(29,42)</programlisting> (useful in geometric
 | ||
| 	  applications or distributed file systems with leases, for
 | ||
| 	  example).  It should be noted that an <classname>std::set</classname> can
 | ||
| 	  only solve these types of problems with linear complexity.</para>
 | ||
| 
 | ||
| 	  <para>In order to do so, each tree stores some metadata in
 | ||
| 	  each node, and maintains node invariants (see <xref linkend="biblio.clrs2001"/>.) The first stores in
 | ||
| 	  each node the size of the sub-tree rooted at the node; the
 | ||
| 	  second stores at each node the maximal endpoint of the
 | ||
| 	  intervals at the sub-tree rooted at the node.</para>
 | ||
| 
 | ||
| 	  <figure>
 | ||
| 	    <title>Tree node invariants</title>
 | ||
| 	    <mediaobject>
 | ||
| 	      <imageobject>
 | ||
| 		<imagedata align="center" format="PNG" scale="100"
 | ||
| 			   fileref="../images/pbds_tree_node_invariants.png"/>
 | ||
| 	      </imageobject>
 | ||
| 	      <textobject>
 | ||
| 		<phrase>Tree node invariants</phrase>
 | ||
| 	      </textobject>
 | ||
| 	    </mediaobject>
 | ||
| 	  </figure>
 | ||
| 	  
 | ||
| 	  <para>Supporting such trees is difficult for a number of
 | ||
| 	  reasons:</para>
 | ||
| 
 | ||
| 	  <orderedlist>
 | ||
| 	    <listitem><para>There must be a way to specify what a node's metadata
 | ||
| 	    should be (if any).</para></listitem>
 | ||
| 
 | ||
| 	    <listitem><para>Various operations can invalidate node
 | ||
| 	    invariants.  The graphic below shows how a right rotation,
 | ||
| 	    performed on A, results in B, with nodes x and y having
 | ||
| 	    corrupted invariants (the grayed nodes in C). The graphic shows
 | ||
| 	    how an insert, performed on D, results in E, with nodes x and y
 | ||
| 	    having corrupted invariants (the grayed nodes in F). It is not
 | ||
| 	    feasible to know outside the tree the effect of an operation on
 | ||
| 	    the nodes of the tree.</para></listitem>
 | ||
| 
 | ||
| 	    <listitem><para>The search paths of standard associative containers are
 | ||
| 	    defined by comparisons between keys, and not through
 | ||
| 	    metadata.</para></listitem>
 | ||
| 
 | ||
| 	    <listitem><para>It is not feasible to know in advance which methods trees
 | ||
| 	    can support. Besides the usual <classname>find</classname> method, the
 | ||
| 	    first tree can support a <classname>find_by_order</classname> method, while
 | ||
| 	    the second can support an <classname>overlaps</classname> method.</para></listitem>
 | ||
| 	  </orderedlist>
 | ||
| 
 | ||
| 	  <figure>
 | ||
| 	    <title>Tree node invalidation</title>
 | ||
| 	    <mediaobject>
 | ||
| 	      <imageobject>
 | ||
| 		<imagedata align="center" format="PNG" scale="100"
 | ||
| 			   fileref="../images/pbds_tree_node_invalidations.png"/>
 | ||
| 	      </imageobject>
 | ||
| 	      <textobject>
 | ||
| 		<phrase>Tree node invalidation</phrase>
 | ||
| 	      </textobject>
 | ||
| 	    </mediaobject>
 | ||
| 	  </figure>
 | ||
| 
 | ||
| 	  <para>These problems are solved by a combination of two means:
 | ||
| 	  node iterators, and template-template node updater
 | ||
| 	  parameters.</para>
 | ||
| 
 | ||
| 	  <section xml:id="container.tree.node.iterators">
 | ||
| 	    <info><title>Node Iterators</title></info>
 | ||
| 
 | ||
| 
 | ||
| 	    <para>Each tree-based container defines two additional iterator
 | ||
| 	    types, <classname>const_node_iterator</classname>
 | ||
| 	    and <classname>node_iterator</classname>.
 | ||
| 	    These iterators allow descending from a node to one of its
 | ||
| 	    children. Node iterator allow search paths different than those
 | ||
| 	    determined by the comparison functor. The <classname>tree</classname>
 | ||
| 	    supports the methods:</para>
 | ||
| 	    <programlisting>
 | ||
| 	      const_node_iterator
 | ||
| 	      node_begin() const;
 | ||
| 
 | ||
| 	      node_iterator
 | ||
| 	      node_begin();
 | ||
| 
 | ||
| 	      const_node_iterator
 | ||
| 	      node_end() const;
 | ||
| 
 | ||
| 	      node_iterator
 | ||
| 	      node_end(); 
 | ||
| 	    </programlisting>
 | ||
| 
 | ||
| 	    <para>The first pairs return node iterators corresponding to the
 | ||
| 	    root node of the tree; the latter pair returns node iterators
 | ||
| 	    corresponding to a just-after-leaf node.</para>
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	  <section xml:id="container.tree.node.updator">
 | ||
| 	    <info><title>Node Updator</title></info>
 | ||
| 
 | ||
| 	    <para>The tree-based containers are parametrized by a
 | ||
| 	    <classname>Node_Update</classname> template-template parameter. A
 | ||
| 	    tree-based container instantiates
 | ||
| 	    <classname>Node_Update</classname> to some
 | ||
| 	    <classname>node_update</classname> class, and publicly subclasses
 | ||
| 	    <classname>node_update</classname>. The graphic below shows this
 | ||
| 	    scheme, as well as some predefined policies (which are explained
 | ||
| 	    below).</para>
 | ||
| 
 | ||
| 	    <figure>
 | ||
| 	      <title>A tree and its update policy</title>
 | ||
| 	      <mediaobject>
 | ||
| 		<imageobject>
 | ||
| 		  <imagedata align="center" format="PNG" scale="100"
 | ||
| 			     fileref="../images/pbds_tree_node_updator_policy_cd.png"/>
 | ||
| 		</imageobject>
 | ||
| 		<textobject>
 | ||
| 		  <phrase>A tree and its update policy</phrase>
 | ||
| 		</textobject>
 | ||
| 	      </mediaobject>
 | ||
| 	    </figure>
 | ||
| 
 | ||
| 	    <para><classname>node_update</classname> (an instantiation of
 | ||
| 	    <classname>Node_Update</classname>) must define <classname>metadata_type</classname> as
 | ||
| 	    the type of metadata it requires. For order statistics,
 | ||
| 	    e.g., <classname>metadata_type</classname> might be <classname>size_t</classname>.
 | ||
| 	    The tree defines within each node a <classname>metadata_type</classname>
 | ||
| 	    object.</para>
 | ||
| 
 | ||
| 	    <para><classname>node_update</classname> must also define the following method
 | ||
| 	    for restoring node invariants:</para>
 | ||
| 	    <programlisting>
 | ||
| 	      void 
 | ||
| 	      operator()(node_iterator nd_it, const_node_iterator end_nd_it)
 | ||
| 	    </programlisting>
 | ||
| 
 | ||
| 	    <para>In this method, <varname>nd_it</varname> is a
 | ||
| 	    <classname>node_iterator</classname> corresponding to a node whose
 | ||
| 	    A) all descendants have valid invariants, and B) its own
 | ||
| 	    invariants might be violated; <classname>end_nd_it</classname> is
 | ||
| 	    a <classname>const_node_iterator</classname> corresponding to a
 | ||
| 	    just-after-leaf node. This method should correct the node
 | ||
| 	    invariants of the node pointed to by
 | ||
| 	    <classname>nd_it</classname>. For example, say node x in the
 | ||
| 	    graphic below label A has an invalid invariant, but its' children,
 | ||
| 	    y and z have valid invariants. After the invocation, all three
 | ||
| 	    nodes should have valid invariants, as in label B.</para>
 | ||
| 
 | ||
| 
 | ||
| 	    <figure>
 | ||
| 	      <title>Restoring node invariants</title>
 | ||
| 	      <mediaobject>
 | ||
| 		<imageobject>
 | ||
| 		  <imagedata align="center" format="PNG" scale="100"
 | ||
| 			     fileref="../images/pbds_restoring_node_invariants.png"/>
 | ||
| 		</imageobject>
 | ||
| 		<textobject>
 | ||
| 		  <phrase>Restoring node invariants</phrase>
 | ||
| 		</textobject>
 | ||
| 	      </mediaobject>
 | ||
| 	    </figure>
 | ||
| 
 | ||
| 	    <para>When a tree operation might invalidate some node invariant,
 | ||
| 	    it invokes this method in its <classname>node_update</classname> base to
 | ||
| 	    restore the invariant. For example, the graphic below shows
 | ||
| 	    an <function>insert</function> operation (point A); the tree performs some
 | ||
| 	    operations, and calls the update functor three times (points B,
 | ||
| 	    C, and D). (It is well known that any <function>insert</function>,
 | ||
| 	    <function>erase</function>, <function>split</function> or <function>join</function>, can restore
 | ||
| 	    all node invariants by a small number of node invariant updates (<xref linkend="biblio.clrs2001"/>)
 | ||
| 	    .</para>
 | ||
| 
 | ||
| 	    <figure>
 | ||
| 	      <title>Insert update sequence</title>
 | ||
| 	      <mediaobject>
 | ||
| 		<imageobject>
 | ||
| 		  <imagedata align="center" format="PNG" scale="100"
 | ||
| 			     fileref="../images/pbds_update_seq_diagram.png"/>
 | ||
| 		</imageobject>
 | ||
| 		<textobject>
 | ||
| 		  <phrase>Insert update sequence</phrase>
 | ||
| 		</textobject>
 | ||
| 	      </mediaobject>
 | ||
| 	    </figure>
 | ||
| 
 | ||
| 	    <para>To complete the description of the scheme, three questions
 | ||
| 	    need to be answered:</para>
 | ||
| 
 | ||
| 	    <orderedlist>
 | ||
| 	      <listitem><para>How can a tree which supports order statistics define a
 | ||
| 	      method such as <classname>find_by_order</classname>?</para></listitem>
 | ||
| 
 | ||
| 	      <listitem><para>How can the node updater base access methods of the
 | ||
| 	      tree?</para></listitem>
 | ||
| 
 | ||
| 	      <listitem><para>How can the following cyclic dependency be resolved?
 | ||
| 	      <classname>node_update</classname> is a base class of the tree, yet it
 | ||
| 	      uses node iterators defined in the tree (its child).</para></listitem>
 | ||
| 	    </orderedlist>
 | ||
| 
 | ||
| 	    <para>The first two questions are answered by the fact that
 | ||
| 	    <classname>node_update</classname> (an instantiation of
 | ||
| 	    <classname>Node_Update</classname>) is a <emphasis>public</emphasis> base class
 | ||
| 	    of the tree. Consequently:</para>
 | ||
| 
 | ||
| 	    <orderedlist>
 | ||
| 	      <listitem><para>Any public methods of
 | ||
| 	      <classname>node_update</classname> are automatically methods of
 | ||
| 	      the tree (<xref linkend="biblio.alexandrescu01modern"/>).
 | ||
| 	      Thus an order-statistics node updater,
 | ||
| 	      <classname>tree_order_statistics_node_update</classname> defines
 | ||
| 	      the <function>find_by_order</function> method; any tree
 | ||
| 	      instantiated by this policy consequently supports this method as
 | ||
| 	      well.</para></listitem>
 | ||
| 
 | ||
| 	      <listitem><para>In C++, if a base class declares a method as
 | ||
| 	      <literal>virtual</literal>, it is
 | ||
| 	      <literal>virtual</literal> in its subclasses. If
 | ||
| 	      <classname>node_update</classname> needs to access one of the
 | ||
| 	      tree's methods, say the member function
 | ||
| 	      <function>end</function>, it simply declares that method as
 | ||
| 	      <literal>virtual</literal> abstract.</para></listitem>
 | ||
| 	    </orderedlist>
 | ||
| 
 | ||
| 	    <para>The cyclic dependency is solved through template-template
 | ||
| 	    parameters. <classname>Node_Update</classname> is parametrized by
 | ||
| 	    the tree's node iterators, its comparison functor, and its
 | ||
| 	    allocator type. Thus, instantiations of
 | ||
| 	    <classname>Node_Update</classname> have all information
 | ||
| 	    required.</para>
 | ||
| 
 | ||
| 	    <para>This library assumes that constructing a metadata object and
 | ||
| 	    modifying it are exception free. Suppose that during some method,
 | ||
| 	    say <classname>insert</classname>, a metadata-related operation
 | ||
| 	    (e.g., changing the value of a metadata) throws an exception. Ack!
 | ||
| 	    Rolling back the method is unusually complex.</para>
 | ||
| 
 | ||
| 	    <para>Previously, a distinction was made between redundant
 | ||
| 	    policies and null policies. Node invariants show a
 | ||
| 	    case where null policies are required.</para>
 | ||
| 
 | ||
| 	    <para>Assume a regular tree is required, one which need not
 | ||
| 	    support order statistics or interval overlap queries.
 | ||
| 	    Seemingly, in this case a redundant policy - a policy which
 | ||
| 	    doesn't affect nodes' contents would suffice. This, would lead
 | ||
| 	    to the following drawbacks:</para>
 | ||
| 
 | ||
| 	    <orderedlist>
 | ||
| 	      <listitem><para>Each node would carry a useless metadata object, wasting
 | ||
| 	      space.</para></listitem>
 | ||
| 
 | ||
| 	      <listitem><para>The tree cannot know if its
 | ||
| 	      <classname>Node_Update</classname> policy actually modifies a
 | ||
| 	      node's metadata (this is halting reducible). In the graphic
 | ||
| 	      below, assume the shaded node is inserted. The tree would have
 | ||
| 	      to traverse the useless path shown to the root, applying
 | ||
| 	      redundant updates all the way.</para></listitem>
 | ||
| 	    </orderedlist>
 | ||
| 	    <figure>
 | ||
| 	      <title>Useless update path</title>
 | ||
| 	      <mediaobject>
 | ||
| 		<imageobject>
 | ||
| 		  <imagedata align="center" format="PNG" scale="100"
 | ||
| 			     fileref="../images/pbds_rationale_null_node_updator.png"/>
 | ||
| 		</imageobject>
 | ||
| 		<textobject>
 | ||
| 		  <phrase>Useless update path</phrase>
 | ||
| 		</textobject>
 | ||
| 	      </mediaobject>
 | ||
| 	    </figure>
 | ||
| 
 | ||
| 
 | ||
| 	    <para>A null policy class, <classname>null_node_update</classname>
 | ||
| 	    solves both these problems. The tree detects that node
 | ||
| 	    invariants are irrelevant, and defines all accordingly.</para>
 | ||
| 
 | ||
| 	  </section>
 | ||
| 
 | ||
| 	</section> 
 | ||
| 
 | ||
| 	<section xml:id="container.tree.details.split">
 | ||
| 	  <info><title>Split and Join</title></info>
 | ||
| 
 | ||
| 	  <para>Tree-based containers support split and join methods.
 | ||
| 	  It is possible to split a tree so that it passes
 | ||
| 	  all nodes with keys larger than a given key to a different
 | ||
| 	  tree. These methods have the following advantages over the
 | ||
| 	  alternative of externally inserting to the destination
 | ||
| 	  tree and erasing from the source tree:</para>
 | ||
| 
 | ||
| 	  <orderedlist>
 | ||
| 	    <listitem><para>These methods are efficient - red-black trees are split
 | ||
| 	    and joined in poly-logarithmic complexity; ordered-vector
 | ||
| 	    trees are split and joined at linear complexity. The
 | ||
| 	    alternatives have super-linear complexity.</para></listitem>
 | ||
| 
 | ||
| 	    <listitem><para>Aside from orders of growth, these operations perform
 | ||
| 	    few allocations and de-allocations. For red-black trees, allocations are not performed,
 | ||
| 	    and the methods are exception-free. </para></listitem>
 | ||
| 	  </orderedlist>
 | ||
| 	</section>
 | ||
| 
 | ||
|       </section> <!-- details -->
 | ||
| 
 | ||
|     </section> <!-- tree -->
 | ||
| 
 | ||
|     <!-- trie -->
 | ||
|     <section xml:id="pbds.design.container.trie">
 | ||
|       <info><title>Trie</title></info>
 | ||
| 
 | ||
|       <section xml:id="container.trie.interface">
 | ||
| 	<info><title>Interface</title></info>
 | ||
| 
 | ||
| 	<para>The trie-based container has the following declaration:</para>
 | ||
| 	<programlisting>
 | ||
| 	  template<typename Key,
 | ||
| 	  typename Mapped,
 | ||
| 	  typename Cmp_Fn = std::less<Key>,
 | ||
| 	  typename Tag = pat_trie_tag,
 | ||
| 	  template<typename Const_Node_Iterator,
 | ||
| 	  typename Node_Iterator,
 | ||
| 	  typename E_Access_Traits_,
 | ||
| 	  typename Allocator_>
 | ||
| 	  class Node_Update = null_node_update,
 | ||
| 	  typename Allocator = std::allocator<char> >
 | ||
| 	  class trie;
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>The parameters have the following meaning:</para>
 | ||
| 
 | ||
| 	<orderedlist>
 | ||
| 	  <listitem><para><classname>Key</classname> is the key type.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Mapped</classname> is the mapped-policy.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>E_Access_Traits</classname> is described in below.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Tag</classname> specifies which underlying data structure
 | ||
| 	  to use, and is described shortly.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Node_Update</classname> is a policy for updating node
 | ||
| 	  invariants. This is described below.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Allocator</classname> is an allocator
 | ||
| 	  type.</para></listitem>
 | ||
| 	</orderedlist>
 | ||
| 
 | ||
| 	<para>The <classname>Tag</classname> parameter specifies which underlying
 | ||
| 	data structure to use. Instantiating it by <classname>pat_trie_tag</classname>, specifies an
 | ||
| 	underlying PATRICIA trie (explained shortly); any other tag is
 | ||
| 	currently illegal.</para>
 | ||
| 
 | ||
| 	<para>Following is a description of a (PATRICIA) trie
 | ||
| 	(this implementation follows <xref linkend="biblio.okasaki98mereable"/> and 
 | ||
| 	<xref linkend="biblio.filliatre2000ptset"/>). 
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>A (PATRICIA) trie is similar to a tree, but with the
 | ||
| 	following differences:</para>
 | ||
| 
 | ||
| 	<orderedlist>
 | ||
| 	  <listitem><para>It explicitly views keys as a sequence of elements.
 | ||
| 	  E.g., a trie can view a string as a sequence of
 | ||
| 	  characters; a trie can view a number as a sequence of
 | ||
| 	  bits.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para>It is not (necessarily) binary. Each node has fan-out n
 | ||
| 	  + 1, where n is the number of distinct
 | ||
| 	  elements.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para>It stores values only at leaf nodes.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para>Internal nodes have the properties that A) each has at
 | ||
| 	  least two children, and B) each shares the same prefix with
 | ||
| 	  any of its descendant.</para></listitem>
 | ||
| 	</orderedlist>
 | ||
| 
 | ||
| 	<para>A (PATRICIA) trie has some useful properties:</para>
 | ||
| 
 | ||
| 	<orderedlist>
 | ||
| 	  <listitem><para>It can be configured to use large node fan-out, giving it
 | ||
| 	  very efficient find performance (albeit at insertion
 | ||
| 	  complexity and size).</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para>It works well for common-prefix keys.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para>It can support efficiently queries such as which
 | ||
| 	  keys match a certain prefix. This is sometimes useful in file
 | ||
| 	  systems and routers, and for "type-ahead" aka predictive text matching
 | ||
| 	  on mobile devices.</para></listitem>
 | ||
| 	</orderedlist>
 | ||
| 
 | ||
| 
 | ||
|       </section>
 | ||
| 
 | ||
|       <section xml:id="container.trie.details">
 | ||
| 	<info><title>Details</title></info>
 | ||
| 
 | ||
| 	<section xml:id="container.trie.details.etraits">
 | ||
| 	  <info><title>Element Access Traits</title></info>
 | ||
| 
 | ||
| 	  <para>A trie inherently views its keys as sequences of elements.
 | ||
| 	  For example, a trie can view a string as a sequence of
 | ||
| 	  characters. A trie needs to map each of n elements to a
 | ||
| 	  number in {0, n - 1}. For example, a trie can map a
 | ||
| 	  character <varname>c</varname> to
 | ||
| 	  <programlisting>static_cast<size_t>(c)</programlisting>.</para>
 | ||
| 
 | ||
| 	  <para>Seemingly, then, a trie can assume that its keys support
 | ||
| 	  (const) iterators, and that the <classname>value_type</classname> of this
 | ||
| 	  iterator can be cast to a <classname>size_t</classname>. There are several
 | ||
| 	  reasons, though, to decouple the mechanism by which the trie
 | ||
| 	  accesses its keys' elements from the trie:</para>
 | ||
| 
 | ||
| 	  <orderedlist>
 | ||
| 	    <listitem><para>In some cases, the numerical value of an element is
 | ||
| 	    inappropriate. Consider a trie storing DNA strings. It is
 | ||
| 	    logical to use a trie with a fan-out of 5 = 1 + |{'A', 'C',
 | ||
| 	    'G', 'T'}|. This requires mapping 'T' to 3, though.</para></listitem>
 | ||
| 
 | ||
| 	    <listitem><para>In some cases the keys' iterators are different than what
 | ||
| 	    is needed. For example, a trie can be used to search for
 | ||
| 	    common suffixes, by using strings'
 | ||
| 	    <classname>reverse_iterator</classname>. As another example, a trie mapping
 | ||
| 	    UNICODE strings would have a huge fan-out if each node would
 | ||
| 	    branch on a UNICODE character; instead, one can define an
 | ||
| 	    iterator iterating over 8-bit (or less) groups.</para></listitem>
 | ||
| 	  </orderedlist>
 | ||
| 
 | ||
| 	  <para>trie is,
 | ||
| 	  consequently, parametrized by <classname>E_Access_Traits</classname> -
 | ||
| 	  traits which instruct how to access sequences' elements.
 | ||
| 	  <classname>string_trie_e_access_traits</classname>
 | ||
| 	  is a traits class for strings. Each such traits define some
 | ||
| 	  types, like:</para>
 | ||
| 	  <programlisting>
 | ||
| 	    typename E_Access_Traits::const_iterator
 | ||
| 	  </programlisting>
 | ||
| 
 | ||
| 	  <para>is a const iterator iterating over a key's elements. The
 | ||
| 	  traits class must also define methods for obtaining an iterator
 | ||
| 	  to the first and last element of a key.</para>
 | ||
| 
 | ||
| 	  <para>The graphic below shows a
 | ||
| 	  (PATRICIA) trie resulting from inserting the words: "I wish
 | ||
| 	  that I could ever see a poem lovely as a trie" (which,
 | ||
| 	  unfortunately, does not rhyme).</para>
 | ||
| 
 | ||
| 	  <para>The leaf nodes contain values; each internal node contains
 | ||
| 	  two <classname>typename E_Access_Traits::const_iterator</classname>
 | ||
| 	  objects, indicating the maximal common prefix of all keys in
 | ||
| 	  the sub-tree. For example, the shaded internal node roots a
 | ||
| 	  sub-tree with leafs "a" and "as". The maximal common prefix is
 | ||
| 	  "a". The internal node contains, consequently, to const
 | ||
| 	  iterators, one pointing to <varname>'a'</varname>, and the other to
 | ||
| 	  <varname>'s'</varname>.</para>
 | ||
| 
 | ||
| 	  <figure>
 | ||
| 	    <title>A PATRICIA trie</title>
 | ||
| 	    <mediaobject>
 | ||
| 	      <imageobject>
 | ||
| 		<imagedata align="center" format="PNG" scale="100"
 | ||
| 			   fileref="../images/pbds_pat_trie.png"/>
 | ||
| 	      </imageobject>
 | ||
| 	      <textobject>
 | ||
| 		<phrase>A PATRICIA trie</phrase>
 | ||
| 	      </textobject>
 | ||
| 	    </mediaobject>
 | ||
| 	  </figure>
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
| 	<section xml:id="container.trie.details.node">
 | ||
| 	  <info><title>Node Invariants</title></info>
 | ||
| 
 | ||
| 	  <para>Trie-based containers support node invariants, as do
 | ||
| 	  tree-based containers. There are two minor
 | ||
| 	  differences, though, which, unfortunately, thwart sharing them
 | ||
| 	  sharing the same node-updating policies:</para>
 | ||
| 
 | ||
| 	  <orderedlist>
 | ||
| 	    <listitem>
 | ||
| 	      <para>A trie's <classname>Node_Update</classname> template-template
 | ||
| 	      parameter is parametrized by <classname>E_Access_Traits</classname>, while
 | ||
| 	      a tree's <classname>Node_Update</classname> template-template parameter is
 | ||
| 	    parametrized by <classname>Cmp_Fn</classname>.</para></listitem>
 | ||
| 
 | ||
| 	    <listitem><para>Tree-based containers store values in all nodes, while
 | ||
| 	    trie-based containers (at least in this implementation) store
 | ||
| 	    values in leafs.</para></listitem>
 | ||
| 	  </orderedlist>
 | ||
| 
 | ||
| 	  <para>The graphic below shows the scheme, as well as some predefined
 | ||
| 	  policies (which are explained below).</para>
 | ||
| 
 | ||
| 	  <figure>
 | ||
| 	    <title>A trie and its update policy</title>
 | ||
| 	    <mediaobject>
 | ||
| 	      <imageobject>
 | ||
| 		<imagedata align="center" format="PNG" scale="100"
 | ||
| 			   fileref="../images/pbds_trie_node_updator_policy_cd.png"/>
 | ||
| 	      </imageobject>
 | ||
| 	      <textobject>
 | ||
| 		<phrase>A trie and its update policy</phrase>
 | ||
| 	      </textobject>
 | ||
| 	    </mediaobject>
 | ||
| 	  </figure>
 | ||
| 
 | ||
| 
 | ||
| 	  <para>This library offers the following pre-defined trie node
 | ||
| 	  updating policies:</para>
 | ||
| 
 | ||
| 	  <orderedlist>
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		<classname>trie_order_statistics_node_update</classname>
 | ||
| 		supports order statistics.
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 
 | ||
| 	    <listitem><para><classname>trie_prefix_search_node_update</classname>
 | ||
| 	    supports searching for ranges that match a given prefix.</para></listitem>
 | ||
| 
 | ||
| 	    <listitem><para><classname>null_node_update</classname>
 | ||
| 	    is the null node updater.</para></listitem>
 | ||
| 	  </orderedlist>
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
| 	<section xml:id="container.trie.details.split">
 | ||
| 	  <info><title>Split and Join</title></info>
 | ||
| 	  <para>Trie-based containers support split and join methods; the
 | ||
| 	  rationale is equal to that of tree-based containers supporting
 | ||
| 	  these methods.</para>
 | ||
| 	</section>
 | ||
| 
 | ||
|       </section> <!-- details -->
 | ||
| 
 | ||
|     </section> <!-- trie -->
 | ||
| 
 | ||
|     <!-- list_update -->
 | ||
|     <section xml:id="pbds.design.container.list">
 | ||
|       <info><title>List</title></info>
 | ||
| 
 | ||
|       <section xml:id="container.list.interface">
 | ||
| 	<info><title>Interface</title></info>
 | ||
| 
 | ||
| 	<para>The list-based container has the following declaration:</para>
 | ||
| 	<programlisting>
 | ||
| 	  template<typename Key,
 | ||
| 	  typename Mapped,
 | ||
| 	  typename Eq_Fn = std::equal_to<Key>,
 | ||
| 	  typename Update_Policy = move_to_front_lu_policy<>,
 | ||
| 	  typename Allocator = std::allocator<char> >
 | ||
| 	  class list_update;
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>The parameters have the following meaning:</para>
 | ||
| 
 | ||
| 	<orderedlist>
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      <classname>Key</classname> is the key type.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      <classname>Mapped</classname> is the mapped-policy.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      <classname>Eq_Fn</classname> is a key equivalence functor.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      <classname>Update_Policy</classname> is a policy updating positions in
 | ||
| 	      the list based on access patterns. It is described in the
 | ||
| 	      following subsection.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 
 | ||
| 	  <listitem>
 | ||
| 	    <para>
 | ||
| 	      <classname>Allocator</classname> is an allocator type.
 | ||
| 	    </para>
 | ||
| 	  </listitem>
 | ||
| 	</orderedlist>
 | ||
| 
 | ||
| 	<para>A list-based associative container is a container that
 | ||
| 	stores elements in a linked-list. It does not order the elements
 | ||
| 	by any particular order related to the keys.  List-based
 | ||
| 	containers are primarily useful for creating "multimaps". In fact,
 | ||
| 	list-based containers are designed in this library expressly for
 | ||
| 	this purpose.</para>
 | ||
| 
 | ||
| 	<para>List-based containers might also be useful for some rare
 | ||
| 	cases, where a key is encapsulated to the extent that only
 | ||
| 	key-equivalence can be tested. Hash-based containers need to know
 | ||
| 	how to transform a key into a size type, and tree-based containers
 | ||
| 	need to know if some key is larger than another.  List-based
 | ||
| 	associative containers, conversely, only need to know if two keys
 | ||
| 	are equivalent.</para>
 | ||
| 
 | ||
| 	<para>Since a list-based associative container does not order
 | ||
| 	elements by keys, is it possible to order the list in some
 | ||
| 	useful manner? Remarkably, many on-line competitive
 | ||
| 	algorithms exist for reordering lists to reflect access
 | ||
| 	prediction. (See <xref linkend="biblio.motwani95random"/> and <xref linkend="biblio.andrew04mtf"/>).
 | ||
| 	</para>
 | ||
| 
 | ||
|       </section>
 | ||
| 
 | ||
|       <section xml:id="container.list.details">
 | ||
| 	<info><title>Details</title></info>
 | ||
| 	<para>
 | ||
| 	</para>
 | ||
| 	<section xml:id="container.list.details.ds">
 | ||
| 	  <info><title>Underlying Data Structure</title></info>
 | ||
| 
 | ||
| 	  <para>The graphic below shows a
 | ||
| 	  simple list of integer keys. If we search for the integer 6, we
 | ||
| 	  are paying an overhead: the link with key 6 is only the fifth
 | ||
| 	  link; if it were the first link, it could be accessed
 | ||
| 	  faster.</para>
 | ||
| 
 | ||
| 	  <figure>
 | ||
| 	    <title>A simple list</title>
 | ||
| 	    <mediaobject>
 | ||
| 	      <imageobject>
 | ||
| 		<imagedata align="center" format="PNG" scale="100"
 | ||
| 			   fileref="../images/pbds_simple_list.png"/>
 | ||
| 	      </imageobject>
 | ||
| 	      <textobject>
 | ||
| 		<phrase>A simple list</phrase>
 | ||
| 	      </textobject>
 | ||
| 	    </mediaobject>
 | ||
| 	  </figure>
 | ||
| 
 | ||
| 	  <para>List-update algorithms reorder lists as elements are
 | ||
| 	  accessed. They try to determine, by the access history, which
 | ||
| 	  keys to move to the front of the list. Some of these algorithms
 | ||
| 	  require adding some metadata alongside each entry.</para>
 | ||
| 
 | ||
| 	  <para>For example, in the graphic below label A shows the counter
 | ||
| 	  algorithm. Each node contains both a key and a count metadata
 | ||
| 	  (shown in bold). When an element is accessed (e.g. 6) its count is
 | ||
| 	  incremented, as shown in label B. If the count reaches some
 | ||
| 	  predetermined value, say 10, as shown in label C, the count is set
 | ||
| 	  to 0 and the node is moved to the front of the list, as in label
 | ||
| 	  D.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <figure>
 | ||
| 	    <title>The counter algorithm</title>
 | ||
| 	    <mediaobject>
 | ||
| 	      <imageobject>
 | ||
| 		<imagedata align="center" format="PNG" scale="100"
 | ||
| 			   fileref="../images/pbds_list_update.png"/>
 | ||
| 	      </imageobject>
 | ||
| 	      <textobject>
 | ||
| 		<phrase>The counter algorithm</phrase>
 | ||
| 	      </textobject>
 | ||
| 	    </mediaobject>
 | ||
| 	  </figure>
 | ||
| 
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
| 	<section xml:id="container.list.details.policies">
 | ||
| 	  <info><title>Policies</title></info>
 | ||
| 
 | ||
| 	  <para>this library allows instantiating lists with policies
 | ||
| 	  implementing any algorithm moving nodes to the front of the
 | ||
| 	  list (policies implementing algorithms interchanging nodes are
 | ||
| 	  unsupported).</para>
 | ||
| 
 | ||
| 	  <para>Associative containers based on lists are parametrized by a
 | ||
| 	  <classname>Update_Policy</classname> parameter. This parameter defines the
 | ||
| 	  type of metadata each node contains, how to create the
 | ||
| 	  metadata, and how to decide, using this metadata, whether to
 | ||
| 	  move a node to the front of the list. A list-based associative
 | ||
| 	  container object derives (publicly) from its update policy.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>An instantiation of <classname>Update_Policy</classname> must define
 | ||
| 	  internally <classname>update_metadata</classname> as the metadata it
 | ||
| 	  requires. Internally, each node of the list contains, besides
 | ||
| 	  the usual key and data, an instance of <classname>typename
 | ||
| 	  Update_Policy::update_metadata</classname>.</para>
 | ||
| 
 | ||
| 	  <para>An instantiation of <classname>Update_Policy</classname> must define
 | ||
| 	  internally two operators:</para>
 | ||
| 	  <programlisting>
 | ||
| 	    update_metadata
 | ||
| 	    operator()();
 | ||
| 
 | ||
| 	    bool
 | ||
| 	    operator()(update_metadata &);
 | ||
| 	  </programlisting>
 | ||
| 
 | ||
| 	  <para>The first is called by the container object, when creating a
 | ||
| 	  new node, to create the node's metadata. The second is called
 | ||
| 	  by the container object, when a node is accessed (
 | ||
| 	  when a find operation's key is equivalent to the key of the
 | ||
| 	  node), to determine whether to move the node to the front of
 | ||
| 	  the list.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>The library contains two predefined implementations of
 | ||
| 	  list-update policies. The first
 | ||
| 	  is <classname>lu_counter_policy</classname>, which implements the
 | ||
| 	  counter algorithm described above. The second is
 | ||
| 	  <classname>lu_move_to_front_policy</classname>,
 | ||
| 	  which unconditionally move an accessed element to the front of
 | ||
| 	  the list. The latter type is very useful in this library,
 | ||
| 	  since there is no need to associate metadata with each element.
 | ||
| 	  (See <xref linkend="biblio.andrew04mtf"/> 
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
| 	<section xml:id="container.list.details.mapped">
 | ||
| 	  <info><title>Use in Multimaps</title></info>
 | ||
| 
 | ||
| 	  <para>In this library, there are no equivalents for the standard's
 | ||
| 	  multimaps and multisets; instead one uses an associative
 | ||
| 	  container mapping primary keys to secondary keys.</para>
 | ||
| 
 | ||
| 	  <para>List-based containers are especially useful as associative
 | ||
| 	  containers for secondary keys. In fact, they are implemented
 | ||
| 	  here expressly for this purpose.</para>
 | ||
| 
 | ||
| 	  <para>To begin with, these containers use very little per-entry
 | ||
| 	  structure memory overhead, since they can be implemented as
 | ||
| 	  singly-linked lists. (Arrays use even lower per-entry memory
 | ||
| 	  overhead, but they are less flexible in moving around entries,
 | ||
| 	  and have weaker invalidation guarantees).</para>
 | ||
| 
 | ||
| 	  <para>More importantly, though, list-based containers use very
 | ||
| 	  little per-container memory overhead. The memory overhead of an
 | ||
| 	  empty list-based container is practically that of a pointer.
 | ||
| 	  This is important for when they are used as secondary
 | ||
| 	  associative-containers in situations where the average ratio of
 | ||
| 	  secondary keys to primary keys is low (or even 1).</para>
 | ||
| 
 | ||
| 	  <para>In order to reduce the per-container memory overhead as much
 | ||
| 	  as possible, they are implemented as closely as possible to
 | ||
| 	  singly-linked lists.</para>
 | ||
| 
 | ||
| 	  <orderedlist>
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		List-based containers do not store internally the number
 | ||
| 		of values that they hold. This means that their <function>size</function>
 | ||
| 		method has linear complexity (just like <classname>std::list</classname>).
 | ||
| 		Note that finding the number of equivalent-key values in a
 | ||
| 		standard multimap also has linear complexity (because it must be
 | ||
| 		done,  via <function>std::distance</function> of the
 | ||
| 		multimap's <function>equal_range</function> method), but usually with
 | ||
| 		higher constants.
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 
 | ||
| 	    <listitem>
 | ||
| 	      <para>
 | ||
| 		Most associative-container objects each hold a policy
 | ||
| 		object (a hash-based container object holds a
 | ||
| 		hash functor). List-based containers, conversely, only have
 | ||
| 		class-wide policy objects.
 | ||
| 	      </para>
 | ||
| 	    </listitem>
 | ||
| 	  </orderedlist>
 | ||
| 
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
|       </section> <!-- details -->
 | ||
| 
 | ||
|     </section> <!-- list -->
 | ||
| 
 | ||
| 
 | ||
|     <!-- priority_queue -->
 | ||
|     <section xml:id="pbds.design.container.priority_queue">
 | ||
|       <info><title>Priority Queue</title></info>
 | ||
| 
 | ||
|       <section xml:id="container.priority_queue.interface">
 | ||
| 	<info><title>Interface</title></info>
 | ||
| 
 | ||
| 	<para>The priority queue container has the following
 | ||
| 	declaration:
 | ||
| 	</para>
 | ||
| 	<programlisting>
 | ||
| 	  template<typename  Value_Type,
 | ||
| 	  typename  Cmp_Fn = std::less<Value_Type>,
 | ||
| 	  typename  Tag = pairing_heap_tag,
 | ||
| 	  typename  Allocator = std::allocator<char > >
 | ||
| 	  class priority_queue;
 | ||
| 	</programlisting>
 | ||
| 
 | ||
| 	<para>The parameters have the following meaning:</para>
 | ||
| 
 | ||
| 	<orderedlist>
 | ||
| 	  <listitem><para><classname>Value_Type</classname> is the value type.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Cmp_Fn</classname> is a value comparison functor</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Tag</classname> specifies which underlying data structure
 | ||
| 	  to use.</para></listitem>
 | ||
| 
 | ||
| 	  <listitem><para><classname>Allocator</classname> is an allocator
 | ||
| 	  type.</para></listitem>
 | ||
| 	</orderedlist>
 | ||
| 
 | ||
| 	<para>The <classname>Tag</classname> parameter specifies which underlying
 | ||
| 	data structure to use. Instantiating it by<classname>pairing_heap_tag</classname>,<classname>binary_heap_tag</classname>,
 | ||
| 	<classname>binomial_heap_tag</classname>,
 | ||
| 	<classname>rc_binomial_heap_tag</classname>,
 | ||
| 	or <classname>thin_heap_tag</classname>,
 | ||
| 	specifies, respectively, 
 | ||
| 	an underlying pairing heap (<xref linkend="biblio.fredman86pairing"/>),
 | ||
| 	binary heap (<xref linkend="biblio.clrs2001"/>),
 | ||
| 	binomial heap (<xref linkend="biblio.clrs2001"/>),
 | ||
| 	a binomial heap with a redundant binary counter (<xref linkend="biblio.maverik_lowerbounds"/>),
 | ||
| 	or a thin heap (<xref linkend="biblio.kt99fat_heaps"/>).
 | ||
| 	</para>
 | ||
| 
 | ||
| 	<para>
 | ||
| 	  As mentioned in the tutorial,
 | ||
| 	  <classname>__gnu_pbds::priority_queue</classname> shares most of the
 | ||
| 	  same interface with <classname>std::priority_queue</classname>.
 | ||
| 	  E.g. if <varname>q</varname> is a priority queue of type
 | ||
| 	  <classname>Q</classname>, then <function>q.top()</function> will
 | ||
| 	  return the "largest" value in the container (according to
 | ||
| 	  <classname>typename
 | ||
| 	  Q::cmp_fn</classname>). <classname>__gnu_pbds::priority_queue</classname>
 | ||
| 	  has a larger (and very slightly different) interface than
 | ||
| 	  <classname>std::priority_queue</classname>, however, since typically
 | ||
| 	  <classname>push</classname> and <classname>pop</classname> are deemed
 | ||
| 	insufficient for manipulating priority-queues. </para>
 | ||
| 
 | ||
| 	<para>Different settings require different priority-queue
 | ||
| 	implementations which are described in later; see traits
 | ||
| 	discusses ways to differentiate between the different traits of
 | ||
| 	different implementations.</para>
 | ||
| 
 | ||
| 
 | ||
|       </section>
 | ||
| 
 | ||
|       <section xml:id="container.priority_queue.details">
 | ||
| 	<info><title>Details</title></info>
 | ||
| 
 | ||
| 	<section xml:id="container.priority_queue.details.iterators">
 | ||
| 	  <info><title>Iterators</title></info>
 | ||
| 
 | ||
| 	  <para>There are many different underlying-data structures for
 | ||
| 	  implementing priority queues. Unfortunately, most such
 | ||
| 	  structures are oriented towards making <function>push</function> and
 | ||
| 	  <function>top</function> efficient, and consequently don't allow efficient
 | ||
| 	  access of other elements: for instance, they cannot support an efficient
 | ||
| 	  <function>find</function> method. In the use case where it
 | ||
| 	  is important to both access and "do something with" an
 | ||
| 	  arbitrary value, one would be out of luck. For example, many graph algorithms require
 | ||
| 	  modifying a value (typically increasing it in the sense of the
 | ||
| 	  priority queue's comparison functor).</para>
 | ||
| 
 | ||
| 	  <para>In order to access and manipulate an arbitrary value in a
 | ||
| 	  priority queue, one needs to reference the internals of the
 | ||
| 	  priority queue from some form of an associative container -
 | ||
| 	  this is unavoidable. Of course, in order to maintain the
 | ||
| 	  encapsulation of the priority queue, this needs to be done in a
 | ||
| 	  way that minimizes exposure to implementation internals.</para>
 | ||
| 
 | ||
| 	  <para>In this library the priority queue's <function>insert</function>
 | ||
| 	  method returns an iterator, which if valid can be used for subsequent <function>modify</function> and
 | ||
| 	  <function>erase</function> operations. This both preserves the priority
 | ||
| 	  queue's encapsulation, and allows accessing arbitrary values (since the
 | ||
| 	  returned iterators from the <function>push</function> operation can be
 | ||
| 	  stored in some form of associative container).</para>
 | ||
| 
 | ||
| 	  <para>Priority queues' iterators present a problem regarding their
 | ||
| 	  invalidation guarantees. One assumes that calling
 | ||
| 	  <function>operator++</function> on an iterator will associate it
 | ||
| 	  with the "next" value. Priority-queues are
 | ||
| 	  self-organizing: each operation changes what the "next" value
 | ||
| 	  means. Consequently, it does not make sense that <function>push</function>
 | ||
| 	  will return an iterator that can be incremented - this can have
 | ||
| 	  no possible use. Also, as in the case of hash-based containers,
 | ||
| 	  it is awkward to define if a subsequent <function>push</function> operation
 | ||
| 	  invalidates a prior returned iterator: it invalidates it in the
 | ||
| 	  sense that its "next" value is not related to what it
 | ||
| 	  previously considered to be its "next" value. However, it might not
 | ||
| 	  invalidate it, in the sense that it can be
 | ||
| 	  de-referenced and used for <function>modify</function> and <function>erase</function>
 | ||
| 	  operations.</para>
 | ||
| 
 | ||
| 	  <para>Similarly to the case of the other unordered associative
 | ||
| 	  containers, this library uses a distinction between
 | ||
| 	  point-type and range type iterators. A priority queue's <classname>iterator</classname> can always be
 | ||
| 	  converted to a <classname>point_iterator</classname>, and a
 | ||
| 	  <classname>const_iterator</classname> can always be converted to a
 | ||
| 	  <classname>point_const_iterator</classname>.</para>
 | ||
| 
 | ||
| 	  <para>The following snippet demonstrates manipulating an arbitrary
 | ||
| 	  value:</para>
 | ||
| 	  <programlisting>
 | ||
| 	    // A priority queue of integers.
 | ||
| 	    priority_queue<int > p;
 | ||
| 
 | ||
| 	    // Insert some values into the priority queue.
 | ||
| 	    priority_queue<int >::point_iterator it = p.push(0);
 | ||
| 
 | ||
| 	    p.push(1);
 | ||
| 	    p.push(2);
 | ||
| 
 | ||
| 	    // Now modify a value.
 | ||
| 	    p.modify(it, 3);
 | ||
| 
 | ||
| 	    assert(p.top() == 3);
 | ||
| 	  </programlisting>
 | ||
| 
 | ||
| 	  
 | ||
| 	  <para>It should be noted that an alternative design could embed an
 | ||
| 	  associative container in a priority queue. Could, but most
 | ||
| 	  probably should not. To begin with, it should be noted that one
 | ||
| 	  could always encapsulate a priority queue and an associative
 | ||
| 	  container mapping values to priority queue iterators with no
 | ||
| 	  performance loss. One cannot, however, "un-encapsulate" a priority
 | ||
| 	  queue embedding an associative container, which might lead to
 | ||
| 	  performance loss. Assume, that one needs to associate each value
 | ||
| 	  with some data unrelated to priority queues. Then using
 | ||
| 	  this library's design, one could use an
 | ||
| 	  associative container mapping each value to a pair consisting of
 | ||
| 	  this data and a priority queue's iterator. Using the embedded
 | ||
| 	  method would need to use two associative containers. Similar
 | ||
| 	  problems might arise in cases where a value can reside
 | ||
| 	  simultaneously in many priority queues.</para>
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
| 
 | ||
| 	<section xml:id="container.priority_queue.details.d">
 | ||
| 	  <info><title>Underlying Data Structure</title></info>
 | ||
| 
 | ||
| 	  <para>There are three main implementations of priority queues: the
 | ||
| 	  first employs a binary heap, typically one which uses a
 | ||
| 	  sequence; the second uses a tree (or forest of trees), which is
 | ||
| 	  typically less structured than an associative container's tree;
 | ||
| 	  the third simply uses an associative container. These are
 | ||
| 	  shown in the graphic below, in labels A1 and A2, label B, and label C.</para>
 | ||
| 
 | ||
| 	  <figure>
 | ||
| 	    <title>Underlying Priority-Queue Data-Structures.</title>
 | ||
| 	    <mediaobject>
 | ||
| 	      <imageobject>
 | ||
| 		<imagedata align="center" format="PNG" scale="100"
 | ||
| 			   fileref="../images/pbds_priority_queue_different_underlying_dss.png"/>
 | ||
| 	      </imageobject>
 | ||
| 	      <textobject>
 | ||
| 		<phrase>Underlying Priority-Queue Data-Structures.</phrase>
 | ||
| 	      </textobject>
 | ||
| 	    </mediaobject>
 | ||
| 	  </figure>
 | ||
| 
 | ||
| 	  <para>Roughly speaking, any value that is both pushed and popped
 | ||
| 	  from a priority queue must incur a logarithmic expense (in the
 | ||
| 	  amortized sense). Any priority queue implementation that would
 | ||
| 	  avoid this, would violate known bounds on comparison-based
 | ||
| 	  sorting (see <xref linkend="biblio.clrs2001"/> and <xref linkend="biblio.brodal96priority"/>).
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>Most implementations do
 | ||
| 	  not differ in the asymptotic amortized complexity of
 | ||
| 	  <function>push</function> and <function>pop</function> operations, but they differ in
 | ||
| 	  the constants involved, in the complexity of other operations
 | ||
| 	  (e.g., <function>modify</function>), and in the worst-case
 | ||
| 	  complexity of single operations. In general, the more
 | ||
| 	  "structured" an implementation (i.e., the more internal
 | ||
| 	  invariants it possesses) - the higher its amortized complexity
 | ||
| 	  of <function>push</function> and <function>pop</function> operations.</para>
 | ||
| 
 | ||
| 	  <para>This library implements different algorithms using a
 | ||
| 	  single class: <classname>priority_queue</classname>.
 | ||
| 	  Instantiating the <classname>Tag</classname> template parameter, "selects"
 | ||
| 	  the implementation:</para>
 | ||
| 
 | ||
| 	  <orderedlist>
 | ||
| 	    <listitem><para>
 | ||
| 	      Instantiating <classname>Tag = binary_heap_tag</classname> creates
 | ||
| 	      a binary heap of the form in represented in the graphic with labels A1 or A2. The former is internally
 | ||
| 	      selected by priority_queue
 | ||
| 	      if <classname>Value_Type</classname> is instantiated by a primitive type
 | ||
| 	      (e.g., an <type>int</type>); the latter is
 | ||
| 	      internally selected for all other types (e.g.,
 | ||
| 	      <classname>std::string</classname>). This implementations is relatively
 | ||
| 	      unstructured, and so has good <classname>push</classname> and <classname>pop</classname>
 | ||
| 	      performance; it is the "best-in-kind" for primitive
 | ||
| 	      types, e.g., <type>int</type>s. Conversely, it has
 | ||
| 	      high worst-case performance, and can support only linear-time
 | ||
| 	    <function>modify</function> and <function>erase</function> operations.</para></listitem>
 | ||
| 
 | ||
| 	    <listitem><para>Instantiating <classname>Tag =
 | ||
| 	    pairing_heap_tag</classname> creates a pairing heap of the form
 | ||
| 	    in represented by label B in the graphic above. This
 | ||
| 	    implementations too is relatively unstructured, and so has good
 | ||
| 	    <function>push</function> and <function>pop</function>
 | ||
| 	    performance; it is the "best-in-kind" for non-primitive types,
 | ||
| 	    e.g., <classname>std:string</classname>s. It also has very good
 | ||
| 	    worst-case <function>push</function> and
 | ||
| 	    <function>join</function> performance (O(1)), but has high
 | ||
| 	    worst-case <function>pop</function>
 | ||
| 	    complexity.</para></listitem>
 | ||
| 
 | ||
| 	    <listitem><para>Instantiating <classname>Tag =
 | ||
| 	    binomial_heap_tag</classname> creates a binomial heap of the
 | ||
| 	    form repsented by label B in the graphic above. This
 | ||
| 	    implementations is more structured than a pairing heap, and so
 | ||
| 	    has worse <function>push</function> and <function>pop</function>
 | ||
| 	    performance. Conversely, it has sub-linear worst-case bounds for
 | ||
| 	    <function>pop</function>, e.g., and so it might be preferred in
 | ||
| 	    cases where responsiveness is important.</para></listitem>
 | ||
| 
 | ||
| 	    <listitem><para>Instantiating <classname>Tag =
 | ||
| 	    rc_binomial_heap_tag</classname> creates a binomial heap of the
 | ||
| 	    form represented in label B above, accompanied by a redundant
 | ||
| 	    counter which governs the trees. This implementations is
 | ||
| 	    therefore more structured than a binomial heap, and so has worse
 | ||
| 	    <function>push</function> and <function>pop</function>
 | ||
| 	    performance. Conversely, it guarantees O(1)
 | ||
| 	    <function>push</function> complexity, and so it might be
 | ||
| 	    preferred in cases where the responsiveness of a binomial heap
 | ||
| 	    is insufficient.</para></listitem>
 | ||
| 
 | ||
| 	    <listitem><para>Instantiating <classname>Tag =
 | ||
| 	    thin_heap_tag</classname> creates a thin heap of the form
 | ||
| 	    represented by the label B in the graphic above. This
 | ||
| 	    implementations too is more structured than a pairing heap, and
 | ||
| 	    so has worse <function>push</function> and
 | ||
| 	    <function>pop</function> performance. Conversely, it has better
 | ||
| 	    worst-case and identical amortized complexities than a Fibonacci
 | ||
| 	    heap, and so might be more appropriate for some graph
 | ||
| 	    algorithms.</para></listitem>
 | ||
| 	  </orderedlist>
 | ||
| 
 | ||
| 	  <para>Of course, one can use any order-preserving associative
 | ||
| 	  container as a priority queue, as in the graphic above label C, possibly by creating an adapter class
 | ||
| 	  over the associative container (much as 
 | ||
| 	  <classname>std::priority_queue</classname> can adapt <classname>std::vector</classname>).
 | ||
| 	  This has the advantage that no cross-referencing is necessary
 | ||
| 	  at all; the priority queue itself is an associative container.
 | ||
| 	  Most associative containers are too structured to compete with
 | ||
| 	  priority queues in terms of <function>push</function> and <function>pop</function>
 | ||
| 	  performance.</para>
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
| 	<section xml:id="container.priority_queue.details.traits">
 | ||
| 	  <info><title>Traits</title></info>
 | ||
| 
 | ||
| 	  <para>It would be nice if all priority queues could
 | ||
| 	  share exactly the same behavior regardless of implementation. Sadly, this is not possible. Just one for instance is in join operations: joining
 | ||
| 	  two binary heaps might throw an exception (not corrupt
 | ||
| 	  any of the heaps on which it operates), but joining two pairing
 | ||
| 	  heaps is exception free.</para>
 | ||
| 
 | ||
| 	  <para>Tags and traits are very useful for manipulating generic
 | ||
| 	  types. <classname>__gnu_pbds::priority_queue</classname>
 | ||
| 	  publicly defines <classname>container_category</classname> as one of the tags. Given any
 | ||
| 	  container <classname>Cntnr</classname>, the tag of the underlying
 | ||
| 	  data structure can be found via <classname>typename 
 | ||
| 	  Cntnr::container_category</classname>; this is one of the possible tags shown in the graphic below.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <figure>
 | ||
| 	    <title>Priority-Queue Data-Structure Tags.</title>
 | ||
| 	    <mediaobject>
 | ||
| 	      <imageobject>
 | ||
| 		<imagedata align="center" format="PNG" scale="100"
 | ||
|                  fileref="../images/pbds_priority_queue_tag_hierarchy.png"/>
 | ||
| 	      </imageobject>
 | ||
| 	      <textobject>
 | ||
| 		<phrase>Priority-Queue Data-Structure Tags.</phrase>
 | ||
| 	      </textobject>
 | ||
| 	    </mediaobject>
 | ||
| 	  </figure>
 | ||
| 
 | ||
| 
 | ||
| 	  <para>Additionally, a traits mechanism can be used to query a
 | ||
| 	  container type for its attributes. Given any container
 | ||
| 	  <classname>Cntnr</classname>, then <programlisting>__gnu_pbds::container_traits<Cntnr></programlisting>
 | ||
| 	  is a traits class identifying the properties of the
 | ||
| 	  container.</para>
 | ||
| 
 | ||
| 	  <para>To find if a container might throw if two of its objects are
 | ||
| 	  joined, one can use 
 | ||
| 	  <programlisting>
 | ||
| 	    container_traits<Cntnr>::split_join_can_throw
 | ||
| 	  </programlisting>
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    Different priority-queue implementations have different invalidation guarantees. This is
 | ||
| 	    especially important, since there is no way to access an arbitrary
 | ||
| 	    value of priority queues except for iterators. Similarly to
 | ||
| 	    associative containers, one can use
 | ||
| 	    <programlisting>
 | ||
| 	      container_traits<Cntnr>::invalidation_guarantee
 | ||
| 	    </programlisting>
 | ||
| 	  to get the invalidation guarantee type of a priority queue.</para>
 | ||
| 
 | ||
| 	  <para>It is easy to understand from the graphic above, what <classname>container_traits<Cntnr>::invalidation_guarantee</classname>
 | ||
| 	  will be for different implementations. All implementations of
 | ||
| 	  type represented by label B have <classname>point_invalidation_guarantee</classname>:
 | ||
| 	  the container can freely internally reorganize the nodes -
 | ||
| 	  range-type iterators are invalidated, but point-type iterators
 | ||
| 	  are always valid. Implementations of type represented by labels A1 and A2 have <classname>basic_invalidation_guarantee</classname>:
 | ||
| 	  the container can freely internally reallocate the array - both
 | ||
| 	  point-type and range-type iterators might be invalidated.</para>
 | ||
| 
 | ||
| 	  <para>
 | ||
| 	    This has major implications, and constitutes a good reason to avoid
 | ||
| 	    using binary heaps. A binary heap can perform <function>modify</function>
 | ||
| 	    or <function>erase</function> efficiently given a valid point-type
 | ||
| 	    iterator. However, in order to supply it with a valid point-type
 | ||
| 	    iterator, one needs to iterate (linearly) over all
 | ||
| 	    values, then supply the relevant iterator (recall that a
 | ||
| 	    range-type iterator can always be converted to a point-type
 | ||
| 	    iterator). This means that if the number of <function>modify</function> or
 | ||
| 	    <function>erase</function> operations is non-negligible (say
 | ||
| 	    super-logarithmic in the total sequence of operations) - binary
 | ||
| 	    heaps will perform badly.
 | ||
| 	  </para>
 | ||
| 
 | ||
| 	</section>
 | ||
| 
 | ||
|       </section> <!-- details -->
 | ||
| 
 | ||
|     </section> <!-- priority_queue -->
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
|   </section> <!-- container -->
 | ||
| 
 | ||
|   </section> <!-- design -->
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
|   <!-- S04: Test -->
 | ||
|   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" parse="xml"
 | ||
| 	      href="test_policy_data_structures.xml">
 | ||
|   </xi:include>
 | ||
| 
 | ||
|   <!-- S05: Reference/Acknowledgments -->
 | ||
|   <section xml:id="pbds.ack">
 | ||
|     <info><title>Acknowledgments</title></info>
 | ||
|     <?dbhtml filename="policy_data_structures_ack.html"?>
 | ||
| 
 | ||
|     <para>
 | ||
|       Written by Ami Tavory and Vladimir Dreizin (IBM Haifa Research
 | ||
|       Laboratories), and Benjamin Kosnik (Red Hat).
 | ||
|     </para>
 | ||
| 
 | ||
|     <para>
 | ||
|       This library was partially written at IBM's Haifa Research Labs.
 | ||
|       It is based heavily on policy-based design and uses many useful
 | ||
|       techniques from Modern C++ Design: Generic Programming and Design
 | ||
|       Patterns Applied by Andrei Alexandrescu.
 | ||
|     </para>
 | ||
| 
 | ||
|     <para>
 | ||
|       Two ideas are borrowed from the SGI-STL implementation:
 | ||
|     </para>
 | ||
| 
 | ||
|     <orderedlist>
 | ||
|       <listitem>
 | ||
| 	<para>
 | ||
| 	  The prime-based resize policies use a list of primes taken from
 | ||
| 	  the SGI-STL implementation.
 | ||
| 	</para>
 | ||
|       </listitem>
 | ||
| 
 | ||
|       <listitem>
 | ||
| 	<para>
 | ||
| 	  The red-black trees contain both a root node and a header node
 | ||
| 	  (containing metadata), connected in a way that forward and
 | ||
| 	  reverse iteration can be performed efficiently.
 | ||
| 	</para>
 | ||
|       </listitem>
 | ||
|     </orderedlist>
 | ||
| 
 | ||
|     <para>
 | ||
|       Some test utilities borrow ideas from
 | ||
|       <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.boost.org/doc/libs/release/libs/timer/index.html">boost::timer</link>.
 | ||
|     </para>
 | ||
| 
 | ||
|     <para>
 | ||
|       We would like to thank Scott Meyers for useful comments (without
 | ||
|       attributing to him any flaws in the design or implementation of the
 | ||
|       library).
 | ||
|     </para>
 | ||
|     <para>We would like to thank Matt Austern for the suggestion to
 | ||
|     include tries.</para>
 | ||
|   </section>
 | ||
| 
 | ||
|   <!-- S06: Biblio -->
 | ||
| <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" parse="xml"
 | ||
| 	    href="policy_data_structures_biblio.xml">
 | ||
| </xi:include>
 | ||
| 
 | ||
| </chapter>
 |