mirror of git://gcc.gnu.org/git/gcc.git
				
				
				
			
		
			
				
	
	
		
			672 lines
		
	
	
		
			31 KiB
		
	
	
	
		
			XML
		
	
	
	
			
		
		
	
	
			672 lines
		
	
	
		
			31 KiB
		
	
	
	
		
			XML
		
	
	
	
<chapter xmlns="http://docbook.org/ns/docbook" version="5.0" 
 | 
						|
	 xml:id="std.io" xreflabel="Input and Output">
 | 
						|
<?dbhtml filename="io.html"?>
 | 
						|
 | 
						|
<info><title>
 | 
						|
  Input and Output
 | 
						|
  <indexterm><primary>Input and Output</primary></indexterm>
 | 
						|
</title>
 | 
						|
  <keywordset>
 | 
						|
    <keyword>ISO C++</keyword>
 | 
						|
    <keyword>library</keyword>
 | 
						|
  </keywordset>
 | 
						|
</info>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<!-- Sect1 01 : Iostream Objects -->
 | 
						|
<section xml:id="std.io.objects" xreflabel="IO Objects"><info><title>Iostream Objects</title></info>
 | 
						|
<?dbhtml filename="iostream_objects.html"?>
 | 
						|
  
 | 
						|
 | 
						|
   <para>To minimize the time you have to wait on the compiler, it's good to
 | 
						|
      only include the headers you really need.  Many people simply include
 | 
						|
      <filename class="headerfile"><iostream></filename> when they don't
 | 
						|
      need to -- and that can <emphasis>penalize your runtime as well.</emphasis>
 | 
						|
      Here are some tips on which header to use
 | 
						|
      for which situations, starting with the simplest.
 | 
						|
   </para>
 | 
						|
   <para><emphasis><filename class="headerfile"><iosfwd></filename></emphasis>
 | 
						|
      should be included whenever you simply need the <emphasis>name</emphasis>
 | 
						|
      of an I/O-related class, such as "<classname>ofstream</classname>" or
 | 
						|
      "<classname>basic_streambuf</classname>".
 | 
						|
      Like the name implies, these are forward declarations.
 | 
						|
      (A word to all you fellow old school programmers:
 | 
						|
      trying to forward declare classes like "<code>class istream;</code>"
 | 
						|
      won't work.
 | 
						|
      Look in the <filename class="headerfile"><iosfwd></filename> header
 | 
						|
      if you'd like to know why.)  For example,
 | 
						|
   </para>
 | 
						|
   <programlisting>
 | 
						|
    #include <iosfwd>
 | 
						|
 | 
						|
    class MyClass
 | 
						|
    {
 | 
						|
	....
 | 
						|
	std::ifstream&   input_file;
 | 
						|
    };
 | 
						|
 | 
						|
    extern std::ostream& operator<< (std::ostream&, MyClass&);
 | 
						|
   </programlisting>
 | 
						|
   <para><emphasis><filename class="headerfile"><ios></filename></emphasis>
 | 
						|
      declares the base classes for the entire I/O stream hierarchy,
 | 
						|
      <classname>std::ios_base</classname> and <classname>std::basic_ios<charT></classname>,
 | 
						|
      the counting types <type>std::streamoff</type> and <type>std::streamsize</type>,
 | 
						|
      the file positioning type <type>std::fpos</type>,
 | 
						|
      and the various manipulators like <function>std::hex</function>,
 | 
						|
      <function>std::fixed</function>, <function>std::noshowbase</function>,
 | 
						|
      and so forth.
 | 
						|
   </para>
 | 
						|
   <para>The <classname>ios_base</classname> class is what holds the format
 | 
						|
      flags, the state flags, and the functions which change them
 | 
						|
      (<function>setf()</function>, <function>width()</function>,
 | 
						|
      <function>precision()</function>, etc).
 | 
						|
      You can also store extra data and register callback functions
 | 
						|
      through <classname>ios_base</classname>, but that has been historically
 | 
						|
      underused.  Anything
 | 
						|
      which doesn't depend on the type of characters stored is consolidated
 | 
						|
      here.
 | 
						|
   </para>
 | 
						|
   <para>The class template <classname>basic_ios</classname> is the highest
 | 
						|
      class template in the
 | 
						|
      hierarchy; it is the first one depending on the character type, and
 | 
						|
      holds all general state associated with that type:  the pointer to the
 | 
						|
      polymorphic stream buffer, the facet information, etc.
 | 
						|
   </para>
 | 
						|
   <para><emphasis><filename class="headerfile"><streambuf></filename></emphasis>
 | 
						|
      declares the class template <classname>basic_streambuf</classname>, and
 | 
						|
      two standard instantiations, <type>streambuf</type> and
 | 
						|
      <type>wstreambuf</type>.  If you need to work with the vastly useful and
 | 
						|
      capable stream buffer classes, e.g., to create a new form of storage
 | 
						|
      transport, this header is the one to include.
 | 
						|
   </para>
 | 
						|
   <para><emphasis><filename class="headerfile"><istream></filename></emphasis>
 | 
						|
       and <emphasis><filename class="headerfile"><ostream></filename></emphasis>
 | 
						|
       are the headers to include when you are using the overloaded
 | 
						|
      <code>>></code> and <code><<</code> operators,
 | 
						|
      or any of the other abstract stream formatting functions.
 | 
						|
      For example,
 | 
						|
   </para>
 | 
						|
   <programlisting>
 | 
						|
    #include <istream>
 | 
						|
 | 
						|
    std::ostream& operator<< (std::ostream& os, MyClass& c)
 | 
						|
    {
 | 
						|
       return os << c.data1() << c.data2();
 | 
						|
    }
 | 
						|
   </programlisting>
 | 
						|
   <para>The <type>std::istream</type> and <type>std::ostream</type> classes
 | 
						|
      are the abstract parents of
 | 
						|
      the various concrete implementations.  If you are only using the
 | 
						|
      interfaces, then you only need to use the appropriate interface header.
 | 
						|
   </para>
 | 
						|
   <para><emphasis><filename class="headerfile"><iomanip></filename></emphasis>
 | 
						|
      provides "extractors and inserters that alter information maintained by
 | 
						|
      class <classname>ios_base</classname> and its derived classes,"
 | 
						|
      such as <function>std::setprecision</function> and
 | 
						|
      <function>std::setw</function>.  If you need
 | 
						|
      to write expressions like <code>os << setw(3);</code> or
 | 
						|
      <code>is >> setbase(8);</code>, you must include
 | 
						|
      <filename class="headerfile"><iomanip></filename>.
 | 
						|
   </para>
 | 
						|
   <para><emphasis><filename class="headerfile"><sstream></filename></emphasis>
 | 
						|
      and <emphasis><filename class="headerfile"><fstream></filename></emphasis>
 | 
						|
      declare the six stringstream and fstream classes.  As they are the
 | 
						|
      standard concrete descendants of <type>istream</type> and <type>ostream</type>,
 | 
						|
      you will already know about them.
 | 
						|
   </para>
 | 
						|
   <para>Finally, <emphasis><filename class="headerfile"><iostream></filename></emphasis>
 | 
						|
      provides the eight standard global objects
 | 
						|
      (<code>cin</code>, <code>cout</code>, etc).  To do this correctly, this
 | 
						|
      header also provides the contents of the
 | 
						|
      <filename class="headerfile"><istream></filename> and
 | 
						|
      <filename class="headerfile"><ostream></filename>
 | 
						|
      headers, but nothing else.  The contents of this header look like:
 | 
						|
   </para>
 | 
						|
   <programlisting>
 | 
						|
    #include <ostream>
 | 
						|
    #include <istream>
 | 
						|
 | 
						|
    namespace std
 | 
						|
    {
 | 
						|
	extern istream cin;
 | 
						|
	extern ostream cout;
 | 
						|
	....
 | 
						|
 | 
						|
	// this is explained below
 | 
						|
	<emphasis>static ios_base::Init __foo;</emphasis>    // not its real name
 | 
						|
    }
 | 
						|
   </programlisting>
 | 
						|
   <para>Now, the runtime penalty mentioned previously:  the global objects
 | 
						|
      must be initialized before any of your own code uses them; this is
 | 
						|
      guaranteed by the standard.  Like any other global object, they must
 | 
						|
      be initialized once and only once.  This is typically done with a
 | 
						|
      construct like the one above, and the nested class
 | 
						|
      <classname>ios_base::Init</classname> is
 | 
						|
      specified in the standard for just this reason.
 | 
						|
   </para>
 | 
						|
   <para>How does it work?  Because the header is included before any of your
 | 
						|
      code, the <emphasis>__foo</emphasis> object is constructed before any of
 | 
						|
      your objects.  (Global objects are built in the order in which they
 | 
						|
      are declared, and destroyed in reverse order.)  The first time the
 | 
						|
      constructor runs, the eight stream objects are set up.
 | 
						|
   </para>
 | 
						|
   <para>The <code>static</code> keyword means that each object file compiled
 | 
						|
      from a source file containing
 | 
						|
      <filename class="headerfile"><iostream></filename> will have its own
 | 
						|
      private copy of <emphasis>__foo</emphasis>.  There is no specified order
 | 
						|
      of construction across object files (it's one of those pesky NP complete
 | 
						|
      problems that make life so interesting), so one copy in each object
 | 
						|
      file means that the stream objects are guaranteed to be set up before
 | 
						|
      any of your code which uses them could run, thereby meeting the
 | 
						|
      requirements of the standard.
 | 
						|
   </para>
 | 
						|
   <para>The penalty, of course, is that after the first copy of
 | 
						|
      <emphasis>__foo</emphasis> is constructed, all the others are just wasted
 | 
						|
      processor time.  The time spent is merely for an increment-and-test
 | 
						|
      inside a function call, but over several dozen or hundreds of object
 | 
						|
      files, that time can add up.  (It's not in a tight loop, either.)
 | 
						|
   </para>
 | 
						|
   <para>The lesson?  Only include
 | 
						|
      <filename class="headerfile"><iostream></filename> when you need
 | 
						|
      to use one of
 | 
						|
      the standard objects in that source file; you'll pay less startup
 | 
						|
      time.  Only include the header files you need to in general; your
 | 
						|
      compile times will go down when there's less parsing work to do.
 | 
						|
   </para>
 | 
						|
 | 
						|
</section>
 | 
						|
 | 
						|
<!-- Sect1 02 : Stream Buffers -->
 | 
						|
<section xml:id="std.io.streambufs" xreflabel="Stream Buffers"><info><title>Stream Buffers</title></info>
 | 
						|
<?dbhtml filename="streambufs.html"?>
 | 
						|
  
 | 
						|
 | 
						|
  <section xml:id="io.streambuf.derived" xreflabel="Derived streambuf Classes"><info><title>Derived streambuf Classes</title></info>
 | 
						|
    
 | 
						|
    <para>
 | 
						|
    </para>
 | 
						|
 | 
						|
   <para>Creating your own stream buffers for I/O can be remarkably easy.
 | 
						|
      If you are interested in doing so, we highly recommend two very
 | 
						|
      excellent books:
 | 
						|
      <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.angelikalanger.com/iostreams.html">Standard C++
 | 
						|
      IOStreams and Locales</link> by Langer and Kreft, ISBN 0-201-18395-1, and
 | 
						|
      <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.josuttis.com/libbook/">The C++ Standard Library</link>
 | 
						|
      by Nicolai Josuttis, ISBN 0-201-37926-0.  Both are published by
 | 
						|
      Addison-Wesley, who isn't paying us a cent for saying that, honest.
 | 
						|
   </para>
 | 
						|
   <para>Here is a simple example, io/outbuf1, from the Josuttis text.  It
 | 
						|
      transforms everything sent through it to uppercase.  This version
 | 
						|
      assumes many things about the nature of the character type being
 | 
						|
      used (for more information, read the books or the newsgroups):
 | 
						|
   </para>
 | 
						|
   <programlisting>
 | 
						|
    #include <iostream>
 | 
						|
    #include <streambuf>
 | 
						|
    #include <locale>
 | 
						|
    #include <cstdio>
 | 
						|
 | 
						|
    class outbuf : public std::streambuf
 | 
						|
    {
 | 
						|
      protected:
 | 
						|
	/* central output function
 | 
						|
	 * - print characters in uppercase mode
 | 
						|
	 */
 | 
						|
	virtual int_type overflow (int_type c) {
 | 
						|
	    if (c != EOF) {
 | 
						|
		// convert lowercase to uppercase
 | 
						|
		c = std::toupper(static_cast<char>(c),getloc());
 | 
						|
 | 
						|
		// and write the character to the standard output
 | 
						|
		if (putchar(c) == EOF) {
 | 
						|
		    return EOF;
 | 
						|
		}
 | 
						|
	    }
 | 
						|
	    return c;
 | 
						|
	}
 | 
						|
    };
 | 
						|
 | 
						|
    int main()
 | 
						|
    {
 | 
						|
	// create special output buffer
 | 
						|
	outbuf ob;
 | 
						|
	// initialize output stream with that output buffer
 | 
						|
	std::ostream out(&ob);
 | 
						|
 | 
						|
	out << "31 hexadecimal: "
 | 
						|
	    << std::hex << 31 << std::endl;
 | 
						|
	return 0;
 | 
						|
    }
 | 
						|
   </programlisting>
 | 
						|
   <para>Try it yourself!  More examples can be found in 3.1.x code, in
 | 
						|
      <filename>include/ext/*_filebuf.h</filename>, and in this article by James Kanze:
 | 
						|
      <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://kanze.james.neuf.fr/articles/fltrsbf1.html">Filtering
 | 
						|
      Streambufs</link>.
 | 
						|
   </para>
 | 
						|
 | 
						|
  </section>
 | 
						|
 | 
						|
  <section xml:id="io.streambuf.buffering" xreflabel="Buffering"><info><title>Buffering</title></info>
 | 
						|
    
 | 
						|
   <para>First, are you sure that you understand buffering?  Particularly
 | 
						|
      the fact that C++ may not, in fact, have anything to do with it?
 | 
						|
   </para>
 | 
						|
   <para>The rules for buffering can be a little odd, but they aren't any
 | 
						|
      different from those of C.  (Maybe that's why they can be a bit
 | 
						|
      odd.)  Many people think that writing a newline to an output
 | 
						|
      stream automatically flushes the output buffer.  This is true only
 | 
						|
      when the output stream is, in fact, a terminal and not a file
 | 
						|
      or some other device -- and <emphasis>that</emphasis> may not even be true
 | 
						|
      since C++ says nothing about files nor terminals.  All of that is
 | 
						|
      system-dependent.  (The "newline-buffer-flushing only occurring
 | 
						|
      on terminals" thing is mostly true on Unix systems, though.)
 | 
						|
   </para>
 | 
						|
   <para>Some people also believe that sending <code>endl</code> down an
 | 
						|
      output stream only writes a newline.  This is incorrect; after a
 | 
						|
      newline is written, the buffer is also flushed.  Perhaps this
 | 
						|
      is the effect you want when writing to a screen -- get the text
 | 
						|
      out as soon as possible, etc -- but the buffering is largely
 | 
						|
      wasted when doing this to a file:
 | 
						|
   </para>
 | 
						|
   <programlisting>
 | 
						|
   output << "a line of text" << endl;
 | 
						|
   output << some_data_variable << endl;
 | 
						|
   output << "another line of text" << endl; </programlisting>
 | 
						|
   <para>The proper thing to do in this case to just write the data out
 | 
						|
      and let the libraries and the system worry about the buffering.
 | 
						|
      If you need a newline, just write a newline:
 | 
						|
   </para>
 | 
						|
   <programlisting>
 | 
						|
   output << "a line of text\n"
 | 
						|
	  << some_data_variable << '\n'
 | 
						|
	  << "another line of text\n"; </programlisting>
 | 
						|
   <para>I have also joined the output statements into a single statement.
 | 
						|
      You could make the code prettier by moving the single newline to
 | 
						|
      the start of the quoted text on the last line, for example.
 | 
						|
   </para>
 | 
						|
   <para>If you do need to flush the buffer above, you can send an
 | 
						|
      <code>endl</code> if you also need a newline, or just flush the buffer
 | 
						|
      yourself:
 | 
						|
   </para>
 | 
						|
   <programlisting>
 | 
						|
   output << ...... << flush;    // can use std::flush manipulator
 | 
						|
   output.flush();               // or call a member fn </programlisting>
 | 
						|
   <para>On the other hand, there are times when writing to a file should
 | 
						|
      be like writing to standard error; no buffering should be done
 | 
						|
      because the data needs to appear quickly (a prime example is a
 | 
						|
      log file for security-related information).  The way to do this is
 | 
						|
      just to turn off the buffering <emphasis>before any I/O operations at
 | 
						|
      all</emphasis> have been done (note that opening counts as an I/O operation):
 | 
						|
   </para>
 | 
						|
   <programlisting>
 | 
						|
   std::ofstream    os;
 | 
						|
   std::ifstream    is;
 | 
						|
   int   i;
 | 
						|
 | 
						|
   os.rdbuf()->pubsetbuf(0,0);
 | 
						|
   is.rdbuf()->pubsetbuf(0,0);
 | 
						|
 | 
						|
   os.open("/foo/bar/baz");
 | 
						|
   is.open("/qux/quux/quuux");
 | 
						|
   ...
 | 
						|
   os << "this data is written immediately\n";
 | 
						|
   is >> i;   // and this will probably cause a disk read </programlisting>
 | 
						|
   <para>Since all aspects of buffering are handled by a streambuf-derived
 | 
						|
      member, it is necessary to get at that member with <code>rdbuf()</code>.
 | 
						|
      Then the public version of <code>setbuf</code> can be called.  The
 | 
						|
      arguments are the same as those for the Standard C I/O Library
 | 
						|
      function (a buffer area followed by its size).
 | 
						|
   </para>
 | 
						|
   <para>A great deal of this is implementation-dependent.  For example,
 | 
						|
      <code>streambuf</code> does not specify any actions for its own
 | 
						|
      <code>setbuf()</code>-ish functions; the classes derived from
 | 
						|
      <code>streambuf</code> each define behavior that "makes
 | 
						|
      sense" for that class:  an argument of (0,0) turns off buffering
 | 
						|
      for <code>filebuf</code> but does nothing at all for its siblings
 | 
						|
      <code>stringbuf</code> and <code>strstreambuf</code>, and specifying
 | 
						|
      anything other than (0,0) has varying effects.
 | 
						|
      User-defined classes derived from <code>streambuf</code> can
 | 
						|
      do whatever they want.  (For <code>filebuf</code> and arguments for
 | 
						|
      <code>(p,s)</code> other than zeros, libstdc++ does what you'd expect:
 | 
						|
      the first <code>s</code> bytes of <code>p</code> are used as a buffer,
 | 
						|
      which you must allocate and deallocate.)
 | 
						|
   </para>
 | 
						|
   <para>A last reminder:  there are usually more buffers involved than
 | 
						|
      just those at the language/library level.  Kernel buffers, disk
 | 
						|
      buffers, and the like will also have an effect.  Inspecting and
 | 
						|
      changing those are system-dependent.
 | 
						|
   </para>
 | 
						|
 | 
						|
  </section>
 | 
						|
</section>
 | 
						|
 | 
						|
<!-- Sect1 03 : Memory-based Streams -->
 | 
						|
<section xml:id="std.io.memstreams" xreflabel="Memory Streams"><info><title>Memory Based Streams</title></info>
 | 
						|
<?dbhtml filename="stringstreams.html"?>
 | 
						|
  
 | 
						|
  <section xml:id="std.io.memstreams.compat" xreflabel="Compatibility strstream"><info><title>Compatibility With strstream</title></info>
 | 
						|
    
 | 
						|
    <para>
 | 
						|
    </para>
 | 
						|
   <para>Stringstreams (defined in the header <code><sstream></code>)
 | 
						|
      are in this author's opinion one of the coolest things since
 | 
						|
      sliced time.  An example of their use is in the Received Wisdom
 | 
						|
      section for Sect1 21 (Strings),
 | 
						|
      <link linkend="strings.string.Cstring"> describing how to
 | 
						|
      format strings</link>.
 | 
						|
   </para>
 | 
						|
   <para>The quick definition is:  they are siblings of ifstream and ofstream,
 | 
						|
      and they do for <code>std::string</code> what their siblings do for
 | 
						|
      files.  All that work you put into writing <code><<</code> and
 | 
						|
      <code>>></code> functions for your classes now pays off
 | 
						|
      <emphasis>again!</emphasis>  Need to format a string before passing the string
 | 
						|
      to a function?  Send your stuff via <code><<</code> to an
 | 
						|
      ostringstream.  You've read a string as input and need to parse it?
 | 
						|
      Initialize an istringstream with that string, and then pull pieces
 | 
						|
      out of it with <code>>></code>.  Have a stringstream and need to
 | 
						|
      get a copy of the string inside?  Just call the <code>str()</code>
 | 
						|
      member function.
 | 
						|
   </para>
 | 
						|
   <para>This only works if you've written your
 | 
						|
      <code><<</code>/<code>>></code> functions correctly, though,
 | 
						|
      and correctly means that they take istreams and ostreams as
 | 
						|
      parameters, not i<emphasis>f</emphasis>streams and o<emphasis>f</emphasis>streams.  If they
 | 
						|
      take the latter, then your I/O operators will work fine with
 | 
						|
      file streams, but with nothing else -- including stringstreams.
 | 
						|
   </para>
 | 
						|
   <para>If you are a user of the strstream classes, you need to update
 | 
						|
      your code.  You don't have to explicitly append <code>ends</code> to
 | 
						|
      terminate the C-style character array, you don't have to mess with
 | 
						|
      "freezing" functions, and you don't have to manage the
 | 
						|
      memory yourself.  The strstreams have been officially deprecated,
 | 
						|
      which means that 1) future revisions of the C++ Standard won't
 | 
						|
      support them, and 2) if you use them, people will laugh at you.
 | 
						|
   </para>
 | 
						|
 | 
						|
 | 
						|
  </section>
 | 
						|
</section>
 | 
						|
 | 
						|
<!-- Sect1 04 : File-based Streams -->
 | 
						|
<section xml:id="std.io.filestreams" xreflabel="File Streams"><info><title>File Based Streams</title></info>
 | 
						|
<?dbhtml filename="fstreams.html"?>
 | 
						|
  
 | 
						|
 | 
						|
  <section xml:id="std.io.filestreams.copying_a_file" xreflabel="Copying a File"><info><title>Copying a File</title></info>
 | 
						|
  
 | 
						|
  <para>
 | 
						|
  </para>
 | 
						|
 | 
						|
   <para>So you want to copy a file quickly and easily, and most important,
 | 
						|
      completely portably.  And since this is C++, you have an open
 | 
						|
      ifstream (call it IN) and an open ofstream (call it OUT):
 | 
						|
   </para>
 | 
						|
   <programlisting>
 | 
						|
   #include <fstream>
 | 
						|
 | 
						|
   std::ifstream  IN ("input_file");
 | 
						|
   std::ofstream  OUT ("output_file"); </programlisting>
 | 
						|
   <para>Here's the easiest way to get it completely wrong:
 | 
						|
   </para>
 | 
						|
   <programlisting>
 | 
						|
   OUT << IN;</programlisting>
 | 
						|
   <para>For those of you who don't already know why this doesn't work
 | 
						|
      (probably from having done it before), I invite you to quickly
 | 
						|
      create a simple text file called "input_file" containing
 | 
						|
      the sentence
 | 
						|
   </para>
 | 
						|
      <programlisting>
 | 
						|
      The quick brown fox jumped over the lazy dog.</programlisting>
 | 
						|
   <para>surrounded by blank lines.  Code it up and try it.  The contents
 | 
						|
      of "output_file" may surprise you.
 | 
						|
   </para>
 | 
						|
   <para>Seriously, go do it.  Get surprised, then come back.  It's worth it.
 | 
						|
   </para>
 | 
						|
   <para>The thing to remember is that the <code>basic_[io]stream</code> classes
 | 
						|
      handle formatting, nothing else.  In particular, they break up on
 | 
						|
      whitespace.  The actual reading, writing, and storing of data is
 | 
						|
      handled by the <code>basic_streambuf</code> family.  Fortunately, the
 | 
						|
      <code>operator<<</code> is overloaded to take an ostream and
 | 
						|
      a pointer-to-streambuf, in order to help with just this kind of
 | 
						|
      "dump the data verbatim" situation.
 | 
						|
   </para>
 | 
						|
   <para>Why a <emphasis>pointer</emphasis> to streambuf and not just a streambuf?  Well,
 | 
						|
      the [io]streams hold pointers (or references, depending on the
 | 
						|
      implementation) to their buffers, not the actual
 | 
						|
      buffers.  This allows polymorphic behavior on the chapter of the buffers
 | 
						|
      as well as the streams themselves.  The pointer is easily retrieved
 | 
						|
      using the <code>rdbuf()</code> member function.  Therefore, the easiest
 | 
						|
      way to copy the file is:
 | 
						|
   </para>
 | 
						|
   <programlisting>
 | 
						|
   OUT << IN.rdbuf();</programlisting>
 | 
						|
   <para>So what <emphasis>was</emphasis> happening with OUT<<IN?  Undefined
 | 
						|
      behavior, since that particular << isn't defined by the Standard.
 | 
						|
      I have seen instances where it is implemented, but the character
 | 
						|
      extraction process removes all the whitespace, leaving you with no
 | 
						|
      blank lines and only "Thequickbrownfox...".  With
 | 
						|
      libraries that do not define that operator, IN (or one of IN's
 | 
						|
      member pointers) sometimes gets converted to a void*, and the output
 | 
						|
      file then contains a perfect text representation of a hexadecimal
 | 
						|
      address (quite a big surprise).  Others don't compile at all.
 | 
						|
   </para>
 | 
						|
   <para>Also note that none of this is specific to o<emphasis>*f*</emphasis>streams.
 | 
						|
      The operators shown above are all defined in the parent
 | 
						|
      basic_ostream class and are therefore available with all possible
 | 
						|
      descendants.
 | 
						|
   </para>
 | 
						|
 | 
						|
  </section>
 | 
						|
 | 
						|
  <section xml:id="std.io.filestreams.binary" xreflabel="Binary Input and Output"><info><title>Binary Input and Output</title></info>
 | 
						|
    
 | 
						|
    <para>
 | 
						|
    </para>
 | 
						|
   <para>The first and most important thing to remember about binary I/O is
 | 
						|
      that opening a file with <code>ios::binary</code> is not, repeat
 | 
						|
      <emphasis>not</emphasis>, the only thing you have to do.  It is not a silver
 | 
						|
      bullet, and will not allow you to use the <code><</>></code>
 | 
						|
      operators of the normal fstreams to do binary I/O.
 | 
						|
   </para>
 | 
						|
   <para>Sorry.  Them's the breaks.
 | 
						|
   </para>
 | 
						|
   <para>This isn't going to try and be a complete tutorial on reading and
 | 
						|
      writing binary files (because "binary"
 | 
						|
      covers a lot of ground), but we will try and clear
 | 
						|
      up a couple of misconceptions and common errors.
 | 
						|
   </para>
 | 
						|
   <para>First, <code>ios::binary</code> has exactly one defined effect, no more
 | 
						|
      and no less.  Normal text mode has to be concerned with the newline
 | 
						|
      characters, and the runtime system will translate between (for
 | 
						|
      example) '\n' and the appropriate end-of-line sequence (LF on Unix,
 | 
						|
      CRLF on DOS, CR on Macintosh, etc).  (There are other things that
 | 
						|
      normal mode does, but that's the most obvious.)  Opening a file in
 | 
						|
      binary mode disables this conversion, so reading a CRLF sequence
 | 
						|
      under Windows won't accidentally get mapped to a '\n' character, etc.
 | 
						|
      Binary mode is not supposed to suddenly give you a bitstream, and
 | 
						|
      if it is doing so in your program then you've discovered a bug in
 | 
						|
      your vendor's compiler (or some other chapter of the C++ implementation,
 | 
						|
      possibly the runtime system).
 | 
						|
   </para>
 | 
						|
   <para>Second, using <code><<</code> to write and <code>>></code> to
 | 
						|
      read isn't going to work with the standard file stream classes, even
 | 
						|
      if you use <code>skipws</code> during reading.  Why not?  Because
 | 
						|
      ifstream and ofstream exist for the purpose of <emphasis>formatting</emphasis>,
 | 
						|
      not reading and writing.  Their job is to interpret the data into
 | 
						|
      text characters, and that's exactly what you don't want to happen
 | 
						|
      during binary I/O.
 | 
						|
   </para>
 | 
						|
   <para>Third, using the <code>get()</code> and <code>put()/write()</code> member
 | 
						|
      functions still aren't guaranteed to help you.  These are
 | 
						|
      "unformatted" I/O functions, but still character-based.
 | 
						|
      (This may or may not be what you want, see below.)
 | 
						|
   </para>
 | 
						|
   <para>Notice how all the problems here are due to the inappropriate use
 | 
						|
      of <emphasis>formatting</emphasis> functions and classes to perform something
 | 
						|
      which <emphasis>requires</emphasis> that formatting not be done?  There are a
 | 
						|
      seemingly infinite number of solutions, and a few are listed here:
 | 
						|
   </para>
 | 
						|
   <itemizedlist>
 | 
						|
      <listitem>
 | 
						|
	<para><quote>Derive your own fstream-type classes and write your own
 | 
						|
	  <</>> operators to do binary I/O on whatever data
 | 
						|
	  types you're using.</quote>
 | 
						|
	</para>
 | 
						|
	<para>
 | 
						|
	  This is a Bad Thing, because while
 | 
						|
	  the compiler would probably be just fine with it, other humans
 | 
						|
	  are going to be confused.  The overloaded bitshift operators
 | 
						|
	  have a well-defined meaning (formatting), and this breaks it.
 | 
						|
	</para>
 | 
						|
      </listitem>
 | 
						|
      <listitem>
 | 
						|
	<para>
 | 
						|
	  <quote>Build the file structure in memory, then
 | 
						|
	  <code>mmap()</code> the file and copy the
 | 
						|
	  structure.
 | 
						|
	</quote>
 | 
						|
	</para>
 | 
						|
	<para>
 | 
						|
	  Well, this is easy to make work, and easy to break, and is
 | 
						|
	  pretty equivalent to using <code>::read()</code> and
 | 
						|
	  <code>::write()</code> directly, and makes no use of the
 | 
						|
	  iostream library at all...
 | 
						|
	  </para>
 | 
						|
      </listitem>
 | 
						|
      <listitem>
 | 
						|
	<para>
 | 
						|
	  <quote>Use streambufs, that's what they're there for.</quote>
 | 
						|
	</para>
 | 
						|
	<para>
 | 
						|
	  While not trivial for the beginner, this is the best of all
 | 
						|
	  solutions.  The streambuf/filebuf layer is the layer that is
 | 
						|
	  responsible for actual I/O.  If you want to use the C++
 | 
						|
	  library for binary I/O, this is where you start.
 | 
						|
	</para>
 | 
						|
      </listitem>
 | 
						|
   </itemizedlist>
 | 
						|
   <para>How to go about using streambufs is a bit beyond the scope of this
 | 
						|
      document (at least for now), but while streambufs go a long way,
 | 
						|
      they still leave a couple of things up to you, the programmer.
 | 
						|
      As an example, byte ordering is completely between you and the
 | 
						|
      operating system, and you have to handle it yourself.
 | 
						|
   </para>
 | 
						|
   <para>Deriving a streambuf or filebuf
 | 
						|
      class from the standard ones, one that is specific to your data
 | 
						|
      types (or an abstraction thereof) is probably a good idea, and
 | 
						|
      lots of examples exist in journals and on Usenet.  Using the
 | 
						|
      standard filebufs directly (either by declaring your own or by
 | 
						|
      using the pointer returned from an fstream's <code>rdbuf()</code>)
 | 
						|
      is certainly feasible as well.
 | 
						|
   </para>
 | 
						|
   <para>One area that causes problems is trying to do bit-by-bit operations
 | 
						|
      with filebufs.  C++ is no different from C in this respect:  I/O
 | 
						|
      must be done at the byte level.  If you're trying to read or write
 | 
						|
      a few bits at a time, you're going about it the wrong way.  You
 | 
						|
      must read/write an integral number of bytes and then process the
 | 
						|
      bytes.  (For example, the streambuf functions take and return
 | 
						|
      variables of type <code>int_type</code>.)
 | 
						|
   </para>
 | 
						|
   <para>Another area of problems is opening text files in binary mode.
 | 
						|
      Generally, binary mode is intended for binary files, and opening
 | 
						|
      text files in binary mode means that you now have to deal with all of
 | 
						|
      those end-of-line and end-of-file problems that we mentioned before.
 | 
						|
   </para>
 | 
						|
   <para>
 | 
						|
      An instructive thread from comp.lang.c++.moderated delved off into
 | 
						|
      this topic starting more or less at
 | 
						|
      <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://groups.google.com/group/comp.std.c++/browse_thread/thread/f87b4abd7954a87/946a3eb9921e382d?q=comp.std.c%2B%2B+binary+iostream#946a3eb9921e382d">this</link>
 | 
						|
      post and continuing to the end of the thread. (The subject heading is "binary iostreams" on both comp.std.c++
 | 
						|
      and comp.lang.c++.moderated.) Take special note of the replies by James Kanze and Dietmar Kühl.
 | 
						|
   </para>
 | 
						|
    <para>Briefly, the problems of byte ordering and type sizes mean that
 | 
						|
      the unformatted functions like <code>ostream::put()</code> and
 | 
						|
      <code>istream::get()</code> cannot safely be used to communicate
 | 
						|
      between arbitrary programs, or across a network, or from one
 | 
						|
      invocation of a program to another invocation of the same program
 | 
						|
      on a different platform, etc.
 | 
						|
   </para>
 | 
						|
 </section>
 | 
						|
 | 
						|
</section>
 | 
						|
 | 
						|
<!-- Sect1 03 : Interacting with C -->
 | 
						|
<section xml:id="std.io.c" xreflabel="Interacting with C"><info><title>Interacting with C</title></info>
 | 
						|
<?dbhtml filename="io_and_c.html"?>
 | 
						|
  
 | 
						|
 | 
						|
 | 
						|
  <section xml:id="std.io.c.FILE" xreflabel="Using FILE* and file descriptors"><info><title>Using FILE* and file descriptors</title></info>
 | 
						|
    
 | 
						|
    <para>
 | 
						|
      See the <link linkend="manual.ext.io">extensions</link> for using
 | 
						|
      <type>FILE</type> and <type>file descriptors</type> with
 | 
						|
      <classname>ofstream</classname> and
 | 
						|
      <classname>ifstream</classname>.
 | 
						|
    </para>
 | 
						|
  </section>
 | 
						|
 | 
						|
  <section xml:id="std.io.c.sync" xreflabel="Performance Issues"><info><title>Performance</title></info>
 | 
						|
    
 | 
						|
    <para>
 | 
						|
      Pathetic Performance? Ditch C.
 | 
						|
    </para>
 | 
						|
   <para>It sounds like a flame on C, but it isn't.  Really.  Calm down.
 | 
						|
      I'm just saying it to get your attention.
 | 
						|
   </para>
 | 
						|
   <para>Because the C++ library includes the C library, both C-style and
 | 
						|
      C++-style I/O have to work at the same time.  For example:
 | 
						|
   </para>
 | 
						|
   <programlisting>
 | 
						|
     #include <iostream>
 | 
						|
     #include <cstdio>
 | 
						|
 | 
						|
     std::cout << "Hel";
 | 
						|
     std::printf ("lo, worl");
 | 
						|
     std::cout << "d!\n";
 | 
						|
   </programlisting>
 | 
						|
   <para>This must do what you think it does.
 | 
						|
   </para>
 | 
						|
   <para>Alert members of the audience will immediately notice that buffering
 | 
						|
      is going to make a hash of the output unless special steps are taken.
 | 
						|
   </para>
 | 
						|
   <para>The special steps taken by libstdc++, at least for version 3.0,
 | 
						|
      involve doing very little buffering for the standard streams, leaving
 | 
						|
      most of the buffering to the underlying C library.  (This kind of
 | 
						|
      thing is tricky to get right.)
 | 
						|
      The upside is that correctness is ensured.  The downside is that
 | 
						|
      writing through <code>cout</code> can quite easily lead to awful
 | 
						|
      performance when the C++ I/O library is layered on top of the C I/O
 | 
						|
      library (as it is for 3.0 by default).  Some patches have been applied
 | 
						|
      which improve the situation for 3.1.
 | 
						|
   </para>
 | 
						|
   <para>However, the C and C++ standard streams only need to be kept in sync
 | 
						|
      when both libraries' facilities are in use.  If your program only uses
 | 
						|
      C++ I/O, then there's no need to sync with the C streams.  The right
 | 
						|
      thing to do in this case is to call
 | 
						|
   </para>
 | 
						|
   <programlisting>
 | 
						|
     #include <emphasis>any of the I/O headers such as ios, iostream, etc</emphasis>
 | 
						|
 | 
						|
     std::ios::sync_with_stdio(false);
 | 
						|
   </programlisting>
 | 
						|
   <para>You must do this before performing any I/O via the C++ stream objects.
 | 
						|
      Once you call this, the C++ streams will operate independently of the
 | 
						|
      (unused) C streams.  For GCC 3.x, this means that <code>cout</code> and
 | 
						|
      company will become fully buffered on their own.
 | 
						|
   </para>
 | 
						|
   <para>Note, by the way, that the synchronization requirement only applies to
 | 
						|
      the standard streams (<code>cin</code>, <code>cout</code>,
 | 
						|
      <code>cerr</code>,
 | 
						|
      <code>clog</code>, and their wide-character counterparts).  File stream
 | 
						|
      objects that you declare yourself have no such requirement and are fully
 | 
						|
      buffered.
 | 
						|
   </para>
 | 
						|
 | 
						|
 | 
						|
  </section>
 | 
						|
</section>
 | 
						|
 | 
						|
</chapter>
 |