gcc/libstdc++-v3/doc/html/manual/bk01pt05ch13s03.html

58 lines
5.7 KiB
HTML
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Arbitrary Character Types</title><meta name="generator" content="DocBook XSL Stylesheets V1.74.3" /><meta name="keywords" content="&#10; ISO C++&#10; , &#10; library&#10; " /><link rel="home" href="../spine.html" title="The GNU C++ Library Documentation" /><link rel="up" href="bk01pt05ch13.html" title="Chapter 13. String Classes" /><link rel="prev" href="bk01pt05ch13s02.html" title="Case Sensitivity" /><link rel="next" href="bk01pt05ch13s04.html" title="Tokenizing" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Arbitrary Character Types</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="bk01pt05ch13s02.html">Prev</a> </td><th width="60%" align="center">Chapter 13. String Classes</th><td width="20%" align="right"> <a accesskey="n" href="bk01pt05ch13s04.html">Next</a></td></tr></table><hr /></div><div class="sect1" lang="en" xml:lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="strings.string.character_types"></a>Arbitrary Character Types</h2></div></div></div><p>
</p><p>The <code class="code">std::basic_string</code> is tantalizingly general, in that
it is parameterized on the type of the characters which it holds.
In theory, you could whip up a Unicode character class and instantiate
<code class="code">std::basic_string&lt;my_unicode_char&gt;</code>, or assuming
that integers are wider than characters on your platform, maybe just
declare variables of type <code class="code">std::basic_string&lt;int&gt;</code>.
</p><p>That's the theory. Remember however that basic_string has additional
type parameters, which take default arguments based on the character
type (called <code class="code">CharT</code> here):
</p><pre class="programlisting">
template &lt;typename CharT,
typename Traits = char_traits&lt;CharT&gt;,
typename Alloc = allocator&lt;CharT&gt; &gt;
class basic_string { .... };</pre><p>Now, <code class="code">allocator&lt;CharT&gt;</code> will probably Do The Right
Thing by default, unless you need to implement your own allocator
for your characters.
</p><p>But <code class="code">char_traits</code> takes more work. The char_traits
template is <span class="emphasis"><em>declared</em></span> but not <span class="emphasis"><em>defined</em></span>.
That means there is only
</p><pre class="programlisting">
template &lt;typename CharT&gt;
struct char_traits
{
static void foo (type1 x, type2 y);
...
};</pre><p>and functions such as char_traits&lt;CharT&gt;::foo() are not
actually defined anywhere for the general case. The C++ standard
permits this, because writing such a definition to fit all possible
CharT's cannot be done.
</p><p>The C++ standard also requires that char_traits be specialized for
instantiations of <code class="code">char</code> and <code class="code">wchar_t</code>, and it
is these template specializations that permit entities like
<code class="code">basic_string&lt;char,char_traits&lt;char&gt;&gt;</code> to work.
</p><p>If you want to use character types other than char and wchar_t,
such as <code class="code">unsigned char</code> and <code class="code">int</code>, you will
need suitable specializations for them. For a time, in earlier
versions of GCC, there was a mostly-correct implementation that
let programmers be lazy but it broke under many situations, so it
was removed. GCC 3.4 introduced a new implementation that mostly
works and can be specialized even for <code class="code">int</code> and other
built-in types.
</p><p>If you want to use your own special character class, then you have
<a class="ulink" href="http://gcc.gnu.org/ml/libstdc++/2002-08/msg00163.html" target="_top">a lot
of work to do</a>, especially if you with to use i18n features
(facets require traits information but don't have a traits argument).
</p><p>Another example of how to specialize char_traits was given <a class="ulink" href="http://gcc.gnu.org/ml/libstdc++/2002-08/msg00260.html" target="_top">on the
mailing list</a> and at a later date was put into the file <code class="code">
include/ext/pod_char_traits.h</code>. We agree
that the way it's used with basic_string (scroll down to main())
doesn't look nice, but that's because <a class="ulink" href="http://gcc.gnu.org/ml/libstdc++/2002-08/msg00236.html" target="_top">the
nice-looking first attempt</a> turned out to <a class="ulink" href="http://gcc.gnu.org/ml/libstdc++/2002-08/msg00242.html" target="_top">not
be conforming C++</a>, due to the rule that CharT must be a POD.
(See how tricky this is?)
</p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="bk01pt05ch13s02.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="bk01pt05ch13.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="bk01pt05ch13s04.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Case Sensitivity </td><td width="20%" align="center"><a accesskey="h" href="../spine.html">Home</a></td><td width="40%" align="right" valign="top"> Tokenizing</td></tr></table></div></body></html>