11.3. Operations on Associative Containers
In addition to the types listed in Table 9.2 (p. 330), the associative containers define the types listed in Table 11.3. These types represent the container’s key and value types.
Table 11.3. Associative Container Additional Type Aliases
Type | Description |
---|---|
key_type | Type of the key for this container type |
mapped_type | Type associated with each key; map types only |
value_type | For set s, same as the key_type . For map s, pair<const key_type, mapped_type> |
For the set
types, the key_type
and the value_type
are the same; the values held in a set
are the keys. In a map
, the elements are key–value pairs. That is, each element is a pair
object containing a key and a associated value. Because we cannot change an element’s key, the key part of these pair
s is const
:
set<string>::value_type v1; // v1 is a string
set<string>::key_type v2; // v2 is a string
map<string, int>::value_type v3; // v3 is a pair<const string, int>
map<string, int>::key_type v4; // v4 is a string
map<string, int>::mapped_type v5; // v5 is an int
As with the sequential containers (§ 9.2.2, p. 332), we use the scope operator to fetch a type member—for example, map<string, int>::key_type
.
Only the map
types (unordered_map
, unordered_multimap
, multimap
, and map
) define mapped_type
.
11.3.1. Associative Container Iterators
When we dereference an iterator, we get a reference to a value of the container’s value_type
. In the case of map
, the value_type
is a pair
in which first
holds the const
key and second
holds the value:
// get an iterator to an element in word_count
auto map_it = word_count.begin();
// *map_it is a reference to a pair<const string, size_t> object
cout << map_it->first; // prints the key for this element
cout << " " << map_it->second; // prints the value of the element
map_it->first = "new key"; // error: key is const
++map_it->second; // ok: we can change the value through an iterator
INFO
It is essential to remember that the value_type
of a map
is a pair
and that we can change the value but not the key member of that pair
.
Iterators for set
s Are const
Although the set
types define both the iterator
and const_iterator
types, both types of iterators give us read-only access to the elements in the set
. Just as we cannot change the key part of a map
element, the keys in a set
are also const
. We can use a set
iterator to read, but not write, an element’s value:
set<int> iset = {0,1,2,3,4,5,6,7,8,9};
set<int>::iterator set_it = iset.begin();
if (set_it != iset.end()) {
*set_it = 42; // error: keys in a set are read-only
cout << *set_it << endl; // ok: can read the key
}
Iterating across an Associative Container
The map
and set
types provide all the begin
and end
operations from Table 9.2 (p. 330). As usual, we can use these functions to obtain iterators that we can use to traverse the container. For example, we can rewrite the loop that printed the results in our word-counting program on page 421 as follows:
// get an iterator positioned on the first element
auto map_it = word_count.cbegin();
// compare the current iterator to the off-the-end iterator
while (map_it != word_count.cend()) {
// dereference the iterator to print the element key--value pairs
cout << map_it->first << " occurs "
<< map_it->second << " times" << endl;
++map_it; // increment the iterator to denote the next element
}
The while
condition and increment for the iterator in this loop look a lot like the programs we wrote that printed the contents of a vector
or a string
. We initialize an iterator, map_it
, to refer to the first element in word_count
. As long as the iterator is not equal to the end
value, we print the current element and then increment the iterator. The output statement dereferences map_it
to get the members of pair
but is otherwise the same as the one in our original program.
INFO
The output of this program is in alphabetical order. When we use an iterator to traverse a map, multimap, set
, or multiset
, the iterators yield elements in ascending key order.
Associative Containers and Algorithms
In general, we do not use the generic algorithms (Chapter 10) with the associative containers. The fact that the keys are const
means that we cannot pass associative container iterators to algorithms that write to or reorder container elements. Such algorithms need to write to the elements. The elements in the set
types are const
, and those in map
s are pair
s whose first element is const
.
Associative containers can be used with the algorithms that read elements. However, many of these algorithms search the sequence. Because elements in an associative container can be found (quickly) by their key, it is almost always a bad idea to use a generic search algorithm. For example, as we’ll see in § 11.3.5 (p. 436), the associative containers define a member named find
, which directly fetches the element with a given key. We could use the generic find
algorithm to look for an element, but that algorithm does a sequential search. It is much faster to use the find
member defined by the container than to call the generic version.
In practice, if we do so at all, we use an associative container with the algorithms either as the source sequence or as a destination. For example, we might use the generic copy
algorithm to copy the elements from an associative container into another sequence. Similarly, we can call inserter
to bind an insert iterator (§ 10.4.1, p. 401) to an associative container. Using inserter
, we can use the associative container as a destination for another algorithm.
INFO
Exercises Section 11.3.1
Exercise 11.15: What are the mapped_type
, key_type
, and value_type
of a map
from int
to vector<int>?
Exercise 11.16: Using a map
iterator write an expression that assigns a value to an element.
Exercise 11.17: Assuming c
is a multiset
of string
s and v
is a vector
of string
s, explain the following calls. Indicate whether each call is legal:
copy(v.begin(), v.end(), inserter(c, c.end()));
copy(v.begin(), v.end(), back_inserter(c));
copy(c.begin(), c.end(), inserter(v, v.end()));
copy(c.begin(), c.end(), back_inserter(v));
Exercise 11.18: Write the type of map_it
from the loop on page 430 without using auto
or decltype
.
Exercise 11.19: Define a variable that you initialize by calling begin()
on the multiset
named bookstore
from § 11.2.2 (p. 425). Write the variable’s type without using auto
or decltype
.
11.3.2. Adding Elements
The insert
members (Table 11.4 (overleaf)) add one element or a range of elements. Because map
and set
(and the corresponding unordered types) contain unique keys, inserting an element that is already present has no effect:
vector<int> ivec = {2,4,6,8,2,4,6,8}; // ivec has eight elements
set<int> set2; // empty set
set2.insert(ivec.cbegin(), ivec.cend()); // set2 has four elements
set2.insert({1,3,5,7,1,3,5,7}); // set2 now has eight elements
Table 11.4. Associative Container insert
Operations
Code | Description |
---|---|
c.insert(v) | v is a value_type object; args are used to construct an element. |
c.emplace(args) | For map and set , the element is inserted (or constructed) only if an element with the given key is not already in c . Returns a pair containing an iterator referring to the element with the given key and a bool indicating whether the element was inserted. For multimap and multiset , inserts (or constructs) the given element and returns an iterator to the new element. |
c.insert(b, e) c.insert(il) | b and e are iterators that denote a range of c::value_type values; il is a braced list of such values. Returns void . For map and set , inserts the elements with keys that are not already in c . For multimap and multiset inserts, each element in the range. |
c.insert(p, v) c.emplace(p, args) | Like insert(v) (or emplace(args) ), but uses iterator p as a hint for where to begin the search for where the new element should be stored. Returns an iterator to the element with the given key. |
The versions of insert
that take a pair of iterators or an initializer list work similarly to the corresponding constructors (§ 11.2.1, p. 423)—only the first element with a given key is inserted.
Adding Elements to a map
When we insert
into a map
, we must remember that the element type is a pair
. Often, we don’t have a pair
object that we want to insert. Instead, we create a pair
in the argument list to insert
:
// four ways to add word to word_count
word_count.insert({word, 1});
word_count.insert(make_pair(word, 1));
word_count.insert(pair<string, size_t>(word, 1));
word_count.insert(map<string, size_t>::value_type(word, 1));
As we’ve seen, under the new standard the easiest way to create a pair
is to use brace initialization inside the argument list. Alternatively, we can call make_pair
or explicitly construct the pair
. The argument in the last call to insert
:
map<string, size_t>::value_type(s, 1)
constructs a new object of the appropriate pair
type to insert into the map
.
Testing the Return from insert
The value returned by insert
(or emplace
) depends on the container type and the parameters. For the containers that have unique keys, the versions of insert
and emplace
that add a single element return a pair
that lets us know whether the insertion happened. The first
member of the pair
is an iterator to the element with the given key; the second
is a bool
indicating whether that element was inserted, or was already there. If the key is already in the container, then insert
does nothing, and the bool
portion of the return value is false
. If the key isn’t present, then the element is inserted and the bool
is true
.
As an example, we’ll rewrite our word-counting program to use insert
:
// more verbose way to count number of times each word occurs in the input
map<string, size_t> word_count; // empty map from string to size_t
string word;
while (cin >> word) {
// inserts an element with key equal to word and value 1;
// if word is already in word_count, insert does nothing
auto ret = word_count.insert({word, 1});
if (!ret.second) // word was already in word_count
++ret.first->second; // increment the counter
}
For each word
, we attempt to insert
it with a value 1
. If word
is already in the map
, then nothing happens. In particular, the counter associated with word
is unchanged. If word
is not already in the map
, then that string
is added to the map
and its counter value is set to 1.
The if
test examines the bool
part of the return value. If that value is false
, then the insertion didn’t happen. In this case, word
was already in word_count
, so we must increment the value associated with that element.
Unwinding the Syntax
The statement that increments the counter in this version of the word-counting program can be hard to understand. It will be easier to understand that expression by first parenthesizing it to reflect the precedence (§ 4.1.2, p. 136) of the operators:
++((ret.first)->second); // equivalent expression
Explaining this expression step by step:
ret
holds the value returned byinsert
, which is apair
.ret.first
is thefirst
member of thatpair
, which is amap
iterator referring to the element with the given key.ret.first->
dereferences that iterator to fetch that element. Elements in themap
are alsopair
s.ret.first->second
is the value part of the map elementpair
.++ret.first->second
increments that value.
Putting it back together, the increment statement fetches the iterator for the element with the key word
and increments the counter associated with the key we tried to insert.
For readers using an older compiler or reading code that predates the new standard, declaring and initializing ret
is also somewhat tricky:
pair<map<string, size_t>::iterator, bool> ret =
word_count.insert(make_pair(word, 1));
It should be easy to see that we’re defining a pair
and that the second type of the pair
is bool
. The first type of that pair
is a bit harder to understand. It is the iterator
type defined by the map<string, size_t>
type.
Adding Elements to multiset
or multimap
Our word-counting program depends on the fact that a given key can occur only once. That way, there is only one counter associated with any given word. Sometimes, we want to be able to add additional elements with the same key. For example, we might want to map authors to titles of the books they have written. In this case, there might be multiple entries for each author, so we’d use a multimap
rather than a map
. Because keys in a multi
container need not be unique, insert
on these types always inserts an element:
multimap<string, string> authors;
// adds the first element with the key Barth, John
authors.insert({"Barth, John", "Sot-Weed Factor"});
// ok: adds the second element with the key Barth, John
authors.insert({"Barth, John", "Lost in the Funhouse"});
For the containers that allow multiple keys, the insert
operation that takes a single element returns an iterator to the new element. There is no need to return a bool
, because insert
always adds a new element in these types.
INFO
Exercises Section 11.3.2
Exercise 11.20: Rewrite the word-counting program from § 11.1 (p. 421) to use insert
instead of subscripting. Which program do you think is easier to write and read? Explain your reasoning.
Exercise 11.21: Assuming word_count
is a map
from string
to size_t
and word
is a string
, explain the following loop:
while (cin >> word)
++word_count.insert({word, 0}).first->second;
Exercise 11.22: Given a map<string, vector<int>>
, write the types used as an argument and as the return value for the version of insert
that inserts one element.
Exercise 11.23: Rewrite the map
that stored vector
s of children’s names with a key that is the family last name for the exercises in § 11.2.1 (p. 424) to use a multimap
.
11.3.3. Erasing Elements
The associative containers define three versions of erase
, which are described in Table 11.5. As with the sequential containers, we can erase
one element or a range of elements by passing erase
an iterator or an iterator pair. These versions of erase
are similar to the corresponding operations on sequential containers: The indicated element(s) are removed and the function returns void
.
Table 11.5. Removing Elements from an Associative Container
Code | Description |
---|---|
c.erase(k) | Removes every element with key k from c . Returns size_type indicating the number of elements removed. |
c.erase(p) | Removes the element denoted by the iterator p from c . p must refer to an actual element in c ; it must not be equal to c.end() . Returns an iterator to the element after p or c.end() if p denotes the last element in c . |
c.erase(b, e) | Removes the elements in the range denoted by the iterator pair b , e . Returns e . |
The associative containers supply an additional erase
operation that takes a key_type
argument. This version removes all the elements, if any, with the given key and returns a count of how many elements were removed. We can use this version to remove a specific word from word_count
before printing the results:
// erase on a key returns the number of elements removed
if (word_count.erase(removal_word))
cout << "ok: " << removal_word << " removed\n";
else cout << "oops: " << removal_word << " not found!\n";
For the containers with unique keys, the return from erase
is always either zero or one. If the return value is zero, then the element we wanted to erase was not in the container.
For types that allow multiple keys, the number of elements removed could be greater than one:
auto cnt = authors.erase("Barth, John");
If authors
is the multimap
we created in § 11.3.2 (p. 434), then cnt
will be 2.
11.3.4. Subscripting a map
TrickyThe map
and unordered_map
containers provide the subscript operator and a corresponding at
function (§ 9.3.2, p. 348), which are described in Table 11.6 (overleaf). The set
types do not support subscripting because there is no “value” associated with a key in a set
. The elements are themselves keys, so the operation of “fetching the value associated with a key” is meaningless. We cannot subscript a multimap
or an unordered_multimap
because there may be more than one value associated with a given key.
Table 11.6. Subscript Operation for map
and unordered_map
Code | Description |
---|---|
c[k] | Returns the element with key k ; if k is not in c , adds a new, value-initialized element with key k . |
c.at(k) | Checked access to the element with key k ; throws an out_of_range exception (§ 5.6, p. 193) if k is not in c . |
Like the other subscript operators we’ve used, the map
subscript takes an index (that is, a key) and fetches the value associated with that key. However, unlike other subscript operators, if the key is not already present, a new element is created and inserted into the map
for that key. The associated value is value initialized (§ 3.3.1, p. 98).
For example, when we write
map <string, size_t> word_count; // empty map
// insert a value-initialized element with key Anna; then assign 1 to its value
word_count["Anna"] = 1;
the following steps take place:
word_count
is searched for the element whose key isAnna
. The element is not found.- A new key-value pair is inserted into
word_count
. The key is aconst string
holdingAnna
. The value is value initialized, meaning in this case that the value is 0. - The newly inserted element is fetched and is given the value 1.
Because the subscript operator might insert an element, we may use subscript only on a map
that is not const
.
INFO
Subscripting a map
behaves quite differently from subscripting an array or vector:
Using a key that is not already present adds an element with that key to the map
.
Using the Value Returned from a Subscript Operation
Another way in which the map
subscript differs from other subscript operators we’ve used is its return type. Ordinarily, the type returned by dereferencing an iterator and the type returned by the subscript operator are the same. Not so for map
s: when we subscript a map
, we get a mapped_type
object; when we dereference a map
iterator, we get a value_type
object (§ 11.3, p. 428).
In common with other subscripts, the map
subscript operator returns an lvalue (§ 4.1.1, p. 135). Because the return is an lvalue, we can read or write the element:
cout << word_count["Anna"]; // fetch the element indexed by Anna; prints 1
++word_count["Anna"]; // fetch the element and add 1 to it
cout << word_count["Anna"]; // fetch the element and print it; prints 2
INFO
Unlike vector
or string
, the type returned by the map
subscript operator differs from the type obtained by dereferencing a map
iterator.
The fact that the subscript operator adds an element if it is not already in the map
allows us to write surprisingly succinct programs such as the loop inside our word-counting program (§ 11.1, p. 421). On the other hand, sometimes we only want to know whether an element is present and do not want to add the element if it is not. In such cases, we must not use the subscript operator.
11.3.5. Accessing Elements
The associative containers provide various ways to find a given element, which are described in Table 11.7 (p. 438). Which operation to use depends on what problem we are trying to solve. If all we care about is whether a particular element is in the container, it is probably best to use find
. For the containers that can hold only unique keys, it probably doesn’t matter whether we use find
or count
. However, for the containers with multiple keys, count
has to do more work: If the element is present, it still has to count how many elements have the same key. If we don’t need the count, it’s best to use find
:
Table 11.7. Operations to Find Elements in an Associative Container
INFO
lower_bound
andupper_bound
not valid for the unordered containers- Subscript and at operations only for
map
andunordered_map
that are notconst
.
Code | Description |
---|---|
c.find(k) | Returns an iterator to the (first) element with key k , or the off-the-end iterator if k is not in the container. |
c.count(k) | Returns the number of elements with key k . For the containers with unique keys, the result is always zero or one. |
c.lower_bound(k) | Returns an iterator to the first element with key not less than k . |
c.upper_bound(k) | Returns an iterator to the first element with key greater than k . |
c.equal_range(k) | Returns a pair of iterators denoting the elements with key k . If k is not present, both members are c.end() . |
INFO
Exercises Section 11.3.4
Exercise 11.24: What does the following program do?
map<int, int> m;
m[0] = 1;
Exercise 11.25: Contrast the following program with the one in the previous exercise
vector<int> v;
v[0] = 1;
Exercise 11.26: What type can be used to subscript a map
? What type does the subscript operator return? Give a concrete example—that is, define a map
and then write the types that can be used to subscript the map
and the type that would be returned from the subscript operator.
set<int> iset = {0,1,2,3,4,5,6,7,8,9};
iset.find(1); // returns an iterator that refers to the element with key == 1
iset.find(11); // returns the iterator == iset.end()
iset.count(1); // returns 1
iset.count(11); // returns 0
Using find
Instead of Subscript for map
s
For the map
and unordered_map
types, the subscript operator provides the simplest method of retrieving a value. However, as we’ve just seen, using a subscript has an important side effect: If that key is not already in the map
, then subscript inserts an element with that key. Whether this behavior is correct depends on our expectations. Our word-counting programs relied on the fact that using a nonexistent key as a subscript inserts an element with that key and value 0.
Sometimes, we want to know if an element with a given key is present without changing the map
. We cannot use the subscript operator to determine whether an element is present, because the subscript operator inserts a new element if the key is not already there. In such cases, we should use find
:
if (word_count.find("foobar") == word_count.end())
cout << "foobar is not in the map" << endl;
Finding Elements in a multimap
or multiset
Finding an element in an associative container that requires unique keys is a simple matter—the element is or is not in the container. For the containers that allow multiple keys, the process is more complicated: There may be many elements with the given key. When a multimap
or multiset
has multiple elements of a given key, those elements will be adjacent within the container.
For example, given our map from author to titles, we might want to print all the books by a particular author. We can solve this problem in three different ways. The most obvious way uses find
and count
:
string search_item("Alain de Botton"); // author we'll look for
auto entries = authors.count(search_item); // number of elements
auto iter = authors.find(search_item); // first entry for this author
// loop through the number of entries there are for this author
while(entries) {
cout << iter->second << endl; // print each title
++iter; // advance to the next title
--entries; // keep track of how many we've printed
}
We start by determining how many entries there are for the author by calling count
and getting an iterator to the first element with this key by calling find
. The number of iterations of the for
loop depends on the number returned from count
. In particular, if the count
was zero, then the loop is never executed.
INFO
We are guaranteed that iterating across a multimap
or multiset
returns all the elements with a given key in sequence.
A Different, Iterator-Oriented Solution
Alternatively, we can solve our problem using lower_bound
and upper_bound
. Each of these operations take a key and returns an iterator. If the key is in the container, the iterator returned from lower_bound
will refer to the first instance of that key and the iterator returned by upper_bound
will refer just after the last instance of the key. If the element is not in the multimap
, then lower_bound
and upper_bound
will return equal iterators; both will refer to the point at which the key can be inserted without disrupting the order. Thus, calling lower_bound
and upper_bound
on the same key yields an iterator range (§ 9.2.1, p. 331) that denotes all the elements with that key.
Of course, the iterator returned from these operations might be the off-the-end iterator for the container itself. If the element we’re looking for has the largest key in the container, then upper_bound
on that key returns the off-the-end iterator. If the key is not present and is larger than any key in the container, then the return from lower_bound
will also be the off-the-end iterator.
INFO
The iterator returned from lower_bound
may or may not refer to an element with the given key. If the key is not in the container, then lower_bound
refers to the first point at which this key can be inserted while preserving the element order within the container.
Using these operations, we can rewrite our program as follows:
// definitions of authors and search_item as above
// beg and end denote the range of elements for this author
for (auto beg = authors.lower_bound(search_item),
end = authors.upper_bound(search_item);
beg != end; ++beg)
cout << beg->second << endl; // print each title
This program does the same work as the previous one that used count
and find
but accomplishes its task more directly. The call to lower_bound
positions beg
so that it refers to the first element matching search_item
if there is one. If there is no such element, then beg
refers to the first element with a key larger than search_item
, which could be the off-the-end iterator. The call to upper_bound
sets end
to refer to the element just beyond the last element with the given key. These operations say nothing about whether the key is present. The important point is that the return values act like an iterator range (§ 9.2.1, p. 331).
If there is no element for this key, then lower_bound
and upper_bound
will be equal. Both will refer to the point at which this key can be inserted while maintaining the container order.
Assuming there are elements with this key, beg
will refer to the first such element. We can increment beg
to traverse the elements with this key. The iterator in end
will signal when we’ve seen all the elements. When beg
equals end
, we have seen every element with this key.
Because these iterators form a range, we can use a for
loop to traverse that range. The loop is executed zero or more times and prints the entries, if any, for the given author. If there are no elements, then beg
and end
are equal and the loop is never executed. Otherwise, we know that the increment to beg
will eventually reach end
and that in the process we will print each record associated with this author.
INFO
If lower_bound
and upper_bound
return the same iterator, then the given key is not in the container.
The equal_range
Function
The remaining way to solve this problem is the most direct of the three approaches: Instead of calling upper_bound
and lower_bound
, we can call equal_range
.
This function takes a key and returns a pair
of iterators. If the key is present, then the first iterator refers to the first instance of the key and the second iterator refers one past the last instance of the key. If no matching element is found, then both the first and second iterators refer to the position where this key can be inserted.
We can use equal_range
to modify our program once again:
// definitions of authors and search_item as above
// pos holds iterators that denote the range of elements for this key
for (auto pos = authors.equal_range(search_item);
pos.first != pos.second; ++pos.first)
cout << pos.first->second << endl; // print each title
This program is essentially identical to the previous one that used upper_bound
and lower_bound
. Instead of using local variables, beg
and end
, to hold the iterator range, we use the pair
returned by equal_range
. The first
member of that pair
holds the same iterator as lower_bound
would have returned and second
holds the iterator upper_bound
would have returned. Thus, in this program pos.first
is equivalent to beg
, and pos.second
is equivalent to end
.
INFO
Exercises Section 11.3.5
Exercise 11.27: What kinds of problems would you use count
to solve? When might you use find
instead?
Exercise 11.28: Define and initialize a variable to hold the result of calling find
on a map
from string
to vector
of int
.
Exercise 11.29: What do upper_bound
, lower_bound
, and equal_range
return when you pass them a key that is not in the container?
Exercise 11.30: Explain the meaning of the operand pos.first->second
used in the output expression of the final program in this section.
Exercise 11.31: Write a program that defines a multimap
of authors and their works. Use find
to find an element in the multimap
and erase
that element. Be sure your program works correctly if the element you look for is not in the map
.
Exercise 11.32: Using the multimap
from the previous exercise, write a program to print the list of authors and their works alphabetically.
11.3.6. A Word Transformation Map
We’ll close this section with a program to illustrate creating, searching, and iterating across a map
. We’ll write a program that, given one string
, transforms it into another. The input to our program is two files. The first file contains rules that we will use to transform the text in the second file. Each rule consists of a word that might be in the input file and a phrase to use in its place. The idea is that whenever the first word appears in the input, we will replace it with the corresponding phrase. The second file contains the text to transform.
If the contents of the word-transformation file are
brb be right back
k okay?
y why
r are
u you
pic picture
thk thanks!
l8r later
and the text we are given to transform is
where r u
y dont u send me a pic
k thk l8r
then the program should generate the following output:
where are you
why dont you send me a picture
okay? thanks! later
The Word Transformation Program
Our solution will use three functions. The word_transform
function will manage the overall processing. It will take two ifstream
arguments: The first will be bound to the word-transformation file and the second to the file of text we’re to transform. The buildMap
function will read the file of transformation rules and create a map
from each word to its transformation. The transform
function will take a string
and return the transformation if there is one.
We’ll start by defining the word_transform
function. The important parts are the calls to buildMap
and transform
:
void word_transform(ifstream &map_file, ifstream &input)
{
auto trans_map = buildMap(map_file); // store the transformations
string text; // hold each line from the input
while (getline(input, text)) { // read a line of input
istringstream stream(text); // read each word
string word;
bool firstword = true; // controls whether a space is printed
while (stream >> word) {
if (firstword)
firstword = false;
else
cout << " "; // print a space between words
// transform returns its first argument or its transformation
cout << transform(word, trans_map); // print the output
}
cout << endl; // done with this line of input
}
}
The function starts by calling buildMap
to generate the word-transformation map
. We store the result in trans_map
. The rest of the function processes the input
file. The while
loop uses getline
to read the input file a line at a time. We read by line so that our output will have line breaks at the same position as in the input file. To get the words from each line, we use a nested while
loop that uses an istringstream
(§ 8.3, p. 321) to process each word in the current line.
The inner while
prints the output using the bool firstword
to determine whether to print a space. The call to transform
obtains the word to print. The value returned from transform
is either the original string
in word
or its corresponding transformation from trans_map
.
Building the Transformation Map
The buildMap
function reads its given file and builds the transformation map
.
map<string, string> buildMap(ifstream &map_file)
{
map<string, string> trans_map; // holds the transformations
string key; // a word to transform
string value; // phrase to use instead
// read the first word into key and the rest of the line into value
while (map_file >> key && getline(map_file, value))
if (value.size() > 1) // check that there is a transformation
trans_map[key] = value.substr(1); // skip leading space
else
throw runtime_error("no rule for " + key);
return trans_map;
}
Each line in map_file
corresponds to a rule. Each rule is a word followed by a phrase, which might contain multiple words. We use >>
to read the word that we will transform into key
and call getline
to read the rest of the line into value
. Because getline
does not skip leading spaces (§ 3.2.2, p. 87), we need to skip the space between the word and its corresponding rule. Before we store the transformation, we check that we got more than one character. If so, we call substr
(§ 9.5.1, p. 361) to skip the space that separated the transformation phrase from its corresponding word and store that substring in trans_map
,
Note that we use the subscript operator to add the key–value pairs. Implicitly, we are ignoring what should happen if a word appears more than once in our transformation file. If a word does appear multiple times, our loops will put the last corresponding phrase into trans_map
. When the while
concludes, trans_map
contains the data that we need to transform the input.
Generating a Transformation
The transform
function does the actual transformation. Its parameters are references to the string
to transform and to the transformation map
. If the given string
is in the map
, transform
returns the corresponding transformation. If the given string
is not in the map
, transform
returns its argument:
const string &
transform(const string &s, const map<string, string> &m)
{
// the actual map work; this part is the heart of the program
auto map_it = m.find(s);
// if this word is in the transformation map
if (map_it != m.cend())
return map_it->second; // use the replacement word
else
return s; // otherwise return the original unchanged
}
We start by calling find
to determine whether the given string
is in the map
. If it is, then find
returns an iterator to the corresponding element. Otherwise, find
returns the off-the-end iterator. If the element is found, we dereference the iterator, obtaining a pair
that holds the key and value for that element (§ 11.3, p. 428). We return the second
member, which is the transformation to use in place of s
.
INFO
Exercises Section 11.3.6
Exercise 11.33: Implement your own version of the word-transformation program.
Exercise 11.34: What would happen if we used the subscript operator instead of find
in the transform
function?
Exercise 11.35: In buildMap
, what effect, if any, would there be from rewriting
trans_map[key] = value.substr(1);
as
trans_map.insert({key, value.substr(1)});
Exercise 11.36: Our program does no checking on the validity of either input file. In particular, it assumes that the rules in the transformation file are all sensible. What would happen if a line in that file has a key, one space, and then the end of the line? Predict the behavior and then check it against your version of the program.