11.3. Operations on Associative Containers

In addition to the types listed in Table 9.2 (p. 330), the associative containers define the types listed in Table 11.3. These types represent the container’s key and value types.

Table 11.3. Associative Container Additional Type Aliases

Type	Description
`key_type`	Type of the key for this container type
`mapped_type`	Type associated with each key; `map` types only
`value_type`	For `set`s, same as the `key_type`. For `map`s, `pair<const key_type, mapped_type>`

For the set types, the key_type and the value_type are the same; the values held in a set are the keys. In a map, the elements are key–value pairs. That is, each element is a pair object containing a key and a associated value. Because we cannot change an element’s key, the key part of these pairs is const:

c++

set<string>::value_type v1;      // v1 is a string
set<string>::key_type v2;        // v2 is a string
map<string, int>::value_type v3; // v3 is a pair<const string, int>
map<string, int>::key_type v4;   // v4 is a string
map<string, int>::mapped_type v5; // v5 is an int

As with the sequential containers (§ 9.2.2, p. 332), we use the scope operator to fetch a type member—for example, map<string, int>::key_type.

Only the map types (unordered_map, unordered_multimap, multimap, and map) define mapped_type.

11.3.1. Associative Container Iterators

When we dereference an iterator, we get a reference to a value of the container’s value_type. In the case of map, the value_type is a pair in which first holds the const key and second holds the value:

c++

// get an iterator to an element in word_count
auto map_it = word_count.begin();
// *map_it is a reference to a pair<const string, size_t> object
cout << map_it->first;          // prints the key for this element
cout << " " << map_it->second;  // prints the value of the element
map_it->first = "new key";      // error: key is const
++map_it->second;     // ok: we can change the value through an iterator

INFO

It is essential to remember that the value_type of a map is a pair and that we can change the value but not the key member of that pair.

Iterators for `set`s Are `const`

Although the set types define both the iterator and const_iterator types, both types of iterators give us read-only access to the elements in the set. Just as we cannot change the key part of a map element, the keys in a set are also const. We can use a set iterator to read, but not write, an element’s value:

c++

set<int> iset = {0,1,2,3,4,5,6,7,8,9};
set<int>::iterator set_it = iset.begin();
if (set_it != iset.end()) {
    *set_it = 42;            // error: keys in a set are read-only
    cout << *set_it << endl; // ok: can read the key
}

Iterating across an Associative Container

The map and set types provide all the begin and end operations from Table 9.2 (p. 330). As usual, we can use these functions to obtain iterators that we can use to traverse the container. For example, we can rewrite the loop that printed the results in our word-counting program on page 421 as follows:

c++

// get an iterator positioned on the first element
auto map_it = word_count.cbegin();
// compare the current iterator to the off-the-end iterator
while (map_it != word_count.cend()) {
    // dereference the iterator to print the element key--value pairs
    cout << map_it->first << " occurs "
         << map_it->second << " times" << endl;
    ++map_it;  // increment the iterator to denote the next element
}

The while condition and increment for the iterator in this loop look a lot like the programs we wrote that printed the contents of a vector or a string. We initialize an iterator, map_it, to refer to the first element in word_count. As long as the iterator is not equal to the end value, we print the current element and then increment the iterator. The output statement dereferences map_it to get the members of pair but is otherwise the same as the one in our original program.

INFO

The output of this program is in alphabetical order. When we use an iterator to traverse a map, multimap, set, or multiset, the iterators yield elements in ascending key order.

Associative Containers and Algorithms

In general, we do not use the generic algorithms (Chapter 10) with the associative containers. The fact that the keys are const means that we cannot pass associative container iterators to algorithms that write to or reorder container elements. Such algorithms need to write to the elements. The elements in the set types are const, and those in maps are pairs whose first element is const.

Associative containers can be used with the algorithms that read elements. However, many of these algorithms search the sequence. Because elements in an associative container can be found (quickly) by their key, it is almost always a bad idea to use a generic search algorithm. For example, as we’ll see in § 11.3.5 (p. 436), the associative containers define a member named find, which directly fetches the element with a given key. We could use the generic find algorithm to look for an element, but that algorithm does a sequential search. It is much faster to use the find member defined by the container than to call the generic version.

In practice, if we do so at all, we use an associative container with the algorithms either as the source sequence or as a destination. For example, we might use the generic copy algorithm to copy the elements from an associative container into another sequence. Similarly, we can call inserter to bind an insert iterator (§ 10.4.1, p. 401) to an associative container. Using inserter, we can use the associative container as a destination for another algorithm.

INFO

Exercises Section 11.3.1

Exercise 11.15: What are the mapped_type, key_type, and value_type of a map from int to vector<int>?

Exercise 11.16: Using a map iterator write an expression that assigns a value to an element.

Exercise 11.17: Assuming c is a multiset of strings and v is a vector of strings, explain the following calls. Indicate whether each call is legal:

c++

copy(v.begin(), v.end(), inserter(c, c.end()));
copy(v.begin(), v.end(), back_inserter(c));
copy(c.begin(), c.end(), inserter(v, v.end()));
copy(c.begin(), c.end(), back_inserter(v));

Exercise 11.18: Write the type of map_it from the loop on page 430 without using auto or decltype.

Exercise 11.19: Define a variable that you initialize by calling begin() on the multiset named bookstore from § 11.2.2 (p. 425). Write the variable’s type without using auto or decltype.

11.3.2. Adding Elements

The insert members (Table 11.4 (overleaf)) add one element or a range of elements. Because map and set (and the corresponding unordered types) contain unique keys, inserting an element that is already present has no effect:

c++

vector<int> ivec = {2,4,6,8,2,4,6,8};    // ivec has eight elements
set<int> set2;                           // empty set
set2.insert(ivec.cbegin(), ivec.cend()); // set2 has four elements
set2.insert({1,3,5,7,1,3,5,7});      // set2 now has eight elements

Table 11.4. Associative Container insert Operations

Code	Description
`c.insert(v)`	`v` is a `value_type` object; `args` are used to construct an element.
`c.emplace(args)`	For `map` and `set`, the element is inserted (or constructed) only if an element with the given key is not already in `c`. Returns a `pair` containing an iterator referring to the element with the given key and a `bool` indicating whether the element was inserted. For `multimap` and `multiset`, inserts (or constructs) the given element and returns an iterator to the new element.
`c.insert(b, e)` `c.insert(il)`	`b` and `e` are iterators that denote a range of `c::value_type` values; `il` is a braced list of such values. Returns `void`. For `map` and `set`, inserts the elements with keys that are not already in `c`. For `multimap` and `multiset` inserts, each element in the range.
`c.insert(p, v)` `c.emplace(p, args)`	Like `insert(v)` (or `emplace(args)`), but uses iterator `p` as a hint for where to begin the search for where the new element should be stored. Returns an iterator to the element with the given key.

The versions of insert that take a pair of iterators or an initializer list work similarly to the corresponding constructors (§ 11.2.1, p. 423)—only the first element with a given key is inserted.

Adding Elements to a `map`

When we insert into a map, we must remember that the element type is a pair. Often, we don’t have a pair object that we want to insert. Instead, we create a pair in the argument list to insert:

c++

// four ways to add word to word_count
word_count.insert({word, 1});
word_count.insert(make_pair(word, 1));
word_count.insert(pair<string, size_t>(word, 1));
word_count.insert(map<string, size_t>::value_type(word, 1));

C++11

As we’ve seen, under the new standard the easiest way to create a pair is to use brace initialization inside the argument list. Alternatively, we can call make_pairor explicitly construct the pair. The argument in the last call to insert:

c++

map<string, size_t>::value_type(s, 1)

constructs a new object of the appropriate pair type to insert into the map.

Testing the Return from `insert`

The value returned by insert (or emplace) depends on the container type and the parameters. For the containers that have unique keys, the versions of insert and emplace that add a single element return a pair that lets us know whether the insertion happened. The first member of the pair is an iterator to the element with the given key; the second is a bool indicating whether that element was inserted, or was already there. If the key is already in the container, then insert does nothing, and the bool portion of the return value is false. If the key isn’t present, then the element is inserted and the bool is true.

As an example, we’ll rewrite our word-counting program to use insert:

c++

// more verbose way to count number of times each word occurs in the input
map<string, size_t> word_count; // empty map from string to size_t
string word;
while (cin >> word) {
    // inserts an element with key equal to word and value 1;
    // if word is already in word_count, insert does nothing
    auto ret = word_count.insert({word, 1});
    if (!ret.second)         // word was already in word_count
        ++ret.first->second; // increment the counter
}

For each word, we attempt to insert it with a value 1. If word is already in the map, then nothing happens. In particular, the counter associated with word is unchanged. If word is not already in the map, then that string is added to the map and its counter value is set to 1.

The if test examines the bool part of the return value. If that value is false, then the insertion didn’t happen. In this case, word was already in word_count, so we must increment the value associated with that element.

Unwinding the Syntax

The statement that increments the counter in this version of the word-counting program can be hard to understand. It will be easier to understand that expression by first parenthesizing it to reflect the precedence (§ 4.1.2, p. 136) of the operators:

c++

++((ret.first)->second); // equivalent expression

Explaining this expression step by step:

ret holds the value returned by insert, which is a pair.
ret.first is the first member of that pair, which is a map iterator referring to the element with the given key.
ret.first-> dereferences that iterator to fetch that element. Elements in the map are also pairs.
ret.first->second is the value part of the map element pair.
++ret.first->second increments that value.

Putting it back together, the increment statement fetches the iterator for the element with the key word and increments the counter associated with the key we tried to insert.

For readers using an older compiler or reading code that predates the new standard, declaring and initializing ret is also somewhat tricky:

c++

pair<map<string, size_t>::iterator, bool> ret =
            word_count.insert(make_pair(word, 1));

It should be easy to see that we’re defining a pair and that the second type of the pair is bool. The first type of that pair is a bit harder to understand. It is the iterator type defined by the map<string, size_t> type.

Adding Elements to `multiset` or `multimap`

Our word-counting program depends on the fact that a given key can occur only once. That way, there is only one counter associated with any given word. Sometimes, we want to be able to add additional elements with the same key. For example, we might want to map authors to titles of the books they have written. In this case, there might be multiple entries for each author, so we’d use a multimap rather than a map. Because keys in a multi container need not be unique, insert on these types always inserts an element:

c++

multimap<string, string> authors;
// adds the first element with the key Barth, John
authors.insert({"Barth, John", "Sot-Weed Factor"});
// ok: adds the second element with the key Barth, John
authors.insert({"Barth, John", "Lost in the Funhouse"});

For the containers that allow multiple keys, the insert operation that takes a single element returns an iterator to the new element. There is no need to return a bool, because insert always adds a new element in these types.

INFO

Exercises Section 11.3.2

Exercise 11.20: Rewrite the word-counting program from § 11.1 (p. 421) to use insert instead of subscripting. Which program do you think is easier to write and read? Explain your reasoning.

Exercise 11.21: Assuming word_count is a map from string to size_t and word is a string, explain the following loop:

c++

while (cin >> word)
  ++word_count.insert({word, 0}).first->second;

Exercise 11.22: Given a map<string, vector<int>>, write the types used as an argument and as the return value for the version of insert that inserts one element.

Exercise 11.23: Rewrite the map that stored vectors of children’s names with a key that is the family last name for the exercises in § 11.2.1 (p. 424) to use a multimap.

11.3.3. Erasing Elements

The associative containers define three versions of erase, which are described in Table 11.5. As with the sequential containers, we can erase one element or a range of elements by passing erase an iterator or an iterator pair. These versions of erase are similar to the corresponding operations on sequential containers: The indicated element(s) are removed and the function returns void.

Table 11.5. Removing Elements from an Associative Container

Code	Description
`c.erase(k)`	Removes every element with key `k` from `c`. Returns `size_type` indicating the number of elements removed.
`c.erase(p)`	Removes the element denoted by the iterator `p` from `c`. `p` must refer to an actual element in `c`; it must not be equal to `c.end()`. Returns an iterator to the element after `p` or `c.end()` if `p` denotes the last element in `c`.
`c.erase(b, e)`	Removes the elements in the range denoted by the iterator pair `b`, `e`. Returns `e`.

The associative containers supply an additional erase operation that takes a key_type argument. This version removes all the elements, if any, with the given key and returns a count of how many elements were removed. We can use this version to remove a specific word from word_count before printing the results:

c++

// erase on a key returns the number of elements removed
if (word_count.erase(removal_word))
     cout << "ok: " << removal_word << " removed\n";
else cout << "oops: " << removal_word << " not found!\n";

For the containers with unique keys, the return from erase is always either zero or one. If the return value is zero, then the element we wanted to erase was not in the container.

For types that allow multiple keys, the number of elements removed could be greater than one:

c++

auto cnt = authors.erase("Barth, John");

If authors is the multimap we created in § 11.3.2 (p. 434), then cnt will be 2.

11.3.4. Subscripting a `map`

Tricky

The map and unordered_map containers provide the subscript operator and a corresponding at function (§ 9.3.2, p. 348), which are described in Table 11.6 (overleaf). The set types do not support subscripting because there is no “value” associated with a key in a set. The elements are themselves keys, so the operation of “fetching the value associated with a key” is meaningless. We cannot subscript a multimap or an unordered_multimap because there may be more than one value associated with a given key.

Table 11.6. Subscript Operation for map and unordered_map

Code	Description
`c[k]`	Returns the element with key `k`; if `k` is not in `c`, adds a new, value-initialized element with key `k`.
`c.at(k)`	Checked access to the element with key `k`; throws an `out_of_range` exception (§ 5.6, p. 193) if `k` is not in `c`.

Like the other subscript operators we’ve used, the map subscript takes an index (that is, a key) and fetches the value associated with that key. However, unlike other subscript operators, if the key is not already present, a new element is created and inserted into the map for that key. The associated value is value initialized (§ 3.3.1, p. 98).

For example, when we write

c++

map <string, size_t> word_count; // empty map
// insert a value-initialized element with key Anna; then assign 1 to its value
word_count["Anna"] = 1;

the following steps take place:

word_count is searched for the element whose key is Anna. The element is not found.
A new key-value pair is inserted into word_count. The key is a const string holding Anna. The value is value initialized, meaning in this case that the value is 0.
The newly inserted element is fetched and is given the value 1.

Because the subscript operator might insert an element, we may use subscript only on a map that is not const.

INFO

Subscripting a map behaves quite differently from subscripting an array or vector: Using a key that is not already present adds an element with that key to the map.

Using the Value Returned from a Subscript Operation

Another way in which the map subscript differs from other subscript operators we’ve used is its return type. Ordinarily, the type returned by dereferencing an iterator and the type returned by the subscript operator are the same. Not so for maps: when we subscript a map, we get a mapped_type object; when we dereference a map iterator, we get a value_type object (§ 11.3, p. 428).

In common with other subscripts, the map subscript operator returns an lvalue (§ 4.1.1, p. 135). Because the return is an lvalue, we can read or write the element:

c++

cout << word_count["Anna"]; // fetch the element indexed by Anna; prints 1
++word_count["Anna"];       // fetch the element and add 1 to it
cout << word_count["Anna"]; // fetch the element and print it; prints 2

INFO

Unlike vector or string, the type returned by the map subscript operator differs from the type obtained by dereferencing a map iterator.

The fact that the subscript operator adds an element if it is not already in the map allows us to write surprisingly succinct programs such as the loop inside our word-counting program (§ 11.1, p. 421). On the other hand, sometimes we only want to know whether an element is present and do not want to add the element if it is not. In such cases, we must not use the subscript operator.

11.3.5. Accessing Elements

The associative containers provide various ways to find a given element, which are described in Table 11.7 (p. 438). Which operation to use depends on what problem we are trying to solve. If all we care about is whether a particular element is in the container, it is probably best to use find. For the containers that can hold only unique keys, it probably doesn’t matter whether we use find or count. However, for the containers with multiple keys, count has to do more work: If the element is present, it still has to count how many elements have the same key. If we don’t need the count, it’s best to use find:

Table 11.7. Operations to Find Elements in an Associative Container

INFO

lower_bound and upper_bound not valid for the unordered containers
Subscript and at operations only for map and unordered_map that are not const.

Code	Description
`c.find(k)`	Returns an iterator to the (first) element with key `k`, or the off-the-end iterator if `k` is not in the container.
`c.count(k)`	Returns the number of elements with key `k`. For the containers with unique keys, the result is always zero or one.
`c.lower_bound(k)`	Returns an iterator to the first element with key not less than `k`.
`c.upper_bound(k)`	Returns an iterator to the first element with key greater than `k`.
`c.equal_range(k)`	Returns a pair of iterators denoting the elements with key `k`. If `k` is not present, both members are `c.end()`.

INFO

Exercises Section 11.3.4

Exercise 11.24: What does the following program do?

c++

map<int, int> m;
m[0] = 1;

Exercise 11.25: Contrast the following program with the one in the previous exercise

c++

vector<int> v;
v[0] = 1;

Exercise 11.26: What type can be used to subscript a map? What type does the subscript operator return? Give a concrete example—that is, define a map and then write the types that can be used to subscript the map and the type that would be returned from the subscript operator.

c++

set<int> iset = {0,1,2,3,4,5,6,7,8,9};
iset.find(1);   // returns an iterator that refers to the element with key == 1
iset.find(11);  // returns the iterator == iset.end()
iset.count(1);  // returns 1
iset.count(11); // returns 0

Using `find` Instead of Subscript for `map`s

For the map and unordered_map types, the subscript operator provides the simplest method of retrieving a value. However, as we’ve just seen, using a subscript has an important side effect: If that key is not already in the map, then subscript inserts an element with that key. Whether this behavior is correct depends on our expectations. Our word-counting programs relied on the fact that using a nonexistent key as a subscript inserts an element with that key and value 0.

Sometimes, we want to know if an element with a given key is present without changing the map. We cannot use the subscript operator to determine whether an element is present, because the subscript operator inserts a new element if the key is not already there. In such cases, we should use find:

c++

if (word_count.find("foobar") == word_count.end())
    cout << "foobar is not in the map" << endl;

Finding Elements in a `multimap` or `multiset`

Finding an element in an associative container that requires unique keys is a simple matter—the element is or is not in the container. For the containers that allow multiple keys, the process is more complicated: There may be many elements with the given key. When a multimap or multiset has multiple elements of a given key, those elements will be adjacent within the container.

For example, given our map from author to titles, we might want to print all the books by a particular author. We can solve this problem in three different ways. The most obvious way uses find and count:

c++

string search_item("Alain de Botton"); // author we'll look for
auto entries = authors.count(search_item); // number of elements
auto iter = authors.find(search_item); // first entry for this author
// loop through the number of entries there are for this author
while(entries) {
    cout << iter->second << endl; // print each title
    ++iter;     // advance to the next title
    --entries;  // keep track of how many we've printed
}

We start by determining how many entries there are for the author by calling count and getting an iterator to the first element with this key by calling find. The number of iterations of the for loop depends on the number returned from count. In particular, if the count was zero, then the loop is never executed.

INFO

We are guaranteed that iterating across a multimap or multiset returns all the elements with a given key in sequence.

A Different, Iterator-Oriented Solution

Alternatively, we can solve our problem using lower_bound and upper_bound. Each of these operations take a key and returns an iterator. If the key is in the container, the iterator returned from lower_bound will refer to the first instance of that key and the iterator returned by upper_bound will refer just after the last instance of the key. If the element is not in the multimap, then lower_bound and upper_bound will return equal iterators; both will refer to the point at which the key can be inserted without disrupting the order. Thus, calling lower_bound and upper_bound on the same key yields an iterator range (§ 9.2.1, p. 331) that denotes all the elements with that key.

Of course, the iterator returned from these operations might be the off-the-end iterator for the container itself. If the element we’re looking for has the largest key in the container, then upper_bound on that key returns the off-the-end iterator. If the key is not present and is larger than any key in the container, then the return from lower_bound will also be the off-the-end iterator.

INFO

The iterator returned from lower_bound may or may not refer to an element with the given key. If the key is not in the container, then lower_bound refers to the first point at which this key can be inserted while preserving the element order within the container.

Using these operations, we can rewrite our program as follows:

c++

// definitions of authors and search_item as above
// beg and end denote the range of elements for this author
for (auto beg = authors.lower_bound(search_item),
          end = authors.upper_bound(search_item);
     beg != end; ++beg)
    cout << beg->second << endl; // print each title

This program does the same work as the previous one that used count and find but accomplishes its task more directly. The call to lower_bound positions beg so that it refers to the first element matching search_item if there is one. If there is no such element, then beg refers to the first element with a key larger than search_item, which could be the off-the-end iterator. The call to upper_bound sets end to refer to the element just beyond the last element with the given key. These operations say nothing about whether the key is present. The important point is that the return values act like an iterator range (§ 9.2.1, p. 331).

If there is no element for this key, then lower_bound and upper_bound will be equal. Both will refer to the point at which this key can be inserted while maintaining the container order.

Assuming there are elements with this key, beg will refer to the first such element. We can increment beg to traverse the elements with this key. The iterator in end will signal when we’ve seen all the elements. When beg equals end, we have seen every element with this key.

Because these iterators form a range, we can use a for loop to traverse that range. The loop is executed zero or more times and prints the entries, if any, for the given author. If there are no elements, then beg and end are equal and the loop is never executed. Otherwise, we know that the increment to beg will eventually reach end and that in the process we will print each record associated with this author.

INFO

If lower_bound and upper_bound return the same iterator, then the given key is not in the container.

The `equal_range` Function

The remaining way to solve this problem is the most direct of the three approaches: Instead of calling upper_bound and lower_bound, we can call equal_range.

This function takes a key and returns a pair of iterators. If the key is present, then the first iterator refers to the first instance of the key and the second iterator refers one past the last instance of the key. If no matching element is found, then both the first and second iterators refer to the position where this key can be inserted.

We can use equal_range to modify our program once again:

c++

// definitions of authors and search_item as above
// pos holds iterators that denote the range of elements for this key
for (auto pos = authors.equal_range(search_item);
     pos.first != pos.second; ++pos.first)
    cout << pos.first->second << endl; // print each title

This program is essentially identical to the previous one that used upper_bound and lower_bound. Instead of using local variables, beg and end, to hold the iterator range, we use the pair returned by equal_range. The first member of that pair holds the same iterator as lower_bound would have returned and second holds the iterator upper_bound would have returned. Thus, in this program pos.first is equivalent to beg, and pos.second is equivalent to end.

INFO

Exercises Section 11.3.5

Exercise 11.27: What kinds of problems would you use count to solve? When might you use find instead?

Exercise 11.28: Define and initialize a variable to hold the result of calling find on a map from string to vector of int.

Exercise 11.29: What do upper_bound, lower_bound, and equal_range return when you pass them a key that is not in the container?

Exercise 11.30: Explain the meaning of the operand pos.first->second used in the output expression of the final program in this section.

Exercise 11.31: Write a program that defines a multimap of authors and their works. Use find to find an element in the multimap and erase that element. Be sure your program works correctly if the element you look for is not in the map.

Exercise 11.32: Using the multimap from the previous exercise, write a program to print the list of authors and their works alphabetically.

11.3.6. A Word Transformation Map

We’ll close this section with a program to illustrate creating, searching, and iterating across a map. We’ll write a program that, given one string, transforms it into another. The input to our program is two files. The first file contains rules that we will use to transform the text in the second file. Each rule consists of a word that might be in the input file and a phrase to use in its place. The idea is that whenever the first word appears in the input, we will replace it with the corresponding phrase. The second file contains the text to transform.

If the contents of the word-transformation file are

brb be right back
k okay?
y why
r are
u you
pic picture
thk thanks!
l8r later

and the text we are given to transform is

where r u
y dont u send me a pic
k thk l8r

then the program should generate the following output:

where are you
why dont you send me a picture
okay? thanks! later

The Word Transformation Program

Our solution will use three functions. The word_transform function will manage the overall processing. It will take two ifstream arguments: The first will be bound to the word-transformation file and the second to the file of text we’re to transform. The buildMap function will read the file of transformation rules and create a map from each word to its transformation. The transform function will take a string and return the transformation if there is one.

We’ll start by defining the word_transform function. The important parts are the calls to buildMap and transform:

c++

void word_transform(ifstream &map_file, ifstream &input)
{
    auto trans_map = buildMap(map_file); // store the transformations
    string text;                    // hold each line from the input
    while (getline(input, text)) {  // read a line of input
        istringstream stream(text); // read each word
        string word;
        bool firstword = true;      // controls whether a space is printed
        while (stream >> word) {
           if (firstword)
               firstword = false;
           else
               cout << " ";  // print a space between words
           // transform returns its first argument or its transformation
           cout << transform(word, trans_map); // print the output
        }
        cout << endl;         // done with this line of input
    }
}

The function starts by calling buildMap to generate the word-transformation map. We store the result in trans_map. The rest of the function processes the input file. The while loop uses getline to read the input file a line at a time. We read by line so that our output will have line breaks at the same position as in the input file. To get the words from each line, we use a nested while loop that uses an istringstream (§ 8.3, p. 321) to process each word in the current line.

The inner while prints the output using the bool firstword to determine whether to print a space. The call to transform obtains the word to print. The value returned from transform is either the original string in word or its corresponding transformation from trans_map.

Building the Transformation Map

The buildMap function reads its given file and builds the transformation map.

c++

map<string, string> buildMap(ifstream &map_file)
{
    map<string, string> trans_map;  // holds the transformations
    string key;    // a word to transform
    string value;  // phrase to use instead
    // read the first word into key and the rest of the line into value
    while (map_file >> key && getline(map_file, value))
        if (value.size() > 1) // check that there is a transformation
            trans_map[key] = value.substr(1); // skip leading space
        else
            throw runtime_error("no rule for " + key);
    return trans_map;
}

Each line in map_file corresponds to a rule. Each rule is a word followed by a phrase, which might contain multiple words. We use >> to read the word that we will transform into key and call getline to read the rest of the line into value. Because getline does not skip leading spaces (§ 3.2.2, p. 87), we need to skip the space between the word and its corresponding rule. Before we store the transformation, we check that we got more than one character. If so, we call substr (§ 9.5.1, p. 361) to skip the space that separated the transformation phrase from its corresponding word and store that substring in trans_map,

Note that we use the subscript operator to add the key–value pairs. Implicitly, we are ignoring what should happen if a word appears more than once in our transformation file. If a word does appear multiple times, our loops will put the last corresponding phrase into trans_map. When the while concludes, trans_map contains the data that we need to transform the input.

Generating a Transformation

The transform function does the actual transformation. Its parameters are references to the string to transform and to the transformation map. If the given string is in the map, transform returns the corresponding transformation. If the given string is not in the map, transform returns its argument:

c++

const string &
transform(const string &s, const map<string, string> &m)
{
    // the actual map work; this part is the heart of the program
    auto map_it = m.find(s);
    // if this word is in the transformation map
    if (map_it != m.cend())
        return map_it->second; // use the replacement word
    else
        return s;              // otherwise return the original unchanged
}

We start by calling find to determine whether the given string is in the map. If it is, then find returns an iterator to the corresponding element. Otherwise, find returns the off-the-end iterator. If the element is found, we dereference the iterator, obtaining a pair that holds the key and value for that element (§ 11.3, p. 428). We return the second member, which is the transformation to use in place of s.

INFO

Exercises Section 11.3.6

Exercise 11.33: Implement your own version of the word-transformation program.

Exercise 11.34: What would happen if we used the subscript operator instead of find in the transform function?

Exercise 11.35: In buildMap, what effect, if any, would there be from rewriting

c++

    trans_map[key] = value.substr(1);

c++

    trans_map.insert({key, value.substr(1)});

Exercise 11.36: Our program does no checking on the validity of either input file. In particular, it assumes that the rules in the transformation file are all sensible. What would happen if a line in that file has a key, one space, and then the end of the line? Predict the behavior and then check it against your version of the program.