C plus plus

Anonymous
timer Asked: Apr 20th, 2017

Question Description

Note: these questions are the exercises and Not the review questions.

  • Chapter 5: 6, 10,

4. Chapter 6: 1, 2

5. Chapter 7: 2, 5,

From page 258

Programming Abstractions in Eric S. Roberts Stanford University Spring Quarter 2012 C ++ Foreword This text represents a major revision of the course reader that we’ve been using at Stanford for the last several years. The primary goal of the revision was to bring the approach more closely in line with the way C++ is used in industry, which will in turn make it easier to export Stanford’s approach to teaching data structures to a larger fraction of schools. Even though this quarter’s draft is reasonably complete, the text remains somewhat rough. In particular, these chapters have not yet had the benefit of the wonderful copyediting service that my wife Lauren Rusk has provided for all my books. This textbook has had an interesting evolutionary history that in some ways mirrors the genesis of the C++ language itself. Just as Bjarne Stroustrup’s first version of C++ was implemented on top of a C language base, this reader began its life as my textbook Programming Abstractions in C (Addison-Wesley, 1998). In 2002-03, Julie Zelenski updated it for use with the C++ programming language, which we began using in CS106 B and CS106 X during that year. Although the revised text worked fairly well at the outset, CS106 B and CS106 X have evolved in recent years so that their structure no longer tracks the organization of the book. In 2009, I embarked on a comprehensive process of rewriting the book so that students in these courses can use it as both a tutorial and a reference. As always, that process takes a considerable amount of time, and there are almost certainly some sections of the book that need a substantial rewrite. I want to thank my colleagues at Stanford over the last several years, starting with Julie Zelenski for her extensive work on the initial C++ revision. My colleagues Keith Schwarz, Jerry Cain, Stephen Cooper, and Mehran Sahami have all made important contributions to the revision. I also need to express my thanks to several generations of section leaders and so many students over the years, all of whom have helped make it so exciting to teach this wonderful material. Finally, I want to thank the students in CS106 B in winter quarter 2011-12 who put up with a partially finished reader and contributed many error reports and suggestions. I’ve always believed that programming is one of the most challenging and exciting intellectual activities that humans have ever discovered. By staying close to the machine, C++ gives you all the freedom you need to build applications that take advantage of the full power of modern computing. I hope you all enjoy the ride. i Contents 1 Overview of C++ 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 2 Functions and Libraries 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3 ii 57 The idea of a function 58 Libraries 61 Defining functions in C++ 63 The mechanics of function calls 67 Reference parameters 75 Interfaces and implementations 80 Principles of interface design 87 Designing a random number library 92 A brief tour of the graphics library 110 Summary 113 Review questions 114 Exercises 116 Strings 3.1 3.2 3.3 3.4 3.5 3.6 3.7 1 Your first C++ program 2 The history of C++ 3 The structure of a C++ program 6 Variables 14 Data types 19 Expressions 26 Statements 36 The Stanford C++ libraries 48 Summary 49 Review questions 50 Exercises 52 Using strings as abstract values 126 String operations 129 The library 137 Modifying the contents of a string 138 The legacy of C-style strings 139 Writing string applications 140 The strlib.h library 146 Summary 147 Review questions 148 Exercises 149 125 4 Streams 4.1 4.2 4.3 4.4 4.5 5 6 Using strings as abstract values 160 Formatted input 165 Data files 167 Class hierarchies 180 Exploring the Stanford libraries 187 Summary 189 Review questions 190 Exercises 191 Collections 5.1 5.2 5.3 5.4 5.5 5.6 197 The Vector class 199 The Stack class 213 The Queue class 219 The Map class 228 The Set class 234 Iterating over a collection 238 Summary 245 Review questions 247 Exercises 248 Designing Classes 6.1 6.2 6.3 6.4 6.5 159 263 Representing points 264 Operator overloading 273 Rational numbers 284 Designing a token scanner class 295 Encapsulating programs as classes 305 Summary 307 Review questions 308 Exercises 310 iii 7 Introduction to Recursion 7.1 7.2 7.3 7.4 7.5 7.6 7.7 8 9 iv 393 Recursive backtracking in a maze 394 Backtracking and games 404 The minimax algorithm 413 Summary 419 Review questions 420 Exercises 421 Algorithmic Analysis 10.1 10.2 10.3 10.4 10.5 10.6 353 The Towers of Hanoi 354 The subset-sum problem 365 Generating permutations 368 Graphical recursion 372 Summary 379 Review questions 379 Exercises 380 Backtracking Algorithms 9.1 9.2 9.3 10 A simple example of recursion 320 The factorial function 322 The Fibonacci function 329 Checking palindromes 336 The binary search algorithm 339 Mutual recursion 340 Thinking recursively 342 Summary 344 Review questions 346 Exercises 348 Recursive Strategies 8.1 8.2 8.3 8.4 319 The sorting problem 434 Computational complexity 439 Recursion to the rescue 447 Standard complexity classes 453 The Quicksort algorithm 456 Mathematical induction 462 Summary 466 Review questions 467 Exercises 470 433 11 Pointers and Arrays 11.1 11.2 11.3 11.4 12 13 563 The concept of an editor buffer 564 Defining the buffer abstraction 566 An array-based implementation 570 A stack-based implementation 578 A list-based implementation 583 Summary 601 Review questions 602 Exercises 604 Linear Structures 14.1 14.2 14.3 14.4 14.5 527 Dynamic allocation and the heap 528 Defining a CharStack class 533 Heap-stack diagrams 544 Copying objects 550 Summary 556 Review questions 557 Exercises 559 Efficiency and Representation 13.1 13.2 13.3 13.4 13.5 14 The structure of memory 478 Pointers 488 Arrays 498 Using functions as data values 508 Summary 515 Review questions 517 Exercises 520 Dynamic Memory Management 12.1 12.2 12.3 12.4 477 611 Templates 612 Defining stacks as a template class 615 Implementing stacks as linked lists 622 Implementing queues 626 Implementing vectors 640 Summary 648 Review questions 649 Exercises 650 v 15 Maps and Hashing 15.1 15.2 15.3 16 17 Family trees 683 Binary search trees 685 Balanced trees 698 Implementing maps using BSTs 709 Summary 711 Review questions 712 Exercises 715 725 Overview of the interpreter 726 The structure of expressions 731 Defining the expression hierarchy 736 Parsing an expression 750 Summary 756 Review questions 757 Exercises 759 Sets 18.1 18.2 18.3 18.4 vi 681 Expression Trees 17.1 17.2 17.3 17.4 18 Implementing maps using arrays 654 Lookup tables 660 Hashing 663 Summary 675 Review questions 675 Exercises 677 Trees 16.1 16.2 16.3 16.4 653 765 Sets as a mathematical abstraction 766 Expanding the set interface 770 Implementation strategies for sets 778 Optimizing sets of small integers 780 Summary 786 Review questions 787 Exercises 790 19 Graphs 19.1 19.2 19.3 19.4 19.5 19.6 19.7 793 The structure of a graph 794 Representation strategies 798 A low-level graph abstraction 802 Graph traversals 809 Defining a Graph class 815 Finding shortest paths 824 Implementing priority queues 828 Summary 831 Review questions 833 Exercises 835 vii A Library Interfaces A-1 A-2 A-3 A-4 A-5 A-6 A-7 A-8 A-9 A-10 A-11 A-12 A-13 A-14 A-15 A-16 A-17 A-18 A-19 A-20 A-21 A-22 A-23 A-24 A-25 A-26 A-27 A-28 Index viii 847 cmpfn.h 848 console.h 849 direction.h 850 error.h 852 filelib.h 853 foreach.h 859 gevents.h 860 graph.h 874 graphics.h 879 grid.h 887 gtypes.h 890 gwindow.h 895 hashmap.h 904 hashset.h 907 lexicon.h 913 map.h 916 point.h 919 pqueue.h 921 queue.h 924 random.h 927 set.h 928 simpio.h 933 sound.h 934 stack.h 936 strlib.h 938 thread.h 941 tokenscanner.h 945 vector.h 951 955 Chapter 1 An Overview of C++ Out of these various experiments come programs. This is our experience: programs do not come out of the minds of one person or two people such as ourselves, but out of day-to-day work. — Stokely Carmichael and Charles V. Hamilton, Black Power, 1967 2 Overview of C++ In Lewis Carroll’s Alice’s Adventures in Wonderland, the King asks the White Rabbit to “begin at the beginning and go on till you come to the end: then stop.” Good advice, but only if you’re starting from the beginning. This book is designed for a second course in computer science and therefore assumes that you have already begun your study of programming. At the same time, because first courses vary considerably in what they cover, it is difficult to rely on any specific material. Some of you, for example, will already understand C++ control structures from prior experience with closely related languages such as C or Java. For others, however, the structure of C++ will seem unfamiliar. Because of this disparity in background, the best approach is to adopt the King’s advice. This chapter therefore “begins at the beginning” and introduces you to those parts of the C++ language you will need to write simple programs. 1.1 Your first C++ program As you will learn in more detail in the following section, C++ is an extension of an extremely successful programming language called C, which appeared in the early 1970s. In the book that serves as C’s defining document, The C Programming Language, Brian Kernighan and Dennis Ritchie offer the following advice on the first page of Chapter 1. The only way to learn a new programming language is by writing programs in it. The first program to write is the same for all languages: Print the words hello, world This is the big hurdle; to leap over it you have to be able to create the program text somewhere, compile it successfully, load it, run it, and find out where the output went. With these mechanical details mastered, everything else is comparatively easy. If you were to rewrite it in C++, the “Hello World” program would end up looking something like the code in Figure 1-1. At this point, the important thing is not to understand exactly what all of the lines in this program mean. There is plenty of time to master those details later. Your mission—and you should decide to accept it—is to get the HelloWorld program running. Type in the program exactly as it appears in Figure 1-1 and then figure out what you need to do to make it work. The exact steps you need to use depend on the programming environment you’re using to create and run C++ programs. If you are using this textbook for a class, your instructor will presumably provide some reference material on the programming environments you are 1.1 Your first C++ program F I G U R E 1 - 1 The “Hello World” program /* * File: HelloWorld.cpp * -------------------* This file is adapted from the example * on page 1 of Kernighan and Ritchie's * book The C Programming Language. */ #include using namespace std; int main() { cout << "hello, world" << endl; return 0; } expected to use. If you are reading this book on your own, you’ll need to refer to the documentation that comes with whatever programming environment you’re using for C++. When you get all of these details worked out, you should see the output from the HelloWorld program on a window somewhere on your computer screen. On the Apple Macintosh on which I prepared this book, that window looks like this: HelloWorld hello, world On your computer, the window will probably have a somewhat different appearance and may include additional status messages along with your program’s cheery “hello, world” greeting. But the message will be there. And although it may not be true that “everything else is comparatively easy” as Kernighan and Ritchie suggest, you will have achieved a significant milestone. 1.2 The history of C++ In the early days of computing, programs were written in machine language, which consists of the primitive instructions that can be executed directly by the machine. Programs written in machine language are difficult to understand, mostly because the structure of machine language reflects the design of the hardware rather than the needs of programmers. Worse still, each type of computing hardware has its own 3 4 Overview of C++ machine language, which means that a program written for one machine will not run on other types of hardware. In the mid-1950s, a group of programmers under the direction of John Backus at IBM had an idea that profoundly changed the nature of computing. Would it be possible, Backus and his colleagues wondered, to write programs that resembled the mathematical formulas they were trying to compute and have the computer itself translate those formulas into machine language? In 1955, this team produced the initial version of FORTRAN (whose name is a contraction of formula translation), which was the first example of a higher-level programming language. Since that time, many new programming languages have been invented, most of which build on previous languages in an evolutionary way. C++ represents the joining of two branches in that evolution. One of its ancestors is a language called C, which was designed at Bell Laboratories by Dennis Ritchie in 1972 and then later revised and standardized by the American National Standards Institute (ANSI) in 1989. But C++ also descends from a family of languages designed to support a different style of programming that has dramatically changed the nature of software development in recent years. The object-oriented paradigm Over the last two decades, computer science and programming have gone through something of a revolution. Like most revolutions—whether political upheavals or the conceptual restructurings that Thomas Kuhn describes in his 1962 book The Structure of Scientific Revolutions—this change has been driven by the emergence of an idea that challenges an existing orthodoxy. Initially, the two ideas compete, and, at least for a while, the old order maintains its dominance. Over time, however, the strength and popularity of the new idea grows, until it begins to displace the older idea in what Kuhn calls a paradigm shift. In programming, the old order is represented by the procedural paradigm, in which programs consist of a collection of procedures and functions that operate on data. The new model is called the object-oriented paradigm, in which programs are viewed instead as a collection of data objects that embody particular characteristics and behavior. The idea of object-oriented programming is not really all that new. The first object-oriented language was SIMULA, a language for coding simulations designed by the Scandinavian computer scientists Ole-Johan Dahl and Kristen Nygaard in 1967. With a design that was far ahead of its time, SIMULA anticipated many of the concepts that later became commonplace in programming, including the concept of abstract data types and much of the modern object-oriented paradigm. In fact, much of the terminology used to describe object-oriented languages comes from the original 1967 report on SIMULA. 1.2 The history of C++ Unfortunately, SIMULA did not generate a great deal of interest in the years after its introduction. The first object-oriented language to gain any significant following within the computing profession was Smalltalk, which was developed at the Xerox Palo Alto Research Center in the late 1970s. The purpose of Smalltalk, which is described in the book Smalltalk-80: The Language and Its Implementation by Adele Goldberg and David Robson, was to make programming accessible to a wider audience. As such, Smalltalk was part of a larger effort at Xerox PARC that gave rise to much of the modern user-interface technology that is now standard on personal computers. Despite many attractive features and a highly interactive user environment that simplifies the programming process, Smalltalk never achieved much commercial success. The profession as a whole took an interest in object-oriented programming only when the central ideas were incorporated into variants of C, which had already become an industry standard. Although there were several parallel efforts to design an object-oriented language based on C, the most successful was the language C++, which was developed by Bjarne Stroustrup at AT&T Bell Laboratories in the early 1980s. C++ includes standard C as a subset, which makes it possible to integrate C++ code into existing C programs in a gradual, evolutionary way. Although object-oriented languages have gained some of their popularity at the expense of procedural ones, it would be a mistake to regard the object-oriented and procedural paradigms as mutually exclusive. Programming paradigms are not so much competitive as they are complementary. The object-oriented and the procedural paradigm—along with other important paradigms such as the functional programming style embodied in LISP—all have important applications in practice. Even within the context of a single application, you are likely to find a use for more than one approach. As a programmer, you must master many different paradigms, so that you can use the conceptual model that is most appropriate to the task at hand. The compilation process When you write a program in C++, your first step is to create a file that contains the text of the program, which is called a source file. Before you can run your program, you need to translate the source file into an executable form. The first step in that process is to invoke a program called a compiler, which translates the source file into an object file containing the corresponding machine-language instructions. This object file is then combined with other object files to produce an executable file that can be run on the system. The other object files typically include predefined object files called libraries that contain the machine-language instructions for various operations commonly required by programs. The process of 5 6 Overview of C++ combining all the individual object files into an executable file is called linking. The steps in the compilation process are illustrated in Figure 1-2. As noted in the discussion of the HelloWorld program earlier in this chapter, the specific details of the compilation process vary from one machine to another. There is no way that a general textbook like this can tell you exactly what commands you should use to run a program on your system. The good news is that the C++ programs themselves will look the same. One of the advantages of programming in a higher-level language like C++ is that doing so allows you to ignore the particular characteristics of the hardware and create programs that will run on many different machines. 1.3 The structure of a C++ program The best way to get a feeling for the C++ programming language is to look at some sample programs, even before you understand the details of the language. The HelloWorld program is a start, but it is so simple that it doesn’t include many of the features you’d expect to see in a program. Since this book is designed for a second course in computer science, you’ve almost certainly written programs that read input from the user, store values in variables, use loops to perform repeated calculations, and make use of subsidiary functions to simplify the structure of the F I G U R E 1 - 2 The compilation process source file object file /* * File: HelloWorld.cpp * -------------------* This file is adapted from the example * on page 1 of Kernighan and Ritchie’s * book The C Programming Language. */ 10011111111000111100111011110100111101100 11010011011100101110111000111001111010100 11100000010000111000000110111110001010111 10001010111010010001000110011011100010010 11000100101011101001011011000101101111101 10110111110111011111101100110110111001110 11100111000111111101000111110001110111010 11110011111010111101110100001111011101010 11011101000011111101010011100101111011111 10111101111101011101111110100010101000000 11011110011011101000011100011111001010010 10101000011110011100101000100110111010111 10110101110111011101011110001001111110010 11111111110011111100100111011011010000100 11010111001111110001101010010111101000010 using namespace std; #include int main() { cout << "hello, world" << endl; return 0; } compiler executable file linker library file 10001100111001011101000000111110110100000 10111011111101011111111101001110100000111 10100111101011001110101100000101111010000 10010011011000111011010100010000110101101 10111010110111010111010110111011111110001 11000010011100111110010000100000101110110 10001100011111101010111000101011101011100 11010011011100101011010001001111100000110 11111110111001111000001001010010000100100 11000111110110110000101100110011010010100 10011111010010011100010101000010010010111 11111000000000101100110101111111011010100 10100000011000001100001000001101111000011 10101101110001001010110111000100110010111 11010101111011111111011110011100111101001 10011111111000111100111011110100111101100 11010011011100101110111000111001111010100 11100000010000111000000110111110001010111 10001010111010010001000110011011100010010 11000100101011101001011011000101101111101 10110111110111011111101100110110111001110 11100111000111111101000111110001110111010 11110011111010111101110100001111011101010 11011101000011111101010011100101111011111 10111101111101011101111110100010101000000 11011110011011101000011100011111001010010 10101000011110011100101000100110111010111 10110101110111011101011110001001111110010 11111111110011111100100111011011010000100 11010111001111110001101010010111101000010 10001100111001011101000000111110110100000 10111011111101011111111101001110100000111 10100111101011001110101100000101111010000 10010011011000111011010100010000110101101 10111010110111010111010110111011111110001 11000010011100111110010000100000101110110 10001100011111101010111000101011101011100 11010011011100101011010001001111100000110 11111110111001111000001001010010000100100 11000111110110110000101100110011010010100 10011111010010011100010101000010010010111 11111000000000101100110101111111011010100 10100000011000001100001000001101111000011 10101101110001001010110111000100110010111 11010101111011111111011110011100111101001 1.3 The structure of a C++ program program. The HelloWorld program does none of these things. To illustrate more of the features of C++, Figure 1-3 shows the code for a program that lists powers of two along with some annotations that describe the various parts of the program. F I G U R E 1 - 3 The structure of a C++ program /* * File: PowersOfTwo.cpp * --------------------* This program generates a list of the powers of * two up to an exponent limit entered by the user. */ Program comments #include using namespace std; Library inclusions /* Function prototypes */ int raiseToPower(int n, int k); Function prototype /* Main program */ int main() { int limit; cout << "This program lists powers of two." << endl; cout << "Enter exponent limit: "; cin >> limit; for (int i = 0; i <= limit; i++) { cout << "2 to the " << i << " = " << raiseToPower(2, i) << endl; } return 0; } Main program /* * Function: raiseToPower * Usage: int p = raiseToPower(n, k); * ---------------------------------* Returns the integer n raised to the kth power. */ Function comment int raiseToPower(int n, int k) { int result = 1; for (int i = 0; i < k; i++) { result *= n; } return result; } Function definition 7 8 Overview of C++ When you run the PowersOfTwo program shown in Figure 1-3, the computer begins by asking you for an exponent limit, which specifies how many powers of two the program should generate. If you type in 8, for example, the program will generate a list of the powers of two up to 28, as follows: PowersOfTwo This program lists powers of two. Enter exponent limit: 8 2 to the 0 = 1 2 to the 1 = 2 2 to the 2 = 4 2 to the 3 = 8 2 to the 4 = 16 2 to the 5 = 32 2 to the 6 = 64 2 to the 7 = 128 2 to the 8 = 256 This screen image shows what happens if you execute the PowersOfTwo program. Such images are called sample runs. In each sample run, input from the user appears in blue so that it is easy to distinguish input data from the output generated by the program. As the annotations in Figure 1-3 indicate, the PowersOfTwo program is divided into several components, which are discussed in the next few sections. Comments Much of the text in Figure 1-3 consists of English-language comments. A comment is text that is ignored by the compiler but which nonetheless conveys information to other programmers. A comment consists of text enclosed between the markers /* and */ and may continue over several lines. Alternatively, you can also specify single-line comments, which begin with the characters // and extend through the end of the line. This book uses the multiline /* . . . */ comment form except when the comment marks some part of a program that is not yet complete. Adopting that strategy makes it easier to find unfinished parts of a program. Modern programming environments typically use colors to make it easier to distinguish the various parts of a program. On a color monitor, it is easy to display many colors, so that a programming environment might, for example, display keywords in one color, strings in another, and comments in yet some other shade. Because this text is typeset using only two colors, program listings must be more economical in their use of color. The only visual distinction that survives the restricted color palette is between comments, which appear in blue, and the code actually executed by the computer, which appears in black. 1.3 The structure of a C++ program As you can see from Figure 1-3, the PowersOfTwo program includes a comment at the beginning that describes the program as a whole and one before the definition of raiseToPower that describes the operation of that function at a lower level of detail. In addition, the program includes a couple of one-line comments that act like section headings in English text. Library inclusions Modern programs are never written without using libraries, which are collections of previously written tools that perform useful operations. C++ defines a number of standard libraries, of which one of the most important is iostream. This library defines a set of simple input and output operations based on a data structure called a stream, which is a data structure used to manage the flow of information to or from some data source, such as the console or a file. To gain access to the iostream library, your program must contain the line #include This line instructs the C++ compiler to read the relevant definitions from what is called a header file. The angle brackets in this line indicate that the header file is a system library that is part of standard C++. Beginning in Chapter 2, you will also have occasion to use header files that you have written yourself or that come from other libraries. Those header files typically end with the suffix .h and are enclosed in quotation marks instead of angle brackets. In C++, reading in a header file using #include is often not sufficient by itself to use a system library. To ensure that the names defined in different parts of a large system do not interfere with one another, the designers of C++ made it possible to segment code into structures called namespaces, each of which keeps track of its own set of names. The standard C++ libraries use a namespace called std, which means that you cannot refer to the names defined in standard header files like iostream unless you let the compiler know to which namespace those definitions belong. Increasingly, professional C++ programmers specify the namespace explicitly by adding the prefix std:: before each name to which it applies. Using this approach, the first line of the HelloWorld program becomes std::cout << "hello, world" << std::endl; If you want to write code that looks like that of a professional, you can adopt this style. For someone just learning the language, all those std:: tags make programs harder to read, so this book instead adopts the convention of adding the line using namespace std; 9 10 Overview of C++ at the end of the library inclusion section. At times—most importantly when you start to define your own library interfaces in Chapter 2—you will need to remember that the complete name of anything in the standard library namespace includes the std:: prefix. For the moment, however, it is probably easiest to think of the using namespace std; as one of the incantations that the C++ compiler requires to work its magic on your code. Function prototypes Computation in a C++ program is carried out in the context of functions. A function is a named section of code that performs a specific operation. The PowersOfTwo program contains two functions—main and raiseToPower—each of which is described in more detail in one of the sections that follow. Although the definitions of these functions appear toward the end of the file, the PowersOfTwo program provides a concise description of the raiseToPower function just after the library inclusions. This concise form is called a prototype and makes it possible to make calls to that function before its actual definition appears. A C++ prototype consists of the first line of the function definition followed by a semicolon, as illustrated by the prototype int raiseToPower(int n, int k); This prototype tells the compiler everything it needs to know about how to call that function when it appears in the code. As you will see in the expanded discussion of functions in Chapter 2, the prototype for raiseToPower indicates that the function takes two integers as arguments and returns an integer as its result. You must provide the declaration or definition of each function before making any calls to that function. C++ requires such prototype declarations so the compiler can check whether calls to functions are compatible with the function definitions. If you accidentally supply the wrong number of arguments or if the arguments are of the wrong type, the compiler reports an error, which makes it easier to find such problems in your code. The main program Every C++ program must contain a function with the name main. This function specifies the starting point for the computation and is called when the program starts up. When main has finished its work and returns, execution of the program ends. 1.3 The structure of a C++ program The first line of the main function in the PowersOfTwo program is an example of a variable declaration, which reserves space for a value used by the program. In this case, the line int limit; introduces a new variable named limit capable of holding a value of type int, the standard type used to represent integers. The syntax of variable declarations is discussed in more detail in the section on “Variables” later in this chapter. For now, all you need to know is that this declaration creates space for an integer variable that you can then use in the body of the main function. The next line of the main function is cout << "This program lists powers of two." << endl; This line has the same effect of the single statement in HelloWorld and sends a message to the user indicating what the program does. The identifier cout refers to the console output stream, which is one of the facilities defined by the iostream interface. The effect of this statement is to take the characters enclosed in quotation marks and send them to the cout stream, followed by the end-of-line sequence endl, which ensures that next output operation will begin on a new line. The next two lines are concerned with asking the user to provide a value for the variable limit. The line cout << "Enter exponent limit: "; also prints a message to the cout stream, just as the first line did. The purpose of this line is to let the user know what kind of input value is required. Such messages are generally known as prompts. When you print a prompt requesting input from the user, it is conventional to omit the endl value so that the prompt appears on the same line as the user input. When the computer executes this line of the program, it will display the prompt but leaves the console cursor—the blinking vertical bar or square that marks the current input position—at the end of the line, waiting for the user’s value, as follows: PowersOfTwo This program lists powers of two. Enter exponent limit: | 11 12 Overview of C++ The actual request for the input value is the line cin >> limit; The identifier cin represents the console input stream, which is the counterpart to cout for reading input from the user. This statement indicates that the next value from the cin stream should be stored in the variable limit. Moreover, because limit is declared as an integer variable, the >> operator automatically converts the characters typed by the user into the appropriate integer. Thus, when the user types 8 and hits the RETURN key, the effect is to set limit to 8. The next line of the main function is a for statement, which is used to repeat a block of code. Like all control statements in C++, the for statement is divided into a header line, which defines the nature of the control operation, and a body, which indicates which statements are affected by the control operation. In this case, the division looks like this: for (int i = 0; i <= limit; i++) { cout << "2 to the " << i << " = " << raiseToPower(2, i) << endl; } header line loop body The header line—which you will have a chance to understand in more detail in the section on “The for statement” later in this chapter—indicates that the statements in the body, whatever they are, should be repeated for each value of i beginning with 0 and continuing up to and including the value of limit. Each execution of the body of the loop prints a single line showing the value of 2i for the current value of the variable i. The single statement in the body of the loop is cout << "2 to the " << i << " = " << raiseToPower(2, i) << endl; This statement illustrates a couple of points. First, statements can be split across more than one line; it is the semicolon that marks the end of the statement, and not simply the end of the line. Second, this statement showcases the ability of the cout stream to chain different output values together and to convert numeric quantities into a printable form. The first part of the output is the character sequence 2 to the This output is then followed by the value of the variable i, an equal sign surrounded by spaces, the value of the function call raiseToPower(2, i) 1.3 The structure of a C++ program and finally the end of line marker. The spaces in the strings ensure that the numeric values do not run together. Before the program can print the output line, however, it must invoke the raiseToPower function to see what the value should be. Calling raiseToPower suspends the execution of the main function, which then waits until the desired value is returned. As is typically the case with functions, raiseToPower needs some information from the main program in order to do its work. If you think about what raising a number to a power involves, you quickly realize that raiseToPower needs to know the base, which in this case is the constant 2, and the exponent, which is currently stored in the variable i. That variable, however, is declared within the body of the main function and is accessible only within the main program. If raiseToPower is to have access to the value of the base and exponent, the main program must pass them as arguments to the function by including them in parentheses after the function name. Doing so copies those values into the corresponding parameters, as described in the following section. As was true in the HelloWorld program as well, the final statement in main is return 0; This statement indicates that the value of the main function is 0. By convention, C++ uses the value of the main function to report the status of the entire program. A value of 0 indicates success; any other value is taken as an indication of failure. Function definitions Because large programs are difficult to understand in their entirety, most programs are broken down into several smaller functions, each of which is easier to understand. In the PowersOfTwo program, the raiseToPower function is used to raise an integer to a power—an operation that is not built into C++ and must therefore be defined explicitly. The first line of raiseToPower is the variable declaration int result = 1; This declaration introduces a new variable named result capable of holding values of type int and initializes it to have the value 1. The next statement in the function is a for loop—similar to the one you’ve already seen in the main program—that executes its body k times. The body of the for loop consists of the line 13 14 Overview of C++ result *= n; which is C++ shorthand for the English sentence “Multiply result by n.” Because the function initializes the value of result to 1 and then multiplies result by n a total of k times, the variable result ends up with the value nk. The last statement in raiseToPower is return result; which indicates that the function should return result as the value of the function. 1.4 Variables Data values in a program are usually stored in variables, which are named locations in memory capable of holding a particular data type. You have already seen examples of variables in the PowersOfTwo program and are almost certainly familiar with the basic concept from your earlier programming experience. The purpose of this section is to outline the rules for using variables in C++. Variable declarations In C++, you must declare each variable before you use it. The primary function of declaring a variable is to make an association between the name of the variable and the type of value that variable contains. The placement of the declaration in a program also determines the scope of the variable, which is the region in which that variable is accessible. The usual syntax for declaring a variable is type namelist; where type indicates the data type and namelist is a list of variable names separated by commas. In most cases, each declaration introduces a single variable name. For example, the function main in PowersOfTwo begins with the line int limit; which declares the variable limit to be of type int. You can, however, declare several variable names at once, as in the following declaration, which declares three variables named n1, n2, and n3: double n1, n2, n3; In this case, the variables are each declared to be of type double, which is the type C++ uses to represent numbers that can have fractional parts. The name double is 1.4 Variables short for double-precision floating-point, but there is no reason to worry about what all those terms mean. This declaration appears in the AddThreeNumbers program in Figure 1-4, which reads in three numbers and writes out their sum. It is important to remember that both the name and the type of a variable remain fixed throughout its lifetime but that the value of that variable will typically change as the program runs. To emphasize the dynamic nature of the value of a variable, it often helps to diagram variables as boxes in which the name appears outside as a label on the box and the value appears on the inside. For example, you might diagram the declaration of limit like this: limit Assigning a value to limit overwrites any previous contents of the box, but does not change the name or the type. In C++, the initial contents of a variable are undefined. If you want a variable to have a particular value, you need to initialize it explicitly. To do so, all you need to do is include an equal sign and a value after a variable name. Thus, the declaration int result = 1; F I G U R E 1 - 4 Program to add three numbers /* * File: AddThreeNumbers.cpp * ------------------------* This program adds three floating-point numbers and prints their sum. */ #include using namespace std; int main() { double n1, n2, n3; cout << "This program adds three numbers." << endl; cout << "1st number: "; cin >> n1; cout << "2nd number: "; cin >> n2; cout << "3rd number: "; cin >> n3; double sum = n1 + n2 + n3; cout << "The sum is " << sum << endl; return 0; } 15 16 Overview of C++ is a shorthand for the following code, in which the declaration and assignment are separate: int result; result = 1; An initial value specified as part of a declaration is called an initializer. Naming conventions The names used for variables, functions, types, constants, and so forth are collectively known as identifiers. In C++, the rules for identifier formation are 1. The name must start with a letter or the underscore character (_). 2. All other characters in the name must be letters, digits, or the underscore. No spaces or other special characters are permitted in names. 3. The name must not be one of the reserved keywords listed in Table 1-1. Uppercase and lowercase letters appearing in an identifier are considered to be different. Thus, the name ABC is not the same as the name abc. Identifiers can be of any length, but C++ compilers are not required to consider any more than the first 31 characters in determining whether two names are identical. You can improve your programming style by adopting conventions for identifiers that help readers identify their function. In this text, names of variables and functions begin with a lowercase letter, such as limit or raiseToPower. The names of classes and other programmer-defined data types begin with an uppercase letter, as in Direction or TokenScanner. Constant values are written entirely in uppercase, as in PI or HALF_DOLLAR. Whenever an identifier consists of several English words run together, the usual convention is to capitalize the first letter of T A B L E 1 - 1 Reserved words in C++ asm auto bool break case catch char class const const_cast continue default delete do double dynamic_cast else enum explicit extern false float for friend goto if inline int long mutable namespace new operator private protected public register reinterpret_cast return short signed sizeof static static_cast struct switch template this throw true try typedef typeid typename union unsigned using virtual void volatile wchar_t while 1.4 Variables each word to make the name easier to read. Because that strategy doesn’t work for constants, programmers use the underscore character to mark the word boundaries. Local and global variables Most variables are declared with the body of a function. Such variables are called local variables. The scope of a local variable extends to the end of the block in which it is declared. When the function is called, space for each local variable is allocated for the duration of that function call. When the function returns, all its local variables disappear. If a variable declaration appears outside any function definition, that declaration introduces a global variable. The scope of a global variable is the remainder of the file in which it is declared, and it exists throughout the execution of a program. Global variables are therefore able to store values that persist across function calls. While that property may seem useful, the disadvantages of global variables easily outweigh their advantages. For one thing, global variables can be manipulated by any function in a program, and it is difficult to keep those functions from interfering with one another. Because of the problems they so often cause, global variables are not used in this text except to declare constants, as discussed in the following section. Although global variables may sometimes seem tempting, you will find it easier to manage the complexity of your programs if you avoid them as well. Constants As you write your programs, you will find that you often use the same constant many times in a program. If, for example, you are performing geometrical calculations that involve circles, the constant π comes up frequently. Moreover, if those calculations require high precision, you might actually need all the digits that fit into a value of type double, which means you would be working with the value 3.14159265358979323846. Writing that constant over and over again is tedious at best, and likely to introduce errors if you type it in by hand each time instead of cutting and pasting the value. It would be better if you could give this constant a name and then refer to it by that name everywhere in the program. You could, of course, simply declare it as a local variable by writing double pi = 3.14159265358979323846; but you would then be able to use it only within the method in which it was defined. A better strategy is to declare it as a global constant like this: const double PI = 3.14159265358979323846; The keyword const at the beginning of this declaration indicates that the value will not change after the variable is initialized, thereby ensuring that the value remains 17 18 Overview of C++ constant. It would not be appropriate, after all, to change the value of π (despite the fact that a bill was introduced in 1897 into the Indiana State Legislature attempting to do just that). The rest of the declaration consists of the type, the name, and the value, as before. The only difference is that the name is written entirely in uppercase to be consistent with the C++ naming conventions for constants. Using named constants offers several advantages. First, descriptive constant names make the program easier to read. More importantly, using constants can dramatically simplify the problem of maintaining the code for a program as it evolves. Even if the value of PI is unlikely to change, some “constants” in a program specify values that might change as the program evolves, even though they will be constant for a particular version of that program. The importance of this principle is easiest to illustrate by historical example. Imagine for the moment that you are a programmer in the late 1960s working on the initial design of the ARPANET, the first large-scale computer network and the ancestor of today’s Internet. Because resource constraints were quite serious at that time, you would need to impose a limit—as the actual designers of the ARPANET did in 1969—on the number of host computers that could be connected. In the early years of the ARPANET, that limit was 127 hosts. If C++ had existed in those days, you might have declared a constant that looked like this: const int MAXIMUM_NUMBER_OF_HOSTS = 127; At some later point, however, the explosive growth of networking would force you to raise this bound. That process would be relatively easy if you used named constants in your programs. To raise the limit on the number of hosts to 1023, it might well be sufficient to change this declaration so that it read const int MAXIMUM_NUMBER_OF_HOSTS = 1023; If you used MAXIMUM_NUMBER_OF_HOSTS everywhere in your program to refer to that maximum value, making this change would automatically propagate to every part of the program in which the name was used. Note that the situation would be entirely different if you had used the numeric constant 127 instead. In that case, you would need to search through the entire program and change all instances of 127 used for this purpose to the larger value. Some instances of 127 might well refer to other things than the limit on the number of hosts, and it would be just as important not to change any of those values. In the likely event that you made a mistake, you would have a very hard time tracking down the bug. 1.5 Data types 1.5 Data types Each variable in a C++ program contains a value constrained to be of a particular type. You set the type of the variable as part of the declaration. So far, you have seen variables of type int and double, but these types merely scratch the surface of the types available in C++. Programs today work with many different data types, some of which are built into the language and some of which are defined as part of a particular application. Learning how to manipulate data of various types is an essential part of mastering the basics of any language, including C++. The concept of a data type In C++ , every data value has an associated data type. From a formal perspective, a data type is defined by two properties: a domain, which is the set of values that belong to that type, and a set of operations, which defines the behavior of that type. For example, the domain of the type int includes all integers . . . −5, −4, −3, −2, −1, 0, 1, 2, 3, 4, 5 . . . and so on, up to the limits established by the hardware of the machine. The set of operations applicable to values of type int includes, for example, the standard arithmetic operations like addition and multiplication. Other types have a different domain and set of operations. As you will learn in the later chapters in this book, much of the power of modern programming languages like C++ comes from the fact that you can define new data types from existing ones. To get that process started, C++ includes several fundamental types that are defined as part of the language. These types, which act as the building blocks for the type system as a whole, are called atomic or primitive types. These predefined types are grouped into five categories—integer, floating-point, Boolean, character, and enumerated types—which are discussed in the sections that follow. Integer types Although the concept of an integer seems like a simple one, C++ actually includes several different data types for representing integer values. In most cases, all you need to know is the type int, which corresponds to the standard representation of an integer on the computer system you are using. In certain cases, however, you need to be more careful. Like all data, values of type int are stored internally in storage units that have a limited capacity. Those values therefore have a maximum size, which limits the range of integers you can use. To get around this problem, C++ defines three integer types—short, int, and long—distinguished from each other by the size of their domains. 19 20 Overview of C++ Unfortunately, the language definition for C++ does not specify an exact range for these three types. As a result, the range for the different integer types depends on the machine and the compiler you’re using. In the early years of computing, the maximum value of type int was typically 32,767, which is very small by today’s standards. If you had wanted, for example, to perform a calculation involving the number of seconds in a year, you could not use type int on those machines, because that value (31,536,000) is considerably larger than 32,767. Modern machines tend to support larger integers, but the only properties you can count on are the following: • The internal size of an integer cannot decrease as you move from short to int to long. A compiler designer for C++ could, for example, decide to make short and int the same size but could not make int smaller than short. • The maximum value of type int must be at least 32,767 (215–1). • The maximum value of type long must be at least 2,147,483,647 (231–1). The designers of C++ could have chosen to define the allowable range of type int more precisely. For example, they could have declared—as the designers of Java did—that the maximum value of type int would be 231–1 on every machine. Had they done so, it would be easier to move a program from one system to another and have it behave in the same way. The ability to move a program between different machines is called portability, which is an important consideration in the design of a programming language. In C++, each of the integer types int, long, and short may be preceded by the keyword unsigned. Adding unsigned creates a new data type in which no negative values are allowed. Unsigned values also offer twice the range of positive values when compared to their signed counterparts. On modern machines, for example, the maximum value of type int is typically 2,147,483,647, but the maximum value of type unsigned int is 4,294,967,295. C++ allows the type unsigned int to be abbreviated to unsigned, and most programmers who use this type tend to follow this practice. An integer constant is ordinarily written as a string of decimal digits. If the number begins with the digit 0, however, the compiler interprets the value as an octal (base 8) integer. Thus, the constant 040 is taken to be in octal and represents the decimal number 32. If you prefix a numeric constant with the characters 0x, the compiler interprets that number as hexadecimal (base 16). Thus, the constant 0xFF is equivalent to the decimal constant 255. You can explicitly indicate that an integer constant is of type long by adding the letter L at the end of the digit string. Thus, the constant 0L is equal to 0, but the value is explicitly of type long. Similarly, if you use the letter U as a suffix, the constant is taken to be unsigned. 1.5 Data types Floating-point types Numbers that include a decimal fraction are called floating-point numbers, which are used to approximate real numbers in mathematics. C++ defines three different floating-point types: float, double, and long double. Although ANSI C++ does not specify the exact representation of these types, the way to think about the difference is that the longer types—where long double is longer than double, which is in turn longer than float—allow numbers to be represented with greater precision at the cost of occupying more memory space. Unless you are doing exacting scientific calculation, however, the differences between these types will not make a great deal of difference. In keeping with a common convention among C++ programmers, this text uses the type double as its standard floating-point type. Floating-point constants in C++ are written with a decimal point. Thus, if 2.0 appears in a program, the number is represented internally as a floating-point value; if the programmer had written 2, this value would be an integer. Floating-point values can also be written in a special programmer’s style of scientific notation, in which the value is represented as a floating-point number multiplied by an integral power of 10. To write a number using this style, you write a floating-point number in standard notation, followed immediately by the letter E and an integer exponent, optionally preceded by a + or - sign. For example, the speed of light in meters per second can be written in C++ as 2.9979E+8 where the E stands for the words times 10 to the power. Boolean type In the programs you write, it is often necessary to test a particular condition that affects the subsequent behavior of your code. Typically, that condition is specified using an expression whose value is either true or false. This data type—for which the only legal values are the constants true and false—is called Boolean data, after the mathematician George Boole, who developed an algebraic approach for working with such values. In C++, the Boolean type is called bool. You can declare variables of type bool and manipulate them in the same way as other data objects. The operations that apply to the type bool are described in detail in the section entitled “Boolean operators” on page 34. 21 22 Overview of C++ Characters In the early days, computers were designed to work only with numeric data and were sometimes called number crunchers as a result. Modern computers, however, work less with numeric data than they do with text data, that is, any information composed of individual characters that appear on the keyboard and the screen. The ability of modern computers to process text data has led to the development of word processing systems, online reference libraries, electronic mail, social networks, and a seemingly infinite supply of exciting applications. The most primitive elements of text data are individual characters, which are represented in C++ using the predefined data type char. The domain of type char is the set of symbols that can be displayed on the screen or typed on the keyboard: the letters, digits, punctuation marks, spacebar, RETURN key, and so forth. Internally, these values are represented inside the computer by assigning each character a numeric code. In most implementations of C++, the coding system used to represent characters is called ASCII, which stands for the American Standard Code for Information Interchange. The decimal values of the characters in the ASCII set are shown in Table 1-2, where the ASCII code for any character is the sum of the numbers at the beginning of its row and column. Although it is important to know that characters are represented internally using a numeric code, it is not generally useful to know what numeric value corresponds T A B L E 1 - 2 ASCII character codes 0 1 2 3 4 5 6 7 8 9 0x \000 \001 \002 \003 \004 \005 \006 \a \b \t 1x \n \v \f \r \016 \017 \020 \021 \022 \023 2x \024 \025 \026 \027 \030 \031 \032 \033 \034 \035 3x \036 \037 space ! " # $ % & ' 4x ( ) * + , - . / 0 1 5x 2 3 4 5 6 7 8 9 : ; 6x < = > ? @ A B C D E 7x F G H I J K L M N O 8x P Q R S T U V W X Y 9x Z [ \ ] ^ _ ‘ a b c 10 x d e f g h i j k l m 11 x n o p q r s t u v w 12 x x y z { | } ~ \177 1.5 Data types to a particular character. When you type the letter A, the hardware logic built into the keyboard automatically translates that character into the ASCII code 65, which is then sent to the computer. Similarly, when the computer sends the ASCII code 65 to the screen, the letter A appears. You can write a character constant in C++ by enclosing the character in single quotes. Thus, the constant 'A' represents the internal code of the uppercase letter A. In addition to the standard characters, C++ allows you to write special characters in a multicharacter form beginning with a backward slash (\). This form is called an escape sequence. Table 1-3 shows the escape sequences that C++ supports. Strings Characters are most useful when they are collected together into sequential units. In programming, a sequence of characters is called a string. So far, the strings you’ve seen in the HelloWorld and PowersOfTwo programs have been used simply to display messages on the screen, but they have many more applications than that. You write string constants in C++ by enclosing the characters contained within the string in double quotes. As with character, C++ uses the escape sequences from Table 1-3 to represent special characters. If two or more string constants appear consecutively in a program, the compiler concatenates them together. The most important implication of this rule is that you can break a long string over several lines so that it doesn’t end up running past the right margin of your program. T A B L E 1 - 3 Escape sequences \a Audible alert (beeps or rings a bell) \b Backspace \f Formfeed (starts a new page) \n Newline (moves to the beginning of the next line) \r Return (returns to the beginning of the current line without advancing) \t Tab (moves horizontally to the next tab stop) \v Vertical tab (moves vertically to the next tab stop) \0 Null character (the character whose ASCII code is 0) \\ The character \ itself \' The character ' (requires the backslash only in character constants) \" The character " (requires the backslash only in string constants) \ddd The character whose ASCII code is the octal number ddd 23 24 Overview of C++ Given that they are essential to so many applications, all modern programming languages include special features for working with strings. Unfortunately, C++ complicates the issue by defining two different string types: an older style inherited from C and a more sophisticated string library that supports the object-oriented paradigm. To minimize confusion, this text uses the string library wherever possible, and you should—for the most part—feel free to ignore the fact that two string models exist. The times when that complexity raises its ugly head are outlined in Chapter 3, which covers the string library in more detail. For the moment, you can simply imagine that C++ offers a built-in data type called string whose domain is the set of all sequences of characters. You can declare variables of type string and pass string values back and forth between functions as arguments and results. The fact that string is a library type and not a built-in feature does have a few implications. If you use the type name string in a program, you need to add the string library to the list of #include lines, like this: #include Moreover, because the string type is part of the standard library namespace, the compiler will recognize the type name only if you have included the line using namespace std; at the beginning of the file, as the programs in this book invariably do. Enumerated types As the discussion of ASCII codes in the preceding section makes clear, computers store character data in integer by assigning a number to each character. This idea of encoding data as integers by numbering the elements of the domain is actually a much more general principle. C++ allows you to define new types simply by listing the elements in their domain. Such types are called enumerated types. The syntax for defining an enumerated type is enum typename { namelist }; where typename is the name of the new type and namelist is a list of the constants in the domain, separated by commas. In this book, all type names start with an uppercase letter, and the names of the enumeration constants are written entirely in uppercase. For example, the following definition introduces a new Direction type whose values are the four compass directions: enum Direction { NORTH, EAST, SOUTH, WEST }; 1.5 Data types When the C++ compiler encounters this definition, it assigns values to the constant names by numbering them consecutively starting with 0. Thus, NORTH is assigned the value 0, EAST is assigned the value 1, SOUTH is assigned the value 2, and WEST is assigned the value 3. C++ allows you to assign explicit underlying values to each of the constants of an enumerated type. For example, the type declaration enum Coin { PENNY = 1, NICKEL = 5, DIME = 10, QUARTER = 25, HALF_DOLLAR = 50, DOLLAR = 100 }; introduces an enumerated type for U.S. coinage in which each constant is defined to equal the monetary value of that coin. If you supply values for some of the constants but not others, the C++ compiler will automatically choose values for the unassigned constants by numbering them consecutively after the last value you supplied. Thus, the type declaration enum Month { JANUARY = 1, FEBRUARY, MARCH, APRIL, MAY, JUNE, JULY, AUGUST, SEPTEMBER, OCTOBER, NOVEMBER, DECEMBER }; introduces a type for the months of the year in which JANUARY has the value 1, FEBRUARY has the value 2, and so forth up to DECEMBER, which has the value 12. Compound types The atomic types described in the preceding sections form the basis of a very rich type system that allows you to create new types from existing ones. Moreover, because C++ represents a synthesis of the object-oriented and procedural 25 26 Overview of C++ paradigms, the type system includes both objects and more traditional structures. Learning how to define and manipulate these types is, to a large extent, the theme of this entire book. It therefore does not make sense to squeeze a complete description of these types into Chapter 1. That’s what the rest of the chapters are for. Over the years of teaching this material at Stanford, we have discovered that you are much more likely to master the concepts of object-oriented programming if the details of defining classes and objects are presented after you have had a chance to use them in a high-level way. This book adopts that strategy and postpones any discussion of how to create your own objects until Chapter 6, at which point you will have had plenty of time to discover just how useful objects can be. 1.6 Expressions Whenever you want a program to perform calculations, you need to write an expression that specifies the necessary operations in a form similar to that used for expressions in mathematics. For example, suppose that you wanted to solve the quadratic equation ax2 + bx + c = 0 As you know from high-school mathematics, this equation has two solutions given by the formula x = - b + b 2 - 4ac 2a The first solution is obtained by using + in place of the ± symbol; the second is obtained by using – instead. In C++, you could compute the first of these solutions by writing the following expression: (-b + sqrt(b * b - 4 * a * c)) / (2 * a) There are a few differences in form: multiplication is represented explicitly by a *, division is represented by a /, and the square root function (which comes from a library called described in detail in Chapter 2) is written using the name sqrt rather than a mathematical symbol. Even so, the C++ form of the expression captures the intent of its mathematical counterpart in a way that is quite readable, particularly if you’ve written programs in any modern programming language. In C++, an expression is composed of terms and operators. A term, such as the variables a, b, and c or the constants 2 and 4 in the preceding expression, represents a single data value and must be either a constant, a variable, or a function call. An operator is a character (or sometimes a short sequence of characters) that indicates a 1.6 Expressions 27 computational operation. A list of the operators available in C++ appears in Table 1-4. The table includes familiar arithmetic operators like + and - along with several others that pertain only to types introduced in later chapters. Precedence and associativity The point of listing all the operators in a single table is to establish how they relate to one another in terms of precedence, which is a measure of how tightly an operator binds to its operands in the absence of parentheses. If two operators compete for the same operand, the one that appears higher in the precedence table is applied first. Thus, in the expression (-b + sqrt(b * b - 4 * a * c)) / (2 * a) the multiplications b * b and 4 * a * c are performed before the subtraction because * has a higher precedence than -. It is, however, important to note that the - operator occurs in two forms. Operators that connect two operands are called binary operators; operators that take just one operand are called unary operators. When a minus sign is written in front of a single operand, as in -b, it is interpreted as a unary operator signifying negation. When it appears between two operands, as it does inside the argument to sqrt, the minus sign is a binary operator signifying T A B L E 1 - 4 Operators available in C++ Operators organized into precedence groups () [] -> unary operators: * / + - << < == Associativity left . - ++ -- ! & * ~ (type) sizeof right left % left left >> <= != > >= left left & left ^ left | left && left || left ?: right = op= right 28 Overview of C++ subtraction. The precedence of the unary and binary versions of an operator are different and are listed separately in the precedence table. If two operators have the same precedence, they are applied in the order specified by their associativity, which indicates whether that operator groups to the left or to the right. Most operators in C++ are left-associative, which means that the leftmost operator is evaluated first. A few operators—most notably the assignment operator discussed in its own section later in this chapter—are right-associative, which mean that they group from right to left. The associativity for each operator appears in Table 1-4. The quadratic formula illustrates the importance of paying attention to precedence and associativity rules. Consider what would happen if you wrote the expression without the parentheses around 2 * a, as follows: (-b + sqrt(b * b - 4 * a * c)) / 2 * a Without the parentheses, the division operator would be performed first because / and * have the same precedence and associate to the left. This example illustrates the use of the bug icon to mark code that is intentionally incorrect to make sure that you don’t copy it into your own programs. Mixing types in an expression In C++, you can write an expression that includes values of different numeric types. If C++ encounters an operator whose operands are of different types, the compiler automatically converts the operands to a common type by choosing the type that appears closest to the top of the hierarchy in Table 1-5. The result of applying the operation is always that of the arguments after any conversions are applied. This convention ensures that the result of the computation is as precise as possible. T A B L E 1 - 5 Type conversion hierarchy for numeric types long double double float unsigned long long unsigned int int unsigned short short char most precise least precise 1.6 Expressions As an example, suppose that n is declared as an int, and x is declared as a double. The expression n + 1 is evaluated using integer arithmetic and produces a result of type int. expression The x + 1 however, is evaluated by converting the integer 1 to the floating-point value 1.0 and adding the results together using double-precision floating-point arithmetic, which results in a value of type double. Integer division and the remainder operator The fact that applying an operator to two integer operands generates an integer result leads to an interesting situation with respect to the division operator. If you write an expression like 9 / 4 C++’s rules specify that the result of this operation must be an integer, because both operands are of type int. When C++ evaluates this expression, it divides 9 by 4 and discards any remainder. Thus, the value of this expression in C++ is 2, not 2.25. If you want to compute the mathematically correct result of 9 divided by 4, at least one of the operands must be a floating-point number. For example, the three expressions 9.0 / 4 9 / 4.0 9.0 / 4.0 each produce the floating-point value 2.25. The decimal fraction is thrown away only if both operands are of type int. The operation of discarding a decimal fraction is called truncation. The / operator in C++ is closely associated with the % operator, which returns the remainder left over when the first operand is divided by the second. For example, the value of 9 % 4 29 30 Overview of C++ is 1, since 4 goes into 9 twice, with 1 left over. The following are some other examples of the % operator: 0 % 4 1 % 4 4 % 4 = = = 0 1 0 19 % 4 20 % 4 2001 % 4 = = = 3 0 1 The / and % operators turn out to be extremely useful in a wide variety of programming applications. You can, for example, use the % operator to test whether one number is divisible by another; to determine whether an integer n is divisible by 3, you just check whether the result of the expression n % 3 is 0. It is, however, important to use caution if either or both of the operands to / and % might be negative, because the results may differ from machine to machine. On most machines, division truncates its result toward 0, but this behavior is not actually guaranteed by the ANSI standard. In general, it is good programming practice to avoid—as this book does—using these operators with negative values. Type casts In C++, you can specify explicit conversion by using what is called a type cast, which specifies an explicit conversion from one type to another. In C++, type casts are usually written by specifying the name of the desired type followed by the value you wish to convert in parentheses. For example, if num and den are declared as integers, you can compute the floating-point quotient by writing quotient = double(num) / den; The first step in evaluating the expression is to convert num to a double, after which the division is performed using floating-point arithmetic as described in the section on “Mixing types in an expression” earlier in this chapter. As long as the conversion moves upward in the hierarchy shown in Table 1-5, the conversion involves no loss of information. If, however, you convert a value of a more precise type to a less precise one, some information may be lost. For example, if you use a type cast to convert a value of type double to type int, any decimal fraction is simply dropped. Thus, the value of the expression int(1.9999) is the integer 1. The assignment operator In C++, assignment of values to variables is built into the expression structure. The = operator takes two operands, just like + or *. The left operand must indicate a value that can change, which is typically a variable name. When the assignment 1.6 Expressions operator is executed, the expression on the right-hand side is evaluated, and the resulting value is then stored in the variable that appears on the left-hand side. Thus, if you evaluate an expression like result = 1 the effect is that the value 1 is assigned to the variable result. In most cases, assignment expressions of this sort appear in the context of simple statements, which are formed by adding a semicolon after the expression, as in the line result = 1; Such statements are often called assignment statements. The assignment operator converts the type of the value on the right-hand side so that it matches the declared type of the variable. Thus, if the variable total is declared to be of type double, and you write the assignment statement total = 0; the integer 0 is converted into a double as part of making the assignment. If n is declared to be of type int, the assignment n = 3.14159265; has the effect of setting n to 3, because the value is truncated to fit in the integer variable. Even though assignment operators usually occur in the context of simple statements, they can also be incorporated into larger expressions, in which case the result of applying the assignment operator is simply the value assigned. For example, the expression z = (x = 6) + (y = 7) has the effect of setting x to 6, y to 7, and z to 13. The parentheses are required in this example because the = operator has a lower precedence than +. Assignments that are written as part of larger expressions are called embedded assignments. Although there are contexts in which embedded assignments are convenient, they often make programs more difficult to read because the assignment is easily overlooked in the middle of a complex expression. For this reason, this text limits the use of embedded assignments to a few special circumstances in which they seem to make the most sense. Of these, the most important is when you want to set several variables to the same value. C++’s definition of assignment as an operator 31 32 Overview of C++ makes it possible, instead of writing separate assignment statements, to write a single statement like n1 = n2 = n3 = 0; which has the effect of setting all three variables to 0. This statement works because C++ evaluates assignment operators from right to left. The entire statement is therefore equivalent to n1 = (n2 = (n3 = 0)); The expression n3 = 0 is evaluated, which sets n3 to 0 and then passes 0 along as the value of the assignment expression. That value is assigned to n2, and the result is then assigned to n1. Statements of this sort are called multiple assignments. As a programming convenience, C++ allows you to combine assignment with a binary operator to produce a form called a shorthand assignment. For any binary operator op, the statement variable op= expression; is equivalent to variable = variable op (expression); where the parentheses are included to emphasize that the entire expression is evaluated before op is applied. Thus, the statement balance += deposit; is a shorthand for balance = balance + deposit; which adds deposit to balance. Because this same shorthand applies to any binary operator in C++, you can subtract the value of surcharge from balance by writing balance -= surcharge; Similarly, you can divide the value of x by 10 using x /= 10; 1.6 Expressions Increment and decrement operators Beyond the shorthand assignment operators, C++ offers a further level of abbreviation for the particularly common programming operations of adding or subtracting 1 from a variable. Adding 1 to a variable is called incrementing it; subtracting 1 is called decrementing it. To indicate these operations in an extremely compact form, C++ uses the operators ++ and --. For example, the statement x++; in C++ has the same effect on the variable x as x += 1; which is itself short for x = x + 1; Similarly, y--; has the same effect as y -= 1; or y = y - 1; As it happens, these operators are more intricate than the previous examples would suggest. To begin with, each of these operators can be written in two ways. The operator can come after the operand to which it applies, as in x++ or before the operand, as in ++x The first form, in which the operator follows the operand, is called the postfix form, the second, the prefix form. If all you do is execute the ++ operator in isolation—as you do in the context of a separate statement or the standard for loop patterns—the prefix and postfix operators have precisely the same effect. You notice the difference only if you use 33 34 Overview of C++ these operators as part of a larger expression. Then, like all operators, the ++ operator returns a value, but the value depends on where the operator is written relative to the operand. The two cases are as follows: x++ Calculates the value of x first, and then increments it. The value returned to the surrounding expression is the original value before the increment operation is performed. ++x Increments the value of x first, and then uses the new value as the value of the ++ operation as a whole. The -- operator behaves similarly, except that the value is decremented rather than incremented. You may wonder why would anyone use such an arcane feature. The ++ and -operators are certainly not essential. Moreover, there are not many circumstances in which programs that embed these operators in larger expressions are demonstrably better than those that use a simpler approach. On the other hand, ++ and -- are firmly entrenched in the historical tradition shared by the languages C, C++, and Java. Programmers use them so frequently that they have become essential idioms in these languages. In light of their widespread use in programs, you need to understand these operators so that you can make sense of existing code. Boolean operators C++ defines three classes of operators that manipulate Boolean data: the relational operators, the logical operators, and the ?: operator. The relational operators are used to compare two values. C++ defines six relational operators, as follows: == != > < >= <= Equal Not equal Greater than Less than Greater than or equal to Less than or equal to When you write programs that test for equality, be careful to use the == operator, which is composed of two equal signs. A single equal sign is the assignment operator. Since the double equal sign violates conventional mathematical usage, replacing it with a single equal sign is a particularly common mistake. This mistake can also be very difficult to track down because the C++ compiler does not usually catch it as an error. A single equal sign turns the expression into an embedded assignment, which is perfectly legal in C++; it just isn’t at all what you want. 1.6 Expressions The relational operators can be used to compare atomic data values like integers, floating-point numbers, Boolean values, and characters, but those operators can also be applied to many of the types supplied by libraries, such as string. In addition to the relational operators, C++ defines three logical operators that take Boolean operands and combine them to form other Boolean values: Logical not (true if the following operand is false) Logical and (true if both operands are true) Logical or (true if either or both operands are true) ! && || These operators are listed in decreasing order of precedence. Although the operators &&, ||, and ! closely resemble the English words and, or, and not, it is important to remember that English can be somewhat imprecise when it comes to logic. To avoid that imprecision, it is often helpful to think of these operators in a more formal, mathematical way. Logicians define these operators using truth tables, which show how the value of a Boolean expression changes as the values of its operands change. The truth table in Table 1-6 illustrates the result for each of the logical operators, given all possible values of the variables p and q. Whenever a C++ program evaluates an expression of the form exp1 && exp2 or exp1 || exp2 the individual subexpressions are always evaluated from left to right, and evaluation ends as soon as the result can be determined. For example, if exp1 is false in the expression involving &&, there is no need to evaluate exp2 since the final result will always be false. Similarly, in the example using ||, there is no need to evaluate the second operand if the first operand is true. This style of evaluation, which stops as soon as the result is known, is called short-circuit evaluation. T A B L E 1 - 6 Truth table for the logical operators p q p && q p || q !p false false false false true false true false true true true false false true false true true true true false 35 36 Overview of C++ The C++ programming language includes another Boolean operator called ?: that can be extremely useful in certain situations. In programming parlance, the name of this operator is always pronounced as question-mark colon, even though the two characters do not appear adjacent to each other in the code. Unlike the other operators in C++, ?: is written in two parts and requires three operands. The general form of the operation is (condition) ? exp1 : exp2 The parentheses are not technically required, but C++ programmers often include them to emphasize the boundaries of the conditional test. When a C++ program encounters the ?: operator, it first evaluates the condition. If the condition turns out to be true, exp1 is evaluated and used as the value of the entire expression; if the condition is false, the value is the result of evaluating exp2. For example, you can use the ?: operator to assign to max either the value of x or the value of y, whichever is greater, as follows: max = (x > y) ? x : y; 1.7 Statements Programs in C++ are composed of functions, which are made up in turn of statements. As in most languages, statements in C++ fall into one of two principal classifications: simple statements that perform some action and control statements that affect the way in which other statements are executed. The sections that follow review the principal statement forms available in C++, giving you the tools you need to write your own programs. Simple statements The most common statement in C++ is the simple statement, which consists of an expression followed by a semicolon: expression; In most cases, the expression is a function call, an assignment, or a variable followed by the increment or decrement operator. Blocks As C++ is defined, control statements typically apply to a single statement. When you are writing a program, you often want a particular control statement to apply to a whole group of statements. To indicate that a sequence of statements is part of a coherent unit, you can assemble those statements into a block, which is a collection of statements enclosed in curly braces, as follows: 1.7 Statements { statement1 statement2 . . . statementn } When the C++ compiler encounters a block, it treats the entire block as a single statement. Thus, whenever the notation statement appears in a pattern for one of the control forms, you can substitute for it either a single statement or a block. To emphasize that they are statements as far as the compiler is concerned, blocks are sometimes referred to as compound statements. In C++, the statements in any block may be preceded by declarations of variables. The statements in the interior of a block are usually indented relative to the enclosing context. The compiler ignores the indentation, but the visual effect is extremely helpful to the human reader, because it makes the structure of the program jump out at you from the format of the page. Empirical research has shown that indenting three or four spaces at each new level makes the program structure easiest to see; the programs in this text use three spaces for each new level. Indentation is critical to good programming, so you should strive to develop a consistent indentation style in your programs. The if statement In writing a program, you will often want to check whether some condition applies and use the result of that check to control the subsequent execution of the program. This type of program control is called conditional execution. The easiest way to express conditional execution in C++ is by using the if statement, which comes in two forms: if (condition) statement if (condition) statement else statement You use the first form of the if statement when your solution strategy calls for a set of statements to be executed only if a particular Boolean condition is true. If the condition is false, the statements that form the body of the if statement are simply skipped. You use the second form of the if statement for situations in which the program must choose between two independent sets of actions based on the result of a test. This statement form is illustrated by the following code, which reports whether an integer n is even or odd. 37 38 Overview of C++ if (n % 2 == 0) { cout << "That number is even." << endl; } else { cout << "That number is odd." << endl; } As with any control statement, the statements controlled by the if statement can be either a single statement or a block. Even if the body of a control form is a single statement, you are free to enclose it in a block if you decide that doing so improves the readability of your code. The programs in this book enclose the body of every control statement in a block unless the entire statement—both the control form and its body—is so short that it fits on a single line. The switch statement The if statement is ideal for those applications in which the program logic calls for a two-way decision point: some condition is either true or false, and the program acts accordingly. Some applications, however, call for more complicated decision structures involving several mutually exclusive cases: in one case, the program should do x; in another case, it should do y; in a third, it should do z; and so forth. In many applications, the most appropriate statement to use for such situations is the switch statement, which has the following syntactic form: switch (e) { case c1: statements break; case c2: statements break; . . . more case clauses default: statements break; } . . . The expression e is called the control expression. When the program executes a switch statement, it evaluates the control expression and compares it against the values c1, c2, and so forth, each of which must be a constant. If one of the constants matches the value of the control expression, the statements in the associated case clause are executed. When the program reaches the break statement at the end of the clause, the operations specified by that clause are complete, and the program continues with the statement that follows the entire switch statement. 1.7 Statements The default clause is used to specify what action occurs if none of the constants match the value of the control expression. The default clause, however, is optional. If none of the cases match and there is no default clause, the program simply continues on with the next statement after the switch statement without taking any action at all. To avoid the possibility that the program might ignore an unexpected case, it is good programming practice to include a default clause in every switch statement unless you are certain you have enumerated all the possibilities. The code pattern I’ve used to illustrate the syntax of the switch statement deliberately suggests that break statements are required at the end of each clause. In fact, C++ is defined so that if the break statement is missing, the program starts executing statements from the next clause after it finishes the selected one. While this design can be useful in some cases, it causes many more problems than it solves. To reinforce the importance of remembering to exit at the end of each case clause, the programs in this text include a break or return statement in each such clause. The one exception to this rule is that multiple case lines specifying different constants can appear together, one after another, before the same statement group. For example, a switch statement might include the following code: case 1: case 2: statements break; which indicates that the specified statements should be executed if the select expression is either 1 or 2. The C++ compiler treats this construction as two case clauses, the first of which is empty. Because the empty clause contains no break statement, a program that selects the first path simply continues on with the second clause. From a conceptual point of view, however, you are better off if you think of this construction as a single case clause representing two possibilities. The constants in a switch statement must be a scalar type, which is defined in C++ as a type that uses an integer as its underlying representation. In particular, characters are often used as case constants, as illustrated by the following function, which tests to see if its argument is a vowel: 39 40 Overview of C++ bool isVowel(char ch) { switch (ch) { case 'A': case 'E': case 'I': case 'O': case 'U': case 'a': case 'e': case 'i': case 'o': case 'u': return true; default: return false; } } Enumerated types also qualify as scalar types, as illustrated by the function string directionToString(Direction dir) { switch (dir) { case NORTH: return "NORTH"; case EAST: return "EAST"; case SOUTH: return "SOUTH"; case WEST: return "WEST"; default: return "???"; } } which converts a Direction value to a string. The default clause returns "???" if the internal value of dir does not match any of the Direction constants. As a second example of using switch with enumerated types, the following function returns the number of days for a given month and year: int daysInMonth(Month month, int year) { switch (month) { case APRIL: case JUNE: case SEPTEMBER: case NOVEMBER: return 30; case FEBRUARY: return (isLeapYear(year)) ? 29 : 28; default: return 31; } } This code assumes the existence of a function isLeapYear(year) that tests whether year is a leap year. You can implement isLeapYear using the logical operators && and ||, as follows: 1.7 Statements bool isLeapYear(int year) { return ((year % 4 == 0) && (year % 100 != 0)) || (year % 400 == 0); } This function simply encodes the rule for determining leap years: a leap year is any year divisible by 4, except for years ending in 00, in which case the year must be divisible by 400. Functions that return Boolean values—like isVowel and isLeapYear in this section—are called predicate functions. Predicate functions play a useful role in programming, and you will encounter many of them in this text. The while statement In addition to the conditional statements if and switch, C++ includes several control statements that allow you to execute some part of the program multiple times to form a loop. Such control statements are called iterative statements. The simplest iterative statement in C++ is the while statement, which executes a statement repeatedly until a conditional expression becomes false. The general form for the while statement looks like this: while (conditional-expression) { statements } When a program encounters a while statement, it first evaluates the conditional expression to see whether it is true or false. If it is false, the loop terminates and the program continues with the next statement after the entire loop. If the condition is true, the entire body is executed, after which the program goes back to the beginning of the loop to check the condition again. A single pass through the statements in the body constitutes a cycle of the loop. There are two important principles about the operation of a while loop: 1. The conditional test is performed before every cycle of the loop, including the first. If the test is false initially, the body of the loop is not executed at all. 2. The conditional test is performed only at the beginning of a loop cycle. If that condition becomes false at some point during the loop, the program won’t notice that fact until it completes the entire cycle. At that point, the program evaluates the test condition again. If it is still false, the loop terminates. The operation of the while loop is illustrated by the following function, which computes the sum of the digits in an integer: 41 42 Overview of C++ int digitSum(int n) { int sum = 0; while (n > 0) { sum += n % 10; n /= 10; } return sum; } The function depends on the following observations: • The expression n % 10 always returns the last digit in a positive integer n. • The expression n / 10 returns a number without its final digit. The while loop is designed for situations in which there is some test condition that can be applied at the beginning of a repeated operation, before any of the statements in the body of the loop are executed. If the problem you are trying to solve fits this structure, the while loop is the perfect tool. Unfortunately, many programming problems do not fit easily into the standard while loop structure. Instead of allowing a convenient test at the beginning of the operation, some problems are structured in such a way that the test you want to write to determine whether the loop is complete falls most naturally somewhere in the middle of the loop. The most common example of such loops are those that read in data from the user until some special value, or sentinel, is entered to signal the end of the input. When expressed in English, the structure of the sentinel-based loop consists of repeating the following steps: 1. 2. 3. Read in a value. If the value is equal to the sentinel, exit from the loop. Perform whatever processing is required for that value. Unfortunately, there is no test you can perform at the beginning of the loop to determine whether the loop is finished. The termination condition for the loop is reached when the input value is equal to the sentinel; in order to check this condition, the program must first read in some value. If the program has not yet read in a value, the termination condition doesn’t make sense. When some operations must be performed before you can check the termination condition, you have a situation that programmers call the loop-and-a-half problem. One strategy for solving the loop-and-a-half problem in C++ is to use the break statement, which, in addition to its use in the switch statement, has the effect of immediately terminating the innermost enclosing loop. Using break, it is possible 1.7 Statements to code the loop structure in a form that follows the natural structure of the problem, which is called the read-until-sentinel pattern: while (true) { Prompt user and read in a value. if (value == sentinel) break; Process the data value. } Note that the while (true) line itself seems to introduce an infinite loop because the value of the constant true can never become false. The only way this program can exit from the loop is by executing the break statement inside it. The AddIntegerList program in Figure 1-5 uses the read-until-sentinel pattern to compute the sum of a list of integers terminated by the sentinel value 0. There are other strategies for solving the loop-and-a-half problem, most of which involve copying part of the code outside the loop or introducing additional Boolean variables. Empirical studies have demonstrated that students are more likely to write correct programs if they use a break statement to exit from the middle of the loop than if they are forced to use some other strategy. This evidence and my own experience have convinced me that using the read-until-sentinel pattern is the best solution to the loop-and-a-half problem. The for statement One of the most important control statements in C++ is the for statement, which is used in situations in which you want to repeat an operation a particular number of times. All modern programming languages have a statement that serves that purpose, but the for statement in the C family of languages is especially powerful and is useful in a wide variety of applications. You have already seen two examples of the for loop in the PowersOfTwo program earlier in this chapter. The first appeared in the main program, when it cycled through the desired values of the exponent. That loop looked like this: for (int i = 0; i <= limit; i++) { cout << "2 to the " << i << " = " << raiseToPower(2, i) << endl; } 43 44 Overview of C++ F I G U R E 1 - 5 Program to add a list of integers /* * File: AddIntegerList.cpp * -----------------------* This program adds a list of integers. The end of the * input is indicated by entering a sentinel value, which * is defined by setting the value of the constant SENTINEL. */ #include using namespace std; /* * Constant: SENTINEL * -----------------* This constant defines the value used to terminate the input * list and should therefore not be a value one would want to * include as a data value. The value 0 therefore makes sense * for a program that adds a series of numbers because the * user can simply skip any 0 values in the input. */ const int SENTINEL = 0; /* Main program */ int main() { cout << "This program adds a list of numbers." << endl; cout << "Use " << SENTINEL << " to signal the end." << endl; int total = 0; while (true) { int value; cout << " ? "; cin >> value; if (value == SENTINEL) break; total += value; } cout << "The total is " << total << endl; return 0; } The second example appeared in the implementation of raiseToPower and had the following form: for (int i = 0; i < k; i++) { result *= n; } Each of these examples represents an idiomatic pattern that will come up often as you write your programs. 1.7 Statements Of these two patterns, the second is more common. The general form of this pattern is for (int var = 0; var < n; var++) and has the effect of executing the body of the loop n times. You can substitute any variable name you want into this pattern, but that variable is often not used inside the body of the for loop at all, as is the case in raiseToPower. The pattern from the main program counts from one value to another and has the following general form: for (int var = start; var <= finish; var++) In this pattern, the body of the for loop is executed with the variable var set to each value between start and finish, inclusive. Thus, you could use a for loop to have the variable i count from 1 to 100 like this: for (int i = 1; i <= 100; i++) You can also use the for loop pattern to implement the factorial function, which is defined as the product of the integers between 1 and n and is usually written in mathematics as n! The code is similar to the implementation of raiseToPower: int fact(int n) { int result = 1; for (int i = 1; i <= n; i++) { result *= i; } return result; } In this implementation, the for loop ensures that the variable i counts from 1 to n. The body of the loop then updates the value of result by multiplying it by i, which takes on each of the desired values in turn. The variable used in these for loop patterns is called an index variable. The convention of using single-letter names such as i and j for index variables dates back at least as far as the early versions of FORTRAN (which required integer variables to start with a predefined set of letters in the middle of the alphabet) but also fits with similar conventions in mathematics. Although short variable names are usually poor choices because they convey very little about the purpose of that variable, the fact that this naming convention exists makes such names appropriate in this context. Whenever you see the variable i or j in a for loop, you can be reasonably confident that the variable is counting through some range of values. 45 46 Overview of C++ The for loop in C++, however, is considerably more general than the earlier examples suggest. The general form of the for loop pattern is for (init; test; step) { statements } This code is equivalent to the following while statement init; while (test) { statements step; } The code fragment specified by init, which is typically a variable declaration, runs before the loop begins and is most often used to initialize an index variable. For example, if you write for (int i = 0; . . . the loop will begin by setting the index variable i to 0. If the loop begins for (int i = -7; . . . the variable i will start as −7, and so on. The test expression is a conditional test written exactly like the test in a while statement. As long as the test expression is true, the loop continues. Thus, the loop for (int i = 0; i < n; i++) begins with i equal to 0 and continues as long as i is less than n, which turns out to represent a total of n cycles, with i taking on the values 0, 1, 2, and so forth, up to the final value n–1. The loop for (int i = 1; i <= n; i++) begins with i equal to 1 and continues as long as i is less than or equal to n. This loop also runs for n cycles, with i taking on the values 1, 2, and so forth, up to n. The step expression indicates how the value of the index variable changes from cycle to cycle. The most common form of step specification is to increment the index variable using the ++ operator, but this is not the only possibility. For 1.7 Statements example, one can count backward using the -- operator, or count by twos using += 2 instead of ++. As an illustration of counting in the reverse direction, the program int main() { for (int t = 10; t >= 0; t--) { cout << t << endl; } return 0; } generates the following sample run: Each of the expressions in a for statement is optional, but the semicolons must appear. If init is missing, no initialization is performed. If test is missing, it is assumed to be true. If step is missing, no action occurs between loop cycles. The return statement The one other control statement you have seen in this chapter is the return statement, which appears in each of the functions defined in this chapter. When it is used to return a value, the return statement has the following form: return expression; Executing the return statement causes the current function to return immediately to its caller, passing back the value of the expression to the caller as the value of the function. Although this chapter offers several examples that should enable you to write simple functions of your own, the topic of functions in C++ is both complex enough and important enough to be worth a chapter of its own. In Chapter 2, you’ll have the chance to explore functions in much more detail. 47 48 Overview of C++ 1.8 The Stanford C++ libraries The programs in this chapter use only features of C++ that are completely standard. For that reason, these programs should work with any C++ compiler on any type of hardware. The standard libraries, however, do not include all the features that you would want as a programmer. Even though almost all modern applications use graphical displays, the standard libraries offer no graphical capabilities. Every modern operating system will provide libraries that support drawing pictures on the screen, but those libraries are different for each machine. In order to create applications that are as interesting as those you are used to using, you will certainly need to use nonstandard libraries somewhere along the way. As part of the supporting material for this textbook, Stanford University provides a set of libraries that make learning to program in C++ both easier and more enjoyable. Those libraries include a sophisticated graphics package that works in exactly the same way on the various platforms that most people use. The Stanford libraries, however, provide several tools that it makes sense to use even at this point in your exploration of C++. One library package that you might want to use for the exercises in this chapter is the console library, which creates a console window on the screen as soon as your program starts up. Having this window appear automatically solves the problem that Dennis Ritchie identified in The C Programming Language when he wrote that one of the harder challenges in running a program is to “find out where the output went.” To use this library, all you have to do is add the line #include "console.h" at the beginning of your program. If this line appears, the program creates the console. If it’s missing, the program works the way it always did. The other Stanford library that you might want to begin using right away is the simpio.h library, which simplifies getting input from the user. Instead of the lines int limit; cout << "Enter exponent limit: "; cin >> limit; using simpio.h allows you to collapse this idea into the following single line: int limit = getInteger("Enter exponent limit: "); This library also includes a getReal function for reading a floating-point value and a getLine function for reading an entire line as a string. You will learn how to write these functions in Chapter 4, but you can use them now to get in the habit. Summary Summary This chapter is itself a summary, which makes it hard to condense it to a few central points. Its purpose was to introduce you to the C++ programming language and give you a crash course in how to write simple programs in that language. This chapter focused on the low-level structure of the language, focusing on the concepts of expressions and statements, which together make it possible to define functions. Important points in this chapter include: • In the 30+ years of its existence, the C++ programming language has become one of the most widely used languages in the world. • A typical C++ program consists of comments, library inclusions, program-level definitions, function prototypes, a function named main that is called when the program is started, and a set of auxiliary functions that work together with the main program to accomplish the required task. • Variables in a C++ program must be declared before they are used. Most variables in C++ are local variables, which are declared within a function and can only be used inside the body of that function. • A data type is defined by a domain of values and a set of operations. C++ includes several primitive types that allow programs to store common data values including integers, floating-point numbers, Booleans, and characters. As you will learn in later chapters, C++ also allows programmers to define new types from existing ones. • The easiest way to perform input and output operations in C++ is to use the iostream library. This library defines three standard streams that refer to the console: cin for reading input data, cout for writing normal program output, and cerr for reporting error messages. Console input is traditionally performed using the >> operator, as in the statement cin >> limit; which reads a value from the console into the variable limit. Console output uses the << operator, as illustrated by the line cout << "The answer is " << answer << endl; which would display the value of answer on the console along with an identifying label. The endl value ensures that the next console output appears on a new line. • Expressions in C++ are written in a form similar to that in most programming languages, with individual terms connected by operators. A list of the C++ operators appears in Table 1-4 along with their precedence and associativity. 49 50 Overview of C++ • Statements in C++ fall into two general categories: simple statements and control statements. A simple statement consists of an expression—typically an assignment or a function call—followed by a semicolon. The control statements described in this chapter are the if, switch, while, and for statements. The first two are used to express conditional execution, while the last two are used to specify repetition. • C++ programs are typically subdivided into several functions. You can use the examples in this chapter as models for writing your own functions, or you can wait until functions are covered in more detail in Chapter 2. • The materials associated with this book include the Stanford C++ libraries which provide tools to make learning programming more fun and exciting. You will learn more about the features of the Stanford C++ libraries as you read this text. Although it is by no means necessary to use them, two libraries that might make your life easier are console.h and simpio.h. Review questions 1. When you write a C program, do you prepare a source file or an object file? 2. What characters are used to mark comments in a C++ program? 3. In an #include line, the name of the library header file can be enclosed in either angle brackets or double quotation marks. What is the difference between the two forms of punctuation? 4. How would you define a constant called CENTIMETERS_PER_INCH with the value 2.54? 5. What is the name of the function that must be defined in every C++ program? What statement typically appears at the end of that function? 6. What is the purpose of endl when you are writing to the cout output stream? 7. Define each of the following terms as it applies to a variable: name, type, value, and scope. 8. Indicate which of the following are legal variable names in C++: a. b. c. d. e. f. x formula1 average_rainfall %correct short tiny g. h. i. j. k. l. total output aVeryLongVariableName 12MonthTotal marginal-cost b4hand _stk_depth Review questions 9. What are the two attributes that define a data type? 10. What is the difference between the types short, int, and long? 11. What does ASCII stand for? 12. List all possible values of type bool. 13. What statements would you include in a program to read a value from the user and store it in the variable x, which is declared as a double? 14. Suppose that a function contains the following declarations: int i; double d; char c; string s; Write a statement that displays the values of each of these variables on the screen, along with the name of the variable so that you can tell which value is which. 15. Indicate the values and types of the following expressions: a. b. c. 2 + 3 19 / 5 19.0 / 5 d. e. f. 3 * 6.0 19 % 5 2 % 7 16. What is the difference between the unary minus and the binary subtraction operator? 17. What does the term truncation mean? 18. Calculate the result of each of the following expressions: a. b. c. d. 6 + 5 / 4 - 3 2 + 2 * (2 * 2 - 2) % 2 / 2 10 + 9 * ((8 + 7) % 6) + 5 * 4 % 3 * 2 + 1 1 + 2 + (3 + 4) * ((5 * 6 % 7 * 8) - 9) - 10 19. How do you specify a shorthand assignment operation? 20. What is the difference between the expressions ++x and x++? 21. What is meant by short-circuit evaluation? 51 52 Overview of C++ 22. Write out the general syntactic form for each of the following control statements: if, switch, while, for. 23. Describe in English the operation of the switch statement, including the role of the break statement at the end of each case clause. 24. What is a sentinel? 25. What for loop control line would you use in each of the following situations? a. b. c. Counting from 1 to 100 Counting by sevens starting at 0 until the number has more than two digits Counting backward by twos from 100 to 0 Exercises 1. Write a program that reads in a temperature in degrees Celsius and displays the corresponding temperature in degrees Fahrenheit. The conversion formula is F= 2. C + 32 Write a program that converts a distance in meters to the corresponding English distance in feet and inches. The conversion factors you need are 1 inch = 0.0254 meters 1 foot = 12 inches 3. As mathematical historians have told the story, the German mathematician Carl Friedrich Gauss (1777-1855) began to show his mathematical talent at a very early age. When he was in elementary school, Gauss was asked by his teacher to compute the sum of the numbers between 1 and 100. Gauss is said to have given the answer instantly: 5050. Write a program that computes the answer to the question Gauss’s teacher posed. 4. Write a program that reads in a positive integer N and then calculates and displays the sum of the first N odd integers. For example, if N is 4, your program should display the value 16, which is 1 + 3 + 5 + 7. 5. Write a program that reads in a list of integers from the user until the user enters the value 0 as a sentinel. When the sentinel appears, your program should display the largest value in the list, as illustrated in the following sample run: Exercises Be sure to define the sentinel value as a constant in a way that makes it easy to change. You should also make sure that the program works correctly if all the input values are negative. 6. For a slightly more interesting challenge, write a program that finds both the largest and the second-largest number in a list, prior to the entry of a sentinel. Once again using 0 as the sentinel, a sample run of this program might look like this: FindSecondLargest This program finds the largest integer in a list. Enter 0 to signal the end of the list. ? 223 ? 251 ? 317 ? 636 ? 766 ? 607 ? 607 ? 0 The largest value was 766. The second largest value was 636. The values in this sample run are the number of pages in the British hardcover editions of J. K. Rowling’s Harry Potter series. The output therefore tells us that the longest book (Harry Potter and the Order of the Phoenix) has 766 pages and the second-longest book (Harry Potter and the Goblet of Fire) weighs in at a mere 636 pages. 7. Using the AddIntegerList program from Figure 1-5 as a model, write a program AverageList that reads in a list of integers representing exam scores and then prints out the average. Because some unprepared student might actually get a score of 0, your program should use −1 as the sentinel to mark the end of the input. 8. Using the digitSum function from the section entitled “The while statement” as a model, write a program that reads in an integer and then displays the 53 54 Overview of C++ number that has the same digits in the reverse order, as illustrated by this sample run: 9. Every positive integer greater than 1 can be expressed as a product of prime numbers. This factorization is unique and is called the prime factorization. For example, the number 60 can be decomposed into the factors 2 x 2 x 3 x 5, each of which is prime. Note that the same prime can appear more than once in the factorization. Write a program to display the prime factorization of a number n, as illustrated by the following sample run: 10. In 1979, Douglas Hofstadter, Professor of Cognitive Science at the University of Indiana, wrote Gödel, Escher, Bach, which he described as “a metaphorical fugue on minds and machines in the spirit of Lewis Carroll.” The book won the Pulitzer Prize for Literature and has over the years become one of the classics of computer science. Much of its charm comes from the mathematical oddities and puzzles it contains, many of which can be expressed in the form of computer programs. Of these, one of the most interesting concerns the sequence of numbers formed by repeatedly executing the following rules for some positive integer n: • If n is equal to 1, you’ve reached the end of the sequence and can stop. • If n is even, divide it by two. • If n is odd, multiply it by three and add one. Although it also goes by several other names, this sequence is often called the hailstone sequence because the values tend to go up and down before coming back to 1, much as hailstones do in the clouds in which they form. Exercises Write a program that reads in a number from the user and then generates the hailstone sequence from that point, as in the following sample run: Hailstone Enter a number: 15 15 is odd, so I multiply by 3 and add 1 to get 46 46 is even, so I divide it by 2 to get 23 23 is odd, so I multiply by 3 and add 1 to get 70 70 is even, so I divide it by 2 to get 35 35 is odd, so I multiply by 3 and add 1 to get 106 106 is even, so I divide it by 2 to get 53 53 is odd, so I multiply by 3 and add 1 to get 160 160 is even, so I divide it by 2 to get 80 80 is even, so I divide it by 2 to get 40 40 is even, so I divide it by 2 to get 20 20 is even, so I divide it by 2 to get 10 10 is even, so I divide it by 2 to get 5 5 is odd, so I multiply by 3 and add 1 to get 16 16 is even, so I divide it by 2 to get 8 8 is even, so I divide it by 2 to get 4 4 is even, so I divide it by 2 to get 2 2 is even, so I divide it by 2 to get 1 As you can see, this program offers a narrative account of the process as it goes along, much as Hofstadter does in his book. One of the fascinating things about the hailstone sequence is that no one has yet been able to prove that the process always stops. The number of steps in the process can get very large, but somehow, it always seems to climb back down to one. 11. The German mathematician Leibniz (1646–1716) discovered the rather remarkable fact that the mathematical constant π can be computed using the following mathematical relationship: π 4 1 ≅ 1 – 13 + 15 – 17 + 19 – 11 + . . . The formula to the right of the equal sign represents an infinite series; each fraction represents a term in that series. If you start with 1, subtract one-third, add one-fifth, and so on, for each of the odd integers, you get a number that gets closer and closer to the value of π/4 as you go along. Write a program that calculates an approximation of π consisting of the first 10,000 terms in Leibniz’s series. 12. You can also compute π by approximating the area bounded by a circular arc. Consider the following quarter circle: 55 56 Overview of C++ which has a radius r equal to two inches. From the formula for the area of a circle, you can easily determine that the area of the quarter circle should be π square inches. You can also approximate the area computationally by adding up the areas of a series of rectangles, where each rectangle has a fixed width and the height is chosen so that the circle passes through the midpoint of the top of the rectangle. For example, if you divide the area into 10 rectangles from left to right, you get the following diagram: The sum of the areas of the rectangles approximates the area of the quarter circle. The more rectangles there are, the closer the approximation. For each rectangle, the width w is a constant derived by dividing the radius by the number of rectangles. The height h, on the other hand, varies depending on the position of the rectangle. If the midpoint of the rectangle in the horizontal direction is given by x, the height of the rectangle can be computed using the sqrt function to express the distance formula h = r2 – x2 The area of each rectangle is then simply h × w. Write a program to compute the area of the quarter circle by dividing it into 10,000 rectangles. Chapter 2 Functions and Libraries I have always imagined Paradise as a kind of library. — Jorge Luis Borges, “Poem of the Gifts,” 1960 58 Functions and Libraries As you know from the examples from Chapter 1, programs in C++ typically consist of a sequence of functions. This chapter looks more expansively at the idea of a function and the realization of that concept in C++. In addition, this chapter explores how functions can be stored in libraries, which makes it easier to use those functions in a variety of applications. 2.1 The idea of a function When you are in the early stages of your study of programming, your most important responsibility is learning how to use functions effectively in programs. Fortunately, the concept of a function is likely to be familiar from mathematics, so that you’re not learning a new concept completely from scratch. At the same time, functions in C++ are much more general than their counterparts in mathematics, which means that you’ll have to move beyond the mathematical conception and think more expansively about how you might use functions as a programmer. The sections that follow begin with the mathematical notion of a function and then generalize that concept to encompass the applications of functions in the programming domain. Functions in mathematics When you studied mathematics in high school, you almost certainly encountered the concept of a function. For example, you might have seen a function definition like ƒ(x) = x 2 + 1 which states that the function ƒ transforms a number x into the square of x plus one. For any value of x, you can compute the value of the function simply by evaluating the formula that appears in the definition. Thus, the value of ƒ(3) is 32 + 1, or 10. Ever since the development of FORTRAN in the 1950s, programming languages have incorporated this mathematical approach to functions into their computational framework. For example, building on the examples of functions you saw in Chapter 1, you know that you can implement the function ƒ in C++ like this: double f(double x) { return x * x + 1; } This definition includes various bits of syntax that are absent from the mathematical formulation, but the basic idea is the same. The function f takes an input value represented by the variable x and returns as its output the value of the expression x * x + 1. 2.1 The idea of a function Functions in programming In programming languages, the concept of a function is more general than it is in mathematics. Like their mathematical counterparts, functions in C++ can specify input values, but don’t need to do so. Similarly, functions in C++ aren’t required to return results. The essential characteristic of a C++ function is that it associates a computational operation—specified by a block of code that forms the body of the function—with a particular name. Once a function has been defined, other parts of the program can trigger the associated operations using only the function name. There is no need to repeat the code for the underlying operation, because the steps required to implement that operation are specified in the function body. This model of functions in the context of programming makes it possible to define several terms that are essential to understanding how functions work in C++. First of all, a function is a block of code that has been organized into a separate unit and given a name. The act of using the name to invoke that code is known as calling that function. To specify a function call in C++, you write the name of the function, followed by a list of expressions enclosed in parentheses. These expressions, called arguments, allow the calling program to pass information to the function. If a function requires no information from its caller, it need not have any arguments, but an empty set of parentheses must appear in both the function definition and any calls to that function. Once called, the function takes the data supplied as arguments, does its work, and then returns to the program at the point in the code from which the call was made. Remembering what the calling program was doing so that the program can get back to the precise location of the call is one of the essential characteristics of the function-calling mechanism. The operation of going back to the calling program is called returning from the function. As it returns, a function often passes a value back to its caller. This operation is called returning a value. The advantages of using functions Functions play several important roles in a programming language. First, defining functions makes it possible to write the code for an operation once but then use it many times. The ability to invoke the same sequence of instructions from many parts of a program can dramatically reduce its size. Having the code for a function appear in just one place also makes a program easier to maintain. If you need to make a change in the way a function operates, it is far easier to do so if the code appears only once than if the same operations are repeated throughout the code. Defining a function, however, is valuable even if you use that function only once in a particular program. The most important role that functions play is that they make it possible to divide a large program into smaller, more manageable pieces. 59 60 Functions and Libraries This process is called decomposition. As you almost certainly know from your prior experience with programming, writing a program as one monolithic block of code is a sure-fire recipe for disaster. What you want to do instead is subdivide the high-level problem into a set of lower-level functions, each of which makes sense on its own. Finding the right decomposition, however, turns out to be a challenging task that requires considerable practice. If you choose the individual pieces well, each one will have integrity as a unit and make the program as a whole much simpler to understand. If you choose unwisely, the decomposition can easily get in your way. There are no hard-and-fast rules for selecting a particular decomposition. Programming is an art, and you will learn to choose good decomposition strategies mostly through experience. As a general rule, however, it makes sense to begin the decomposition process starting with the main program. At this level, you think about the problem as a whole and try to identify the major pieces of the entire task. Once you figure out what the big pieces of the program are, you can define them as independent functions. Since some of these functions may themselves be complicated, it is often appropriate to decompose them into still smaller ones. You can continue this process until every piece of the problem is simple enough to be solved on its own. This process is called top-down design or stepwise refinement. Functions and algorithms In addition to their role as a tool for managing complexity, functions are important in programming because they provide a basis for the implementation of algorithms, which are precisely specified strategies for solving computational problems. The term comes from the name of the ninth-century Persian mathematician Abu Ja‘far Mohammed ibn Mûsâ al-Khowârizmî, whose treatise on mathematics entitled Kitab al jabr w’al-muqabala gave rise to the English word algebra. Mathematical algorithms, however, go back much further in history, and certainly extend at least as far as the early Greek, Chinese, and Indian civilizations. One of the earliest known algorithms worthy of that distinction is named for the Greek mathematician Euclid, who lived in Alexandria during the reign of Ptolemy I (323-283 BCE). In his great mathematical treatise called Elements, Euclid outlines a procedure for finding the greatest common divisor (or gcd for short) of two integers x and y, which is defined to be the largest integer that divides evenly into both. For example, the gcd of 49 and 35 is 7, the gcd of 6 and 18 is 6, and the gcd of 32 and 33 is 1. In modern English, Euclid’s algorithm can be described as follows: 1. Divide x by y and compute the remainder; call that remainder r. 2. If r is zero, the algorithm is complete, and the answer is y. 3. If r is not zero, set x to the old value of y, set y equal to r, and repeat the process. 2.1 The idea of a function You can easily translate this algorithmic description into the following code in C++: int gcd(int x, int y) { int r = x % y; while (r != 0) { x = y; y = r; r = x % y; } return y; } Euclid’s algorithm is considerably more efficient than any strategy you would be likely to discover on your own, and is still used today in a variety of practical applications, including the implementation of the cryptographic protocols that enable secure communication on the Internet. At the same time, it is not easy to see exactly why the algorithm gives the correct result. Fortunately for those who rely on it in modern-day applications, Euclid was able to prove the correctness of his algorithm in Elements, Book VII, proposition 2. While it is not always necessary to have formal proofs of the algorithms that drive computer-based applications, having such proofs makes it possible to have more confidence in the correctness of those programs. 2.2 Libraries When you write a C++ program, most of the code the computer executes is not the code you have written yourself, but the library code that you have loaded along with your application. In a way, programs today are like icebergs, in which most of the bulk lies hidden beneath the surface. If you want to become an effective C++ programmer, you need to spend at least as much time learning about the standard libraries as you do learning the language itself. Every program you have seen in this book—all the way back to the tiny HelloWorld example at the beginning of Chapter 1—includes the library, which gives the program access to the cin and cout streams. The details of how those streams are implemented aren’t important to you when you are writing HelloWorld or PowersOfTwo. Those implementations, in fact, are not simply beyond your reach at the moment, but entirely beyond the scope of this book. In your role as a programmer, all that matters is that you know how to use these libraries and that the library implementations do what they’re supposed to do. For some reason, beginning programmers are sometimes uncomfortable with the idea of calling a function without understanding its underlying implementation. In 61 62 Functions and Libraries fact, you’ve probably been doing precisely that in mathematics for a long time. In high school, you presumably encountered several functions that turned out to be useful even if you had no idea—and probably no interest—in how the values of that function are computed. In your algebra classes, for example, you learned about functions like logarithms and square root. If you took a trigonometry course, you worked with functions like sine and cosine. If you needed to know the value of one of these functions, you didn’t compute the result by hand, but instead looked up the answer in a table or, more likely, typed the appropriate values into a calculator. When you’re writing programs, you want to follow that same strategy. If you need to invoke some mathematical function, what you want to have is a library function that computes it with no more conceptual effort than it takes to press the right key on a calculator. Fortunately, C++ has an extensive mathematical library called that includes all the functions you are ever likely to need. The most common functions in the library appear in Table 2-1. Don’t worry if you T A B L E 2 - 1 Selected functions in the library General mathematical functions abs(x) Returns the absolute value of x sqrt(x) Returns the square root of x. floor(x) Returns the largest integer less than or equal to x as a floating-point value. ceil(x) Returns the smallest integer greater than or equal to x as a floating-point value. Logarithmic and exponential functions exp(x) Returns the exponential function of x (e x). log(x) Returns the natural logarithm (base e) of x. log10(x) Returns the common logarithm (base 10) of x. pow(x, y) Returns x y. Trigonometric functions cos(theta) Returns the cosine of the angle theta, measured in radians counterclockwise from the +x axis. You can convert from degrees to radians by multiplying by π/180. sin(theta) Returns the sine of the radian angle theta. tan(theta) Returns the tangent of the radian angle theta. atan(x) Returns the principal arctangent of x. The result is an angle expressed in radians between –π/2 and +π/2. Returns the radian angle formed between the x-axis and the line extending from the origin through the point (x, y). atan2(y, x) 2.2 Libraries have no idea what some of those functions mean. This book requires very little mathematics and will explain any concepts beyond basic algebra as they come up. Whenever a library makes some service available to the programs that include it, computer scientists say that the library exports that service. The library, for example, exports the cin and cout streams; the library exports the sqrt function, along with the other functions in Table 2-1. One of the design goals of any library is to hide the complexity involved in the underlying implementation. By exporting the sqrt function, the designers of the library make it far easier to write programs that use it. When you call the sqrt function, you don’t need to have any idea how sqrt works internally. Those details are relevant only to the programmers who implement the library in the first place. Knowing how to call the sqrt function and knowing how to implement it are both important skills. It is important to recognize, however, that those two skills— calling a function and implementing one—are to a large extent independent. Successful programmers often use functions that they wouldn’t have a clue how to write. Conversely, programmers who implement a library function can never anticipate all the potential uses for that function. To emphasize the difference in perspective between programmers who implement a library and those who use it, computer scientists have assigned names to programmers working in each of these roles. Naturally enough, a programmer who implements a library is called an implementer. Conversely, a programmer who calls functions provided by a library is called a client of that library. As you go through the chapters in this book, you will have a chance to look at several libraries from both of these perspectives, first as a client and later as an implementer. 2.3 Defining functions in C++ Although you saw several functions in Chapter 1 and even had a chance to write a few of your own in the exercises, it makes sense to review the rules for writing functions in C++ before going on to investigate how to use them most effectively. In C++, a function definition has the following syntactic form: type name(parameters) { . . . body . . . } In this example, type is the type returned by the function, name is the function name, and parameters is a list of declarations separated by commas, giving the type and name of each parameter to the function. A parameter is a placeholder for one of the 63 64 Functions and Libraries arguments supplied in the function call and acts in most respects like a local variable. The only difference is that each parameter is initialized automatically to hold the value of the corresponding argument. If a function takes no parameters, the entire parameter list in the function header line is empty. The body of the function is a block consisting of the statements that implement the function, along with the declarations of any local variables the function requires. For functions that return a value to their caller, at least one of those statements must be a return statement, which usually has the form return expression; Executing the return statement causes the function to return immediately to its caller, passing back the value of the expression as the value of the function. Functions can return values of any type. The following function, for example, returns a Boolean value indicating whether the argument n is an even integer: bool isEven(int n) { return n % 2 == 0; } Once you have defined this function, you can use it in an if statement like this: if (isEven(i)) . . . As noted in Chapter 1, functions that return Boolean results play an important role in programming and are called predicate functions. Functions, however, do not need to return a value at all. A function that does not return a value and is instead executed for its effect is often called a procedure. Procedures are indicated in the definition of a function by using the reserved word void as the result type. Procedures ordinarily finish by reaching the end of the statements in the body. You can, however, signal early completion of a procedure by executing a return statement without a value expression, as follows: return; Function prototypes When the C++ compiler encounters a function call in your program, it needs to have some information about that function in order to generate the correct code. In most cases, the compiler doesn’t need to know all the steps that form the body of the function. All it really needs to know is what arguments the function requires and what type of value it returns. That information is usually provided by a prototype, which is simply the header line of the function followed by a semicolon. 2.3 Defining functions in C++ You have already seen examples of prototypes in Chapter 1. The PowersOfTwo program from Figure 1-3, for example, provides the following prototype for the raiseToPower function: int raiseToPower(int n, int k); which tells the compiler that raiseToPower takes two integers as arguments and returns an integer as its result. The names of the parameters are optional in a prototype, but supplying them usually helps the reader. If you always define functions before you call them, prototypes are not required. Some programmers organize their source files so that the low-level functions come at the beginning, followed by the intermediate-level functions that call them, with the main program coming at the very bottom. While this strategy can save a few prototype lines, it forces an order on the source file that often makes it harder for the reader to figure out what is going on. In particular, programs organized in this way end up reflecting the opposite of top-down design, since the most general functions appear at the end. In this text, every function other than main has an explicit prototype, which makes it possible to define those functions in any order. Overloading In C++, it is legal to give the same name to more than one function as long as the pattern of arguments is different. When the compiler encounters a call to a function with that name, it checks to see what arguments have been provided and chooses the version of the function that fits those arguments. Using the same name for more than one version of a function is called overloading. The pattern of arguments taken by a function—which refers only to the number and types of the arguments and not the parameter names—is called its signature. As an example of overloading, the library includes several versions of the function abs, one for each of the built-in arithmetic types. For example, the library includes the function int abs(int x) { return (x < 0) ? -x : x; } as well as the identically named function double abs(double x) { return (x < 0) ? -x : x; } 65 66 Functions and Libraries The only difference between these functions is that the first version takes an int as its argument and the second takes a double. The compiler chooses which of these versions to invoke by looking at the types of the arguments the caller provides. Thus, if abs is called with an int, the compiler invokes the integer-valued version of the functions and returns a value of type int. If, by contrast, the argument is of type double, the compiler will choose the version that takes a double. The primary advantage of using overloading is that doing so makes it easier for you as a programmer to keep track of different function names for the same operation when it is applied in slightly different contexts. If, for example, you need to call the absolute value function in C, which does not support overloading, you have to remember to call fabs for floating-point numbers and abs for integers. In C++, all you need to remember is the single function name abs. Default parameters In addition to overloading, C++ makes it possible to specify that certain parameters are optional. The parameter variables still appear in the function header line, but the prototype for the function specifies the values that those parameters should have if they do not appear in the call. Such parameters are called default parameters. To indicate that a parameter is optional, all you need to do is include an initial value in the declaration of that parameter in the function prototype. For example, if you were designing a set of functions to implement a word processor, you might define a procedure with the following prototype: void formatInColumns(int nColumns = 2); The formatInColumns procedure takes the number of columns as an argument, but the = 2 in the prototype declaration means that this argument may be omitted. If you leave it out and call formatInColumns(); the parameter variable nColumns will automatically be initialized to 2. When you use default parameters, it helps to keep the following rules in mind: • The specification of the default value appears only in the function prototype and not in the function definition. • Any default parameters must appear at the end of the parameter list. Default parameters tend to be overused in C++. It is always possible—and usually preferable—to achieve the same effect through overloading. Suppose, for 2.3 Defining functions in C++ example, that you want to define a procedure setInitialLocation that takes an x and a y coordinate as arguments. The prototype presumably looks like this: void setInitialLocation(double x, double y); Now suppose that you want to change this definition so that the initial location defaults to the origin (0, 0). One way to accomplish that goal is to add initializers to the prototype, like this: void setInitialLocation(double x = 0, double y = 0); While this definition has the desired effect, having it in place makes it possible to call setInitialLocation with one argument, which would almost certainly be confusing to anyone reading the code. It is almost certainly better to define an overloaded version of the function with the following implementation: void setInitialLocation() { setInitialLocation(0, 0); } 2.4 The mechanics of function calls Although you can certainly get by with an intuitive understanding of how the function-calling process works, it sometimes helps—particularly when you start to work with recursive functions in Chapter 7—to understand precisely what happens when one function calls another in C++. The sections that follow define the process in detail and then walk you through a simple example designed to help you visualize exactly what is going on. The steps in calling a function Whenever a function call occurs, the C++ compiler generates code to implement the following operations: 1. The calling function computes values for each argument using the bindings of local variables in its own context. Because the arguments are expressions, this computation can involve operators and other functions, all of which are evaluated before execution of the new function actually begins. 2. The system creates new space for all the local variables required by the new function, including any parameters. These variables are allocated together in a block, which is called a stack frame. 3. The value of each argument is copied into the corresponding parameter variable. If there is more than one argument, the arguments are copied into the parameters in order; the first argument is copied into the first parameter, and so 67 68 Functions and Libraries forth. If necessary, type conversions are performed between the argument values and the parameter variables, as in an assignment statement. For example, if you pass a value of type int to a function that expects a parameter of type double, the integer is converted into the equivalent floating-point value before it is copied into the parameter variable. 4. The statements in the function body are executed until the program encounters a return statement or there are no more statements to execute. 5. The value of the return expression, if any, is evaluated and returned as the value of the function. If the value being returned does not precisely match the result type declared for the function, a type conversion is performed. Thus, if a return statement specifies a floating-point value in a function defined to return an int, the result is truncated to an integer. 6. The stack frame created for this function call is discarded. In the process, all local variables disappear. 7. The calling program continues, with the returned value substituted in place of the call. Although this process may seem to make sense, you probably need to work through an example or two before you understand it thoroughly. Reading through the example in the next section will give you some insight into the process, but it is probably even more helpful to take one of your own programs and walk through it at the same level of detail. And while you can trace through a program on paper or a whiteboard, it may be better to get yourself a supply of 3 × 5 index cards and then use a card to represent each stack frame. The advantage of the index-card model is that you can create a stack of index cards that closely models the operation of the computer. Calling a function adds a card; returning from the function removes it. The combinations function Suppose that you have a collection of six coins, which in the United States might be a penny, a nickel, a dime, a quarter, a half-dollar, and a dollar. Given those six coins, how many ways are there to choose two of them? As you can see from the full enumeration of the possibilities in Figure 2-1, the answer is 15. As a computer scientist, you should immediately think about the more general question: Given a set containing n distinct elements, how many ways can you choose a subset with k elements? The answer to that question is well known in mathematics and is computed by the combinations function C(n, k), which is defined as follows: C(n, k) = n! k! × (n – k)! where the exclamation point indicates the factorial function, which is simply the product of the integers between 1 and the specified number, inclusive. 2.4 The mechanics of function calls F I G U R E 2 - 1 Illustration of the combinations function If you start with six coins There are 15 ways to choose two coins: + + + + + + + + + + + + + + + The code to compute the combinations function in C++ appears in Figure 2-2, along with a main program that requests values of n and k from the user and then displays the value of the function C(n, k). A sample run of the program might look like this: Combinations Enter the number of objects (n): 6 Enter the number to be chosen (k): 2 C(n, k) = 15 As you can see from Figure 2-2, the Combinations program is divided into three functions. The main function handles the interaction with the user. The combinations function computes C(n, k). Finally, the fact function, which is borrowed from Chapter 1, computes the factorials required for the computation. 69 70 Functions and Libraries Figure 2-2. F I G U R E 2 - 2 Program to compute the combinations function /* * File: Combinations.cpp * ---------------------* This program computes the mathematical function C(n, k) from * its mathematical definition in terms of factorials. */ #include using namespace std; /* Function prototypes */ int combinations(int n, int k); int fact(int n); /* Main program */ int main() { int n, k; cout << "Enter the number of objects (n): "; cin >> n; cout << "Enter the number to be chosen (k): "; cin >> k; cout << "C(n, k) = " << combinations(n, k) << endl; return 0; } /* * Function: combinations(n, k) * Usage: int nWays = combinations(n, k); * -------------------------------------* Returns the mathematical combinations function C(n, k), which is * the number of ways one can choose k elements from a set of size n. */ int combinations(int n, int k) { return fact(n) / (fact(k) * fact(n - k)); } /* * Function: fact(n) * Usage: int result = fact(n); * ---------------------------* Returns the factorial of n, which is the product of all the * integers between 1 and n, inclusive. */ int fact(int n) { int result = 1; for (int i = 1; i <= n; i++) { result *= i; } return result; } 2.4 The mechanics of function calls Tracing the combinations function While the Combinations program can be interesting in its own right, the purpose of this example is to illustrate the steps involved in executing functions. In C++, all programs begin by making a call to the function main. To implement a function call, the system—which encompasses both the operating system you’re using and the hardware on which it runs—creates a new stack frame to keep track of the local variables that function declares. In the Combinations program, the main function declares two integers, n and k, so the stack frame must include space for these variables. In the diagrams in this book, each stack frame appears as a rectangle surrounded by a double line. The frame rectangle shows the code for the function along with a pointing-hand icon that makes it easy to keep track of the current execution point. It also contains labeled boxes for each of the local variables. The stack frame for main therefore looks like this when execution begins: int main() { int n, k; ☞ cout << "Enter the number of objects (n): "; cin >> n; cout << "Enter the number to be chosen (k): "; cin >> k; cout << "C(n, k) = " << combinations(n, k) << endl; return 0; } k n From this point, the system executes the statements in order, printing out the prompts on the console, reading in the associated values from the user, and storing those values in the variables in that frame. If the user enters the values shown in the earlier sample run, the frame will look like this when it reaches the statement that displays the result: int main() { int n, k; cout << "Enter the number of objects (n): "; cin >> n; cout << "Enter the number to be chosen (k): "; cin >> k; ☞ cout << "C(n, k) = " << combinations(n, k) << endl; return 0; } k n 2 6 71 72 Functions and Libraries Before the program can complete the output line, it has to evaluate the call to combinations(n, k). At this point, the function main is calling the function combinations, which means that the computer has to go through all the steps that are required in making a function call. The first step is to evaluate the arguments in the context of the current frame. The variable n has the value 6, and the variable k has the value 2. These two arguments are then copied into the parameter variables n and k when the computer creates the combinations stack frame. The new frame gets stacked on top of the old one, which allows the computer to remember the values of the local variables in main, even though they are not currently accessible. The situation after creating the new frame and initializing the parameter variables looks like this: int main() { intcombinations(int n, k; int n, int k) { << "Enter of *objects return fact(n)the / (number fact(k) fact(n (n): - k) "; ); ☞cout } cin >> n; cout << "Enter the number to be chosen (k): "; cin >> k; cout << "C(n, k) = " << combinations(n, k) << endl; return 0; } k n 2 6 To compute the value of the combinations function, the program must make three calls to the function fact. In C++, those function calls can happen in any order, but it’s easiest to process them from left to right. The first call, therefore, is the call to fact(n). To evaluate this function, the system must create yet another stack frame, this time for the function fact with an argument value of 6: int main() { intcombinations(int n, k; int n, int k) { cout << "Enter the number of *objects return fact(n) / ( fact(k) fact(n (n): - k) "; ); int fact(int n) { >> n; } cinint result = 1; ☞ cout << "Enter the number to be chosen (k): "; i = 1; i <= n; i++) { cinfor >> (int k; result cout << "C(n,*= k) i; = " << combinations(n, k) << endl; } return 0; return result; } } i result n 6 Unlike the earlier stack frames, the frame for fact includes both parameters and local variable. The parameter n is initialized to the value of the calling argument 2.4 The mechanics of function calls and therefore has the value 6. The two local variables, i and result, have not yet been initialized, but the system nonetheless needs to reserve space in the frame for those variables. Until you assign a new value to those variables, they will contain whatever data happened to be left over in the memory cells assigned to that stack frame, which is completely unpredictable. It is therefore important to initialize all local variables before you use them, ideally as part of the declaration. The system then executes the statements in the function fact. In this instance, the body of the for loop will be executed six times. On each cycle, the value of result is multiplied by the loop index i, which means that it will eventually hold the value 720 (1 × 2 × 3 × 4 × 5 × 6 or 6!). When the program reaches the return statement, the stack frame looks like this: int main() { intcombinations(int n, k; int n, int k) { cout << "Enter the of *objects return fact(n) / (number fact(k) fact(n (n): - k) "; ); int fact(int n) { >> n; } cinint result = 1; cout << "Enter the number to be chosen (k): "; i = 1; i <= n; i++) { cinfor >> (int k; result cout << "C(n,*= k) i; = " << combinations(n, k) << endl; } return 0; } ☞ return result; } i result 720 n 6 In this diagram, the box for the variable i is empty, because the value of i is no longer defined at this point in the program. In C++, index variables declared in a for loop header are accessible only inside the loop body. Showing an empty box emphasizes the fact that the value of i is no longer available. Returning from a function involves copying the value of the return expression (in this case the local variable result), into the point at which the call occurred. The frame for fact is then discarded, which leads to the following configuration: int main() { intcombinations(int n, k; int n, int k) { cout << "Enter of *objects return fact(n)the / (number fact(k) fact(n (n): - k) "; ); } cin >> n; cout << "Enter the 720 number to be chosen (k): "; cin >> k; cout << "C(n, k) = " << combinations(n, k) << endl; return 0; } k n 2 6 73 74 Functions and Libraries The next step in the process is to make a second call to fact, this time with the argument k. In the calling frame, k has the value 2. That value is then used to initialize the parameter n in the new stack frame, as follows: int main() { intcombinations(int n, k; int n, int k) { cout << "Enter the of *objects return fact(n) / (number fact(k) fact(n (n): - k) "; ); int fact(int n) { >> n; } cinint result = 1; ☞ << "Enter the number to be chosen (k): "; cout i = 1; i <= n; i++) { cinfor >> (int k; result cout << "C(n,*= k) i; = " << combinations(n, k) << endl; } return 0; return result; } } i result n 2 The computation of fact(2) is a bit easier to perform in one’s head than the earlier call to fact(6). This time around, the value of result will be 2, which is then returned to the calling frame, like this: int main() { intcombinations(int n, k; int n, int k) { cout << "Enter of *objects return fact(n)the / (number fact(k) fact(n (n): - k) "; ); } cin >> n; cout << "Enter the 720 number to2 be chosen (k): "; cin >> k; cout << "C(n, k) = " << combinations(n, k) << endl; return 0; } k n 2 6 The code for combinations makes one more call to fact, this time with the argument n - k. As before, this call creates a new frame with n equal to 4: int main() { intcombinations(int n, k; int n, int k) { cout << "Enter the of *objects return fact(n) / (number fact(k) fact(n (n): - k) "; ); int fact(int n) { >> n; } cinint result = 1; ☞ << "Enter the number to be chosen (k): "; cout i = 1; i <= n; i++) { cinfor >> (int k; result cout << "C(n,*= k) i; = " << combinations(n, k) << endl; } return 0; return result; } } i result n 4 2.4 The mechanics of function calls The value of fact(4) is 1 × 2 × 3 × 4, or 24. When this call returns, the system is able to fill in the last of the missing values in the calculation, as follows: int main() { intcombinations(int n, k; int n, int k) { cout << "Enter of *objects return fact(n)the / (number fact(k) fact(n (n): - k) "; ); } cin >> n; cout << "Enter the 720 number to2 be chosen (k): 24 "; cin >> k; cout << "C(n, k) = " << combinations(n, k) << endl; return 0; } k n 2 6 The computer then divides 720 by the product of 2 and 24 to get the answer 15. This value is then returned to the main function, which leads to the following state: int main() { int n, k; cout << "Enter the number of objects (n): "; cin >> n; cout << "Enter the number to be chosen (k): "; cin >> k; cout << "C(n, k) = " << combinations(n, k) << endl; return 0; 15 k } n 2 6 From here, all that remains is to generate the output line and return from the main function, which completes the execution of the program. 2.5 Reference parameters In C++, whenever you pass a simple variable from one function to another, the function gets a copy of the calling value. Assigning a new value to the parameter as part of the function changes the local copy but has no effect on the calling argument. For example, if you try to implement a procedure that initializes a variable to zero using the code void setToZero(int var) { var = 0; } 75 76 Functions and Libraries that procedure ends up having no effect whatever. If you call setToZero(x); the parameter var is initialized to a copy of whatever value is stored in x. The assignment statement var = 0; inside the function sets the local copy to 0 but leaves x unchanged in the calling program. If you want to change the value of the calling argument—and there are often compelling reasons for doing so—you can change the parameter from the usual kind of C++ parameter (which is called a value parameter) into a reference parameter by adding an ampersand between the type and the name in the function header. Unlike value parameters, reference parameters are not copied. What happens instead is that the function receives a reference to the original variable, which means that the memory used for that variable is shared between the function and its caller. The new version of setToZero looks like this: void setToZero(int & var) { var = 0; } This style of parameter passing is known as call by reference. When you use call by reference, the argument corresponding to the reference parameter must be an assignable value, such as a variable name. Although calling setToZero(x) would correctly set the integer variable x to 0, it would be illegal to call setToZero(3) because 3 is not assignable. In C++, one of the most common uses of call by reference occurs when a function needs to return more than one value to the calling program. A single result can easily be returned as the value of the function itself. If you need to return more than one result from a function, the return value is no longer appropriate. The standard approach to solving the problem is to turn that function into a procedure and pass values back and forth through the argument list. As an example, suppose that you are writing a program to solve the quadratic equation ax2 + bx + c = 0 Because of your commitment to good programming style, you want to structure that program into three phases as illustrated in the following flowchart: 2.5 Reference parameters Input phase: Accept values of the coefficients from the user. Computation phase: Solve the quadratic equation for those coefficients. Output phase: Display the roots of the equation on the screen. The Quadratic program in Figure 2-3 shows how call by reference makes it possible to decompose the quadratic equation problem in this way. Each of the functions in Figure 2-3 corresponds to one of the phases in the flowchart. The main program provides information to the function using conventional parameters. Whenever a function needs to get information back to the main program, it uses reference parameters. The solveQuadratic function uses parameters of each type. The parameters a, b, and c are used as input to the function and give it access to the three coefficients. The parameters x1 and x2 are output parameters and allow the program to pass back the two roots of the quadratic equation. The Quadratic program also introduces a new strategy for reporting errors. Whenever the code encounters a condition that makes further progress impossible, it calls a function named error that prints a message indicating the nature of the problem and then terminates the execution of the program. The code for error looks like this: void error(string msg) { cerr << msg << endl; exit(EXIT_FAILURE); } The code for error uses two features of C++ that have not yet made their appearance in this book: the cerr output stream and the exit function. The cerr stream is similar to cout, but is reserved for reporting errors. The exit function terminates the execution of the main program immediately, using the value of the parameter to report the program status. The constant EXIT_FAILURE is defined in the library and is used to indicate that some kind of failure occurred. 77 78 Functions and Libraries page 1 of figure. F I G U R E 2 - 3 Program to solve the quadratic equation /* * File: Quadratic.cpp * ------------------* This program finds roots of the quadratic equation * * 2 * a x + b x + c = 0 * * If a is 0 or if the equation has no real roots, the * program prints an error message and exits. */ #include #include #include using namespace std; /* Function prototypes */ void getCoefficients(double & a, double & b, double & c); void solveQuadratic(double a, double b, double c, double & x1, double & x2); void printRoots(double x1, double x2); void error(string msg); /* Main program */ int main() { double a, b, c, r1, r2; getCoefficients(a, b, c); solveQuadratic(a, b, c, r1, r2); printRoots(r1, r2); return 0; } /* * Function: getCoefficients * Usage: getCoefficients(a, b, c); * -------------------------------* Reads in the coefficients of a quadratic equation into the * reference parameters a, b, and c. */ void getCoefficients(double & a, double & b, double & c) { cout << "Enter coefficients for the quadratic equation:" << endl; cout << "a: "; cin >> a; cout << "b: "; cin >> b; cout << "c: "; cin >> c; } 2.5 Reference parameters F I G U R E 2 - 3 Program to solve the quadratic equation (continued) /* * Function: solveQuadratic * Usage: solveQuadratic(a, b, c, x1, x2); * --------------------------------------* Solves a quadratic equation for the coefficients a, b, and c. * roots are returned in the reference parameters x1 and x2. */ The void solveQuadratic(double a, double b, double c, double & x1, double & x2) { if (a == 0) error("The coefficient a must be nonzero."); double disc = b * b - 4 * a * c; if (disc < 0) error("This equation has no real roots."); double sqrtDisc = sqrt(disc); x1 = (-b + sqrtDisc) / (2 * a); x2 = (-b - sqrtDisc) / (2 * a); } /* * Function: printRoots * Usage: printRoots(x1, x2); * -------------------------* Displays x1 and x2, which are the roots of the quadratic equation. */ void printRoots(double x1, double x2) { if (x1 == x2) { cout << "There is a double root at " << x1 << endl; } else { cout << "The roots are " << x1 << " and " << x2 << endl; } } /* * Function: error * Usage: error(msg); * -----------------* Writes the string msg to the cerr stream and then exits the program * with a standard status value indicating that a failure has occurred. */ void error(string msg) { cerr << msg << endl; exit(EXIT_FAILURE); } The error function, of course, would be useful in many applications beyond the one that solves quadratic equations. Although you could easily copy the code into another program, it would make much more sense to save the error function in a library. In the following section, you’ll have a chance to do precisely that. 79 80 Functions and Libraries 2.6 Interfaces and implementations When you define a library in C++, you need to supply two parts. First, you must define the interface, which provides the information clients need to use the library but leaves out the details about how the library works. Second, you need to provide the implementation, which specifies the underlying details. A typical interface will export several definitions, which are typically functions, types, or constants. Each individual definition is called an interface entry. In C++, the interface and implementation are usually written as two separate files. The name of the interface file ends with .h, which marks it as a header file. The implementation appears in a file with the same root, but with the suffix .cpp. Following this convention, the error library is defined in the file error.h and implemented in the file error.cpp. Defining the error library The contents of the error.h interface appear in Figure 2-4. As you can see, the file consists mostly of comments. The only other parts of the interface are the prototype for the error function and three additional lines that are often referred to F I G U R E 2 - 4 Interface for the error library /* * File: error.h * ------------* This file defines a simple function for reporting errors. */ #ifndef _error_h #define _error_h /* * Function: error * Usage: error(msg); * -----------------* Writes the string msg to the cerr stream and then exits the program * with a standard status code indicating failure. The usual pattern for * using error is to enclose the call to error inside an if statement that * checks for a particular condition, which might look something like this: * * if (divisor == 0) error("Division by zero"); */ void error(std::string msg); #endif 2.6 Interfaces and implementations as interface boilerplate, which is a patterned set of lines that appears in every interface. In this interface, the boilerplate consists of the #ifndef and #define lines at the beginning of the interface and the matching #endif line at the end. These lines make sure that the compiler doesn’t compile the same interface twice. The #ifndef directive checks whether the _error_h symbol has been defined. When the compiler reads this interface for the first time, the answer is no. The next line, however, defines that symbol. Thus, if the compiler is later asked to read the same interface, the _error_h symbol will already be defined, and the compiler will skip over the contents of the interface this time around. This book uses the convention of checking for a symbol that begins with an underscore followed by the name of the interface file after replacing the dot with a second underscore. The prototype for error, however, looks slightly different when it appears in interface form. The parameter declaration now uses the type name std::string to indicate that string comes from the std namespace. Interfaces are typically read before the using namespace std line and therefore cannot use identifiers from that namespace without explicitly including the std:: qualifier. The error.cpp implementation appears in Figure 2-5. Comments in the implementation are intended for programmers responsible for maintaining the F I G U R E 2 - 5 Implementation of the error library /* * File: error.cpp * --------------* This file implements the error.h interface. */ #include #include #include "error.h" using namespace std; /* * Implementation notes: error * --------------------------* This function writes out the error message to the cerr stream and * then exits the program. The EXIT_FAILURE constant is defined in * to represent a standard failure code. */ void error(string msg) { cerr << msg << endl; exit(EXIT_FAILURE); } 81 82 Functions and Libraries library and are often less extensive than those in the interface. Here, the body of the error function is all of two lines long, and the purpose of each of those lines is instantly recognizable to any C++ programmer. Exporting types The error.h interface described in the preceding section exports a single function and nothing else. Most of the interfaces you will use in C++ also export data types. Most of these types will be classes, which are the foundation of the object-oriented type system that C++ provides. Given that you won’t learn how to define your own classes until Chapter 6, it is premature to introduce class-based examples at this point in the text. What does make sense is to create an interface that exports one of the enumerated types introduced in Chapter 1, such as the Direction type used to encode the four standard compass points: enum Direction { NORTH, EAST, SOUTH, WEST }; The simplest approach to making this type accessible through a library interface would be to write a direction.h interface that contained only this line along with the usual interface boilerplate. If you were to adopt that strategy, you wouldn’t need to supply an implementation at all. It is more useful, however, to have this interface export some simple functions for working with Direction values. For example, it would be useful to export the directionToString function defined on page 40, which returns the name of a direction given the enumerated value. For some of the programs that appear later in this book, it will be useful to have functions leftFrom and rightFrom that return the Direction value that results from turning 90 degrees in the specified direction, so that, for example, leftFrom(NORTH) would return WEST. If you add these functions to the direction.h interface, you will also need to supply a direction.cpp file to implement those functions. Those files appear in Figures 2-6 and 2-7. The implementations of leftFrom and rightFrom require some subtlety that is worth a little explanation. While C++ lets you freely convert an enumerated value to an integer, conversions in the opposite direction—from an integer to the corresponding value of an enumerated type—require a type cast. This fact is illustrated by the implementation of rightFrom, which looks like this: Direction rightFrom(Direction dir) { return Direction((dir + 1) % 4); } 2.6 Interfaces and implementations direction.h F I G U R E 2 - 6 Interface for the direction library /* * File: direction.h * ----------------* This interface exports an enumerated type called Direction whose * elements are the four compass points: NORTH, EAST, SOUTH, and WEST. */ #ifndef _direction_h #define _direction_h #include /* * Type: Direction * --------------* This enumerated type is used to represent the four compass directions. */ enum Direction { NORTH, EAST, SOUTH, WEST }; /* * Function: leftFrom * Usage: Direction newdir = leftFrom(dir); * ---------------------------------------* Returns the direction that is to the left of the argument. * For example, leftFrom(NORTH) returns WEST. */ Direction leftFrom(Direction dir); /* * Function: rightFrom * Usage: Direction newdir = rightFrom(dir); * ----------------------------------------* Returns the direction that is to the right of the argument. * For example, rightFrom(NORTH) returns EAST. */ Direction rightFrom(Direction dir); /* * Function: directionToString * Usage: string str = directionToString(dir); * ------------------------------------------* Returns the name of the direction as a string. */ std::string directionToString(Direction dir); #endif 83 84 Functions and Libraries F I G U R E 2 - 7 Implementation of the direction library /* * File: direction.cpp * ------------------* This file implements the direction.h interface. */ #include #include "direction.h" using namespace std; /* * Implementation notes: leftFrom, rightFrom * ----------------------------------------* These functions use the remainder operator to cycle through the * internal values of the enumeration type. Note that the leftFrom * function cannot subtract 1 from the direction because the result * might then be negative; adding 3 achieves the same effect but * ensures that the values remain positive. */ Direction leftFrom(Direction dir) { return Direction((dir + 3) % 4); } Direction rightFrom(Direction dir) { return Direction((dir + 1) % 4); } /* * Implementation notes: directionToString * --------------------------------------* Most C++ compilers require the default clause to make sure that this * function always returns a string, even if the direction is not one * of the legal values. */ string directionToString(Direction dir) { switch (dir) { case NORTH: return "NORTH"; case EAST: return "EAST"; case SOUTH: return "SOUTH"; case WEST: return "WEST"; default: return "???"; } } The arithmetic operators in the expression (dir + 1) % 4 automatically convert the Direction value to its underlying representation as an 2.6 Interfaces and implementations integer: 0 for NORTH, 1 for EAST, 2 for SOUTH, and 3 for WEST. Turning right from one of these compass points corresponds to adding one to the underlying value, with the exception to turning right from WEST, when it is necessary to cycle back to the value 0 to indicate the direction NORTH. As is so often the case in situations that involve cyclical structures, using the % operator eliminates the need for special-case testing. Before it returns, however, the function must use a type cast to convert the result of the arithmetic expression to the Direction value that rightFrom is defined to return. As a cautionary reminder about the dangers of using % with negative numbers, the leftFrom function cannot be defined as Direction leftFrom(Direction dir) { return Direction((dir - 1) % 4); } The problem with this implementation arises when you call leftFrom(NORTH). When dir has the value NORTH, the expression (dir - 1) % 4 has the value –1 on most machines, which is not a legal Direction. Fortunately, it is easy to fix the problem by coding leftFrom like this: Direction leftFrom(Direction dir) { return Direction((dir + 3) % 4); } Exporting constant definitions In addition to functions and types, interfaces often export constant definitions so that several clients can share that constant without redefining it in every source file. If, for example, you are writing programs that involve geometrical calculations, it is useful to have a definition of the mathematical constant π, which, given the usual conventions for constants, would presumably be named PI. If you declare PI as a constant in the fashion introduced in Chapter 1, you would write const double PI = 3.14159265358979323846; In C++, constants written in this form are private to the source file that contains them and cannot be exported through an interface. To export the constant PI, you need to add the keyword extern to both its definition and the prototype declaration in the interface. This strategy is illustrated by the gmath.h interface in Figure 2-8, which exports PI along with some simple functions that simplify working with angles in degrees. The corresponding implementation appears in Figure 2-9. 85 86 Functions and Libraries the gmath.h interface F I G U R E 2 - 8 Simplified interface for the gmath library /* * File: gmath.h * ------------* This file exports the constant PI along with a few degree-based * trigonometric functions, which are typically easier to use. */ #ifndef _gmath_h #define _gmath_h /* Constants */ extern const double PI; /* The mathematical constant pi */ /* * Function: sinDegrees * Usage: double sine = sinDegrees(angle); * --------------------------------------* Returns the trigonometric sine of angle expressed in degrees. */ double sinDegrees(double angle); /* * Function: cosDegrees * Usage: double cosine = cosDegrees(angle); * ----------------------------------------* Returns the trigonometric cosine of angle expressed in degrees. */ double cosDegrees(double angle); /* * Function: toDegrees * Usage: double degrees = toDegrees(radians); * ------------------------------------------* Converts an angle from radians to degrees. */ double toDegrees(double radians); /* * Function: toRadians * Usage: double radians = toRadians(degrees); * ------------------------------------------* Converts an angle from degrees to radians. */ double toRadians(double degrees); #endif 2.6 Interfaces and implementations F I G U R E 2 - 9 Implementation of the gmath library /* * File: gmath.cpp * --------------* This file implements the gmath.h interface. In all cases, the * implementation for each function requires only one line of code, * which makes detailed documentation unnecessary. */ #include #include "gmath.h" extern const double PI = 3.14159265358979323846; double sinDegrees(double angle) { return sin(toRadians(angle)); } double cosDegrees(double angle) { return cos(toRadians(angle)); } double toDegrees(double radians) { return radians * 180 / PI; } double toRadians(double degrees) { return degrees * PI / 180; } The Stanford library collection From time to time, this book introduces new library interfaces like error.h, direction.h, and gmath.h that are worth keeping around for future use. These interfaces appear in Appendix A, often in a form that includes more features than those described explicitly in the text. Those interfaces are also available for downloading from the web site associated with this book, along with a compiled version of the libraries you can use to create new applications. It can take a little time to figure out how to use the Stanford libraries in the programming environment you’re using, but that effort will quickly repay itself many times over. 2.7 Principles of interface design One of the reasons that programming is difficult is that programs reflect the complexity of the underlying application. As long as computers are used to solve problems of ever-increasing sophistication, the process of programming will of necessity become more sophisticated as well. 87 88 Functions and Libraries Writing a program to solve a large or difficult problem forces you to manage a staggering amount of complexity. There are algorithms to design, special cases to consider, user requirements to meet, and innumerable details to get right. To make programming manageable, you must reduce the complexity of the programming process as much as possible. Functions reduce some of the complexity; libraries offer a similar reduction in programming complexity but at a higher level of detail. A function gives its caller access to a set of steps that together implement a single operation. A library gives its client access to a set of functions and types that implement what computer scientists describe as a programming abstraction. The extent to which a particular abstraction simplifies the programming process, however, depends on how well you have designed its interface. To design an effective interface, you must balance several criteria. In general, you should try to develop interfaces that are • Unified. A single interface should define a consistent abstraction with a clear unifying theme. If a function does not fit within that theme, it should be defined in a separate interface. • Simple. To the extent that the underlying implementation is itself complex, the interface must seek to hide that complexity from the client. • Sufficient. When clients use an abstraction, the interface must provide sufficient functionality to meet their needs. If some critical operation is missing from an interface, clients may decide to abandon it and develop their own, more powerful abstraction. As important as simplicity is, the designer must avoid simplifying an interface to the point that it becomes useless. • General. A well-designed interface should be flexible enough to meet the needs of many different clients. An interface that performs a narrowly defined set of operations for one client is not as useful as one that can be used in many different situations. • Stable. The functions defined in an interface should continue to have precisely the same structure and effect, even if the underlying implementation changes. Making changes in the behavior of an interface forces clients to change their programs, which compromises the value of interface. The sections that follow discuss each of these criteria in detail. The importance of a unifying theme Unity gives strength. — Aesop, The Bundle of Sticks, ~600 BCE A central feature of a well-designed interface is that it presents a unified and consistent abstraction. In part, this criterion implies that the functions within a 2.7 Principles of interface design library should be chosen so that they reflect a coherent theme. Thus, the library exports mathematical functions, the library exports the cin, cout, and cerr streams along with the operators that perform input and output, and the error.h interface introduced in section 2.6 exports a function for reporting errors. Each interface entry exported by these libraries fits the purpose of that interface. For example, you would not expect to find sqrt in the library since it fits much more naturally into the framework of the library. The principle of a unifying theme also influences the design of the functions within a library interface. The functions within an interface should behave in as consistent a way as possible. For example, all the functions in the library measure angles in radians. If the implementer of the library had decided to add a function that required a different unit of measurement, clients would have to remember what units to use for each function. Simplicity and the principle of information hiding Embrace simplicity. — Lao-tzu, The Way of Lao-tzu, ~550 BCE Because a primary goal of using interfaces is to reduce the complexity of the programming process, it makes sense that simplicity is a desirable criterion in the design of an interface. In general, an interface should be as easy to use as possible. The underlying implementation may perform extremely intricate operations, but the client should be able to think about those operations in a simple, more abstract way. To a certain extent, an interface acts as a reference guide to a particular library abstraction. When you want to know how to use the error function, you consult the error.h interface to find out how to do so. The interface contains precisely the information that you, as a client, need to know—and no more. For clients, getting too much information can be as bad as getting too little, because additional detail is likely to make the interface more difficult to understand. Often, the real value of an interface lies not in the information it reveals but rather in the information it hides. When you design an interface, you should try to protect the client from as many of the complicating details of the implementation as possible. In that respect, it is perhaps best to think of an interface not primarily as a communication channel between the client and the implementation, but instead as a wall that divides them. client implementation interface 89 90 Functions and Libraries Like the wall that divided the lovers Pyramus and Thisbe in Greek mythology, the wall representing an interface contains a chink that allows the client and the implementation to communicate. The main purpose of the wall, however, is to keep the two sides apart. Because we conceive of it as lying at the border of the abstraction represented by the library, an interface is sometimes called an abstraction boundary. Ideally, all the complexity involved in the realization of a library lies on the implementation side of the wall. The interface is successful if it keeps that complexity away from the client side. Keeping details confined to the implementation domain is called information hiding. The principle of information hiding has important practical implications for interface design. When you write an interface, you should be sure you don’t reveal details of implementation, even in the commentary. Especially if you are writing an interface and an implementation at the same time, you may be tempted to describe in your interface all the clever ideas you used in the implementation. Try to resist that temptation. The interface is written for the benefit of the client and should contain only what the client needs to know. Similarly, you should design the functions in an interface so that they are as simple as possible. If you can reduce the number of arguments or find a way to eliminate confusing special cases, it will be easier for the client to understand how to use those functions. Moreover, it is good practice to limit the total number of functions exported by interface, so that the client does not become lost in a mass of functions, unable to make sense of the whole. Meeting the needs of your clients Everything should be as simple as possible, but no simpler. — attributed to Albert Einstein Simplicity is only part of the story. You can easily make an interface simple just by throwing away any parts of it that are hard or complicated. There is a good chance you will also make the interface useless. Sometimes clients need to perform tasks that have some inherent complexity. Denying your clients the tools they require just to make the interface simpler is not an effective strategy. Your interface must provide sufficient functionality to serve the clients’ needs. Learning to strike the right balance between simplicity and completeness in interface design is one of the fundamental challenges in programming. In many cases, the clients of an interface are concerned not only with whether a particular function is available but also with the efficiency of its implementation. For example, if you are developing a system for air-traffic control and need to call functions provided by a library interface, those functions must return the correct answer quickly. Late answers may be just as devastating as wrong answers. 2.7 Principles of interface design For the most part, efficiency is a concern for the implementation rather than the interface. Even so, you will often find it valuable to think about implementation strategies while you are designing the interface itself. Suppose, for example, that you are faced with a choice of two designs. If you determine that one of them would be much easier to implement efficiently, it makes sense—assuming there are no compelling reasons to the contrary—to choose that design. The advantages of general tools Give us the tools and we will finish the job. — Winston Churchill, BBC broadcast, 1941 An interface that is perfectly adapted to a particular client’s needs may not be useful to others. A good library abstraction serves the needs of many different clients. To do so, it must be general enough to solve a wide range of problems and not be limited to one highly specific purpose. By choosing a design that offers flexibility, you can create interfaces that are widely used. The desire to ensure that an interface remains general has an important practical implication. When you are writing a program, you will often discover that you need a particular tool. If you decide that the tool is important enough to go into a library, you then need to change your mode of thought. When you design the interface for that library, you have to forget about the original application and instead design your interface for a more general audience. The value of stability People change and forget to tell each other. Too bad— causes so many mistakes. — Lillian Hellman, Toys in the Attic, 1959 Interfaces have another property that makes them critically important to programming: they tend to be stable over long periods of time. Stable interfaces can dramatically simplify the problem of maintaining large programming systems by establishing clear boundaries of responsibility. As long as the interface does not change, both implementers and clients are relatively free to make changes on their own side of the abstraction boundary. For example, suppose that you are the implementer of the math library. In the course of your work, you discover a clever new algorithm for calculating the sqrt function that cuts in half the time required to calculate a square root. If you can say to your clients that you have a new implementation of sqrt that works just as it did before, only faster, they will probably be pleased. If, on the other hand, you were to say that the name of the function had changed or that its use involved certain new restrictions, your clients would be justifiably annoyed. To use your “improved” 91 92 Functions and Libraries implementation of square root, they would be forced to change their programs. Changing programs is a time-consuming, error-prone activity, and many clients would happily give up the extra efficiency for the convenience of being able to leave their programs alone. Interfaces, however, simplify the task of program maintenance only if they remain stable. Programs change frequently as new algorithms are discovered or as the requirements of applications change. Throughout such evolution, however, the interfaces must remain as constant as possible. In a well-designed system, changing the details of an implementation is a straightforward process. The complexity involved in making that change is localized on the implementation side of the abstraction boundary. On the other hand, changing an interface often produces a global upheaval that requires changing every program that depends on it. Thus, interface changes should be undertaken very rarely and then only with the active participation of clients. Some interface changes, however, are more drastic than others. For example, adding an entirely new function to an interface is usually a relatively straightforward process, since no clients already depend on that function. Changing an interface in such a way that existing programs will continue to run without modification is called extending the interface. If you find that you need to make evolutionary changes over the lifetime of an interface, it is usually best to make those changes by extension. 2.8 Designing a random number library The easiest way to illustrate the principles of interface design is to go through a simple design exercise. To this end, this section goes through the design process leading up to the creation of the random.h interface in the Stanford libraries, which makes it possible to write programs that make seemingly random choices. Being able to simulate random behavior is necessary, for example, if you want to write a computer game that involves flipping a coin or rolling a die, but is also useful in more practical contexts, as well. Programs that simulate such random events are called nondeterministic programs. Getting a computer to behave in a random way involves a certain amount of complexity. For the benefit of client programmers, you want to hide this complexity behind an interface. In this section, you will have the opportunity to focus your attention on that interface from several different perspectives: that of the interface designer, the implementer, and the client. 2.8 Designing a random number library Random versus pseudorandom numbers Partly because early computers were used primarily for numerical applications, the idea of generating randomness using a computer is often expressed in terms of being able to generate a random number in a particular range. From a theoretical perspective, a number is random if there is no way to determine in advance what value it will have among a set of equally probable possibilities. For example, rolling a die generates a random number between 1 and 6. If the die is fair, there is no way to predict which number will come up. The six possible values are equally likely. Although the idea of a random number makes intuitive sense, it is a difficult notion to represent inside a computer. Computers operate by following a sequence of instructions in memory and therefore function in a deterministic mode. How is it possible to generate unpredictable results by following a deterministic set of rules? If a number is the result of a deterministic process, any user should be able to work through that same set of rules and anticipate the computer’s response. Yet computers do in fact use a deterministic procedure to generate what we call random numbers. This strategy works because, even though the user could, in theory, follow the same set of rules and anticipate the computer’s response, no one actually bothers to do so. In most practical applications, it doesn’t matter if the numbers are truly random; all that matters is that the numbers appear to be random. For numbers to appear random, they should (1) behave like random numbers from a statistical point of view and (2) be sufficiently difficult to predict in advance that no user would bother. “Random” numbers generated by an algorithmic process inside a computer are referred to as pseudorandom numbers to underscore the fact that no truly random activity is involved. Pseudorandom numbers in the standard libraries The library exports a low-level function called rand that produces pseudorandom numbers. The prototype for rand is int rand(); which indicates that rand takes no arguments and returns an integer. Each call to rand produces a different value that is difficult for clients to predict and therefore appear to be random. The result of rand is guaranteed to be nonnegative and no larger than the constant RAND_MAX, which is also defined in . Thus, each time you call rand, it returns a different integer between 0 and RAND_MAX, inclusive. If you want to get a feel for how the rand function works, one strategy is to write a simple program to test it. The RandTest program in Figure 2-10 displays 93 94 Functions and Libraries F I G U R E 2 - 1 0 Program to test the rand function /* * File: RandTest.cpp * -----------------* This program tests the random number generator in C++ and produces * the values used in the examples in the text. */ #include #include #include using namespace std; const int N_TRIALS = 10; int main() { cout << "On this computer, RAND_MAX is " << RAND_MAX << endl; cout << "The first " << N_TRIALS << " calls to rand:" << endl; for (int i = 0; i < N_TRIALS; i++) { cout << setw(10) << rand() << endl; } return 0; } the value of RAND_MAX and then prints out the result of calling rand ten times. A sample run of this program looks like this: RandTest On this computer, RAND_MAX is 2147483647 The first 10 calls to rand: 1103527590 377401575 662824084 1147902781 2035015474 368800899 1508029952 486256185 1062517886 267834847 As you can see, the value of rand is always positive, and never larger than the value of RAND_MAX. The values, moreover, appear to jump around unpredictably within that range, which is exactly what you want from a pseudorandom process. Given that the C++ libraries include a function for generating pseudorandom numbers, it is reasonable to ask why one would bother to develop a new interface to support this process. The answer, in part, is that the rand function itself does not 2.8 Designing a random number library return a value that clients are likely to use directly. For one thing, the value of RAND_MAX depends on the hardware and software environment. On most systems, it is defined to be the largest positive integer, which is typically 2,147,483,647, but it may have different values on different systems. Moreover, even if you could count on RAND_MAX having that particular value, there are few (if any) applications where what you need is a random number between 0 and 2,147,483,647. As a client, you are much more likely to want values that fall into some other range, usually much smaller. For example, if you are trying to simulate flipping a coin, you want a function that has only two outcomes: heads and tails. Similarly, if you are trying to represent rolling a die, you need to produce a random integer between 1 and 6. If you are trying to simulate processes in the physical world, you will often need to produce random values over continuous range, where the result needs to be represented as a double rather than an int. If you could design an interface that was better suited to the needs of such clients, that interface would be much more flexible and easy to use. Another reason to design a higher-level interface is that using the low-level functions from introduces several complexities—as you will discover when you turn your attention to the implementation—that clients would be happy to ignore. Part of your job as an interface designer is to hide that complexity from clients as much as possible. Defining a higher-level random.h interface makes that possible, because all the complexity can be localized within the implementation. Choosing the right set of functions As an interface designer, one of your primary challenges is choosing what functions to export. Although interface design turns out to be more of an art than a science, there are some general principles you can apply, including those outlined in section 2.7. In particular, the functions you export through random.h should be simple and should hide as much of the underlying complexity as possible. They should also provide the functionality necessary to meet the needs of a wide range of clients, which means that you need to have some idea of what operations clients are likely to need. Understanding those needs depends in part on your own experience, but often requires interacting with potential clients to get a better sense of their requirements. From my own experience with programming, I know that the operations clients expect from an interface like random.h include the following: • The ability to select a random integer in a specified range. If you want, for example, to simulate the process of rolling a standard six-sided die, you need to choose a random integer between 1 and 6. • The ability to choose a random real number in a specified range. If you want to 95 96 Functions and Libraries position an object at a random point in space, you would need to choose random x and y coordinates within whatever limits are appropriate to the application. • The ability to simulate a random event with a specific probability. If you want to simulate flipping a coin, you want to generate the value heads with probability 0.5, which corresponds to 50 percent of the time. Translating these conceptual operations into a set of function prototypes is a relatively straightforward task. The first three functions—randomInteger, randomReal, and randomChance—in the random.h interface correspond to these three operations. The complete interface, which also exports a function called setRandomSeed that will be described later in the chapter, appears in Figure 2-11. As you can see from either the comments or the prototype in Figure 2-11, the randomInteger function takes two integer arguments and returns an integer in that range. If you want to simulate a die roll, you would call randomInteger(1, 6) To simulate a spin on a European roulette wheel (American wheels have both a 0 and a 00 slot in addition to the numbers from 1 to 36), you would call randomInteger(0, 36) The function randomReal is conceptually similar to randomInteger. It takes two floating-point values, low and high, and returns a floating-point value r subject to the condition that low ≤ r < high. For example, calling randomReal(0, 1) returns a random number that can be as small as 0 but is always strictly less than 1. In mathematics, a range of real numbers that can be equal to one endpoint but not the other is called a half-open interval. On a number line, a half-open interval is marked using an open circle to show that the endpoint is excluded, like this: 0 1 In mathematics, the standard convention is to use square brackets to indicate closed ends of intervals and parentheses to indicate open ones, so that the notation [0, 1) indicates the half-open interval corresponding to this diagram. The function randomChance is used to simulate random events that occur with some fixed probability. To be consistent with the conventions of statistics, a probability is represented as a number between 0 and 1, where 0 means that the event never occurs and 1 means that it always does. Calling randomChance(p) returns true with probability p. Thus, calling randomChance(0.75) returns true 75 percent of the time. 2.8 Designing a random number library random.h F I G U R E 2 - 1 1 Interface for the random number library /* * File: random.h * -------------* This interface exports functions for generating pseudorandom numbers. */ #ifndef _random_h #define _random_h /* * Function: randomInteger * Usage: int n = randomInteger(low, high); * ---------------------------------------* Returns a random integer in the range low to high, inclusive. */ int randomInteger(int low, int high); /* * Function: randomReal * Usage: double d = randomReal(low, high); * ---------------------------------------* Returns a random real number in the half-open interval [low .. high). A * half-open interval includes the first endpoint but not the second, which * means that the result is always greater than or equal to low but * strictly less than high. */ double randomReal(double low, double high); /* * Function: randomChance * Usage: if (randomChance(p)) . . . * --------------------------------* Returns true with the probability indicated by p. The argument p must * be a floating-point number between 0 (never) and 1 (always). For * example, calling randomChance(.30) returns true 30 percent of the time. */ bool randomChance(double p); /* * Function: setRandomSeed * Usage: setRandomSeed(seed); * --------------------------* Sets the internal random number seed to the specified value. You can * use this function to set a specific starting point for the pseudorandom * sequence or to ensure that program behavior is repeatable during the * debugging phase. */ void setRandomSeed(int seed); #endif 97 98 Functions and Libraries You can use randomChance to simulate flipping a coin, as illustrated by the following function, which returns "heads" or "tails" with equal probability: string flipCoin() { if (randomChance(0.50)) { return "heads"; } else { return "tails"; } } Constructing a client program One of the best ways to validate the design of an interface is to write applications that use it. The program in Figure 2-12, for example, uses the randomInteger function to play the casino game called craps. The rules for craps appear in the comments at the beginning of the program, although you would probably want the program to explain the rules to the user as well. In this example, the printed instructions have been omitted to save space. Although the Craps.cpp program is nondeterministic and will produce different outcomes each time, one possible sample run looks like this: Craps This program plays a Rolling the dice . . You rolled 5 and 5 Your point is 10. Rolling the dice . . You rolled 4 and 1 Rolling the dice . . You rolled 4 and 6 You made your point. game of craps. . that's 10 . that's 5 . that's 10 You win. Implementing the random number library Up to now, this chapter has looked only at the design of the random.h interface. Before you can actually use that interface in a program, it is necessary to write the corresponding random.cpp implementation. All you know at this point is that you have a function rand in the library that generates a random integer between 0 and a positive constant called RAND_MAX, which is some point on a number line that looks like this: 0 RAND_MAX 2.8 Designing a random number library craps.cpp F I G U R E 2 - 1 2 Program to play the casino game of craps /* * File: Craps.cpp * --------------* This program plays the casino game called craps, which is * played using a pair of dice. At the beginning of the game, * you roll the dice and compute the total. If your first roll * is 7 or 11, you win with what gamblers call a "natural." * If your first roll is 2, 3, or 12, you lose by "crapping * out." In any other case, the total from the first roll * becomes your "point," after which you continue to roll * the dice until one of the following conditions occurs: * * a) You roll your point again, in which case you win. * b) You roll a 7, in which case you lose. * * Other rolls, including 2, 3, 11, and 12, have no effect * during this phase of the game. */ #include #include "random.h" using namespace std; /* Function prototypes */ bool tryToMakePoint(int point); int rollTwoDice(); /* Main program */ int main() { cout << "This program plays a game of craps." << endl; int point = rollTwoDice(); switch (point) { case 7: case 11: cout << "That's a natural. You win." << endl; break; case 2: case 3: case 12: cout << "That's craps. You lose." << endl; break; default: cout << "Your point is " << point << "." << endl; if (tryToMakePoint(point)) { cout << "You made your point. You win." << endl; } else { cout << "You rolled a seven. You lose." << endl; } } return 0; } 99 100 Functions and Libraries F I G U R E 2 - 1 2 Program to play the casino game of craps (continued) /* * Function: tryToMakePoint * Usage: flag = tryToMakePoint(point); * -----------------------------------* Rolls the dice repeatedly until you either make your point or roll a 7. * The function returns true if you make your point and false if a 7 comes * up first. */ bool tryToMakePoint(int point) { while (true) { int total = rollTwoDice(); if (total == point) return true; if (total == 7) return false; } } /* * Function: rollTwoDice * Usage: total = rollTwoDice(); * ----------------------------* Simulates the process of rolling two dice. The individual values of the * dice are printed on cout along with the sum, which is returned as the * value of the function. */ int rollTwoDice() { cout << "Rolling the dice . . ." << endl; int d1 = randomInteger(1, 6); int d2 = randomInteger(1, 6); int total = d1 + d2; cout << "You rolled " << d1 << " and " << d2 << " - that's " << total << endl; return total; } To simulate the die roll, for example, what you need to do is transform that random integer into one of the following discrete outcomes: 1 2 3 4 5 6 As it happens, there are many bad strategies for performing this transformation. A surprising number of textbooks, for example, suggest code that looks like this: int die = rand() % 6 + 1; This code looks reasonable on the surface. The rand function always returns a 2.8 Designing a random number library positive integer, so the remainder on division by six must be an integer between 0 and 5. Adding one to that value gives an integer in the desired range of 1 to 6. The problem here is that rand guarantees only that the value it produces is uniformly distributed over the range from 0 to RAND_MAX. There is, however, no guarantee that the remainders on division by six will be at all random. In fact, early versions of rand distributed with the Unix operating system generated values that alternated between odd and even values, despite the fact that those values did fall uniformly over the range. Taking the remainder on division by six would also alternate between even and odd values, which hardly fits any reasonable definition of random behavior. If nothing else, it would be impossible to roll doubles on successive die rolls, given that one die would always be even and the other would always be odd. What you want to do instead is divide the integers between 0 and RAND_MAX into six equal-sized segments that correspond to the different outcomes, as follows: 0 RAND_MAX 1 2 3 4 5 6 In the more general case, you need to divide the number line between 0 and RAND_MAX into k equal intervals, where k is the number of possible outcomes in the desired range. Once you’ve done that, all you need to do is map the interval number into the value the client wants. The process of transforming the result of the rand function into an integer in a finite range can be understood most easily if you decompose it into the following four-step process: 1. Normalize the integer result from rand by converting it into a floating-point number d in the range 0 ≤ d < 1. 2. Scale the value d by multiplying it by the size of the desired range, so that it spans the correct number of integers. 3. Translate the value by adding in the lower bound so that the range begins at the desired point. 4. Convert the number to an integer by calling the function floor from library, which returns the largest integer that is smaller than its argument. These phases are illustrated in Figure 2-13, which also traces one possible path through the process. If the initial call to rand returns 848,256,064 and the value of RAND_MAX has its most common value of 2,147,483,647, the normalization process produces a value close to 0.4 (the values in Figure 2-13 have been rounded to a single digit after the decimal point to make them fit in the diagram). Multiplying 101 102 Functions and Libraries F I G U R E 2 - 1 3 Steps required to generate a random integer in the range 1 to 6 Initial call to rand 0 848256064 RAND_MAX Normalization 0 0.4 1 Scaling 0 2.4 6 Translation 1 2 3 2 3 3.4 4 5 6 4 5 6 7 Conversion to an integer 1 this value by 6 in the scaling step gives 2.4, which is then shifted to 3.4 in the translation step. In the conversion step, calling floor on 3.4 yields the value 3. Writing the code to implement this process is not as easy as it might appear, because there are several pitfalls that can easily trap the insufficiently careful programmer. Consider, for example, just the normalization step. You can’t convert the result of rand to a floating-point value in the half-open interval [0, 1) by calling double d = double(rand()) / RAND_MAX; The problem here is that rand might return the value RAND_MAX, which would mean that the variable d would be assigned the value 1.0, which is specifically excluded from the half-open interval. What’s worse is that you also can’t write double d = double(rand()) / (RAND_MAX + 1); although the problem in this case is more subtle. As noted earlier, RAND_MAX is typically chosen to be the largest positive value of type int. If that is indeed the 2.8 Designing a random number library case, adding one to it cannot possibly produce the correct result because that value is outside the integer range. To fix these problems, what you need to do instead is perform the addition using type double instead of int, as follows: double d = rand() / (double(RAND_MAX) + 1); A similar problem arises in the scaling step. Mathematically, the number of integer values in the inclusive range from low to high is given by the expression high - low + 1 That value, however, can overflow the range of an integer if high is a large positive number and low is a negative number with a large absolute magnitude. Thus, to calculate a scaled value s from the normalized value d, you need to perform this calculation in double-precision as well: double s = d * (double(high) - low + 1); Fortunately, the last two steps in the transformation process are relatively straightforward. Given the scaled value s, you can combine the translation and conversion steps in a single statement: return int(floor(low + s)); The implementation of randomInteger, at least for the moment, therefore looks like this: int randomInteger(int low, int high) { double d = rand() / (double(RAND_MAX) + 1); double s = d * (double(high) - low + 1); return int(floor(low + s)); } The implementations of randomReal and randomChance are relatively simple, given that you’ve already had to work out the complexities for randomInteger: double randomReal(double low, double high) { double d = rand() / (double(RAND_MAX) + 1); double s = d * (high - low); return low + s; } bool randomChance(double p) { return randomReal(0, 1) < p; } 103 104 Functions and Libraries Initializing the random number seed Unfortunately, the implementations of the randomInteger, randomReal, and randomChance functions from the preceding section don’t quite work in exactly the way clients would want. The problem is that—far from producing unpredictable results—programs that use them always produce exactly the same results. If, for example, you were to run the Craps program twenty times in succession, you would see exactly the same output each time. So much for randomness. To figure out why the implementation is behaving in this way, it helps to go back to the RandTest program and run it again. This time, the output is RandTest On this computer, RAND_MAX is 2147483647 The first 10 calls to rand: 1103527590 377401575 662824084 1147902781 2035015474 368800899 1508029952 486256185 1062517886 267834847 The output is the same as the first time around. In fact, the RandTest program produces the same output every time because the designers of the C++ library (and the earlier C libraries on which these libraries are based) decided that rand should return the same random sequence each time a program is run. At first, it may be hard to understand why a function that is supposed to return a random number would always return the same sequence of values. After all, deterministic behavior of this sort seems to run counter to the whole idea of randomness. There is, however, a good reason behind this behavior: programs that behave deterministically are easier to debug. To get a sense of why such repeatability is important, suppose you have written a program to play an intricate game, such as Monopoly. As is always the case with newly written programs, the odds are good that your program will have a few bugs. In a complex program, bugs can be relatively obscure, in the sense that they only occur in rare situations. Suppose you are playing the game and discover that the program is starting to behave in a bizarre way. As you begin to debug the program, it would be very convenient if you could regenerate the same state and take a closer look at what is going on. Unfortunately, if the program is running in a random way, a second run of the program will behave differently from the first. Bugs that showed up the first time may not occur on the second pass. The designers of the 2.8 Designing a random number library original C libraries therefore concluded that it had to be possible to use rand in a deterministic way in order to support debugging. At the same time, it also has to be possible to use rand so that it doesn’t always deliver the same results. To understand how to implement this behavior, it helps to understand how rand works internally. The rand function generates each new random value by applying a set of mathematical calculations to the last value it produced. Because you don’t know what those calculations are, it is best to think of the entire operation as a black box where old numbers go in on one side and new pseudorandom numbers pop out on the other. Since the first call to rand produces the value 1103527590, the second call to rand corresponds to putting 1103527590 into one end of the black box and having 377401575 pop out on the other side: 1103527590 rand 377401575 On the next call to rand, the implementation puts 377401575 into the black box, which returns 662824084: 377401575 rand 662824084 This same process is repeated on each call to rand. The computation inside the black box is designed so that (1) the numbers are uniformly distributed over the legal range, and (2) the sequence goes on for a long time before it begins to repeat. But what about the first call to rand—the one that returns 1103527590? The implementation must have a starting point. There must be an integer s0 that goes into the black box and produces 1103527590: s0 rand 1103527590 This initial value—the value that is used to get the entire process started—is called the seed for the random number generator. In the library, you can set that seed explicitly by calling srand(seed). As you know from the multiple runs of the RandTest program, the C++ library sets the initial seed to a constant value every time a program is started, which is why rand always generates the same sequence of values. That behavior, however, makes sense only during the debugging phase. Most modern languages change the default behavior so that the functions in the random-number library return different results on each run unless the programmer specifies otherwise. That design is considerably simpler for clients, and it makes sense to incorporate that design change into the random.h interface. It is still necessary to allow clients to generate a repeatable sequence of values, because doing so simplifies the debugging process. 105 106 Functions and Libraries The need to provide that option is the reason for the inclusion in the interface of the setRandomSeed function. During the debugging phase, you can add the line setRandomSeed(1); at the beginning of the main program. The calls to the other entries exported by the random.h interface will then produce the repeatable sequence of values generated by using 1 as the initial seed. When you are convinced that the program is working, you can remove this line to restore unpredictable behavior. To implement this change, the functions randomInteger, randomReal, and randomChance must first check to see whether the random number seed has already been initialized and, if not, set it to some starting value that would be difficult for users to predict, which is usually taken from the value of the system clock. Because that value is different each time you run a program, the random number sequence changes as well. In C++, you can retrieve the current value of the system clock by calling the function time and then converting the result to an integer. This technique allows you to write the following statement, which has the effect of initializing the pseudorandom number generator to an unpredictable point: srand(int(time(NULL))); Although it requires only a single line, the operation to set the random seed to an unpredictable value based on the system clock is quite obscure. If this line were to appear in the client program, the client would have to understand the concept of a random number seed, along with the functions srand and time and the constant NULL (which you won’t learn about until Chapter 11). To make things as easy as possible for the client, you need to hide this complexity away. The situation becomes even more complicated, however, when you realize that this initialization step must be performed once—and only once—before delivering up the results of any of the other functions. To ensure that the initialization code is not executed more than once, you need to declare a variable to keep track of whether that initialization has been performed. The most difficult implementation decision in the random library is figuring out where to declare that variable. Declaring it as a global leads to significant problems, primarily because C++ does not specify the order in which global variables are initialized. If you declare other global values whose initial values were produced by the random-library library, there would be no way to ensure that the initialization flag had been set correctly. In C++, the strategy that guarantees the initialization flag is set correctly is to declare the initialization flag inside the context of the function that checks to see whether the necessary initialization has already been performed. That variable, however, can’t be declared as a traditional local variable, because doing so would 2.8 Designing a random number library mean that a new variable was created on every call. To make this strategy work, you need to mark the flag as static, as shown in the following implementation of initRandomSeed: void initRandomSeed() { static bool initialized = false; if (!initialized) { srand(int(time(NULL))); initialized = true; } } When it is marked with the keyword static, the variable initialized becomes a static local variable, which means that it is accessible only within the body of the function (just like any local variable) but that a single copy of the variable is allocated outside of the stack frame that is shared by all calls to initRandomSeed. The rules of C++ ensure that static local variables are initialized exactly once and, moreover, that the initialization occurs before the function containing them is called for the first time. Defining the initRandomSeed function represents the final step in coding the random.cpp implementation, which appears in Figure 2-14. My primary goal in going through the code for the random.cpp implementation is not for you to master all of its intricacies. What I hope to do instead is convince you why it is important for you as a client of random.h not to have to understand all those details. The code in random.cpp is subtle and contains many potential pitfalls that are likely to ensnare the casual programmer who tries to implement these functions from scratch. The primary purpose of library interfaces is to hide precisely this sort of complexity. The interface wall protects the client from all sorts of dangers, of the sort that might have been marked with “Here Be Dragons” on medieval maps. These dangers are illustrated in the following diagram, in which the interface wall divides the frontispiece from Thomas More’s Utopia (1516) from the seafaring dragon in Charles Gould’s Mythical Monsters (1886): 107 108 Functions and Libraries Figure 2-8 F I G U R E 2 - 1 4 Implementation of the random number library /* * File: random.cpp * ---------------* This file implements the random.h interface. */ #include #include #include #include "random.h" using namespace std; /* Private function prototype */ void initRandomSeed(); /* * Implementation notes: randomInteger * ----------------------------------* The code for randomInteger produces the number in four steps: * * 1. Generate a random real number d in the range [0 .. 1). * 2. Scale the number to the range [0 .. N) where N is the number of values. * 3. Translate the number so that the range starts at the appropriate value. * 4. Convert the result to the next lower integer. * * The implementation is complicated by the fact that both the expression * * RAND_MAX + 1 * * and the expression for the number of values * * high - low + 1 * * can overflow the integer range. These calculations must therefore be * performed using doubles instead of ints. */ int randomInteger(int low, int high) { initRandomSeed(); double d = rand() / (double(RAND_MAX) + 1); double s = d * (double(high) - low + 1); return int(floor(low + s)); } 2.8 Designing a random number library Figure 2-8 F I G U R E 2 - 1 4 Implementation of the random number library (continued) /* * Implementation notes: randomReal * -------------------------------* The code for randomReal is similar to that for randomInteger, * without the final conversion step. */ double randomReal(double low, double high) { initRandomSeed(); double d = rand() / (double(RAND_MAX) + 1); double s = d * (high - low); return low + s; } /* * Implementation notes: randomChance * ---------------------------------* The code for randomChance calls randomReal(0, 1) and then checks * whether the result is less than the requested probability. */ bool randomChance(double p) { initRandomSeed(); return randomReal(0, 1) < p; } /* * Implementation notes: setRandomSeed * ----------------------------------* The setRandomSeed function simply forwards its argument to srand. * The call to initRandomSeed is required to set the initialized flag. */ void setRandomSeed(int seed) { initRandomSeed(); srand(seed); } /* * Implementation notes: initRandomSeed * -----------------------------------* The initRandomSeed function declares a static variable that keeps track * of whether the seed has been initialized. The first time initRandomSeed * is called, initialized is false, so the seed is set to the current time. */ void initRandomSeed() { static bool initialized = false; if (!initialized) { srand(int(time(NULL))); initialized = true; } } 109 110 Functions and Libraries 2.9 A brief tour of the graphics library One of the challenges of using C++ as a teaching language is that C++ doesn’t offer a standard graphics library. Although it is perfectly possible to learn about data structures and algorithms without using graphics, having a graphics library makes the process a lot more fun. And because you’re likely to learn things more easily if you can have a little fun along the way, the Stanford libraries include an interface called graphics.h that supports simple two-dimensional graphics on the platforms you’re most likely to be using. Before looking at the contents of the graphics.h interface, it makes sense to consider a simple example that illustrates the capabilities of the graphics library. Just as the first program in Chapter 1 was the classic “hello, world” program, it is appropriate to begin this section with the GraphicHelloWorld program shown in Figure 2-15, which draws “hello, world” in big letters across the graphics window. And just to give the output some color, the program draws the string in white letters on top of a large blue rectangle. The output of the program, which includes several annotations to illustrate the coordinate system used in the graphics library, appears at the top of the next page in Figure 2-16. The first line in the main program for GraphicHelloWorld is a call to the function initGraphics, which has the effect of initializing the graphics window and displaying it on the screen. Inside the graphics window, coordinates are measured in units called pixels, which are the individual dots that cover the surface 2.9 A brief tour of the graphics library F I G U R E 2 - 1 6 Annotated output of GraphicHelloWorld GraphicHelloWorld (0, 0) (60, 140) hello, world 300 pixels increasing values of x of the display screen. By default, the initGraphics function creates a relatively small window. You can, however, specify an explicit window size by calling initGraphics(width, height); where width and height are the desired number of pixels in each dimension. Coordinates in the graphics window are represented using a pair of values (x, y) where the x and y values represent the distance in that dimension from the origin, which is the point (0, 0) in the upper left corner of the window. As in traditional Cartesian coordinates, the value for x increases as you move rightward across the window. Having the location of the origin in the upper left corner, however, means that the y value increases as you move downward, which is precisely opposite to the usual Cartesian plane. Computer-based graphics packages invert the y coordinate because text moves downward on the page. By having y values increase as you move downward, successive lines of text appear at increasing values of y. The rest of the code for GraphicHelloWorld consists of calls to functions in the graphics.h interface, which is summarized in Table 2-2. The descriptions of those functions are somewhat cursory, and may not be sufficient in themselves for you to understand the intricacies of the graphics library. Given that the graphics library is only an optional component of this textbook, however, it isn’t appropriate to go through every one of those functions in complete detail. 150 pixels increasing values of y (50, 50) 111 112 Functions and Libraries Table 2-2 T A B L E 2 - 2 Selected functions exported by the graphics.h interface initGraphics() initGraphics(width, height) Initializes the graphics window. If width and height are omitted, this function creates a window with a default size. drawLine(x0, y0, x1, y1) Draws a line from the point (x0, y0) to the point (x1, y1). drawPolarLine(x, y, r, theta) Draws a line from the point (x, y) that extends r r pixel units at angle theta relative to the +x axis. This function returns the endpoint of the line as a GPoint. drawRect(x, y, width, height) Draws the frame of the rectangle with the specified bounds. fillRect(x, y, width, height) Fills the rectangle with the specified bounds. drawOval(x, y, width, height) Draws the frame of the oval inscribed in the specified rectangle. fillOval(x, y, width, height) Fills the oval inscribed in the specified rectangle. drawArc(x, y, width, height, start, sweep) Draws an elliptical arc. The x, y, width, and height parameters specify the bounding rectangle of the complete ellipse. The start parameter indicates the angle at which the arc begins; the sweep parameter indicates how far the arc extends. Both angles are measured in degrees counterclockwise from the +x axis. fillArc(x, y, width, height, start, sweep) Fills the wedge-shaped area bounded by the elliptical arc that would be produced by drawArc and the radial lines connecting the endpoints of the arc to the center of the ellipse. drawImage(filename, x, y) Draws the image from the specified file with its upper left corner at the point (x, y). The graphics library looks for the image file in the current directory and then in a subdirectory called images. drawString(str, x, y) Draws the string str so that its baseline origin appears at the point (x, y). The string appears in the current font and color. getStringWidth(str) Returns the width of the string str in the current font. setFont(font) Sets the font, where font is in the form "family-style-size". getFont() Returns the current font in the form "family-style-size". setColor(color) Sets the current color The color parameter is either one of the standard color names or a string in the form "#rrggbb". getColor() Returns the current color in the form "#rrggbb". saveGraphicsState() Saves the state of the graphics context. restoreGraphicsState() Restores the graphics state from the most recent save. getWindowWidth() Returns the width of the graphics window in pixels. getWindowHeight() Returns the height of the graphics window in pixels. repaint() Initiates an immediate update of the graphics window. pause(milliseconds) Repaints the window and then pauses for the specified time. Summary Fortunately, providing a complete description of the graphics library in the textbook seems to be unnecessary. Graphics is exciting enough that most students figure out how it works through some combination of the abbreviated descriptions in Table 2-2, the examples in the text, the complete listing of the interface in Appendix A, and their own experimentation. If you want to enjoy the process of learning this material to the utmost, I suggest you do the same. Summary In this chapter, you learned about functions, which enable you to refer to an entire set of operations by using a simple name. By allowing the programmer to ignore the internal details and concentrate only on the effect of a function as a whole, functions are an essential tool for reducing the conceptual complexity of programs. The important points introduced in this chapter include: • A function is a block of code that has been organized into a separate unit and given a name. Other parts of the program can then call that function, possibly passing it information in the form of arguments and receiving a result returned by that function. • Functions serve several useful purposes in programming. Allowing the same set of instructions to be shared by many parts of a program reduces both its size and its complexity. More importantly, functions make it possible to decompose large programs into smaller, more manageable pieces. Functions also serve as the basis for implementing algorithms, which are precisely specified strategies for solving computational problems. • Functions become even more useful when they are collected into a library, which can then be shared by many different applications. Each library typically defines several functions that share a single conceptual framework. In computer science, the process of making a function available through a library is called exporting that function. • The library exports several functions that are likely to be familiar from mathematics, including functions like sqrt, sin, and cos. As a client of the library, you need to know how to call these functions, but do not need to know the details of how they work. • In C++, functions must be declared before they are used. A function declaration is called a prototype. In addition to the prototype, functions have an implementation, which specifies the individual steps that function contains. • A function that returns a value must have a return statement that specifies the result. Functions may return values of any type. Functions that return Boolean values are called predicate functions and play an important role in programming. 113 114 Functions and Libraries • A function that returns no result and is executed only for its effect is called a procedure. • C++ allows you to define several functions with the same name as long as the compiler can use the number and types of the arguments to determine which function is required. This process is called overloading. The argument pattern that distinguishes each of the overloaded variants is called a signature. C++ also makes it possible to specify default parameters, which are used if the client omits them from the call. • Variables declared with a function are local to that function and cannot be used outside of it. Internally, all the variables declared within a function are stored together in a stack frame. • When you call a function, the arguments are evaluated in the context of the caller and then copied into the parameter variables specified in the function prototype. The association of arguments and parameters always follows the order in which the variables appear in each of the lists. • C++ makes it possible for a function and its caller to share the value of a parameter variable by marking it with the ampersand character (&). This style of parameter transmission is referred to as call by reference. • When a function returns, it continues from precisely the point at which the call was made. The computer refers to this point as the return address and keeps track of it in the stack frame. • Programmers who create a library are its implementers; programmers who make use of one are called its clients. The connection between the implementer and the client is called an interface. Interfaces typically export functions, types, and constants, which are collectively known as interface entries. • In C++, interfaces are stored in header files, which typically end with a .h suffix. Every interface should include several lines of interface boilerplate to ensure that the compiler reads it only once. • When you design an interface, you must balance several criteria. Well-designed interfaces are unified, simple, sufficient, general, and stable. • The random.h interface exports several functions that make it easy to simulate random behavior. Review questions 1. Explain in your own words the difference between a function and a program. 2. Define the following terms as they apply to functions: call, argument, return. Review questions 3. True or false: Every function in a C++ program requires a prototype. 4. What is the prototype of the function sqrt in the library? 5. Can there be more than one return statement in the body of a function? 6. What is a predicate function? 7. What is meant by the term overloading? How does the C++ compiler use signatures to implement overloading? 8. How do you specify a default value for a parameter? 9. True or false: It is possible to specify a default value for the first parameter to a function without specifying a default value for the second. 10. What is a stack frame? 11. Describe the process by which arguments are associated with parameters. 12. Variables declared within a function are said to be local variables. What is the significance of the word local in this context? 13. What does the term call by reference mean? 14. How do you indicate call by reference in a C++ program? 15. Define the following terms in the context of libraries: client, implementation, interface. 16. If you were writing an interface called mylib.h, what lines would you include as the interface boilerplate? 17. Describe the process used to export a constant definition as part of an interface. 18. What criteria does this chapter list as important in the process of interface design? 19. Why is it important for an interface to be stable? 20. What is meant by the term pseudorandom number? 21. On most computers, how is the value of RAND_MAX chosen? 115 116 Functions and Libraries 22. What four steps are necessary to convert the result of rand into an integer value with a different range? 23. How would you use the randomInteger function to generate a pseudorandom number between 1 and 100? 24. By executing each of the statements in the implementation by hand, determine whether the randomInteger function works with negative arguments. What are the possible results of calling the function randomInteger(-5, 5)? 25. Assuming that d1 and d2 have already been declared as variables of type int, could you use the multiple assignment statement d1 = d2 = RandomInteger(1, 6); to simulate the process of rolling two dice? 26. True or false: The rand function ordinarily generates the same sequence of random numbers every time a program is run. 27. What is meant by the term seed in the context of random numbers? 28. What suggestion does this chapter offer for debugging a program involving random numbers? 29. What functions are defined in the final version of the random.h interface? In what context would you use each function? Exercises 1. If you did not do so the first time around, rewrite the Celsius-to-Fahrenheit program from exercise 1 in Chapter 1 so that it uses a function to perform the conversion. 2. Reimplement the distance-conversion program from exercise 2 in Chapter 1 so that it uses a function. In this case, the function must produce both the number of feet and the number of inches, which means that you need to use call by reference to return these values. 3. When a floating-point number is converted to an integer in C++, the value is truncated by throwing away any fraction. Thus, when 4.99999 is converted to an integer, the result is 4. In many cases, it would be useful to have the option of rounding a floating-point value to the nearest integer. Given a positive Exercises 117 floating-point number x, you can round it to the closest integer by adding 0.5 and then truncating the result to an integer. Because truncation always moves toward zero, rounding negative numbers requires you to subtract 0.5, rather than adding it. Write a function roundToNearestInt(x) that rounds the floating-point number x to the nearest integer. Show that your function works by writing a suitable main program to test it. 4. If you are unfortunate enough to be outside in winter weather, you know that your perception of the cold is dependent on the wind speed as well as the temperature. The faster the wind blows, the colder you feel. To quantify the how wind affects temperature perception, the National Weather Service reports the wind chill, which is illustrated on their website as shown in Figure 2-15. At the bottom of the figure, you can see that the formula for wind chill is 35.74 + 0.6215 t – 35.75 v 0.16 + 0.4275 t v 0.16 where t is the Fahrenheit temperature and v is the wind speed in miles per hour. F I G U R E 2 - 1 7 Wind chill as a function of temperature and wind speed Source: National Weather Service 118 Functions and Libraries Write a function windChill that takes the values of t and v and returns the wind chill. In doing so, your function should take account of two special cases: • If there is no wind, windChill should return the original temperature t. • If the temperature is greater than 40° F, the wind chill is undefined, and your function should call error with an appropriate message. Although it will be easier to do so once you learn how to format numeric data in Chapter 4, you already know enough to generate a table that presents the wind-chill data in columns as shown in Figure 2-15. If you’re up for more of a challenge, write a main program that uses windChill to produce that table. 5. Greek mathematicians took a special interest in numbers that are equal to the sum of their proper divisors (a proper divisor of n is any divisor less than n itself). They called such numbers perfect numbers. For example, 6 is a perfect number because it is the sum of 1, 2, and 3, which are the integers less than 6 that divide evenly into 6. Similarly, 28 is a perfect number because it is the sum of 1, 2, 4, 7, and 14. Write a predicate function isPerfect that takes an integer n and returns true if n is perfect, and false otherwise. Test your implementation by writing a main program that uses the isPerfect function to check for perfect numbers in the range 1 to 9999 by testing each number in turn. When a perfect number is found, your program should display it on the screen. The first two lines of output should be 6 and 28. Your program should find two other perfect numbers in the range as well. 6. An integer greater than 1 is said to be prime if it has no divisors other than itself and one. The number 17, for example, is prime, because there are no numbers other than 1 and 17 that divide evenly into it. The number 91, however, is not prime because it is divisible by 7 and 13. Write a predicate method isPrime(n) that returns true if the integer n is prime, and false otherwise. To test your algorithm, write a main program that lists the prime numbers between 1 and 100. 7. Even though clients of the library typically don’t need to understand how functions like sqrt work internally, the implementers of that library have to be able to design an effective algorithm and write the necessary code. If you were asked to implement the sqrt function without using the library version, there are many strategies you could adopt. One of the easiest strategies to understand is successive approximation in which you make a guess at the solution and then refine that guess by choosing new values that move closer to the solution. Exercises You can use successive approximation to determine the square root of x by adopting the following strategy: 1. Begin by guessing that the square root is x / 2. Call that guess g. 2. The actual square root must lie between g and x / g. At each step in the successive approximation, generate a new guess by averaging g and x / g. 3. Repeat step 2 until the values g and x / g are as close together as the machine precision allows. In C++, the best way to check for this condition is to test whether the average is equal to either of the values used to generate it. Use this strategy to write your own implementation of the sqrt function. 8. Although Euclid’s algorithm for calculating the greatest common divisor is one of the oldest to be dignified with that term, there are other algorithms that date back many centuries. In the Middle Ages, one of the problems that required sophisticated algorithmic thinking was determining the date of Easter, which falls on the first Sunday after the first full moon following the vernal equinox. Given this definition, the calculation involves interacting cycles of the day of the week, the orbit of the moon, and the passage of the sun through the zodiac. Early algorithms for solving this problem date back to the third century and are fully described in the writings of the eighth-century scholar now known as the Venerable Bede. In 1800, the German mathematician Carl Friedrich Gauss published an algorithm for determining the date of Easter that was purely computational in the sense that it relied on arithmetic rather than looking up values in tables. His algorithm—translated directly from the German—appears in Figure 2-15. F I G U R E 2 - 1 8 Gauss’s algorithm for computing the date of Easter I. Divide the number of the year for which one wishes to calculate Easter by 19, by 4, and by 7, and call the remainders of these divisions a, b, and c, respectively. If the division is even, set the remainder to 0; the quotients are not taken into account. Precisely the same is true of the following divisions. II. Divide the value 19a + 23 by 30 and call the remainder d. III. Finally, divide 2b + 4c + 6d + 3, or 2b + 4c + 6d + 4, choosing the former for years between 1700 and 1799 and the latter for years between 1800 and 1899, by 7 and call the remainder e. Then Easter falls on March 22 + d + e, or when d + e is greater than 9, on April d + e – 9. Translated from Karl Friedrich Gauss, “Berechnung des Osterfestes,” August 1800 http://gdz.sub.uni-goettingen.de/no_cache/dms/load/img/?IDDOC=137484 119 120 Functions and Libraries Write a procedure void findEaster(int year, string & month, int & day); that takes a year and returns the Easter date in the reference parameters month and day. Unfortunately, the simple algorithm in Figure 2-15 only works for years in the 18th and 19th centuries. It is easy, however, to search the web for extensions that work for all years. Once you have coded and tested your implementation of Gauss’s original algorithm, undertake the necessary research to find a more general approach. 9. The combinations function C(n, k) described in this chapter determines the number of ways you can choose k values from a set of n elements, ignoring the order of the elements. If the order of the value matters—so that, in the case of the coin example, choosing a quarter first and then a dime is seen as distinct from choosing a dime and then a quarter—you need to use a different function, which computes the number of permutations. This function is denoted as P(n, k), and has the following mathematical formulation: P(n, k) = n! (n – k)! Although this definition is mathematically correct, it is not well suited to implementation in practice because the factorials involved can get much too large to store in an integer variable, even when the answer is small. For example, if you tried to use this formula to calculate the number of ways to select two cards from a standard 52-card deck, you would end up trying to evaluate the following fraction: 80,658,175,170,943,878,571,660,636,856,403,766,975,289,505,440,883,277,824,000,000,000,000 30,414,093,201,713,378,043,612,608,166,064,768,844,377,641,568,960,512,000,000,000,000 even though the answer is the much more manageable 2652 (52 × 51). Write a function permutations(n, k) that computes the P(n, k) function without calling the fact function. Part of your job in this problem is to figure out how to compute this value efficiently. To do so, you will probably find it useful to play around with some relatively small values to get a sense of how the factorials in the numerator and denominator of the formula behave. 10. The C(n, k) function from the text and the P(n, k) function from the preceding exercise come up often in computational mathematics, particularly in an area called combinatorics, which is concerned with counting the ways objects can Exercises be combined. Now that you have C++ implementations for each of these functions, it might be worth putting them in a library so that you can use them in many different applications. Write the files combinatorics.h and combinatorics.cpp for a library that exports the functions permutations and combinations. When you write the implementation, make sure to rewrite the code for the combinations function so that it uses the efficiency enhancements suggested for permutations in exercise 9. 11. Using the direction.h interface as an example, design and implement a calendar.h interface that exports the Month type from Chapter 1, along with the functions daysInMonth and isLeapYear, which also appear in that chapter. Your interface should also export a monthToString function that returns the constant name for a value of type Month. Test your implementation by writing a main program that asks the user to enter a year and then writes out the number of days in each month of that year, as in the following sample run: TestCalendar Enter a year: 2012 JANUARY has 31 days. FEBRUARY has 29 days. MARCH has 31 days. APRIL has 30 days. MAY has 31 days. JUNE has 30 days. JULY has 31 days. AUGUST has 31 days. SEPTEMBER has 30 days. OCTOBER has 31 days. NOVEMBER has 30 days. DECEMBER has 31 days. 12. Write a program RandomAverage that repeatedly generates a random real number between 0 and 1 and then displays the average after a specified number of trials entered by the user. 13. I shall never believe that God plays dice with the world. —Albert Einstein, 1947 Despite Einstein’s metaphysical objections, the current models of physics, and particularly of quantum theory, are based on the idea that nature does indeed involve random processes. A radioactive atom, for example, does not decay for any specific reason that we mortals understand. Instead, that atom has a random probability of decaying within a period of time. Sometimes it does, sometimes it doesn’t, and there is no way to know for sure. 121 122 Functions and Libraries Because physicists consider radioactive decay a random process, it is not surprising that random numbers can be used to simulate it. Suppose you start with a collection of atoms, each of which has a certain probability of decaying in any unit of time. You can then approximate the decay process by taking each atom in turn and deciding randomly whether it decays, considering the probability. Write a program that simulates the decay of a sample that contains 10,000 atoms of radioactive material, where each atom has a 50 percent chance of decaying in a year. The output of your program should show the number of atoms remaining at the end of each year, which might look something like this: RadioactiveDecay There There There There There There There There There There There There There are are are are are are are are are are are are are 10000 atoms initially. 4957 atoms at the end of year 1. 2484 atoms at the end of year 2. 1215 atoms at the end of year 3. 612 atoms at the end of year 4. 296 atoms at the end of year 5. 143 atoms at the end of year 6. 66 atoms at the end of year 7. 37 atoms at the end of year 8. 15 atoms at the end of year 9. 8 atoms at the end of year 10. 2 atoms at the end of year 11. 0 atoms at the end of year 12. As the numbers indicate, roughly half the atoms in the sample decay each year. In physics, the conventional way to express this observation is to say that the sample has a half-life of one year. 14. Random numbers offer yet another strategy for approximating the value of π. Imagine that you have a dartboard hanging on your wall that consists of a circle painted on a square backdrop, as in the following diagram: What happens if you throw a whole bunch of darts completely randomly, ignoring any darts that miss the board altogether? Some of the darts will fall inside the gray circle, but some will be outside the circle in the white corners of the square. If the darts are random, the ratio of the number of darts landing Exercises inside the circle to the total number of darts hitting the square should be approximately equal to the ratio between the two areas. The ratio of the areas is independent of the actual size of the dartboard, as illustrated by the following formula: darts falling inside the circle darts falling inside the square ≅ area inside the circle area inside the square = πr 2 4r 2 = π 4 To simulate this process in a program, imagine that the dart board is drawn on the standard Cartesian coordinate plane with its center at the origin and a radius of 1 unit. The process of throwing a dart randomly at the square can be modeled by generating two random numbers, x and y, each of which lies between –1 and +1. This (x, y) point always lies somewhere inside the square. The point (x, y) lies inside the circle if x2 + y2 < 1 This condition, however, can be simplified considerably by squaring each side of the inequality, which gives the following more efficient test: x2 + y2 < 1 If you perform this simulation many times and compute the fraction of darts that fall within the circle, the result will be an approximation of π/4. Write a program that simulates throwing 10,000 darts and then uses the simulation technique described in this exercise to generate and display an approximate value of π. Don’t worry if your answer is correct only in the first few digits. The strategy used in this problem is not particularly accurate, even though it occasionally proves useful as an approximation technique. In mathematics, this technique is called Monte Carlo integration, after the capital city of Monaco. 15. Heads. . . . Heads. . . . Heads. . . . A weaker man might be moved to re-examine his faith, if in nothing else at least in the law of probability. —Tom Stoppard, Rosencrantz and Guildenstern Are Dead, 1967 Write a program that simulates flipping a coin repeatedly and continues until three consecutive heads are tossed. At that point, your program should display the total number of coin flips that were made. The following is one possible sample run of the program: 123 124 Functions and Libraries ConsecutiveHeads heads tails heads tails tails heads heads heads It took 8 flips to get 3 consecutive heads. 16. Use the graphics library to draw a rainbow that looks something like this: Rainbow Starting at the top, the six stripes in the rainbow are red, orange, yellow, green, blue, and magenta, respectively; cyan makes a lovely color for the sky. 17. Use the graphics library to write a program that draws a checkerboard on the graphics window. Your picture should include the red and black pieces, as they exist at the beginning of the game, like this: Checkerboard Chapter 3 Strings Whisper music on those strings. — T. S. Eliot, The Waste Land, 1922 126 Strings Up to now, most of the programming examples you have seen in this book have used numbers as their basic data type. These days, computers work less with numeric data than with text data, which is a generic term for information composed of individual characters. The ability of modern computers to process text data has led to the development of text messaging, electronic mail, word processing systems, online reference libraries, and a wide variety of other useful applications. This chapter introduces the C++ library, which provides a convenient abstraction for working with strings of characters. Having this library in your toolbox will make it much easier to write interesting applications. This chapter, however, also introduces the notion of a class, which is the term computer scientists have adopted for data types that support the object-oriented programming paradigm. Although C++ also defines a more primitive string type, most text-processing applications work instead with objects of a class called string. By working with the string class in this chapter, you will have a better sense of what classes are when you begin to define your own classes in Chapter 7. 3.1 Using strings as abstract values Conceptually, a string is simply a sequence of characters. For example, the string "hello, world" is a sequence of 12 characters including ten letters, a comma, and a space. In C++, the string class and its associated operations are defined in the library, and you must therefore include this library in any source file that manipulates string data. In Chapter 1, you learned that data types are defined by two properties: a domain and a set of operations. For strings, the domain is easy to identify: the domain of type string is the set of all sequences of characters. The more interesting problem is to identify the appropriate set of operations. Early versions of C++ followed the lead of the older C language and offered little support for manipulating strings. The only facilities were low-level operations that required you to understand the underlying representation. The designers of C++ soon solved that problem by introducing a string class that enables clients to work at a more abstract level. For the most part, you can use string as a primitive data type in much the same way that you use types like int and double. You can, for example, declare a variable of type string and assign it an initial value, much as you would with a numeric variable. When you declare a string variable, you typically specify its initial value as a string literal, which is a sequence of characters enclosed in double quotes. For example, the declaration const string ALPHABET = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"; sets the constant ALPHABET to a 26-character string containing the uppercase letters. 3.1 Using strings as abstract values You can also use the operators >> and << to read and write data values of type string, although doing so requires some caution. For example, you could make the Hello program more conversational by rewriting the main function like this: int main() { string name; cout << "Enter your name: "; cin >> name; cout << "Hello, " << name << "!" << endl; return 0; } This program reads a string from the user into the variable name and then includes that name as part of the greeting, as shown in the following sample run: HelloName Enter your name: Eric Hello, Eric! This program, however, behaves in a somewhat surprising way if you enter a user’s full name instead of the first name alone. For example, if I had typed my full name in response to the prompt, the sample run would look like this: HelloName Enter your name: Eric Roberts Hello, Eric! Even though the program contains no code to take the two pieces of my name apart, it somehow still uses only my first name when it prints its greeting. How does the program know to be so colloquial? Answering this question requires you to understand in more detail how the >> operator behaves when you use it to read a string value. Although you might expect it to read an entire line of input, the >> operator stops as soon as it hits the first whitespace character, which is defined to be any character that appears as blank space on the display screen. The most common whitespace character is the space character itself, but the set of whitespace characters also includes tabs and the characters that mark the end of a line. 127 128 Strings If you want to read a string that contains whitespace characters, you can’t use the >> operator. The standard approach is to call the function getline(cin, str); which reads an entire line from the console input stream cin into the variable str, which is passed by reference. It is, however, often more convenient use the getLine function introduced at the end of Chapter 1, which displays a prompt on the console, waits for the user to enter a line, and then returns the entire input line to the caller. Incorporating the getLine function into the HelloName program gives rise to the code shown in Figure 3-1 and enables the program to display the full name of the user, as follows: HelloName Enter your full name: Eric Roberts Hello, Eric Roberts! In practice, reading a complete line is a much more common operation than reading a substring bounded by whitespace characters. As a result, programs that need to read a string from the user are far more likely to use the getLine function than the >> operator. F I G U R E 3 - 1 An interactive version of the “Hello World” program /* * File: HelloName.cpp * ------------------* This program extends the classic "Hello world" program by asking * the user for a name, which is then used as part of the greeting. * This version of the program reads a complete line into name and * not just the first word. */ #include #include #include "simpio.h" using namespace std; int main() { string name = getLine("Enter your full name: "); cout << "Hello, " << name << "!" << endl; return 0; } 3.2 String operations 3.2 String operations If you need to perform more complex operations using the library, you will discover that the data type string doesn’t behave in precisely the same way that more traditional types do. One of the major differences is in the syntax used to invoke function calls. For example, if you were assured that the library exports a function named length, you might imagine that the way to determine the length of a string str would be to use a statement like this: int nChars = length(str); As the bug icon emphasizes, this statement is incorrect in C++. The problem with this expression is that the data type string is not a traditional type but is instead an example of a class, which is probably easiest to define informally as a template that describes a set of values together with an associated set of operations. In the language of object-oriented programming, the values that belong to a class are called objects. A single class can give rise to many different objects; each such object is said to be an instance of that class. The operations that apply to instances of a class are called methods. In C++, methods look and behave much like traditional functions, although it helps to give them a new name to emphasize that there are differences. Unlike functions, methods are tightly coupled to the class to which they belong. In cases where it is useful to emphasize the distinction, traditional functions are sometimes called free functions because they are not bound to a particular class. In the object-oriented world, objects communicate by sending information and requests from one object to another. Collectively, these transmissions are called messages. The act of sending a message corresponds to having one object invoke a method that belongs to a different object. For consistency with the conceptual model of sending messages, the object that initiates the method is called the sender, and the object that is the target of that transmission is called the receiver. In C++, sending a message is specified using the following syntax: receiver.name(arguments) The object-oriented version of the statement that sets nChars to the length of the string object str is therefore int nChars = str.length(); Table 3-1 lists the most common methods exported by the library, all of which use the receiver syntax. 129 130 Strings methods T A B L E 3 - 1 Common methods in the library String operators str1 + str2 Concatenates str1 and str2 end to end and returns a new string containing the combined characters. Either string can be replaced by a character, but numeric types are not allowed. str += str2 Appends a copy of str2 to the end of str. C++ overloads this operator so that str2 can be a character value as well. str1 == str2 str1 < str2 str1 > str2 str1 != str2 str1 <= str2 str1 >= str2 str[k] C++ overloads the relational operators so that they compare strings using lexicographic order, which is based on the underlying character codes. Returns the character at index position k in str as an assignable value. The [] operator does not check to see if k is in range. Methods that leave the receiver string unchanged str.length() Returns the number of characters in str. str.at(k) Return the character at index position k in str. In contrast to the [] operator, at generates an exception if k is out of range. str.substr(pos, n) Returns a new string made of characters starting at pos in str and extending for n characters or up to the end of the string. The second parameter is optional. If n is missing, the substring always extends through the end of the string. str.compare(str2) Compares the receiver string str with str2 and returns an integer that is 0 if the two strings are equal, negative if str precedes str2, and positive if str follows str2 in lexicographic order. Because C++ overloads the relational operators, programmers rarely call compare explicitly. str.find(pattern, pos) Searches the receiver string str for pattern, starting at position pos. The function returns the first index at which pattern (which can be either a character or a string) appears; if pattern does not appear, find returns the constant string::npos. The second argument is optional; if it is missing, find starts at the beginning of the string. Methods that destructively modify the receiver string str.erase(pos, n) Deletes n characters from str starting at pos. str.insert(pos, str2) Inserts a copy of str2 into str starting at pos. str.replace(pos, n, str2) Replaces the n characters in str starting at pos with a copy of str2. Methods for creating C++ and C-style strings string(carray) Returns a C++ string containing the same characters as carray. string(n, ch) Returns a C++ string containing n copies of the character ch. str.c_str() Returns a C-style character array containing the same characters as str. 3.2 String operations Operator overloading As you can see from the first section of Table 3-1, the library redefines several of the standard operators using an extremely powerful C++ feature called operator overloading, which redefines the behavior of operators depending on the types of their operands. In the library, the most important overloaded operator is the + sign. When + is applied to numbers, it performs addition. When + is applied to strings, it performs concatenation, which is just a fancy word for joining the two strings together end to end. You can also use the shorthand += form to concatenate new text to the end of an existing string. Both the + and += operators allow concatenation of strings or single characters, as illustrated by the following example, which sets the string variable str to the string "abcd": string str = "abc"; str += 'd'; Those of you who are familiar with Java may expect the + operator to support values of other types by converting those values to strings and then concatenating them together. That feature is not available in C++, which treats any attempt to use the + operator with incompatible operands as an error. C++ also overloads the relational operators so that you can compare string values much more conveniently than you can in many other languages, including C and Java. For example, you can check to see whether the value of str is equal to "quit" like this: if (str == "quit") . . . The relational operators compare strings using lexicographic order, which is the order defined by the underlying ASCII codes. Lexicographic order means that case is significant, so "abc" is not equal to "ABC". Selecting characters from a string In C++, positions within a string are numbered starting from 0. For example, the characters in the string "hello, world" are numbered as in the following diagram: h e l l o , 0 1 2 3 4 5 6 w o r l d 7 8 9 10 11 The position number written underneath each character is called its index. The library offers two different mechanisms for selecting individual characters from a string. One approach is to supply the index inside square brackets 131 132 Strings after the string. For example, if the string variable str contains "hello, world", the expression str[0] selects the character 'h' at the beginning of the string. Although C++ programmers tend to use the square-bracket syntax for its expressiveness, it is arguably better to call the at method instead. The expressions str[i] and str.at(i) have almost the same meaning in C++; the only difference is that at checks to make sure the index is in range. No matter which syntax you use, selecting an individual character in a string returns a direct reference to the character in the string, which allows you to assign a new value to that character. For example, you can use either the statement str[0] = 'H'; or the statement str.at(0) = 'H'; to change the value of the string from "hello, world" to "Hello, world". Partly because the second form generates confusion by suggesting that it is possible to assign a value to the result of any function and partly because experience has shown that students have more trouble understanding examples written using at, I have chosen to use bracket selection for the programming examples in this book, despite the lack of range-checking. Although index positions in strings are conceptually integers, the string class complicates matters by using the type size_t to represent both index positions and lengths in a string. If you want to be slavishly correct in your coding, you should use that type (which is automatically defined when you include the header file) every time you store a string index in a variable. At Stanford, we have found that the additional conceptual complexity of using size_t gets in the way of student understanding and have chosen to use int for this purpose. Using int works just fine unless your strings are longer than 2,147,483,647 characters long, although you may get a warning message from some compilers. String assignment C++ does take some steps to mitigate the conceptual problems that follow from allowing the client to change individual characters in an existing string. In particular, C++ redefines assignment for strings so that assigning one string to another copies the underlying characters. For example, the assignment statement 3.2 String operations str2 = str1; overwrites any previous contents of str2 with a copy of the characters contained in str1. The variables str1 and str2 therefore remain independent, which means that changing the characters in str1 does not affect str2. Similarly, C++ copies the characters in any string passed as a value parameter. Thus, if a function makes changes to an argument that happens to be a string, those changes are not reflected in the calling function unless the string parameter is passed by reference. Extracting parts of a string Concatenation makes longer strings from shorter pieces. You often need to do the reverse: separate a string into the shorter pieces it contains. A string that is part of a longer string is called a substring. The string class exports a method called substr that takes two parameters: the index of the first character you want to select and the desired number of characters. Calling str.substr(start, n) creates a new string by extracting n characters from str starting at the index position specified by start. For example, if str contains the string "hello, world", the method call str.substr(1, 3) returns the three-character substring "ell". Because indices in C++ begin at 0, the character at index position 1 is the character 'e'. The second argument in the substr method is optional. If it is missing, substr returns the substring that starts at the specified position and continues through the end of the string. Thus, calling str.substr(7) returns the string "world". Similarly, if n is supplied but fewer than n characters follow the specified starting position, substr returns characters only up to the end of the original string. The following function uses substr to return the second half of the parameter str, which is defined to include the middle character if the length of str is odd: string secondHalf(string str) { return str.substr(str.length() / 2); } Searching within a string From time to time, you will find it useful to search a string to see whether it contains a particular character or substring. To make such search operations 133 134 Strings possible, the string class exports a method called find, which comes in several forms. The simplest form of the call is str.find(search); where search is the content you’re looking for, which can be either a string or a character. When called, the find method searches through str looking for the first occurrence of the search value. If the search value is found, find returns the index position at which the match begins. If the character does not appear before the end of the string, find returns the constant string::npos. Unlike the constants that you saw in Chapter 1, the identifier npos is defined as part of the string class and therefore requires the string:: qualifier whenever it appears. The find method also takes an optional second argument that indicates the index position at which to start the search. The effect of both styles of the find method is illustrated by the following examples, which assume that the variable str contains the string "hello, world": str.find('o') str.find('o', 5) str.find('x') → → → 4 8 str::npos As with string comparison, the methods for searching a string consider uppercase and lowercase characters to be different. Iterating through the characters in a string Even though the methods exported by the string class give you the tools you need to implement string applications from scratch, it is usually easier to write programs by adapting existing code examples that implement particularly common operations. In programming terminology, such illustrative examples are called patterns. When you work with strings, one of the most important patterns involves iterating through the characters in a string, which requires the following code: for (int i = 0; i < str.length(); i++) { . . . body of loop that manipulates str[i] . . . } On each loop cycle, the selection expression str[i] refers to the ith character in the string. Because the purpose of the loop is to process every character, the loop continues until i reaches the length of the string. Thus, you can count the number of spaces in a string using the following function: 3.2 String operations int countSpaces(string str) { int nSpaces = 0; for (int i = 0; i < str.length(); i++) { if (str[i] == ' ') nSpaces++; } return nSpaces; } For some applications, you will find it useful to iterate through a string in the opposite direction, starting with the last character and continuing backward until you reach the first. This style of iteration uses the following for loop: for (int i = str.length() - 1; i >= 0; i--) Here, the index i begins at the last index position, which is one less than the length of the string, and then decreases by one on each cycle, down to and including the index position 0. Assuming that you understand the syntax and semantics of the for statement, you could work out the patterns for each iteration direction from first principles each time this pattern comes up in an application. Doing so, however, will slow you down enormously. These iteration patterns are worth memorizing so that you don’t have to waste any time thinking about them. Whenever you recognize that you need to cycle through the characters in a string, some part of your nervous system between your brain and your fingers should be able to translate that idea effortlessly into the following line: for (int i = 0; i < str.length(); i++) For some applications, you will need to modify the basic iteration pattern to start or end the iteration at a different index position. For example, the following function checks to see whether a string begins with a particular prefix: bool startsWith(string str, string prefix) { if (str.length() < prefix.length()) return false; for (int i = 0; i < prefix.length(); i++) { if (str[i] != prefix[i]) return false; } return true; } This code begins by checking to make sure that str is not shorter than prefix (in which case, the result must certainly be false) and then iterates only through the characters in prefix rather than the string as a whole. 135 136 Strings As you read through the code for the startsWith function, it is useful to pay attention to the placement of the two return statements. The code returns false from inside the loop, as soon as it discovers the first difference between the string and the prefix. The code returns true from outside the loop, after it has checked every character in the prefix without finding any differences. You will encounter examples of this basic pattern over and over again as you read through this book. The startsWith function and the similar endsWith function you will have a chance to write in exercise 1 turn out to be very useful, even though they are not part of the standard library in C++. They are thus ideal candidates for inclusion in a library of string functions that you create for yourself by applying the same techniques used in Chapter 2 to define the error library. The Stanford libraries include an interface called strlib.h that exports several useful string functions. The contents of that interface are described in more detail later in this chapter and appear in full in Appendix A. Growing a string through concatenation The other important pattern to memorize as you learn how to work with strings involves creating a new string one character at a time. The loop structure itself will depend on the application, but the general pattern for creating a string by concatenation looks like this: string str = ""; for (whatever loop header line fits the application) { str += the next substring or character; } As a simple example, the following method returns a string consisting of n copies of the character ch: string repeatChar(int n, char ch) { string str = ""; for (int i = 0; i < n; i++) { str += ch; } return str; } The repeatChar function is useful, for example, if you need to generate some kind of section separator in console output. One strategy to accomplish this goal would be to use the statement cout << repeatChar(72, '-') << endl; which prints a line of 72 hyphens. 3.3 The library Many string-processing functions use the iteration and concatenation patterns together. For example, the following function reverses the argument string so that, for example, calling reverse("desserts") returns "stressed": string reverse(string str) { string rev = ""; for (int i = str.length() - 1; i >= 0; i--) { rev += str[i]; } return rev; } 3.3 The library Given that strings are composed of characters, it is often useful to have tools for working with those individual characters, and not just the string as a whole. The library exports a variety of functions that work with characters, the most common of which appear in Table 3-2. The first section of Table 3-2 defines a set of predicate functions to test whether a character belongs to a particular category. Calling isdigit(ch), for example, returns true if ch is one of the digit characters in the range between '0' and '9'. T A B L E 3 - 2 Selected functions in the library Predicate functions for testing character type isalpha(ch) Returns true if ch is an alphabetic character. isupper(ch) Returns true if ch is an uppercase alphabetic character. islower(ch) Returns true if ch is a lowercase alphabetic character. isdigit(ch) Returns true if ch is a digit ('0'-'9'). isxdigit(ch) Returns true if ch is a hexadecimal digit ('0'-'9', 'A'-'F', 'a'-'f') isalnum(ch) Returns true if ch is alphanumeric, which means that it is either a letter or a digit. ispunct(ch) Returns true if ch is a punctuation symbol. isspace(ch) Returns true if ch is a whitespace character. These characters are ' ' (the space character), '\t', '\n', '\f', '\v', or '\r', all of which appear as blank space. isprint(ch) Returns true if ch is any printable character. Functions for case conversion toupper(ch) Returns ch converted to upper case (or ch itself if ch is not a letter). tolower(ch) Returns ch converted to lower case (or ch itself if ch is not a letter). 137 138 Strings Similarly, calling isspace(ch) returns true if ch is any of the characters that appear as white space on a display screen, such as spaces and tabs. The functions in the second section of Table 3-2 make it easy to convert between uppercase and lowercase letters. Calling toupper('a'), for example, returns the character 'A'. If the argument to either the toupper or tolower function is not a letter, the function returns its argument unchanged, so that tolower('7') returns '7'. The functions in the library often come in handy when you are working with strings. The following function, for example, returns true if the argument str is a nonempty string of digits, which means that it represents an integer: bool isDigitString(string str) { if (str.length() == 0) return false; for (int i = 0; i < str.length(); i++) { if (!isdigit(str[i])) return false; } return true; } Similarly, the following function returns true if the strings s1 and s2 are equal, ignoring differences in case: bool equalsIgnoreCase(string s1, string s2) { if (s1.length() != s2.length()) return false; for (int i = 0; i < s1.length(); i++) { if (tolower(s1[i]) != tolower(s2[i])) return false; } return true; } The implementation of equalsIgnoreCase returns false as soon as it finds the first character position that doesn’t match, but must wait until it reaches the end of the loop to return true. 3.4 Modifying the contents of a string Unlike other languages such as Java, C++ allows you to change the characters in a string by assigning new values to a particular index position. That fact makes it possible to design your own string functions that change the content of a string in much the same way that the erase, insert, and replace functions do. In most cases, however, it is better to write functions so that they return a transformed version of a string without changing the original. 3.4 Modifying the contents of a string As an illustration of how these two approaches differ, suppose that you want to design a string counterpart to the toupper function in the library that converts every lowercase character in a string to its uppercase equivalent. One approach is to implement a procedure that changes the contents of the argument string, as follows: void toUpperCaseInPlace(string & str) { for (int i = 0; i < str.length(); i++) { str[i] = toupper(str[i]); } } An alternative strategy is to write a function that returns an uppercase copy of its argument without changing the original. If you use the iteration and concatenation patterns, such a function might look like this: string toUpperCase(string str) { string result = ""; for (int i = 0; i < str.length(); i++) { result += toupper(str[i]); } return result; } The strategy that modifies the string in place is more efficient, but the second is more flexible and less likely to cause unexpected results. Having the first version, however, makes it possible to code the second in a more efficient way: string toUpperCase(string str) { toUpperCaseInPlace(str); return str; } In this implementation, C++ automatically copies the argument string because it is passed by value. Given that str is no longer connected to the argument string in the caller’s domain, it is perfectly acceptable to modify it in place and then return a copy back to the caller. 3.5 The legacy of C-style strings In its early years, C++ succeeded in part because it includes all of C as a subset, thereby making it possible to evolve gradually from one language to the other. That design decision, however, means that C++ includes some aspects of C that no longer make sense in a modern object-oriented language, but nonetheless need to be maintained for compatibility. 139 140 Strings In C, strings are implemented as low-level arrays of characters, which offer none of the high-level facilities that make the string class so useful. Unfortunately, the decision to keep C++ compatible with C means that C++ must support both styles. String literals, for example, are implemented using the older C-based style. For the most part, you can ignore this historical detail because C++ automatically converts a string literal to a C++ string whenever the compiler can determine that what you want is a C++ string. If you initialize a string using the line string str = "hello, world"; C++ automatically converts the C-style string literal "hello, world" to a C++ string object, because you’ve told the compiler that str is a string variable. By contrast, C++ does not allow you to write the declaration string str = "hello" + ", " + "world"; even though it seems as if this statement would have the same ultimate effect. The problem here is that this version of the code tries to apply the + operator to string literals, which are not C++ string objects. If you need to get around this problem, you can explicitly convert a string literal to a string object by calling string on the literal. The following line, for example, correctly converts "hello" to a C++ string object and then uses concatenation to complete the calculation of the initial value: string str = string("hello") + ", " + "world"; Another problem that arises from having two different representations for strings is that some C++ libraries require the use of C-style strings instead of the more modern C++ string class. If you use these library abstractions in the context of an application that uses C++ strings, you must at some point convert the C++ string objects into their older, C-style counterparts. Specifying that conversion is simple enough: all you have to do is apply the c_str method to the C++ version of a string to obtain its C-style equivalent. The more important problem, however, is that having to master two different representations for strings increases the conceptual complexity of C++, thereby making it harder to learn. 3.6 Writing string applications Although they are useful to illustrate how particular string functions work, the string examples you have seen so far are too simple to give you much insight into writing a significant string-processing application. This section addresses that deficiency by developing two applications that manipulate string data. 3.6 Writing string applications Recognizing palindromes A palindrome is a word that reads identically backward and forward, such as level or noon. The goal of this section is to write a predicate function isPalindrome that checks whether a string is a palindrome. Calling isPalindrome("level") should return true; calling isPalindrome("xyz") should return false. As with most programming problems, there are several reasonable strategies for solving this problem. In my experience, the approach that most students are likely to try first uses a for loop to run through each index position in the first half of the string. At each position, the code then checks to see whether that character matches the one that appears in the symmetric position relative to the end of the string. Adopting that strategy leads to the following code: bool isPalindrome(string str) { int n = str.length(); for (int i = 0; i < n / 2; i++) { if (str[i] != str[n - i - 1]) return false; } return true; } Given the string functions you’ve already encountered in this chapter, you can also code isPalindrome in the following, much simpler form: bool isPalindrome(string str) { return str == reverse(str); } Of these two implementations, the first is more efficient. The second version has to construct a new string that is the reverse of the original; worse still, it does so by concatenating the characters together one at a time, which means that the program creates as many intermediate string values as there are characters in the original string. The first version doesn’t have to create any strings at all. It does its work by selecting and comparing characters, which turn out to be less costly operations. Despite this difference in efficiency, I believe that the second coding is much better, particularly as an example for new programmers. For one thing, it takes advantage of existing code by making use of the reverse function. For another, it hides the complexity involved in calculating index positions required by the first version. It takes at least a minute or two for most students to figure out why the code includes the selection expression str[n - i - 1] or why it is appropriate to use the < operator in the for loop test, as opposed to <=. By contrast, the line return str == reverse(str); 141 142 Strings reads almost as fluidly as English: a string is a palindrome if it is equal to the same string in reverse order. Particularly as you are learning about programming, it is much more important to work toward the clarity of the second implementation than the efficiency of the first. Given the speed of modern computers, it is almost always worth sacrificing a few machine cycles to make a program easier to understand. Translating English to Pig Latin To give you more of a sense of how to implement string-processing applications, this section describes a C++ program that reads a line of text from the user and then translates each word in that line from English to Pig Latin, a made-up language familiar to most children in the English-speaking world. In Pig Latin, words are formed from their English counterparts by applying the following rules: 1. If the word contains no vowels, no translation is done, which means that the translated word is the same as the original. 2. If the word begins with a vowel, the function adds the string "way" to the end of the original word. 3. If the word begins with a consonant, the function extracts the string of consonants up to the first vowel, moves that collection of consonants to the end of the word, and adds the string "ay". As an example, suppose that the English word is scram. Because the word begins with a consonant, you divide it into two parts: one consisting of the letters before the first vowel and one consisting of that vowel and the remaining letters: scr am You then interchange these two parts and add ay at the end, as follows: am scr ay Thus the Pig Latin word for scram is amscray. For a word that begins with a vowel, such as apple, you simply add way to the end, which leaves you with appleway. The code for the PigLatin program appears in Figure 3-2. The main program reads a line of text from the user and then calls lineToPigLatin to convert that line to Pig Latin. The lineToPigLatin function then calls wordToPigLatin to convert each word to its Pig Latin equivalent. Characters that are not part of a word are copied directly to the output line so that punctuation and spacing remain unaffected. 3.6 Writing string applications page 1 of figure. F I G U R E 3 - 2 Program to translate English to Pig Latin /* * File: PigLatin.cpp * -----------------* This program converts lines from English to Pig Latin. * This dialect of Pig Latin applies the following rules: * * 1. If the word contains no vowels, return the original * word unchanged. * * 2. If the word begins with a consonant, extract the set * of consonants up to the first vowel, move that set * of consonants to the end of the word, and add "ay". * * 3. If the word begins with a vowel, add "way" to the * end of the word. */ #include #include #include using namespace std; /* Function prototypes */ string lineToPigLatin(string line); string wordToPigLatin(string word); int findFirstVowel(string word); bool isVowel(char ch); /* Main program */ int main() { cout << "This program translates English to Pig Latin." << endl; string line; cout << "Enter English text: "; getline(cin, line); string translation = lineToPigLatin(line); cout << "Pig Latin output: " << translation << endl; return 0; } 143 144 Strings Page 2 of figure. F I G U R E 3 - 2 Program to translate English to Pig Latin (continued) /* * Function: lineToPigLatin * Usage: string translation = lineToPigLatin(line); * ------------------------------------------------* Translates each word in the line to Pig Latin, leaving all other * characters unchanged. The variable start keeps track of the index * position at which the current word begins. As a special case, * the code sets start to -1 to indicate that the beginning of the * current word has not yet been encountered. */ string lineToPigLatin(string line) { string result; int start = -1; for (int i = 0; i < line.length(); i++) { char ch = line[i]; if (isalpha(ch)) { if (start == -1) start = i; } else { if (start >= 0) { result += wordToPigLatin(line.substr(start, i - start)); start = -1; } result += ch; } } if (start >= 0) result += wordToPigLatin(line.substr(start)); return result; } /* * Function: wordToPigLatin * Usage: string translation = wordToPigLatin(word); * ------------------------------------------------* This function translates a word from English to Pig Latin using * the rules specified in the text. The translated word is * returned as the value of the function. */ string wordToPigLatin(string word) { int vp = findFirstVowel(word); if (vp == -1) { return word; } else if (vp == 0) { return word + "way"; } else { string head = word.substr(0, vp); string tail = word.substr(vp); return tail + head + "ay"; } } 3.6 Writing string applications F I G U R E 3 - 2 Program to translate English to Pig Latin (continued) /* * Function: findFirstVowel * Usage: int k = findFirstVowel(word); * -----------------------------------* Returns the index position of the first vowel in word. If * word does not contain a vowel, findFirstVowel returns -1. */ int findFirstVowel(string word) { for (int i = 0; i < word.length(); i++) { if (isVowel(word[i])) return i; } return -1; } /* * Function: isVowel * Usage: if (isVowel(ch)) . . . * ----------------------------* Returns true if the character ch is a vowel. */ bool isVowel(char ch) { switch (ch) { case 'A': case 'E': case 'I': case 'O': case 'U': case 'a': case 'e': case 'i': case 'o': case 'u': return true; default: return false; } } A sample run of the program might look like this: PigLatin This program translates English to Pig Latin. Enter English text: this is pig latin. Pig Latin output: isthay isway igpay atinlay. It is worth taking a careful look at the implementations of lineToPigLatin and wordToPigLatin in Figure 3-2. The lineToPigLatin function finds the word boundaries in the input and provides a useful pattern for separating a string into individual words. You will see much the same strategy in the implementation of the token scanner in Chapter 6. The wordToPigLatin function uses substr to extract pieces of the English word and then uses concatenation to put them back together in their Pig Latin form. 145 146 Strings 3.7 The strlib.h library As I’ve noted from time to time in the text, several of the functions in this chapter seem ideal for inclusion in a library. Once you have written these functions as part of one application, it would be silly not to use them in other applications that need to perform the same operations. While functions like wordToPigLatin are unlikely to show up anywhere else, you will often have occasion to use functions like toUpperCase and startsWith. To avoid having to rewrite them or to cut-and-paste the code, it makes sense to put those functions into a library so that you always have them available. The Stanford libraries distributed with the book include an interface called strlib.h that exports the functions shown in Table 3-6. You can see on this list several functions whose definitions appear in this chapter. In the exercises, you will have a chance to fill in the definitions of some of the others, including endsWith and trim. The first four functions in Table 3-6, all of which are concerned with converting numeric values to string form, require techniques that are beyond the limits of your knowledge of C++, but only for the moment. In Chapter 4, you will learn how to use a new data type called a stream, which will make it easy to implement these functions. T A B L E 3 - 6 Functions exported by the strlib.h interface integerToString(n) Converts an integer into the corresponding string of digits. stringToInteger(str) Converts a string of digits into an integer. realToString(d) Converts a floating-point number into the corresponding string form. stringToReal(str) Converts a string representing a real number into its corresponding value. Returns a new string in which all lowercase characters have been converted into their uppercase equivalents. toUpperCase(str) toLowerCase(str) Returns a new string in which all uppercase characters have been converted into their lowercase equivalents. equalsIgnoreCase(s1, s2) Returns true if s1 and s2 are equal discounting differences in case. startsWith(str, prefix) Returns true if the string str starts with the specified prefix, which may be either a string or a character. endsWith(str, suffix) Returns true if the string str ends with the specified suffix, which may be either a string or a character. trim(str) Returns a new string after removing any whitespace characters from the beginning and end of the argument. Summary Summary In this chapter, you have learned how to use the library, which makes it possible to write string-processing functions without worrying about the details of the underlying representation. The important points in this chapter include: • The library exports a class called string that represents a sequence of characters. Although C++ also includes a more primitive string type to maintain compatibility with the C programming language, it is best to use the string class in any programs that you write. • If you use the >> extraction operator to read string data, the input stops at the first whitespace character. If you want to read an entire line of text from the user, it is usually better to use the getLine function exported by the simpio.h interface. • The most common methods exported by the string class appear in Table 3-1 on page 128. Because string is a class, the methods use the receiver syntax instead of the traditional functional form. Thus, to obtain the length of a string stored in the variable str, you need to invoke str.length(). • Several of the methods exported by the string class destructively modify the receiver string. Giving clients free access to methods that change the internal state of an object makes it harder to protect that object’s integrity. The programs in this book therefore minimize the use of these methods. • The string class uses operator overloading to simplify many common string operations. For strings, the most important operators are + (for concatenation), [] (for selection), and the relational operators. • The standard pattern for iterating through the characters in a string is for (int i = 0; i < str.length(); i++) { . . . body of loop that manipulates str[i] . . . } • The standard pattern for growing a string by concatenation is string str = ""; for (whatever loop header line fits the application) { str += the next substring or character; } • The library exports several functions for working with individual characters. The most important of these functions appear in Table 3-2. 147 148 Strings Review questions 1. What is the difference between a character and a string? 2. True or false: If you execute the lines string line; cin >> line; the program will read an entire line of data from the user and store it in the variable line. 3. Which arguments to the getline function are passed by reference? 4. What is the difference between a method and a free function? 5. True or false: In C++, you can determine the length of the string stored in the variable str by calling length(str). 6. If you call s1.replace(0, 1, s2), which string is the receiver? 7. What is the effect of the + operator when it is used with two string operands? 8. When C++ evaluates the expression s1 < s2, what rule does the string class use to compare the string values? 9. What two syntactic forms does this chapter describe for selecting an individual character from a string? How do these two syntactic forms differ in their implementation? 10. When you select an individual character from a C++ string, you can use either the at method or the standard subscript notation using square brackets. From the client’s perspective, what is the difference between these two options? 11. True or false: If you assign the value of the string variable s1 to the string variable s2, the string class copies the characters so that subsequent changes to the characters in one string will not affect the characters in the other. 12. True or false: The index positions in a string begin at 0 and extend up to the length of the string minus 1. 13. What are the arguments to the substr method? What happens if you omit the second argument? Review questions 14. What value does the find method return to indicate that the search string does not appear? 15. What are the arguments to the substr method? What happens if you omit the second argument? 16. What is the significance of the optional second argument to the find method? 17. What is the effect of each of the following calls to the library: string s = "ABCDE" string t = ""; a. b. c. d. e. s.length() t.length() s[2] s + t t += 'a' f. g. h. i. j. s.replace(0, 2, "Z") s.substr(0, 3) s.substr(4) s.substr(3, 9) s.substr(3, 3) 18. What is the pattern for iterating through each character in a string? 19. How does this pattern change if you want to iterate through the characters in reverse order, starting with the last character and ending with the first? 20. What is the pattern for growing a string through concatenation? 21. What is the result of each of the following calls to the library: a. isdigit(7) b. isdigit('7') c. isalnum(7) d. toupper(7) e. toupper('A') f. tolower('A') 22. Why does C++ support both a string class and a more primitive string type? 23. How can you convert a primitive string value to a C++ string? How can you specify a conversion in the opposite direction? Exercises 1. Implement the function endsWith(str, suffix), which returns true if str ends with suffix. Like its startsWith counterpart, the endsWith function should allow the second argument to be either a string or a character. 149 150 Strings 2. The strlib.h function exports a function trim(str) that returns a new string formed by removing all whitespace characters from the beginning and end of str. Write the corresponding implementation. 3. Without using the built-in string method substr, implement a free function substr(str, pos, n) that returns the substring of str beginning at position pos and containing at most n characters. Make sure that your function correctly applies the following rules: • If n is missing or greater than the length of the string, the substring should extend through the end of the original string. • If pos is greater than the length of the string, substr should call error with an appropriate message. 4. Implement a function capitalize(str) that returns a string in which the initial character is capitalized (if it is a letter) and all other letters are converted to lowercase. Characters other than letters are not affected. For example, both capitalize("BOOLEAN") and capitalize("boolean") should return the string "Boolean". 5. In most word games, each letter in a word is scored according to its point value, which is inversely proportional to its frequency in English words. In Scrabble™, the points are allocated as follows: Points 1 2 3 4 5 8 10 Letters A, E, I, L, N, O, R, S, T, U D, G B, C, M, P F, H, V, W, Y K J, X Q, Z For example, the word "FARM" is worth 9 points in Scrabble: 4 for the F, 1 each for the A and the R, and 3 for the M. Write a program that reads in words and prints out their score in Scrabble, not counting any of the other bonuses that occur in the game. You should ignore any characters other than uppercase letters in computing the score. In particular, lowercase letters are assumed to represent blank tiles, which can stand for any letter but have a score of 0. 6. An acronym is a new word formed by combining, in order, the initial letters of a series of words. For example, the word scuba is an acronym formed from the first letters in self-contained underwater breathing apparatus. Similarly, AIDS is an acronym for Acquired Immune Deficiency Syndrome. Write a Exercises function acronym that takes a string and returns the acronym formed from that string. To ensure that your function treats hyphenated compounds like self-contained as two words, it should define the beginning of a word as any alphabetic character that appears either at the beginning of the string or after a nonalphabetic character. 7. Write a function string removeCharacters(string str, string remove); that returns a new string consisting of the characters in str after removing all instances of the characters in remove. For example, if you call removeCharacters("counterrevolutionaries", "aeiou") the function should return "cntrrvltnrs", which is the original string after removing all of its vowels. 8. Modify your solution to exercise 7 so that, instead of using a function that returns a new string, you define a function removeCharactersInPlace that removes the letters from the string passed as the first argument. 9. The waste of time in spelling imaginary sounds and their history (or etymology as it is called) is monstrous in English . . . —George Bernard Shaw, 1941 In the early part of the 20th century, there was considerable interest in both England and the United States in simplifying the rules used for spelling English words, which has always been a difficult proposition. One suggestion advanced as part of this movement was to eliminate all doubled letters, so that bookkeeper would be written as bokeper and committee would become comite. Write a function removeDoubledLetters(str) that returns a new string in which any duplicated characters in str have been replaced by a single copy. 10. Write a function string replaceAll(string str, char c1, char c2); that returns a copy of str with every occurrence of c1 replaced by c2. For example, calling replaceAll("nannies", 'n', 'd'); should return "daddies". 151 152 Strings Once you have coded and tested this function, write an overloaded version string replaceAll(string str, string s1, string s2); that replaces all instances of the string s1 with the replacement string s2. 11. The concept of a palindrome is often extended to full sentences by ignoring punctuation and differences in the case of letters. For example, the sentence Madam, I’m Adam. is a sentence palindrome, because if you only look at the letters and ignore any distinction between uppercase and lowercase letters, it reads identically backward and forward. Write a predicate function isSentencePalindrome(str) that returns true if the string str fits this definition of a sentence palindrome. For example, you should be able to use your function to write a main program capable of producing the following sample run: SentencePalindrome This program tests for sentence palindromes. Indicate the end of the input with a blank line. Enter a sentence: Madam, I'm Adam. That sentence is a palindrome. Enter a sentence: A man, a plan, a canal -- Panama! That sentence is a palindrome. Enter a sentence: Not a palindrome. That sentence is not a palindrome. Enter a sentence: 12. Write a function createRegularPlural(word) that returns the plural of word formed by following these standard English rules: a. If the word ends in s, x, z, ch, or sh, add es to the word. b. If the word ends in a y preceded by a consonant, change the y to ies. c. In all other cases, add just an s. Write a test program and design a set of test cases to verify that your program works. 13. Like most other languages, English include two types of numbers. The cardinal numbers (such as one, two, three, and four) are used in counting; the ordinal numbers (such as first, second, third, and fourth) are used to indicate a position in a sequence. In text, ordinals are usually indicated by writing the Exercises digits in the number, followed by the last two letters of the English word that names the corresponding ordinal. Thus, the ordinal numbers first, second, third, and fourth often appear in print as 1st, 2nd, 3rd, and 4th. The ordinals for 11, 12, and 13, however, are 11th, 12th, and 13th. Devise a rule that determines what suffix should be added to each number, and then use this rule to write a function createOrdinalForm(n) that returns the ordinal form of the number n as a string. 14. When large numbers are written out on paper, it is traditional—at least in the United States—to use commas to separate the digits into groups of three. For example, the number one million is usually written in the following form: 1,000,000 To make it easier for programmers to display numbers in this fashion, implement a function string addCommas(string digits); that takes a string of decimal digits representing a number and returns the string formed by inserting commas at every third position, starting on the right. For example, if you were to execute the main program int main() { while (true) { string digits = getLine("Enter a number: "); if (digits == "") break; cout << addCommas(digits) << endl; } return 0; } your implementation of the addCommas function should be able to produce the following sample run: AddCommas Enter a number: 17 Enter a number: 1,001 Enter a number: 12,345,678 Enter a number: 999,999,999 Enter a number: 17 1001 12345678 999999999 153 154 Strings 15. As written, the PigLatin program in Figure 3-2 behaves oddly if you enter a string that includes words beginning with an uppercase letter. For example, if you were to capitalize the first word in the sentence and the name of the Pig Latin language, you would see the following output: PigLatin This program translates English to Pig Latin. Enter English text: This is Pig Latin. Pig Latin output: isThay isway igPay atinLay. Rewrite the wordToPigLatin function so that any word that begins with a capital letter in the English line still begins with a capital letter in Pig Latin. Thus, after making your changes in the program, the output should look like this: PigLatin This program translates English to Pig Latin. Enter English text: This is Pig Latin. Pig Latin output: Isthay isway Igpay Atinlay. 16. Most people—at least those in English-speaking countries—have played the Pig Latin game at some point in their lives. There are, however, other invented “languages” in which words are created using some simple transformation of English. One such language is called Obenglobish, in which words are created by adding the letters ob before the vowels (a, e, i, o, and u) in an English word. For example, under this rule, the word english gets the letters ob added before the e and the i to form obenglobish, which is how the language gets its name. In official Obenglobish, the o b characters are added only before vowels that are pronounced, which means that a word like game would become gobame rather than gobamobe because the final e is silent. While it is impossible to implement this rule perfectly, you can do a pretty good job by adopting the rule that the ob should be added before every vowel in the English word except • Vowels that follow other vowels • An e that occurs at the end of the word Exercises Write a function obenglobish that takes an English word and returns its Obenglobish equivalent, using the translation rule given above. For example, if you used your function with the main program int main() { while (true) { string word = getLine("Enter a word: "); if (word == "") break; string trans = obenglobish(word); cout << word << " -> " << trans << endl; } return 0; } you should be able to generate the following sample run: Obenglobish Enter a word: english english -> obenglobish Enter a word: hobnob hobnob -> hobobnobob Enter a word: gooiest gooiest -> gobooiest Enter a word: amaze amaze -> obamobaze Enter a word: rot rot -> robot Enter a word: 17. If you played around with codes and ciphers as a child, the odds are good that you at some point used a cyclic cipher—which is often called a Caesar cipher because the Roman historian Suetonius records that Julius Caesar used this technique—in which you replace each letter in the original message by the letter that appears a fixed distance ahead in the alphabet. As an example, suppose that you wanted to encode a message by shifting every letter ahead three places. In this cipher, each A becomes an D, B becomes E, and so on. If you reach the end of the alphabet, the process cycles around to the beginning, so that X becomes A, Y becomes B, and Z becomes C. To implement a Caesar cipher, you should first define a function string encodeCaesarCipher(string str, int shift); that returns a new string formed by shifting every letter in str forward the number of letters indicated by shift, cycling back to the beginning of the 155 156 Strings alphabet if necessary. After you have implemented encodeCaesarCipher, write a program that generates the following sample run: CaesarCipher This program encodes a message using a Caesar cipher. Enter the number of character positions to shift: 13 Enter a message: This is a secret message. Encoded message: Guvf vf n frperg zrffntr. Note that the transformation applies only to letters; any other characters are copied unchanged to the output. Moreover, the case of letters is unaffected: lowercase letters come out as lowercase, and uppercase letters come out as uppercase. You should also write your program so that a negative value of shift means that letters are shifted toward the beginning of the alphabet instead of toward the end, as illustrated by the following sample run: CaesarCipher This program encodes a message using a Caesar cipher. Enter the number of character positions to shift: -1 Enter a message: IBM 9000 Encoded message: HAL 9000 18. Although they are certainly simple, Caesar ciphers are also extremely easy to break. There are, after all, only 25 values for the number of characters to shift. If you want to break a Caesar cipher, all you have to do is try each of the 25 possibilities and see which one translates the original message into something readable. A better scheme is to allow each letter in the original message to be represented by an arbitrary letter instead of one a fixed distance from the original. In this case, the key for the encoding operation is a translation table that shows what each of the 26 letters becomes in the encrypted form. Such a coding scheme is called a letter-substitution cipher. The key in a letter-substitution cipher is a 26-character string that indicates the translation for each character in the alphabet in order. For example, the key "QWERTYUIOPASDFGHJKLZXCVBNM" indicates that the encoding process should use the following translation rule: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Q W E R T Y U I O P A S D F G H J K L Z X C V B N M Write a program that implements encryption using a letter-substitution cipher. Your program should be able to duplicate the following sample run: Exercises LetterSubstitutionCipher Letter substitution cipher. Enter a 26-letter key: QWERTYUIOPASDFGHJKLZXCVBNM Enter a message: WORKERS OF THE WORLD UNITE! Encoded message: VGKATKL GY ZIT VGKSR XFOZT! 19. Using the definition of keys for letter-substitution ciphers as described in the preceding problem, write a function invertKey that takes an encryption key and returns the 26-letter key necessary to decrypt a message encoded with that encryption key. 20. There is no gene for the human spirit. —Error! Bookmark not defined.Tagline for the 1997 film GATTACA The genetic code for all living organisms is carried in its DNA—a molecule with the remarkable capacity to replicate its own structure. The DNA molecule itself consists of a long strand of chemical bases wound together with a similar strand in a double helix. DNA’s ability to replicate comes from the fact that its four constituent bases—adenosine, cytosine, guanine, and thymine—combine with each other only in the following ways: • Cytosine on one strand links only with guanine on the other, and vice versa. • Adenosine links only with thymine, and vice versa. Biologists abbreviate the names of the bases by writing only the initial letter: A, C, G, or T. Inside the cell, a DNA strand acts as a template to which other DNA strands can attach themselves. As an example, suppose that you have the following DNA strand, in which the position of each base has been numbered as it would be in a C++ string: T A A C G G T A C G T C 0 1 2 3 4 5 6 7 8 9 10 11 Your mission in this exercise is to determine where a shorter DNA strand can attach itself to the longer one. If, for example, you were trying to find a match for the strand T T G C C 157 158 Strings the rules for DNA dictate that this strand can bind to the longer one only at position 1: T T A T A G C C G C G T A C G T C 0 1 2 3 4 5 6 7 8 9 10 11 By contrast, the strand T G C matches at either position 2 or position 7. Write a function int findDNAMatch(string s1, string s2, int start = 0); that returns the first position at which the DNA strand s1 can attach to the strand s2. As in the find method for the string class, the optional start parameter indicates the index position at which the search should start. If there is no match, findDNAMatch should return –1. Chapter 4 Streams We will not be satisfied until justice rolls down like waters and righteousness like a mighty stream. — Reverend Martin Luther King, Jr. “I Have a Dream,” August 28, 1963 (paraphrasing Amos 5:24) 160 Streams Ever since HelloWorld back in Chapter 1, the programs in this book have made use of an important data structure called a stream, which C++ uses to manage the flow of information to or from some data source. In the earlier chapters, you have used the << and >> operators and have already had occasion to use the three standard streams exported by the library: cin, cout, and cerr. You have, however, only scratched the surface of what you can do even with the standard streams. To write C++ programs that move beyond the simple examples you have seen up to now, you will have to learn more about streams and how to use them to create more sophisticated applications. This chapter begins by giving you more insight into the features provided by the << and >> operators. It then moves on to introduce the notion of data files and shows you how to implement file-processing applications. The chapter then concludes by exploring the structure of the C++ stream classes as a representative example of inheritance hierarchies in an object-oriented language. 4.1 Formatted output The easiest way to generate formatted output in C++ is to use the << operator. This operator is called the insertion operator because it has the effect of inserting data into a stream. The operand on the left is the output stream; the operand on the right is the data you want to insert into that stream. The << operator is overloaded so that the operand on the right can be a string or any primitive value. If this operand is not a string, the << operator converts it to string form before sending it to the output stream. This feature makes it easy to display the values of variables, because C++ handles the output conversion automatically. C++ makes generating output even more convenient by having the << operator return the value of the stream. This design decision makes it possible to chain several output operations together, as you have already seen in several examples in this text. Suppose, for example, that you want to display the value of the variable total on an output line that begins with some text telling the user what that value represents. In C++, you would start with the expression cout << "The total is " which copies the characters in the string "The total is " to the cout stream. To insert the decimal representation of the value in total, all you need to do is chain on another instance of the << operator like this: cout << "The total is " << total This expression has the intended effect because the << operator returns the stream. Thus, the left operand of the second << is simply cout, which means that the value 4.1 Formatted output of total is displayed there. Finally, you can signal the end of an output line by inserting the value endl using yet another instance of the << operator: cout << "The total is " << total << endl; If total contains the value 42, the resulting output would look like this on the console: Total The total is 42 Even though you have been using statements like this one since the very beginning, knowing that the << operator propagates the value of cout through the expression as it moves along the chain of insertion operators may help you to appreciate how output works in C++. Although it seems as if it might be a simple string constant, the endl value used to signal the end of an output line is actually an example of something that C++ calls a manipulator, which is just a fancy name for a special type of value used to control formatting. The C++ libraries export a variety of manipulators that you can use to specify the format for output values, the most common of which appear in Table 4-1. For the most part, these manipulators are automatically available when you include the library. The only exceptions are the manipulators that take parameters, such as setw(n), setprecision(digits), and setfill(ch). To use these manipulators, you need to include as well. Manipulators typically have the effect of setting properties of the output stream in a way that changes the formatting of subsequent output. As the individual entries in Table 4-1 make clear, some manipulators are transient, which means that they affect only the next data value that appears. Most, however, are persistent, which means that they take effect for that stream until they are explicitly changed. One of the most common applications of manipulators involves specifying a field width to support tabular output. Suppose, for example, that you want to rewrite the PowersOfTwo program from Chapter 1 so that the numbers in the table are aligned in columns. To do so, all you would need to do is add the appropriate manipulators in the output statement, which will look something like this: cout << right << setw(2) << i << setw(8) << raiseToPower(2, i) << endl; 161 162 Streams T A B L E 4 - 1 Output manipulators Output manipulator table endl Inserts an end-of-line sequence into the output stream and ensures that the characters in the output are written to the destination stream. setw(n) Sets the width of the next field to n characters. If the value requires fewer characters, extra space is added to fill the field. This property is transient, which means that it affects only the next value inserted into the stream. setprecision(digits) Sets the precision for the stream to digits. The interpretation of the precision specification depends on other stream settings. If you have set the mode to fixed or scientific, digits specifies the number of digits after the decimal point. If neither of these modes are set, digits indicates the number of significant digits, irrespective of where those digits appear. This property is persistent, which means that it remains in effect for the stream until it is explicitly changed. setfill(ch) Sets the fill character for the stream to ch. By default, spaces are added to the output if additional characters are necessary to fill the field width established by setw. Calling setfill makes it possible to change that character. For example, calling setfill('0') means that fields will be filled with zeroes. This property is persistent. left Specifies that fields should be aligned on the left, which means that any fill characters are inserted after the value. This property is persistent. right Specifies that fields should be aligned on the right, which means that any fill characters are inserted before the value. This property is persistent. fixed Specifies that subsequent floating-point output should be displayed in full, without using scientific notation. By default, floating point values are displayed in the most compact form. This property is persistent. scientific Specifies that subsequent floating-point output should always appear in scientific notation. This property is persistent. showpoint noshowpoint These manipulators control whether a decimal point should appear in floating-point numbers, even if the value is equal to an integer. You can use showpoint to force inclusion of the decimal point and later restore the default behavior with noshowpoint. This property is persistent. showpos noshowpos These manipulators control whether a plus sign is printed before positive values. By default, positive values are printed without a sign. This property is persistent. uppercase nouppercase These manipulators control the case of any letters generated as part of numeric conversion, such as the E in scientific notation. By default, these characters appear in lower case. This property is persistent. boolalpha noboolalpha These manipulators control the format of bool values, which are ordinarily displayed (for historical reasons) using their internal numeric representation. Using the boolalpha manipulator causes them to appear instead as true and false. This property is persistent. 4.1 Formatted output This statement prints the value of i in a field of width 2 and the value of the function raiseToPower(2, i) in a field of width 8. Both fields are justified on the right because the effect of the right manipulator is persistent. If you used this line to display the powers of two between 0 and 16, the output would look like this: PowersOfTwo This program lists powers of two. Enter exponent limit: 16 0 1 1 2 2 4 3 8 4 16 5 32 6 64 7 128 8 256 9 512 10 1024 11 2048 12 4096 13 8192 14 16384 15 32768 16 65536 Understanding the use of the setprecision(digits) manipulator is complicated by the fact that the interpretation of the argument depends on other mode settings for the stream. In the absence of any specifications to the contrary, C++ represents floating-point numbers using either decimal or scientific notation, choosing the representation that is more compact. The fact that C++ can choose either of these representations makes sense if all you care about is seeing the value. If, however, you want to control the output more precisely, you need to indicate which of these formats you would like C++ to use. The fixed manipulator specifies that floating-point values should always appear as a string of digits with a decimal point in the appropriate position. Conversely, the scientific manipulator specifies that values should always use the programming form of scientific notation in which the exponent is separated from the value by the letter E. Each of these formats interprets the setprecision manipulator in a slightly different way, which makes it harder to provide a concise description of how setprecision works. As is often the case in programming, one of the best ways to understand how some highly detailed aspect of a library works is to write simple test programs that allow you to see what happens on the screen. The PrecisionExample program in Figure 4-1 shows how three constants—the mathematical constant π, the speed of light in meters per second, and the fine-structure constant that characterizes the strength of electrical interaction—appear using different floating-point modes and precision. The output of the program is shown in Figure 4-2. 163 164 Streams Program listing for PrecisionExample. F I G U R E 4 - 1 Program to explore the behavior of setprecision /* * File: PrecisionExample.cpp * -------------------------* This program demonstrates various options for floating-point output * by displaying three different constants (pi, the speed of light in * meters/second, and the fine-structure constant). These constants * are chosen because they illustrate a range of exponent scales. */ #include #include #include using namespace std; /* Constants */ const double PI = 3.14159265358979323846; const double SPEED_OF_LIGHT = 2.99792458E+8; const double FINE_STRUCTURE = 7.2573525E-3; /* Function prototypes */ void printPrecisionTable(); /* Main program */ int main() { cout << uppercase << right; cout << "Default format:" << endl << endl; printPrecisionTable(); cout << endl << "Fixed format:" << fixed << endl << endl; printPrecisionTable(); cout << endl << "Scientific format:" << scientific << endl << endl; printPrecisionTable(); return 0; } /* * Function: printPrecisionTable * ----------------------------* Generates a simple precision table for the current cout settings. */ void printPrecisionTable() { cout << " prec | pi | speed of light | fine-structure" << endl; cout << "------+--------------+------------------+----------------" << endl; for (int prec = 0; prec <= 6; prec += 2) { cout << setw(4) << prec << " |"; cout << " " << setw(12) << setprecision(prec) << PI << " |"; cout << " " << setw(16) << setprecision(prec) << SPEED_OF_LIGHT << " |"; cout << " " << setw(14) << setprecision(prec) << FINE_STRUCTURE << endl; } } 4.2 Formatted input F I G U R E 4 - 2 Sample run illustrating floating-point output PrecisionExample Default format: prec | pi | speed of light | fine-structure ------+--------------+------------------+---------------0 | 3 | 3E+08 | 0.007 2 | 3.1 | 3E+08 | 0.0073 4 | 3.142 | 2.998E+08 | 0.007257 6 | 3.14159 | 2.99792E+08 | 0.00725735 Fixed format: prec | pi | speed of light | fine-structure ------+--------------+------------------+---------------0 | 3 | 299792458 | 0 2 | 3.14 | 299792458.00 | 0.01 4 | 3.1416 | 299792458.0000 | 0.0073 6 | 3.141593 | 299792458.000000 | 0.007257 Scientific format: prec | pi | speed of light | fine-structure ------+--------------+------------------+---------------0 | 3E+00 | 3E+08 | 7E-03 2 | 3.14E+00 | 3.00E+08 | 7.26E-03 4 | 3.1416E+00 | 2.9979E+08 | 7.2574E-03 6 | 3.141593E+00 | 2.997925E+08 | 7.257352E-03 Although the mechanisms for formatted input and output (which computer scientists often abbreviate as I/O) in any programming language can be quite useful, they also tend to be detail-ridden. In general, it makes sense to learn how to perform a few of the most common formatting tasks but to look up the details of other, less common operations only when you need to use them. 4.2 Formatted input Formatted input in C++ is enabled by the operator >>, which you have already used in a variety of programs. This operator is called the extraction operator, because it is used to extract formatted data from an input stream. Up to now, you have used the >> operator to request input values from the console, typically using a sequence of statements such as the following lines from the PowersOfTwo program from Chapter 1: int limit; cout << "Enter exponent limit: "; cin >> limit; 165 166 Streams By default, the >> operator skips over whitespace characters before trying to read the input data. If necessary, you can change this behavior by using the skipws and noskipws manipulators, which appear in the list of input manipulators in Table 4-2. For example, if you execute the lines char ch; cout << "Enter a single character: "; cin >> noskipws >> ch; it would be possible to enter a space or tab character in response to the prompt. Had you left out the noskipws manipulator, the program would have skipped over the whitespace characters before storing the next input character in ch. Although the extraction operator makes it easy to write simple test programs that read input data from the console, it is not widely used in practice. The primary problem with the >> operator is that it offers little support for checking whether user input is valid. Users are notoriously sloppy when entering data into a computer. They make typographical errors or, worse still, fail to understand exactly what input the program requires. Well-designed programs check the user’s input to make sure that it is well-formed and that it makes sense in the context. Unfortunately, the >> operator is simply not up to this task, which is the reason behind the existence of the simpio.h library introduced at the end of Chapter 1. If you use the facilities provided by this library, the code to read a value from the user shrinks from three lines to one: int limit = getInteger("Enter exponent limit: "); The real advantage, however, is that getInteger implements the necessary error checking internally, making it much safer to use as well. You’ll have a chance to see how getInteger is implemented later in this chapter, after which it won’t seem so mysterious. T A B L E 4 - 2 Input manipulators skipws noskipws These manipulators control whether the extraction operator skips over whitespace characters before reading a value. If you specify noskipws, the extraction operator treats all characters (including whitespace characters) as part of the input field. You can later use skipws to restore the default behavior. This property is persistent. ws Reads characters from the input stream until some character appears that is not in the whitespace category. The effect of this manipulator is therefore to skip over any spaces, tabs, and newlines in the input. Unlike skipws and noskipws, which change the behavior of the stream for subsequent input operations, the ws manipulator takes effect immediately. 4.3 Data files 4.3 Data files Whenever you want to store information on the computer for longer than the running time of a program, the usual approach is to collect the data into a logically cohesive whole and store it on a permanent storage medium as a file. Ordinarily, a file is stored using magnetic or optical media, such as the hard disk installed inside your computer or a portable flash drive or memory stick. The specific details of the medium, however, are not critical; the important point is that the permanent data objects you store on the computer—documents, games, executable programs, source code, and the like—are all stored in the form of files. On most systems, files come in a variety of types. For example, in the programming domain, you work with source files, object files, and executable files, each of which has a distinct representation. When you use a file to store data for use by a program, that file usually consists of text and is therefore called a text file. You can think of a text file as a sequence of characters stored in a permanent medium and identified by a file name. The name of the file and the characters it contains have the same relationship as the name of a variable and its contents. As an example, the following text file contains the first stanza of Lewis Carroll’s nonsense poem “Jabberwocky,” which appears in Through the Looking Glass: Jabberwocky.txt 'Twas brillig, and the slithy toves Did gyre and gimble in the wabe; All mimsy were the borogoves, And the mome raths outgrabe. The name of the file is Jabberwocky.txt, and the contents of the file consist of the four lines of characters that comprise the first stanza of the poem. When you look at a file, it is often convenient to regard it as a two-dimensional structure: a sequence of lines composed of individual characters. Internally, however, text files are represented as a one-dimensional sequence of characters. In addition to the printing characters you can see, files also contain a newline character that marks the end of each line. In many respects, text files are similar to strings. Each consists of an ordered collection of characters with a specified endpoint. On the other hand, strings and files differ in several important respects. The most important difference is the permanence of the data. A string is stored temporarily in the computer’s memory during the time that a program runs; a file is stored permanently on a long-term storage device until it is explicitly deleted. There is also a difference in how you refer to individual characters in strings and files. Because each character in a string 167 168 Streams has an index, you can process those characters in any order simply by choosing the appropriate sequence of index values. By contrast, characters in a text file tend to be processed sequentially. Programs that work with file data typically start at the beginning of the file and then work their way—either reading existing characters from an input file or writing new ones to an output file—to the end of the file. Using file streams As you will discover in section 4.4, the C++ stream library exports several classes that form a hierarchical structure. To help you make sense of that structure as a whole, it is useful to start with two stream classes—ifstream and ofstream— exported by the library. Once you are familiar with those examples, it will be easier to generalize from that experience to understand the stream hierarchy as a whole. The most common methods that apply to file streams appear in Table 4-3. Studying this table, however, is not likely to be as helpful to you as learning a few simple patterns for working with files. File processing in any language tends to be idiomatic in the sense that you need to learn a general strategy and then apply that strategy in the applications you write. C++ is no exception to this rule. Reading or writing a file in C++ requires the following steps: 1. Declare a stream variable to refer to the file. Programs that work with files typically declare a stream variable for each file that is simultaneously active. Thus, if you are writing a program that reads an input file and uses that data to write an output file, you need to declare two variables, as follows: ifstream infile; ofstream outfile; 2. Open the file. Before you can use a stream variable, you need to establish an association between that variable and an actual file. This operation is called opening the file and is performed by calling the stream method open. For example, if you want to read the text contained in the Jabberwocky.txt file, you open the file by executing the method call infile.open("Jabberwocky.txt"); Because the stream libraries predate the introduction of the string class, the open method expects a C-style string as the file name. A string literal is therefore acceptable as is. If, however, the name of the file is stored in a string variable named filename, you will need to open the file like this: infile.open(filename.c_str()); 4.3 Data files 169 T A B L E 4 - 3 Useful methods in the stream classes Methods supported by all streams stream.fail() Returns true if the stream is in a failure state. This condition usually occurs when you try to read data past the end of the file, but may also indicate an integrity error in the data. stream.eof() Returns true if the stream is positioned at the end of the file. Given the semantics of the C++ stream library, the eof method is useful only after a call to fail. At that point, calling eof allows you to test whether the failure indication was caused by the end of file or some other data error. stream.clear() Resets the status bits associated with the stream. You need to call this method whenever you need to reuse a stream after a failure has occurred. Methods supported by all file streams stream.open(filename) Attempts to open the named file and attach it to the stream. The direction of file transfer is determined by the stream type: input streams are opened for input, output streams are opened for output. The filename parameter is a C-style string, which means that you will need to call c_str on any C++ string. You can check whether the open method fails by calling the fail method. stream.close() Closes the file attached to the stream. Methods supported by all input streams stream >> variable Reads formatted data into a variable. The data format is controlled by the variable type and whatever input manipulators are in effect. stream.get() Returns the next character in the stream. The return value is an integer, which makes it possible to identify the end-of-file character, which is represented by the constant EOF. stream.unget() Backs up the internal pointer of the stream so that the last character read will be read again by the next call to get. getline(stream, str) Reads the next line of input from stream into the string str. Methods supported by all output streams stream << expression Writes formatted data to an output stream. The data format is controlled by the expression type and whatever output manipulators are in effect. stream.put(ch) Writes the character ch to the output stream. If the requested file is missing, the stream will record that error and let you check for it by calling the predicate method fail. It is your responsibility as a programmer to recover from such failures, and you will see various strategies for doing so later in this chapter. 170 Streams 3. Transfer the data. Once you have opened the data files, you then use the appropriate stream operations to perform the actual I/O operations. Depending on the application, you can choose any of several strategies to transfer the file data. At the simplest level, you can read or write files character by character. In some cases, however, it is more convenient to process files line by line. At a still higher level, you can choose to read and write formatted data, which allows you to intermix numeric data with strings and other data types. The details of each of these strategies appear in the sections that follow. 4. Close the file. When you have finished all data transfers, you need to indicate that fact to the file system by calling the stream method close, as follows: infile.close(); This operation, which is called closing the file, breaks the association between a stream and the actual file. Single character I/O In many applications, the best way to process the data in a text file is to go through the contents of the file one character at a time. Input streams in the C++ library support reading a single character using a method called get, which returns the next character from the stream. Although the idea of the get method is simple enough, there is one confusing aspect in its design. If you look at the prototype for get, you’ll discover that it looks like this: int get(); At first glance, the result type seems odd. The prototype indicates that get returns an int, even though it would seem more appropriate for the method to return a char. The reason for this design decision is that returning a character would make it harder for a program to detect the end-of-file mark. There are only 256 possible character codes, and a data file might contain any of those values. There is no value—or at least no value of type char—that you could use as a sentinel to indicate the end-of-file condition. By extending the definition so that get returns an integer, the implementation can return a value outside the range of legal character codes to indicate the end-of-file condition. That value has the symbolic name of EOF. For output streams, the stream method put takes a char value as its argument and writes that character to the stream. A typical call to the put method therefore looks like this: outfile.put(ch); 4.3 Data files As an example of the use of get and put, the ShowFileContents program in Figure 4-3 displays the contents of a text file on the console. Assuming that the Jabberwocky.txt file exists, a sample run of this program might look like this: ShowFileContents Input file: Jabberwocky.txt 'Twas brillig, and the slithy toves Did gyre and gimble in the wabe; All mimsy were the borogoves, And the mome raths outgrabe. This implementation uses the read-until-sentinel pattern that was introduced in the AddList program from Chapter 1. The body of the while loop reads the next character into the integer variable ch and then exits the loop if ch is the end-of-file sentinel; if it is not, the program writes that character to the output file stream and loops back to read the next character. Many C++ programmers—mostly because the style reflects a common coding convention that dates back to the C programming language—will implement this loop in the following slightly shorter but decidedly more cryptic form: int ch; while ((ch = infile.get()) != EOF) { cout.put(ch); } In this style, the test expression for the while loop uses embedded assignment to combine the operations of reading in a character and testing for the end-of-file condition. When C++ evaluates this test, it begins by evaluating the subexpression ch = infile.get() which reads a character and assigns it to ch. Before executing the loop body, the program then goes on to make sure the result of the assignment is not EOF. The parentheses around the assignment are required; without them, the expression would incorrectly assign to ch the result of comparing the character against EOF. The code in Figure 4-3 also includes the function promptUserForFile, which asks the user to enter a file name and then opens that file for input. If the file does not exist or cannot be opened for some reason, promptUserForFile asks the user for a new file name, continuing that process until the open call succeeds. This design allows the program to recover gracefully if the user enters an invalid file name. For example, if the user forgot to include the .txt extension the first time around, the first few lines of the console output would appear as shown at the top of the page following the figure. 171 172 Streams F I G U R E 4 - 3 Program to show the contents of a file Figure /* * File: ShowFileContents.cpp * -------------------------* This program displays the contents of a file chosen by the user. */ #include #include #include using namespace std; /* Function prototypes */ string promptUserForFile(ifstream & infile, string prompt = ""); /* Main program */ int main() { ifstream infile; promptUserForFile(infile, "Input file: "); while (true) { int ch = infile.get(); if (ch == EOF) break; cout.put(ch); } infile.close(); return 0; } /* * Function: promptUserForFile * Usage: string filename = promptUserForFile(infile, prompt); * ----------------------------------------------------------* Asks the user for the name of an input file and opens the reference * parameter infile using that name, which is returned as the result of * the function. If the requested file does not exist, the user is * given additional chances to enter a valid file name. The optional * prompt argument is used to give the user more information about the * desired input file. */ string promptUserForFile(ifstream & infile, string prompt) { while (true) { cout << prompt; string filename; getline(cin, filename); infile.open(filename.c_str()); if (!infile.fail()) return filename; infile.clear(); cout << "Unable to open that file. Try again." << endl; if (prompt == "") prompt = "Input file: "; } } 4.3 Data files ShowFileContents Input file: jabberwocky Unable to open that file. Try again. Input file: jabberwocky.txt Although the logic behind the implementation of promptUserForFile is easy to follow, there are a few important details that are worth mentioning. The open call, for example, needs to use c_str to convert the C++ string stored in filename to the old-style C string that the stream library requires. Similarly, the call to clear inside the while loop is necessary to ensure that the failure status indicator in the stream is reset before the user enters a new file name. When you are reading character data from an input file, you will sometimes find yourself in the position of not knowing that you should stop reading characters until you have already read more than you need. Consider, for example, what happens when the C++ extraction operator tries to read an integer, which is represented as a string of decimal digits. The library implementation cannot know that the number is finished until it reads a character that is not a digit. That character, however, may be part of some subsequent input, and it is essential not to lose that information. C++ solves this problem for input streams by exporting a method called unget that has the following form: infile.unget(); The effect of this call is to “push” the most recent character back into the input stream so that it is returned on the next call to get. The specifications for the C++ library guarantees that it will always be possible to push one character back into the input file, but you should not rely on being able to read several characters ahead and then push them all back. Fortunately, being able to push back one character is sufficient in the vast majority of cases. Line-oriented I/O Because files are usually subdivided into individual lines, it is often useful to read an entire line of data at a time. The stream function that performs this operation is called getline, which is not the same as the getLine function from simpio.h even though they serve a similar purpose. The getline function—which is defined as a free function rather than a method—takes two references parameters: the input stream from which the line is read and a string variable into which the result is written. Calling 173 174 Streams getline(infile, str); copies the next line of the file into the variable str, up to but not including the newline character that signals the end of the line. The getline function makes it possible to rewrite the ShowFileContents program so that the program reads the file one line at a time. Using this approach, the while loop in the main program looks like this: while (true) { string line; getline(infile, line); if (infile.fail()) break; cout << line << endl; } The body of this loop uses getline to read the next line of data from the file and then calls the fail method to check whether the input stream has reached the end of the file. If so, the break statement exits from the loop, following the structure of the read-until-sentinel pattern introduced in Chapter 1. If not, the code uses the << operator to send the contents of the line to cout, followed by a newline character. Formatted I/O In addition to the character-by-character and line-by-line approaches to processing files, it is also possible to use the << and >> operators on file streams, just as you have already had occasion to do with the console streams. Suppose, for example, that you want to revise the AddIntegerList program from Figure 1-4 so that it takes its input from a data file instead of from the console. The simplest approach is to open a data file for input and use the resulting ifstream instead of cin to read the input values. The only other change you need to make is in the code that exits the loop when the input is complete. The console version of the program used a sentinel value to indicate the end of the input. For the version that reads from a file, the loop should continue until there are no more data values to read. The usual way to test for that condition is to try to read a value and then to call fail to check whether that operation succeeds. If you make these changes to the original code, the program looks like this: 4.3 Data files int main() { ifstream infile; promptUserForFile(infile, "Input file: "); int total = 0; while (true) { int value; infile >> value; if (infile.fail()) break; total += value; } infile.close(); cout << "The sum is " << total << endl; return 0; } Unfortunately, this implementation strategy is less than ideal, even if it is not technically incorrect. If all the numbers are formatted in exactly the right way, the program will get the right answer. If, however, there are extraneous characters in the file, the loop will exit before all the input values have been read. Worse still, the program will give no indication that an error has occurred. The crux of the problem is that the extraction operator in the line infile >> value; will set the failure indicator in either of two cases: 1. Reaching the end of the file, at which point there are no more values to read. 2. Trying to read data from the file that cannot be converted to an integer. You can make things better by checking to make sure the end of the file has been reached when the loop exits. For example, you could at least let the user know that an error had occurred by adding the following lines after the while loop: if (!infile.eof()) { error("Data error in file"); } The error doesn’t provide the user with much guidance as to the source of the data error, but at least it’s better than nothing. Another problem, however, is that the extraction operator is overly permissive in terms of the formats it allows. Unless you specify otherwise, the >> operator will accept any sequence of whitespace characters as data separators. Thus, the data in the input file need not be one value per line, but can be formatted in any of a 175 176 Streams number of ways. For example, if you wanted to use your application to add the first five integers, it would not be necessary to enter the data with one value per line, as illustrated by the following data file: 1 2 3 4 5 It would work just as well—and might indeed be more convenient—to put the values on a single line like this: 1 2 3 4 5 One problem with having this much flexibility is that it becomes harder to detect certain kinds of formatting errors. What happens, for example, if you accidentally include a space in one of the integers? As things stand, the SumIntegerFile application would simply read the digits before and after the space as two separate values. In such cases, it is often better to insist on more rigid formatting rules to improve data integrity. In the SumIntegerFile application, it probably makes sense to insist that the data values appear one per line, as in the first sample file. Unfortunately, enforcing that restriction is difficult if the only tools you have are file streams and the extraction operator. One way to start would be to read the data file one line at a time and then convert each line to an integer before adding it to the total. If you were to adopt this approach, the main program would look like this: int main() { ifstream infile; promptUserForFile(infile, "Input file: "); int total = 0; while (true) { string line; getline(infile, line); if (infile.fail()) break; total += stringToInteger(line); } infile.close(); cout << "The sum is " << total << endl; return 0; } The only thing that’s missing is the implementation of stringToInteger. 4.3 Data files Although the C++ libraries include a function called atoi that converts a string to an integer, that function predates the library and therefore requires the use of C strings, which are much less convenient to use. It would be great if you could find a way to implement that conversion while remaining entirely in the C++ domain. You know that the C++ libraries must contain the necessary code, because the >> operator has to perform that conversion when it reads an integer from a file. If there were a way to use that same code to read an integer from a string, the implementation of stringToInteger would follow immediately as a result. As you will learn in the following section, the C++ stream libraries provide precisely that capability, which will turn out to be useful in a wide variety of applications. String streams Given that files and strings are both sequences of characters, it seems reasonable to think that programming languages might allow you to treat them symmetrically. C++ provides that capability through the library, which exports several classes that allow you to associate a stream with a string value in much the same way that the library allows you to associate a stream with a file. The istringstream class is the counterpart of ifstream and makes it possible to use stream operators to read data from a string. For output, the ostringstream class works very much like ofstream except that the output is directed to a string rather than a file. The existence of the istringstream class makes it possible to implement the stringToInteger method described in the last section as follows: int stringToInteger(string str) { istringstream stream(str); int value; stream >> value >> ws; if (stream.fail() || !stream.eof()) { error("stringToInteger: Illegal integer format"); } return value; } The first line of the function introduces an important feature of variable declarations that you have not yet seen. If you are declaring an object, C++ allows you to supply arguments after the variable name that control how that object is initialized. Here, the line istringstream stream(str); 177 178 Streams declares a variable named stream and initializes it to an istringstream object that is already set up to read data from the string variable str. The next two lines int value; stream >> value >> ws; read an integer value from that stream and store it in the variable value. In this implementation, whitespace characters are allowed either before or after the value. The first >> operator automatically skips over any whitespace characters that appear before the value; the ws manipulator at the end of the line reads any whitespace characters that follow the value, thereby ensuring that the stream will be positioned correctly at the end of the line if the input is correctly formatted. The lines if (stream.fail() || !stream.eof()) { error("stringToInteger: Illegal integer format"); } then check to make sure that the input is valid. If the string cannot be parsed as an integer, stream.fail() will return true, thereby triggering the error message. If, however, the string begins with a number but then contains some additional characters, stream.eof() will be false, which also triggers the error message. If you need to convert in the other direction, you can use the ostringstream class. The following function, for example, converts an integer into a string of decimal digits: string integerToString(int n) { ostringstream stream; stream << n; return stream.str(); } The << operator in the second line converts the value into its decimal representation just as it would for a file. Here, however, the output is directed to a string value stored internally as part of the ostringstream object. The str function in the return statement copies the value of that internal string so that it can be returned to the caller. It is interesting to note that converting in this direction is substantially easier because it is no longer necessary to take account of formatting errors. A better strategy for console input String streams also offer a solution to the problem of checking whether user input is properly formed. As discussed in the section on “Formatted input” earlier in the chapter, the >> operator does not check the user’s input for errors. Consider, for 4.3 Data files example, what happens if you ask the user for an integer value using the following statements from the PowersOfTwo program: int limit; cout << "Enter exponent limit: "; cin >> limit; If the user enters a valid integer, everything is fine. But what happens if the user tries to enter the value 16 but slips down one row on the keyboard and types the letter t instead of the digit 6? In an ideal world, the program would look at the input 1t and complain about its validity. Unfortunately, if you use the extraction operator in C++, that error will go undetected. When it is asked to read a value into the integer variable limit, the >> operator reads characters until it finds one that is illegal in an integer field. Input therefore stops on the t, but the value 1 is still a legal integer, so the program would keep right on going with limit equal to 1. The most effective way to ensure that user input is valid is to read an entire line as a string and then convert that string to an integer. This strategy is embodied in the function getInteger shown in Figure 4-4. This function reads an integer from the user just as the >> operator does but also makes sure that the integer is valid. F I G U R E 4 - 4 Function to read an integer from the console /* * Function: getInteger * Usage: int n = getInteger(prompt); * ---------------------------------* Requests an integer value from the user. The function begins by * printing the prompt string on the console and then waits for the * user to enter a line of input data. If that line contains a * single integer, the function returns the corresponding integer * value. If the input is not a legal integer or if extraneous * characters (other than whitespace) appear on the input line, * the implementation gives the user a chance to reenter the value. */ int getInteger(string prompt) { int value; string line; while (true) { cout << prompt; getline(cin, line); istringstream stream(line); stream >> value >> ws; if (!stream.fail() && stream.eof()) break; cout << "Illegal integer format. Try again." << endl; } return value; } 179 180 Streams The logic of getInteger is similar to that used in the stringToInteger function from the preceding section. The only major difference is that getInteger gives the user a chance to reenter a corrected value instead of terminating the program with an error message, as illustrated in the following sample run: PowersOfTwo This program lists powers of two. Enter exponent limit: 1t Illegal integer format. Try again. Enter exponent limit: 16 As this example makes clear, getInteger identifies the typographical error in the first input line and then asks the user for a new value. When the user presses the RETURN key after correctly typing the value 16, getInteger returns this value to the caller. As you can see from the code in Figure 4-4, the getInteger function takes a optional prompt string as an argument and then displays that string to the user before reading the input line. This design decision follows directly from the fact that the getInteger function needs that information in order to repeat the prompt string if an error occurs. The getInteger function is a much more reliable tool for reading integer data than the extraction operator. For that reason, it makes sense to use getInteger instead of >> in any application in which checking for errors is important. It therefore makes sense to put getInteger in a library. As you already know from Chapter 1, the Stanford libraries include getInteger as part of the simpio library. 4.4 Class hierarchies When the designers of C++ undertook the task of modernizing the input/output libraries, they chose to adopt an object-oriented approach. One implication of that decision is that the data types in the stream libraries are implemented as classes. Classes have a number of advantages over older strategies for representing data. Of these, the most important is that classes provide a framework for encapsulation, which is the process of combining the data representation and the associated operations into a coherent whole that reveals as few details as possible about the underlying structure. The classes you have already seen in this chapter illustrate encapsulation well. When you use these classes, you have no idea how they are implemented. As a client, all you need to know is what methods are available for those classes and how to call them. 4.4 Class hierarchies Object-oriented programming, however, offers other important advantages besides encapsulation. Classes in an object-oriented language form a hierarchy in which each class automatically acquires the characteristics of the classes that precede it in the hierarchy. This property is called inheritance. Although C++ tends to use inheritance somewhat less frequently than many object-oriented languages do, it is nonetheless one of the features that sets the object-oriented paradigm apart from earlier programming models. Biological hierarchies The class structure of an object-oriented language is similar in many ways to the biological classification system developed by the Swedish botanist Carl Linnaeus in the eighteenth century as a way to represent the structure of the biological world. In Linnaeus’s conception, living things are first subdivided into kingdoms. The original system contained only the plant and animal kingdoms, but there are some forms of life—such as fungi and bacteria—that don’t fit well in either category and now have kingdoms of their own. Each kingdom is then further broken down into the hierarchical categories of phylum, class, order, family, genus, and species. Every living species fits at the bottom of this hierarchy but also belongs to some category at each higher level. This biological classification system is illustrated in Figure 4-5, which shows the classification of the common black garden ant, which has the scientific name of Lasius niger, corresponding to its genus and species. This species of ant, however, is also part of the family Formicidae, which is the classification that actually identifies it as an ant. Moving upward in the hierarchy from there, Lasius niger is also of the order Hymenoptera (which also includes bees and wasps), the class Insecta (which consists of the insects), and the phylum Arthropoda (which also includes, for example, shellfish and spiders). One of the properties that makes this biological classification system useful is that all living things belong to a category at every level in the hierarchy. Each individual life form therefore belongs to several categories simultaneously and inherits the properties that are characteristic of each one. The species Lasius niger, for example, is an ant, an insect, and arthropod, and an animal—all at the same time. Moreover, each individual ant shares the properties that it inherits from each of those categories. One of the defining characteristics of the class Insecta is that insects have six legs. All ants must therefore have six legs because ants are members of that class. 181 182 Streams F I G U R E 4 - 5 Class hierarchies in the biological world Living Things Plantae Annelida Brachiopoda Crustacea Animalia Arthropoda Insecta . . . Fungi Mollusca Arachnida Chordata . . . . . . Hymenoptera Formicidae This black garden ant is classified according to its genus and species as Lasius niger. It is also a member of every class up the highlighted chain. Lasius niger Every class in the biological hierarchy inherits the characteristics of the classes above it. For example, a black garden ant has six legs because ants are a subclass of the class Insecta, and all insects have six legs. The biological metaphor also helps to illustrate the distinction between classes and objects. Every common black garden ant has the same classification, but there can be many different ants that all fit that single pattern. Thus, each of the ants is an instance of Lasius niger. In the language of object-oriented programming, Lasius niger is a class and each individual ant is an object. 4.4 Class hierarchies The stream class hierarchy The classes in the stream libraries form hierarchies that are in many ways similar to the biological hierarchy introduced in the preceding section. So far, you have seen two types of input streams—ifstream and istringstream—and two types of output streams—ofstream and ostringstream—that in each pairing share a common set of operations. In C++, these classes form the hierarchy shown in Figure 4-6. At the top of the hierarchy is the class ios, which represents a general stream type that can be used for any kind of I/O. The hierarchy is then subdivided into two categories—istream and ostream—that generalize the notions of input stream and output stream, respectively. The C++ file and string stream classes then fall naturally into the appropriate position in this hierarchy, as shown. Figure 4-6 provides a useful framework for introducing some of the terminology associated with object-oriented programming. In this diagram, which was drawn using the same geometrical structure as the evolutionary diagram in Figure 4-5, each class is a subclass of the class that appears above it in the hierarchy. Thus, istream and ostream are both subclasses of ios. In the opposite direction, ios is called a superclass of both istream and ostream. Similar relationships exist at different levels of this diagram. For example, ifstream is a subclass of istream, and ostream is a superclass of ofstream. The relationship between subclasses and superclasses is in many ways best conveyed by the English words is a. Every ifstream object is also an istream and, by continuing up the hierarchy, an ios. As in the biological hierarchy, this relationship implies that the characteristics of any class are inherited by its subclasses. In C++, these characteristics correspond to the methods and other definitions associated with the class. Thus, if the istream class exports a particular method, that method is automatically available to any ifstream or F I G U R E 4 - 6 Selected classes in the stream hierarchy ios istream ifstream istringstream ostream ofstream ostringstream 183 184 Streams istringstream object. More globally, any methods exported by the ios class are available to every class in the hierarchy shown in Figure 4-6. Although simple diagrams that show only the relationships among classes are often useful in their own right, it is useful to expand them to include the methods exported at each level, as shown in Figure 4-7. This enhanced diagram adopts parts of a standard methodology for illustrating class hierarchies called the Universal Modeling Language, or UML for short. In UML, each class appears as a rectangular box whose upper portion contains the name of the class. The methods exported by that class appear in the lower portion. In UML diagrams, subclasses use open arrowheads to point to their superclasses. UML diagrams of the sort shown in Figure 4-7 make it easy to determine what methods are available to each of the classes in the diagram. Because each class inherits the methods of every class in its superclass chain, an object of a particular class can call any method defined in any of those classes. For example, the diagram indicates that any ifstream object has access to the following methods: • The open and close methods from the ifstream class itself • The get and unget methods and the >> operator from the istream class • The clear, fail, and eof methods from the ios class F I G U R E 4 - 7 Simplified UML diagram for the stream hierarchy ios clear() fail() eof() istream get() unget() >> ifstream open(cstr) close() ostream put(ch) << istringstream istringstream(s) ofstream open(cstr) close() ostringstream str() 4.4 Class hierarchies Choosing the right level in the stream hierarchy One of the most important decisions you need to make when you use object hierarchies is finding the right level at which to work. As a general rule, it is best to write your code so that it uses the most general level in the hierarchy that supports the operations you need. Adopting this rule ensures that your code is as flexible as possible in that it supports the widest range of types. As an example, you will often find it useful to define a copyStream method that copies all the characters from one input stream to another. If you had come up with this idea as you were working with file streams, you might have been tempted to implement the method as follows: void copyStream(ifstream & infile, ofstream & outfile) { while (true) { int ch = infile.get(); if (ch == EOF) break; outfile.put(ch); } } Although this implementation is not technically incorrect, it is certainly problematic enough to justify the bug icon. The problem is that the method works only for file streams, even though the code would be perfectly appropriate for any type of input and output stream if you had chosen more general types for the arguments. A much better implementation of copyStream looks like this: void copyStream(istream & is, ostream & os) { while (true) { int ch = is.get(); if (ch == EOF) break; os.put(ch); } } The advantage of the new coding is that you can use this version of copyStream with more general types of streams. For example, given the new implementation of copyStream, you could replace the while loop in ShowFileContents with the single line copyStream(infile, cout); which copies the contents of the file to cout. The previous version—in which the second argument is declared to be an ofstream rather than the more general class ostream—fails because cout is not a file stream. 185 186 Streams Multiple inheritance Although inheritance is used in various places in the standard libraries, it is not used as widely in C++ as it tends to be in other object-oriented languages, most notably Java. Thus, while it is essential to understand inheritance in the context of existing library packages such as those that implement streams, this book does not cover the implementation of such hierarchies until Chapter 17. You can get along fine using inheritance when it is offered to you without having to code it on your own. Part of the reason for deferring the implementation of class hierarchies until close to the end of the book is that inheritance in C++ is more powerful—and consequently more complex—than its counterpart in other languages. One of the features that sets C++ apart from other object-oriented languages in common use today is that classes can inherit behavior from more than one superclass. This property is called multiple inheritance. Although using multiple inheritance in your own classes is beyond the scope of this book, it is useful to see how it is used in the design of existing library packages. The stream hierarchy uses multiple inheritance to define new classes that are both input and output streams. Figure 4-8 shows how these classes fit into the UML diagram for the stream hierarchy as a whole. Because there are arrows leading from iostream to both istream and ostream, the iostream class inherits the methods from both of its superclasses, which it then passes on to its subclasses. As an example of how you might use these bidirectional streams, suppose that you have been assigned the task of writing a roundToSignificantDigits function that takes a floating-point value x and rounds it to some specified number of significant digits. For example, if you call roundToSignificantDigits(3.14159265, 5) the function should return the value of π rounded to 5 significant digits, which is 3.1416. Although it is possible to write this function in other ways, the simplest strategy is to use the facilities already provided by streams. All you need to do is write the value of x to an output stream after setting the precision to the desired number of digits and then read that same value back to convert it to its numeric form. The following implementation does just that using a stringstream—which is both an istream and an ostream—to store the intermediate string: double roundToSignificantDigits(double x, int nDigits) { stringstream ss; ss << setprecision(nDigits) << x; ss >> x; return x; } 4.5 Exploring the Stanford libraries 187 F I G U R E 4 - 8 Multiple inheritance in the stream hierarchy ios clear() fail() eof() istream ostream get() unget() >> ifstream open(cstr) close() put(ch) << istringstream istringstream(s) fstream open(cstr) close() iostream ofstream open(cstr) close() stringstream stringstream(s) str() 4.5 Exploring the Stanford libraries Chapter 3 introduced several new functions that were useful enough to package up as the strlib.h library. In this chapter, you have seen several useful tools for working with streams that have also made their way into two interfaces in the Stanford library: the simpio.h interface you have already seen and a filelib.h interface for the methods that are more closely related to files. Although it would be possible to list the contents of each of those interfaces in tables of the sort you have already seen for the standard libraries, most modern interfaces are not described on paper. With the expansion of the web, programmers tend to use online reference materials more often than printed ones. Figure 4-9, for example, shows the online documentation for the simpio.h library, which exports the getInteger, getReal, and getLine functions. ostringstream str() 188 Streams Web figure F I G U R E 4 - 9 Online documentation for the simpio.h interface simpio.h The Stanford C++ Libraries simpio.h This interface exports a set of functions that simplify input/output operations in C++ and provide some error-checking on console input. Functions getInteger(prompt) Reads a complete line from cin and tries to scan it as an integer. getReal(prompt) Reads a complete line from cin and tries to scan it as a floating-point number. getLine(prompt) Reads a line of text from cin and returns that line as a string. Function detail int getInteger(string prompt = ""); Reads a complete line from cin and scans it as an integer. If the scan succeeds, the integer value is returned. If the argument is not a legal integer or if extraneous characters (other than whitespace) appear in the string, the user is given a chance to reenter the value. If supplied, the optional prompt string is printed before reading the value. Usage: int n = getInteger(prompt); double getReal(string prompt = ""); Reads a complete line from cin and scans it as a floating-point number. If the scan succeeds, the floating-point value is returned. If the input is not a legal number or if extraneous characters (other than whitespace) appear in the string, the user is given a chance to reenter the value. If supplied, the optional prompt string is printed before reading the value. Usage: double x = getReal(prompt); string getLine(string prompt = ""); Reads a line of text from cin and returns that line as a string. The newline character that terminates the input is not stored as part of the return value. If supplied, the optional prompt string is printed before reading the value. Usage: string line = getLine(prompt); Summary Although the tabular description fits well on the printed page and the web-based documentation is ideal for online browsing, you do have another option for learning what resources are available in an interface: you can read the .h file. If an interface is designed and documented effectively, reading the .h file may well be the best option. In any event, reading .h files is a programming skill that you will need to cultivate if you want to become proficient in C++. To give yourself some practice in reading interfaces, you should look at the filelib.h interface, which appears in Appendix A. That interface includes the promptUserForFile function developed in this chapter, along with a number of other functions that will likely come in handy when you work with files. Summary In this chapter, you have learned how to use the libraries in the stream hierarchy to support input/output operations involving the console, strings, and data files. Important points in this chapter include: • The library exports three standard streams: cin, cout, and cerr. • The library makes it possible to control the output format. This library exports a set of manipulators, the most important of which appear in Table 4-1. The library includes manipulators for input as well, but these are less important in practice. • The library in C++ supports reading and writing data files. The most important methods that apply to file streams are listed in Table 4-3. • The C++ stream libraries allow you to choose any of several different strategies when reading a file. You can read the file character by character using the get methods, line by line using the getline method, or as formatted data using the >> extraction operator. • The library makes it possible to use the >> and << operators to read and write string data. • The classes in the stream library form a hierarchy in which subclasses inherit the behavior of their superclasses. When you design functions to work with streams, it is important to choose the most general level of the hierarchy for which the necessary operations are defined. • At various points in the first three chapters, the text defines new functions that are likely to prove useful in many different contexts. To make sure that these functions are freely accessible without having to copy the code, those functions are exported through several interfaces that collectively constitute the Stanford C++ libraries. 189 190 Streams Review questions 1. What are the three standard file streams defined by the library? 2. What are the formal names for the << and >> operators? 3. What value do the << and >> operators return? Why is this value important? 4. What is a manipulator? 5. What is the difference between a transient and a persistent property? 6. In your own words, describe how the fixed and scientific manipulators change the format for floating-point output. What happens if you don’t specify either of these options? 7. Suppose that the constant PI has been defined as const double PI = 3.14159265358979323846; What output manipulators would you use to produce each line of the following sample run: PrecisionExercise 3.141592653589793 3.141593 3.141592653589793e+00 3.141593E+00 3.141592654 003.1416 8. What is the purpose of the types ifstream and ofstream? 9. The argument to open must be a C-style string. How does this requirement affect the code you write to open a file? 10. How can you determine if an open operation on a stream was successful? 11. When you are using the get method to read a file character by character, how do you detect the end of a file? 12. Why is the return type of get declared as int instead of char? 13. What is the purpose of the unget method? Exercises 14. When you are using the getline method to read a file line by line, how do you detect the end of a file? 15. What classes does the library support? How are these classes different from the ones provided in ? 16. What is meant by the following terms: subclass, superclass, and inheritance? 17. True or false: The stream class hierarchy of Figure 3-8 shows that istream is a subclass of istringstream. 18. Why does the copyStream function take arguments of type istream and ostream instead of ifstream and ofstream? 19. What are the advantages of using the getInteger and getReal functions from simpio.h over using the >> extraction operator? 20. If this text does not describe the functions exported by a library in tabular form, what options do you have for learning how to use that library? Exercises 1. The library gives programmers more control over output format, which makes it easy, for example, to create formatted tables. Write a program that displays a table of trigonometric sines and cosines that looks like this: TrigTable theta | sin(theta) | cos(theta) | -------+------------+------------+ -90 | -1.0000000 | 0.0000000 | -75 | -0.9659258 | 0.2588190 | -60 | -0.8660254 | 0.5000000 | -45 | -0.7071068 | 0.7071068 | -30 | -0.5000000 | 0.8660254 | -15 | -0.2588190 | 0.9659258 | 0 | 0.0000000 | 1.0000000 | 15 | 0.2588190 | 0.9659258 | 30 | 0.5000000 | 0.8660254 | 45 | 0.7071068 | 0.7071068 | 60 | 0.8660254 | 0.5000000 | 75 | 0.9659258 | 0.2588190 | 90 | 1.0000000 | 0.0000000 | The numeric columns should all be aligned on the right, and the columns containing the trigonometric functions (which are listed here for angles in degrees at 15-degree intervals) should all have seven digits after the decimal point. 191 192 Streams 2. In exercise 4 in Chapter 2, you wrote a function windChill that calculated the wind chill for a given temperature and wind velocity. Write a program that uses this function to display these values in tabular form, as illustrated by the table from the National Weather service shown in Figure 2-17 on page 117. 3. Write a program that prints out the longest line in a file entered by the user. If several lines are all equally long, your program should print out the first such line. 4. Write a program that reads a file and reports how many lines, words, and characters appear in it. For the purposes of this program, a word consists of a consecutive sequence of any characters except whitespace characters. As an example, suppose that the file Lear.txt contains the following passage from Shakespeare’s King Lear, Lear.txt Poor naked wretches, wheresoe'er you are, That bide the pelting of this pitiless storm, How shall your houseless heads and unfed sides, Your loop'd and window'd raggedness, defend you From seasons such as these? O, I have ta'en Too little care of this! your program should be able to generate the following sample run: FileCounts Input file: Lear.txt Chars: 254 Words: 43 Lines: 6 The counts in the output should be displayed in a column that is aligned on the right but which expands to fit the data. For example, if you have a file containing the full text of George Eliot’s Middlemarch, the output of your program should look like this: FileCounts Input file: Middlemarch.txt Chars: 1796948 Words: 316689 Lines: 34037 Exercises 5. The filelib.h interface exports several functions that make it easy to work with filenames. Two functions that are particularly useful are getRoot and getExtension, which divide a function into its root, which is the identifying part of the filename, and the extension, which indicates its type. For example, given the filename Middlemarch.txt used in the preceding exercise, the root is Middlemarch and the extension is .txt (note that filelib.h defines the extension to includes the dot). Write the code necessary to implement these functions. To find out how to handle special cases, such as filenames that don’t include a dot, consult the filelib.h interface in Appendix A. 6. Another useful function in filelib.h is string defaultExtension(string filename, string ext); which adds ext to the end of filename if it doesn’t already have an extension. For example, defaultExtension("Shakespeare", ".txt") would return "Shakespeare.txt". If filename already has an extension, that name is returned unchanged, so that defaultExtension("library.h", ".cpp") would ignore the specified extension and return "library.h" unchanged. If, however, ext includes a star before the dot, defaultExtension removes any existing extension from filename and adds the new one (minus the star). Thus, defaultExtension("library.h", "*.cpp") would return "library.cpp". Write the code for defaultExtension so that it behaves as described in this exercise. 7. On occasion, publishers find it useful to evaluate layouts and stylistic designs without being distracted by the actual words. To do so, they sometimes typeset sample pages in such a way that all of the original letters are replaced with random letters. The resulting text has the spacing and punctuation structure of the original, but no longer conveys any meaning that might get in the way of the design. The publishing term for text that has been replaced in this way is greek, presumably after the old saying “It’s all Greek to me,” which is itself adapted from a line from Julius Caesar. 193 194 Streams Write a program that reads characters from an input file and displays them on the console after making the appropriate random substitutions. Your program should replace every uppercase character in the input by a random uppercase character and every lowercase character by a random lowercase one. Nonalphabetic characters should remain unchanged. For example, if the input file Troilus.txt contains the text from Troilus and Cressida, Troilus.txt Ay, Greek; and that shall be divulged well In characters as red as Mars his heart Inflamed with Venus: your program should generate output that looks something like this: Greek Input file: Troilus.txt Ne, Inyes; fmd ckhj zntqt uv dqijxnkp uyww Rt pkjfvkmzdf yt kut ya Itgp byi blxod Ogmgmkwl jjbe Nscku: 8. Books were bks and Robin Hood was Rbinhd. Little Goody Two Shoes lost her Os and so did Goldilocks, and the former became a whisper, and the latter sounded like a key jiggled in a lck. It was impossible to read “cockadoodledoo” aloud, and parents gave up reading to their children, and some gave up reading altogether. . . . —James Thurber, The Wonderful O, 1957 In James Thurber’s children’s story The Wonderful O, the island of Ooroo is invaded by pirates who set out to banish the letter O from the alphabet. Such censorship would be much easier with modern technology. Write a program that asks the user for an input file, an output file, and a string of letters to be eliminated. The program should then copy the input file to the output file, deleting any of the letters that appear in the string of censored letters, no matter whether they appear in uppercase or lowercase form. As an example, suppose that you have a file containing the first few lines of Thurber’s novel, as follows: TheWonderfulO.txt Somewhere a ponderous tower clock slowly dropped a dozen strokes into the gloom. Storm clouds rode low along the horizon, and no moon shown. Only a melancholy chorus of frogs broke the soundlessness. Exercises If you run your program with the input BanishLetters Input file: TheWonderfulO.txt Output file: TheWnderful.txt Letters to banish: o it should write the following file: TheWnderful.txt Smewhere a pnderus twer clck slwly drpped a dzen strkes int the glm. Strm cluds rde lw alng the hrizn, and n mn shwn. nly a melanchly chrus f frgs brke the sundlessness. If you tried to get greedy and banish all the vowels by entering aeiou in response to the prompt, the contents of the output file would be Smwhr pndrs twr clck slwly drppd dzn strks nt th glm. Strm clds rd lw lng th hrzn, nd n mn shwn. nly mlnchly chrs f frgs brk th sndlssnss. 9. Even though comments are essential for human readers, the compiler simply ignores them. If you are writing a compiler, you therefore need to be able to recognize and eliminate comments that occur in a source file. Write a function void removeComments(istream & is, ostream & os); that copies characters from the input stream is to the output stream os, except for characters that appear inside C++ comments. Your implementation should recognize both comment conventions: • Any text beginning with /* and ending with */, possibly many lines later. • Any text beginning with //and extending through the end of the line. The real C++ compiler needs to check to make sure that these characters are not contained inside quoted strings, but you should feel free to ignore that detail. The problem is tricky enough as it stands. 195 196 Streams 10. Some files use tab characters to align data into columns. Doing so, however, can cause problems for applications that are unable to work directly with tabs. For these applications, it is useful to have access to a program that replaces tabs in an input file with the number of spaces required to reach the next tab stop. In programming, tab stops are usually set at every eight columns. For example, suppose that the input file contains a line of the form abc pqrst xyz where the symbol represents the space taken up by a tab, which differs depending on its position in the line. If the tab stops are set every eight spaces, the first tab character must be replaced by five spaces and the second tab character by three. Write a program that copies an input file to an output file, replacing all tab characters by the appropriate number of spaces. 11. Using the functions stringToInteger and integerToString as a model, write the code necessary to implement stringToReal and realToString. 12. Complete the implementation of the simpio.h interface by implementing the functions getReal and getLine. Chapter 5 Collections You see what a collection I have. — Jane Austen, Mansfield Park, 1814 198 Collections As you know from your programming experience, data structures can be assembled to form hierarchies. The atomic data types like int, char, double, and the enumerated types occupy the lowest level in the hierarchy. To represent more complex information, you combine the atomic types to form larger structures. These larger structures can then be assembled into even larger ones in an open-ended process. Collectively, these assemblages are called data structures. As you learn more about programming, you will discover that particular data structures are so useful that they are worth studying in their own right. Moreover, it is usually far more important to know how to use those structures effectively than it is to understand their underlying representation. For example, even though a string might be represented inside the machine as an array of characters, it also has an abstract behavior that transcends its representation. A type defined in terms of its behavior rather than its representation is called an abstract data type, which is often abbreviated to ADT. Abstract data types are central to the object-oriented style of programming, which encourages programmers to think about data structures in a holistic way. This chapter introduces five classes—Vector, Stack, Queue, Map, and Set— each of which represents an important abstract data type. Each of these classes, moreover, contains a collection of values of some simpler type. Such classes are therefore called collection classes. For the moment, you don’t need to understand how these classes are implemented, because your primary focus is on learning how to use these classes as a client. In later chapters, you’ll have a chance to explore a variety of implementation strategies and learn about the algorithms and data structures necessary to make those implementations efficient. Being able to separate the behavior of a class from its underlying implementation is a fundamental precept of object-oriented programming. As a design strategy, it offers the following advantages: • Simplicity. Hiding the internal representation from the client means that there are fewer details for the client to understand. • Flexibility. Because a class is defined in terms of its public behavior, the programmer who implements one is free to change its underlying private representation. As with any abstraction, it is appropriate to change the implementation as long as the interface remains the same. • Security. The interface boundary acts as a wall that protects the implementation and the client from each other. If a client program has access to the representation, it can change the values in the underlying data structure in unexpected ways. Making the data private in a class prevents the client from making such changes. 5.1 The Vector class To use any of the collection classes introduced in this chapter, you must include the appropriate interface, just as you would for any of the libraries from the earlier chapters. The interface for each of the collection classes is simply the name of the class spelled with a lowercase initial letter and followed with the extension .h at the end. For example, in order to use the Vector class in a program, you must include the line #include "vector.h" The collection classes used in this book are inspired by and draw much of their structure from a more advanced set of classes available for C++ called the Standard Template Library, or STL for short. Although the STL is enormously powerful, it is more difficult to understand from both the client and implementation perspectives. The advantage of using the simplified version is that you can understand the entire implementation by the time you finish this book. Knowing how the implementation works gives you greater insight into what the Standard Template Library is doing for you behind the scenes. 5.1 The Vector class One of the most valuable collection classes is the Vector class, which provides a facility similar to the arrays you have almost certainly encountered in your earlier experience with programming. Arrays have been around since the early days of programming. Like most languages, C++ supports arrays, and you will have the chance to learn how C++ arrays work in Chapter 11. Arrays in C++, however, have a number of weaknesses, including the following: • Arrays are allocated with a fixed size that you can’t subsequently change. • Even though arrays have a fixed size, C++ does not make that size available to the programmer. As a result, programs that work with arrays typically need an additional variable to keep track of the number of elements. • Traditional arrays offer no support for inserting and deleting elements. • C++ makes no effort to ensure that the elements you select are actually present in the array. For example, if you create an array with 25 elements and then try to select the value at index position 50, C++ will simply look at the memory addresses at which element 50 would appear if it existed. The Vector class solves each of these problems by reimplementing the array concept in the form of an abstract data type. You can use the Vector class in place of arrays in any application, usually with surprisingly few changes in the source code and at most a minor reduction in efficiency. In fact, once you have the Vector class, it’s unlikely that you will have much occasion to use arrays at all, unless you actually have to implement classes like Vector, which, not surprisingly, 199 200 Collections uses arrays in its underlying structure. As a client of the Vector class, however, you are not interested in that underlying structure and can leave the array mechanics to the programmers who implement the abstract data type. As a client of the Vector class, you are concerned with a different set of issues and need to answer the following questions: 1. How is it possible to specify the type of object contained in a Vector? 2. How does one create an object that is an instance of the Vector class? 3. What methods exist in the Vector class to implement its abstract behavior? The next three sections explore the answers to each of these questions in turn. Specifying the base type of a Vector In C++, collection classes specify the type of object they contain by including the type name in angle brackets following the class name. For example, the class Vector represents a vector whose elements are integers, Vector specifies a vector whose elements are single characters, and Vector specifies one in which the elements are strings. The type enclosed within the angle brackets is called the base type for the collection. Classes that include a base-type specification are called parameterized classes in the object-oriented community. In C++, parameterized classes are more often called templates, which reflects the fact that C++ compilers treat Vector, Vector, and Vector as independent classes that share a common structure. The name Vector acts as a template for stamping out a whole family of classes, in which the only difference is what type of value the vector contains. For now, all you need to understand is how to use templates; the process of implementing basic templates is described in Chapter 14. Declaring a Vector object One of the philosophical principles behind abstract data types is that clients should be able to think of them as if they were built-in primitive types. Thus, just as you would declare an integer variable by writing a declaration such as int n; it ought to be possible to declare a new vector by writing Vector vec; 5.1 The Vector class In C++, that is precisely what you do. That declaration introduces a new variable named vec, which is—as the template marker in angle brackets indicates—a vector of integers. Vector operations When you declare a Vector variable, it starts out as an empty vector, which means that it contains no elements. Since an empty vector is not particularly useful, one of the first things you need to learn is how to add new elements to a Vector object. The usual approach is to invoke the add method, which adds a new element at the end of the Vector. For example, if vec is an empty vector of integers as declared in the preceding section, executing the code vec.add(10); vec.add(20); vec.add(40); changes vec into a three-element vector containing the values 10, 20, and 40. As with the characters in a string, C++ numbers the elements of a vector starting with 0, which means that you could diagram the contents of vec like this: vec 10 20 40 0 1 2 Unlike the more primitive array type that will be introduced in Chapter 11, the size of a vector is not fixed, which means that you can add additional elements at any time. Later in the program, for example, you could call vec.add(50); which would add the value 50 to the end of the vector, like this: vec 10 20 40 50 0 1 2 3 The insertAt method allows you to add new elements in the middle of a vector. The first argument to insertAt is an index number, and the new element is inserted before that position. For example, calling vec.insertAt(2, 30); inserts the value 30 before index position 2, as follows: 201 202 Collections vec 10 20 30 40 50 0 1 2 3 4 Internally, the implementation of the Vector class has to expand the array storage and move the values 40 and 50 over one position to make room for the 30. From your perspective as a client, the implementation simply takes care of such details, and you don’t need to understand how it does so. The Vector class also lets you remove elements. For example, calling vec.removeAt(0); removes the element from position 0, leaving the following values: vec 20 30 40 50 0 1 2 3 Once again, the implementation takes care of shifting elements to close the gap left by the deleted value. The Vector class includes two methods for selecting and modifying individual elements. The get method takes an index number and returns the value in that index position. For example, given the value of vec shown in the most recent diagram, calling vec.get(2) would return the value 40. Symmetrically, you can use the set method to change the value of an existing element. For example, calling vec.set(3, 70); changes the value in index position 3 from 50 to 70, like this: vec 20 30 40 70 0 1 2 3 The get, set, insertAt, and removeAt methods all check to make sure that the index value you supply is valid for the vector. For example, if you were to call vec.get(4) in this vector, the get method would call error to report that the index value 4 is too large for a vector in which the index values run from 0 to 3. For get, set, and removeAt, the Vector implementation checks that the index is greater than or equal to 0 and less than the number of elements. The insertAt method allows the index to be equal to the number of elements, in which case the new value is added at the end of the array, just as it is with add. 5.1 The Vector class The operation of testing whether an index is valid is called bounds-checking. Bounds-checking makes it easier to catch programming errors that can often go unnoticed when you work with traditional arrays. Selecting elements in a vector Even though the get and set methods are easy to use, hardly anyone actually calls these methods. One of the characteristics of C++ that sets it apart from most other languages is that classes can override the definition of the standard operators. This feature makes it possible for the Vector class to support the more traditional syntax of using square brackets to specify the desired index. Thus, to select the element at position i, you can use the expression vec[i], just as you would with a traditional array. You can, moreover, change the value of that element by assigning a new value to vec[i]. For example, you can set element 3 in vec to 70 by writing vec[3] = 70; The resulting syntax is marginally shorter than calling set, but is more evocative of the array operations that the Vector class is designed to emulate. The index used to select an element from an array can be any expression that evaluates to an integer. One of the most common index expressions is the index of a for loop that cycles through each of the index values is order. The general pattern for cycling through the index positions in a vector looks like this: for (int i = 0; i < vec.size(); i++) { loop body } Inside the loop body, you can refer to the current element as vec[i]. As an example, the following code writes out the contents of the vector vec as a comma-separated list enclosed in square brackets: cout << "["; for (int i = 0; i < vec.size(); i++) { if (i > 0) cout << ", "; cout << vec[i]; } cout << "]" << endl; If you were to execute this code given the most recent contents of vec, you would see the following output on the screen: 203 204 Collections PrintVector [20, 30, 40, 70] Passing a Vector object as a parameter The code at the end of the preceding section is so useful (particularly when you’re debugging and need to see what values a vector contains), that it is worth defining a function for this purpose. At one level, encapsulating the code inside a function is easy; all you have to do is add the appropriate function header, like this: void printVector(Vector & vec) { cout << "["; for (int i = 0; i < vec.size(); i++) { if (i > 0) cout << ", "; cout << vec[i]; } cout << "]" << endl; } The header line, however, involves one subtlety that you have to understand before you can use collection classes effectively. As described in Chapter 2, the & before the parameter name indicates that the argument vec is passed by reference, which means that the value in the function is shared with the value in the caller. Call-by-reference is more efficient than C++’s default model of call-by-value, which would require copying every element in the vector. In the printVector example, there is no reason to make that copy, so call-by-reference represents a more efficient design. Perhaps more importantly, using call-by-reference makes it possible to write functions that change the contents of a vector. As an example, the following function deletes any zero-valued elements from a vector of integers: void removeZeroElements(Vector & vec) { for (int i = vec.size() - 1; i >= 0; i--) { if (vec[i] == 0) vec.removeAt(i); } } The for loop cycles through each element and checks whether its value is 0. If so, the function calls removeAt to delete that element from the vector. To ensure that 5.1 The Vector class removing an element doesn’t change the positions of elements that have not yet been checked, the for loop starts at the end of the vector and runs backwards. This function depends on the use of call-by-reference. If you had left out the ampersand in this header line, removeZeroElements would have no effect at all. The code would remove the zero elements from the local copy of vec, but not from the vector the caller supplied. When removeZeroElements returns, that copy goes away, leaving the original vector unchanged. This kind of error is easy to make, and you should learn to look for it when your programs go awry. The ReverseFile program in Figure 5-1 shows a complete C++ program that uses Vector to display the lines from a file in reverse order. The functions promptUserForFile and readEntireFile will come in handy in a variety of applications. For this reason, both of these functions—along with many other useful functions for working with files—are included in the filelib.h library listed in Appendix A. Creating a Vector of a predefined size The examples you have seen up to this point start out with an empty vector and then add elements to it, one at a time. In many applications, building up a vector one element at a time is tedious, particularly if you know the size of the vector in advance. In such cases, it makes more sense to specify the number of elements as part of the declaration. As an example, suppose that you wanted to create a vector to hold the scores for each hole on an 18-hole golf course. The strategy you already know is to create an empty Vector and then add 18 elements to it using a for loop, as follows: const int N_HOLES = 18; Vector golfScores; for (int i = 0; i < N_HOLES; i++) { golfScores.add(0); } A better approach is to include the size as a parameter to the declaration like this: Vector golfScores(N_HOLES); This declaration creates a Vector with N_HOLES elements, each of which is initialized to 0 for a Vector of type int. The effect of these two code fragments is the same. Each of them creates a Vector filled with 18 zero values. The first form requires the client to initialize the elements; the second hands that responsibility off to the Vector class itself. 205 206 Collections Figure 4-1 F I G U R E 5 - 1 Program to display the lines of a file in reverse order /* * File: ReverseFile.cpp * --------------------* This program displays the lines of an input file in reverse order. */ #include #include #include #include "filelib.h" #include "vector.h" using namespace std; /* Function prototypes */ void readEntireFile(istream & is, Vector & lines); /* Main program */ int main() { ifstream infile; Vector lines; promptUserForFile(infile, "Input file: "); readEntireFile(infile, lines); infile.close(); for (int i = lines.size() - 1; i >= 0; i--) { cout << lines[i] << endl; } return 0; } /* * Function: readEntireFile * Usage: readEntireFile(is, lines); * --------------------------------* Reads the entire contents of the specified input stream into the * string vector lines. The client is responsible for opening and * closing the stream. */ void readEntireFile(istream & is, Vector & lines) { while (true) { string line; getline(is, line); if (is.fail()) break; lines.add(line); } } 5.1 The Vector class As a more significant example of when you might want to declare a vector of a constant size, consider the LetterFrequency program in Figure 5-2, which counts how often each of the 26 letters appears in a data file. Those counts are maintained in the variable letterCounts, which is declared as follows: Vector letterCounts(26); Each element in this vector contains the count of the letter at the corresponding index in the alphabet, with the number of As in letterCounts[0], the number of Bs in letterCounts[1], and so on. For each letter in the file, all the program has to do is increment the value at the appropriate index position in the vector, which F I G U R E 5 - 2 Program to count letter frequencies in a file /* * File: LetterFrequency.cpp * ------------------------* This program counts the frequency of letters in a data file. */ #include #include #include #include #include "filelib.h" #include "vector.h" using namespace std; /* Constants */ static const int COLUMNS = 7; /* Main program */ int main() { Vector letterCounts(26); ifstream infile; promptUserForFile(infile, "Input file: "); while (true) { int ch = infile.get(); if (ch == EOF) break; if (isalpha(ch)) { letterCounts[toupper(ch) - 'A']++; } } infile.close(); for (char ch = 'A'; ch <= 'Z'; ch++) { cout << setw(COLUMNS) << letterCounts[ch - 'A'] << " " << ch << endl; } return 0; } 207 208 Collections can be calculated arithmetically based on the ASCII code of the character. This calculation occurs in the statement letterCounts[toupper(ch) - 'A']++; Subtracting the ASCII value 'A' from the uppercase character code gives the index of the character ch in the alphabet. This statement then updates the count by incrementing that element of the vector. The remainder of the program is mostly concerned with formatting the output so that the letter counts are properly aligned in columns. As an example, here is what the LetterCounts program produces if you run it on a file containing the text of George Eliot’s Middlemarch: LetterFrequency Input file: Middlemarch.txt 114157 A 23269 B 34031 C 61046 D 166989 E 30826 F 30055 G 89636 H 99651 I 1695 J 11010 K 56865 L 37816 M 96887 N 108561 O 21922 P 1441 Q 79808 R 88555 S 123433 T 40647 U 12792 V 34508 W 2069 X 28700 Y 249 Z Constructors for the Vector class The sample programs you’ve seen so far in this chapter declare vectors in two different ways. The ReverseFile program in Figure 5-1 defines an empty vector of strings using the declaration Vector lines; The LetterFrequency program declares a vector containing 26 zeroes like this: 5.1 The Vector class Vector letterCounts(26); As it happens, there is more going on in these declarations than meets the eye. When you declare a variable of a primitive type, as in double total; C++ does not make any attempt to initialize that variable. The contents of the memory used to hold the variable total continue to contain whatever value happened to be there, which can easily lead to unexpected results. For this reason, declarations of primitive variables usually include an explicit initializer that sets the variable to the desired starting value, as in double total = 0.0; The situation is different when you declare a variable to be an instance of a C++ class. In that case, C++ automatically initializes that variable by invoking a special method called a constructor. For example, in the declaration Vector lines; C++ calls the Vector constructor, which initializes the variable lines to be an empty vector that belongs to the parameterized class Vector. The declaration Vector letterCounts(26); calls a different version of the constructor to initialize a Vector with 26 elements. The C++ compiler determines which version of the constructor to call by looking at the arguments appearing in the declaration, just as it does for overloaded functions. The declaration of lines provides no arguments, which tells the compiler to invoke the constructor that takes no arguments, which is called the default constructor. The declaration of letterCounts provides an integer argument, which tells the compiler to invoke the version of the constructor that takes an integer indicating the vector size. The two versions of the Vector constructor appear—along with a complete list of the Vector methods—in Table 5-1. If you look closely at the constructor descriptions in Table 5-1, you’ll discover that the second form of the constructor accepts an optional argument indicating the initial value to use for each of the elements. That argument is usually omitted, in which case, the initial value for the elements is the default value for the base type, 209 210 Collections T A B L E 5 - 1 Entries in the vector.h interface Constructors Vector() Creates an empty vector. Vector(n, value) Creates a vector with n elements, each of which has the specified value. If value is missing, each element is initialized to its default value. Methods size() Returns the number of elements in the vector. isEmpty() Returns true if the vector is empty. get(index) Returns the element at the specified index position. Attempting to get the value of an element outside the vector bounds generates an error. set(index, value) Sets the element at the specified index to the new value. Attempting to set the value of an element outside the vector bounds generates an error. add(value) Adds a new element at the end of the vector. insertAt(index, value) Inserts the new value before the specified index position. removeAt(index) Deletes the element at the specified index position. clear() Removes all elements from the vector. Operators vec[index] The Vector class extends the square bracket operators so that clients can use array notation to select elements. Attempting to select an element outside the vector bounds generates an error. v1 + v2 Concatenates v1 and v2 and returns a new vector containing the combined elements. Adds the elements e1, e2, and so on, to the end of the vector vec. vec += e1, e2, . . . which is 0 for numeric types, false for type bool, the character whose ASCII value is 0 for type char, and the result of calling the default constructor for a class. Vector operators Partly to illustrate the power of operator overloading and partly because these operators turn out to be so useful, the Vector class in the Stanford library defines several operators that apply to vector objects. You have already seen the use of square brackets, which make it possible to select objects from a vector using the traditional selection syntax for arrays. The Vector class also defines the operators + and += as shorthands for concatenating two vectors and for adding elements to an existing vector. 5.1 The Vector class The + operator on vectors is defined so that it works in exactly the way it does for strings. If the character vectors v1 and v2 contain the values v1 'A' 'B' 'C' 0 1 2 'D' 'E' 0 1 and v2 evaluating the expression v1 + v2 creates a new five-element vector, as follows: 'A' 'B' 'C' 'D' 'E' 0 1 2 3 4 The += operator adds elements to the end of an existing vector and turns out to be even more useful, particularly when you need to initialize the contents of a vector. For example, you could declare and initialize the vectors v1 and v2 in this example using the following code: Vector v1; Vector v2; v1 += 'A', 'B', 'C'; v2 += 'D', 'E'; Representing two-dimensional structures The type parameter used in the Vector class can be any C++ type and may itself be a parameterized type. In particular, you can create a two-dimensional structure by declaring a Vector whose base type is itself a Vector. The declaration Vector< Vector > sudoku(9, Vector(9)); initializes the variable sudoku to a Vector of nine elements, each of which is itself a Vector of nine elements. The base type of the inner vector is int, and the base type of the outer vector is Vector. The type of the whole assemblage is therefore Vector< Vector > Although some C++ compilers have been extended so that the spaces are optional, this declaration adheres to the C++ standard by including spaces around the inner 211 212 Collections type parameter. Those spaces ensure that the angle brackets for the type parameters are interpreted correctly. If you instead write this declaration as Vector> sudoku(9, Vector(9)); many C++ compilers will interpret the >> as a single operator and be unable to compile this line. The Grid class in the Stanford libraries Although using nested vectors makes it possible to represent two-dimensional structures, that strategy is by no means convenient. To simplify the development of applications that need to work with two-dimensional structures, the Stanford version of the collection class library includes a class called Grid, even though there is no direct counterpart for this class in the Standard Template Library. The entries exported by grid.h appear in Table 5-2. T A B L E 5 - 2 Entries exported by the grid.h interface Constructors Grid() Creates an empty grid. Clients who use the default constructor must specify the grid dimensions by calling resize. Grid(rows, cols) Creates a grid with the specified number of rows and columns. Each element is initialized to the default value for the type. Methods numRows() numCols() These methods return the number of rows and the number of columns, respectively. inBounds(row, col) Returns true if the specified row and column coordinates are inside the grid. get(row, col) Returns the element of the grid that appears at the specified row and column. Sets the element at the specified grid coordinates to the new value. set(row, col, value) resize(rows, cols) Changes the dimensions of the grid as specified by the rows and cols parameters. Any previous contents of the grid are discarded. Operators grid[row][col] The Grid class extends the square bracket operators so that clients can use array notation to select individual elements of the grid. 5.2 The Stack class 5.2 The Stack class When measured in terms of the operations it supports, the simplest collection class is the Stack class, which—despite its simplicity—turns out to be useful in a variety of programming applications. Conceptually, a stack provides storage for a collection of data values, subject to the restriction that values must be removed from a stack in the opposite order from which they were added. This restriction implies that the last item added to a stack is always the first item that gets removed. In light of their importance in computer science, stacks have a terminology of their own. Adding a new value to a stack is called pushing that value; removing the most recent item from a stack is called popping the stack. Moreover, the order in which stacks are processed is sometimes called LIFO, which stands for “last in, first out.” A common (but possibly apocryphal) explanation for the words stack, push, and pop is that the stack model is derived from the way plates are stored in a cafeteria. Particularly, if you are in a cafeteria in which customers pick up their own plates at the beginning of a buffet line, those plates are placed in spring-loaded columns that make it easy for people in the cafeteria line to take the top plate, as illustrated in the following diagram: When a dishwasher adds a new plate, it goes on the top of the stack, pushing the others down slightly as the spring is compressed, as shown: Customers can take plates only from the top of the stack. When they do, the remaining plates pop back up. The last plate added to the stack is the first one a customer takes. The primary reason that stacks are important in programming is that nested function calls behave in a stack-oriented fashion. For example, if the main program calls a function named f, a stack frame for f gets pushed on top of the stack frame for main, as illustrated by the following diagram: 213 214 Collections int main() { f();f() { void return 0; cout << "This is the function f" << endl; } ☞ g(); } If f calls g, a new stack frame for g is pushed on top of the frame for f, as follows: int main() { f();f() { void return 0; cout << "This is the function f" << endl; } void g() { g(); cout << "This is the function g" << endl; } }☞ When g returns, its frame is popped off the stack, restoring f to the top of the stack as shown in the original diagram. The structure of the Stack class Like Vector and Grid, Stack is a collection class that requires you to specify the element type. For example, Stack represents a stack whose elements are integers, and Stack represents one in which the elements are strings. Similarly, if you define the classes Plate and Frame, you can create stacks of these objects using the classes Stack and Stack. The list of entries exported by the stack.h interface appears in Table 5-3. T A B L E 5 - 3 Entries exported by the stack.h interface Constructor Stack() Creates an empty stack capable of holding values of the specified type. Methods size() Returns the number of elements currently on the stack. isEmpty() Returns true if the stack is empty. push(value) Pushes value on the stack so that it becomes the topmost element. pop() Pops the topmost value from the stack and returns it to the caller. Calling pop on an empty stack generates an error. peek() Returns the topmost value on the stack without removing it. Calling peek on an empty stack generates an error. clear() Removes all the elements from a stack. 5.2 The Stack class 215 Stacks and pocket calculators One interesting application of stacks is in electronic calculators, where they are used to store intermediate results of a calculation. Although stacks play a central role in the operation of most calculators, that role is easiest to see in early scientific calculators that required users to enter expressions in Reverse Polish Notation, or RPN. In Reverse Polish Notation, operators are entered after the operands to which they apply. For example, to compute the result of the expression 8.5 * 4.4 + 6.9 / 1.5 on an RPN calculator, you would enter the operations in the following order: 8.5 ENTER 4.4 * 6.9 ENTER 1.5 / + When the ENTER button is pressed, the calculator takes the previous value and pushes it on a stack. When an operator button is pressed, the calculator first checks whether the user has just entered a value and, if so, automatically pushes it on the stack. It then computes the result of applying the operator by • Popping the top two values from the stack • Applying the arithmetic operation indicated by the button to these values • Pushing the result back on the stack Except when the user is actually typing in a number, the calculator display shows the value at the top of the stack. Thus, at each point in the operation, the calculator display and stack contain the values shown in Figure 5-3. F I G U R E 5 - 3 Execution diagram of the RPN calculator Buttons: 8.5 ENTER 4.4 * 6.9 ENTER 1.5 / + Display: 8.5 8.5 4.4 37.4 6.9 6.9 1.5 4.6 42.0 37.4 37.4 6.9 37.4 6.9 37.4 4.6 37.4 42.0 Stack: 8.5 8.5 216 Collections Implementing the RPN calculator in C++ requires making some changes in the user-interface design. In a real calculator, the digits and operations appear on a keypad. In this implementation, it is easier to imagine that the user enters lines on the console, where those lines take one of the following forms: • A floating-point number • An arithmetic operator chosen from the set +, -, *, and / • The letter Q, which causes the program to quit • The letter H, which prints a help message • The letter C, which clears any values left on the stack A sample run of the calculator program might therefore look like this: RPNCalculator RPN Calculator Simulation (type H for help) > H Enter expressions in Reverse Polish Notation, in which operators follow the operands to which they apply. Each line consists of a number, an operator, or one of the following commands: Q -- Quit the program H -- Display this help message C -- Clear the calculator stack > 8.5 > 4.4 > * 37.4 > 6.9 > 1.5 > / 4.6 > + 42 > Q Because the user enters each number on a separate line terminated with the RETURN key, there is no need for any counterpart to the calculator’s ENTER button, which really serves only to indicate that a number is complete. The calculator program can simply push the numbers on the stack as the user enters them. When the calculator reads an operator, it pops the top two elements from the stack, applies the operator, displays the result, and then pushes the result back on the stack. The complete implementation of the calculator application appears in Figure 5-4. 5.2 The Stack class Figure 4-4 F I G U R E 5 - 4 Program to implement a simple RPN calculator /* * File: RPNCalculator.cpp * ----------------------* This program simulates an electronic calculator that uses * reverse Polish notation, in which the operators come after * the operands to which they apply. Information for users * of this application appears in the helpCommand function. */ #include #include #include #include "error.h" #include "stack.h" #include "strlib.h" using namespace std; /* Function prototypes */ void applyOperator(char op, Stack & operandStack); void helpCommand(); /* Main program */ int main() { cout << "RPN Calculator Simulation (type H for help)" << endl; Stack operandStack; while (true) { cout << "> "; string line; getline(cin, line); if (line.length() == 0) line = "Q"; char ch = toupper(line[0]); if (ch == 'Q') { break; } else if (ch == 'C') { operandStack.clear(); } else if (ch == 'H') { helpCommand(); } else if (isdigit(ch)) { operandStack.push(stringToReal(line)); } else { applyOperator(ch, operandStack); } } return 0; } 217 218 Collections Figure 4-4 F I G U R E 5 - 4 Program to implement a simple RPN calculator (continued) /* * Function: applyOperator * Usage: applyOperator(op, operandStack); * --------------------------------------* This function applies the operator to the top two elements on * the operand stack. Because the elements on the stack are * popped in reverse order, the right operand is popped before * the left operand. */ void applyOperator(char op, Stack & operandStack) { double result; double rhs = operandStack.pop(); double lhs = operandStack.pop(); switch (op) { case '+': result = lhs + rhs; break; case '-': result = lhs - rhs; break; case '*': result = lhs * rhs; break; case '/': result = lhs / rhs; break; default: error("Illegal operator"); } cout << result << endl; operandStack.push(result); } /* * Function: helpCommand * Usage: helpCommand(); * --------------------* This function generates a help message for the user. */ void helpCommand() { cout << "Enter expressions in Reverse Polish Notation," << endl; cout << "in which operators follow the operands to which" << endl; cout << "they apply. Each line consists of a number, an" << endl; cout << "operator, or one of the following commands:" << endl; cout << " Q -- Quit the program" << endl; cout << " H -- Display this help message" << endl; cout << " C -- Clear the calculator stack" << endl; } 5.3 The Queue class 5.3 The Queue class As you learned in section 5.2, the defining feature of a stack is that the last item pushed is always the first item popped. As noted in the introduction to that section, this behavior is often referred to in computer science as LIFO, which is an acronym for the phrase “last in, first out.” The LIFO discipline is useful in programming contexts because it reflects the operation of function calls; the most recently called function is the first to return. In real-world situations, however, its usefulness is more limited. In human society, our collective notion of fairness assigns some priority to being first, as expressed in the maxim “first come, first served.” In programming, the usual phrasing of this ordering strategy is “first in, first out,” which is traditionally abbreviated as FIFO. A data structure that stores items using a FIFO discipline is called a queue. The fundamental operations on a queue—which are analogous to the push and pop operations for stacks—are called enqueue and dequeue. The enqueue operation adds a new element to the end of the queue, which is traditionally called its tail. The dequeue operation removes the element at the beginning of the queue, which is called its head. The conceptual difference between these structures can be illustrated most easily with a diagram. In a stack, the client must add and remove elements from the same end of the internal data structure, as follows: push Stack: ↑ top of stack pop In a queue, the client adds elements at one end and removes them from the other, like this: enqueue Queue: dequeue ↑ ↑ head of queue tail of queue As you might expect from the fact that the models are so similar, the structure of the Queue class looks very much like its Stack counterpart. The list of entries in Table 5-4 bears out that supposition. The only differences are in the terminology, which reflects the difference in the ordering of the elements. 219 220 Collections T A B L E 5 - 4 Entries exported by the queue.h interface Constructor Queue() Creates an empty queue capable of holding values of the specified type. Methods size() Returns the number of elements currently in the queue. isEmpty() Returns true if the queue is empty. enqueue(value) Adds value to the tail of the queue. dequeue() Removes the element at the head of the queue and returns it to the caller. Calling dequeue on an empty queue generates an error. peek() Returns the value at the head of the queue without removing it. Calling peek on an empty queue generates an error. clear() Removes all the elements from a queue. The queue data structure has many applications in programming. Not surprisingly, queues turn up in many situations in which it is important to maintain a first-in/first-out discipline in order to ensure that service requests are treated fairly. For example, if you are working in an environment in which a single printer is shared among several computers, the printing software is usually designed so that all print requests are entered in a queue. Thus, if several users decide to enter print requests, the queue structure ensures that each user’s request is processed in the order received. Queues are also common in programs that simulate the behavior of waiting lines. For example, if you wanted to decide how many cashiers you needed in a supermarket, it might be worth writing a program that could simulate the behavior of customers in the store. Such a program would almost certainly involve queues, because a checkout line operates in a first-in/first-out way. Customers who have completed their purchases arrive in the checkout line and wait for their turn to pay. Each customer eventually reaches the front of the line, at which point the cashier totals up the purchases and collects the money. Because simulations of this sort represent an important class of application programs, it is worth spending a little time understanding how such simulations work. Simulations and models Beyond the world of programming, there are an endless variety of real-world events and processes that—although they are undeniably important—are nonetheless too complicated to understand completely. For example, it would be very useful to know how various pollutants affect the ozone layer and how the resulting changes 5.3 The Queue class in the ozone layer affect the global climate. Similarly, if economists and political leaders had a more complete understanding of exactly how the national economy works, it would be possible to evaluate whether a cut in the capital-gains tax would spur investment or whether it would exacerbate the existing disparities of wealth and income. When faced with such large-scale problems, it is usually necessary to come up with an idealized model, which is a simplified representation of some real-world process. Most problems are far too complex to allow for a complete understanding. There are just too many details. The reason to build a model is that, despite the complexity of a particular problem, it is often possible to make certain assumptions that allow you to simplify a complicated process without affecting its fundamental character. If you can come up with a reasonable model for a process, you can often translate the dynamics of the model into a program that captures the behavior of that model. Such a program is called a simulation. It is important to remember that creating a simulation is usually a two-step process. The first step consists of designing a conceptual model for the real-world behavior you are trying to simulate. The second consists of writing a program that implements the conceptual model. Because errors can occur in both steps of the process, maintaining a certain skepticism about simulations and their applicability to the real world is probably wise. In a society conditioned to believe the “answers” delivered by computers, it is critical to recognize that the simulations can never be better than the models on which they are based. The waiting-line model Suppose that you want to design a simulation that models the behavior of a supermarket waiting line. By simulating the waiting line, you can determine some useful properties of waiting lines that might help a company make such decisions as how many cashiers are needed, how much space needs to be reserved for the line itself, and so forth. The first step in the process of writing a checkout-line simulation is to develop a model for the waiting line, detailing the simplifying assumptions. For example, to make the initial implementation of the simulation as simple as possible, you might begin by assuming that there is one cashier who serves customers from a single queue. You might then assume that customers arrive with a random probability and enter the queue at the end of the line. Whenever the cashier is free and someone is waiting in line, the cashier begins to serve that customer. After an appropriate service period—which you must also model in some way—the cashier completes the transaction with the current customer, and is free to serve the next customer in the queue. 221 222 Collections Discrete time Another assumption often required in a model is some limitation on the level of accuracy. Consider, for example, the time that a customer spends being served by the cashier. One customer might spend two minutes; another might spend six. It is important, however, to consider whether measuring time in minutes allows the simulation to be sufficiently precise. If you had a sufficiently accurate stopwatch, you might discover that a customer actually spent 3.14159265 minutes. The question you need to resolve is how accurate you need to be. For most models, and particularly for those intended for simulation, it is useful to introduce the simplifying assumption that all events within the model happen in discrete integral time units. Using discrete time assumes that you can find a time unit that—for the purpose of the model—you can treat as indivisible. In general, the time units used in a simulation must be small enough that the probability of more than one event occurring during a single time unit is negligible. In the checkout-line simulation, for example, minutes may not be accurate enough; two customers could easily arrive in the same minute. On the other hand, you could probably get away with using seconds as the time unit and discount the possibility that two customers arrive in precisely the same second. Although the checkout-line example assumes that simulation time is measured in seconds, in general, there is no reason you have to measure time in conventional units. When you write a simulation, you can define the unit of time in any way that fits the structure of the model. For example, you could define a time unit to be five seconds and then run the simulation as a series of five-second intervals. Events in simulated time The real advantage of using discrete time units is not that it makes it possible to work with variables of type int instead of being forced to use type double. The most important property of discrete time is that it allows you to structure the simulation as a loop in which each time unit represents a single cycle. When you approach the problem in this way, a simulation program has the following form: for (int time = 0; time < SIMULATION_TIME; time++) { Execute one cycle of the simulation. } Within the body of the loop, the program performs the operations necessary to advance through one unit of simulated time. Think for a moment about what events might occur during each time unit of the checkout-line simulation. One possibility is that a new customer might arrive. Another is that the cashier might finish with the current customer and go on the 5.3 The Queue class serve the next person in line. These events bring up some interesting issues. To complete the model, you need to say something about how often customers arrive and how much time they spend at the cash register. You could (and probably should) gather approximate data by watching a real checkout line in a store. Even if you collect that information, however, you will need to simplify it to a form that (1) captures enough of the real-world behavior to be useful and (2) is easy to understand in terms of the model. For example, your surveys might show that customers arrive at the line on average once every 20 seconds. This average arrival rate is certainly useful input to the model. On the other hand, you would not have much confidence in a simulation in which customers arrived exactly once every 20 seconds. Such an implementation would violate the real-world condition that customer arrivals have some random variability and that they sometimes bunch together. For this reason, the arrival process is usually modeled by specifying the probability that an arrival takes place in any discrete time unit instead of the average time between arrivals. For example, if your studies indicated that a customer arrived once every 20 seconds, the average probability of a customer arriving in any particular second would be 1/20 or 0.05. If you assume that arrivals occur randomly with an equal probability in each unit of time, the arrival process forms a pattern that mathematicians call a Poisson distribution. You might also choose to make simplifying assumptions about how long it takes to serve a particular customer. For example, the program is easier to write if you assume that the service time required for each customer is uniformly distributed within a certain range. If you do, you can use the randomInteger function from the random.h interface to pick the service time. Implementing the simulation Even though it is longer than the other programs in this chapter, the code for the simulation program is reasonably easy to write and appears in Figure 5-5. The core of the simulation is a loop that runs for the number of seconds indicated by the parameter SIMULATION_TIME. In each second, the simulation performs the following operations: 1. Determine whether a new customer has arrived and, if so, add that person to the queue. 2. If the cashier is busy, note that the cashier has spent another second with the current customer. Eventually, the required service time will be complete, which will free the cashier. 3. If the cashier is free, serve the next customer in the waiting line. 223 224 Collections F I G U R E 5 - 5 Program to simulate a checkout line Figure 4-5 /* * File: CheckoutLine.cpp * ---------------------* This program simulates a checkout line, such as one you * might encounter in a grocery store. Customers arrive at * the checkout stand and get in line. Those customers wait * in the line until the cashier is free, at which point * they are served and occupy the cashier for some period * of time. After the service time is complete, the cashier * is free to serve the next customer in the line. * * In each unit of time, up to the constant SIMULATION_TIME, * the following operations are performed: * * 1. Determine whether a new customer has arrived. * New customers arrive randomly, with a probability * determined by the constant ARRIVAL_PROBABILITY. * * 2. If the cashier is busy, note that the cashier has * spent another minute with that customer. Eventually, * the customer's time request is satisfied, which frees * the cashier. * * 3. If the cashier is free, serve the next customer in line. * The service time is taken to be a random period between * MIN_SERVICE_TIME and MAX_SERVICE_TIME. * * At the end of the simulation, the program displays the * simulation constants and the following computed results: * * o The number of customers served * o The average time spent in line * o The average number of people in line */ #include #include #include "queue.h" #include "random.h" using namespace std; /* Constants */ const const const const double ARRIVAL_PROBABILITY = 0.05; int MIN_SERVICE_TIME = 5; int MAX_SERVICE_TIME = 15; int SIMULATION_TIME = 2000; /* Function prototypes */ void runSimulation(int & nServed, int & totalWait, int & totalLength); void printReport(int nServed, int totalWait, int totalLength); 5.3 The Queue class Figure 4-5 F I G U R E 5 - 5 Program to simulate a checkout line (continued) /* Main program */ int main() { int nServed; int totalWait; int totalLength; runSimulation(nServed, totalWait, totalLength); printReport(nServed, totalWait, totalLength); return 0; } /* * Function: runSimulation * Usage: runSimulation(); * ----------------------* This function runs the actual simulation. The number of * customers served, the sum of the customer waiting times, * and the sum of the queue length in each time step are * returned through the reference parameters. */ void runSimulation(int & nServed, int & totalWait, int & totalLength) { Queue queue; int timeRemaining = 0; nServed = 0; totalWait = 0; totalLength = 0; for (int t = 0; t < SIMULATION_TIME; t++) { if (randomChance(ARRIVAL_PROBABILITY)) { queue.enqueue(t); } if (timeRemaining > 0) { timeRemaining--; } else if (!queue.isEmpty()) { totalWait += t - queue.dequeue(); nServed++; timeRemaining = randomInteger(MIN_SERVICE_TIME, MAX_SERVICE_TIME); } totalLength += queue.size(); } } 225 226 Collections F I G U R E 5 - 5 Program to simulate a checkout line (continued) /* * Function: printReport * Usage: printReport(nServed, totalWait, totalLength); * ---------------------------------------------------* This function reports the results of the simulation. */ void printReport(int nServed, int totalWait, int totalLength) { cout << "Simulation results given the following constants:" << endl; cout << fixed << setprecision(2); cout << " SIMULATION_TIME: " << setw(4) << SIMULATION_TIME << endl; cout << " ARRIVAL_PROBABILITY: " << setw(7) << ARRIVAL_PROBABILITY << endl; cout << " MIN_SERVICE_TIME: " << setw(4) << MIN_SERVICE_TIME << endl; cout << " MAX_SERVICE_TIME: " << setw(4) << MAX_SERVICE_TIME << endl; cout << endl; cout << "Customers served: " << setw(4) << nServed << endl; cout << "Average waiting time: " << setw(7) << double(totalWait) / nServed << endl; cout << "Average queue length: " << setw(7) << double(totalLength) / SIMULATION_TIME << endl; } The waiting line itself is represented, naturally enough, as a queue. The value stored in the queue is the time at which that customer arrived in the queue, which makes it possible to determine how many seconds that customer spent in line before reaching the head of the queue. The simulation is controlled by the following constants: • SIMULATION_TIME—This constant specifies the duration of the simulation. • ARRIVAL_PROBABILITY—This constant indicates the probability that a new customer will arrive at the checkout line during a single unit of time. In keeping with standard statistical convention, the probability is expressed as a real number between 0 and 1. • MIN_SERVICE_TIME, MAX_SERVICE_TIME—These constants define the legal range of customer service time. For any particular customer, the amount of time spent at the cashier is determined by picking a random integer in this range. 5.3 The Queue class When the simulation is complete, the program reports the simulation constants along with the following results: • The number of customers served • The average amount of time customers spent in the waiting line • The average length of the waiting line For example, the following sample run shows the results of the simulation for the indicated constant values: CheckoutLine Simulation results given the following constants: SIMULATION_TIME: 2000 ARRIVAL_PROBABILITY: 0.05 MIN_SERVICE_TIME: 5 MAX_SERVICE_TIME: 15 Customers served: Average waiting time: Average queue length: 117 8.73 0.51 The behavior of the simulation depends significantly on the values of the constants used to control it. Suppose, for example, that the probability of a customer arriving increases from 0.05 to 0.10. Running the simulation with these parameters gives the following results: CheckoutLine Simulation results given the following constants: SIMULATION_TIME: 2000 ARRIVAL_PROBABILITY: 0.10 MIN_SERVICE_TIME: 5 MAX_SERVICE_TIME: 15 Customers served: Average waiting time: Average queue length: 176 68.95 6.49 As you can see, doubling the probability of arrival causes the average waiting time to grow from under nine seconds to more than a minute, which is obviously a dramatic increase. The reason for the poor performance is that the arrival rate in the second run of the simulation means that new customers arrive at the same rate at which they are served. When this arrival level is reached, the length of the queue and the average waiting time both begin to grow very quickly. Simulations of this sort make it possible to experiment with different parameter values. Those experiments, in turn, make it possible to identify potential sources of trouble in the corresponding real-world systems. 227 228 Collections 5.4 The Map class This section introduces another generic collection called a map, which is conceptually similar to a dictionary. A dictionary allows you to look up a word to find its meaning. A map is a generalization of this idea that provides an association between an identifying tag called a key and an associated value, which may be a much larger and more complicated structure. In the dictionary example, the key is the word you’re looking up, and the value is its definition. Maps have many applications in programming. For example, an interpreter for a programming language needs to be able to assign values to variables, which can then be referenced by name. A map makes it easy to maintain the association between the name of a variable and its corresponding value. When they are used in this context, maps are often called symbol tables, which is just another name for the same concept. In addition to the Map class described in this section, the Stanford libraries offer a HashMap class that has almost the same structure and behavior. The HashMap class is more efficient but somewhat less convenient to use for certain applications. For now, it’s best to focus on the Map class until you understand how maps work in general. You’ll have a chance to learn about the differences between these two implementations of the map concept in Chapter 15. The structure of the Map class As with the collection classes introduced earlier in this chapter, Map is implemented as a template class that must be parameterized with both the key type and the value type. For example, if you want to simulate a dictionary in which individual words are associated with their definitions, you can start by declaring a dictionary variable as follows: Map dictionary; Similarly, if you are implementing a programming language, you could use a Map to store the values of floating-point variables by associating variable names and values, as follows: Map symbolTable; These definitions create empty maps that contain no keys and values. In either case, you would subsequently need to add key/value pairs to the map. In the case of the dictionary, you could read the contents from a data file. For the symbol table, you would add new associations whenever an assignment statement appeared. 5.4 The Map class 229 The most common methods used with the Map class appear in Table 5-5. Of these, the ones that implement the fundamental behavior of the map concept are put and get. The put method creates an association between a key and a value. Its operation is analogous to assigning a value to a variable in C++: if there is a value already associated with the key, the old value is replaced by the new one. The get method retrieves the value most recently associated with a particular key and therefore corresponds to the act of using a variable name to retrieve its value. If no value appears in the map for a particular key, calling get with that key generates an error condition. You can check for that condition by calling the containsKey method, which returns true or false depending on whether the key exists in the map. A few simple diagrams may help to illustrate the operation of the Map class in more detail. Suppose that you have declared the symbolTable variable to be a Map as you saw earlier in the section. That declaration creates an empty map with no associations, as follows: T A B L E 5 - 5 Entries exported by the map.h interface Constructors Map() Creates an empty map associating keys and values. Methods size() Returns the number of key/value pairs contained in the map. isEmpty() Returns true if the map is empty. put(key, value) Associates the specified key and value in the map. If key has no previous definition, a new entry is added; if a previous association exists, the old value is discarded and replaced by the new one. get(key) Returns the value currently associated with key in the map. If there is no such value, get generates an error. remove(key) Removes key from the map along with any associated value. If key does not exist, this call leaves the map unchanged. containsKey(key) Checks to see whether key is associated with a value. If so, this method returns true; if not, it returns false. clear() Removes all the key/value pairs from the map. Operators map[key] The Map class overloads the square bracket operator so that a map acts as an associative array indexed by the key value. If the key does not exist in the map, the square bracket operator creates a new entry and sets its value to the default for that type. 230 Collections symbolTable Once you have the map, you can use put to establish new associations. example, if you call For symbolTable.put("pi", 3.14159); the conceptual effect is to add an association between the key "pi" and the value 3.14159, as follows: symbolTable pi = 3.14159 Calling symbolTable.put("e", 2.71828); would add a new association between the key "e" and the value 2.71828, like this: symbolTable pi = 3.14159 e = 2.71828 You can then use get to retrieve these values. Calling symbolTable.get("pi") would return the value 3.14159, and calling symbolTable.get("e") would return 2.71828. Although it hardly makes sense in the case of mathematical constants, you could change the values in the map by making additional calls to put. You could, for example, reset the value associated with "pi" (as an 1897 bill before the Indiana State General Assembly sought to do) by calling symbolTable.put("pi", 3.0); which would leave the map in the following state: symbolTable pi = 3.0 e = 2.71828 At this point, calling symbolTable.containsKey("pi") would return true; by contrast, calling symbolTable.containsKey("x") would return false. 5.4 The Map class Using maps in an application If you fly at all frequently, you quickly learn that every airport in the world has a three-letter code assigned by the International Air Transport Association (IATA). For example, John F. Kennedy airport in New York City is assigned the three-letter code JFK. Other codes, however, are considerably harder to recognize. Most web-based travel systems offer some means of looking up these codes as a service to their customers. Suppose that you have been asked to write a simple C++ program that reads a three-letter airport code from the user and responds with the location of that airport. The data you need is in the form of a text file called AirportCodes.txt, which contains a list of the several thousand airport codes that IATA has assigned. Each line of the file consists of a three-letter code, an equal sign, and the location of the airport. If the file were sorted in descending order by passenger traffic in 2009, as compiled by Airports Council International, the file would begin with the lines in Figure 5-6. The existence of the Map class makes this application extremely easy to write. The entire application fits on a single page, as shown in Figure 5-7. F I G U R E 5 - 6 Beginning of data file containing airport codes and locations AirportCodes.txt ATL=Atlanta, GA, USA ORD=Chicago, IL, USA LHR=London, England, United Kingdom HND=Tokyo, Japan LAX=Los Angeles, CA, USA CDG=Paris, France DFW=Dallas/Ft Worth, TX, USA FRA=Frankfurt, Germany PEK=Beijing, China MAD=Madrid, Spain DEN=Denver, CO, USA AMS=Amsterdam, Netherlands JFK=New York, NY, USA HKG=Hong Kong, Hong Kong LAS=Las Vegas, NV, USA IAH=Houston, TX, USA PHX=Phoenix, AZ, USA BKK=Bangkok, Thailand SIN=Singapore, Singapore MCO=Orlando, FL, USA . . . 231 232 Collections Figure 4-7 F I G U R E 5 - 7 Program to look up three-letter airport codes /* * File: AirportCodes.cpp * ---------------------* This program looks up a three-letter airport code in a Map object. */ #include #include #include #include "error.h" #include "map.h" #include "strlib.h" using namespace std; /* Function prototypes */ void readCodeFile(string filename, Map & map); /* Main program */ int main() { Map airportCodes; readCodeFile("AirportCodes.txt", airportCodes); while (true) { string line; cout << "Airport code: "; getline(cin, line); if (line == "") break; string code = toUpperCase(line); if (airportCodes.containsKey(code)) { cout << code << " is in " << airportCodes.get(code) << endl; } else { cout << "There is no such airport code" << endl; } } return 0; } void readCodeFile(string filename, Map & map) { ifstream infile; infile.open(filename.c_str()); if (infile.fail()) error("Can't read the data file"); while (true) { string line; getline(infile, line); if (infile.fail()) break; if (line.length() < 4 || line[3] != '=') { error("Illegal data line: " + line); } string code = toUpperCase(line.substr(0, 3)); map.put(code, line.substr(4)); } infile.close(); } 5.4 The Map class The main program in the AirportCodes application reads in three-letter codes, looks up the corresponding location, and writes that back to the console, as shown in the following sample run: AirportCodes Airport code: LHR LHR is in London, England, United Kingdom Airport code: SFO SFO is in San Francisco, CA, USA Airport code: XXX There is no such airport code Airport code: Maps as associative arrays The Map class overloads the square bracket operators used for array selection so that the statement map[key] = value; acts as a shorthand for map.put(key, value); Similarly, the expression map[key] returns the value from map associated with key in exactly the same way that map.get(key) does. While these shorthand forms of the put and get methods are undoubtedly convenient, using array notation for maps is initially surprising, given that maps and arrays seem to be entirely different in their structure. If you think about maps and arrays in the right way, however, they turn out to be more alike than you might at first suspect. The insight necessary to unify these two seemingly different structures is that you can think of arrays as structures that map index positions to elements. Suppose, for example, that you have an array—or equivalently a vector—containing a set of scores such as those you might assign for a gymnastics match: scores 9.2 9.9 9.7 8.9 9.5 0 1 2 3 4 This array maps the key 0 into the value 9.2, the key 1 into 9.9, the key 2 into 9.7, and so forth. Thus, you can think of an array as simply a map with integer keys. Conversely, you can think of a map as an array that uses the key type as its indices, which is precisely what the overloaded selection syntax for the Map class suggests. 233 234 Collections Using array syntax to perform map operations is becoming increasingly common in programming languages even beyond the C++ domain. Many popular scripting languages implement all arrays internally as maps, making it possible to use index values that are not necessarily integers. Arrays implemented using maps as their underlying representation are called associative arrays. 5.5 The Set class One of the most useful collection classes is the Set class, which exports the entries shown in Table 5-6. This class is used to model the mathematical abstraction of a set, which is a collection in which the elements are unordered and in which each value appears only once. Sets turn out to be extremely useful in many algorithmic T A B L E 5 - 6 Entries exported by the set.h interface Constructor Set() Creates an empty set containing values of the specified type. Methods size() Returns the number of elements in the set. isEmpty() Returns true if the set is empty. add(value) Adds the value to the set. If the value is already in the set, no error is generated, and the set remains unchanged. remove(value) Removes the value to the set. If the value is not present, no error is generated, and the set remains unchanged. contains(value) Returns true if the value is in the set. clear() Removes all elements from the set. isSubsetOf(set) Returns true if this set is a subset of the set passed as an argument. first() Returns the first element of the set in the ordering specified for its value type. Operators s1 + s2 Returns the union of s1 and s2, which consists of the elements in either or both of the original sets. s1 * s2 Returns the intersection of s1 and s2, which consists of the elements common to both of the original sets. s1 - s2 Returns the set difference of s1 and s2, which consists of the all elements in s1 that are not present in s2. s1 += s2 s1 -= s2 s1 *= s2 The +, -, and * operators can be combined with assignment just as they can with numeric values. For += and -=, the right hand value can be a set, a single value, or a list of values separated by commas. 5.5 The Set class applications and are therefore worth a chapter of their own. Even before you have a chance to read the more detailed account in Chapter 18, it is worth presenting a few examples of sets so you can get a better sense of how they work and how they might be useful in applications. Implementing the library In Chapter 3, you learned about the library, which exports several predicate functions that test the type of a character. Calling isdigit(ch), for example, tests whether the character ch is one of the digit characters. In that case, you can implement the function by testing the character code for ch against a simple range of values, as follows: bool isdigit(ch) { return ch >= '0' && ch <= '9'; } The situation gets a little more complicated with some of the other functions. Implementing ispunct in this same style would be more difficult because the punctuation characters are spread over several intervals of the ASCII range. Things would be a lot easier if you could simply define a set of all the punctuation marks, in which case all you need to do to implement ispunct(ch) is check whether the character ch is in that set. A set-based implementation of the predicate functions from appears in Figure 5-8. The code creates a Set for each of the character types and then defines the predicate functions so that they simply invoke contains on the appropriate set. For example, to implement isdigit, the cctype implementation defines a set containing the digit characters like this: const Set DIGIT_SET = setFromString("0123456789"); The setFromString function, which appears at the bottom of Figure 5-8, is a simple helper function that creates a set by adding each of the characters in the argument string. This function makes it very easy to define sets like the set of punctuation characters simply by listing the characters that fit that description. One of the advantages of working with sets is that doing so makes it easier to think in terms of abstract, high-level operations. While most of the sets in cctype.cpp use setFromString to create the set from the actual characters, a few use the + operator, which is overloaded to return the union of two sets. For example, once you have defined the sets LOWER_SET and UPPER_SET so that they contain the lowercase and uppercase letters, you can define ALPHA_SET by writing const Set ALPHA_SET = LOWER_SET + UPPER_SET; 235 236 Collections Figure 4-8 F I G U R E 5 - 8 Set-based implementation of /* * File: cctype.cpp * ---------------* This program simulates the interface using sets of characters. */ #include #include "set.h" #include "cctype.h" using namespace std; /* Function prototypes */ Set setFromString(string str); /* * Constant sets * ------------* These sets are initialized to contain the characters in the * corresponding character class. */ const const const const const const const const const Set Set Set Set Set Set Set Set Set DIGIT_SET = setFromString("0123456789"); LOWER_SET = setFromString("abcdefghijklmnopqrstuvwxyz"); UPPER_SET = setFromString("ABCDEFGHIJKLMNOPQRSTUVWXYZ"); PUNCT_SET = setFromString("!\"#$%&'()*+,-./:;<=>?@[\\]^_‘{|}"); SPACE_SET = setFromString(" \t\v\f\n\r"); XDIGIT_SET = setFromString("0123456789ABCDEFabcdef"); ALPHA_SET = LOWER_SET + UPPER_SET; ALNUM_SET = ALPHA_SET + DIGIT_SET; PRINT_SET = ALNUM_SET + PUNCT_SET + SPACE_SET; /* Exported functions */ bool bool bool bool bool bool bool bool bool isalnum(char ch) { return ALNUM_SET.contains(ch); } isalpha(char ch) { return ALPHA_SET.contains(ch); } isdigit(char ch) { return DIGIT_SET.contains(ch); } islower(char ch) { return LOWER_SET.contains(ch); } isprint(char ch) { return PRINT_SET.contains(ch); } ispunct(char ch) { return PUNCT_SET.contains(ch); } isspace(char ch) { return SPACE_SET.contains(ch); } isupper(char ch) { return UPPER_SET.contains(ch); } isxdigit(char ch) { return XDIGIT_SET.contains(ch); } /* Helper function to create a set from a string of characters */ Set setFromString(string str) { Set set; for (int i = 0; i < str.length(); i++) { set.add(str[i]); } return set; } 5.5 The Set class Creating a word list In the discussion of the Map class earlier in the chapter, one of the examples used to explain the underlying concept was that of a dictionary in which the keys were individual words and the corresponding values were the definitions. In some applications, such as a spelling checker or a program that plays Scrabble, you don’t need to know the definition of a word. All you need to know is whether a particular combination of letters is a legal word. For such applications, the Set class is an ideal tool. Instead of a map containing both words and definitions, all you need is a Set whose elements are the legal words. A word is legal if it is contained in the set, and illegal if it is not. A set of words with no associated definitions is called a lexicon. If you have a text file named EnglishWords.txt containing all the words in English, one word per line, you could create an English lexicon using the following code: Set lexicon; ifstream infile; infile.open("EnglishWords.txt"); if (infile.fail()) error("Can't open EnglishWords.txt"); while (true) { string word; getline(infile, word); if (infile.fail()) break; lexicon.add(word); } infile.close(); The Lexicon class in the Stanford libraries Although the Set class works reasonably well as the underlying representation for a lexicon, it is not particularly efficient. Because having an efficient representation for lexicons opens up many exciting possibilities for programming projects, the Stanford libraries include a Lexicon class, which is essentially a specialized version of Set optimized for storing sets of words. The entries exported by the Lexicon class appear in Table 5-9. As you can see, these entries are largely the same as those for Set. The library distribution also includes a data file called EnglishWords.dat that contains a compiled representation of a lexicon containing a reasonably complete list of English words. Programs that use the English lexicon conventionally initialize it using the declaration Lexicon english("EnglishWords.dat"); 237 238 Collections T A B L E 5 - 7 Entries exported by the lexicon.h interface Constructors Lexicon() Creates an empty lexicon. Lexicon(file) Initializes a lexicon by reading data from a file. Methods size() Returns the number of words in the lexicon. isEmpty() Returns true if the lexicon is empty. add(word) addWordsFromFile(file) Adds a new word to the lexicon, if it is not already present. All words in a lexicon are stored in lower case. Adds all the words in the named file to the lexicon. The file must either be a text file, in which case the words are listed on separate lines, or a compiled data file specifically formatted for the lexicon. contains(word) Returns true if word is in the lexicon. containsPrefix(prefix) Returns true if any of the words in the lexicon start with the specified prefix. clear() Removes all the words from a lexicon. In word games like Scrabble, it is useful to memorize as many two-letter words as you can, because knowing the two-letter words makes it easier to attach new words to the existing words on the board. Given that you have a lexicon containing English words, you could create such a list by generating all two-letter strings and then using the lexicon to check which of the resulting combinations are actually words. The code to do so appears in Figure 5-9. As you will discover in the following section, it is also possible to solve this problem by going through the lexicon and printing out the words whose length is two. Given that there are more than 100,000 English words in the lexicon and only 676 (26 × 26) combinations of two letters, the strategy used in Figure 5-9 is probably more efficient. 5.6 Iterating over a collection The TwoLetterWords program introduced in Figure 5-9 produces a list of the two-letter words by generating every possible combination of two letters and then looking up each one to see whether that two-letter string appears in the lexicon of English words. Another strategy that accomplishes the same result is to go through every word in the lexicon and display the words whose length is equal to 2. To do so, all you need is some way of stepping through each word in a Lexicon object, one word at a time. 5.6 Iterating over a collection F I G U R E 5 - 9 Program to generate a list of all the two-letter English words /* * File: TwoLetterWords.cpp * -----------------------* This program generates a list of the two-letter words. */ #include #include "lexicon.h" using namespace std; int main() { Lexicon english("EnglishWords.dat"); string word = "xx"; for (char c0 = 'a'; c0 <= 'z'; c0++) { word[0] = c0; for (char c1 = 'a'; c1 <= 'z'; c1++) { word[1] = c1; if (english.contains(word)) { cout << word << endl; } } } return 0; } Iterating through the elements is a fundamental operation for any collection class. Moreover, if the package of collection classes is well designed, clients should be able to use the same strategy to perform that operation, no matter whether they are cycling through all elements in a vector or a grid, all keys in a map, or all words in a lexicon. The Standard Template Library offers a powerful mechanism called an iterator for doing just that. Unfortunately, understanding these standard iterators depends on a high level of familiarity with the low-level details of C++, most notably the concept of pointers. Given that one of the goals of this text is to defer covering those details until you understand the high-level ideas, introducing standard iterators has the effect of dragging in a large amount of complexity to achieve a goal that is in fact quite simple. All you really need is some way to express the algorithmic idea suggested by the following pseudocode: For each element in a particular collection { Process that element } Most modern languages define a syntactic form that expresses precisely that idea. Unfortunately, the syntax of C++ does not yet include such a facility, although one has been proposed for a future release. The good news, however, is that it is possible to use the macro-definition capabilities of the C++ preprocessor to 239 240 Collections achieve exactly what you would like to see in the language. Although the implementation is beyond the scope of this text, the collection classes—both those in the Standard Template Library and the simplified versions used in this text— support an extended control pattern called foreach that takes the following form: foreach (type variable in collection) { body of the loop. } For example, if you want to iterate through all the words in the English lexicon and select only those containing two letters, you could write foreach (string word in english) { if (word.length() == 2) { cout << word << endl; } } Iteration order When you use the foreach pattern, it is sometimes useful to understand the order in which it processes the individual values. There is no universal rule. Each collection class defines its own policy about iteration order, usually based on considerations of efficiency. The classes you’ve already seen make the following guarantees about the order of values: • When you iterate through the elements of the Vector class, foreach delivers the elements in the order of the index position, so that the element in position 0 comes first, followed by the element in position 1, and so on, up to the end of the vector. The iteration order is therefore the same as that produced by the traditional for loop pattern: for (int i = 0; i < vec.size(); i++) { code to process vec[i] } • When you iterate through the elements of the Grid class, foreach begins by cycling through the elements of row 0 in order, then the elements of row 1, and so forth. This iteration strategy for Grid is thus analogous to using the following for loop: 5.6 Iterating over a collection for (int row = 0; row < grid.numRows(); row++) { for (int col = 0; col < grid.numCols(); col++) { code to process grid[row][col] } } This order, in which the row subscript appears in the outer loop, is called row-major order. • When you iterate through the elements of the Map class, foreach returns the keys in the natural order for the key type. For example, a Map whose keys are integers will process the keys in ascending numerical order. A Map whose keys are strings will process the keys in lexicographic order, which is the order determined by comparing the underlying ASCII codes. • When you iterate through the elements of a Set or a Lexicon, foreach always returns the elements in the natural order for the value type. In the Lexicon class, foreach returns the words in lower case. • You cannot use foreach in conjunction with the Stack and Queue classes. Allowing unrestricted access to these structures would violate the principle that only one element (the element at the top of a stack or the one at the head of a queue) is visible at a particular time. Pig Latin revisited When you convert English to Pig Latin as described in section 3.2, most words turn into something that sounds vaguely Latinate but certainly distinct from conventional English. There are, however, a few words whose Pig Latin equivalents just happen to be English words. For example, the Pig Latin translation of trash is ashtray, and the translation for entry is entryway. Such words are not all that common; in the lexicon stored in EnglishWords.dat, there are only 27 words with that property out of over 100,000 English words. Given foreach and the translateWord function from the PigLatin program from Chapter 3, it is easy to write a program that lists all such words, as shown in Figure 5-10. Computing word frequencies Computers have revolutionized many fields of academic inquiry, including some in which the use of such modern tools might at first seem surprising. Over the last few decades, computer analysis has become central to resolving questions of disputed authorship. For example, there are plays from the Elizabethan era that might have been written by Shakespeare, even though they are not part of the traditional canon. Conversely, several Shakespearean plays that are attributed to Shakespeare have parts that don’t sound like his other works and may have in fact been written by someone else. To resolve such questions, Shakespearean scholars often compute 241 242 Collections F I G U R E 5 - 1 0 Program to list words that remain English words in Pig Latin /* * File: PigEnglish.cpp * -------------------* This program finds all English words that remain words when * you convert them to Pig Latin, such as "trash" (which becomes * "ashtray") and "entry" (which becomes "entryway"). The code * ignores words containing no vowels (mostly Welsh-derived * words like "cwm"), which don't change form under the Pig Latin * rules introduced in Chapter 3. */ #include #include #include #include "lexicon.h" using namespace std; /* Function prototypes */ string wordToPigLatin(string word); int findFirstVowel(string word); bool isVowel(char ch); /* Main program */ int main() { cout << "This program finds words that remain words" << " when translated to Pig Latin." << endl; Lexicon english("EnglishWords.dat"); foreach (string word in english) { string pig = wordToPigLatin(word); if (pig != word && english.contains(pig)) { cout << word << " -> " << pig << endl; } } return 0; } /* The code for the helper functions appears in Figure 3-2 */ the frequency of particular words that appear in the text and see whether those frequencies match what we expect to find based on an analysis of Shakespeare’s known works. The code for the word frequency program appears in Figure 5-11. Given the tools you have at your disposal from earlier examples, the code required to tabulate word frequencies is quite straightforward. The strategy for dividing a line into words is similar to what you have already seen in the PigLatin program from Chapter 3. To keep track of the mapping between words and their associated counts, a Map is precisely what you need. 5.6 Iterating over a collection Figure 4-11 F I G U R E 5 - 1 1 Program to compute word frequencies /* * File: WordFrequency.cpp * ----------------------* This program computes the frequency of words in a text file. */ #include #include #include #include #include #include "filelib.h" #include "map.h" #include "strlib.h" #include "vector.h" using namespace std; /* Function prototypes */ void countWords(istream & stream, Map & wordCounts); void displayWordCounts(Map & wordCounts); void extractWords(string line, Vector & words); /* Main program */ int main() { ifstream infile; Map wordCounts; promptUserForFile(infile, "Input file: "); countWords(infile, wordCounts); infile.close(); displayWordCounts(wordCounts); return 0; } /* * Function: countWords * Usage: countWords(stream, wordCounts); * -------------------------------------* Counts the words in the input stream, updating the counts in * the wordCounts map for each word in the stream. */ void countWords(istream & stream, Map & wordCounts) { Vector lines, words; readEntireFile(stream, lines); foreach (string line in lines) { extractWords(line, words); foreach (string word in words) { wordCounts[toLowerCase(word)]++; } } } 243 244 Collections Figure 4-11 F I G U R E 5 - 1 1 Program to compute word frequencies (continued) /* * Function: displayWordCounts * Usage: displayWordCounts(wordCount); * -----------------------------------* Displays the count associated with each word in the wordCount map. */ void displayWordCounts(Map & wordCounts) { foreach (string word in wordCounts) { cout << left << setw(15) << word << right << setw(5) << wordCounts[word] << endl; } } /* * Function: extractWords * Usage: extractWords(line, words); * --------------------------------* This function scans through the characters in the line and stores * each individual word (any sequence of alphabetic characters) in the * vector of strings passed by reference as the second argument. The * caller is responsible for creating the vector. */ void extractWords(string line, Vector & words) { words.clear(); int start = -1; for (int i = 0; i < line.length(); i++) { if (isalpha(line[i])) { if (start == -1) start = i; } else { if (start >= 0) { words.add(line.substr(start, i - start)); start = -1; } } } if (start >= 0) words.add(line.substr(start)); } Summary Suppose, for example, that you have a text file containing a passage from Shakespeare, such as the following well-known lines from Act 5 of Macbeth: Macbeth.txt Tomorrow, and tomorrow, and tomorrow Creeps in this petty pace from day to day If you are trying to determine the relative frequency of words in Shakespeare’s writing, you need to have a program that counts how many times each word appears in the data file. Thus, given the file Macbeth.txt, you would like your program to produce something like the following output: WordFrequency Input file: Macbeth.txt and 2 creeps 1 day 2 from 1 in 1 pace 1 petty 1 this 1 to 1 tomorrow 3 Summary This chapter introduced the C++ classes Vector, Stack, Queue, Map, and Set that together represent a powerful framework for storing collections. For the moment, you have looked at these classes only as a client. In subsequent chapters, you will have a chance to learn more about how they are implemented. Given that you will be implementing them as you complete the text, the classes presented here have been simplified to some extent from the vector, stack, queue, map, and set classes in the Standard Template Library, although they export a very similar collection of methods. The chapter also introduced two additional classes from the Stanford libraries—Grid and Lexicon—that make it easier to implement various interesting applications. Important points in this chapter include: • Data structures defined in terms of their behavior rather their representation are called abstract data types. Abstract data types have several important advantages over more primitive data structures. These advantages include: 1. Simplicity. The representation of the underlying data representation is hidden, which means that there are fewer details for the client to understand. 245 246 Collections 2. Flexibility. The implementer is free to change the underlying representation as long as the methods in the interface continue to behave in the same way. 3. Security. The interface barrier prevents the client from making unexpected changes in the internal structure. • Classes that contain other objects as elements of an integral collection are called collection classes. In C++, container classes are defined using a template or parameterized type, in which the type name of the element appears in angle brackets after the name of the container class. For example, the class Vector signifies a vector containing values of type int. • The Vector class is an abstract data type that behaves in much the same fashion as a one-dimensional array but is much more powerful. Unlike arrays, a Vector can grow dynamically as elements are added and removed. They are also more secure, because the Vector class checks to make sure that selected elements exist. Although you can use vectors of vectors to create two-dimensional structures, it is usually easier to use the Grid class, which is part of the Stanford libraries. • The Stack class represents a collection of objects whose behavior is defined by the property that items are removed from a stack in the opposite order from which they were added: last in, first out (LIFO). The fundamental operations on a stack are push, which adds a value to the stack, and pop, which removes and returns the value most recently pushed. • The Queue class is similar to the Stack class except for the fact that elements are removed from a queue in the same order in which they were added: first in, first out (FIFO). The fundamental operations on a queue are enqueue, which adds a value to the end of a queue, and dequeue, which removes and returns the value from the front. • The Map class makes it possible to associate keys with values in a way that makes it possible to retrieve those associations efficiently. The fundamental operations on a map are put, which adds a key/value pair, and get, which returns the value associated with a particular key. • The Set class represents a collection in which the elements are unordered and in which each value appears only once, as with sets in mathematics. The fundamental operations on a set include add, which stores a new element in the set and contains, which checks to see whether an element is in the set. • With the exception of Stack and Queue, all collection classes support the foreach pattern, which makes it easy to cycle through the elements of the collection. Review questions Review questions 1. True or false: An abstract data type is one defined in terms of its behavior rather than its representation. 2. What three advantages does this chapter cite for separating the behavior of a class from its underlying implementation? 3. What is the STL? 4. If you want to use the Vector class in a program, what #include line do you need to add to the beginning of your code? 5. List at least three advantages of the Vector class over the more primitive array mechanism available in C++. 6. What is meant by the term bounds-checking? 7. What is a parameterized type? 8. What type name would you use to store a vector of Boolean values? 9. True or false: The default constructor for the Vector class creates a vector with ten elements, although you can make it longer later. 10. How would you initialize a Vector with 20 elements, all equal to 0? 11. What method do you call to determine the number of elements in a Vector? 12. If a Vector object has N elements, what is the legal range of values for the first argument to insertAt? What about for the argument to removeAt? 13. What feature of the Vector class makes it possible to avoid explicit use of the get and set methods? 14. Why is it important to pass vectors and other collection object by reference? 15. What declaration would you use to initialize a variable called chessboard to an 8×8 grid, each of whose elements is a character? 16. Given the chessboard variable from the preceding exercise, how would you assign the character 'R' (which stands for a white rook in standard chess notation) to the squares in the lower left and lower right corners of the board? 17. What do the acronyms LIFO and FIFO stand for? How do these terms apply 247 248 Collections to stacks and queues? 18. What are the names of the two fundamental operations for a stack? 19. What are the names for the corresponding operations for a queue? 20. What does the peek operation do in each of the Stack and Queue classes? 21. Describe in your own words what is meant by the term discrete time in the context of a simulation program. 22. What are the two type parameters used with the Map class. 23. What happens if you call get for a key that doesn’t exist in a map? 24. What are the syntactic shorthand forms for get and put that allow you to treat a map as an associative array? 25. Why do the Stanford libraries include a separate Lexicon class even though it is easy to implement the fundamental operations of a lexicon using the Set class? 26. What are the two kinds of data files supported by the constructor for the Lexicon class? 27. What is the general form of the foreach pattern? 28. What reason does the chapter offer for disallowing the use of foreach with the Stack and Queue classes? 29. Describe the order in which foreach processes elements for each of the collection classes introduced in this chapter. Exercises 1. In statistics, a collection of data values is often referred to as a distribution. One of the primary goals of statistical analysis is to find ways to compress the complete set of data into summary statistics that express properties of the distribution as a whole. The most common statistical measure is the mean, which is simply the traditional average. For the distribution x1, x2, x3, . . . , xn, the mean is usually represented by the symbol x . Write a function double mean(Vector & data); that returns the mean of the data in the vector. Exercises 2. Another common statistical measure is the standard deviation, which provides an indication of how much the individual values in the distribution differ from the mean. To calculate the standard deviation of a distribution whose elements are x1, x2, . . . , xn you need to perform the following steps: a. Calculate the mean of the distribution as in exercise 1. b. Go through the individual data items in the distribution and calculate the square of the difference between each data value and the mean. Add all these values to a running total. c. Take the total from step (b) and divide it by the number of data items d. Take the square root of the result, which represents the standard deviation. In mathematical form, the standard deviation (σ) is expressed as follows, at least if you are computing the standard deviation of a complete distribution as opposed to a sample: n σ = Σ (x – xi) i 2 =1 n The Greek letter sigma (Σ) indicates a summation of the quantity that follows, which in this case is the square of the difference between the mean and each individual data point. Write a function double stddev(Vector & data); that returns the standard deviation of the data distribution. 3. Write the overloaded functions void readVector(istream & is, Vector & vec); void readVector(istream & is, Vector & vec); void readVector(istream & is, Vector & vec); each of which reads lines from the input stream specified by is into the vector vec. In the input stream, each element of the vector appears on a line of its own. The function should read elements until it encounters a blank line or the end of the file. To illustrate the operation of this function, suppose that you have the data file 249 250 Collections SquareAndCubeRoots.txt 1.0000 1.4142 1.7321 2.0000 1.0000 1.2599 1.4422 1.5874 1.7100 1.8171 1.9129 2.0000 and that you have opened infile as an ifstream on that file. In addition, suppose that you have declared the variable roots as follows: Vector roots; The first call to readVector(infile, roots) should initialize roots so that it contains the four elements shown at the beginning of the file. The second call should change the value of roots so that it contains the eight elements shown at the end of the file. Calling readVector a third time should set roots to an empty vector. 4. A histogram is a graphical way of displaying data by dividing the data into separate ranges and then indicating how many data values fall into each range. For example, given the set of exam scores 100, 95, 47, 88, 86, 92, 75, 89, 81, 70, 55, 80 a traditional histogram would have the following form: 00s 10s 20s 30s * 40s * 50s 60s * * 70s * * * * * 80s * * 90s * 100 The asterisks in the histogram indicate one score in the 40s, one score in the 50s, five scores in the 80s, and so forth. When you generate histograms using a computer, however, it is much easier to display them sideways on the page, as in this sample run: Exercises Histogram 00s: 10s: 20s: 30s: 40s: 50s: 60s: 70s: 80s: 90s: 100: * * ** ***** ** * Write a program that reads in a vector of integers from a data file and then displays a histogram of those numbers, divided into the ranges 0–9, 10–19, 20–29, and so forth, up to the range containing only the value 100. Your program should generate output that looks as much like the sample run as possible. 5. Extend the flexibility of the previous exercise by defining a hist.h interface that gives clients more control over the format of the histogram. At a minimum, your interface should allow clients to specify the minimum and maximum values along with the size of each histogram range, but you should feel free to provide other capabilities as well. Use your imagination! 6. In the third century B.C.E., the Greek astronomer Eratosthenes developed an algorithm for finding all the prime numbers up to some upper limit N. To apply the algorithm, you start by writing down a list of the integers between 2 and N. For example, if N were 20, you would begin by writing the following list: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 You then circle the first number in the list, indicating that you have found a prime. Whenever you mark a number as a prime, you go through the rest of the list and cross off every multiple of that number, since none of those multiples can itself be prime. Thus, after executing the first cycle of the algorithm, you will have circled the number 2 and crossed off every multiple of 2, as follows: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 To complete the algorithm, you simply repeat the process by circling the first number in the list that is neither crossed off nor circled, and then crossing off its multiples. In this example, you would circle 3 as a prime and cross off all multiples of 3 in the rest of the list, which results in the following state: 251 252 Collections 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Eventually, every number in the list will either be circled or crossed out, as shown in this diagram: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 The circled numbers are the primes; the crossed-out numbers are composites. This algorithm is called the sieve of Eratosthenes. Write a program that uses the sieve of Eratosthenes to generate a list of the primes between 2 and 1000. 7. One of the problems in using the Grid class is that it isn’t as easy to set up a particular set of initial values as it is with a vector, where you can add the elements you want with the += operator. One way to streamline the process of initializing a grid is to define a function void fillGrid(Grid & grid, Vector & values); that fills the elements of the grid from the values in the vector. For example, the code Grid matrix(3, 3); Vector values; values += 1, 2, 3; values += 4, 5, 6; values += 7, 8, 9; fillGrid(matrix, values); initializes the variable matrix to be a 3 × 3 grid containing the values 8. 1 2 3 4 5 6 7 8 9 A magic square is a two-dimensional grid of integers in which the rows, columns, and diagonals all add up to the same value. One of the most famous magic squares appears in the 1514 engraving “Melencolia I” by Albrecht Dürer shown in Figure 5-12, in which a 4 × 4 magic square appears in the upper right, just under the bell. In Dürer’s square, which can be read more easily in the magnified inset shown at the right of the figure, all four rows, all Exercises F I G U R E 5 - 1 2 Magic square in Albrecht Durer’s Melencolia I (1514) 16 3 2 13 5 10 11 8 9 6 7 12 4 15 14 1 = 15 = 15 8 1 6 = 15 8 1 6 8 1 6 3 5 7 = 15 3 5 7 3 5 7 4 9 2 = 15 4 9 2 4 9 2 = 15 = 15 four columns, and both diagonals add up to 34. A more familiar example is the following 3 × 3 magic square in which each of the rows, columns, and diagonals add up to 15, as shown: = 15 Implement a function 253 254 Collections bool isMagicSquare(Grid & square); that tests to see whether the grid contains a magic square. Your program should work for square grids of any size. If you call isMagicSquare with a grid in which the number of rows and columns are different, it should simply return false. 9. In the last several years, a new logic puzzle called Sudoku has become quite popular throughout the world. In Sudoku, you start with a 9 × 9 grid of integers in which some of the cells have been filled in with digits between 1 and 9. Your job in the puzzle is to fill in each of the empty spaces with a digit between 1 and 9 so that each digit appears exactly once in each row, each column, and each of the smaller 3 × 3 squares. Each Sudoku puzzle is carefully constructed so that there is only one solution. For example, Figure 5-13 shows a typical Sudoku puzzle on the left and its unique solution on the right. Although you won’t discover the algorithmic strategies you need to solve Sudoku puzzles until later in this book, you can easily write a method that checks to see whether a proposed solution follows the Sudoku rules against duplicating values in a row, column, or outlined 3 × 3 square. Write a function bool checkSudokuSolution(Grid & puzzle); that performs this check and returns true if the puzzle is a valid solution. Your program should check to make sure that puzzle contains a 9 × 9 grid of integers and report an error if this is not the case. F I G U R E 5 - 1 3 Typical Sudoku puzzle and its solution 2 4 4 1 8 6 2 5 8 2 3 9 9 6 7 3 9 6 1 7 9 6 2 7 1 5 8 8 3 3 1 9 5 6 6 9 3 7 6 2 8 1 9 4 5 9 4 8 5 3 7 6 2 1 2 1 5 4 9 6 7 3 8 4 8 2 1 6 9 5 7 3 6 9 7 3 2 5 8 1 4 5 3 1 8 7 4 2 9 6 8 6 4 7 1 2 3 5 9 1 2 3 9 5 8 4 6 7 7 5 9 6 4 3 1 8 2 Exercises 10. In the game of Minesweeper, a player searches for hidden mines on a rectangular grid that might—for a very small board—look like this: One way to represent that grid in C++ is to use a grid of Boolean values marking mine locations, where true indicates the location of a mine. In Boolean form, this sample grid therefore looks like this: T F F F F T F F F F F T T T F T F T T F F F F F F F T F F F F F F F F F Given such a grid of mine locations, write a function void fixCounts(Grid & mines, Grid & counts); that creates a grid of integers storing the number of mines in each neighborhood. The neighborhood of a location includes the location itself and the eight adjacent locations, but only if they are inside the boundaries of the grid. The reference parameter counts is used to store the result. Your job in this exercise is to make sure that it has the same size as the mines grid and then to assign to each element an integer between 0 and 9. For example, if mineLocations contains the Boolean grid shown earlier, the code Grid mineCounts; fixCounts(mineLocations, mineCounts); should initialize mineCounts as follows: 255 256 Collections 1 1 0 0 2 2 3 3 2 1 4 3 3 3 2 1 3 2 3 4 3 2 2 1 1 2 1 1 0 0 0 1 1 1 0 0 11. The resize method in the Grid class resets the dimensions of the grid but also initializes every element of the grid to its default value. Write a function void reshape(Grid & grid, int nRows, int nCols); that resizes the grid but fills in the data from the original grid by copying elements in the standard row-major order (left-to-right/top-to-bottom). For example, if myGrid initially contains the values 1 2 3 4 5 6 7 8 9 10 11 12 calling the function reshape(myGrid, 4, 3) should change the dimensions and contents of myGrid as follows: 1 2 3 4 5 6 7 8 9 10 11 12 If the new grid does not include enough space for all of the original values, the values at the bottom of the grid are simply dropped. For example, if you call reshape(myGrid, 2, 5) there is no room for the last two elements, so the new grid will look like this: 1 2 3 4 5 6 7 8 9 10 Exercises Conversely, if there are not enough elements in the original grid to fill the available space, the entries at the end should simply retain their default values. 12. Write a program that uses a stack to reverse a sequence of integers read from the console one number per line, as shown in the following sample run: ReverseList Enter a list of integers, ending with 0: ? 10 ? 20 ? 30 ? 40 ? 0 Those integers in reverse order are: 40 30 20 10 13. And the first one now Will later be last For the times they are a-changin’. —Bob Dylan, “The Times They Are a-Changin’,” 1963 Following the inspiration from Bob Dylan’s song (which is itself inspired by Matthew 19:30), write a function void reverseQueue(Queue & queue); that reverses the elements in the queue. Remember that you have no access to the internal representation of the queue and must therefore come up with an algorithm—presumably involving other structures—that accomplishes the task. 14. Write a program that checks whether the bracketing operators (parentheses, brackets, and curly braces) in a string are properly matched. As an example of proper matching, consider the string { s = 2 * (a[2] + 3); x = (1 + (2)); } If you go through the string carefully, you discover that all the bracketing operators are correctly nested, with each open parenthesis matched by a close parenthesis, each open bracket matched by a close bracket, and so on. On the other hand, the following strings are all unbalanced for the reasons indicated: (([]) )( {(}) The line is missing a close parenthesis. The close parenthesis comes before the open parenthesis. The bracketing operators are improperly nested. 257 258 Collections 15. The figures in this book are created using PostScript®, a powerful graphics language developed by the Adobe Corporation in the early 1980s. PostScript programs store their data on a stack. Many of the operators available in the PostScript language have the effect of manipulating the stack in some way. You can, for example, invoke the pop operator, which pops the top element off the stack, or the exch operator, which swaps the top two elements. One of the most interesting (and surprisingly useful) PostScript operators is the roll operator, which takes two arguments: n and k. The effect of applying roll(n, k) is to rotate the top n elements of a stack by k positions, where the general direction of the rotation is toward the top of the stack. More specifically, roll(n, k) has the effect of removing the top n elements, cycling the top element to the last position k times, and then replacing the reordered elements back on the stack. Figure 5-14 shows before and after pictures for three different examples of roll. Write a function void roll(Stack & stack, int n, int k) that implements the roll(n, k) operation on the specified stack. Your implementation should check that n and k are both nonnegative and that n is not larger than the stack size; if either of these conditions is violated, your implementation should call error with the message roll: argument out of range Note, however, that k can be larger than n, in which case the roll operation continues through more than a complete cycle. This case is illustrated in the final example in Figure 5-14, in which the top two elements on the stack are rolled four times, leaving the stack exactly as it started. F I G U R E 5 - 1 4 Examples of the roll function for stacks roll(4, 1) roll(3, 2) roll(2, 4) D C D B D D C B C D C C B A B C B B A D A A A A before after before after before after Exercises 16. The checkout-line simulation in Figure 5-5 can be extended to investigate important practical questions about how waiting lines behave. As a first step, rewrite the simulation so that there are several independent queues, as is usually the case in supermarkets. A customer arriving at the checkout area finds the shortest checkout line and enters that queue. Your revised simulation should calculate the same results as the simulation in the chapter. 17. As a second extension to the checkout-line simulation, change the program from the previous exercise so that there is a single waiting line served by multiple cashiers—a practice that has become more common in recent years. In each cycle of the simulation, any cashier that becomes idle serves the next customer in the queue. If you compare the data produced by this exercise and the preceding one, what can you say about the relative advantages of these two strategies? 18. Write a program to simulate the following experiment, which was included in the 1957 Disney film Our Friend the Atom, to illustrate the chain reactions involved in nuclear fission. The setting for the experiment is a large cubical box, the bottom of which is completely covered with 625 mousetraps arranged to form a square grid 25 mousetraps on a side. Each of the mousetraps is initially loaded with two ping-pong balls. At the beginning of the simulation, an additional ping-pong ball is released from the top of the box and falls on one of the mousetraps. That mousetrap springs and shoots its two ping-pong balls into the air. The ping-pong balls bounce around the sides of the box and eventually land on the floor, where they are likely to set off more mousetraps. In writing this simulation, you should make the following simplifying assumptions: • Every ping-pong ball that falls always lands on a mousetrap, chosen randomly by selecting a random row and column in the grid. If the trap is loaded, its balls are released into the air. If the trap has already been sprung, having a ball fall on it has no effect. • Once a ball falls on a mousetrap—whether or not the trap is sprung—that ball stops and takes no further role in the simulation. • Balls launched from a mousetrap bounce around the room and land again after a random number of simulation cycles have gone by. That random interval is chosen independently for each ball and is always between one and four cycles. Your simulation should run until there are no balls in the air. At that point, your program should report how many time units have elapsed since the beginning, what percentage of the traps have been sprung, and the maximum number of balls in the air at any time in the simulation. 259 260 Collections 19. Telephone numbers in the United States and Canada are organized into various three-digit area codes. A single state or province will often have many area codes, but a single area code will not cross a state boundary. This rule makes it possible to list the geographical locations of each area code in a data file. For this problem, assume that you have access to the file AreaCodes.txt, which lists all the area codes paired with their locations as illustrated by the first ten lines of that file: AreaCodes.txt 201-New Jersey 202-District of Columbia 203-Connecticut 204-Manitoba 205-Alabama 206-Washington 207-Maine 208-Idaho 209-California 210-Texas Using the AirportCodes program as a model, write the code necessary to read this file into a Map, where the key is the area code and the value is the location. Once you’ve read in the data, write a main program that repeatedly asks the user for an area code and then looks up the corresponding location, as illustrated in the following sample run: FindAreaCode Enter area code or state name: 650 California Enter area code or state name: 202 District of Columbia Enter area code or state name: 778 British Columbia As the prompt suggests, however, your program should also allow users to enter the name of a state or province and have the program list all the area codes that serve that area, as illustrated by the following sample run: FindAreaCode Enter area code or state name: Oregon 458 503 541 971 Enter area code or state name: Manitoba 204 Exercises 20. When you wrote the FindAreaCode program for the previous exercise, it is likely that you generated the list of area codes for a state by looping through the entire map and printing out any area codes that mapped to that state. Although this strategy is fine for small maps like the area code example, efficiency might become an issue in working with much larger collections of data. This strategy also feels uncomfortably asymmetric. When you want to translate an area code to a state name, you ask the map and it gives you the answer immediately; translating in the opposite direction requires a lot more work. What you would like to do is invert the map so that you could perform lookup operations in either direction. You can’t, however, declare the inverted map as a Map, because there is often more than one area code associated with a state. An int can’t hold all the necessary information. A better strategy is to make the inverted map a Map< string,Vector > that maps each state name to a vector of the area codes that serve that state. Rewrite the FindAreaCode program so that it creates an inverted map after reading in the data file and then uses that map to list the area codes for a state. 21. In May of 1844, Samuel F. B. Morse sent the message “What hath God wrought!” by telegraph from Washington to Baltimore, heralding the beginning of the age of electronic communication. To make it possible to communicate information using only the presence or absence of a single tone, Morse designed a coding system in which letters and other symbols are represented as coded sequences of short and long tones, traditionally called dots and dashes. In Morse code, the 26 letters of the alphabet are represented by the codes shown in Figure 5-15. 261 262 Collections Write a program that reads in lines from the user and translates each line either to or from Morse code depending on the first character of the line: • If the line starts with a letter, you want to translate it to Morse code. Any characters other than the 26 letters should simply be ignored. • If the line starts with a period (dot) or a hyphen (dash), it should be read as a series of Morse code characters that you need to translate back to letters. Each sequence of dots and dashes is separated by spaces, but any other characters should be ignored. Because there is no encoding for the space between words, the characters of the translated message will be run together when your program translates in this direction. The program should end when the user enters a blank line. A sample run of this program (taken from the messages between the Titanic and the Carpathia in 1912) might look like this: MorseCode Morse code translator > SOS TITANIC ... --- ... - .. - .- -. .. -.-. > WE ARE SINKING FAST .-- . .- .-. . ... .. -. -.- .. -. --. ..-. .- ... > .... . .- -.. .. -. --. ..-. --- .-. -.-- --- ..HEADINGFORYOU > 22. Section 3.6 defines the function isPalindrome that checks whether a word reads identically forward and backward. Use that function together with the English lexicon to print out a list of all words that are palindromes. 23. In Scrabble, knowing the two-letter word list is important because those short words make it easy to “hook” a new word into tiles already on the board. Another list that Scrabble experts memorize is the list of three-letter words that can be formed by adding a letter to the front or back of a two-letter word. Write a program that generates this list. 24. One of the most important strategic principles in Scrabble is to conserve your S tiles, because the rules for English plurals mean that many words take an S-hook at the end. Some words, of course, allow an S tile to be added at the beginning, but it turns out that there are 680 words—including, for example, both the words cold and hot—that allow S-hooks on either end. Write a program that uses the English lexicon to make a list of all such words. Chapter 6 Designing Classes You don’t understand. I coulda had class. . . . — Marlon Brando’s character in On the Waterfront, 1954 264 Designing Classes Although you have been using classes extensively throughout this book, you have not yet had the chance to define classes of your own. The purpose of this chapter is to fill that gap by giving you the tools you need to implement new classes. This chapter, however, presents only the beginning of the story. In later chapters, you will have a chance to learn other important aspects of class design, including memory management and inheritance. 6.1 Representing points One of the useful properties of classes—but by no means the only one—is that they make it possible to combine several related pieces of information into a composite value that can be manipulated as a unit. As a simple example, suppose that you are working with coordinates in a x-y grid in which the coordinates are always integers. Although it is possible to work with the x and y values independently, it is more convenient to define an abstract data type that combines an x and a y value together. In geometry, this unified pair of coordinate values is called a point, and it therefore makes sense to use the name Point for the corresponding type. C++ offers several strategies for defining a Point type, ranging in sophistication from the simple structure types that have always been available in the C family of languages to definitions that use a much more modern object-oriented style. The individual sections that follow explore these strategies, beginning with the structure-based model and then moving on to the class-based form. Defining Point as a structure type In your past experience with programming, you have almost certainly encountered types that are defined by combining values of simpler types that already exist. Such types are called records or structures, where the first term is used more broadly in the computer science community and the second is more common among C++ programmers. Given that C++ supports the facilities available in C, you can define the Point type as a structure type using the following C-style structure definition: struct Point { int x; int y; }; This code defines the type Point as a traditional structure with two components. In a structure, components are called fields or members. In this example, the Point structure contains a field named x and a field named y, both of type int. When you work with structures or classes in C++, it is important to keep in mind that the definition introduces a new type and does not in itself declare any variables. Once you have the definition, you can then use the type name to declare variables, 6.1 Representing points just as you would with any other type. For example, if you include the local variable declaration Point p; in a function, the compiler will reserve space in the stack frame for a variable of type Point named p, just as the declaration int n; reserves space for a variable of type int named n. The only difference is that the Point variable p includes internal fields that hold the values of its x and y components. If you were to draw a box diagram of the variable p, it would look something like this: p y x The variable p holds a compound value, which contains the internal fields x and y. Given a structure, you can select the individual fields using the dot operator, which is written in the form var.name where var is the variable containing the structured value and name specifies the desired field. For example, you can use the expressions p.x and p.y to select the individual coordinate value of the Point structure stored in the variable p. Selection expressions are assignable, so you can initialize the components of p to represent the point (2, 3) using the following code: p.x = 2; p.y = 3; which leads to the state shown in the following diagram: p y x 2 3 The fundamental characteristic of a structure is that it is possible to view it both as a collection of individual fields and as a single value. At the lower levels of the implementation, the values stored in the individual fields are likely to be important. At higher levels of detail, it makes sense to focus on the value as an integral unit. 265 266 Designing Classes C++ makes it easier to maintain a high-level perspective by defining several important operations that work with the structure as a whole. Given a Point value, for example, you can assign that value to a variable, pass it as a parameter to a function, or return it as a result. Any client can select the x and y fields if you need to look at the components individually, but it is often sufficient to work with the value as a whole. These design decisions mean that you can pass a structure up and down through the various levels of an application. At all but the lowest levels, the underlying details of that structure are unimportant. Defining Point as a class Although structure types are part of the history of C++ and the languages that came before it, they have largely been supplanted by classes, which offer greater power and flexibility. The Point structure from the preceding section is identical to the following class definition: class Point { public: int x; int y; }; As you can see from this example, the fields of a class—which are also called instance variables when they occur in the context of a class—are declared using the same syntax used for fields in a structure. The only syntactic difference is that fields within a class are separated into public and private sections to control what parts of the program have access to those fields. The keyword public introduces the public section containing fields that are available to anyone who uses the defining class. A class definition, however, can also include a private section introduced by the keyword private. Fields that are declared in the private section are visible only to the defining class and not to any of its clients. In C++ today, structures are implemented using classes and behave in exactly the same way except for the fact that entries are public by default. As the definition of the Point class now stands, the x and y fields are included as part of the public section, which makes them visible to clients. You can select a public field in an object using the dot operator. For example, if you have a variable pt containing an object that is an instance of the Point class, you can select its x field by writing pt.x just as if it were still the Point structure from the preceding section. 6.1 Representing points Declaring public instance variables, however, is discouraged in modern object-oriented programming. Today, the common practice is to make all instance variables private, which means that clients have no direct access to the internal variables. Clients instead use methods exported by the class to obtain access to any information the class contains. Keeping implementation details away from the client is likely to foster simplicity, flexibility, and security, as described in the introduction to Chapter 5. Making the instance variables private is easy enough. All you have to do is change the label that introduces that section of the class from public to private, as follows: class Point { private: int x; int y; }; The problem with this definition is that clients no longer have any way to reach the information stored within a Point object, which makes it unusable in its current form. At a minimum, clients need some way to create Point objects and to retrieve the individual x and y coordinates from an existing Point. As you know from your experience with collection classes in Chapter 5, creating objects is the responsibility of a constructor, which always has the same name as the class. Classes often define more than one version of the constructor to take account of different initialization patterns. In particular, most classes define a constructor that takes no arguments, which is called the default constructor. The default constructor is used to initialize an object declared without specifying a parameter list. In the case of the Point class, it is useful to define a version of the constructor that takes a pair of coordinate values. In computer science, methods that retrieve the values of instance variables are formally called accessors, but are more often known as getters. By convention, the name of a getter method begins with the prefix get followed by the name of the field after capitalizing the first letter in its name. The getters for the Point class are therefore getX and getY. In keeping with the general strategy used in this book of showing examples before explaining each new concept in detail, Figure 6-1 offers a minimal definition of a Point class that defines two constructors, the methods getX and getY, and a toString method in its public section. I have omitted the comments that would ordinarily accompany the class and its methods to provide room for the annotations, which illustrate the structure of the class definition. 267 268 Designing Classes F I G U R E 6 - 1 A simple version of the Point class /* * Class: Point * -----------* This class represents an x-y coordinate point on a two-dimensional * integer grid. */ #include #include "strlib.h" using namespace std; class Point { public: Point() { x = 0; y = 0; } Constructors Point(int xc, int yc) { x = xc; y = yc; } Public section int getX() { return x; } Getter methods int getY() { return y; } string toString() { return "(" + integerToString(x) + "," + integerToString(y) + ")"; } private: int x; int y; Instance variables Private section }; For the most part, the code in Figure 6-1 should be easy to understand. The only subtlety that might cause confusion shows up in the names of the parameters to the second form of the constructor. Logically, the constructor takes an x and a y coordinate, and it is therefore reasonable to expect that the parameter names would be x and y instead of xc and yc (where I’ve included the c to suggest the word coordinate). Using the names x and y as parameters to the constructor would result 6.1 Representing points in confusion as to whether a reference to the variable x was intended to refer to the parameter or to the instance variable with that name. Having one variable hide an identically named variable in some larger scope is called shadowing. In Chapter 11, you’ll learn a simple technique to resolve this ambiguity; unfortunately, that technique depends on concepts that are as yet beyond your knowledge. For the moment, therefore, the examples in this text avoid the problem of shadowing altogether by choosing different names for parameters and instance variables. The other question you may have after reading the code in Figure 6-1 is why certain methods aren’t part of the class. Although the getter methods make it possible to get information out of a Point object, the current definition provides no means of changing the values of these fields. One approach that is useful in some situations is to export methods that set the values of specific instance variables. Such methods are called mutators or, more informally, setters. If the Point class were to export a setX and a setY method that allowed the client to change the values of these fields, you could easily replace any application that previously used the old structure type with a version that relies entirely on the new Point class. All you would need to do is replace every assignment of the form pt.x = value; with a call to pt.setX(value); Similarly, every reference to pt.y that isn’t on the left side of an assignment statement would need to be rewritten as pt.getY(). It is, however, a bit unsatisfying to think about adding setter methods to a class so soon after deciding that it was important to make its instance variables private. After all, part of the reason for making instance variables private is to ensure that clients don’t have unrestricted access to them. Having every instance variable in a class come with a public setter method circumvents those restrictions and eliminates the advantages one might have obtained by making those variables private in the first place. In general, it is considerably safer to allow clients to read the values of the instance variables than it is to have clients change those values. As a result, setter methods are far less common than getters in object-oriented design. Many programmers, in fact, take the recommendation against allowing change to an even higher level by making it impossible to change the values of any instance variables after an object has been created. Classes designed in this way are said to be immutable. The Point class is immutable, as that concept tends to be defined in 269 270 Designing Classes practice in C++. Although it is still possible to change the contents of a Point object by assigning another Point to it, there is no way to change the individual fields in a point independently. Separating the interface from the implementation The Point class as it appears in Figure 6-1 is useful only if you plan to use that class entirely within the context of a single source file. It is generally more useful to export class definitions in a library, thereby making those definitions available to a broader set of applications. In this case, what you would like to do is create a point.h file that serves as the interface for the class and a separate point.cpp file that includes the corresponding implementation. As you know from Chapter 2, an interface typically contains only the prototypes for its functions and not the full implementation. The same is true for the methods in a class. The class definition in the interface includes only the method prototypes, deferring the code for those methods to the implementation. The header files therefore look similar to header files for the other libraries you have seen, except for the fact that the prototypes appear inside a class definition. The structure of the implementation file, however, is somewhat different. In C++, when you separate the interface for a class from its implementation, the class definition itself exists only in the .h file. The corresponding code appears in the .cpp file as independent method definitions that are not nested within a class definition in the way that the prototypes are. For this reason, the method definitions need to specify the class to which they belong in a different way. In C++, this identification is accomplished by adding the class name as a qualifier before the method name, separating the two with a double colon. Thus, the fully qualified name of the getX method in the Point class is Point::getX. Once that minor syntactic wrinkle is out of the way, the implementations are straightforward. Figure 6-2 provides a complete interface for the Point class, and the corresponding implementation appears in Figure 6-3. Hiding the underlying representation One of the primary reasons for separating an interface from its implementation is that doing so hides complexity from the client. A well-designed interface contains only the information that is necessary to use the entries it exports. The details of how those entries work should be hidden inside the interface. The point.h interface in Figure 6-2 approaches this goal by specifying only the prototypes for the methods; the corresponding implementations are hidden away in point.cpp. 6.1 Representing points point.h F I G U R E 6 - 2 Preliminary interface for the Point class /* * File: point.h * ------------* This interface exports the Point class, which represents a point on * a two-dimensional integer grid. */ #ifndef _point_h #define _point_h #include class Point { public: /* * Constructor: Point * Usage: Point origin; * Point pt(xc, yc); * -----------------------* Creates a Point object. The default constructor sets the coordinates * to 0; the second form sets the coordinates to xc and yc. */ Point(); Point(int xc, int yc); /* * Methods: getX, getY * Usage: int x = pt.getX(); * int y = pt.getY(); * ------------------------* These methods returns the x and y coordinates of the point. */ int getX(); int getY(); /* * Method: toString * Usage: string str = pt.toString(); * ---------------------------------* Returns a string representation of the Point in the form "(x,y)". */ std::string toString(); private: int x; int y; }; #endif /* The x-coordinate */ /* The y-coordinate */ 271 272 Designing Classes point.cpp F I G U R E 6 - 3 Preliminary implementation of the Point class /* * File: point.cpp * --------------* This file implements the point.h interface. */ #include #include "point.h" #include "strlib.h" using namespace std; /* * Implementation notes: Constructors * ---------------------------------* The constructors initialize the instance variables x and y. In the * second form of the constructor, the parameter names are xc and yc * to avoid the problem of shadowing the instance variables. */ Point::Point() { x = 0; y = 0; } Point::Point(int xc, int yc) { x = xc; y = yc; } /* * Implementation notes: Getters * ----------------------------* The getters return the value of the corresponding instance variable. * No setters are provided to ensure that Point objects are immutable. */ int Point::getX() { return x; } int Point::getY() { return y; } /* * Implementation notes: toString * -----------------------------* The implementation of toString uses the integerToString function * from the strlib.h interface. */ string Point::toString() { return "(" + integerToString(x) + "," + integerToString(y) + ")"; } 6.2 Operator overloading So far, so good. Unfortunately, the point.h file still specifies certain details that are of no interest to the client. The instance variables x and y—which are really part of the implementation—are right there in the interface for everyone to see, even though clients have no direct access to these variables. Even though those declarations appear in a section marked as private, it would be better if the client didn’t have to see those details at all. Unfortunately, the rules of C++ make it tricky to hide the declarations of the instance variables from the client. Syntactically, both the public and private sections of the class must be enclosed within the class body. Given the fact that the compiler has to be able to see the definition of the entire class, there is no way to prevent a determined client from seeing the contents of the private section. Even so, it is still useful to avoid sticking all that information in the client’s face. To accomplish this goal, the interface files in this book replace the private section with an #include line that reads that section from another file. In the case of point.h, for example, the finished version of the interface replaces the private section—including the private keyword—with the line #include "pointpriv.h" This line instructs the compiler to read in the contents of the file pointpriv.h at that point in the compilation. That file contains exactly the same code that used to appear in the point.h interface, but in some sense moves it out of sight. Anyone reading the interface could go off and look at the contents of pointpriv.h, but in practice no one bothers. The information you need as a client is in the public section, and you have no interest in being loaded down with irrelevant details. 6.2 Operator overloading As you know from your experience with the library classes in several of the earlier chapters, C++ makes it possible to extend the standard operators so that they apply to new types. This technique is called operator overloading. For example, the string class overloads the + operator so that it behaves differently when applied to strings. When the C++ compiler sees the + operator, it decides how to evaluate it by looking at the types of the operand, just as it uses the argument signature to choose among various overloaded versions of a function. If the compiler sees + applied to two integers, it generates the instructions necessary to add those values to produce an integer result. If the operands are strings, the compiler instead generates a call to the method provided by the string class that implements concatenation. The ability to overload operators is a powerful feature of C++ that can make programs much easier to read, but only if the interpretation of each operator remains consistent across the types to which it is applied. The classes that overload the + 273 274 Designing Classes operator, for example, use it for operations that are conceptually similar to addition, like concatenating strings. If you write an expression like s1 + s2 for two variables of type string, it is easy to think about this operation as one of adding the strings together. If, however, you redefine an operator in such a way that readers have no clue what it means, operator overloading can make programs essentially unreadable. It is therefore important to exercise restraint by using this feature only in circumstances where it enhances program readability. The sections that follow illustrate how you can overload operators in your own type definitions, using the Point class from section 6.1 as the starting point for the examples and then moving on to add some useful operators to the Direction type introduced in Chapter 1. Overloading the insertion operator As you can see from the point.h interface in Figure 6-2, the Point class exports a method called toString that converts a Point object into a string containing the coordinate values enclosed in parentheses. The primary purpose for including this method is to make it easy to display the value of a Point. When you’re debugging a program, it is often convenient to display the value of a variable. To display the value of a Point variable named pt, all you need to do is add a statement to your program that looks something like this: cout << "pt = " << pt.toString() << endl; Operator overloading makes it possible to simplify this process even further. C++ already overloads the stream insertion operator << so that it can display strings along with the primitive types. If you overload this operator to support the Point class as well, you can simplify the preceding statement to cout << "pt = " << pt << endl; A minor change to be sure, but one that makes printing the value of a point require even less thought. Each of the operators in C++ is associated with a function name used to define its overloaded behavior. In almost all cases, the function name consists of the keyword operator followed by the operator symbol. For example, if you want to redefine the + operator for some new type, you define a function named operator+ that takes arguments of that type. Similarly, you overload the insertion operator by supplying a new definition for the function operator<<. 6.2 Operator overloading The hardest part of coding the operator<< function is writing its prototype. The left operand of the << operator is the output stream. For this argument, it makes sense to choose the most general class that implements the insertion operator, which is ostream. The right operand of << is the value you want to insert into that stream, which in this example is a Point object. The overloaded definition of operator<< therefore takes two arguments: an ostream and a Point. Completing the prototype, however, requires some care to take account of the fact that streams cannot be copied. This restriction implies that the ostream argument must be passed by reference. That same restriction, however, also comes up when the << operator returns. As I describe on page 160, the insertion operator implements its wonderful chaining behavior by returning the output stream, which is then carried forward to the next << operator in the chain. To avoid copying the stream on this end of the process, the definition of operator<< must also return its result by reference. Returning results by reference is much less common than passing parameters by reference. There are a few cases—including the current example of overloading the << operator—in which you need that capability, but there is no reason to introduce it as a general tool. For those applications where it comes up, it is enough to know that you can specify return by reference using pretty much the same syntax you use to indicate call by reference: you simply add an ampersand after the result type. Putting all these observations together suggests that the prototype for the overloaded version of operator<< must look like this: ostream & operator<<(ostream & os, Point pt); The implementation must print the string representation of pt on the output stream and then return the stream by reference so that it can be used again in the surrounding context. If you code these steps sequentially, you get the following: ostream & operator<<(ostream & os, Point pt) { os << pt.toString(); return os; } You can, however, reduce this implementation to a single line, as follows: ostream & operator<<(ostream & os, Point pt) { return os << pt.toString(); } The classes in this text use this second form, which emphasizes that the << operator continues to produce the same value it returns when you apply it to other types. 275 276 Designing Classes Testing points for equality If you look at the final version of the point.h interface in Appendix A, you will discover that the Point class supports other operators besides stream insertion. For example, given two points, p1 and p2, you can test whether those points are equal by applying the == operator, just as you would with strings or primitive values. C++ offers two strategies for overloading a built-in operator so that it works with objects of a newly defined class: 1. You can define the operator as a method within the class. When you use this style to overload a binary operator, the left operand is the receiver object and the right operand is passed as a parameter. 2. You can define the operator as a free function outside the class. If you use this style, the operands for a binary operator are both passed as parameters. If you use the method-based style, the first step in extending the == operator is to add the prototypes for this operator to the point.h interface, as follows: bool operator==(Point rhs); This method is part of the Point class and must therefore be defined as part of its public section. The corresponding implementation appears in the point.cpp file, where you would need to add the following code: bool Point::operator==(Point rhs) { return x == rhs.x && y == rhs.y; } As with other methods that are part of a class exported through an interface, the implementation of operator== must specify that it is associated with the Point class by including the Point:: prefix before the method name. The client code that calls this method presumably looks something like this: if (pt == origin) . . . Assuming that both pt and origin are variables of type Point, the compiler will invoke the == operator from the Point class when it encounters this expression. Because operator== is a method, the compiler designates the variable pt as the receiver and then copies the value of origin to the parameter rhs. In the body of the operator== method, the unqualified references to x and y therefore refer to fields in the variable pt, while the expressions rhs.x and rhs.y refer to fields in the variable origin. 6.2 Operator overloading The code for the operator== method offers a useful illustration of an important property of object-oriented programming. The operator== method clearly has access to the x and y fields of the current object, because any method in a class has access to its own private variables. What is perhaps harder to understand is that the operator== method also has access to the private variables of rhs, even though that variable holds a completely different object. These references are legal in C++ because the definitions in the private section of a class are private to the class and not to the object. The code for the methods of a class can refer to the instance variables of any object of that class. In my experience, students often find the method-based form of operator overloading confusing because the compiler treats the operands differently, designating one as the receiver and passing the other as a parameter. The easiest way to restore symmetry is to use the alternative approach of defining the operators as free functions. If you use this strategy, the point.h interface needs to include the following prototype: bool operator==(Point p1, Point p2); This prototype declares a free function and must therefore appear outside the definition of the Point class. The corresponding implementation—which no longer includes the Point:: prefix because the operator is no longer part of the class—looks like this: bool operator==(Point p1, Point p2) { return p1.x == p2.x && p1.y == p2.y; } Although this implementation is easier to follow because it treats the parameters p1 and p2 symmetrically, this code has a significant problem: it doesn’t actually work. In fact, if you add this definition to the Point class, your code won’t even compile. The crux of the problem is that the == operator is now defined as a free function and therefore has no access to the private instance variables x and y. I haven’t included a bug icon here because the code for the == operator is going to end up looking exactly as it does in this example. What saves the day is that C++ makes it possible to solve the access problem in another way. Given that the == operator appears in the point.h interface, it is conceptually associated with the Point class and therefore in some sense deserves to have access to the private variables of the class. To make this design work, the Point class must let the C++ compiler know that it is okay for a particular function—in this case the overloaded version of the == operator—to see its private instance variables. To make such access legal, the 277 278 Designing Classes Point class must designate the operator== function as a friend. In this context, friendship has much the same character as it does on social networks. Your private information is typically not shared with the community at large, but is accessible to those you have accepted as your friends. In C++, the syntax for declaring a free function as a friend is friend prototype; where prototype is the complete function prototype. Here, for example, you need to specify that the free function operator== is a friend of the Point class by writing friend bool operator==(Point p1, Point p2); This line is part of the class definition and must therefore be available when the point.h interface is read. As with the declarations of the instance variables, however, the client is probably not interested in this level of detail. It therefore makes sense to put the friend specification in the pointpriv.h file instead of in the point.h interface itself. In C++, a class can declare that everything in some other class should gain the benefits of friendship by including the line friend class name; where name is the name of the class. In C++, such declarations of friendship are not automatically reciprocal. If two classes both want access to the private variables of the other, each class must explicitly declare the other class as a friend. Whenever you overload the == operator for a class, it is good practice to provide an overloaded definition for the != operator as well. Clients, after all, will expect that they can test whether two points are different as easily as they can test whether those points are the same. C++ does not assume that == and != return opposite values; if you want that behavior, you have to overload each of these operators separately. You can, however, make use of operator== when you implement operator!=, because operator== is a public member of the class. The most straightforward implementation of != therefore looks like this: bool operator!=(Point p1, Point p2) { return !(p1 == p2); } The final version of the Point class appears on the next few pages. Figure 6-4 contains the point.h interface, Figure 6-5 lists the contents of pointpriv.h, and Figure 6-6 completes the presentation with the point.cpp implementation. 6.2 Operator overloading Finished version of point.h F I G U R E 6 - 4 Complete interface for the Point class /* * File: point.h * ------------* This interface exports the Point class, which represents a point on * a two-dimensional integer grid. */ #ifndef _point_h #define _point_h #include #include class Point { public: /* * Constructor: Point * Usage: Point origin; * Point pt(xc, yc); * -----------------------* Creates a Point object. The default constructor sets the coordinates * to 0; the second form sets the coordinates to xc and yc. */ Point(); Point(int xc, int yc); /* * Methods: getX, getY * Usage: int x = pt.getX(); * int y = pt.getY(); * ------------------------* These methods returns the x and y coordinates of the point. */ int getX(); int getY(); /* * Method: toString * Usage: string str = pt.toString(); * ---------------------------------* Returns a string representation of the Point in the form "(x,y)". */ std::string toString(); #include "pointpriv.h" }; 279 280 Designing Classes point.h and pointpriv.h F I G U R E 6 - 4 Complete interface for the Point class (continued) /* * Operator: << * Usage: cout << pt; * -----------------* Overloads the << operator so that it is able to display Point values. */ std::ostream & operator<<(std::ostream & os, Point pt); /* * Operator: == * Usage: p1 == p2 * --------------* This operator supports equality testing for points. */ bool operator==(Point p1, Point p2); /* * Operator: != * Usage: p1 != p2 * --------------* This operator implements the != operator for points. It is good * practice to overload this operator whenever you overload == to * ensure that clients can perform either test. */ bool operator!=(Point p1, Point p2); #endif F I G U R E 6 - 5 The private section of the Point class /* * File: pointpriv.h * ----------------* This file contains the private section of the Point class. This * information is stored in a separate file to ensure that the client * doesn't see this information when reading the interface. */ /* Friend declaration */ friend bool operator==(Point p1, Point p2); private: /* Instance variables */ int x; int y; /* The x-coordinate */ /* The y-coordinate */ 6.2 Operator overloading point.cpp F I G U R E 6 - 6 Complete implementation of the Point class /* * File: point.cpp * --------------* This file implements the point.h interface. The comments have been * eliminated from this listing so that the implementation fits on a * single page. */ #include #include "point.h" #include "strlib.h" using namespace std; Point::Point() { x = 0; y = 0; } Point::Point(int xc, int yc) { x = xc; y = yc; } int Point::getX() { return x; } int Point::getY() { return y; } string Point::toString() { return "(" + integerToString(x) + "," + integerToString(y) + ")"; } bool operator==(Point p1, Point p2) { return p1.x == p2.x && p1.y == p2.y; } bool operator!=(Point p1, Point p2) { return !(p1 == p2); } ostream & operator<<(ostream & os, Point pt) { return os << pt.toString(); } 281 282 Designing Classes Adding operators to the Direction type Although operator overloading is most commonly associated with classes, C++ also allows you to extend the definition of operators so that they work with enumerated types. This feature makes it possible to add two operators to the direction.h interface from Chapter 2 that make the enumerated Direction type considerably easier to use. For exactly the same reasons that made it useful to overload the << operator for the Point class, it makes sense to define that operator for the Direction type as well. Given that the direction.h interface exports the directionToString function, the extended implementation of operator<< is straightforward: ostream & operator<<(ostream & os, Direction dir) { return os << directionToString(dir); } As with any function in a library interface, the body of this function belongs in direction.cpp, and its prototype must appear in direction.h. Before introducing the second operator, which is both more important and more subtle, it is useful to take note of an unfortunate limitation in the capabilities provided by the Direction type. As you will discover in Chapter 9, it is often useful to iterate through the elements of the Direction type, cycling through the values NORTH, EAST, SOUTH, and WEST in order. To accomplish that goal, what you would like to do is use what seems to be the obvious for loop idiom, as follows: for (Direction dir = NORTH; dir <= WEST; dir++) . . . Unfortunately, that statement doesn’t quite work for the Direction type as it is currently defined. The problem is that the ++ operator doesn’t work for enumerated types. To achieve the effect of dir++, you have to write the much less elegant expression dir = Direction(dir + 1) Once again, operator overloading comes to the rescue. To make the standard for loop idiom work exactly as it should, all you have to do is overload the ++ operator for the Direction type. Doing so, however, is not quite as simple as it sounds. The ++ and -- operators are special in C++ because they occur in two forms. When they are written in the prefix position, as in the expression ++x, the operator is applied first and the value of the expression is the value at the end of the operation. When they are written in the suffix position, as in the expression x++, the value of the variable changes in exactly the same way, but the value of the expression is the value of the variable before the operation takes place. 6.2 Operator overloading When you overload the ++ or -- operators in C++, you have to tell the compiler whether you want to redefine the prefix or suffix form of the operator. The designers of C++ chose to indicate the suffix form by passing an integer argument that has no purpose other than to differentiate it from its prefix counterpart. Thus, to overload the prefix form of the ++ operator for the Direction type, you would define the function Direction operator++(Direction & dir) { dir = Direction(dir + 1); return dir; } To overload the suffix form, you would instead define the function Direction operator++(Direction & dir, int) { Direction old = dir; dir = Direction(dir + 1); return old; } Note that the dir parameter must be passed by reference so that the function can change its value. This example also illustrates the fact that C++ doesn’t require a parameter name if you aren’t going to use that value. Given that the only purpose in overloading this operator is to enable the standard for loop idiom, the library version of the Direction type overrides only the suffix form. This extension is useful for any enumerated type for which it makes sense to iterate through the elements. Once you have made the extensions described in this section, the Direction type becomes much easier to use. For example, if you execute for (Direction dir = NORTH; dir <= WEST; dir++) cout << dir << endl; } you get the following output: TestDirection NORTH EAST SOUTH WEST Generating this output without these extensions would be much more difficult. 283 284 Designing Classes 6.3 Rational numbers Although the Point class from section 6.1 illustrates the basic mechanics used to define a new class, developing a solid understanding of the topic requires you to consider more sophisticated examples. This section walks you through the design of a class to represent rational numbers, which are those numbers that can be represented as the quotient of two integers. In elementary school, you probably called these numbers fractions. In some respects, rational numbers are similar to the floating-point numbers you have been using since Chapter 1. Both types of numbers can represent fractional values, such as 1.5, which is the rational number 3/2. The difference is that rational numbers are exact, while floating-point numbers are approximations limited by the precision of the hardware. To get a sense of why this distinction might be important, consider the arithmetic problem of adding together the following fractions: 1 1 1 + + 2 3 6 Basic arithmetic—or even a little intuition—makes it clear that the mathematically precise answer is 1, but that answer is impossible to obtain using double-precision arithmetic on a computer. Writing a program to perform this calculation and display the result with 16 digits of precision is easy: int main() { double a = 1.0 / 2.0; double b = 1.0 / 3.0; double c = 1.0 / 6.0; double sum = a + b + c; cout << setprecision(16); cout << "1/2 + 1/3 + 1/6 = " << sum << endl; return 0; } The result of running this program, however, might be a bit surprising: FractionSum 1/2 + 1/3 + 1/6 = 0.9999999999999999 The problem is that the memory cells used to store numbers inside a computer have a limited storage capacity, which in turn restricts the precision they can offer. 6.3 Rational numbers Within the limits of double-precision arithmetic, the sum of one-half plus one-third plus one-sixth is closer to 0.9999999999999999 than it is to 1.0. Worse still, the calculated value of the sum really is less than 1 and would show up as such if you were to test it in your program. At the end of the run, the value of the expression sum < 1 would be true, and the value of sum == 1 would be false. That result is all rather unsettling from a mathematical point of view. Rational numbers are not subject to rounding errors because no approximations are involved. What’s more, rational numbers are well understood as a mathematical concept with their own arithmetic rules, which are summarized in Figure 6-7. The problem is that C++ does not include rational numbers among its predefined types. To use rational numbers in C++, you have to define a new class to represent them. A strategy for defining new classes When you work in object-oriented languages, designing new classes is the most important skill you need to master. As with much of programming, designing a new class is as much an art as it is a science. Developing effective class designs requires a strong sense of aesthetics and considerable sensitivity to the needs and desires of clients who will use those classes as tools. Experience and practice are the best teachers, but following a general design framework can help get you started along this path. From my own experience, I’ve found that the following step-by-step approach is often helpful: 1. Think generally about how clients are likely to use the class. From the very beginning of the process, it is essential to remember that library classes are designed to meet the needs of clients and not for the convenience of the implementer. In a professional context, the most effective way to ensure that a new class meets those needs is to involve clients in the design process. At a minimum, however, you need to put yourself in the client role as you sketch the outlines of the class design. F I G U R E 6 - 7 Rules for rational arithmetic Addition a + c = ad + bc b d bd Subtraction a – c = ad – bc b d bd Multiplication a × c = ac b d bd Division a ÷ c = ad b d bc 285 286 Designing Classes 2. Determine what information belongs in the private state of each object. Although the private section is conceptually part of the implementation of a class, it simplifies the later design phases if you have at least an intuitive sense about what information objects of this class contain. In many cases, you can write down the instance variables that go into the private section. Although this precise level of detail is not essential at this point, having at least a feeling for the inner structure makes it easier to define its constructors and methods. 3. Define a set of constructors to create new objects. Classes typically define more than one overloaded form of the constructor, and it is useful to think from the client’s point of view about the types of objects that need to be created and what information the client has on hand at that time. Typically, every class exports a default constructor, which makes it possible for clients to declare variables of that class and initialize them later on. During this phase, it is also useful to consider whether the constructors need to apply any restrictions to ensure that the resulting objects are valid. 4. Enumerate the operations that will become the public methods of the class. In this phase, the goal is to write the prototypes for the exported methods, thereby adding specificity to the general outline you developed at the beginning of the process. You can also use this phase to refine the overall design, following the principles of unity, simplicity, sufficiency, generality, and stability outlined in Chapter 2. 5. Code and test the implementation. Once you have the interface specification, you need to write the code that implements it. Writing the implementation is of course essential to having a working program but also offers validation for the design. As you write the implementation, it is sometimes necessary to revisit the interface design if, for example, you discover that a particular feature is difficult to implement at an acceptable level of efficiency. As the implementer, you also have a responsibility to test the class to ensure that the class delivers the functionality it advertises in the interface. The sections that follow carry out these steps for the Rational class. Adopting the client perspective As a first step toward the design of the Rational class, you need to think about what features your clients are likely to need. In a large company, you might have various implementation teams that need to use rational numbers and can give you a good sense of what they need. In that setting, it would be useful to work together with those clients to agree on a set of design goals. Given that this example is designed for a textbook, it isn’t possible to schedule meetings with prospective clients. The primary purpose of the example, moreover, 6.3 Rational numbers is to illustrate the structure of class definitions in C++. Given these limitations and the need to manage the complexity of the example, it makes sense to limit the design goals so that the Rational class implements only the arithmetic operations defined in Figure 6-7. The private state of the Rational class For the Rational class, the private state is easy to specify. A rational number is defined as the quotient of two integers. Each rational object must therefore keep track of these two integers. The declarations of the instance variables that will go into the private section will therefore look something like this: int num; int den; The names for these variables are shortened versions of the mathematical terms numerator and denominator, which refer to the upper and lower parts of a fraction. It is interesting to note that the instance variables of the Point class and the Rational class are the same except for the variable names. The values maintained by each of these classes consist of a pair of integers. What makes these classes different is the interpretation of that data, which is reflected in the operations each class supports. Constructors for the Rational class Given that a rational number represents the quotient of two integers, one of the constructors will presumably take two integers representing the components of the fraction. Having such a constructor makes it possible, for example, to define the rational number one-third by calling Rational(1, 3). The prototype for this form of the constructor—which is the only part of the definition that goes into the interface—looks like this: Rational(int x, int y); Although it isn’t necessary to think about the implementation at this stage in the process, keeping implementation issues at least at the back of your mind can sometimes save you headaches later on. In this case, it is worth recognizing that it isn’t appropriate to implement this constructor in the following form: Rational(int x, int y) { num = x; den = y; } 287 288 Designing Classes The problem with this implementation is that the rules of arithmetic place constraints on the values of the numerator and denominator—constraints that need to be incorporated into the constructor. The most obvious constraint is that the value of the denominator cannot be zero, and the constructor should check for this case and signal an error if it occurs. There is, however, a more subtle issue. If the client is given unconstrained choice for the numerator and denominator, there will be many different ways to represent the same rational number. For example, the rational number one-third can be written as a fraction in any of the following ways: 1 3 2 6 100 300 –1 –3 Given that these fractions all represent the same rational number, it is inelegant to allow arbitrary combinations of numerator and denominator values in a Rational object. It simplifies the implementation if every rational number has a consistent, unique representation. Mathematicians achieve this goal by insisting on the following rules: • The fraction is always expressed in lowest terms, which means that any common factors are eliminated from the numerator and the denominator. In practice, the easiest way to reduce a fraction to lowest terms is to divide both the numerator and the denominator by their greatest common divisor, which you already know how to compute using the gcd function presented on page 61. • The denominator is always positive, which means that the sign of the value is stored with the numerator. • The rational number 0 is always represented as the fraction 0/1. Implementing these rules results in the following code for the constructor: Rational(int x, int y) { if (y == 0) error("Rational: Division by zero"); if (x == 0) { num = 0; den = 1; } else { int g = gcd(abs(x), abs(y)); num = x / g; den = abs(y) / g; if (y < 0) num = -num; } } As a general rule, every class should have a default constructor, which is used when no parameters are provided in a declaration. The appropriate mathematical 6.3 Rational numbers value for the default rational number is zero, which is represented as the fraction 0/1. The code for the default constructor therefore looks like this: Rational() { num = 0; den = 1; } Finally, it turns out to be useful to define a third version of the constructor that takes a single argument representing an integer: Rational(int n) { num = n; den = 1; } Defining methods for the Rational class In light of the earlier decision to limit the functionality of the Rational class to the arithmetic operators, figuring out what methods to export is a relatively easy task, particularly in C++. In many object-oriented languages—including Java, for example—the only way to define arithmetic operations would be to export methods named add, subtract, multiply, and divide to implement the four arithmetic operations. What’s worse is that you have to use the receiver syntax to apply those operators. Instead of writing the intuitively satisfying declaration Rational sum = a + b + c; languages like Java would require you to write Rational sum = a.add(b).add(c); While it is not all that difficult to puzzle out what this expression means, it has none of the fluidity and expressiveness that redefining the arithmetic operators provides. The situation is much better in C++. In C++, you implement rational arithmetic by overloading the operators +, -, *, and / to work with Rational objects. As was true for the Point class in section 6.2, it is easier to define these operators as free functions than as methods, which means that the prototypes for the four operators look like this: Rational Rational Rational Rational operator+(Rational operator-(Rational operator*(Rational operator/(Rational r1, r1, r1, r1, Rational Rational Rational Rational r2); r2); r2); r2); 289 290 Designing Classes As with the == operator in the Point class, these arithmetic operators will need to have access to the fields of r1 and r2, which means that these operator methods must be declared as friends of the Rational class. The most convenient place to put those declarations is in the rationalpriv.h file that hides away the private section of the class. The contents of that file appear in Figure 6-8. Although there are many other methods and operators that would make sense in a professional implementation of the Rational class, the only additional facilities included in this example are a toString method and an overloaded version of the << operator, mostly to get you in the habit of including these facilities in every class you design. Being able to display the values of your objects in a human-readable form is tremendously important for both testing and debugging, which are essential phases of the development process. These design decisions make it possible to complete the definition of the rational.h interface, which appears in Figure 6-9. Implementing the Rational class The final step in the process is writing the code for the Rational class, which appears in Figure 6-10. Particularly given the fact that the only complex part of the implementation is the constructor for which you have already seen the necessary code, the contents of rational.cpp are reasonably straightforward. F I G U R E 6 - 8 The private section of the Rational class /* * File: rationalpriv.h * -------------------* This file contains the private section (and the friends declarations) * for the Rational class. */ /* Declare the operator functions as friends */ friend friend friend friend Rational Rational Rational Rational operator+(Rational operator-(Rational operator*(Rational operator/(Rational r1, r1, r1, r1, Rational Rational Rational Rational r2); r2); r2); r2); private: /* Instance variables */ int num; int den; /* The numerator of this Rational object */ /* The denominator of this Rational object */ 6.3 Rational numbers rational.h, page 1 F I G U R E 6 - 9 Interface for the Rational class /* * File: rational.h * ---------------* This interface exports a class for representing rational numbers. */ #ifndef _rational_h #define _rational_h #include #include /* * Class: Rational * --------------* The Rational class is used to represent rational numbers, which * are defined to be the quotient of two integers. */ class Rational { public: /* * Constructor: Rational * Usage: Rational zero; * Rational num(n); * Rational r(x, y); * -----------------------* Creates a Rational object. The default constructor creates the * rational number 0. The single-argument form creates a rational * equal to the specified integer, and the two-argument form creates * a rational number corresponding to the fraction x/y. */ Rational(); Rational(int n); Rational(int x, int y); /* * Method: toString() * Usage: string str = r.toString(); * --------------------------------* Returns the string representation of this rational number. */ std::string toString(); #include "rationalpriv.h" }; 291 292 Designing Classes Rational.h, page 2 F I G U R E 6 - 9 Interface for the Rational class (continued) /* * Operator: << * -----------* Overloads the << operator so that it is able to display Rational values. */ std::ostream & operator<<(std::ostream & os, Rational rat); /* * Operator: + * Usage: r1 + r2 * -------------* Overloads the + operator so that it performs addition on two * rational numbers. */ Rational operator+(Rational r1, Rational r2); /* * Operator: * Usage: r1 - r2 * -------------* Overloads the - operator so that it performs subtraction on two * rational numbers. */ Rational operator-(Rational r1, Rational r2); /* * Operator: * * Usage: r1 * r2 * -------------* Overloads the * operator so that it performs multiplication on two * rational numbers. */ Rational operator*(Rational r1, Rational r2); /* * Operator: / * Usage: r1 / r2 * -------------* Overloads the / operator so that it performs division on two * rational numbers. */ Rational operator/(Rational r1, Rational r2); #endif 6.3 Rational numbers rational.cpp, page 1 F I G U R E 6 - 1 0 Implementation of the Rational class /* * File: rational.cpp * -----------------* This file implements the Rational class. */ #include #include #include "error.h" #include "rational.h" #include "strlib.h" using namespace std; /* Function prototypes */ int gcd(int x, int y); /* * Implementation notes: Constructors * ---------------------------------* There are three constructors for the Rational class. The default * constructor creates a Rational with a zero value, the one-argument * form converts an integer to a Rational, and the two-argument form * allows specifying a fraction. The constructors ensure that * the following invariants are maintained: * * 1. The fraction is always reduced to lowest terms. * 2. The denominator is always positive. * 3. Zero is always represented as 0/1. */ Rational::Rational() { num = 0; den = 1; } Rational::Rational(int n) { num = n; den = 1; } Rational::Rational(int x, int y) { if (y == 0) error("Rational: Division by zero"); if (x == 0) { num = 0; den = 1; } else { int g = gcd(abs(x), abs(y)); num = x / g; den = abs(y) / g; if (y < 0) num = -num; } } 293 294 Designing Classes rational.cpp, page 2 F I G U R E 6 - 1 0 Implementation of the Rational class (continued) /* Implementation of toString and the << operator */ string Rational::toString() { if (den == 1) { return integerToString(num); } else { return integerToString(num) + "/" + integerToString(den); } } ostream & operator<<(ostream & os, Rational rat) { return os << rat.toString(); } /* * Implementation notes: arithmetic operators * -----------------------------------------* The implementation of the operators follows directly from the definitions. */ Rational operator+(Rational r1, Rational r2) { return Rational(r1.num * r2.den + r2.num * r1.den, r1.den * r2.den); } Rational operator-(Rational r1, Rational r2) { return Rational(r1.num * r2.den - r2.num * r1.den, r1.den * r2.den); } Rational operator*(Rational r1, Rational r2) { return Rational(r1.num * r2.num, r1.den * r2.den); } Rational operator/(Rational r1, Rational r2) { return Rational(r1.num * r2.den, r1.den * r2.num); } /* * Implementation notes: gcd * ------------------------* This implementation uses Euclid's algorithm to calculate the * greatest common divisor. */ int gcd(int x, int y) { int r = x % y; while (r != 0) { x = y; y = r; r = x % y; } return y; } 6.4 Designing a token scanner class Particularly when you implement the operators as free functions, the code for the operators follows directly from the mathematical definitions in Figure 6-7. For example, the implementation of operator+ Rational operator+(Rational r1, Rational r2) { return Rational(r1.num * r2.den + r2.num * r1.den, r1.den * r2.den); } is a direct translation of the rules for adding the rational numbers r1 and r2: r1 + r2 = r1num r2den + r2num r1den r1den r2den 6.4 Designing a token scanner class In Chapter 3, the most sophisticated example of string processing is the Pig Latin translator. As it appears in Figure 3-2, the PigLatin program decomposes the problem into two phases: the lineToPigLatin function divides the input into words and then calls wordToPigLatin to convert each word to its Pig Latin form. The first phase of this decomposition, however, is not at all specific to the Pig Latin domain. Many applications need to divide a string into words, or more generally, into logical units that may be larger than a single character. In computer science, such units are typically called tokens. Given that the problem of dividing a string into individual tokens comes up so frequently in applications, it is useful to build a library package that takes care of that task. This section introduces a TokenScanner class designed for that purpose. The primary goal is to build a package that is simple to use but nonetheless flexible enough to meet the needs of a variety of clients. What clients want from a token scanner As always, the best way to begin the design of the TokenScanner class is to look at the problem from the client perspective. Every client that wants to use a scanner starts with a source of tokens, which might be a string but might also be an input stream for applications that read data from files. In either case, what the client needs is some way to retrieve individual tokens from that source. There are several strategies for designing a TokenScanner class that offers the necessary functionality. You could, for example, have the token scanner return a vector containing the entire list of tokens. That strategy, however, isn’t appropriate for applications that work with large input file, because the scanner has to create a single vector containing the entire list of tokens. A more space-efficient approach is 295 296 Designing Classes to have the scanner deliver its tokens one at a time. When you use this design, the process of reading tokens from a scanner has the following pseudocode form: Set the input for the token scanner to be some string or input stream. while (more tokens are available) { Read the next token. } This pseudocode structure immediately suggests the sort of methods that the TokenScanner class will have to support. From this example, you would expect TokenScanner to export the following methods: • A setInput method that allows clients to specify the token source. For maximum flexibility, this method should be overloaded to take either a string or an input stream as its argument. • A hasMoreTokens method that tests whether the token scanner has any tokens left to process. • A nextToken method that scans and returns the next token. These methods define the operational structure of a token scanner and are largely independent of the specifics of the applications. Different applications, however, define tokens in all sorts of different ways, which means that the TokenScanner class must give the client some control over what types of tokens are recognized. The need to recognize different types of tokens is easiest to illustrate by offering a few examples. As a starting point, it is instructive to revisit to the problem of translating English into Pig Latin. If you rewrite the PigLatin program to use the token scanner, you can’t ignore the spaces and punctuation marks, because those characters need to be part of the output. In the context of the Pig Latin problem, tokens fall into one of two categories: 1. A string of consecutive alphanumeric characters representing a word. 2. A single-character string consisting of a space or punctuation mark. If you gave the token scanner the input this is "pig latin" calling nextToken repeatedly would return the following sequence of nine tokens: this is " pig latin " Other applications, however, are likely to define tokens in different ways. Your C++ compiler, for example, uses a token scanner to break programs into tokens that 6.4 Designing a token scanner class make sense in the programming context, including identifiers, constants, operators, and other symbols that define the syntactic structure of the language. For example, if you gave the compiler’s token scanner the line cout << "hello, world" << endl; you would like it to deliver up the following sequence of tokens: cout << "hello, world" << endl ; There are several differences between these two application domains in the definition of a token. In the Pig Latin translator, anything that’s not a sequence of alphanumeric characters is returned as a single-character token. In the compiler example, the situation is more complicated. For one thing, programming languages often define multicharacter operators like << that must be treated as single tokens. Similarly, the string constant "hello, world" has the correct meaning only if the token scanner treats it as a single entity. Perhaps less obviously, the compiler’s token scanner ignores spaces in the input entirely, unless they appear inside string constants. As you will learn if you go on to take a course on compilers, it is possible to build a token scanner that allows the client to specify what constitutes a legal token, typically by supplying a precise set of rules. That design offers the greatest possible generality. That generality, however, often comes at the expense of simplicity. If you force clients to specify the rules for token formation, they need to learn how to write those rules, which is similar in many respects to learning a new language. Worse still, the rules for token formation—particularly if you are trying to specify, for example, the rules that a compiler uses to recognize numbers—are complicated and difficult for clients to get right. If your goal in the interface is to maximize simplicity, it is probably better to design the TokenScanner class so that clients can enable specific options that allow it to recognize the type of tokens used in specific application contexts. If all you want is a token scanner that collects consecutive alphanumeric characters into words, you use the TokenScanner class in its simplest possible configuration. If you instead want the TokenScanner to identify the units in a C++ program, you enable options that tell the scanner, for example, to ignore whitespace, to treat quoted strings as single units, and that certain combinations of punctuation marks represent multicharacter operators. The tokenscanner.h interface The Stanford C++ library includes a TokenScanner class that offers considerable flexibility without sacrificing simplicity. The methods exported by TokenScanner 297 298 Designing Classes appear in Table 6-1. Many of the methods in the interface are used to enable options that change the default behavior of the scanner. For example, you can ignore all whitespace in the input stream by initializing a token scanner like this: TokenScanner scanner; scanner.ignoreWhitespace(); T A B L E 6 - 1 Methods exported by the library TokenScanner class Constructors TokenScanner() TokenScanner(str) TokenScanner(infile) Initializes a scanner object. The source for the tokens is initialized from the specified string or input file. If no token source is provided, the client must call setInput before reading tokens from the scanner. Methods for reading tokens hasMoreTokens() Returns true if there are more tokens to read from the input source. nextToken() Returns the next token from this scanner. If nextToken is called when no tokens are available, it returns the empty string. saveToken(token) Saves the specified token as part of this scanner’s internal state so that it will be returned on the next call to nextToken. Methods for controlling scanner options ignoreWhitespace() Tells the scanner to ignore whitespace characters. ignoreComments() Tells the scanner to ignore comments, which can be in either the slash-star or slash-slash form. Tells the scanner to recognize any legal number as a single token. scanNumbers() scanStrings() Tells the scanner to return a string enclosed in quotation marks as a single token. The quotation marks (which may be either single or double quotes) are included in the scanned token so that clients can differentiate strings from other token types. addWordCharacters(str) Adds the characters in str to the set of characters legal in a word. addOperator(op) Defines a new multicharacter operator. Miscellaneous methods setInput(str) setInput(infile) Sets the input source for this scanner to the specified string or input stream. Any tokens remaining in the previous source are lost. getPosition() Returns the current position of the scanner in the input stream. isWordCharacter(ch) Returns true if the character ch is valid in a word. verifyToken(expected) Reads the next token and makes sure it matches the string expected. getTokenType(token) Returns the type of the token, which must be one of the following constants: EOF, SEPARATOR, WORD, NUMBER, STRING, OPERATOR. 6.4 Designing a token scanner class If you instead want to initialize a TokenScanner so that it adheres to the rules for tokens in C++, you could use the following code: TokenScanner scanner; scanCPlusPlusTokens(scanner); The implementation of scanCPlusPlusTokens in Figure 6-11 tells the scanner to ignore whitespace and comments, that numbers and strings should be scanned as single tokens, that the underscore is a legal character in an identifier, and that C++ recognizes the multicharacter operators (many of which are likely to be unfamiliar but are nonetheless defined in C++) shown in the various calls to addOperator. F I G U R E 6 - 1 1 Initializing a TokenScanner to scan C++ tokens /* * Function: scanCPlusPlusTokens * Usage: scanCPlusPlusTokens(scanner); * -----------------------------------* Sets the necessary options for the scanner so that it can * read C++ source code. */ void scanCPlusPlusTokens(TokenScanner & scanner) { scanner.ignoreWhitespace(); scanner.ignoreComments(); scanner.scanNumbers(); scanner.scanStrings(); scanner.addWordCharacters("_"); scanner.addOperator("++"); scanner.addOperator("--"); scanner.addOperator("=="); scanner.addOperator("!="); scanner.addOperator("<="); scanner.addOperator(">="); scanner.addOperator("<<"); scanner.addOperator(">>"); scanner.addOperator("&&"); scanner.addOperator("||"); scanner.addOperator("+="); scanner.addOperator("-="); scanner.addOperator("*="); scanner.addOperator("%="); scanner.addOperator("^="); scanner.addOperator("&="); scanner.addOperator("|="); scanner.addOperator("<<="); scanner.addOperator(">>="); scanner.addOperator("->"); scanner.addOperator("::"); } 299 300 Designing Classes The tokenscanner.h interface makes it much easier to write a variety of applications, including several you have already seen in this book. You could, for example, use it to simplify the PigLatin program from Figure 3-2 by rewriting the lineToPigLatin function as follows: string lineToPigLatin(string line) { TokenScanner scanner(line); string result = ""; while (scanner.hasMoreTokens()) { string word = scanner.nextToken(); if (isalpha(word[0])) word = wordToPigLatin(word); result += word; } return result; } The new version of lineToPigLatin is shorter than the original implementation, but the real simplification is conceptual. The original code had to operate at the level of individual characters; the new version gets to work with complete words because the TokenScanner class takes care of the low-level details. Implementing the TokenScanner class Particularly given the number of options it supports, the complete implementation of the TokenScanner class is too complicated to serve as an effective example. Figures 6-12 through 6-14 therefore present a simplified version of the token scanner package that defines only the following methods: • A constructor that takes a string argument, in addition to the default constructor • The setInput method that sets the scanner input to a string • The nextToken method, which returns the next token from the string • The hasMoreTokens method, which allows clients to see if tokens are available • The ignoreWhitespace method, which tells the scanner to ignore spaces The ignoreWhitespace method serves as a model for the other option settings that are available in this package, and you will have a chance to implement all of these options in the exercises. Adding the functionality to read tokens from data files, however, depends on concepts that won’t be introduced until Chapter 11, and you will have a chance at that point to finish the implementation of the interface. 6.4 Designing a token scanner class tokenscanner.h F I G U R E 6 - 1 2 Simplified interface for the TokenScanner class /* * File: tokenscanner.h * -------------------* This file exports a simple TokenScanner class that divides a string * into individual logical units called tokens. In this version * of the package, tokens come in two forms: * * 1. Strings of consecutive letters and digits representing words * 2. One-character strings representing punctuation or separators * * The use of the TokenScanner class is illustrated by the following code * pattern, which reads the tokens in the string variable input: * * TokenScanner scanner; * scanner.setInput(input); * while (scanner.hasMoreTokens()) { * string token = scanner.nextToken(); * . . . process the token . . . * } * * This version of the TokenScanner class includes the ignoreWhitespace * method. The other options available in the library version of the * class are included as exercises in the text. */ #ifndef _tokenscanner_h #define _tokenscanner_h #include /* * Class: TokenScanner * ------------------* This class is used to represent a single instance of a scanner. */ class TokenScanner { public: /* * Constructor: TokenScanner * Usage: TokenScanner scanner; * TokenScanner scanner(str); * --------------------------------* Initializes a scanner object. The initial token stream comes from * the string str, if it is specified. The default constructor creates * a scanner with an empty token stream. */ TokenScanner(); TokenScanner(std::string str); 301 302 Designing Classes tokenscanner.h (page 2) F I G U R E 6 - 1 2 Simplified interface for the TokenScanner class (continued) /* * Method: setInput * Usage: scanner.setInput(str); * ----------------------------* Sets the input for this scanner to the specified string. * previous input string is discarded. */ Any void setInput(std::string str); /* * Method: hasMoreTokens * Usage: if (scanner.hasMoreTokens()) . . . * ----------------------------------------* Returns true if there are additional tokens for this scanner to read. */ bool hasMoreTokens(); /* * Method: nextToken * Usage: token = scanner.nextToken(); * ----------------------------------* Returns the next token from this scanner. If called when no tokens * are available, nextToken returns the empty string. */ std::string nextToken(); /* * Method: ignoreWhitespace() * Usage: scanner.ignoreWhitespace(); * ---------------------------------* Tells the scanner to ignore whitespace characters. By default, the * nextToken method treats whitespace characters (typically spaces and * tabs) just like any other punctuation mark and returns them as * single-character tokens. Calling * * scanner.ignoreWhitespace(); * * changes this behavior so that the scanner ignore whitespace characters. */ void ignoreWhitespace(); #include "tokenscannerpriv.h" }; #endif 6.4 Designing a token scanner class F I G U R E 6 - 1 3 The private section of the TokenScanner class /* * File: tokenscannerpriv.h * -----------------------* This file contains the private section for the TokenScanner class. */ private: /* Instance variables */ std::string buffer; int cp; bool ignoreWhitespaceFlag; /* The input string containing the tokens */ /* The current position in the buffer */ /* Flag set by a call to ignoreWhitespace */ /* Private methods */ void skipWhitespace(); F I G U R E 6 - 1 4 Implementation of the simplified TokenScanner class /* * File: tokenscanner.cpp * ---------------------* This file implements the TokenScanner class. Most of the methods * are straightforward enough to require no additional documentation. */ #include #include #include "tokenscanner.h" using namespace std; TokenScanner::TokenScanner() { /* Empty */ } TokenScanner::TokenScanner(string str) { setInput(str); } void TokenScanner::setInput(string str) { buffer = str; cp = 0; } bool TokenScanner::hasMoreTokens() { if (ignoreWhitespaceFlag) skipWhitespace(); return cp < buffer.length(); } 303 304 Designing Classes tokenscanner.cpp (page 2) F I G U R E 6 - 1 4 Implementation of the simplified TokenScanner class (continued) /* * Implementation notes: nextToken * ------------------------------* This method starts by looking at the character at the current character * indicated by the index cp. If that index is past the end of the string, * nextToken returns the empty string. If that character is alphanumeric, * nextToken scans ahead until it finds the end of that word; if not, * nextToken returns that character as a one-character string. */ string TokenScanner::nextToken() { if (ignoreWhitespaceFlag) skipWhitespace(); if (cp >= buffer.length()) { return ""; } else if (isalnum(buffer[cp])) { int start = cp; while (cp < buffer.length() && isalnum(buffer[cp])) { cp++; } return buffer.substr(start, cp - start); } else { return string(1, buffer[cp++]); } } void TokenScanner::ignoreWhitespace() { ignoreWhitespaceFlag = true; } /* * Implementation notes: skipWhitespace * -----------------------------------* This method is a private method and is therefore not exported. It * does, however, need to be declared in the private section of the * class, which is contained in the tokenscannerpriv.h file. This * method is called from both hasMoreTokens and nextToken. */ void TokenScanner::skipWhitespace() { while (cp < buffer.length() && isspace(buffer[cp])) { cp++; } } 6.5 Encapsulating programs as classes 6.5 Encapsulating programs as classes Most of the class definitions you have seen in this chapter create new abstract types that you can then use as if they were primitive objects. Once you have defined the Rational class, for example, you can then use Rational objects in much the same ways that you use the primitive types in C++. You can declare variables that hold Rational values, assign new values to those variables, combine them using operators, print them on cout, and store them in any of the collection classes. Programs that work with rational numbers typically create many Rational objects, all of which are instances of the same class. Classes, however, can still be useful even if you never intend to have more than one object of a particular class. For example, it often makes sense to write a program as a class rather than as a collection of free functions. The primary advantage of doing so is that classes provide better encapsulation. The fact that access to any private data is limited to the class itself means that it is much safer to use private instance variables to share information than it is to use global variables, which offer no such security. As an illustration of this technique, the program in Figure 6-15 shows how to redesign the checkout-line simulation from Figure 5-5 as a class. In the new design, the free functions runSimulation and printReport become public methods in a new CheckoutLineSimulation class. Given that the bodies of those methods are exactly the same as they were before, this change alone has little impact on the complexity of the code. What has changed, however, is that the information shared between these functions can now be stored in instance variables and need not be passed as arguments. Being able to share access to such data among all the methods in the class substantially reduces the size and complexity of the parameter lists, which in this example shrink from three parameters to zero. When you use this approach, the main function typically becomes considerably shorter. It declares an object of the class that encapsulates the program operation and then calls the public methods necessary to get it going, as illustrated by the definition of main in Figure 6-15: int main() { CheckoutLineSimulation simulation; simulation.runSimulation(); simulation.printReport(); return 0; } The advantage of using a class increases along with the complexity of the program. Programs in this book use this technique only if doing so simplifies the code. 305 306 Designing Classes Class-based checkout-line simulation F I G U R E 6 - 1 5 Class-based version of the checkout-line simulation /* * File: CheckoutLineClass.cpp * --------------------------* This program duplicates the CheckoutLine program from Chapter 5, * but embeds the entire program in a class definition. */ #include #include #include "queue.h" #include "random.h" using namespace std; /* Constants */ const const const const double ARRIVAL_PROBABILITY = 0.05; int MIN_SERVICE_TIME = 5; int MAX_SERVICE_TIME = 15; int SIMULATION_TIME = 2000; /* * Class: CheckoutLineSimulation * ----------------------------* This class encapsulates the code and data for the simulation. */ class CheckoutLineSimulation { public: void runSimulation() { . . . same as in Figure 5-5 . . . } void printReport() { . . . same as in Figure 5-5 . . . } private: int nServed; int totalWait; int totalLength; /* Number of customers served */ /* Sum of all customer waiting times */ /* Sum of the queue length at each time step */ }; /* Main program */ int main() { CheckoutLineSimulation simulation; simulation.runSimulation(); simulation.printReport(); return 0; } Review questions Summary The primary purpose of this chapter is to give you the tools you need to design and implement classes on your own. The examples in this chapter focus on classes that encapsulate data and operations into a coherent whole, deferring the more complex issue of inheritance to Chapter 17. Important points covered in this chapter include: • In many applications, it is useful to combine several independent data values into a single abstract data type. C++ offers several strategies for encapsulating data in this way. At the lowest level, C++ continues to support the definition of C-style structure types. In modern programming practice, however, this kind of encapsulation is more commonly accomplished using classes. • In C++, a class is divided into sections that control the access clients have to the fields and methods in that section. The public section of a class is accessible to all clients; the private section is accessible only to the implementation. In modern object-oriented programming, instance variables are declared in the private section. One class can give other functions and classes access to its private data by declaring them as friends. • Given a compound object that is either a structure or a class, you select the individual components using the dot operator. Clients can select a field from a compound object only if that field is in the public section. The implementation of a class, however, has access to the private members of all objects of that class. • Class definitions typically export one or more constructors that are responsible for initializing objects of that class. In general, all class definitions include a default constructor that takes no arguments. • Methods that give clients access to the values of the instance variables are called getters; methods that allow clients to change the value of an instance variable are called setters. A class that gives the client no opportunity to change the value of an object after it is created is said to be immutable. • Class definitions exported by an interface typically separate the definition of class methods between the interface and the implementation. The .h file contains only method prototypes; the bodies of those methods go in the .cpp file. In C++, the implementation file must specify the class to which each method belongs by adding a :: tag before the method name. • Class definitions can overload the standard operators in either of two ways. Defining an operator as a class method means that the operator is part of the class and therefore has free access to the private methods. Defining an operator as a free function often produces code that is easier to read but means that the operator function must be designated as a friend to refer to the private data. 307 308 Designing Classes • One of the most useful operators to overload is the insertion operator <<, because doing so makes it easy to display values of that class on the console. In this text, most classes overload the << operator and define a toString method that converts a value of that class to a string. • Designing new classes is as much an art as a science. Although the chapter offers some general guidelines to guide you in this process, experience and practice are the best teachers. • The Stanford libraries export a TokenScanner class that supports the process of breaking input text into individual units called tokens. The library version of the TokenScanner class supports a variety of options that make this package useful in a wide range of applications. • For applications that are complex enough to require maintaining more than a modest amount of internal state, it often makes sense to encapsulate the entire program inside a class. When you use such a design, the main program declares a variable of that class and then invokes some method to start it up. Review questions 1. Define each of the following terms: object, structure, class, instance variable, method. 2. In a C++ class definition, what do the keywords public and private mean? 3. True or false: In C++, the only difference between the keyword struct and the keyword class is that struct makes fields public by default. 4. What operator does C++ use to select an instance variable from an object? 5. What is the syntax for a C++ constructor? 6. How many arguments are passed to the default constructor. 7. What are getters and setters? 8. What does it mean for a class to be immutable? 9. When you separate the interface and implementation of a class, how does the implementation let the compiler know to which class a particular method definition belongs? 10. What strategy is used in the .h files in this chapter to prevent clients from seeing the contents of the private section? Review questions 11. In C++, what method name would you use to overload the % operator. 12. How does C++ differentiate between the prefix and suffix versions of the ++ and -- operators? 13. Why does the overloaded implementation of the << operator require the use of return by reference? 14. True or false: Return by reference is used as frequently in C++ programs as call by reference. 15. Describe the differences between the method-based and free-function-based approaches to overloading the operators for a class. What are the advantages and disadvantages of each style? 16. What does it mean for one class to declare a method or another class as a friend? 17. What reason does this chapter offer for overloading the ++ operator for the Direction type? 18. What are the five steps suggested in this chapter as a guideline for designing a class? 19. What is a rational number? 20. What restrictions does the Rational constructor place on the values of the num and den variables? 21. The code for the Rational constructor of page 288 includes an explicit check to see whether x is zero? Would the Rational class still work the same way if this check were eliminated? 22. In the rationalpriv.h file in Figure 6-8, why is it necessary to designate the operator methods for +, -, *, and / as friends but not the operator method for < that holds the coordinates of each of these pegs, which are inserted into the vector starting at the upper right and then proceeding clockwise around the edges of the rectangle, as follows: 0 1 2 3 4 5 6 7 8 9 27 10 26 11 25 12 24 13 23 22 21 20 19 18 17 16 15 14 From here, you create a figure by drawing lines between the pegs, starting at peg 0 and then moving ahead a fixed number of spaces on each cycle, as specified by the constant DELTA. For example, if DELTA is 11, the first line goes from peg 0 to peg 11, the second goes from peg 11 to peg 22, and the third—which has to count 11 pegs clockwise past the beginning—goes from peg 22 to peg 5. The process continues in this way until the line returns to peg 0. As usual, implementing the wrap-around feature is much easier if you make use of the % operator. Write a program that simulates this process on the graphics window using larger values for N_ACROSS and N_DOWN. The output of the program with N_ACROSS equal to 50, N_DOWN equal to 30, and DELTA equal to 67 appears in Exercises F I G U R E 6 - 1 6 Sample run of the yarn-pattern program YarnPattern Figure 6-16. By changing those constants, you can create other wonderful patterns composed entirely of straight lines. 3. The game of dominos is played using pieces that are usually black rectangles with some number of white dots on each side. For example, the domino is called the 4-1 domino, with four dots on its left side and one on its right. Define a simple Domino class that represents a traditional domino. Your class should export the following entries: • A default constructor that creates the 0-0 domino • A constructor that takes the number of dots on each side • A toString method that creates a string representation of the domino • Two getter methods named getLeftDots and getRightDots All instance variables should be private to the class. 311 312 Designing Classes Write the domino.h interface and the domino.cpp implementation that export this class. As with the examples in the text, the interface should overload the << operator so that it is possible to print the string representation of a domino on any output stream. Test your implementation of the Domino class by writing a program that creates a full set of dominos from 0-0 up to 6-6 and then displays those dominos on the console. A full set of dominos contains one copy of each possible domino in that range, disallowing duplicates that result from flipping a domino over. A domino set, therefore, has a 4-1 domino but not a separate 1-4 domino. 4. Defines a Card class suitable for representing a standard playing card, which is identified by two components: a rank and a suit. The rank is stored as an integer between 1 and 13 in which an ace is a 1, a jack is an 11, a queen is a 12, and a king is 13. The suit is one of the four constants in the following enumeration type: enum Suit = { CLUBS, DIAMONDS, HEARTS, SPADES }; The Card class should export the following methods: • A default constructor that creates a card that can later be assigned a value • A constructor that takes a short string name like "10S" or "JD" • A constructor that takes separate values for the rank and the suit • A toString method that returns the short string representation of the card • The getter methods getRank and getSuit Write the card.h interface and the card.cpp implementation necessary to export the Card class. In addition to the Card class itself, the card.h interface should export the Suit type, constant names for the ranks that are usually named rather than numbered (ACE, JACK, QUEEN, KING), and any other definitions you need to run the following main program: int main() { for (Suit suit = CLUBS; suit <= SPADES; suit++) { for (int rank = ACE; rank <= KING; rank++) { cout << " " << Card(rank, suit); } cout << endl; } return 0; } Exercises This program should produce the following sample run: TestCardClass AC AD AH AS 5. 2C 2D 2H 2S 3C 3D 3H 3S 4C 4D 4H 4S 5C 5D 5H 5S 6C 6D 6H 6S 7C 7D 7H 7S 8C 8D 8H 8S 9C 9D 9H 9S 10C 10D 10H 10S JC JD JH JS QC QD QH QS KC KD KH KS Extend the calendar.h interface from Chapter 2, exercise 11 so that it also exports a Date class that exports the following methods: • A default constructor that sets the date to January 1, 1970, which is used as the base of many computerized time standards. • A constructor that takes a month, day, and year and initializes the Date to contain those values. For example, the declaration Date moonLanding(JULY, 20, 1969); should initialize moonLanding so that it represents July 20, 1969. • An overloaded version of the constructor that takes the first two parameters in the opposite order, for the benefit of clients in other parts of the world. This change allows the declaration of moonLanding to be written as Date moonLanding(20, JULY, 1969); • The getter methods getDay, getMonth, and getYear. • A toString method that returns the date in the form dd-mmm-yyyy, where dd is a one- or two-digit date, mmm is the three-letter English abbreviation for the month, and yyyy is the four-digit year. Thus, calling toString(moonLanding) should return the string "20-Jul-1969". 6. Extend the calendar.h interface still further by adding overloaded versions of the following operators: • The insertion operator <<. • The relational operators ==, !=, <, <=, >, and >= • The expression date + n, which returns the date n days after date • The expression date - n, which returns the date n days before date • The expression d1 - d2, which returns how many days separate d1 and d2 • The shorthand assignment operators += and -= with an integer on the right • The ++ and -- operators in both their prefix and suffix form. 313 314 Designing Classes Suppose, for example, that you have made the following definitions: Date electionDay(6, NOVEMBER, 2012); Date inaugurationDay(21, JANUARY, 2013); Given these values of the variables, electionDay < inaugurationDay is true because electionDay comes before inaugurationDay. Evaluating inaugurationDay - electionDay returns 76, which is the number of days between the two events. The definitions of these operators, moreover, allow you to write a for loop like: for (Date d = electionDay; d <= inaugurationDay; d++) which would cycle through each of these days, including both endpoints. 7. The Rational class presented in the text defines the operators +, –, *, / but needs several other operators for completeness, including the following: • The relational operators ==, !=, <, <=, >, and >= • The shorthand assignment operators +=, -=, *=, and /= • The ++ and -- operators in both their prefix and suffix form. Add these operators to the interface and implementation. 8. Reimplement the RPN calculator from Figure 5-4 so that it performs its internal calculations using rational instead of floating-point numbers. For example, your program should be able to produce the following sample run (which demonstrates that rational arithmetic is always exact): RationalRPNCalculator RPN Calculator Simulation (type H for help) > 1 > 2 > / 1/2 > 1 > 3 > / 1/3 > 1 > 6 > / 1/6 > + 1/2 > + 1 > Q Exercises 9. For certain applications, it is useful to be able to generate a series of names that form a sequential pattern. For example, if you were writing a program to number figures in a paper, having some mechanism to return the sequence of strings "Figure 1", "Figure 2", "Figure 3", and so on, would be very handy. However, you might also need to label points in a geometric diagram, in which case you would want a similar but independent set of labels for points such as "P0", "P1", "P2", and so forth. If you think about this problem more generally, the tool you need is a label generator that allows the client to define arbitrary sequences of labels, each of which consists of a prefix string ("Figure " or "P" for the examples in the preceding paragraph) coupled with an integer used as a sequence number. Because the client may want different sequences to be active simultaneously, it makes sense to define the label generator as an abstract type called LabelGenerator. To initialize a new generator, the client provides the prefix string and the initial index as arguments to the LabelGenerator constructor. Once the generator has been created, the client can return new labels in the sequence by calling nextLabel on the LabelGenerator. As an illustration of how the interface works, the main program shown in Figure 6-17 produces the following sample run: F I G U R E 6 - 1 7 Main program to test the label generator int main() { LabelGenerator figureNumbers("Figure ", 1); LabelGenerator pointNumbers("P", 0); cout << "Figure numbers: "; for (int i = 0; i < 3; i++) { if (i > 0) cout << ", "; cout << figureNumbers.nextLabel(); } cout << endl << "Point numbers: "; for (int i = 0; i < 5; i++) { if (i > 0) cout << ", "; cout << pointNumbers.nextLabel(); } cout << endl << "More figures: "; for (int i = 0; i < 3; i++) { if (i > 0) cout << ", "; cout << figureNumbers.nextLabel(); } cout << endl; return 0; } 315 316 Designing Classes TestLabelGenerator Figure numbers: Figure 1, Figure 2, Figure 3 Point numbers: P0, P1, P2, P3, P4 More figures: Figure 4, Figure 5, Figure 6 Write the files labelgen.h and labelgen.cpp that define and implement this class. 10. Write a program that checks the spelling of all words in a file. Your program should use the TokenScanner class to reads tokens from an input file and then look up each word in the lexicon stored in the file EnglishWords.dat introduced in Chapter 5. If the word does not appear in the lexicon, your program should print a message to that effect. If, for example, you run the program on a file containing the text of this paragraph, the SpellCheck program would produce the following output: SpellCheck Input file: SampleParagraph.txt "TokenScanner" is not in the dictionary "EnglishWords" is not in the dictionary "dat" is not in the dictionary "SpellCheck" is not in the dictionary 11. Write a program that implements a simple arithmetic calculator. Input to the calculator consists of lines composed of numbers (either integers or reals) combined together using the arithmetic operators +, -, *, and /. For each line of input, your program should display the result of applying the operators from left to right. You should use the token scanner to read the terms and operators and set up the scanner so that it ignores any whitespace characters. Your program should exit when the user enters a blank line. A sample run of your program might look like this: ExpressionCalculator > 2 + 2 4 > 355 / 113 3.14159 > 20.4 - 3.6 * 2.5 42 > 4+9-2*16+1/3*6-67+8*2-3+26-1/34+3/7+2-5 0 > The last line in this sample run is the arithmetic problem the Mathemagician gives to Milo in Norton Juster’s children’s story, The Phantom Tollbooth. Exercises 12. Extend the program you wrote for the preceding exercise so that the terms in the expressions can also be variable names assigned earlier in the session by entering lines like var = exp as shown in the following sample run: ExpressionCalculator > pi = 3.1415926535 > r = 1.5 > area = pi * r * r > area 7.06858 > 13. Implement the saveToken method for the TokenScanner class. This method saves the specified token so that subsequent calls to nextToken return the saved token without consuming any additional characters from the input. Your implementation should allow clients to save multiple tokens, which are then returned so that the last token saved is the first token returned. 14. Using ignoreWhitespace as a model, implement the ignoreComments method for the TokenScanner class. If ignoreComments is in effect, the token scanner should ignore any characters in comments marked using either the /* . . . */ or // style. 15. Implement the scanNumbers method for the TokenScanner class, which causes the token scanner to read any valid C++ number as a single token. The difficult part of this extension lies in understanding the rules for what constitutes a valid numeric string and then finding a way to implement those rules efficiently. The easiest way to specify those rules is in a form that computer scientists call a finite-state machine, which is usually represented diagrammatically as a collection of circles representing the possible states of the machine. The circles are then connected by a set of labeled arcs that indicate how the process moves from one state to another. A finite-state machine for scanning a real number appears in Figure 6-18. When you use a finite-state machine, you start in state s0 and then follow the labeled arcs for each character in the input until there is no arc that matches the current character. If you end up in a state marked by a double circle, you have successfully scanned a number. These states that indicate successful scanning of a token are called final states. Figure 6-18 includes three examples that show how the finite-state machine scans numbers of various kinds. 317 318 Designing Classes F I G U R E 6 - 1 8 Finite-state machine for scanning numbers digit digit s0 s1 digit . digit E s2 s3 E +, - digit s4 s5 digit Examples: 1729 3.1416 3.0E+9 state char state char state char s0 s1 s1 s1 s1 1 7 2 9 s0 s1 s2 s2 s2 s2 s2 3 . 1 4 1 6 s0 s1 s2 s2 s4 s5 s5 3 . 0 E + 9 The easiest way to write the code that scans a token when scanNumbers is in effect is to simulate the operation of the finite-state machine. Your code should keep track of the current state and then go through the input one character at a time. Each character will either signal the end of the number or send the machine into a new state. 15. Implement the scanStrings method for the TokenScanner class. When scanStrings is in effect, the token scanner should return quoted strings as single tokens. The strings may use either single or double quotation marks and should include the quotation marks in the string that nextToken returns. Chapter 7 Introduction to Recursion And often enough, our faith beforehand in a certain result is the only thing that makes the result come true. — William James, The Will To Believe, 1897 320 Introduction to Recursion Most algorithmic strategies used to solve programming problems have counterparts outside the domain of computing. When you perform a task repeatedly, you are using iteration. When you make a decision, you exercise conditional control. Because these operations are familiar, most people learn to use the control statements for, while, and if with relatively little trouble. Before you can solve many sophisticated programming tasks, however, you will have to learn to use a powerful problem-solving strategy that has few direct counterparts in the real world. That strategy, called recursion, is defined as any solution technique in which large problems are solved by reducing them to smaller problems of the same form. The italicized phrase is crucial to the definition, which otherwise describes the basic strategy of stepwise refinement. Both strategies involve decomposition. What makes recursion special is that the subproblems in a recursive solution have the same form as the original problem. If you are like most beginning programmers, the idea of breaking a problem down into subproblems of the same form does not make much sense when you first hear it. Unlike repetition or conditional testing, recursion is not a concept that comes up in day-to-day life. Because it is unfamiliar, learning how to use recursion can be difficult. To do so, you must develop the intuition necessary to make recursion seem as natural as all the other control structures. For most students of programming, reaching that level of understanding takes considerable time and practice. Even so, learning to use recursion is definitely worth the effort. As a problem-solving tool, recursion is so powerful that it at times seems almost magical. In addition, using recursion often makes it possible to write complex programs in simple and profoundly elegant ways. 7.1 A simple example of recursion To gain a better sense of what recursion is, let’s imagine that you have been appointed as the funding coordinator for a large charitable organization that is long on volunteers and short on cash. Your job is to raise $1,000,000 in contributions so the organization can meet its expenses. If you know someone who is willing to write a check for the entire $1,000,000, your job is easy. On the other hand, you may not be lucky enough to have friends who are generous millionaires. In that case, you must raise the $1,000,000 in smaller amounts. If the average contribution to your organization is $100, you might choose a different tack: call 10,000 friends and ask each of them for $100. But then again, you probably don’t have 10,000 friends. So what can you do? As is often the case when you are faced with a task that exceeds your own capacity, the answer lies in delegating part of the work to others. Your organization 7.1 A simple example of recursion has a reasonable supply of volunteers. If you could find 10 dedicated supporters in different parts of the country and appoint them as regional coordinators, each of those 10 people could then take responsibility for raising $100,000. Raising $100,000 is simpler than raising $1,000,000, but it hardly qualifies as easy. What should your regional coordinators do? If they adopt the same strategy, they will in turn delegate parts of the job. If they each recruit 10 fundraising volunteers, those people will only have to raise $10,000. The delegation process can continue until the volunteers are able to raise the money on their own; because the average contribution is $100, the volunteer fundraisers can probably raise $100 from a single donor, which eliminates the need for further delegation. If you express this fundraising strategy in pseudocode, it has the following structure: void collectContributions(int n) { if (n <= 100) { Collect the money from a single donor. } else { Find 10 volunteers. Get each volunteer to collect n/10 dollars. Combine the money raised by the volunteers. } } The most important thing to notice about this pseudocode translation is that the line Get each volunteer to collect n/10 dollars. is simply the original problem reproduced at a smaller scale. The basic character of the task—raise n dollars—remains exactly the same; the only difference is that n has a smaller value. Moreover, because the problem is the same, you can solve it by calling the original function. Thus, the preceding line of pseudocode would eventually be replaced with the following line: collectContributions(n / 10); It’s important to note that the collectContributions function ends up calling itself if the contribution level is greater than $100. In the context of programming, having a function call itself is the defining characteristic of recursion. The structure of the collectContributions function is typical of recursive functions. In general, the body of a recursive function has the following form: 321 322 Introduction to Recursion if (test for simple case) { Compute a simple solution without using recursion. } else { Break the problem down into subproblems of the same form. Solve each of the subproblems by calling this function recursively. Reassemble the subproblem solutions into a solution for the whole. } This structure provides a template for writing recursive functions and is therefore called the recursive paradigm. You can apply this technique to programming problems as long as they meet the following conditions: 1. You must be able to identify simple cases for which the answer is easily determined. 2. You must be able to identify a recursive decomposition that allows you to break any complex instance of the problem into simpler problems of the same form. The collectContributions example illustrates the power of recursion. As in any recursive technique, the original problem is solved by breaking it down into smaller subproblems that differ from the original only in their scale. Here, the original problem is to raise $1,000,000. At the first level of decomposition, each subproblem is to raise $100,000. These problems are then subdivided in turn to create smaller problems until the problems are simple enough to be solved immediately without recourse to further subdivision. Because the solution depends on dividing hard problems into simpler instances of the same problem, recursive solutions of this form are often called divide-and-conquer algorithms. 7.2 The factorial function Although the collectContributions example illustrates the idea of recursion, it gives little insight into how recursion is used in practice, mostly because the steps that make up the solution, such as finding 10 volunteers and collecting money, are not easily represented in a C++ program. To get a practical sense of the nature of recursion, you need to consider problems that fit more easily into the programming domain. For most people, the best way to understand recursion is to start with simple mathematical functions in which the recursive structure follows directly from the statement of the problem and is therefore easy to see. Of these, the most common is the factorial function—traditionally denoted in mathematics as n!—which is defined as the product of the integers between 1 and n. In C++, the equivalent problem is to write an implementation of a function with the prototype 7.2 The factorial function int fact(int n); that takes an integer n and returns its factorial. As you probably know from your programming experience, it is easy to code the fact function using a for loop, as illustrated by the following implementation: int fact(int n) { int result = 1; for (int i = 1; i <= n; i++) { result *= i; } return result; } This implementation uses a for loop to cycle through each of the integers between 1 and n. In the recursive implementation, this loop does not exist. The same effect is generated instead by the cascading recursive calls. Implementations that use looping (typically by using for and while statements) are said to be iterative. Iterative and recursive strategies are often seen as opposites because they can be used to solve the same problem in rather different ways. These strategies, however, are not mutually exclusive. Recursive functions sometimes employ iteration internally, and you will see examples of this technique in Chapter 8. The recursive formulation of fact The iterative implementation of fact, however, does not take advantage of an important mathematical property of factorials. Each factorial is related to the factorial of the next smaller integer in the following way: n ! = n × (n – 1)! Thus, 4! is 4 × 3!, 3! is 3 × 2!, and so on. To make sure that this process stops at some point, mathematicians define 0! to be 1. Thus, the conventional mathematical definition of the factorial function looks like this: 1 if n = 0 n × (n – 1)! otherwise n! = This definition is recursive, because it defines the factorial of n in terms of the factorial of n – 1. The new problem—finding the factorial of n – 1—has the same form as the original problem, which is the fundamental characteristic of recursion. 323 324 Introduction to Recursion You can then use the same process to define (n – 1)! in terms of (n – 2)!. Moreover, you can carry this process forward step by step until the solution is expressed in terms of 0!, which is equal to 1 by definition. From your perspective as a programmer, the practical impact of the mathematical definition is that it provides a template for a recursive implementation. In C++, you can implement a function fact that computes the factorial of its argument as follows: int fact(int n) { if (n == 0) { return 1; } else { return n * fact(n - 1); } } If n is 0, the result of fact is 1. If not, the implementation computes the result by calling fact(n - 1) and then multiplying the result by n. This implementation follows directly from the mathematical definition of the factorial function and has precisely the same recursive structure. Tracing the recursive process If you work from the mathematical definition, writing the recursive implementation of fact is straightforward. On the other hand, even though the definition is easy to write, the brevity of the solution may seem suspicious. When you are learning about recursion for the first time, the recursive implementation of fact seems to leave something out. Even though it clearly reflects the mathematical definition, the recursive formulation makes it hard to identify where the actual computational steps occur. When you call fact, for example, you want the computer to give you the answer. In the recursive implementation, all you see is a formula that transforms one call to fact into another one. Because the steps in that calculation are not explicit, it seems somewhat magical when the computer gets the right answer. If you follow through the logic the computer uses to evaluate any function call, however, you discover that no magic is involved. When the computer evaluates a call to the recursive fact function, it goes through the same process it uses to evaluate any other function call. To visualize the process, suppose that you have executed the statement cout << "fact(4) = " << fact(4) << endl; as part of the function main. When main calls fact, the computer creates a new stack frame and copies the argument value into the formal parameter n. The frame 7.2 The factorial function for fact temporarily supersedes the frame for main, as shown in the following diagram: int main() { printf("f(5) = %s\n", f(5)); int fact(int n) { return 1; if (n == 0) { } ☞ return 1; } else { return n * fact(n - 1); } } n 4 In the diagram, the code for the body of fact is shown inside the frame to make it easier to keep track of the current position in the program. In this diagram, the current position indicator appears at the beginning of the code because all function calls start at the first statement of the function body. The computer now begins to evaluate the body of the function, starting with the if statement. Because n is not equal to 0, control proceeds to the else clause, where the program must evaluate and return the value of the expression n * fact(n - 1) Evaluating this expression requires computing the value of fact(n - 1), which introduces a recursive call. When that call returns, all the program has to do is to multiply the result by n. The current state of the computation can therefore be diagrammed as follows: int main() { printf("f(5) = %s\n", f(5)); int fact(int n) { return 1; if (n == 0) { } return 1; } else { return n * fact(n - 1); } ? } n 4 As soon as the call to fact(n - 1) returns, the result is substituted for the expression underlined in the diagram, allowing computation to proceed. The next step in the computation is to evaluate the call to fact(n - 1), beginning with the argument expression. Because the current value of n is 4, the argument expression n - 1 has the value 3. The computer then creates a new frame for fact in which the formal parameter is initialized to this value. Thus, the next frame looks like this: 325 326 Introduction to Recursion int main() { printf("f(5) = %s\n", f(5)); int fact(int n) { return 1; if (n == 0) { } int fact(int n) { return 1; (n == 0) { ☞} if else { return 1; return n * fact(n - 1); } else { } return n * fact(n - 1); } } } n 3 There are now two frames labeled fact. In the most recent one, the computer is just starting to calculate fact(3). This new frame hides the previous frame for fact(4), which will not reappear until the fact(3) computation is complete. Computing fact(3) again begins by testing the value of n. Since n is still not 0, the else clause instructs the computer to evaluate fact(n - 1). As before, this process requires the creation of a new stack frame, as shown: int main() { printf("f(5) = %s\n", f(5)); int fact(int n) { return 1; if (n == 0) { } int fact(int n) { return 1; if (n == 0) { int } else fact(int { n) { return 1; return (n ==n 0) * fact(n { - 1); ☞} if else { } return 1; return n * fact(n - 1); } } else { } return n * fact(n - 1); } } } n 2 Following the same logic, the program must now call fact(1), which in turn calls fact(0), creating two new stack frames, as follows: int main() { printf("f(5) = %s\n", f(5)); int fact(int n) { return 1; if (n == 0) { } int fact(int n) { return 1; if (n == 0) { int } else fact(int { n) { return 1; if return (n ==n 0) * fact(n { - 1); int } else fact(int { n) { } return 1; if return (n ==n 0) * fact(n { - 1); } int } else fact(int { n) { } return 1; if return (n == n 0) * fact(n { - 1); ☞} else { } } return 1; return n * fact(n - 1); } } else { } return n * fact(n - 1); } } } n 0 7.2 The factorial function At this point, however, the situation changes. Because the value of n is 0, the function can return its result immediately by executing the statement return 1; The value 1 is returned to the calling frame, which resumes its position on top of the stack, as shown: int main() { printf("f(5) = %s\n", f(5)); int fact(int n) { return 1; if (n == 0) { } int fact(int n) { return 1; if (n == 0) { int } else fact(int { n) { return 1; if return (n ==n 0) * fact(n { - 1); int } else fact(int { n) { } return 1; if return (n ==n 0) * fact(n { - 1); } } else { } return 1; return n * fact(n - 1); } } else { } return n * fact(n - 1); } } 1 } n 1 From this point, the computation proceeds back through each of the recursive calls, completing the calculation of the return value at each level. In this frame, for example, the call to fact(n - 1) can be replaced by the value 1, as shown in the diagram for the stack frame. In this stack frame, n has the value 1, so the result of this call is simply 1. This result gets propagated back to its caller, which is represented by the top frame in the following diagram: int main() { printf("f(5) = %s\n", f(5)); int fact(int n) { return 1; if (n == 0) { } int fact(int n) { return 1; if (n == 0) { int } else fact(int { n) { return 1; if return (n ==n 0) * fact(n { - 1); } else { } return 1; return n * fact(n - 1); } } else { } return n * fact(n - 1); } } 1 } n 2 Because n is now 2, evaluating the return statement causes the value 2 to be passed back to the previous level, as follows: 327 328 Introduction to Recursion int main() { printf("f(5) = %s\n", f(5)); int fact(int n) { return 1; if (n == 0) { } int fact(int n) { return 1; if (n == 0) { } else { return 1; return n * fact(n - 1); } else { } return n * fact(n - 1); } } 2 } n 3 At this stage, the program returns 3 × 2 to the previous level, so that the frame for the initial call to fact looks like this: int main() { printf("f(5) = %s\n", f(5)); int fact(int n) { return 1; if (n == 0) { } return 1; } else { return n * fact(n - 1); } 6 } n 4 The final step in the calculation process consists of calculating 4 × 6 and returning the value 24 to the main program. The recursive leap of faith The point of including the complete trace of the fact(4) computation is to convince you that the computer treats recursive functions just like all other functions. When you are faced with a recursive function, you can—at least in theory—mimic the operation of the computer and figure out what it will do. By drawing all the frames and keeping track of all the variables, you can duplicate the entire operation and come up with the answer. If you do so, however, you will usually find that the complexity of the process ends up making the computation much harder to follow. Whenever you try to understand a recursive program, it is useful to put the underlying details aside and focus instead on a single level of the operation. At that level, you are allowed to assume that any recursive call automatically gets the right answer as long as the arguments to that call are in some sense simpler than the original arguments. This psychological strategy—assuming that any simpler recursive call will work correctly—is called the recursive leap of faith. Learning to apply this strategy is essential to using recursion in practical applications. 7.3 The Fibonacci function As an example, consider what happens when this implementation is used to compute fact(n) with n equal to 4. To do so, the recursive implementation must compute the value of the expression n * fact(n - 1) By substituting the current value of n into the expression, you know that the result is 4 * fact(3) Stop right there. Computing fact(3) is simpler than computing fact(4). Because it is simpler, the recursive leap of faith allows you to assume that it works. Thus, you should assume that the call to fact(3) will correctly compute the value of 3!, which is 3 × 2 × 1, or 6. The result of calling fact(4) is therefore 4 × 6, or 24. As you look at the examples in the rest of this chapter, try to focus on the big picture instead of the details. Once you have made the recursive decomposition and identified the simple cases, be satisfied that the computer can handle the rest. 7.3 The Fibonacci function In a mathematical treatise entitled Liber Abbaci published in 1202, the Italian mathematician Leonardo Fibonacci proposed a problem that has had a wide influence on many fields, including computer science. The problem was phrased as an exercise in population biology—a field that has become increasingly important in recent years. Fibonacci’s problem concerns how the population of rabbits would grow from generation to generation if the rabbits reproduced according to the following, admittedly fanciful, rules: • Each pair of fertile rabbits produces a new pair of offspring each month. • Rabbits become fertile in their second month of life. • Old rabbits never die. If a pair of newborn rabbits is introduced in January, how many pairs of rabbits are there at the end of the year? You can solve Fibonacci’s problem simply by keeping a count of the rabbits at each month during the year. At the beginning of January, there are no rabbits, since the first pair is introduced sometime in that month, which leaves one pair of rabbits on February 1. Since the initial pair of rabbits is newborn, they are not yet fertile in February, which means that the only rabbits on March 1 are the original pair of rabbits. In March, however, the original pair is now of reproductive age, which 329 330 Introduction to Recursion means that a new pair of rabbits is born. The new pair increases the colony’s population—counting by pairs—to two on April 1. In April, the original pair goes right on reproducing, but the rabbits born in March are as yet too young. Thus, there are three pairs of rabbits at the beginning of May. From here on, with more and more rabbits becoming fertile each month, the rabbit population begins to grow more quickly. Computing terms in the Fibonacci sequence At this point, it is useful to record the population data so far as a sequence of terms, indicated here by the subscripted value ti , each of which shows the number of rabbit pairs at the beginning of the i th month from the start of the experiment on January 1. The sequence itself is called the Fibonacci sequence and begins with the following terms, which represent the results of our calculation so far: t0 t1 t2 t3 t4 0 1 1 2 3 You can simplify the computation of further terms in this sequence by making an important observation. Because rabbits in this problem never die, all the rabbits that were around in the previous month are still around. Moreover, all of the fertile rabbits have produced a new pair. The number of fertile rabbit pairs capable of reproduction is simply the number of rabbits that were alive in the month before the previous one. The net effect is that each new term in the sequence must simply be the sum of the preceding two. Thus, the next several terms in the Fibonacci sequence look like this: t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 0 1 1 2 3 5 8 13 21 34 55 89 144 The number of rabbit pairs at the end of the year is therefore 144. From a programming perspective, it helps to express the rule for generating new terms in the following, more mathematical form: tn = tn-1 + tn-2 An expression of this type, in which each element of a sequence is defined in terms of earlier elements, is called a recurrence relation. The recurrence relation alone is not sufficient to define the Fibonacci sequence. Although the formula makes it easy to calculate new terms in the sequence, the process has to start somewhere. In order to apply the formula, you need to have at 7.3 The Fibonacci function least two terms already available, which means that the first two terms in the sequence—t0 and t1—must be defined explicitly. The complete specification of the terms in the Fibonacci sequence is therefore tn = n if n is 0 or 1 tn-1 + tn-2 otherwise This mathematical formulation is an ideal model for a recursive implementation of a function fib(n) that computes the n th term in the Fibonacci sequence. All you need to do is plug the simple cases and the recurrence relation into the standard recursive paradigm. The recursive implementation of fib(n) is shown in Figure 7-1, which also includes a test program that displays the terms in the Fibonacci sequence between two specified indices. Gaining confidence in the recursive implementation Now that you have a recursive implementation of the function fib, how can you go about convincing yourself that it works? You can always begin by tracing through the logic. Consider, for example, what happens if you call fib(5). Because this is not one of the simple cases enumerated in the if statement, the implementation computes the result by evaluating the line return fib(n - 1) + fib(n - 2); which is in this case equivalent to return fib(4) + fib(3); At this point, the computer calculates the result of fib(4), adds that to the result of calling fib(3), and returns the sum as the value of fib(5). But how does the computer go about evaluating fib(4) and fib(3)? The answer, of course, is that it uses precisely the same strategy. The essence of recursion is to break problems down into simpler ones that can be solved by calls to exactly the same function. Those calls get broken down into simpler ones, which in turn get broken down into even simpler ones, until at last the simple cases are reached. On the other hand, it is best to regard this entire mechanism as irrelevant detail. Remember the recursive leap of faith. Your job at this level is to understand how the call to fib(5) works. In the course of walking though the execution of that function, you have managed to transform the problem into computing the sum of fib(4) and fib(3). Because the argument values are smaller, each of these calls 331 332 Introduction to Recursion The Fibonacci function F I G U R E 7 - 1 Program to list the Fibonacci series /* * File: Fib.cpp * ------------* This program lists the terms in the Fibonacci sequence with * indices ranging from MIN_INDEX to MAX_INDEX. */ #include #include using namespace std; /* Constants */ const int MIN_INDEX = 0; const int MAX_INDEX = 20; /* Index of first term to generate */ /* Index of last term to generate */ /* Function prototypes */ int fib(int n); /* Main program */ int main() { cout << "This program lists the Fibonacci sequence." << endl; for (int i = MIN_INDEX; i <= MAX_INDEX; i++) { if (i < 10) cout << " "; cout << "fib(" << i << ")"; cout << " = " << setw(4) << fib(i) << endl; } return 0; } /* * Function: fib * Usage: int f = fib(n); * ---------------------* Returns the nth term in the Fibonacci sequence using the * following recursive formulation: * * fib(0) = 0 * fib(1) = 1 * fib(n) = fib(n - 1) + fib(n - 2) */ int fib(int n) { if (n < 2) { return n; } else { return fib(n - 1) + fib(n - 2); } } 7.3 The Fibonacci function represents a simpler case. Applying the recursive leap of faith, you can assume that the program correctly computes each of these values, without going through all the steps yourself. For the purposes of validating the recursive strategy, you can just look the answers up in the table: fib(4) is 3 and fib(3) is 2. The result of calling fib(5) is therefore 3 + 2, or 5, which is indeed the correct answer. Case closed. You don’t need to see all the details, which are best left to the computer. Efficiency of the recursive implementation If you do decide to go through the details of the evaluation of the call to fib(5), however, you will quickly discover that the calculation is extremely inefficient. The recursive decomposition makes many redundant calls, in which the computer ends up calculating the same term in the Fibonacci sequence several times. This situation is illustrated in Figure 7-2, which shows all the recursive calls required in the calculation of fib(5). As you can see from the diagram, the program ends up making one call to fib(4), two calls to fib(3), three calls to fib(2), five calls to fib(1), and three calls to fib(0). Given that the Fibonacci function can be implemented efficiently using iteration, the explosion of steps required by the recursive implementation is more than a little disturbing. F I G U R E 7 - 2 Steps in the calculation of fib(5) fib(5) fib(4) fib(3) fib(3) fib(2) fib(2) fib(1) 1 fib(2) fib(1) fib(0) 1 0 fib(1) fib(1) fib(0) fib(1) fib(0) 1 1 0 1 0 333 334 Introduction to Recursion Recursion is not to blame On discovering that the implementation of fib(n) given in Figure 7-1 is highly inefficient, many people are tempted to point their finger at recursion as the culprit. The problem in the Fibonacci example, however, has nothing to do with recursion per se but rather the way in which recursion is used. By adopting a different strategy, it is possible to write a recursive implementation of the fib function in which the large-scale inefficiencies revealed in Figure 7-2 disappear completely. As is often the case when using recursion, the key to finding a more efficient solution lies in adopting a more general approach. The Fibonacci sequence is not the only sequence whose terms are defined by the recurrence relation tn = tn-1 + tn-2 Depending on how you choose the first two terms, you can generate many different sequences. The traditional Fibonacci sequence 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, . . . comes from defining t0 = 0 and t1 = 1. If, for example, you defined t0 = 3 and t1 = 7, you would get this sequence instead: 3, 7, 10, 17, 27, 44, 71, 115, 186, 301, 487, 788, 1275, . . . Similarly, defining t0 = –1 and t1 = 2 gives rise to the following sequence: –1, 2, 1, 3, 4, 7, 11, 18, 29, 47, 76, 123, 199, . . . These sequences all use the same recurrence relation, which specifies that each new term is the sum of the previous two. The only way the sequences differ is in the choice of the first two terms. As a general class, the sequences that follow this pattern are called additive sequences. This concept of an additive sequence makes it possible to convert the problem of finding the n th term in the Fibonacci sequence into the more general problem of finding the n th term in an additive sequence whose initial terms are t0 and t1. Such a function requires three arguments and might be expressed in C++ as a function with the following prototype: int additiveSequence(int n, int t0, int t1); If you had such a function, it would be easy to implement fib using it. All you would need to do is supply the correct values of the first two terms, as follows: 7.3 The Fibonacci function int fib(int n) { return additiveSequence(n, 0, 1); } The body consists of a single line of code that does nothing but call another function, passing along a few extra arguments. Functions of this sort, which simply return the result of another function, often after transforming the arguments in some way, are called wrapper functions. Wrapper functions are extremely common in recursive programming. In most cases, a wrapper function is used—as it is here— to supply additional arguments to a subsidiary function that solves a more general problem. From here, the one remaining task is to implement additiveSequence. If you think about this more general problem for a few minutes, you will discover that additive sequences have an interesting recursive character of their own. The simple case for the recursion consists of the terms t0 and t1, whose values are part of the definition of the sequence. In the C++ implementation, the values of these terms are passed as arguments. If you need to compute t0, for example, all you have to do is return the argument t0. But what if you are asked to find a term further down in the sequence? Suppose, for example, that you want to find t6 in the additive sequence whose initial terms are 3 and 7. By looking at the list of terms in the sequence t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 3 7 10 17 27 44 71 115 186 301 ... you can see that the correct value is 71. The interesting question, however, is how you can use recursion to determine this result. The key insight you need to discover is that the n th term in any additive sequence is simply the n–1st term in the additive sequence that begins one step further along. For example, t6 in the sequence shown in the most recent example is simply t5 in the additive sequence t0 t1 t2 t3 t4 t5 t6 t7 t8 7 10 17 27 44 71 115 186 301 ... that begins with 7 and 10. This insight makes it possible to implement the function additiveSequence as follows: 335 336 Introduction to Recursion int additiveSequence(int n, int t0, int t1) { if (n == 0) return t0; if (n == 1) return t1; return additiveSequence(n - 1, t1, t0 + t1); } If you trace through the steps in the calculation of fib(5) using this technique, you will discover that the calculation involves none of the redundant computation that plagued the earlier recursive formulation. The steps lead directly to the solution, as shown in the following diagram: fib(5) = additiveSequence(5, 0, 1) = additiveSequence(4, 1, 1) = additiveSequence(3, 1, 2) = additiveSequence(2, 2, 3) = additiveSequence(1, 3, 5) = 5 Even though the new implementation is entirely recursive, it is comparable in efficiency to the traditional iterative version of the Fibonacci function. In fact, it is possible to use more sophisticated mathematics to write an entirely recursive implementation of fib(n) that is considerably more efficient than the iterative strategy. You will have a chance to code this implementation on your own in the exercises for Chapter 10. 7.4 Checking palindromes Although the factorial and Fibonacci functions provide excellent examples of how recursive functions work, they are mathematical in nature and may therefore convey the incorrect impression that recursion is applicable only to mathematical functions. In fact, you can apply recursion to any problem that can be decomposed into simpler problems of the same form. It is therefore useful to consider a few more examples of recursion, focusing on those that are less mathematical in nature. This section, for example, illustrates the use of recursion in a simple string application. A palindrome is a string that reads identically backward and forward, such as "level" or "noon". Although it is easy to check whether a string is a palindrome by iterating through its characters, palindromes can also be defined recursively. The insight you need is that any palindrome longer than a single character must contain a shorter palindrome in its interior. For example, the string "level" consists of the palindrome "eve" with an "l" at each end. Thus, to check whether a string is a palindrome—assuming the string is sufficiently long that it does not constitute a 7.4 Checking palindromes simple case—all you need to do is 1. Check to see that the first and last characters are the same. 2. Check to see whether the substring generated by removing the first and last characters is itself a palindrome. If both conditions apply, the string is a palindrome. The only other question you must consider before writing a recursive solution to the palindrome problem is what the simple cases are. Clearly, any single-character string is a palindrome because reversing a one-character string has no effect. The one-character string therefore represents a simple case, but it is not the only one. The empty string—which contains no characters at all—is also a palindrome, and any recursive solution must operate correctly in this case as well. Figure 7-3 contains a recursive implementation of the isPalindrome function, which returns true if and only if its argument is a palindrome. The function begins by checking to see whether the length of the string is less than 2. If it is, the string is certainly a palindrome. If not, the function checks to make sure that the string meets both of the necessary criteria. This implementation shown in Figure 7-3 is inefficient, even though the recursive decomposition is easy to follow. You can improve the performance of isPalindrome by making the following changes: F I G U R E 7 - 3 Program to check for palindromes /* * Function: isPalindrome * Usage: if (isPalindrome(str)) . . . * ----------------------------------* Returns true if str is a palindrome, which is a string that * reads the same backwards and forwards. This implementation * uses the recursive insight that all strings of length 0 or 1 * are palindromes and that longer strings are palindromes if * their first and last characters match and the remaining substring * is a palindrome. */ bool isPalindrome(string str) { int len = str.length(); if (len <= 1) { return true; } else { return str[0] == str[len - 1] && isPalindrome(str.substr(1, len - 2)); } } 337 338 Introduction to Recursion • Calculate the length of the argument only once. The original implementation of isPalindrome calculates the length of the string at every level of the recursive decomposition. It is more efficient to call the length method once and then pass that information down through the chain of recursive calls. • Don’t make a substring on each call. The other source of inefficiency in the first version of isPalindrome is the repeated substr calls that remove the first and last characters. You can avoid the calls to substr entirely by passing indices to keep track of the positions at which the desired substring begins and ends. Each of these changes requires the recursive function to take additional arguments. Figure 7-4 shows a revised version of isPalindrome implemented as a wrapper function that calls the helper function isSubstringPalindrome to do the actual work. The isSubstringPalindrome function takes the additional arguments p1 and p2, which specify the indices between which the checking occurs. F I G U R E 7 - 4 More efficient implementation of isPalindrome /* * Function: isPalindrome * Usage: if (isPalindrome(str)) . . . * ----------------------------------* Returns true if str is a palindrome, which is a string that reads the * same backwards and forwards. This level of the implementation is * simply a wrapper for isSubstringPalindrome, which does the real work. */ bool isPalindrome(string str) { return isSubstringPalindrome(str, 0, str.length() - 1); } /* * Function: isSubstringPalindrome * Usage: if (isSubstringPalindrome(str, p1, p2)) . . . * ---------------------------------------------------* Returns true if the characters in str from p1 to p2 inclusive form * a palindrome. This implementation uses the recursive insight that * all strings of length 0 or 1 are palindromes (the simple cases) and * that longer strings are palindromes only if their first and last * characters match and the remaining substring is a palindrome. */ bool isSubstringPalindrome(string str, int p1, int p2) { if (p1 >= p2) { return true; } else { return str[p1] == str[p2] && isSubstringPalindrome(str, p1 + 1, p2 - 1); } } 7.5 The binary search algorithm 7.5 The binary search algorithm When you work with sequences of values stored in a vector, one of the common operations consists of searching that vector for a particular element. For example, if you work frequently with string vectors, it is useful to have a function int findInVector(string key, Vector & vec); that searches through each of the elements of vec, looking for an element whose value is equal to key. If a matching value is found, findInVector returns the index at which it appears. If no such value exists, the function returns –1. If you have no specific knowledge about the order of elements within the array, the implementation of findInVector must simply check each of the elements in turn until it either finds a match or runs out of elements. This strategy is called the linear-search algorithm, which can be time-consuming if the arrays are large. On the other hand, if you know that the elements of the array are arranged in alphabetical order, you can adopt a much more efficient approach. All you have to do is divide the array in half and compare the key you’re trying to find against the element closest to the middle of the array, using the order defined by the ASCII character codes, which is called lexicographic order. If the key you’re looking for precedes the middle element, then the key—if it exists at all—must be in the first half. Conversely, if the key follows the middle element in lexicographic order, you only need to look at the elements in the second half. This strategy is called the binary-search algorithm. Because binary search makes it possible for you to discard half the possible elements at each step in the process, it turns out to be much more efficient than linear search for sorted arrays. The binary-search algorithm—which appears in Figure 7-5—is also a perfect example of the divide-and-conquer strategy. It should therefore not be particularly surprising that binary search has a natural recursive implementation. Note that the function findInSortedVector is implemented as a wrapper, leaving the real work to the recursive function binarySearch, which takes two additional arguments—the indices p1 and p2—to limit the range of the search. The simple cases for binarySearch are 1. There are no elements in the active part of the array. This condition is marked by the fact that the index p1 is greater than the index p2, which means that there are no elements left to search. 2. The middle element matches the specified key. Since the key has just been found, findInSortedVector can simply return the index of that value. 339 340 Introduction to Recursion F I G U R E 7 - 5 Divide-and-conquer implementation of binary search /* * Function: findInSortedVector * Usage: int index = findInSortedVector(key, vec); * -----------------------------------------------* Searches for the specified key in the Vector vec, which * must be sorted in lexicographic (character code) order. If the * key is found, the function returns the index in the vector at * which that key appears. (If the key appears more than once in * the vector, any of the matching indices may be returned). If the * key does not exist in the vector, the function returns -1. This * implementation is simply a wrapper function; all of the real work * is done by the more general binarySearch function. */ int findInSortedVector(string key, Vector & vec) { return binarySearch(key, vec, 0, vec.size() - 1); } /* * Function: binarySearch * Usage: int index = binarySearch(key, vec, p1, p2); * -------------------------------------------------* Searches for the specified key in the Vector vec, looking * only at indices between p1 and p2, inclusive. The function returns * the index of a matching element or -1 if no match is found. */ int binarySearch(string key, Vector & vec, int p1, int p2) { if (p1 > p2) return -1; int mid = (p1 + p2) / 2; if (key == vec[mid]) return mid; if (key < vec[mid]) { return binarySearch(key, vec, p1, mid - 1); } else { return binarySearch(key, vec, mid + 1, p2); } } If neither of these cases applies, the implementation can simplify the problem by choosing the appropriate half of the array and calling itself recursively with an updated set of search limits. 7.6 Mutual recursion In each of the examples considered so far, the recursive functions have called themselves directly, in the sense that the body of the function contains a call to itself. Although most of the recursive functions you encounter are likely to adhere to this style, the definition of recursion is actually somewhat broader. To be 7.6 Mutual recursion recursive, a function must call itself at some point during its evaluation. If a function is subdivided into subsidiary functions, the recursive call can occur at a deeper level of nesting. For example, if a function ƒ calls a function g, which in turn calls ƒ, those function calls are still considered to be recursive. Because the functions ƒ and g call each other, this type of recursion is called mutual recursion. As a simple example, it turns out to be easy—although wildly inefficient—to use recursion to test whether a number is even or odd. The code in Figure 7-6, for example, implements the isEven and isOdd functions, by taking advantage of the following informal definition: • A number is even if its predecessor is odd. • A number is odd if is not even. • The number 0 is even by definition. Even though these rules seem simplistic, they form the basis of an effective strategy for distinguishing odd and even numbers, as long as those numbers are nonnegative. F I G U R E 7 - 6 Mutually recursive definitions of isEven and isOdd /* * Function: isOdd * Usage: if (isOdd(n)) . . . * -------------------------* Returns true if the unsigned number n is odd. * if it is not even. */ A number is odd bool isOdd(unsigned int n) { return !isEven(n); } /* * Function: isEven * Usage: if (isEven(n)) . . . * --------------------------* Returns true if the unsigned number n is even. A number is even * either (1) if it is zero or (2) if its predecessor is odd. */ bool isEven(unsigned int n) { if (n == 0) { return true; } else { return isOdd(n - 1); } } 341 342 Introduction to Recursion The code in Figure 7-6 ensures that condition by having isEven and isOdd take arguments of type unsigned, which C++ uses to represent an integer that can never be less than zero. 7.7 Thinking recursively For most people, recursion is not an easy concept to grasp. Learning to use it effectively requires considerable practice and forces you to approach problems in entirely new ways. The key to success lies in developing the right mindset— learning how to think recursively. The remainder of this chapter is designed to help you achieve that goal. Maintaining a holistic perspective When you are learning to program, I think it helps enormously to keep in mind the philosophical concepts of holism and reductionism. Simply stated, reductionism is the belief that the whole of an object can be understood merely by understanding the parts that make it up. Its antithesis is holism, the position that the whole is often greater than the sum of its parts. As you learn about programming, it helps to be able to interleave these two perspectives, sometimes focusing on the behavior of a program as a whole, and at other times delving into the details of its execution. When you try to learn about recursion, however, this balance seems to change. Thinking recursively requires you to think holistically. In the recursive domain, reductionism is the enemy of understanding and almost always gets in the way. To maintain the holistic perspective, you must become comfortable adopting the recursive leap of faith, which was introduced in its own section earlier in this chapter. Whenever you are writing a recursive program or trying to understand the behavior of one, you must get to the point where you ignore the details of the individual recursive calls. As long as you have chosen the right decomposition, identified the appropriate simple cases, and implemented your strategy correctly, those recursive calls will simply work. You don’t need to think about them. Unfortunately, until you have had extensive experience working with recursive functions, applying the recursive leap of faith does not come easily. The problem is that doing so requires to suspend your disbelief and make assumptions about the correctness of your programs that fly in the face of your experience. After all, when you write a program, the odds are good—even if you are an experienced programmer—that your program won’t work the first time. In fact, it is quite likely that you have chosen the wrong decomposition, messed up the definition of the simple cases, or somehow gotten things muddled as you tried to implement your strategy. If you have done any of these things, your recursive calls won’t work. 7.7 Thinking recursively When things go wrong—as they inevitably will—you have to remember to look for the error in the right place. The problem lies somewhere in your recursive implementation, not in the recursive mechanism itself. If there is a problem, you should be able to find it by looking at a single level of the recursive hierarchy. Looking down through additional levels of recursive calls is not going to help. If the simple cases work and the recursive decomposition is correct, the subsidiary calls will work correctly. If they don’t, the problem must lie in your formulation of the recursive decomposition. Avoiding the common pitfalls As you gain experience with recursion, the process of writing and debugging recursive programs will become more natural. At the beginning, however, finding out what you need to fix in a recursive program can be difficult. The following is a checklist that will help you identify the most common sources of error. • Does your recursive implementation begin by checking for simple cases? Before you attempt to solve a problem by transforming it into a recursive subproblem, you must first check to see if the problem is so simple that such decomposition is unnecessary. In almost all cases, recursive functions begin with the keyword if. If your function doesn’t, you should look carefully at your program and make sure that you know what you’re doing. • Have you solved the simple cases correctly? A surprising number of bugs in recursive programs arise from having incorrect solutions to the simple cases. If the simple cases are wrong, the recursive solutions to more complicated problems will inherit the same mistake. For example, if you had mistakenly defined fact(0) as 0 instead of 1, calling fact on any argument would end up returning 0. • Does your recursive decomposition make the problem simpler? For recursion to work, the problems have to get simpler as you go along. More formally, there must be some metric—a standard of measurement that assigns a numeric difficulty rating to the problem—that gets smaller as the computation proceeds. For mathematical functions like fact and fib, the value of the integer argument serves as a metric. On each recursive call, the value of the argument gets smaller. For the isPalindrome function, the appropriate metric is the length of the argument string, because the string gets shorter on each recursive call. If the problem instances do not get simpler, the decomposition process will just keep making more and more calls, giving rise to the recursive analogue of the infinite loop, which is called nonterminating recursion. • Does the simplification process eventually reach the simple cases, or have you left out some of the possibilities? A common source of error is failing to include simple case tests for all the cases that can arise as the result of the recursive 343 344 Introduction to Recursion decomposition. For example, in the isPalindrome implementation presented in Figure 7-3, it is critically important for the function to check the zero-character case as well as the one-character case, even if the client never intends to call isPalindrome on the empty string. As the recursive decomposition proceeds, the string arguments get shorter by two characters at each level of the recursive call. If the length of the original argument string is even, the recursive decomposition will never get to the one-character case. • Do the recursive calls in your function represent subproblems that are truly identical in form to the original? When you use recursion to break down a problem, it is essential that the subproblems be of the same form. If the recursive calls change the nature of the problem or violate one of the initial assumptions, the entire process can break down. As several of the examples in this chapter illustrate, it is often useful to define the publicly exported function as a simple wrapper that calls a more general recursive function that is private to the implementation. Because the private function has a more general form, it is usually easier to decompose the original problem and still have it fit within the recursive structure. • When you apply the recursive leap of faith, do the solutions to the recursive subproblems provide a complete solution to the original problem? Breaking a problem down into recursive subinstances is only part of the recursive process. Once you get the solutions, you must also be able to reassemble them to generate the complete solution. The way to check whether this process in fact generates the solution is to walk through the decomposition, religiously applying the recursive leap of faith. Work through all the steps in the current function call, but assume that every recursive call generates the correct answer. If following this process yields the right solution, your program should work. Summary This chapter has introduced the idea of recursion, a powerful programming strategy in which complex problems are broken down into simpler problems of the same form. The important points presented in this chapter include: • Recursion is similar to stepwise refinement in that both strategies consist of breaking a problem down into simpler problems that are easier to solve. The distinguishing characteristic of recursion is that the simpler subproblems must have the same form as the original. • To use recursion, you must be able to identify simple cases for which the answer is easily determined and a recursive decomposition that allows you to break any complex instance of the problem into simpler problems of the same type. Summary • In C++, recursive functions typically have the following paradigmatic form: if (test for simple case) { Compute a simple solution without using recursion. } else { Break the problem down into subproblems of the same form. Solve each of the subproblems by calling this function recursively. Reassemble the subproblem solutions into a solution for the whole. } • Recursive functions are implemented using exactly the same mechanism as any other function call. Each call creates a new stack frame that contains the local variables for that call. Because the computer creates a separate stack frame for each function call, the local variables at each level of the recursive decomposition remain separate. • Before you can use recursion effectively, you must learn to limit your analysis to a single level of the recursive decomposition and to rely on the correctness of all simpler recursive calls without tracing through the entire computation. Trusting these simpler calls to work correctly is called the recursive leap of faith. • Mathematical functions often express their recursive nature in the form of a recurrence relation, in which each element of a sequence is defined in terms of earlier elements. • Although some recursive functions may be less efficient than their iterative counterparts, recursion itself is not the problem. As is typical with all types of algorithms, some recursive strategies are more efficient than others. • In order to ensure that a recursive decomposition produces subproblems that are identical in form to the original, it is often necessary to generalize the problem. As a result, it is often useful to implement the solution to a specific problem as a simple wrapper function whose only purpose is to call a subsidiary function that handles the more general case. • Recursion need not consist of a single function that calls itself but may instead involve several functions that call each other in a cyclical pattern. Recursion that involves more than one function is called mutual recursion. • You will be more successful at understanding recursive programs if you can maintain a holistic perspective rather than a reductionistic one. Thinking about recursive problems in the right way does not come easily. Learning to use recursion effectively requires practice and more practice. For many students, mastering the concept takes years. But because recursion will turn out to be one of the most powerful techniques in your programming repertoire, that time will be well spent. 345 346 Introduction to Recursion Review questions 1. Define the terms recursive and iterative. Is it possible for a function to employ both strategies? 2. What is the fundamental difference between recursion and traditional stepwise refinement? 3. In the pseudocode for the collectContributions function, the if statement looks like this: if (n <= 100) Why is it important to use the <= operator instead of simply checking whether n is exactly equal to 100? 4. What is the standard recursive paradigm? 5. What two properties must a problem have for recursion to make sense as a solution strategy? 6. Why is the term divide and conquer appropriate to recursive techniques? 7. What is meant by the recursive leap of faith? Why is this concept important to you as a programmer? 8. In the section entitled “Tracing the recursive process,” the text goes through a long analysis of what happens internally when fact(4) is called. Using this section as a model, trace through the execution of fib(3), sketching out each stack frame created in the process. 9. What is a recurrence relation? 10. Modify Fibonacci’s rabbit problem by introducing the additional rule that rabbit pairs stop reproducing after giving birth to three litters. How does this assumption change the recurrence relation? What changes do you need to make in the simple cases? 11. How many times is fib(1) called when calculating fib(n) using the recursive implementation given in Figure 7-1? 12. What is a wrapper function? Why are they often useful in writing recursive functions? Review questions 13. What would happen if you eliminated the if (n == 1) check from the function additiveSequence, so that the implementation looked like this: int additiveSequence(int n, int t0, int t1) { if (n == 0) return t0; return additiveSequence(n - 1, t1, t0 + t1); } Would the function still work? Why or why not? 14. Why is it important that the implementation of isPalindrome in Figure 7-3 checks for the empty string as well as the single character string? What would happen if the function didn’t check for the single character case and instead checked only whether the length is 0? Would the function still work correctly? 15. Explain the effect of the function call isPalindrome(str, p1 + 1, p2 - 1) in the isPalindrome implementation given in Figure 7-4. 16. What is mutual recursion? 17. What would happen if you defined isEven and isOdd as follows: bool isEven(unsigned int n) { return !isOdd(n); } bool isOdd(unsigned int n) { return !isEven(n); } Which of the errors explained in the section “Avoiding the common pitfalls” is illustrated in this example? 18. The following definitions of isEven and isOdd are also incorrect: bool isEven(unsigned int n) { if (n == 0) { return true; } else { return isOdd(n - 1); } } 347 348 Introduction to Recursion bool isOdd(unsigned int n) { if (n == 1) { return true; } else { return isEven(n - 1); } } Give an example that shows how this implementation can fail. What common pitfall is illustrated here? Exercises 1. Spherical objects, such as cannonballs, can be stacked to form a pyramid with one cannonball at the top, sitting on top of a square composed of four cannonballs, sitting on top of a square composed of nine cannonballs, and so forth. Write a recursive function cannonball that takes as its argument the height of the pyramid and returns the number of cannonballs it contains. Your function must operate recursively and must not use any iterative constructs, such as while or for. 2. Unlike many programming languages, C++ does not include a predefined operator that raises a number to a power. As a partial remedy for this deficiency, write a recursive implementation of a function int raiseToPower(int n, int k) that calculates n k . The recursive insight that you need to solve this problem is the mathematical property that nk = 3. 1 if k is 0 n × n k-1 otherwise In the 18th century, the astronomer Johann Daniel Titius proposed a rule, later recorded by Johann Elert Bode, for calculating the distance from the sun to each of the planets known at that time. To apply that rule, which is now known as the Titius-Bode Law, you begin by writing down the sequence b1 = 1 b2 = 3 b3 = 6 b4 = 12 b5 = 24 b6 = 48 . . . where each subsequent element in the sequence is twice the preceding one. It turns out that an approximate distance to the i th planet can be computed from this series by applying the formula Exercises di = 4 + bi 10 The distance di is expressed in astronomical units (AU), which correspond to the average distance from the earth to the sun (approximately 93,000,000 miles). Except for a disconcerting gap between Mars and Jupiter, the Titius-Bode law gives reasonable approximations for the distances to the seven planets known at the time: Mercury Venus Earth Mars ? Jupiter Saturn Uranus 0.5 AU 0.7 AU 1.0 AU 1.6 AU 2.8 AU 5.2 AU 10.0 AU 19.6 AU Concern about the gap in the sequence led astronomers to discover the asteroid belt, which they suggested might have been the remains of a planet that had once orbited the sun at distance specified by the missing entry in the table. Write a recursive function getTitiusBodeDistance(k) that calculates the expected distance between the sun and the kth planet, numbering outward from Mercury starting with 1. Test your function by writing a program that displays the distances to each of these planets in tabular form. 4. The greatest common divisor (often abbreviated to gcd) of two nonnegative integers is the largest integer that divides evenly into both. In the third century BCE, the Greek mathematician Euclid discovered that the greatest common divisor of x and y can always be computed as follows: • If x is evenly divisible by y, then y is the greatest common divisor. • Otherwise, the greatest common divisor of x and y is always equal to the greatest common divisor of y and the remainder of x divided by y. Use Euclid’s insight to write a recursive function gcd(x, y) that computes the greatest common divisor of x and y. 5. Write an iterative implementation of the function fib(n). 6. For each of the two recursive implementations of the function fib(n) presented in this chapter, write a recursive function (you can call these countFib1 and countFib2 for the two algorithms) that counts the number of 349 350 Introduction to Recursion function calls made during the evaluation of the corresponding Fibonacci calculation. Write a main program that uses these functions to display a table showing the number of calls made by each algorithm for various values of n, as shown in the following sample run: CountFib This program counts the number of calls made by the two algorithms used to compute the Fibonacci sequence. n -0 1 2 3 4 5 6 7 8 9 10 11 12 7. fib1 ---1 1 3 5 9 15 25 41 67 109 177 287 465 fib2 ---2 2 3 4 5 6 7 8 9 10 11 12 13 Write a recursive function digitSum(n) that takes a nonnegative integer and returns the sum of its digits. For example, calling digitSum(1729) should return 1 + 7 + 2 + 9, which is 19. The recursive implementation of digitSum depends on the fact that it is very easy to break an integer down into two components using division by 10. For example, given the integer 1729, you can divide it into two pieces as follows: 1729 172 9 Each of the resulting integers is strictly smaller than the original and thus represents a simpler case. 8. The digital root of an integer n is defined as the result of summing the digits repeatedly until only a single digit remains. For example, the digital root of 1729 can be calculated using the following steps: Step 1: Step 2: Step 3: 1+7+2+9 1+9 1+0 → → → 19 10 1 Exercises Because the total at the end of step 3 is the single digit 1, that value is the digital root. Write a function digitalRoot(n) that returns the digital root of its argument. Although it is easy to implement digitalRoot using the digitSum function from exercise 6 and a while loop, part of the challenge of this problem is to write the function recursively without using any explicit loop constructs. 9. As you know from Chapter 2, the mathematical combinations function c(n, k) is usually defined in terms of factorials, as follows: n! c(n, k) = k! × (n – k)! The values of c(n, k) can also be arranged geometrically to form a triangle in which n increases as you move down the triangle and k increases as you move from left to right. The resulting structure, which is called Pascal’s Triangle after the French mathematician Blaise Pascal, is arranged like this: c(0, 0) c(1, 0) c(2, 0) c(3, 0) c(4, 0) c(1, 1) c(2, 1) c(3, 1) c(4, 1) c(2, 2) c(3, 2) c(4, 2) c(3, 3) c(4, 3) c(4, 4) Pascal’s Triangle has the interesting property that every entry is the sum of the two entries above it, except along the left and right edges, where the values are always 1. Consider, for example, the circled entry in the following display of Pascal’s Triangle: 1 1 1 1 1 1 2 3 1 3 1 6 4 1 10 10 5 1 1 6 15 20 15 6 1 1 7 21 35 35 21 7 1 1 4 5 This entry, which corresponds to c(6, 2), is the sum of the two entries—5 and 10—that appear above it to either side. Use this relationship between entries in 351 352 Introduction to Recursion Pascal’s Triangle to write a recursive implementation of the c(n, k) function that uses no loops, no multiplication, and no calls to fact. 10. Write a recursive function that takes a string as argument and returns the reverse of that string. The prototype for this function should be string reverse(string str); and the statement cout << reverse("program") << endl; should display ReverseString margorp Your solution should be entirely recursive and should not use any iterative constructs such as while or for. 11. The strlib.h library contains a function integerToString. Reimplement this function without using streams by exploiting the recursive decomposition of an integer outlined in exercise 7. Chapter 8 Recursive Strategies Tactics without strategy is the noise before defeat. — Sun Tzu, ~5th Century BCE 354 Recursive Strategies When a recursive decomposition follows directly from a mathematical definition, as it does in the case of the fact and fib functions in Chapter 7, applying recursion is not particularly hard. In most cases, you can translate the mathematical definition directly into a recursive implementation by plugging the appropriate expressions into the standard recursive paradigm. The situation changes, however, as you begin to solve more complex problems. This chapter introduces several programming problems that seem—at least on the surface—much more difficult than those in Chapter 7. In fact, if you try to solve these problems without using recursion, relying instead on more familiar iterative techniques, you will find them quite difficult to solve. By contrast, each of the problems has a recursive solution that is surprisingly short. If you exploit the power of recursion, a few lines of code are sufficient for each task. The brevity of these solutions, however, endows them with a deceptive aura of simplicity. The hard part of solving these problems has nothing to do with the length of the code. What makes these programs difficult is finding the recursive decomposition in the first place. Doing so occasionally requires some cleverness, but what you need even more is confidence. You have to accept the recursive leap of faith. 8.1 The Towers of Hanoi The first example in this chapter is a simple puzzle that has come to be known as the Towers of Hanoi. Invented by French mathematician Édouard Lucas in the 1880s, the Towers of Hanoi puzzle quickly became popular in Europe. Its success was due in part to the legend that grew up around the puzzle, which was described as follows in La Nature by the French mathematician Henri de Parville (as translated by the mathematical historian W. W. R. Ball): In the great temple at Benares beneath the dome which marks the center of the world, rests a brass plate in which are fixed three diamond needles, each a cubit high and as thick as the body of a bee. On one of these needles, at the creation, God placed sixty-four disks of pure gold, the largest disk resting on the brass plate and the others getting smaller and smaller up to the top one. This is the Tower of Brahma. Day and night unceasingly, the priests transfer the disks from one diamond needle to another according to the fixed and immutable laws of Brahma, which require that the priest on duty must not move more than one disk at a time and that he must place this disk on a needle so that there is no smaller disk below it. When all the sixty-four disks shall have been thus transferred from the needle on which at the creation God placed them to one of the other needles, tower, temple and Brahmins alike will crumble into dust, and with a thunderclap the world will vanish. Over the years, the setting has shifted from India to Vietnam, but the puzzle and its legend remain the same. 8.1 The Towers of Hanoi As far as I know, the Towers of Hanoi puzzle has no practical use except one: teaching recursion to computer science students. In that domain, it has tremendous value because the solution involves nothing besides recursion. In contrast to most recursive algorithms that arise in response to real-world problems, the Towers of Hanoi problem has no extraneous complications that might interfere with your understanding and keep you from seeing how the recursive solution works. Because it works so well as an example, the Towers of Hanoi is included in most textbooks that treat recursion and has become—much like the “hello, world” program in Chapter 1—part of the cultural heritage that computer scientists share. In commercial versions of the puzzle, the 64 golden disks of legend are replaced with eight wooden or plastic ones, which makes the puzzle considerably easier to solve (not to mention cheaper). The initial state of the puzzle looks like this: A B C At the beginning, all eight disks are on spire A. Your goal is to move these eight disks from spire A to spire B, but you must adhere to the following rules: • You can only move one disk at a time. • You are not allowed to move a larger disk on top of a smaller disk. Framing the problem In order to apply recursion to the Towers of Hanoi problem, you must first frame the problem in more general terms. Although the ultimate goal is moving eight disks from A to B, the recursive decomposition of the problem will involve moving smaller subtowers from spire to spire in various configurations. In the more general case, the problem you need to solve is moving a tower of a given height from one spire to another, using the third spire as a temporary repository. To ensure that all subproblems fit the original form, your recursive procedure must therefore take the following arguments: 1. The number of disks to move 2. The name of the spire where the disks start out 3. The name of the spire where the disks should finish 4. The name of the spire used for temporary storage 355 356 Recursive Strategies The number of disks to move is clearly an integer, and the fact that the spires are labeled with the letters A, B, and C suggests the use of type char to indicate which spire is involved. Knowing the types allows you to write a prototype for the operation that moves a tower, as follows: void moveTower(int n, char start, char finish, char tmp); To move the eight disks in the example, the initial call is moveTower(8, 'A', 'B', 'C'); This function call corresponds to the English command “Move a tower of size 8 from spire A to spire B using spire C as a temporary.” As the recursive decomposition proceeds, moveTower will be called with different arguments that move smaller towers in various configurations. Finding a recursive strategy Now that you have a more general definition of the problem, you can return to the problem of finding a strategy for moving a large tower. To apply recursion, you must first make sure that the problem meets the following conditions: 1. There must be a simple case. In this problem, the simple case occurs when n is equal to 1, which means that there is only a single disk to move. As long as you don’t violate the rule of placing a larger disk on top of a smaller one, you can move a single disk as a single operation. 2. There must be a recursive decomposition. In order to implement a recursive solution, it must be possible to break the problem down into simpler problems in the same form as the original. This part of the problem is harder and will require closer examination. To see how solving a simpler subproblem helps solve a larger problem, it helps to go back and consider the original example with eight disks. A B C The goal here is to move eight disks from spire A to spire B. You need to ask yourself how it would help if you could solve the same problem for a smaller 8.1 The Towers of Hanoi number of disks. In particular, you should think about how being able to move a stack of seven disks would help you to solve the eight-disk case. If you think about the problem for a few moments, it becomes clear that you can solve the problem by dividing it into these three steps: 1. Move the entire stack consisting of the top seven disks from spire A to spire C. 2. Move the bottom disk from spire A to spire B. 3. Move the stack of seven disks from spire C to spire B. Executing the first step takes you to the following position: A B C Once you have gotten rid of the seven disks on top of the largest disk, the second step is simply to move that disk from spire A to spire B, which results in the following configuration: A B C All that remains is to move the tower of seven disks back from spire C to spire B, which is again a smaller problem of the same form. This operation is the third step in the recursive strategy, and leaves the puzzle in the desired final configuration: A B C 357 358 Recursive Strategies That’s it! You’re finished. You’ve reduced the problem of moving a tower of size eight to one of moving a tower of size seven. More importantly, this recursive strategy generalizes to towers of size N, as follows: 1. Move the top N–1 disks from the start spire to the temporary spire. 2. Move a single disk from the start spire to the finish spire. 3. Move the stack of N–1 disks from the temporary spire back to the finish spire. At this point, it is hard to avoid saying to yourself, “Okay, I can reduce the problem to moving a tower of size N–1, but how do I accomplish that?” The answer, of course, is that you move a tower of size N–1 in precisely the same way. You break that problem down into one that requires moving a tower of size N–2, which further breaks down into moving a tower of size N–3, and so forth, until there is just one disk to move. Psychologically, however, the important thing is to avoid asking that question altogether. The recursive leap of faith should be sufficient. You’ve reduced the scale of the problem without changing its form. That’s the hard work. All the rest is bookkeeping, and it’s best to let the computer take care of that. Once you have identified the simple cases and the recursive decomposition, all you need to do is plug them into the standard recursive paradigm, which results in the following pseudocode procedure: void moveTower(int n, char start, char finish, char tmp) { if (n == 1) { Move a single disk from start to finish. } else { Move a tower of size n - 1 from start to tmp. Move a single disk from start to finish. Move a tower of size n - 1 from tmp to finish. } } Validating the strategy Although the pseudocode strategy is in fact correct, the derivation up to this point has been a little careless. Whenever you use recursion to decompose a problem, you must make sure that the new problems are identical in form to the original. The task of moving N–1 disks from one spire to another certainly sounds like an instance of the same problem and fits the moveTower prototype. Even so, there is a subtle but important difference. In the original problem, the destination and temporary spires are empty. When you move a tower of size N–1 to the temporary spire as part of the recursive strategy, you’ve left a disk behind on the starting spire. Does the presence of that disk change the nature of the problem and thus invalidate the recursive solution? 8.1 The Towers of Hanoi To answer this question, you need to think about the subproblem in light of the rules of the game. If the recursive decomposition doesn’t end up violating the rules, everything should be okay. The first rule—that only one disk can be moved at a time—is not an issue. If there is more than a single disk, the recursive decomposition breaks the problem down to generate a simpler case. The steps in the pseudocode that actually transfer disks move only one disk at a time. The second rule—that you are not allowed to place a larger disk on top of a smaller one—is the critical one. You need to convince yourself that you will not violate this rule in the recursive decomposition. The important observation to make is that, as you move a subtower from one spire to the other, the disk you leave behind on the original spire—and indeed any disk left behind at any previous stage in the operation—must be larger than anything in the current subtower. Thus, as you move those disks among the spires, the only disks below them will be larger in size, which is consistent with the rules. Coding the solution To complete the Towers of Hanoi solution, the only remaining step is to substitute function calls for the remaining pseudocode. The task of moving a complete tower requires a recursive call to the moveTower function. The only other operation is moving a single disk from one spire to another. For the purposes of writing a test program that displays the steps in the solution, all you need is a function that records its operation on the console. For example, you can implement the function moveSingleDisk as follows: void moveSingleDisk(char start, char finish) { cout << start << " -> " << finish << endl; } The moveTower code itself looks like this: void moveTower(int n, char start, char finish, char tmp) { if (n == 1) { moveSingleDisk(start, finish); } else { moveTower(n - 1, start, tmp, finish); moveSingleDisk(start, finish); moveTower(n - 1, tmp, finish, start); } } The complete implementation appears in Figure 8-1. 359 360 Recursive Strategies Hanoi.cpp F I G U R E 8 - 1 Program to solve the Towers of Hanoi puzzle /* * File: Hanoi.cpp * --------------* This program solves the Tower of Hanoi puzzle. */ #include #include "simpio.h" using namespace std; /* Function prototypes */ void moveTower(int n, char start, char finish, char tmp); void moveSingleDisk(char start, char finish); /* Main program */ int main() { int n = getInteger("Enter number of disks: "); moveTower(n, 'A', 'B', 'C'); return 0; } /* * Function: moveTower * Usage: moveTower(n, start, finish, tmp); * ---------------------------------------* Moves a tower of size n from the start spire to the finish * spire using the tmp spire as the temporary repository. */ void moveTower(int n, char start, char finish, char tmp) { if (n == 1) { moveSingleDisk(start, finish); } else { moveTower(n - 1, start, tmp, finish); moveSingleDisk(start, finish); moveTower(n - 1, tmp, finish, start); } } /* * Function: moveSingleDisk * Usage: moveSingleDisk(start, finish); * ------------------------------------* Executes the transfer of a single disk from the start spire to the * finish spire. In this implementation, the move is simply displayed * on the console; in a graphical implementation, the code would update * the graphics window to show the new arrangement. */ void moveSingleDisk(char start, char finish) { cout << start << " -> " << finish << endl; } 8.1 The Towers of Hanoi Tracing the recursive process The only problem with this implementation of moveTower is that it seems like magic. If you’re like most students learning about recursion for the first time, the solution seems so short that you feel sure there must be something missing. Where is the strategy? How can the computer know which disk to move first and where it should go? The answer is that the recursive process—breaking a problem down into smaller subproblems of the same form and then providing solutions for the simple cases—is all you need to solve the problem. If you trust the recursive leap of faith, you’re done. You can skip this section of the book and go on to the next. If, on the other hand, you’re still suspicious, it may be necessary for you to go through the steps in the complete process and watch what happens. To make the problem more manageable, consider what happens if there are only three disks in the original tower. The main program call is therefore moveTower(3, 'A', 'B', 'C'); To trace how this call computes the steps necessary to transfer a tower of size 3, all you need to do is keep track of the operation of the program, using precisely the same strategy as in the factorial example from Chapter 7. For each new function call, you introduce a stack frame that shows the values of the parameters for that call. The initial call to moveTower, for example, creates the following stack frame: int main() { moveTower(3, 'A', 'B', 'C'); void moveTower(int n, char start, char finish, char tmp) { return 0; if (n == 1) { ☞ } moveSingleDisk(start, finish); } else { moveTower(n - 1, start, tmp, finish); moveSingleDisk(start, finish); moveTower(n - 1, tmp, finish, start); } n start finish tmp } 3 'A' 'B' 'C' As the arrow in the code indicates, the function has just been called, so execution begins with the first statement in the function body. The current value of n is not equal to 1, so the program skips ahead to the else clause and executes the statement moveTower(n-1, start, tmp, finish); As with any function call, the first step is to evaluate the arguments. To do so, you need to determine the values of the variables n, start, tmp, and finish. 361 362 Recursive Strategies Whenever you need to find the value of a variable, you use the value as it is defined in the current stack frame. Thus, the moveTower call is equivalent to moveTower(2, 'A', 'C', 'B'); This operation, however, indicates another function call, which means that the current operation is suspended until the new function call is complete. To trace the operation of the new function call, you need to generate a new stack frame and repeat the process. As always, the parameters in the new stack frame are copied from the calling arguments in the order in which they appear. Thus, the new stack frame looks like this: int main() { moveTower(3, 'A', 'B', 'C'); void moveTower(int n, char start, char finish, char tmp) { return 0; if (n == 1) { } void moveTower(int n, char finish); start, char finish, char tmp) { moveSingleDisk(start, (n{== 1) { ☞} if else moveSingleDisk(start, moveTower(n - 1, start, finish); tmp, finish); } moveSingleDisk(start, else { finish); moveTower(n 1, start, tmp,start); finish); moveTower(n - 1, tmp, finish, moveSingleDisk(start, finish); } moveTower(n - 1, tmp, finish, start); } } n start finish tmp } 2 'A' 'C' 'B' As the diagram illustrates, the new stack frame has its own set of variables, which temporarily supersede the variables in frames that are further down on the stack. Thus, as long as the program is executing in this stack frame, n will have the value 2, start will be 'A', finish will be 'C', and tmp will be 'B'. The old values in the previous frame will not reappear until the subtask represented by this call to moveTower is complete. The evaluation of the recursive call to moveTower proceeds exactly like the original one. Once again, n is not 1, which requires another call of the form moveTower(n-1, start, tmp, finish); Because this call comes from a different stack frame, however, the value of the individual variables are different from those in the original call. If you evaluate the arguments in the context of the current stack frame, you discover that this function call is equivalent to moveTower(1, 'A', 'B', 'C'); The effect of making this call is to introduce yet another stack frame for the moveTower function, as follows: 8.1 The Towers of Hanoi int main() { moveTower(3, 'A', 'B', 'C'); void moveTower(int n, char start, char finish, char tmp) { return 0; if (n == 1) { } void moveTower(int n, char finish); start, char finish, char tmp) { moveSingleDisk(start, (n{== 1) { } if else void moveTower(int n, char start, char finish, char tmp) { moveSingleDisk(start, finish); moveTower(n (n{== 1) {- 1, start, tmp, finish); ☞} if else moveSingleDisk(start, finish); moveSingleDisk(start, finish); moveTower(n start, tmp,start); finish); - -1,1, tmp, finish, }moveTower(n else { moveSingleDisk(start, finish); } moveTower(n 1, start, tmp, finish); moveTower(n - 1, tmp, finish, start); } moveSingleDisk(start, finish); } moveTower(n - 1, tmp, finish, start); } } n start finish tmp } 1 'A' 'B' 'C' This call to moveTower, however, does represent the simple case. Since n is 1, the program calls the moveSingleDisk function to move a disk from A to B, leaving the puzzle in the following configuration: A B C At this point, the most recent call to moveTower is complete and the function returns. In the process, its stack frame is discarded, which brings the execution back to the previous stack frame, having just completed the first statement in the else clause: int main() { moveTower(3, 'A', 'B', 'C'); void moveTower(int n, char start, char finish, char tmp) { return 0; if (n == 1) { } void moveTower(int n, char finish); start, char finish, char tmp) { moveSingleDisk(start, (n{== 1) { } if else moveSingleDisk(start, moveTower(n - 1, start, finish); tmp, finish); } moveSingleDisk(start, else { finish); moveTower(n 1, start, tmp,start); finish); moveTower(n - 1, tmp, finish, } ☞ moveSingleDisk(start, finish); moveTower(n - 1, tmp, finish, start); } } n start finish tmp } 2 'A' 'C' 'B' The call to moveSingleDisk again represents a simple operation, which leaves the puzzle in the following state: A B C 363 364 Recursive Strategies With the moveSingleDisk operation completed, the only remaining step required to finish the current call to moveTower is the last statement in the function: moveTower(n-1, tmp, finish, start); Evaluating these arguments in the context of the current frame reveals that this call is equivalent to moveTower(1, 'B', 'C', 'A'); Once again, this call requires the creation of a new stack frame. By this point in the process, however, you should be able to see that the effect of this call is simply to move a tower of size 1 from B to C, using A as a temporary repository. Internally, the function determines that n is 1 and then calls moveSingleDisk to reach the following configuration: A B C This operation again completes a call to moveTower, allowing it to return to its caller having completed the subtask of moving a tower of size 2 from A to C. Discarding the stack frame from the just-completed subtask reveals the stack frame for the original call to moveTower, which is now in the following state: int main() { moveTower(3, 'A', 'B', 'C'); void moveTower(int n, char start, char finish, char tmp) { return 0; if (n == 1) { } moveSingleDisk(start, finish); } else { moveTower(n - 1, start, tmp, finish); ☞ moveSingleDisk(start, finish); moveTower(n - 1, tmp, finish, start); } n start finish tmp } 3 'A' 'B' 'C' The next step is to call moveSingleDisk to move the largest disk from A to B, which results in the following position: A B The only operation that remains is to call C 8.2 The subset-sum problem moveTower(n-1, tmp, finish, start); with the arguments from the current stack frame, which are moveTower(2, 'C', 'B', 'A'); If you’re still suspicious of the recursive process, you can draw the stack frame created by this function call and continue tracing the process to its ultimate conclusion. At some point, however, it is essential that you trust the recursive process enough to see that function call as a single operation having the effect of the following command in English: Move a tower of size 2 from C to B, using A as a temporary repository. If you think about the process in this holistic form, you can immediately see that completion of this step will move the tower of two disks back from C to B, leaving the desired final configuration: A B C 8.2 The subset-sum problem Although the Towers of Hanoi problem offers a wonderful illustration of the power of recursion, its effectiveness as an example is compromised by its lack of any practical application. Many people are drawn to programming because it enables them to solve practical problems. If the only examples of recursion are like the Towers of Hanoi, it’s easy to conclude that recursion is useful only for solving abstract puzzles. Nothing could be further from the truth. Recursive strategies give rise to extremely efficient solutions to practical problems—most notably the problem of sorting introduced in Chapter 10—that are hard to solve in other ways. The problem covered in this section is called the subset-sum problem, which can be defined as follows: Given a set of integers and a target value, determine whether it is possible to find a subset of those integers whose sum is equal to the specified target. For example, given the set { –2, 1, 3, 8 } and the target value 7, the answer to the subset-sum question is yes, because the subset { –2, 1, 8 } adds up to 7. If the target value had been 5, however, the answer would be no, because there is no way to choose a subset of the integers in { –2, 1, 3, 8 } that adds up to 5. 365 366 Recursive Strategies It is easy to translate the idea of the subset-sum problem into C++. The concrete goal is to write a predicate function bool subsetSumExists(Set & set, int target); that takes the required information and returns true if it is possible to generate the value target by adding up some combination of elements chosen from set. Even though it might at first seem that the subset-sum problem is just as esoteric as the Towers of Hanoi, it has significant importance in both the theory and practice of computer science. As you will discover in Chapter 10, the subset-sum problem is an instance of an important class of computational problems that are hard to solve efficiently. That very fact, however, makes problems like subset-sum useful in applications where the goal is to keep information secret. The first implementation of public-key cryptography, for example, used a variant of the subset-sum problem as its mathematical foundation. By basing their operation on problems that are provably hard, modern encryption strategies are more difficult to break. The search for a recursive solution The subset-sum problem is difficult to solve using a traditional iterative approach. To make any headway, you need to think recursively. As always, you therefore need to identify a simple case and a recursive decomposition. In applications that work with sets, the simple case is almost always when the set is empty. If the set is empty, there is no way that you can add elements to produce a target value unless the target is zero. That discovery suggests that the code for subsetSumExists will start off like this: bool subsetSumExists(Set & set, int target) { if (set.isEmpty()) { return target == 0; } else { Find a recursive decomposition that simplifies the problem. } } In this problem, the hard part is finding that recursive decomposition. When you are looking for a recursive decomposition, you need to be on the lookout for some value in the inputs—which are conveyed as arguments in the C++ formulation of the problem—that you can make smaller. In this case, what you need to do is make the set smaller, because what you’re trying to do is move toward the simple case that occurs when the set is empty. If you take an element out of the set, what’s left over is smaller by one element. The operations exported by the Set 8.2 The subset-sum problem class make it easy to choose an element from a set and then determine what’s left over. All you need is the following code: int element = set.first(); Set rest = set - element; The first method returns the element of the set that appears first in its iteration order, and the expression involving the overloaded - operator produces the set that contains every element in set except the value of element. The fact that element is first in iteration order is not really important here. All you really need is some way to choose some element and then to create a smaller set by removing the element you selected from the original set. The inclusion/exclusion pattern Making the set smaller, however, is not enough to solve this problem. The code to divide a set into an element and the rest of the set will come up again in many recursive applications and is part of a general programming pattern for working with the Set class. Structurally, you know that subsetSumExists must call itself recursively on the smaller set now stored in the variable rest. What you haven’t yet determined is how the solution to these recursive subproblems will help to solve the original. The key insight you need to solve this problem is that there are two ways that you might be able to produce the desired target sum after you have identified a particular element. One possibility is that the subset you’re looking for includes that element. For that to happen, it must be possible to take the rest of the set and produce the value target - element. The other possibility is that the subset you’re looking for excludes that element, in which case it must be possible to generate the value target using only the leftover set of elements. This insight is enough to complete the implementation of subsetSumExists, as follows: bool subsetSumExists(Set & set, int target) { if (set.isEmpty()) { return target == 0; } else { int element = set.first(); Set rest = set - element; return subsetSumExists(rest, target) || subsetSumExists(rest, target - element); } } Because the recursive strategy subdivides the general case into one branch that includes a particular element and another that excludes it, this strategy is sometimes 367 368 Recursive Strategies called the inclusion/exclusion pattern. As you work through the exercises in this chapter as well as several subsequent ones, you will find that this pattern, with slight variations, comes up in many different applications. Although the pattern is easiest to recognize when you are working with sets, it also comes up in applications involving vectors and strings, and you should be on the lookout for it in those situations as well. 8.3 Generating permutations Many word games and puzzles require the ability to rearrange a set of letters to form a word. Thus, if you wanted to write a Scrabble program, it would be useful to have a facility for generating all possible arrangements of a particular set of tiles. In word games, such arrangements are generally called anagrams. In mathematics, they are known as permutations. Let’s suppose you want to write a function Set generatePermutations(string str); that returns a set containing all permutations of the string. For example, if you call generatePermutations("ABC") the function should return a set containing the following elements: { "ABC", "ACB", "BAC", "BCA", "CAB", "CBA" } How might you go about implementing the generatePermutations function? If you are limited to iterative control structures, finding a general solution that works for strings of any length is difficult. Thinking about the problem recursively, on the other hand, leads to a relatively straightforward solution. As is usually the case with recursive programs, the hard part of the solution process is figuring out how to divide the original problem into simpler instances of the same problem. In this case, to generate all permutations of a string, you need to discover how being able to generate all permutations of a shorter string might contribute to the solution. Before you look at my solution on the next page, stop and think about this problem for a few minutes. When you are first learning about recursion, it is easy to look at a recursive solution and believe that you could have generated it on your own. Without trying it first, however, it is hard to know whether you would have come up with the necessary recursive insight. 8.3 Generating permutations Finding the recursive insight To give yourself more of a feel for the problem, it helps to consider a concrete case. Suppose you want to generate all permutations of a five-character string, such as "ABCDE". In your solution, you can apply the recursive leap of faith to generate all permutations of any shorter string. Just assume that the recursive calls work and be done with it. Once again, the critical question is how being able to permute shorter strings helps you solve the problem of permuting the original five-character string. If you focus on breaking the five-character permutation problem down into some number of instances of the permutation problem involving four-character strings, you will soon discover that the permutations of the five-character string "ABCDE" consist of the following strings: • The character 'A' followed by every possible permutation of "BCDE" • The character 'B' followed by every possible permutation of "ACDE" • The character 'C' followed by every possible permutation of "ABDE" • The character 'D' followed by every possible permutation of "ABCE" • The character 'E' followed by every possible permutation of "ABCD" More generally, you can construct the set of all permutations of a string of length n by selecting each character in turn and then, for each of those n possible first characters, concatenating the selected character on to the front of every possible permutation of the remaining n – 1 characters. The problem of generating all permutations of n – 1 characters is a smaller instance of the same problem and can therefore be solved recursively. As always, you also need to define a simple case. One possibility is to check to see whether the string contains a single character. Computing all the permutations of a single-character string is easy, because there is only one possible ordering. In string processing, however, the best choice for the simple case is rarely a one-character string, because there is in fact an even simpler alternative: the empty string containing no characters at all. Just as there is only one ordering for a single-character string, there is only one way to write the empty string. If you call generatePermutations(""), you should get back a set containing a single element, which is the empty string. Once you have both the simple case and the recursive insight, writing the code for generatePermutations becomes reasonably straightforward. The code for generatePermutations appears in Figure 8-2, along with a simple test program that asks the user for a string and then prints out every possible permutation of the characters in that string. 369 370 Recursive Strategies Permutations.cpp F I G U R E 8 - 2 Program to generate all permutations of a string /* * File: Permutations.cpp * ---------------------* This file generates all permutations of an input string. */ #include #include "set.h" #include "simpio.h" using namespace std; /* Function prototypes */ Set generatePermutations(string str); /* Main program */ int main() { string str = getLine("Enter a string: "); cout << "The permutations of \"" << str << "\" are:" << endl; foreach (string s in generatePermutations(str)) { cout << " \"" << s << "\"" << endl; } return 0; } /* * Function: generatePermutations * Usage: Set permutations = generatePermutations(str); * -----------------------------------------------------------* Returns a set consisting of all permutations of the specified string. * This implementation uses the recursive insight that you can generate * all permutations of a string by selecting each character in turn, * generating all permutations of the string without that character, * and then concatenating the selected character on the front of each * string generated. */ Set generatePermutations(string str) { Set result; if (str == "") { result += ""; } else { for (int i = 0; i < str.length(); i++) { char ch = str[i]; string rest = str.substr(0, i) + str.substr(i + 1); foreach (string s in generatePermutations(rest)) { result += ch + s; } } } return result; } 8.3 Generating permutations If you run the Permutations program and enter the string "ABC", you see the following output: Permutations Enter a string: ABC The permutations of "ABC" are: "ABC" "ACB" "BAC" "BCA" "CAB" "CBA" The use of sets in this application ensures that the program generates permutations in alphabetical order and that each distinct ordering of the characters appears exactly once, even if there are repeated letters in the input string. For example, if you enter the string AABB in response to the prompt, the program produces only six permutations, as follows: Permutations Enter a string: AABB The permutations of "AABB" are: "AABB" "ABAB" "ABBA" "BAAB" "BABA" "BBAA" The recursive process adds a new element to this set the full total of 24 (4!) times, but the implementation of the Set class ensures that no duplicate values appear. You can use the generatePermutations function to generate all anagrams of a word by changing the main program from Figure 8-2 so that it checks each string against the English lexicon. If you enter the string "aeinrst", you get the following output—a list that serious Scrabble players will recognize instantly: Anagrams Enter the letters: aeinrst The anagrams of aeinrst are: anestri nastier ratines retains retinas retsina stainer stearin 371 372 Recursive Strategies 8.4 Graphical recursion Some of the most exciting applications of recursion use graphics to create intricate pictures in which a particular motif is repeated at many different scales. The remainder of this chapter offers a few examples of graphical recursion that make use of the graphics.h interface introduced briefly at the end of Chapter 2. This material is not essential to learning about recursion, and you can skip it if you don’t have ready access to the graphics library. On the other hand, working through these examples will make recursion seem a lot more powerful, not to mention more fun. An example from computer art In the early part of the twentieth century, a controversial artistic movement arose in Paris, largely under the influence of Pablo Picasso and Georges Braque. The Cubists—as they were called by their critics—rejected classical artistic notions of perspective and representationalism and instead produced highly fragmented works based on primitive geometrical forms. Strongly influenced by Cubism, the Dutch painter Piet Mondrian (1872–1944) produced a series of compositions based on horizontal and vertical lines, such as the one shown in Figure 8-3. Suppose that you want to generate compositions such as the following, which— like much of Mondrian’s work—consists only of horizontal and vertical lines: F I G U R E 8 - 3 Image and grid pattern from Mondrian, “Composition with Grid 6,” 1919 8.4 Graphical recursion How would you go about designing a general strategy to create such a figure using the graphics library? To understand how a program might produce such a figure, it helps to think about the process as one of successive decomposition. At the beginning, the canvas was simply an empty rectangle that looked like this: If you want to subdivide the canvas using a series of horizontal and vertical lines, the easiest way to start is by drawing a single line that divides the rectangle in two: 373 374 Recursive Strategies If you’re thinking recursively, the thing to notice at this point is that you now have two empty rectangular canvases, each of which is smaller in size. The task of subdividing these rectangles is the same as before, so you can perform it by using a recursive implementation of the same procedure. The only thing needed for a complete recursive strategy is a simple case. The process of dividing up rectangles can’t go on indefinitely. As the rectangles get smaller and smaller, at some point the process has to stop. One approach is to look at the area of each rectangle before you start. Once the area of a rectangle falls below some threshold, you needn’t bother to subdivide it any further. The Mondrian.cpp program in Figure 8-4 implements the recursive algorithm, using the entire graphics window as the initial canvas. In Mondrian.cpp, the recursive function subdivideCanvas does all the work. The arguments give the position and dimensions of the current rectangle on the canvas. At each step in the decomposition, the function simply checks to see whether the rectangle is large enough to split. If it is, the function checks to see which dimension—width or height—is larger and accordingly divides the rectangle with a vertical or horizontal line. In each case, the function draws only a single line; all remaining lines in the figure are drawn by subsequent recursive calls. Fractals In the late 1970s, a researcher at IBM named Benoit Mandelbrot (1924–2010) generated a great deal of excitement by publishing a book on fractals, which are geometrical structures in which the same pattern is repeated at many different scales. Although mathematicians have known about fractals for a long time, there was a resurgence of interest in the subject during the 1980s, partly because the development of computers made it possible to do so much more with fractals than had ever been possible before. One of the earliest examples of fractal figures is called the Koch snowflake after its inventor, Helge von Koch (1870–1924). The Koch snowflake begins with an equilateral triangle like this: 8.4 Graphical recursion Mondrian.cpp F I G U R E 8 - 4 Program to subdivide the plane in a Mondrian-like style /* * File: Mondrian.cpp * -----------------* This program creates a random line drawing in a style reminiscent * of the Dutch painter Piet Mondrian. */ #include #include "graphics.h" #include "random.h" using namespace std; /* Constants */ const double MIN_AREA = 10000; const double MIN_EDGE = 20; /* Smallest square that will be split */ /* Smallest edge length allowed */ /* Function prototypes */ void subdivideCanvas(double x, double y, double width, double height); /* Main program */ int main() { initGraphics(); subdivideCanvas(0, 0, getWindowWidth(), getWindowHeight()); return 0; } /* * Function: subdivideCanvas * Usage: subdivideCanvas(x, y, width, height); * -------------------------------------------* Decomposes the specified rectangular region on the canvas recursively * using randomly chosen horizontal and vertical lines. The rectangle * will not be split if its area is too small, and the division is always * chosen so that it splits the larger of the two dimensions. */ void subdivideCanvas(double x, double y, double width, double height) { if (width * height >= MIN_AREA) { if (width > height) { double mid = randomReal(MIN_EDGE, width - MIN_EDGE); subdivideCanvas(x, y, mid, height); subdivideCanvas(x + mid, y, width - mid, height); drawLine(x + mid, y, x + mid, y + height); } else { double mid = randomReal(MIN_EDGE, height - MIN_EDGE); subdivideCanvas(x, y, width, mid); subdivideCanvas(x, y + mid, width, height - mid); drawLine(x, y + mid, x + width, y + mid); } } } 375 376 Recursive Strategies This triangle, in which the sides are straight lines, is called the Koch snowflake of order 0. The figure is then revised in stages to generate fractals of successively higher orders. At each stage, every straight-line segment in the figure is replaced by one in which the middle third consists of a triangular bump protruding outward from the figure. Thus, the first step is to replace each line segment in the triangle with a line that looks like this: Applying this transformation to each of the three sides of the original triangle generates the Koch snowflake of order 1, as follows: If you then replace each line segment in this figure with a new line that again includes a triangular wedge, you create the following order-2 Koch snowflake: Replacing each of these line segments gives the order-3 fractal shown in the following diagram, which now looks even more like a snowflake: 8.4 Graphical recursion Because figures like the Koch snowflake are much easier to draw by computer than by hand, it makes sense to write a program that uses the facilities exported by the graphics.h interface to generate this design. Although it is possible to draw fractal snowflakes using only the drawLine function, it is usually easier to use the drawPolarLine function in the graphics library, which lets you specify a line in terms of its length and a direction. In mathematics, the length and direction of a line segment are conventionally represented by the symbols r and θ, which are called its polar coordinates. The use of polar coordinates is illustrated by the following diagram, in which the solid line has length r and extends from its starting point at the angle θ measured in degrees counterclockwise from the x-axis: r θ x-axis The drawPolarLine function takes the coordinates of the starting point (either as separate coordinates or as a GPoint object) and returns the coordinates of the other endpoint of the line segment, thereby making it easy to chain consecutive segments together. For example, the following code draws a downward-pointing equilateral triangle whose upper left corner is at the original value of pt: pt = drawPolarLine(pt, size, 0); pt = drawPolarLine(pt, size, -120); pt = drawPolarLine(pt, size, +120); This code creates the snowflake fractal of order 0. To generalize it so that it creates higher-order fractals, all you need to do is replace the calls to drawPolarLine with a new function drawFractalLine that takes an additional parameter specifying the order of the fractal line, as follows: pt = drawFractalLine(pt, size, 0, order); pt = drawFractalLine(pt, size, -120, order); pt = drawFractalLine(pt, size, +120, order); The only remaining task is to implement drawFractalLine, which is easy if you think about it recursively. The simple case for drawFractalLine occurs when order is 0, in which case the function simply draws a straight line with the specified length and direction. If order is greater than 0, the fractal line is broken down into four components, each of which is a fractal line of the next lower order. Like drawPolarLine, the drawFractalLine function returns the end point of the last segment it draws so that the next fractal line can begin where the last one left off. The complete implementation of the Snowflake program, which includes the finished code for drawFractalLine, appears in Figure 8-5. 377 378 Recursive Strategies Snowflake.cpp F I G U R E 8 - 5 Program to draw the Koch fractal snowflake /* * File: Snowflake.cpp * ------------------* This program draws a Koch fractal snowflake. */ #include #include #include "graphics.h" #include "gtypes.h" #include "simpio.h" using namespace std; /* Function prototypes */ GPoint drawFractalLine(GPoint pt, double len, double theta, int order); /* Main program */ int main() { initGraphics(); cout << "Program to draw a snowflake fractal." << endl; double size = getReal("Enter edge length in pixels: "); int order = getInteger("Enter order of fractal: "); double cx = getWindowWidth() / 2; double cy = getWindowHeight() / 2; GPoint pt(cx - size / 2, cy - sqrt(3.0) * size / 6); pt = drawFractalLine(pt, size, 0, order); pt = drawFractalLine(pt, size, -120, order); pt = drawFractalLine(pt, size, +120, order); return 0; } /* * Function: drawFractalLine * Usage: GPoint end = drawFractalLine(pt, len, theta, order); * ----------------------------------------------------------* Draws a fractal edge starting from the point pt and then extending * len units in direction theta. If order is 0, the fractal edge is * a straight line. If order is greater than zero, the edge is divided * into four line segments, each of which is a fractal edge of the next * lower order. The function returns the end point of the line. */ GPoint drawFractalLine(GPoint pt, double len, double theta, int order) { if (order == 0) { return drawPolarLine(pt, len, theta); } else { pt = drawFractalLine(pt, len / 3, theta, order - 1); pt = drawFractalLine(pt, len / 3, theta + 60, order - 1); pt = drawFractalLine(pt, len / 3, theta - 60, order - 1); return drawFractalLine(pt, len / 3, theta, order - 1); } } Summary Summary This chapter in fact introduces relatively few new concepts because the fundamental precepts of recursion were introduced in Chapter 7. The point of this chapter is to raise the sophistication of the recursive examples to the point at which the problems become difficult to solve in any other way. Given this increase in sophistication, beginning students often find these problems much harder to comprehend than those in the preceding chapter. They are indeed harder, but recursion is a tool for solving hard problems. To master it, you need to practice with problems at this level of complexity. The important points in this chapter include: • Whenever you want to apply recursion to a programming problem, you have to devise a strategy that transforms the problem into simpler instances of the same problem. Until you find the correct insight that leads to the recursive strategy, there is no way to apply recursive techniques. • Once you identify a recursive approach, it is important for you to check your strategy to ensure that it does not violate any conditions imposed by the problem. • When the problems you are trying to solve increase in complexity, the importance of accepting the recursive leap of faith increases. • Recursion is not magical. If you need to do so, you can simulate the operation of the computer yourself by drawing the stack frames for every procedure that is called in the course of the solution. On the other hand, it is critical to get beyond the skepticism that forces you to look at all the underlying details. Review questions 1. In your own words, describe the recursive insight necessary to solve the Towers of Hanoi puzzle. 2. The following strategy for solving the Towers of Hanoi puzzle is structurally similar to the strategy used in the text: a. Move the top disk from the start spire to the temporary spire. b. Move a stack of N–1 disks from the start spire to the finish spire. c. Move the top disk now on the temporary spire back to the finish spire. Why does this strategy fail? 3. If you call moveTower(16, 'A', 'B', 'C') 379 380 Recursive Strategies what line is displayed by moveSingleDisk as the first step in the solution? What is the last step in the solution? 4. What is a permutation? 5. In your own words, explain the recursive insight necessary to enumerate the permutations of the characters in a string. 6. How many permutations are there of the string "WXYZ"? 7. What simple case is used to terminate the recursion in Mondrian.cpp? 8. Draw a picture of the order-1 fractal snowflake. 9. How many line segments appear in the order-2 fractal snowflake? Exercises 1. Following the logic of the moveTower function, write a recursive function countHanoiMoves(n) that computes the number of moves required to solve the Towers of Hanoi puzzle for n disks. 2. To make the operation of the program somewhat easier to explain, the implementation of moveTower in this chapter uses if (n == 1) as its simple case test. Whenever you see a recursive program use 1 as its simple case, it pays to be a little skeptical; in most applications, 0 is a more appropriate choice. Rewrite the Towers of Hanoi program so that the moveTower function checks whether n is 0 instead. What happens to the length of the moveTower implementation? 3. Rewrite the Towers of Hanoi program so that it uses an explicit stack of pending tasks instead of recursion. In this context, a task can be represented most easily as a structure containing the number of disks to move and the names of the spires used for the start, finish, and temporary repositories. At the beginning of the process, you push onto your stack a single task that describes the process of moving the entire tower. The program then repeatedly pops the stack and executes the task found there until no tasks are left. Except for the simple cases, the process of executing a task results in the creation of more tasks that get pushed onto the stack for later execution. 4. In the subset-sum problem introduced in section 8-2, there are often several ways to generate the desired target number. For example, given the set Exercises { 1, 3, 4, 5 }, there are two different ways to produce the target value 5: • Select the 1 and the 4 • Select just the 5 By contrast, there is no way to partition the set { 1, 3, 4, 5 } to get 11. Write a function int countSubsetSumWays(Set & set, int target); that returns the number of ways in which you can produce the target value by choosing a subset of the specified set. For example, suppose that sampleSet has been initialized as follows: Set sampleSet; sampleSet += 1, 3, 4, 5; Given this definition of sampleSet, calling countSubsetSumWays(sampleSet, 5); should return 2 (there are two ways to make 5), and calling countSubsetSumWays(sampleSet, 11) should return 0 (there are no ways to make 11). 5. Write a program EmbeddedWords that finds all English words that can be formed by taking some subset of letters in order from a given starting word. For example, given the starting word happy, you can certainly produce the words a, ha, hap, and happy, in which the letters appear consecutively. You can also produce the words hay and ay, because those letters appear in happy in the correct left-to-right order. You cannot, however, produce the words pa or pap because the letters—even though they appear in the word—don’t appear in the correct order. A sample run of the program might look like this: EmbeddedWords Enter starting word: happy The embedded words are: a ay ha hap happy hay 381 382 Recursive Strategies 6. I am the only child of parents who weighed, measured, and priced everything; for whom what could not be weighed, measured, and priced had no existence. — Charles Dickens, Little Dorrit, 1857 In Dickens’s time, merchants measured many commodities using weights and a two-pan balance—a practice that continues in many parts of the world today. If you are using a limited set of weights, however, you can only measure certain quantities accurately. For example, suppose that you have only two weights: a 1-ounce weight and a 3-ounce weight. With these you can easily measure out 4 ounces, as shown: 3 1 It is somewhat more interesting to discover that you can also measure out 2 ounces by shifting the 1-ounce weight to the other side, as follows: 1 3 Write a recursive function bool isMeasurable(int target, Vector & weights) that determines whether it is possible to measure out the desired target amount with a given set of weights, which is stored in the vector weights. For example, suppose that sampleWeights has been initialized like this: Vector sampleWeights; sampleWeights += 1, 3; Given these values, the function call isMeasurable(2, sampleWeights) should return true because it is possible to measure out 2 ounces using the sample weight set as illustrated in the preceding diagram. On the other hand, calling isMeasurable(5, sampleWeights) Exercises should return false because it is impossible to use the 1- and 3-ounce weights to measure out 5 ounces. In the card game called Cribbage, part of the game consists of adding up the score from a set of five playing cards. One of the components of the score is the number of distinct card combinations whose values add up to 15, with aces counting as 1 and all face cards (jacks, queens, and kings) counting as 10. Consider, for example, the following cards: ♠ ♠ ♠♠ ♥ ♠ ♠ ♠♠ l0 ♠♠ 5 ♣♣ ♣ ♦ 4 ♥♥ l0 ♠♠ ♣ ♣ 9 ♣♣ ♣ ♣♣ ♣ ♥ 4 ♥♥ 5 ♣♣ ♣ ♣♣ A ♦ 9 ♣♣ A ♦ There are three different combinations that sum to 15, as follows: AD + 10S + 4H AD + 5C + 9C 5C + 10S As a second example, the cards 5 ♥♥ 5 ♠♠ ♥ ♥ ♠ ♠ 5 ♥♥ ♥ ♣ ♦ ♦ ♦ ♠ 5 ♦♦ 5 ♦♦ ♣ ♣ J ♣♣ 5 ♠♠ 5 ♣♣ J ♣♣ 5 ♣♣ 7. contain the following eight different combinations that add up to 15: 5C + JC 5C + 5D + 5H 5D + JC 5C + 5D + 5S 5H + JC 5C + 5H + 5S 5S + JC 5D + 5H + 5S Write a function int countFifteens(Vector & cards); that takes a vector of Card values (as defined in Chapter 6, exercise 4) and returns the number of ways you can make 15 from that set of cards. You don’t need to know much about the Card class to solve this problem. The only thing you need is the getRank method, which returns the rank of the card as an integer. You may assume that the card.h interface exports the constant names ACE, JACK, QUEEN, and KING with the values 1, 11, 12, and 13, respectively. 383 384 Recursive Strategies 8. The recursive decomposition presented in section 8-3 to solve the problem of generating permutations is not the only effective strategy. Another way to look at the problem is to solve the recursive case looks like this: a) Remove the first character from the string and store it in the variable ch. b) Generate the set containing all permutations of the remaining characters. c) Form a new set by inserting ch in every possible position in each of those permutations. Reimplement the Permutations program so that it uses this new strategy. 9. The strategy used to implement the Permutations program in the text is designed to emphasize its recursive character. The resulting code is not particularly efficient, mostly because it ends up generating sets that are later discarded and because it applies methods like substr that require copying the characters in a string. It is possible to eliminate those inefficiencies using the following recursive formulation: a) At each level, pass the entire string along with an index that indicates where the permutation process starts. Characters in the string before this index stay where they are; characters at or after that position must go through all their permutations. b) The simple case occurs when the index reaches the end of the string. c) The recursive case operates by swapping the character at the index position with every other character in the string and then generating every permutation starting with the next higher index and then swapping the characters back to ensure that the original order is restored. Use this strategy to implement a function void listPermutations(string str); that lists on cout all permutations of the string str without generating any sets at all or applying any string methods other than length or selection. The listPermutations function itself must be a wrapper function for a second function that includes the index. This function is relatively easy to implement if you don’t try to take account of duplicated letters in the string. The interesting challenge comes in adapting the structure of the algorithm so that it lists each unique permutation exactly once without using sets to accomplish that task. You should not, however, worry about the order in which listPermutations delivers its output. Exercises 10. On a telephone keypad, the digits are mapped onto the alphabet as shown in the diagram below: 1 2 3 ABC DEF 4 5 6 GHI JKL MNO 7 8 9 PQRS TUV WXYZ 0 # * In order to make their phone numbers more memorable, service providers like to find numbers that spell out some word (called a mnemonic) appropriate to their business that makes that phone number easier to remember. Imagine that you have just been hired by a local telephone company to write a function listMnemonics that will generate all possible letter combinations that correspond to a given number, represented as a string of digits. For example, the call listMnemonics("723") should list the following 36 possible letter combinations that correspond to that prefix: PAD PBD PCD QAD QBD QCD RAD RBD RCD SAD SBD SCD PAE PBE PCE QAE QBE QCE RAE RBE RCE SAE SBE SCE PAF PBF PCF QAF QBF QCF RAF RBF RCF SAF SBF SCF 11. Rewrite the program from exercise 4 so that it uses the Lexicon class and the EnglishWords.dat file so that the program only lists mnemonics that are valid English words. 12. These days, the letters on a telephone keypad are not used for mnemonics as much as they are for texting. Entering text using a keypad is problematic, because there are fewer keys than there are letters in the alphabet. Some cell phones use a “multi-tap” user interface, in which you tap the 2 key once for a, twice for b, and three times for c, which can get tedious. A streamlined alternative is to use a predictive strategy in which the cell phone guesses which of the possible letter you intended, based on the sequence so far and its possible completions. 385 386 Recursive Strategies For example, if you type the digit sequence 72, there are 12 possibilities: pa, pb, pc, qa, qb, qc, ra, rb, rc, sa, sb, and sc. Only four of these letter pairs—pa, ra, sa, and sc—seem promising because they are prefixes of common English words like party, radio, sandwich, and scanner. The others can be ignored because there are no common words that begin with those sequences of letters. If the user enters 9956, there are 144 (4 x 4 x 3 x 3) possible letter sequences, but you can be assured the user meant xylo since that is the only sequence that is a prefix of any English words. Write a function void listCompletions(string digits, Lexicon & lex); that prints all words from the lexicon that can be formed by extending the given digit sequence. For example, calling listCompletions("72547", english) should generate the following sample run: ListCompletions palisade palisaded palisades palisading palish rakis rakish rakishly rakishness sakis If your only concern is getting the answer, the easiest way to solve this problem is to iterate through the words in the lexicon and print out each word that matches the specified digit string. That solution requires no recursion and very little thinking. Your managers, however, believe that looking through every word in the dictionary is slow and insist that your code use the lexicon only to test whether a given string is a word or a prefix of an English word. With that restriction, you need to figure out how to generate all possible letter sequences from the string of digits. That task is easiest to solve recursively. 13. Many of Mondrian’s geometrical paintings fill in the rectangular regions with some color, as in his “Composition with Color Planes and Gray Lines 1” shown in Figure 8-6. Extend the Mondrian program from the text so that it fills in some fraction of the rectangular regions it creates with randomly chosen colors. Exercises F I G U R E 8 - 6 Piet Mondrian, “Composition with Color Planes and Gray Lines 1,” 1918 14. In countries like the United States that still use the traditional English system of measurement, each inch on a ruler is marked off into fractions using tick marks that look like this: The longest tick mark falls at the half-inch position, two smaller tick marks indicate the quarter inches, and even smaller ones are used to mark the eighths and sixteenths. Write a recursive program that draws a 1-inch line at the center of the graphics window and then draws the tick marks shown in the diagram. Assume that the length of the tick mark indicating the half-inch position is given by the constant definition const double HALF_INCH_TICK = 0.2; and that each smaller tick mark is half the size of the next larger one. 387 388 Recursive Strategies 15. One of the reasons that fractals have generated so much interest is that they turn out to be useful in some surprising practical contexts. For example, the most successful techniques for drawing computer images of mountains and certain other landscape features involve using fractal geometry. As a simple example of where this issue comes up, consider the problem of connecting two points A and B with a fractal that looks like a coastline on a map. The simplest possible strategy would be to draw a straight line between the two points: A B This is the order-0 coastline and represents the base case of the recursion. Of course, a real coastline will have small peninsulas or inlets somewhere along its length, so you would expect a realistic drawing of a coastline to jut in or out occasionally like a real one. As a first approximation, you could replace the straight line with the same fractal line used to create the snowflake fractal, as follows: A B This process gives the order-1 coastline. However, in order to give the feeling of a traditional coastline, it is important for the triangular wedge in this line sometimes to point up and sometimes down, with equal probability. If you then replace each of the straight-line segments in the order-1 fractal with a fractal line in a random direction, you get the order-2 coastline, which might look like this: A B Continuing this process eventually results in a drawing that conveys a remarkably realistic sense, as in this order-5 coastline: Exercises A B Write a program to draw a fractal coastline on the graphics window. 16. To celebrate its 550th anniversary in 2008, Magdalen College at Oxford commissioned the English artist Mark Wallinger to create a sculpture called Y that has a decidedly recursive structure. A photograph of the sculpture appears at the left of Figure 8-7 and a diagram illustrating its fractal nature appears at the right. Given its branching structure, the underlying pattern in Wallinger’s sculpture is called a fractal tree. The tree begins as a simple trunk indicated by a straight vertical line, as follows: F I G U R E 8 - 7 Photograph and structure of Mark Wallinger’s fractal tree 389 390 Recursive Strategies The trunk may branch at the top to form two lines that veer off at an angle, as shown: These branches may themselves split to form new branches, which split to form new ones, and so on. Write a program that uses the graphics library to draw the fractal tree in Wallinger’s sculpture. If you carry this process on to the eighth-order fractal, you get the image on the right of Figure 8-7. 18. If you search the web for fractal designs, you will find many intricate wonders beyond the Koch snowflake illustrated in this chapter. One of these is the Sierpinski Triangle, named after its inventor, the Polish mathematician Wacław Sierpiński (1882–1969). The order-0 Sierpinski Triangle is an equilateral triangle: To create an order-N Sierpinski Triangle, you draw three Sierpinski Triangles of order N – 1, each of which has half the edge length of the original. Those three triangles are placed in the corners of the larger triangle, which means that the order-1 Sierpinski Triangle looks like this: Exercises The downward-pointing triangle in the middle of this figure is not drawn explicitly, but is instead formed by the sides of the other three triangles. That area, moreover, is not recursively subdivided and will remain unchanged at every level of the fractal decomposition. Thus, the order-2 Sierpinski Triangle has the same open area in the middle: If you continue this process through three more recursive levels, you get the order-5 Sierpinski Triangle, which looks like this: Write a program that asks the user for an edge length and a fractal order and draws the resulting Sierpinski Triangle in the center of the graphics window. 391 Chapter 9 Backtracking Algorithms Truth is not discovered by proofs but by exploration. It is always experimental. — Simone Weil, The New York Notebook, 1942 394 Backtracking Algorithms For many real-world problems, the solution process consists of working your way through a sequence of decision points in which each choice leads you further along some path. If you make the correct set of choices, you end up at the solution. On the other hand, if you reach a dead end or otherwise discover that you have made an incorrect choice somewhere along the way, you have to backtrack to a previous decision point and try a different path. Algorithms that use this approach are called backtracking algorithms. If you think about a backtracking algorithm as the process of repeatedly exploring paths until you encounter the solution, the process appears to have an iterative character. As it happens, however, most problems of this form are easier to solve recursively. The fundamental recursive insight is simply this: a backtracking problem has a solution if and only if at least one of the smaller backtracking problems that result from making each possible initial choice has a solution. The examples in this chapter are designed to illustrate this process and demonstrate the power of recursion in this domain. 9.1 Recursive backtracking in a maze Once upon a time, in the days of Greek mythology, the Mediterranean island of Crete was ruled by a tyrannical king named Minos. From time to time, Minos demanded tribute from the city of Athens in the form of young men and women, whom he would sacrifice to the Minotaur, a fearsome beast with the head of a bull and the body of a man. To house this deadly creature, Minos forced his servant Daedalus (the engineering genius who later escaped by constructing a set of wings) to build a vast underground labyrinth at Knossos. The young sacrifices from Athens would be led into the labyrinth, where they would be eaten by the Minotaur before they could find their way out. This tragedy continued until young Theseus of Athens volunteered to be one of the sacrifices. Following the advice of Minos’s daughter Ariadne, Theseus entered the labyrinth with a sword and a ball of string. After slaying the monster, Theseus was able to find his way back to the exit by unwinding the string as he went along. The right-hand rule Ariadne’s strategy is an algorithm for escaping from a maze, but not everyone trapped in a maze is lucky enough to have a ball of string. Fortunately, there are other strategies for solving a maze. Of these strategies, the best known is called the right-hand rule, which can be expressed in the following pseudocode form: Put your right hand against a wall. while (you have not yet escaped from the maze) { Walk forward keeping your right hand on a wall. } 9.1 Recursive backtracking in a maze To visualize the operation of the right-hand rule, imagine that Theseus has successfully dispatched the Minotaur and is now standing in the position marked by the first character in Theseus’s name, the Greek letter theta (Θ): Θ If Theseus puts his right hand on the wall and then follows the right-hand rule from there, he will trace out the path shown by the dashed line in this diagram: Θ Unfortunately, the right-hand rule does not in fact work in every maze. If there is a loop that surrounds the starting position, Theseus can get trapped in an infinite loop, as illustrated by the following simple maze: Θ Finding a recursive approach As the while loop in its pseudocode form makes clear, the right-hand rule is an iterative strategy. You can, however, also think about the process of solving a maze from a recursive perspective. To do so, you must adopt a different mindset. You can no longer think about the problem in terms of finding a complete path. Instead, your goal is to find a recursive insight that simplifies the problem, one step at a 395 396 Backtracking Algorithms time. Once you have made the simplification, you use the same process to solve each of the resulting subproblems. Let’s go back to the initial configuration of the maze shown in the illustration of the right-hand rule. Put yourself in Theseus’s place. From the initial configuration, you have three choices, as indicated by the arrows in the following diagram: Θ The exit, if any, must lie along one of those paths. Moreover, if you choose the correct direction, you will be one step closer to the solution. The maze has therefore become simpler along that path, which is the key to a recursive solution. This observation suggests the necessary recursive insight. The original maze has a solution if and only if it is possible to solve at least one of the new mazes shown in Figure 9-1. The × in each diagram marks the original starting square and is off-limits for any of the recursive solutions because the optimal solution will never have to backtrack through this square. By looking at the mazes in Figure 9-1, it is easy to see—at least from your global vantage point—that the submazes labeled (a) and (c) represent dead-end paths and that the only solution begins in the direction shown in the submaze (b). If you are thinking recursively, however, you don’t need to carry on the analysis all the way to F I G U R E 9 - 1 Recursive decomposition of a maze Θ Θ Θ (a) (b) (c) 9.1 Recursive backtracking in a maze the solution. You have already decomposed the problem into simpler instances. All you need to do is rely on the power of recursion to solve the individual subproblems, and you’re home free. You still have to identify a set of simple cases so that the recursion can terminate, but the hard work has been done. Identifying the simple cases What constitutes the simple case for a maze? One possibility is that you might already be standing outside the maze. If so, you’re finished. Clearly, this situation represents one simple case. There is, however, another possibility. You might also reach a blind alley where you’ve run out of places to move. For example, if you try to solve the sample maze by moving north and then continue to make recursive calls along that path, you will eventually be in the position of trying to solve the following maze: Θ At this point, you’ve run out of room to maneuver. Every path from the new position is either marked or blocked by a wall, which makes it clear that the maze has no solution from this point. Thus, the maze problem has a second simple case in which every direction from the current square is blocked, either by a wall or a marked square. It is easier to code the recursive algorithm if, instead of checking for marked squares as you consider the possible directions of motion, you go ahead and make the recursive calls on those squares. If you check at the beginning of the procedure to see whether the current square is marked, you can terminate the recursion at that point. After all, if you find yourself positioned on a marked square, you must be retracing your path, which means that the optimal solution must lie in some other direction. Thus, the two simple cases for this problem are as follows: 1. If the current square is outside the maze, the maze is solved. 2. If the current square is marked, the maze is unsolvable, at least along the path you’ve chosen so far. 397 398 Backtracking Algorithms Coding the maze solution algorithm Although the recursive insight and the simple cases are all you need to solve the problem on a conceptual level, writing a complete program to navigate a maze requires you to consider a number of implementation details as well. For example, you need to decide on a representation for the maze itself that allows you to figure out where the walls are, keep track of the current position, indicate that a particular square is marked, and determine whether you have escaped from the maze. While designing an appropriate data structure for the maze is an interesting programming challenge in its own right, it has very little to do with understanding the recursive algorithm, which is the focus of this discussion. If anything, the details of the data structure are likely to get in the way and make it more difficult for you to understand the algorithmic strategy as a whole. Fortunately, it is possible to set those details aside by introducing a new interface that hides some of the complexity. The maze.h interface in Figure 9-2 exports a class called Maze that encapsulates all the information necessary to keep track of the passages in a maze and to display that maze in the graphics window. Once you have access to the Maze class, writing a program to solve a maze becomes much simpler. The goal of this exercise is to write a function bool solveMaze(Maze & maze, Point pt); The arguments to solveMaze are (1) the Maze object that holds the data structure and (2) the starting position, which changes for each of the recursive subproblems. To ensure that the recursion can terminate when a solution is found, the solveMaze function returns true if a solution has been found, and false otherwise. Given this definition of solveMaze, the main program looks like this: int main() { initGraphics(); Maze maze("SampleMaze.txt"); maze.showInGraphicsWindow(); if (solveMaze(maze, maze.getStartPosition())) { cout << "The marked path is a solution." << endl; } else { cout << "No solution exists." << endl; } return 0; } 9.1 Recursive backtracking in a maze maze.h, p1 F I G U R E 9 - 2 The maze.h interface /* * File: maze.h * -----------* This interface exports the Maze class. */ #ifndef _maze_h #define _maze_h #include "grid.h" #include "point.h" /* * Class: Maze * ----------* This class represents a two-dimensional maze contained in a rectangular * grid of squares. The maze is read in from a data file in which the * characters '+', '-', and '|' represent corners, horizontal walls, and * vertical walls, respectively; spaces represent open passageway squares. * The starting position is indicated by the character 'S'. For example, * the following data file defines a simple maze: * * +-+-+-+-+-+ * | | * + +-+ + +-+ * |S | | * +-+-+-+-+-+ */ class Maze { public: /* * Constructor: Maze * Usage: Maze maze(filename); * --------------------------* Constructs a new maze by reading the specified data file. */ Maze(string filename); /* * Method: showInGraphicsWindow * Usage: showInGraphicsWindow(); * showInGraphicsWindow(x, y); * ---------------------------------* Displays the maze in the graphics window, positioned with its origin * at the point (x, y), which defaults to center the maze in the window. */ void showInGraphicsWindow(); void showInGraphicsWindow(double x, double y); 399 400 Backtracking Algorithms F I G U R E 9 - 2 The maze.h interface (continued) maze.h, p2 /* * Method: getStartPosition * Usage: Point start = maze.getStartPosition(); * --------------------------------------------* Returns a Point indicating the coordinates of the start square. */ Point getStartPosition(); /* * Method: isOutside * Usage: if (maze.isOutside(pt)) . . . * -----------------------------------* Returns true if the specified point is outside the boundary of the maze. */ bool isOutside(Point pt); /* * Method: wallExists * Usage: if (maze.wallExists(pt, dir)) . . . * -----------------------------------------* Returns true if there is a wall in direction dir from the square at pt. */ bool wallExists(Point pt, Direction dir); /* * Method: markSquare * Usage: maze.markSquare(pt); * --------------------------* Marks the specified square in the maze. */ void markSquare(Point pt); /* * Method: unmarkSquare * Usage: maze.unmarkSquare(pt); * ----------------------------* Unmarks the specified square in the maze. */ void unmarkSquare(Point pt); /* * Method: isMarked * Usage: if (maze.isMarked(pt)) . . . * ----------------------------------* Returns true if the specified square is marked. */ bool isMarked(Point pt); #include "mazepriv.h" }; 9.1 Recursive backtracking in a maze The code for the solveMaze function appears in Figure 9-3, along with the function adjacentPoint(start, dir), which returns the point you reach if you move in the specified direction from the starting point. F I G U R E 9 - 3 The implementation of the solveMaze function /* * Function: solveMaze * Usage: solveMaze(maze, start); * -----------------------------* Attempts to generate a solution to the current maze from the specified * start point. The solveMaze function returns true if the maze has a * solution and false otherwise. The implementation uses recursion * to solve the submazes that result from marking the current square * and moving one step along each open passage. */ bool solveMaze(Maze & maze, Point start) { if (maze.isOutside(start)) return true; if (maze.isMarked(start)) return false; maze.markSquare(start); for (Direction dir = NORTH; dir <= WEST; dir++) { if (!maze.wallExists(start, dir)) { if (solveMaze(maze, adjacentPoint(start, dir))) { return true; } } } maze.unmarkSquare(start); return false; } /* * Function: adjacentPoint * Usage: Point finish = adjacentPoint(start, dir); * -----------------------------------------------* Returns the point that results from moving one square from start * in the direction specified by dir. For example, if pt is the * point (1, 1), calling adjacentPoint(pt, EAST) returns the * point (2, 1). To maintain consistency with the graphics package, * the y coordinates increase as you move downward on the screen. Thus, * moving NORTH decreases the y component, and moving SOUTH increases it. */ Point adjacentPoint(Point start, Direction dir) { switch (dir) { case NORTH: return Point(start.getX(), start.getY() - 1); case EAST: return Point(start.getX() + 1, start.getY()); case SOUTH: return Point(start.getX(), start.getY() + 1); case WEST: return Point(start.getX() - 1, start.getY()); } return start; } 401 402 Backtracking Algorithms Convincing yourself that the solution works In order to use recursion effectively, at some point you must be able to look at a recursive function like the solveMaze example in Figure 9-3 and say to yourself something like this: “I understand how this works. The problem is getting simpler because more squares are marked each time. The simple cases are clearly correct. This code must do the job.” For most of you, however, that confidence in the power of recursion will not come easily. Your natural skepticism makes you want to see the steps in the solution. The problem is that, even for a maze as simple as the one shown earlier in this chapter, the complete history of the steps involved in the solution is far too large to think about comfortably. Solving that maze, for example, requires 66 calls to solveMaze that are nested 27 levels deep when the solution is finally discovered. If you attempt to trace the code in detail, you will almost certainly get lost. If you are not yet ready to accept the recursive leap of faith, the best you can do is track the operation of the code in a more general sense. You know that the code first tries to solve the maze by moving one square to the north, because the for loop goes through the directions in the order defined by the Direction enumeration. Thus, the first step in the solution process is to make a recursive call that starts in the following position: Θ At this point, the same process occurs again. The program again tries to move north and makes a new recursive call in the following position: Θ 9.1 Recursive backtracking in a maze At this level of the recursion, moving north is no longer possible, so the for loop cycles through the other directions. After a brief excursion southward, upon which the program encounters a marked square, the program finds the opening to the west and proceeds to generate a new recursive call. The same process occurs in this new square, which in turn leads to the following configuration: Θ In this position, none of the directions in the for loop do any good; every square is either blocked by a wall or already marked. Thus, when the for loop at this level exits at the bottom, it unmarks the current square and returns to the previous level. It turns out that all the paths have also been explored in this position, so the program once again unmarks the square and returns to the next higher level in the recursion. Eventually, the program backtracks all the way to the initial call, having completely exhausted the possibilities that begin by moving north. The for loop then tries the eastward direction, finds it blocked, and continues on to explore the southern corridor, beginning with a recursive call in the following configuration: Θ From here on, the same process ensues. The recursion systematically explores every corridor along this path, backing up through the stack of recursive calls whenever it reaches a dead end. The only difference along this route is that eventually—after descending through an additional recursive level for every step on the path—the program makes a recursive call in the following position: 403 404 Backtracking Algorithms Θ At this point, Theseus is outside the maze, so the simple case kicks in and returns true to its caller. This value is then propagated back through all 27 levels of the recursion, eventually returning back to the main program. 9.2 Backtracking and games Although backtracking is easiest to illustrate in the context of a maze, the strategy is considerably more general. For example, you can apply backtracking to most two-player strategy games. The first player has several choices for an initial move. Depending on which move is chosen, the second player then has a particular set of responses. Each of these responses leads in turn to new options for the first player, and this process continues until the end of the game. The different possible positions at each turn in the game form a branching structure in which each option opens up more and more possibilities. If you want to program a computer to take one side of a two-player game, one approach is to have the computer follow all the branches in the list of possibilities. Before making its first move, the computer would try every possible choice. For each of these choices, it would then try to determine what its opponent’s response would be. To do so, it would follow the same logic: try every possibility and evaluate the possible counterplays. If the computer can look far enough ahead to discover that some move would leave its opponent in a hopeless position, it should make that move. In theory, this strategy can be applied to any two-player strategy game. In practice, the process of looking at all the possible moves, potential responses, responses to those responses, and so on requires too much time and memory, even for modern computers. There are, however, several games that are simple enough to solve by looking at all the possibilities, yet complex enough so that the solution is not immediately obvious to the human player. 9.2 Backtracking and games The game of Nim To see how recursive backtracking applies to two-player games, it helps to consider a simple example such as the game of Nim, which is the generic name for an entire class of games in which players take turns removing objects from some initial configuration. In this particular version, the game begins with a pile of 13 coins. On each turn, players take either one, two, or three coins from the pile and put them aside. The object of the game is to avoid being forced to take the last coin. Figure 9-4 shows a sample game between the computer and a human player. How would you go about writing a program to play a winning game of Nim? The mechanical aspects of the game—keeping track of the number of coins, asking the player for a legal move, determining the end of the game, and so forth—are a straightforward programming task. The interesting part of the program consists of figuring out how to give the computer a strategy for playing the best possible game. Finding a successful strategy for Nim is not particularly hard, particularly if you work backward from the end of the game. The rules of Nim indicate that the loser is the player who takes the last coin. Thus, if you ever find yourself with just one coin on the table, you’re in a bad position. You have to take that coin and lose. On the other hand, things look good if you find yourself with two, three, or four coins. In any of these cases, you can always take all but one of the remaining coins, leaving your opponent in the unenviable position of being stuck with just one coin. F I G U R E 9 - 4 Sample run of the Nim game Nim Welcome to the game of Nim! In this game, we will start with a pile of 13 coins on the table. On each turn, you and I will alternately take between 1 and 3 coins from the table. The player who takes the last coin loses. There are 13 coins in the pile. How many would you like? 2 There are 11 coins in the pile. I'll take 2. There are 9 coins in the pile. How many would you like? 3 There are 6 coins in the pile. I'll take 1. There are 5 coins in the pile. How many would you like? 1 There are 4 coins in the pile. I'll take 3. There is only one coin left. I win. 405 406 Backtracking Algorithms But what if there are five coins on the table? What can you do then? After a bit of thought, it’s easy to see that you’re also doomed if you’re left with five coins. No matter what you do, you have to leave your opponent with two, three, or four coins—situations that you’ve just discovered represent good positions from your opponent’s perspective. If your opponent is playing intelligently, you will surely be left with a single coin on your next turn. Since you have no good moves, being left with five coins is clearly a bad position. This informal analysis reveals an important insight about the game of Nim. On each turn, you are looking for a good move. A good move is one that leaves your opponent in a bad position. But what is a bad position? A bad position is one in which there is no good move. Even though these definitions of good move and bad position are circular, they nonetheless constitute a complete strategy for playing a perfect game of Nim. You just have to rely on the power of recursion. If you have a function findGoodMove that takes the number of coins as its argument, all it has to do is try every possibility, looking for one that leaves a bad position for the opponent. You can then assign the job of determining whether a particular position is bad to the predicate function isBadPosition, which calls findGoodMove to see if there is one. The two functions call each other back and forth, evaluating all possible branches as the game proceeds. The mutually recursive functions findGoodMove and isBadPosition provide all the strategy that the Nim program needs to play a perfect game. To complete the program, all you need to do is write the code that takes care of the mechanics of playing Nim with a human player. This code is responsible for setting up the game, printing out instructions, keeping track of whose turn it is, asking the user for a move, checking whether that move is legal, updating the number of coins, figuring out when the game is over, and letting the user know who won. Although none of these tasks is conceptually difficult, the Nim application is large enough that it makes sense to adopt the implementation strategy described in section 6.5, in which the program is defined as a class rather than as a collection of free functions. Figure 9-5 shows an implementation of the Nim game that adopts this design. The code for the game is encapsulated in a class called SimpleNim, along with two instance variables that keep track of the progress of play: • An integer variable nCoins that records the number of coins in the pile. • A variable whoseTurn that indicates which player is about to move. This value is stored using the enumerated type Player, which defines the constants HUMAN and COMPUTER. At the end of each turn, the code for the play method passes the turn to the next player by setting whoseTurn to opponent(whoseTurn). 9.2 Backtracking and games Nim.cpp F I G U R E 9 - 5 The Nim.cpp implementation /* * File: Nim.cpp * ------------* This program simulates a simple variant of the game of Nim. In this * version, the game starts with a pile of 13 coins on a table. Players * then take turns removing 1, 2, or 3 coins from the pile. The player * who takes the last coin loses. */ #include #include #include "error.h" #include "simpio.h" #include "strlib.h" using namespace std; /* Constants */ const int N_COINS = 13; const int MAX_MOVE = 3; const int NO_GOOD_MOVE = -1; /* Initial number of coins */ /* Number of coins a player may take */ /* Marker indicating there is no good move */ /* * Type: Player * -----------* This enumerated type differentiates the human and computer players. */ enum Player { HUMAN, COMPUTER }; /* * Method: opponent * Usage: Player other = opponent(player); * --------------------------------------* Returns the opponent of the player. The opponent of the computer * is the human player and vice versa. */ Player opponent(Player player) { return (player == HUMAN) ? COMPUTER : HUMAN; } /* * Constant: STARTING_PLAYER * ------------------------* Indicates which player should start the game. */ const Player STARTING_PLAYER = HUMAN; 407 408 Backtracking Algorithms nim.cpp, p2 F I G U R E 9 - 5 The Nim.cpp implementation (continued) /* * Class: SimpleNim * ---------------* The SimpleNim class implements the simple version of Nim. */ class SimpleNim { public: /* * Method: play * Usage: game.play(); * ------------------* Plays one game of Nim with the human player. */ void play() { nCoins = N_COINS; whoseTurn = STARTING_PLAYER; while (nCoins > 1) { cout << "There are " << nCoins << " coins in the pile." << endl; if (whoseTurn == HUMAN) { nCoins -= getUserMove(); } else { int nTaken = getComputerMove(); cout << "I'll take " << nTaken << "." << endl; nCoins -= nTaken; } whoseTurn = opponent(whoseTurn); } announceResult(); } /* * Method: printInstructions * Usage: game.printInstructions(); * ------------------------------* This method explains the rules of the game to the user. */ void printInstructions() { cout << "Welcome to the game of Nim!" << endl; cout << "In this game, we will start with a pile of" << endl; cout << N_COINS << " coins on the table. On each turn, you" << endl; cout << "and I will alternately take between 1 and" << endl; cout << MAX_MOVE << " coins from the table. The player who" << endl; cout << "takes the last coin loses." << endl << endl; } 9.2 Backtracking and games nim.cpp, p3 F I G U R E 9 - 5 The Nim.cpp implementation (continued) private: /* * Method: getComputerMove * Usage: int nTaken = getComputerMove(); * ----------------------------------* Figures out what move is best for the computer player and returns * the number of coins taken. The method first calls findGoodMove * to see if a winning move exists. If none does, the program takes * only one coin to give the human player more chances to make a mistake. */ int getComputerMove() { int nTaken = findGoodMove(nCoins); return (nTaken == NO_GOOD_MOVE) ? 1 : nTaken; } /* * Method: findGoodMove * Usage: int nTaken = findGoodMove(nCoins); * ----------------------------------------* This method looks for a winning move, given the specified number * of coins. If there is a winning move in that position, the method * returns that value; if not, the method returns the constant * NO_GOOD_MOVE. This method depends on the recursive insight that a * good move is one that leaves your opponent in a bad position and a bad * position is one that offers no good moves. */ int findGoodMove(int nCoins) { int limit = (nCoins < MAX_MOVE) ? nCoins : MAX_MOVE; for (int nTaken = 1; nTaken <= limit; nTaken++) { if (isBadPosition(nCoins - nTaken)) return nTaken; } return NO_GOOD_MOVE; } /* * Method: isBadPosition * Usage: if (isBadPosition(nCoins)) . . . * --------------------------------------* This method returns true if nCoins is a bad position. * A bad position is one in which there is no good move. * Being left with a single coin is clearly a bad position * and represents the simple case of the recursion. */ bool isBadPosition(int nCoins) { if (nCoins == 1) return true; return findGoodMove(nCoins) == NO_GOOD_MOVE; } 409 410 Backtracking Algorithms nim.cpp, p4 F I G U R E 9 - 5 The Nim.cpp implementation (continued) /* * Method: getUserMove * Usage: int nTaken = getUserMove(); * ---------------------------------* Asks the user to enter a move and returns the number of coins taken. * If the move is not legal, the user is asked to reenter a valid move. */ int getUserMove() { while (true) { int nTaken = getInteger("How many would you like? "); int limit = (nCoins < MAX_MOVE) ? nCoins : MAX_MOVE; if (nTaken > 0 && nTaken <= limit) return nTaken; cout << "That's cheating! Please choose a number"; cout << " between 1 and " << limit << "." << endl; cout << "There are " << nCoins << " coins in the pile." << endl; } } /* * Method: announceResult * Usage: announceResult(); * -----------------------* This method announces the final result of the game. */ void announceResult() { if (nCoins == 0) { cout << "You took the last coin. You lose." << endl; } else { cout << "There is only one coin left." << endl; if (whoseTurn == HUMAN) { cout << "I win." << endl; } else { cout << "I lose." << endl; } } } /* Instance variables */ int nCoins; Player whoseTurn; /* Number of coins left on the table /* Identifies which player moves next }; /* Main program */ int main() { SimpleNim game; game.printInstructions(); game.play(); return 0; } */ */ 9.2 Backtracking and games A generalized program for two-player games The code in Figure 9-5 is highly specific to Nim. The play method, for example, is directly responsible for setting up the nCoins variable and updating it after each player moves. The general structure of a two-player game, however, is more widely applicable. Many games can be solved using the same overall strategy, even though different games will require different implementations to get the details right. One of the key concepts in this text is the notion of abstraction, which is the process of separating out the general aspects of a problem so that they are no longer obscured by the details of a specific domain. You may not be terribly interested in a program that plays Nim; after all, Nim is rather boring once you figure it out. What you would probably enjoy more is a program that is general enough to be adapted to play Nim, tic-tac-toe, or any other two-player strategy game you choose. The possibility of creating such a generalization arises from the fact that most games share a few fundamental concepts. The first such concept is that of state. For any game, there are data values that define exactly what is happening at any point in time. In the Nim game, for example, the state consists of the values of its two instance variables, nCoins and whoseTurn. For a game like chess, the state would need to include what pieces are currently placed on which squares, although it would presumably continue to include the whoseTurn variable, or something that fulfills the same function. For any game, however, it should be possible to store the relevant data in the instance variables of the class that implements the game. The second important concept is that of a move. In Nim, a move consists of an integer representing the number of coins taken away. In chess, a move might consist of a pair indicating the starting and ending coordinates of the piece that is moving, although this approach is in fact complicated by the need to represent such esoteric moves as castling or the promotion of a pawn. For any game, however, it is possible to define a Move class that encapsulates whatever information is necessary to represent a move in that game. When you implement the Nim game, however, it doesn’t quite work to define the Move class in the simplest possible way: class Move { int nTaken; } The problem here is that the default visibility for entries in a class is private, which means that the class implementing the game would have no access to the instance variable. 411 412 Backtracking Algorithms There are several strategies you can adopt to fix this problem. One strategy is to make the nTaken variable public or, equivalently, to replace the class definition by a traditional C-style struct. That strategy, however, violates the principle of encapsulation by making the details of the Move class available to the entire program. A second approach is to define getter and setter methods in the style of the classes introduced in Chapter 6. That approach follows the style of modern object-oriented programming at the cost of making the Move class harder to use. That extra complexity seems unwarranted for a class whose only client is likely to be the class that implements the game. A third strategy—and the one used in this text—is to have the Move class declare the game class as a friend. Once you have a Move class, it is possible to define a few additional helper methods that allow you to rewrite the play method like this: void play() { initGame(); while (!gameIsOver()) { displayGame(); if (getCurrentPlayerType() == HUMAN) { makeMove(getUserMove()); } else { Move move = getComputerMove(); displayMove(move); makeMove(move); } } announceResult(); } The most important thing to notice about the implementation of the play method is that the code gives no indication about what game is being played. It might be Nim, but it could just as easily be tic-tac-toe (which you’ll have a chance to implement in the exercises) or some other strategy game. Every game requires its own definitions for the Move class, along with the various game-specific methods such as initGame and makeMove. Even so, the structure of the play method is general enough to work for many different two-player games. If you compare the generalized implementation of play with the code in Figure 9-5, you will also notice that the code to switch turns is no longer included at this level of the implementation. In the general version of the program, it makes sense to embed this aspect of the game into the methods initGame, makeMove, and getCurrentPlayerType. Making this change means that the play method no longer refers directly to the instance variables but instead calls methods to do the job. This strategy allows more flexibility in the underlying implementation. 9.3 The minimax algorithm The play method and the mechanics of taking turns, however, are not the most exciting aspects of a game. The algorithmically interesting part is embedded inside the method getComputerMove, which is responsible for choosing the best move for the computer. The version of Nim in Figure 9-5 implements this strategy using the mutually recursive methods findGoodMove and isBadPosition, which search through all possible choices for each player to find a winning move in the current position. As a strategy, that idea is also independent from the details of any particular game, and it therefore ought to be possible to write these methods in a more general way. Before going further down that path, however, it helps to make a further generalization to the problem, which will make it suitable for a wider variety of games. 9.3 The minimax algorithm The techniques described in the preceding section work well for simple, completely solvable games like Nim. As games become more complex, however, it quickly becomes impossible to examine every possible outcome. If you tried to go through every possible game of chess, for example, the process could take billions of years, even at the speed of modern computers. Yet, somehow, in spite of this limitation, computers are very good at chess. In 1997, IBM’s “Deep Blue” supercomputer beat the reigning world champion at that time, Garry Kasparov. Deep Blue did not win by conducting an exhaustive analysis of all possible games; it instead looked ahead only for a restricted number of moves, in much the same way that humans do. Even for games for which it is computationally infeasible to work through every possible sequence of moves, the recursive concepts of good moves and bad positions from the Nim game still come in handy. Although it may not be possible to identify a move as surefire winner, it is still true that the best move in any position is the one that leaves your opponent in the worst position. Similarly, the worst position is the one that offers the weakest best move. This strategy—which consists of finding the position that leaves your opponent with the worst possible best move—is called the minimax algorithm because the goal is to find the move that minimizes your opponent’s maximum opportunity. Game trees The best way to visualize the operation of the minimax strategy is to think about the possible future moves in a game as forming a branching diagram that expands on each turn. Because of this branching structure, such diagrams are called game trees. The initial state is represented by a dot at the top of the game tree. If there are, for example, three possible moves from this position, there will be three lines emanating downward from the current state to three new states that represent the results of these moves, as shown in the following diagram: 413 414 Backtracking Algorithms From each of these new positions, your opponent also has options. If each position again has three options, the next generation of the game tree looks like this: Which move do you choose in the initial position? Clearly, your goal is to achieve the best outcome. Unfortunately, you only get to control half of the game. If you were able to select your opponent’s move as well as your own, you could select the path to the state two turns away that left you in the best position. Given that your opponent is also trying to win, the best thing you can do is choose the initial move that leaves your opponent with as few winning chances as possible. Rating positions and moves In order to get a sense of how you should proceed, it helps to add some quantitative data to the analysis. Deciding whether a particular move is better than some alternative is much easier if you assign a numeric score to each possible move. The higher the numeric score, the better the move. Thus, a move that has a score of +7, for example, is better than a move with a rating of –4. In addition to rating each possible move, it makes sense to assign a similar numeric rating to each position in the game. Thus, one position might have a rating of +9 and would therefore be better than a position with a score of only +2. Both positions and moves are rated from the perspective of the player having the move. Moreover, the rating system is designed to be symmetric around 0, in the sense that a position that has a score of +9 for the player to move would have a score of –9 from the opponent’s point of view. This interpretation of rating numbers captures the idea that a position that is good for one player is bad for the other, as was true for the Nim game. More importantly, defining the rating system in this way makes it easy to express the relationship between the scores for moves and positions. The rating for any move is simply the negative of the rating for the resulting position when evaluated by your opponent. Similarly, the rating of any position can be defined as the rating of its best move. To make this discussion more concrete, it helps to consider a simple example. Suppose that you have looked two steps ahead in the game, covering one move by 9.3 The minimax algorithm you and the possible responses from your opponent. In computer science, a single move for a single player is called a ply to avoid the ambiguity associated with the words move and turn, which sometimes imply that both players have a chance to play. If you rate the positions at the conclusion of the two-ply analysis, the game tree might look like this: +7 +6 –9 –5 +9 –4 –1 +1 –2 Because the positions at the bottom of this tree are again positions in which—as at the top of the tree—you have to move, the rating numbers in those positions are assigned from your perspective. Given these ratings of the potential positions, what move should you make from the original configuration? At first glance, you might be attracted by the fact that the center branch contains a path that leads to a +9, which is an excellent outcome for you. Unfortunately, the fact that the center branch offers such a wonderful outcome doesn’t really matter. If your opponent is playing rationally, there is no way that the game can reach the +9 position. Suppose, for example, that you do choose the center branch. Given the options available, your opponent will select the leftmost branch, as illustrated by the colored path in the following game tree: +7 +6 –9 –5 +9 –4 –1 +1 –2 Your initial choice thus leaves you in a position that—from your point of view—has a rating of –5. You would do better to choose the rightmost branch, from which your opponent’s best strategy leaves you in a position with a –2 rating: +7 +6 –9 –5 +9 –4 –1 +1 –2 As noted earlier in this section, the rating of a move is the negative of the rating of the resulting position when evaluated from the opponent’s perspective. The 415 416 Backtracking Algorithms rating of the last move in the highlighted line of the game tree is +2 because it leads to a position with a –2 rating. The negative sign indicates the shift in perspective. Moves that lead to positions that are bad for my opponent are good for me, and vice versa. The rating of each position is simply the rating of the best move it offers. The ratings for the positions and moves along the highlighted path in the game tree therefore look like this: –2 –2 +2 +2 +7 +6 –9 –5 +9 –4 –1 +1 As rated by your opponent –2 The rating of the starting position is therefore –2. While this position is hardly ideal, it is better for you than the other possible outcomes, assuming that your opponent is playing rationally. Any implementation of the minimax algorithm will need to compare the ratings of moves to determine the best one. For this reason, it is convenient to store the rating of each move in the corresponding Move object. The simplest way to accomplish this goal is to add a new instance variable called rating to the Move class, so that the definition now looks like this for the Nim game: class Move { int nTaken; int rating; friend class SimpleNim; }; The code that implements the minimax algorithm will assume that the Move structure—no matter what else it has that is specific to a particular game—always includes a rating field. In the implementation of the minimax application outlined later in this chapter, the values used as ratings are integers that must fall between the limits defined by the following constants: const int WINNING_POSITION const int LOSING_POSITION = 1000; = -WINNING_POSITION; At the end of a game, the rating of a position can be determined by checking to see who has won. The rating of any position for which the outcome is not yet determined must be an integer somewhere between these extremes. 9.3 The minimax algorithm Limiting the depth of the recursive search If you could search an entire game tree from the beginning of a game through to every possible conclusion, you could implement the minimax algorithm using pretty much the same structure as in the earlier Nim example. All you would need to do is two mutually recursive functions, one that finds the best move and one that evaluates positions. For games that involve any significant level of complexity, it is impossible to search the entire game tree in a reasonable amount of time. A practical implementation of the minimax algorithm must therefore include a provision for cutting off the search at a certain point. The usual strategy for limiting the search is to set some maximum value for the depth of the recursion. You could, for example, allow the recursion to continue until each player has made five moves, for a total of ten ply. If the game ends before that limit is reached, you can evaluate the final position by checking to see who won the game and then returning WINNING_POSITION or LOSING_POSITION, as appropriate. But what happens if you hit the recursion limit before the outcome of the game is decided? At that point, you need to evaluate the position in some other way that does not involve making additional recursive calls. Given that this kind of analysis depends only on the state of the game as it stands, it is usually called static analysis. In chess-playing programs, for example, static analysis usually performs some simple calculation based on the values of the pieces each side has on the board. If the player to move is ahead in that calculation, the position has a positive rating; if not, the rating will be negative. Although any simple calculation is sure to overlook some important factor, it is important to remember that static analysis applies only after the recursion limit is reached. If, for example, there was some line of play that would force a win in the game in the next few moves, the quality of the static analysis is irrelevant, because the recursive evaluation will find that winning line of play before it gets to the static-analysis phase. The easiest way to add a depth limit to the minimax implementation is to have each of the recursive methods take a parameter named depth that records how many levels have already been analyzed and add one to that value before trying to rate the next position. If that parameter exceeds a defined constant MAX_DEPTH, any further evaluations must be performed using static analysis. Implementing the minimax algorithm The minimax algorithm can be implemented using two mutually recursive methods: findBestMove and evaluatePosition. The findBestMove method considers 417 418 Backtracking Algorithms every possible move and then calls evaluatePosition on the resulting positions, looking for the one with the lowest rating when evaluated from the opponent’s perspective. The code for evaluatePosition uses findBestMove to determine the best move and then return the rating of that move, unless the recursion limit or state of the game require static analysis. The code for these methods appears in Figure 9-6. F I G U R E 9 - 6 Generalized implementation of the minimax algorithm /* * Method: findBestMove * Usage: move = findBestMove(depth); * ---------------------------------* Finds the best move for the current player. The depth parameter * and the constant MAX_DEPTH are used to limit the depth of the search * for games that are too difficult to analyze in full detail. */ Move findBestMove(int depth = 0) { Vector moveList; Move bestMove; int minRating = WINNING_POSITION + 1; generateMoveList(moveList); if (moveList.isEmpty()) error("No moves available"); foreach (Move move in moveList) { makeMove(move); int rating = evaluatePosition(depth + 1); if (rating < minRating) { bestMove = move; minRating = rating; } retractMove(move); } bestMove.rating = -minRating; return bestMove; } /* * Method: evaluatePosition * Usage: rating = evaluatePosition(depth); * ---------------------------------------* Evaluates a position by finding the rating of the best move in * the current position. The depth parameter and the constant * MAX_DEPTH are used to limit the depth of the search. */ int evaluatePosition(int depth) { if (gameIsOver() || depth >= MAX_DEPTH) { return evaluateStaticPosition(); } return findBestMove(depth).rating; } Summary The code in Figure 9-6 calls several methods—each of which is coded independently for a particular game—that are worth some further explanation: • The generateMoveList method fills the moveList vector with the legal moves available in the current state. • The methods makeMove and retractMove have the effect of making and taking back a particular move. These methods allow the program to try out a potential move, evaluate the resulting position, and then go back to the original state. • The isGameOver method checks to see if the game has reached a final state in which no further analysis is possible. • The evaluateStaticPosition method evaluates a particular state in the game without making any further recursive calls. Summary In this chapter, you have learned to solve problems that require making a sequence of choices as you search for a goal, as illustrated by finding a path through a maze or a winning strategy in a two-player game. The basic strategy is to write programs that can backtrack to previous decision points if those choices lead to dead ends. By exploiting the power of recursion, however, you can avoid coding the details of the backtracking process explicitly and develop general solution strategies that apply to a wide variety of problem domains. Important points in this chapter include: • You can solve most problems that require backtracking by adopting the following recursive approach: If you are already at a solution, report success. for (every possible choice in the current position ) { Make that choice and take one step along the path. Use recursion to solve the problem from the new position. If the recursive call succeeds, report the success to the next higher level. Back out of the current choice to restore the state at the beginning of the loop. } Report failure. • The complete history of recursive calls in a backtracking problem—even for relatively simple applications—is usually too complex to understand in detail. For problems that involve any significant amount of backtracking, it is essential to accept the recursive leap of faith. 419 420 Backtracking Algorithms • You can often find a winning strategy for two-player games by adopting a recursive-backtracking approach. Because the goal in such games involves minimizing the winning chances for your opponent, the conventional strategic approach is called the minimax algorithm. Review questions 1. What is the principal characteristic of a backtracking algorithm? 2. Using your own words, state the right-hand rule for escaping from a maze. Would a left-hand rule work equally well? 3. What is the insight that makes it possible to solve a maze by recursive backtracking? 4. What are the simple cases that apply in the recursive implementation of solveMaze? 5. Why is important to mark squares as you proceed through the maze? What would happen in the solveMaze function if you never marked any squares? 6. What is the purpose of the unmarkSquare call at the end of the for loop in the solveMaze implementation? Is this statement essential to the algorithm? 7. What is the purpose of the Boolean result returned by solveMaze? 8. In your own words, explain how the backtracking process actually takes place in the recursive implementation of solveMaze. 9. In the simple Nim game, the human player plays first and begins with a pile of 13 coins. Is this a good or a bad position? Why? 10. Write a simple C++ expression based on the value of nCoins that has the value true if the position is good for the current player and false otherwise. 11. What is the minimax algorithm? What does its name signify? 12. Why is it useful to develop an abstract implementation of the minimax algorithm that does not depend on the details of a particular game? 13. What is the role of the depth argument in the functions findBestMove and evaluatePosition? Exercises 14. Explain the role of the evaluateStaticPosition function in the minimax implementation. 15. Suppose you are in a position in which the analysis for the next two moves shows the following rated outcomes from your original player’s point-of-view: – 9 – 4 +4 –3 –2 – 4 – 3 +2 – 5 +1 +0 – 5 If you adopt the minimax strategy, what is the best move to make in this position? What is the rating of that move from your perspective? Exercises 1. In many mazes, there are multiple paths. For example, Figure 9-7 shows three solutions for the same maze. None of these solutions, however, is optimal. The shortest path through the maze has a path length of 11: Θ F I G U R E 9 - 7 Multiple paths through a maze Θ Θ length = 13 Θ length = 15 length = 13 421 422 Backtracking Algorithms Write a function int shortestPathLength(Maze & maze, Point start); that returns the length of the shortest path in the maze from the specified position to any exit. If there is no solution, shortestPathLength should return –1. 2. As implemented in Figure 9-3, the solveMaze function unmarks each square as it discovers there are no solutions from that point. Although this design strategy has the advantage that the final configuration of the maze shows the solution path as a series of marked squares, the decision to unmark squares as you backtrack has a cost in terms of the overall efficiency of the algorithm. If you’ve marked a square and then backtracked through it, you’ve already explored the possibilities leading from that square. If you come back to it by some other path, you might as well rely on your earlier analysis instead of exploring the same options again. To give yourself a sense of how much these unmarking operations cost in terms of efficiency, extend the solveMaze program so that it records the number of recursive calls as it proceeds. Use this program to calculate how many recursive calls are required to solve the following maze if the call to unmarkSquare remains part of the program: Θ Run your program again, this time without the call to unmarkSquare. What happens to the number of recursive calls? 3. As the result of the preceding exercise makes clear, the idea of keeping track of the path through a maze by using the markSquare facility in the Maze class has a substantial cost. A more practical approach is to change the definition of the recursive function so that it keeps track of the current path as it goes. Following the logic of solveMaze, write a function bool findSolutionPath(Maze & maze, Point start, Vector & path); Exercises that takes, in addition to the coordinates of the starting position, a vector of Point values called path. Like solveMaze, findSolutionPath returns a Boolean value indicating whether the maze is solvable. In addition, the findSolutionPath function initializes the elements of the path vector to a sequence of coordinates beginning with the starting position and ending with the coordinates of the first square that lies outside the maze. For this exercise, it is sufficient for findPath to find any solution path. It need not find the shortest one. 4. Most drawing programs for personal computers make it possible to fill an enclosed region on the screen with a solid color. Typically, you invoke this operation by selecting a “paint bucket” tool and then clicking the mouse, with the cursor somewhere in your drawing. When you do, the paint spreads to every part of the picture it can reach without going through a line. For example, suppose you have just drawn the following picture of a house: If you select the paint bucket and click inside the door, the drawing program fills the area bounded by the door frame as shown at the left side of the following diagram. If you instead click somewhere on the front wall of the house, the program fills the entire wall space except for the windows and doors, as shown on the right: In order to understand how this process works, it is important to understand that the screen of the computer is broken down into an array of tiny dots called 423 424 Backtracking Algorithms pixels. On a monochrome display, pixels can be either white or black. The paint-fill operation consists of painting black the starting pixel (i.e., the pixel you click while using the paint-bucket tool) along with any pixels connected to that starting point by an unbroken chain of white pixels. Thus, the patterns of pixels on the screen representing the preceding two diagrams would look like this: It is easy to represent a pixel grid using the type Grid. White pixels in the grid have the value false, and black pixels have the value true. Given this representation, write a function void fillRegion(Grid & pixels, int row, int col) that simulates the operation of the paint-bucket tool by painting in black all white pixels reachable from the specified row and column without crossing an existing black pixel. 5. The most powerful piece in the game of chess is the queen, which can move any number of squares in any direction, horizontally, vertically, or diagonally. For example, the queen shown in this chessboard can move to any of the marked squares: Exercises Even though the queen can cover a large number of squares, it is possible to place eight queens on an 8 × 8 chessboard so that none of them attacks any of the others, as shown in the following diagram: Write a program that solves the more general problem of whether it is possible to place N queens on an N × N chessboard so that none of them can move to a square occupied by any of the others in a single turn. Your program should either display a solution if it finds one or report that no solutions exist. 6. In chess, a knight moves in an L-shaped pattern: two squares in one direction horizontally or vertically, and then one square at right angles to that motion. For example, the white knight in the upper right side of the following diagram can move to any of the eight squares marked with an ×: The mobility of a knight decreases near the edge of the board, as illustrated by the black knight in the corner, which can reach only the two squares marked with an o. 425 426 Backtracking Algorithms It turns out that a knight can visit all 64 squares on a chessboard without ever moving to the same square twice. A path for the knight that moves through all the squares without repeating a square is called a knight’s tour. One such tour is shown in the following diagram, in which the numbers in the squares indicate the order in which they were visited: 52 47 56 45 54 5 22 13 57 44 53 4 23 14 25 6 48 51 46 55 26 21 12 15 43 58 3 50 41 24 7 20 36 49 42 27 62 11 16 29 59 2 37 40 33 28 19 8 38 35 32 61 10 63 30 17 1 60 39 34 31 18 9 64 Write a program that uses backtracking recursion to find a knight’s tour. 7. In the 1960s, a puzzle called Instant Insanity was popular for some years before it faded from view. The puzzle consisted of four cubes whose faces were each painted with one of the colors red, blue, green, and white, represented in the rest of this problem by their initial letter. The goal of the puzzle was to arrange the cubes into a line so that if you looked at the line from any of its edges, you would see no duplicated colors. Cubes are hard to draw in two dimensions, but the following diagram shows what the cubes would look like if you unfolded them and placed them flat on the page: G W B B R R W W G B G G W B R R W B G G R R R W Write a program that uses backtracking to solve the Instant Insanity puzzle. Exercises 8. In theory, the recursive backtracking strategy described in this chapter should be sufficient to solve most puzzles that involve performing a sequence of moves looking for some solution. In practice, however, some of those puzzles are too complex to solve in a reasonable amount of time. One puzzle that is just at the limit of what recursive backtracking can accomplish without using some additional cleverness is the peg solitaire puzzle, which dates from the 17th century. Peg solitaire is usually played on a board that looks like this: The black dots in the diagram are pegs, which fill the board except for the center hole. On a turn, you are allowed to jump over and remove a peg, as illustrated in the following diagram, in which the colored peg jumps into the vacant center hole and the peg in the middle is removed: The object of the game is to perform a series of jumps that leaves only one peg in the center hole. Write a program to solve this puzzle. 9. The game of dominos is played with rectangular pieces composed of two connected squares, each of which is marked with a certain number of dots. For example, each of the following four rectangles represents a domino: 427 428 Backtracking Algorithms Dominos can be connected end-to-end to form chains, subject to the condition that two dominos can be linked together only if the numbers match. For example, you can form a chain consisting of these four dominos by connecting them in the following order: As in the traditional game, dominos can be rotated by 180˚ so that their numbers are reversed. In this chain, for example, the 1–6 and 3–4 dominos have been “turned over” so that they fit into the chain. Suppose that you have access to a Domino class (see Chapter 6, exercise 3) that exports the methods getLeftDots and getRightDots. Given this class, write a recursive function bool formsDominoChain(Vector & dominos); that returns true if it possible to build a chain consisting of every domino in the vector. 10. Suppose that you have been assigned the job of buying the plumbing pipes for a construction project. Your foreman gives you a list of the varying lengths of pipe needed, but the distributor sells stock pipe only in one fixed size. You can, however, cut each stock pipe in any way needed. Your job is to figure out the minimum number of stock pipes required to satisfy the list of requests, thereby saving money and minimizing waste. Your job is to write a recursive function int cutStock(Vector & requests, int stockLength); that takes two arguments—a vector of the lengths needed and the length of stock pipe that the distributor sells—and returns the minimum number of stock pipes needed to service all requests in the vector. For example, if the vector contains [ 4, 3, 4, 1, 7, 8 ] and the stock pipe length is 10, you can purchase three stock pipes and divide them as follows: Pipe 1: 4, 4, 1 Pipe 2: 3, 7 Pipe 3: 8 Doing so leaves you with two small remnants left over. There are other possible arrangements that also fit into three stock pipes, but it cannot be done with fewer. Exercises 11. Most operating systems and many applications that allow users to work with files support wildcard patterns, in which special characters are used to create filename patterns that can match many different files. The most common special characters used in wildcard matching are ?, which matches any single character, and *, which matches any sequence of characters. Other characters in a filename pattern must match the corresponding character in a filename. For example, the pattern *.* matches any filename that contains a period, such as EnglishWords.dat or HelloWorld.cpp, but would not match filenames that do not contain a period. Similarly, the pattern test.? matches any filename that consists of the name test, a period, and a single character; thus, test.? matches test.h but not test.cpp. These patterns can be combined in any way you like. For example, the pattern ??* matches any filename containing at least two characters. Write a function bool wildcardMatch(string filename, string pattern); that takes two strings, representing a filename and a wildcard pattern, and returns true if that filename matches the pattern. Thus, wildcardMatch("US.txt", "*.*") wildcardMatch("test", "*.*") wildcardMatch("test.h", "test.?") wildcardMatch("test.cpp", "test.?") wildcardMatch("x", "??*") wildcardMatch("yy", "??*") wildcardMatch("zzz", "??*") returns returns returns returns returns returns returns true false true false false true true 12. Rewrite the simple Nim game so that it uses the generalized minimax algorithm presented in Figure 9-6. Your program should not change the code for findBestMove or evaluatePosition. Your job is to come up with an appropriate definition of the Move class and the various game-specific methods so that the program still plays a perfect game of Nim. 13. Modify the code for the simple Nim game you wrote for exercise 11 so that it plays a different variant of Nim. In this version, the pile begins with 17 coins. On each turn, players alternate taking one, two, three, or four coins from the pile. In the simple Nim game, the coins the players took away were simply ignored; in this game, the coins go into a pile for each player. The player whose pile contains an even number of coins after the last coin is taken wins the game. 429 430 Backtracking Algorithms 14. In the most common variant of Nim, the coins are not combined into a single pile but are instead arranged in three rows like this: Row 0: Row 1: Row 2: A move in this game consists of taking any number of coins, subject to the condition that all the coins must come from the same row. The player who takes the last coin loses. Write a program that uses the minimax algorithm to play a perfect game of three-pile Nim. The starting configuration shown here is a typical one, but your program should be general enough so that you can easily change either the number of rows or the number of coins in each row. 15. The game of tic-tac-toe is played by two players who take turns placing Xs and Os in a 3×3 grid that looks like this: The object of the game is to line up three of your own symbols in a row, horizontally, vertically, or diagonally. In the following game, for example, X has won the game by completing three in a row across the top: If the board fills up without anyone completing a row, the game is a draw, which is called a cat’s game in tic-tac-toe. Write a program that uses the minimax algorithm to play a perfect game of tic-tac-toe. Figure 9-8 shows a sample run against a particularly inept player. Exercises tic-tac-toe sample run F I G U R E 9 - 8 Sample run of the tic-tac-toe game TicTacToe Welcome to TicTacToe, the game of three in a row. I'll be X, and you'll be O. The squares are numbered like this: 1 | 2 | 3 ---+---+--4 | 5 | 6 ---+---+--7 | 8 | 9 I'll move to 1. The game now looks like this: X | | ---+---+--| | ---+---+--| | Your move. What square? 5 The game now looks like this: X | | ---+---+--| O | ---+---+--| | I'll move to 2. The game now looks like this: X | X | ---+---+--| O | ---+---+--| | Your move. What square? 7 The game now looks like this: X | X | ---+---+--| O | ---+---+--O | | I'll move to 3. The final position looks like this: X | X | X ---+---+--| O | ---+---+--O | | I win. 431 Chapter 10 Algorithmic Analysis Without analysis, no synthesis. — Friedrich Engels, Herr Eugen Dühring’s Revolution in Science, 1878 434 Algorithmic Analysis In Chapter 7, you were introduced to two different recursive implementations of the function fib(n), which computes the nth Fibonacci number. The first is based directly on the mathematical definition fib(n) = n if n is 0 or 1 fib(n - 1) + fib(n - 2) otherwise and turns out to be wildly inefficient. The second implementation, which uses the notion of additive sequences to produce a version of fib(n) that is comparable in efficiency to traditional iterative approaches, demonstrates that recursion itself is not the cause of the problem. Even so, examples like the first version of the Fibonacci function have such high execution costs that recursion sometimes gets a bad name as a result. As you will see in this chapter, the ability to think recursively about a problem often leads to new strategies that are considerably more efficient than anything that would come out of an iterative design process. The power of divide-and-conquer algorithms is enormous and has a profound impact on many problems that arise in practice. By using recursive algorithms of this form, it is possible to achieve dramatic increases in efficiency that can cut the solution times, not by factors of two or three, but by factors of a thousand or more. Before looking at these algorithms, however, it is important to ask a few questions. What does the term efficiency mean in an algorithmic context? How would you go about measuring that efficiency? These questions form the foundation for the subfield of computer science known as analysis of algorithms. Although a detailed understanding of algorithmic analysis requires a reasonable facility with mathematics and a lot of careful thought, you can get a sense of how it works by investigating the performance of a few simple algorithms. 10.1 The sorting problem The easiest way to illustrate the analysis of algorithms is to consider a problem domain in which different algorithms vary widely in their performance. Of these, one of the most interesting is the problem of sorting, which consists of reordering the elements in an array or vector so that they fall in some defined sequence. For example, suppose you have stored the following integers in the variable vec, which is a Vector: vec 56 25 37 58 95 19 73 30 0 1 2 3 4 5 6 7 10.1 The sorting problem 435 Your mission is to write a function sort(vec) that rearranges the elements into ascending order, like this: 19 25 30 37 56 58 73 95 0 1 2 3 4 5 6 7 The selection sort algorithm There are many algorithms you could choose to sort a vector of integers into ascending order. One of the simplest is called selection sort. Given a vector of size N, the selection sort algorithm goes through each element position and finds the value that should occupy that position in the sorted vector. When it finds the appropriate element, the algorithm exchanges it with the value that previously occupied the desired position to ensure that no elements are lost. Thus, on the first cycle, the algorithm finds the smallest element and swaps it with the first vector position. On the second cycle, it finds the smallest remaining element and swaps it with the second position. Thereafter, the algorithm continues this strategy until all positions in the vector are correctly ordered. An implementation of sort that uses selection sort is shown in Figure 10-1. F I G U R E 1 0 - 1 Implementation of the selection sort algorithm /* * Function: sort * -------------* This implementation uses an algorithm called selection sort, which can * be described as follows. With your left hand (lh), point at each element * in the vector in turn, starting at index 0. At each step in the cycle: * * 1. Find the smallest element in the range between your left hand and the * end of the vector, and point at that element with your right hand (rh). * * 2. Move that element into its correct position by exchanging the elements * indicated by your left and right hands. */ void sort(Vector & vec) { int n = vec.size(); for (int lh = 0; lh < n; lh++) { int rh = lh; for (int i = lh + 1; i < n; i++) { if (vec[i] < vec[rh]) rh = i; } int tmp = vec[lh]; vec[lh] = vec[rh]; vec[rh] = tmp; } } 436 Algorithmic Analysis For example, if the initial contents of the vector are 56 25 37 58 95 19 73 30 0 1 2 3 4 5 6 7 the first cycle through the outer for loop identifies the 19 in index position 5 as the smallest value in the entire vector and then swaps it with the 56 in index position 0 to leave the following configuration: 19 25 37 58 95 56 73 30 0 1 2 3 4 5 6 7 On the second cycle, the algorithm finds the smallest element between positions 1 and 7, which turns out to be the 25 in position 1. The program goes ahead and performs the exchange operation, leaving the vector unchanged from the preceding diagram. On each subsequent cycle, the algorithm performs a swap operation to move the next smallest value into its appropriate final position. When the for loop is complete, the entire vector is sorted. Empirical measurements of performance How efficient is the selection sort algorithm as a strategy for sorting? To answer this question, it helps to collect empirical data about how long it takes the computer to sort a vector of various sizes. When I did this experiment on my MacBook Pro laptop, I observed the following timing data for selection sort, where N represents the number of elements in the vector: N Running time 10 50 100 500 1000 5000 10,000 50,000 100,000 0.0000024 sec 0.0000448 sec 0.000169 sec 0.00402 sec 0.0159 sec 0.395 sec 1.58 sec 39.6 sec 158.7 sec For a vector of 10 integers, the selection sort algorithm completes its work in a couple of microseconds. Even for 5000 integers, this implementation of sort takes less than a second, which certainly seems fast enough in terms of our human sense of time. As the vector sizes get larger, however, the performance of selection sort begins to go downhill. For a vector of 100,000 integers, the algorithm requires 10.1 The sorting problem more than two and a half minutes. If you’re sitting in front of your computer waiting for it to reply, that is an awfully long time. Even more disturbing is the fact that the performance of selection sort rapidly gets worse as the vector size increases. As you can see from the timing data, every time you multiply the number of values by 10, the time required to sort the vector goes up a hundredfold. If this pattern continues, sorting a list of a million numbers would take about four and a half hours. If your business requires sorting vectors on this scale, you would have no choice but to find a more efficient approach. Analyzing the performance of selection sort What makes selection sort perform so badly as the number of values to be sorted becomes large? To answer this question, it helps to think about what the algorithm has to do on each cycle of the outer loop. To correctly determine the first value in the vector, the selection sort algorithm must consider all N elements as it searches for the smallest value. Thus, the time required on the first cycle of the loop is presumably proportional to N. For each of the other elements in the vector, the algorithm performs the same basic steps but looks at one less element each time. It looks at N – 1 elements on the second cycle, N – 2 on the third, and so on, so the total running time is roughly proportional to N + N–1 + N–2 + . . . + 3 + 2 + 1 Because it is difficult to work with an expression in this expanded form, it is useful to simplify it by applying a bit of mathematics. As you may have learned in an algebra course, the sum of the first N integers is given by the formula N × (N + 1) 2 or, multiplying out the numerator, N2 + N 2 You will learn how to prove that this formula is correct in the section on “Mathematical induction” later in this chapter. For the moment, all you need to know is that the sum of the first N integers can be expressed in this more compact form. If you write out the values of the function N2 + N 2 437 438 Algorithmic Analysis for various values of N, you get a table that looks like this: N2 + N 2 N 10 50 100 500 1000 5000 10,000 50,000 100,000 55 1275 5050 125,250 500,500 12,502,500 50,005,000 1,250,025,000 5,000,050,000 Because the running time of the selection sort algorithm is presumably related to the amount of work the algorithm needs to do, the values in this table should be roughly proportional to the observed execution time of the algorithm, which turns out to be true. If you look at the measured timing data for selection sort in Figure 10-2, for example, you discover that the algorithm requires 1.58 seconds to sort 10,000 numbers. In that time, the selection sort algorithm has to perform 50,005,000 operations in its innermost loop. Assuming that there is indeed a proportionality relationship between these two values, dividing the time by the number of operations gives the following estimate of the proportionality constant: –8 1.58 seconds = 3.16 × 10 seconds 50,005,000 If you apply this same proportionality constant to the other entries in the table, you discover that the formula F I G U R E 1 0 - 2 Observed and estimated times for selection sort N 10 50 100 500 1000 5000 10,000 50,000 100,000 Observed time Estimated time 0.0000024 sec 0.0000448 sec 0.000169 sec 0.0000017 sec 0.0000403 sec 0.000159 sec 0.00402 sec 0.0159 sec 0.395 sec 1.58 sec 0.00395 sec 0.0159 sec 0.395 sec 1.58 sec 39.6 sec 158.7 sec 39.5 sec 158.0 sec Error 28% 10% 6% 2% < 1% < 1% < 1% < 1% < 1% 10.2 Computational complexity 2 –8 3.16 × 10 seconds × N + N 2 offers a reasonable approximation of the running time, at least for large values of N. The observed times and the estimates calculated using this formula appear in Figure 10-2, along with the relative error between the two. 10.2 Computational complexity The problem with carrying out a detailed analysis like the one shown in Figure 10-2 is that you end up with too much information. Although it is occasionally useful to have a formula for predicting exactly how long a program will take, you can usually get away with more qualitative measures. The reason that selection sort is impractical for large values of N has little to do with the precise timing characteristics of a particular implementation running on the laptop I happen to own at the moment. The problem is simpler and more fundamental. At its essence, the problem with selection sort is that doubling the size of the input vector increases the running time of the selection sort algorithm by a factor of four, which means that the running time grows more quickly than the number of elements in the vector. The most valuable qualitative insights you can obtain about algorithmic efficiency are usually those that help you understand how the performance of an algorithm responds to changes in problem size. Problem size is usually easy to quantify. For algorithms that operate on numbers, it generally makes sense to let the numbers themselves represent the problem size. For most algorithms that operate on arrays or vectors, you can use the number of elements. When evaluating algorithmic efficiency, computer scientists traditionally use the letter N to indicate the size of the problem, no matter how it is calculated. The relationship between N and the performance of an algorithm as N becomes large is called the computational complexity of that algorithm. In general, the most important measure of performance is execution time, although it is also possible to apply complexity analysis to other concerns, such as the amount of memory space required. Unless otherwise stated, all assessments of complexity used in this text refer to execution time. Big-O notation Computer scientists use a special shorthand called big-O notation to denote the computational complexity of algorithms. Big-O notation was introduced by the German mathematician Paul Bachmann in 1892—long before the development of computers. The notation itself is very simple and consists of the letter O, followed by a formula enclosed in parentheses. When it is used to specify computational complexity, the formula is usually a simple function involving the problem size N. For example, in this chapter you will soon encounter the big-O expression 439 440 Algorithmic Analysis O (N 2) which is read aloud as “big-oh of N squared.” Big-O notation is used to specify qualitative approximations and is therefore ideal for expressing the computational complexity of an algorithm. Coming as it does from mathematics, big-O notation has a precise definition, which appears later in this chapter in the section entitled “A formal definition of big-O.” At this point, however, it is far more important for you—no matter whether you think of yourself as a programmer or a computer scientist—to understand what big-O means from a more intuitive point of view. Standard simplifications of big-O When you use big-O notation to estimate the computational complexity of an algorithm, the goal is to provide a qualitative insight as to how changes in N affect the algorithmic performance as N becomes large. Because big-O notation is not intended to be a quantitative measure, it is not only appropriate but desirable to reduce the formula inside the parentheses so that it captures the qualitative behavior of the algorithm in the simplest possible form. The most common simplifications that you can make when using big-O notation are as follows: 1. Eliminate any term whose contribution to the total ceases to be significant as N becomes large. When a formula involves several terms added together, one of those terms often grows much faster than the others and ends up dominating the entire expression as N becomes large. For large values of N, this term alone will control the running time of the algorithm, and you can ignore the other terms in the formula entirely. 2. Eliminate any constant factors. When you calculate computational complexity, your main concern is how running time changes as a function of the problem size N. Constant factors have no effect on the overall pattern. If you bought a machine that was twice as fast as your old one, any algorithm that you executed on your machine would run twice as fast as before for every value of N. The growth pattern, however, would remain exactly the same. Thus, you can ignore constant factors when you use big-O notation. The computational complexity of selection sort You can apply the simplification rules from the preceding section to derive a big-O expression for the computational complexity of selection sort. From the analysis in the section “Analyzing the performance of selection sort” earlier in the chapter, you know that the running time of the selection sort algorithm for a vector of N elements is proportional to 10.2 Computational complexity N2 + N 2 Although it would be mathematically correct to use this formula directly in the big-O expression O 2 ( N 2+ N ) you would never do so in practice because the formula inside the parentheses is not expressed in the simplest form. The first step toward simplifying this relationship is to recognize that the formula is actually the sum of two terms, as follows: N2 + N 2 2 You then need to consider the contribution of each of these terms to the total formula as N increases in size, which is illustrated by the following table: N N2 2 10 100 1000 10,000 100,000 50 5000 500,000 50,000,000 5,000,000,000 N 2 5 50 500 5000 50,000 N2 + N 2 55 5050 500,500 50,005,000 5,000,050,000 As N increases, the term involving N 2 quickly dominates the term involving N. As a result, the simplification rule allows you to eliminate the smaller term from the expression. Even so, you would not write that the computational complexity of selection sort is O 2 ( N2 ) because you can eliminate the constant factor. The simplest expression you can use to indicate the complexity of selection sort is O (N 2) This expression captures the essence of the performance of selection sort. As the size of the problem increases, the running time tends to grow by the square of that increase. Thus, if you double the size of the vector, the running time goes up by a 441 442 Algorithmic Analysis factor of four. If you instead multiply the number of input values by 10, the running time will explode by a factor of 100. Deducing computational complexity from code The following function computes the average of the elements in a vector: double average(Vector & vec) { int n = vec.size(); double total = 0; for (int i = 0; i < n; i++) { total += vec[i]; } return total / n; } How would you determine its computational complexity? When you call this function, some parts of the code are executed only once, such as the initialization of total to 0 and the division operation in the return statement. These computations take a certain amount of time, but that time is constant in the sense that it doesn’t depend on the size of the vector. Code whose execution time does not depend on the problem size is said to run in constant time, which is expressed in big-O notation as O (1). Some of you will probably find the designation O(1) confusing, because the expression inside the parentheses does not depend on N. In fact, this lack of any dependency on N is the whole point of the O(1) notation. As you increase the size of the problem, the time required to execute code whose running time is O(1) increases in exactly the same way that 1 increases; in other words, the running time of the code does not increase at all. There are, however, other parts of the average function that are executed exactly n times, once for each cycle of the for loop. These components include the expression i++ in the for loop and the statement total += vec[i]; that constitutes the loop body. Although any single execution of this part of the computation takes a fixed amount of time, the fact that these statements are executed n times means their total execution time is directly proportional to the vector size. The computational complexity of this part of the average function is O (N ), which is commonly called linear time. The total running time for average is therefore the sum of the times required for the constant parts and the linear parts of the algorithm. As the size of the 10.2 Computational complexity problem increases, however, the constant term becomes less and less relevant. By exploiting the simplification rule that allows you to ignore terms that become insignificant as N gets large, you can assert that the average function as a whole runs in O (N ) time. You could, however, predict this result just by looking at the loop structure of the code. For the most part, the individual expressions and statements—unless they involve function calls that must be accounted separately—run in constant time. What matters in terms of computational complexity is how often those statements are executed. For many programs, you can determine the computational complexity simply by finding the piece of the code that is executed most often and determining how many times it runs as a function of N. In the case of the average function, the body of the loop is executed n times. Because no part of the code is executed more often than this, you can predict that the computational complexity will be O (N ). The selection sort function can be analyzed in a similar way. frequently executed part of the code is the comparison in the statement The most if (vec[i] < vec[rh]) rh = i; That statement is nested inside two for loops whose limits depend on the value of N. The inner loop runs N times as often as the outer loop, which implies that the inner loop body is executed O (N 2) times. Algorithms like selection sort that exhibit O (N 2) performance are said to run in quadratic time. Worst-case versus average-case complexity In some cases, the running time of an algorithm depends not only on the size of the problem but also on the specific characteristics of the data. For example, consider the function int linearSearch(int key, Vector & vec) { int n = vec.size(); for (int i = 0; i < n; i++) { if (key == vec[i]) return i; } return -1; } which returns the first index position in vec at which the value key appears, or –1 if the value key does not appear anywhere in the vector. Because the for loop in the implementation executes n times, you expect the performance of linearSearch, as its name implies, to be O (N ). 443 444 Algorithmic Analysis On the other hand, some calls to linearSearch can be executed very quickly. Suppose, for example, that the key element you are searching for happens to be in the first position in the vector. In that case, the body of the for loop will run only once. If you’re lucky enough to search for a value that always occurs at the beginning of the vector, linearSearch will run in constant time. When you analyze the computational complexity of a program, you’re usually not interested in the minimum possible time. In general, computer scientists tend to be concerned about the following two types of complexity analysis: • Worst-case complexity. The most common type of complexity analysis consists of determining the performance of an algorithm in the worst possible case. Such an analysis is useful because it allows you to set an upper bound on the computational complexity. If you analyze for the worst case, you can guarantee that the performance of the algorithm will be at least as good as your analysis indicates. You might sometimes get lucky, but you can be confident that the performance will not get any worse. • Average-case complexity. From a practical point of view, it is often useful to consider how well an algorithm performs if you average its behavior over all possible sets of input data. Particularly if you have no reason to assume that the specific input to your problem is in any way atypical, the average-case analysis provides the best statistical estimate of actual performance. The problem, however, is that average-case analysis is usually much more difficult to carry out and typically requires considerable mathematical sophistication. The worst case for the linearSearch function occurs when the key is not in the vector at all. When the key is not there, the function must complete all n cycles of the for loop, which means that its performance is O (N ). If the key is known to be in the vector, the for loop will be executed about half as many times on average, which implies that average-case performance is also O (N ). As you will discover in the section on “The Quicksort algorithm” later in this chapter, the average-case and worst-case performances of an algorithm sometimes differ in qualitative ways, which means that in practice it is often important to take both performance characteristics into consideration. A formal definition of big-O Because understanding big-O notation is critical to modern computer science, it is important to offer a more formal definition to help you understand why the intuitive model of big-O works and why the suggested simplifications of big-O formulas are in fact justified. In mathematics, big-O notation is used to express the relationship between two functions, in an expression like this: 10.2 Computational complexity t (N ) = O (ƒ (N )) The formal meaning of this expression is that ƒ (N ) is an approximation of t (N ) with the following characteristic: it must be possible to find a constant N0 and a positive constant C so that for every value of N ≥ N0, the following condition holds: t (N ) ≤ C × ƒ (N ) In other words, as long as N is “large enough,” the function t (N ) is always bounded by a constant multiple of the function ƒ (N ). When it is used to express computational complexity, the function t (N ) represents the actual running time of the algorithm, which is usually difficult to compute. The function ƒ (N ) is a much simpler formula that nonetheless provides a reasonable qualitative estimate for how the running time changes as a function of N, because the condition expressed in the mathematical definition of big-O ensures that the actual running time cannot grow faster than ƒ (N ). To see how the formal definition applies, it is useful to go back to the selection sort example. Analyzing the loop structure of selection sort showed that the operations in the innermost loop were executed N2 + N 2 times and that the running time was presumably roughly proportional to this formula. When this complexity was expressed in terms of big-O notation, the constants and low-order terms were eliminated, leaving only the assertion that the execution time was O (N 2), which is in fact an assertion that N 2 + N = O (N 2) 2 To show that this expression is indeed true under the formal definition of big-O, all you need to do is find constants C and N0 so that N2 + N ≤ C × N2 2 for all values of N ≥ N0. This particular example is extremely simple. All you need to do to satisfy the constraints is to set the constants C and N0 both to 1. After all, as long as N is no smaller than 1, you know that N 2 ≥ N. It must therefore be the case that N2 + N ≤ 2 N2 + N2 2 445 446 Algorithmic Analysis But the right side of this inequality is simply N 2, which means that N2 + N ≤ N2 2 for all values of N ≥ 1, as required by the definition. You can use a similar argument to show that any polynomial of degree k, which can be expressed in general terms as ak N k + ak–1 N k–1 + ak–2 N k–2 + . . . + a2 N 2 + a1 N + a0 is O (N k). To do so, your goal is to find constants C and N0 so that ak N k + ak–1 N k–1 + ak–2 N k–2 + . . . + a2 N 2 + a1 N + a0 ≤ C × N k for all values of N ≥ N0. As in the preceding example, start by letting N0 be 1. For all values of N ≥ 1, each successive power of N is at least as large as its predecessor, so N k ≥ N k–1 ≥ N k–2 ≥ . . . ≥ N ≥ 1 This property in turn implies that ak N k + ak–1 N k–1 + ak–2 N k–2 + . . . + a1 N + a0 ≤ | ak | N k + | ak-1 | N k + | ak-2 | N k + . . . + | a1 | N k + | a0 | N k where the vertical bars surrounding the coefficients on the right side of the equation indicate absolute value. By factoring out N k, you can simplify the right side of this inequality to ( | ak | + | ak-1 | + | ak-2 | + . . . + | a1 | + | a0 | ) N k Thus, if you define the constant C to be | ak | + | ak-1 | + | ak-2 | + . . . + | a1 | + | a0 | you have established that ak N k + ak–1 N k–1 + ak–2 N k–2 + . . . + a2 N 2 + a1 N + a0 ≤ C × N k This result proves that the entire polynomial is O (N k). 10.3 Recursion to the rescue If all this mathematics scares you, try not to worry. It is much more important for you to understand what big-O means in practice than it is to follow all the steps in the formal derivation. 10.3 Recursion to the rescue At this point, you know considerably more about complexity analysis than you did when you started the chapter. However, you are no closer to solving the practical problem of how to write a sorting algorithm that is more efficient for large vectors. The selection sort algorithm is clearly not up to the task, because the running time increases in proportion to the square of the input size. The same is true for most sorting algorithms that process the elements of the vector in a linear order. To develop a better sorting algorithm, you need to adopt a qualitatively different approach. The power of divide-and-conquer strategies Oddly enough, the key to finding a better sorting strategy lies in recognizing that the quadratic behavior of algorithms like selection sort has a hidden virtue. The basic characteristic of quadratic complexity is that, as the size of a problem doubles, the running time increases by a factor of four. The reverse, however, is also true. If you divide the size of a quadratic problem by two, you decrease the running time by that same factor of four. This fact suggests that dividing a vector in half and then applying a recursive divide-and-conquer approach might reduce the required sorting time. To make this idea more concrete, suppose you have a large vector that you need to sort. What happens if you divide the vector into two halves and then use the selection sort algorithm to sort each of those pieces? Because selection sort is quadratic, each of the smaller vectors requires one quarter of the original time. You need to sort both halves, of course, but the total time required to sort the two smaller vectors is still only half the time that would have been required to sort the original vector. If it turns out that sorting two halves of a vector simplifies the problem of sorting the complete vector, you will be able to reduce the total time substantially. More importantly, once you discover how to improve performance at one level, you can use the same algorithm recursively to sort each half. To determine whether a divide-and-conquer strategy is applicable to the sorting problem, you need to answer the question of whether dividing a vector into two smaller vectors and then sorting each one helps to solve the general problem. As a way to gain some insight into this question, suppose that you start with a vector containing the following eight elements: 447 448 Algorithmic Analysis vec 56 25 37 58 95 19 73 30 0 1 2 3 4 5 6 7 If you divide the vector of eight elements into two vectors of length four and then sort each of those smaller vectors—remember that the recursive leap of faith means you can assume that the recursive calls work correctly—you get the following situation in which each of the smaller vectors is sorted: v1 25 37 56 58 0 1 2 3 19 30 73 95 0 1 2 3 v2 How useful is this decomposition? Remember that your goal is to take the values out of these smaller vectors and put them back into the original vector in the correct order. How does having these smaller sorted vectors help you in accomplishing that goal? Merging two vectors As it happens, reconstructing the complete vector from the smaller sorted vectors is a much simpler problem than sorting itself. The required technique, called merging, depends on the fact that the first element in the complete ordering must be either the first element in v1 or the first element in v2, whichever is smaller. In this example, the first element you want in the new vector is the 19 in v2. If you add that element to an empty vector vec and, in effect, cross it out of v2, you get the following configuration: v1 25 37 56 58 0 1 2 3 19 30 73 95 0 1 2 3 vec 19 v2 0 Once again, the next element can only be the first unused element in one of the two smaller vectors. You compare the 25 from v1 against the 30 in v2 and choose the former: 10.3 Recursion to the rescue v1 25 37 56 58 0 1 2 3 19 30 73 95 0 1 2 3 vec 19 25 0 1 v2 You can easily continue this process of choosing the smaller value from v1 or v2 until you have reconstructed the entire vector. The merge sort algorithm The merge operation, combined with recursive decomposition, gives rise to a new sorting algorithm called merge sort, which you can implement in a straightforward way. The basic idea of the algorithm can be outlined as follows: 1. Check to see if the vector is empty or has only one element. If so, it must already be sorted. This condition defines the simple case for the recursion. 2. Divide the vector into two smaller vectors, each of which is half the size. 3. Sort each of the smaller vectors recursively. 4. Clear the original vector so that it is again empty. 5. Merge the two sorted vectors back into the original one. The code for the merge sort algorithm, shown in Figure 10-3, divides neatly into two functions: sort and merge. The code for sort follows directly from the outline of the algorithm. After checking for the special case, the algorithm divides the original vector into two smaller ones, v1 and v2. As soon as the code for sort has copied all of the elements into either v1 or v2, the rest of the function sorts these vectors recursively, clears the original vector, and then calls merge to reassemble the complete solution. Most of the work is done by the merge function, which takes the destination vector, along with the smaller vectors v1 and v2. The indices p1 and p2 mark the progress through each of the subsidiary vectors. On each cycle of the loop, the function selects an element from v1 or v2—whichever is smaller—and adds that value to the end of vec. As soon as the elements in either of the two smaller vectors are exhausted, the function can simply copy the elements from the other vector without bothering to test them. In fact, because one of these vectors is already exhausted when the first while loop exits, the function can simply copy the rest of each vector to the destination. One of these vectors will be empty, and the corresponding while loop will therefore not be executed at all. 449 450 Algorithmic Analysis F I G U R E 1 0 - 3 Implementation of the merge sort algorithm /* * Function: sort * -------------* This function sorts the elements of the vector into increasing order * using the merge sort algorithm, which consists of the following steps: * * 1. Divide the vector into two halves. * 2. Sort each of these smaller vectors recursively. * 3. Merge the two vectors back into the original one. */ void sort(Vector & vec) { int n = vec.size(); if (n <= 1) return; Vector v1; Vector v2; for (int i = 0; i < n; i++) { if (i < n / 2) { v1.add(vec[i]); } else { v2.add(vec[i]); } } sort(v1); sort(v2); vec.clear(); merge(vec, v1, v2); } /* * Function: merge * --------------* This function merges two sorted vectors, v1 and v2, into the vector * vec, which should be empty before this operation. Because the input * vectors are sorted, the implementation can always select the first * unused element in one of the input vectors to fill the next position. */ void merge(Vector & vec, Vector & v1, Vector & v2) { int n1 = v1.size(); int n2 = v2.size(); int p1 = 0; int p2 = 0; while (p1 < n1 && p2 < n2) { if (v1[p1] < v2[p2]) { vec.add(v1[p1++]); } else { vec.add(v2[p2++]); } } while (p1 < n1) vec.add(v1[p1++]); while (p2 < n2) vec.add(v2[p2++]); } 10.3 Recursion to the rescue The computational complexity of merge sort You now have an implementation of the sort function based on the strategy of divide-and-conquer. How efficient is it? You can measure its efficiency by sorting vectors of numbers and timing the result, but it is helpful to start by thinking about the algorithm in terms of its computational complexity. When you call the merge sort implementation of sort on a list of N numbers, the running time can be divided into two components: 1. The amount of time required to execute the operations at the current level of the recursive decomposition 2. The time required to execute the recursive calls At the top level of the recursive decomposition, the cost of performing the nonrecursive operations is proportional to N. The loop to fill the subsidiary vectors accounts for N cycles, and the call to merge has the effect of refilling the original N positions in the vector. If you add these operations and ignore the constant factor, you discover that the complexity of any single call to sort—not counting the recursive calls within it—requires O (N ) operations. But what about the cost of the recursive operations? To sort a vector of size N, you must recursively sort two vectors of size N / 2. Each of these operations requires some amount of time. If you apply the same logic, you quickly determine that sorting each of these smaller vectors requires time proportional to N / 2 at that level, plus whatever time is required by its own recursive calls. The same process then continues until you reach the simple case in which the vectors consist of a single element or no elements at all. The total time required to solve the problem is the sum of the time required at each level of the recursive decomposition. In general, the decomposition has the structure shown in Figure 10-4. As you move down through the recursive hierarchy, the vectors get smaller, but more numerous. The amount of work done at each level, however, is always directly proportional to N. Determining the total amount of work is therefore a question of finding out how many levels there will be. At each level of the hierarchy, the value of N is divided by 2. The total number of levels is therefore equal to the number of times you can divide N by 2 before you get down to 1. Rephrasing this problem in mathematical terms, you need to find a value of k such that N = 2k 451 452 Algorithmic Analysis F I G U R E 1 0 - 4 Recursive decomposition of merge sort Sorting a vector of size N N operations requires sorting two vectors of size N / 2 2 × N/2 operations requires sorting four vectors of size N / 4 4 × N/4 operations requires sorting eight vectors of size N / 8 8 × N/8 operations and so on. Solving the equation for k gives k = log2 N Because the number of levels is log2 N and the amount of work done at each level is proportional to N, the total amount of work is proportional to N log2 N. Unlike other scientific disciplines, in which logarithms are expressed in terms of powers of 10 (common logarithms) or the mathematical constant e (natural logarithms), computer science tends to use binary logarithms, which are based on powers of 2. Logarithms computed using different bases differ only by a constant factor, and it is therefore traditional to omit the logarithmic base when you talk about computational complexity. Thus, the computational complexity of merge sort is usually written as O (N log N ) Comparing N 2 and N log N performance But how good is O (N log N )? One way to assess the level of improvement is to look at empirical data to get a sense of how the running times of the selection and merge sort algorithms compare. That timing information appears in Figure 10-5. For 10 items, this implementation of merge sort is more than five times slower than selection sort. At 100 items, selection sort is still faster, but not by very much. By the time you get up to 100,000 items, merge sort is almost 500 times faster than selection sort. On my computer, the selection sort algorithm requires more than two 10.4 Standard complexity classes F I G U R E 1 0 - 5 Empirical comparison of selection and merge sorts N Selection sort Merge sort 10 0.0000024 sec 0.0000128 sec 50 100 0.0000448 sec 0.000169 sec 0.0000887 sec 0.000196 sec 0.00402 sec 0.0159 sec 0.395 sec 0.00110 sec 0.00236 sec 0.0129 sec 500 1000 5000 10,000 50,000 100,000 1.58 sec 39.6 sec 158.7 sec 0.027 sec 0.156 sec 0.324 sec and a half minutes to sort 100,000 items while merge sort completes the job in less than half a second. For large vectors, merge sort clearly represents a significant improvement. You can, however, get much the same information by comparing the computational complexity formulas for the two algorithms, as follows: N 10 100 1000 10,000 N2 100 10,000 1,000,000 100,000,000 N log N 33 664 9965 132,877 The numbers in both columns grow as N becomes larger, but the N 2 column grows much faster than the N log N column. Sorting algorithms based on an N log N algorithm will therefore be useful over a much larger range of vector sizes. 10.4 Standard complexity classes In programming, most algorithms fall into one of several common complexity classes. The most important complexity classes are shown in Figure 10-6, which gives the common name of the class along with the corresponding big-O expression and a representative algorithm in that class. The classes in Figure 10-6 are presented in strictly increasing order of complexity. If you have a choice between one algorithm that requires O (log N ) time and another that requires O (N ) time, the first will always outperform the second as N grows large. For small values of N, terms that are discounted in the big-O calculation may allow a theoretically less efficient algorithm to do better 453 Algorithmic Analysis 454 F I G U R E 1 0 - 6 Standard complexity classes Constant Logarithmic Linear N log N Quadratic Cubic Exponential O(1) O(log N) O(N) O(N log N) O(N 2) O(N 3) O(2 N ) Returning the first element in a vector Binary search in a sorted vector Linear search in a vector Merge sort Selection sort Conventional algorithms for matrix multiplication Tower of Hanoi against one that has a lower computational complexity. On the other hand, as N grows larger, there will always be a point at which the theoretical difference in efficiency becomes the deciding factor. The differences in efficiency between these classes are in fact profound. You can begin to get a sense of how the different complexity functions stand in relation to one another by looking at the graph in Figure 10-7, which plots these complexity F I G U R E 1 0 - 7 Growth characteristics of the standard complexity classes: linear plot O(N 3) 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 O(2 N ) O(N 2) O(N log N) O(N) O(log N) O(1) 1 2 3 4 5 6 7 10.4 Standard complexity classes functions on a traditional linear scale. Unfortunately, this graph tells an incomplete and somewhat misleading part of the story, because the values of N are all very small. Complexity analysis, after all, is primarily relevant as the values of N become large. Figure 10-8 shows the same data plotted on a logarithmic scale, which gives you a better sense of how these functions grow over a more extensive range of values. Algorithms that fall into the constant, linear, quadratic, and cubic complexity classes are all part of a more general family called polynomial algorithms that execute in time N k for some constant k. One of the useful properties of the logarithmic plot shown in Figure 10-8 is that the graph of any function N k always comes out as a straight line whose slope is proportional to k. From looking at the figure, it is immediately clear that the function N k—no matter how big k happens to be—will invariably grow more slowly than the exponential function represented by 2N, which continues to curve upward as the value of N increases. This property has important implications in terms of finding practical algorithms for real-world problems. Even though the selection sort example makes it clear that quadratic algorithms have substantial performance problems for large values of N, algorithms F I G U R E 1 0 - 8 Growth characteristics of the standard complexity classes: logarithmic plot O(N 3) O(2 N ) 108 107 O(N 2) 106 105 O(N log N) 104 O(N) 103 102 O(log N) 10 O(1) 1 1 10 102 103 455 456 Algorithmic Analysis whose complexity is O (2N ) are considerably worse. As a general rule of thumb, computer scientists classify problems that can be solved using algorithms that run in polynomial time as tractable, in the sense that they are amenable to implementation on a computer. Problems for which no polynomial time algorithm exists are regarded as intractable. Unfortunately, there are many commercially important problems for which all known algorithms require exponential time. One of those is the subset-sum problem introduced in Chapter 8, which turns up in several practical contexts. Another is the traveling salesman problem, which seeks to find the shortest route by which one can visit a set of N cities connected by some transportation system and then return to the starting point. As far as anyone knows, it is not possible to solve either the subset-sum problem or the traveling salesman problem in polynomial time. The best-known approaches all have exponential performance in the worst case and are equivalent in efficiency to generating all possible routings and comparing the cost. In general, the best known solution to each of these problems is to try every possibility, which requires exponential time. On the other hand, no one has been able to prove conclusively that no polynomial-time algorithm for this problem exists. There might be some clever algorithm that makes these problems tractable. If so, it turns out that many problems currently believed to be difficult would move into the tractable range as well. The question of whether problems like subset-sum or the traveling salesman problem, which are part of class of problems that computer scientists designate as NP-complete problems, can be solved in polynomial time is one of the most important open questions in computer science and, indeed, in mathematics. 10.5 The Quicksort algorithm Even though the merge sort algorithm presented earlier in this chapter performs well in theory and has a worst-case complexity of O (N log N ), it is not used much in practice. Instead, most sorting programs in use today are based on an algorithm called Quicksort, developed by the British computer scientist C. A. R. (Tony) Hoare. Both Quicksort and merge sort employ a divide-and-conquer strategy. In the merge sort algorithm, the original vector is divided into two halves, each of which is sorted independently. The resulting sorted vectors are then merged together to complete the sort operation for the entire vector. Suppose, however, that you took a different approach to dividing up the vector. What would happen if you started the process by making an initial pass through the vector, changing the positions of the elements so that “small” values come at the beginning of the vector and “large” values come at the end, for some definition of the words large and small? 10.5 The Quicksort algorithm For example, suppose that the original vector you wanted to sort was the following one, presented earlier in the discussion of merge sort: vec 56 25 37 58 95 19 73 30 0 1 2 3 4 5 6 7 Half of these elements are larger than 50 and half are smaller, so it might make sense to define small in this case as being less than 50 and large as being 50 or more. If you could then find a way to rearrange the elements so that all the small elements came at the beginning and all the large ones at the end, you would wind up with a vector that looks something like the following diagram, which shows one of many possible orderings that fit the definition: When the elements are divided into parts in this fashion, all that remains to be done is to sort each of the parts, using a recursive call to the function that does the sorting. Since all the elements on the left side of the boundary line are smaller than all those on the right, the final result will be a completely sorted vector: 19 0 25 37 30 56 1 2 3 4 small elements 95 73 58 5 6 7 large elements If you could always choose the optimal boundary between the small and large elements on each cycle, this algorithm would divide the vector in half each time and end up demonstrating the same qualitative characteristics as merge sort. In practice, the Quicksort algorithm selects some existing element in the vector and uses that value as the dividing line between the small and large elements. For example, a common approach is to pick the first element, which was 56 in the original vector, and use it as the demarcation point between small and large elements. When the vector is reordered, the boundary is therefore at a particular index position rather than between two positions, as follows: 19 25 37 30 56 95 73 58 0 1 2 3 4 5 6 7 From this point, the recursive calls must sort the vector between positions 0 and 3 and the vector between positions 5 and 7, leaving index position 4 right where it is. 457 458 Algorithmic Analysis As in merge sort, the simple case of the Quicksort algorithm is a vector of size 0 or 1, which must already be sorted. The recursive part of the Quicksort algorithm consists of the following steps: 1. Choose an element to serve as the boundary between the small and large elements. This element is traditionally called the pivot. For the moment, it is sufficient to choose any element for this purpose, and the simplest strategy is to select the first element in the vector. 2. Rearrange the elements in the vector so that large elements are moved toward the end of the vector and small elements toward the beginning. More formally, the goal of this step is to divide the elements around a boundary position so that all elements to the left of the boundary are less than the pivot and all elements to the right are greater than or possibly equal to the pivot. This processing is called partitioning the vector and is discussed in detail in the next section. 3. Sort the elements in each of the partial vectors. Because all elements to the left of the pivot boundary are strictly less than all those to the right, sorting each of the vectors must leave the entire vector in sorted order. Moreover, since the algorithm uses a divide-and-conquer strategy, these smaller vectors can be sorted using a recursive application of Quicksort. Partitioning the vector In the partition step of the Quicksort algorithm, the goal is to rearrange the elements so that they are divided into three classes: those that are smaller than the pivot; the pivot element itself, which is situated at the boundary position; and those elements that are at least as large as the pivot. The tricky part about partition is to rearrange the elements without using any extra storage, which is typically done by swapping pairs of elements. Tony Hoare’s original approach to partitioning is fairly easy to explain in English. Because the pivot value has already been selected when you start the partitioning phase of the algorithm, you can tell immediately whether a value is large or small relative to that pivot. To make things easier, let’s assume that the pivot value is stored in the initial element position. Hoare’s partitioning algorithm proceeds as follows: 1. For the moment, ignore the pivot element at index position 0 and concentrate on the remaining elements. Use two index values, lh and rh, to record the index positions of the first and last elements in the rest of the vector, as shown: 10.5 The Quicksort algorithm 25 37 58 95 19 73 30 0 1 2 3 4 5 6 7 ☞ 56 ☞ lh rh 2. Move the rh index to the left until it either coincides with lh or points to an element containing a value that is small with respect to the pivot. In this example, the value 30 in position 7 is already a small value, so the rh index does not need to move. 3. Move the lh index to the right until it coincides with rh or points to an element containing a value that is larger than or equal to the pivot. In this example, the lh index must move to the right until it points to an element larger than 56, which leads to the following configuration: 58 95 19 73 30 0 1 2 3 4 5 6 7 ☞ 37 lh rh If the lh and rh index values have not yet reached the same position, exchange the elements in those positions in the vector, which leaves it looking like this: 25 37 30 95 19 73 58 0 1 2 3 4 5 6 7 ☞ 56 ☞ lh rh Repeat steps 2 through 4 until the lh and rh positions coincide. On the next pass, for example, the exchange operation in step 4 swaps the 19 and the 95. As soon as that happens, the next execution of step 2 moves the rh index to the left, where it ends up matching the lh, as follows: 56 25 37 30 19 95 73 58 0 1 2 3 4 5 6 7 ☞ ☞ 5. 25 ☞ 4. 56 lh+rh lh rh 6. Unless the chosen pivot just happened to be the smallest element in the entire vector (and the code includes a special check for this case), the point at which the lh and rh index positions coincide will be the small value that is furthest to the right in the vector. The only remaining step is to exchange that value in this position with the pivot element at the beginning of the vector, as shown: 459 Algorithmic Analysis 19 25 37 30 56 95 73 58 0 1 2 3 4 5 6 7 → 460 boundary Note that this configuration meets the requirements of the partitioning step. The pivot value is at the marked boundary position, with every element to the left being smaller and every element to the right being at least that large. An implementation of sort using the Quicksort algorithm is shown in Figure 10-9. F I G U R E 1 0 - 9 Implementation of the Quicksort algorithm /* * Function: sort * -------------* This function sorts the elements of the vector into * increasing numerical order using the Quicksort algorithm. * In this implementation, sort is a wrapper function that * calls quicksort to do all the work. */ void sort(Vector & vec) { quicksort(vec, 0, vec.size() - 1); } /* * Function: quicksort * ------------------* Sorts the elements in the vector between index positions * start and finish, inclusive. The Quicksort algorithm begins * by "partitioning" the vector so that all elements smaller * than a designated pivot element appear to the left of a * boundary and all equal or larger values appear to the right. * Sorting the subsidiary vectors to the left and right of the * boundary ensures that the entire vector is sorted. */ void quicksort(Vector & vec, int start, int finish) { if (start >= finish) return; int boundary = partition(vec, start, finish); quicksort(vec, start, boundary - 1); quicksort(vec, boundary + 1, finish); } 10.5 The Quicksort algorithm F I G U R E 1 0 - 9 Implementation of the Quicksort algorithm (continued) /* * Function: partition * ------------------* This function rearranges the elements of the vector so that the * small elements are grouped at the left end of the vector and the * large elements are grouped at the right end. The distinction * between small and large is made by comparing each element to the * pivot value, which is initially taken from vec[start]. When the * partitioning is done, the function returns a boundary index such * that vec[i] < pivot for all i < boundary, vec[i] == pivot * for i == boundary, and vec[i] >= pivot for all i > boundary. */ int partition(Vector & vec, int start, int finish) { int pivot = vec[start]; int lh = start + 1; int rh = finish; while (true) { while (lh < rh && vec[rh] >= pivot) rh--; while (lh < rh && vec[lh] < pivot) lh++; if (lh == rh) break; int tmp = vec[lh]; vec[lh] = vec[rh]; vec[rh] = tmp; } if (vec[lh] >= pivot) return start; vec[start] = vec[lh]; vec[lh] = pivot; return lh; } Analyzing the performance of Quicksort A head-to-head comparison of the actual running times for the merge sort and Quicksort algorithms appears in Figure 10-10. As you can see, this implementation of Quicksort tends to run several times faster than the implementation of merge sort given in Figure 10-3, which is one of the reasons why programmers use it more frequently in practice. Moreover, the running times for both algorithms appear to grow in roughly the same way. The empirical results presented in Figure 10-10, however, obscure an important point. As long as the Quicksort algorithm chooses a pivot that is close to the median value in the vector, the partition step will divide the vector into roughly equal parts. If the algorithm chooses its pivot value poorly, one of the two partial vectors may be much larger than the other, which defeats the purpose of the divide-and-conquer strategy. In a vector with randomly chosen elements, Quicksort tends to perform well, with an average-case complexity of O (N log N ). In the worst 461 462 Algorithmic Analysis F I G U R E 1 0 - 1 0 Empirical comparison of merge sort and Quicksort N Merge sort 10 50 100 500 1000 5000 10,000 50,000 100,000 Quicksort 0.0000128 sec 0.0000887 sec 0.0000014 sec 0.0000120 sec 0.000196 sec 0.00110 sec 0.00236 sec 0.0129 sec 0.027 sec 0.0000288 sec 0.000200 sec 0.000456 sec 0.00284 sec 0.00608 sec 0.156 sec 0.324 sec 0.0365 sec 0.0774 sec case—which paradoxically consists of a vector that is already sorted—the performance degenerates to O (N 2). Despite this inferior behavior in the worst case, Quicksort is so much faster in practice than most other algorithms that it has become the standard choice for general sorting procedures. There are several strategies you can use to increase the likelihood that the pivot is in fact close to the median value in the vector. One simple approach is to have the Quicksort implementation choose the pivot element at random. Although it is still possible that the random process will choose a poor pivot value, it is unlikely that it would make the same mistake repeatedly at each level of the recursive decomposition. Moreover, there is no distribution of the original vector that is always bad. Given any input, choosing the pivot randomly ensures that the average-case performance for that vector would be O (N log N ). Another possibility, which is explored in more detail in exercise 6, is to select a few values, typically three or five, from the vector and choose the median of those values as the pivot. You do have to be somewhat careful as you try to improve the algorithm in this way. Picking a good pivot improves performance, but also costs some time. If the algorithm spends more time choosing the pivot than it gets back from making a good choice, you will end up slowing down the implementation rather than speeding it up. 10.6 Mathematical induction Earlier in the chapter, I asked you to rely on the fact that the sum N + N –1 + N –2 + . . . + 3 + 2 + 1 10.6 Mathematical induction could be simplified to the more manageable formula N2 + N 2 If you were skeptical about this simplification, how would you go about proving that the simplified formula is indeed correct? There are, in fact, several different proof techniques you could try. One possibility is to represent the original extended sum in a geometric form. Suppose, for example, that N is 5. If you then represent each term in the summation with a row of dots, those dots form the following triangle: If you make a copy of this triangle and flip it upside down, the two triangles fit together to form a rectangle, shown here with the lower triangle in gray: Since the pattern is now rectangular, the total number of dots—both black and gray—is easy to compute. In this picture, there are five rows of six dots each, so the total collection of dots, counting both colors, is 5 × 6, or 30. Since the two triangles are identical, exactly half of these dots are black; thus the number of black dots is 30 / 2, or 15. In the more general case, there are N rows containing N +1 dots each, and the number of black dots from the original triangle is therefore N × (N + 1) 2 Proving that a formula is correct in this fashion, however, has some potential drawbacks. For one thing, geometrical arguments presented in this style are not as formal as many computer scientists would like. More to the point, constructing this type of argument requires that you come up with the right geometrical insight, 463 464 Algorithmic Analysis which is different for each problem. It would be better to adopt a more general proof strategy that would apply to many different problems. The technique that computer scientists generally use to prove propositions like N + N –1 + N –2 + . . . + 3 + 2 + 1 = N × (N + 1) 2 is called mathematical induction. Mathematical induction applies when you want to show that a proposition is true for all values of an integer N beginning at some initial starting point. This starting point is called the basis of the induction and is typically 0 or 1. The process consists of the following steps: • Prove the base case. The first step is to establish that the proposition holds true when N has the basis value. In most cases, this step is a simple matter of plugging the basis value into a formula and showing that the desired relationship holds. • Prove the inductive case. The second step is to demonstrate that, if you assume the proposition to be true for N, it must also be true for N+1. As an example, here is how you can use mathematical induction to prove the proposition that N + N –1 + N –2 + . . . + 3 + 2 + 1 = N × (N + 1) 2 is indeed true for all N greater than or equal to 1. The first step is to prove the base case, when N is equal to 1. That part is easy. All you have to do is substitute 1 for N in both halves of the formula to determine that 1 = 1 × (1 + 1) 2 = 2 = 1 2 To prove the inductive case, you begin by assuming that the proposition N + N –1 + N –2 + . . . + 3 + 2 + 1 = N × (N + 1) 2 is indeed true for N. This assumption is called the inductive hypothesis. Your goal is now to verify that the same relationship holds for N+1. In other words, what you need to do to establish the truth of the current formula is to show that N +1 + N + N –1 + N –2 + . . . + 3 + 2 + 1 = (N + 1) × (N + 2) 2 10.6 Mathematical induction If you look at the left side of the equation, you should notice that the sequence of terms beginning with N is exactly the same as the left side of your inductive hypothesis. Since you have assumed that the inductive hypothesis is true, you can substitute the equivalent closed-form expression, so that the left side of the proposition you’re trying to prove looks like this: N +1 + N × (N + 1) 2 From here on, the rest of the proof is simple algebra: N +1 + N × (N + 1) 2 N2 + N 2 = 2N + 2 + 2 = N 2 + 3N + 2 2 = (N + 1) × (N + 2) 2 The last line in this derivation is precisely the result you were looking for and therefore completes the proof. Many students need time to get used to the idea of mathematical induction. At first glance, the inductive hypothesis seems to be “cheating” in some sense; after all, you get to assume precisely the proposition that you are trying to prove. In fact, the process of mathematical induction is nothing more than an infinite family of proofs, each of which proceeds by the same logic. The base case in a typical example establishes that the proposition is true for N = 1. Once you have proved the base case, you can adopt the following chain of reasoning: Now that I know the proposition is true for N = 1, I can prove it is true for N = 2. Now that I know the proposition is true for N = 2, I can prove it is true for N = 3. Now that I know the proposition is true for N = 3, I can prove it is true for N = 4. Now that I know the proposition is true for N = 4, I can prove it is true for N = 5. And so on. . . . At each step in this process, you could write out a complete proof by applying the logic you used to establish the inductive case. The power of mathematical induction comes from the fact that you don’t actually need to write out the details of each step individually. 465 466 Algorithmic Analysis In a way, the process of mathematical induction is like the process of recursion viewed from the opposite direction. If you try to explain a typical recursive decomposition in detail, the process usually sounds something like this: To calculate this function for N = 5, I need to know its value for N = 4. To calculate this function for N = 4, I need to know its value for N = 3. To calculate this function for N = 3, I need to know its value for N = 2. To calculate this function for N = 2, I need to know its value for N = 1. The value N = 1 represents a simple case, so I can return the result immediately. Both induction and recursion require you to make a leap of faith. When you write a recursive function, this leap consists of believing that all simpler instances of the function call will work without your paying any attention to the details. Making the inductive hypothesis requires much the same mental discipline. In both cases, you have to restrict your thinking to one level of the solution and not get sidetracked trying to follow the details all the way to the end. Summary The most valuable concept to take with you from this chapter is that algorithms for solving a problem can vary widely in their performance characteristics. Choosing an algorithm that has better computational properties can often reduce the time required to solve a problem by many orders of magnitude. The difference in behavior is illustrated dramatically by the tables presented in this chapter that give the actual running times for various sorting algorithms. When sorting a vector of 10,000 integers, for example, the Quicksort algorithm outperforms selection sort by a factor of almost 250; as the vector sizes get larger, the difference in efficiency between these algorithms will become even more pronounced. Other important points in this chapter include: • Most algorithmic problems can be characterized by an integer N that represents the size of the problem. For algorithms that operate on large integers, the size of the integer provides an effective measure of problem size; for algorithms that operate on arrays or vectors, it usually makes sense to define the problem size as the number of elements. • The most useful qualitative measure of efficiency is computational complexity, which is defined as the relationship between problem size and algorithmic performance as the problem size becomes large. • Big-O notation provides an intuitive way of expressing computational complexity because it allows you to highlight the most important aspects of the complexity relationship in the simplest possible form. Review questions • When you use big-O notation, you can simplify the formula by eliminating any term in the formula that becomes insignificant as N becomes large, along with any constant factors. • You can often predict the computational complexity of a program by looking at the nesting structure of the loops it contains. • Two useful measures of complexity are worst-case and average-case analysis. Average-case analysis is usually much more difficult to conduct. • Divide-and-conquer strategies make it possible to reduce the complexity of sorting algorithms from O (N 2) to O (N log N ), which is a significant reduction. • Most algorithms fall into one of several common complexity classes, which include the constant, logarithmic, linear, N log N , quadratic, cubic, and exponential classes. Algorithms whose complexity class appears earlier in this list are more efficient than those that come afterward when the problems being considered are sufficiently large. • Problems that can be solved in polynomial time, which is defined to be O (N k) for some constant value k, are considered to be tractable. Problems for which no polynomial-time algorithm exists are considered intractable because solving such problems requires prohibitive amounts of time, even for problems of relatively modest size. • Because it tends to perform extremely well in practice, most sorting programs are based on the Quicksort algorithm, developed by Tony Hoare, even though its worst-case complexity is O (N 2). • Mathematical induction provides a general technique for proving that a property holds for all values of N greater than or equal to some base value. To apply this technique, your first step is to demonstrate that the property holds in the base case. In the second step, you must prove that, if the formula holds for a specific value N, then it must also hold for N +1. Review questions 1. The simplest recursive implementation of the Fibonacci function is considerably less efficient than the iterative version. Does this fact allow you to make any general conclusions about the relative efficiency of recursive and iterative solutions? 2. What is the sorting problem? 3. The implementation of sort shown in Figure 10-1 runs through the code to exchange the values at positions lh and rh even if these values happen to be the same. If you change the program so that it checks to make sure lh and rh 467 468 Algorithmic Analysis are different before making the exchange, it is likely to run more slowly than the original algorithm. Why might this be so? 4. Suppose that you are using the selection sort algorithm to sort a vector of 250 values and find that it takes 50 milliseconds to complete the operation. What would you expect the running time to be if you used the same algorithm to sort a vector of 1000 values on the same machine? 5. What is the closed-form expression that computes the sum of the series N + N –1 + N –2 + . . . + 3 + 2 + 1 6. In your own words, define the concept of computational complexity. 7. True or false: Big-O notation was invented as a means to express computational complexity. 8. What are the two rules presented in this chapter for simplifying big-O notation? 9. Is it technically correct to say that selection sort runs in O ( N 2+ N ) 2 time? What, if anything, is wrong with doing so? 10. Is it technically correct to say that selection sort runs in O (N 3) time? Again, what, if anything, is wrong with doing so? 11. Why is it customary to omit the base of the logarithm in big-O expressions such as O (N log N )? 12. What is the computational complexity of the following function: int mystery1(int n) { int sum = 0; for (int i = 0; i < n; i++) { for (int j = 0; j < i; j++) { sum += i * j; } } return sum; } Review questions 13. What is the computational complexity of this function: int mystery2(int n) { int sum = 0; for (int i = 0; i < 10; i++) { for (int j = 0; j < i; j++) { sum += j * n; } } return sum; } 14. Explain the difference between worst-case and average-case complexity. In general, which of these measures is harder to compute? 15. State the formal definition of big-O. 16. In your own words, explain why the merge function runs in linear time. 17. The last two lines of the merge function are while (p1 < n1) vec.add(v1[p1++]); while (p2 < n2) vec.add(v2[p2++]); Would it matter if these two lines were reversed? Why or why not? 18. What are the seven complexity classes identified in this chapter as the most common classes encountered in practice? 19. What does the term polynomial algorithm mean? 20. What criterion do computer scientists use to differentiate tractable and intractable problems? 21. In the Quicksort algorithm, what conditions must be true at the conclusion of the partitioning step? 22. What are the worst- and average-case complexities for Quicksort? 23. Describe the two steps involved in a proof by mathematical induction. 24. In your own words, describe the relationship between recursion and mathematical induction. 469 470 Algorithmic Analysis Exercises 1. It is easy to write a recursive function double raiseToPower(double x, int n) that calculates xn, by relying on the recursive insight that x n = x × x n–1 Such a strategy leads to an implementation that runs in linear time. You can, however, adopt a recursive divide-and-conquer strategy which takes advantage of the fact that x 2n = x n × x n Use this fact to write a recursive version of raiseToPower that runs in O (log N ) time. 2. There are several other sorting algorithms that exhibit the O (N 2) behavior of selection sort. Of these, one of the most important is insertion sort, which operates as follows. You go through each element in the vector in turn, as with the selection sort algorithm. At each step in the process, however, the goal is not to find the smallest remaining value and switch it into its correct position, but rather to ensure that the values considered so far are correctly ordered with respect to each other. Although those values may shift as more elements are processed, they form an ordered sequence in and of themselves. For example, if you consider again the data used in the sorting examples from this chapter, the first cycle of the insertion sort algorithm requires no work, because a vector of one element is always sorted: in order 56 25 37 58 95 19 73 30 0 1 2 3 4 5 6 7 On the next cycle, you need to put 25 into the correct position with respect to the elements you have already seen, which means that you need to exchange the 56 and 25 to reach the following configuration: in order 25 56 37 58 95 19 73 30 0 1 2 3 4 5 6 7 On the third cycle, you need to find where the value 37 should go. To do so, you need to move backward through the earlier elements—which you know Exercises are in order with respect to each other—looking for the position where 37 belongs. As you go, you need to shift each of the larger elements one position to the right, which eventually makes room for the value you’re trying to insert. In this case, the 56 gets shifted by one position, and the 37 winds up in position 1. Thus, the configuration after the third cycle looks like this: in order 25 37 56 58 95 19 73 30 0 1 2 3 4 5 6 7 After each cycle, the initial portion of the vector is always sorted, which implies that cycling through all the positions in this way will sort the entire vector. The insertion sort algorithm is important in practice because it runs in linear time if the vector is already more or less in the correct order. It therefore makes sense to use insertion sort to restore order to a large vector in which only a few elements are out of sequence. Write an implementation of sort that uses the insertion sort algorithm. Construct an informal argument to show that the worst-case behavior of insertion sort is O (N 2). 3. Write a function that keeps track of the elapsed time as it executes the sort procedure on a randomly chosen vector. Use that function to write a program that produces a table of the observed running times for a predefined set of sizes, as shown in the following sample run: SortTimer N | Time (msec) ---------+--------------10 | 0.00074 50 | 0.00688 100 | 0.01654 500 | 0.11578 1000 | 0.25458 5000 | 1.59824 10000 | 3.46010 50000 | 20.45690 100000 | 43.76030 The best way to measure elapsed system time for programs of this sort is to use the ANSI clock function, which is exported by the ctime interface. The clock function takes no arguments and returns the amount of time the processing unit of the computer has used in the execution of the program. The unit of measurement and even the type used to store the result of clock differ 471 472 Algorithmic Analysis depending on the type of machine, but you can always convert the system-dependent clock units into seconds by using the following expression: double(clock()) / CLOCKS_PER_SEC If you record the starting and finishing times in the variables start and finish, you can use the following code to compute the time required by a calculation: double start = double(clock()) / CLOCKS_PER_SEC; . . . Perform some calculation . . . double finish = double(clock()) / CLOCKS_PER_SEC; double elapsed = finish - start; Unfortunately, calculating the time requirements for a program that runs quickly requires some subtlety because there is no guarantee that the system clock unit is precise enough to measure the elapsed time. For example, if you used this strategy to time the process of sorting 10 integers, the odds are good that the time value of elapsed at the end of the code fragment would be 0. The reason is that the processing unit on most machines can execute many instructions in the space of a single clock tick—almost certainly enough to get the entire sorting process done for a vector of 10 elements. Because the system’s internal clock may not tick in the interim, the values recorded for start and finish are likely to be the same. The best way to get around this problem is to repeat the calculation many times between the two calls to the clock function. For example, if you want to determine how long it takes to sort 10 numbers, you can perform the sort-10-numbers experiment 1000 times in a row and then divide the total elapsed time by 1000. This strategy gives you a timing measurement that is much more accurate. 4. Suppose you know that all the values in an integer array fall into the range 0 to 9999. Show that it is possible to write a O (N ) algorithm to sort arrays with this restriction. Implement your algorithm and evaluate its performance by taking empirical measurements using the strategy outlined in exercise 3. Explain why the performance of the algorithm is so bad for small values of N. 5. Write a program that generates a table comparing the performance of two algorithms—linear and binary search—when used to find a randomly chosen integer key in a sorted Vector. The linear search algorithm simply goes through each element of the vector in turn until it finds the desired one or determines that the key does not appear. The binary search algorithm, which is implemented for string vectors in Figure 7-5, uses a divide-and-conquer Exercises strategy by checking the middle element of the vector and then deciding which half of the remaining elements to search. The table you generate in this problem, rather than computing the time as in exercise 3, should instead calculate the number of comparisons made against elements of the vector. To ensure that the results are not completely random, your program should average the results over several independent trials. A sample run of the program might look like this: SearchComparison N | Linear | Binary ---------+----------+---------10 | 4.4 | 2.2 50 | 31.2 | 5.4 100 | 33.2 | 6.2 500 | 270.4 | 8.6 1000 | 637.4 | 9.2 5000 | 4101.6 | 12.0 10000 | 4632.8 | 13.8 50000 | 32681.4 | 15.6 100000 | 51598.4 | 16.8 6. Change the implementation of the Quicksort algorithm so that, instead of picking the first element in the vector as the pivot, the partition function chooses the median of the first, middle, and last elements. 7. Although O (N log N ) sorting algorithms are clearly more efficient than O (N 2) algorithms for large vectors, the simplicity of quadratic algorithms like selection sort often means that they perform better for small values of N. This fact raises the possibility of developing a strategy that combines the two algorithms, using Quicksort for large vectors but selection sort whenever the vectors become less than some threshold called the crossover point. Approaches that combine two different algorithms to exploit the best features of each are called hybrid strategies. Reimplement sort using a hybrid of the Quicksort and selection sort strategies. Experiment with different values of the crossover point below which the implementation chooses to use selection sort, and determine what value gives the best performance. The value of the crossover point depends on the specific timing characteristics of your computer and will change from system to system. 8. Another interesting hybrid strategy for the sorting problem is to start with a recursive implementation of Quicksort that simply returns when the size of the vector falls below a certain threshold. When this function returns, the vector is not sorted, but all the elements are relatively close to their final positions. At 473 474 Algorithmic Analysis this point, you can use the insertion sort algorithm presented in exercise 2 on the entire vector to fix any remaining problems. Because insertion sort runs in linear time on vectors that are mostly sorted, you may be able to save some time using this approach. 9. Suppose you have two functions, ƒ and g, for which ƒ (N ) is less than g (N ) for all values of N. Use the formal definition of big-O to prove that 15ƒ (N ) + 6g (N ) is O (g (N )). 10. Use the formal definition of big-O to prove that N 2 is O (2N). 11. Use mathematical induction to prove that the following properties hold for all positive values of N. c) 1 + 3 + 5 + 7 + . . . + 2N–1 = N 2 N × (N + 1) × (2N + 1) 12 + 22 + 32 + 42 + . . . + N 2 = 6 13 + 23 + 33 + 43 + . . . + N 3 = (1 + 2 + 3 + 4 + . . . + N ) 2 d) 20 + 21 + 22 + 23 + . . . + 2 N = 2N+1 – 1 a) b) 12. Exercise 1 shows that it is possible to compute x n in O (log N ) time. This fact in turn makes it possible to write an implementation of the function fib(n) that also runs in O (log N ) time, which is much faster than the traditional iterative version. To do so, you need to rely on the somewhat surprising fact that the Fibonacci function is closely related to a value called the golden ratio, which has been known since the days of Greek mathematics. The golden ratio, which is usually designated by the Greek letter ϕ, is defined to be the value that satisfies the equation ϕ2 – ϕ – 1 = 0 Because this is a quadratic equation, it actually has two roots. If you apply the quadratic formula, you will discover that these roots are ϕ = 1+ 5 2 ϕ^ = 1– 5 2 Exercises In 1718, the French mathematician Abraham de Moivre discovered that the n Fibonacci number can be represented in closed form as th ϕ n – ϕ^ n 5 Moreover, because ϕ^ n is always very small, the formula can be simplified to ϕn 5 rounded to the nearest integer. Use this formula and the raiseToPower function from exercise 1 to write an implementation of fib(n) that runs in O (log N ) time. Once you have verified empirically that the formula seems to work for the first several terms in the sequence, use mathematical induction to prove that the formula ϕ n – ϕ^ n 5 actually computes the n th Fibonacci number. 13. If you’re ready for a real algorithmic challenge, write the function int findMajorityElement(Vector & vec); that takes a vector of nonnegative integers and returns the majority element, which is defined to be a value that occurs in an absolute majority (at least 50 percent plus one) of the element positions. If no majority element exists, the function should return –1 to signal that fact. Your function must also meet the following conditions: • It must run in O (N ) time. • It must use O (1 ) additional space. In other words, it may use individual temporary variables but may not allocate any additional array or vector storage. Moreover, this condition rules out recursive solutions, because the space required to store the stack frames would grow with the depth of the recursion. • It may not change any of the values in the vector. The hard part about this problem is coming up with the algorithm, not implementing it. Play with some sample vectors and see if you can come up with the right strategy. 14. If you enjoyed the previous problem, here’s an even more challenging one that used to be an interview question at Microsoft. Suppose that you have a vector of N elements, in which each element has a value in the inclusive range 1 to 475 476 Algorithmic Analysis N – 1. Given that there are N elements in the vector and only N – 1 possible values to store in each slot, there must be at least one value that is duplicated in the vector. There may, of course, be many duplicated values, but you know that there must be at least one by virtue of what mathematicians call the pigeonhole principle: if you have more items to put into a set of pigeonholes than the number of pigeonholes, there must be some pigeonhole that ends up with more than one item. Your task in this problem is to write a function int findDuplicate(Vector vec); that takes a vector whose values are constrained to be in the 1 to N – 1 range and returns one of the duplicated values. The hard part of this problem is to design an algorithm so that your implementation adheres to the same set of conditions as the solution to the preceding exercise: • It must run in O (N ) time. • It must use O (1 ) additional space. In other words, it may use individual temporary variables but may not allocate any additional array or vector storage. Moreover, this condition rules out recursive solutions, because the space required to store the stack frames would grow with the depth of the recursion. • It may not change any of the values in the vector. It’s easy, for example, to write a quadratic-time solution to this problem, which looks like this: int findDuplicate(Vector & vec) { for (int i = 0; i < vec.size(); i++) { for (int j = 0; j < i; j++) { if (vec[i] == vec[j]) return vec[i]; } } error("Vector has no duplicates"); return -1; } The hard part is optimizing it so that it runs in linear time. Chapter 11 Pointers and Arrays Orlando ran her eyes through it and then, using the first finger of her right hand as pointer, read out the following facts as being most germane to the matter. — Virginia Woolf, Orlando, 1928 478 Pointers and Arrays For the most part, the programs in this book have relied on abstract data types to represent compound objects. From a practical point of view, this strategy is clearly correct. When you write programs in an object-oriented language like C++, you should take as much advantage as you can of the abstract types provided by the libraries and stay as far away as possible from the complexity of the low-level details. At the same time, it is useful to know a bit more about how C++ represents data. Having that knowledge gives you a better sense of how those abstract types work and helps you understand why C++ behaves as it does. At this point in the text, there is another compelling reason to learn how memory works in C++. In Chapter 5, you learned about the wonderful collection classes in C++ that make programming so much easier. In the chapters that follow, the primary goal is to understand how those structures can be implemented efficiently. Evaluating the efficiency of different possible algorithms requires you to know what costs are involved in the various options. Without a detailed understanding of the low-level structures that C++ uses to implement those algorithms—most notably pointers and arrays—those costs are impossible to evaluate. 11.1 The structure of memory Before you can understand C++’s memory model in any detail, you need to know how information is stored inside a computer. Every modern computer contains some amount of high-speed internal memory that is its principal repository for information. In a typical machine, that memory is built out of special integrated-circuit chips called RAM, which stands for random-access memory. Random-access memory allows the program to use the contents of any memory cell at any time. The technical details of how the RAM chip operates are not important to most programmers. What is important is how the memory is organized. Bits, bytes, and words Inside the computer, all data values—no matter how complex—are stored as combinations of the fundamental unit of information, which is called a bit. Each bit can be in one of two possible states. If you think of the circuitry inside the machine as if it were a tiny light switch, you might label those states as off and on. If you think of each bit as a Boolean value, you might instead use the labels false and true. However, because the word bit comes originally from a contraction of binary digit, it is more common to label those states as 0 and 1, which are the two digits used in the binary number system on which computer arithmetic is based. Since a single bit holds so little information, individual bits are not the most convenient mechanism for storing data. To make it easier to store such traditional types of information as numbers or characters, individual bits are collected together into larger units that are then treated as integral units of storage. The smallest such 11.1 The structure of memory combined unit is called a byte, which consists of eight bits and is large enough to hold a value of type char. On most machines, bytes are assembled into larger structures called words, where a word is usually defined to be the size required to hold a value of type int. Today, most machines use words that are either four or eight bytes long (32 or 64 bits). The amount of memory available to a particular computer varies over a wide range. Early machines supported memories whose size was measured in kilobytes (KB), the machines of the 1980s and ’90s had memory sizes measured in megabytes (MB), and today’s machines typically have memories measured in gigabytes (GB). In most sciences, the prefixes kilo, mega, and giga stand for one thousand, one million, and one billion, respectively. In the world of computers, however, those base-10 values do not fit well into the internal structure of the machine. By tradition, therefore, these prefixes are taken to represent the power of two closest to their traditional interpretations. Thus, in programming, the prefixes kilo, mega, and giga have the following meanings: kilo (K) mega (M) giga (G) = = = 210 220 230 = = = 1,024 1,048,576 1,073,741,824 A 64KB computer from the early 1970s would have had 64×1024 or 65,536 bytes of memory. Similarly, a modern 4GB machine would have 4×1,037,741,824 or 4,294,967,296 bytes of memory. Binary and hexadecimal representations Each of the bytes inside a machine holds data whose meaning depends on how the system interprets the individual bits. Depending on the hardware instructions that are used to manipulate it, a particular sequence of bits can represent an integer, a character, or a floating-point value, each of which requires some kind of encoding scheme. The easiest encoding scheme to describe is that for unsigned integers. The bits in an unsigned integer are represented using binary notation, in which the only legal values are 0 and 1, just as is true for the underlying bits. Binary notation is similar in structure to our more familiar decimal notation, but uses 2 rather than 10 as its base. The contribution that a binary digit makes to the entire number depends on its position within the number as a whole. The rightmost digit represents the units field, and each of the other positions counts for twice as much as the digit to its right. Consider, for example, the eight-bit byte containing the following binary digits: 0 0 1 0 1 0 1 0 479 480 Pointers and Arrays That sequence of bits represents the number forty-two, which you can verify by calculating the contribution for each of the individual bits, as follows: 0 0 1 0 1 0 1 0 × 1 × 2 × 4 × 8 × 16 × 32 × 64 × 128 = 0 = 2 = 0 = 8 = 0 = 32 = 0 = 0 42 This diagram illustrates how to map an integer into bits using binary notation, but also helps to demonstrate the fact that writing numbers in binary form is terribly inconvenient. Binary numbers are cumbersome, mostly because they tend to be so long. Decimal representations are intuitive and familiar but make it harder to understand how the number translates into bits. For applications in which it is useful to understand how a number translates into its binary representation without having to work with binary numbers that stretch all the way across the page, computer scientists tend to use hexadecimal (base 16) representation instead. In hexadecimal notation, there are sixteen digits, representing values from 0 to 15. The decimal digits 0 through 9 are perfectly adequate for the first ten digits, but classical arithmetic does not define the extra symbols you need to represent the remaining six digits. Computer science traditionally uses the letters A through F for this purpose, where the letters have the following values: A B C D E F = = = = = = 10 11 12 13 14 15 What makes hexadecimal notation so attractive is that you can instantly convert between hexadecimal values and the underlying binary representation. All you need to do is combine the bits into groups of four. For example, the number forty-two can be converted from binary to hexadecimal like this: 0 0 1 2 0 1 0 1 A 0 11.1 The structure of memory The first four bits represent the number 2, and the next four represent the number 10. Converting each of these to the corresponding hexadecimal digit gives 2A as the hexadecimal form. You can then verify that this number still has the value 42 by adding up the digit values, as follows: 2 A × × 1 = 10 16 = 32 42 For the most part, numeric representations in this book use decimal notation for readability. If the base is not clear from the context, the text follows the usual strategy of using a subscript to denote the base. Thus, the three most common representations for the number forty-two—decimal, binary, and hexadecimal—look like this: 4210 = 001010102 = 2A16 The key point is that the number itself is always the same; the numeric base affects only the representation. Forty-two has a real-world interpretation that is independent of the base. That real-world interpretation is perhaps easiest to see in the representation an elementary school student might use, which is after all just another way of writing the number down: The number of line segments in this representation is forty-two. The fact that a number is written in binary, decimal, or any other base is a property of the representation and not of the number itself. Representing other data types In many ways, the fundamental idea behind modern computing is that any data value can be represented as a collection of bits. It is easy to see, for example, how to represent a Boolean value in a single bit. All you have to do is assign each of the possible states of a bit to one of the two Boolean values. Conventionally, 0 is interpreted as false, and 1 is interpreted as true. As the last section makes clear, you can store unsigned integers by interpreting a sequence of bits as a number in binary notation, so that the eight-bit sequence 00101010 represents the number 42. With eight bits, it is possible to represent numbers between 0 and 28– 1, or 255. Sixteen bits are sufficient to represent numbers between 0 and 216– 1, or 65,535. Thirty-two bits allow for numbers up to 232– 1, or 4,294,967,295. The fact that each byte of memory can store a numeric value between 0 and 255 means that a byte is a perfect size to store an ASCII character. For historical 481 482 Pointers and Arrays reasons dating back to its predecessor languages, C++ defines the data type char to be exactly one byte in size. This design decision makes it more difficult for C++ programs to work with expanded character sets needed to encode languages that don’t fit easily into the ASCII model. The C++ standard libraries define a type called wchar_t to represent “wide characters” that extend outside the ASCII range. Those facilities, however, are beyond the scope of this book. Signed integers can also be stored as bit sequences by making a minor change to the encoding. Primarily because doing so simplifies the hardware design, most computers use a representation called two’s complement arithmetic to represent signed integers. If you want to express a nonnegative value in two’s complement arithmetic, you simply use its traditional binary expansion. To represent a negative value, you subtract its absolute value from 2N, where N is the number of bits used in the representation. For example, the two’s complement representation of –1 in a 32-bit word is calculated by performing the following binary subtraction: 100000000000000000000000000000000 –00000000000000000000000000000001 11111111111111111111111111111111 Floating-point numbers are also represented as fixed-length bit sequences in C++. Although the details of floating-point representation are beyond the scope of this text, it isn’t too hard to imagine building hardware that would use some subset of the bits in a word to represent the digits in the floating-point value and some other subset to represent the exponent by which that value is scaled. The important thing to remember is simply that, internally, every data value is stored as bits. In C++, different data types require different amounts of memory. For the primitive types, the following values are typical, although the C++ standard gives compiler-writers some flexibility to choose different sizes that are more convenient for a particular type of hardware: char bool short int float long double long double 1 byte (by definition) 1 byte 2 bytes 4 bytes 4 bytes 8 bytes 8 bytes 16 bytes In C++, the size of an object is usually just the sum of the sizes of the instance variables it contains. If, for example, you define the Point class as it appears in Chapter 6, its private section would contain the following instance variables: 11.1 The structure of memory int x; int y; Each of these instance variables typically requires four bytes, so the total space required to store the data for the object is eight bytes on most machines. Compilers, however, are allowed to add memory space to the underlying representation of an object, mostly because doing so sometimes allows them to generate more efficient machine language code. Thus, the size of a Point object must be at least the eight bytes necessary to hold the instance variable x and y, but it might be larger. In a C++ program, you can determine how much memory will be assigned to a variable using the sizeof operator. The sizeof operator takes a single operand, which must be either a type name enclosed in parentheses or an expression. If the operand is a type, the sizeof operator returns the number of bytes required to store a value of that type; if the operand is an expression, sizeof returns the number of bytes required to store the value of that expression. For example, the expression sizeof(int) returns the number of bytes required to store a value of type int. The expression sizeof x returns the number of bytes required to store the variable x. Memory addresses Within the memory system of a typical computer, every byte is identified by a numeric address. The first byte in the computer is numbered 0, the second is numbered 1, and so on, up to the number of bytes in the machine minus one. As an example, the memory addresses in a tiny 64KB computer would begin with a byte numbered 0 and end with a byte numbered 65,535. Those numbers, however, are represented as decimal values, which is not how most programmers think about addresses. Given that addresses are closely tied to the internal structure of the hardware, it is more common to think about addresses as beginning at address 0000 and ending with FFFF, using the hexadecimal notation introduced in the preceding section. It is important, however, to remember that addresses are simply numbers and that the base determines only how those numbers are written down. Although it is possible to work with decimal addresses, this book uses hexadecimal notation for the following reasons: • Address numbers are conventionally written in hexadecimal, and C++ debuggers and runtime environments tend to display addresses in this form. 483 484 Pointers and Arrays • Writing address numbers in their hexadecimal form using a sans-serif font makes it easier to recognize that a particular number represents an address rather than some unidentified integer. In this text, if you see the number 65,536, you can assume that it represents an integer. If you see instead the number FFFF, you can be confident that number represents an address. • Using hexadecimal makes it easier to see why particular limits are chosen. If you write it as a decimal value, the number 65,535 seems like a rather random value. If you express that same number in hexadecimal as FFFF, it becomes easier to recognize that this value is the largest value that can be represented in 16 bits. Although addresses in memory are usually specified in terms of bytes, most computers support operations on larger units such as words. In a typical machine, a word consists of four bytes, and it is therefore possible to group four bytes together to refer to an individual word. In that case, however, the addresses of consecutive words increase by four. The difference between byte and word addressing is illustrated in Figure 11-1. F I G U R E 1 1 - 1 Typical memory layout for C++ programs byte addressing word addressing 0000 0000 0001 0004 0002 0008 0003 000C 0004 0010 0005 0014 0006 0018 0007 001C 0008 0020 0009 0024 000A 0028 000B memory layout program code and global variables available for dynamic allocation as described in Chapter 12 002C .. . .. . FFF8 FFE0 FFF9 FFE4 FFFA FFE8 FFFB FFEC FFFC FFF0 FFFD FFF4 FFFE FFF8 FFFF FFFC available for stack frames 11.1 The structure of memory The right side of Figure 11-1 provides a rough sketch as to how memory is organized in a typical C++ program. The instructions in the program—which are represented internally as bit patterns stored in memory words—and global variables tend to be stored in low-numbered addresses near the beginning of the address space. The amount of memory allocated in this region typically does not change as the program runs. The highest addresses in memory are used for stack frames. Each time your program calls a function or a method, the computer creates a new stack frame in this memory region. When that function returns, the stack frame is discarded leaving the memory free to be used for the stack frames of subsequent calls. The structure of these stack frames is described in more detail in the next section. The region of memory between the end of the program data and the stack is available for use by the program in case it needs to acquire more memory as it runs. This technique is extremely important in the design and implementation of abstract data types and is described in Chapter 12. Assigning memory to variables When you declare a variable in a C++ program, the compiler must make sure that the variable is assigned enough memory to hold a value of that type. The source from which that memory is taken depends on how the variable is declared. Global variables—of which the only kind used in this book are constants—are typically allocated in the same region of memory as the program, which in most architectures today appears at relatively small addresses in memory. Thus, if the compiler sees the declaration const double PI = 3.14159; it will reserve eight bytes of memory somewhere in the low-address region of memory and store the value 3.14159 in that variable. As a programmer, you have no idea what memory address the compiler will choose, but it often helps you to visualize what is happening inside the machine if you make up an address and use that in a diagram. Here, for example, you might imagine that the constant PI is stored in the address 0200, as shown in the following diagram: 0200 3.14159 PI Most variables, however, are local variables. Local variables are allocated in the stack region at the high end of memory in a contiguous block of addresses called a stack frame. You’ve already seen stack frames beginning in Chapter 2, but those frames have been represented abstractly as boxes. Internally, these variables are 485 486 Pointers and Arrays assigned space in a block that is pushed onto the top of the stack at the time of each function call. To make this discussion concrete, it makes sense to walk through the execution of the PowersOfTwo program from Chapter 1 to show exactly what happens on the stack. So that you don’t have to keep flipping back to the original figure, the code for the program—without the comments and prototype lines—is reprinted as Figure 11-2 at the bottom of this page. When you run the PowersOfTwo program, the first thing that happens is that the operating system generates a call to the main function. The main function takes no arguments but does have two local variables, limit and i. The stack frame must therefore allocate space for these integer variables, as follows: FFF4 limit FFF8 i F I G U R E 1 1 - 2 Code for the PowersOfTwo program /* * File: PowersOfTwo.cpp * --------------------* This figure contains only the function definitions from the * PowersOfTwo program from Chapter 1. */ int main() { int limit; cout << "This program lists powers of two." << endl; cout << "Enter exponent limit: "; cin >> limit; for (int i = 0; i <= limit; i++) { cout << "2 to the " << i << " = " << raiseToPower(2, i) << endl; } return 0; } int raiseToPower(int n, int k) { int result = 1; for (int i = 0; i < k; i++) { result *= n; } return result; } 11.1 The structure of memory In this diagram, the variable limit has been assigned to address FFF4, and the variable i appears at FFF8. These addresses are in some sense arbitrary in that there is no way to predict exactly what addresses the compiler will assign or whether limit will come before i or the other way around. What you can count on is that both of these variables will be allocated in the region assigned to that stack frame. The gray rectangle at the bottom of the diagram indicates that the computer will need to keep track of additional information about each function call beyond the values of the local variables. If nothing else, each stack frame needs to keep track of the location in the program to which it should return. The format of that information depends on the architecture of the machine and is not at all essential to understanding the data model. The stack diagrams in this book include a gray rectangle in each stack frame both to remind you that this extra information exists and to make it easier to see the extent of each frame visually. When the program runs, each function has access to its own stack frame and updates the values of the local variables as they change. Assuming that the user has entered 8 as the value of limit, the situation immediately before the first call to raiseToPower looks like this: FFF4 limit 8 0 FFF8 i Calling raiseToPower(2, i)creates a new stack frame on top of the existing one. The frame contains entries for the parameter variables n and k, as well as the local variables result and i. The parameter variables are initialized to the values of the arguments, which means that the stack now looks like this: FFE0 result FFE4 i FFE8 FFEC FFF4 FFF8 0 2 k 8 0 limit n i When raiseToPower returns, its stack frame is discarded, restoring the state before the call. This example is quite simple and includes none of the complexity involved in creating more accurate memory diagrams. This example does, however, give you 487 488 Pointers and Arrays enough of a sense of how variables are assigned to memory to cover the topic of pointers, which are introduced in the following section. In Chapter 12, you’ll have a chance to get back to memory diagrams and learn more about strategies for memory allocation. 11.2 Pointers One of the principles behind the design of C++ is that programmers should have as much access as possible to the facilities provided by the underlying hardware. For this reason, C++ makes the fact that memory locations have addresses visible to the programmer. A data item whose value is an address in memory is called a pointer. In many high-level programming languages, pointers are used sparingly because those languages provide other mechanisms that eliminate much of the need for pointers. The Java programming language, for example, hides pointers from the programmer altogether. In C++, pointers are pervasive, and it is impossible to understand most professional C++ programs without knowing how pointers work. In C++, pointers serve several purposes, of which the following are the most important: • Pointers allow you to refer to a large data structure in a compact way. Data structures in a program can become arbitrarily large. No matter how large they grow, however, the data structures still reside somewhere in the computer’s memory and therefore have an address. Pointers allow you to use the address as a shorthand for the complete value. Because a memory address typically fits in four bytes of memory, this strategy offers considerable space savings when the data structures themselves are large. • Pointers make it possible to reserve new memory during program execution. Up to now, the only memory you could use in your programs was the memory assigned to variables that you have declared explicitly. In many applications, it is convenient to acquire new memory as the program runs and to refer to that memory using pointers. This strategy is discussed in the section on “Dynamic allocation” in Chapter 12. • Pointers can be used to record relationships among data items. In advanced programming applications, pointers are used extensively to model connections between individual data values. For example, programmers often indicate that one data item follows another in a conceptual sequence by including a pointer to the second item in the internal representation of the first. Data structures that use pointers to create connections between individual components are called linked structures. Linked structures play a critical role in implementing many of the abstract data types you will encounter later in this book. 11.2 Pointers Using addresses as data values In C++, any expression that refers to an internal memory location capable of storing data is called an lvalue (pronounced “ell-value”). The l at the beginning of lvalue comes from the observation that lvalues can appear on the left side of an assignment statement in C++. For example, simple variables are lvalues because you can write a statement like x = 1.0; Many values in C++, however, are not lvalues. For example, constants are not lvalues because a constant cannot be changed. Similarly, although the result of an arithmetic expression is a value, it is not an lvalue, because you cannot assign a new value to the result of an arithmetic expression. The following properties apply to lvalues in C++: • Every lvalue is stored somewhere in memory and therefore has an address. • Once it has been declared, the address of an lvalue never changes, even though the contents of those memory locations may change. • The address of an lvalue is a pointer value, which can be stored in memory and manipulated as data. Declaring pointer variables As with all other variables in C++, you must declare pointer variables before you use them. To declare a variable as a pointer, all you need to do is add an asterisk (*) before the variable name in the declaration. For example, the line int *p; declares p to be of the conceptual type pointer-to-int. Similarly, the line char *cptr; declares cptr to be of type pointer-to-char. These two types—pointer-to-int and pointer-to-char—are distinct in C++, even though each of them is represented internally as an address. To use the value at that address, the compiler needs to know how to interpret it and therefore requires that its type be specified explicitly. The value at the address specified by a pointer is called its target. The type of that target value is called the base type for the pointer. Thus, the type pointer-to-int has int as its base type. It is important to note that the asterisk symbol used to indicate that a variable is a pointer belongs syntactically with the variable name and not with the base type. If 489 490 Pointers and Arrays you use the same declaration to declare two pointers of the same type, you need to mark each of the variables with an asterisk, as in int *p1, *p2; The declaration int *p1, p2; declares p1 as a pointer to an integer, but declares p2 as an integer variable. The fundamental pointer operations C++ defines two operators that allow you to move back and forth between a pointer and its target value: & * Address-of Value-pointed-to The & operator takes an lvalue as its operand and returns the memory address in which that lvalue is stored. The * operator takes a value of any pointer type and returns the lvalue to which it points. This operation is called dereferencing the pointer. The * operation produces an lvalue, which means that you can assign a value to a dereferenced pointer. The easiest way to illustrate these operators is by example. declarations Consider the int x, y; int *p1, *p2; These declarations allocate memory for four words, two of type int and two of type pointer-to-int. For concreteness, let’s suppose that these values are stored on the stack at the machine addresses indicated by the following diagram: FF00 x FF04 y FF08 p1 FF0C p2 Given these declarations, you can assign values to x and y just as you always have. For example, executing the assignment statements x = 42; y = 163; 11.2 Pointers results in the following memory state: FF00 42 x FF04 163 y FF08 p1 FF0C p2 To initialize the pointer variables p1 and p2, you need to assign values that represent the addresses of some integer objects. In C++, the operator that produces addresses is the & operator. You can use assignment and the & operator to make p1 point to y and p2 point to x like this: p1 = &y; p2 = &x; These assignments leave memory in the following state: FF00 42 163 x FF04 FF08 FF04 p1 FF0C FF00 p2 y The variable p1 contains the value FF04, which is the address of the variable y. Similarly, p2 contains the value FF00, which is the address of the variable x. Using explicit addresses to represent pointers emphasizes the fact that addresses are stored internally as numbers. It does not, however, give you an intuitive sense of what pointers mean. To accomplish that goal, it is better to use arrows to indicate the target of each pointer. If you eliminate the address values entirely and use arrows to represent the pointers, the diagram looks like this: 42 163 x y p1 p2 Using arrows in the diagram makes it clear that the variables p1 and p2 point to the cells indicated by the arrowheads. Arrows makes it easier to understand how pointers work and therefore appear in most of the memory diagrams in this text. At the same time, it is important to remember that pointers are simply numeric addresses and that there are no arrows inside the machine. 491 492 Pointers and Arrays To move from a pointer to the value it points to, you use the * operator. For example, the expression *p1 indicates the value in the memory location to which p1 points. Moreover, since p1 is declared as a pointer to an integer, the compiler knows that the expression *p1 must refer to an integer. Thus, given the configuration of memory illustrated in the diagram, *p1 turns out to be another name for the variable y. Like the simple variable name y, the expression *p1 is an lvalue, and you can assign new values to it. Executing the assignment statement *p1 = 17; changes the value in the variable y because that is the target of the pointer p1. After you make this assignment, the memory configuration is 42 17 x y p1 p2 You can see that the value of p1 itself is unaffected by this assignment. It continues to point to the variable y. It is also possible to assign new values to the pointer variables themselves. For instance, the statement p1 = p2; tells the computer to take the value contained in the variable p2 and copy it into the variable p1. The value contained in p2 is the pointer value FF00. If you copy this value into p1, both p1 and p2 contain the same pointer, as illustrated by the following diagram: 42 17 x y FF00 p1 FF00 p2 As long as you remember the fact that a pointer has an underlying representation as an integer, the idea of copying a pointer is not at all mysterious. The value of the 11.2 Pointers pointer is simply copied unchanged to the destination. If you draw your memory diagrams using arrows, you have to keep in mind that copying a pointer replaces the destination pointer with a new arrow that points to the same location as the old one. Thus, the effect of the assignment p1 = p2; is to change the arrow leading from p1 so that it points to the same location as the arrow originating at p2, like this: 42 x 17 y p1 p2 It is important to distinguish the assignment of a pointer from that of a value. Pointer assignment, such as p1 = p2; makes p1 and p2 point to the same location. By contrast, value assignment, which is represented by the statement *p1 = *p2; copies the value from the memory location addressed by p2 into the location addressed by p1. Pointers to structures and objects The examples in the preceding sections declare pointers only to the primitive types. In C++, it is much more common to use pointers in conjunction with structures or objects. For example, the declarations Point pt(3, 4); Point *pp = &pt; declare two local variables. The variable pt contains a Point object with the coordinate values 3 and 4. The variable pp contains a pointer to that same Point object. Using the pointer-based format, the memory diagram that results from these declarations looks like this: 493 494 Pointers and Arrays 3 4 pt pp From the pointer pp, you can move to the object using the * operator, so that *pp and pt are effectively synonyms. You do, however, need to exercise a bit of caution if you try to refer to the fields and methods of a compound object given a pointer to that object. You cannot, for example, obtain the x coordinate of the point by writing *pp.getX() Although the code looks right, there is a problem with the precedence of the operators in this expression. In C++, the dot operator has higher precedence than the star operator, which means that the compiler tries to interpret this expression as *(pp.getX()) which is meaningless. What you want to do is dereference the pointer first and then invoke the method, which means that the statement needs to be parenthesized as follows: (*pp).getX() This expression has the desired effect but is too cumbersome for everyday use. As you write more sophisticated applications, you’ll find yourself using pointers to objects all the time. Forcing programmers to include these parentheses in every selection operation would make pointers to objects considerably less convenient. To eliminate some of this inconvenience, C++ defines the operator -> (usually read aloud as arrow), which combines the operations of dereference and selection into a single operator. The conventional way to invoke the getX method given the pointer variable pp is therefore pp->getX() The keyword this When you are writing the implementation of a class, C++ defines the keyword this as a pointer to the current object. That definition has several important applications that you will discover in examples throughout the rest of this book. Of these, one of the most common is that you can use the keyword this to select the instance variables of an object even if those names are shadowed by a parameter or local variable. 11.2 Pointers The problem of shadowing was introduced in Chapter 6 in the context of the constructor for the Point class, which at the time looked like this: Point(int cx, int cy) { x = cx; y = cy; } The parameters in this constructor had to be named something other than x and y to avoid name conflicts with the instance variables. Clients, however, are likely to find the new names at least a bit confusing. From the client’s point of view, x and y are precisely the right names for the constructor parameters. From the perspective of the implementer, however, x and y are the perfect names for the instance variables. Using the keyword this to select the instance variables makes it possible to satisfy the client and the implementer at the same time, as follows: Point(int x, int y) { this->x = x; this->y = y; } Some programmers argue that using this with every reference to a member of the current object makes the code easier to read. The JavaScript language goes so far as to require the keyword this for all such references. This text follows the common C++ convention of using this only when doing so helps to resolve an ambiguity. The special pointer NULL In many pointer applications, it is useful to have a special pointer value that indicates that the pointer does not in fact refer to any valid memory address, at least for the present. That special value is called the null pointer and is represented internally as the value 0. In C++, the best way to indicate the null pointer is to use the constant NULL, which is defined in the interface. It is illegal to use the * operator on a null pointer. The popular programming environments in use today typically detect that error and terminate the program, but that response is not guaranteed. On some machines, trying to read the target value of a null pointer simply gives you back the contents of address 0000 in the machine. The situation is much the same in the case of uninitialized pointers. If you declare a pointer but fail to initialize it, the computer will try to interpret the contents of that pointer as an address and try to read that region of memory. In such cases, programs can fail in ways that are extremely difficult to detect. 495 496 Pointers and Arrays The uses of the null pointer will be introduced in this text as they become relevant to a particular application. For now, the important thing to remember is that this constant exists. Pointers and call by reference As an illustration of their use in practice, pointers are used internally by C++ to implement call by reference. When a parameter is passed by reference, the stack frame stores a pointer to the location in the caller at which that value resides. Any changes to that value are made to the target of the pointer, which means that those changes remain in effect after the function returns. The program in Figure 11-3 offers a simple illustration of how C++ implements call by reference. The program reads two integers from the user and then checks to see that they are in ascending order. If not, the program calls the function swap, which exchanges the values of its arguments. Suppose that you run this program and enter the values 20 and 10, as follows: SwapIntegers Enter n1: 20 Enter n2: 10 Those values are out of order, so the main program will call swap. The contents of the stack immediately before that call look like this: FFF4 FFF8 20 10 n1 n2 The function swap takes its parameters by reference, which means that the stack frame for swap is given the addresses of the calling arguments rather than the values. Immediately after the call, the contents of the stack look like this: FFE4 tmp FFE8 FFF4 &x FFEC FFF8 &y FFF4 20 10 n1 FFF8 n2 11.2 Pointers F I G U R E 1 1 - 3 Program to ensure two integers are in sequence /* * File: SwapIntegers.cpp * ---------------------* This program illustrates the use of call by reference to exchange * the values of two integers. */ #include #include "simpio.h" using namespace std; /* Function prototype */ void swap(int & x, int & y); /* Main program */ int main() { int n1 = getInteger("Enter n1: "); int n2 = getInteger("Enter n2: "); if (n1 > n2) swap(n1, n2); cout << "The range is " << n1 << " to " << n2 << "." << endl; return 0; } /* * Function: swap * Usage: swap(x, y); * -----------------* Exchanges the values of x and y. The arguments are passed by * reference and can therefore be modified. */ void swap(int & x, int & y) { int tmp = x; x = y; y = tmp; } All references to x and y inside swap are passed along to the variables n1 and n2, which are the targets of the pointers. Exchanging those values means that the effect of this function persists after the call to swap returns, when the stack looks like this: FFF4 FFF8 10 20 n1 n2 The program continues with the updated values, leading to the following output: 497 498 Pointers and Arrays SwapIntegers Enter n1: 20 Enter n2: 10 The range is 10 to 20. Although call by reference is extremely convenient, it is not an essential feature of the C++ language. You can simulate the effect of call by reference by making the pointers explicit. In this program, all you would have to do is change the implementation of swap to void swap(int *px, int *py) { int tmp = *px; *px = *py; *py = tmp; } and the call in the main program to swap(&n1, &n2); In the exercises, you’ll have a chance to practice converting programs that use call by reference into their pointer-based equivalents. 11.3 Arrays When the Vector class first appeared in Chapter 5, the introduction to that section described vectors in terms of arrays, noting that you were likely to have some idea about arrays from your previous programming experience. C++ offers a built-in array type, which is based on the language model that C++ inherits from C. Given that the Vector collection class is uniformly more flexible and convenient, there are few reasons to use arrays in new code, although you will certainly encounter arrays in existing applications. In C++, an array is a low-level collection of individual data values with two distinguishing characteristics: 1. An array is ordered. You must be able to count off the individual components of an array in order: here is the first, here is the second, and so on. 2. An array is homogeneous. Every value stored in an array must be of the same type. Thus, you can define an array of integers or an array of floating-point numbers but not an array in which the two types are mixed. Given, however, that you are already familiar with the Vector class, it’s probably easiest to think of arrays as a primitive implementation of the vector idea. As with 11.3 Arrays vectors, arrays are composed of individual elements of some base type selected by an integer index. The pictures one draws to represent a vector almost certainly work for arrays as well. There are a few small differences in syntax, but those are easily mastered. The real difference is that arrays have the following limitations that make them less useful in practice than the more powerful vector type: • Arrays are allocated with a fixed size that you can’t subsequently change. • Even though arrays have a fixed size, C++ does not make that size available to the programmer. As a result, programs that work with arrays typically need an additional variable to keep track of the number of elements. • Arrays offer no support for inserting and deleting elements. • C++ performs no bounds-checking to ensure that the elements you select are actually present in the array. Despite these clear disadvantages, arrays are the framework from which the more powerful collection classes are built. To understand the implementation of those classes, you need to have some familiarity with the mechanics of arrays. Array declaration Like any other variable in C++, an array must be declared before it is used. The general form for an array declaration is type name[size]; where type is the type of each element in the array, name is the name of the array variable, and size is an integer value indicating the number of elements allocated to the array. For example, the declaration int intArray[10]; declares an array named intArray with 10 elements, each of which is of type int. In most cases, however, you should specify the size as a symbolic constant rather than an explicit integer so that the array size is easier to change. Thus, a more conventional declaration would look like this: const int N_ELEMENTS = 10; int intArray[N_ELEMENTS]; You can represent this declaration pictorially as follows: 499 500 Pointers and Arrays intArray 0 1 2 3 4 5 6 7 8 9 As with the elements in a vector, the index numbers for an array always begin with 0 and run up to the array size minus one. Thus, in an array with 10 elements, the index numbers are 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. Array selection To refer to a specific element within an array, you specify both the array name and the index corresponding to the position of that element within the array. The process of identifying a particular element within an array is called selection, and is indicated in C++ by writing the name of the array and following it with the index written in square brackets, just as if you are selecting an element from a vector. The result of a selection expression is an lvalue, which means that you can assign new values to it. For example, if you execute the for loop for (int i = 0; i < N_ELEMENTS; i++) { intArray[i] = 10 * i; } the variable intArray will be initialized as follows: intArray 0 10 20 30 40 50 60 70 80 90 0 1 2 3 4 5 6 7 8 9 When you select an element from an array, C++ performs no bounds checking. If the index is out of range, C++ simply figures out where that element would be in memory and uses that value, leading to unpredictable results. Worse still, if you assign a value to that index position, you can overwrite the contents of memory used by some other part of the program. Writing beyond the end of an array is one of the primary vulnerabilities used by hackers to attack computer systems. Static initialization of arrays Array variables can be given initial values at the time they are declared. In this case, the equal sign specifying the initial value is followed by a list of initializers enclosed in curly braces. For example, the declaration const int DIGITS[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }; declares a constant array called DIGITS in which each of the 10 elements is initialized to its own index number. As you can see from this example, specifying 11.3 Arrays explicit initializers allow you to omit the array size from the declaration, which is then taken from the number of values. In the DIGITS example, you know that there are 10 digits in the list. In many cases, however, the program will need to determine the number of elements in a statically initialized array in order to free the programmer from having to count the number of elements each time the program is changed. As an example, imagine you’re writing a program that requires an array containing the names of all U. S. cities with populations of over 1,000,000. Taking data from the 2010 census, you could declare and initialize BIG_CITIES as a constant global array using the following declaration: const string BIG_CITIES[] = { "New York", "Los Angeles", "Chicago", "Houston", "Philadelphia", "Phoenix", "San Antonio", "San Diego", "Dallas", }; This list, however, is not static over time. Between the census results of 1990 and 2000, Detroit dropped off this list while Phoenix and San Antonio joined it. When the results are in from the 2020 census, it is likely that San Jose will join the list of cities with more than 1,000,000 people. If you are responsible for maintaining the program that contains this code, the last thing you want to do is to have to count how many cities there are in the list just so the program can determine how many elements there are. What you would like to do instead is update the list of cities and have the compiler figure out how many there are. Fortunately, C++ offers a standard idiom for determining the allocated size of an array that uses static initialization to set the number of elements. Given a statically initialized array named MY_ARRAY, the number of elements in MY_ARRAY can be computed using the idiomatic pattern sizeof MY_ARRAY / sizeof MY_ARRAY[0] This expression takes the size of the entire array and divides it by the size of the initial element in the array. Because all elements of an array are the same size, the result is the number of elements in the array, regardless of the element type. Thus 501 502 Pointers and Arrays you could initialize a variable N_BIG_CITIES to hold the number of cities in the bigCities array by writing const int N_BIG_CITIES = sizeof BIG_CITIES / sizeof BIG_CITIES[0]; Effective and allocated sizes Although the sizeof technique allows you to determine the size of a statically allocated array, there are many applications in which you have no way of knowing how large an array should be when you write the code because the actual number of elements depends on the user’s data. One strategy for solving the problem of choosing an appropriate array size is to declare an array that you know is larger than you need and then use only part of it. Thus, instead of declaring the array so that it holds the actual number of elements, you define a constant indicating the maximum number of elements and use that constant in the declaration of the array. On any given use of the program, the actual number of elements is less than or equal to this bound. When you use this strategy, you need to maintain a separate integer variable that keeps track of the number of values that are actually in use. The size of the array specified in the declaration is called the allocated size; the number of elements actively in use is called the effective size. Suppose, for example, that you want to define an array that holds the scores for competitors in a sport like gymnastics, where judges rate each entrant on a numeric scale, which might run—as Olympic gymnastics did until 2005—from 0.0 to 10.0. If you wanted your program to allow for a maximum of 100 judges even though the actual number is usually smaller, you might declare the array like this: const int MAX_JUDGES = 100; double scores[MAX_JUDGES]; To keep track of the effective size, you need to declare an additional variable, which you might call nJudges, and make sure that it keeps track of how many judges there actually are. The relationship between pointers and arrays In C++, the name of an array is synonymous with a pointer to its initial element. This identity is most easily illustrated by example. The declaration int list[5]; allocates space for an array of five integers, which is assigned storage in the current stack frame, as illustrated in the following diagram: 11.3 Arrays FF60 list[0] FF64 list[1] FF68 list[2] FF6C list[3] FF70 list[4] The name list represents an array but can also be used as a pointer value. When it is used as a pointer, list is defined to be the address of the initial element in the array. Thus, if the compiler encounters the variable name list on its own, without any subscript after it, it translates the array name into the pointer value at which the array begins in memory. One of the most important implications of the way that C++ treats arrays as pointers is that array parameters appear to be shared with the calling argument even though no explicit call by reference is involved. You can, for example, implement an array-based version of the selection sort algorithm as follows: void sort(int array[], int n) { for (int lh = 0; lh < n; lh++) { int rh = lh; for (int i = lh + 1; i < n; i++) { if (array[i] < array[rh]) rh = i; } swap(array[lh], array[rh]); } } This function correctly sorts the array passed by the caller because the function initializes the array parameter by copying the address of the calling argument. The function then selects elements from the array using that address, which means that the elements are the ones in the calling argument. The sort function would work exactly the same way if you had written its prototype like this: void sort(int *array, int n) In this case, the first argument is declared as a pointer, but the effect is the same as in the original implementation, which declared this parameter as an array. In either case, the value stored in the stack frame under the name array is the address of the initial element of the calling argument. Inside the machine, the declarations are equivalent. No matter which of these forms you use in the declaration, you can apply the same operations to the variable array. 503 504 Pointers and Arrays As a general rule, you should declare parameters in the way that reflects their use. If you intend to use a parameter as an array and select elements from it, you should declare that parameter as an array. If you intend to use the parameter as a pointer and dereference it, you should declare it as a pointer. The crucial difference between arrays and pointers in C++ comes into play when variables are originally declared, not when those values are passed as parameters. The fundamental distinction between the declaration int array[5]; and the declaration int *p; is one of memory allocation. The first declaration reserves five consecutive words of memory capable of holding the array elements. The second declaration reserves only a single word, which is large enough to hold a machine address. The implication of this distinction is important for you to keep in mind. If you declare an array, you have storage to work with; if you declare a pointer variable, that variable is not associated with any storage until you initialize it. The simplest way to initialize a pointer to an array is to copy the base address of an existing array into the pointer variable. As an example, if you were to write the statement p = array; after making the preceding declarations, the pointer variable p would point to the same address used for array, and you could use the two names interchangeably. The technique of setting a pointer to the address of an existing array is rather limited. After all, if you already have an array name, you might as well use it. Assigning that name to a pointer does not really do you any good. The real advantage of using a pointer as an array comes from the fact that you can initialize that pointer to new memory that has not previously been allocated, which allows you to create new arrays as the program runs. This technique is described later in Chapter 12. Pointer arithmetic In C++, you can apply the operators + and - to pointers. The results are similar to the familiar arithmetic operations in certain respects but different in others. The process of applying these operators to pointer values is called pointer arithmetic. 11.3 Arrays Pointer arithmetic is defined by a simple rule. If p is a pointer to the initial element in an array array, and k is an integer, the following identity always holds: p + k is defined to be &array[k] In other words, if you add an integer k to a pointer value, the result is the address of the array element at index k for an array beginning at the original pointer address. To illustrate how this rule applies, suppose that a function contains the following declarations: double list[3]; double *p; Each of these variables is given space in the frame for this function. For the array variable list, the compiler allocates space for the three elements in the array, each of which is large enough to hold a double. For p, the compiler allocates enough space for a pointer, which will be used to hold the address of some lvalue of type double. If the stack frame begins at location FFA0, the memory allocation looks like this: FFA0 list[0] FFA8 list[1] FFB0 list[2] FFB8 p Since no values have been assigned to any of these memory locations, their initial contents are undefined. Suppose that you use the following assignment statements to store values in each of the array elements: list[0] = 1.0; list[1] = 1.1; list[2] = 1.2; and initialize the pointer variable p to the beginning of the array by executing the assignment statement p = list; or, equivalently, p = &list[0]; 505 506 Pointers and Arrays After these assignments, the stack frame holds the following values: FFA0 list[0] 1.0 FFA8 list[1] 1.1 FFB0 list[2] 1.2 FFB8 FFA0 p In this diagram, p now points to the initial address in the array list. If you add an integer k to the pointer p, the result is the address corresponding to the array element at index position k. For example, if a program contained the expression p + 2 the result of evaluating this expression would be a new pointer value that references list[2]. Thus, in the preceding diagram, in which p points to address FFA0, p + 2 points to the address of the element that appears two elements later in the array, which is at address FFB0. It’s important to note that pointer addition is not equivalent to traditional addition because the calculation must take into account the size of the base type. In this example, for each unit that is added to a pointer value, the internal numeric value must be increased by eight to take account of the fact that a double requires eight bytes. The C++ compiler interprets subtraction of an integer from a pointer in a similar way. The expression p - k in which p is a pointer and k is an integer, computes the address of an array element located k elements before the address currently indicated by p. Thus, if you had set p to the address of list[1] using p = &list[1]; the addresses corresponding to p - 1 and p + 1 would be the addresses of list[0] and list[2], respectively. The arithmetic operations *, /, and % make no sense for pointers and cannot be used with pointer operands. Moreover, the uses of + and - with pointers are 11.3 Arrays limited. In C, you can add or subtract an integer from a pointer, but you cannot, for example, add two pointers together. The only other arithmetic operation defined for pointers is subtracting one pointer from another. The expression p1 - p2 is defined to return the number of array elements between the current values of p1 and p2. For example, if p1 points at list[2] and p2 points at list[0], the expression p1 - p2 has the value 2, since there are two elements between the current pointer values. Incrementing and decrementing pointers Knowing the rules for pointer arithmetic makes it possible to understand one of the most common idiomatic constructions in C++, which is the expression *p++ In this expression, the * operator and the ++ operator compete for the operand p. Because unary operators in C++ are evaluated in right-to-left order, the ++ takes precedence over the *, so the compiler interprets this expression as if it had been written like this: *(p++) As you learned in Chapter 1, the postfix ++ operator increments the value of p and then returns the value that p had prior to the increment operation. Since p is a pointer, the increment operation uses pointer arithmetic. Thus, adding 1 to the value of p creates a pointer to the next element in the array. If p originally pointed to arr[0], for example, the increment operation would cause it to point to arr[1]. Thus, the expression *p++ has the following meaning in English: Dereference the pointer p and return as an lvalue the object to which it currently points. As a side effect, increment the value of p so that, if the original lvalue was an element in an array, the new value of p points to the next element in that array. 507 508 Pointers and Arrays As an example of how you might use the *p++ construction, the following code adds the elements in an array of effective size n using a pointer-based model: int sumArray(int *p, int n) { int sum = 0; for (int i = 0; i < n; i++) { sum += *p++; } return sum; } At each cycle of the for loop, the expression *p++ returns the value of the current element and then advances p so that it is correctly positioned for the next cycle. Partly for reasons of history and habit, C++ programmers tend to overuse the *p++ idiom in places where array notation makes the intent much clearer. The sumArray function is easier to read in the following array-based form: int sumArray(int array[], int n) { int sum = 0; for (int i = 0; i < n; i++) { sum += array[i]; } return sum; } The examples in this text do not use pointer arithmetic, relying instead on array indexing to improve the readability of the program examples. In general, you would be well-advised to adopt the same rule in your own code. If you look at existing code, however, you will certainly encounter the *p++ idiom (and possibly even more complex applications of pointer arithmetic), and you need to know what that construction means. The same syntactic pattern, moreover, is used by the Standard Template Library to implement iterators, which are introduced in Chapter 14. In that context, it is useful to remember that *p++ is a C++ shorthand for retrieving the current element of an array and then advancing to the next one. 11.4 Using functions as data values In the programming you have done up to this point, the concepts of functions and data structures have remained quite separate. Functions provide the means for representing an algorithm; data structures allow you to organize the information to which those algorithms are applied. Functions have been part of the algorithmic structure, not part of the data structure. Being able to use functions as data values, however, often makes it much easier to design effective interfaces because doing so allows clients to specify operations as well as data. 11.4 Using functions as data values Pointers to functions In the earliest days of computing, programs were represented in a form that made them entirely separate from data. Typically, instructions were punched on paper tape and then fed into the machine, which would then execute the instructions in sequence. If you wanted to change the program, you had to punch a new tape. One of the most important characteristics of modern computers is that the same memory words used to store data values are also used to store the machine-language instructions executed by the hardware. This technique of storing instructions in the memory addresses used for data values is called the von Neumann architecture, after the mathematician John von Neumann. Although computing historians now believe that von Neumann is unlikely to have originated the idea, he seems to have been the first to publish it, and the concept continues to bear his name. One of the important implications of the von Neumann architecture is that every machine-language instruction in a program has an address in memory. That fact makes it possible to create a pointer to a function, which is simply the address of its first instruction. Most modern object-oriented languages use pointers to functions internally but hide the details from the programmer. By contrast, C++ makes it possible for programmers to declare pointers to functions and then use those functions as data values in an application. A simple plotting application Before looking at the details of how C++ incorporates pointers to functions into its syntax, it helps to consider an example that shows how this technique is used in practice. One of the most convincing examples is an application that plots the value of a client-specified function. Suppose, for example, that you want to write a program that plots the value of a function ƒ (x ) for values of x in a specified range. For example, if ƒ is the trigonometric sine function, you would like your program to produce a sample run that like something like this: PlotFunction 509 510 Pointers and Arrays The graphical output shows only the shape of the graph and does not indicate any units along the x- and y-axes. In this diagram, the values of x vary from –2π to 2π, and the values of y extend from –1 to 1. The plot function called by the client would need to include these ranges as parameters, which means that the call that produces this output would look something like this, assuming the appropriate definition of the constant PI: plot(sin, -2 * PI, 2 * PI, -1, 1); The interesting parameter here is the first argument, which is the name of the function you want to plot. In this example, the function is the trigonometric function sin from the library. If plot is designed in a general way, however, it should be possible to plot a different function by changing the first argument. For example, the call plot(sqrt, 0, 4, 0, 2); should plot the sqrt function on a graph that extends from 0 to 4 along the x-axis and from 0 to 2 along the y-axis, as follows: PlotFunction At the same time, the first argument to plot can’t be any function at all. It wouldn’t make sense, for example, to use a string function, because graphical plots on an x-y grid make sense only for functions that apply to numbers. What’s more, it also wouldn’t make sense to call plot with a numeric function that takes several arguments. The function that appears as the first argument must be a function that takes a real number (presumably a double) and returns one as well. Thus, you can say that the first argument to plot must be an instance chosen from the general class of functions that map one double into another double. 11.4 Using functions as data values The first argument, moreover, must be a data value that is part of the C++ type system. When the compiler encounters the name of a function that is not followed by the parentheses that would indicate a function call, it treats that function name as a pointer to the beginning of the machine-language implementation of that function in memory. The first argument to plot is therefore a pointer to a function that takes a double as its argument and returns a double as a result. Declaring pointers to functions The next question to consider is how to declare the parameter variable that represents the first argument to plot. C++ uses a syntactic form to declare a pointer to a function that often appears odd at first but that is in fact consistent with other declaration forms. If you want, for example, to declare a variable fn to be a pointer to a function taking and returning a double, you do so by writing double (*fn)(double); It’s important to remember the parentheses around *fn in the declaration of a function pointer. The alternative double *fn(double); declares fn as a function returning a pointer to a double. This declaration syntax makes it possible to write down the prototype for the plot function, which looks like this: void plot(double (*fn)(double), double minX, double maxX, double minY, double maxY); Implementing the plot function Once you have defined the prototype, you can write a simple implementation of plot using the graphics library introduced in Chapter 2, as shown in Figure 11-4. The implementation cycles through each pixel coordinate across the graphics window and then transforms that value into the x coordinate by choosing the point at the corresponding position in the interval between minX and maxX. Thus, the point halfway across the window corresponds to the point halfway between minX and maxX. The program then calls the function passed as the first argument to compute the value of y, which appears in the code as the declaration double y = fn(x); The final step consists of converting the y value to the appropriate vertical position on the screen by scaling this value with respect to minY and maxY. This operation 511 512 Pointers and Arrays F I G U R E 1 1 - 4 Implementation of the plot function /* * Function: plot * Usage: plot(fn, minX, maxX, minY, maxY); * ---------------------------------------* Plots the specified function (which must map one double to another * double) on the screen. The remaining arguments indicate the range * of values in the x and y directions, respectively. */ void plot(double (*fn)(double), double minX, double maxX, double minY, double maxY) { double width = getWindowWidth(); double height = getWindowHeight(); double nSteps = int(width); double dx = (maxX - minX) / nSteps; double sx0 = 0; double sy0 = height - (fn(minX) - minY) / (maxY - minY) * height; for (int i = 1; i < nSteps && y0 >= 0; i++) { double x = minX + i * dx; double y = fn(x); double sx1 = (x - minX) / (maxX - minX) * width; double sy1 = height - (y - minY) / (maxY - minY) * height; drawLine(sx0, sy0, sx1, sy1); sx0 = sx1; sy0 = sy1; } } is essentially the inverse of the transformation to derive the value of x. The only difference is that the y-axis on the screen is inverted from the traditional Cartesian coordinate plane, making it necessary to subtract the computed value from the height of the graphics window. The implementation of plot begins by computing the coordinates of the point at the left edge of the window, storing the result in the variables sx0 and sy0. From there it fills in the coordinates one pixel to the right in the variables sx1 and sy1. The graphics library call then connects these points with a call to drawLine(sx0, sy0, sx1, sy1); Each additional cycle of the loop connects the current point to its predecessor. This process has the effect of approximating the graph of the function by connecting a sequence of line segments, each of which extends one pixel in the x direction. The plot function in Figure 11-4 is probably too primitive to be used in a practical application. It nonetheless makes a useful first example of how treating functions as data values can be useful in an application. 11.4 Using functions as data values Comparison functions Another common application of functions as data arises when you want to sort a vector or an array using some ordering scheme other than the natural ordering provided by the relational operators <, ==, and >. The usual approach in such cases is to pass a comparison function to the sorting function that implements the new ordering. Comparison functions typically take two values, v1 and v2, and return an integer with the following properties: • If v1 and v2 are equal, the comparison function returns 0. • If v1 comes before v2, the comparison function returns a number less than 0. • If v1 comes after v2, the comparison function returns a number greater than 0. Figure 11-5 defines a sort function that sorts a vector of strings according to a client-supplied comparison function along with a compareIgnoringCase function that compares two strings without regard to case. Calling sort(vec, compareIgnoringCase); therefore sorts a vector of strings without considering the case of the letters, which is presumably what you want for most programs that alphabetize a list of words. The compareIgnoringCase function is only one of many that you could use in connection with this extended version of sort. If, for example, you wanted to sort strings in order of increasing length, you could give sort the following comparison function: int compareByLength(string s1, string s2) { return s1.length() - s2.length(); } The subtraction operator is all you need here. If s1 is shorter than s2, this function will return a negative number. Similarly, if s1 is longer than s2, the result will be positive. If the lengths are the same, the compareByLength comparison function returns 0. As another example, you can sort strings in reverse lexicographic order by calling sort with the following comparison function: int compareBackwards(string s1, string s2) { if (s1 == s2) return 0; if (s1 > s2) return -1; return 1; } 513 514 Pointers and Arrays sort with comparison function F I G U R E 1 1 - 5 Implementation of selection sort using a comparison function /* * Function: sort * Usage: sort(vec, cmp); * ---------------------* Sorts the vector of strings using the client-specified comparison * function cmp to determine the ordering. The comparison function * takes two strings, s1 and s2, and returns an integer whose sign * indicates the result of the comparison. If s1 and s2 are equal, * the comparison function must return 0. If s1 comes before s2 in * the ordering, the comparison function must return some integer less * than 0. If s1 comes after s2, the comparison function returns an * integer greater than 0. */ void sort(Vector & vec, int (*cmp)(string, string)) { int n = vec.size(); for (int lh = 0; lh < n; lh++) { int rh = lh; for (int i = lh + 1; i < n; i++) { if (cmp(vec[i], vec[rh]) < 0) rh = i; } string tmp = vec[lh]; vec[lh] = vec[rh]; vec[rh] = tmp; } } /* * Function: compareIgnoringCase * Usage: sort(vec, compareIgnoringCase); * -------------------------------------* Returns an integer whose sign indicates the relationship between * the strings s1 and s2 when considered without regard to case. * In most cases, this function is not called explicitly but is instead * passed to the sort function or some other utility that requires a * comparison function. This implementation adopts the simple but * inefficient tactic of converting the entire string to lowercase * and then comparing the result using the relational operators for * the string class. */ int compareIgnoringCase(string s1, string s2) { s1 = toLowerCase(s1); s2 = toLowerCase(s2); if (s1 == s2) return 0; if (s1 < s2) return -1; return 1; } Summary Summary One of the goals of this book is to encourage you to use high-level structures that allow you to think about data in an abstract way that is independent of the underlying representation. Abstract data types and classes help make it possible to maintain this holistic viewpoint. At the same time, using C++ effectively requires you to have a mental model of how data structures are represented in memory. In this chapter, you have had a chance to see how those structures are stored and to get a sense of what goes on “under the hood” as you write your programs. The important points introduced in this chapter include: • The fundamental unit of information in a modern computer is a bit, which can be in one of two possible states. The state of a bit is usually represented in memory diagrams using the binary digits 0 and 1, but it is equally appropriate to think of these values as off and on or false and true, depending on the application. • Sequences of bits are combined inside the hardware to form larger structures, including bytes, which are eight bits long, and words, which contain either four bytes (32 bits) or eight bytes (64 bits) depending on the machine architecture. • The internal memory of a computer is arranged into a sequence of bytes in which each byte is identified by its index position in that sequence, which is called its address. • Computer scientists tend to write address values and the contents of memory locations in hexadecimal notation (base 16) because doing so makes it easy to identify the individual bits. • The primitive types in C++ require different amounts of memory. A value of type char requires one byte, a value of type int typically requires four, and a value of type double requires eight. • Addresses of data in memory are themselves data values and can be manipulated as such by a program. A data value that is the address of some other piece of data is called a pointer. Pointer variables are declared in C++ by writing an asterisk in front of the variable name in its declaration line. • The fundamental operations on pointers are & and *, which indicate the address of a stored value and the value stored at a particular address, respectively. • Data values that you create in a C++ program are allocated in different regions of memory. Static variables and constants are allocated in a region of memory devoted to the program code and static data. Local variables are allocated in a region called the stack, which is apportioned into structures called frames that contain all of the local variables for a function or method. As you will discover in Chapter 12, programs can also allocate additional memory as they run. 515 516 Pointers and Arrays • The stack frame for a method call includes an entry identified by the keyword this that identifies the current object. • C++ uses the -> operator to select a member of a structure or an object given a pointer to that value. • There is a special pointer value called NULL, which is used to indicate that a pointer does not refer to anything. • Reference parameters are implemented in C++ by storing a pointer to the calling argument in the stack frame. • Like most languages, C++ includes a built-in array type for storing an ordered, homogeneous collection of elements. As in a vector, each element in an array has an integer index that begins with 0. • Arrays are declared by specifying the size in square brackets after the name of the array. The size specified in the declaration is called the allocated size of the array and is typically larger than the effective size, which is the actual number of elements in use. • Arrays in C++ are interpreted internally as a pointer to their first element. An important implication of this design is that passing an array as a parameter does not copy the elements. Instead, the function stores the address of the array in the caller. As a result, if a function changes the values of any elements of an array passed as a parameter, those changes will be visible to the caller. • C++ defines arithmetic on pointers so that adding an integer to a pointer generates the address of an array element that number of index positions further down in the array. Thus, if the pointer p points to array[0], the expression p + 2 points to array[2]. • The idiomatic pattern *p++ returns the value to which p currently points and then increments p so that it points to the next element in an array of that type. • Most computers use the von Neumann architecture, in which machine-language instructions are stored in the same memory as any other kind of data. As a result, it is possible to define a pointer to a function as the address of its first instruction. • In C++, the declaration of a variable or parameter of type pointer-to-function must specify the types of the parameters and the function result. The syntactic form of the declaration is illustrated by the following example, which appears in the argument list to the plot function: double (*fn)(double) This declaration specifies that the parameter fn is a pointer to a function taking one argument of type double and returning a double as its result. Review questions Review questions 1. Define the following terms: bit, byte, and word. 2. What is the etymology of the word bit? 3. How many bytes of memory are there in a 2GB machine? 4. Convert each of the following decimal numbers to its hexadecimal equivalent: a) 17 b) 256 5. c) 1729 d) 2766 Convert each of the following hexadecimal numbers to decimal: a) 17 b) 64 c) CC d) FAD 6. How many bytes does C++ assign to a value of type char? How many bytes are typically required for a double? 7. What are the three memory regions in which values can be stored in a C++ program? 8. True or false: In C++, values of type char always require one byte of memory. 9. True or false: In C++, values of type int always require four bytes of memory. 10. If a machine uses two’s complement arithmetic to represent negative numbers, what is the internal representation of –7 in a 32-bit integer format? 11. What is the purpose of the sizeof operator? How do you use it? 12. What is an address? 13. What is an lvalue? 14. What reasons for using pointers are cited in this chapter? 15. What are the types of the variables introduced by the following declaration: int * p1, p2; 517 518 Pointers and Arrays 16. What are the two fundamental pointer operations? Which one corresponds to the term dereferencing? 17. Explain the difference between pointer assignment and value assignment. 18. Assuming that variables of type int and all pointers require four bytes of memory, draw a diagram showing a portion of the stack frame that contains the following declarations: int int int int v1 = 10; v2 = 25; *p1 = &v1; *p2 = &v2; In your diagram, trace through the operation of these statements: *p1 += *p2; p2 = p1; *p2 = *p1 + *p2; 19. True or false: For any variable x, the expression *&x is essentially a synonym for x. 20. True or false: For any variable p, the expression &*p is essentially a synonym for p. 21. How are pointers used in the implementation of call by reference? 22. Write array declarations for the following array variables: a) An array realArray consisting of 100 floating-point values b) An array inUse consisting of 16 Boolean values c) An array lines that can hold up to 1000 strings Remember to declare constants to specify the allocated size for these arrays. 23. Write the variable declaration and for loop necessary to create and initialize the following integer array: squares 0 1 4 9 16 25 36 49 64 81 100 0 1 2 3 4 5 6 7 8 9 10 24. What is the difference between the allocated size and the effective size of an array? Review questions 25. Assuming that intArray is declared as int intArray[10]; and that j is an integer variable, describe the steps the computer would take to determine the value of the following expression: &intArray[j + 3]; 26. If array is declared to be an array, describe the distinction between the expressions array[2] and array + 2 27. Assume that variables of type double take up eight bytes on the computer system you are using. If the base address of the array doubleArray is FF00, what is the address value of doubleArray + 5? 28. True or false: If p is a pointer variable, the expression p++ adds 1 to the internal representation of p. 29. Describe the effect of the idiomatic C++ expression *p++ 30. In the expression *p++, which operator (* or ++) is applied first. In general, what rule does C use to determine the precedence of unary operators? 31. What feature of the von Neumann architecture makes it possible to define pointers to functions? 32. Describe the difference in the declarations char *fn1(string); and char (*fn1)(string); 33. How would you declare a variable fn as a pointer to a function taking two integers and returning a Boolean value? 34. What is the rule governing the return value of a comparison function? 519 520 The C++ Memory Model Exercises 1. As you know from the chapter, integers are represented inside the computer as a sequence of bits, each of which is a single digit in the binary number system and can therefore have only the value 0 or 1. With N bits, you can represent 2N distinct integers. For example, three bits are sufficient to represent the eight (23) integers between 0 and 7, as follows: 000 001 010 011 100 101 110 111 → → → → → → → → 0 1 2 3 4 5 6 7 The bit patterns for these integers follow a recursive pattern. The binary numbers with N bits consist of the following two sets in order: • All binary numbers with N − 1 bits preceded by a 0. • All binary numbers with N − 1 bits preceded by a 1. Write a recursive function void generateBinaryCode(int nBits); that generates the bit patterns for the binary representation of all integers that can be represented using the specified number of bits. For example, calling generateBinaryCode(3) should produce the following output: GenerateBinaryCode 000 001 010 011 100 101 110 111 2. Although the binary coding used in exercise 1 is ideal for most applications, it has certain drawbacks. As you count in standard binary notation, there are some points in the sequence at which several bits change at the same time. For example, in the three-bit binary code, the value of every bit changes as you move from 3 (011) to 4 (100). Exercises In some applications, this instability in the bit patterns used to represent adjacent numbers can lead to problems. Imagine for the moment that you are using some hardware measurement device that produces a three-bit value from some real-world phenomenon that happens to be varying between 3 and 4. Sometimes, the device will register 011 to indicate the value 3; at other times, it will register 100 to indicate 4. For this device to work correctly, the transitions for each of the individual bits must occur simultaneously. If the first bit changes more quickly than the others, for example, there may be an intermediate state in which the device reads 111, which would be a highly inaccurate reading. It turns out that you can avoid this problem simply by changing the numbering system. If instead of using binary representation in the traditional way, you can assign three-bit values to each of the numbers 0 through 7 with the highly useful property that only one bit changes in the representation between every pair of adjacent integers. Such an encoding is called a Gray code (after its inventor, the mathematician Frank Gray) and looks like this: 000 001 011 010 110 111 101 100 → → → → → → → → 0 1 2 3 4 5 6 7 In the Gray code representation, the bit patterns for 3 and 4 differ only in their leftmost bit. If the hardware measurement device used Gray codes, a value oscillating between 3 and 4 would simply turn that bit on and off, eliminating any problems with synchronization. The recursive insight that you need to create a Gray code of N bits is summarized in the following informal procedure: 1. Write down the Gray code for N – 1 bits. 2. Copy that same list in reverse order below the original one. 3. Add a 0 bit in front of the encodings in the original half of the list and a 1 bit in front of those in the reversed copy. This procedure is illustrated in the following derivation of the Gray code for three bits: 521 522 The C++ Memory Model 3-bit Gray code 2-bit Gray code 1-bit Gray code 000 001 011 00 010 01 110 111 ed rev ers 11 0 ed revers 1 10 101 100 Write a recursive function generateGrayCode(nBits) that generates the Gray code patterns for the specified number of bits. For example, if you call the function generateGrayCode(3) the program should produce the following output: GenerateGrayCode 000 001 011 010 110 111 101 100 3. Write overloaded versions of the integerToString and stringToInteger functions that take a second argument indicating the numeric base, which can be any integer in the range 2 through 36 (the 10 digits plus the 26 letters). For example, calling integerToString(42, 16) should return the string "2A". Similarly, calling stringToInteger("111111", 2) should return the integer 63. Your functions should allow for negative numbers and should generate an error if any of the digits in the first argument to stringToInteger is out of range for the specified base. Exercises 4. Rewrite the simple expression calculator from Chapter 6, exercise 11, so that both the input and output values are represented in hexadecimal. A sample run of this program might look like this: HexCalculator > 2 + 2 4 > 9 + 3 C > 10000 - 1 FFFF > 6 * 7 2A > FEED - CAFE 33EF > In writing this program, the easiest approach is to scan all tokens as words rather than numbers and the call the functions you wrote for exercise 3 to perform the conversions. 5. Rewrite the Quadratic program from Figure 2-3 so that it uses explicit pointers instead of call by reference to return values from the functions getCoefficients and solveQuadratic. 6. Using the definitions of MAX_JUDGES and scores on page 502 as a starting point, write a program that reads in gymnastics scores between 0 and 10 from a set of judges and then computes the average of the scores after eliminating both the highest and lowest scores from consideration. Your program should accept input values until the user enters a blank line as a sentinel. A sample run of this program might look like this: GymnasticsJudge Enter score for each judge in the range 0 to 10. Enter a blank line to signal the end of the list. Judge #1: 9.0 Judge #2: 9.1 Judge #3: 9.3 Judge #4: 9.0 Judge #5: 8.8 Judge #6: 9.0 Judge #7: The average after eliminating 8.80 and 9.30 is 9.03. 7. Rewrite the implementation of the merge sort algorithm from Figure 10-3 so that it sorts an array rather than a vector. The revised function should use the prototype 523 524 The C++ Memory Model void sort(int array[], int n) as in the reimplementation of the selection sort algorithm on page 503. 8. In calculus, the definite integral of a function is defined to be the area bounded horizontally by two specified limits and vertically by the x-axis and the value of the function. For example, the definite integral of the trigonometric sine function in the range 0 to π is the area of the shaded region in the following diagram: π 2 π 3π 2 2π You can compute an approximation to this area by adding up the area of small rectangles of a fixed width, where the height is given by the value of the function at the midpoint of the rectangle: π 2 π 3π 2 2π Design the prototype and write the code for a function integrate that calculates the definite integral by summing the areas of a set of rectangles. For example, to calculate the area of the shaded region in the earlier example, the client would write double value = integrate(sin, 0, PI, 20); where the last argument is the number of rectangles into which the area gets divided; the larger this value, the more accurate the approximation. Exercises Note that any region that falls below the x-axis is treated as negative area. Thus, if you compute the definite integral of sin from 0 to 2π, the result would be 0 because the areas above and below the axis cancel each other out. 9. As noted in the text, the code for compareIgnoringCase in Figure 11-5 is inefficient because it converts the entire string to lowercase even if you can determine the relative ordering by looking only at the first character. Write a more efficient version of compareIgnoringCase that looks at as few characters as possible as it determines the result. 10. Write a comparison function compareByTitle that takes two strings and compares them subject to the following rules: • The comparison should ignore differences in case. • All punctuation marks except spaces should be ignored. • The words a, an, or the at the beginning of a title should be ignored. 525 Chapter 12 Dynamic Memory Management You have burdened your memory with exploded systems and useless names. — Mary Shelley, Frankenstein, 1818 528 Dynamic Memory Management Up to this point in the text, you have seen two mechanisms for assigning memory to variables. When you declare a global constant, the compiler allocates memory space that persists throughout the entire program. This style of allocation is called static allocation because the variables are assigned to locations in memory that remain unchanged throughout the lifetime of the program. When you declare a local variable inside a function, the space for that variable is allocated on the stack. Calling the function assigns memory to the variable; that memory is freed when the function returns. This style of allocation is called automatic allocation. There is, however, a third way of allocating memory that permits you to acquire new memory as the program runs. The process of acquiring new storage while the program is running is called dynamic allocation. Dynamic allocation is one of the most important techniques that you need to learn before you can consider yourself fluent in C++. In part, the importance comes from the fact that dynamic allocation makes it possible for data structures to expand as needed while a program is running. The collection classes introduced in Chapter 5, for example, depend on this ability. There is no arbitrary limit on the size of a Vector or a Map. If these classes need more memory, they simply request it from the system. In C++, dynamic allocation takes on special importance because the language assigns much more responsibility to the programmer than most modern languages do. In C++, it is not enough to know how to allocate memory. You also have to learn how to free memory when it is no longer needed. The process of allocating and freeing memory in a disciplined way is called memory management. 12.1 Dynamic allocation and the heap When a program is loaded into memory, it usually occupies only a fraction of the available storage. Like most programming languages, C++ allows you to allocate some of the unused storage to the program whenever your application needs more memory. For example, if you need space for an array while the program is running, you can reserve part of the unallocated memory, leaving the rest for subsequent allocations. The pool of unallocated memory available to a program is called the heap. In most modern architectures, memory is arranged so that the heap and the stack grow in opposite directions toward each other, as illustrated in the following diagram: heap stack (grows toward higher addresses) (grows toward lower addresses) 12.1 Dynamic allocation and the heap The advantage of this strategy is that either region can grow as needed until all of the available memory is filled. The ability to allocate memory from the heap when you need it turns out to be an extremely useful technique that has widespread application to programming. For example, all the collection classes use the heap to store their elements, because dynamic allocation is essential to creating data structures that can expand as needed. Later in this chapter, you’ll have a chance to build a simplified version of one of those collection classes. Before doing so, however, it is important to learn the underlying mechanics of dynamic allocation and how the process works. The new operator C++ uses the new operator to allocate memory from the heap. In its simplest form, the new operator takes a type and allocates space for a variable of that type located in the heap. For example, if you want to allocate an integer in the heap, you call int *ip = new int; The call to the new operator returns the address of a storage location in the heap that has been set aside to hold an integer. If the first free word in the heap is located at address 1000, the variable ip in the current stack frame will be assigned the address of that memory word, as follows: FFF0 1000 1000 ip Conceptually, the local variable ip on the stack points to the newly allocated word on the heap. That relationship becomes clearer if you indicate the relationship with an arrow rather than an address, as follows: ip The contents of the memory locations are the same. The only change is in how you draw the picture. Once you have allocated the space for the integer value in heap memory, you can refer to that integer by dereferencing the pointer. For example, you can store the integer 42 into that word by executing the statement *ip = 42; which leaves memory looking like this: 42 ip 529 530 Dynamic Memory Management You can also use the new operator to allocate objects on the heap. If you use only the class name, as in Rational *rp = new Rational; C++ allocates space for a Rational object on the heap and invokes the default constructor, creating the following state in memory: 0 1 rp If you supply arguments after the type name, C++ will call the matching version of the constructor. The declaration Rational *rp = new Rational(2, 3); therefore creates the following state: 2 3 rp Dynamic arrays The new operator also makes it possible to allocate space for an array in the heap, which is called a dynamic array. To allocate space for a dynamic array, you follow the type name with the desired number of elements enclosed in square brackets. Thus, the declaration double *array = new double[3]; initializes array so that it points to a contiguous block of memory large enough to hold three doubles, like this: array The variable array is now a fully functioning array whose storage lives in the heap rather than on the stack. You can assign values to the elements of array, which are then stored in the appropriate cells in the heap. 12.1 Dynamic allocation and the heap Freeing memory Although they are getting larger all the time, computer memory systems are finite in size. As a result, the heap will eventually run out of space. When this occurs, the new operator will be unable to allocate a block of the requested size. Failure to fulfill an allocation request is such a serious error that there is usually nothing the program can do to recover. The best strategy to ensure that you don’t run out of memory is to free any heap storage when you are finished using it. To this end, C++ includes the operator delete, which takes a pointer previously allocated by new and frees the memory associated with that pointer. If, for example, you determine that you are completely finished using the storage allocated for ip, you can free that storage by calling delete ip; If the heap memory is an array, you need to add square brackets after the delete keyword. Thus, when you are finished using array in the earlier example, you can free its memory by executing the statement delete[] array; Knowing when to free a piece of memory is not an easy task, particularly as programs become large. If several parts of a program share some data structure that has been allocated in the heap, it may not be possible for any single part to recognize that the memory is no longer needed. For simple programs that run only until they produce the desired results, it is often a reasonable strategy to allocate whatever memory you need without bothering to free it again. Doing so, however, is likely to create a dangerous habit. If some other programmer tries to use your code in a long-running application, the fact that you were careless with memory will become a serious problem. It is therefore good practice to make sure that you at some point free any heap storage that you allocate. When a program fails to free its unused heap storage, that program is said to contain a memory leak. Some languages, such as Java, support a strategy for dynamic allocation that actively goes through memory freeing any storage that is no longer in active use. This strategy is called garbage collection. Garbage collection makes memory management extremely easy for programmers, although it also imposes some costs. Scanning through the entire heap to figure out what parts of it are in use takes some time. Worse still, garbage collection often makes it impossible to predict how long it will take to execute a specific task. A function that usually runs very quickly can suddenly take a significant amount of time if that particular call ends up running the garbage collector. If that function is responsible for some time-critical task, the application might fail to respond in time. 531 532 Dynamic Memory Management Increasingly, programming languages are adopting garbage collection as a strategy for memory management. As a general principle, it is worth sacrificing some amount of processing time to save a lot of programmer time. Processors, after all, are cheap, and programmers are expensive. C++, however, dates from an earlier era. For better or worse, the designers of C++ decided to leave the responsibility for freeing heap memory in the hands of the programmers instead of delegating that task to an automatic process. Fortunately, those designers made up for the lack of garbage collection by offering programmers a different strategy for memory management that simplifies the problem enormously. In C++, each class is allowed to specify what happens when an object of that class disappears. In well-designed C++ applications, each class takes responsibility for its own heap storage, thereby freeing clients from the nearly impossible task of remembering exactly what heap storage is currently active. Learning to use this strategy effectively is one of the most important techniques you need to master as a C++ programmer, so it is worth paying particular attention to the following section, which describes this approach in detail. Destructors As you know from the examples you have already seen, C++ classes typically define one or more constructors that initialize an object. Each class can also define a destructor, which is called automatically when an object of that class disappears. This destructor can perform a variety of cleanup operations. It can, for example, close any files the object has opened. The most important role for a destructor, however, is freeing any heap memory created by that object. In C++, the destructor has the same name as the class preceded by the tilde character ~. Thus, if you needed to define a destructor for a class named MyClass, the prototype in the interface would look like this: ~MyClass(); As with the constructors, the destructor does not include a result type. Unlike constructors, the destructor cannot be overloaded. Each class has only one destructor, which takes no arguments. In C++, objects disappear in several different ways. In most cases, objects are declared as local variables within some function, which means that they are allocated on the stack. Those objects disappear when the function returns. That means that their destructor is automatically called at that same time, which gives the class definition the chance to free any heap memory that it allocated during that object’s lifetime. In most C++ documentation, local variables that disappear when a function returns are said to go out of scope. 12.2 Defining a CharStack class Objects can also be created as temporaries in the evaluation of expressions, even if their value is never stored in a local variable. As an example, the test program for the Rational class from Chapter 6 includes the following code: Rational Rational Rational Rational a(1, 2); b(1, 3); c(1, 6); sum = a + b + c; The local variables a, b, c, and sum will disappear when the function returns. The final line, however, ends up computing the Rational value 5/6 as an intermediate result before going on to compute the final value. Temporary values of this sort go out of scope as soon as the evaluation is complete for the expression that generates them. If the Rational class had a destructor, it would be called at that point to free the storage associated with the temporary object. 12.2 Defining a CharStack class The easiest way to get a more detailed sense of how to write classes that use dynamic allocation is to implement one of the container classes from Chapter 5. You already know how these classes work, so you can focus on the implementation without having to learn how some new class operates from the client’s perspective. In part because it exports the smallest set of methods, the easiest container class to implement is the Stack class. As you know from Chapter 5, the Stack class in the Stanford libraries allows clients to create stacks of different kinds by specifying the base type in angle brackets after the type name. The RPN calculator in Chapter 5, for example uses a Stack to hold the values. Writing a parameterized class, however, requires the use of the template facility described in Chapter 14. At the moment, the goal is to see how classes like Stack use dynamic allocation to manage memory. For that purpose, the base type of the stack is not important, and it will work just as well to define a specific type of stack—in this case a stack of characters that will turn out to be useful in its own right in Chapter 13—to illustrate the general principles. The charstack.h interface Given that you already know the operations that a stack supports from your experience with the generic Stack class from Chapter 5, the interface that exports the CharStack class is easy to write. The contents of the charstack.h interface appear in Figure 12-1. The entries exported by the interface include the default constructor and the various methods—size, isEmpty, clear, push, pop, and peek—that define the behavior of the stack abstraction. 533 534 Dynamic Memory Management F I G U R E 1 2 - 1 Interface for the CharStack class charstack.h-p1 /* * File: charstack.h * ----------------* This interface defines the CharStack class, which implements * the stack abstraction for characters. */ #ifndef _charstack_h #define _charstack_h /* * Class: CharStack * ---------------* This class models a stack of characters. The fundamental operations * are the same as those for the Stack class. */ class CharStack { public: /* * Constructor: CharStack * Usage: CharStack cstk; * ---------------------* Initializes a new empty stack that can contain characters. */ CharStack(); /* * Destructor: ~CharStack * Usage: (usually implicit) * ------------------------* Frees any heap storage associated with this character stack. */ ~CharStack(); /* * Method: size * Usage: int nElems = cstk.size(); * -------------------------------* Returns the number of characters in this stack. */ int size(); 12.2 Defining a CharStack class charstack.h-p2 F I G U R E 1 2 - 1 Interface for the CharStack class (continued) /* * Method: isEmpty * Usage: if (cstk.isEmpty()) . . . * -------------------------------* Returns true if this stack contains no