<>

Part I: Artificial Intelligence

Chapter 1<>Introduction ... 1

<>What Is AI? ... 1
<>1.1.1<>Acting humanly: The Turing test approach ... 2
<>1.1.2<>Thinking humanly: The cognitive modeling approach ... 2
<>1.1.3<>Thinking rationally: The ``laws of thought'' approach ... 3
<>1.1.4<>Acting rationally: The rational agent approach ... 3
<>1.1.5<>Beneficial machines ... 4
<>1.2<>The Foundations of Artificial Intelligence ... 5
<>1.2.1<>Philosophy ... 6
<>1.2.2<>Mathematics ... 8
<>1.2.3<>Economics ... 9
<>1.2.4<>Neuroscience ... 11
<>1.2.5<>Psychology ... 12
<>1.2.6<>Computer engineering ... 14
<>1.2.7<>Control theory and cybernetics ... 15
<>1.2.8<>Linguistics ... 16
<>1.3<>The History of Artificial Intelligence ... 17
<>1.3.1<>The inception of artificial intelligence (1943--1956) ... 17
<>1.3.2<>Early enthusiasm, great expectations (1952--1969) ... 18
<>1.3.3<>A dose of reality (1966--1973) ... 21
<>1.3.4<>Expert systems (1969--1986) ... 22
<>1.3.5<>The return of neural networks (1986--present) ... 24
<>1.3.6<>Probabilistic reasoning and machine learning (1987--present) ... 24
<>1.3.7<>Big data (2001--present) ... 26
<>1.3.8<>Deep learning (2011--present) ... 26
<>1.4<>The State of the Art ... 27
<>1.5<>Risks and Benefits of AI ... 31
<>Summary ... 34
<>Bibliographical and Historical Notes ... 35

Chapter 2<>Intelligent Agents ... 36

<>2.1<>Agents and Environments ... 36
<>2.2<>Good Behavior: The Concept of Rationality ... 39
<>2.2.1<>Performance measures ... 39
<>2.2.2<>Rationality ... 40
<>2.2.3<>Omniscience, learning, and autonomy ... 40
<>2.3<>The Nature of Environments ... 42
<>2.3.1<>Specifying the task environment ... 42
<>2.3.2<>Properties of task environments ... 43
<>2.4<>The Structure of Agents ... 47
<>2.4.1<>Agent programs ... 48
<>2.4.2<>Simple reflex agents ... 49
<>2.4.3<>Model-based reflex agents ... 51
<>2.4.4<>Goal-based agents ... 53
<>2.4.5<>Utility-based agents ... 54
<>2.4.6<>Learning agents ... 56
<>2.4.7<>How the components of agent programs work ... 58
<>Summary ... 60
<>Bibliographical and Historical Notes ... 60

Part II: Problem-solving

Chapter 3<>Solving Problems by Searching ... 63

<>3.1<>Problem-Solving Agents ... 63
<>3.1.1<>Search problems and solutions ... 65
<>3.1.2<>Formulating problems ... 66
<>3.2<>Example Problems ... 66
<>3.2.1<>Standardized problems ... 66
<>3.2.2<>Real-world problems ... 69
<>3.3<>Search Algorithms ... 71
<>3.3.1<>Best-first search ... 73
<>3.3.2<>Search data structures ... 73
<>3.3.3<>Redundant paths ... 74
<>3.3.4<>Measuring problem-solving performance ... 75
<>3.4<>Uninformed Search Strategies ... 76
<>3.4.1<>Breadth-first search ... 76
<>3.4.2<>Dijkstra's algorithm or uniform-cost search ... 77
<>3.4.3<>Depth-first search and the problem of memory ... 78
<>3.4.4<>Depth-limited and iterative deepening search ... 80
<>3.4.5<>Bidirectional search ... 82
<>3.4.6<>Comparing uninformed search algorithms ... 84
<>3.5<>Informed (Heuristic) Search Strategies ... 84
<>3.5.1<>Greedy best-first search ... 85
<>3.5.2<>A* search ... 85
<>3.5.3<>Search contours ... 89
<>3.5.4<>Satisficing search: Inadmissible heuristics and weighted A* ... 90
<>3.5.5<>Memory-bounded search ... 92
<>3.5.6<>Bidirectional heuristic search ... 96
<>3.6<>Heuristic Functions ... 97
<>3.6.1<>The effect of heuristic accuracy on performance ... 98
<>3.6.2<>Generating heuristics from relaxed problems ... 99
<>3.6.3<>Generating heuristics from subproblems: Pattern databases ... 100
<>3.6.4<>Generating heuristics with landmarks ... 102
<>3.6.5<>Learning to search better ... 103
<>3.6.6<>Learning heuristics from experience ... 104
<>Summary ... 104
<>Bibliographical and Historical Notes ... 106

Chapter 4<>Search in Complex Environments ... 110

<>4.1<>Local Search and Optimization Problems ... 110
<>4.1.1<>Hill-climbing search ... 111
<>4.1.2<>Simulated annealing ... 114
<>4.1.3<>Local beam search ... 115
<>4.1.4<>Evolutionary algorithms ... 115
<>4.2<>Local Search in Continuous Spaces ... 119
<>4.3<>Search with Nondeterministic Actions ... 122
<>4.3.1<>The erratic vacuum world ... 122
<>4.3.2<>AND—OR search trees ... 123
<>4.3.3<>Try, try again ... 125
<>4.4<>Search in Partially Observable Environments ... 126
<>4.4.1<>Searching with no observation ... 126
<>4.4.2<>Searching in partially observable environments ... 130
<>4.4.3<>Solving partially observable problems ... 132
<>4.4.4<>An agent for partially observable environments ... 132
<>4.5<>Online Search Agents and Unknown Environments ... 134
<>4.5.1<>Online search problems ... 135
<>4.5.2<>Online search agents ... 137
<>4.5.3<>Online local search ... 138
<>4.5.4<>Learning in online search ... 140
<>Summary ... 141
<>Bibliographical and Historical Notes ... 142

Chapter 5<>Adversarial Search and Games ... 146

<>5.1<>Game Theory ... 146
<>5.1.1<>Two-player zero-sum games ... 147
<>5.2<>Optimal Decisions in Games ... 148
<>5.2.1<>The minimax search algorithm ... 149
<>5.2.2<>Optimal decisions in multiplayer games ... 151
<>5.2.3<>Alpha--Beta Pruning ... 152
<>5.2.4<>Move ordering ... 153
<>5.3<>Heuristic Alpha--Beta Tree Search ... 156
<>5.3.1<>Evaluation functions ... 156
<>5.3.2<>Cutting off search ... 158
<>5.3.3<>Forward pruning ... 159
<>5.3.4<>Search versus lookup ... 160
<>5.4<>Monte Carlo Tree Search ... 161
<>5.5<>Stochastic Games ... 164
<>5.5.1<>Evaluation functions for games of chance ... 166
<>5.6<>Partially Observable Games ... 168
<>5.6.1<>Kriegspiel: Partially observable chess ... 168
<>5.6.2<>Card games ... 171
<>5.7<>Limitations of Game Search Algorithms ... 173
<>Summary ... 174
<>Bibliographical and Historical Notes ... 175

Chapter 6<>Constraint Satisfaction Problems ... 180

<>6.1<>Defining Constraint Satisfaction Problems ... 180
<>6.1.1<>Example problem: Map coloring ... 181
<>6.1.2<>Example problem: Job-shop scheduling ... 182
<>6.1.3<>Variations on the CSP formalism ... 183
<>6.2<>Constraint Propagation: Inference in CSPs ... 185
<>6.2.1<>Node consistency ... 186
<>6.2.2<>Arc consistency ... 186
<>6.2.3<>Path consistency ... 187
<>6.2.4<>Kconsistency ... 188
<>6.2.5<>Global constraints ... 188
<>6.2.6<>Sudoku ... 189
<>6.3<>Backtracking Search for CSPs ... 191
<>6.3.1<>Variable and value ordering ... 193
<>6.3.2<>Interleaving search and inference ... 194
<>6.3.3<>Intelligent backtracking: Looking backward ... 195
<>6.3.4<>Constraint learning ... 196
<>6.4<>Local Search for CSPs ... 197
<>6.5<>The Structure of Problems ... 199
<>6.5.1<>Cutset conditioning ... 200
<>6.5.2<>Tree decomposition ... 201
<>6.5.3<>Value symmetry ... 203
<>Summary ... 203
<>Bibliographical and Historical Notes ... 204

Part III: Knowledge, reasoning, and planning

Chapter 7<>Logical Agents ... 208

<>7.1<>Knowledge-Based Agents ... 209
<>7.2<>The Wumpus World ... 210
<>7.3<>Logic ... 214
<>7.4<>Propositional Logic: A Very Simple Logic ... 217
<>7.4.1<>Syntax ... 217
<>7.4.2<>Semantics ... 218
<>7.4.3<>A simple knowledge base ... 220
<>7.4.4<>A simple inference procedure ... 220
<>7.5<>Propositional Theorem Proving ... 222
<>7.5.1<>Inference and proofs ... 223
<>7.5.2<>Proof by resolution ... 225
<>Conjunctive normal form ... 226
<>A resolution algorithm ... 227
<>Completeness of resolution ... 228
<>7.5.3<>Horn clauses and definite clauses ... 229
<>7.5.4<>Forward and backward chaining ... 230
<>7.6<>Effective Propositional Model Checking ... 232
<>7.6.1<>A complete backtracking algorithm ... 233
<>7.6.2<>Local search algorithms ... 235
<>7.6.3<>The landscape of random SAT problems ... 236
<>7.7<>Agents Based on Propositional Logic ... 237
<>7.7.1<>The current state of the world ... 237
<>7.7.2<>A hybrid agent ... 241
<>7.7.3<>Logical state estimation ... 241
<>7.7.4<>Making plans by propositional inference ... 244
<>Summary ... 246
<>Bibliographical and Historical Notes ... 247

Chapter 8<>First-Order Logic ... 251

<>8.1<>Representation Revisited ... 251
<>8.1.1<>The language of thought ... 252
<>8.1.2<>Combining the best of formal and natural languages ... 254
<>8.2<>Syntax and Semantics of First-Order Logic ... 256
<>8.2.1<>Models for first-order logic ... 256
<>8.2.2<>Symbols and interpretations ... 257
<>8.2.3<>Terms ... 259
<>8.2.4<>Atomic sentences ... 260
<>8.2.5<>Complex sentences ... 260
<>8.2.6<>Quantifiers ... 260
<>Universal quantification (∀) ... 260
<>Existential quantification (∃) ... 262
<>Nested quantifiers ... 263
<>Connections between ∀ and ∃ ... 263
<>8.2.7<>Equality ... 264
<>8.2.8<>Database semantics ... 264
<>8.3<>Using First-Order Logic ... 265
<>8.3.1<>Assertions and queries in first-order logic ... 265
<>8.3.2<>The kinship domain ... 266
<>8.3.3<>Numbers, sets, and lists ... 268
<>8.3.4<>The wumpus world ... 270
<>8.4<>Knowledge Engineering in First-Order Logic ... 271
<>8.4.1<>The knowledge engineering process ... 272
<>8.4.2<>The electronic circuits domain ... 273
<>Identify the questions ... 273
<>Assemble the relevant knowledge ... 274
<>Decide on a vocabulary ... 274
<>Encode general knowledge of the domain ... 275
<>Encode the specific problem instance ... 276
<>Pose queries to the inference procedure ... 276
<>Debug the knowledge base ... 277
<>Summary ... 277
<>Bibliographical and Historical Notes ... 278

Chapter 9<>Inference in First-Order Logic ... 280

<>9.1<>Propositional vs. First-Order Inference ... 280
<>9.1.1<>Reduction to propositional inference ... 281
<>9.2<>Unification and First-Order Inference ... 282
<>9.2.1<>Unification ... 283
<>9.2.2<>Storage and retrieval ... 284
<>9.3<>Forward Chaining ... 286
<>9.3.1<>First-order definite clauses ... 286
<>9.3.2<>A simple forward-chaining algorithm ... 287
<>9.3.3<>Efficient forward chaining ... 289
<>Matching rules against known facts ... 289
<>Incremental forward chaining ... 291
<>Irrelevant facts ... 292
<>9.4<>Backward Chaining ... 293
<>9.4.1<>A backward-chaining algorithm ... 293
<>9.4.2<>Logic programming ... 294
<>9.4.3<>Redundant inference and infinite loops ... 295
<>9.4.4<>Database semantics of Prolog ... 297
<>9.4.5<>Constraint logic programming ... 298
<>9.5<>Resolution ... 298
<>9.5.1<>Conjunctive normal form for first-order logic ... 299
<>9.5.2<>The resolution inference rule ... 300
<>9.5.3<>Example proofs ... 301
<>9.5.4<>Completeness of resolution ... 303
<>9.5.5<>Equality ... 306
<>9.5.6<>Resolution strategies ... 308
<>Practical uses of resolution theorem provers ... 309
<>Summary ... 309
<>Bibliographical and Historical Notes ... 310

Chapter 10<>Knowledge Representation ... 314

<>10.1<>Ontological Engineering ... 314
<>10.2<>Categories and Objects ... 317
<>10.2.1<>Physical composition ... 318
<>10.2.2<>Measurements ... 319
<>10.2.3<>Objects: Things and stuff ... 321
<>10.3<>Events ... 322
<>10.3.1<>Time ... 324
<>10.3.2<>Fluents and objects ... 325
<>10.4<>Mental Objects and Modal Logic ... 326
<>10.4.1<>Other modal logics ... 328
<>10.5<>Reasoning Systems for Categories ... 329
<>10.5.1<>Semantic networks ... 329
<>10.5.2<>Description logics ... 331
<>10.6<>Reasoning with Default Information ... 333
<>10.6.1<>Circumscription and default logic ... 333
<>10.6.2<>Truth maintenance systems ... 335
<>Summary ... 337
<>Bibliographical and Historical Notes ... 338

Chapter 11<>Automated Planning ... 344

<>11.1<>Definition of Classical Planning ... 344
<>11.1.1<>Example domain: Air cargo transport ... 345
<>11.1.2<>Example domain: The spare tire problem ... 346
<>11.1.3<>Example domain: The blocks world ... 346
<>11.2<>Algorithms for Classical Planning ... 348
<>11.2.1<>Forward state-space search for planning ... 348
<>11.2.2<>Backward search for planning ... 350
<>11.2.3<>Planning as Boolean satisfiability ... 351
<>11.2.4<>Other classical planning approaches ... 352
<>11.3<>Heuristics for Planning ... 353
<>11.3.1<>Domain-independent pruning ... 354
<>11.3.2<>State abstraction in planning ... 355
<>11.4<>Hierarchical Planning ... 356
<>11.4.1<>High-level actions ... 357
<>11.4.2<>Searching for primitive solutions ... 358
<>11.4.3<>Searching for abstract solutions ... 360
<>11.5<>Planning and Acting in Nondeterministic Domains ... 365
<>11.5.1<>Sensorless planning ... 367
<>11.5.2<>Contingent planning ... 370
<>11.5.3<>Online planning ... 371
<>11.6<>Time, Schedules, and Resources ... 374
<>11.6.1<>Representing temporal and resource constraints ... 375
<>11.6.2<>Solving scheduling problems ... 376
<>11.7<>Analysis of Planning Approaches ... 378
<>Summary ... 379
<>Bibliographical and Historical Notes ... 380

Part IV: Uncertain knowledge and reasoning

Chapter 12<>Quantifying Uncertainty ... 385

<>12.1<>Acting under Uncertainty ... 385
<>12.1.1<>Summarizing uncertainty ... 386
<>12.1.2<>Uncertainty and rational decisions ... 387
<>12.2<>Basic Probability Notation ... 388
<>12.2.1<>What probabilities are about ... 388
<>12.2.2<>The language of propositions in probability assertions ... 390
<>12.2.3<>Probability axioms and their reasonableness ... 393
<>12.3<>Inference Using Full Joint Distributions ... 395
<>12.4<>Independence ... 397
<>12.5<>Bayes' Rule and Its Use ... 399
<>12.5.1<>Applying Bayes' rule: The simple case ... 399
<>12.5.2<>Using Bayes' rule: Combining evidence ... 400
<>12.6<>Naive Bayes Models ... 402
<>12.6.1<>Text classification with naive Bayes ... 403
<>12.7<>The Wumpus World Revisited ... 404
<>Summary ... 407
<>Bibliographical and Historical Notes ... 408

Chapter 13<>Probabilistic Reasoning ... 412

<>13.1<>Representing Knowledge in an Uncertain Domain ... 412
<>13.2<>The Semantics of Bayesian Networks ... 414
<>A method for constructing Bayesian networks ... 415
<>Compactness and node ordering ... 417
<>13.2.1<>Conditional independence relations in Bayesian networks ... 418
<>13.2.2<>Efficient Representation of Conditional Distributions ... 420
<>13.2.3<>Bayesian nets with continuous variables ... 422
<>13.2.4<>Case study: Car insurance ... 424
<>13.3<>Exact Inference in Bayesian Networks ... 427
<>13.3.1<>Inference by enumeration ... 427
<>13.3.2<>The variable elimination algorithm ... 430
<>Operations on factors ... 431
<>Variable ordering and variable relevance ... 432
<>13.3.3<>The complexity of exact inference ... 433
<>13.3.4<>Clustering algorithms ... 434
<>13.4<>Approximate Inference for Bayesian Networks ... 435
<>13.4.1<>Direct sampling methods ... 436
<>Rejection sampling in Bayesian networks ... 437
<>Importance sampling ... 439
<>13.4.2<>Inference by Markov chain simulation ... 441
<>Gibbs sampling in Bayesian networks ... 442
<>Analysis of Markov chains ... 443
<>Why Gibbs sampling works ... 445
<>Complexity of Gibbs sampling ... 446
<>Metropolis--Hastings sampling ... 447
<>13.4.3<>Compiling approximate inference ... 448
<>13.5<>Causal Networks ... 449
<>13.5.1<>Representing actions: do-operator ... 451
<>13.5.2<>The back-door criterion ... 453
<>Summary ... 453
<>Bibliographical and Historical Notes ... 454

Chapter 14<>Probabilistic Reasoning over Time ... 461

<>14.1<>Time and Uncertainty ... 461
<>14.1.1<>States and observations ... 462
<>14.1.2<>Transition and sensor models ... 463
<>14.2<>Inference in Temporal Models ... 465
<>14.2.1<>Filtering and prediction ... 466
<>14.2.2<>Smoothing ... 468
<>14.2.3<>Finding the most likely sequence ... 471
<>14.3<>Hidden Markov Models ... 473
<>14.3.1<>Simplified matrix algorithms ... 474
<>14.3.2<>Hidden Markov model example: Localization ... 476
<>14.4<>Kalman Filters ... 479
<>14.4.1<>Updating Gaussian distributions ... 479
<>14.4.2<>A simple one-dimensional example ... 480
<>14.4.3<>The general case ... 482
<>14.4.4<>Applicability of Kalman filtering ... 483
<>14.5<>Dynamic Bayesian Networks ... 485
<>14.5.1<>Constructing DBNs ... 486
<>14.5.2<>Exact inference in DBNs ... 489
<>14.5.3<>Approximate inference in DBNs ... 491
<>Summary ... 496
<>Bibliographical and Historical Notes ... 497

Chapter 15<>Probabilistic Programming ... 500

<>15.1<>Relational Probability Models ... 501
<>15.1.1<>Syntax and semantics ... 502
<>15.1.2<>Example: Rating player skill levels ... 505
<>15.1.3<>Inference in relational probability models ... 506
<>15.2<>Open-Universe Probability Models ... 507
<>15.2.1<>Syntax and semantics ... 508
<>15.2.2<>Inference in open-universe probability models ... 510
<>15.2.3<>Examples ... 511
<>Citation matching ... 511
<>Nuclear treaty monitoring ... 512
<>15.3<>Keeping Track of a Complex World ... 514
<>15.3.1<>Example: Multitarget tracking ... 515
<>15.3.2<>Example: Traffic monitoring ... 518
<>15.4<>Programs as Probability Models ... 519
<>15.4.1<>Example: Reading text ... 519
<>15.4.2<>Syntax and semantics ... 520
<>15.4.3<>Inference results ... 522
<>15.4.4<>Improving the generative program to incorporate a Markov model ... 522
<>15.4.5<>Inference in generative programs ... 522
<>Summary ... 523
<>Bibliographical and Historical Notes ... 524

Chapter 16<>Making Simple Decisions ... 528

<>16.1<>Combining Beliefs and Desires under Uncertainty ... 528
<>16.2<>The Basis of Utility Theory ... 529
<>16.2.1<>Constraints on rational preferences ... 530
<>16.2.2<>Rational preferences lead to utility ... 531
<>16.3<>Utility Functions ... 532
<>16.3.1<>Utility assessment and utility scales ... 533
<>16.3.2<>The utility of money ... 534
<>16.3.3<>Expected utility and post-decision disappointment ... 536
<>16.3.4<>Human judgment and irrationality ... 538
<>16.4<>Multiattribute Utility Functions ... 540
<>16.4.1<>Dominance ... 540
<>16.4.2<>Preference structure and multiattribute utility ... 543
<>Preferences without uncertainty ... 543
<>Preferences with uncertainty ... 544
<>16.5<>Decision Networks ... 544
<>16.5.1<>Representing a decision problem with a decision network ... 545
<>16.5.2<>Evaluating decision networks ... 546
<>16.6<>The Value of Information ... 547
<>16.6.1<>A simple example ... 547
<>16.6.2<>A general formula for perfect information ... 548
<>16.6.3<>Properties of the value of information ... 549
<>16.6.4<>Implementation of an information-gathering agent ... 550
<>16.6.5<>Nonmyopic information gathering ... 551
<>16.6.6<>Sensitivity analysis and robust decisions ... 552
<>16.7<>Unknown Preferences ... 553
<>16.7.1<>Uncertainty about one's own preferences ... 553
<>16.7.2<>Deference to humans ... 554
<>Summary ... 557
<>Bibliographical and Historical Notes ... 557

Chapter 17<>Making Complex Decisions ... 562

<>17.1<>Sequential Decision Problems ... 562
<>17.1.1<>Utilities over time ... 564
<>17.1.2<>Optimal policies and the utilities of states ... 567
<>17.1.3<>Reward scales ... 569
<>17.1.4<>Representing MDPs ... 570
<>17.2<>Algorithms for MDPs ... 572
<>17.2.1<>Value Iteration ... 572
<>Convergence of value iteration ... 574
<>17.2.2<>Policy iteration ... 576
<>17.2.3<>Linear programming ... 578
<>17.2.4<>Online algorithms for MDPs ... 578
<>17.3<>Bandit Problems ... 581
<>17.3.1<>Calculating the Gittins index ... 583
<>17.3.2<>The Bernoulli bandit ... 584
<>17.3.3<>Approximately optimal bandit policies ... 585
<>17.3.4<>Non-indexable variants ... 586
<>17.4<>Partially Observable MDPs ... 588
<>17.4.1<>Definition of POMDPs ... 588
<>17.5<>Algorithms for Solving POMDPs ... 590
<>17.5.1<>Value iteration for POMDPs ... 590
<>17.5.2<>Online algorithms for POMDPs ... 593
<>Summary ... 595
<>Bibliographical and Historical Notes ... 596

Chapter 18<>Multiagent Decision Making ... 599

<>18.1<>Properties of Multiagent Environments ... 599
<>18.1.1<>One decision maker ... 599
<>18.1.2<>Multiple decision makers ... 600
<>18.1.3<>Multiagent planning ... 601
<>18.1.4<>Planning with multiple agents: Cooperation and coordination ... 604
<>18.2<>Non-Cooperative Game Theory ... 605
<>18.2.1<>Games with a single move: Normal form games ... 605
<>18.2.2<>Social welfare ... 609
<>Computing equilibria ... 610
<>18.2.3<>Repeated games ... 614
<>18.2.4<>Sequential games: The extensive form ... 617
<>Chance and simultaneous moves ... 619
<>Capturing imperfect information ... 619
<>18.2.5<>Uncertain payoffs and assistance games ... 623
<>18.3<>Cooperative Game Theory ... 626
<>18.3.1<>Coalition structures and outcomes ... 626
<>18.3.2<>Strategy in cooperative games ... 627
<>18.3.3<>Computation in cooperative games ... 630
<>Marginal contribution nets ... 630
<>Coalition structures for maximum social welfare ... 631
<>18.4<>Making Collective Decisions ... 632
<>18.4.1<>Allocating tasks with the contract net ... 632
<>18.4.2<>Allocating scarce resources with auctions ... 634
<>Common goods ... 637
<>18.4.3<>Voting ... 638
<>Strategic manipulation ... 641
<>18.4.4<>Bargaining ... 641
<>Bargaining with the alternating offers protocol ... 641
<>Impatient agents ... 642
<>Negotiation in task-oriented domains ... 643
<>The monotonic concession protocol ... 644
<>The Zeuthen strategy ... 644
<>Summary ... 645
<>Bibliographical and Historical Notes ... 646

Part V: Machine Learning

Chapter 19<>Learning from Examples ... 651

<>19.1<>Forms of Learning ... 651
<>19.2<>Supervised Learning ... 653
<>19.2.1<>Example problem: Restaurant waiting ... 656
<>19.3<>Learning Decision Trees ... 657
<>19.3.1<>Expressiveness of decision trees ... 657
<>19.3.2<>Learning decision trees from examples ... 658
<>19.3.3<>Choosing attribute tests ... 661
<>19.3.4<>Generalization and overfitting ... 663
<>19.3.5<>Broadening the applicability of decision trees ... 664
<>19.4<>Model Selection and Optimization ... 665
<>19.4.1<>Model selection ... 667
<>19.4.2<>From error rates to loss ... 669
<>19.4.3<>Regularization ... 671
<>19.4.4<>Hyperparameter tuning ... 671
<>19.5<>The Theory of Learning ... 672
<>19.5.1<>PAC learning example: Learning decision lists ... 674
<>19.6<>Linear Regression and Classification ... 676
<>19.6.1<>Univariate linear regression ... 676
<>19.6.2<>Gradient descent ... 677
<>19.6.3<>Multivariable linear regression ... 679
<>19.6.4<>Linear classifiers with a hard threshold ... 682
<>19.6.5<>Linear classification with logistic regression ... 684
<>19.7<>Nonparametric Models ... 686
<>19.7.1<>Nearest-neighbor models ... 687
<>19.7.2<>Finding nearest neighbors with k-d trees ... 688
<>19.7.3<>Locality-sensitive hashing ... 689
<>19.7.4<>Nonparametric regression ... 691
<>19.7.5<>Support vector machines ... 692
<>19.7.6<>The kernel trick ... 695
<>19.8<>Ensemble Learning ... 696
<>19.8.1<>Bagging ... 697
<>19.8.2<>Random forests ... 697
<>19.8.3<>Stacking ... 699
<>19.8.4<>Boosting ... 699
<>19.8.5<>Gradient boosting ... 701
<>19.8.6<>Online learning ... 702
<>19.9<>Developing Machine Learning Systems ... 704
<>19.9.1<>Problem formulation ... 704
<>19.9.2<>Data collection, assessment, and management ... 705
<>Feature engineering ... 707
<>Exploratory data analysis and visualization ... 708
<>19.9.3<>Model selection and training ... 709
<>19.9.4<>Trust, interpretability, and explainability ... 710
<>19.9.5<>Operation, monitoring, and maintenance ... 712
<>Summary ... 714
<>Bibliographical and Historical Notes ... 715

Chapter 20<>Learning Probabilistic Models ... 721

<>20.1<>Statistical Learning ... 721
<>20.2<>Learning with Complete Data ... 724
<>20.2.1<>Maximum-likelihood parameter learning: Discrete models ... 725
<>20.2.2<>Naive Bayes models ... 727
<>20.2.3<>Generative and discriminative models ... 727
<>20.2.4<>Maximum-likelihood parameter learning: Continuous models ... 728
<>20.2.5<>Bayesian parameter learning ... 730
<>20.2.6<>Bayesian linear regression ... 732
<>20.2.7<>Learning Bayes net structures ... 734
<>20.2.8<>Density estimation with nonparametric models ... 736
<>20.3<>Learning with Hidden Variables: The EM Algorithm ... 737
<>20.3.1<>Unsupervised clustering: Learning mixtures of Gaussians ... 738
<>20.3.2<>Learning Bayes net parameter values for hidden variables ... 741
<>20.3.3<>Learning hidden Markov models ... 744
<>20.3.4<>The general form of the EM algorithm ... 744
<>20.3.5<>Learning Bayes net structures with hidden variables ... 745
<>Summary ... 746
<>Bibliographical and Historical Notes ... 747

Chapter 21<>Deep Learning ... 750

<>21.1<>Simple Feedforward Networks ... 751
<>21.1.1<>Networks as complex functions ... 751
<>21.1.2<>Gradients and learning ... 754
<>21.2<>Computation Graphs for Deep Learning ... 756
<>21.2.1<>Input encoding ... 756
<>21.2.2<>Output layers and loss functions ... 757
<>21.2.3<>Hidden layers ... 759
<>21.3<>Convolutional Networks ... 760
<>21.3.1<>Pooling and downsampling ... 762
<>21.3.2<>Tensor operations in CNNs ... 763
<>21.3.3<>Residual networks ... 764
<>21.4<>Learning Algorithms ... 765
<>21.4.1<>Computing gradients in computation graphs ... 766
<>21.4.2<>Batch normalization ... 768
<>21.5<>Generalization ... 768
<>21.5.1<>Choosing a network architecture ... 768
<>21.5.2<>Neural architecture search ... 770
<>21.5.3<>Weight decay ... 771
<>21.5.4<>Dropout ... 772
<>21.6<>Recurrent Neural Networks ... 772
<>21.6.1<>Training a basic RNN ... 773
<>21.6.2<>Long short-term memory RNNs ... 775
<>21.7<>Unsupervised Learning and Transfer Learning ... 775
<>21.7.1<>Unsupervised learning ... 776
<>Probabilistic PCA: A simple generative model ... 776
<>Autoencoders ... 778
<>Deep autoregressive models ... 779
<>Generative adversarial networks ... 780
<>Unsupervised translation ... 780
<>21.7.2<>Transfer learning and multitask learning ... 781
<>21.8<>Applications ... 782
<>21.8.1<>Vision ... 782
<>21.8.2<>Natural language processing ... 783
<>21.8.3<>Reinforcement learning ... 783
<>Summary ... 784
<>Bibliographical and Historical Notes ... 785

Chapter 22<>Reinforcement Learning ... 789

<>22.1<>Learning from Rewards ... 789
<>22.2<>Passive Reinforcement Learning ... 791
<>22.2.1<>Direct utility estimation ... 792
<>22.2.2<>Adaptive dynamic programming ... 793
<>22.2.3<>Temporal-difference learning ... 793
<>22.3<>Active Reinforcement Learning ... 797
<>22.3.1<>Exploration ... 797
<>22.3.2<>Safe exploration ... 799
<>22.3.3<>Temporal-difference Q-learning ... 801
<>22.4<>Generalization in Reinforcement Learning ... 803
<>22.4.1<>Approximating direct utility estimation ... 804
<>22.4.2<>Approximating temporal-difference learning ... 805
<>22.4.3<>Deep reinforcement learning ... 806
<>22.4.4<>Reward shaping ... 807
<>22.4.5<>Hierarchical reinforcement learning ... 807
<>22.5<>Policy Search ... 810
<>22.6<>Apprenticeship and Inverse Reinforcement Learning ... 812
<>22.7<>Applications of Reinforcement Learning ... 815
<>22.7.1<>Applications in game playing ... 815
<>22.7.2<>Application to robot control ... 816
<>Summary ... 818
<>Bibliographical and Historical Notes ... 819

Part VI: Communicating, perceiving, and acting

Chapter 23<>Natural Language Processing ... 823

<>23.1<>Language Models ... 823
<>23.1.1<>The bag-of-words model ... 824
<>23.1.2<>N-gram word models ... 826
<>23.1.3<>Other n-gram models ... 826
<>23.1.4<>Smoothing n-gram models ... 827
<>23.1.5<>Word representations ... 828
<>23.1.6<>Part-of-speech (POS) tagging ... 829
<>23.1.7<>Comparing language models ... 832
<>23.2<>Grammar ... 833
<>23.2.1<>The lexicon of E0 ... 835
<>23.3<>Parsing ... 835
<>23.3.1<>Dependency parsing ... 838
<>23.3.2<>Learning a parser from examples ... 839
<>23.4<>Augmented Grammars ... 841
<>23.4.1<>Semantic interpretation ... 843
<>23.4.2<>Learning semantic grammars ... 845
<>23.5<>Complications of Real Natural Language ... 845
<>23.6<>Natural Language Tasks ... 849
<>Summary ... 850
<>Bibliographical and Historical Notes ... 851

Chapter 24<>Deep Learning for Natural Language Processing ... 856

<>24.1<>Word Embeddings ... 856
<>24.2<>Recurrent Neural Networks for NLP ... 860
<>24.2.1<>Language models with recurrent neural networks ... 860
<>24.2.2<>Classification with recurrent neural networks ... 862
<>24.2.3<>LSTMs for NLP tasks ... 863
<>24.3<>Sequence-to-Sequence Models ... 864
<>24.3.1<>Attention ... 865
<>24.3.2<>Decoding ... 867
<>24.4<>The Transformer Architecture ... 868
<>24.4.1<>Self-attention ... 868
<>24.4.2<>From self-attention to transformer ... 869
<>24.5<>Pretraining and Transfer Learning ... 871
<>24.5.1<>Pretrained word embeddings ... 871
<>24.5.2<>Pretrained contextual representations ... 873
<>24.5.3<>Masked language models ... 873
<>24.6<>State of the art ... 875
<>Summary ... 878
<>Bibliographical and Historical Notes ... 878

Chapter 25<>Computer Vision ... 881

<>25.1<>Introduction ... 881
<>25.2<>Image Formation ... 882
<>25.2.1<>Images without lenses: The pinhole camera ... 882
<>25.2.2<>Lens systems ... 884
<>25.2.3<>Scaled orthographic projection ... 885
<>25.2.4<>Light and shading ... 886
<>25.2.5<>Color ... 888
<>25.3<>Simple Image Features ... 888
<>25.3.1<>Edges ... 889
<>25.3.2<>Texture ... 892
<>25.3.3<>Optical flow ... 893
<>25.3.4<>Segmentation of natural images ... 894
<>25.4<>Classifying Images ... 895
<>25.4.1<>Image classification with convolutional neural networks ... 896
<>25.4.2<>Why convolutional neural networks classify images well ... 897
<>25.5<>Detecting Objects ... 899
<>25.6<>The 3D World ... 901
<>25.6.1<>3D cues from multiple views ... 901
<>25.6.2<>Binocular stereopsis ... 902
<>25.6.3<>3D cues from a moving camera ... 904
<>25.6.4<>3D cues from one view ... 905
<>25.7<>Using Computer Vision ... 906
<>25.7.1<>Understanding what people are doing ... 906
<>25.7.2<>Linking pictures and words ... 909
<>25.7.3<>Reconstruction from many views ... 911
<>25.7.4<>Geometry from a single view ... 911
<>25.7.5<>Making pictures ... 913
<>25.7.6<>Controlling movement with vision ... 917
<>Summary ... 919
<>Bibliographical and Historical Notes ... 920

Chapter 26<>Robotics ... 925

<>26.1<>Robots ... 925
<>26.2<>Robot Hardware ... 926
<>26.2.1<>Types of robots from the hardware perspective ... 926
<>26.2.2<>Sensing the world ... 927
<>26.2.3<>Producing motion ... 929
<>26.3<>What kind of problem is robotics solving? ... 930
<>26.4<>Robotic Perception ... 931
<>26.4.1<>Localization and mapping ... 932
<>26.4.2<>Other types of perception ... 935
<>26.4.3<>Supervised and unsupervised learning in robot perception ... 937
<>26.5<>Planning and Control ... 938
<>26.5.1<>Configuration space ... 939
<>26.5.2<>Motion planning ... 942
<>Visibility graphs ... 943
<>Voronoi diagrams ... 943
<>Cell decomposition ... 945
<>Randomized motion planning ... 946
<>Rapidly-exploring random trees ... 947
<>Trajectory optimization for kinematic planning ... 948
<>26.5.3<>Trajectory tracking control ... 950
<>Plans versus policies ... 954
<>26.5.4<>Optimal control ... 954
<>26.6<>Planning Uncertain Movements ... 956
<>26.7<>Reinforcement Learning in Robotics ... 958
<>26.7.1<>Exploiting models ... 959
<>26.7.2<>Exploiting other information ... 960
<>26.8<>Humans and Robots ... 961
<>26.8.1<>Coordination ... 961
<>Humans as approximately rational agents ... 961
<>Predicting human action ... 962
<>Human predictions about the robot ... 964
<>Humans as black box agents ... 965
<>26.8.2<>Learning to do what humans want ... 965
<>Preference learning: Learning cost functions ... 965
<>Learning policies directly via imitation ... 966
<>26.9<>Alternative Robotic Frameworks ... 968
<>26.9.1<>Reactive controllers ... 968
<>26.9.2<>Subsumption architectures ... 969
<>26.10<>Application Domains ... 971
<>Summary ... 974
<>Bibliographical and Historical Notes ... 975

Part VII: Conclusions

Chapter 27<>Philosophy, Ethics, and Safety of AI ... 981

<>27.1<>The Limits of AI ... 981
<>27.1.1<>The argument from informality ... 981
<>27.1.2<>The argument from disability ... 982
<>27.1.3<>The mathematical objection ... 983
<>27.1.4<>Measuring AI ... 984
<>27.2<>Can Machines Really Think? ... 984
<>27.2.1<>The Chinese room ... 985
<>27.2.2<>Consciousness and qualia ... 985
<>27.3<>The Ethics of AI ... 986
<>27.3.1<>Lethal autonomous weapons ... 987
<>27.3.2<>Surveillance, security, and privacy ... 990
<>27.3.3<>Fairness and bias ... 992
<>27.3.4<>Trust and transparency ... 996
<>27.3.5<>The future of work ... 998
<>27.3.6<>Robot rights ... 1000
<>27.3.7<>AI Safety ... 1001
<>Summary ... 1005
<>Bibliographical and Historical Notes ... 1006

Chapter 28<>The Future of AI ... 1012

<>28.1<>AI Components ... 1012
<>Sensors and actuators ... 1012
<>Representing the state of the world ... 1013
<>Selecting actions ... 1014
<>Deciding what we want ... 1014
<>Learning ... 1015
<>Resources ... 1017
<>28.2<>AI Architectures ... 1018
<>General AI ... 1020
<>AI engineering ... 1021
<>The future ... 1022

Appendix A:<>Mathematical Background ... 1023

<>A.1<>Complexity Analysis and O() Notation ... 1023
<>A.1.1<>Asymptotic analysis ... 1023
<>A.1.2<>NP and inherently hard problems ... 1024
<>A.2<>Vectors, Matrices, and Linear Algebra ... 1025
<>A.3<>Probability Distributions ... 1027
<>Bibliographical and Historical Notes ... 1029

Appendix B:<>Notes on Languages and Algorithms ... 1030

<>B.1<>Defining Languages with Backus--Naur Form (BNF) ... 1030
<>B.2<>Describing Algorithms with Pseudocode ... 1031
<>B.3<>Online Supplemental Material ... 1032

Bibliography ... 1033

Index ... 1089

<