**IAS/Park City Mathematics Series**

Volume: 25;
2018;
325 pp;
Hardcover

MSC: Primary 15; 52; 60; 62; 65; 68; 90;

**Print ISBN: 978-1-4704-3575-2
Product Code: PCMS/25**

List Price: $104.00

AMS Member Price: $83.20

MAA Member Price: $93.60

**Electronic ISBN: 978-1-4704-4990-2
Product Code: PCMS/25.E**

List Price: $104.00

AMS Member Price: $83.20

MAA Member Price: $93.60

#### You may also like

# The Mathematics of Data

Share this page *Edited by *
*Michael W. Mahoney; John C. Duchi; Anna C. Gilbert*

A co-publication of the AMS, IAS/Park City Mathematics Institute, and Society for Industrial and Applied Mathematics

Data science is a highly interdisciplinary field, incorporating ideas
from applied mathematics, statistics, probability, and computer
science, as well as many other areas. This book gives an introduction
to the mathematical methods that form the foundations of machine
learning and data science, presented by leading experts in computer
science, statistics, and applied mathematics. Although the chapters
can be read independently, they are designed to be read together as
they lay out algorithmic, statistical, and numerical approaches in
diverse but complementary ways.

This book can be used both as a text for advanced undergraduate and
beginning graduate courses, and as a survey for researchers interested
in understanding how applied mathematics broadly defined is being used
in data science. It will appeal to anyone interested in the
interdisciplinary foundations of machine learning and data science.

Titles in this series are co-published with the Institute for Advanced Study/Park City Mathematics Institute.

#### Readership

Graduate students and researchers interested in applied mathematics of data.

#### Reviews & Endorsements

What should you expect from a book titled 'The Mathematics of Data'? Nearly anything. There are numerous elementary books with similar titles that don't go far beyond showing the reader how to compute the standard deviation. But what if you saw that the book was published by AMS and SIAM? That changes everything. You know it won't be elementary, and it will probably be high quality, which is indeed the case here.

-- John D. Cook, MAA Reviews

# Table of Contents

## The Mathematics of Data

Table of Contents pages: 1 2

- Cover Cover11
- Title page iii4
- Preface vii8
- Introduction ix10
- Lectures on Randomized Numerical Linear Algebra 114
- Introduction 215
- Linear Algebra 316
- Discrete Probability 1124
- Random experiments: basics. 1124
- Properties of events. 1225
- The union bound. 1225
- Disjoint events and independent events. 1225
- Conditional probability. 1225
- Random variables. 1326
- Probability mass function and cumulative distribution function. 1326
- Independent random variables. 1427
- Expectation of a random variable. 1427
- Variance of a random variable. 1427
- Markov’s inequality. 1528
- The Coupon Collector Problem. 1629
- References. 1629

- Randomized Matrix Multiplication 1629
- RandNLA Approaches for Regression Problems 2437
- A RandNLA Algorithm for Low-rank Matrix Approximation 3649

- \replace{Optimization Algorithms for Data Analysis} 4962
- Introduction 5063
- Optimization Formulations of Data Analysis Problems 5265
- Setup 5265
- Least Squares 5467
- Matrix Completion 5467
- Nonnegative Matrix Factorization 5568
- Sparse Inverse Covariance Estimation 5669
- Sparse Principal Components 5669
- Sparse Plus Low-Rank Matrix Decomposition 5770
- Subspace Identification 5770
- Support Vector Machines 5871
- Logistic Regression 6073
- Deep Learning 6174

- Preliminaries 6376
- Gradient Methods 7184
- Prox-Gradient Methods 7790
- Accelerating Gradient Methods 8093
- Newton Methods 88101
- Conclusions 95108

- Introductory Lectures on Stochastic Optimization 99112
- Randomized Methods for Matrix Computations 187200
- Introduction 188201
- Notation 191204
- A two-stage approach 193206
- A randomized algorithm for “Stage A” —the range finding problem 194207
- Single pass algorithms 195208
- A method with complexity O(mn log k) for general dense matrices 199212
- Theoretical performance bounds 200213
- An accuracy enhanced randomized scheme 202215
- The Nyström method for positive symmetric definite matrices 205218
- Randomized algorithms for computing Interpolatory Decompositions 206219
- Randomized algorithms for computing the CUR decomposition 212225
- Adaptive rank determination with updating of the matrix 214227
- Adaptive rank determination without updating the matrix 218231
- Randomized algorithms for computing a rank-revealing QR decomposition 221234
- A strongly rank-revealing UTV decomposition 223236

- Four Lectures on Probabilistic Methods for Data Science 231244
- Homological Algebra and Data 273286

Table of Contents pages: 1 2