Gilbert Strang’s most recent textbook is remarkable on several fronts. Published in 2019, it contains the key linear algebra and optimization techniques at the forefront of active data-science and machine… Click to show full abstract
Gilbert Strang’s most recent textbook is remarkable on several fronts. Published in 2019, it contains the key linear algebra and optimization techniques at the forefront of active data-science and machine learning practice today. This is an appropriate choice of content because while state-of-the-art machine learning applications can change each month (as in reinforcement learning, language translation, game playing, or image classification), the underlying mathematical concepts and algorithms do not. Some topics (such as numerical algorithms for various tensor problems) are so recent that they are only now being presented as textbook material. This book is an offspring of a current Massachusetts Institute of Technology (MIT) mathematics course, Matrix Methods in Data Analysis and Signal Processing, which, in turn, was greatly influenced by a current University of Michigan electrical engineering and computer science course, Matrix Methods for Signal Processing, Data Analysis, and Machine Learning. Both courses are aimed at advanced undergraduate and graduate students in science and engineering. Undergraduate linear algebra is an explicit prerequisite for the MIT course because the basic concepts of linear algebra (such as vector spaces, subspaces, independence, bases, and linear operators) are assumed to be known. As a result, this is not a text to be used for learning linear algebra. Although the first 100 pages provide a review of many basic matrix concepts, this material does not provide a stand-alone introduction to this material. In addition, it is not clear how much classical linear algebra is really needed for gaining exposure to and intuition about modern machine learning techniques. Perhaps the book would be more accurately described as using matrix methods as opposed to linear algebra, as the title states. This book contains a number of remarkable aspects. First, the breadth of material covered is impressive. Discussions of tensor constructs and algorithms, compressed sensing and sketching, least absolute shrinkage and selection operator (LASSO), graph Laplacians, stochastic gradient descent, and convolutional neural networks (among other topics) are included. Second, the Hemmingway-esque writing style, consisting of short sentences, unadorned prose, and conversational vocabulary, is a welcome relief from dry, formal, theorem-proof mathematics writing. Third, the pedagogical exposition is excellent. The author’s previous experience writing outstanding textbooks and monographs (and then actually teaching his courses using them) is apparent. Strang notes what is important and why, where to be careful, and how the key ideas came about. He also uses simple examples (not generalities) to introduce and illustrate concepts. Finally, the book is teeming with insights into the author’s singular curiosity and drive to understand. It is clear that Strang embraces new ideas and trends, seeking to understand and explain them in terms of foundational computations and mathematics.
               
Click one of the above tabs to view related content.