A Next Big Idea Club Must-Read Title for July
“
Why Machines Learn, by the award-winning science writer Anil Ananthaswamy, takes the reader on an entertaining journey into the mind of a machine… [The book] demystifies the underlying mechanisms behind machine learning, which may possibly lead to a better understanding of the learning process itself and the development of improved AI.”
—Physics World
“A skillful primer makes sense of the mathematics beneath AI’s hood.”
—
New Scientist
“Whether Ananthaswamy is talking of ML algorithms or manipulation of matrices, he maintains a lightness of language and invokes historical accounts to advance a compelling narrative… A must-read for anyone who is curious to understand “the elegant math behind modern AI” [and] an inspirational guide for teachers of math and mathematical sciences who can adopt these techniques and methods to make classrooms lively.”
—
Shaastra, IIT-Madras
“Some books about the development of neural networks describe the underlying mathematics while others describe the social history. This book presents the mathematics in the context of the social history. It is a masterpiece. The author is very good at explaining the mathematics in a way that makes it available to people with only a rudimentary knowledge of the field, but he is also a very good writer who brings the social history to life.”
—
Geoffrey Hinton, deep learning pioneer, Turing Award winner, former VP at Google, and Professor Emeritus at University of Toronto
“After just a few minutes of reading
Why Machines Learn, you’ll feel your own synaptic weights getting updated. By the end you will have achieved your own version of deep learning—with deep pleasure and insight along the way.”
—Steven Strogatz, New York Times bestselling author of Infinite Powers and professor of mathematics at Cornell University
“If you were looking for a way to make sense of the AI revolution that is well underway, look no further. With this comprehensive yet engaging book, Anil Ananthaswamy puts it all into context, from the origin of the idea and its governing equations to its potential to transform medicine, quantum physics—and virtually every aspect of our life. An essential read for understanding both the possibilities and limitations of artificial intelligence.”
—
Sabine Hossenfelder, physicist and New York Times bestselling author of Existential Physics: A Scientist’s Guide to Life’s Biggest Questions
“
Why Machines Learn is a masterful work that explains—in clear, accessible, and entertaining fashion—the mathematics underlying modern machine learning, along with the colorful history of the field and its pioneering researchers. As AI has increasingly profound impacts in our world, this book will be an invaluable companion for anyone who wants a deep understanding of what’s under the hood of these often inscrutable machines.”
—Melanie Mitchell, author of Artificial Intelligence and Professor at the Santa Fe Institute
“Generative AI, with its foundations in machine learning, is as fundamental an advance as the creation of the microprocessor, the Internet, and the mobile phone. But almost no one, outside of a handful of specialists, understands how it works. Anil Ananthaswamy has removed the mystery by giving us a gentle, intuitive, and human-oriented introduction to the math that underpins this revolutionary development.”
—
Peter E. Hart, AI pioneer, entrepreneur, and co-author of Pattern Classification
“Anil Ananthaswamy’s
Why Machines Learn embarks on an exhilarating journey through the origins of contemporary machine learning. With a captivating narrative, the book delves into the lives of influential figures driving the AI revolution while simultaneously exploring the intricate mathematical formalism that underpins it. As Anil traces the roots and unravels the mysteries of modern AI, he gently introduces the underlying mathematics, rendering the complex subject matter accessible and exciting for readers of all backgrounds.”
—Björn Ommer, Professor at the Ludwig Maximilian University of Munich and leader of the original team behind Stable Diffusion
“An inspiring introduction to the mathematics of AI.”
—
Arthur I. Miller, author of The Artist in the Machine: The World of AI-Powered Creativity
“[An] illuminating overview of how machine learning works.”
—
Kirkus Reviews
Excerpt. © Reprinted by permission. All rights reserved.
Chapter 1
Desperately Seeking Patterns
When he was a child, the Austrian scientist Konrad Lorenz, enamored by tales from a book called The Wonderful Adventures of Nils-the story of a boy’s adventures with wild geese written by the Swedish novelist and winner of the Nobel Prize for Literature, Selma Lagerlöf-“yearned to become a wild goose.” Unable to indulge his fantasy, the young Lorenz settled for taking care of a day-old duckling his neighbor gave him. To the boy’s delight, the duckling began following him around: It had imprinted on him. “Imprinting” refers to the ability of many animals, including baby ducks and geese (goslings), to form bonds with the first moving thing they see upon hatching. Lorenz would go on to become an ethologist and would pioneer studies in the field of animal behavior, particularly imprinting. (He got ducklings to imprint on him; they followed him around as he walked, ran, swam, and even paddled away in a canoe.) He won the Nobel Prize for Physiology or Medicine in 1973, jointly with fellow ethologists Karl von Frisch and Nikolaas Tinbergen. The three were celebrated “for their discoveries concerning organization and elicitation of individual and social behavior patterns.”
Patterns. While the ethologists were discerning them in the behavior of animals, the animals were detecting patterns of their own. Newly hatched ducklings must have the ability to make out or tell apart the properties of things they see moving around them. It turns out that ducklings can imprint not just on the first living creature they see moving, but on inanimate things as well. Mallard ducklings, for example, can imprint on a pair of moving objects that are similar in shape or color. Specifically, they imprint on the relational concept embodied by the objects. So, if upon birth the ducklings see two moving red objects, they will later follow two objects of the same color (even if those latter objects are blue, not red), but not two objects of different colors. In this case, the ducklings imprint on the idea of similarity. They also show the ability to discern dissimilarity. If the first moving objects the ducklings see are, for example, a cube and a rectangular prism, they will recognize that the objects have different shapes and will later follow two objects that are different in shape (a pyramid and a cone, for example), but they will ignore two objects that have the same shape.
Ponder this for a moment. Newborn ducklings, with the briefest of exposure to sensory stimuli, detect patterns in what they see, form abstract notions of similarity/dissimilarity, and then will recognize those abstractions in stimuli they see later and act upon them. Artificial intelligence researchers would offer an arm and a leg to know just how the ducklings pull this off.
While today’s AI is far from being able to perform such tasks with the ease and efficiency of ducklings, it does have something in common with the ducklings, and that’s the ability to pick out and learn about patterns in data. When Frank Rosenblatt invented the perceptron in the late 1950s, one reason it made such a splash was because it was the first formidable “brain-inspired” algorithm that could learn about patterns in data simply by examining the data. Most important, given certain assumptions about the data, researchers proved that Rosenblatt’s perceptron will always find the pattern hidden in the data in a finite amount of time; or, put differently, the perceptron will converge upon a solution without fail. Such certainties in computing are like gold dust. No wonder the perceptron learning algorithm created such a fuss.
But what do these terms mean? What are “patterns” in data? What does “learning about these patterns” imply?