The decade of data revolution: literary review

Damir Valput

2020-01-21 14:04:28
Reading Time: 4 minutes

As we wrap up the 2010s, the most uncertain decade yet to many, one sentiment seems to be shared by most: despite being a tumultuous decade, the world moved faster than ever before. In the sea of upheavals, breakthroughs and shifting perspectives that defined the 2010s, the accelerated rise of artificial intelligence and data science will surely be remembered as one of its most remarkable characteristics.

When Watson, an IBM’s intelligent machine, beat out human champions in Jeopardy! in 2011, it felt like an announcement of machine revolution. Since then, the whole decade seems to have been a constant release of new AI technologies penetrating our day-to-day lives. Virtual assistants that understand natural languages started answering our questions, more machines defeated human game players (e.g. AlphaGo), autonomous cars started driving on our streets (albeit only under human supervision), and “data scientist” was declared the sexiest job in the 21st century.

Data science indeed bridged the gap between tech-savvy geeks and folks not so well-versed in the language of zeroes and ones. It pushed a large number of people from various backgrounds to write their first lines of code, drastically revolutionised decision-making processes in business, and ultimately turned playing with data and creating knowledge into a cool, detective-like endeavour.

To wrap up the decade of data revolution on this blog, I decided to present my pick of 5 books published in the past 10 years that I found both educational and entertaining and that I wholeheartedly recommend to any data science practitioner out there. I wanted to focus on high-level literary pieces that I believe could stand the test of time, independent of any programming language, BA tool or machine learning library. The entries are ordered chronologically by the publishing date, starting with the most recent.

1) Machine Learning Yearning; Andrew Ng – 2018

Machine Learning Yearning is not your typical ML textbook that dissects all the ML algorithms known to a man. Rather, Andrew Ng intended it to be a compendium of tips and tricks on how to tune ML algorithms, avoid common pitfalls and successfully execute ML projects.

Ng is a professor at Stanford University and is also known for co-founding Coursera, an online learning platform where you can take his excellent course on Machine Learning. His language and writing style are concise and to the point, and his advices cover all the phases of an ML project, which makes this book a must have for anyone wishing to develop their own ML systems.

The best of all that the book is completely freely available on Ng’s deeplearning.ai webpage!

2) Deep Learning; Ian Goodfellow, Yoshua Bengio, Aaron Courville – 2016

Most data practitioners have come upon the names Ian Goodfellow, Yoshua Bengio and Aaron Courville at least once during their career, and a great number have read at least one paper authored by one of them.

Intended as a textbook for students and machine learning practitioners, this book starts with an overview of the mathematical techniques fundamental for machine learning. Subsequently, it moves onto discussing increasingly complex advancements in deep learning.

Unlike many other sources on deep learning, this one is academically oriented and thus fully theoretical. Nevertheless, I found the book extremely valuable for understanding the field of deep learning and how it might develop in the future; it’s truly indispensable for any deep learning practitioner. And it is also available totally for free here!

3) The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World; Pedro Domingos – 2015

This isn’t another super technical book on machine learning. I don’t think it even contains a single formula except for the Bayes’ theorem.

The Master Algorithm is a rather amusingly written overview of different schools of artificial intelligence and how they contrast or complement each other. Throughout the book, Domingos portrays his vision of the quest for the ultimate algorithm that could solve any problem and entertains several futuristic ideas on how it could shape our civilisation.

Domingo’s enthusiasm for the field of ML can be felt throughout the book. I highly recommend picking it up if you are a newcomer to the field or simply going through a creative drought.

4) How Not To Be Wrong: The Power of Mathematical Thinking; Jordan Ellenberg – 2014

How this book earned its spot on a list of data science book recommendations might not be immediately obvious; however, if you decide to give it a chance, the first few pages will easily clarify its position here among the others.

Ellenberg relies on everyday situations to present rational, data-based approaches to decision-making and complex problem solving – and that is precisely what any data wrangler does. Though this book’s utility might be somewhat greater in data analytics than data science, anyone trying to convert data into useful insights can benefit from what Ellenberg has to say about critical thinking and reasoning fallacies.

This book was also the main source of inspiration for one of my previous posts on optimising your time when travelling by air, which you can also find on this blog.

5) Thinking, Fast and Slow; Daniel Kahneman – 2011

This is another not-so-obvious choice for this list. The book by the 2011 Nobel prize winner Daniel Kahneman can usually be found under “Psychology books” in bookstores and libraries, and Kahneman is best known for his work in economics and behavioural science.

So why did I include it?

Kahneman masterfully presents how our minds can trick us into committing fallacies and following our biases, something every data scientist must be wary of. Based on the idea of two systems that guide all of our reasoning, a fast and a slow thinking one, Kahneman systematically discusses human decision-making process and warns that we must not trust human judgment too much, even if the logic seems convincing. Better to rely on data, no?

This is a book that can make you grow as a critical thinker and a believer in data.

This list is my personal selection and is in no way exhaustive. The 2010s was great for data science and it would be difficult to list or read all the many excellent literary pieces published in the decade. What are your favourite data science books published in the 2010s? I’m always happy to receive new recommendations!

Author: Damir Valput

© datascience.aero