AI-what? A quick intro to AI, Machine Learning, and Deep Learning

Where to start…?

Artificial Intelligence has become the trending field in computer science, and people think it will bring about technological advancements comparable to the discovery of electricity. However, the conversation around AI, Machine Learning, and Deep Learning can be quite confusing, with terminology and lingo that can put off everyone interested in dipping their toes in.

So, where to start?

In October 2018, I gave a Masterclass at the Postdoc Centre of the University of Cambridge, UK, very much inspired by this wonderful blog post by Cassie Kozyrkov! At deepMirror, we thought we would share it with you, for a gentle introduction.

Is it Artificial Intelligence, Machine Learning, or Deep Learning? Does it even matter?

The terms Artificial Intelligence, Machine Learning and Deep Learning are now used almost interchangeably, although based on their history they could be arranged hierarchically like in the diagram below (credit to nVidia).

In this post we will have an (embarrassingly short) overview over the technological advancements of the last ~70 years, and we will focus on a more detailed introduction to Deep Learning specifically in our next blog post.

Artificial Intelligence

The field of Artificial Intelligence was established in the mid-fifties. At that time, despite being a set of computer instructions capable of playing chequers (quite well we admit), it was hailed as the key to our civilisation brightest future, although some just saw it as the newest trend to follow to get funding for their research in computer science or companies (does that sound familiar?).

Soon, AI found itself popularised in different media, and often given “human-like” appearance or characteristics. Characters such as C3PO (with quirky personality), the Terminator (with an evil executioner’s vibe), and more recently WALL-E and EVE (the most adorable couple in AI history), were created based on the idea of a general purpose Artificial Intelligence capable of doing any tasks humans can do, and more!

It turns out that making a general purpose AI is very hard (even by today’s standards), leading to a period of disillusion for the following ~20 years. Finally, in the 1980s computer scientists started having access to large amounts of digitalised data, and used it in combination with their improved algorithms and faster machines, paving the way for the field of Machine Learning.

Machine Learning

Machine Learning at its most basic is the practice of using data in combination with algorithms. The idea is to learn from the data, and then make a prediction about the world based on it. So rather than hand-coding software routines with a specific set of instructions to accomplish a particular task, the machine is “trained” using large amounts of data and algorithms that give it the ability to learn how to perform the task.

Examples of successful Machine Learning applications are Anti-Spam email systems, and Deep Blue – the very famous Chess machine that defeated Garry Kasparov in 1996.

What better example of extreme intelligence than recognising this email as SPAM?

There is a wide literature for Machine Learning, which we are not going to cover in this blog post!

In the 1990s, technological advancements led to the first practical applications (despite the theory behind it was available since the 1960s-1970s) of a subfield of Machine Learning, called Deep Learning.

Deep Learning

The name Deep Learning is very catchy, but it would make sense to call it “Deep Artificial Neural Networks-based Machine Learning”. Not so catchy now, is it?

The core idea of Deep Artificial Neural Networks was inspired by our current understanding of how real biological neurons work. Data is “presented” to a series of layers of neurons (see the nodes in the graph below), which make some computations based on certain numbers (or parameters) associated to the individual neurons. Then, these neurons pass the result to the following layer of neurons, and this is repeated until we reach the final layer of the network (the output). Based on the output, the parameters of the neurons in the network are changed, and this process is repeated thousands (or millions!) of times, until the network has learned a specific task.

A schematic representation of a (not very deep) Neural Network.
The blue circles are neurons, and the magenta lines are the connections between them, that get stronger or weaker during learning.

This is inspired from biology, as we think that during development and learning, the brain changes the connections between neurons, to adapt to specific needs.

In the 1990s and 2000s, Deep Learning worked well for simple tasks like recognising hand written digits or letters. But applications involving large datasets and deeper Neural Networks required long computation times, making the learning process very slow, and very hard to iterate on.

The current boom (from the 2010s) in Deep Learning stems from 3 main reasons:

  • the explosion in availability of data
  • the improved architectures of Artificial Neural Networks
  • the extreme increase in computational power

Since then, Deep Learning became a viable (in fact the go-to) approach to Machine Learning. This is why we are seeing important advances in Biological and Medical Imaging (including our own KymoButler), why Autonomous Driving is becoming a reality, and why computers became so good at recognising cats on the internet (and many other things).

Deep Learning is ushering humanity towards a greater civilisation (?)

Next time…

Thank you very much for sticking around until the end of this blog post, we hope you have gained a first intuition about Artificial Intelligence, Machine Learning, and Deep Learning. Next time we’ll see how Deep Learning works using a simple example. Stay tuned!