In April, we had the pleasure to welcome Simon Knowles, CTO of Graphcore and Marta Garnelo Abellanas, Research Scientist at DeepMind to the London Machine Learning Meetup. Below are summaries of their talks.
Graphcore - Simon Knowles
Simon Knowles, CTO at Graphcore and founder of many successful startups in the area of processor development gives us an overview of the challenges and opportunities faced by today’s chip manufacturers in the context of intelligence computation (model learning & simulations). What is required such that these new types of computational tasks can be executed as fast and efficiently as possible?
In the first part, Simon walks us through the current state of processor development and why we do not see performance improvements as 10-15 years ago. He predicts that silicon scaling might yield another 3-10x performance in the next decade, but in order to get to 100x improvements, we need to come up with ground breaking new ideas, connect many chips together and adjust to the new forms of computation required. Intelligence computation can usually be represented in a shape of a graph, with nodes representing data transformation (computation) and edges dependencies (communication), but current CPUs (Scalar) and GPUs (Vector) are not designed for efficient processing of these types of data structures.
This is the space which Graphcore tries to fill. The result of their development is Colossus, an Intelligence Processing Unit (IPU) optimised to work efficiently on graph type data structures, with memory on chip, 2432 processor tiles, “compiled” communication. In order to avoid concurrency hazards, the IPU is based on Bulk SynchronousParallel. It can access 600MB at 90TB/s with near zero latency (compared to a GPUs 16GB at 900GB/s).
Watch the video if you are interested to know how the Intelligence Processing Unit works and how it relates and performs in the context of machine learning.
Symbolic Representation Learning - Marta Garnelo
Marta Garnelo, Research Scientist at DeepMind and PhD student at Imperial College London, invites us in this talk to think about the representations Deep Learning (DL) models generate, and how combining them with Symbolic AI could yield better results.
In the first part, Marta highlights the principle of Symbolic AI and identifies several advantages of this approach: i) interpretability, ii) generalization at the concept level, iii) well established solver/planning algorithms. She also recognizes this paradigm suffers from one critical drawback as relations, and so knowledge, need to be handcrafted. Comparing it to Neural Networks, she then emphasizes the complimentarity between DL and Symbolic AI and details her first attempt to reconcile both approaches in one of her early experiments. Garnelo M., Arulkumaran K. and Shanahan M., 2016 introduced a Deep Symbolic Reinforcement Learning pipeline, composed of a low level symbol extractor - using Convolutional Autoencoder - feeding into a Representation Building component and passed to a Q-learning algorithm. They compared their learning pipeline to Deepmind’s DQN in an environment where an agent must collect rewards in a discrete world. While DQN was quick to achieve a perfect score when the position of the rewards was kept constant on the grid, it failed to generalize to random positioning. Their algorithm, on the other hand, achieved consistent results in the different set-up of the experiment.
The second part of the talk focuses on the challenges faced by DL and reviews recent attempts to tackle these challenges. The need for interpretability is a controversial topic that has been hugely debated recently. NIPS 2017 - Interpretable ML Symposium is an example of the questions (and emotions) this topic raises. Recent work on disentangled representations (Chen et al 2016, Higgins et al 2016) consists of learning interpretable representations of the independent data generative factors. Another approach towards better interpretability also takes advantage of symbolic representations in DL algorithms, such as relational networks (Santoro et al, 2017). Another focus of improvement is on the ability to generalize at a concept level. Work on higher level generalization includes combining disentangled representations with symbolic description of the environment (Higgins et al 2017). Meta-learning, a field gaining popularity recently, can also be considered, by definition, to generalize at a higher level.