Courses — AI FOR SCIENCE MASTERS

Mathematical Problem Solving (11-29 September 2023)

Naina Ralaivaosaona (Stellenbosch University)

In this course we shall consider a variety of elementary, but challenging, problems in different branches of pure mathematics. Investigations, comparisons of different methods of attack, literature searches, solutions and generalizations of the problems will arise in discussions in class. The objective is for students to learn, by example, different approaches to problem solving and research.

Introduction to Machine Learning (11-29 September 2023)

Claire David (AIMS South Africa)

This course introduces students to the key concepts of machine learning: linear regression, gradient descent, logistic regression, regularization, over/underfitting. The most common algorithms in supervised learning will be presented, from boosted decision trees to deep neural networks. Students will code from scratch the algorithms as well as the methods to assess the performance of their model (ROC, train/validation learning curve, bias/variance). They will learn how to improve their program through hyperparameter optimization and advanced techniques (momentum, scheduler). In unsupervised machine learning, the approaches of dimensionality reduction and clustering will be introduced and illustrated.

Applied Machine Learning at Scale (2-20 October 2023)

Ulrich Paquet (Google DeepMind, AIMS South Africa)

ML and AI drive the back-ends and front-ends of many large online companies, and are set to play a transformative role in the “internet of things”. This is a practical module that looks at how ML is applied to internet-scale systems. In this module, students will build their own recommender systems from scratch. Topics covered will include A/B testing, ranking, recommender systems, and the modelling of users and entities that they engage with online (like news stories).

Cloud and Functional Programming (2-20 October 2023)

William Dekou (Google) and Jeff Sanders (AIMS South Africa)

Cloud Computing democratised compute power by providing individuals and organisations access to unlimited compute resources on demand. It can be used to run backend applications as well as conducting research. This Cloud computing course equipped students with capabilities to run Python and C++ Machine Learning code on Google Compute Engine’s Virtual Machine with GPU. The course provides the fundamentals of Cloud computing, an introduction of this computing paradigm and its key characteristics. It explores the cloud service models, the different deployment models available. We also explore different levels of virtualisation including virtual machines and containers. The course gets practical by introducing Google Cloud fundamentals building blocks: Virtual Private Cloud (VPC), Compute Engine, Cloud NAT, Cloud Storage, Cloud Build, Artifact Registry. We go on to create Virtual Machine with GPU, connect to it using SSH and GCloud tool. We show how to build a container image. We conclude by running our code into Virtual Machine.

Physics for Machine Learning (30 October - 17 November 2023)

Hugo Touchette (Stellenbosch University)

This course covers techniques from physics that are useful in Machine Learning. Monte Carlo methods form a family of stochastic approximation techniques, and are widely used in fields spanning Physics, Statistics and Finance. It is also a principal tool for statistical inference in ML. The module will cover Markov chains and Monte Carlo methods, including Metropolis-Hastings algorithms.

Computer Vision (30 October - 17 November 2023)

Willie Brink (University of Stellenbosch)

Computer Vision has long been an important driving force for advances in Machine Learning, and have been instrumental in the rise and development of deep learning. The module will start with convolutional neural networks for image classification, and extensions like dropout, batch normalisation, data augmentation, transfer learning, and visual attention. Other typical Computer Vision tasks will then be overviewed, including object segmentation, colourisation, style transfer, and automated image captioning.

Statistical Inference and Causality (27 November - 15 December 2023)

Ulrich Paquet (Google DeepMind; AIMS South Africa) and St John Grimbly (University of Cape Town)

The first part of the course covers the bread and butter of Bayesian inference, with applications in Machine Learning in mind. The first part of the course uses Chris Bishop's Pattern Recognition and Machine Learning textbook, including a Bayesian treatment of linear basis function models, graphical models and a short overview of information theory. We aim to help students see the wood from the trees of probabilistic inference

"Correlation does not imply causation" – a phrase often heard in statistics classrooms, suggests that we cannot determine causation from data alone. Challenging this notion, this module delves into the field of causal inference, with a particular emphasis on graphical causal modelling. You will explore how combining assumptions with data can indeed lead to valid causal conclusions. The course is designed to impart a fundamental understanding and intuition of causal modelling, particularly in the context of machine learning.

CUDA Programming for GPUs (27 November - 15 December 2023)

Mike Giles (University of Oxford)

The achievements of Machine Learning have been made possible by the incredible power of GPUs (Graphics Processing Units) which were originally designed for generating graphics but are now used for a variety of applications in High Performance Computing, including Machine Learning. This course introduces students to NVIDIA's CUDA programming language which is an extension of C/C++. Through a combination of lectures and practicals students will learn how to write codes to execute on GPUs, how to assess the performance of those codes, and how to optimise the performance by understanding the movement of data within the GPU. Example applications include Monte Carlo simulation and the solution of finite difference PDE approximations.

AI for Quantum and Quantum for AI (8-26 January 2024)

Ryan Sweke (IBM Research, Almaden)

In this course we will explore two complementary research directions at the intersection of quantum computing and AI for science. On the one hand, we will explore a variety of proposals for using ML type algorithms tailored for quantum computers to solve difficult problems in quantum many-body physics, whose solutions promise applications in diverse areas such as drug design and high-temperature superconductivity. On the other hand, we will explore how state-of-the-art tools from machine learning can be used to address fundamental bottlenecks in the development and implementation of quantum computers themselves. In order to explore these topics we will start by understanding the fundamentals of quantum computing, from both a theoretical and practical perspective. Using these fundamentals we will proceed to develop and implement a variety of both quantum and classical ML-type algorithms for problems in both quantum many-body physics, and quantum computing itself.

Simulation-based Inference (8-26 January 2024)

Jakob Macke (University of Tübingen), Cornelius Schroeder (University of Tübingen), Pedro Goncalves (VIB-Neuroelectronics Research Flanders)

This course offers an introduction to simulation-based inference, a growing area in machine learning that deals with statistical inference on simulator-based models. We begin by reviewing key concepts in probability theory and statistics, providing a foundation necessary for understanding simulation-based inference. The course will then explore how recent advances in probabilistic deep learning can be applied to this field, enabling us to address complex inference problems. Throughout the course, we will examine examples from various scientific disciplines, with a focus on neuroscience problems. These examples will demonstrate the versatility and effectiveness of simulation-based inference in scientific research. By the end of the course, participants will have a clear understanding of the basics of simulation-based inference, and its potential to extract insights from empirical data. The goal is to equip students with the knowledge and skills to apply simulation-based inference in their scientific endeavours.

AI, Mind and Brain (8-26 January 2024)

Martin Butz (University of Tübingen)

Our brain yields our mind. Starting soon after conception, the brain develops into a neurobiological cognitive architecture. It solves fundamental cognitive science problems, such as the symbol grounding problem, the frame problem, and the binding problem. It learns to control its own body in a self-motivated, goal-directed manner and even seamlessly learns a language. How is this possible? In this course, we will explore answers to this question pursuing a functional and computational approach. In particular, we will introduce reinforcement learning, reasoning, planning, the free energy principle including active inference, as well as generative models. Moreover, we will elaborate on scene segmentation algorithms, the causality challenge, and techniques to foster compositionality. We conclude with an outlook onto a fully integrative and self-organising model of the human mind with all its positive and negative implications and potential.

Fluid Dynamics (8-26 January 2024)

Richard Katz (University of Oxford)

Fluids are all around us, from the air we breathe to the oceans that determine our climate and from oil that powers our industries to metals that are cast into machinery. The study of fluid dynamics requires sophisticated applications of mathematics and the ability to translate physical problems into mathematical language and back again. The course begins by building a fundamental understanding of viscous fluid flows in the context of unidirectional flows. In more general, higher dimensional flows, pressure gradients are generated within a fluid to deflect the flow around obstacles rather than the fluid being compressed in front of them, and an understanding of the coupling between momentum and mass conservation through the pressure field is key to the understanding and analysis of fluid motions. We will use simple experiments to illustrate and motivate our mathematical understanding of fluid flow. Prerequisite for the course is fluency with differential equations and vector calculus. No previous knowledge of fluid dynamics will be assumed.

AI in Climate Science (29 January - 16 February 2024)

Neil Hart (University of Oxford)

When was the last time you or your family was affected by a weather or climate event? Did the rains fail to come? Did too much rain come in one day? Did the heat make daily life difficult? Understanding the changing risk of such events is at the core of climate science research today. Building dynamical and statistical methods to better predict these events is the core of atmospheric prediction science. Fundamental research on how the Earth System works continues to support both climate science and prediction science.

In this module, we will explore the ways in which machine learning tools have been used to make sense of our weather and climate and how the rapid advances of modern AI techniques are accelerating advances across this research area. The goal of this module is to equip you with sufficient knowledge of the Earth System for you to find matches between your growing AI skill set and weather-climate problems.

In this module we will together explore, and reproduce, applications of unsupervised learning, computer vision and causal inference to such problems. Examples will be provided of new techniques in Explainable AI (XAI) to use AI as a research partner in climate science. Consideration will also be given to the very latest breakthroughs in weather prediction using GraphCast, AIFS, and similar.

Principles of Imaging for Radio Astronomy (29 January - 16 February 2024)

Marta Spinelli (ETH Zurich) and Landman Bester (South Africa Radio Astronomy Observatory; Rhodes University)

Our best theory for the evolution of the Universe is based on the existence of a “dark” form of matter that has only gravitational interactions and an unknown form of “dark” energy that causes its accelerated expansion. Unveiling the nature of the dark sector is the next fundamental question in Cosmology and requires mapping increasingly larger volumes of the observable Universe.

The SKA Observatory is an intergovernmental organisation for ground-based astronomy in charge of building and operating the world’s largest radio facilities. One of the two observatories will be in South Africa, in the Karoo Desert. The successful exploitation of the data from SKA and its precursors will shed new light on our understanding of the formation and evolution of our Universe. This new fundamental research needs on the one hand new and innovative data analysis techniques and, on the other hand, the construction of realistic end-to-end simulations, going from the sky emission to cosmological parameter constraints.

This course will cover the timely and innovative subject of Radio Cosmology with a particular focus on the role of neutral hydrogen in unveiling the large-scale structure of the Universe. It will also discuss how AI has improved the quality of data analysis and the construction of simulations.

Course outline

Summary of the key concepts of Modern Cosmology: evidence in favour of the Big Bang, inflation, dark matter, and dark energy. Description of the main phase of the evolution of the Universe. Concept of distance.
Basics of Bayesian Approach in Cosmology.
A brief summary of how we simulate the Universe. AI methods to speed up this process.
Radio Cosmology: what we can learn about the structure and evolution of our Universe using Radio Astronomy data. Description of the SKA Observatory project. Review of the main cosmological probes accessible in the Radio band.
21cm Intensity Mapping: what is it, how we can do it, and what we can learn with it.
ML techniques to analyze the data: distinguish the signal from the foreground emission.

Deep Learning for Ecology (29 January - 16 February 2024)

Emmanuel Dufourq (AIMS South Africa), Lorène Jeantet (AIMS South Africa), Matthew Van den Berg (Stellenbosch University)

There has been a catastrophic decline in wildlife populations in recent years. Many species are threatened with extinction due to various factors. Further conservation efforts are urgently required to ensure the survival of the remaining individuals. In this course we will explore how deep neural networks can be applied to wildlife monitoring tasks. Machine learning can be applied to detect animal vocalisations (bio-acoustics), detect images of animals (camera traps), estimate animal postures, and understand their behaviours. We will explore how to use acoustic, image, and accelerometry data for conservation purposes. We will also design a small Raspberry Pi acoustic recording unit, we will record sounds in nature, and build a machine learning classifier. Coding material will be provided in Python and Jax.

Mathematical Optimisation (26 February — 15 March 2023)

Roland Herzog (University of Heidelberg), Ekaterina Kostina (University of Heidelberg), Evelyn Herberg (University of Heidelberg)

Theoretical Foundations of Machine Learning and AI (26 February - 15 March 2023)

Luigi del Debbio (University of Edinburgh)

Reinforcement Learning (26 February - 15 March 2024)

Arnu Pretorius (InstaDeep)

This course provides an introduction to Reinforcement Learning (RL). It will cover fundamental topics such as Markov decision processes, dynamic programming, Monte Carlo and value-based methods including temporal difference learning, function approximation and policy gradient methods. The presentation of topics will be roughly split equally between theory and code, using notebooks and additional material. Although the focus will be on understanding the foundations of RL, the course will conclude by giving an overview of some more advanced topics and applications.

Framing Research Problems (25 March - 12 April 2024)

Max Welling (University of Amsterdam)

During this course we will discuss in depth three topics where machine learning and science intersect.

For each week, there will be some lectures to introduce the topic at an accessible level after which we will discuss research papers together.

Our goals will be to 1) learn about these topics and 2) to formulate and execute research programs that are worthy of publication in top tier venues.

The following topics will be covered:

1) Generative models from nonequilibrium statistical mechanics

2) Training (classical) DFT functionals and force fields

3) Machine learning and neuroscience

Bayesian Modelling and Probabilistic Programming with Examples from Epidemiology (25 March - 12 April 2024)

Elizaveta Semenova (University of Oxford)

In this course we will cover such topics as Bayesian inference, hierarchical modelling, Gaussian processes for spatial statistics, ordinary differential equations for disease transmission modelling.

We will build probabilistic models and perform inference using a probabilistic programming language Numpyro in a fully Bayesian manner to characterise uncertainty of the modelled quantities.

Although the course is primarily computational in nature, the models which we will examine are inspired by the typical modelling practices found in epidemiology.

Natural Language Processing (25 March — 12 April 2024)

Jan Buys (University of Cape Town), Francois Meyer (University of Cape Town)

Geophysical Fluid Dynamics (25 March - 12 April 2024)

Grae Worster and Jerome Neufeld (University of Cambridge

Fluids are all around us in the natural world, from the air we breath and that determines our weather to the oceans that affect our climate and the ’solid’ earth beneath our feet, which is in constant motion, giving rise to continental drift and volcanism. This course will give an introduction to the sorts of mathematical fluid dynamics that allows us to make predictions of such geophysical flows. As time allows: we will explore the role of buoyancy-driven convection and the effect of Earth’s rotation in determining weather patterns and extreme events such as hurricanes and tornadoes; we will study flows through porous media relating to ground-water aquifers and storage of CO2 to mitigate climate change; and we will model the change of phase from liquid to solid that determines the freezing of the polar oceans, the formation of solid rocks from molten lava and the growth of the Earth’s inner core, which drives our magnetic field.

It is strongly advised that students taking this course have first taken the course in Fluid Dynamics earlier in the term.

Research Project (February 2024 soft start, for 15 April - 24 May 2024)

Courses offered in 2023/4

Mathematical Problem Solving

Introduction to Machine Learning

Applied Machine Learning at Scale

Cloud and Functional Programming

Physics for Machine Learning

Computer Vision

Statistical Inference and Causality

CUDA Programming for GPUs

AI for Quantum and Quantum for AI

Simulation-based Inference

AI, Mind and Brain

Fluid Dynamics

AI in Climate Science

Principles of Imaging for Radio Astronomy

Deep Learning for Ecology

Mathematical Optimisation

Theoretical Foundations of Machine Learning and AI

Reinforcement Learning

Framing Research Problems

Bayesian Modelling and Probabilistic Programming with Examples from Epidemiology

Natural Language Processing

Geophysical Fluid Dynamics

Research Project