Courses — AI FOR SCIENCE MASTERS

Introduction to Machine Learning (16 September - 4 October 2024)

Claire David (AIMS South Africa)

This course introduces students to the key concepts of machine learning: linear regression, gradient descent, logistic regression, regularisation, over/underfitting. The most common algorithms in supervised learning will be presented, from boosted decision trees to deep neural networks. Students will code from scratch the algorithms as well as the methods to assess the performance of their model (ROC, train/validation learning curve, bias/variance). They will learn how to improve their program through hyperparameter optimization and advanced techniques (momentum, scheduler). In unsupervised machine learning, the approaches of dimensionality reduction and clustering will be introduced and illustrated.

Bayesian Inference (16 September - 4 October 2024)

Sanjoy Mahajan (Lund University)

Bayesian probability and statistics, with applications in mathematics, science, engineering, and philosophy of science. The approach emphasises the mathematical description of, and inference from, incomplete information. Topics include: nature of probability, conditional probability, Bayes' theorem, prior and posterior probabilities, Bayes factors, discrete and continuous distributions, point and compound hypotheses, hypothesis testing, likelihood principle, Shannon information, history of the Bayesian approach, and comparison with orthodox (frequentist) statistics. Applications include: medical testing, the problem of old evidence, the reproducibility crisis, drug evaluation, sequential sampling, p values, confidence intervals, legal evidence and reasoning, the Monty Hall problem, and plausible reasoning in mathematics.

Applied Machine Learning at Scale (7-25 October 2023)

Ulrich Paquet (Google DeepMind, AIMS South Africa)

ML and AI drive the back-ends and front-ends of many large online companies, and are set to play a transformative role in the “internet of things”. This is a practical module that looks at how ML is applied to internet-scale systems. In this module, students will build their own recommender systems from scratch. Topics covered will include A/B testing, ranking, recommender systems, and the modelling of users and entities that they engage with online (like news stories).

Computer Vision (7-25 October 2024)

Willie Brink (Stellenbosch University)

Computer Vision has long been an important driving force for advances in Machine Learning, and have been instrumental in the rise and development of deep learning. The module will start with convolutional neural networks for image classification, and extensions like dropout, batch normalisation, data augmentation, transfer learning, and visual attention. Other typical Computer Vision tasks will then be overviewed, including object segmentation, colourisation, style transfer, and automated image captioning.

Mathematical Problem Solving (4-22 November 2024)

Jacques Rabie (University of Cape Town)

In this course we shall consider a variety of elementary, but challenging, problems in different branches of pure mathematics. Investigations, comparisons of different methods of attack, literature searches, solutions and generalizations of the problems will arise in discussions in class. The objective is for students to learn, by example, different approaches to problem solving and research.

Reinforcement Learning (4-22 November 2024)

Arnu Pretorius (InstaDeep)

This course provides an introduction to Reinforcement Learning (RL). It will cover fundamental topics such as Markov decision processes, dynamic programming, Monte Carlo and value-based methods including temporal difference learning, function approximation and policy gradient methods. The presentation of topics will be roughly split equally between theory and code, using notebooks and additional material. Although the focus will be on understanding the foundations of RL, the course will conclude by giving an overview of some more advanced topics and applications.

(Bayesian) Active Learning, Information Theory, and Uncertainty (25 November - 13 December 2024)

Andreas Kirsch (Google DeepMind)

This course provides a comprehensive introduction to active learning and uncertainty quantification, emphasising their integration to enhance deep learning models' efficiency. We will explore data subset selection techniques, such as active learning and active sampling, through the lens of information-theoretic principles. The course will cover the distinction between epistemic and aleatoric uncertainty in deep neural networks, providing insights into their implications for data selection. We will also discuss various strategies for active learning and data subset selection in Bayesian deep learning, emphasising their connection to information-theoretic measures. By adopting a unified perspective, participants will gain a comprehensive understanding of how these concepts can be applied to improve the label and training efficiency of deep learning models.

Bayesian Modelling and Deep Generative Surrogates for Epidemiology (25 November - 13 December 2023)

Elizaveta Semenova (Imperial College London)

In this course we will explore a range of topics in Bayesian modelling, such as Bayesian inference, hierarchical modelling, Gaussian processes for spatial statistics, ordinary differential equations and agent-based models for disease transmission modelling. Using the probabilistic programming language Numpyro, we will construct probabilistic models and perform Bayesian inference to quantify uncertainty in model predictions and parameter estimates. As the course progresses, we will introduce deep generative models as efficient surrogates for computationally demanding model components (yes, this is 'generative AI'!). These surrogates, implemented in JAX, will be integrated seamlessly into NumPyro programs, enabling fast and scalable MCMC inference. While the course emphasises computational techniques, the models and applications are rooted in real-world epidemiology, providing a practical framework for data-driven decision-making in health research.

Deep Learning for Ecology (25 November - 13 December 2024)

Emmanuel Dufourq (AIMS South Africa), Rupa Kurinchi-Vendhan (Massachusetts Institute of Technology), Timm Haucke (Massachusetts Institute of Technology)

There has been a catastrophic decline in wildlife populations in recent years. Many species are threatened with extinction due to various factors. Further conservation efforts are urgently required to ensure the survival of the remaining individuals. In this course we will explore how deep neural networks can be applied to wildlife monitoring tasks. Machine learning can be applied to detect animal vocalisations (bioacoustics) and detect images of animals (camera traps). In this course you will learn about audio processing and deep learning classifiers for bioacoustics, and how to apply deep learning for penguin monitoring. Bioacoustics will be covered by Emmanuel (2 weeks), camera traps by Rupa Kurinchi-Vendhan and Timm Haucke (1 week). There will be a field trip to the Two Oceans Aquarium to collect penguin video data.

AI for Climate Change (13-31 January 2025)

Neil Hart (University of Oxford), Shruti Nath (University of Oxford)

When was the last time you or your family was affected by a weather or climate event? Did the rains fail to come? Did too much rain come in one day? Did the heat make daily life difficult? Understanding the changing risk of such events is at the core of climate science research today. Building dynamical and statistical methods to better predict these events is the core of atmospheric prediction science. Fundamental research on how the Earth System works continues to support both climate science and prediction science.

In this module, we will explore the ways in which machine learning tools have been used to make sense of our weather and climate and how the rapid advances of modern AI techniques are accelerating advances across this research area. The goal of this module is to equip you with sufficient knowledge of the Earth System for you to find matches between your growing AI skill set and weather-climate problems.

In this module we will together explore, and reproduce, applications of unsupervised learning, computer vision and causal inference to such problems. Examples will be provided of new techniques in Explainable AI (XAI) to use AI as a research partner in climate science. Consideration will also be given to the very latest breakthroughs in weather prediction using GraphCast, AIFS, and similar.

AI for Public Health (13-31 January 2025)

Joacim Rocklöv (University of Heidelberg), Marina Treskova (University of Heidelberg), Steffen Knoblauch (University of Heidelberg)

This course will introduce basic concepts of epidemiology and public health, such as study designs and epidemiological measures. It will orientate the students in disease burdens from a global and regional perspective, discuss public health ethics, intervention designs and causal inference. It will train the students in critical data appraisal, processing, cleaning and analysis of public health data. Methods will include intense data driven methods and machine learning, with examples mainly in connection to geospatial infectious diseases and environmental data. The students will become familiar with major public health data collection activities across Africa, such as demographic surveillance systems (DSS) and health demographic surveillance systems (HDSS). Finally, the students will be trained in using climate data for training predictive models, forecasting public health risks and making scenario-based projections of potential future impacts.

Fluid Dynamics (13-31 January 2025)

Richard Katz (University of Oxford)

Fluids are all around us, from the air we breathe to the oceans that determine our climate and from oil that powers our industries to metals that are cast into machinery. The study of fluid dynamics requires sophisticated applications of mathematics and the ability to translate physical problems into mathematical language and back again. The course begins by building a fundamental understanding of viscous fluid flows in the context of unidirectional flows. In more general, higher dimensional flows, pressure gradients are generated within a fluid to deflect the flow around obstacles rather than the fluid being compressed in front of them, and an understanding of the coupling between momentum and mass conservation through the pressure field is key to the understanding and analysis of fluid motions. We will use simple experiments to illustrate and motivate our mathematical understanding of fluid flow. Prerequisite for the course is fluency with differential equations and vector calculus. No previous knowledge of fluid dynamics will be assumed.

CUDA Programming for GPUs (3-21 February 2025)

Mike Giles (University of Oxford)

The achievements of Machine Learning have been made possible by the incredible power of GPUs (Graphics Processing Units) which were originally designed for generating graphics but are now used for a variety of applications in High Performance Computing, including Machine Learning. This course introduces students to NVIDIA's CUDA programming language which is an extension of C/C++. Through a combination of lectures and practicals students will learn how to write codes to execute on GPUs, how to assess the performance of those codes, and how to optimise the performance by understanding the movement of data within the GPU. Example applications include Monte Carlo simulation and the solution of finite difference PDE approximations.

Deep Generative Models and Generative AI (3-21 March 2025)

Jan-Willem van de Meent (University of Amsterdam) and Floor Eijkelboom (University of Amsterdam)

This course will cover introductory topics in deep generative modeling and generative AI, with the aim of giving participants the pre-requisite background to current state-of-the-art systems and their components. We will begin with a basic introduction to deep generative modeling that focuses on variational autoencoders, then discuss autoregressive models, transformers, and graph neural networks, and finally cover diffusion and flow-based methods for generative AI. We will conclude with a number of case studies to see how the methods and neural network components are used in recent systems for generative AI.

Principles of Imaging for Radio Astronomy (3-21 March 2025)

Marta Spinelli (Université Côte d'Azur) and Landman Bester (South Africa Radio Astronomy Observatory; Rhodes University)

Our best theory for the evolution of the Universe is based on the existence of a “dark” form of matter that has only gravitational interactions and an unknown form of “dark” energy that causes its accelerated expansion. Unveiling the nature of the dark sector is the next fundamental question in Cosmology and requires mapping increasingly larger volumes of the observable Universe.

The SKA Observatory is an intergovernmental organisation for ground-based astronomy in charge of building and operating the world’s largest radio facilities. One of the two observatories will be in South Africa, in the Karoo Desert. The successful exploitation of the data from SKA and its precursors will shed new light on our understanding of the formation and evolution of our Universe. This new fundamental research needs on the one hand new and innovative data analysis techniques and, on the other hand, the construction of realistic end-to-end simulations, going from the sky emission to cosmological parameter constraints.

This course will cover the timely and innovative subject of Radio Cosmology with a particular focus on the role of neutral hydrogen in unveiling the large-scale structure of the Universe. It will also discuss how AI has improved the quality of data analysis and the construction of simulations.

Course outline

Summary of the key concepts of Modern Cosmology: evidence in favour of the Big Bang, inflation, dark matter, and dark energy. Description of the main phase of the evolution of the Universe. Concept of distance.
Basics of Bayesian Approach in Cosmology.
A brief summary of how we simulate the Universe. AI methods to speed up this process.
Radio Cosmology: what we can learn about the structure and evolution of our Universe using Radio Astronomy data. Description of the SKA Observatory project. Review of the main cosmological probes accessible in the Radio band.
21cm Intensity Mapping: what is it, how we can do it, and what we can learn with it.
ML techniques to analyze the data: distinguish the signal from the foreground emission.

Simulation and Inference for Neuroscience (3-21 March 2025)

Philipp Berens (University of Tübingen), Pedro Goncalves (VIB-Neuroelectronics Research Flanders)

Modern Statistics and Machine Learning for Population Health in Africa (24-28 March 2025)

Oliver Ratmann, Alexandra Blenkinsop and Tristan Naidoo (Imperial College London), Juliette Unwin (University of Bristol)

One of the groundbreaking advances in machine learning research in the past decade is surrounding the emergence of increasingly sophisticated, robust, and easily usable probabilistic programming languages. These new tools, including Stan or NumPyro, hide tedious calculations involving automatic differentiation and gradient-based optimization from the end-user, making modern statistical methods widely available to data scientists in Africa that wish to address some of the most urgent challenges on the continent, ranging from habitat degradation, air pollution, extreme weather events, disease outbreaks and population health in general.

This one-week course will cover how you can integrate modern statistical techniques with the Stan probabilistic programming language to effectively address a broad range of applications from epidemiological, genomic and spatial data. We hope this course will equip you with intelligence-driven statistical technologies to drive your own evidence-based discoveries in global health or other applications, and more broadly increase your fluency in artificial intelligence and modern statistics.

Content covered

What attendees will learn

Bayesian workflow with probabilistic programming (Stan)
Core regression models for hierarchical data
Gaussian process regression with Stan
State-of-the-art GP approximations for scalable inference
Infectious disease modelling with probabilistic programming
Pathogen phylogenetics with Stan

The Science and Engineering of Large Language Models (31 March - 11 April 2025)

Amr Khalifa, Alban Rrustemi (Google DeepMind) and teams

1. A very brief history of LLMs (from word2vec to transformers), tokenization (including hands-on lab), and a detailed overview of the transformer architecture, including coding attention and a full dense transformer from scratch.

2. Training and Optimization (April 2): Pre-training strategies, improving training efficiency on a single GPU, and building training loops to train a transformer.

3. Advanced Architectures (April 3 - 4): Scaling laws, Mixture of Experts (MoE) architecture, with a potential lab on converting a dense transformer to MoE.

4. Specialized Topics (April 5): A module on Diloco and async training.

5. Adaptation and Efficiency (April 7): Adaptors, LoRA (Low-Rank Adaptation) and sparsity (weight and activation quantization).

6. Scaling (April 8 - 11): Scaling laws, roofline models, profiling (MATmuls and transformers on accelerators), data parallelism, and Fully Sharded Data Parallel (FSDP).

7. Post-Training (April 7-11): Instruction tuning and multiple sessions dedicated to advanced post-training techniques.

8. Hands-on Labs: The workshop emphasizes practical application with numerous lab sessions, allowing participants to solidify their understanding and gain hands-on experience.

Selected Topics in AI in Science (21 April - 9 May 2025)

Research Project (February 2025 soft start, for 14 April - 13 June 2025)

Courses offered in 2024/5

Introduction to Machine Learning

Bayesian Inference

Applied Machine Learning at Scale

Computer Vision

Mathematical Problem Solving

Reinforcement Learning

(Bayesian) Active Learning, Information Theory, and Uncertainty

Bayesian Modelling and Deep Generative Surrogates for Epidemiology

Deep Learning for Ecology

AI for Climate Change

AI for Public Health

Fluid Dynamics

CUDA Programming for GPUs

Deep Generative Models and Generative AI

Principles of Imaging for Radio Astronomy

Simulation and Inference for Neuroscience

Modern Statistics and Machine Learning for Population Health in Africa

The Science and Engineering of Large Language Models

Selected Topics in AI in Science

Research Project