Imperial Data Science Intensive Course

Boost your career with this online programme brought to you by the Data Science Institute at Imperial College London and Le Wagon Coding Bootcamp.

In ten intense weeks you'll learn about a range of data science topics, from Python to advanced machine learning, and how to code your own data projects.


Key information

Duration: Ten weeks

12 April 2021
18 June 2021

Location: Online course (Live teaching)

9.00 – 18.30 (full-time)

Fees: £9,950 (10% alumni discount available)

Programme overview

Data Science Image

Data science tools like programming, data modelling and visualisation are becoming common across many sectors as a way of generating business insights and informing strategy. Whether you’re a student just coming out of university or an industry professional, being able to use these tools and having an understanding of statistics, statistical analysis and data analysis will be essential as the need for data scientists grows.

That's why we're excited to present our Imperial Data Science Intensive Course, a partnership between Imperial College London, one of the world’s top universities and Le Wagon, the educational innovators behind the leading Coding Bootcamp that has helped nearly 10,000 students to build their technical skills. This venture also has the added support of Imperial Projects.

With modules on everything from Pandas to deep learning, in just ten weeks this course will teach you the fundamental skills to begin a career as a data scientist. This full-time, online and immersive experience equips participants with the skills to explore, clean, and transform data into actionable insights and to implement machine learning models from start to finish.

The course will culminate in a Data Spark project, pioneered by Imperial College London. Coached by data scientists from Imperial’s Data Science Institute, participants will apply the skills they've learned in class to capstone projects.

This is an entirely online course, adapted from the world class, in-person, Le Wagon Coding Bootcamp. This means you can get the teaching support and information you need to give your career a boost, wherever you are in the world!

What Imperial brings:

  • Personal mentorship: you’ll be in touch with experienced Imperial academics as part of the training.
  • Imperial Spice: a peek at the future in the form of weekly bite-size online introductions from leading academics working in the field. You will hear about problems they are currently tackling, as well as be prompted to reflect on problems that no one has really thought about yet.
  • Data science projects: be a part of Imperial's pioneering Data Spark programme. This is your chance to apply the skills learned in class to a real case study, coached by an Imperial researcher with expertise in data analytics.
  • Imperial ideation training: if you have an entrepreneurial venture you would like to explore, we will work with you to translate the idea into an Imperial Data Spark Project.
  • Certification: upon successful completion of the course you'll receive a verified digital certificate issued by the Centre for Continuing Professional Development at Imperial College London.

What Le Wagon brings:

  • Best in class education: Le Wagon will be providing the course curriculum and teaching staff, so you know you'll be getting high quality expertise and experience.
  • Lifetime access to learning: Le Wagon have been refining their bespoke learning platform, Kitt, for the past seven years. Graduates will have lifetime access to this platform and up-to-date course materials so you can refresh your knowledge at any time.
  • A global alumni network: there are over 10,000 alumni in the Le Wagon community, covering 39 cities and 22 countries around the world.


Our course is designed to help you learn data science step by step, starting with the basic data toolkit in Python and the mathematics required to the complete implementation and deployment cycle of machine learning algorithms.

The course will be delivered live and you will get the opportunity to interact with the teaching team in real time, ask questions and get support as you progress.

Click through the tabs below to explore some of the modules you can study as part of this course:

Start the course prepared!

As the Imperial Data Science Intensive Course is very intense, our students must complete some online preparation work before starting the course. This work takes around 40 hours and covers the basics of Python, the pre-requisite language of the course, and some mathematical topics used every day by data scientists.

Python for data science

Learn programming in Python, how to work with Jupyter Notebook and to use powerful Python libraries like Pandas and NumPy to explore and analyse big datasets. Collect data from various sources, including CSV files, SQL queries on relational databases, Google Big Query, APIs and webscraping.

Relational database and SQL

Learn how to formulate a good question and how to answer it by building the right SQL query. This module will cover schema architecture and then dive deep into the advanced manipulation of SELECT to extract useful information from a stand-alone database or using a SQL client software like DBeaver.

Data visualisation

Make your data analyses more visual and understandable by including data visualisations in your Notebook. Learn how to plot your data frames using Python libraries such as matplotlib and seaborn and transform your data into actionable insights.

Statistics, probability, linear algebra

Understand the underlying math behind all the libraries and models used in the course. Become comfortable with the basic concepts of statistics and probabilities (including mean, variance, random variable, Bayes’s Theorem, etc.) and with matrix computation, at the core of numerical operations in libraries like Pandas and NumPy.

Statistical inferences

You'll learn how to structure a Python repository with object-oriented programming in order to clean your code and make it re-usable, how to survive the data preparation phase of a vast dataset, and how to find and interpret meaningful statistical results based on multivariate regression models.


Data analysts are meant to communicate their findings to non-technical audiences: you will learn how to create impact by explaining your technical insights, and how to turn them into business decisions using cost/benefits analysis. You’ll be able to share your progress, present your results and compare your results to your teammates’.

Preprocessing and supervised learning

Learn how to explore, clean, and prepare your dataset through pre-processing techniques like vectorisation. Get familiar with the classic models of supervised learning – linear and logistic regressions. Learn how to solve prediction and classification tasks with the Python library scikit-learn using learning algorithms like KNN (k-nearest neighbors).

Generalisation and overfitting

Implement training and testing phases to make sure your model can be generalised to unseen data and deployed in production with predictable accuracy. Learn how to prevent overfitting using regularisation methods and how to choose the right loss function to improve your model's accuracy.

Performance metrics

Evaluate your model's performance by defining what to optimise and the right error metrics in order to assess your business impact. Improve your model's performance with validation methods such as cross validation or hyperparameter tuning. Finally, discover a powerful supervised learning method called SVM (Support Vector Machines).

Unsupervised learning and advanced methods

Move to unsupervised learning and implement methods like PCA for dimensionality reduction or clustering for discovering groups in a data set. Complete your toolbelt with ensemble method that combine other models to improve performance, such as Random Forest or Gradient Boosting.

Managing images and text data

Get comfortable with managing high-dimensional variables and transforming them into manageable input. Learn classic pre-processing techniques for images like normalisation, standardisation and whitening. Apply the right type of encodings to prepare your text data for different NLP tasks (Natural Language Processing).

Neural networks

Understand the architecture of neural networks (neurons, layers, stacks) and their parameters (activation functions, loss function, optimiser). Become autonomous to build your own networks like Convolutional NeuralNetworks (for images), Recurrent Neural Networks (for time-series) and Natural Language Processing networks (for text).

Deep learning with Keras

Discover a new library called Keras, which is a developer-friendly wrapper over tensorflow, a deep learning library created by Google. We'll teach you the fundamental techniques to build your first deep learning model with Keras.

Computer vision

Go further into computer vision with deep learning by building networks for object detection and recognition. Implement advanced techniques like data augmentation to augment your training set by computing image perturbations (random crops, intensity changes, etc) in order to improve your model's generalisation.

Machine learning pipeline

Move from Jupyter Notebook to a code editor and learn how to setup a machine learning project in the right way in order to quickly and confidently iterate. Learn how to convert a machine learning model into a model with a robust and scalable pipeline with sklearn-pipeline using encoders and transformers.

Machine learning workflow with MLflow

Building a machine learning model from start to finish requires a lot of data preparation, experimentation, iteration and tuning. We'll teach you how to do your feature engineering and hyperparameter tuning in order to build the best model. For this, we will leverage a library called MLflow.

Deploying to production with Google Cloud Platform

Finally, we'll show you how to deploy your code and model to production. Using Google Cloud AI Platform, you'll be able to train your model at scale, package it and make it available to the world. Cherry on top, you will use a Docker environment to deploy your own RESTful Flask API which could be plugged to any front-end interface.

Student projects

You'll spend the last two weeks of the course working on a group project that explores an exciting data science problem you want to solve! As a team, you'll learn how to collaborate efficiently on a real data science project through a common Python repository and the Git flow. You will use a mix of your own datasets (if you have any from your company / non-profit organisation) and open-data repositories (Government initiatives, Kaggle, etc.). It will be a great way to practise using all the tools, techniques and methodologies covered in the Imperial Data Science Intensive Course and will make you realise how autonomous you have become.


This course is intense and will jump into advanced topics from the very first week. The course is designed for people with basic coding skills in Python and maths:

  • Python: You must be comfortable with data types and variables, conditions, loops, functions and the two data structures list and dict.
  • Maths: A minimum A-level qualification or equivalent in maths is required for the course, as we need you to be comfortable with functions, their derivatives and systems of linear equations.

To get up to speed, you'll be given some preparation work to complete before the Imperial Data Science Intensive Course starts. We have a Python Technical Test as part of the admissions process to make sure you are at the right level to make the course a success.

Our events

We will be hosting lots of taster events open to students and anyone interested in the topic.

List of upcoming events:

  • Wednesday 13 January 2021, 18.30 (GMT)

    Intro to Python

    2/4 taster workshops for the Imperial Data Science Intensive Course. Let's learn how to explore the "insides" of websites and extract information from them.

    Sign up for free
  • Thursday 21 January 2021, 18.30 (GMT)

    Data Science Intensive launch event

    Come to Imperial's Data Science Intensive launch event! More info to come.

    Sign up for free
  • Wednesday 17 February 2021, 18.30 (GMT)

    Intro to SQL

    3/4 taster workshops for our co-branded Bootcamp. Immersion in the life of a data analyst through concrete business cases using #datasets from the real world.

    Sign up for free
  • Tuesday 23 February 2021, 18.30 (GMT)

    Bootcamp info session

    Come to Imperial's Data Science Intensive info session! Le Wagon will be hosting and will answer all your questions.

    Sign up for free
  • Wednesday 17 March 2021, 18.30 (GMT)

    Intro to Data Analysis (2)

    4/4 taster workshops for our co-branded Bootcamp. Let's discover the basics of programming with Python and how to manage big data.

    Sign up for free

A typical day

From morning lectures to evening recaps, every day of the course is action-packed.

Lecture 9:00 - 10:30

Grab a coffee and start every morning with an engaging and interactive lecture, before putting what you’ve learnt into practice.

Challenges 10:30 - 16:30

Pair up with your buddy for the day, and work on a series of programming challenges with the help of our teaching staff.

Recap 17:00 - 18:00

Review the day’s challenges and get an overview of upcoming lessons during live code sessions.

Programme directors and teachers

Benjamin Baranger's picture
Blair Young
Lead teacher and Data Scientist, Le Wagon

Having graduated from an MSc Computational Biology in 2014, Blair spent the last 6 years building his skillset as a data scientist. He's now joined Le Wagon, as well as being a meetup organiser, open source contributor and musician.

Benjamin Baranger's picture
Benjamin Baranger
Data Science Lead, Le Wagon

Ben studied and worked in IT for 11 years, working closely with developers. In 2017, he realised he wanted to become one and never looked back! He joined the team full-time as a teacher and engineer, and now leads the Data Science programme in the UK.

Blair Young's picture
Dr Mark Kennedy
Programme lead and co-Director Data Science Institute at Imperial College London

Mark uses methods of data science to study the emergence of new categories around innovations in organising. Recently, he has been using natural language processing (NLP) to study the emergence of commercial applications of artificial intelligence (AI) and machine learning to new approaches to organisations and work. He has PhD and MBA from Northwestern University and its Kellogg School and an AB degree from Stanford in Philosophy and Logic of Formal Systems.

Susan Mulcahy's picture
Dr Susan Mulcahy
Academic Director and Director, Data Sparks Programme at Imperial College London

Susan enjoys translating complex tech problems to a wider audience. She currently leads Imperial’s Data Spark programme and is a lecturer in Data Analytics at Ada National College for Digital Skills. She has an MBA from INSEAD in France and a BSc in Mechanical Engineering from Purdue University in the USA.

Sadia Haider's picture
Dr Sadia Haider
Academic mentor and Research Associate, Imperial College London

Sadia’s research is centred on implementing and developing statistical machine learning models to understand the progression of allergic diseases from childhood to adulthood using longitudinal data from five major birth cohorts. She obtained her PhD in Statistics from the London School of Economics. She was awarded the ‘Star Mentor’ award in 2020 for her mentoring on Imperial’s Data Spark programme.

Aras Selvi's picture
Aras Selvi
Academic mentor and Postgraduate Researcher, Imperial College London

Aras’s research focuses on models and algorithms for decision-making under uncertainty. His interests include robust optimisation, non-convex optimisation, and theoretical computer science. Aras is an academic mentor on the Imperial's Data Spark programme. Past projects he worked on used data analytics to improve operational efficiencies and commercial performance for a global energy company.

Apply to Imperial Data Science Intensive Course