Overview The On-Ramp Mathematics CS & Engineering Data & Intelligence
Structured Learning Path

Towards Intelligence

This roadmap organises two parallel disciplines — Computer Science & Engineering and Data & Intelligence — into a structured learning path from mathematical foundations to large language models. Each discipline deliberately separates Science (the theory and the “why”) from Engineering (the practice and the “how”), because understanding the foundations before building on them is the whole point.

01
01  /  The On-Ramp

Getting Started

The fastest path to becoming employable — these courses build a high-level overview of both disciplines and get you job-ready before going deeper into the specialised theory and practice below.

These are not prerequisites in the academic sense — they are the fastest path to becoming employable and understanding the landscape well enough to know where to go deep. The CS&E track gets you building real software with real tools (code, databases, version control, deployment). The Data & Intelligence track gets you working with data and AI systems end-to-end. Together, they give you enough working fluency to land a role and enough context to make the deeper sections below actually make sense.
Computer Science & Engineering
i
100 Days of Code
Daily coding habit — build fluency through consistent practice
ii
Data Structures and Algorithms
The backbone of all software — how data is organised and problems are solved
iii
Database Management Systems
How data is stored, queried, and managed at scale
iv
Open Source Software Development, Linux and Git Specialization
The toolchain — version control, open-source workflows, and the Linux environment
v
Software Engineering
Principles of building software that is maintainable, testable, and scalable
vi
DevOps
Bridging development and operations — CI/CD, automation, and deployment
Data & Intelligence
i
Mathematics for Data Science
The mathematical prerequisites — calculus, linear algebra, and statistics in context
ii
Data Analyst
Exploratory analysis, visualisation, and deriving insights from structured data
iii
AI Engineer
End-to-end AI application development — from model selection to deployment
iv
IBM RAG and Agentic AI
Retrieval-augmented generation and building autonomous AI agents
v
MLOps
Operationalising machine learning — pipelines, monitoring, and model lifecycle
vi
LLMOps
The operational side of large language models — fine-tuning, serving, and evaluation
02
02  /  Mathematics

Mathematical Foundations

The mathematical layer that underpins everything else — these topics recur across both disciplines at every level of depth.

Mathematics is not a separate track to “finish” — it is a foundation you keep revisiting. Probability and Statistics are essential for machine learning and data analysis. Linear Algebra is the language of neural networks and dimensionality reduction. Undergraduate Mathematics (calculus, real analysis) gives you the rigour to read research papers without hand-waving. The order here is roughly progressive: school-level refreshers first, then university-level depth, then the specific branches that matter most for AI and data work.
School Mathematics
Probability
Linear Algebra
Linear Algebra Done Right – Sheldon Axler Textbook
03
03  /  CS & Engineering

Computer Science & Engineering

The full knowledge tree — organised as a deliberate progression from mathematical foundations through theoretical computer science to practical engineering at scale.

This discipline is split into Science and Engineering because they answer fundamentally different questions. Science asks: how does computation work? What makes an algorithm correct? Why does a distributed system fail in the ways it does? These are questions of understanding — and skipping them means you end up building things you cannot debug, optimise, or reason about when they break. Engineering asks: how do I build and run real systems? Knowing how a B-tree works is Science; designing a database schema that survives production traffic is Engineering. You need both, and in that order.

Within Science, the split is between Fundamentals and Systems. Fundamentals covers the abstract core — computation theory, algorithms, language design — the concepts that do not change regardless of what technology stack you use. Systems takes those ideas and grounds them in real infrastructure: how databases manage transactions, how operating systems schedule processes, how networks route packets. Fundamentals gives you the “what”; Systems gives you the “where it actually runs.”

Within Engineering, the split is between Building and Operating. Building is about construction — writing software, setting up infrastructure, designing cloud architectures, establishing CI/CD pipelines. But building a system is only half the job. Operating is about keeping it alive: ensuring reliability through SRE practices, catching performance regressions before users do, hardening security posture, and maintaining observability so that when something does go wrong at 3 AM, you can actually find out why. A system that cannot be operated is a system that cannot be trusted.
Read the complete CS & Engineering guide →
Mathematics
The formal language that CS theory is built on.
Discrete Mathematics
Discrete Mathematics and Its Applications – Kenneth Rosen Science · Textbook
Science
Fundamentals
Programming Languages and Compilers
Data Structures and Algorithms
Theory of Computation
Systems
Databases
Distributed Systems
Operating Systems
Computer Networks
Parallel & Concurrent Computing
Computer Organization & Architecture
Fundamentals
The theoretical core — how computation, languages, and algorithms work at a foundational level.
Data Structures and Algorithms
Introduction to Algorithms (CLRS) – Cormen, Leiserson, Rivest, Stein Textbook
The Algorithm Design Manual – Steven Skiena Practical
Neetcode Practice
Theory of Computation
Introduction to the Theory of Computation – Michael Sipser Textbook
Systems
How real computing infrastructure is designed — from a single machine to distributed clusters.
Databases
Database Internals – Alex Petrov Textbook
Designing Data-Intensive Applications – Martin Kleppmann Practical
Distributed Systems
Designing Data-Intensive Applications – Martin Kleppmann Practical
Computer Networks
Computer Networking: A Top-Down Approach – Kurose & Ross Textbook
Parallel & Concurrent Computing
The Art of Multiprocessor Programming – Herlihy & Shavit Textbook
Computer Organization & Architecture
Computer Organization and Design – Patterson & Hennessy Textbook
Engineering
Building
Software Engineering
Infrastructure Engineering
— Development Operations (DevOps) — Cloud Engineering — Platform Engineering
Operating
Reliability Engineering — SRE
Performance Engineering
Security Engineering
Observability Engineering
Building
Constructing software systems and the infrastructure they run on.
Software Engineering
Software Engineering
The Pragmatic Programmer – Hunt & Thomas Book
Clean Code – Robert C. Martin Book
Designing Data-Intensive Applications – Martin Kleppmann Book
Infrastructure Engineering
Development Operations (DevOps) Cloud Engineering Platform Engineering
The Phoenix Project – Kim, Behr & Spafford Book
Kubernetes in Action – Marko Lukša Book
Operating
Keeping systems reliable, performant, secure, and observable in production.
Reliability Engineering
Site Reliability Engineering
Performance Engineering
Performance Engineering
Systems Performance – Brendan Gregg Book
Security Engineering
Security Engineering
Observability Engineering
Observability Engineering
Observability Engineering – Charity Majors, Liz Fong-Jones & George Miranda Book
04
04  /  Data & Intelligence

Data & Intelligence

A dual-track approach to AI and data — the Science track for theoretical depth, the Engineering track for hands-on building — both following the same progression from data analysis to large language models.

The same Science/Engineering split applies here, and for the same reason. Knowing the mathematics behind gradient descent (Science) and knowing how to debug a training run that is not converging (Engineering) are two entirely different skills — and you need both to be effective. The roadmap deliberately mirrors the two tracks topic-for-topic across the same five areas: Data Analysis, Machine Learning, Deep Learning, NLP, and Transformers/LLMs. The Science track uses canonical textbooks and Stanford lecture series to build theoretical depth — understanding why algorithms work, what their guarantees are, and where they break down. The Engineering track uses implementation-focused books and code repositories to build practical fluency — understanding how to actually train, evaluate, ship, and iterate on real models.

The Engineering track is further split into Building and Operating, just like in CS&E. Building covers Software Engineering (writing the model code, building the application, choosing the right framework) and Infrastructure Engineering (designing the ML system architecture, data pipelines, feature stores, and serving infrastructure). Operating covers the production lifecycle — monitoring model drift, managing retraining pipelines, ensuring reproducibility, and keeping ML systems reliable at scale. Building gets a model into production; Operating keeps it there.

The deliberate mirroring between Science and Engineering is the point. Theory without practice is academic — you understand the math but cannot ship anything. Practice without theory is fragile — you can follow a tutorial, but the moment something behaves unexpectedly, you have no mental model to fall back on. Both tracks together produce someone who can reason about and build intelligent systems.
Read the complete Data & Intelligence guide →
Science
Data Analysis
Machine Learning
Deep Learning
NLP
Transformers & LLMs
Engineering
Building
Software Engineering — Data Analysis — Machine Learning — Deep Learning — NLP — Transformers & LLMs
Infrastructure Engineering
Operating
State of the Art
Science
Theory & Foundations
The Science track is about building mathematical intuition. These are the canonical textbooks and Stanford lecture series that the field is built on. The goal here is not to memorise proofs, but to develop the ability to read a paper, understand a loss function, or reason about why a model is failing — without reaching for Stack Overflow first.
Data Analysis
Transformers & LLMs
Engineering
Building & Practice
The Engineering track mirrors the Science track topic-for-topic, but the resources are entirely different. Where the Science track assigns a textbook like Goodfellow’s Deep Learning to explain the theory behind backpropagation, the Engineering track assigns Chollet’s Deep Learning with Python to show you how to actually implement, train, and evaluate a neural network in Keras.

This track splits into two sub-disciplines. Software Engineering is about building the models and applications themselves — writing the training code, choosing the right libraries, structuring experiments, and going from a research idea to working software. Infrastructure Engineering is about what sits underneath and around the model — the ML system design, data pipelines, feature stores, model registries, and serving infrastructure that make it possible to run models reliably at scale. You can write a beautiful model in a notebook, but without the infrastructure to serve it, retrain it, and monitor it, it never leaves your laptop.
Software Engineering
Data Analysis
Natural Language Processing
Natural Language Processing in Action – Lane, Howard & Hapke Book
Infrastructure Engineering
Engineering
Operating
This is the production side — monitoring model drift, managing retraining pipelines, ensuring reliability and reproducibility of ML systems in the real world. Resources and structure for this section are still being developed as the earlier tracks solidify.
State of the Art
Emerging & Evolving
A deliberately open-ended section. The field moves fast — new architectures, new capabilities, new failure modes. This is where I track the papers, tools, and techniques that haven’t yet settled into textbooks but are shaping where things are heading.