The same Science/Engineering split applies here, and for the same reason. Knowing
the mathematics behind gradient descent (Science) and knowing how to debug a training
run that is not converging (Engineering) are two entirely different skills — and
you need both to be effective. The roadmap deliberately mirrors the two tracks
topic-for-topic across the same five areas: Data Analysis, Machine Learning,
Deep Learning, NLP, and Transformers/LLMs. The Science track uses canonical
textbooks and Stanford lecture series to build theoretical depth — understanding
why algorithms work, what their guarantees are, and where they break down.
The Engineering track uses implementation-focused books and code repositories to build
practical fluency — understanding how to actually train, evaluate, ship,
and iterate on real models.
The Engineering track is further split into Building and
Operating, just like in CS&E. Building covers
Software Engineering (writing the model code, building the application,
choosing the right framework) and Infrastructure Engineering (designing
the ML system architecture, data pipelines, feature stores, and serving infrastructure).
Operating covers the production lifecycle — monitoring model drift, managing
retraining pipelines, ensuring reproducibility, and keeping ML systems reliable at
scale. Building gets a model into production; Operating keeps it there.
The deliberate mirroring between Science and Engineering is the point. Theory without
practice is academic — you understand the math but cannot ship anything.
Practice without theory is fragile — you can follow a tutorial, but the moment
something behaves unexpectedly, you have no mental model to fall back on. Both tracks
together produce someone who can reason about and build intelligent systems.
Science
Data Analysis
Machine Learning
Deep Learning
NLP
Transformers & LLMs
Engineering
Building
Software Engineering
— Data Analysis
— Machine Learning
— Deep Learning
— NLP
— Transformers & LLMs
Infrastructure Engineering
Operating
State of the Art
Theory & Foundations
The Science track is about building mathematical intuition. These are the canonical
textbooks and Stanford lecture series that the field is built on. The goal here is
not to memorise proofs, but to develop the ability to read a paper, understand a
loss function, or reason about why a model is failing — without reaching for
Stack Overflow first.
Natural Language Processing
Building & Practice
The Engineering track mirrors the Science track topic-for-topic, but the resources
are entirely different. Where the Science track assigns a textbook like Goodfellow’s
Deep Learning to explain the theory behind backpropagation, the Engineering
track assigns Chollet’s Deep Learning with Python to show you how to
actually implement, train, and evaluate a neural network in Keras.
This track splits into two sub-disciplines. Software Engineering is
about building the models and applications themselves — writing the training code,
choosing the right libraries, structuring experiments, and going from a research idea
to working software. Infrastructure Engineering is about what sits
underneath and around the model — the ML system design, data pipelines, feature
stores, model registries, and serving infrastructure that make it possible to run models
reliably at scale. You can write a beautiful model in a notebook, but without the
infrastructure to serve it, retrain it, and monitor it, it never leaves your laptop.
Software Engineering
Natural Language Processing
Natural Language Processing in Action – Lane, Howard & Hapke
Book
Infrastructure Engineering
Operating
This is the production side — monitoring model drift, managing retraining
pipelines, ensuring reliability and reproducibility of ML systems in the real world.
Resources and structure for this section are still being developed as the earlier
tracks solidify.
Emerging & Evolving
A deliberately open-ended section. The field moves fast — new architectures,
new capabilities, new failure modes. This is where I track the papers, tools,
and techniques that haven’t yet settled into textbooks but are shaping
where things are heading.