Learning AI as an Undergraduate

Learning AI as an Undergraduate

One of the biggest technologies enabling AI today is Machine Learning and especially the area of it popular as Deep Learning. Currently there are a large number of extremely high level libraries that let one meddle with state-of-the-art deep learning methodologies without having any understanding whatsoever about deep learning. Also you can easily learn about these and simplified versions of how they work through a lot of online courses but that would really give you a great background about the core of machine learning.

As an undergraduate if you hope to purse a research career in this field (in academia or industry) it is essential to gain a strong foundation in the mathematical concepts that enable deep learning. I personally find these a lot more interesting than working around mere implementations of machine learning solutions. However strong coding skills and hardware understanding that enable the implementation of modern machine learning techniques are equally important.

Math Background

Having a strong mathematical background is essential. The stuff you learn as part of your Major curriculum does not really go into the depths of the subjects. When following math courses at uni like Linear Algebra, Probability, and Statistics, try to do some background reading (online courses even) to grasp the core concepts of those subjects.

At the same time, the Deep Learning Book is something anyone interested in this field should read from the beginning. You may feel that you know certain parts and wish to skip areas (even the author suggest sometimes) but I would personally recommend to read every part of it from the very beginning (include the introduction). Also when reading this book spend some time to brush up your mathematics if you don’t understand any of the derivations of formulas or concepts explained. Ideally you should develop an intuitive understanding of any concept explained and have the ability to visualise what a specific equation or function does in your mind.


Python (and later C++, CUDA) may be sufficient for one to learn in terms of programming languages. A thorough understanding of the principals of Python (garbage management, threading, multi-processing) in extremely useful. Also you will have a few courses in your Major that specialise on computer organisation (processor / hardware level implementation) and operating systems that will give one much needed understanding of how your code runs underneath a high level language. Especially when optimising your code (writing new neural network layers or modules) this kind of understanding becomes necessary. Also running training jobs on multiple GPUs or multiple machines in parallel (even using existing frameworks like Horovod) would require having some understanding about operating systems.

Alongside this the next step in programming is to actually do some projects that require you to program. A starting point would be to take a recent state of the art work, go through its code repository, get it running locally, dive deep into the core of it, and understand every bit of that code. I would suggest picking a well documented repository. This would let you learn coding standards and documentation standards as well. Using an IDE like PyCharm would help considerably in sticking to such standards.

This was an edx course I found extremely useful to learn some basic python libraries related to data science while studying some very interesting concepts as well.


If your plan is to work in this field after graduation as an ML Engineer a good understanding of commonly used libraries like Tensorflow, PyTorch, Numpy, and Matplotlib would be essential. Also some familiarity with common tools like Hadoop and Spark as well as platforms like AWS would be highly useful. However these can be learnt quite easily during your internship (at a company working on ML) or once you start working after graduation. The skills necessary for learning these are your core understanding of a programming language and its interaction with the operating system and hardware.


If your goal is to purse higher studies in this area, having publications as an undergraduate is quite important. It is a good idea to try and collaborate with some post graduates pursuing research abroad (or anyone experienced doing great research while publishing). Heading a research and publishing (at a top conference) alone on your own in this area may be quite difficult given your general access to resources (both hardware and knowledge). Focussing on a publication at least during your internship (or even before) is definitely quite important.

In terms of a ML framework for starting you machine learning research work, I would strongly recommend PyTorch (as of 11.04.2020) since its very user-friendly and has a large amount of up-to-date resources (including implementations of most recent research). Also learn to use Google Colab and well as GCP free credits if you are scarce for GPU resources when working on these.


Training a standard neural network on your dataset and building a basic AI tool is an absolute piece of cake nowadays given the level of simplicity in modern ML libraries. If you really want to get into AI you should develop an understanding of the underlying concepts that enable you to improve an already existing model to your specific use case and methods to optimise and scale it using industry standard tools. For someone with a penchant more towards pure research a strong knowledge in linear algebra, probability, and statistics (way beyond what you are taught for your major here) alongside in-depth programming skills is a must.

Finally some advice from one of the top people in the world in this area: quora link (it’s a bit old but most of it still very valid!)