Resources for Getting Started With Machine Learning

By Shawn Hainsworth posted 08-01-2019 14:51

So, you want to learn Machine Learning. In this post, I will break down the various topics. In addition, each topic will include references to learning resources. There is an abundance of free information available on all of these topics.

How to Get Started with Machine Learning

  • Start with practical problems in which you have a strong business knowledge. For example, take an analysis you have already done in Excel, and try to improve it incrementally.
  • Take the time to understand the math and how the algorithms work. 
  • Always benchmark your results by learning how to score and evaluate your models.
  • Be realistic about your programming skills. You can accomplish significant results in Excel, or using visual tools such as the Azure Machine Learning Studio.

Math Topics for Machine Learning

Statistics and Probability

In order to work in machine learning, you should understand basic statistics, including means, medians, quartiles, bell-curves, etc. There are many on-line resources for basic statistics. 

Linear Algebra

In order to enhance your understanding of many machine learning algorithms, it is important to understand the basics of linear algebra. Linear algebra extends algebra into an arbitrary number of dimensions. When reading algorithm descriptions, you will often encounter linear algebraic notation. There is a good course on Khan Academy.

Machine Learning Algorithms

This is a very big topic area. The best way to get started is with simpler algorithms and practical problems.  See my blog post, A Non-Technical Introduction to Machine Learning for a description of the major algorithms and their uses.

Familiarize yourself with Jupyter notebooks. It is important to work in a well-documented, repeatable manner.

The following posts will help you get started in more detail:

Linear Regression
Understand Precision vs. Recall
Learn how to Cross-Validate Models

Additional Resources for Learning

For Excel users, an excellent place to start is Data Smart by John Foreman. This book takes you through a number of practical problems using Excel. Most importantly, you can more easily understand how these algorithms work by using spreadsheets and formulas.

I highly recommend The Analytics Edge, a free MIT extension class. In order to get the most out of the course, do the homework, get the certificate, and participate in the Kaggle competition.

I would recommend Hands-On Machine Learning with Scikit-Learn and TensorFlow by Aurélien Géron. This is a very well-written book covering both Scikit-Learn and TensorFlow.

I have two Pluralsight courses for working with Microsoft AI and Machine Learning technologies:
Creating & Deploying Microsoft Azure Machine Learning Studio Solutions

Scalable Machine Learning with the Microsoft Machine Learning Server

Finally, the best way to learn about these algorithms is to work through examples.

Microsoft Azure has some excellent tooling around AI and Machine Learning. To begin, you can sign up for a free Machine Learning Studio account to create experiments visually. In addition, you can easily copy experiments from the Azure AI Gallery into your workspace to study and modify.

Review the Kaggle competition submissions. In order to join a competition, all participants make their kernels publicly available. So, this is a great way to learn from other developers. In addition, Kaggle has a number of excellent learning resources, including sample datasets.

Good Luck!