In preparation for the conference, I will be creating a series of machine learning blog posts. These will cover the basic concepts, as well as practical examples of real world machine learning experiments using R, Python, the Azure Machine Learning Studio, and SQL Server. I hope this will provide for a lively conversation during our session: Machine Learning and AI in Action (AI Series, Session 2). I also hope this will start you thinking about ways that machine learning can be used at your firm.
To begin, let's start with a non-technical description of different types of machine learning algorithms.
Machine Learning Algorithms
We use different types of machine learning algorithms to solve different types of problems. There are three main categories of machine learning algorithms: Supervised, Unsupervised and Reinforcement Learning. We will look at each category with relevant examples.
In supervised algorithms, we label each observation with either a category or a value. The algorithm trains a predictive model based on these labels. In other words, the model associates a set of data points with a label. This model can then predict the best label for new data sets.
We use different algorithms to predict values or classifications. Regression algorithms predict values. Classification algorithms predict a classification.
Two-Class Classification Algorithms
Two-class classification algorithms predict whether something is a or b, true or false, 0 or 1. Logistic regression is a two-class classification algorithm which uses a logistic (sigmoid or “S”-shaped) curve. Examples include:
- Predict if a client is likely to churn
- Predict if a summer associate will be a good new hire
- Predict if the firm will win a given case
Multi-Class Classification Algorithms
Multi-class classification algorithms classify instances into one of three or more labels or classes. Examples include:
- Predicting the classification of a text document (eDiscovery)
- Predicting which business development activities will generate the highest return on investment
Regression algorithms predict a specific value. Examples include:
- Predicting sales or profit for a given Practice Group
- Predicting income per partner
In unsupervised algorithms, our observations do not have a label. There is no value that we are trying to predict. The goal is to model the underlying structure or distribution in the data.
Clustering is the most common example of an unsupervised machine learning algorithm. The goal is to discover inherent groupings in the data. The results of a clustering algorithm can be used as input to a supervised algorithm For example:
- Discover clients which behave similarly in ways that are not obvious or intuitive by looking at specific characteristics, such as size, industry, practice areas etc. Clustering clients can lead to deeper practice management insights
Discover rules that describe inherent relationships in the data, for example:
- Clients which use the firm for an initial IPO also use the firm for other types of work, such as employment, etc.
Reinforcement algorithms is concerned with how software agents take actions in a given environment so as to maximize the notion of a cumulative reward. Reinforcement learning is more relevant to the fields of robotics and self-driving cars than the legal profession at this time.
Stay tuned for most posts, including some practical, real-world examples.