In a recent post, Michael Porter asked “What Does AI Have To Do With Strategy?” In discussing the transformational nature of AI, he suggested the very first question you should ask is, “What is AI and Machine Learning?” Lately, I’ve been doing a lot of research into machine learning to understand the concepts and deep dive into the right applications of the technology. What I’ve found is the literature often deep dives into a pea soup of acronyms, statistics and mathematics that the layman is left to wonder, “What the heck did I just read?” So, I thought I’d provide a very brief, consumable introduction to machine learning with as little jargon as possible. If you are looking for a concise overview, hopefully you found it and will let us know in the comments.
Machine Learning vs Artificial Intelligence
Often the concepts of machine learning and artificial intelligence are talked about as if they are interchangeable. In reality, machine learning is a part of AI, but they are not the same thing. Ultimately, AI is about making machines ‘think’ like, or better, than humans. In most cases, AI capabilities are nowhere near this goal today. Machine Learning is about teaching a system to analyze data and make ‘predictions’. These predictions can be used by themselves or can be integrated into larger systems. We have lots of examples of useful machine learning applications today.
As a simple example, I can use machine learning to analyze a picture, and make a prediction that it displays a bird. The machine can also predict that the bird is flying or not. If I train the machine properly, I could even predict what kind of bird it is. But, by itself, the machine doesn’t understand what a bird is or what flying means or how that information can be used in some useful way.
What is Machine Learning
At the very basic, machine learning is taking data in, running it through some algorithm to predict something, evaluating and teaching the algorithm to make better predictions, and then using that learning to produce useful ‘predictions’ in the future. Of course, this is a very simplistic view and there lots of complicated stuff that happens in each step, but you get the basic idea. Lets break this process down into the specific steps:
- Taking data in – in other words, having data available to analyze. In order for machines to learn anything, they need to have data on which to train. Typically machine learning is useful when there are lots and lots of data to analyze. Whether that means lots of records, like the number of visits to our website, or lots of variables, like the a bird’s color, size of beak, location where spotted, etc.
- Run it through an algorithm – ok, algorithm is a bit of jargon, but an algorithm is just a set of rules to follow to calculate or do something. For example, an algorithm might be: ‘add up all the values of the data and divide by the number of records’. That algorithm is called ‘average’. So algorithms perform a desired number of steps, probably apply calculations to our data, and finally spit out a result. The result is often a prediction of something in which we are interested. So an algorithm to identify the bird in our picture might be 1)extract all the pixels, 2) compare them to pixels from other pictures that we know are birds, and 3) then ‘predict’ whether our picture contains a bird or not.
- Learn to make better predictions – this is a key aspect of machine learning. In the previous step, the algorithm compares the data in the image to data we already know. If the algorithm is correct in its predictions, we give it the proverbial pat on the head. If it is wrong, we tweak (teach) the algorithm to better understand how the new pixels mean bird too. This cycle of learning/evaluating continues until our test predictions become more accurate. Without this learning cycle, the machine just makes guesses.
- Predict in the future – now that the machine has learned to be more accurate, we can take new data in and the machine will predict correctly that the image contains a bird. While predicting a bird in a image might seem trivial, this is the underlying technique for a self driving truck to predict whether an object it sees in a road is a rock (run it over) or a small dog (avoid it).
That Seemed Easy, Now I Have a Bunch of Questions!
As I ran through the steps above I came up with a bunch of questions that I had and I’ll bet you do to:
- Data:
- How do I get data in? Data can come from a variety of sources, including existing systems, internet data sources, surveys, etc. Typically machine learning works with very large amounts of data that are contained in various types of databases – this is Big Data.
- What kind of data? Any kind of data can be used: numeric data, text, images, etc. It all depends on your particular need or application.
- How much data? More data is better. Dr. Pedro Domingos, an AI researcher and Professor, said “More data beats cleverer algorithms…As a rule of thumb, a dumb algorithm with lots and lots of data beats a clever one with modest amounts of it. “.
- Algorithms:
- Are there algorithms available or do I have to create one? We have lots of algorithms available today built into many different systems that you can use. Each existing algorithm has lots of parameters that can be tweaked to better training on specific data. A data scientist can help identify useful algorithms or build new ones.
- What’s the best algorithm? Many algorithms perform well in certain areas, but don’t work in other uses. Researchers are seeing that a combination of algorithms is the better approach for most uses.
- Is there one algorithm that can be use for all purposes? Unfortunately, there is no one best algorithm that can be used for all purposes. Dr Domingos wrote a book titled The Master Algorithm where he discusses the search for the one, master algorithm for all problems. Spoiler alert – the master algorithm hasn’t been found yet.
- Learning:
- How much teaching do I need to do? Learning is often a continual process. There is almost no way to teach an algorithm with 100% of the possible data, so each application should build in a learn/test/evaluate process.
- What accuracy level do I need? That depends on your application. Some uses will require 80% accuracy, while others need to be at 90%+. For example a spam filter may be fine for predicting 80% correctly. However predicting disease may need to over 95% accurate.
- Predictions
- What can I do with the predictions? While predictions can often be used by themselves, predictions are often built into larger systems. As a single predictor, Cornell University built an application (called Merlin) that takes in an image and tells you what kind of bird it detects. However a digital asset management system might take that prediction to automatically tag the bird, its color, whether its flying, etc. so your web site authors can find useful images quickly.
- Is a prediction the only outcome? As mentioned above, the ‘prediction’ often leads to other outcomes. While the Cornell Merlin application predicts what kind of bird it sees, it uses that prediction to give you lots of additional information about that bird that a bird watcher might find useful.
What’s Next
Whether it is big data, algorithms, predictions or testing that peak your interest, you have at least a basic understanding of machine learning. Hopefully you can use this as a spring board to dive deeper into these topics and you can contact us for help.