Digital generated image of data.
Lemonade is one of this year’s hottest IPOs and a key reason for this is the company’s heavy investments in AI (Artificial Intelligence). The company has used this technology to develop bots to handle the purchase of policies and the managing of claims.
Then how does a company like this create AI models? What is the process? Well, as should be no surprise, it is complex and susceptible to failure.
But then again, there are some key principles to keep in mind. So let’s take a look:
Selection: There are hundreds of algorithms to choose from. In some cases, the best approach is to use several (this is known as ensemble modelling).
“You need to determine the core task,” said Eric Yeh, who is a computer scientist at the Artificial Intelligence Center at SRI International. “What are you trying to accomplish? Suppose you have a bunch of images which you want to classify. The model adopted would be dramatically different from a case where you want to put captions on the images, even if they look similar and have the same input data.”
It’s important to realize that you need the right kind of data for certain models. If anything, this is one of the biggest challenges in the AI development process. “On average, the data preparation process takes 2X or in some cases 3X longer that just the design of the machine learning algorithm,” said Valeria Sadovykh, who is the Emerging Technology Global Delivery Lead at PwC Labs.
So in the early phases of a project, you need to get a good sense of the data. “Conduct an exploratory analysis,” said Dan Simion, who is the VP of AI & Analytics at Capgemini North America. “Visualize the data in 2-dimensions and 3-dimensions, then run simple, descriptive statistics to understand the data more effectively. Next, check for anomalies and missing data. Then clean the data to get a better picture of the sample size.”
But there is no perfect model, as there will always be trade-offs.
“There is an old theorem in the machine learning and pattern recognition community called the No Free Lunch Theorem, which states that there is no single model that is best on all tasks,” said Dr. Jason Corso, who is a Professor of Electrical Engineering and Computer Science at the University of Michigan and the co-founder and CEO of Voxel51. “So, understanding the relationships between the assumptions a model makes and the assumptions a task makes is key.”
Training: Once you have an algorithm – or a set of them – you want to perform tests against the dataset. The best practice is to divide the dataset into at least two parts. About 70% to 80% is for testing and tuning of the model. The remaining will then be used for validation. Through this process, there will be a look at the accuracy rates.
The good news is that there are many AI platforms that can help streamline the process. There are open source offerings, such as TensorFlow, PyTorch, KNIME, Anaconda and Keras, as well as proprietary applications like Alteryx, Databricks, DataRobot, MathWorks and SAS. And of course, there are rich AI systems from Amazon, Microsoft and Google.
“The key is to look for open source tools which allow for easy and quick experimentation,” said Monica Livingston, who is the Director of AI Sales at Intel. “If you prefer to purchase 3rd party solutions, there are many ISVs offering AI-based solutions for tasks like image recognition, chat bots, defect detection and so on.”
Feature Engineering: This is the process of finding the variables that are the best predictors for a model. This is where the expertise of a data scientist is essential. But there is also often a need to have domain experts help out.
“To perform feature engineering, the practitioner building the model is required to have a good understanding of the problem at hand—such as having a preconceived notion of possible effective predictors even before discovering them through the data,” said Jason Cottrell, who is the CEO of Myplanet. “For example, in the case of predicting defaults for loan applicants, an effective predictor could be monthly income flow from the applicant.”
But finding the right features can be nearly impossible in some situations. This could be the case with computer vision, such as when used with autonomous vehicles. Yet using sophisticated deep learning can be a solution.
“These days, neural networks are used to learn features, as they are better at understanding statistics than humans,” said Shadi Sifain, who is the senior manager of data science and predictive analytics at Paychex. “However, they are not necessarily a panacea and might develop features that were not intended as well. The famous example is the image classifier which was developed to detect tanks and jeeps. Instead, it learned to detect night and day since all jeep photos were taken in the day and all tank photos were taken in the museum at night.”
Tom (@ttaulli) is an advisor to startups and the author of Artificial Intelligence Basics: A Non-Technical Introduction and The Robotic Process Automation Handbook: A Guide to Implementing RPA Systems. He also has developed various online courses, such as for the Python programming language.