Maybe you’ve worked on your first few AI or ML projects and encountered some difficulties.
Or maybe you’re just getting started on your first one. Whatever the case may be, it’s important to recognize that – even though they can provide great value – AI initiatives require a certain amount of diligence.
If you have already struggled, you should give yourself some grace, because even Google Research has called machine learning “the high-interest credit card of technical debt.”
AI initiatives fail for a plethora of reasons. But with a bit of foresight, failures can be avoided. In this article, I’m going to outline some of the problems that I’ve seen and hopefully lay out a path for you to avoid becoming part of a failed AI project.
1. You Don’t Have the Right Data Available
Have you ever tried to bake a cake without a pan? Or change a tire without one of those turner thingies?
That’s what it’s like to be told to build an AI project without the right data. You need a solid foundation to build upon. Oftentimes, everyone on the team is expecting that a certain piece of data is being collected. But with all the data coming in, sometimes a key piece falls through the cracks.
For instance, maybe a particular column is missing data or has a strange value associated with it instead of the desired clarity needed to build a model. Data Quality is a necessary component of deploying a successful ML model. These techniques are usually set up on the data engineering side of the projects.
Are there ways to make up for missing data?
Yup, but not all will work for every use case.
For example, if you know customers are leaving your platform but you’re not saving the right behavioral data to your data warehouse, then we really can’t build a model that predicts customer churn. Data scientists will try to hunt and find some speck of meaning to feed the model but that’s not what builds the most successful outcomes. We can try to mock-up certain metrics to fill the data in, but “your results may vary.”
One way that may help you prevent this problem is to give them access to all of your data and tell your data scientist to come back with a list of any missing data they think is necessary. Sometimes, it’s actually there but being stored in an unexpected place.
And other times you will have to hire a data engineer or API specialist to start collecting the right data so that you can have a successful AI project.
2. You Don’t Establish a Metric for Success
What’s sometimes more important than the data is your metric for success. This doesn’t have to be a number. It can be an improvement on the distribution of a piece of data or the curve of a graph. Looking at a certain percentage or number isn’t always possible.
It’s a science.
There’s a lot of experimentation involved and we don’t know when the experiment may have a successful end.
With that being said, if your team isn’t setting and sticking to a measure of success, then your team is living on the wild side. You’ll never know if something was achieved or not.
A metric for success removes a lot of ambiguity for data science projects. The DS team knows what they are working toward and you can also hold them accountable for doing more research and finding new ideas in that direction.
It’s easier to make a list of techniques to try and test in a single direction than to just be thinking of all the techniques that have ever been created to solve the ambiguous problem.
3. You Create a Solution with Excessive Complexity
I’m sure you’ve heard the joke about, “how many people does it take to screw in a lightbulb?” Well, the answer is one. And certain techniques are like using ten people to screw in a lightbulb.
Take deep learning for example. It has many great use cases. However, in many business use cases, it’s simply too much. It sounds fancy. The VCs love it and it makes us look cool but it’s probably not the most effective way to solve a problem.
Not only that but it’s not easy to know why deep learning algorithms make decisions. In other words, they can be a bit of a black box. If you don’t want to work with a black box, or if you have to comply with any form of regulation. A deep learning technique may not be viable.
A data-driven operations, marketing, or sales strategy needs answers. Those answers can be used to improve processes.
4. You Don’t Prototype the Expected Outcome
Prototypes are amazing! And you need one.
A prototype helps you to know what the expected output from the model needs to look like. Oftentimes, we have one idea in our heads and the business has another. By starting with a prototype’s expected output, we can merge these two ideas.
This can be a dashboard that needs to be viewed by managers or the sales team. It can be how a feature will be displayed in an application. It can be how the customer will interact with it on their end or simply how it needs to feed into the email automation system.
None of these has to be fully built out, but the visual will create another opportunity to ensure all stakeholders are on the same page.
5. You Don’t Have Budget for Unexpected Complexities
Data Science is a Science. The process for putting models into a usable environment is new. This means that you uncover so many new things along the way to your solutions. Some of those discoveries could add value or insights, but other ones will mean that you need to rethink your solution.
It is a good idea to communicate appropriate expectations, set a cautious budget and timeline projections, and work in an agile manner to account for discoveries or changes. Beyond that, you should include time and budget for testing, deployment, and integration to your product. The length of each phase will vary based on the size of your team and their expertise.
Best Practices for Setting up Data Science Projects
1. Make it reproducible. As with any science, others need to be able to recreate the work or the experiment isn’t valid. Documenting the experiment and verifying the output statistically is a good place to start. This is important to validate your model assumptions.
2. Set up Automation and Testing. While adding testing to Ml projects is new, it is possible. In recent years, people have been modifying software engineering techniques to ensure that notifications, alerts, and thresholds are set to not deploy bad models into production.
Our sincere hope is that this blog provided some insights to help your AI initiatives find success. If you have any questions or could use a little guidance, our team of AI experts are happy to jump in and help.
We (humble brag alert) have helped numerous clients across many verticals/business sizes find early success in their AI initiatives by leveraging our vast expertise. If you’re ready to take the next step in your AI journey, don’t hesitate to reach out!