Planning ML projects is different than planning typical software engineering projects. ML projects are characteristically non-linear and have varying degrees of uncertainty. They require an iterative approach and an experimental mindset.
Project uncertainty
Early-stage planning can be difficult because the best approach typically isn't apparent when beginning a project. This inherent uncertainty makes estimating timelines hard.
A recent Kaggle competition illustrates the uncertainty of ML projects. In the first few weeks of a competition, 350 teams participated. The top teams were able to increase the benchmark prediction quality from 35% to 65%. During the next two weeks, the number of teams working on the problem increased from 350 to 1400. However, the best model only achieved a prediction quality of 68%.
Figure 3 illustrates the uncertainty in ML development by showing the significant increase in effort but only minimal gains in model quality.
Figure 3. Over a two week period, the number of teams working on the problem increased by a factor of 4, but the model's quality stayed almost the same, highlighting the difficulty in estimating effort of an ML solution.
In other words, over a thousand teams—each experimenting with a variety of data transformations, architectures, and hyperparameters—only achieved producing a model with 68% prediction quality.
An example from industry illustrates the non-linearity of ML projects, where the output isn't proportional to the inputs. Two teams took several months to train a model to 90% prediction quality. However, it took several teams more than five years to get the model ready for production with 99.9% prediction quality.
These examples highlight that production-ready ML is an exploratory process, requiring both a scientific and engineering mindset.
Experimental approach
In most cases, ML development is more like performing experiments than it is practicing traditional software engineering. ML requires testing different features, trying multiple architectures, and tuning hyperparameters just right. By definition, experiments aren't guaranteed to be successful. Because of this, it's best to plan using an experimental framework.
Let's look at a typical software engineering plan to see how it's different from an ML project plan.
Planning software engineering projects
In a typical software engineering plan, you define the requirements, outline the components, estimate the effort, and schedule the work. There's a clearly defined path to a solution. For example, engineers often know with a high degree of certainty the tasks they need to complete to build an application that meets the design spec.
When they predict the time it'll take to complete a task, they can estimate the work based on similar projects. Although challenges invariably occur—like unknown dependencies or changing requirements—that can make estimating sometimes difficult, a clear path to the solution typically exists.
In contrast, ML projects don't normally have one clear path to success.
Planning ML projects
For most ML projects, you'll find the best solution by experimenting with multiple approaches in a trial-and-error process. You won't typically know the optimal solution to your problem before attempting to solve it. For example, the optimal solution's architecture might be a simple linear model, or a neural net, or possibly a decision tree. Only by trying each approach can you discover the best solution.
This ambiguity makes planning hard. As discussed previously, predicting the effort an ML project will require is difficult. Only by attempting to solve the problem can you get a better sense for the amount of time and resources a solution might require.
The following are recommended strategies for planning ML work:
Time box the work. Set clear timeframes to complete tasks or attempt a particular solution. For example, you might allocate two weeks to determine if you can get access to the right kind of data. If you can get the data, you then might designate two more weeks to see if a simple model indicates that an ML solution is feasible. If a simple model fails, you might designate two more weeks to try a neural net. At the end of each timeframe, you'll have more information to determine if continuing to apply resources to the problem is worthwhile.
Scope down the project requirements. If an ML solution appears promising but isn't a critical feature for your product or service, scope back its requirements. For example, when planning the next quarter's work, you might plan to try a very simple solution. Then in subsequent quarters, you might plan to iteratively improve the solution. Implementing an ML solution by making incremental improvements over a longer time horizon has been the way many teams have arrived at impactful ML solutions.
Intern or new hire project. Directing and guiding an intern or a new hire to attempt an ML solution can be a good way to begin exploring a new space with unknown outcomes. After the project is over, you'll have a better sense of the effort an ML solution will require and potentially promising approaches to pursue—or whether resources should be put somewhere else.
With any strategy, it's wise to fail fast. Attempt approaches with the lowest costs, but potentially the highest payoff, first. If the approach works, you've found a good solution. If it doesn't, you haven't wasted lots of time and resources.
As a teams get experience and exposure to running experiments, they'll be able to better estimate the effort an experiment might require, making planning more predictable. However, an experiment's outcome will almost always be unknown, so the number of experiments needed to find the best solution can't be estimated beforehand.
Planning approaches with an experimental mindset helps set your team up for success. When an approach leads to a dead end, rather than being discouraged, team members understand that's part of the process in finding an ML solution. More importantly, by discussing the inherent uncertainty in ML development with stakeholders, you're able to set more realistic expectations.
Check Your Understanding
Keep in mind
Learning to plan multiple ML approaches probabilistically takes time and experience. Your project plan might require frequent updates. Think of it as a dynamic document in constant evolution as your team experiments with multiple approaches. By focusing on the following key ideas, you'll increase your chances of success:
- Estimate the cost and chance of success for each approach.
- Attempt a portfolio of approaches.
- Identify lessons learned and try to improve the system one thing at a time.
- Plan for failures.
Occasionally an early approach leads to a breakthrough. Someone might discover a bug in the data generation pipeline or training-validation split. With good planning and thorough documentation, you increase the likelihood that you'll find a model that solves your business problem sooner than expected.