by Anantha Kancherla and Vivian Guo | September 27, 2021
Make Machine Learning Work for Your Company: A Primer
Over the last 50 years, machine learning (ML) has evolved through a series of hype cycles—periods of public fervor as well as funding droughts known as “AI winters”—to reach mainstream applicability and acceptance. With recent computing advances, we now see machine learning being widely used for things like search and feed ranking, spam filtering, and warnings about suspicious credit card activity. A specific form of ML called Deep Learning has fueled the recent growth in Natural Language Processing (NLP), autonomous driving, image and object recognition, and virtual personal assistants. Now, machine learning has evolved to the point where it won’t just be integrated into new products but will also transform how products are built.
Already today, ML offers enough benefits for product development that most companies should consider incorporating it into their processes. But when does it make sense to invest in machine learning capabilities and how do you actually build a machine learning team? And what does this mean for software companies?
When do you need machine learning?
As a basic principle, companies should apply machine learning when the problem is so complex that rules-based approaches no longer scale. This implies that there are either too many data points or the input data is “high-dimensional.” One caveat to keep in mind: don’t expect laser precision of prediction as models can be occasionally wrong, which is what one should expect from any statistically-determined system.
For businesses considering ML in a customer-facing part of their offering, the capabilities are often dependent on rich data typically available once the number of customers has grown significantly (after product market fit). For example, you might be building an e-commerce solution and want to recommend what a customer should buy. Other companies might have rich and high-dimensional data even earlier in their lifetime (e.g., logs from servers, data streams they acquire, audio/images, text etc.). Both of these scenarios are valid reasons to explore ML/AI. And of course, you will also have companies with an “AI-centric” problem like building a chatbot.
A good signal that the time may be right for machine learning is if you are deploying data scientists on these types of problems and want to further automate and scale your approach. It’s also important to note that machine learning is most effective for scaling and automating strong products. It will not transform a mediocre product into a great one.
How do you get started with your first machine learning team?
Once it’s time to deploy machine learning, the best way to start is by pulling in your most senior tech lead (or the CTO in earlier stage companies) to read up and prototype some ML. Strong engineers should be able to easily pick up the basic techniques. Even if they’re not yet able to keep pace or understand the intricacies, the technical team should still be capable of structuring a solution.
There are a few different types of ML engineers you will find in the market. The first big division is between:
- ML engineers: Look first for engineers who are focused on the ML algorithms. There are two types of ML engineers:
- ML Generalists: These engineers are adept at ML techniques but are flexible in their application of this skill. These are the most likely candidates for your first ML hires.
- ML specialists: In addition to being experts at ML algorithms, they are specialized in certain domains (e.g., computer vision, ranking, text etc.). Ideally, you would hire a specialist that matches closest to the problem you are solving, but this might take time.
- ML infrastructure engineers: There is enough of a difference between ML infrastructure and traditional infrastructure that experts in the former can make a big difference down the road. However, you can start by pulling in your infrastructure generalists to help.
You might end up hiring (or pulling in) a few people for your first ML solution, such as a few generalists, perhaps a specialist, and a couple of individuals to help with the infrastructure. We have seen companies initially keeping them in a separate organizational pod for better load balancing and collaboration. In mid-sized companies, this trend of a central ML team is also particularly common.
That said, ML is a technique—if you are deploying ML in multiple places, it behooves you to embed ML teams in each of these “product areas” while keeping a central pod of engineers who will build your ML infrastructure. This doesn’t mean they will build everything from scratch—even stitching together external offerings will require significant work.
One common failure is that as companies bring in ML, they are still dependent on their data scientists. These companies often have a scaled data science (DS) team that now needs to be migrated to ML. Pay special attention to integrating these two teams (ML and DS). We recommend a combined DS/ML team; when you split and embed ML into multiple product areas, do it together with your DS team.
How should you think about hiring and spend in today’s market?
The job market is incredibly competitive and hiring will be challenging. While major players like Google and Facebook will continue to vacuum up ML Engineers, there are enough ML specialists who will be interested in the specific problem you are trying to solve. And as with hiring your traditional engineers, many ML practitioners will also appreciate the autonomy that a smaller organization provides. Still, know that the “supply” will be limited. Be smart about how your engineering resources are deployed and what you build versus buy.
Lastly, dedicate time early on to think through your ML development workflow. This pre-planning will reduce frustration for your ML engineers who produce the algorithms. Ensuring they see the impact their work has on the company can help tremendously with retention.
As with traditional engineering, it’s important to measure and track the impact of your machine learning deployment over time. A few metrics to consider include:
- The time it takes for the team to iterate on a model
- The amount you are parallelizing the iteration
- The financial cost of the above
- The utilization of expensive hardware
- Failure rate and frequency of resorting to human intervention
It’s also imperative to start understanding the ROI of your ML team early on. Determine how much of your recent “wins” are directly attributable to the ML algorithm. This can come in two forms of work:
- Constant experimentation (tweaking) of the shipping models
- Adopting edgy, breakthrough techniques (this is rarer)
This space is rapidly changing. It’s important to keep up with the latest breakthroughs in this area and keep learning to prevent disruption by an upstart that’s managed to harness this emerging technology. AI can also get rapidly if you’re not mindful of costly variables like GPU usage.
Like cloud and mobile, we believe machine learning is a profoundly transformative technology and has limitless potential to change how solutions are created, and therefore how software teams are built. It’s taken more than fifty years, but ML is finally ready for prime time.
We welcome any questions or discourse as you develop your machine learning capabilities.