If you talk to corporate chief information officers today about their internal data science programs, you’ll likely get a response of enthusiasm for the potential business gains … or a worried look from a CIO who just made a major investment and is really hoping they’ll have something to show the Board of Directors that quarter. Or a combination of both.
When I speak with CIOs regularly, I’m seeing this push/pull dynamic play out at company after company. Domino Data Lab recently conducted a survey with Wakefield Research about executive opinions on data science initiatives, and 97% of executives expect a revenue increase from data science programs. I also read a survey from Accenture that showed that 75% of executives believe their company will go out of business if they can’t scale data science successfully within the next five years.
So you have a rush of excitement for the potential gains from artificial intelligence and an extinction level fear about getting data science wrong. It’s a tough spot to be in. Especially when 82% of data execs say they’re under pressure to make short-term, splashy investments so they can show a quick win.
How can a CIO, the board, and executive team avoid falling into a trap of splashy investments that don’t last? You have to look at your metrics, and make sure you’re measuring the right KPIs. To quote Peter Drucker, “if you can’t measure it, you can’t improve it.”
The KPIs you choose should point to building a sustainable machine capable of producing a steady stream of highly profitable models. You want to avoid nearsighted metrics that won’t be sustainable over time, such as predicting consistent quarter-to-quarter growth. And you want to avoid the gold rush mistake of diving into the mine before you build a framework that will support your program over time.
I’ll share a set of core principles that I’ve found the most successful data science programs have in common. These should guide your choice of metrics. I’ll then share specific ideas for metrics that apply both to your data science program and to the actual outputs of data science. Combined, these metrics can help drive sustainable, long-term gains.
Core Principles of Data Science
For companies early in the process of growing their data science programs, here are four principles to keep in mind as you’re thinking about how to measure the long-term impact of your data science program:
- Iteration speed. How rapidly is your team iterating on ideas and models? Velocity is more important than big breakthroughs. You want to set your team up for long-term success. That means building a product-generating machine that will justify your initial investment over time and deliver consistent results.
- Reusable knowledge. Building on the intelligence and experience of your team is more important than producing an immediate answer. You need to build reusable assets. This means you should prioritize creating a searchable, shareable knowledge base that can be a catalyst for future product research and development.
- Tool agility. With the pace of innovation in analytics, you’re seeing new tools all the time. Success will require agility and flexibility in how you use your tools and how quickly the team can ramp up on new software. Don’t put all your technology eggs in one basket. This applies to infrastructure, frameworks, programming languages and tool solutions.
- Process and culture. Building a successful analytics flywheel takes more than just technology—you need a team that can support each other’s work and a culture of growth and learning. Senior leaders know building the right team is their single biggest goal and the biggest factor of success or failure. Give your team the infrastructure they need to discover miracles.
If these strategic objectives are front and center, you’ll be on the right path towards long-term success. The next step is to look at how to build and evaluate a successful team.
Also read: Steps to Improving Your Data Architecture
Building the Foundation
When starting out with a newer data science program or expanding a current program, senior leaders should look at ongoing expenses, how much knowledge the team is creating over time, and how quickly new team members are adding value. There are three key areas to consider here:
- Costs of running the program. There will be recurring costs for essential tools and data management to consider, but don’t forget to measure the expense of support from your IT team. If you set up your infrastructure so data scientists can take a self-service approach, then the number of support tickets should drop over time.
- Contributing to the knowledge base. You want to reward data scientists for sharing their insights and contributing to the company knowledge base, as well as producing valuable new models. To quantify and acknowledge the collaboration, you can track the number of contributions per person and commits to the main knowledge management platform to measure success for individuals, and track the aggregate contribution rate over time to evaluate the team’s overall contributions.
- Onboarding. See if there are ways to set up your data science programs so you can accelerate the onboarding process. You want to make sure that new employees can add value quickly, and not slow down more experienced people on the team by asking where everything is. The easiest way to get people up and running is to make it easier for employees to find information on their own.
Once a data science program is up and running, leaders need to look at how they can speed up the process to deliver improved results over time. Data science is evolving quickly, but you shouldn’t need to reinvent the wheel with each new project. Leaders should look at how they can reuse and iterate on previous work by the broader team to jump start the next project and deliver results quickly.
At Domino Data Lab, we call this “model velocity.” This describes how long it takes to create a new model, deploy it to production, and update and retrain it on a regular basis. Model velocity measures the speed of your data science flywheel, and the rate at which you deliver model-based products.
- Model creation. Start tracking the raw time it takes to create a new model, from initial planning to production deployment. If you build up a knowledge base and create collaborative workflows, you may be able to cut the time to create a new model from 180 days down to 14 days. Use your experience from each project to make it easier to build the next model.
- Production deployment. Once you’ve developed a model, look at how long it takes to get a validated model into production. For too many companies, each deployment requires unique infrastructure changes and adjustments to manage incoming data. If you create a documented, repeatable process for your IT team, you can streamline the process, and get a model into production in just a day, instead of taking months.
- Regular updates. Once a model is in production, you need to give it some care and “feeding” to maintain viability. When an issue comes up, or data shifts, look at your approach and procedures to figure out the root cause. Make sure you have a defined procedure to fix or update models, and a regular cadence of model updates to proactively address variance in the data stream.
Apply Key Principles
I’ve shared a few different ways to measure the success of your data science team, and to make sure you can show consistent improvement and business impact over time. I’ve seen companies discover unique customer insights thanks to data science. But I’ve also seen programs at blue chip companies fail when everything is treated like a prototype and companies don’t build a long-term program.
By building a data science machine based on continual iteration and improvement, your teams and employees can deliver better results and reduce the time to new insights. The key questions and principles above will give you a place to start the discussion and determine specific metrics for your company so you can track your long-term success.
Read next: Top Data Science Tools 2021