Veteran Data Scientist Offers Tips for Analytics Success

Loraine Lawson
Slide Show

Top Predictions for Big Data in 2014

Rudyard Kipling once said, “If history were told in the forms of stories, it would never be forgotten.” Could the same be true for your data?

Mike Cavaretta argues that it is true. Cavaretta is a veteran data scientist, as well as a manager at the Ford Motor Company in Dearborn, Michigan. In a recent GigaOm column, he says telling a good story is key to helping others understand your data.

“Many analytics presentations crash and burn because no one answered the question, ‘So what?’” he writes. “Almost as bad are the presentations with dense formulas and a single R2 value. Take your audience on a data journey.”


Specifically, he recommends watching Hans Rosling, a professor of global health at Sweden's Karolinska Institute, TED speaker and “data visionary.” Rosling apparently makes data “sing” in his presentations.

That’s just one small recommendation Cavaretta makes in his column, but it’s a fairly significant one. In “The Social Neuroscience of Education,” Pepperdine University Professor of Psychology Louis Cozolino, Ph.D., points out that stories are so ubiquitous, we seldom notice their existence — or their power.

“Stories connect us to one another, help to shape our identities, and serve to keep our brains integrated and regulated,” Cozolino writes. “Like our primitive social instincts, storytelling has a deep evolutionary history that has been woven into the fabric of our brains, minds, and relationships.”

I think that’s a real challenge for the data-minded worker. For some, pie charts and line graphs speak volumes; however, it’s worth remembering that not everyone’s mind works that way. Most of us are hardwired to favor stories. If that weren’t the case, there’d be no shortage of data scientists.

Cavaretta breaks down his recommendations along what he sees as the essential components for any analytics strategy: Data, Process and People.

Data is somewhat self-evident, so he doesn’t spend a lot of space on it. The real meat of the piece can be found in the “process” part. He sub-divides into two areas: The process for finding value in the data and the process for adding analytics to the business. The bulk of his advice centers on that critical, but oh-so-tricky first part: finding the value in the data.

Last week, I wrote about the debate in Big Data over whether you should start with a hypothesis or “explore” the data without a preconceived goal. In “What Charles Darwin Can Teach Business Leaders About Big Data,” I suggested a middle path that would use your existing business assumptions as a hypothesis you should try to disprove.

Cavaretta takes a different tactic, but the strategy is a similar balance between the two extremes:

Work from a list of important questions, but leave time for discovery. I’m not a fan of completely unconstrained investigation. Without some generic questions the likelihood of getting value is low. Conversely, it’s easy to leave value on the table if an analytics project is too rigorously structured.

One other point caught my attention in this excellent piece, and that is Cavaretta’s recommendation that you always look for more data. In particular, he suggests you do that by breaking down data silos (hullo? Integration!), leveraging open data or obtaining a longer time series.

As an added bonus, he even talks about another idea we’ve covered: How to build a data science team instead of hiring a data scientist.



Add Comment      Leave a comment on this blog post

Post a comment

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.


 

Resource centers

Business Intelligence

Business performance information for strategic and operational decision-making

SOA

SOA uses interoperable services grouped around business processes to ease data integration

Data Warehousing

Data warehousing helps companies make sense of their operational data