It’s March Madness, and here in Kentucky, people accept that as a literal command. I don’t know about where you live, but I’ve no doubt right now, all across my state, people are more concerned with finishing their NCAA playoff brackets than anything else.
Today, NPR ran a fun piece discussing the approaches people use to fill in their brackets, which included everything from the usual tactics to picking the winners by mascots (a bear could totally take a cardinal, right?).
But by far, Danny Tarlow created the data-cool pool for the NCAA playoff. Tarlow is a postdoctoral student at Microsoft Research Cambridge. In his pool, programmers develop algorithms to fill in the brackets.
Danny Tarlow runs a blog called “This Number Crunching Life,” and each year he runs a pool to see who can design the best algorithm program for filling in the bracket. The programs rely on past basketball stats to make the predictions, of course, then guess and refine as new games happen, according to the report.
Brilliant! I heard it and thought maybe somebody had written a program that would solve this whole bracket-thing once and for all so we could all get back to what really matters, which is who is going to beat up whom in the Stanley Cup. Priorities: I has them, to quote an LOLCat somewhere.
In his first approach, Tarlow used the basic structure of a reader’s book selection program to fill in the bracket. Since he won the pool, he was pretty happy with it.
But did this data-analyst approach outperform the rest of us? Not really.
“Looking at the group of algorithms, it's probably not that much different than you would expect to see out of your group of friends,” Tarlow told NPR.
Now that’s telling. Is it possible the programmers brought their own bias to the algorithms? They actually use the same datasets — although, there are rules for adding other datasets. Given that they’re asking the same basic questions, using the same basic data, you would think the results would be more or less the same.
But that’s not what happened.
Okay – but who cares? It’s just basketball, so the algorithms aren’t perfect. The problem is, it’s NOT just basketball. What happens when it’s orange juice? Or a data-driven enterprise? What if it’s Wall Street, or even the world?
Tarlow’s contest is fodder for thought about how data analysis is coded, but it’s also a reminder that situations where data’s impact is limited are rare.
“No matter how much computer analysis you do, you're still stuck with the way the ball bounces,” Mike Weimerskirch, a math professor and sports fan at the University of Minnesota, told NPR.
We’ve been told all our lives: “That’s the way the ball bounces,” but in this age of computation, it’s hard to accept — no matter what the data shows us.
By the way, here’s a stat for you sports fans and data heads: Your chances of filling out a perfect bracket if you only pick the top-seeded teams? About 150 billion to one.
Good luck on that one. Maybe this year, I’ll skip the bracket and buy a lottery ticket instead.