If you were asked to name one endeavor that would be the very best application of Big Data technology imaginable, what would it be? How about the fight against human trafficking?
The global industry of buying and selling human beings—call it human trafficking or slavery—is a heartbreakingly bustling one. SumAll Inc., an analytics services startup in New York, has resolved to do something about it. The company has set up a non-profit organization, the SumAll Foundation (SumAll.org, or as the employees call it, “the dot-org”), with the aim of tapping Big Data to further the aims of the global non-profit community. Step One in that noble endeavor is an aggressive assault on human trafficking.
I had the opportunity last week to speak with SumAll CEO Dane Atkinson, and Korey Lee, the company’s vice president of operations and analytics, to discuss the initiative. I started by asking them how the fight against human trafficking emerged as a cause they wanted to use their technology to help with. Lee fielded that one:
There’s a struggle in the non-profit space to understand and leverage data. A lot of non-profits will kind of orient themselves around their success metrics—it’s tough to get them to adhere more to key performance indicators. So we wanted to bring more transparency to this particular issue. There’s not a lot of data published out there, given that slavery and trafficking are illegal across the world. A lot of this information is locked up in PDF reports, hundreds of pages of U.N. documentation, and what not, so it’s not easily accessible and available. So we wanted to transform that and make it a little easier to understand.
I asked them to map out their strategy—to explain what they’re planning to do and how that’s going to help in the fight against human trafficking. Lee said it’s all about leveraging and understanding data:
Our object and vision for the .org is not so much getting on the ground and combating human trafficking, but rather leveraging data and understanding data so we can broadcast that information, not only back to the non-profit, but to other folks as well, to increase awareness of what those non-profits are doing, and to keep them accountable to key performance indicators so that people in the marketplace can have a better understanding of what those non-profits are doing. And then they can measure themselves against other non-profits, and find common ground, maybe to partner with other non-profits and get a collaborative environment there. So the mission for us at SumAll.org is really to use data to improve quality of life and operational efficiency, and to optimize the fundraising patterns for these charities—we want to leverage data to impact more than one charity. So our focus is not solely on human trafficking—we picked that as a first topic because it’s not a particularly widely known issue, and we wanted to bring light to that.
Atkinson added that the cause was in dire need of some analytic rigor:
By looking at different topic areas, we want to show this model of understanding the information across an entire issue, and consolidating it and presenting it back out to the market. And human trafficking—or “slavery” is a better term—really has horrible data issues. Obviously the nature of an illicit industry is going to cause that. One of the discoveries we made early-on in the process was that the U.S. government uses a number for how they benchmark their investment in fighting trafficking, and that number actually comes from an undergrad paper through five different channels—it’s been rewashed and rebranded. There are real issues in getting a sense of what’s going on in these spaces, so we thought that putting some analytic rigor to the process would be valuable. … I think a lot of it comes down to providing these non-profits with insights that come from their own data. A lot of them will gather metrics and raw data, but they don’t have the resources to analyze it and figure out what insights you can glean from it. So we hope to help them do that, and make their operations on the field and off the field more efficient.
I asked them what they see as the hardest part of all this—what they expect to be the biggest obstacle. Atkinson said it’s the simple fact that the data held by non-profits fighting human trafficking is a mess:
The nice part about the professional market is there’s a real eagerness to work and purify data, and the information sources are far more mature. When we looked at the human trafficking issue in particular—and this seems to be the case as well with the other issues we’re exploring now—there is none of that discipline or rigor. It’s a mess. And the facts that sit in the most treasure troves are mind-bogglingly scary. We looked at actual police reports for Ghana and Thailand, and you can see very clearly the trending price for a human life, and understand that we can buy a 13-year old girl from her parent for $300, and $200 of that goes to the slave trader. And that has gone up 20 percent. But no one sees that data—it’s sitting in some file somewhere, and the people who are fighting the industry really aren’t getting their hands around the stuff that would make people move, that would get people excited. So it’s been a real slog. We’ve had analysts spend eight or nine man-months just pulling together the first blush of what the hell’s going on in this category just to make sense of it, which is a huge data-analysis investment for results. If you’re looking at Twitter data, you can get to the same kind of level in a week.
Lee said the organizations that are “fighting the good fight” just don’t have the right tools:
We’re finding data sets that are PDFs or written reports—even in big U.S. government agencies and the United Nations, they have a very splintered infrastructure for data, so it’s really hard to pull out. There are no APIs that exist for almost any of these issues, so none of these people are presenting their information in a way that’s easy for technologists to grapple with it. So there are a lot of straight-up barriers to entry. When you look at the way companies are successful now, it’s because they’re using information, they’re optimizing data, but there’s a very lagging market here [among non-profits]. The folks that really want to have an impact are not embracing these new ways of doing things.
But, Lee said, there are some encouraging signs:
What’s started happening in the last few weeks is a lot of those non-profits out there struggling with issues, that know the information exists—sometimes even in databases that they’re touching, but can’t make sense of it—are starting to come to us and give us that information. It’s looking very encouraging—we believe we’re going to get our hands on some really fantastic information. We’ve seen that on the professional side of the operation—once we get data in the minds of operators, they actually operate much better, and we hope that continues to be true for more core, human issues.
I’ll share some of the backstory behind all of this in a subsequent post. It’s a remarkable, inspiring tale.