The Potential and Pitfalls of Open Data

    Looking for a way to make money for your company? Invest in open data, Gartner recommended last week. But be forewarned: The suggestion comes amid criticisms by some — including a Gartner analyst — that open data can be problematic in terms of its usability, quality and, yes, ethics.

    Open data is data that’s freely available for use by anyone, without restrictions, copyright or patents. Mostly, that means it’s data made available by governments required to make it available. Advocates say open data is a key component of an open and transparent government.

    But for businesses, it could be the key to a new strategy and money, says Gartner.

    “Whereas ‘big data’ will make organizations smarter, open data will be far more consequential for increasing revenue and business value in today’s highly competitive environments, according to Gartner, Inc.,” begins a press release issued last week in advanced of Gartner’s Symposium/ITxpo. That event, scheduled for Oct. 21-25, will include sessions on open data.

    Of course, it’s really not an either/or question, since most open data sets will require IT to be able to handle Big Data sets.

    But there are two main problems with open data:

    Its quality. This is raw data, and in the UK, at least, that’s triggered major complaints, with the Public Accounts Committee charging that the data is so raw, it’s unusable and inaccessible for the public. The data also tends to have major data quality problems, caused by incomplete datasets and inconsistent reporting by local authorities, according to a ZDNet article.

    In theory, open data could transform what we know, adding depth and breadth to the public knowledge. In practice, that seems unlikely to happen anytime soon, contends Andrea Di Maio, a Gartner VP distinguished analyst who wrote a blog post on this issue in May.

    “The more the data, the more sophisticated the analysis and presentation tools, the more specialized are the skills and resources required to process that data,” Di Maio wrote. “Although consumer technologies become increasingly powerful and massive processing resources become available as a commodity, making sense of big, open data is not for the faint of heart, and will require significant investments for the times to come.”

    He also alludes to the second issue with open data:

    The potential for misuse. People are already raising questions about the potential misuse of Big Data.

    “Big data is our generation’s civil rights issue, and we don’t know it,” Alistair Croll, an industry watcher and analyst, wrote in July. In the post, he goes on to show how publicly available data on last names could be used to generate racial boundary maps in London. He’s focused on Big Data, but it doesn’t take a TED talk or a genius to see how much more frightening his examples would be with open datasets added to the mix.

    It’s easy to dismiss this as theoretical paranoia, but keep in mind: The public still doesn’t really know about open data, or its potential for misuse. Last week, Mitt Romney’s presidential campaign made headlines for mining large data sets that included personal information. Will this lead to a backlash against Big Data and the concept of open data? Probably not. But it’s not hard to see from here.

    There’s no doubt that open data adds value. Slashdot pointed out that the public sector tends to view it as a good option, noting that cities are opening up their public datasets to third-party developers and encouraging them to construct apps for citizen services. One well-known example: The New York transit opened up its maps as an API for developers rather than trying to create and manage their own app.

    Gartner analysts recommend you make an open data strategy a “top priority” if your business or organization uses the Web as a channel for delivering goods and services. It also recommends you consider how your own data could be used as a strategic asset, possibly through opening it through a data API.

    But like so many other issues IT deals with, this isn’t a technology problem. The technology exists to both use and deploy open data sets — with caveats in place about the quality of the data.

    The real issue is whether or not you’re ready to be responsible with that data, and if you’re not, whether you’re prepared to deal with the potential backlash that seems so inevitable.

    Loraine Lawson
    Loraine Lawson
    Loraine Lawson is a freelance writer specializing in technology and business issues, including integration, health care IT, cloud and Big Data.

    Latest Articles