In New York City, obtaining a public data set required an open records request and the researcher toting in a hard drive.
So grab a notepad, Big Apple, and let the Windy City show you how to do open data.
A recent GCN article describes how Chicago simplified the release and updating of open data by building an OpenData ETL Utility Kit.
Before the kit, the process was onerous. Open data sets required manual updates made mostly with custom-written Java code.
That data updating process is now automated with the OpenData ETL Utility Kit. Pentaho’s Data Integration ETL tool is embedded into the kit, along with pre-built and custom components that can process Big Data sets, GCN reports.
“What’s different now is we have a framework that can be easily used by a lot of people,” Tom Schenk, the city’s chief data officer, told GCN. “I could also give that tool to a number of users around the city of Chicago and they’d to be able to program ETLs that are going be easier for them to understand, easier for them to create. It allows us to be more nimble.”
In a particularly compelling use case, the city tapped into an application programming interface (API) that monitors water quality at Lake Michigan beaches and used the ETL to push out information hourly.
If you’re curious about the OpenData ETL Utility Kit — and I’m looking at you, New York City — you can download it from github.
Talend and Hortonworks
A partnership between Hortonworks and Talend will make it easier to run data integration workloads natively within Hadoop. Engineers from both companies integrated the Hortonworks Data Platform and Talend’s ETL tool. The latest Apache Hadoop extension Apache Storm is supported in Talend 5.6, Data Center Knowledge reports. That will allow users to more easily stream data in real time.
Talend also supports Apache Kafka, which is a fault-tolerant, publish-subscribe messaging system that can be used with Apache Storm for real-time analysis and rendering of streaming data, according to Talend’s press release.
MuleSoft New Release Supports Mobile Integration via API
As enterprises focus more on mobile, that’s led to often costly integration challenges. MuleSoft’s new release of its Anypoint Platform targets that pain point.
How big is that pain point? MuleSoft’s press release cites Gartner’s estimation that as much as 70 percent of a mobile app project’s cost can be attributed to integration between the app and existing enterprise applications, services and data sources.
The Anypoint Platform for Mobile allows you to design, build, manage and analyze APIs to connect with Salesforce, ServiceNow, SAP, Siebel and other popular enterprise platforms and services. The Anypoint Platform also supports sharing data with any audience, from any source, through apps that are highly available, scalable and secure, MuleSoft states.
Webinars
Wednesday, Jan. 28, at 4 ET, “The Big Picture: Understanding the Many Roles of Hadoop Exploratory Webcast,” by The Bloor Group’s 2015 Research Program.
Thursday, Jan. 29, at 2 p.m. EST, “How Big Data is Changing Product Development,” with Tom Davenport, distinguished professor of Information Technology and Management at Babson College and author of Big Data@Work, and Kobi Gershoni, chief research officer and co-founder of Signals Group.
Loraine Lawson is a veteran technology reporter and blogger. She currently writes the Integration blog for IT Business Edge, which covers all aspects of integration technology, including data governance and best practices. She has also covered IT/Business Alignment and IT Security for IT Business Edge. Before becoming a freelance writer, Lawson worked at TechRepublic as a site editor and writer, covering mobile, IT management, IT security and other technology trends. Previously, she was a webmaster at the Kentucky Transportation Cabinet and a newspaper journalist. Follow Lawson at Google+ and on Twitter.