25 Questions to Ask Before Integrating External Datasets

    Twenty-eight percent of the time, traditional data integration tools are being used for integration with other businesses, according to Gartner Vice President and Distinguished Analyst Ted Friedman.

    It’s a major emerging trend, thanks largely to Big Data. In fact, Gartner is predicting that within four years, by 2017, over one-third of all new integration flows will extend outside the enterprise firewall.

    There are a lot of drivers for this. First, there are changes in traditional B2B companies, such as financial services and manufacturing, that require more integration. New regulations, fast time-to-market strategies, and cutting costs while increasing efficiency in supply chains are all reasons why B2B companies need integration now more than ever.

    But the push for external data is also happening in more traditional B2C, government and nonprofit organizations, thanks to new technologies such as external sensors and social media feeds.

    So what makes integration with external data sources unique? It turns out, there are a lot of issues you need to consider, according to Jim Damoulakis, CTO at the consulting and IT services firm GlassHouse Technologies, Inc. TechTarget recently interviewed Damoulakis about how to manage external data integration projects, particularly Big Data stores.

    Here are 25 questions, broken down by challenge, included in Damoulakis’ interview that he recommends CIOs and CTOs answer about any external data integration project.

    Challenge One: Define your Goal

    1. What’s the business outcome we hope to achieve or support by integrating these datasets?

    2. Are there other business goals we hope to achieve?

    3. What service levels can we provide with this data?

    4. Who is the end user for this data?

    5. What are their needs?

    Challenge Two: Dealing with the Data

    6. How do the goals shape this project? Damoulakis points out there are different requirements for market analysis than security analysis or demographic analysis.

    7. Is this data structured or unstructured?

    8. How much do we trust this external data?

    9. Do we have any guarantee about the quality of the data?

    10. If not, what kind of data cleansing or normalization do we need to perform on this data?

    11. Do we have the data tools on hand to support this?

    12. Do we need to purchase additional tools? (If yes, be sure to read Step Four)

    Challenge Three: Staffing the Project

    13. Do we have the experience in house to manage this integration?

    14. What’s our learning curve?

    15. Would it be more cost-effective or time-efficient to bring in outside expertise?

    16. Who needs to be part of the team that will manage this project and the data?

    17. Who will represent the data users?

    18. Who is responsible for owning and managing the data?

    19. Who is responsible for the infrastructure that houses the data?

    20. Is there a security or compliance issue with this data?

    21. Who is responsible for ensuring the security of the data?

    22. Who is responsible for ensuring the right compliance practices are followed?

    Challenge Four: The DI Vendors

    23. Is there a well-defined roadmap for getting the most value from our technology investment?

    24. What is that roadmap?

    25. Does this vendor use established or, if it’s a new technology, is the vendor at least using de facto standards?

    Loraine Lawson
    Loraine Lawson
    Loraine Lawson is a freelance writer specializing in technology and business issues, including integration, health care IT, cloud and Big Data.

    Get the Free Newsletter!

    Subscribe to Daily Tech Insider for top news, trends, and analysis.

    Latest Articles