Oracle recently released a report noting, among other things, that health care isn’t prepared to manage Big Data. That’s hardly shocking, since health care seems largely inept at managing any data, much less Big Data, which is generally defined as having one or more of these characteristics:
- Variety, meaning structured, semi-structured and unstructured data
- Velocity, meaning you want it moved at high speeds
- Volume, think petabytes and terabytes
I happened to be discussing this definition with a friend who writes about health care IT recently. And the more I thought about it, the more I realized that maybe health care IT doesn’t have a data problem so much as it has a Big Data problem.
What do I mean? Well, most health care records actually fall into the domain of Big Data more than your typical, relational database kind of data. Specifically:
- Most health care records are actually unstructured data, e.g., text documents or images. Doctor’s notes on patients, nurse’s care plans, lab results, x-rays and MRI results all fall well outside the domain of structured data. In fact, except for billing data, most of what we consider health care records would seem to fall into a variety of data types other than structured data. So, clearly, health care IT is dealing with a variety of data types.
- Health care data is often high volume, particularly when you’re talking about a state or national electronic health records system. What’s more, when you deal with images, like x-rays or other scans, you’re increasing the data’s volume in terms of storage requirements.
- Finally, most health care records need to be moved relatively quickly, and as individual records. So, if I’m having a consult tomorrow with a surgeon, then the x-rays need to be at the office by morning. Right now — I kid you not — this is handled by me, driving between the two locations. But there’s no reason with the right technology that those files couldn’t be sent electronically. Besides individual records, being able to process medical records at high speeds across a geographical area would help doctors identify health trends and possible disease outbreaks sooner. So, velocity will need to play a role in any effective health records system.
I’m in no way an expert, but after writing and reading about Big Data and health care for a few years, it looks like there’s a clear use case for Hadoop and other Big Data technologies in health care.
In fact, if I may be so bold, maybe health care’s data problems are not entirely caused by niche vendors, data silos and a lack of investment. Maybe the reason health care IT is such a mess is because the existing tools couldn’t handle Big Data needs in an affordable way.
If that’s true, then emerging technologies such as the Apache Hadoop stack could be just what the doctor ordered.