It’s ironic, but the term “semantic technology” isn’t well defined and can be used by different vendors to mean completely different things. Lee Feigenbaum explains to IT Business Edge’s Loraine Lawson the different meanings and how these technologies all relate — but he also explains the advances in what he considers the one that matters most: Semantic Web technology. Feigenbaum is vice president of marketing and technology for Cambridge Semantics and co-chairs of W3C SPARQL working group, which is an RDF query language and considered a key technology for the Semantic Web.
In part two of this interview, Feigenbaum explains how Semantic Web technology is changing data integration.
Lawson: What is the main thing that businesses and IT people should understand about Semantic Technology that they currently don’t?
Feigenbaum: The word “semantic” is an awful word, especially paired with technology, because semantic technology has been used to mean a ton of different things over the years. So, to some people, it’s been ensconced firmly in their brains of being associated with search, right? And for those people it means, “Well, when I search for ‘jaguar’ on a search engine, I can choose whether I want the animal or the car,” right? And that’s sort of semantic search, and for some people, that is semantic technology, period, end of story.
Then there are people whose only exposure to the term “semantic technology” has been in the context of text analytics and natural language processing. So, when they hear “semantic” or “semantic technology,” what they think about is how do you take a red tape or a document or an email and apply text analytics and get some sort of structured information out of it, whether finding people and companies and dates or extracting sentiment from it. So if you're looking at social media and you want to see how your brand is perceived, that sort of thing.
Then it’s probably over-simplifying, but there’s a third camp that thinks of semantic technology specifically as Semantic Web technology, in which case it’s a family of technology standards from the WCC, (World Wide Web Consortium). It’s a family of technologies that are, you know, all designed to work well together that include a flexible graph model for representing data. It includes a schema and topology language for doing rich modeling of the data. It includes a distributed query language. It includes ways to reason over information and do rules and the whole family of technology standards that are designed to let you integrate and link together very diverse and disparate data.
These were technologies that were originally designed, as the name “semantic web” implies, to basically gather data on the Web, but especially in the past five years or so, but there’s been a lot of attention paid to how these same technologies that were designed for the Web can be used inside enterprise IT who deal with data and data integration challenges.
So there are at least these three very different views of semantic and depending on where a particular individual was exposed to the idea, they either think of it as search or they think of it as tech analytics or they think of it as the Semantic Web technologies and that’s a major source of confusion. I regularly do like tutorials and lectures, and one of the things I do is put up this Venn diagram slide that shows how these things relate.
They do have some relationship to them. For example, if you're doing tech analytics to look at unstructured content and try to pull actual structured data out of it, you’d need some way to represent that structured data at the end after you’ve run all your heuristics and analytics. The Semantic Web standards, that technology is a great way to do that. So a lot of these different views of the world play very nicely together and complement each other very well, but it still is a big source of confusion.
Lawson: I’m actually probably familiar most with the Semantic Web and the text analytics, but that was a source of confusion for me as well. My first exposure was a Semantic Web discussion and using that in the enterprise; later, a vendor told me they were using Semantic Technology, but they really meant text analytics.
Feigenbaum: There’s hardly an IT vendor out there today that is not using the word semantic in some way. They’ll talk about having a semantic layer. What they usually mean by that is in a lot of the software you work with, the data at the very nuts-and-bolts level where you're working with IDs and keys and all these sort of low-level, nitty-gritty details of the data. When they say there’s a semantic level, there’s a level at which they're looking at data conceptually and they're talking about people and customers and orders and shipments and all that stuff.
I, and Cambridge Semantic, when we talk about semantics, we are generally talking about the Semantic Web Technologies and their application to enterprise IT.
All of the other types of semantics are interesting and important, but they're really sort of a catch-all. There’s no consistency between how they're used. So from the point of view of talking about trends and momentum and adoption and stuff like that, it mainly makes sense to talk about the Semantic Web and what’s happening with that set of technologies. That’s my personal bias.
Lawson:When you say there’s no consistency between how they're used, do you mean the terms or the actual technology?
Feigenbaum: Both, actually. Outside of the Semantic Web standards, if you line up three vendors and they all say, “Hey, we use Semantic for search,” or, “We do Semantic search,” or if they all say, “We do Semantic text analytics,” or, “We have a Semantic model in our BI platform,” first of all they're using the term in a different way, but second of all, the actual technology that underlies that will be completely different and proprietary for each of those vendors.
On the other hand, when it’s used as meaning Semantic Web technologies, you are referring to a cohesive, coherent set of standards from the WCC, such that if you then line up three vendors who all say, “Hey, I’ve got enterprise IT software that is driven by Semantic Web Technology,” you know that you’ve got software that’s doing a similar thing, using similar technologies and will probably work well together and interoperate.
Lawson: So where are we now with the Semantic Web technology? It’s probably been a year since I talked to anyone; have there been any major breakthroughs?
Feigenbaum: When I talk about this, I have sort of gives a rough timeline of what I’ve seen in terms of the overall adoption. I’ll just give a little context. You look back to the early 2000s, when this stuff was really new and what was going on then is that you had all these enthusiasts in the Semantic web community. They would get together at conferences and talk about the big vision. Really all that was happening was defining some of these foundational standards. Then you fast forward a few years to 2004 or 2005 and people would come to these conferences and say, “Hey, we’ve got these standards in place now, so let’s talk about how we can start using them.” And then they would go back to their day-to-day life and they would start building initial tools that actually use those standards.
Then you fast forward a couple more years and you're in 2007, 2008 and you go to one of these conferences and people are talking about, “Hey, I built this tool. I built this database. I built this utilization tool, this integration tool. It uses Semantic Web Tech, it’s pretty cool, check it out.”
It was around then that people really started to try to apply these tools and therefore the underlying technologies in their businesses.
So you look at those same conferences in 2009, 2010 or so, and what was happening is people were starting to report on what was basically pilots and proofs of concept of these technologies. And inevitably somebody in the audience would stand up and say, you know, “So you guys are going to go ahead and put this into production and use this for real?” And the person would say, “Well, we’re going to see. We’re going to try.”
So basically where that brings us is in the past couple of years people have really been doing that. They have been starting to apply these technologies in production use cases and then report back at the conferences. It’s interesting, because the more this stuff is adopted, the actual use cases are talked about less. As soon as it’s actually day-to-day to run their business, they hold their cards a little closer to their vests.
From a macro level, that’s where I think it is: People are starting to use this stuff for real. The point on an adoption curve varies from industry to industry. Most people would agree that life sciences, pharmaceutical especially, is probably the industry that’s the furthest along with adopting this stuff in production. In the pharmaceutical industry, it’s being used in production in a lot of different ways, ranging from doing sort of basic integration and management of research data to analysis of clinical trial data all the way through to the commercial side of pharma companies, where people are using Semantic Web technologies to do predicted sales forecasting. They're using it for competitive intelligence, which is an area that I’ve got a fair amount of familiarity with, because we have a bunch of customers that are doing that. They're using it for monitoring drugs that are already in the market to see if there’s any sort of adverse events that start cropping up after a drug’s been approved and sold.
Those are some of the use cases that are from a business perspective, but the technologies are also increasingly being used at an IT layer to do things like master data management, pure data integration, sometimes BI-type applications.