The Semantic Lunch
Lunch today with John Davies, who’s in charge of next-web research for BT. It was quite a long, or rather intense, discussion, so I’ll only tackle the basics here. I’ve been trying to nail this semantic web issue for some time, but every time I start reading an academic paper, my attention seems to wander off. So this was a good opportunity for me. I wasn’t going to deviate. As soon as he sat down, I was in with my carefully prepared, top journalist’s question: “so what’s this semantic web thingy, then?”
It turns out that that is one of the more difficult questions. (Damn!) It depends on what you mean. You might mean turning the billions of existing web sites semantic or only about possible future sites or services. The second of these options is the most likely outcome at the present. Semantic web is partly about annotating web pages to make them amenable to machines. John prefers the expression ’semantic technologies’ to avoid this confusion.
At the moment, information on the web is pretty much designed for human consumption. You and I know when we go to a shopping site that the figure in bold is the price, that a certain number is the product code and that this piece of information is about the shipping information. To a machine, it may make no sense whatsoever. If machines are to be able to bring together all these different pages to make the web more useful, then they need to be able to read them.
We’ll see the first applications of semantic technologies in the enterprise space. Its need is more acute. They have lots of databases, all built by different people according to different rules. Integrating the information from those is already a very costly and time-consuming activity. One database may talk about CustomerName, another may refer to CustomerID, for example. Joining these things together, so perhaps, a support department knows about what equipment the logistics department has installed for a customer, improves business efficiency. Semantic technologies put what Davies called a “wrapper” around these different data sources to create overarching access, connecting different datasources in a way that doesn’t require nearly so much human effort.
People developing semantic technologies work by developing an ontology for understanding the sort of data it’s looking at and the technology will be able to do some reasoning based on this. An ontology means a form of classification system for whatever it is that’s being examined. For foods, that might include their ingredients, nutritional properties, suppliers and type. It’s not just a list, though, but will also understand the relationships between different items. An ontology developed for food might come across E101 and additives and CrispyPop bar. It will know that E101 falls under additives which are part of the ingredients of the bar. If that description then gets combined with a database of shops at a wholesaler where you might send the bar, then the semantic agent will calculate that health food shops aren’t going to stock CrispyPop bars. It’s not intelligence in any way, but the application of rules that the creators have decided upon.
Because the semantic technologies are lightweight and open source, they are potentially available to any company. For this reason, enterprises that get some of their data from external sources will still be able to use semantic approaches to integrate and drill the ensuing combination. These are my words, not John’s, but one way to think of the technologies is as providing a toolkit for more easily creating web mashups. Companies already exist, such as Cerebra and Ontoprise, to sell ways to integrate enterprise information using OWL, the web ontology language.
I kind of understood, so far, but I needed a good example. I suspect you may be the same.
BT works closely with the National Health Service in the UK. The Service has already gone a long way into digitising and collating its information through the Electronic Patient Record system and also information on medical knowledge through the SNOMED classification system. Unfortunately, though, the data can still be very dispersed. The X-Ray department might have a patient’s data on a different system to the Pharmacy, for example, and those might be completely unconnected to the systems used in a different hospital or by a clinic.
What semantic web technologies bring to this is unity and also what John called Description Logics. It can prise open the different databases, allow an overview, but also calculate with it. Imagine a patient’s medical record says that they are allergic to almonds. Then a doctor misses this and somehow prescribes a nut-based food. When the nurse enters this into the patient’s record, the semantic application will use its ontology to work out that almonds are nuts and that therefore this is a very bad idea. Semantic technologies that can perform calculations like this, and potentially save lives, are already in use in the UK Health Service.
I’ll leave this there for today. My head hurts already. If there’s interest, I’ll be happy to do a follow-up on another day.
