Saving Months, Not Milliseconds: Do More Faster with the Semantic Web


When I suggested that we're often asking the wrong question about why we should use Semantic Web technologies, I promised that I'd write more about what it is about these technologies that lowers the barrier to entry enough to let us do (lots of) things that we otherwise wouldn't. In the meantime, some other people have done a great job of anticipating and echoing my own thoughts on the topic, so I'm going to summarize them here.

The bottom line is this: The Semantic Web lets you do things fast. And because you can do things fast, you can do lots more things than you could before. You can afford to do things that fail (fail fast); you can afford to do things that are unproven and speculative (exploratory analysis); you can afford to do things that are only relevant this week or today (on-demand or situational applications); and you can afford to do things that change rapidly. Of course, you can also do things that you would have done with other technology stacks, only you can have them up and running (& ready to be improved, refined, extended, and leveraged) in a fraction of the time that you otherwise would have spent.

The word 'fast" can be a bit deceptive when talking about technology. We can all be a bit obsessed with what I call stopwatch time. Stopwatch time is speed measured in seconds (or less). It's raw performance: How much quicker does my laptop boot up with an SSD? How long does it take to load 100 million records into a database? How many queries per second does your SPARQL implementation do on the Berlin benchmark with and without a recent round of optimizations?

We always talk about stopwatch time. Stopwatch time is impressive. Stopwatch time is sexy. But stopwatch time is often far less important than calendar time.

Calendar time is measured in hours and days or in weeks and months and years. Calendar time is the actual time it takes to get an answer to a question. Not just the time it takes to push the "Go" button and let some software application do a calculation, but all of the time necessary to get to an answer: to install, configure, design, deploy, test, and use an application.

Calendar time is what matters. If my relational database application renders a sales forecast report in 500 milliseconds while my Semantic Web application takes 5 seconds, you might hear people say that the relational approach is 10 times faster than the Semantic Web approach. But if it took six months to design and build the relational solution versus two weeks for the Semantic Web solution, Semantic Sam will be adjusting his supply chain and improving his efficiencies long before Relational Randy has even seen his first report. The Semantic Web lets you do things fast, in calendar time.

Why is this? Ultimately, it's because of the inherent flexibility of the Semantic Web data model (RDF). This flexibility has been described in many different ways. RDF relies on an adaptive, resilient schema (from Mike Bergman); it enables cooperation without coordination (from David Wood via Kendall Clark); it can be incrementally evolved; changes to one part of a system don't require re-designs to the rest of the system. These are all dimensions of the same core flexibility of Semantic Web technologies, and it is this flexibility that lets you do things fast with the Semantic Web.

(There is a bit of nuance here: if stopwatch performance is below a minimum threshold of acceptability, then no one will use a solution in the first place. Semantic Web technologies have had a bit of a reputation for this in the past, but it's long since true. I'll write more about that in a future post.)


That's quite an interesting thread. You write "In the meantime, some other people have done a great job of anticipating and echoing my own thoughts on the topic, so I'm going to summarize them here." could you provide some links to other poeple that picked up this conversation? Thanks.

[Lee: Two of the ones I was referring to were Mike Bergman and Kendall Clark's posts which are linked lower in the article. There are a couple of others that I can try to dig up and I will add links as I find them.]

Very good post. I have had similar experiences and questions, especially on things that could be done with other technology stacks. We often hear something like "I could have done this with XML or RDBMS or ... (add your favorite technology here)". It is true, there is always more than one way to implement something. The question to ask is "Why haven't you?". And the answer is often because it is simply too complex and expensive to do using a different technology. Possible in theory, but not possible in practice given resource and time constraints.