" /> TechnicaLee Speaking: September 2011 Archives

« August 2011 | Main | December 2011 »

September 27, 2011

Saving Months, Not Milliseconds: Do More Faster with the Semantic Web

When I suggested that we're often asking the wrong question about why we should use Semantic Web technologies, I promised that I'd write more about what it is about these technologies that lowers the barrier to entry enough to let us do (lots of) things that we otherwise wouldn't. In the meantime, some other people have done a great job of anticipating and echoing my own thoughts on the topic, so I'm going to summarize them here.

The bottom line is this: The Semantic Web lets you do things fast. And because you can do things fast, you can do lots more things than you could before. You can afford to do things that fail (fail fast); you can afford to do things that are unproven and speculative (exploratory analysis); you can afford to do things that are only relevant this week or today (on-demand or situational applications); and you can afford to do things that change rapidly. Of course, you can also do things that you would have done with other technology stacks, only you can have them up and running (& ready to be improved, refined, extended, and leveraged) in a fraction of the time that you otherwise would have spent.

The word 'fast" can be a bit deceptive when talking about technology. We can all be a bit obsessed with what I call stopwatch time. Stopwatch time is speed measured in seconds (or less). It's raw performance: How much quicker does my laptop boot up with an SSD? How long does it take to load 100 million records into a database? How many queries per second does your SPARQL implementation do on the Berlin benchmark with and without a recent round of optimizations?

We always talk about stopwatch time. Stopwatch time is impressive. Stopwatch time is sexy. But stopwatch time is often far less important than calendar time.

Calendar time is measured in hours and days or in weeks and months and years. Calendar time is the actual time it takes to get an answer to a question. Not just the time it takes to push the "Go" button and let some software application do a calculation, but all of the time necessary to get to an answer: to install, configure, design, deploy, test, and use an application.

Calendar time is what matters. If my relational database application renders a sales forecast report in 500 milliseconds while my Semantic Web application takes 5 seconds, you might hear people say that the relational approach is 10 times faster than the Semantic Web approach. But if it took six months to design and build the relational solution versus two weeks for the Semantic Web solution, Semantic Sam will be adjusting his supply chain and improving his efficiencies long before Relational Randy has even seen his first report. The Semantic Web lets you do things fast, in calendar time.

Why is this? Ultimately, it's because of the inherent flexibility of the Semantic Web data model (RDF). This flexibility has been described in many different ways. RDF relies on an adaptive, resilient schema (from Mike Bergman); it enables cooperation without coordination (from David Wood via Kendall Clark); it can be incrementally evolved; changes to one part of a system don't require re-designs to the rest of the system. These are all dimensions of the same core flexibility of Semantic Web technologies, and it is this flexibility that lets you do things fast with the Semantic Web.

(There is a bit of nuance here: if stopwatch performance is below a minimum threshold of acceptability, then no one will use a solution in the first place. Semantic Web technologies have had a bit of a reputation for this in the past, but it's long since true. I'll write more about that in a future post.)

September 12, 2011

Why Semantic Web Technologies: Common, Coherent, Standard

To paraphrase both Ecclesiastes and Michael Stonebraker & Joseph Hellerstein, there is nothing new under the sun.

It's as true with Semantic Web technologies as with anything else—tuples are straightforward, ontologies build on schema languages and description logics that have been around for ages, URIs have been baked into the Web for twenty years, etc. But while the technologies are not new, the circumstances are. In particular, the W3C set of Semantic Web technologies are particularly valuable for having been brought together as a common, coherent, set of standards.

  • Common. Semantic Web technologies are broadly applicable to many, many different use cases. People use them to publish pricing data online, to uncover market opportunities, to integrate data in the bowels of corporate IT, to open government data, to promote structured scientific discourse, to build open social networks, to reform supply chain inefficiencies, to search employee skill sets, and to accomplish about ten thousand other tasks. This makes a one-size-fits-all elevator pitch challenging, but it also means that there's a large audience of practitioners that are benefitting from these technologies and so are coming together to create standards, build tool sets, and implement solutions. These are not niche technologies with limited resources for ongoing development or at risk to be hijacked for a purpose at odds with your own.
  • Coherent. Semantic Web technologies are designed to work together. The infamous layer cake diagram may have many shortcomings, but it does demonstrate that these technologies fit together like jigsaw puzzle pieces. This means that I can build an application using the RDF data model, and then incrementally bring new functionality online by adopting other Semantic Web technologies. Without a coherent set of technologies, I'd have to either roll my own solutions for new functionality (expensive, error-prone) or try to overcome impedance mismatches in connecting together multiple unrelated technologies (expensive, error-prone).
  • Standard. Semantic Web technologies are developed in collaborative working groups under the auspices of the World Wide Web Consortium (W3C). The specifications are free (both as in beer and as in not constrained by intellectual property) and are backed by test suites and implementation reports that go a long way to encouraging interoperable tools.

The technologies are not novel and are not perfect. But they are common, coherent, and standard and that sets them apart from a lot of what's come before and a lot of other options that are currently out there.