Open-sourcing the Semantic Layered Research Platform: A Roadmap

| 1 Comment

Yesterday, I promised a bit more detail on what components we'll be open-sourcing under the SLRP banner over the coming weeks and months. We've added much of this information to the SLRP homepage, but I wanted to include it here as well.


The Semantic Layered Research Platform (SLRP, pronounced slurp) is the collective name for the family of software components produced by the IBM Advanced Technology group to utilize semantics throughout the application stack. We'll be releasing these components to the open-source community over the next few months as we polish the initial versions of them and prepare supporting materials (examples, how-tos, documentation, ...). This post is a summary of the components that we'll be releasing, with a brief description of each. The list is arranged in a rough approximation of the order in which we think we'll be able to release them, but the order is very much subject to change.

  1. Boca1. Boca is the foundation of many of our components. It is an enterprise-featured RDF store that provides support for multiple users, distributed clients, offline work, real-time notification, named-graph modularization, versioning, access controls, and transactions with preconditions. Matt's written more about Boca here. Along with Boca are included two subsystems which may also be interesting on their own:

    • Glitter. Glitter is a SPARQL engine independent of any particular backend. It allows interfaces to backend data sources to plugin to the core engine and generate solutions for portions of SPARQL queries with varying granularity. The core engine orchestrates query rewriting, optimization, and execution, and composes solutions generated by the backend. A Boca-specific backend allows SPARQL queries to be compiled to Boca's temporal database schema.
    • Sleuth. Sleuth provides full-text search capabilities for text literals within Boca. Text literals are indexed with Apache Lucene, and the index also stores information about the named graph, subject, and predicate to which the literal is attached.
  2. DDR. The Distributed Data Repository (DDR) is the binary counterpart to Boca. It's a write-once, read-many store for binary data. Content within DDR receives an LSID, and a registry of metadata extractors ("scrapers") allows metadata to be pulled from the content and stored into a companion Boca server. DDR contains an LSID resolver that returns the stored binary content for the LSID getData() call and returns the Boca named graph containing the metadata in response to the LSID getMetadata() call.

  3. Queso. Queso is a semantic web-application framework. It stores content (HTML, CSS, JavaScript, etc.), user data, and application data within Boca, and provides mechanisms for deploying modular applications and services that (modulo access control) can remix and reuse service endpoints and semantic data. Ben, Elias, and Wing have already written more about Queso.

  4. ODO. ODO is a family of Perl 5 libraries for parsing, manipulating, persisting, and serializing RDF data. ODO also contains Plastor, a Perl analog of Jastor, which generates Perl classes from an OWL ontology.

  5. Telar. Telar is a family of Java libraries that provide services for creating applications driven by RDF. Some Telar libraries focus on the user interface, supplying bindings between RDF data and SWT widgets, the Eclipse Graphical Editing Framework (GEF), and Eclipse RCP perspectives, editors, and views. Other Telar libraries focus on data management; these libraries provide functionality to manipulate RDF datasets (collections of named graphs) and to perform tasks such as resolving human-readable labels for resources within an RDF graph.

  6. Salsa. Salsa is a Boca application that brings together semantic technologies and spreadsheets. Salsa serializes spreadsheets to a central Boca server, and also uses a transform language to map cells and cell ranges to their RDF semantics. Salsa is an experiment in separating data from layout within spreadsheets, and also in adapting the familiar spreadsheet user-interface paradigm for RDF data.

  7. Taco. Taco is a framework for measuring performance of RDF stores. It includes utilities that build various kinds of RDF graph structures and the ability to add measurable operations, which can include queries, statement adds and statement removes, for those structures. The perfomance log is also stored in RDF with a defined ontology and can be easily queried with a report generator.


1 As you might be able to tell, several of our projects were named at lunch time, as we gazed longingly across the street.

1 Comment

Hi,

We are using BOCA2 and while executing SPARQL Queries with BOCA API we need to pass the NamedGraph URI's along with the Query..

For this we need to know all the Named Graphs that the Instnaces of the Model are Stored..!!

Is there a way to specify a regular expression ( which matches all the Named Graphs that ends startswith a URL ) in the execute Query method ..

Thanks
Sateesh