Compact RDF to Topic Maps Mapping Syntax (CRTM)

Author: Lars Heuer
Organization: Semagia
Date: 2010-09-30
Status: Draft

Table of Contents

1   Introduction

The Compact RDF to Topic Maps Mapping Syntax (CRTM) is a language to describe an RDF [RDF] to Topic Maps [TMDM] mapping. It was designed to be compact and easy to write and read.

The RTM ([RTM]) specification defines an RDF vocabulary which describes how RDF triples are converted into Topic Maps constructs. While that approach certainly works and has its advantages (like embedding the mapping into the RDF source), people familiar with Topic Maps have to write the mapping in an RDF format and the mapping may become verbose. More importantly, it is not possible to reuse an existing mapping within another mapping.

CRTM provides the same expressiveness as [RTM] and offers some additional features like translating the optional language tag into scope and makes the reuse of mappings possible.

This language operates with the same algorithm as described by [RTM]; it's just an alternative, compact syntax (aside from the enhancements which are not supported by [RTM]).

2   Mapping

Like [RTM], CRTM uses RDF predicates to translate triples to Topic Maps constructs. The general notation is:

predicate-list ':' topic-maps-equivalent

Where the predicate-list consists of one or more RDF predicates which should be mapped to Topic Maps.

The following sections assume that the prefix foaf was bound to the IRI http://xmlns.com/foaf/0.1/ and the prefix tmdm to the IRI http://psi.topicmaps.org/iso13250/model/.

2.1   Common syntactical constructs

2.1.1   Whitespace

Whitespace consists of one or more space (#x20) characters, carriage returns, line feeds, or tabs. Whitespace character are allowed everywhere to separate tokens (terminals and non-terminals).

2.1.2   Comments

Comments are introduced with the hash sign (#) and continue until the end of the current line. They are allowed everywhere where whitespaces are allowed:

# This is a comment
foaf:name: tmdm:topic-name # This is also a comment

2.1.3   IRIs

IRIs are enclosed into < and >:

<http://www.semagia.com/> # This is an IRI
<http://ws.mappify.org/rdf2tm/> # This, too ;)

2.1.4   QNames

QNames are used to abbreviate IRIs, they can be used (nearly) everywhere, where an IRI is allowed. During deserialization, the IRI to which the prefix is bound is concatenated with the local part. The result of such a process is always an absolute IRI:

foaf:name: tmdm:topic-name

# Equivalent statement:
<http://xmlns.com/foaf/0.1/name>: <http://psi.topicmaps.org/iso13250/model/topic-name>

2.1.5   Scope

The scope of a Topic Maps statement is introduced by the @ sign followed by list of QNames or IRIs:

# Maps the foaf:nick to the default name type and uses foaf:nick
# as scope
foaf:nick: tmdm:topic-name @foaf:nick

# Same as above, but assigns two themes
foaf:nick: tmdm:topic-name @foaf:nick, <http://psi.example.org/nickname>

2.2   Identities

While RDF knows just one identity type, Topic Maps provides two 'strong' identities (subject identifiers and subject locators) and one 'weak' identity type, the item identifiers.

CRTM provides support for all three identity types: the object may be mapped either to a subject identifier, a subject locator, or an item identifier.

A mapping to a subject identifier is either expresses through the keyword subject-identifier or sid:

foaf:mbox: subject-identifier

# Shotcut
foaf:mbox: sid

Subject locators can be established through the keyword subject-locator or slo:

foaf:img: subject-locator

# Shortcut
foaf:img: slo

Item identifiers are created through the keyword item-identifier or iid:

foaf:mbox: item-identifier

# Shotcut
foaf:mbox: iid

2.3   Occurrences

Occurrences are created with the keyword occurrence or the shortcut occ:

# Maps the foaf:homepage to an occurrence
foaf:homepage: occurrence

# Shortcut
foaf:homepage: occ

Optionally, a type and a scope may be specified:

# Overrides the foaf:homepage type with the specified type
foaf:homepage: occurrence <http://psi.example.org/homepage>

# Adding type and scope:
foaf:homepage: occurrence ex:homepage @lang:en

If a type and / or scope is specified, the keywords occurrence and occ are optional:

foaf:homepage: <http://psi.example.org/homepage>

foaf:homepage: ex:homepage @lang:en

foaf:homepage: @lang:en

2.4   Names

The notation for names is similar to the occurrence notation, except that the keyword is name. Alternatively, the hyphen - can be used to indicate that the RDF statement should be mapped to a topic name:

# Maps foaf:name to a topic name with the type foaf:name.
foaf:name: name

# Maps foaf:name to the default topic name type.
foaf:name: - tmdm:topic-name

# Maps foaf:nick to the default topic name type and adds foaf:nick to the scope
foaf:nick: - tmdm:topic-name @foaf:nick

# Equivalent statement; using the name keyword:
foaf:nick: name tmdm:topic-name @foaf:nick

2.5   Associations

Associations can be created with the (optional) keywords association or assoc followed by an optional type (if the RDF predicate should not be used as association type) followed by the role types and an optional scope:

# Creates an association where the subject and object play a role of type ``foaf:Person``
foaf:knows: association (foaf:Person, foaf:Person)

# Creates an association where the subject plays the ``ex:member`` role
# and the object the ``ex:group`` role.
ex:member-of: assoc(ex:member, ex:group)

The association keyword is optional:

# Same example as above
ex:member-of: (ex:member, ex:group)

Further, the association type can be overriden:

# The resulting association will have the type ex:parent-of rather than
# ex:child-of
ex:child-of: ex:parent-of(ex:child, ex:parent)

2.6   Type-Instance Relationship

While it is possible to model type-instance relationships with the notation for associations, CRTM provides a shortcut for it:

# Maps rdf:type to a type-instance association where the
# subject plays the tmdm:instance role and the object the tmdm:type role.
rdf:type: isa

Like associations, the type-instance shortcut accepts an optional scope.

2.7   Language Tags

Occurrences and names can utilize the optional RDF language tag which is translated into a topic which uses one of the OASIS ISO 639 PSIs [OASIS]. By default, the RDF language tags are ignored (for compatibility with [RTM]).

If the RDF source provides language tags they can be added to the scope of the occurrence:

foaf:homepage: occurrence; lang=true

The lang=true instruction advices the CRTM reader to convert the (optional) language tag into a topic which is added to the occurrence scope.

Given the following [Turtle] statements:

ex:tinyTiM dc:description "tinyTiM is a small Topic Maps engine"@en .
ex:tinyTiM dc:description "tinyTiM ist eine kleine Topic Maps Engine"@de .

the instruction:

dc:description: occurrence; lang=true

results into the following topic (using [CTM]):

%prefix lang <http://psi.oasis-open.org/iso/639/#>
# Other prefixes omitted

ex:tinytim
    dc:description: "tinyTiM is a small Topic Maps engine"@lang:eng;
    dc:description: "tinyTiM ist eine kleine Topic Maps Engine"@lang:deu.

CRTM offers also a global setting to translate the language tags and therefor it's also possible to disable the translation on a per statement basis:

dc:description: occurrence; lang=false

This would disable the translation of the language tag for the dc:description RDF predicate.

2.8   Prefix Directive

The prefix directive is used to associate an IRI with an identifier and to use QNames instead of more verbose IRIs.

Example:

%prefix ex <http://www.example.org/>

ex:foo: subject-identifier

In the example above the QName ex:foo is expanded to http://www.example.org/foo.

2.9   Language to Scope Directive

The language to scope directive enables / disables the translation of the optional RDF language tag for all occurrences and names.

Example:

%langtoscope true

dc:description: occurrence

Given the following [Turtle] statements:

ex:tinyTiM dc:description "tinyTiM is a small Topic Maps engine"@en .
ex:tinyTiM dc:description "tinyTiM ist eine kleine Topic Maps Engine"@de .

the resulting [CTM] topic would be:

%prefix lang <http://psi.oasis-open.org/iso/639/#>
# Other prefixes omitted

ex:tinytim
    dc:description: "tinyTiM is a small Topic Maps engine"@lang:eng;
    dc:description: "tinyTiM ist eine kleine Topic Maps Engine"@lang:deu.

Each mapping may override the global setting with setting lang to false:

%langtoscope true

dc:description: occurrence; lang=false

Results into:

ex:tinytim
    dc:description: "tinyTiM is a small Topic Maps engine";
    dc:description: "tinyTiM ist eine kleine Topic Maps Engine".

2.10   Include Directive

To create modular mappings or to reuse other mappings, CRTM offers the include directive:

%include <foaf.crtm>

The referenced mapping is added to the current CRTM instance. The IRI of the referenced mapping is resolved against the document locator of the current CRTM instance.

2.11   Predicate Lists

To shorten the code further, CRTM offers predicate lists which should be mapped to a Topic Maps construct.

The following statements:

foaf:familyName: name
foaf:firstName: name
foaf:givenName: name

can be folded into one statement:

foaf:familyName, foaf:firstName, foaf:givenName: name

That notation works for all CRTM instructions:

foaf:homepage, foaf:workInfoHomepage: subject-identifier

2.12   Grouped Statements

If a mapping is dedicated to a particular domain (i.e. the FOAF vocabulary), it could be cumbersome to type always QNames, therefor CRTM offers an alternative syntax:

%prefix foaf <http://xmlns.com/foaf/0.1/>

foaf {
  name: name
  nick: name
}

The identifiers within the curly braces are interpreted as local part of the a QName which the prefix "foaf". The grouped statements are not limited to one prefix, though:

%prefix doap <http://usefulinc.com/ns/doap#>
%prefix foaf <http://xmlns.com/foaf/0.1/>

doap {
  shortdesc, description: occurrence
}

foaf {
  name: name
  nick: name
}

As shown in the example above, it's also possible to use a list of identifers which should be mapped to a Topic Maps statement (doap:shortdesc and doap:description).

3   Example CRTM Mapping

This section shows a complete, example mapping:

#
# CRTM example that maps a subset of the DOAP voc. to Topic Maps
#
%prefix doap <http://usefulinc.com/ns/doap#>
%prefix tmdm <http://psi.topicmaps.org/iso13250/model/>
%prefix rdf <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
%prefix foaf <http://xmlns.com/foaf/0.1/>
%prefix ex <http://psi.example.org/doap/>

rdf:type: isa

foaf {
  # Map the foaf:name type to the default name type.
  name: - tmdm:topic-name
  homepage: occurrence
}

doap {
  # Map the doap:name type to the default name type.
  name: - tmdm:topic-name

  shortname: name

  # Map all of the following DOAP properties to occurrences
  shortdesc, description,
  homepage, download-page,
  bug-database, mailing-list,
  license, programming-language, browse: occurrence

  # Create an association from the repository property
  repository: ex:has-repository(ex:project, ex:repository)

  # Create an assoc from the maintainer property
  maintainer: ex:maintains(ex:project, ex:maintainer)

  # Treat the repository URL as subject locator.
  location: subject-locator
}

4   Grammar

The Compact RTM grammar is defined as follows (using EBNF defined in XML 1.0 (Third Edition) [EBNF]).

Note

CRTM has no reserved words, all keywords may be used as identifier.

instance          ::= directive* (statement | prefix)*

directive         ::= include | prefix | lang-to-scope

prefix            ::= '%prefix' IDENT IRI

include           ::= '%include' IRI

lang-to-scope     ::= '%langtoscope' boolean

statement         ::= grouped-statement | single-statement

grouped-statement ::= (IDENT | IRI) '{' (local_ids ':' statement-body)+ '}'

local_ids         ::= LOCAL_IDENT (',' LOCAL_IDENT)*

single-statement  ::= predicates ':' statement-body

statment-body     ::= identity | type-of | subtype-of | association | occurrence | name

predicates        ::= qiri (',' qiri)*

identity          ::= sid | slo | iid

sid               ::= 'subject-identifier' | 'sid'

slo               ::= 'subject-locator' | 'slo'

iid               ::= 'item-identifier' | 'iid'

type-of           ::= 'isa' scope?

subtype-of        ::= 'ako' scope?

association       ::= ('association' | 'assoc')? type? roles scope?

roles             ::= '(' subject-role ',' object-role ')'

subject-role      ::= qiri

object-role       ::= qiri

occurrence        ::= ('occurrence' | 'occ') type? scope? language?
                      | type scope? language?
                      | scope language?

name              ::= ('name' | '-') type? scope? language?

language          ::= ';' 'lang' '=' boolean

type              ::= qiri

scope             ::= '@' theme (',' theme)*

theme             ::= qiri

qiri              ::= QNAME | IRI

boolean           ::= 'true' | 'false'


IDENT             ::= ID_START (.* ID_CHAR)*

LOCAL_IDENT       ::= IDENT | ([0-9]+ (\.* ID_CHAR)*)

QNAME             ::= IDENT ':' LOCAL_IDENT

IRI               ::= '<' [^<>"{}`\ ]+ '>'

COMMENT           ::= '#' [^#xA#xD]*

ID_START          ::= [a-zA-Z_]
                      | [\u00C0-\u00D6] | [\u00D8-\u00F6]
                      | [\u00F8-\u02FF] | [\u0370-\u037D]
                      | [\u037F-\u1FFF] | [\u200C-\u200D]
                      | [\u2070-\u218F] | [\u2C00-\u2FEF]
                      | [\u3001-\uD7FF] | [\uF900-\uFDCF]
                      | [\uFDF0-\uFFFD]
                      | [\u10000-\uEFFFF]

ID_CHAR           ::= ID_START | [-.0-9]
                      | \u00B7 | [\u0300-\u036F] | [\u203F-\u2040]

5   Bibliography

[RDF]Resource Description Framework (RDF): Concepts and Abstract Syntax, W3C, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/rdf-concepts/
[TMDM]ISO/IEC 13250-2: Topic Maps — Data Model (TMDM), 2006, http://www.isotopicmaps.org/sam/sam-model/2008-06-03/
[RTM]The RTM RDF to Topic Maps mapping, Ontopia A/S, 2003, http://www.ontopia.net/topicmaps/materials/rdf2tm.html
[Turtle]Turtle - Terse RDF Triple Language, David Beckett, 2008, http://en.wikipedia.org/wiki/Turtle_%28syntax%29
[CTM]ISO/IEC 13250-6: Topic Maps — Compact Syntax (CTM), http://www.isotopicmaps.org/ctm/
[OASIS]OASIS PubSubj TC, Published subjects for languages in ISO 639, http://psi.oasis-open.org/iso/639/
[FOAF]FOAF Vocabulary Specification, 2010, 3rd edition, http://xmlns.com/foaf/spec/20100101.html
[IRI]IETF RFC 3987, Internationalized Resource Identifiers (IRIs), Internet Standards Track Specification, January 2005, http://www.ietf.org/rfc/rfc3987.txt
[EBNF]XML 1.0, Extensible Markup Language (XML) 1.0, W3C, Third Edition, W3C Recommendation, 04 February 2004, http://www.w3.org/TR/REC-xml/