RDF as XML

Over the last week, Planet RDF has seen more than a few posts and comments on the RDF/XML serialisation syntax, most of them looking into its (almost not enumerable) possible variations.

Danny Ayers has a great overview with reference to the expectations, not an unimportant point — as my boss always tells me: It’s all about controlling expectations.

All in all, I agree with most of the comments on the subject, even parts of the ones that don’t seem to agree with each other.

In short: RDF/XML is just a syntax (it’s the RDF model that counts), and while I generally find it acceptable, the variation is one aspect I hope I would have done different had I been involved. That would make it more accessible to XML tools such as XSLT, leading to easier ways of generating clean XML for other uses.

The subject of RDF/XML variation is as close to being a permathread as it can be, and I’ve participated before myself, mostly with regard to the R3X syntax subset. I’ve been doing a lot of XSLT as well, and three recommendations come to mind when considering a syntactic profile or subset of RDF/XML, to reduce the variation:

Drop the attribute form
Except for aesthetic reasons, it’s not necessary.
Don’t use typed nodes
While it may seem easier and smarter to write <foaf:Person>...</foaf:Person> than <rdf:Description><rdf:type rdf:resource="&foaf;Person"/>...</rdf:Description>, it makes it much harder to deal with nodes that have multiple types (and using named entities can help a lot too).
Sort and group
‘nuf said — don’t break statements about the same subject into different elements, keep them together and don’t nest at all.

That said, I don’t consider the deficiencies of RDF/XML to be serious enough to warrant a new XML syntax — after all, there are plenty of RDF/XML parsers out there by now, and the real challenges lies elsewhere (see: Crisis, LargeTripleStores, and Tagtriples + identity precision).

5 thoughts on “RDF as XML

  1. Why is it more difficult to deal with multiple types if you’re using typed nodes? Is this because you have no idea how an RDF serializer will choose to handle it?

    From an XML/XSLT standpoint, it seems to me using them offers more power (in particular validation) and flexibility.

  2. Bruce,

    It’s because of the fact, that only one of the types of a given ressource can be expressed in the typed nodes form. Since e.g. a foaf:Person (due to inference) is always also a foaf:Agent, you won’t know which of them will be used by the serializer. Thus, you have to go through extra work to handle both possible cases when dealing with e.g. XSLT.

    The validation aspect is orthogonal, as I see it, it doesn’t have anything to do with syntax.

  3. +1

    I am using these 3 restrictions in all RDF files we publish.

    Life is much simpler (and faster…) this way.

  4. Thanks for the clarification. Makes sense. I suppose it’d be helped if a serializer did reasoning and knew that one type was a subclass of the other?

    However, I’m confused by your comment about syntax and validation. What is XML validation but about the syntax? If I write a pattern in RNG like:

    Person = element foaf:Person { Name }

    … then I am specifying the syntax.

    I have found that in general most validation languages make it easier to validate these sorts of patterns, than to have the same element (rdf:Description) for potentially very different structures. For example, RNG has a very powerful — and RDF friendly — feature called interleave, which allows unordered content. IIRC, you cannot use interleave, however, where you have the same element content, and want to distinguish patterns based on an attribute (e.g.that from rdf:type).

  5. Regarding the validation aspect, i think it’s orthogonal to the syntax because RDF doesn’t care much about documents — it’s triples all the way down, see also Missing isn’t broken.

    That’s what makes it possible to add e.g. “unknown” properties without risk.

    As for the case of embedding RDF(/XML) in XML documents, I think it’d be better to simply provide a transformation (per GRDDL) or to allow anything that’s RDF/XML at some specific place. If you insist on validating the structure of (partial) RDF/XML documents, you won’t be able to draw upon the flexibility of RDF.

Comments are closed.