SHACL and OWL Compared

On July 20, 2017 the Shapes Constraint Language (SHACL) became an official W3C Recommendation, lifting it into the same league as other RDF-based standards including RDF Schema, OWL, SPARQL, Turtle and JSON-LD. This article examines the similarities and differences between SHACL and the Web Ontology Language (OWL). It explains that OWL has been designed for classification tasks (inferencing in an "open world"), while SHACL covers data validation (in a "closed world") similar to traditional schema languages. Given this division of roles, both technologies can be used together or individually. Further, SHACL can also be used for general purpose rule-based inferencing.

A Historical Perspective

The Creation of RDF, RDF Schema and OWL

The base specification of the W3C semantic technology stack is RDF, a simple and flexible graph-based data model based on URIs and literals that form triples, published in 2004. Parallel to the first version of RDF, work was underway to produce schema and ontology languages leading to the publication of the RDF Schema and Web Ontology Language (OWL) specifications. RDF Schema introduced a core vocabulary to declare classes (using rdfs:Class) and properties (using rdf:Property), and built-in properties to associate properties with classes (rdfs:domain) and to declare their value types (using rdfs:range). RDF Schema is not a schema language in a traditional sense (like for example XML Schema). It is a very minimalistic language designed for inferencing rather than to enforce adherence of data to a schema.

The Web Ontology Language (OWL) was designed as an extension of RDF Schema. OWL added properties such as owl:maxCardinality to express so-called restrictions. A very important thing to note is that, like RDFS, OWL was designed for inferencing. OWL restrictions are not actually data constraints, but rather describe inferences to be applied based on them. For many first-time users of OWL this leads to some very surprising outcomes. For example, assuming there is an owl:maxCardinality 1 restriction stating that a person can only have 1 value for ex:hasFather and there is an instance of ex:Person that has two ex:hasFather values, then an OWL processor will assume that these two values must in fact represent the same real-world entity, just with different URIs. This topic is sometimes called Unique-Name Assumption (or, in the case of OWL: the lack of that assumption). OWL's interpretation is based on a (rather philosophical) distinction between a resource (URI) and a real-world entity that is represented by that resource. If you follow the OWL spec, you cannot send a set of instances to an OWL processor and ask whether these instances "match" or "conform to" the given schema in the same way that you would send an XML file to an XML Schema validator. Instead, an OWL processor will actually add to the data in attempt to conform to the restrictions rather than report an error. If the addition of new facts results in logical contradictions, then the processor will report an error. However, the error is rarely traceable to the original data statement(s) that have caused the contradictions.

Another basic design decision gave the OWL properties a certain meaning (aka semantics), that were supposed to make OWL-based data fit for the open world of the Web in which any RDF resource may link to any other RDF resource without having full control over which triples are actually present when the resource is in use. A consequence of this Open-World Assumption is that an application should not assume that the absence of a certain statement means that the statement is false. For example, if an OWL ontology states that the rdfs:range of the property ex:hasFather is ex:Person and an application only sees a triple stating ex:John ex:hasFather ex:Bob and nothing else then it should not assume that ex:Bob is an not instance of ex:Person. In fact, the application should assume the opposite, and automatically infer the triple ex:Bob rdf:type ex:Person. Another surprise for newer OWL users is that missing a value for a property with a restriction of owl:minCardinality 1 is not reported as an error by an OWL processor, because more data may appear at any time to satisfy that restriction under the Open-World Assumption.

Data Validation prior to SHACL

Although data validation is an important practical use case for the RDF stack, until SHACL came around, there was no W3C standard mechanism for defining data constraints. Over time, people became creative in working around this limitation. Many tools simply decided that for all practical purposes, the open-world and non-unique-name assumptions should simply be ignored. OWL-aware tools including TopBraid and Protégé, for example, provide data entry forms that restrict users from entering more than one value if there is a corresponding owl:maxCardinality 1 restriction, or require the selection of a specific instance of ex:Person if that class is the rdfs:range of the property.

Another practical solution that has emerged over the years is writing SPARQL queries that test an RDF data graph against conditions expressed in a SPARQL WHERE clause. SPARQL does not make any assumptions about inferences being present or not. Instead, a SPARQL processor simply queries the triples that are actually asserted in the data. Even today, many applications use home-grown formats to store a collection of SPARQL queries that are executed to validate data in ad hoc processes.

The usefulness of SPARQL as a data constraint language led to the creation of an RDF vocabulary called the SPARQL Inferencing Notation (SPIN), also known as "SPARQL Rules". SPIN started as a feature in TopBraid Composer and evolved and grew over time into a TopQuadrant W3C Member Submission in 2011. SPIN defines properties that can be used to attach SPARQL queries to classes, stating that all instances of these classes must satisfy the constraints stated by those SPARQL queries. SPIN also includes vocabulary terms to represent inferencing rules, again attached to classes. Over the years, SPIN became popular among a community of users, yet without being an official W3C standard it did not lead to widespread industry adoption.

The Creation of SHACL

Recognizing the lack of a suitable standard to express constraints and schemas with a Closed World Assumption, the W3C launched the RDF Data Shapes Working Group in 2014. The group was using SPIN and other member submissions such as IBM's Resource Shapes as input, and the term shape was established to mean a collection of constraints that apply to targetted RDF resources. After a couple of years of intense discussions, the Shapes Constraint Language (SHACL) was standardized as a W3C Recommendation in July 2017.

SHACL can be regarded as a next-generation merger of Resource Shapes and SPIN, and it provides a high-level vocabulary with properties such as sh:minCount and sh:datatype as well as a fallback mechanism that allows users to express basically any constraint using SPARQL or (using the SHACL-JS extension) in JavaScript. The high-level vocabulary makes SHACL also a schema/ontology language, allowing tools to examine the structure of a class to, for example, suggest how instances of a class should be presented on an input form. Since OWL performs (limited, as described above) data validation through inferencing, it has no separation between data validation and reasoning. SHACL separates checking data validity from inferring new facts. Both, however, are possible with SHACL.

Comparing OWL and SHACL

The following sub-sections compare various aspects of OWL and SHACL.

Defining Class Hierarchies

The backbone of most data models are classes, arranged in a sub-class or sub-set hierarchy. Both OWL and SHACL rely on RDF Schema vocabulary for the basic mechanism, as illustrated in the following example:

ex:Person
	a rdfs:Class ;
	rdfs:label "Person" ;
	rdfs:comment "A human being" .
	
ex:Customer
	a rdfs:Class ;
	rdfs:subClassOf ex:Person .

Note that OWL defines its own metaclass owl:Class which can be used instead of rdfs:Class. SHACL treats all these classes in the same way, assuming the graph contains triples that declare owl:Class (or any other metaclass) as a sub-class of rdfs:Class. Some SHACL features such as sh:class and sh:targetClass are defined to automatically "walk" up the superclass hierarchy based on the rdfs:subClassOf relationship. So overall this central concept of schemas and ontologies is handled very similarly between OWL and SHACL.

Defining Constraints and their Targets

OWL is operating on classes, which are understood as sets of instances that satisfy the same restrictions. OWL includes the metaclass owl:Restriction which is typically used as an anonymous superclass of the named class that the restriction is about. For example, to state that a ex:Person cannot have more than one ex:hasFather and this value must be an instance of ex:Person, we can state:

ex:Person
	a owl:Class ;
	rdfs:subClassOf [
		a owl:Restriction ;
		owl:onProperty ex:hasFather ;
		owl:maxCardinality 1 ;
	] ;
	rdfs:subClassOf [
		a owl:Restriction ;
		owl:onProperty ex:hasFather ;
		owl:allValuesFrom ex:Person ;
	] .

Such restrictions apply to all instances of the class and its subclasses. SHACL offers more flexibility. The direct SHACL equivalent of the OWL example above would be:

ex:Person
	a owl:Class, sh:NodeShape ;
	sh:property [
		sh:path ex:hasFather ;
		sh:maxCount 1 ;
		sh:class ex:Person ;
	] .

An alternative approach is to decouple the class definition and its "shape" and to use sh:targetClass to link the shape to its target class(es):

ex:Person
	a owl:Class .

ex:PersonShape
	a sh:NodeShape ;
	sh:targetClass ex:Person ;
	sh:property [
		sh:path ex:hasFather ;
		sh:maxCount 1 ;
		sh:class ex:Person ;
	] .

The latter design is sometimes cleaner if shape definitions should be reused in multiple classes or if they are developed by different people, in different files or namespaces. However, the simpler (first) pattern of directly attaching the constraints to a class is also supported.

A difference between OWL and SHACL is the presence of global property axioms such as owl:FunctionalProperty in OWL. SHACL does not have such global constraints but they can be expressed using shapes that apply to all places where a property is present, for example using sh:targetObjectsOf. So the following two examples both mean similar things:

ex:identifier
	a owl:FunctionalProperty .

SHACL is a bit more verbose here:

ex:IdentifierShape
	a sh:PropertyShape ;
	sh:targetSubjectsOf ex:identifier ;
	sh:path ex:identifier ;
	sh:maxCount 1 .

SHACL shapes are not limited to only target instances of a class. In addition, the SHACL vocabulary includes terms to state that a shape applies to all subjects or objects that have values for a certain property. SHACL also offers a way to state that a shape applies to a specific resource or a group of resources by listing their URIs. The custom targets (SHACL Advanced Features document) take this further and even support the selection of target nodes based on SPARQL queries or JavaScript code. This makes is possible to precisely state, based on some criteria, which nodes a given shape should apply. Overall, SHACL is vastly more flexible here than OWL.

In OWL, describing what properties a class member may have is done by declaring a class either an rdfs:subClassOf or owl:equivalentClass of some restriction. In the example above, we have used rdfs:subClassOf. The difference between these two options is in what inferences are generated. In each case, the targets of OWL inferencing may be members of the class for which the restriction is declared or members of the class the restriction refers to. The subtleties of this are often misunderstood or ignored by the users who simply want to describe a data model. SHACL uses an arguably easier to understand and more familiar paradigm.

Built-in Constraint Types

The following table summarizes the available kinds of constraints that are built (hard-coded) into OWL and SHACL processors. The table does not necessarily mean that OWL and SHACL are equivalent in their interpretation - as mentioned before there are fundamental differences in how OWL interprets restrictions (for inferencing) from how SHACL interprets constraints (for validation).

Note that SHACL offers its extension mechanisms allowing anyone to create their own constraint types that can then be used with properties similar to the ones below. Such extension namespaces can be published on the web, for anyone to reuse. An example of such an extension namespace is the DASH Data Shapes Vocabulary which is indicated where suitable below. The SHACL Core vocabulary was designed with this extensibility in mind, i.e. although it covers the most common use cases, no attempt was made to be completely comprehensive (which is not realistic anyway).

OWL	SHACL	Notes
Value Type Constraints
`owl:allValuesFrom`	`sh:class` or `sh:datatype`	`sh:datatype` also checks well-formedness
-	`sh:nodeKind`
Cardinality Constraints
`owl:maxCardinality`	`sh:maxCount`
`owl:minCardinality`	`sh:minCount`
`owl:cardinality`	`sh:minCount` + `sh:maxCount`	SHACL is more verbose but less redundant for tools
`owl:FunctionalProperty`	`sh:maxCount 1`	In OWL: global axiom, in SHACL: local constraint
`owl:InverseFunctionalProperty`	`sh:maxCount 1` on a `sh:inversePath`	In OWL: global axiom, in SHACL: local constraint
Value Range Constraints
`owl:onDatatype`/`owl:withRestrictions`/`xsd:minExclusive` etc	`sh:minExclusive` etc	Datatype facets are considerably more verbose in OWL
String-based Constraints
`owl:onDatatype`/`owl:withRestrictions`/`xsd:minLength`	`sh:minLength`
`owl:onDatatype`/`owl:withRestrictions`/`xsd:maxLength`	`sh:maxLength`
`owl:onDatatype`/`owl:withRestrictions`/`xsd:length`	`sh:minLength` + `sh:maxLength`
`owl:onDatatype`/`owl:withRestrictions`/`xsd:pattern`	`sh:pattern`	OWL does not support `sh:flags`
`owl:onDatatype`/`owl:withRestrictions`/`rdf:langRange`	`sh:languageIn`	Different approach
-	`sh:uniqueLang`
Property Pair Constraints
`owl:equivalentProperty`	`sh:equals`
`owl:propertyDisjointWith`, `owl:AllDisjointProperties`	`sh:disjoint`
`owl:inverseOf`	`sh:inversePath/sh:equals`
-	`sh:lessThan`
-	`sh:lessThanOrEquals`
`rdfs:subPropertyOf`	`dash:subSetOf`	Or: combine SHACL with RDFS inferencing
`owl:onProperty`, `owl:propertyChainAxiom`	`sh:path`	SHACL supports arbitrary property paths, OWL does not
Logical Constraints
`owl:complementOf`	`sh:not`
`owl:intersectionOf`	`sh:and`
`owl:unionOf`	`sh:or`
`owl:qualifiedMin/MaxCardinality 1`	`sh:xone`
`owl:disjointUnionOf`	`sh:node`/`sh:or`/`sh:not`	This is verbose in SHACL
Shape-based (structural) Constraints
`rdfs:subClassOf`, `owl:equivalentClass`	`sh:node`
`rdfs:subClassOf`, `owl:equivalentClass`	`sh:property`
`owl:someValuesFrom`	`sh:qualifiedMinCount 1` or `dash:hasValueWithClass`
`owl:qualifiedMinCardinality` etc	`sh:qualifiedMinCount` etc
Other Constraints
`-`	`sh:closed`, `sh:ignoredProperties`
`owl:hasValue`	`sh:hasValue`
`owl:oneOf`	`sh:in`
`owl:ReflexiveProperty`	`sh:not`/`sh:disjoint` in a node shape
`owl:IrreflexiveProperty`	`sh:disjoint` in a node shape
`owl:SymmetricProperty`	-
`owl:AsymmetricProperty`	-
`owl:TransitiveProperty`	`sh:path` with `*` operator
`owl:hasKey`	`dash:uriStart`	Approximation
`owl:sameAs`, `owl:differentFrom`, `owl:AllDifferent`	-	In SHACL every resource is distinct by default

Both languages have comparable features to build more complex restrictions out of these basic building blocks. For example, an OWL ontology can define a restriction saying that values of a restricted property must again fulfill some other restrictions described in terms of values of their properties. In SHACL this is achieved by shapes that reference other shapes, e.g. using sh:node and sh:property.

With some notable exceptions that we will explain below, the expressivity of OWL is comparable to the SHACL Core vocabulary. A syntactic translation between OWL and SHACL is straight-forward in most cases. Here is an example stating that the values of schema:attendee of an schema:Event are either instances of schema:Organization or schema:Person.

schema:Event
	a owl:Class ;
	rdfs:subClassOf [
		a owl:Restriction ;
		owl:onProperty schema:attendee ;
		owl:allValuesFrom [
			a owl:Class ;
			owl:unionOf (
				schema:Organization
				schema:Person
			)
		]
	] .

An equivalent SHACL shape may look as follows:

schema:Event
	a owl:Class, sh:NodeShape ;
	sh:property [
		sh:path schema:attendee ;
		sh:or (
			[ sh:class schema:Organization ]
			[ sh:class schema:Person ]
		)
	] .

While the example above illustrates that both languages sometimes only differ in the surface syntax, there are some use cases that cannot be covered by OWL. For example, to state that values of schema:address can be either resources of type schema:PostalAddress or literals with datatype xsd:string is not possible in OWL because object and datatype properties are disjoint. In SHACL it would be:

schema:Person
	a owl:Class, sh:NodeShape ;
	sh:property [
		sh:path schema:address ;
		sh:or (
			[ sh:class schema:PostalAddress ]
			[ sh:datatype xsd:string ]
		)
	] .

Additional SHACL Validation Features

A key differentiator between SHACL and OWL is that SHACL is extensible while OWL is limited to exactly the features that have been specified by the OWL committee. See SHACL-SPARQL and SHACL-JS for details on some SHACL extension points. In particular, SPARQL includes the concept of variables which has no equivalent in OWL and therefore makes it impossible to express many real-world use cases in OWL alone. The following example defines a SHACL constraint that verifies that each value of address:country in an address class must match an identical value of geo:countryCode in a country class:

address:AddressShape
	a sh:NodeShape ;
	sh:targetClass address:Address ;
	sh:sparql [
		sh:message "Address uses undefined country code {?countryCode}." ;
		sh:prefixes address: ;
		sh:select """
			SELECT $this ?countryCode
			WHERE {
				$this address:country ?countryCode .
				FILTER NOT EXISTS {
					?country geo:countryCode ?countryCode .
				}
			}
			""" ;
	] .

Furthermore, SHACL includes a very rich results vocabulary in which the results of the validation process are returned. For example it includes both human-readable and machine-readable pointers to specific data that violates SHACL constraints, the specific constraint type that was violated (e.g. sh:minCount), and it distinguishes errors from warnings and informational results. Even if OWL processors would return constraint violations, they would not have a standard way of expressing them.

Individual SHACL shapes can be switched off using sh:deactivated, which greatly improves the reusability of other people's data shapes. In OWL, on the other hand, you either need to use the whole file or create a clone where you remove the axioms that you don't want to apply to your application.

Not specific to validation, SHACL also includes standard properties to represent annotations that may be used to drive user interfaces, in particular forms, including sh:group, sh:order and sh:defaultValue. OWL also includes annotation properties, which can be used by SHACL files. However, OWL files cannot reuse the SHACL properties above because they need to be used in conjunction with SHACL property shapes. So if you plan to use your data model to drive user interfaces, starting with SHACL is arguably offering more built-in capabilities.

Inferencing

OWL has been designed to support inferencing, but it is only practically applicable for certain kinds of inferencing. As a result, users have been supplementing OWL with SWRL rules, SPIN rules or simply with a set of SPARQL CONSTRUCT queries. Although SHACL has been originally designed to focus on data validation, it also includes support for rule-based inferencing. Note that the SHACL features described in this section are not part of the SHACL W3C Recommendation but have been published by the W3C working group in the WG Note SHACL Advanced Features.

The following example is taken from the Pizza Ontology, a toy ontology that was custom built to teach and promote OWL and its Description Logic based inferencing capabilities. It defines an OWL class pizza:InterestingPizza that consists of all pizzas that have at least three values for pizza:hasTopping. Any instance of pizza:Pizza that fulfills this condition would be classified by an OWL DL engine to have rdf:type pizza:InterestingPizza.

pizza:InterestingPizza
	a owl:Class ;
	rdfs:subClassOf owl:Thing ;
	owl:equivalentClass [
		a owl:Class ;
		owl:intersectionOf (
			pizza:Pizza
			[
				a owl:Restriction ;
				owl:minCardinality "3"^^xsd:nonNegativeInteger ;
				owl:onProperty pizza:hasTopping ;
			]
		) ;
	] .

Here is a representation of a comparable inferencing rule using SHACL:

pizza:InterestingPizzaRuleShape
	a sh:NodeShape ;
	sh:targetClass pizza:Pizza ;
	sh:rule [
		a sh:TripleRule ;
		sh:condition [
			sh:path pizza:hasTopping ;
			sh:minCount 3 ;
		] ;
		sh:subject sh:this ;
		sh:predicate rdf:type ;
		sh:object pizza:InterestingPizza ;
	] .

The inference is expressed using a SHACL shape that targets all instances of pizza:Pizza. The shape has a sh:rule of type sh:TripleRule, which is applied to any of the pizzas that conform to the shape represented by sh:condition. This is where any pizzas with less than three values are filtered out. Those pizza instances that fulfill the condition become the subject of an inferred triple that has rdf:type as its predicate and pizza:InterestingPizza as its object.

There are notable differences between the types of inferences supported by both languages. OWL is centered around classification problems, i.e. it can be used to find subclass relationships (rdfs:subClassOf triples), instance-of relationships (rdf:type) and find equivalent (owl:sameAs) or different (owl:differentFrom) individuals. However, this is basically all that OWL inferencing is capable of. SHACL can be used to infer arbitrary triples, including the above use cases. So as an alternative to inferring that a resource is an instance of pizza:InterestingPizza it may set a flag pizza:isInteresting to true, or it may even compute the sum of the calories of a pizza by adding up values derived from its individual ingredients. This is shown, using a SPARQL-based SHACL rule, in the next snippet:

pizza:PizzaCaloriesRuleShape
	a sh:NodeShape ;
	sh:targetClass pizza:Pizza ;
	sh:rule [
		a sh:SPARQLRule ;
		sh:construct """
			CONSTRUCT {
				$this pizza:calories ?sum .
			}
			WHERE {
				{
    				SELECT (SUM(?calories) AS ?sum) $this
					WHERE {
						$this pizza:hasIngredient ?ingredient .
						?ingredient pizza:calories ?calories .
					} GROUP BY $this
				}
			}""" ;
		sh:prefixes <http://topbraid.org/examples/pizza.shapes> ;
	] .

When to use OWL, SHACL or both

As of 2017, OWL has been around for much longer than SHACL. This means that there are more tools and software packages around, and more books and academic papers have been written about it. SHACL has some catch up to do. There are modelers that have learned OWL, are comfortable using it and either do not have issues with its limitations or have developed their own custom approaches to dealing with limitations and/or glossing over certain aspects of OWL semantics. They may be reluctant to move to SHACL.

However, we believe it is fair to say that during the 13 years since its becoming a standard OWL has not achieved a critical mass of adoption outside of academia. We believe the main reason for this lays in the design principles of OWL. Despite large amounts of education material, newcomers to OWL continue to run into the same stumbling blocks as in 2004:

Confusion about the meaning of restrictions - in particular that OWL does not constrain anything but rather describes inferences
The built-in nature of the Open World Assumption and the Unique Name Assumption - it contradicts established approaches from schema languages and makes the meaning of certain statements (e.g., cardinality) different from what most modelers expect

SHACL comes with much more traditional semantics and offers a flexible approach to data modeling on graph structures.

Many ontology projects are currently using OWL, primarily because there was no alternative around. For practical matters, ontology designers have simply ignored the official semantics of OWL and, for example, treated an owl:maxCardinality 1 to mean that only one value is permitted. With SHACL now an official W3C standard, new projects may select a different approach. With SHACL's inferencing features, for most projects there is really no strong reason to stay with OWL.

Having said this, a number of useful OWL ontologies have been built and people may need to leverage them or interact with them. One option is a port to SHACL. Another option is co-existence of OWL and SHACL models. We now examine some techniques on how to deal with this situation.

The Power of RDF: URIs and Graphs

Thankfully, the shared base technology of both OWL and SHACL is RDF. RDF is built on top of URIs - global unique identifiers of resources. Anyone is allowed to add statements about such URIs. Some of these statements will be interpreted by OWL engines, others will only carry meaning for SHACL processors. Humans can read and understand both.

RDF is also built on the concept of graphs, which are typically represented by individual files or databases. RDF graphs are sets of triples. Such sets can be merged into larger sets, forming union graphs. Tools can then operate on RDF triples merged from multiple sources. This is an important capability of the RDF world, and a differentiator from comparable technologies such as XML.

Based on URIs and graphs, it is perfectly fine for any URI resource to carry different triples for different purposes. If you define a class with the URI ex:Customer then you may attach both OWL axioms and SHACL constraints to the same class. In order to keep things tidy and organized, it is advisable to create multiple graphs in multiple files, as illustrated and explained as follows:

# File: customers.ttl
# Base Vocabulary (RDF Schema): just the classes and properties
# with labels and sub-class-of relationships

ex:Customer
	a rdfs:Class ;
	rdfs:label "Customer"@en ;
	rdfs:label "Kunde"@de ;
	rdfs:comment "A person who buys goods or services from a shop or business."@en ;
	rdfs:subClassOf schema:Person .

ex:moneySpent
	a rdf:Property ;
	rdfs:label "money spent ($US)"@en ;
	rdfs:label "verschwendetes Geld"@de ;
	rdfs:comment "The amount in $US that the customer has spent so far."@en ;
	rdfs:domain ex:Customer ;    # Optional
	rdfs:range xsd:float .       # Optional

The OWL axioms for this vocabulary can be placed in another file:

# File: customers.owl.ttl
# OWL axioms

<http://example.org/customers.owl>
	a owl:Ontology ;
	owl:imports <http://example.org/customers> .

ex:Customer
	a owl:Class ;
	rdfs:subClassOf [
		a owl:Restriction ;
		owl:onProperty ex:moneySpent ;
		owl:maxCardinality 1 ;
	] .

ex:moneySpent
	a owl:DatatypeProperty .

The SHACL shapes for the vocabulary would be placed in another file too:

# File: customers.shapes.ttl
# SHACL shapes

<http://example.org/customers.shapes>
	a owl:Ontology ;
	owl:imports <http://example.org/customers> .

ex:Customer
	a sh:NodeShape ;
	sh:property [
		sh:path ex:moneySpent ;
		sh:datatype xsd:float ;
		sh:maxCount 1 ;
	] .

As a result of this design, applications can dynamically decide which features of the data model they need for their specific task, by merging the sub-graphs that they require. A SHACL validator does not need the OWL axioms and vice versa.

While the example above expresses the same cardinality restriction in two ways, which may be redundant and extra work, in many practical scenarios the OWL axioms will cover different things than the SHACL constraints. As long as they use the same URIs for the same classes, instance data can be used in both contexts.

Note that there is - from a SHACL perspective - nothing wrong with putting all these triples into a single large file. However, this has two downsides:

Some OWL tools would be "upset" if they encounter SHACL triples in class definitions, sometimes even refusing to perform inferences.
Storing SHACL constraints in the same file as the base vocabulary may make it harder for people to reuse the vocabulary if they disagree with the specific constraints. (property shapes should have URIs so that people can mark them as sh:deactivated true yet it is often more courteous to keep concerns separated).

With this, we recommend the use of individual graphs when RDF/OWL/SHACL files are published. Note that SHACL includes two properties sh:shapesGraph and sh:suggestedShapesGraph that can be used to link from an RDF Schema or instances file to shapes graphs. And then there is rdfs:seeAlso.

SHACL in an Open World

In some use cases, SHACL's closed-world assumption does not work well. Your data may have links to external RDF resources that are not part of the graph that is being validated. For example, your sales database may reference a customer by a URI via ex:customer ex:JohnDoe but there is no triple stating that ex:JohnDoe is actually an instance of ex:Customer. In such cases, using a closed-world sh:class ex:Customer constraint would be a poor choice. Such cases can typically be covered with rdfs:range statements, which provide tools with a hint of the nature of values without being overly strict. A SHACL shape may limit itself to stating that the values must be URIs, using sh:nodeKind sh:IRI.

Combining OWL Inferencing and SHACL

It is perfectly doable to combine OWL and SHACL with their individual strengths. An OWL inferencing engine basically takes an OWL ontology (plus instances) as input and produces new triples. These new, inferred, triples can be added to the graph which is then sent to the SHACL validation engine. Many implementations of OWL and RDF Schema can even perform these inferences on-the-fly by activating a switch.

It is also possible to use OWL and SHACL inferencing rules together, for example to express transformations that go beyond the coverage of OWL. Since both an OWL and a SHACL inference engine will only ever add new triples, their individual output can be added used as input to each other, in a loop until a fixpoint has been reached in which no further triples were added.

Holger Knublauch, last updated 2017-08-17