Definition of the CIDOC Conceptual Reference Model
This document is the formal definition of the CIDOC Conceptual Reference Model (“CRM”), a formal ontology intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information. The CRM is the culmination of more than a decade of standards development work by the International Committee for Documentation (CIDOC) of the International Council of Museums (ICOM). Work on the CRM itself began in 1996 under the auspices of the ICOM-CIDOC Documentation Standards Working Group. Since 2000, development of the CRM has been officially delegated by ICOM-CIDOC to the CIDOC CRM Special Interest Group, which collaborates with the ISO working group ISO/TC46/SC4/WG9 to bring the CRM to the form and status of an International Standard.
Objectives of the CIDOC CRM
The primary role of the CRM is to enable information exchange and integration between heterogeneous sources of cultural heritage information. It aims at providing the semantic definitions and clarifications needed to transform disparate, localised information sources into a coherent global resource, be it within a larger institution, in intranets or on the Internet.
Its perspective is supra-institutional and abstracted from any specific local context. This goal determines the constructs and level of detail of the CRM.
More specifically, it defines and is restricted to the underlying semantics of database schemata and document structures used in cultural heritage and museum documentation in terms of a formal ontology. It does not define any of the terminology appearing typically as data in the respective data structures; however it foresees the characteristic relationships for its use. It does not aim at proposing what cultural institutions should document. Rather it explains the logic of what they actually currently document, and thereby enables semantic interoperability.
It intends to provide an optimal analysis of the intellectual structure of cultural documentation in logical terms. As such, it is not optimised to implementation-specific storage and processing aspects. Rather, it provides the means to understand the effects of such optimisations to the semantic accessibility of the respective contents.
The CRM aims to support the following specific functionalities:
·Inform developers of information systems as a guide to good practice in conceptual modelling, in order to effectively structure and relate information assets of cultural documentation.
·Serve as a common language for domain experts and IT developers to formulate requirements and to agree on system functionalities with respect to the correct handling of cultural contents.
·To serve as a formal language for the identification of common information contents in different data formats; in particular to support the implementation of automatic data transformation algorithms from local to global data structures without loss of meaning. The latter being useful for data exchange, data migration from legacy systems, data information integration and mediation of heterogeneous sources.
·To support associative queries against integrated resources by providing a global model of the basic classes and their associations to formulate such queries.
·It is further believed, that advanced natural language algorithms and case-specific heuristics can take significant advantage of the CRM to resolve free text information into a formal logical form, if that is regarded beneficial. The CRM is however not thought to be a means to replace scholarly text, rich in meaning, by logical forms, but only a means to identify related data.
Users of the CRM should be aware that the definition of data entry systems requires support of community-specific terminology, guidance to what should be documented and in which sequence, and application-specific consistency controls. The CRM does not provide such notions.
By its very structure and formalism, the CRM is extensible and users are encouraged to create extensions for the needs of more specialized communities and applications.
Scope of the CIDOC CRM
The overall scope of the CIDOC CRM can be summarised in simple terms as the curated knowledge of museums.
However, a more detailed and useful definition can be articulated by defining both the Intended Scope, a broad and maximally-inclusive definition of general application principles, and the Practical Scope, which is expressed by the overall scope of a reference set of specific identifiable museum documentation standards and practices that the CRM aims to encompass, however restricted in its details to the limitations of the Intended Scope.
The Intended Scope of the CRM may be defined as all information required for the exchange and integration of heterogeneous scientific documentation of museum collections. This definition requires further elaboration:
·The term “scientific documentation” is intended to convey the requirement that the depth and quality of descriptive information that can be handled by the CRM should be sufficient for serious academic research. This does not mean that information intended for presentation to members of the general public is excluded, but rather that the CRM is intended to provide the level of detail and precision expected and required by museum professionals and researchers in the field.
·The term “museum collections” is intended to cover all types of material collected and displayed by museums and related institutions, as defined by ICOM. This includes collections, sites and monuments relating to fields such as social history, ethnography, archaeology, fine and applied arts, natural history, history of sciences and technology.
·The documentation of collections includes the detailed description of individual items within collections, groups of items and collections as a whole. The CRM is specifically intended to cover contextual information: the historical, geographical and theoretical background that gives museum collections much of their cultural significance and value.
·The exchange of relevant information with libraries and archives, and the harmonisation of the CRM with their models, falls within the Intended Scope of the CRM.
·Information required solely for the administration and management of cultural institutions, such as information relating to personnel, accounting, and visitor statistics, falls outside the Intended Scope of the CRM.
The Practical Scope of the CRM is expressed in terms of the current reference standards for museum documentation that have been used to guide and validate the CRM’s development. The CRM covers the same domain of discourse as the union of these reference standards; this means that data correctly encoded according to these museum documentation standards there can be a CRM-compatible expression that conveys the same meaning.
Compatibility with the CRM
Utility of CRM compatibility
The goal of the CRM is to enable the integration of the largest number of information resources. Therefore it aims to provide the greatest flexibility of systems to become compatible, rather than imposing one particular solution.
Users intending to take advantage of the semantic interoperability offered by the CRM may want to make parts of their data structures compatible with the CRM. Compatibility may pertain either to the associations by which users would like their data to be accessible in an integrated environment, or to the contents intended for transport to other environments, allowing encoded meaning to be preserved in a target system.
The CRM does not require complete matching of all user documentation structures with the CRM, nor that systems should always implement all CRM concepts and associations; instead it leaves room both for extensions, needed to capture the full richness of cultural information, and for simplifications, required for reasons of economy.
Furthermore, the CRM provides a means of interpreting structured information so that large amounts of data can be transformed or mediated automatically. It does not require unstructured or semi-structured free text information to be analysed into a formal logical representation. In other words, it does not aim to provide more structure than users have previously provided. The interpretation of information in the form of free text falls outside the scope of compatibility considerations. The CRM does, however, allow free text information to be integrated with structured information.
The Information Integration Environment
The notion of CRM compatibility is based on interoperability. Interoperability is best defined on the basis of specific communication practices between information systems. Following current practice, we distinguish the following types of information integration environments pertaining to information systems:
1.Local information systems. These are either collection management systems or content management systems that constitute institutional memories and are maintained by an institution. They are used for primary data entry, i.e. a relevant part of the information, be it data or metadata, is primary information in digital form that fulfils institutional needs.
2.Integrated access systems. These provide an homogeneous access layer to multiple local systems. The information they manage resides primarily on local systems. We distinguish between:
a.Materialized access systems,which physically import data provided by local systems, using a data warehouse approach. Such systems may employ so-called metadata harvesting techniques or rely on data submission. Data may be transformed to respect the schema of the access system before being merged.
b.Mediation systems, [Gio Wiederholt]which send out queries, formulated according to a virtual global schema, to multiple local systems and then collect and integrate the answers. The queries may be transformed to a local schema either by the mediation system or by the receiving local system itself.
Local systems may also import data from other systems, in order to complement collections, or to merge information from other systems. An information system may export information for migration and preservation.
Compatibility with the CRM pertains to one or more of the followingdata communication capabilities or use cases:
1.data falling within the scope of the CRM can be exported from an information system into an encoded form without loss of meaning with respect to CRM concepts;
2.data falling within the scope of the CRM can be transformed into another encoded form without loss of meaning with respect to CRM concepts;
3.data falling within the scope of the CRM can be imported from an encoded form into an information system without loss of meaning with respect to CRM concepts;
4.data falling within the scope of the CRM that is contained in an information system can be queried and retrieved exhaustively in terms of CRM concepts, subject to the expressive power of a particular query language.
Any declaration of CRM compatibility must specify one or more of the above use cases. System and data structure providers shall not declare their products as “CRM compatible” without specifying the appropriate use cases as detailed below.
In the context of this chapter, the expression “without loss of meaning with respect to the CRM concepts” means the following: The CRM concepts are used to classify items of discourse and their relationships. By virtue of this classification, data can be understood as propositions of a kind declared by the CRM about real world facts, such as “Object x. forms part of: Object y”. In case the encoding, i.e. the language used to describe a fact, is changed, only an expert conversant with both languages can assess if the two propositions do indeed describe the same fact. If this is the case, then there is no loss of meaning with respect to CRM concepts. Communities of practice requiring fewer concepts than the CRM declares may restrict CRM compatibility with respect to an explicitly declared subset of the CRM.
Users of this standard may communicate CRM compatible data, as detailed below, with data structures and systems that are either more detailed and specialized than the CRM or whose scope extends beyond that of the CRM. In such cases, the standard guarantees only the preservation of meaning with respect to CRM concepts. However, additional information that can be regarded as extending CRM concepts may be communicated and preserved in CRM compatible systems through the appropriate use of controlled terminology. The specification of the latter techniques does not fall under the scope of this standard. Communities of practice requiring extensions to the CRM are encouraged to declare their extensions as CRM-compatible standards.
The CRM is a formal ontology which can be expressed in terms of logic or a suitable knowledge representation language. Its concepts can be instantiated as sets of statements that provide a model of reality. We call any encoding of such CRM instances in a formal language that preserves the relations between the CRM classes, properties and inheritance rules a “CRM-compatible form”. Hence data expressed in any CRM-compatible form can be automatically transformed into any other CRM-compatible form without loss of meaning. Classes and properties of the CRM are identified by their initial codes, such as “E55” or “P12”. The names of classes and properties of a CRM-compatible form may be translated into any local language, but the identifying codes must be preserved. A CRM-compatible form should not implement the quantifiers of CRM properties as cardinality constraints for the encoded instances. Quantifiers may be implemented in an informative way, or not at all. Statements that violate quantifiers should be treated as alternative knowledge.
Any encoding of CRM instances in a formal language that preserves the relations within a consistent subset of CRM classes, properties and inheritance rules is regarded a “reduced CRM-compatible form”, if:
·all the conditions applicable to a CRM compatible form are respected;
·the subset does not violate the rules of subsumption and inheritance;
·any instance of the reduced CRM-compatible form is also a valid instance of a (full) CRM compatible form
·the subset contains at least the following concepts:
Beginning of Existence
End of Existence
Physical Man-Made Thing
Physical Man-Made Thing
Entity – Domain
Entity - Range
is identified by (identifies)
E1 CRM Entity
has type (is type of)
E1 CRM Entity
E1 CRM Entity
has time-span (is time-span of)
E2 Temporal Entity
took place at (witnessed)
falls within (contains)
occurred in the presence of (was present at)
E77 Persistent Item
- had participant (participated in)
- - carried out by (performed)
- used specific object (was used for)
- has modified (was modified by)
E24 Physical Man-Made Thing
- - has produced (was produced by)
E24 Physical Man-Made Thing
- brought into existence (was brought into existence by)
E63 Beginning of Existence
E77 Persistent Item
- - has produced (was produced by)
E24 Physical Man-Made Thing
- - has created (was created by)
E28 Conceptual Object
- took out of existence (was taken out of existence by)
E64 End of Existence
E77 Persistent Item
was influenced by (influenced)
E1 CRM Entity
- used specific object (was used for)
had specific purpose (was purpose of)
has dimension (isdimensionof)
is composed of (forms part of)
E18 Physical Thing
E18 Physical Thing
has section (is located on or within)
E18 Physical Thing
refers to ( is referred to by)
E89 Propositional Object
E1 CRM Entity
possesses (is possessed by)
E61 Time Primitive
at some time within
E61 Time Primitive
falls within (contains)
is subject to (applies to)
E72 Legal Object
is composed of (forms part of)
E90 Symbolic Object
E90 Symbolic Object
has current or former member (is current or former member of)
has broader term (has narrower term)
carries (is carried by)
E24 Physical Man-Made Thing
E73 Information Object
shows features of (features are also found on)
assigned attribute to (was attributed by)
E13 Attribute Assignment
E1 CRM Entity
assigned (was assigned by)
E13 Attribute Assignement
E1 CRM Entity
has component (is component of)
E89 Propositional Object
E89 Propositional Object
CRM Compatibility of Data Structure
A data structure is export-compatible with the CRM if it is possible to transform any data from this data structure into a CRM-compatible form without loss of meaning. Implicit concepts may be present in elements of the data structure that are not supported by the CRM. As long as these concepts can be encoded as instances of E55 Type (i.e. as terminology) and attached unambiguously to their respective data items with suitable properties, the data structure is still regarded as export compatible.
Note that not all CRM concepts may be represented by elements of an export-compatible data structure. All data from export-compatible data structures can be transported in a CRM-compatible form. In particular any CRM compatible form or reduced CRM-compatible form is export-compatible with the CRM.
A data structure is import-compatible with the CRM if it is possible to automatically transform any data from a CRM-compatible form into this data structure without loss of meaning, simply on the basis of knowledge about the data structure elements being used. This implies that a data record transformed into this data structure from a CRM-compatible form can be transformed back into the CRM-compatible form without loss of meaning. Note that the back-transformation into a CRM-compatible form may result in a data record that is semantically equivalent but not identical with the original.
Any CRM-compatible form is automatically import-compatible with the CRM. Note that an import-compatible data structure may be semantically richer than the CRM. It may contain elements that, through the use of a transformation algorithm, can be made to correspond to CRM concepts or specializations thereof or that contain elements with meanings that fall outside the scope of the CRM. However, it must not contain elements that overlap in meaning with CRM concepts and which cannot be subsumed via transformation by a CRM concept other than E1 CRM Entity and E77 Persistent Item.
Import-compatible data structures may be used to transport data for applications that require concepts that lie beyond the scope of the CRM, as well as data from any export-compatible data structure. Note that, in general, applications may make use of data from a CRM import-compatible data structure that has been exported into a CRM compatible form by semantic reduction to CRM concepts, i.e. by generalizing all subsumed concepts to the most specific CRM concept applicable, and by discarding elements that fall outside the scope of the CRM.
A data structure is partiallyimport-compatiblewith the CRM if the above holds for a reduced CRM-compatible form.
CRM Compatibility of Information Systems
An information system isexport-compatible with the CRM if it is possible to export all user data from this information system into an import-compatible data structure. This capability is the recommended kind of CRM-compatibility for local information systems.
An information system ispartially export compatible if it is possible to export all user data from this information system into a partially import-compatible data structure. This is not the recommended kind of CRM-compatibility, but it may not be feasible for legacy systems to acquire a higher level of CRM compatibility without unreasonable effort. This reduced level of CRM compatibility is nonetheless highly useful.
Note that there is no minimum requirement for the classes and properties that must be present in the exported user data. Therefore it is possible that the data may pertain to instances of just a single property, such as E21 Person. P131 is identified by: E82 Actor Appellation.
An information system isimport-compatible with the CRM if it is possible to import data encoded in a CRM-compatible form and to access the data in a manner equivalent to and homogeneous with all generic data of this system that fall under the same concepts. This capability is considered as the normal kind of CRM compatibility for integrated access systems that physically copy source data in a data warehouse style (materialized access systems).
An information system is partially import-compatible with the CRM if it is possible to import data encoded in a reduced CRM-compatible form and to access the data in a manner equivalent to and homogeneous with all generic data of this system that fall under the same concepts. Depending on the functional requirements, it makes sense for integrated access systems to offer access services of reduced complexity by being only partially import-compatible with the CRM.
Note that it makes sense for integrated access systems to import data from extended data structures by semantic reduction to CRM defined concepts.
Note that local information system providers may choose to make their systems import-compatible with the CRM in order to exchange data, for example in the case of museum object loans or for system migration purposes. Communities of practice may choose to agree on import compatibility for extended data structures.
Some local information systems are likely to focus on specialized subject areas, such as inscriptions. For these specialized systems, the ability to import a specific data structure is recommended. This should be export-compatible with the CRM, and encompass the concepts that are required by the subject matter (“dedicated import compatibility”).
An information system is access-compatible with the CRM if it is possible to access the user data in the information system by querying with CRM classes and properties so that the meaning of the answers to the queries corresponds to the query terms used. It is not regarded as a reduction of compatibility if access is limited to data deemed to be exchanged.
An information system is partially access-compatible with the CRM if it is possible to access the user data in the information system by querying with a consistent subset of CRM classes and properties, corresponding to a reduced CRM-compatible form, so that the meaning of the answers to the queries corresponds to the query terms used.
An access-compatible system may be export-compatible with respect to the query answers. Note that it may make sense for an access-compatible content management system to return only content items in response to queries rather than being export compatible.
fig. 1: Possible data flow between different kinds of CRM-compatible systems and data structures
Fig. 1 shows a symbolic representation of some of the data flow patterns defined above between different kinds of CRM-compatible systems and data structures. In this figure it is assumed that the Local System B exports data into a CRM export-compatible data structure, which implies that it can be exported into a CRM-compatible form or any other CRM import-compatible data structure. Therefore Local System B is export-compatible with the CRM. For Local System A, the figure symbolizes the case where the exported data contain elements that correspond to specializations of the CRM or fall out of its scope.
Compatibility claim declaration
A provider of a data structure or information system claiming compatibility with the CRM has to provide a declaration that describes the kind of compatibility and, depending on the kind, the following additional information:
·For export-compatible data structures:
The subset of CRM concepts directly instantiated by any possible data in this data structure after transformation into a CRM-compatible form.
·For export-compatible systems:
a.A declaration of configurable user data elements, if any, that are not semantically restricted to a CRM Concept (other than E1 CRM Entity or E77 Persistent Item).
b.User data elements or units that are not exported.
c.The subset of CRM concepts directly instantiated by any possible data exported from the system after transformation into a CRM-compatible form.
·For partially or dedicated import-compatible systems:
The subset of CRM concepts under which data can be imported into the system.
·For access-compatible systems:
a.The query language by which the system can be queried.
b.The subset of CRM concepts directly instantiated by any possible query answers exported from the system after transformation into a CRM-compatible form.
c.For partially access-compatible systems, the subset of CRM concepts by which the system can be queried.
The provider should be able to demonstrate the claim with suitable test data. The provider should be able to demonstrate its claim according to certain procedures included in any applicable certificate practice related statement.
The provider should either make evidence of these procedures publicly available on the Internet on a site nominated by the ISO community of use, so that any third party is able to verify the claim with suitable test data, or acquire a certificate by a certification authority (CA).
A trusted third party recognised and authorised by a competent regulatory authority to act as a CA in this practice area, should be able to verify the credentials of the provider applying for such certificate and thus, of its claim with suitable test data, before issuing the certificate so that the users can trust the information in the CA certificates.
The CA will grant the provider of the certified system the right to use the “CRM compatible” logo..
The CRM is an ontology in the sense used in computer science. It has been expressed as an object-oriented semantic model, in the hope that this formulation will be comprehensible to both documentation experts and information scientists alike, while at the same time being readily converted to machine-readable formats such as RDF Schema, KIF, DAML+OIL, OWL, STEP, etc. It can be implemented in any Relational or object-oriented schema. CRM instances can also be encoded in RDF, XML, DAML+OIL, OWL and others.
Although the definition of the CRM provided here is complete, it is an intentionally compact and concise presentation of the CRM’s 86 classes and 137 unique properties. It does not attempt to articulate the inheritance of properties by subclasses throughout the class hierarchy (this would require the declaration of several thousand properties, as opposed to 137). However, this definition does contain all of the information necessary to infer and automatically generate a full declaration of all properties, including inherited properties.
The following definitions of key terminology used in this document are provided both as an aid to readers unfamiliar with object-oriented modelling terminology, and to specify the precise usage of terms that are sometimes applied inconsistently across the object oriented modelling community for the purpose of this document. Where applicable, the editors have tried to consistently use terminology that is compatible with that of the Resource Description Framework (RDF), a recommendation of the World Wide Web Consortium. The editors have tried to find a language which is comprehensible to the non-computer expert and precise enough for the computer expert so that both understand the intended meaning.
A class is a category of items that share one or more common traitsserving as criteria to identify the items belonging to the class. These properties need not be explicitly formulated in logical terms, but may be described in a text (here called a scope note) that refers to a common conceptualisation of domain experts. The sum of these traits is called the intension of the class. A class may be the domain or range of none, one or more properties formally defined in a model. The formally defined properties need not be part of the intension of their domains or ranges: such properties are optional. An item that belongs to a class is called an instance of this class. A class is associated with an open set of real life instances, known as the extension of the class. Here “open” is used in the sense that it is generally beyond our capabilities to know all instances of a class in the world and indeed that the future may bring new instances about at any time (Open World). Therefore a class cannot be defined by enumerating its instances. A class plays a role analogous to a grammatical noun, and can be completely defined without reference to any other construct (unlike properties, which must have an unambiguously defined domain and range). In some contexts, the terms individual class, entity or node are used synonymously with class.
Person is a class. To be a Person may actually be determined by DNA characteristics, but we all know what a Person is. A Person may have the property of being a member of a Group, but it is not necessary to be member of a Group in order to be a Person. We shall never know all Persons of the past. There will be more Persons in the future.
A subclass is a class that is a specialization of another class (its superclass). Specialization or the IsA relationship means that:
1.all instances of the subclass are also instances of its superclass,
2.the intension of the subclass extends the intension of its superclass, i.e. its traits are more restrictive than that of its superclass and
3.the subclass inherits the definition of all of the properties declared for its superclass without exceptions (strict inheritance), in addition to having none, one or more properties of its own.
A subclass can have more than one immediate superclass and consequently inherits the properties of all of its superclasses (multiple inheritance). The IsA relationship or specialization between two or more classes gives rise to a structure known as a class hierarchy. The IsA relationship is transitive and may not be cyclic. In some contexts (e.g. the programming language C++) the term derived class is used synonymously with subclass.
Every Person IsA Biological Object, or Person is a subclass of Biological Object.
Also, every Person IsA Actor. A Person may die. However other kinds of Actors, such as companies, don’t die (c.f. 2).
Every Biological Object IsA Physical Object. A Physical Object can be moved. Hence a Person can be moved also (c.f. 3).
A superclass is a class that is a generalization of one or more other classes (its subclasses), which means that it subsumes all instances of its subclasses, and that it can also have additional instances that do not belong to any of its subclasses. The intension of the superclass is less restrictive than any of its subclasses. This subsumption relationship or generalization is the inverse of the IsA relationship or specialization.
In some contexts (e.g. the programming language C++) the term parent class is used synonymously with superclass.
“Biological Object subsumes Person” is synonymous with “Biological Object is a superclass of Person”. It needs fewer traits to identify an item as a Biological Object than to identify it as a Person.
The intension of a class or property is its intended meaning. It consists of one or more common traitsshared by all instances of the class or property. These traits need not be explicitly formulated in logical terms, but may just be described in a text (here called a scope note) that refers to a conceptualisation common to domain experts. In particular the so-called primitive concepts, which make up most of the CRM, cannot be further reduced to other concepts by logical terms.
The extension of a class is the set of all real life instances belonging to the class that fulfil the criteria of its intension. This set is “open” in the sense that it is generally beyond our capabilities to know all instances of a class in the world and indeed that the future may bring new instances about at any time (Open World). An information system may at any point in time refer to some instances of a class, which form a subset of its extension.
A scope note is a textual description of the intension of a class or property.
Scope notes are not formal modelling constructs, but are provided to help explain the intended meaning and application of the CRM’s classes and properties. Basically, they refer to a conceptualisation common to domain experts and disambiguate between different possible interpretations. Illustrative example instances of classes and properties are also regularly provided in the scope notes for explanatory purposes.
An instance of a class is a real world item that fulfils the criteria of the intension of the class. Note, that the number of instances declared for a class in an information system is typically less than the total in the real world. For example, you are an instance of Person, but you are not mentioned in all information systems describing Persons.
The painting known as the “The Mona Lisa” is an instance of the class Man Made Object.
An instance of a property is a factual relation between an instance of the domain and an instance of the range of the property that matches the criteria of the intension of the property.
“The Louvre is current ownerof The Mona Lisa” is an instance of the property “is current owner of”.
A property serves to define a relationship of a specific kind between two classes. The property is characterized by an intension, which is conveyed by a scope note. A property plays a role analogous to a grammatical verb, in that it must be defined with reference to both its domain and range, which are analogous to the subject and object in grammar (unlike classes, which can be defined independently). It is arbitrary, which class is selected as the domain, just as the choice between active and passive voice in grammar is arbitrary. In other words, a property can be interpreted in both directions, with two distinct, but related interpretations. Properties may themselves have properties that relate to other classes (This feature is used in this model only in order to describe dynamic subtyping of properties). Properties can also be specialized in the same manner as classes, resulting in IsA relationships between subproperties and their superproperties.
In some contexts, the terms attribute, reference, link, role or slot are used synonymously with property.
“Physical Man-Made Thing depictsCRM Entity” is equivalent to “CRM Entity is depicted by Physical Man-Made Thing”.
A subproperty is a property that is a specialization of another property (its superproperty). Specialization or IsA relationship means that:
1.all instances of the subproperty are also instances of its superproperty,
2.the intension of the subproperty extends the intension of the superproperty, i.e. its traits are more restrictive than that of its superproperty,
3.the domain of the subproperty is the same as the domain of its superproperty or a subclass of that domain,
4.the range of the subproperty is the same as the range of its superproperty or a subclass of that range,
5.the subproperty inherits the definition of all of the properties declared for its superproperty without exceptions (strict inheritance), in addition to having none, one or more properties of its own.
A subproperty can have more than one immediate superproperty and consequently inherits the properties of all of its superproperties (multiple inheritance). The IsA relationship or specialization between two or more properties gives rise to the structure we call a property hierarchy. The IsA relationship is transitive and may not be cyclic.
Some object-oriented languages, such as C++, have no equivalent to the specialization of properties.
A superproperty is a property that is a generalization of one or more other properties (its subproperties), which means that it subsumes all instances of its subproperties, and that it can also have additional instances that do not belong to any of its subproperties. The intension of the superproperty is less restrictive than any of its subproperties. The subsumption relationship or generalization is the inverse of the IsA relationship or specialization.
The domain is the class for which a property is formally defined. This means that instances of the property are applicable to instances of its domain class. A property must have exactly one domain, although the domain class may always contain instances for which the property is not instantiated. The domain class is analogous to the grammatical subject of the phrase for which the property is analogous to the verb. It is arbitrary, which class is selected as the domain and which as the range, just as the choice between active and passive voice in grammar is arbitrary. Property names in the CRM are designed to be semantically meaningful and grammatically correct when read from domain to range. In addition, the inverse property name, normally given in parentheses, is also designed to be semantically meaningful and grammatically correct when read from range to domain.
The range is the class that comprises all potential values of a property. That means that instances of the property can link only to instances of its range class. A property must have exactly one range, although the range class may always contain instances that are not the value of the property. The range class is analogous to the grammatical object of a phrase for which the property is analogous to the verb. It is arbitrary, which class is selected as domain and which as range, just as the choice between active and passive voice in grammar is arbitrary. Property names in the CRM are designed to be semantically meaningful and grammatically correct when read from domain to range. In addition the inverse property name, normally given in parentheses, is also designed to be semantically meaningful and grammatically correct when read from range to domain.
Inheritance of properties from superclasses to subclasses means that if an item x is an instance of a class A, then
1.all properties that must hold for the instances of any of the superclasses of A must also hold for item x, and
all optional properties that may hold for the instances of any of the superclasses of A may also hold for item x.
Strict inheritance means that there are no exceptions to the inheritance of properties from superclasses to subclasses. For instance, some systems may declare that elephants are grey, and regard a white elephant as an exception. Under strict inheritance it would hold that: if all elephants were grey, then a white elephant could not be an elephant. Obviously not all elephants are grey. To be grey is not part of the intension of the concept elephant but an optional property. The CRM applies strict inheritance as a normalization principle.
Multiple inheritance means that a class A may have more than one immediate superclass. The extension of a class with multiple immediate superclasses is a subset of the intersection of all extensions of its superclasses. The intension of a class with multiple immediate superclasses extends the intensions of all its superclasses, i.e. its traits are more restrictive than any of its superclasses. If multiple inheritance is used, the resulting “class hierarchy” is a directed graph and not a tree structure. If it is represented as an indented list, there are necessarily repetitions of the same class at different positions in the list.
For example, Person is both, an Actor and a Biological Object.
“The difference between enduring and perduring entities (which we shall also call endurants and perdurants) is related to their behaviour in time. Endurants are wholly present (i.e., all their proper parts are present) at any time they are present. Perdurants, on the other hand, just extend in time by accumulating different temporal parts, so that, at any time they are present, they are only partially present, in the sense that some of their proper temporal parts (e.g., their previous or future phases) may be not present. E.g., the piece of paper you are reading now is wholly present, while some temporal parts of your reading are not present any more. Philosophers say that endurants are entities that are in time, while lacking however temporal parts (so to speak, all their parts flow with them in time). Perdurants, on the other hand, are entities that happen in time, and can have temporal parts (all their parts are fixed in time).” (Gangemi et al. 2002, pp. 166-181).
A shortcut is a formally defined single property that represents a deduction or join of a data path in the CRM. The scope notes of all properties characterized as shortcuts describe in words the equivalent deduction. Shortcuts are introduced for the cases where common documentation practice refers only to the deduction rather than to the fully developed path. For example, museums often only record the dimension of an object without documenting the Measurement that observed it. The CRM allows shortcuts as cases of less detailed knowledge, while preserving in its schema the relationship to the full information.
Monotonic reasoning is a term from knowledge representation. A reasoning form is monotonic if an addition to the set of propositions making up the knowledge base never determines a decrement in the set of conclusions that may be derived from the knowledge base via inference rules. In practical terms, if experts enter subsequently correct statements to an information system, the system should not regard any results from those statements as invalid, when a new one is entered. The CRM is designed for monotonic reasoning and so enables conflict-free merging of huge stores of knowledge.
Classes are disjoint if the intersection of their extensions is an empty set. In other words, they have no common instances in any possible world.
The term primitive as used in knowledge representation characterizes a concept that is declared and its meaning is agreed upon, but that is not defined by a logical deduction from other concepts. For example, mother may be described as a female human with child. Then mother is not a primitive concept. Event however is a primitive concept.
Most of the CRM is made up of primitive concepts.
The “Open World Assumption” is a term from knowledge base systems. It characterizes knowledge base systems that assume the information stored is incomplete relative to the universe of discourse they intend to describe. This incompleteness may be due to the inability of the maintainer to provide sufficient information or due to more fundamental problems of cognition in the system’s domain. Such problems are characteristic of cultural information systems. Our records about the past are necessarily incomplete. In addition, there may be items that cannot be clearly assigned to a given class.
In particular, absence of a certain property for an item described in the system does not mean that this item does not have this property. For example, if one item is described as Biological Object and another as Physical Object, this does not imply that the latter may not be a Biological Object as well. Therefore complements of a class with respect to a superclass cannot be concluded in general from an information system using the Open World Assumption. For example, one cannot list “all Physical Objects known to the system that are not Biological Objects in the real world”, but one may of course list “all items known to the system as Physical Objects but that are not known to the system as Biological Objects”.
Thecomplement of a class A with respect to one of its superclasses B is the set of all instances of B that are not instances of A. Formally, it is the set-theoretic difference of the extension of B minus the extension of A. Compatible extensions of the CRM should not declare any class with the intension of them being the complement of one or more other classes. To do so will normally violate the desire to describe an Open World. For example, for all possible cases of human gender, male should not be declared as the complement of female or vice versa. What if someone is both or even of another kind?
Query containment is a problem from database theory: A query X contains another query Y, if for each possible population of a database the answer set to query X contains also the answer set to query Y. If query X and Y were classes, then X would be superclass of Y.
Interoperability means the capability of different information systems to communicate some of their contents. In particular, it may mean that
1. two systems can exchange information, and/or
2. multiple systems can be accessed with a single method.
Generally, syntacticinteroperability is distinguished from semanticinteroperability. Syntactic interoperability means that the information encoding of the involved systems and the access protocols are compatible, so that information can be processed as described above without error. However, this does not mean that each system processes the data in a manner consistent with the intended meaning. For example, one system may use a table called “Actor” and another one called “Agent”. With syntactic interoperability, data from both tables may only be retrieved as distinct, even though they may have exactly the same meaning. To overcome this situation, semantic interoperability has to be added. The CRM relies on existing syntactic interoperability and is concerned only with adding semanticinteroperability.
Semantic interoperability means the capability of different information systems to communicate information consistent with the intended meaning. In more detail, the intended meaning encompasses
1.the data structure elements involved,
2.the terminology appearing as data and
3.the identifiers used in the data for factual items such as places, people, objects etc.
Obviously communication about data structure must be resolved first. In this case consistent communication means that data can be transferred between data structure elements with the same intended meaning or that data from elements with the same intended meaning can be merged. In practice, the different levels of generalization in different systems do not allow the achievement of this ideal. Therefore semantic interoperability is regarded as achieved if elements can be found that provide a reasonably close generalization for the transfer or merge. This problem is being studied theoretically as the query containment problem. The CRM is only concerned with semantic interoperability on the level of data structure elements.
We use the term property quantifiers for the declaration of the allowed number of instances of a certain property that an instance of its range or domain may have. These declarations are ontological, i.e. they refer to the nature of the real world described and not to our current knowledge. For example, each person has exactly one father, but collected knowledge may refer to none, one or many.
The fundamental ontological distinction between universals and particulars can be informally understood by considering their relationship with instantiation: particulars are entities that have no instances in any possible world; universals are entities that do have instances. Classes and properties (corresponding to predicates in a logical language) are usually considered to be universals. (after Gangemi et al. 2002, pp. 166-181).
Quantifiers for properties are provided for the purpose of semantic clarification only, and should not be treated as implementation recommendations. The CRM has been designed to accommodate alternative opinions and incomplete information, and therefore all properties should be implemented as optional and repeatable for their domain and range (“many to many (0,n:0,n)”). Therefore the term “cardinality constraints” is avoided here, as it typically pertains to implementations.
The following table lists all possible property quantifiers occurring in this document by their notation, together with an explanation in plain words. In order to provide optimal clarity, two widely accepted notations are used redundantly in this document, a verbal and a numeric one. The verbal notation uses phrases such as “one to many”, and the numeric one, expressions such as “(0,n:0,1)”. While the terms “one”, “many” and “necessary” are quite intuitive, the term “dependent” denotes a situation where a range instance cannot exist without an instance of the respective property. In other words, the property is “necessary” for its range.
many to many (0,n:0,n)
Unconstrained: An individual domain instance and range instance of this property can have zero, one or more instances of this property. In other words, this property is optional and repeatable for its domain and range.
one to many
An individual domain instance of this property can have zero, one or more instances of this property, but an individual range instance cannot be referenced by more than one instance of this property. In other words, this property is optional for its domain and range, but repeatable for its domain only. In some contexts this situation is called a “fan-out”.
many to one
An individual domain instance of this property can have zero or one instance of this property, but an individual range instance can be referenced by zero, one or more instances of this property. In other words, this property is optional for its domain and range, but repeatable for its range only. In some contexts this situation is called a “fan-in”.
many to many, necessary (1,n:0,n)
An individual domain instance of this property can have one or more instances of this property, but an individual range instance can have zero, one or more instances of this property. In other words, this property is necessary and repeatable for its domain, and optional and repeatable for its range.
one to many, necessary
An individual domain instance of this property can have one or more instances of this property, but an individual range instance cannot be referenced by more than one instance of this property. In other words, this property is necessary and repeatable for its domain, and optional but not repeatable for its range. In some contexts this situation is called a “fan-out”.
many to one, necessary
An individual domain instance of this property must have exactly one instance of this property, but an individual range instance can be referenced by zero, one or more instances of this property. In other words, this property is necessary and not repeatable for its domain, and optional and repeatable for its range. In some contexts this situation is called a “fan-in”.
one to many, dependent
An individual domain instance of this property can have zero, one or more instances of this property, but an individual range instance must be referenced by exactly one instance of this property. In other words, this property is optional and repeatable for its domain, but necessary and not repeatable for its range. In some contexts this situation is called a “fan-out”.
one to many, necessary, dependent
An individual domain instance of this property can have one or more instances of this property, but an individual range instance must be referenced by exactly one instance of this property. In other words, this property is necessary and repeatable for its domain, and necessary but not repeatable for its range. In some contexts this situation is called a “fan-out”.
many to one, necessary, dependent
An individual domain instance of this property must have exactly one instance of this property, but an individual range instance can be referenced by one or more instances of this property. In other words, this property is necessary and not repeatable for its domain, and necessary and repeatable for its range. In some contexts this situation is called a “fan-in”.
one to one
An individual domain instance and range instance of this property must have exactly one instance of this property. In other words, this property is necessary and not repeatable for its domain and for its range.
The CRM defines some properties as being necessary for their domain or as being dependent from their range, following the definitions in the table above.Note that if such a property is not specified for an instance of the respective domain or range, it means that the property exists, but the value on one side of the property is unknown. In the case of optional properties, the methodology proposed by the CRM does not distinguish between a value being unknown or the property not being applicable at all. For example, one may know that an object has an owner, but the owner is unknown. In a CRM instance this case cannot be distinguished from the fact that the object has no owner at all. Of course, such details can always be specified by a textual note.
The following naming conventions have been applied throughout the CRM:
·Classes are identified by numbers preceded by the letter “E” (historically classes were sometimes referred to as “Entities”), and are named using noun phrases (nominal groups) using title case (initial capitals). For example, E63 Beginning of Existence.
·Properties are identified by numbers preceded by the letter “P,” and are named in both directions using verbal phrases in lower case. Properties with the character of states are named in the present tense, such as “has type”, whereas properties related to events are named in past tense, such as “carried out.” For example, P126 employed (was employed in).
·Property names should be read in their non-parenthetical form for the domain-to-range direction, and in parenthetical form for the range-to-domain direction.
·Properties with a range that is a subclass of E59 Primitive Value (such as E1 CRM Entity. P3 has note: E62 String, for example) have no parenthetical name form, because reading the property name in the range-to-domain direction is not regarded as meaningful.
·Properties that have identical domain and range are either symmetric or transitive. Instantiating a symmetric property implies that the same relation holds for both the domain-to-range and the range-to-domain directions. An example of this is E53 Place. P122 borders with: E53 Place. The names of symmetric properties have no parenthetical form, because reading in the range-to-domain direction is the same as the domain-to-range reading. Transitive asymmetric properties, such as E4 Period. P9 consist of (forms part of): E4 Period, have a parenthetical form that relates to the meaning of the inverse direction.
·The choice of the domain of properties, and hence the order of their names, are established in accordance with the following priority list:
·Temporal Entity and its subclasses
·Thing and its subclasses
·Actor and its subclasses
The following modelling principles have guided and informed the development of the CIDOC CRM.
Because the CRM’s primary role is the meaningful integration of information in an Open World, it aims to be monotonic in the sense of Domain Theory. That is, the existing CRM constructs and the deductions made from them must always remain valid and well-formed, even as new constructs are added by extensions to the CRM.
One may add a subclass of E7 Activity to describe the practice of an instance of group to use a certain name for a place over a certain time-span. By this extension, no existing IsA Relationships or property inheritances are compromised.
In addition, the CRM aims to enable the formal preservation of monotonicity when augmenting a particular CRM compatible system. That is, existing CRM instances, their properties and deductions made from them, should always remain valid and well-formed, even as new instances, regarded as consistent by the domain expert, are added to the system.
If someone describes correctly that an item is an instance of E19 Physical Object, and later it is correctly characterized as an instance of E20 Biological Object, the system should not stop treating it as an instance of E19 Physical Object.
In order to formally preserve monotonicity for the frequent cases of alternative opinions, all formally defined properties should be implemented as unconstrained (many: many) so that conflicting instances of properties are merely accumulated. Thus knowledge integrated following the CRM serves as a research base, accumulating relevant alternative opinions around well-defined entities, whereas conclusions about the truth are the task of open-ended scientific or scholarly hypothesis building.
El Greco and even King Arthur should always remain an instance of E21 Person and be dealt with as existing within the sense of our discourse, once they are entered into our knowledge base. Alternative opinions about properties, such as their birthplaces and their living places, should be accumulated without validity decisions being made during data compilation.
Although the scope of the CRM is very broad, the model itself is constructed as economically as possible.
·A class is not declared unless it is required as the domain or range of a property not appropriate to its superclass, or it is a key concept in the practical scope.
·CRM classes and properties that share a superclass are non-exclusive by default. For example, an object may be both an instance of E20 Biological Object and E22 Man-made Object.
·CRM classes and properties are either primitive, or they are key concepts in the practical scope.
·Complements of CRM classes are not declared.
Some properties are declared as shortcuts of longer, more comprehensively articulated paths that connect the same domain and range classes as the shortcut property via one or more intermediate classes. For example, the property E18 Physical Thing. P52 has current owner (is current owner of): E39 Actor, is a shortcut for a fully articulated path from E18 Physical Thing through E8 Acquisition to E39 Actor. An instance of the fully-articulated path always implies an instance of the shortcut property. However, the inverse may not be true; an instance of the fully-articulated path cannot always be inferred from an instance of the shortcut property.
The class E13 Attribute Assignment allows for the documentation of how the assignment of any property came about, and whose opinion it was, even in cases of properties not explicitly characterized as “shortcuts”.
Classes are disjoint if they share no common instances in any possible world. There are many examples of disjoint classes in the CRM.
A comprehensive declaration of all possible disjoint class combinations afforded by the CRM has not been provided here; it would be of questionable practical utility, and may easily become inconsistent with the goal of providing a concise definition. However, there are two key examples of disjoint class pairs that are fundamental to effective comprehension of the CRM:
·E2 Temporal Entity is disjoint from E77 Persistent Item. Instances of the class E2 Temporal Entity are perdurants, whereas instances of the class E77 Persistent Item are endurants. Even though instances of E77 Persistent Item have a limited existence in time, they are fundamentally different in nature from instances of E2 Temporal Entity, because they preserve their identity between events. Declaring endurants and perdurants as disjoint classes is consistent with the distinctions made in data structures that fall within the CRM’s practical scope.
·E18 Physical Thing is disjoint from E28 Conceptual Object. The distinction is between material and immaterial items, the latter being exclusively man-made. Instances of E18 Physical Thing and E28 Conceptual Object differ in many fundamental ways; for example, the production of instances of E18 Physical Thing implies the incorporation of physical material, whereas the production of instances of E28 Conceptual Object does not. Similarly, instances of E18 Physical Thing cease to exist when destroyed, whereas an instance of E28 Conceptual Object perishes when it is forgotten or its last physical carrier is destroyed.
Virtually all structured descriptions of museum objects begin with a unique object identifier and information about the "type" of the object, often in a set of fields with names like "Classification", "Category", "Object Type", "Object Name", etc. All these fields are used for terms that declare that the object belongs to a particular category of items. In the CRM the class E55 Type comprises such terms from thesauri and controlled vocabularies used to characterize and classify instances of CRM classes. Instances of E55 Type represent concepts (universals) in contrast to instances of E41 Appellation which are used to name instances of CRM classes.
E55 Type is the CRM’s interface to domain specific ontologies and thesauri. These can be represented in the CRM as subclasses of E55 Type, forming hierarchies of terms, i.e. instances of E55 Type linked via P127 has broader term (has narrower term). Such hierarchies may be extended with additional properties.
For this purpose the CRM provides two basic properties that describe classification with terminology, corresponding to what is the current practice in the majority of information systems. The class E1 CRM Entity is the domain of the property P2 has type (is type of), which has the range E55 Type. Consequently, every class in the CRM, with the exception of E59 Primitive Value, inherits the property P2 has type (is type of). This provides a general mechanism for simulating a specialization of the classification of CRM instances to any level of detail, by linking to external vocabulary sources, thesauri, classification schema or ontologies.
Analogous to the function of the P2 has type (is type of) property, some properties in the CRM are associated with an additional property. These are numbered in the CRM documentation with a ‘.1’ extension. The range of these properties of properties always falls under E55 Type. Their purpose is to simulate a specialization of their parent property through the use of property subtypes declared as instances of E55 Type. They do not appear in the property hierarchy list but are included as part of the property declarations and referred to in the class declarations. For example, P62.1 mode of depiction: E55 Type is associated with E24 Physical Man-made Thing. P62 depicts (is depicted by): E1 CRM Entity.
The class E55 Type also serves as the range of properties that relate to categorical knowledge commonly found in cultural documentation. For example, the property P125 used object of type (was type of object used in) enables the CRM to express statements such as “this casting was produced using a mould”, meaning that there has been an unknown or unmentioned object, a mould, that was actually used. This enables the specific instance of the casting to be associated with the entire type of manufacturing devices known as moulds. Further, the objects of type “mould” would be related via P2 has type (is type of) to this term. This indirect relationship may actually help in detecting the unknown object in an integrated environment. On the other side, some casting may refer directly to a known mould via P16 used specific object (was used for). So a statistical question to how many objects in a certain collection are made with moulds could be answered correctly (following both paths through P16 used specific object (was used for) - P2 has type (is type of) and P125 used object of type (was type of object used in). This consistent treatment of categorical knowledge enhances the CRM’s ability to integrate cultural knowledge.
In addition to being an interface to external thesauri and classification systems E55 Type is an ordinary class in the CRM and a subclass of E28 Conceptual Object. E55 Type and its subclasses inherit all properties from this superclass. Thus together with the CRM class E83 Type Creation the rigorous scholarly or scientific process that ensures a type is exhaustively described and appropriately named can be modelled inside the CRM. In some cases, particularly in archaeology and the life sciences, E83 Type Creation requires the identification of an exemplary specimen and the publication of the type definition in an appropriate scholarly forum. This is very central to research in the life sciences, where a type would be referred to as a “taxon,” the type description as a “protologue,” and the exemplary specimens as “original element” or “holotype”.
Finally, types, that is, instances of E55 Type and its subclasses, are used to characterize the instances of a CRM class and hence refine the meaning of the class. A type ‘artist’ can be used to characterize persons through P2 has type (is type of). On the other hand, in an art history application of the CRM it can be adequate to extend the CRM class E21 Person with a subclass E21.xx Artist. What is the difference of the type ‘artist’ and the class Artist? From an everyday conceptual point of view there is no difference. Both denote the concept ‘artist’ and identify the same set of persons. Thus in this setting a type could be seen as a class and the class of types may be seen as a metaclass. Since current systems do not provide an adequate control of user defined metaclasses, the CRM prefers to model instances of E55 Type as if they were particulars, with the relationships described in the previous paragraphs.
Users may decide to implement a concept either as a subclass extending the CRM class system or as an instance of E55 Type. A new subclass should only be created in case the concept is sufficiently stable and associated with additional explicitly modelled properties specific to it. Otherwise, an instance of E55 Type provides more flexibility of use. Users that may want to describe a discourse not only using a concept extending the CRM but also describing the history of this concept itself, may chose to model the same concept both as subclass and as an instance of E55 Type with the same name. Similarly it should be regarded as good practice to foresee for each term hierarchy refining a CRM class a term equivalent of this class as top term. For instance, a term hierarchy for instances of E21 Person may begin with “Person”.
Since the intended scope of the CRM is a subset of the “real” world and is therefore potentially infinite, the model has been designed to be extensible through the linkage of compatible external type hierarchies.
Compatibility of extensions with the CRM means that data structured according to an extension must also remain valid as a CRM instance. In practical terms, this implies query containment: any queries based on CRM concepts should retrieve a result set that is correct according to the CRM’s semantics, regardless of whether the knowledge base is structured according to the CRM’s semantics alone, or according to the CRM plus compatible extensions. For example, a query such as “list all events” should recall 100% of the instances deemed to be events by the CRM, regardless of how they are classified by the extension.
A sufficient condition for the compatibility of an extension with the CRM is that CRM classes subsume all classes of the extension, and all properties of the extension are either subsumed by CRM properties, or are part of a path for which a CRM property is a shortcut. Obviously, such a condition can only be tested intellectually.
Of necessity, some concepts covered by the CRM are less thoroughly elaborated than others: E39 Actor and E30 Right, for example. This is a natural consequence of staying within the CRM’s clearly articulated practical scope in an intrinsically unlimited domain of discourse. These ‘underdeveloped’ concepts can be considered as hooks for compatible extensions.
The CRM provides a number of mechanisms to ensure that coverage of the intended scope is complete:
1.Existing high level classes can be extended, either structurally as subclasses or dynamically using the type hierarchy.
2.Existing high level properties can be extended, either structurally as subproperties, or in some cases, dynamically, using properties of properties which allow subtyping.
3.Additional information that falls outside the semantics formally defined by the CRM can be recorded as unstructured data using E1 CRM Entity. P3 has note: E62 String.
In mechanisms 1 and 2 the CRM concepts subsume and thereby cover the extensions.
In mechanism 3, the information is accessible at the appropriate point in the respective knowledge base. This approach is preferable when detailed, targeted queries are not expected; in general, only those concepts used for formal queryingneed to be explicitly modelled.
fig. 2 reasoning about spatial information
The diagram above shows a partial view of the CRM, representing reasoning about spatial information. Five of the main hierarchy branches are included in this view: E39 Actor, E51 Contact Point, E41 Appellation, E53 Place, and E70 Thing. The relationships between these main classes and their subclasses are shown as arrows. Properties between classes are shown as green rectangles. A ‘shortcut’ property is included in this view: P59has section (is located on or within) between E53 Place and E18 Physical Thing is a shortcut of the path through E46 Section Definition. In some cases the order of priority for property names has been modified in order to facilitate reading the diagram from left to right.
As can be seen, an instance of E53 Placeis identified by an instance of E44 Place Appellation, which may be an instance of E45 Address, E47 Spatial Coordinates, E48 Place Name, or E46 Section Definition such as ‘basement’, ‘prow’, or ‘lower left-hand corner.’ An instance of E53 Place may consist of or form part of another instance of E53 Place, thereby allowing a hierarchy of physical ‘containers’ to be constructed.
An instance of E45 Address can be considered both as an E44 Place Appellation–a way of referring to an E53 Place–and as an E51 Contact Point for an E39 Actor. An E39 Actor may have any number of instances of E51 Contact Point. E18 Physical Thing is found on locations as a consequence of being created there or being moved there. Therefore the properties P53 has former or current location (is former or current location of) (and P55 has current location (currently holds) are regarded as shortcuts of the fully articulated paths through the respective events. P55 has current location (currently holds) is a subproperty of P53has former or current location (is former or current location of). The latter is a container for location information in the absence of knowledge about time of validity and related events.
An interesting aspect of the model is the P58 has section definition (defines section) property between E46 Section Definition and E18 Physical Thing (and the corresponding shortcut from E53 Place to E19 Physical Object). This allows an instance of E53 Place to be defined as a section of an instance of E19 Physical Object. For example, we may know that Nelson fell at a particular spot on the deck of H.M.S. Victory, without knowing the exact position of the vessel in geospatial terms at the time of the fatal shooting of Nelson. Similarly, a signature or inscription can be located “in the lower right corner of” a painting, regardless of where the painting is hanging.
fig. 3 reasoning about temporal information
This second example shows how the CRM handles reasoning about temporal information. Four of the main hierarchy branches are included in this view: E2 Temporal Entity, E52 Time-Span, E77 Persistent Item and E53 Place.
The E2 Temporal Entity class is an abstract class (i.e. it has no instances) that serves to group together all classes with a temporal component, such as instances of E4 Period, E5 Event and E3 Condition State.
An instance of E52 Time-Span is simply a temporal interval that does not make any reference to cultural or geographical contexts (unlike instances of E4 Period, which took place at a particular instance of E53 Place). Instances of E52 Time-Span are sometimes identified by instances of E49 Time Appellation, often in the form of E50 Date.
Both E52 Time-Span and E4 Period have transitive properties. E52 Time-Span has the transitive property P86 falls within (contains), denoting a purely incidental inclusion; whereas E4 Period has the transitive property P9 consists of (forms part of) that supports the decomposition of instances of E4 Period into their constituent parts. For example, the E52 Time-Span during which a building is constructed might falls within the E52 Time-Span of a particular government, although there is no causal or contextual connection between the two instances of E52 Time-Span; conversely, the E4 Period of the Chinese Song Dynasty consists of the Northern Song Period and the Southern Song Period.
Instances of E52 Time-Span are related to their outer bounds (i.e. their indeterminacy interval) by the property P82 at some time within, and to their inner bounds via the property P81 ongoing throughout. The range of these properties is the E61 Time Primitive class, instances of which are treated by the CRM as application or system specific date intervals that are not further analysed.
Class & Property Hierarchies
Although they do not provide comprehensive definitions, compact monohierarchical presentations of the class and property IsA hierarchies have been found to significantly aid comprehension and navigation of the CRM, and are therefore provided below.
The class hierarchy presented below has the following format:
·Each line begins with a unique class identifier, consisting of a number preceded by the letter “E” (originally denoting “entity,” although now replaced by convention with the term “class”).
·A series of hyphens (“-”) follows the unique class identifier, indicating the hierarchical position of the class in the IsA hierarchy.
·The English name of the class appears to the right of the hyphens.
·The index is ordered by hierarchical level, in a “depth first” manner, from the smaller to the larger subhierarchies.
·Classes that appear in more than one position in the class hierarchy as a result of multiple inheritance are shown in an italic typeface.
The property hierarchy presented below has the following format:
·Each line begins with a unique property identifier, consisting of a number preceded by the letter “P” (for “property”).
·A series of hyphens (“-”) follows the unique property identifier, indicating the hierarchical position of the property in the IsA hierarchy.
·The English name of the property appears to the right of the hyphens, followed by its inverse name in parentheses for reading in the range to domain direction.
·The domain class for which the property is declared.
·The range class that the property references.
·The index is ordered by hierarchical level, in a “depth first” manner, from the smaller to the larger subhierarchies, and by property number between equal siblings.
·Properties that appear in more than one position in the property hierarchy as a result of multiple inheritance are shown in an italic typeface.
The classes of the CRM are comprehensively declared in this section using the following format:
·Class names are presented as headings in bold face, preceded by the class’ unique identifier;
·The line “Subclass of:” declares the superclass of the class from which it inherits properties;
·The line “Superclass of:” is a cross-reference to the subclasses of this class;
·The line “Scope note:” contains the textual definition of the concept the class represents;
·The line “Examples:” contains a bulleted list of examples of instances of this class. If the example is also instance of a subclass of this class, the unique identifier of the subclass is added in parenthesis. If the example instantiates two classes, the unique identifiers of both classes is added in parenthesis. Non-fictitious examples may be followed by an explanation in brackets.
·The line “Properties:” declares the list of the class’ properties;
·Each property is represented by its unique identifier, its forward and reverse names, and the range class that it links to, separated by colons;
·Inherited properties are not represented;
·Properties of properties are provided indented and in parentheses beneath their respective domain property.
Scope note:This class comprises all phenomena, such as the instances of E4 Periods, E5 Events and states, which happen over a limited extent in time.
In some contexts, these are also called perdurants. This class is disjoint from E77 Persistent Item. This is an abstract class and has no direct instances. E2 Temporal Entity is specialized into E4 Period, which applies to a particular geographic area (defined with a greater or lesser degree of precision), and E3 Condition State, which applies to instances of E18 Physical Thing.
§Bronze Age (E4)
§the earthquake in Lisbon 1755 (E5)
§the PeterhofPalace near Saint Petersburg being in ruins from 1944 – 1946 (E3)
Scope note:This class comprises the states of objects characterised by a certain condition over a time-span.
An instance of this class describes the prevailing physical condition of any material object or feature during a specific E52 Time Span. In general, the time-span for which a certain condition can be asserted may be shorter than the real time-span, for which this condition held.
The nature of that condition can be described using P2 has type. For example, the E3 Condition State “condition of the SS Great Britain between 22 September 1846 and 27 August 1847” can be characterized as E55 Type “wrecked”.
§the “AmberRoom” in Tsarskoje Selo being completely reconstructed from summer 2003 until now
§the PeterhofPalace near Saint Petersburg being in ruins from 1944 – 1946
§the state of my turkey in the oven at 14:30 on 25 December, 2002 (P2has type: E55Type “still not cooked”)
Scope note:This class comprises sets of coherent phenomena or cultural manifestations bounded in time and space.
It is the social or physical coherence of these phenomena that identify an E4 Period and not the associated spatio-temporal bounds. These bounds are a mere approximation of the actual process of growth, spread and retreat. Consequently, different periods can overlap and coexist in time and space, such as when a nomadic culture exists in the same area as a sedentary culture.
Typically this class is used to describe prehistoric or historic periods such as the “Neolithic Period”, the “Ming Dynasty” or the “McCarthy Era”. There are however no assumptions about the scale of the associated phenomena. In particular all events are seen as synthetic processes consisting of coherent phenomena. Therefore E4 Period is a superclass of E5 Event. For example, a modern clinical E67 Birth can be seen as both an atomic E5 Event and as an E4 Period that consists of multiple activities performed by multiple instances of E39 Actor.
There are two different conceptualisations of ‘artistic style’, defined either by physical features or by historical context. For example, “Impressionism” can be viewed as a period lasting from approximately 1870 to 1905 during which paintings with particular characteristics were produced by a group of artists that included (among others) Monet, Renoir, Pissarro, Sisley and Degas. Alternatively, it can be regarded as a style applicable to all paintings sharing the characteristics of the works produced by the Impressionist painters, regardless of historical context. The first interpretation is an E4 Period, and the second defines morphological object types that fall under E55 Type.
Another specific case of an E4 Period is the set of activities and phenomena associated with a settlement, such as the populated period of Nineveh.
Scope note:This class comprises changes of states in cultural, social or physical systems, regardless of scale, brought about by a series or group of coherent physical, cultural, technological or legal phenomena. Such changes of state will affect instances of E77 Persistent Item or its subclasses.
The distinction between an E5 Event and an E4 Period is partly a question of the scale of observation. Viewed at a coarse level of detail, an E5 Event is an ‘instantaneous’ change of state. At a fine level, the E5 Event can be analysed into its component phenomena within a space and time frame, and as such can be seen as an E4 Period. The reverse is not necessarily the case: not all instances of E4 Period give rise to a noteworthy change of state.
§the birth of Cleopatra (E67)
§the destruction of Herculaneum by volcanic eruption in 79 AD (E6)
Scope note:This class comprises events that destroy one or more instances of E18 Physical Thing such that they lose their identity as the subjects of documentation.
Some destruction events are intentional, while others are independent of human activity. Intentional destruction may be documented by classifying the event as both an E6 Destruction and E7 Activity.
The decision to document an object as destroyed, transformed or modified is context sensitive:
1. If the matter remaining from the destruction is not documented, the event is modelled solely as E6 Destruction.
2. An event should also be documented using E81 Transformation if it results in the destruction of one or more objects and the simultaneous production of others using parts or material from the original. In this case, the new items have separate identities. Matter is preserved, but identity is not.
3. When the initial identity of the changed instance of E18 Physical Thing is preserved, the event should be documented as E11 Modification.
§the destruction of Herculaneum by volcanic eruption in 79 AD
§the destruction of Nineveh (E6, E7)
§the breaking of a champagne glass yesterday by my dog
Scope note:This class comprises transfers of legal ownership from one or more instances of E39 Actor to one or more other instances of E39 Actor.
The class also applies to the establishment or loss of ownership of instances of E18 Physical Thing. It does not, however, imply changes of any other kinds of right. The recording of the donor and/or recipient is optional. It is possible that in an instance of E8 Acquisition there is either no donor or no recipient. Depending on the circumstances, it may describe:
1.the beginning of ownership
2.the end of ownership
3.the transfer of ownership
4.the acquisition from an unknown source
5.the loss of title due to destruction of the item
It may also describe events where a collector appropriates legal title, for example by annexation or field collection. The interpretation of the museum notion of "accession" differs between institutions. The CRM therefore models legal ownership (E8 Acquisition) and physical custody (E10 Transfer of Custody) separately. Institutions will then model their specific notions of accession and deaccession as combinations of these.
§the collection of a hammer-head shark of the genus Sphyrna (Carchariniformes) XXXtbc by John Steinbeck and Edward Ricketts at Puerto Escondido in the Gulf of Mexico on March 25th, 1940
§the acquisition of El Greco’s painting entitled ‘The Apostles Peter and Paul’ by the State Hermitage in Saint Petersburg
§the loss of my stuffed chaffinch ‘Fringilla coelebs Linnaeus, 1758’ due to insect damage last year
P22 transferred title to (acquired title through): E39 Actor
P23 transferred title from (surrendered title through): E39 Actor
P24 transferred title of (changed ownership through): E18 Physical Thing
Scope note:This class comprises changes of the physical location of the instances of E19 Physical Object.
Note, that the class E9 Move inherits the property P7 took place at (witnessed): E53 Place. This property should be used to describe the trajectory or a larger area within which a move takes place, whereas the properties P26 moved to (was destination of), P27 moved from (was origin of) describe the start and end points only. Moves may also be documented to consist of other moves (via P9 consists of (forms part of)), in order to describe intermediate stages on a trajectory. In that case, start and end points of the partial moves should match appropriately between each other and with the overall event.
§the relocation of LondonBridge from the UK to the USA
§the movement of the exhibition “Treasures of Tut-Ankh-Amun” 1976-1979
Scope note:This class comprises transfers of physical custody of objects between instances of E39 Actor.
The recording of the donor and/or recipient is optional. It is possible that in an instance of E10 Transfer of Custody there is either no donor or no recipient. Depending on the circumstances it may describe:
1.the beginning of custody
2.the end of custody
3.the transfer of custody
4.the receipt of custody from an unknown source
5.the declared loss of an object
The distinction between the legal responsibility for custody and the actual physical possession of the object should be expressed using the property P2 has type (is type of). A specific case of transfer of custody is theft.
The interpretation of the museum notion of "accession" differs between institutions. The CRM therefore models legal ownership and physical custody separately. Institutions will then model their specific notions of accession and deaccession as combinations of these.
§the delivery of the paintings by Secure Deliveries Inc. to the National Gallery
§the return of Picasso’s “Guernica” to Madrid’s Prado in 1981
P28 custody surrendered by (surrendered custody through): E39 Actor
P29 custody received by (received custody through): E39 Actor
P30 transferred custody of (custody transferred through): E18 Physical Thing
Scope note:This class comprises all instances of E7 Activity that create, alter or change E24 Physical Man-Made Thing.
This class includes the production of an item from raw materials, and other so far undocumented objects, and the preventive treatment or restoration of an object for conservation.
Since the distinction between modification and production is not always clear, modification is regarded as the more generally applicable concept. This implies that some items may be consumed or destroyed in a Modification, and that others may be produced as a result of it. An event should also be documented using E81 Transformation if it results in the destruction of one or more objects and the simultaneous production of others using parts or material from the originals. In this case, the new items have separate identities.
If the instance of the E29 Design or Procedure utilised for the modification prescribes the use of specific materials, they should be documented using properties of the design or procedure, rather than via P126 employed (was employed in): E57 Material.
§the construction of the SS Great Britain (E12)
§the impregnation of the Vasa warship in Stockholm for preservation after 1956
§the transformation of the Enola Gay into a museum exhibit by the National Air and SpaceMuseum in WashingtonDC between 1993 and 1995 (E12, E81)
§the last renewal of the gold coating of the Toshogu shrine in Nikko, Japan
P31 has modified (was modified by): E24 Physical Man-Made Thing
Scope note:This class comprises activities that are designed to, and succeed in, creating one or more new items.
It specializes the notion of modification into production. The decision as to whether or not an object is regarded as new is context sensitive. Normally, items are considered “new” if there is no obvious overall similarity between them and the consumed items and material used in their production. In other cases, an item is considered “new” because it becomes relevant to documentation by a modification. For example, the scribbling of a name on a potsherd may make it a voting token. The original potsherd may not be worth documenting, in contrast to the inscribed one.
This entity can be collective: the printing of a thousand books, for example, would normally be considered a single event.
An event should also be documented using E81 Transformation if it results in the destruction of one or more objects and the simultaneous production of others using parts or material from the originals. In this case, the new items have separate identities and matter is preserved, but identity is not.
§the construction of the SS Great Britain
§the first casting of the Little Mermaid from the harbour of Copenhagen
§Rembrandt’s creating of the seventh state of his etching “Woman sitting half dressed beside a stove”, 1658, identified by Bartsch Number 197 (E12,E65,E81)
P108 has produced (was produced by): E24 Physical Man-Made Thing
Scope note:This class comprises the actions of making assertions about properties of an object or any relation between two items or concepts.
This class allows the documentation of how the respective assignment came about, and whose opinion it was. All the attributes or properties assigned in such an action can also be seen as directly attached to the respective item or concept, possibly as a collection of contradictory values. All cases of properties in this model that are also described indirectly through an action are characterised as "short cuts" of this action. This redundant modelling of two alternative views is preferred because many implementations may have good reasons to model either the action or the short cut, and the relation between both alternatives can be captured by simple rules.
In particular, the class describes the actions of people making propositions and statements during certain museum procedures, e.g. the person and date when a condition statement was made, an identifier was assigned, the museum object was measured, etc. Which kinds of such assignments and statements need to be documented explicitly in structures of a schema rather than free text, depends on if this information should be accessible by structured queries.
§the assessment of the current ownership of Martin Doerr’s silver cup in February 1997
P140 assigned attribute to (was attributed by): E1 CRM Entity
Scope note:This class describes the act of assessing the state of preservation of an object during a particular period.
The condition assessment may be carried out by inspection, measurement or through historical research. This class is used to document circumstances of the respective assessment that may be relevant to interpret its quality at a later stage, or to continue research on related documents.
·last year’s inspection of humidity damage to the frescos in the St. George chapel in our village
Scope note:This class comprises activities that result in the allocation of an identifier to an instance of E1 CRM Entity. An E15 Identifier Assignment may include the creation of the identifier from multiple constituents, which themselves may be instances of E41 Appellation. The syntax and kinds of constituents to be used may be declared in a rule constituting an instance of E29 Design or Procedure.
Examples of such identifiers include Find Numbers, Inventory Numbers, uniform titles in the sense of librarianship and Digital Object Identifiers (DOI). Documenting the act of identifier assignment and deassignment is especially useful when objects change custody or the identification system of an organization is changed. In order to keep track of the identity of things in such cases, it is important to document by whom, when and for what purpose an identifier is assigned to an item.
The fact that an identifier is a preferred one for an organisation can be expressed by using the property E1 CRM Entity. P48 has preferred identifier (is preferred identifier of): E42 Identifier. It can better be expressed in a context independent form by assigning a suitable E55 Type, such as “preferred identifier assignment”, to the respective instance of E15 Identifier Assignment via the P2 has type property.
§Replacement of the inventory number TA959a by GE34604 for a 17th century lament cloth at the Museum Benaki, Athens
§Assigning the author-uniform title heading “Goethe, Johann Wolfgang von, 1749-1832. Faust. 1. Theil.” for a work (E28)
§On June 1, 2001 assigning the personal name heading “Guillaume, de Machaut, ca. 1300-1377” (E42,E82) to Guillaume de Machaut (E21)
Scope note: This class comprises actions measuring physical properties and other values that can be determined by a systematic procedure.
Examples include measuring the monetary value of a collection of coins or the running time of a specific video cassette.
The E16 Measurement may use simple counting or tools, such as yardsticks or radiation detection devices. The interest is in the method and care applied, so that the reliability of the result may be judged at a later stage, or research continued on the associated documents. The date of the event is important for dimensions, which may change value over time, such as the length of an object subject to shrinkage. Details of methods and devices are best handled as free text, whereas basic techniques such as "carbon 14 dating" should be encoded using P2 has type (is type of:) E55 Type.
§measurement of height of silver cup 232 on the 31st August 1997
§the carbon 14 dating of the “Schoeninger Speer II” in 1996 [an about 400.000 years old Palaeolithic complete wooden spear found in Schoeningen, Niedersachsen, Germany in 1995]
Scope note:This class comprises the actions of classifying items of whatever kind. Such items include objects, specimens, people, actions and concepts.
This class allows for the documentation of the context of classification acts in cases where the value of the classification depends on the personal opinion of the classifier, and the date that the classification was made. This class also encompasses the notion of "determination," i.e. the systematic and molecular identification of a specimen in biology.
§the first classification of object GE34604 as Lament Cloth, October 2nd
§the determination of a cactus in Martin Doerr’s garden as ‘Cereus hildmannianus K.Schumann’, July 2003
Scope Note:This class comprises all persistent physical items with a relatively stable form, man-made or natural.
Depending on the existence of natural boundaries of such things, the CRM distinguishes the instances of E19 Physical Object from instances of E26 Physical Feature, such as holes, rivers, pieces of land etc. Most instances of E19 Physical Object can be moved (if not too heavy), whereas features are integral to the surrounding matter.
The CRM is generally not concerned with amounts of matter in fluid or gaseous states.