Difference between revisions of "Audubon Core Structure"

From TDWG Terms Wiki
Jump to: navigation, search
m
m (top: added outdated warning with link to up-to-date information)
 
(30 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 +
__NOINDEX__
 +
<span style="background-color:#ffc; padding:5px; color:#f00;border:1px solid #f00;">'''Warning''': This is not the current version of this document. It is kept for archival purposes. For up-to-date information, go to http://rs.tdwg.org/ac/doc/structure/</span>
 +
 +
 
<!-- IMPORTANT:  
 
<!-- IMPORTANT:  
 
   WHEN YOU WANT TO EDIT, UNCOMMENT THE WIP CALL BELOW THIS AND SAVE,
 
   WHEN YOU WANT TO EDIT, UNCOMMENT THE WIP CALL BELOW THIS AND SAVE,
   THEN REOPEN THE EDIT. THIS HELPS ASSURE EDITS WON'T COLLIDE  
+
   THEN REOPEN THE EDIT. THIS HELPS ASSURE EDITS WON'T COLLIDE
{{WIP | end = around 20:00Z | user = {{REVISIONUSER}} }} -->
+
{{WIP | end = when approved by TDWG Executive Committee. While you see this notice, the proposal has not been yet accepted | user = {{REVISIONUSER}} }} -->
  
 
'''Title:''' Audubon Core
 
'''Title:''' Audubon Core
  
'''Date:''' TBD. This document is a proposal. See [[Audubon_Core_1.0_Call_For_Public_Review#Audubon_Core_Public_Review]]
+
'''Date:''' 23 October 2013.
  
'''Abstract:''' The Audubon Core is a set of vocabularies designed to represent metadata for biodiversity multimedia resources and collections. These vocabularies aim to represent information that will help to determine whether a particular resource or collection will be fit for some particular biodiversity science application before acquiring the media. Among others, the vocabularies address such concerns as the management of the media and collections, descriptions of their content, their taxonomic, geographic, and temporal coverage, and the appropriate ways to retrieve, attribute and reproduce them. This document contains material introductory to the '''[[Audubon Core Term List (1.0 normative)|Audubon Core Term List]]'''
+
'''Abstract:''' The Audubon Core is a set of vocabularies designed to represent metadata for biodiversity multimedia resources and collections. These vocabularies aim to represent information that will help to determine whether a particular resource or collection will be fit for some particular biodiversity science application before acquiring the media. Among others, the vocabularies address such concerns as the management of the media and collections, descriptions of their content, their taxonomic, geographic, and temporal coverage, and the appropriate ways to retrieve, attribute and reproduce them. This document contains material introductory to the '''[[Audubon Core Term List]]'''
  
 
'''Contributors:''' Robert A. Morris, Vijay Barve, Mihail Carausu, Vishwas Chavan, Jose Cuadra, Chris Freeland, Gregor Hagedorn, Patrick Leary, Dimitry Mozzherin, Annette Olson, Greg Riccardi, Ivan Teage
 
'''Contributors:''' Robert A. Morris, Vijay Barve, Mihail Carausu, Vishwas Chavan, Jose Cuadra, Chris Freeland, Gregor Hagedorn, Patrick Leary, Dimitry Mozzherin, Annette Olson, Greg Riccardi, Ivan Teage
Line 14: Line 18:
 
'''Legal:''' This document is governed by the standard legal, copyright, licensing provisions and disclaimers issued by the Taxonomic Databases Working Group.
 
'''Legal:''' This document is governed by the standard legal, copyright, licensing provisions and disclaimers issued by the Taxonomic Databases Working Group.
  
'''Part of TDWG Standard:''' TBD
+
'''Part of TDWG Standard:''' http://www.tdwg.org/standards/638/.
  
 +
'''Document Status:''' This document is a [http://www.tdwg.org/fileadmin/tdwg_std_drafts/tdwg_standards_documentation_specification.html#a_3 TDWG Type 1 Normative Document].
 +
 +
Release 1.0 of this document has wiki revision 10756 with a permalink http://terms.gbif.org/w/index.php?oldid=10756.<br/>
 +
This document has wiki revision ID {{REVISIONID}} with a permalink http://terms.gbif.org/w/index.php?oldid={{REVISIONID}}.
 +
<!--
 
[[Image:WIP.gif]]  This document has revision ID {{REVISIONID}} which can be permanently accessed through http://terms.gbif.org/w/index.php?oldid={{REVISIONID}}. This may have minor clarifications and corrections from the '''[http://terms.gbif.org/w/index.php?oldid=9000 |version under consideration for ratification.]'''<br/>
 
[[Image:WIP.gif]]  This document has revision ID {{REVISIONID}} which can be permanently accessed through http://terms.gbif.org/w/index.php?oldid={{REVISIONID}}. This may have minor clarifications and corrections from the '''[http://terms.gbif.org/w/index.php?oldid=9000 |version under consideration for ratification.]'''<br/>
 
'''<font color="red"> The version under consideration by TDWG has permalink http://terms.gbif.org/w/index.php?oldid=9000.</font>'''
 
'''<font color="red"> The version under consideration by TDWG has permalink http://terms.gbif.org/w/index.php?oldid=9000.</font>'''
  
ONCE REVIEW AND REVISION FINALIZED: '''This is a normative document.'''
 
  
 +
ONCE REVIEW AND REVISION FINALIZED: '''This is a normative document.'''
 +
-->
 
__TOC__ <!-- table of contents should appear here. -->
 
__TOC__ <!-- table of contents should appear here. -->
  
  
This is the normative documentation for the [http://tdwg.org TDWG] Audubon Core Multimedia Resources Metadata Standard (Audubon Core, or simply AC), During development, it was colloquially known as MRTG, after its developers, the GBIF-TDWG Joint Multimedia Resources Metadata Task Group. Please see the brief '''[[Audubon Core Non Normative Document]]''' and also '''[http://www.keytonature.eu/wiki/MRTG_Development_History MRTG Development History]''' for the development history in detail.  
+
This is the normative documentation for the [http://tdwg.org TDWG] Audubon Core Multimedia Resources Metadata Standard (Audubon Core, or simply AC), During development, it was colloquially known as MRTG, after its developers, the GBIF-TDWG Joint Multimedia Resources Metadata Task Group. Please see the brief '''[[Audubon Core]]''' non normative document and also '''[http://www.keytonature.eu/wiki/MRTG_Development_History MRTG Development History]''' for the development history in detail.  
  
'''If you are unfamiliar with the AudubonCore, ''please'' read the [[Audubon Core Non Normative Document]] before editing this page.''' It lays out why there is perceived a need for a biodiversity media resource metadata schema, and how the standard attempts to use existing metadata standards where possible.
+
'''If you are unfamiliar with the Audubon Core, ''please'' read the [[Audubon Core]] introduction before reading this page.''' It lays out why there is perceived a need for a biodiversity media resource metadata schema, and how the standard attempts to use existing metadata standards where possible.
  
 
==Terminology of this specification==
 
==Terminology of this specification==
Line 33: Line 43:
  
 
* A ''Multimedia Resource'' is anything that a provider identifies as belonging to one of the possible values of the AC ''Type'' term and optionally one or more of the ''Subtype'' term values. A mechanism is provided by which providers can supply a privately defined subtype that will not collide with the AC defined Subtype values.
 
* A ''Multimedia Resource'' is anything that a provider identifies as belonging to one of the possible values of the AC ''Type'' term and optionally one or more of the ''Subtype'' term values. A mechanism is provided by which providers can supply a privately defined subtype that will not collide with the AC defined Subtype values.
* An AC ''record'' is a set of terms with any values conforming to this document, and which contain at least the four mandatory terms described in the [[Audubon_Core_Term_List_(1.0_normative) | Audubon Core Core Term List (1.0 normative)]], and which describes a single multimedia resource (possibly including a Collection). One of these, the value of ''Identifier'' is a Globally Unique IDentifier (GUID), which may have been assigned to the resource by an external authority or by the provider of the metadata record.
+
* An AC ''record'' is a set of terms with any values conforming to this document, and which contain at least the four mandatory terms described in the [[Audubon Core Term List|Audubon Core Core Term List]], and which describes a single multimedia resource (possibly including a Collection). One of these, the value of ''Identifier'' is a Globally Unique IDentifier (GUID), which may have been assigned to the resource by an external authority or by the provider of the metadata record.
 
* AC terms are divided into two ''Layers''. Those characterized as in  ''Layer 1'', including the four mandatory terms, should be meaningfully handled by all consuming client applications. Only wholly complete consuming applications need handle those in the ''Layer 2.'' What is meant by "meaningfully handle" is up to implementers of this normative specification.  It could be as simple as "gracefully ignore".  
 
* AC terms are divided into two ''Layers''. Those characterized as in  ''Layer 1'', including the four mandatory terms, should be meaningfully handled by all consuming client applications. Only wholly complete consuming applications need handle those in the ''Layer 2.'' What is meant by "meaningfully handle" is up to implementers of this normative specification.  It could be as simple as "gracefully ignore".  
  
In the [[Audubon_Core_Term_List_(1.0_normative) | Audubon Core Term List]], every AC term has a ''term name'' following a table entry ''"Term:"'', a ''URI'', a plain text normative ''Definition'', a recommended English ''Label'',  an optional ''Notes'' attribute. In addition, a term has an attribute telling whether it is mandatory, one telling whether it is repeatable, and one telling whether it is in Layer 1 or 2. Layer 2 comprises terms likely to only occur for certain media. For example, the term ''DateAvailable'' will apply only to media that are embargoed, but for which the provider is prepared to make the metadata immediately available.  
+
In the [[Audubon Core Term List]], every AC term has a ''term name'' following a table entry ''"Term:"'', a ''URI'', a plain text normative ''Definition'', a recommended English ''Label'',  an optional ''Notes'' attribute. In addition, a term has an attribute telling whether it is mandatory, one telling whether it is repeatable, and one telling whether it is in Layer 1 or 2. Layer 2 comprises terms likely to only occur for certain media. For example, the term ''DateAvailable'' will apply only to media that are embargoed, but for which the provider is prepared to make the metadata immediately available.  
  
 
AC metadata can describe either individual multimedia resources or collections of resources. A few, but not many, of the AC properties have different values for collections than for individual media. If no such distinction is mentioned, AC does not assume one.
 
AC metadata can describe either individual multimedia resources or collections of resources. A few, but not many, of the AC properties have different values for collections than for individual media. If no such distinction is mentioned, AC does not assume one.
Line 42: Line 52:
 
Term Names for terms borrowed from other vocabularies are those in use for the corresponding term in those vocabularies. Term Names are intended principally for navigation in the AC documentation. Term Labels are suggestions for English labels in applications.  They are recommendations only and are offered only in English, with the added expectation that they may clarify intended usage of the term.  Communities may wish to promulgate recommendations for Labels in other languages, or even alternative English Labels for specialized audiences, e.g. school children.  Labels are may be used for navigation within the Term List, and are often used within the Term List itself when a term is mentioned within the documentation of another term. The Term List provides indices both by name and label.
 
Term Names for terms borrowed from other vocabularies are those in use for the corresponding term in those vocabularies. Term Names are intended principally for navigation in the AC documentation. Term Labels are suggestions for English labels in applications.  They are recommendations only and are offered only in English, with the added expectation that they may clarify intended usage of the term.  Communities may wish to promulgate recommendations for Labels in other languages, or even alternative English Labels for specialized audiences, e.g. school children.  Labels are may be used for navigation within the Term List, and are often used within the Term List itself when a term is mentioned within the documentation of another term. The Term List provides indices both by name and label.
  
URI's for terms conform to the http URI scheme (See, http://en.wikipedia.org/wiki/URI_scheme, http://www.w3.org/TR/uri-clarification/, or http://www.ietf.org/rfc/rfc2396.txt ). Informally, one may understand this as follows: an http URI has the syntax of an http URL, but there is no expectation that putting it in a web browser will result in any information being returned to the browser, and if there is, it may have no relevance. This conformance requirement applies only to the URIs that identify AC terms. A few AC terms permit '''values''' to be taken from another controlled vocabulary chosen by the user. In this case, those values may involve URIs conforming to a scheme given by that external vocabulary, and AC is silent on what that scheme is.  
+
URI's for terms conform to the http URI scheme (See http://en.wikipedia.org/wiki/URI_scheme, http://www.w3.org/TR/uri-clarification/, or http://www.ietf.org/rfc/rfc2396.txt.) Informally, one may understand this as follows: an http URI has the syntax of an http URL, but there is no expectation that putting it in a web browser will result in any information being returned to the browser, and if there is, it may have no relevance. This conformance requirement applies only to the URIs that identify AC terms. A few AC terms permit '''values''' to be taken from another controlled vocabulary chosen by the user. In this case, those values may involve URIs conforming to a scheme given by that external vocabulary, and AC is silent on what that scheme is.  
  
 
The Notes field of a term's documentation points to further information, if any exists, about the term. In particular, for terms borrowed from other vocabularies, this field generally carries a link to the originating vocabulary's documentation for that term.
 
The Notes field of a term's documentation points to further information, if any exists, about the term. In particular, for terms borrowed from other vocabularies, this field generally carries a link to the originating vocabulary's documentation for that term.
Line 50: Line 60:
 
A number of terms are repeatable. How to implement repeatability in a given serialization is not defined by Audubon Core. The following section gives advice on some best practices in the context of repeatability.
 
A number of terms are repeatable. How to implement repeatability in a given serialization is not defined by Audubon Core. The following section gives advice on some best practices in the context of repeatability.
  
The simplest case is a single repeatable term (e.g., dcterms:identifier). In representations based on an XML Schema that permits elements to be repeated such a term may simply be repeated (e.g. "...<dcterms:identifier>http://example.com/123</dcterms:identifier><dcterms:identifier>http://example.com/456</dcterms:identifier>..."). In serializations that do not easily lend themselves to repeatable elements (e.g. "flat" schemata with all elements occurring only a single time in an otherwise unstructured record) it is possible to define separators to support a list of values within a single element (e.g. "...<dcterms:identifier>http://example.com/123; http://example.com/456</dcterms:identifier>...").
+
The simplest case is a single repeatable term (e.g., dcterms:identifier). In representations based on an XML Schema that permits elements to be repeated such a term may simply be repeated (e.g. <nowiki>"...<dcterms:identifier>http://example.com/123</dcterms:identifier><dcterms:identifier>http://example.com/456</dcterms:identifier>..."</nowiki>). In serializations that do not easily lend themselves to repeatable elements (e.g. "flat" schemata with all elements occurring only a single time in an otherwise unstructured record) it is possible to define separators to support a list of values within a single element (e.g. "...<dcterms:identifier><nowiki>http://example.com/123; http://example.com/456</nowiki></dcterms:identifier>...").
  
 
In certain cases pairs or tuples of properties are repeated. In Audubon Core this situation occurs, for example, in the following cases:
 
In certain cases pairs or tuples of properties are repeated. In Audubon Core this situation occurs, for example, in the following cases:
* The language-dependent metadata like title, description, etc. need to be associated with ac:metadataLanguage. One approach here is to use complete Audubon Core records together with the [[Audubon_Core_Term_List_(1.0_normative)#metadataLanguage| ac:metadataLanguage]] property; see there for further detail.
+
* The language-dependent metadata like title, description, etc. need to be associated with ac:metadataLanguage. One approach here is to use complete Audubon Core records together with the [[Audubon Core Term List#ac:metadataLanguage|Metadata Language]] property; see there for further detail.
* The values of properties about a Service Access Point must remain associated with that Service Access Point even if there are multiple Service Access Points. See [[Audubon_Core_Term_List_(1.0_normative)#hasServiceAccessPoint|hasServiceAccessPoint]] for further details.
+
* The values of properties about a Service Access Point must remain associated with that Service Access Point even if there are multiple Service Access Points. See [[Audubon Core Term List#ac:hasServiceAccessPoint|hasServiceAccessPoint]] for further details.
* The terms dwc:scientificName and dwc:identificationQualifier may optionally be structured into pairs (see the notes on [[Audubon_Core_Term_List_(1.0_normative)#Reviewer| identificationQualifier]]).
+
* The terms dwc:scientificName and dwc:identificationQualifier may optionally be structured into pairs. (See the notes on [[Audubon Core Term List#dwc:identificationQualifier|dwc:identificationQualifier]].)
* The terms [[Audubon_Core_Term_List_(1.0_normative)#Reviewer| Reviewer]], being the name of an individual providing some expert review of a resource, and the review text itself in [[Audubon_Core_Term_List_(1.0_normative)#Reviewer_Comments | Reviewer Comments]] are desirable to store as pairs.
+
* The terms [[Audubon Core Term List#ac:Reviewer|Reviewer]], being the name of an individual providing some expert review of a resource, and the review text itself in [[Audubon Core Term List#ac:Reviewer Comments|Reviewer Comments]] are desirable to store as pairs.
  
 
Many serialization languages provide sufficiently structured forms to deal with repeated terms unambiguously.  For example, in XML might define a container element and use a nesting structure something like this:
 
Many serialization languages provide sufficiently structured forms to deal with repeated terms unambiguously.  For example, in XML might define a container element and use a nesting structure something like this:
 
  <MEDIA_METADATA_CONTAINER>
 
  <MEDIA_METADATA_CONTAINER>
   <dcterms:identifier>http//:example.com/pictures/thePicture.jpg</dcterms:identifier>
+
   <dcterms:identifier><nowiki>http//:example.com/pictures/thePicture.jpg</nowiki></dcterms:identifier>
 
   ...
 
   ...
 
   <ac:hasServiceAccessPoint>
 
   <ac:hasServiceAccessPoint>
 
     <dcterms:format>jpg</dcterms:format>
 
     <dcterms:format>jpg</dcterms:format>
     <ac:accessURI>http://example.com/fullres/thePicture.jpg</ac:accessURI>
+
     <ac:accessURI><nowiki>http://example.com/fullres/thePicture.jpg</nowiki></ac:accessURI>
 
     ...
 
     ...
 
   </ac:hasServiceAccessPoint>
 
   </ac:hasServiceAccessPoint>
Line 74: Line 84:
 
Another example may reference access points by identifier:
 
Another example may reference access points by identifier:
 
  <MEDIA_METADATA_CONTAINER>
 
  <MEDIA_METADATA_CONTAINER>
   <dcterms:identifier>http//:example.com/pictures/thePicture.jpg</dcterms:identifier>
+
   <dcterms:identifier><nowiki>http://example.com/pictures/thePicture.jpg</nowiki></dcterms:identifier>
 
   ...
 
   ...
   <ac:hasServiceAccessPoint>http//:example.com/pictures/thePicture.jpg#ac0001</ac:hasServiceAccessPoint>
+
   <ac:hasServiceAccessPoint><nowiki>http://example.com/pictures/thePicture.jpg#ac0001</nowiki></ac:hasServiceAccessPoint>
   <ac:hasServiceAccessPoint>http//:example.com/pictures/thePicture.jpg#ac0002</ac:hasServiceAccessPoint>
+
   <ac:hasServiceAccessPoint><nowiki>http://example.com/pictures/thePicture.jpg#ac0002</nowiki></ac:hasServiceAccessPoint>
   <ac-classes:ServiceAccessPoint id="http//:example.com/pictures/thePicture.jpg#ac0001">
+
   <ac-classes:ServiceAccessPoint id="<nowiki>http://example.com/pictures/thePicture.jpg#ac0001</nowiki>">
 
     <dcterms:format>jpg</dcterms:format>
 
     <dcterms:format>jpg</dcterms:format>
     <ac:accessURI>http://example.com/fullres/thePicture.jpg</ac:accessURI>
+
     <ac:accessURI><nowiki>http://example.com/fullres/thePicture.jpg</nowiki></ac:accessURI>
 
     ...
 
     ...
 
   </ac-classes:ServiceAccessPoint>
 
   </ac-classes:ServiceAccessPoint>
Line 87: Line 97:
 
Note: ac-classes:ServiceAccessPoint a prefix of an illustrative namespace.  Namespace recommendations will be made when the normative documents are approved.
 
Note: ac-classes:ServiceAccessPoint a prefix of an illustrative namespace.  Namespace recommendations will be made when the normative documents are approved.
  
Where such structure is impossible or undesirable, an alternative solution is to to permit only one access point per MEDIA_METADATA_CONTAINER, but to repeat the main MEDIA_METADATA_CONTAINER for a single media resource. This is similar to one of the options discussed for multilingual metadata (see [[Audubon_Core_Term_List_(1.0_normative)#metadataLanguage| ac:metadataLanguage]]). An example in XML for this:
+
Where such structure is impossible or undesirable, an alternative solution is to to permit only one access point per MEDIA_METADATA_CONTAINER, but to repeat the main MEDIA_METADATA_CONTAINER for a single media resource. This is similar to one of the options discussed for multilingual metadata (see [[Audubon Core Term List#ac:metadataLanguage|Metadata Language]]). An example in XML for this:
  
 
  <MEDIA_METADATA_CONTAINER>
 
  <MEDIA_METADATA_CONTAINER>
   <dcterms:identifier>http//:example.com/pictures/thePicture.jpg</dcterms:identifier>
+
   <dcterms:identifier>http://example.com/pictures/thePicture.jpg</dcterms:identifier>
 
   <dcterms:title>A red beech leaf</dcterms:title>
 
   <dcterms:title>A red beech leaf</dcterms:title>
 
   <dcterms:format>jpg</dcterms:format>
 
   <dcterms:format>jpg</dcterms:format>
   <ac:accessURI>http://example.com/fullres/thePicture.jpg</ac:accessURI>
+
   <ac:accessURI><nowiki>http://example.com/fullres/thePicture.jpg</nowiki></ac:accessURI>
 
   ...
 
   ...
 
  <MEDIA_METADATA_CONTAINER>
 
  <MEDIA_METADATA_CONTAINER>
 
  <MEDIA_METADATA_CONTAINER>
 
  <MEDIA_METADATA_CONTAINER>
   <dcterms:identifier>http//:example.com/pictures/thePicture.jpg</dcterms:identifier>
+
   <dcterms:identifier><nowiki>http://example.com/pictures/thePicture.jpg</nowiki></dcterms:identifier>
 
   <dcterms:format>png</dcterms:format>
 
   <dcterms:format>png</dcterms:format>
   <ac:accessURI>http://example.com/fullres/thePicture-hires.png</ac:accessURI>
+
   <ac:accessURI><nowiki>http://example.com/fullres/thePicture-hires.png</nowiki></ac:accessURI>
 
   ...
 
   ...
 
  <MEDIA_METADATA_CONTAINER>
 
  <MEDIA_METADATA_CONTAINER>
Line 105: Line 115:
 
The same example as a spreadsheet-like table:
 
The same example as a spreadsheet-like table:
 
<table class="wikitable">
 
<table class="wikitable">
<tr><td>'''dcterms:identifier'''</td><td>'''dcterms:title'''</td><td>'''dcterms:format'''</td><td>'''ac:accessURI'''</td></tr>
+
<tr><td>'''dcterms:identifier'''</td><td>'''dcterms:title'''</td><td>'''ac:variant'''</td><td>'''dcterms:format'''</td><td>'''ac:accessURI'''</td></tr>
<tr><td>http//:example.com/pictures/thePicture.jpg</td><td>A red beech leaf</td><td>jpg</td><td>http://example.com/fullres/thePicture.jpg</td></tr>
+
<tr><td><nowiki>http://example.com/pictures/thePicture.jpg</nowiki></td><td>A red beech leaf</td><td>Best Quality</td><td>jpg</td><td><nowiki>http://example.com/fullres/thePicture.jpg</nowiki></td></tr>
<tr><td>http//:example.com/pictures/thePicture.jpg</td><td></td><td>png</td><td>http://example.com/fullres/thePicture-hires.png</td></tr>
+
<tr><td><nowiki>http://example.com/pictures/thePicture.jpg</nowiki></td><td></td><td>Best Quality</td><td>png</td><td><nowiki>http://example.com/fullres/thePicture-hires.png</nowiki></td></tr>
 +
<tr><td><nowiki>http://example.com/pictures/thePicture.jpg</nowiki></td><td></td><td>Thumbnail</td><td>png</td><td><nowiki>http://example.com/thumbs/thePicture-thumb.png</nowiki></td></tr>
 +
 
 
</table>
 
</table>
  
In the example above, only the required identifier is repeated, but not the title field. Whether to repeat all fields or whether to provide all fields only in the first record, limiting later records to the identifier and the service access point properties, is left to specific implementations.
+
In the example above, only the required identifier is repeated, but not the title field. Whether to repeat all fields or whether to provide all fields only in the first record, limiting later records to the identifier and the service access point properties, is left to specific implementations. In the example, hasAccessPoint property is suppressed as unnecessary. Another approach reduces the need for the property
 +
when flattening the ac  structure. It is based on introducing new terms exploiting values of the [[Audubon Core Term List#ac:variantLiteral ac:variantLiteral]]: "Thumbnail", "Trailer", "Lower Quality", "Medium Quality", "Good Quality", "Best Quality", "Offline", as prefixes for additional properties in a new namespace, say acf (Audubon Core Flat):
 +
 
 +
<table class="wikitable">
 +
<tr><td>'''dcterms:identifier'''</td><td>'''dcterms:title'''</td><td>'''acf:thumbnailAccessURI'''</td><td>'''acf:thumbnailFormat'''</td><td>'''acf:thumbnailImageWidth'''</td><td>'''acf:thumbnailImageHeight'''</td><td>'''acf:goodQualityAccessURI'''</td><td>'''acf:goodQualityFormat'''</td><td>'''acf:goodQualityImageWidth'''</td><td>'''acf:goodQualityImageHeight'''</td><td>'''acf:bestQualityAccessURI'''</td><td>'''acf:bestQualityFormat'''</td><td>'''acf:bestQualityImageWidth'''</td><td>'''acf:bestQualityImageHeight'''
 +
</td></tr>
 +
<tr><td><nowiki>http://ex.com/pictures/thePicture.jpg</nowiki></td><td>A red beech leaf</td><td><nowiki>http://example.com/thumb/thePic.jpg</nowiki></td><td>image/jpeg</td><td>100</td><td>100</td><td><nowiki>http://ex.com/img/thePic.jpg</nowiki></td><td>image/jpeg</td><td>1000</td><td>1000</td><td><nowiki>http://ex.com/hr/thePic.png</nowiki></td><td>image/png</nowiki></td><td>10000</td><td>10000</td>
 +
</tr>
 +
</table>
  
 
==Lists of plain text values==
 
==Lists of plain text values==
Line 118: Line 138:
 
==Term List==
 
==Term List==
  
See: [[Audubon Core Term List (1.0 normative)]]
+
See: [[Audubon Core Term List]]
  
 
==Non-normative documents==
 
==Non-normative documents==
  
See: [[Audubon Core Non Normative Document]]
+
See: [[Audubon Core]] introduction; [[Audubon Core Offline Non Normative Document]]  
  
[[Category:Audubon Core]]
+
[[Class:Audubon Core]]

Latest revision as of 13:11, 5 March 2020

Warning: This is not the current version of this document. It is kept for archival purposes. For up-to-date information, go to http://rs.tdwg.org/ac/doc/structure/


Title: Audubon Core

Date: 23 October 2013.

Abstract: The Audubon Core is a set of vocabularies designed to represent metadata for biodiversity multimedia resources and collections. These vocabularies aim to represent information that will help to determine whether a particular resource or collection will be fit for some particular biodiversity science application before acquiring the media. Among others, the vocabularies address such concerns as the management of the media and collections, descriptions of their content, their taxonomic, geographic, and temporal coverage, and the appropriate ways to retrieve, attribute and reproduce them. This document contains material introductory to the Audubon Core Term List

Contributors: Robert A. Morris, Vijay Barve, Mihail Carausu, Vishwas Chavan, Jose Cuadra, Chris Freeland, Gregor Hagedorn, Patrick Leary, Dimitry Mozzherin, Annette Olson, Greg Riccardi, Ivan Teage

Legal: This document is governed by the standard legal, copyright, licensing provisions and disclaimers issued by the Taxonomic Databases Working Group.

Part of TDWG Standard: http://www.tdwg.org/standards/638/.

Document Status: This document is a TDWG Type 1 Normative Document.

Release 1.0 of this document has wiki revision 10756 with a permalink http://terms.gbif.org/w/index.php?oldid=10756.
This document has wiki revision ID 46991 with a permalink http://terms.gbif.org/w/index.php?oldid=46991.


This is the normative documentation for the TDWG Audubon Core Multimedia Resources Metadata Standard (Audubon Core, or simply AC), During development, it was colloquially known as MRTG, after its developers, the GBIF-TDWG Joint Multimedia Resources Metadata Task Group. Please see the brief Audubon Core non normative document and also MRTG Development History for the development history in detail.

If you are unfamiliar with the Audubon Core, please read the Audubon Core introduction before reading this page. It lays out why there is perceived a need for a biodiversity media resource metadata schema, and how the standard attempts to use existing metadata standards where possible.

Terminology of this specification

There are many ways to organize metadata specifications, particularly as to the nomenclature of the constituents of the metadata. Note the following as they apply to the Audubon Core:

  • A Multimedia Resource is anything that a provider identifies as belonging to one of the possible values of the AC Type term and optionally one or more of the Subtype term values. A mechanism is provided by which providers can supply a privately defined subtype that will not collide with the AC defined Subtype values.
  • An AC record is a set of terms with any values conforming to this document, and which contain at least the four mandatory terms described in the Audubon Core Core Term List, and which describes a single multimedia resource (possibly including a Collection). One of these, the value of Identifier is a Globally Unique IDentifier (GUID), which may have been assigned to the resource by an external authority or by the provider of the metadata record.
  • AC terms are divided into two Layers. Those characterized as in Layer 1, including the four mandatory terms, should be meaningfully handled by all consuming client applications. Only wholly complete consuming applications need handle those in the Layer 2. What is meant by "meaningfully handle" is up to implementers of this normative specification. It could be as simple as "gracefully ignore".

In the Audubon Core Term List, every AC term has a term name following a table entry "Term:", a URI, a plain text normative Definition, a recommended English Label, an optional Notes attribute. In addition, a term has an attribute telling whether it is mandatory, one telling whether it is repeatable, and one telling whether it is in Layer 1 or 2. Layer 2 comprises terms likely to only occur for certain media. For example, the term DateAvailable will apply only to media that are embargoed, but for which the provider is prepared to make the metadata immediately available.

AC metadata can describe either individual multimedia resources or collections of resources. A few, but not many, of the AC properties have different values for collections than for individual media. If no such distinction is mentioned, AC does not assume one.

Term Names for terms borrowed from other vocabularies are those in use for the corresponding term in those vocabularies. Term Names are intended principally for navigation in the AC documentation. Term Labels are suggestions for English labels in applications. They are recommendations only and are offered only in English, with the added expectation that they may clarify intended usage of the term. Communities may wish to promulgate recommendations for Labels in other languages, or even alternative English Labels for specialized audiences, e.g. school children. Labels are may be used for navigation within the Term List, and are often used within the Term List itself when a term is mentioned within the documentation of another term. The Term List provides indices both by name and label.

URI's for terms conform to the http URI scheme (See http://en.wikipedia.org/wiki/URI_scheme, http://www.w3.org/TR/uri-clarification/, or http://www.ietf.org/rfc/rfc2396.txt.) Informally, one may understand this as follows: an http URI has the syntax of an http URL, but there is no expectation that putting it in a web browser will result in any information being returned to the browser, and if there is, it may have no relevance. This conformance requirement applies only to the URIs that identify AC terms. A few AC terms permit values to be taken from another controlled vocabulary chosen by the user. In this case, those values may involve URIs conforming to a scheme given by that external vocabulary, and AC is silent on what that scheme is.

The Notes field of a term's documentation points to further information, if any exists, about the term. In particular, for terms borrowed from other vocabularies, this field generally carries a link to the originating vocabulary's documentation for that term.

Multiplicity/Cardinality

A number of terms are repeatable. How to implement repeatability in a given serialization is not defined by Audubon Core. The following section gives advice on some best practices in the context of repeatability.

The simplest case is a single repeatable term (e.g., dcterms:identifier). In representations based on an XML Schema that permits elements to be repeated such a term may simply be repeated (e.g. "...<dcterms:identifier>http://example.com/123</dcterms:identifier><dcterms:identifier>http://example.com/456</dcterms:identifier>..."). In serializations that do not easily lend themselves to repeatable elements (e.g. "flat" schemata with all elements occurring only a single time in an otherwise unstructured record) it is possible to define separators to support a list of values within a single element (e.g. "...<dcterms:identifier>http://example.com/123; http://example.com/456</dcterms:identifier>...").

In certain cases pairs or tuples of properties are repeated. In Audubon Core this situation occurs, for example, in the following cases:

  • The language-dependent metadata like title, description, etc. need to be associated with ac:metadataLanguage. One approach here is to use complete Audubon Core records together with the Metadata Language property; see there for further detail.
  • The values of properties about a Service Access Point must remain associated with that Service Access Point even if there are multiple Service Access Points. See hasServiceAccessPoint for further details.
  • The terms dwc:scientificName and dwc:identificationQualifier may optionally be structured into pairs. (See the notes on dwc:identificationQualifier.)
  • The terms Reviewer, being the name of an individual providing some expert review of a resource, and the review text itself in Reviewer Comments are desirable to store as pairs.

Many serialization languages provide sufficiently structured forms to deal with repeated terms unambiguously. For example, in XML might define a container element and use a nesting structure something like this:

<MEDIA_METADATA_CONTAINER>
  <dcterms:identifier>http//:example.com/pictures/thePicture.jpg</dcterms:identifier>
  ...
  <ac:hasServiceAccessPoint>
    <dcterms:format>jpg</dcterms:format>
    <ac:accessURI>http://example.com/fullres/thePicture.jpg</ac:accessURI>
    ...
  </ac:hasServiceAccessPoint>
  <ac:hasServiceAccessPoint>
    ...
  </ac:hasServiceAccessPoint>
<MEDIA_METADATA_CONTAINER>

Another example may reference access points by identifier:

<MEDIA_METADATA_CONTAINER>
  <dcterms:identifier>http://example.com/pictures/thePicture.jpg</dcterms:identifier>
  ...
  <ac:hasServiceAccessPoint>http://example.com/pictures/thePicture.jpg#ac0001</ac:hasServiceAccessPoint>
  <ac:hasServiceAccessPoint>http://example.com/pictures/thePicture.jpg#ac0002</ac:hasServiceAccessPoint>
  <ac-classes:ServiceAccessPoint id="http://example.com/pictures/thePicture.jpg#ac0001">
    <dcterms:format>jpg</dcterms:format>
    <ac:accessURI>http://example.com/fullres/thePicture.jpg</ac:accessURI>
    ...
  </ac-classes:ServiceAccessPoint>
  ...
<MEDIA_METADATA_CONTAINER>

Note: ac-classes:ServiceAccessPoint a prefix of an illustrative namespace. Namespace recommendations will be made when the normative documents are approved.

Where such structure is impossible or undesirable, an alternative solution is to to permit only one access point per MEDIA_METADATA_CONTAINER, but to repeat the main MEDIA_METADATA_CONTAINER for a single media resource. This is similar to one of the options discussed for multilingual metadata (see Metadata Language). An example in XML for this:

<MEDIA_METADATA_CONTAINER>
  <dcterms:identifier>http://example.com/pictures/thePicture.jpg</dcterms:identifier>
  <dcterms:title>A red beech leaf</dcterms:title>
  <dcterms:format>jpg</dcterms:format>
  <ac:accessURI>http://example.com/fullres/thePicture.jpg</ac:accessURI>
  ...
<MEDIA_METADATA_CONTAINER>
<MEDIA_METADATA_CONTAINER>
  <dcterms:identifier>http://example.com/pictures/thePicture.jpg</dcterms:identifier>
  <dcterms:format>png</dcterms:format>
  <ac:accessURI>http://example.com/fullres/thePicture-hires.png</ac:accessURI>
  ...
<MEDIA_METADATA_CONTAINER>

The same example as a spreadsheet-like table:

dcterms:identifierdcterms:titleac:variantdcterms:formatac:accessURI
http://example.com/pictures/thePicture.jpgA red beech leafBest Qualityjpghttp://example.com/fullres/thePicture.jpg
http://example.com/pictures/thePicture.jpgBest Qualitypnghttp://example.com/fullres/thePicture-hires.png
http://example.com/pictures/thePicture.jpgThumbnailpnghttp://example.com/thumbs/thePicture-thumb.png

In the example above, only the required identifier is repeated, but not the title field. Whether to repeat all fields or whether to provide all fields only in the first record, limiting later records to the identifier and the service access point properties, is left to specific implementations. In the example, hasAccessPoint property is suppressed as unnecessary. Another approach reduces the need for the property when flattening the ac structure. It is based on introducing new terms exploiting values of the Audubon Core Term List#ac:variantLiteral ac:variantLiteral: "Thumbnail", "Trailer", "Lower Quality", "Medium Quality", "Good Quality", "Best Quality", "Offline", as prefixes for additional properties in a new namespace, say acf (Audubon Core Flat):

dcterms:identifierdcterms:titleacf:thumbnailAccessURIacf:thumbnailFormatacf:thumbnailImageWidthacf:thumbnailImageHeightacf:goodQualityAccessURIacf:goodQualityFormatacf:goodQualityImageWidthacf:goodQualityImageHeightacf:bestQualityAccessURIacf:bestQualityFormatacf:bestQualityImageWidthacf:bestQualityImageHeight
http://ex.com/pictures/thePicture.jpgA red beech leafhttp://example.com/thumb/thePic.jpgimage/jpeg100100http://ex.com/img/thePic.jpgimage/jpeg10001000http://ex.com/hr/thePic.pngimage/png</nowiki>1000010000

Lists of plain text values

Some AC terms permit values that are lists to be represented as plain text. The choice of how to separate list items is ultimately left to the implementers of AC. Typical usage is to choose a punctuation mark such as ",", ";", or "|". In these cases a special escape syntax needs to be defined for cases in which the separator is part of the metadata value. Unfortunately, even for standard list formats like CSV, different software packages choose different escape methods, hindering interchange. In the absence of an implementation-specific choice we recommend to use "|" as separator and "\|" as an escaped vertical bar.

Term List

See: Audubon Core Term List

Non-normative documents

See: Audubon Core introduction; Audubon Core Offline Non Normative Document