Difference between revisions of "Audubon Core Structure"

From TDWG Terms Wiki
Jump to: navigation, search
(One intermediate revision by the same user not shown)
Line 1: Line 1:
<!-- IMPORTANT: WHEN YOU WANT TO EDIT, UNCOMMENT THE WIP CALL BELOW THIS AND SAVE, THEN REOPEN THE EDIT. THIS HELPS ASSURE EDITS WON'T COLLIDE  
+
<!-- IMPORTANT:  
{{WIP | end =  | user = {{REVISIONUSER}} }} -->
+
  WHEN YOU WANT TO EDIT, UNCOMMENT THE WIP CALL BELOW THIS AND SAVE,
'''Title:''' Audubon Core
+
  THEN REOPEN THE EDIT. THIS HELPS ASSURE EDITS WON'T COLLIDE  
 +
{{WIP | end =  | user = {{REVISIONUSER}} }}  
 +
-->
 +
{{Concept scheme
 +
|description= The <b>Audubon Core</b> is a set of vocabularies designed to represent metadata for biodiversity multimedia resources and collections. These vocabularies aim to represent information that will help to determine whether a particular resource or collection will be fit for some particular biodiversity science application before acquiring the media. Among others, the vocabularies address such concerns as the management of the media and collections, descriptions of their content, their taxonomic, geographic, and temporal coverage, and the appropriate ways to retrieve, attribute and reproduce them.
 +
|notes= An Audubon Core (AC) record is a description, using the Audubon Core vocabularies, of a multimedia resource. Two kinds of terms are specified by this document: record-level terms and access-level terms. Terms under the Record-level Terms section apply to the whole Audubon Core record regardless of the record type. Almost all terms are record-level terms. One such term, hasServiceAccessPoint plays a special role in helping to retrieve the resource that the record describes. Its value is an identifier of a set of more metadata about services that may provide web or other access to a file containing the media. A multimedia resource may describe more than one such service, each of which is described by values of one or more access-level terms, telling such things as a web address at which the resource from which a digital representation of the resource can be retrieved, the size of such a retrieved object, etc.
  
'''Date:''' TBD. This document is a proposal
+
This document is governed by the standard legal, copyright, licensing provisions and disclaimers issued by the Taxonomic Databases Working Group.
  
'''Abstract:''' The Audubon Core is a set of vocabularies designed to represent metadata for biodiversity multimedia resources and collections. These vocabularies aim to represent information that will help to determine whether a particular resource or collection will be fit for some particular biodiversity science application before acquiring the media. Among others, the vocabularies address such concerns as the management of the media and collections, descriptions of their content, their taxonomic, geographic, and temporal coverage, and the appropriate ways to retrieve, attribute and reproduce them. This document contains material introductory to the '''[[Audubon Core Term List]]'''
+
The terms defined in this concept schema are conveniently presented in the <b>[[Audubon Core Term List]]</b>. <b>See also:</b> [[Audubon Core Non Normative Document]].
 +
|creators=
 +
|contributors=  Robert A. Morris, Vijay Barve, Mihail Carausu, Vishwas Chavan, Jose Cuadra, Chris Freeland, Gregor Hagedorn, Patrick Leary, Dimitry Mozzherin, Annette Olson, Greg Riccardi, Ivan Teage
 +
|preferred namespace prefix= ac
 +
|preferred namespace uri= http://rs.tdwg.org/ac/terms/
 +
}}
  
'''Contributors:''' Robert A. Morris, Vijay Barve, Mihail Carausu, Vishwas Chavan, Jose Cuarda, Chris Freeland, Gregor Hagedorn, Patrick Leary, Dimitry Mozzherin, Annette Olson, Greg Riccardi, Ivan Teage
 
  
'''Legal:''' This document is governed by the standard legal, copyright, licensing provisions and disclaimers issued by the Taxonomic Databases Working Group.
+
----
  
'''Part of TDWG Standard:''' TBD
 
 
'''See also:''' The [[Audubon Core Term List]] and the [[Audubon Core Non Normative Document]]
 
  
 
This document has wiki revision ID {{REVISIONID}} with a permalink http://www.species-id.net/w/index.php?oldid={{REVISIONID}}.<br/>
 
This document has wiki revision ID {{REVISIONID}} with a permalink http://www.species-id.net/w/index.php?oldid={{REVISIONID}}.<br/>
Line 19: Line 25:
  
 
'''This is a normative document.'''
 
'''This is a normative document.'''
 
Here is the '''[[Audubon Core Term List]]'''.
 
  
 
__TOC__ <!-- table of contents should appear here -->
 
__TOC__ <!-- table of contents should appear here -->
Line 29: Line 33:
  
 
==Related Information==
 
==Related Information==
 
 
 
* See also discussion of the'''[http://www.keytonature.eu/wiki/XML_Schema_For_MRTG XML Schema For MRTG]'''. This is obsolete and will be revised for compliance with the normative version when it is approved.
 
* See also discussion of the'''[http://www.keytonature.eu/wiki/XML_Schema_For_MRTG XML Schema For MRTG]'''. This is obsolete and will be revised for compliance with the normative version when it is approved.
 
* See also discussion of the '''[http://www.keytonature.eu/wiki/MRTG_in_RDF MRTG in RDF].''' This is obsolete and will be revised for compliance with the normative version when it is approved.
 
* See also discussion of the '''[http://www.keytonature.eu/wiki/MRTG_in_RDF MRTG in RDF].''' This is obsolete and will be revised for compliance with the normative version when it is approved.
Line 53: Line 55:
 
The Details field of a term's documentation points to further information, if any exists, about the term. In particular, for terms borrowed from other vocabularies, this field generally carries a link to the originating vocabulary's documentation for that term.
 
The Details field of a term's documentation points to further information, if any exists, about the term. In particular, for terms borrowed from other vocabularies, this field generally carries a link to the originating vocabulary's documentation for that term.
  
==Multiplicity==
+
==Multiplicity/Cardinality==
A number of terms do not permit repetition within a single AC metadata record, sometimes in cases where there seems a compelling case for more than one value for a term.  A typical example is the term [[#Reviewer| Reviewer]], being the name of an individual providing some natural language expert review of a resource, typically given as the value of the term [[#Reviewer_Comments | Reviewer Comments]].   Clearly, more than one person might provide such a review, but in this case something has to specify which reviewer made which comments. There are several solutions to this kind of problem, which is endemic to many kinds of descriptive metadata. One is to have some kind of linkage between the related terms. This alone is not sufficient, as we see from this example: a reviewer might make several comments, several reviewers might together make several comments, and some other group of reviewers might make some other group of comments, etc. This the linkage approach requires not only linkages between many terms, but also container objects holding multiples, and linkages between those containers and other objects. Other even more complex ways to allow aggregation of terms to provide relations between them are possible. Instead the Audubon Core is an entirely flat terminology (with the sole exception of the [[#Access_Point |Access Point]]) but this does ''not'' preclude multiple values. At the small cost of having to repeat some of the term values (including the required ones), a description can contain several metadata records for the same Multimedia Resource. Consider for example the following pseudo-code in an imaginary spreadsheet implementation of AC. In this example, the implementing language requires an additional spreadsheet term named EOR ("End of Record") separating several metadata records for the same image (because they have the same Identifier) from one another. Note that this entire issue is principally an issue for machine parsing of attribute values. Nothing in AC prevents a metadata provider from having a well documented punctuation mechanism suited for human readability that, for example, allows both a reviewer comment and reviewer name in the same text, and even multiple such pairs all in a single Reviewer_Comments. In that context, the AC multi-record approach illustrated below may be regarded as a standardized way of atomizing such a punctuation convention, and mapping back and forth between this atomization and a punctuation based single piece of text would not be difficult.
+
A number of terms do not permit repetition within a single AC metadata record, sometimes in cases where there seems a compelling case for more than one value for a term.  A typical example is the term [[#Reviewer| Reviewer]], being the name of an individual providing some natural language expert review of a resource, typically given as the value of the term [[#Reviewer_Comments | Reviewer Comments]]. Clearly, more than one person might provide such a review, but in this case something has to specify which reviewer made which comments. There are several solutions to this kind of problem, which is endemic to many kinds of descriptive metadata. One is to have some kind of linkage between the related terms. This alone is not sufficient, as we see from this example: a reviewer might make several comments, several reviewers might together make several comments, and some other group of reviewers might make some other group of comments, etc. This the linkage approach requires not only linkages between many terms, but also container objects holding multiples, and linkages between those containers and other objects. Other even more complex ways to allow aggregation of terms to provide relations between them are possible. Instead the Audubon Core is an entirely flat terminology (with the sole exception of the [[#Access_Point |Access Point]]) but this does ''not'' preclude multiple values. At the small cost of having to repeat some of the term values (including the required ones), a description can contain several metadata records for the same Multimedia Resource. Consider for example the following pseudo-code in an imaginary spreadsheet implementation of AC. In this example, the implementing language requires an additional spreadsheet term named EOR ("End of Record") separating several metadata records for the same image (because they have the same Identifier) from one another. Note that this entire issue is principally an issue for machine parsing of attribute values. Nothing in AC prevents a metadata provider from having a well documented punctuation mechanism suited for human readability that, for example, allows both a reviewer comment and reviewer name in the same text, and even multiple such pairs all in a single Reviewer_Comments. In that context, the AC multi-record approach illustrated below may be regarded as a standardized way of atomizing such a punctuation convention, and mapping back and forth between this atomization and a punctuation based single piece of text would not be difficult.
 +
 
 
{| border="1" summary="Example spreadsheet implementation with two metadata records for the same image"
 
{| border="1" summary="Example spreadsheet implementation with two metadata records for the same image"
 
|+ align="bottom" " |''Example spreadsheet implementation with two metadata records for the same image''
 
|+ align="bottom" " |''Example spreadsheet implementation with two metadata records for the same image''
Line 71: Line 74:
 
|EOR || Yes
 
|EOR || Yes
 
|-
 
|-
|dcterms:dentifier || http://bit.ly/pwrmwD
+
|dcterms:dentifier || http://bit.ly/pwrmwD
 
|-
 
|-
|dcterms:type || image
+
|dcterms:type || image
 
|-
 
|-
 
|… (required terms only)|| …
 
|… (required terms only)|| …
 
|-
 
|-
|reviewer || Jonathan Smythe
+
|reviewer || Jonathan Smythe
 
|-
 
|-
|reviewerComments || Colors are inaccurate
+
|reviewerComments || Colors are inaccurate
 
|-
 
|-
 
|EOR || Yes
 
|EOR || Yes
Line 86: Line 89:
 
==Plain Text Lists==
 
==Plain Text Lists==
  
Some AC terms permit values that are lists to be represented as plain text. The choice of how to
+
Some AC terms permit values that are lists to be represented as plain text. The choice of how to separate list items is necessarily left to the implementors of AC. For example, an XML implementation of AC might choose to use standard XML container methods whereas an implementer of a spreadsheet version, in which cells may contain lists, might specify a punctuation mark, e.g. comma, and supply some special escape syntax for use when the comma is part of the metadata value. An implementation might even make different such choices depending on the term involved, the languages supported, etc.
separate list items is necessarily left to the implementors of AC. For
+
example, an XML implementation of AC might choose to use standard XML
+
container methods whereas an implementer of a spreadsheet version, in which
+
cells may contain lists, might specify a punctuation
+
mark, e.g. comma, and supply some special escape syntax for use when
+
the comma is part of the metadata value. An implementation might even
+
make different such choices depending on the term involved, the languages supported, etc.
+
  
 
==Term List==
 
==Term List==
Go here: [[Audubon Core Term List]]
+
See: [[Audubon Core Term List]]
  
 
[[Category:Audubon Core]]
 
[[Category:Audubon Core]]

Revision as of 22:48, 2 October 2012

The Audubon Core is a set of vocabularies designed to represent metadata for biodiversity multimedia resources and collections. These vocabularies aim to represent information that will help to determine whether a particular resource or collection will be fit for some particular biodiversity science application before acquiring the media. Among others, the vocabularies address such concerns as the management of the media and collections, descriptions of their content, their taxonomic, geographic, and temporal coverage, and the appropriate ways to retrieve, attribute and reproduce them.

Notes: An Audubon Core (AC) record is a description, using the Audubon Core vocabularies, of a multimedia resource. Two kinds of terms are specified by this document: record-level terms and access-level terms. Terms under the Record-level Terms section apply to the whole Audubon Core record regardless of the record type. Almost all terms are record-level terms. One such term, hasServiceAccessPoint plays a special role in helping to retrieve the resource that the record describes. Its value is an identifier of a set of more metadata about services that may provide web or other access to a file containing the media. A multimedia resource may describe more than one such service, each of which is described by values of one or more access-level terms, telling such things as a web address at which the resource from which a digital representation of the resource can be retrieved, the size of such a retrieved object, etc.

This document is governed by the standard legal, copyright, licensing provisions and disclaimers issued by the Taxonomic Databases Working Group.

The terms defined in this concept schema are conveniently presented in the Audubon Core Term List. See also: Audubon Core Non Normative Document.

Contributors: Robert A. Morris, Vijay Barve, Mihail Carausu, Vishwas Chavan, Jose Cuadra, Chris Freeland, Gregor Hagedorn, Patrick Leary, Dimitry Mozzherin, Annette Olson, Greg Riccardi, Ivan Teage

Namespace URI: http://rs.tdwg.org/ac/terms/ with preferred namespace prefix “ac”.



 



No Collections defined yet in scheme “Audubon Core Structure“.




This document has wiki revision ID 3856 with a permalink http://www.species-id.net/w/index.php?oldid=3856.
The version under consideration by TDWG has permalink http://www.species-id.net/w/index.php?oldid=25835.

This is a normative document.

WIP.gif This is the normative documentation for the TDWG Audubon Core Multimedia Resources Metadata Standard (Audubon Core, or simply AC), During development, it was colloquially known as MRTG, after its developers, the GBIF-TDWG Joint Multimedia Resources Metadata Task Group. Please see the brief Audubon Core Non Normative Document and also MRTG Development History for the development history in detail.

If you are unfamiliar with the AudubonCore, please read the Audubon Core Non Normative Document before editing this page. It lays out why there is perceived a need for a biodiversity media resource metadata schema, and how the standard attempts to use existing metadata standards where possible.

Related Information

  • See also discussion of theXML Schema For MRTG. This is obsolete and will be revised for compliance with the normative version when it is approved.
  • See also discussion of the MRTG in RDF. This is obsolete and will be revised for compliance with the normative version when it is approved.
  • TDWG09 MRTG WORKGROUP REPORT

Terminology of this specification

There are many ways to organize metadata specifications, particularly as to the nomenclature of the constituents of the metadata. In this document and the associated non-normative documentation, we will follow closely (sometimes verbatim) a portion of the Dublin Core Metadata Initiative (DCMI) metadata nomenclature as described in Section 2.3 of the DCMI Abstract Model (http://www.dublincore.org/documents/abstract-model/). In addition:

  • A Multimedia Resource is anything that a provider identifies as belonging to one of the possible values of the AC Type term and one of the Subtype term values. A mechanism is provided by which providers can supply a privately defined subtype that will not collide with the AC defined Subtype values.
  • An AC record is a set of terms with any values conforming to this document, and which contain at least the four mandatory terms described in the Audubon_Core_Term_List, and which describes a single multimedia resource (possibly including a Collection). One of these, the value of Identifier is a Globally Unique IDentifier (GUID), which may have been assigned to the resource by an external authority or by the provider of the metadata record.
  • AC terms are divided into two Layers. Those characterized as in the Core Layer, including the five mandatory terms, should be meaningfully handled by all consuming clients applications. Only wholly complete consuming applications need handle those in the Extended Layer. What is meant by "meaningfully handle" is up to implementers of this normative specification. It could be as simple as "gracefully ignore".

In the Audubon_Core_Term_List, every AC term has a term name following a table entry "Term:", a URI, a plain text normative Definition, a recommended English Label, an optional Details attribute, and an optional Comments attribute. In addition, a term has an attribute telling whether it is mandatory, one telling whether it is repeatable, and one telling whether it is in the Core or Extended Layer. The Extended Layer comprises terms likely to only occur for certain media. For example, the term DateAvailable will apply only to media that are embargoed, but for which the provider is prepared to make the metadata immediately available.

AC metadata can describe either individual multimedia resources or collections of resources. A few, but not many, of the AC properties have different values for collections than for individual media. If no such distinction is mentioned, AC does not assume one.

Term Names for terms borrowed from other vocabularies are those in use for the corresponding term in those vocabularies. Term names are intended principally for navigation in this document. Term Labels are suggestions for English labels in applications. They are recommendations only and are offered only in English, with the added expectation that they may clarify intended usage of the term. Communities may wish to promulgate recommendations for Labels in other languages, or even alternative English Labels for specialized audiences, e.g. school children.

URI's for terms conform to the http URI scheme (See, http://en.wikipedia.org/wiki/URI_scheme, http://www.w3.org/TR/uri-clarification/, or http://www.ietf.org/rfc/rfc2396.txt ). Informally, one may understand this as follows: an http URI has the syntax of an http URL, but there is no expectation that putting it in a web browser will result in any information being returned to the browser, and if there is, it may have no relevance. This conformance requirement applies only to the URIs that identify AC terms. A few AC terms permit values to be taken from another controlled vocabulary chosen by the user. In this case, those values may involve URIs conforming to a scheme given by that external vocabulary, and AC is silent on what that scheme is.

The Details field of a term's documentation points to further information, if any exists, about the term. In particular, for terms borrowed from other vocabularies, this field generally carries a link to the originating vocabulary's documentation for that term.

Multiplicity/Cardinality

A number of terms do not permit repetition within a single AC metadata record, sometimes in cases where there seems a compelling case for more than one value for a term. A typical example is the term Reviewer, being the name of an individual providing some natural language expert review of a resource, typically given as the value of the term Reviewer Comments. Clearly, more than one person might provide such a review, but in this case something has to specify which reviewer made which comments. There are several solutions to this kind of problem, which is endemic to many kinds of descriptive metadata. One is to have some kind of linkage between the related terms. This alone is not sufficient, as we see from this example: a reviewer might make several comments, several reviewers might together make several comments, and some other group of reviewers might make some other group of comments, etc. This the linkage approach requires not only linkages between many terms, but also container objects holding multiples, and linkages between those containers and other objects. Other even more complex ways to allow aggregation of terms to provide relations between them are possible. Instead the Audubon Core is an entirely flat terminology (with the sole exception of the Access Point) but this does not preclude multiple values. At the small cost of having to repeat some of the term values (including the required ones), a description can contain several metadata records for the same Multimedia Resource. Consider for example the following pseudo-code in an imaginary spreadsheet implementation of AC. In this example, the implementing language requires an additional spreadsheet term named EOR ("End of Record") separating several metadata records for the same image (because they have the same Identifier) from one another. Note that this entire issue is principally an issue for machine parsing of attribute values. Nothing in AC prevents a metadata provider from having a well documented punctuation mechanism suited for human readability that, for example, allows both a reviewer comment and reviewer name in the same text, and even multiple such pairs all in a single Reviewer_Comments. In that context, the AC multi-record approach illustrated below may be regarded as a standardized way of atomizing such a punctuation convention, and mapping back and forth between this atomization and a punctuation based single piece of text would not be difficult.

Example spreadsheet implementation with two metadata records for the same image
Term Value
dcterms:identifier http://bit.ly/pwrmwD
dcterms:type image
… (many terms)
reviewer Susan Grogan
comments Excellent Image
EOR Yes
dcterms:dentifier http://bit.ly/pwrmwD
dcterms:type image
… (required terms only)
reviewer Jonathan Smythe
reviewerComments Colors are inaccurate
EOR Yes

Plain Text Lists

Some AC terms permit values that are lists to be represented as plain text. The choice of how to separate list items is necessarily left to the implementors of AC. For example, an XML implementation of AC might choose to use standard XML container methods whereas an implementer of a spreadsheet version, in which cells may contain lists, might specify a punctuation mark, e.g. comma, and supply some special escape syntax for use when the comma is part of the metadata value. An implementation might even make different such choices depending on the term involved, the languages supported, etc.

Term List

See: Audubon Core Term List