PBDB Data Service: Documentation

PBDB Data Service 1.2 v2

This is the stable version of the data service. It is now the preferred version for production applications. We may add new URL paths, parameters and response fields from time to time, but existing ones will continue to work as they did before. You can view the change log which tracks changes to this data service across the different versions.

The older version 1.1 will continue to work, but we urge developers to transition to using this version because it provides much greater capabilities.

DESCRIPTION

The function of this data service is to provide programmatic access to the information stored in the Paleobiology Database. Our goal is to make the entire database accessible by means of this service, so that anyone can write client software that interacts with it.

SYNOPSIS

This service currently provides access to the following classes of information, by means of the indicated URLs. The following links will take you to pages which document the individual URL paths, listing the parameters accepted and the data fields returned by each.

Fossil occurrences	A fossil occurence represents the occurrence of a particular organism at a particular location in time and space. Each occurrence is a member of a single fossil collection, and has a taxonomic identification which may be more or less specific. The fossil occurrence records are the core data concept around which this database is built.
Fossil collections	A fossil collection is somewhat loosely defined as a set of fossil occurrences that are co-located geographically and temporally. Each collection has a geographic location, stratigraphic context, and age estimate.
Specimens and measurements	Many of the fossil occurrences in the database are based on specimens that can be examined and measured. There are also specimens entered into the database for which no information was available as to the location and context in which they were found.
Taxonomic names	The taxonomic names stored in the database are arranged hierarchically. Our tree of life is quite complete down to the class level, and reasonably complete down to the suborder level. Below that, coverage varies. Many parts of the tree have been completely entered, while others are sparser.
Taxonomic opinions	The taxonomic hierarchy in our database is computed algorithmically based on a constantly growing set of taxonomic opinions. These opinions are ranked by publication year and basis, yielding a 'consensus taxonomy' based on the latest research.
Geological time intervals and time scales	The database lists almost every geologic time interval in current use, including the standard set established by the International Commission on Stratigraphy (2013-01).
Geographic places	The collections and specimens in the database are associated with specific geographic locations, which can be queried separately from other operations.
Geological strata	Most of the fossil collections in the database are categorized by the formation from which each was collected, and many by group and member.
Bibliographic references	Each fossil occurrence, collection, specimen, taxonomic name, and opinion in the database is associated with one or more bibliographic references, identifying the source from which this information was entered.
Research Publications	Research papers that make major use of data from this database can be submitted for consideration as official publications. The operations in this section provide a list of official publication records and information about individual records.
Data Archives	Database members can archive data downloads, preserving them on the server so that they or others can later retrieve the exact same result set. This facility can be used to document the research process. Additionally, you can request that a DOI be generated for significant archives so that they can be quoted in research papers.
Client configuration	This operation provides information about the structure, encoding and organization of the information in the database. It is designed to enable the easy configuration of client applications.
Combined data	The operations in this group provide access to multiple types of data records, including auto-completion for client applications.
Database contributors
Support for frontend application	Auxiliary operations to support the frontend Navigator application.
Database statistics	Statistical summaries of database information

This is a TEST VERSION of the data service which provides data entry operations:

Educational Resources	Data entry operations for educational resource records.
Specimens and Measurements	Data entry operations for specimen and measurement records.
Official publications	Data entry operations for official publication records.
Data archives	Data entry operations for data archive records.

The following links will take you to additional pages that provide information about how to use this service.

Record identifiers and record numbers	Records retrieved from the data service can be identified either by using the numeric identifiers from the underlying database records (i.e. 'occurrence_no'), or using an extended identifier syntax.
Specifying taxonomic names	The data service accepts taxonomic names using several different parameters, and there are modifiers that you can add in order to precisely specify which taxa you are interested in.
Ecological and taphonomic vocabulary	The ecology of organisms and the taphonomy of their fossil remains are described by several different data fields with an associated vocabulary.
Specifying dates and times	You can retrieve records based on when they were modified and/or created.
Bibliographic references	Each piece of data entered into the database is linked to the bibliographic reference from which it was entered.
Basis and precision of coordinates	The basis and precision of geographic locations is specified by a set of code values.
Formats and Vocabularies	You can get the results of query operations in a variety of formats, and with the field names expressed in any of the available vocabularies.
Special Parameters	There are a number of special parameters which you can use with almost any data service operation. These constrain or alter the response in various ways.

USAGE

You can access this service by making HTTP requests whose URLs conform to a simple scheme. In most cases each URL maps to a single database query, and the body of the response returns some or all of the resulting records. For a description of how this information is encoded, see the documentation for the various output formats.

For example, consider the following URL:

http://dev.paleobiodb.org/data1.2/taxa/single.json?name=Dascillidae&show=attr

An HTTP GET request using this URL would return information about the taxon Dascillidae (soft-bodied plant beetles). The components of this URL are as follows:

http://dev.paleobiodb.org/	The initial part of the URL specifies the server to be contacted, and the protocol (http or https) to be used in the transaction.
data1.2/	The first component of the URL path indicates which data service you wish to use. There are multiple data service versions available from this server, and you need to specify which one you are talking to. The advantage of this approach is that you can store URLs that use a particular version of the data service, and they will continue to be valid even as we add new versions.
taxa/single	The rest of the URL path indicates the operation to be carried out. For a GET request, it specifies the class of information to be retrieved.
json	The path suffix indicates the format in which the results will be returned. In this case, the result will be expressed in Javascript Object Notation.
name=Dascillidae	Some of the parameters are used to construct a database query that will retrieve the desired information. This one selects a particular taxonomic name.
show=attr	Other parameters change or augment the set of information returned. This one specifies that in addition to basic information about the taxonomic name the result should also include the name's attribution.

Each URL path accepts its own set of parameters as well as a set of special parameters that control the form of the result.

For now, the only HTTP requests that are accepted are GET requests. Once we allow authentication and data modification, these operations will be carried out by means of POST, PUT and DELETE requests.

FORMATS

The following response formats are available for this data service. Not all of these may be available for every operation. You must select the desired format for a request by adding the appropriate suffix to the URI path.

Format	Suffix	Documentation	Description
JSON	.json	JSON format	The JSON format is intended primarily to support client applications, including the PBDB Navigator. Response fields are named using compact 3-character field names.
Comma-separated text	.txt	Text formats	The text formats (txt, tsv, csv) are intended primarily for researchers downloading data from the database. These downloads can easily be loaded into spreadsheets or other analysis tools. The field names are taken from the PBDB Classic interface, for compatibility with existing tools and analytical procedures.
Comma-separated text	.csv	Text formats	The text formats (txt, tsv, csv) are intended primarily for researchers downloading data from the database. These downloads can easily be loaded into spreadsheets or other analysis tools. The field names are taken from the PBDB Classic interface, for compatibility with existing tools and analytical procedures.
Tab-separated text	.tsv	Text formats	The text formats (txt, tsv, csv) are intended primarily for researchers downloading data from the database. These downloads can easily be loaded into spreadsheets or other analysis tools. The field names are taken from the PBDB Classic interface, for compatibility with existing tools and analytical procedures.
Larkin	.larkin	Larkin format	This format is used for operations that replace the old Larkin data service, written in Javascript, that was used to provide data to the frontend website. This format is designed to produce the same output as the old data service.
RIS	.ris	RIS format	The RIS format is a common format for bibliographic references.
PNG	.png	PNG format	The PNG suffix is used with a few URL paths to fetch images stored in the database.

If an error occurs, the response body will be a JSON object if the URL path suffix is json and HTML otherwise. If the URL path suffix is not recognized, an error of type 415 Unknown Media Type will be returned.

VOCABULARIES

The following response vocabularies are available for this data service. If you wish your responses to be expressed in a vocabulary other than the default for your selected format, you can use the vocab parameter with the appropriate vocabulary name.

Vocabulary	Name	Default for	Description
PaleobioDB field names	pbdb	txt, csv, tsv	The PBDB vocabulary is derived from the underlying field names and values in the database, augmented by a few new fields. For the most part any response that uses this vocabulary will be directly comparable to downloads from the PBDB Classic interface. This vocabulary is the default for text format responses.
Compact field names	com	json	The Compact vocabulary is a set of 3-character field names designed to minimize the size of the response message. This is the default for JSON format responses. Some of the field values are similarly abbreviated, while others are conveyed in their entirety. For details, see the documentation for the individual response fields.
BibJSON field names	bibjson \|		The BibJSON vocabulary uses the field names and value formats defined for BibTeX, which is the vocabulary used for BibJSON.

This service is provided by the Paleobiology Database, hosted by the Department of Geoscience at the University of Wisconsin-Madison.

If you have questions about this data service, or wish to report a bug, please contact the database administrator at admin@paleobiodb.org