ObsPack Data Product

Each ObsPack data product includes

Product Name

Each ObsPack data product has a unique product name using the following structure.

obspack_<trace gas identifier>_<preparation lab number>_<product name>_<product version number>_<preparation date>

The version numbering scheme is major.minor[.minor] where a major release is indicated by the first number in the sequence and minor revisions are indicated by the second and third (optional) numbers in the sequence. Below are a few examples.

obspack_co2_1_PROTOTYPE_v1.0.0_2012-11-06 (first major release of PROTOTYPE data product)
obspack_co2_1_PROTOTYPE_v0.9.3_2012-07-26 (minor revision to PROTOTYPE beta-version)
obspack_co2_1_PROTOTYPE_v0.9.2_2012-07-26 (minor revision to PROTOTYPE beta-version)
obspack_co2_1_PROTOTYPE_v0.9.1_2012-07-24 (minor revision to PROTOTYPE beta-version)
obspack_co2_1_PROTOTYPE_v0.9_2012-07-23 (first major release of PROTOTYPE beta-version)
obspack_co2_1_GLOBALVIEW-CO2_v1.0_2012-07-25 (first major release of GLOBALVIEW-CO2 using ObsPack framework)

Please note: The latest minor revision of a major release includes all changes included in intermediate minor revisions if they exist. We can expect a considerable number of minor revisions while the ObsPack framework is being developed. Once the framework has been thoroughly vetted, the number of minor revisions should be greatly reduced.

The ObsPack product name is used throughout.

Prepared Data Sets

An ObsPack data set is 1) a collection of measurements for a single trace gas species, 2) derived from a single laboratory-project, and 3) prepared according to a set of instructions. A set of instructions, specific to each data set, configures ObsPack software to subset data, average data, or pass data through without alteration. Multiple instruction sets for a given measurement record will create multiple unique data sets. For example, the NOAA quasi-continuous CO2 measurement record from the 396 magl intake height on the Wisconsin tall tower site (LEF) could be subsetted into 2 data sets; one consisting of average values of afternoon measurements only, and a second consisting of average values of nighttime measurements only. The ways in which data are prepared depend on the intended use of the data product.

Data sets are presented as individual files. File names are unique and include the trace gas species identifier, 3-letter site or project code, measurement project, laboratory identification number, a data selection tag, and the file type identifier, e.g., "nc" (netCDF4) and "txt" (ASCII text). The file name structure is as follows.

<trace gas identifier>_<site code>_<project>_<lab number>_<selection tag>.<filetype extension>

Below are a few examples.

co2_lef_aircraft-pfp_1_valid.txt
co2_lef_surface-pfp_1_valid.txt
co2_lef_tower-insitu_1_afternoon-396magl.txt
co2_lef_tower-insitu_1_nighttime-396magl.txt

co2_nat_surface-flask_26_marine.nc
co2_con_aircraft-flask_20_valid.nc

The selection tag included is intended to convey a very general notion of how the data have been selected. This information including relevant literature references is included in the file.

Metadata

Each data set includes comprehensive metadata describing the sampling location, sampling strategy, preparation strategy, and contact information for the contributing laboratory and data providers. Also included in each data set is a URL link to images of a world map highlighting the site location, the contributing lab's logo (where available) and country flag. These metadata provide users with all the information required to give proper attribution when displaying data from an ObsPack product. Figure 1 is constructed entirely from data and metadata extracted from a single data set.

Figure 1: Data and metadata for this graph from a single ObsPack data file.

Inside the Data File

Each data file includes a single prepared data set and associated metadata. Each data item in a data set includes the sample collection time, position, reported mole fraction or isotope ratio, estimated uncertainty (when available), the number (n) of individual measurements contributing to the reported value, and a unique ID that distinguishes the item from all other data items in the product. Metadata are presented as global attributes that describe general features of the data set and variable attributes that describe characteristics of the variables associated with each data item. Tables 1 and 2 describe global and variable attributes included in a typical ObsPack netCDF data file.

Table 1. Description of global attributes included in typical ObsPack netCDF data files.
Global Attributes
Name Description
site_code 3-letter site code as defined by GAWSIS. (e.g., LEF)
site_name Standard site name (e.g., Park Falls, Wisconsin)
site_country,site_country_flag Country in which site is located and link to image of flag
site_longitude Longitude (decimal degree) at representative site location
site_latitude Latitude (decimal degree) at representative site location
site_elevation Ground or surface elevation at representative site location
site_elevation_unit site_elevation is reported in meters above sea level (masl)
site_map URL link to world map highlighting site location (file type is png)
site_utc2lst Hour conversion from UTC to LST
site_url URL link to site web page (optional)
site_comments Additional relevant site information (optional)
 
dataset_num Integer that uniquely identifies the data set in the ObsPack data product
dataset_name Character string that uniquely identifies the data set in the ObsPack data product. Data set name are discussed here.
dataset_globalview_prefix Character string of equivalent GLOBALVIEW file name prefix (see GLOBALVIEW for details).
dataset_parameter Identifies trace gas species included in data set (e.g., co2, c13co2)
dataset_process String description of ObsPack data preparation (e.g., PassThru, TimeStepAverage)
dataset_project Typically identies sampling platform and strategy (e.g., surface-flask, tower-insitu, aircraft-pfp)
dataset_db Boolean T/F. Indicates source data are from NOAA operational database (internal use only).
dataset_archive_dir Source data archive directory (internal use only).
dataset_archive_file Source data file or file filter (internal use only).
dataset_intake_ht This attribute is set when it is necessary to subset source data by sample intake height (internal use only).
dataset_intake_ht_unit dataset_intake_ht is reported in meters above ground level (magl) (internal use only).
dataset_time_window_utc Attribute set when necessary to subset source data by sample collection time (UTC) (internal use only).
dataset_time_window_lst Attribute set when necessary to subset source data by sample collection time (LST) (internal use only).
dataset_parse_function Python module used to read source data (internal use only).
dataset_data_frequency Measurement frequency of source data.
dataset_data_frequency_unit Indicates the time unit of the data set_data_frequency attribute.
dataset_platform Fixed or Mobile.
dataset_start_date Date of first item in data set (ISO 8601 format).
dataset_stop_date Data of last item in data set (ISO 8601 format).
dataset_selection Brief description of how data have been selected by data contributor or prepared by NOAA.
dataset_selection_tag Short descriptor to help convey how data have been selected by data contributor or prepared by NOAA. The selection tag is included in the data set name.
dataset_calibration_scale Measurements are relative to reported calibration scale.
dataset_fair_use This is the ObsPack fair use statement agreed upon by data providers.
dataset_reference_number Number indicating how many references to published literature to expect in this file.
dataset_reference_#_name Reference provided by data contributor. # is a number from 1 to relative data set_reference_number.
 
lab_num Laboratory identification number. See Lab Table.
lab_abbr Laboratory abbreviation or acronym (e.g., CONTRAIL, UHEI-IUP)
lab_name Laboratory name
lab_address#
lab_country, lab_country_flag
lab_url (optional)
lab_logo (optional)
lab_ongoing_atmospheric_air_comparison If "yes", lab participates in at least one ongoing direct atmospheric air comparison experiment.
lab_comparison_activity Brief description of measurement comparison activities (optional).
 
program_abbr [ _name, _address, _country, _country_flag, _url, _logo ] Providers may make a distinction between the measurement lab and an over-arching research program (e.g., NACP, ICOS).
 
provider_number Number of providers listed in the file.
provider_#_name
provider_#_address
provider_#_country
provider_#_organization
provider_#_email
provider_#_tel Telephone number
 
obspack_contact_name [ _lab, _email ] Contact information of ObsPack preparer.
obspack_data_time_step Time interval at which ObsPack data are presented (e.g., day, hour).
obspack_name Unique ObsPack identification string. Structure is obspack_<parameter>_<preparation/distribution lab number>_<product name>_<version number>_<preparation date> (e.g., obspack_co2_1_PROTOTYPE_v0.9.1_2012-07-20).
obspack_description Brief description of data product contents.
obspack_version ObsPack software version number.
obspack_creation_date Date when the ObsPack data product was prepared.
obspack_citation Required ObsPack citation. This citation is in addition to the requirements of the ObsPack Fair Use statements.
obspack_fair_use These cooperative data products are made freely available to the scientific community and are intended to stimulate and support carbon cycle modeling studies. We rely on the ethics and integrity of the user to assure that each contributing national and university laboratory receives fair credit for their work. Fair credit will depend on the nature of the work and the requirements of the institutions involved. Your use of an ObsPack data product implies an agreement to contact each contributing laboratory to discuss the nature of the work and the appropriate level of acknowledgement. If an ObsPack data product is essential to the work, or if an important result or conclusion depends on an ObsPack product, co-authorship may be appropriate. This should be discussed with each data provider at an early stage in the work. Contacting the data providers is not optional; if you use an ObsPack data product, you must contact the data providers. To help you meet your obligation, each data product includes an e-mail distribution list of all data providers. ObsPack data products must be obtained directly from the ObsPack Data Portal at www.esrl.noaa.gov/gmd/ccgg/obspack/ and may not be re-distributed.

Beginning November 2013, all new ObsPack data products will have a unique Digital Object Identifier (DOI) registered with the International DOI Foundation. In addition to the conditions of fair use as stated above, users must also include the ObsPack product citation in any publication or presentation using the product. The required citation is included in every data product and in the automated e-mail sent to the user during product download.

Beginning November 2013, there are no longer any exceptions to this policy; it applies to all ObsPack products including GLOBALVIEW.

obspack_warning Every effort is made to create the most accurate and precise data product possible. Contributors reserve the right to make corrections to this product and data based on recalibration of standard gases or for other reasons deemed scientifically justified. Contributors to this product are not responsible for results and conclusions based on use of this product without regard to this warning.
Table 2. Description of variable attributes associated with each data item in a data set.
Variable Attributes
Name Description
obs_num Unique observation number in a single data set. Ranges from 1 to UNLIMITED (netCDF).
obs_id Unique identification string that distinguishes the data item from all other data items in the ObsPack data product. It includes dataset_name and obs_num.
obspack_num Unique observation index number across all data sets in the ObsPack distribution. Ranges from 1 to max_obspack_num.
obspack_id Unique identification string that distinguishes the data item from all other data items in any ObsPack data product. It includes obspack_name, dataset_name, and obspack_num delimited by a tilde (~).
time Air sample collection time (UTC). POSIX time (number of seconds since January 1, 1970 in UTC).
time_decimal Air sample collection time (UTC) in decimal year notation (e.g., 2012.4523312).
time_components Air sample collection time (UTC) represented as a 6-element array [year, month, day, hour, minute, second]. Calendar time components as integers.
solartime_components Air sample collection time (solar time) represented as a 6-element array [year, month, day, hour, minute, second]. UTC time is converted to local solar time based on longitude and day-of-year. Solar time components as integers.
value Reported mole fraction or isotope ratio. Units depend on trace gas species.
value_unc Standard deviation of the reported mean value when nvalue is greater than 1. Units depend on trace gas species.
nvalue Number of individual measurements used to compute reported value.
latitude Latitude at which air sample was collected (units: decimal degrees).
longitude Longitude at which air sample was collected (units: decimal degrees).
altitude Altitude (surface elevation plus sample intake height) at which air sample was collected. Units are meters above sea level (masl).
pressure Ambient pressure at time of sampling. Units are hectopascal (hPa) where 1 hPa = 100 Pa. This variable is not always available.
elevation Surface or ground elevation at which air sample was collected. Units are meters above sea level (masl).
intake_height Height above ground at which air sample was collected. Units are meters above ground level (magl).
obs_flag Representation flag indicates that reported value has large spatial scale representation (1) or is locally influenced (0). This attribute is derived from the data providers source data. The implementation of this flag is still being developed. Suggestions welcome.

Product Summary

The ObsPack product summary (<product name>_dataset_summary.txt) briefly summarizes the contents of the data product including 1) the ObsPack Fair Use Statement, 2) a brief description of the data product and its intended use, 3) the total number of data sets (max_dataset_num) and the total number of observations (max_obs_num) included in the package, and 4) a list of all data sets in the data product. Listed with each data set is the contributing laboratory abbreviation; the start and end date of the included data; indication of lab participation in ongoing direct atmospheric air comparison experiments; and a short phrase indicating the data selection strategy used by the data provider.

Summary files for currently available data products can be found by clicking on the information icon located next to the list of available product versions.

Data Provider E-mail Distribution List

Use of an ObsPack data product implies agreement to contact each contributing laboratory to discuss the nature of the work and the appropriate level of acknowledgement, which may include co-authorship (see the ObsPack Fair Use Statement). To help users meet this obligation, each data product includes an e-mail distribution list of all data providers. The text file <product name>_data_provider_email_list.txt provides the e-mail list in two formats to facilitate use. The list includes e-mail addresses for those data providers who have contributed to the particular data product.