Tutorial: Create a New Variable Representation Manually

Objective

The purpose of this tutorial is to review the steps to manually create a machine-readable scientific variable representation following the principles of the Scientific Variables Ontology (SVO).

A natural language scientific variable description can be converted to machine-readable form if it is unpacked one layer at a time, starting at the highest level of granularity—the entire variable—and working down to decomposing individual phenomena. The level to which information needs to be unpacked is discretionary, but can be informed by concepts found in existing ontologies.

This document is intended to guide you in creating variable representations manually. It is not a technical specification for SVO. SVO is being continually developed to help automate as much of this process as possible, but it is helpful to understand the basics of the SVO design patterns and how they are applied in order to verify the results yielded by the automation tools.

Instructions for Creating a Variable Representation

The steps to creating a new variable are:

  1. Select a concise description for a scientific variable.
  2. Decompose the variable description to identify the primary entities present: the recorded Phenomenon, the corresponding recorded Property, and if applicable, the Reference for the recorded Property.
  3. diagram showing how variable is decomposed into phenomenon, property, and reference
  4. Decompose the recorded Phenomenon into distinct component phenomena, processes (if present), and the relationships that combine these phenomena and processes together to create the overall recorded Phenomenon.
    1. Identify the distinct component phenomena in the description.
    2. Identify the primary Phenomena of interest that correspond to directly assessed values of the recorded Property. Identify the roles of these primary phenomena with respect to the Property.
    3. For Phenomena at the same level of granularity, identify groupings of phenomena that are linked through a process and use the Participant design pattern to create a compound Phenomenon.
    4. Working up from individual Phenomena, combine phenomena at different granularity levels by utilizing the Context design pattern to express how phenomena are spatiotemporally linked to each other.
    diagram showing how phenomenon is decomposed into participants with participant and variable roles and processes
  5. Decompose the recorded Property:
    • Identify any applied transformations and operations, including aggregations.
    • Identify component Properties that are combined through mathematical operations to create the Property of interest.
    • Identify the PropertyType and Dimensions if applicable.
    • Identify any associated Abstraction and Property Role.
    diagram showing how property is decomposed into operations/transformations, component properties, property types, property roles, and dimensions
  6. Decompose the Reference:
    • Identify the reference Phenomenon.
    • Identify the reference Property.
    • Identify the reference Value.
    • Identify the reference Relationship.
    diagram showing how reference is decomposed into phenomenon, property, value, and reference relationship
  7. Decompose all identified Phenomena by attributes and into component Phenomena as desired.
  8. diagram showing how phenomenon is generically decomposed into context, anstraction, and attributes diagram showing pre-defined phenomenon attributes

To illustrate the process, we will use examples from the Public SRS Names list (NWIS) and the CF Standard names. Here are the variables we will represent:

  1. NWIS: brachionus concentration
  2. NWIS: water alkalinity
  3. NWIS: oil slick severity
  4. NWIS: dissolved argon
  5. NWIS: pyridaben in biota
  6. NWIS: solids under 8 millimeters
  7. NWIS: diazoxon in air
  8. NWIS: groundwater level
  9. NWIS: water salinity
  10. NWIS: molybdenum in wet atmospheric deposition
  11. NWIS: light attenuation
  12. NWIS: absorption coefficient
  13. NWIS: somatic coliphage
  14. CF: optical thickness
  15. CF: thermal energy content of ice and snow
  16. CF: radiative flux
  17. CF: carbonate concentration
  18. CF: mass streamfunction
  19. CF: nitrogen flux
  20. CF: air temperature

Examples

NWIS Examples

brachionus concentration

For this example, we will represent NWIS Parameter 92087 in machine-readable SVO form.

  1. The variable description is "BRACHIONUS (4) USGS,ACL-77" and the units provided are #/ml.
  2. define variable brachionus
  3. Decomposing the variable into its primary elements yields the recorded Phenomenon Brachionus, a genus of planktonic rotifers, a type of organism and the recorded Property count concentration, derived from the units provided. The recorded Property indicates that there is an implied Phenomenon, likely water, which is the medium quantified by the volume in the denominator of the Property.
  4. decompose variable brachionus
  5. Based on the provided Property, we identify the roles of numerator for the Phenomenon organism and denominator (and medium) for the Phenomenon water.
  6. identify participants brachionus
  7. The Property count concentration is sufficiently decomposed.
  8. There is no Reference for the measurement.
  9. We can further atomize the Phenomenon "Brachionus organism" by identifying the attributes genus (Brachionus) and Form (organism).
  10. decompose phenomenon brachionus organisms

Putting the whole variable representation expansion together yields:

final representation of variable brachionus

water alkalinity

For this example, we will represent NWIS Parameter 00418 in machine-readable SVO form.

Coming soon!

oil slick severity

Coming soon!

dissolved argon

Coming soon!

pyridaben in biota

Pyridaben, biota, tissue, recoverable, dry weight, micrograms per kilogram

Coming soon!

solids under 8 millimeters

Solids, percent smaller than 8 millimeters

Coming soon!

diazoxon in air

Diazoxon, air, sum of particulate filter plus top and bottom sorbent traps, recoverable, nanograms per cubic meter

Coming soon!

groundwater level

Groundwater level above Guam Vertical Datum of 2004, meters

Coming soon!

water salinity

Salinity, water, in situ, tidally filtered, practical salinity units at 25 degrees Celsius

Coming soon!

molybdenum in wet atmospheric deposition

Molybdenum, wet atmospheric deposition, unfiltered, micrograms per liter

Coming soon!

light attenuation

Depth to 1 percent of surface light, meters

Coming soon!

absorption coefficient

Absorption coefficient at 412 nm for suspended solids, water, filtered (0.7 micron glass fiber filter), units per meter

Coming soon!

somatic coliphage

Coming soon!

CF Examples

optical thickness

atmosphere_absorption_optical_thickness_due_to_particulate_organic_matter_ambient_aerosol_particles

The optical thickness is the integral along the path of radiation of a volume scattering/absorption/attenuation coefficient. The radiative flux is reduced by a factor exp(-optical_thickness) on traversing the path. A coordinate variable of radiation_wavelength or radiation_frequency can be specified to indicate that the optical thickness applies at specific wavelengths or frequencies. "Absorption optical thickness" means that part of the atmosphere optical thickness that is caused by the absorption of incident radiation. "Aerosol" means the system of suspended liquid or solid particles in air (except cloud droplets) and their carrier gas, the air itself. "Ambient_aerosol" means that the aerosol is measured or modelled at the ambient state of pressure, temperature and relative humidity that exists in its immediate environment. "Ambient aerosol particles" are aerosol particles that have taken up ambient water through hygroscopic growth. The extent of hygroscopic growth depends on the relative humidity and the composition of the particles. To specify the relative humidity and temperature at which the quantity described by the standard name applies, provide scalar coordinate variables with standard names of "relative_humidity" and "air_temperature". The specification of a physical process by the phrase due_to_process means that the quantity named is a single term in a sum of terms which together compose the general quantity named by omitting the phrase.

Coming soon!

thermal energy content of ice and snow

change_over_time_in_thermal_energy_content_of_ice_and_snow_on_land

The phrase "change_over_time_in_X" means change in a quantity X over a time-interval, which should be defined by the bounds of the time coordinate. Thermal energy is the total vibrational energy, kinetic and potential, of all the molecules and atoms in a substance. The phrase "ice_and_snow_on_land" means ice in glaciers, ice caps, ice sheets and shelves, river and lake ice, any other ice on a land surface, such as frozen flood water, and snow lying on such ice or on the land surface.

Coming soon!

radiative flux

fraction_of_surface_downwelling_photosynthetic_radiative_flux_absorbed_by_vegetation

Downwelling radiation is radiation from above. It does not mean "net downward". The sign convention is that "upwelling" is positive upwards and "downwelling" is positive downwards. The surface called "surface" means the lower boundary of the atmosphere. The quantity with standard name fraction_of_surface_downwelling_photosynthetic_radiative_flux_absorbed_by_vegetation, often called Fraction of Absorbed Photosynthetically Active Radiation (FAPAR), is the fraction of incoming solar radiation in the photosynthetically active radiation spectral region that is absorbed by a vegetation canopy. "Photosynthetic" radiation is the part of the spectrum which is used in photosynthesis e.g. 400-700 nm. The range of wavelengths could be specified precisely by the bounds of a coordinate of "radiation_wavelength". When thought of as being incident on a surface, a radiative flux is sometimes called "irradiance". In addition, it is identical with the quantity measured by a cosine-collector light-meter and sometimes called "vector irradiance". In accordance with common usage in geophysical disciplines, "flux" implies per unit area, called "flux density" in physics. "Vegetation" means any plants e.g. trees, shrubs, grass. The term "plants" refers to the kingdom of plants in the modern classification which excludes fungi. Plants are autotrophs i.e. "producers" of biomass using carbon obtained from carbon dioxide.

Coming soon!

carbonate concentration

mole_concentration_of_carbonate_expressed_as_carbon_at_equilibrium_with_pure_aragonite_in_sea_water

Mole concentration means number of moles per unit volume, also called "molarity", and is used in the construction "mole_concentration_of_X_in_Y", where X is a material constituent of Y. A chemical or biological species denoted by X may be described by a single term such as "nitrogen" or a phrase such as "nox_expressed_as_nitrogen". The phrase "expressed_as" is used in the construction A_expressed_as_B, where B is a chemical constituent of A. It means that the quantity indicated by the standard name is calculated solely with respect to the B contained in A, neglecting all other chemical constituents of A. The chemical formula of the carbonate anion is CO3 with an electrical charge of minus two. Aragonite is a mineral that is a polymorph of calcium carbonate. The chemical formula of aragonite is CaCO3. At a given salinity, the thermodynamic equilibrium is that between dissolved carbonate ion and solid aragonite. Standard names also exist for calcite, another polymorph of calcium carbonate.

Coming soon!

mass streamfunction

ocean_meridional_overturning_mass_streamfunction_due_to_parameterized_submesoscale_eddy_advection

The specification of a physical process by the phrase due_to_process means that the quantity named is a single term in a sum of terms which together compose the general quantity named by omitting the phrase. Parameterized eddy advection in an ocean model means the part due to a scheme representing parameterized eddy-induced advective effects not included in the resolved model velocity field. Parameterized submesoscale eddy advection occurs on a spatial scale of the order of 1 km horizontally. Reference: James C. McWilliams 2016, Submesoscale currents in the ocean, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, volume 472, issue 2189. DOI: 10.1098/rspa.2016.0117. There are also standard names for parameterized_mesoscale_eddy_advection which, along with parameterized_submesoscale_eddy_advection, contributes to the total parameterized eddy advection.

Coming soon!

nitrogen flux

surface_upward_mass_flux_of_nitrogen_compounds_expressed_as_nitrogen_due_to_all_land_processes_excluding_fires

"Upward" indicates a vector component which is positive when directed upward (negative downward). In accordance with common usage in geophysical disciplines, "flux" implies per unit area, called "flux density" in physics. The phrase "expressed_as" is used in the construction A_expressed_as_B, where B is a chemical constituent of A. It means that the quantity indicated by the standard name is calculated solely with respect to the B contained in A, neglecting all other chemical constituents of A. "Nitrogen compounds" summarizes all chemical species containing nitrogen atoms. The list of individual species that are included in this quantity can vary between models. Where possible, the data variable should be accompanied by a complete description of the species represented, for example, by using a comment attribute. The specification of a physical process by the phrase "due_to_" process means that the quantity named is a single term in a sum of terms which together compose the general quantity named by omitting the phrase. "All land processes" means plant and soil respiration, photosynthesis, animal grazing, crop harvesting, natural fires and anthropogenic land use change.

surface_upward_mass_flux_of_nitrogen_compounds_expressed_as_nitrogen_out_of_vegetation_and_litter_and_soil

The surface called "surface" means the lower boundary of the atmosphere. "Upward" indicates a vector component which is positive when directed upward (negative downward). In accordance with common usage in geophysical disciplines, "flux" implies per unit area, called "flux density" in physics. The phrase "expressed_as" is used in the construction A_expressed_as_B, where B is a chemical constituent of A. It means that the quantity indicated by the standard name is calculated solely with respect to the B contained in A, neglecting all other chemical constituents of A. "Nitrogen compounds" summarizes all chemical species containing nitrogen atoms. The list of individual species that are included in this quantity can vary between models. Where possible, the data variable should be accompanied by a complete description of the species represented, for example, by using a comment attribute. "Vegetation" means any living plants e.g. trees, shrubs, grass. "Litter" is dead plant material in or above the soil.

Coming soon!

air temperature

tendency_of_air_temperature_due_to_longwave_heating_from_volcanic_ambient_aerosol_particles

The phrase "tendency_of_X" means derivative of X with respect to time. Air temperature is the bulk temperature of the air, not the surface (skin) temperature. The specification of a physical process by the phrase "due_to_" process means that the quantity named is a single term in a sum of terms which together compose the general quantity named by omitting the phrase. The term "longwave" means longwave radiation. "Aerosol" means the system of suspended liquid or solid particles in air (except cloud droplets) and their carrier gas, the air itself. "Ambient_aerosol" means that the aerosol is measured or modelled at the ambient state of pressure, temperature and relative humidity that exists in its immediate environment. "Ambient aerosol particles" are aerosol particles that have taken up ambient water through hygroscopic growth. The extent of hygroscopic growth depends on the relative humidity and the composition of the particles. To specify the relative humidity and temperature at which the quantity described by the standard name applies, provide scalar coordinate variables with standard names of "relative_humidity" and "air_temperature". Volcanic aerosols include both volcanic ash and secondary products such as sulphate aerosols formed from gaseous emissions of volcanic eruptions.

Coming soon!