RESEARCH DATA MANAGEMENT SERVICE GROUP
Comprehensive Data Management Planning & Services

Metadata and describing data

Metadata: the who, what, when, where, why, how of your research.

Metadata and describing data

Properly describing and documenting data allows users (yourself included) to understand and track important details of the work. Having metadata about the data also facilitates search and retrieval of the data when deposited in a data repository. The information needed to discover, use, and understand data is referred to as metadata. Metadata describes the who, what, when, where, why, and how of your data in the context of your research and should provide enough information so that users know what can and cannot be done with your data. Metadata can also facilitate search and retrieval of the data when deposited in a data repository, and is an important step in creating FAIR data.

Describing your data

Metadata can include content such as:

  • contact information,
  • geographic locations,
  • details about units of measure,
  • abbreviations or codes used in the dataset,
  • instrument and protocol information,
  • survey tool details,
  • provenance,
  • version information,
  • and much more.

It is important to describe your data with sufficient detail so that users can:

  • reconstruct the context
  • evaluate whether they are fit for purpose
  • further analyze and reuse appropriately

You may need to describe several facets of your data, including:

  • overall bibliographic information about the dataset (e.g. title, author, related publications)
  • types of files used (e.g., csv, txt, png, etc.)
  • key descriptive information about the experiment, (e.g. sampling or measurement methods, software used for analysis, any processing or transformations performed)

Commonly used data formats may be available in your field that help capture and structure relevant metadata. When possible, structure your metadata using an appropriate, agreed-upon metadata standard format (see below for examples and guidelines). When no appropriate metadata standard exists, you may consider composing a "readme" style metadata document, as described in this guide.

Metadata formats and standards

Specific disciplines, repositories or data centers may guide or even dictate the content and format of metadata, possibly using a formal standard. Some standards describe general information such as bibliographic metadata, others describe specific data types or are designed for specific disciplines. Some examples of metadata standards are listed below.

To find an appropriate metadata standard for your discipline, try one of these guides:

Tools can also be very helpful for generating structured metadata and are available for a variety of standards and use cases (e.g. Morpho allows for easy creation of EML, Nesstar for DDI data, etc.).

Examples of different metadata standards:

  • Dublin Core - domain agnostic, basic and widely used metadata standard
  • DDI (Data Documentation Initiative) - common standard for social, behavioral and economic sciences, including survey data
  • EML (Ecological Metadata Language) - specific for ecology disciplines
  • ISO 19115 and FGDC-CSDGM (Federal Geographic Data Committee's Content Standard for Digital Geospatial Metadata) - for describing geospatial information
  • MINSEQE (MINimal information about high throughput SEQeuencing Experiments) - Genomics standard
  • FITS (Flexible Image Transport System) - Astronomy digital file standard that includes structured, embedded metadata

Related information

Best Practices in Creating Metadata (ICPSR) Part of the ICPSR's Guide to Social Science Data Preparation and Archiving

Metadata Best Practices (DataONE)

Metadata Services (RDMSG)

Preparing FAIR data for reuse and reproducibility (RDMSG)

 

Page last updated Sep 2022