RESEARCH DATA MANAGEMENT SERVICE GROUP
Comprehensive Data Management Planning & Services

Data Management Planning

Overview

Many research funders require the submission of a data management plan with a grant proposal. The guide below describes the major areas that researchers are well-advised to consider in preparing a data management plan. The content is based on existing requirements as well as best practices documents from data archives, specific disciplines, and other sources. This guide is not specific to any particular funder, discipline, or type of data, and prospective PIs should always review the call for proposals and the requirements of the funder.

Related information:
  • FAQs - Frequently asked questions about writing a data management plan
  • Funder Data Requirements - More information on meeting public access requirements of various funding agencies
Related services:
  • Contact the RDMSG consultants at rdmsg-help@cornell.edu for general data management planning guidance and review of draft data management plans. 
  • Data Management Plan Tool (DMPTool) - Online tool for creating data management plans, with templates for many funding agencies.

Guide to writing a Data Management Plan (DMP)

General description

Provide a general description of the data expected to be produced over the course of the project. Consider including:

  • A brief, non-technical description of the data (including code or software, if appropriate) the project will produce. This should be a non-technical description that provides a very general idea of what data will be generated throughout the research project.
  • If the proposed research involves obtaining data from other sources, provide a brief description of the data including its content, source, and any particular conditions for obtaining and using the data. Describe plans for redistributing any derived data products, if applicable.
  • Indicate which data you will share and at what stage (raw, processed, reduced, or analyzed).
  • Describe why the data you will share will be of interest to a broader community and how your plan will maximize the potential for reuse of the data.


Managing project data

Describe how data will be managed during the active phase of the proposed research project:

  • Describe how data will be collected and processed, including any quality assurance/quality control procedures.
  • Identify the formats of data files created over the course of the project, and approximate volume of data (if known).
  • Describe how data sets will be organized: how data will be distributed among files, file naming conventions, directory organization, and version management.
  • Describe how data sets will be stored and backed up during the course of the project, describing hardware, storage environment, and local or external services to be used. Include the costs for these services in proposal budget, if applicable.
  • If your research is subject to oversight by the Institutional Review Board, refer to applicable requirements and describe how your data management practices will ensure compliance.
  • Identify who will have access to working data and how access will be managed. For sensitive data, describe the the security measures and any formal standards that will be used.
  • Describe what metadata will be created or captured, when it will be created, and who will create it.
Related information:
Related services:

Sharing data

Sharing data makes it possible to conduct synthetic and comparative studies, to validate research results, and to reuse data for teaching and further research. Funders seek to maximize the impact of the research they fund by encouraging or requiring data sharing. 

See also Best practices: Sharing data

A data sharing plan should:

  • Describe how the data will be made available (via a disciplinary data center or repository, an institutional repository, as supplementary material supporting a publication, or other strategy).
  • Include a description of file formats to be used for the data that will be shared. Select file formats for sharing and archiving that maximize the potential for reuse and longevity, and describe the plans for conversion to those formats, if necessary.
  • Include a plan for creating metadata to describe the data. Indicate who will create metadata and when they will do so. Identify the standards that will be used. If no applicable standards exist, indicate this in the data management plan and describe what supplementary documentation you will make available to make publicly shared data understandable and usable by others.
  • Describe conditions for reuse of the data by others. Describe any standard licenses that will be applied to data, as well as any additional terms of use.
  • Describe how users will discover the data (via a specific repository, references in publications, project website, Internet search engines, or other means).
  • Describe how users will obtain the data (direct download, registration and download, upon request).
  • If acquiring data from another source, describe whether the data or derived versions of the data will be shared, and under what conditions.
  • If data will not be made immediately available, indicate when data will be shared.
  • Indicate who will have primary responsibility for the data and who owns the data (for sponsored research at Cornell, the university is usually considered the owner. See Introduction to intellectual property rights in data management for more information.
  • Describe how your data sharing strategy will maximize the value of the data to the audiences of interest (a particular research community, the general public, etc.).
Related information:
Related services:

Preserving data

Research data may be useful well beyond the end of a research project. Ensuring that data are usable and understandable by others requires some thought and planning. The specifics may be closely related to your sharing strategy, and/or determined by the services you choose to use, such as external repositories, data centers, or publishers.

A data preservation plan should:

  • Identify any departmental, institutional, or programmatic policies on data retention, how they influence your plan, and how you will adhere to the policies.
  • Explain the criteria that will be used for selecting data for retention and preservation. Specify how long data will be retained or preserved. Some data may only be retained for the lifetime of the project, some may be retained for the project plus a specified number of years, and some may be worth the effort of long-term preservation (several years to decades). Consider what data are needed to validate the research, what data directly support publications based on the research, and what data have the greatest potential for reuse by others.
  • Include a description of file formats to be used for the data that will be shared. Select file formats for sharing and archiving that maximize the potential for reuse and longevity, and describe the plans for conversion to those formats, if necessary.
  • Include a plan for creating metadata to describe the data. Indicate who will create metadata and when they will do so. Identify the standards that will be used. If no applicable standards exist, indicate this in the data management plan and describe what supplementary documentation you will make available to make publicly shared data understandable and usable by others.
  • Describe plans for transfer of responsibility, should the need arise. These plans should be agreed upon in consultation with access and preservation service providers.
  • Include costs for any of these activities or services in your proposal budget, if applicable.
Related information:
Related services:

Special considerations

Additional considerations for managing data include security of data and the protection of privacy, and policies related to intellectual property.

Protection:

Researchers may have ethical or legal obligations to maintain confidentiality and to protect the privacy of research subjects, or may have other circumstances requiring secure data storage or restricted access to data, such as licensing restrictions that prohibit data sharing.

  • Describe how the data itself will be managed (e.g. measures taken to anonymize data, disposition of data including personally identifiable information, etc.) to protect privacy.
  • Describe how data will be stored, if secure storage and/or restricted access are required.
  • Some funding agencies (including the NSF) recognize that legal and ethical requirements may preclude sharing of some kinds of data. If this is the case, explain in your data management plan the circumstances that prevent you from sharing data.
Related information:
Related services:

Intellectual Property:

Copyright protection does not necessarily extend to data, but some standard licensing options (Creative Commons, Open Data Commons) exist. Many metadata standards accommodate rights or usage statements where conditions for reuse may be expressed. Note that some funding agencies (including the NSF) recognize that commercialization potential may delay or preclude data sharing, and exempts trade secrets and commercial information from the data sharing requirement.

Related information:
Related services:

Additional resources