Skip to content
Cornell University

Cornell Data Services

  • About
  • Services
  • Data Management
    • Planning
    • Storing and Managing
    • Sharing
    • Archiving and Preserving
  • News
  • Workshops
Home > Data Management > Archiving and Preserving > File formats for preservation

File formats for preservation

“Working” file formats (i.e, those used when collecting and working with project data) may not be ideal for re-use or long-term preservation.

In the absence of specific directives from funders or repositories, we offer the following general guidelines for selecting file formats for preservation and reuse. eCommons@Cornell, a repository service based at Cornell University Library, provides more detailed information in their support document, Recommended File Formats for eCommons.

Principles for selecting file formats

Select open, non-proprietary formats

Open, non-proprietary formats are better for re-use and long-term preservation, as they are independent of costly software for use and may be generated and opened in an open or free software. As a general rule, plain text formats, such as comma- or tab- delimited files, are open formats and are typically better for re-use and long-term preservation.

  • Example of a proprietary format: Photoshop .psd file
  • Example of an open format: .tiff image file

Select “lossless” formats

Formats that compress the information in a file are smaller, but the compression may permanently remove data from the file. These formats are “lossy,” while formats that do not result in the loss of information when uncompressed are “lossless.”

  • Example of lossy formats: .mp3 audio file, .jpeg image file
  • Example of lossless formats: .wav audio file, .tiff image file

Select unencrypted and uncompiled formats

If the encryption key, passphrase, or password to a file is lost, there may be no way to retrieve the data from the file later, rendering it unusable to others. Uncompiled source code is more readily re-usable by others because recompiling is possible with different architectures and platforms.

Learn More

  • Funder Data Requirements
  • Data Management Guidelines from the California Digital Library
DMP Tool

Online tool for creating data management plans, with templates for many funding agencies.

Contact us | FAQ

Learn More

  • Data Management Guidelines from the California Digital Library
DMP Tool

Online tool for creating data management plans, with templates for many funding agencies.

Contact us | FAQ

Life Sciences Support

Case Studies

  • Approaching Maximum Storage Capacity
  • Managing Imaging Data
  • Using Controlled Access in dbGaP
  • Sharing Genomic Data

Service spotlight

eCommons for sharing and preserving data

See all service spotlights »
  • Home /
  • Contact /
  • Privacy Statement /
  • FAQ /
  • Glossary /
  • Site Map /
  • Web Accessibility Assistance /
  • Staff Login

Creative Commons License: This work is licensed under a Creative Commons Attribution 4.0 International License

CDS / Cornell Data Services