Skip to Main Content

Data Management

A guide to dealing with research data throughout the data lifecycle.

Data Management Plans

Funding agencies are beginning to require the submission of data management plans (DMPs) as part of grant applications. Although many funders have unique requirements for the information included in a DMP, this guide will help you to understand and address most common requirements.

DMPs describe the lifecycle of your data: how it will be created, curated, preserved, and (if possible) shared. Funders--particularly government agencies--are conscious of the importance of making research and underlying data available. In some cases, if you do not plan to share your data, you will need to justify that decision in your grant application.

Remember to include data management costs in your grant application! DataONE has provided guidance on how to calculate these costs.

Data Documentation

Data management plans typically include information about how data will be documented and contextualized. Determining strategies for managing data will not only help with the completion of the grant application, but will also be beneficial to the researcher or research team. Creating robust metadata, establishing file naming conventions, and paying attention to file formats will ensure that data remains retrievable and intelligible throughout the life of the project and beyond.


File naming conventions

It's easy to create bad filenames. When you're working with data comprised of multiple files, it is integral to name files descriptively and consistently.

Tips for creating descriptive and consistent filenames:

  • Include the date of creation (the recommended format is ISO 8601: YYYY-DD-MM). Although this information is captured by your operating system and potentially by your instrumentation, including the date in the filename will enhance preservation and sorting.
  • Avoid using spaces; use underscores (_) or dashes (-) instead.
  • Avoid using special characters, such as ! @ $ % ^ & * ( ) [ ] { } ; : " ' , / \
  • Keep filenames short. Although most modern operating systems can handle filenames up to 255 characters, limiting filenames to under 30 characters will allow greater cross-compatibility.
  • Keep sorting in mind when structuring your filenames. For instance, if you begin the filename with the date, you will be able to sort files in a folder chronologically. If you lead with a researcher's initials, you can sort by researcher. Think about how you might need to order your data.
  • Use consistent and meaningful abbreviations.
  • Use unique identifiers (descriptive or numerical).
  • Use leading zeroes when numbering files sequentially. "001" will sort differently than "1" or "01," and will be clearer.
  • Consider developing a versioning convention to add to the end of filenames after you edit them (e.g. v01, v02, etc.).

Other considerations:

  • Organize data in hierarchical folder structures.
  • Document your file naming conventions. This will provide a record of decisions and standards that will ensure consistency over time.

Metadata: data about your data

Metadata describes and provides context for your data. It can also help others find metadata is a love note to the futureyour data. The following links provide guidance on creating metadata, or you can ask a librarian for help.

Image by cea+; CC BY 2.0


File formats

Most software programs have a default file output format. However, it is important to consider whether those files will be accessible using other programs in the future.

Here are some things to consider when choosing a file format:

  • is the data in a proprietary format, or is it an open format?
  • what types of software are required to view or use the data?
  • how sustainable is the format you're using? See the Library of Congress page for more info on digital formats.

Find more information here and here.

Storage & Backup

data thiefMake sure to keep your data safe from disaster (everything from fire to water damage to hard drive crashes) by saving your data in at least three different places. You may want to consider saving your data in the cloud, on an external drive, and on external media (like discs) as well.

Also make sure you protect any sensitive data from hackers or theft. Lock filing cabinets,  use encryption, and password protect sensitive information.

 

Image by Blue Coat Photos; CC BY-SA 2.0

Data life

It is not necessary to keep all data forever, and it may even be detrimental to do so. If you're unsure whether you should keep your data, check this 5-step guide from the Digital Curation Centre.

Sharing data

Donate your data to scienceSharing data (as long as it is not sensitive or confidential) helps to ensure new discoveries. Your data will also be cited, which may increase your scholarly profile. If you'd like to upload your data to a repository, you can select a general repository (like figshare), or a discipline-specific one like the Archaeology Data Service. Find information about trusted repositories by discipline or view more repositories here.

Before sharing your data, you should consider how you would like to license your data. Columbia University has much more information on data licensing and copyright.

If you are interested in depositing your data in Duquesne's Digital Commons, please contact Gesina Phillips, Digital Scholarship Librarian.

Image by Juhan Sonin; CC BY 2.0