Skip to main content

Research Data Toolkit

This guide is intended to help researchers, data creators or others who manage digital data as part of a research project, plan, organize, describe, share and preserve their research data for the long term.

Why Good File Management is Important?

Another facet of the research data management process is to develop good file management practices. Managing your files carefully will assist with identifing, locating, and maintaining data efficiently and effectively.

In addition, establishing consistent naming conventions will help with distinguishing one file from another. This is especially important when you have multiple files in various formats and need to locate them quickly.

Tips for Naming and Organizing Files

Here are a few tips for organizing and naming data files:

Organising electronic files systematically and consistently in folders will save you time when you and your research partners are searching for them. In addition, applying file naming conventions (FNC) will help bring order to a multifaceted group of files and allows you to bring files with similar information together in a logical manner. 

Tips for Naming Your Files

  • You don't have to include all of these, but elements of a good file name may include::              

                  o   Version Name

                  o   Date the record was created

                  o   Initial of person who created record

                  o   Short description of record contents

                  o   Name of research team associated with the data record

                  o   Date data was published

                  o   Project name and number

                  o   Type of data

  • Keep you file names short, the standard length is less than 25 characters.
  • Choose a standard and meanful vocabulary for file names, so that everyone uses a common language.
  • Format dates Year-Month-Date, YYYY-MM-DD, YYYY-MM, or YYYY-YYY
  • When separating words do not use blank spaces, rather consider the following formating.

                  o   Underscores, file_name.xx

                  o   Dashes, file-name.xx

                  o   No separation, filename.xxx

  • Special characters such as  ~ ! @ # $ % ^ & * ( ) ` ; < > ? , [ ] { } ' " should be avoided.
  • If you are using a sequential numbering system, your order should lead with zeros, for example, use "001, 002,  ...010, 011
  • Include file extensions in file names to distinguish one file version from another (dox, xls, pdf, jpeg etc.)

Keeping Track of File Versions (Version Control)

Versioning occurs when you make changes to an existing file resulting in you saving a new copy. It is important to document new versions of files so that the latest version can be identified.

It is recommended that when versioning files using ordinal numbers for major versions and add decimals for minor changes-e.g. v1, v1.1, v2.6. 

Documentation & Metadata

Incorporating documentation or metadata into your data management procedures allows users (yourself included) to understand and track important details of the project. It will also ensures that when another researches seeks your data in a repository it is searchable because you have created specific language to describe it. 

Metadata Formats and Standards

Metadata is literally defined as "data about data". Metadata formats and standards can vary by discipine, repository or data center.  It can include content such as title, creator, dates, abbreviations or codes, instrument and protocol information, funders information, provenance identifier, language, location, and much more. The key is to make your metadata as descriptive as possible and ensure that it covers as much of your dataset that you want to be searchable.

There are a couple of places you can start when you are trying to determine which metadata schema are appropriate for you dataset.

The Digital Curation Center provides a catalogue of metadata standards by discipline on its website http://www.dcc.ac.uk/resources/metadata-standards.

The Research Data Alliance (RDA) “Metadata Directory ” http://rd-alliance.github.io/metadata-directory/standards/ also has a good listing of possible metadata standards by discipline. 

In addition, you could seek out advice from the repository where you plan to store your data. Another good resource are metadata editors, which allow users to view and edit metadata tags interactively on the computer screen. A full listing of metadata editing programs can be found at the DataONE site.

Links to Commonly Used Metadata Schemas 

The Data Documentation Initiative (DDI) is an international standard for describing social, behavioral and economic sciences research data

Dublin Core is a basic schema, widely used for simple resource description across disciplines. 

Ecological Metadata Language (EML) is specific for ecology disciplines.

Darwin Core body of standards used in the field of biology.

ISO 19115 an internationally-adopted schema for describing geographic information and services.

Flexible Image Transport System FITS is an astronomy digital file standard that includes structured, embedded metadata