Skip to main content
Library

Data storage and backup

Data storage & backup

Active data storage

Things to consider:

  • How much data will your project generate? This is something to consider during the planning phase, because storage costs should be factored into the overall data management plan.
  • Who will need access to the data during the project's active phase? Collaborative research means additional challenges to storage and access.
  • Will the project involve confidential or sensitive information? If so, you'll need to take extra precautions to avoid accidental disclosure.

Source: University of Saskatchewan

Where can I store my active data?

Select the different tabs below to view some options for active research data storage solutions.

Data backup

What files should I backup?

Ideally, you should back up all the data files and associated documentation files, e.g. metadata files, files describing the methodology and/or the instruments used to obtain the data, files describing any manipulation or transformation of the dataset.

What should be the backup frequency?

There are no absolute rules prescribing how often data files should be backed up. However, critical files, especially dataset under construction should be backed up every time the file is modified. Less crucial files can be backed up at regular intervals, daily or weekly for instance. You should use a software or hardware solution that will automatize your backup plan and can handle incremental backups.

What type of storage should I use to backup my files?

No storage medium is perfect; you should use multiple backup media and store at least one copy in a remote location. Here are some storage solutions (adapted from Mantra -- CC-BY):

Networked drive

Generally managed at the university or departmental level, these devices are regularly backed up (usually on tape) and provide for easy and secure access to your data.

PC or laptop hard drive

A flexible solution while you are working on a dataset, but these should not be the only storage solution you use. Hard drives can fail and PCs can be stolen.

External storage device (USB flash drive, CDs, DVDs)

Although this a common and affordable backup solution, there are several issues to be aware of:

  • Depending on the size of your dataset, you may have to use multiple devices
  • The longevity of these supports is questionable
  • Follow the care and handling instructions carefully
  • Regularly check your files to see if they are accessible and complete
  • Make sure to “refresh” your data by making new copies on a new CD, or USB drive
  • Encrypt any confidential data or protect it with a password
Remote/Cloud storage

Services like Dropbox, Google Drive or OneDrive provide some free storage space on remote servers (more space can be obtained on a subscription base). Most cloud-storage solutions provide automated syncing and data encryption. However, remember that there are drawbacks in using third-party online storage:

  • Legal issues (copyright, data protection licenses) can be complicated or unsatisfactory, especially if the server is located outside of Canada. It is generally not recommended to use cloud-storage for sensitive data that includes identifying information on human subjects.
  • Bandwidth may be a concern especially if you have large datasets
  • You are at the mercy of changes in policies and commercial terms
Ressources
  • Storage solutions overview from the Consortium of European Social Science Data Archives:
    • Data sensitivity, ease of access, file size and overall data volume affect storage choice. Advantages and disadvantages are detailed as well as precautions that should be taken when working with personal (sensitive) data.
Back to top