Data storage and backup
On this page
- What is a data management plan
- Funder requirements
- Data management plan tools & examples
- Find data
- Collect data ethically
- File formats
- File naming, organization, versioning
- Document & describe
- Storage & backup
- Analyze & visualize data
- Prepare data for archiving, sharing
- Where to share data
- Data licenses
- Cite data
Data storage & backup
At the beginning of your research project, you should devise a storage and backup strategy for all your data and associated files.
For data storage best practices, consult the University of Saskatchewan's "Store Your Data" guide.
Active data storage
Where can I store my active data?
During the active phase of your research, contact Concordia's IT Research Support team. They provide consultation services as well as research storage, research server hosting, and research virtualized servers.
Things to consider:
Source: University of Saskatchewan
- How much data will your project generate? This is something to consider during the planning phase, because storage costs should be factored into the overall data management plan.
- Who will need access to the data during the project's active phase? Collaborative research means additional challenges to storage and access.
- Will the project involve confidential or sensitive information? If so, you'll need to take extra precautions to avoid accidental disclosure.
What files should I backup?
Ideally, you should back up all the data files and associated documentation files, e.g. metadata files, files describing the methodology and/or the instruments used to obtain the data, files describing any manipulation or transformation of the dataset.
What should be the backup frequency?
There are no absolute rules prescribing how often data files should be backed up. However, critical files, especially dataset under construction should be backed up every time the file is modified. Less crucial files can be backed up at regular intervals, daily or weekly for instance. You should use a software or hardware solution that will automatize your backup plan and can handle incremental backups.
What type of storage should I use to backup my files?
No storage medium is perfect; you should use multiple backup media and store at least one copy in a remote location. Here are some storage solutions (adapted from Mantra -- CC-BY):
Generally managed at the university or departmental level, these devices are regularly backed up (usually on tape) and provide for easy and secure access to your data.
PC or laptop hard drive
A flexible solution while you are working on a dataset, but these should not be the only storage solution you use. Hard drives can fail and PCs can be stolen.
External storage device (USB flash drive, CDs, DVDs)
Although this a common and affordable backup solution, there are several issues to be aware of:
- Depending on the size of your dataset, you may have to use multiple devices
- The longevity of these supports is questionable
- Follow the care and handling instructions carefully
- Regularly check your files to see if they are accessible and complete
- Make sure to “refresh” your data by making new copies on a new CD, or USB drive
- Encrypt any confidential data or protect it with a password
Services like Dropbox, Google Drive or OneDrive provide some free storage space on remote servers (more space can be obtained on a subscription base). Most cloud-storage solutions provide automated syncing and data encryption. However, remember that that there are drawbacks in using third-party online storage:
- Legal issues (copyright, data protection licenses) can be complicated or unsatisfactory, especially if the server is located outside of Canada. It is generally not recommended to use cloud-storage for sensitive data that includes identifying information on human subjects.
- Bandwidth may be a concern especially if you have large datasets
- You are at the mercy of changes in policies and commercial terms