Concordia.ca   /   Library   /   Research   /   Research data management guide   /   Data storage

Data storage and backup

Data storage & backup

Active data storage

Things to consider:

  • How much data will your project generate? This is something to consider during the planning phase, because storage costs should be factored into the overall data management plan.
  • Who will need access to the data during the project's active phase? Collaborative research means additional challenges to storage and access.
  • Will the project involve confidential or sensitive information? If so, you'll need to take extra precautions to avoid accidental disclosure.

Source: University of Saskatchewan

Where can I store my active data?

Select the different tabs below to view some options for active research data storage solutions.


Concordia faculty and staff have been granted a license for Office 365, which includes access to the standard suite of office applications (Word, Excel, PowerPoint), file storage (One Drive), Intranet (SharePoint), collaboration tools (MS Teams) and more.


Brief description Teams: Collaboration platform combining chat, video meetings, file storage (including collaboration on files), and application integration. Meant to replace services like Google Drive or Dropbox.

SharePoint: Browser-based document management platform.
Example use Teams: Use to share and collaborate with colleagues, as well as for instant communication to reduce the number of emails.

SharePoint: Use to store files for retention purposes, manage versions, co-edit documents.
Storage capacity Each Team site has a SharePoint site behind it with a storage limit of 25TB. Each user can create up to 250 Teams.
Access and collaboration Files can be accessed and shared with a group. Different permission levels can be granted to different members of a group.
Data allowed Microsoft employs security measures that meets standards set by the security community, however Concordia recommends that additional security measures be taken when sensitive or confidential data is stored. These can include password protecting documents, using multi-factor authorization to protect your account, or encrypting files. While not always needed, these measures and general cyber security awareness help in protecting the privacy of your data.

See Concordia's Office 365 FAQ for more information as well as Concordia's Privacy Impact Assessments
Server location Québec and Toronto
Versioning Automatic file versioning for Office 365 files.
Backups Automatic across both Québec and Toronto data sites. Files can be retrieved using SharePoint for up to 90 days after a user deletes them. The SharePoint site has a Recycle Bin and a Second Stage Recycle Bin where users can easily rescue deleted files themselves.
More information

Concordia faculty and staff have been granted a license for Office 365, which includes access to the standard suite of office applications (Word, Excel, PowerPoint), file storage (One Drive), Intranet (SharePoint), collaboration tools (MS Teams) and more.


Brief description Used to store personal and work related files. Files stored within OneDrive are private by default but there is an option to allow sharing and collaboration with others. Can sync desktop files with OneDrive to keep backup in the cloud. Meant to replace personal drives such as the C:\ drive or P:\ drive.
Example use Use to store personal working and reference documents that you don’t necessarily want to share.
Storage capacity 100 GB
Access and collaboration Meant for personal use, however individual files and folders can be shared with users within Concordia.
Data allowed Microsoft employs security measures that meets standards set by the security community, however Concordia recommends that additional security measures be taken when sensitive or confidential data is stored. These can include password protecting documents, using multi-factor authorization to protect your account, or encrypting files. While not always needed, these measures and general cyber security awareness help in protecting the privacy of your data.

See Concordia's Office 365 FAQ for more information as well as Concordia's Privacy Impact Assessments
Server location Québec and Toronto
Versioning Automatic file versioning for Office 365 files.
Backups Automatic across both Québec and Toronto data sites. Files can be retrieved using SharePoint for up to 90 days after a user deletes them. The SharePoint site has a Recycle Bin and a Second Stage Recycle Bin where users can easily rescue deleted files themselves.
More information Concordia Office 365 IITS page

Note that OSF is not a service provided by Concordia


Brief description Free online collaboration tool with both hosted and add-on storage options. Use OSF to organize, document and share projects, including files, data, code and protocols.
Example use Use to work on research projects with multiple collaborators who need varying levels of access.
Storage capacity 5 GB storage limit per project or component for private projects. 50GB storage limit per project or component for public projects. Add-on storage extends capacity but storage limits are controlled by, and vary, depending on provider. Find out more.
Access and collaboration Can include non-Concordia users. Collaborators can be granted read-only, read-write or administrative permissions.
Data allowed Most unpublished research data can be added to OSF, however, confidential, restricted, or high-risk data should not.
Server location Montréal (default storage location must be set when creating a new project)
Versioning Automatic file versioning
Backups Redundant data centers and infrastructure.
More information OSF website

 

Storing big and/or sensitive or confidential data can be challenging. Although Microsoft Teams/SharePoint/OneDrive can be used for sensitve data if files are password protected or encrypted (see the FAQ on Concordia's Office 365 webpage), the services below can also be considered.

 

Concordia's IT Research Support team Provide consultation services as well as research storage, research server hosting, and research virtualized servers.

Find out more...
Digital Research Alliance of Canada (the Alliance) The Rapid Access Service allows Principal Investigators (PIs) to request a modest amount of storage. Resource Allocation Competitions are an application based process to request storage and compute resources that go beyond what is available with the Rapid Access Service.

Find out more about:
REDCap Note that this is not a service provided by Concordia

"A secure web application for building and managing online surveys and databases. While REDCap can be used to collect virtually any type of data in any environment (including compliance with 21 CFR Part 11, FISMA, HIPAA, and GDPR), it is specifically geared to support online and offline data capture for research studies and operations."

Find out more...
Pillar Science Note that this is not a service provided by Concordia

Research data management, research project management and research data analysis software solution for collaborative and interdisciplinary research. This is a Montreal based company.

Find out more...

See also: Active Storage and Security section of the Human Participant Research Data Risk Matrix (p.6) (Portage)

 

Data backup

What files should I backup?

Ideally, you should back up all the data files and associated documentation files, e.g. metadata files, files describing the methodology and/or the instruments used to obtain the data, files describing any manipulation or transformation of the dataset.

What should be the backup frequency?

There are no absolute rules prescribing how often data files should be backed up. However, critical files, especially dataset under construction should be backed up every time the file is modified. Less crucial files can be backed up at regular intervals, daily or weekly for instance. You should use a software or hardware solution that will automatize your backup plan and can handle incremental backups.

What type of storage should I use to backup my files?

No storage medium is perfect; you should use multiple backup media and store at least one copy in a remote location. Here are some storage solutions (adapted from Mantra -- CC-BY):

Networked drive

Generally managed at the university or departmental level, these devices are regularly backed up (usually on tape) and provide for easy and secure access to your data.

PC or laptop hard drive

A flexible solution while you are working on a dataset, but these should not be the only storage solution you use. Hard drives can fail and PCs can be stolen.

External storage device (USB flash drive, CDs, DVDs)

Although this a common and affordable backup solution, there are several issues to be aware of:

  • Depending on the size of your dataset, you may have to use multiple devices
  • The longevity of these supports is questionable
  • Follow the care and handling instructions carefully
  • Regularly check your files to see if they are accessible and complete
  • Make sure to “refresh” your data by making new copies on a new CD, or USB drive
  • Encrypt any confidential data or protect it with a password
Remote/Cloud storage

Services like Dropbox, Google Drive or OneDrive provide some free storage space on remote servers (more space can be obtained on a subscription base). Most cloud-storage solutions provide automated syncing and data encryption. However, remember that there are drawbacks in using third-party online storage:

  • Legal issues (copyright, data protection licenses) can be complicated or unsatisfactory, especially if the server is located outside of Canada. It is generally not recommended to use cloud-storage for sensitive data that includes identifying information on human subjects.
  • Bandwidth may be a concern especially if you have large datasets
  • You are at the mercy of changes in policies and commercial terms
Ressources
  • Storage solutions overview from the Consortium of European Social Science Data Archives:
    • Data sensitivity, ease of access, file size and overall data volume affect storage choice. Advantages and disadvantages are detailed as well as precautions that should be taken when working with personal (sensitive) data.
Back to top
 
Back to top arrow up, go to top of page