7 Repository Services @ CU
In order to meet the NIH data sharing requirement, you will typically have the option of depositing your data in CU Boulder’s Institutional Repository, CU Scholar (for some grants, NIH may explicitly circumscribe your repository options, in which case you should follow their guidance).
7.1 Basic CU Scholar Policies
If you are thinking of depositing your data in CU Scholar, please review the Data Set Policy (scroll all the way down). Two things, in particular, are worth emphasizing:
- All of the contributors to a dataset or project do not necessarily have to be affiliated with CU-Boulder, but the person who deposits the data must be employed by CU-Boulder (the depositor must authenticate with CU-Boulder credentials).
- Once the data is published, it is immediately open-access and accessible to anyone who wishes to view or download the data. As a result, datasets with personally identifying information are not appropriate for CU Scholar.
7.2 Submitting data to CU Scholar
If you decide that you would like to meet NIH data sharing requirements using CU Scholar, please consult the submission guidelines, for an overview of the submission process.
The basic steps of the CU Scholar data submission workflow are as follows:
- Click the blue “Share Your Work” button on the main CU Scholar page.
- When prompted to select the type of work, select the “Data Set” option.
- You will be prompted to fill out several fields, and upload your data. Please also upload a Readme or documentation file that provides relevant metadata. You can use the suggested Readme template for CU Scholar for this purpose.
- Once you have completed the submission, we will review it for adherence to FAIR data principles. We may recommend changes to your submission based on this review.
- Once the submission has been finalized, we will register a digital object identifer (DOI) for the data, which can be used to uniquely identify the dataset. You can share this DOI with relevant stakeholders, which provides proof of compliance with the NIH data publication requirement.
7.3 Sample CU Scholar datasets
If you would like to see what dataset publications look like on CU Scholar, here are a few recent examples:
7.4 CU Scholar size constraints and costs
Finally, a word on size limits and costs. If your data submission is less than 500 GB, we can publish it on CU Scholar at no additional cost to you.
For data submissions greater than 500 GB, there is a one-time data deposit fee of $450/terabyte. When assessing costs, file sizes are always rounded up (for example, a deposit of 750 GB will be assessed a deposit fee of $450; a deposit of 1.4 TB will be assessed a deposit fee of $900; and so on).
7.5 Privacy, Ethics, and CU Scholar
As we noted above, all data on CU Scholar is open data; if you decide to use CU Scholar to meet data publication requirements, please make sure that the data is appropriately de-identified or anonymized to protect human subjects. We can offer some advice on de-identification, but the responsibility for ensuring that the data is appropriate for public dissemination, and that human subjects are protected, ultimately rests with the depositor. For practical guidance on data anonymization, please see this resource from the UK Anonymization Network.
An alternative to anonymization is to deposit your data with a repository that has an infrastructure for restricted-use data. If a repository has a restricted-use option, you will be able to deposit your data just as you would deposit data in a normal open-access repository, but the repository will only make the data available to researchers under controlled conditions that guarantee the safety of human subjects. For an example of how restricted-use data policies generally work, see this overview of restricted use data policies at ICPSR.