Research Data Bootcamp Microcredential
The goal of the Research Data Bootcamp Microcredential is to provide learners with an opportunity to demonstrate literacy in:
- R OR Python programming for data analysis
- Critical evaluation of existing data sets and data approaches
To earn the microcredential certifying literacy in these areas, learners must complete the following:
- Attend all three days of the Research Data Bootcamp
- Complete a final project (described below) that demonstrates Bootcamp-specific knowledge and skills
Final Project
The final project will be due by June 30th; to submit the project, push your project code to a dedicated GitHub repository, and send us a link to the repository.
Final projects will consist of a portfolio of work hosted on a GitHub repository (you are also encouraged to publish your portfolio materials via GitHub pages; see below). Completed microcredential project portfolios must include the following elements:
-
Question or Hypothesis: A statement of a question or hypothesis that you are interested in investigating with data (the question or hypothesis could be relevant to your formal academic interests, but this is by no means necessary). Briefly explain the variables (which often correspond to the columns of a conventional tabular data set) you would need to answer your question or test your hypothesis.
- Python or R Code and Analysis Outputs: Find a data set that will allow you to explore the question you formulated in (1). Write one script (either an R script or Python Script) that casts light on your question. For example, you might write code that produces a visualization, implements a statistical test, calculates summary statistics, generates a crosstab, or derives a new data set from an existing one (for example, creating a “tidy” data set from an unstructured one).
- Please note that you are welcome to write two scripts (one in R and one in Python) if this would help your learning goals, or you would like to demonstrate proficiency in both. However, you are only required to write one.
- Please comment your code so that other researchers (or your future self!) can easily follow and understand your script.
- Please remember to cite the data set you use in your analysis, as well as other relevant contextual information in the repository’s Readme.
- Data set Evaluation: A brief (250-500 words) that evaluates a data set with respect to one of the FAIR principles, and best practices for data publishing more generally. In the write-up, you might reflect on the challenges that a user of the data set might encounter, or suggest changes to the data set’s documentation or metadata with a view towards enhancing its reusability for future users.
Making Your Microcredential GitHub Repository
Please add files that correspond to the above requirements (a statement of a question or hypothesis, an R script or Python script, and data set evaluation) to a GitHub repository that is specifically created to host your Microcredential project materials. When all of your material is uploaded, please send us a link to your repository by completing this form.
We encourage you to write your code in R Markdown or in a Jupyter Notebook, and to publish your files via GitHub Pages, particularly if you plan to submit your microcredential to future employers or mentors (since this will make it easier for others to quickly understand your work, and demonstrate familiarity with important tools that facilitate sharing and reproducibility). If you do publish your code via GitHub Pages, please send us a link to these published files, in addition to the link to your GitHub repository.
However, you are not required to publish your code via GitHub Pages. It is perfectly fine to simply push your raw code to a GitHub repository, and send us the link; this will be sufficient to earn the microcredential.
What You Will Receive
If you complete and submit the microcredential project, you will receive a digital certificate that officially certifies your participation in the Bootcamp, and your completion of the associated project. This certificate will include a link to the repository containing your project materials; this will allow those viewing the microcredential (for example, future employers, supervisors, or collaborators) to verify your familiarity with the Bootcamp’s core competencies.
Examples
To give you a sense of the nature and scope of a possible project, consider the following hypothetical project ideas:
Example 1:Using information from a data set that contains information on the geographic locations of US yoga studios, we might calculate the geographic density of yoga studios with respect to a geographic unit of interest. For our portfolio, we might create a table from this data set (ex. “Census Tracts With Highest Yoga Studio Density per Square Kilometer”), which could be accompanied by a visualization that depicts this variation in density (such as a map or bar chart). The table and visualization, along with the code used to generate them, could be pushed up to your GitHub repository as part of your microcredential portfolio.
Example 2:In a data set of snowfall across several ski resorts/areas, we might write code to generate a table of summary statistics (i.e. mean, median, standard deviation etc.) for the snowfall variable (and other variables of interest). Alternatively, we could calculate correlations between variables of interest, or formally test hypotheses using inferential statistics (for example, using a difference of means test or a regression analysis). In this case, the output containing the results of the relevant test(s) (ex. a table of summary statistics, regression table etc.) and the code used to create them would be our portfolio examples.
Final Project Tips
-
This does not have to be perfect! We are looking for demonstrated literacy and understanding of the concepts and discussions through the Bootcamp.
-
We are here to help! Please join the end-of-day clinics or reach out to us if you have any questions or are struggling with any aspect of the boot camp or microcredential. Peer support and feedback is an integral part of data work.
-
Explore relevant online resources In addition to reaching out to us for help, we also encourage you to take this opportunity to troubleshoot possible problems by seeking help from online resources and documentation relevant to R and Python. These resources are substantial, and an important part of the learning process involves self-directed troubleshooting using these resources.
Turn in Your Final Portfolio Project!
Use this Google Survey link to turn in your project.
Be sure to give us your:
- First and last name
- Your email (preferably your CU Boulder email but personal email is ok if that’s not possible)
- A link to your find GitHub page or repository containing your project materials.
Cheers!