Link Search Menu Expand Document

Data Analysis in R

Instructor

Ellery Galvin (she/hers)
PhD Student, Applied Mathematics
Statistics Specialist, Center for Research Data and Digital Scholarship

Introduction

This section provides a primer on using R and R Studio for data-based research. The first session from 10-12 teaches fundamental skills all R programmers should know: functions, variables, installing packages, loading data, vectors, dataframes, wrangling, and visualization. It assumes the participant has never coded before. The second session from 1-2 covers modeling and statistics, assuming the content studied in the morning: t-tests, linear models, ANOVA, and mixed effects. The session covers programming skills and some statistics concepts as well. The third session covers fundamental skills in text mining: wordclouds, tm, and stringr. Participants may choose to attend any combination of the offered topics.

Before the session

Complete the following steps

Software installation

For instructions on how to download R and R Studio, please see the following: https://posit.co/download/rstudio-desktop/#download. Install both on your machine. R is the programming language, which enables your computer to understand R code, and RStudio is an Integrated Development Environment (IDE for short) that provides a user interface specialized for writing R code.

Next, navigate to https://quarto.org/docs/download/ and install the Quarto CLI for your operating system. It may prompt you to install other things (such as pandoc or developer tools) which you should do if requested. Quarto is software that enables you to combine code, output, text, figures, tables and more into a published document such as a website or a pdf file. It also provides a notebook interface to R, which is a powerful way to break down code into small manageable parts.

Session materials

Before beginning the lesson, please download the materials linked below and put the .zip file in a thoughtful location on your computer (maybe in Documents inside a folder called “Research Data Foundations”. Then, double click the .zip file to expand it into a folder. Inside the folder, you will find everything needed for the session; peruse the README.md file if you wish to explore its contents ahead of time.

Before the session, open RStudio, use the files pane in the lower right square to navigate to the location where you saved the materials. Inside the folder, there is a file called “setup.R”. Open this file. In the upper right of the pane showing the contents of the file (it has the line install.packages) there is button called “source”. Click this button once. This action will cause R to execute the code in the file, installing all the packages you will need.

Download Session Materials.

Solutions

After the session, additional solutions will be posted here.