This workshop, in the style of Software Carpentry and Data Carpentry workshops are for any researcher who has data they want to analyze, and no prior computational experience is required. This hands-on workshop teaches basic concepts, skills and tools for working more effectively with data. We will cover Data organization in spreadsheets, Introduction to R, Data analysis and visualization in R. Participants should bring their laptops and plan to participate actively. By the end of the workshop learners should be able to more effectively manage and analyze data and be able to apply the tools and approaches directly to their ongoing research.
Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below, please install latest version for maximum compatibility).
- OpenRefine OpenRefine (formerly Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another. (Download and extract only, no 'install')
- R is a language and environment for statistical computing and graphics.
- RStudio (the open source 'free' version) is an integrated development environment (IDE) for R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management.
- Github Desktop (the app) is a seamless way to contribute to projects on GitHub (the website)
- A Github account Will allow you to host your version controlled project folder (repository) in the cloud for collaboration, sharing (and backup!).
- Initiate and provide training to participants in organized reproducible data workflow
- Produce a structured data management and archiving plan for individual projects
- David Beauchesne
- Rémi Daigle
- Angela Grant
May 1 | Four Points Gatineau, Renaissance B, 4th Floor |
---|---|
6 - 9 PM | - Introduction to Data Workshop (why?, open access, Angela. - Metadata (what is it, standards, Gis, DCT... ) Angela. - Data organization with spreadsheets (don’t save as excel file, don’t put spaces in columns, square data frame, standard vocab for column headers, Dates). (Remi) - Data cleaning and raw data management (openrefine and other tools, replicable workflow) (Remi/David) |
May 2 | Four Points Gatineau, Renaissance B, 4th Floor |
---|---|
8:30 AM - Noon | - Shock and awe with R (Remi/David) - Text analysis (David/Remi) - Data Analysis and Visualization in R (basic dplyr, tidyverse, spatial mapping with (http://remi-daigle.github.io/GIS_mapping_in_R/) (Remi/David) |
Noon - 1 PM | - Lunch |
1 - 5 | - Data Archiving & version control ( Angela: review of data repositories (zenodo, figshare, dryad, OBIS, Remi/David: version control, student builder pack, Github) |
evening | - Hacky Hour (everyone) |