tidyr

Overview

The goal of tidyr is to help you create tidy data. Tidy data is data where:

Each variable is a column; each column is a variable.
Each observation is a row; each row is an observation.
Each value is a cell; each cell is a single value.

Tidy data describes a standard way of storing data that is used wherever possible throughout the tidyverse. If you ensure that your data is tidy, you’ll spend less time fighting with the tools and more time working on your analysis. Learn more about tidy data in vignette("tidy-data").

Installation

# The easiest way to get tidyr is to install the whole tidyverse:
install.packages("tidyverse")

# Alternatively, install just tidyr:
install.packages("tidyr")

# Or the development version from GitHub:
# install.packages("pak")
pak::pak("tidyverse/tidyr")

Cheatsheet

Getting started

library(tidyr)

tidyr functions fall into five main categories:

“Pivoting” which converts between long and wide forms. tidyr 1.0.0 introduces pivot_longer() and pivot_wider(), replacing the older spread() and gather() functions. See vignette("pivot") for more details.
“Rectangling”, which turns deeply nested lists (as from JSON) into tidy tibbles. See unnest_longer(), unnest_wider(), hoist(), and vignette("rectangle") for more details.
Nesting converts grouped data to a form where each group becomes a single row containing a nested data frame, and unnesting does the opposite. See nest(), unnest(), and vignette("nest") for more details.
Splitting and combining character columns. Use separate_wider_delim(), separate_wider_position(), and separate_wider_regex() to pull a single character column into multiple columns; use unite() to combine multiple columns into a single character column.
Make implicit missing values explicit with complete(); make explicit missing values implicit with drop_na(); replace missing values with next/previous value with fill(), or a known value with replace_na().

Related work

tidyr supersedes reshape2 (2010-2014) and reshape (2005-2010). Somewhat counterintuitively, each iteration of the package has done less. tidyr is designed specifically for tidying data, not general reshaping (reshape2), or the general aggregation (reshape).

data.table provides high-performance implementations of melt() and dcast()

If you’d like to read more about data reshaping from a CS perspective, I’d recommend the following three papers:

To guide your reading, here’s a translation between the terminology used in different places:

tidyr 1.0.0	pivot longer	pivot wider
tidyr < 1.0.0	gather	spread
reshape(2)	melt	cast
spreadsheets	unpivot	pivot
databases	fold	unfold

Getting help

If you encounter a clear bug, please file a minimal reproducible example on github. For questions and other discussion, please use community.rstudio.com.

Please note that the tidyr project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Name		Name	Last commit message	Last commit date
Latest commit History 1,382 Commits
.github		.github
R		R
data-raw		data-raw
data		data
man		man
pkgdown/favicon		pkgdown/favicon
revdep		revdep
src		src
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.Rmd		README.Rmd
README.md		README.md
_pkgdown.yml		_pkgdown.yml
codecov.yml		codecov.yml
cran-comments.md		cran-comments.md
tidyr.Rproj		tidyr.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tidyr

Overview

Installation

Cheatsheet

Getting started

Related work

Getting help

About

Releases

Packages

Languages

License

devpowerplatform/tidyr

Folders and files

Latest commit

History

Repository files navigation

tidyr

Overview

Installation

Cheatsheet

Getting started

Related work

Getting help

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages