-
Notifications
You must be signed in to change notification settings - Fork 109
/
likert_data.Rmd
93 lines (67 loc) · 4.28 KB
/
likert_data.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
# How to plot likert data
Zining Fan and Xinyuan He
## Introduction
The Likert scale is a five (or seven) point scale which is used to allow the individual to express how much they agree or disagree with a particular statement. A common set of 5 answers can be selected are: Strong Disagree, Disagree, Don't Know/Not Sure, Agree and Strongly Agree.
To visualize the result of a Likert scale datasets, we can use diverging stacked bar chart or stacked bar chart, which we will introduce later in this chapter.
## Diverging stacked bar chart using function likert()
Here is an example of how to plot likert data. We could use function likert(), it is in the package "likert". The plot looks like a diverging bar chart.
```{r message=FALSE, warning=FALSE, include=FALSE, paged.print=FALSE}
library(readxl)
library(reshape2)
library(ggplot2)
library(HH)
library(httr)
library(tidyverse)
library(likert)
b=data.frame("DBN"=c("01M509","02M374","02M520","11X545","12X271","13K605","29Q156"),"StronglyDisagree"=c(0.27,0.32,0.17,0.10,0.37,0.27,0.25),"Disagree"=c(0.43,0.57,0.61,0.46,0.40,0.57,0.54),"DontKnow"=c(0.13,0.05,0.03,0.16,0.12,0.08,0.07),"Agree"=c(0.13,0.05,0.12,0.19,0.09,0.06,0.09),"StronglyAgree"=c(0.03,0.01,0.06,0.10,0.02,0.02,0.04))
HH::likert(DBN~., b, positive.order=TRUE, as.percent = TRUE,
main="At my child's school my child is safe.",
xlab="percent")
```
The code and the plot are shown below,
```{r}
# use function likert() to plot likert data
HH::likert(DBN~., b, positive.order=TRUE, as.percent = TRUE,
main="At my child's school my child is safe.",
xlab="percentage",ylab="School Code")
```
In this picture, the statement is "At my child's school my child is safe.". We could see that the percentage of each catogary for different schools, the "don't know " is set at the middle in the likert plot.
## Data cleaning and preparation
Actually, before we plot, the first thing we need to do is to tidy data. Transforming data to the format we want is very important and more challengeable than ploting. After transforming, the tidy data looks like,
```{r}
# view data frame
b
```
In the table above, each line represents the data from school. The data is in percentage format. If you want to know more about this dataset, you could go to https://www.schools.nyc.gov/about-us/reports/school-quality/nyc-school-survey.
There is another choice to plot the dataset. In the likert data, the neutral data or "don't know" data is the least important, so we could exclude these data. The plot looks like :
```{r}
# exclude "don't know"
bb <- b[,c(1,2,3,5,6)]
HH::likert(DBN~., bb, positive.order=TRUE,as.percent = TRUE,
main="At my child's school my child is safe.(without don't know)",
xlab="percentage",ylab="School Code")
```
In the plot above, we could clearly compare the percentage of "Agree" and "StronglyAgree" to "Disagree" and "StronglyDisagree".
## Stacked bar chart using ggplot()
Another way to plot the likert data is to use stacked bar chart. We can use ggplot() and gemo_bar to implement the stacked bar chart as shown in the example below.
Before drawing the stacked bar chart with ggplot(), we need to reorganize the dataset into the following format:
```{r}
mb = melt(b)
mb
```
Then we plot the data, the code and plot are shown below,
```{r}
g2 = ggplot()+
geom_bar(data = mb, aes(x = reorder(DBN,value), y=value, fill=variable), position="stack", stat="identity")+
coord_flip() +
ggtitle("At my child's school my child is safe")+
ylab("Percentage")+
xlab("School Code")+
scale_fill_brewer(palette="PRGn")+
theme(legend.position="bottom")
#scale_x_discrete(limits=c("StronglyAgree", "Agree", "DontKnow","Disagree","StronglyDisagree"))
g2
```
One advantage of using stacked bar chart is that: since these two features "StronglyAgree" and "StronglyDisagree" are placed at the two ends of the bars,the comparisions between them are very clear. While it will be relatively hard to compare features placed in the middle of the bars, like "Disagree" and "Agree".
## Summary
There are many options to visualize likert data as shown in the previous sections, and each with different advantages and disadvantages. You can choose based on your needs. If you would like to find more information you can go to https://blog.datawrapper.de/divergingbars/.