forked from PeterKDunn/SRM-Textbook
-
Notifications
You must be signed in to change notification settings - Fork 0
/
27-Testing-Intro.Rmd
135 lines (99 loc) · 5.28 KB
/
27-Testing-Intro.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
# (PART) Analysis: hypothesis testing {-}
# Introducing hypothesis tests {#HTIntro}
```{r, fig.cap="", fig.align="center", fig.width=3, out.width="35%"}
SixSteps(5, "Comparison RQs: Hypothesis tests")
```
You have studied how to construct *confidence intervals*, which answer estimation-type RQs, and indicate the precision with which a statistic estimates a parameter.
Now, you begin studying [decision-type RQs](#TypesOfRQs), which help you make *decisions* about the value of unknown parameters based on the value of the statistic (Table \@ref(tab:OverviewTable)).
This is called *hypothesis testing*.
::: {.tipBox .tip data-latex="{iconmonstr-info-6-240.png}"}
The word *hypothesis* means 'a possible explanation'.\smallskip
**Scientific hypotheses** refer to potential *scientific* explanations that can be tested by collecting data.
For example, an engineer may hypothesise that replacing sand with glass in the manufacture of concrete will produce desirable characteristics [@devaraj2021exploring].\smallskip
**Statistical hypotheses** refer to statistical explanations that are required to determine whether the evidence (i.e., data) supports the scientific hypotheses.
The statistical hypotheses are the foundation of the logic of hypothesis testing.\smallskip
This book discusses *statistical hypotheses*.
:::
The [decision-making process](#DecisionMaking) (Chap. \@ref(MakingDecisions)) previously discussed was:
1. **Assumption**:
Make an assumption about the *population*.
Initially, assume that the sampling variation explains any discrepancy between the observed sample and assumed value of the population parameter.
2. **Expectation**:
Based on the assumption about the parameter, describe the distribution of the values of the sample statistic that might reasonably be observed from all the possible samples that might be obtained (due to sampling variation).
3. **Observation**:
Observe the data from one of the many possible samples, and compute the
observed sample statistic from this sample.
4. **Decision**: If the observed sample statistic is:
- unlikely to happen by chance, it *contradicts* the assumption about the *population parameter*, and the assumption is probably **wrong**.
The *evidence* suggests that the assumption is wrong (but it is not *certainly* wrong).
- *likely* to happen by chance, it is **consistent with** the assumption about the *population parameter*, and the assumption may be **correct**.
No *evidence* suggests the assumption is wrong (though it may be wrong).
In this Part, we explore decision-type relational or interventional RQs with a *comparison*.
Decision-type RQs with a *connection* are discussed in Chaps. \@ref(Correlation) and \@ref(Regression).
(ref:CI-OneProp) Chap. \@ref(CIOneProportion)
(ref:HT-OneProp) Chap. \@ref(TestOneProportion)
(ref:CI-OneMean) Chap. \@ref(OneMeanConfInterval)
(ref:HT-OneMean) Chap. \@ref(TestOneMean)
(ref:CI-TwoMeans) Chap. \@ref(CITwoMeans)
(ref:HT-TwoMeans) Chap. \@ref(TestTwoMeans)
(ref:CI-Paired) Chap. \@ref(PairedCI)
(ref:HT-Paired) Chap. \@ref(TestPairedMeans)
(ref:CI-OddsR) Chap. \@ref(OddsRatiosCI)
(ref:HT-OddsR) Chap. \@ref(TestsOddsRatio)
(ref:HT-Cor) Sect. \@ref(CorrelationTesting)
(ref:HT-Reg) Sect. \@ref(RegressionCI)
(ref:CI-Reg) Sect. \@ref(RegressionCI)
```{r OverviewTable}
Overview <- array(dim = c(7, 3))
colnames(Overview) <- c("",
"Estimation (CI)",
"Decision (Tests)")
Overview[1, ] <- c("Proportions for one sample",
"(ref:CI-OneProp)",
"(ref:HT-OneProp)")
Overview[2, ] <- c("Means for one sample",
"(ref:CI-OneMean)",
"(ref:HT-OneMean)")
Overview[3, ] <- c("Mean differences (paired data; within-individual)",
"(ref:CI-Paired)",
"(ref:HT-Paired)" )
Overview[4, ] <- c("Comparing two means (between-individuals)",
"(ref:CI-TwoMeans)",
"(ref:HT-TwoMeans)")
Overview[5, ] <- c("Comparing two odds (between-individuals)",
"(ref:CI-OddsR)",
"(ref:HT-OddsR)")
Overview[6, ] <- c("Correlation",
" ",
"(ref:HT-Cor)")
Overview[7, ] <- c("Regression",
"(ref:CI-Reg)",
"(ref:HT-Reg)")
if( knitr::is_latex_output() ) {
kable(Overview,
format = "latex",
booktabs = TRUE,
align = c("l", "c","c"),
longtable = FALSE,
caption = "Confidence intervals and hypothesis tests for different situations") %>%
kable_styling("striped",
font_size = 10,
full_width = FALSE) %>%
pack_rows("Descriptive RQs", 1, 3) %>%
pack_rows("Relational/Interventional RQs with a Comparison", 4, 5) %>%
pack_rows("Relational/Interventional RQs with a Connection", 6, 7)
}
if( knitr::is_html_output() ) {
kable(Overview,
format = "html",
booktabs = TRUE,
align = c("l", "c","c"),
longtable = FALSE,
caption = "Confidence interrvals and hypothesis tests for different situations") %>%
kable_styling("striped",
full_width = FALSE) %>%
pack_rows("Descriptive RQs", 1, 3) %>%
pack_rows("Relational/Interventional RQs (Comparison)", 4, 5) %>%
pack_rows("Relational/Interventional RQs (Connection)", 6, 7)
}
```