27-Testing-Intro.Rmd

# (PART) Analysis: hypothesis testing {-}


# Introducing hypothesis tests {#HTIntro}


```{r, fig.cap="", fig.align="center", fig.width=3, out.width="35%"}
SixSteps(5, "Comparison RQs: Hypothesis tests")
```

You have studied how to construct *confidence intervals*, which answer estimation-type RQs, and indicate the precision with which a statistic estimates a parameter.
Now, you begin studying [decision-type RQs](#TypesOfRQs), which help you make *decisions* about the value of unknown parameters based on the value of the statistic (Table \@ref(tab:OverviewTable)).
This is called *hypothesis testing*.


::: {.tipBox .tip data-latex="{iconmonstr-info-6-240.png}"}
The word *hypothesis* means 'a possible explanation'.\smallskip

**Scientific hypotheses** refer to potential *scientific* explanations that can be tested by collecting data.
For example, an engineer may hypothesise that replacing sand with glass in the manufacture of concrete will produce desirable characteristics [@devaraj2021exploring].\smallskip

**Statistical hypotheses** refer to statistical explanations that are required to determine whether the evidence (i.e., data) supports the scientific hypotheses.
The statistical hypotheses are the foundation of the logic of hypothesis testing.\smallskip

This book discusses *statistical hypotheses*.
:::


The [decision-making process](#DecisionMaking) (Chap. \@ref(MakingDecisions)) previously discussed was:

1. **Assumption**: 
   Make an assumption about the *population*.
   Initially, assume that the sampling variation explains any discrepancy between the observed sample and assumed value of the population parameter.
2. **Expectation**: 
   Based on the assumption about the parameter, describe the distribution of the values of the sample statistic that might reasonably be observed from all the possible samples that might be obtained (due to sampling variation).
3. **Observation**:
   Observe the data from one of the many possible samples, and compute the
observed sample statistic from this sample.
4. **Decision**: If the observed sample statistic is:
    - unlikely to happen by chance, it *contradicts* the assumption about the *population parameter*, and the assumption is probably **wrong**.
			The *evidence* suggests that the assumption is wrong (but it is not *certainly* wrong).
    - *likely* to happen by chance, it is **consistent with** the assumption about the *population parameter*, and the assumption may be **correct**.
			No *evidence* suggests the assumption is wrong (though it may be wrong).


In this Part, we explore decision-type relational or interventional RQs with a *comparison*.
Decision-type RQs with a *connection* are discussed in Chaps. \@ref(Correlation) and \@ref(Regression).


(ref:CI-OneProp) Chap. \@ref(CIOneProportion)

(ref:HT-OneProp) Chap. \@ref(TestOneProportion)

(ref:CI-OneMean) Chap. \@ref(OneMeanConfInterval)

(ref:HT-OneMean) Chap. \@ref(TestOneMean)

(ref:CI-TwoMeans) Chap. \@ref(CITwoMeans)

(ref:HT-TwoMeans) Chap. \@ref(TestTwoMeans)

(ref:CI-Paired) Chap. \@ref(PairedCI)

(ref:HT-Paired) Chap. \@ref(TestPairedMeans)

(ref:CI-OddsR) Chap. \@ref(OddsRatiosCI)

(ref:HT-OddsR) Chap. \@ref(TestsOddsRatio)


(ref:HT-Cor) Sect. \@ref(CorrelationTesting)

(ref:HT-Reg) Sect. \@ref(RegressionCI)

(ref:CI-Reg) Sect. \@ref(RegressionCI)

```{r OverviewTable}
Overview <- array(dim = c(7, 3))
colnames(Overview) <- c("",
                        "Estimation (CI)",
                        "Decision (Tests)")

Overview[1, ] <- c("Proportions for one sample",
                   "(ref:CI-OneProp)",
                   "(ref:HT-OneProp)")
Overview[2, ] <- c("Means for one sample",
                    "(ref:CI-OneMean)",
                   "(ref:HT-OneMean)")
Overview[3, ] <- c("Mean differences (paired data; within-individual)",
                   "(ref:CI-Paired)",
                   "(ref:HT-Paired)" )
Overview[4, ] <- c("Comparing two means (between-individuals)",
                   "(ref:CI-TwoMeans)",
                   "(ref:HT-TwoMeans)")
Overview[5, ] <- c("Comparing two odds (between-individuals)", 
                   "(ref:CI-OddsR)",
                   "(ref:HT-OddsR)")
Overview[6, ] <- c("Correlation", 
                   " ",
                   "(ref:HT-Cor)")
Overview[7, ] <- c("Regression", 
                   "(ref:CI-Reg)",
                   "(ref:HT-Reg)")
  
if( knitr::is_latex_output() ) {
  kable(Overview,
        format = "latex",
        booktabs = TRUE,
        align = c("l", "c","c"),
        longtable = FALSE,
        caption = "Confidence intervals and hypothesis tests for different situations") %>%
  kable_styling("striped", 
                font_size = 10,
                full_width = FALSE) %>%
  pack_rows("Descriptive RQs", 1, 3) %>%
  pack_rows("Relational/Interventional RQs with a Comparison", 4, 5) %>%
  pack_rows("Relational/Interventional RQs with a Connection", 6, 7)
}
if( knitr::is_html_output() ) {
  kable(Overview,
      format = "html",
      booktabs = TRUE,
      align = c("l", "c","c"),
      longtable = FALSE,
      caption = "Confidence interrvals and hypothesis tests for different situations") %>%
    kable_styling("striped", 
		  full_width = FALSE) %>%
    pack_rows("Descriptive RQs", 1, 3) %>%
    pack_rows("Relational/Interventional RQs (Comparison)", 4, 5) %>%
    pack_rows("Relational/Interventional RQs (Connection)", 6, 7)
}
```