forked from PeterKDunn/SRM-Textbook
-
Notifications
You must be signed in to change notification settings - Fork 0
/
02-RQs.Rmd
executable file
·2115 lines (1469 loc) · 90.4 KB
/
02-RQs.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# (PART) Asking research questions {-}
# Research questions {#RQs}
<!-- Introductions; easier to separate by format -->
```{r, child = if (knitr::is_html_output()) {'./introductions/02-RQs-HTML.Rmd'} else {'./introductions/02-RQs-LaTeX.Rmd'}}
```
## Introduction {#Chap2-Intro}
Asking clear and answerable *research questions* (RQs) is important, as the RQ impacts all other components of the research.
Since quantitative research summarises and analyses the data using numerical methods (such as averages or percentages), the RQ must be appropriate for analysis using quantitative methods.
Studies often have an overall, broad research goal with many sub-questions (which may be quantitative or qualitative).
::: {.example #RQs name="Research questions"}
Consider this broad research goal:
> How well are permeable pavements (PPs) working in urban areas?
This goal has many component RQs (Fig.\ \@ref(fig:PermPavements)), and each can be answered separately.
:::
```{r PermPavements, fig.cap="A study of permeable pavements (PPs) may have many sub-questions", fig.align="center", fig.height=1.75, fig.width=7, out.width = '80%'}
par( mar = c(0.10, 0.15, 0.10, 0.15))
openplotmat()
pos <- array(NA,
dim = c(4, 2))
pos[1, ] <- c(0.50, 0.175) # RQ
pos[2, ] <- c(0.15, 0.40) # sub1
pos[3, ] <- c(0.85, 0.40) # sub2
pos[4, ] <- c(0.50, 0.825) # sub3
straightarrow(from = pos[1, ],
to = pos[2, ],
lty = 1,
lwd = 2)
straightarrow(from = pos[1, ],
to = pos[3, ],
lty = 1)
straightarrow(from = pos[1, ],
to = pos[4, ],
lty = 1)
textrect( pos[1, ],
lab = "Are PPs effective\nin urban areas?",
radx = 0.13,
rady = 0.20,
shadow.size = 0,
box.col = ResponseColour,
lcol = ResponseColour)
textrect( pos[2, ],
lab = "What is the\ncost of using PPs?",
radx = 0.16,
rady = 0.18,
shadow.size = 0,
box.col = GroupColour,
lcol = GroupColour)
textrect( pos[3, ],
lab = "Do PPs improve\nrun-off quality?",
radx = 0.16,
rady = 0.18,
shadow.size = 0,
box.col = GroupColour,
lcol = GroupColour)
textrect( pos[4, ],
lab = "How do people perceive\nPP aesthetics?",
radx = 0.16,
rady = 0.18,
shadow.size = 0,
box.col = GroupColour,
lcol = GroupColour)
```
## Elements of RQs
A RQ must be *written* carefully so they can be *answered* effectively.
This section introduces the four potential components of a quantitative RQ:
* The **P**opulation (Sect.\ \@ref(Population));
* The **O**utcome (Sect.\ \@ref(Outcome));
* The **C**omparison or **C**onnection (Sect. \@ref(Comparison));
* The **I**ntervention (Sect.\ \@ref(Intervention)).
These form the **POCI** acronym (sometimes seen as the PICO acronym).
### The population {#Population}
All quantitative RQs study a *population*: a (usually large) group of interest in the study.
Populations comprise *individuals*, sometimes called *cases*.
If the individuals are people, they are sometimes called *subjects*.
<div style="float:right; width: 222x; border: 1px; padding:10px">
<img src="Pics/iconmonstr-friend-5-240.png" width="50px"/>
</div>
::: {.definition #Population name="Population"}
The *population* is the group of *individuals* from which the total set of observations of interest *could* be made, and to which the results will (hopefully) generalise.
:::
::: {.importantBox .important data-latex="{iconmonstr-warning-8-240.png}"}
To fully understand *individuals*, you should also read about **units of analysis and units of observation** (Sect.\ \@ref(UnitsObsAnalysis)).
The *individuals* are the *units of analysis*.
:::
The population is *any* group of individuals of interest; for example:
* all German males between 18 and 35 years of age.
* all bamboo flooring materials manufactured in China.
* all elderly females with glaucoma in Canada.
* all *Pinguicula grandiflora* growing in Europe.
::: {.importantBox .important data-latex="{iconmonstr-warning-8-240.png}"}
The words *population*, *individuals* and *cases* do **not** just refer to people, though they may be commonly used that way in general conversation.
:::
The *population* is rarely the individuals from which the data are actually *obtained*: *all* elements of the population are rarely accessible in practice.
For example, testing a new drug cannot possibly study *all* people (especially people not yet born who might use the drug).
The population is 'all people', *not* just those studied.
::: {.importantBox .important data-latex="{iconmonstr-warning-8-240.png}"}
The **population** in a RQ is *not* just those studied; it is the whole group to which our results would generalise.
:::
In contrast, a *sample* is the *subset* of the population from which data are obtained (Chap.\ \@ref(Sampling)).
<div style="float:right; width: 75px; padding:10px">
<img src="Pics/iconmonstr-sitemap-20-240.png" width="50px"/>
</div>
::: {.definition #Sample name="Sample"}
A *sample* is a subset of the population from which data are collected.
:::
::: {.example #Samples name="Samples"}
Consider a study of American college women [@data:woolf:ironstatus], which aimed to (p. 52):
> ...assess iron status [...] in highly active (>12 hr purposeful physical activity per week) and sedentary (<2 hr purposeful physical activity per week) women...
The *sample* comprises 28 'active' and 28 'sedentary' American college women, from which data are collected.
The *population* is *all* 'active and sedentary' American college women, not just the 56 in the study.
The group of 56 subjects is the *sample*.
:::
Completely and precisely defining the population sometimes requires *refining* or *clarifying* the population, using *exclusion* and/or *inclusion criteria*.
Exclusion and inclusion criteria clarify which individuals may be explicitly included or excluded from the population.
Exclusion and inclusion criteria should be explained when their purpose is not obvious.
Both exclusion and inclusion criteria are not *necessary*; none, one or both may be used.
<div style="float:right; width: 75px; padding:10px">
<img src="Pics/iconmonstr-checkbox-4-240.png" width="50px"/>
</div>
::: {.definition #InclusionExclusionCriteria name="Inclusion and exclusion criteria"}
*Inclusion criteria* are characteristics that individuals must meet explicitly to be included in the study.
*Exclusion criteria* are characteristics that explicitly disqualify potential individuals from being included in the study.
:::
<div style="float:right; width: 75px; padding:10px">
<img src="Pics/iconmonstr-checkbox-14-240.png" width="50px"/>
</div>
::: {.example #ExclusionCriteriaEG name="Inclusion and exclusion criteria"}
A study of a certain bird species may only include sites where with a confirmed sighting within the last two years.
Concrete test cylinders with fissure cracks may be explicitly excluded from tests of concrete strength.
People with severe asthma may be explicitly excluded from exercise studies.
:::
:::{.exampleExtra data-latex=""}
A study on the influenza vaccine [@kheok2008efficacy] listed the **P**opulation as 'health-care workers' [@kheok2008efficacy, p. 466], and the sample comprised healthcare workers at two specific hospitals.
The population was refined using exclusion criteria: those
> ...declining to give consent, a history of egg protein allergy, and neurological or immunological conditions that are contraindications to the influenza vaccine.
>
> --- @kheok2008efficacy, p. 466
:::
::: {.example #ExclusionAmoutees name="Inclusion and exclusion criteria"}
A study [@data:Guirao2017:amputees] of the walking abilities of amputees used inclusion *and* exclusion criteria.
Inclusion criteria included (p. 27):
> ... length of the femur of the amputated limb of at least 15cm measured from the greater trochanter; use of the prosthesis for at least 12 months prior to enrollment and more than 6 h/day...
Exclusion criteria included (p. 27):
> ... the presence of cognitive impairment hindering the ability to follow instructions and/or perform the tests; body weight over 100kg...
:::
### Units of observation and analysis {#UnitsObsAnalysis}
*Units of observation* and *units of analysis* are important, but similar, concepts that need to be distinguished to properly identify a population.
Consider this RQ (based on @vaughn2009comparison):
> In Australian 20-something men, is the average thickness of head hair strands the same for
`r if (knitr::is_latex_output()) {
'blond-haired men'
} else {
'[blond-haired men](https://www.dictionary.com/browse/blonde)'
}`
and
`r if (knitr::is_latex_output()) {
'brunet-haired men?'
} else {
'[brunet-haired men](https://www.dictionary.com/browse/brunet?s=t)?'
}`
Comparing 100 hair strands from one blond-haired man, to 100 hair strands from one brunet-haired man, is problematic since *only one man of each hair colour is represented*.
While there are 200 observations, only two people are compared; little is learnt about 20-something men *in general*.
Instead, a lot is learnt about two specific men.
The population is represented by just two men.
<div style="float:right; width: 222x; border: 1px; padding:10px">
<img src="Illustrations/pexels-moose-photos-1036627.jpg" width="200px"/>
</div>
In this study, each individual hair is a *unit of observation*: the hair strands are what must be measured to obtain 'thickness of head hair strands'.
::: {.definition #UnitOfObservation name="Unit of observation"}
*Unit of observation*: The 'who' or 'what' which are observed, from which measurements are taken and data collected.
:::
Since each blond hair comes from the same man, each of those hairs have essentially 'lived their life together': They are washed at the same time, with the same shampoo, exposed to the same amount of sunlight and exercise, share genetics, etc.
However, different men would potentially use different shampoo, exercise differently, have different genetics, and so on.
Each man tends to be different, and lives differently and independently of others.
The RQ aims to compare blond *men* with brunet *men*; *men* are being compared.
Each man is a collection of units of observations (hair strands).
This leads to a similar, but different, concept: the *unit of analysis*.
In the example above, each man is a *unit of analysis*, where each unit of analysis gives 200 observations.
::: {.definition #UnitOfAnalysis name="Unit of analysis"}
*Unit of analysis*: The smallest collection of units of observations (and perhaps the units of observations themselves) about which generalizations and conclusions are made; the smallest *independent* 'who' or 'what' for which information is analysed.
:::
In the hair-thickness study, each *person* is a *unit of analysis*.
The sample size is just two.
*Each unit of analysis (man) has 100 units of observation (hair strands).*
Importantly, the sample size for the study is the number units of analysis; so here, *only two examples of the population of men are in the study*.
::: {.importantBox .important data-latex="{iconmonstr-warning-8-240.png}"}
The *individuals* are the *units of analysis*.
:::
::: {.tipBox .tip data-latex="{iconmonstr-info-6-240.png}"}
The number of units of *analysis* in a study is the size of the sample.
:::
::: {.tipBox .tip data-latex="{iconmonstr-info-6-240.png}"}
Sometimes the *units of analysis* and *units of observation* are the same.
:::
::: {.example #UnitsAnalysis name="Units of analysis"}
In the hair-strand study, each hair strand is a *unit of observation*: measurements of hair strand thickness are taken from individual hair strands.
The *unit of analysis* is the *person*: the hair strands from each man share much in common.
'Men' operate independently, but the hairs on each man are not independent entities.
:::
<div style="float:right; width: 222x; border: 1px; padding:10px">
<img src="Illustrations/winter-tires-4664205_640.jpg" width="200px"/>
</div>
::: {.example #UnitsAnalysis2 name="Units of analysis"}
A study compares the wear on two brands of car tyres.
Four tyres of Brand A are allocated to each of Cars 1--5, and four tyres of Brand B are allocated to each of Cars 6--10.
After 12 months, the amount of wear is recorded on each tyre.
The *unit of observation* is the *tyre*: the amount of wear is measured on each tyre.
The tyres on any one car do not operate independently; the four tyres on a single car 'live their life together'.
They all are exposed to the same day-to-day use, the same drivers, have driven almost identical distances, under the same conditions, etc.
The *unit of analysis* is the *car*: the brand of tyre is allocated to the car, and all wheels on the car get the same brand of tyre.
Each unit of analysis (car) produces four units of observations.
The *sample size* is 10 cars, with $10\times 4 = 40$ observations.
:::
<div style="float:right; width: 222x; border: 1px; padding:10px">
<img src="Illustrations/pexels-nappy-936018.jpg" width="200px"/>
</div>
:::{.exampleExtra data-latex=""}
A report on the *Spectrum* website reported:
<!-- (https://www.spectrumnews.org/news/statistical-errors-may-taint-many-half-mouse-studies/) -->
<!-- (https://www.spectrumnews.org/) -->
> Seven years ago, Peter Kind [...] was reading a study about fragile X syndrome, a developmental condition characterized by severe intellectual disability and, often, autism [...]
> Kind was surprised when he noticed a potentially serious statistical flaw.
>
> The research team had looked at 10 neurons from each of the 16 mice in the experiment [...]
> the researchers had analyzed each neuron as if it were an independent [individual observation].
> That gave them 160 data points to work with, 10 times the number of mice in the experiment.
>
> `The question is, are two neurons in the brain of the same animal truly independent data points? The answer is no,' Kind says.
>
> --- [Spectrum report](https://www.spectrumnews.org/news/statistical-errors-may-taint-many-half-mouse-studies/), accessed 18 Nov 2022
<!-- (https://www.spectrumnews.org/news/statistical-errors-may-taint-many-half-mouse-studies/) -->
The study used 16 units of analysis (mice), but the authors treated the $16\times 10 = 160$ neurons as the units of analysis.
The 10 neurons from each mouse share the same genetic information
A total of 160 neurons from 16 mice is very different to a study of 160 neurons from 160 genetically-different mice.
:::
The units of observation and units of analysis *may* be the same, and often are.
However, they are sometimes different, and identifying these situations is *crucial*.
Importantly, studies compare units of analysis, not units of observation.
::: {.example #UnitsAnalysis3 name="Units of analysis"}
A study compared two school physical activity (PA) programs.
Each of 44 children (with parental agreement) was allocated to one of two PA programs.
The improvement in children's fitness was measured for every student in the study over six months.
The *units of observation* are the individual students, as the fitness measurements are taken from the students.
The *units of analysis* are also the individual students, as the PA program was allocated to each student individually, and each student has their own sport, family routines and activities, etc.
Each unit of analysis (student) has one unit of observation.
There are 44 units of analysis, and 44 units of observations.
:::
::: {.example #UnitsAnalysisGroups name="Units of analysis"}
Consider comparing the percentage of females and males wearing sunglasses at a specific beach.
People in a *group* at the beach will probably not be operating 'independently': people with similarities tend to group together.
For example, a couple will often *both* be wearing or both *not* wearing sunglasses; families will often all be wearing sunglasses or not wearing sunglasses.
The researchers have two options; either
* Use the people *groups* as the *unit of analysis* (some will be groups of one), and record data from just *one* person in any group.
Ideally, the researchers would specify before-hand from which group member to take data (e.g., the person closest to the researchers when the group is noticed).
* Alternatively, the researchers may decide not to use data from groups at all, and only gather data from individuals.
:::
`r if (knitr::is_latex_output()) '<!--'`
::: {.thinkBox .think data-latex="{iconmonstr-light-bulb-2-240.png}"}
<iframe src='https://www.ferendum.com/en/embeded.php?pregunta_ID=1252809&sec_digit=992403757&embeded_digit=1479848444' style='width:100%; height:500px; overflow: auto; background: #badaff' frameBorder='0'></iframe><BR>
<A href='https://www.ferendum.com' target='_blank'>Free Online Poll Maker</A>
<iframe src='https://www.ferendum.com/en/embeded.php?pregunta_ID=1252811&sec_digit=285796637&embeded_digit=6372971' style='width:100%; height:500px; overflow: auto; background: #badaff' frameBorder='0'></iframe><BR>
<A href='https://www.ferendum.com' target='_blank'>Free Online Poll Maker</A>
`r webexercises::hide()`
**Units of observation**: the individual students, as the fitness measurements are taken from the students individually.
**Units of analysis**: the *schools*, as the PA program was allocated to each *school*.
All students at School A are exposed Program 1, but all students at School A are also likely to be exposed to similar weather, fitness opportunities, physical conditions, teachers and school-based philosophies, and so on.
*The improvement in the children's fitness levels* and *the program* are both **variables**.
`r webexercises::unhide()`
:::
`r if (knitr::is_latex_output()) '-->'`
`r if (knitr::is_html_output()){
'The following short video may help explain some of these concepts:'
}`
<!-- From: https://stackoverflow.com/questions/43840742/how-to-embed-local-video-in-r-markdown -->
<div style="text-align:center;">
```{r}
htmltools::tags$video(src = "./videos/UnitsOfObsAnalysis-B.mp4",
width = "550",
controls = "controls",
loop = "loop",
style = "padding:5px; border: 2px solid gray;")
```
</div>
<iframe src="https://learningapps.org/watch?v=pmy2wjmun22" style="border:0px;width:100%;height:650px" allowfullscreen="true" webkitallowfullscreen="true" mozallowfullscreen="true"></iframe>
### The outcome {#Outcome}
All RQs study something *about* the population, called the *outcome*.
Because the RQ concerns a population, the outcome describes a group (not individuals).
Hence, the outcome is usually an *average*, *percentage*, or *general* numerically quantity summarising the population.
<div style="float:right; width: 75px; padding:10px">
<img src="Pics/iconmonstr-process-1-240.png" width="50px"/>
</div>
::: {.definition #Outcome name="Outcome"}
The *outcome* in a RQ is the result, output, consequence or effect of interest in a study, numerically summarising the population.
:::
The outcome of interest in a population may be (for example):
* *average* increase in heart rates after 30 minutes of exercise.
* *average* amount of wear after 1000 hours of use.
* *proportion* of people whose pupils dilate.
* *average* weight loss after three weeks.
* *percentage* of seedlings that die.
::: {.importantBox .important data-latex="{iconmonstr-warning-8-240.png}"}
The **outcome** in a RQ summarises a *population*; it does not describe the *individuals* in the population.
:::
### The comparison or connection {#Comparison}
Some RQs may seek to establish a relationship in the **P**opulation between the **O**utcome and another attribute of the individuals.
This is other attribute is called a **C**omparison or **C**onnection.
The implication is that a change in the value of the comparison or connection may be associated with a change in the value of the outcome (which *may* or *may not* be a cause-and-effect relationship).
A *comparison* refers to an attribute recorded in a small number of distinct groups for which the outcome is compared.
A *connection* refers to a attribute that can take many different values for which a connection with the outcome is explored.
The values of the comparison or connection may be *imposed* on the individuals by the researchers (e.g., fertilizer dose) and called *treatments*, or naturally occur (e.g., age) and called *conditions*.
::: {.definition #Comparison name="Comparisons"}
The *comparison* in the RQ identifies the small number of different, distinct subsets of the population between which the outcome is compared.
:::
::: {.definition #Connections name="Connections"}
The *connection* in the RQ identifies another attribute of the individuals that can take many different values, and may be related to the outcome.
:::
::: {.example #ComparisonConnectionCaloric name="Comparisons and connections"}
A study [@CaloricIntake] examined the mean daily sodium excretion (the **O**utcome) in Israeli adults (the **P**opulation).
The daily sodium excretion was *compared* for two separate groups: those diagnosed with diabetes, and those not diagnosed with diabetes.
A possible *connection* was explored between the daily sodium excretion and the systolic blood pressure.
:::
<div style="float:right; width: 75px; padding:10px">
<img src="Pics/iconmonstr-tree-16-240.png" width="50px"/>
</div>
The distinction between *between*-individuals comparisons (being discussed here) and *within*-individuals comparisons is important.
::: {.definition #Betweenindividuals name="Between-individuals comparisons"}
*Between-individuals comparisons* mean that the comparison is between different groups of individuals.
In contrast, *within-individual comparisons* make comparison *within* the same individuals.
:::
<div style="float:right; width: 222x; border: 1px; padding:10px">
<img src="Illustrations/football-5441486_1920.jpg" width="200px"/>
</div>
:::{.example #WithinBetweenComparison name="Between- and within-individual comparisons"}
Consider studying the strength of left and right legs of football players.
A *between*-individuals comparison would compare the left and right leg strengths *between* different groups: one group would have their left-leg strength measured, and the other their right-leg strength measured.
The **C** refers to *between*-individual comparisons or connections.
In contrast, a *within*-individuals comparison would measure both the right and left strength of the same individuals: the comparison is *within* each individual.
In this example, *no within-individuals comparison* exists, so this study does not have a *comparison* for the purpose of writing a RQ.
Instead, the **O**utcome may be given as "the average difference between individuals' right- and left-leg strength".
:::
The outcome may be *compared* between two or more separate, distinct subsets of the population; for example:
* Comparing the average amount of wear in floor boards (O) between two groups: standard wooden flooring materials, and bamboo flooring.
* Comparing the average heart rates (O) across three subsets: those who received no dose of a drug, those who received a daily dose of the drug, and those who received a twice-daily dose of the drug.
Explicitly, the comparison here is *between different individuals in separate groups*, not *within the same individuals*.
Studies may use *both* within- and between-individuals comparisons.
For instance, a study may examine the *change* in each individuals blood pressure (the within-individuals comparison), for two drugs given to different groups (the between-groups comparison).
::: {.importantBox .important data-latex="{iconmonstr-warning-8-240.png}"}
Be careful!
The definition of a *comparison* refers to a *between-individuals comparison*, that may be *imposed* (e.g., one group is given one dose of fertilizer per day, and another given two doses of fertilizer per day) or *existing* (e.g., one group of people aged under 30, and another aged 30 or over).
If all individuals are treated in the same way, or do not have existing difference that allow them to be divided into groups to be compared, no *comparison* exists according to this definition.
:::
::: {.example #Comparisons name="Comparison"}
Consider comparing the average blood pressure (the Outcome) in the right and left arms of Australians (the Population).
The blood pressure is measured on both arms of every studied individual.
*There is no (between-individuals) comparison*: the individuals are not divided into separate groups to compare average blood pressure; every person is treated the same way, and a left and right arm belongs to each person.
This is a *"within-individuals" comparison*.
The outcome might be best described as 'the average *difference* between right- and left-arm blood pressure'.
A study comparing the average blood pressure between (a) people aged under 40, and (b) people aged 40 or over *does* have a (between-individuals) comparison: two different subsets (under 40; 40 and over) of the population are compared.
:::
Often, one of the comparison groups is the *control group*.
The *control group* comprises units of analysis *as similar as possible* to the other units of analysis (called *controls*), but without the receiving the treatment, or having the condition, being studied.
The control groups acts like a benchmark for detecting changes in the outcome.
Sometimes the control group receives a *placebo*: a non-effective treatment that appears to be the real treatment.
::: {.definition #Control name="Control"}
A *control* is a unit of analysis without the treatment or condition of interest, but as similar as possible in *every other way* to other units of analysis.
:::
::: {.definition #Placebo name="Placebo"}
A *placebo* is a treatment with no intended effect or active ingredient, but appears to be the real treatment.
:::
::: {.example #ControlGroup name="Control group"}
To test the effectiveness of a new drug, patients report to a doctor to receive injections of a new drug.
Some patients are assigned to the *control group*, but the controls are *not* just people who don't get the injections.
Ideally, controls would be people who, like the treatment group, report to a doctor and receive an injection... however, they just receive a injection that does nothing.
:::
The above ideas also apply to RQs with *connections*.
As the value of the *connection* changes between many possible values, the value of the outcome (potentially) changes; for example:
* Connecting the average heart rate (O) with exposure to amount of caffeine consumption in the previous day (C).
Heart rate and caffeine consumption are both recorded for many different people (the individuals).
* Connecting the percentage seed germination (O) with hours of sunlight per day (C).
The percentage germination and sunlight hours per day are measured on many pots (the individuals) of 10 seeds each.
Again, as with comparisons, the connections here refer to those *between individuals*, but not *within-individuals*.
:::{.example #WithinBetweenConnections name="Between- and within-individual connections"}
Consider studying the relationship between daily water intake and amount of sun exposure per day for athletes.
A *between*-individuals comparison would record the daily water intake and the daily sun exposure once each for many individuals.
In contrast, a *within*-individuals comparison would record the daily water intake and the daily sun exposure for one individual over many days: the comparison is *within* each individual.
Explicitly, the connection here is *between different individuals*, not *within the same individuals*.
:::
### The intervention {#Intervention}
RQs with a connection or comparison (C) sometimes also have an *intervention*.
<div style="float:right; width: 75px; padding:10px">
<img src="Pics/iconmonstr-medical-14-240.png" width="50px"/>
</div>
::: {.definition #Intervention name="Intervention"}
An *intervention* is a comparison or connection whose value can be *manipulated* by the researchers.
That is, the researchers *impose* the connection or comparison upon the individuals in the study.
:::
The intervention may be:
* explicitly giving doses of a new drug to patients.
* explicitly applying wear testing loads to two different flooring materials.
* explicitly exposing people to different stimuli.
* explicitly applying a different dose of fertiliser.
<div style="float:right; width: 75px; padding:10px">
<img src="Pics/iconmonstr-medical-6-240.png" width="50px"/>
</div>
:::{.example #InterventionHimalaya name="Intervention"}
A study by @data:Bird2008:wholegrain *gave* one group of participants a diet using refined flour, and gave another group of participants a diet using a new flour variety (*Himalaya 292*).
The type of diet is the comparison.
Since the researchers can manipulate which subject ate which flour, this study has an intervention.
:::
::: {.example #Interventions name="Interventions"}
A study comparing the average blood pressure in female and male Australians measured blood pressure using a blood pressure machine (a sphygmomanometer).
The research team needs to interact with the participants and use the machine to measure blood pressure, but there is *no* intervention.
Using the sphygmomanometer is just a way to measure blood pressure, to *obtain* the data.
*There is no intervention*: the *comparison* is between females and males, which cannot be manipulated or imposed on the individuals by the researchers.
:::
::: {.thinkBox .think data-latex="{iconmonstr-light-bulb-2-240.png}"}
A study of American college women [@data:woolf:ironstatus] measured 'iron status' in highly active women (>12 hr purposeful physical activity per week) and sedentary women (<2 hr purposeful physical activity per week).
What is the Outcome; Comparison or Connection (if any), and Intervention (if any)?\label{thinkBox:POCIWomen}
`r if (knitr::is_latex_output()) '<!--'`
`r webexercises::hide()`
*Outcome*: 'average iron status' (which would need an *operational definition*.)
*Comparison*: between two groups of individuals: highly active and sedentary women (i.e., between individuals).
These terms would also need operational definitions!
*Intervention*: Probably none; an intervention would mean the *researchers* tell each individual woman to be highly active or sedentary, which seems unlikely.
`r webexercises::unhide()`
`r if (knitr::is_latex_output()) '-->'`
:::
::: {.exampleExtra data-latex=""}
Researchers examined numerous studies of chest compressions by paramedics.
They examined research papers in which the **P**opulation was patients who had experienced a cardiac arrest, and where manual chest compressions were compared with another method.
The table below shows the interventions and outcomes of interest:
**Interventions** | **Outcomes**
-------------------------------|--------------------------------------------------------------------
* Mechanical chest compression | * Mean survival time to hospital discharge
* Mechanical CPR | * Percentage with a return of spontaneous circulation (ROSC)
* Powered chest compressions |
* Powered CPR |
The research concluded that:
> Overall, the evidence analysed suggests that mechanical chest compression devices are statistically superior to manual chest compressions of a high quality, when up-to-date protocols and guidelines are followed.
>
> --- @williams2021mechanical, Table 1
:::
<iframe src="https://learningapps.org/watch?v=pip2dnpq222" style="border:0px;width:100%;height:500px" allowfullscreen="true" webkitallowfullscreen="true" mozallowfullscreen="true"></iframe>
## Definitions {#OperationDefinitions}
Research studies usually include terms that must be carefully and precisely defined, so that others know *exactly* what has been done, without ambiguity.
Two types of definitions can be given when necessary:
* A *conceptual definition* explains *what* is being studied (i.e., what a word or a term *means* in the study).
* An *operational definition* defines *how* something will be studied or measured.
::: {.definition #ConceptualDefinition name="Conceptual definition"}
A *conceptual definition* articulates precisely *what* words or phrases mean; that is, what is being identified, measured, observed or assessed in a study.
:::
::: {.definition #OperationalDefinition name="Operational definition"}
An *operational definition* articulates exactly *how* something will be identified, measured, observed or assessed.
:::
In many cases, a clear *operational definition* is needed to describe how data will be collected to ensure repeatability and consistent data collection, by removing any ambiguity about how data are obtained.
<div style="float:right; width: 222x; border: 1px; padding:10px">
<img src="Illustrations/pexels-oladimeji-ajegbile-2696299.jpg" width="200px"/>
</div>
::: {.example #DefinitionsStress name="Operational and conceptual definitions"}
Consider a study examining stress in students.
A *conceptual definition* would describe *what is meant* by 'stress' (in contrast to, say, 'anxiety').
An *operational definition* would describe *how* 'stress' is *measured*, since stress cannot be measured directly (like height, for example).
'Stress' could be *measured* using a questionnaire or measuring physical characteristics, for instance.
Other ways of measuring stress are also possible, and all have advantages and disadvantages.
:::
<iframe src="https://learningapps.org/watch?v=pt0rqa5t522" style="border:0px;width:100%;height:500px" allowfullscreen="true" webkitallowfullscreen="true" mozallowfullscreen="true"></iframe>
Sometimes the definitions themselves are not important; a clear definition is simply needed.
However, to avoid confusion, commonly-accepted definitions should be used unless good reasons exist for using a different definition.
When a commonly-accepted definition does not exist, the definition being used should be very clearly articulated.
::: {.example #DefinitionsFlexibility name="Operational and conceptual definitions"}
A research article [@gillet2018shoulder] entitled "Shoulder range of motion and strength in young competitive tennis players with and without history of shoulder problems" provided these necessary conceptual definitions (among others):
* Young: 8--15 years;
* Competitive tennis players: Some of the best players in their age category in France, and members of a French tennis centre of excellence.
An operational definition was provided for 'Shoulder strength': as measured using a hand-held dynamometer.
:::
<!-- ::: {.example #DefinitionsFlexibility name="Operational and conceptual definitions"} -->
<!-- A student project at my university used this RQ: -->
<!-- > Amongst students [...], on average do students who participate in competitive swimming have greater shoulder flexibility than the remainder of the able-bodied student population? -->
<!-- *Shoulder flexibility* needs a *conceptual definition* to describe exactly what it *means*. -->
<!-- Additionally, how *shoulder flexibility* is being *measured* is not clear, so an *operational definition* is needed (which was not provided...). -->
<!-- ::: -->
<!-- Zac R., SEM1 2020 -->
::: {.exampleExtra data-latex=""}
Players, administrators and fans are wary of concussions and head injuries in sport.
A conference on concussion in sport developed this *conceptual definition* [@McCrory250]:
> ... a complex pathophysiological process affecting the brain, induced by biomechanical forces...
<div style="float:right; width: 222x; border: 1px; padding:10px">
<img src="Illustrations/women-6905793_1920.jpeg" width="200px"/>
</div>
However, an *operational definition* is needed to explain *how* to identify a player with concussion during a game.
Rugby decided on this *operational definition* [@Raftery642]:
> ... a concussion applies with any of the following:
>
> 1. The presence, pitch side, of any Criteria Set 1 signs or symptoms (table 1)... [Table 1 includes symptoms such as 'convulsion', 'clearly dazed', etc.];
>
> 2. An abnormal post game, same day assessment...;
>
> 3. An abnormal 36--48 h assessment...;
>
> 4. The presence of clinical suspicion by the treating doctor at any time...
:::
::: {.example #DefinitionsWater name="Operational and conceptual definitions"}
Consider a study requiring water temperature to be measured.
An *operational definition* would explain *how* the temperature is measured: the thermometer type, how the thermometer was positioned, how long was it left in the water, and so on.
In contrast, a *conceptual* definition might describe the scientific definition of temperature (and would not be needed, as 'temperature' is a well-understood term).
:::
::: {.exampleExtra data-latex=""}
A study of snacking in Australia [@data:Fayet2017:Snacks] used this conceptual definition of an 'eating occasion':
> ...one or more food or beverage items consumed at the same time of day...
and a 'snacking occasion' as
> ...one or more food or beverage items consumed at the same time of day within a snacking time period...
Finally then, 'snacking' was defined as:
> Eating occasions that occurred during breakfast, midday and evening meals were meals and all eating occasions that occurred between these meals were classified as snacking.
These are all *conceptual* definitions, explaining what the terms *mean*.
An *operational* definition would explain *how* the data were obtained from the participants (e.g., using a food diary).
:::
::: {.exampleExtra data-latex=""}
@data:Meline2006:InclusionExclusion discusses five studies about stuttering, each using a different *operational* definition:
* Study 1: As diagnosed by speech-language pathologist.
* Study 2: Within-word disfluences greater than 5 per 150 words.
* Study 3: Unnatural hesitation, interjections, restarted or incomplete phrases, etc.
* Study 4: More than 3 stuttered words per minute.
* Study 5: State guidelines for fluency disorders.
People may be classified as stutterers by some definitions but not others, so it is important to know which definition is used.
:::
:::{.exampleExtra data-latex=""}
A study examined the possible relationship between the 'pace of life' and the incidence of heart disease
[@data:levine1990:paceoflife] in 36 US cities.
The researchers used four different *operational* definitions for 'pace of life' (remember the article was published in 1990!):
1. The walking speed of randomly chosen pedestrians.
1. The speed with which bank clerks gave 'change for two $20 bills or [gave] two $20 bills for change'.
1. The talking speed of postal clerks.
1. The proportion of men and women wearing a wristwatch.
None of these *perfectly* measure 'pace of life', of course.
Nonetheless, the researchers found that, compared to people on the West Coast,
> ... people in the Northeast walk faster, make change faster, talk faster and are more likely to wear a watch...
>
> --- @data:levine1990:paceoflife (p. 455)
:::
::: {.thinkBox .think data-latex="{iconmonstr-light-bulb-2-240.png}"}
Define a 'smoker'.\label{thinkBox:Smoker}
`r if (!knitr::is_html_output()) '<!--'`
`r webexercises::hide()`
This is very difficult!
Some studies use the categories *Never smoked*, *Past smoker*, and *Current smoker*... or ask people to *self*-identify as a smoker or not.
`r webexercises::unhide()`
`r if (!knitr::is_html_output()) '-->'`
:::
## Types of RQs {#TypesOfRQs}
All RQs have a population (P) and an outcome (O).
Different *types* of RQ emerge depending on whether the RQ has a comparison/connection (C) and whether this comparison or connection can be manipulated by the researchers (an intervention, I).
This section explores different types of research questions:
* Descriptive RQs (Sect.\ \@ref(RQsDescriptive));
* Relational RQs (Sect.\ \@ref(RQsRelational));
* Interventional RQs (Sect.\ \@ref(RQsInterventional)).
These are compared in Sect.\ \@ref(RQsCompare).
RQs can also be written with one of two purposes in mind (Sect.\ \@ref(TwoPurposesOfRQs)):
* *Estimation*:
These RQs ask how precisely a *value* in the *population* is estimated by using the *sample* data.
* *Making decisions*:
These RQs are concerned with making a decision about a population, based on the sample data.
Examples for both forms are given for the different types of RQs below.
### Descriptive RQs (PO) {#RQsDescriptive}
*Descriptive RQs* are the most basic RQs, giving the **P**opulation to be studied, and the **O**utcome of interest about this population.
Typically, descriptive RQs have one of these forms:
* *Estimation*: Among {*the population*}, what is {*the outcome*}?
* *Decision-making*: Among {*the population*}, is {*the outcome*} equal to {*a given value*}?
::: {.importantBox .important data-latex="{iconmonstr-warning-8-240.png}"}
These are **not** 'recipes', but guidelines.
:::
<div style="float:right; width: 75px; padding:10px">
<img src="Pics/iconmonstr-medical-6-240.png" width="50px"/>
</div>
::: {.example}
A study examined the 'body temperature of 148 healthy men and women' [@data:mackowiak:bodytemp] aged between 18 to 40 (the P).
One descriptive RQ was:
> What is the mean body temperature?
This RQ is an *estimation* RQ.
A *decision-making* RQ they also studied was whether the average body temperature was the value that had been commonly accepted by medical professionals:
> Is the mean body temperature really 98.6^o^F?
:::
::: {.thinkBox .think data-latex="{iconmonstr-light-bulb-2-240.png}"}
Consider this RQ:\label{thinkBox:POCICoeliacs}
'Among Indonesian adults, what proportion are coeliacs?'
For this RQ, identify the **P**opulation and the **O**utcome.
`r if (knitr::is_latex_output()) '<!--'`
`r webexercises::hide()`
**P**: Indonesian adults; **O**: The proportion that are coeliacs.
This is a estimation-type descriptive RQ.
`r webexercises::unhide()`
`r if (knitr::is_latex_output()) '-->'`
:::
### Relational RQs (POC) {#RQsRelational}
Usually, studying *relationships* is more interesting than simply describing a population.
*Relational RQs* explore existing relationships, and state the **P**opulation, the **O**utcome, and the **C**omparison or **C**onnection.
Relational RQs have no intervention; the connection or comparison is *not* manipulated by, nor imposed by, the researchers.
Typically, relational RQs with a *comparison* have one of these forms:
* *Estimation*: Among {*the population*}, what is the difference in {*the outcome*} for {*the groups being compared*}?
* *Decision-making*: Among {*the population*}, is {*the outcome*} the same for {*the groups being compared*}?
Typically, relational RQs based on a *connection* have the form:
* *Estimation*: Among {*the population*}, how strong is the relationship between {*the outcome*} and {*something else*}?
* *Decision making*: Among {*the population*}, is {*the outcome*} related to {*something else*}?
<div style="float:right; width: 75px; padding:10px">
<img src="Pics/iconmonstr-medical-6-240.png" width="50px"/>
</div>
::: {.example #RelationalRQ name="Relational RQ"}
Consider this RQ (based on @estevez2019influence):
> Among Cubans between 13 and 20 years of age, is the average heart rate the same for females and males?
The *population* is 'Cubans 13 and 20 years of age', the *outcome* is 'average heart rate', and the (between-individuals) *comparison* is between two separate groups: 'between females and males'.
This is a *relational RQ* since the sex of the individual (the C) is *not* manipulated by, or imposed by, the researchers.
This RQ is a *decision-making RQ*, since it asks if the average heart rate is the same for females and males.
An *estimation*-type relational RQ would ask about the *size* of difference in the average heart rate between females and males.
The same study could also have asked:
> Among Cubans between 13 and 20 years of age, is the average heart rate related to age?
The *connection* is with 'age', which cannot be manipulated by the researchers, so this is a *relational RQ*.
This RQ is a *decision-making RQ*, since it asks if the average heart rate is related to age.
An *estimation*-type relational RQ might be:
> Among Cubans between 13 and 20 years of age, how strong is the relationship between average heart rate and age?
:::
::: {.thinkBox .think data-latex="{iconmonstr-light-bulb-2-240.png}"}
Consider this RQ (based on @data:Brown2000:WarningLights):\label{thinkBox:POCIAmbo}
\vspace{-2ex}
> In the London Ambulance Service last year, what was the difference between the average response time to emergency calls between weekdays and weekends?
\vspace{-2ex}
Identify the population, outcome, and comparison.
:::