forked from PeterKDunn/SRM-Textbook
-
Notifications
You must be signed in to change notification settings - Fork 0
/
53-App-Answers.Rmd
3090 lines (2182 loc) · 115 KB
/
53-App-Answers.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# Answers to end-of-chapter exercises {#Answers}
This Appendix contains answers to *most* (not all) exercises.
Some are fully worked, and some are only brief solutions.
* Answers to Chap. \@ref(Intro) (Introduction): Sect. \@ref(IntroAnswer).
* Answers to Chap. \@ref(RQs) (Research questions): Sect. \@ref(RQsAnswer).
* Answers to Chap. \@ref(ResearchDesign) (Research design): Sect. \@ref(ResearchDesignAnswer).
* Answers to Chap. \@ref(Ethics) (Ethics): Sect. \@ref(EthicsAnswer).
* Answers to Chap. \@ref(Sampling) (Sampling): Sect. \@ref(SamplingAnswer).
* Answers to Chap. \@ref(FactorsInfluenceY) (Factors that influence the response variable): Sect. \@ref(FactorsInfluenceYAnswer).
* Answers to Chap. \@ref(DesignExperiment) (Designing experiments): Sect. \@ref(DesigningExperimentsAnswer).
* Answers to Chap. \@ref(DesignObservational) (Designing observational studies): Sect. \@ref(DesigningObservationalAnswer).
* Answers to Chap. \@ref(Interpretation) (Interpretation): Sect. \@ref(InterpretationAnswer).
* Answers to Chap. \@ref(CollectingDataProcedures) (Collecting data): Sect. \@ref(CollectionAnswer).
* Answers to Chap. \@ref(DescribingVars) (Describing variables): Sect. \@ref(DescribeAnswer).
* Answers to Chap. \@ref(Graphs) (Graphs): Sect. \@ref(GraphsAnswer).
* Answers to Chap. \@ref(NumericalQuant) (Numerical summaries for quantitative data): Sect. \@ref(NumericalQuantAnswer).
* Answers to Chap. \@ref(NumericalQual) (Numerical summaries for qualitative data): Sect. \@ref(NumericalQualAnswer).
* Answers to Chap. \@ref(MakingDecisions) (Making decisions): Sect. \@ref(MakingDecisionsAnswer)
* Answers to Chap. \@ref(Probability) (Probability): Sect. \@ref(ProbabilityAnswer).
* Answers to Chap. \@ref(SamplingDistributions) (Sampling distributions): Sect. \@ref(SamplingDistributionsAnswer).
* Answers to Chap. \@ref(SamplingVariation) (Sampling variation): Sect. \@ref(SamplingVariationExercisesAnswer).
* Answers to Chap. \@ref(CIOneProportion) (CIs for one proportion): Sect. \@ref(CIOneProportionAnswer).
* Answers to Chap. \@ref(AboutCIs) (More about forming CIs): Sect. \@ref(AboutCIsAnswer).
* Answers to Chap. \@ref(OneMeanConfInterval) (CIs for one mean): Sect. \@ref(OneMeanConfIntervalAnswer).
* Answers to Chap. \@ref(PairedCI) (CIs for paired data): Sect. \@ref(PairedCIExercisesAnswer).
* Answers to Chap. \@ref(CITwoMeans) (CIs for two independent means): Sect. \@ref(MeansIndSamplesAnswer).
* Answers to Chap. \@ref(OddsRatiosCI) (CIs for odds ratios): Sect. \@ref(OddsRatiosCIAnswer).
* Answers to Chap. \@ref(EstimatingSampleSize) (Sample size estimation): Sect. \@ref(EstimatingSampleSizeAnswer)
* Answers to Chap. \@ref(TestOneProportion) (Tests for one proportion): Sect. \@ref(TestOneProportionAnswer).
* Answers to Chap. \@ref(TestOneMean) (Tests for one mean): Sect. \@ref(TestOneMeanAnswer).
* Answers to Chap. \@ref(MoreAboutTests) (More about hypothesis tests): Sect. \@ref(MoreAboutTestsAnswer).
* Answers to Chap. \@ref(TestPairedMeans) (Tests for paired mean): Sect. \@ref(TestPairedMeansAnswer).
* Answers to Chap. \@ref(TestTwoMeans) (Tests for two independent mean): Sect. \@ref(TestTwoMeansAnswer).
* Answers to Chap. \@ref(TestsOddsRatio) (Tests for odds ratios): Sect. \@ref(TestsOddsRatioAnswer).
* Answers to Chap. \@ref(TwoQuant) (Relationships between two quantitative variables): Sect. \@ref(TwoQuantAnswer).
* Answers to Chap. \@ref(Correlation) (Correlation): Sect. \@ref(CorrelationAnswer).
* Answers to Chap. \@ref(Regression) (Regression): Sect. \@ref(RegressionAnswer).
* Answers to Chap. \@ref(Reading) (Reading research): Sect. \@ref(ReadAnswer).
* Answers to Chap. \@ref(WritingResearch) (Writing research): Sect. \@ref(WriteAnswer).
## Answers: Introduction {#IntroAnswer}
Answers to exercises in Sect. \@ref(IntroExercises).
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:RQsTypeTourniquet)**:
The RQ requires numerical information to be answered, such as the *average time* taken to apply the tourniquets.
This RQ would be answered using a **quant**itative RQ.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:RQsTypeMangroves)**:
The RQ does not require numerical information to be answered.
This RQ would be answered using a **qual**itative RQ.
:::
## Answers: RQs {#RQsAnswer}
Answers to exercises in Sect. \@ref(RQsExercises).
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:RQsBloodPressure)**:
**1.** P: Danish University students.
**2.** O: *Average* resting diastolic blood pressure.
**3.** C: between students who regularly drive to university and those who regularly ride their bicycles.
**4.** No intervention.
**5.** Relational.
**6.** *Conceptual*: What is meant by 'regularly'; 'university student' (on-campus and online? undergraduate *and* postgraduate? full-time and part-time?). *Operational*: how 'resting diastolic blood pressure' will be measured.
**7.** Resting diastolic blood pressure; whether they regularly drive to university or regularly ride their bicycles.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:RQsNutrition)**:
**1.** Some elements are not well defined, but perhaps: P: Children aged under 3 in a Peruvian peri-urban community; O: proportion of children with diarrhoea; C: nutritional status; No intervention.
**2.** Hard to be sure; perhaps something like: 'In children aged under 3 in a Peruvian peri-urban community, is there a a relationship between diarrhoea status and nutritional status?'.
**3.** Relational.
**4.** How is 'diarrhoea status' measured?
Likewise, how is 'nutritional status' measured?
There are probably others.
**5.** Response: diarrhoea status; explanatory: nutritional status.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:RQsOutcomeResponse)**:
Recall that the outcome is used to describe a *group* (the population), not the individuals.
**1.** The *percentage* of vehicles that crash.
**2.** The *average* jump height.
**3.** The *average* number of tomatoes per plant.
**4.** The *percentage* of people who own a car.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:RQsWalkingSpeed)**:
**1.** What *individual* people are doing with their phones probably explains their walking speed: the *explanatory variable* is the way in which the mobile phone is being used.
Notice that 'talking on the phone' and 'texting on the phone' are not *variables*.
They are particular values that the *variable* can take.
That is, 'what people are doing on their phone' is the variable, because it can vary: Sometimes people will be talking, sometimes texting, etc.
**2.** Waking speed probably depends on (or *responds to*) how *individual* people are using their phone: the *response variable* is the walking speed.
**3.** The *outcome* is how the response *variable* is summarised over a group of individuals.
The walking speeds from many individuals could be summarised numerically using the *average walking speed*, which would be the outcome.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:RQsComparisonExplanatory)**:
Recall that the explanatory variable is what is actually measured on the individuals in the population.
**1.** The type of car fuel.
**2.** The type of coffee.
**3.** The dose of iron supplement.
**4.** The diet.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:RQsComparisonVsPaired)**:
**1.** *Does* have a comparison (between a group of people in winter, and a different group of people in summer).
The outcome is 'the percentage of people wearing hats'.
**2.** *Does not* have a comparison.
Two subsets of the population are not being compared: instead, each person is measured twice.
So an Outcome may be 'average *change* in cholesterol levels'.
**3.** *Does not* have a comparison.
Two subsets of the population are not being compared: instead, each person gets two measurements.
So an Outcome may be 'average *difference* between right- and left-leg balance times'.
**4.** *Does* have a comparison: The three subsets of the population are being compared: the three groups of tomato plants.
The Outcome is 'average yield' (which could be measured in kg/plant, tomatoes/plant, kg/hectare, etc).
:::
<!-- SEM 2 2018: Christopher WALTON; WILLIAM LUDOWYK; LOK RAJ BHATTA-->
<!-- ::: {.answer data-latex=""} -->
<!-- **Answer to Exercise \@ref(exr:ProjectRQ1)** -->
<!-- * What does 'is there a differentiation' actually mean? Do they simply mean 'is there a difference'? -->
<!-- * People who work 20 hours per week don't actually belong to either group. (This isn't necessarily a problem, but it is odd and looks like the students inadvertently left out people who work 20 hours per week.) -->
<!-- Perhaps try this: -->
<!-- > Among the USC students, -->
<!-- > is the mean study time per week different -->
<!-- > for those who work more than 20 hours per week compared to those that work 20 hours or less? -->
<!-- ::: -->
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:RQsAnimals)**:
The *unit of observation* is the *animal*; the animals, for example, are weighed.
The *unit of analysis* is the *pen*, as the food is allocated to the animals in the whole pen.
In addition, the animals in the same pen are not independent: they compete for the same space, food, resources, and would all have similar environments that they share.
:::
<!-- SEM 2 2018: Christopher WALTON; WILLIUAM LUDOWYK; LOK RAJ BHATTA-->
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ProjectRQ2)**:
The *population* surely is not 10 adults; that sounds like the sample.
It does not make clear how many fonts are being compared (or which fonts are being used).
Perhaps try this:
> Among Australian adults, is the average time taken to read a passage of text different when Arial font is used compared to Times Roman font?
:::
<!-- SEM 2 2018: Alex Bruce; Levi Crawley; Stuart Stevens-->
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ProjectRQ3)**:
The RQ is about comparing *groups*, so it should talk about the *average* lung capacity of males and females.
Perhaps:
> Of students that study at UniSC, Sippy Downs, do males have a larger *average* lung capacity than females?
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:RQsCommonCold)**:
Remember that the **outcome** describes a group ('the *average* cold duration') whereas the response **variable** is measured from an individual ('the cold duration for each individual').
The explanatory *variable* is measured on each individual: Whether they take a tablet or not.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:RQsBlueGum)**:
The unit of observation is the leaf: the size of leaves can only be measured from individual leaves.
However, all the leaves on the same tree have 'lived their life together': They all get nutrition from the same trunk and root system, share genetics, and are located in the same place... so the unit of analysis is the tree.
The sample size is the number of units of analysis, so the sample size is 20.
:::
## Answers: Research designs {#ResearchDesignAnswer}
Answers to exercises in Sect. \@ref(ResearchDesignsExercises).
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ResearchDesignConcreteBeams)**:
The researchers could decide which beams go into Group A and into Group B.
Researchers could also allocate treatments to the groups: they could select what treatments is applied to each group of beams.
This is a *true experiment*.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ResearchDesignMatresses)**:
The researchers had no say in who was in hospital at the time: they could not allocate the patients to the two groups (overlay; mattress).
This is a *quasi-experiment*.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ResearchDesignPetsAndHealth)**:
1. *P*: Perhaps people in a suburb of the Sunshine Coast;
*O*: number of doctor's visits in the next six months;
*C*: between people owning a pet for those six months, and those who do not own a pet for those six months.
2. For an experiment, we would need to *intervene* to *give* subjects a pet, or *not* give them a pet.
3. For an observational study, we would *not* intervene: We would find the subjects who *already* owned a pet, or who did not *already* own a pet.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ResearchDesignDietsForWeightLoss)**:
1. *P*: A bit vague from this small extract: people of some kind;
*O*: the *average change* in body weight over two years;
*C*: Between the four diets;
*I*: The diets seems to be have been imposed.
2. Experimental: The diets have been *manipulated* and *imposed* by the researchers, with the intent of changing the outcome (the weight change).
3. Probably a true experiment.
4. The individuals: the diets are allocated to each individual.
5. The individuals: those from whom the weight change is taken.
6. The *change* in body weight over two years.
7. The type of diet.
:::
## Answers: Ethics {#EthicsAnswer}
Answers to exercises in Sect. \@ref(EthicsExercises).
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:EthicsCougars)**:
Answers will vary.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:EthicsSideEffects)**:
Answers will vary.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:EthicsLying)**:
Answers will vary.
:::
## Answers: Sampling {#SamplingAnswer}
Answers to exercises in Sect. \@ref(SamplingExercises).
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:SamplingBooks)**:
A tricky thing here is that some books are not physically in the library, as they have been borrowed.
1. Simple random sample:
A list of all the books held by the UniSC library is needed.
This may be possible for a librarian (it may not be, and would be really huge), it certainly is not possible for a student or non-library staff member.
In principle though, number each book, and randomly select a sample from that list.
2. Stratified:
Use locations (Sippy Downs; Fraser Coast; Caboolture; Gympie; Southbank; SCHI) as strata, and then a random sample of all the book in each locations.
3. Cluster:
Consider each set of shelves as a cluster, and randomly select some shelves, and determine the number of pages in each book on the selected shelves.
4. Convenience:
Finding books in the libraries within reach and easily accessible and on the shelves,
5. Multi-stage: Consider taking a random of campuses, then a random sample of the sets of shelves in the selected libraries, then selecting a random shelf from each one, then a small number of random book from each shelf.
6. Multi-stage perhaps.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:SamplingApartments)**:
**1.** Multi-stage.
**2.** It's a bit like stratified... but not quite.
**3.** Convenience.
**4.** A combination of multi-stage and convenience.
**5.** The second last is poor, and the last is a slight improvement.
The second is bit odd but is probably OK.
The first might be the best.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:SamplingShoppingCentre)**:
**1.** Convenience, but by approaching every 10th person they are trying to make it a little more representative... but they can do a lot better.
**2.** Convenience, but by approaching every 5th person and going every day for a week they are trying to make it a little more representative... but they can do a lot better.
**3.** Self-selecting.
**4.** Convenience. At least the researcher is trying to get a more representative sample, by going every day for two weeks, and at different times and locations each week, and approaching someone every 15 minutes.
**5.** The fourth is the best, but it is still far from 'random'.
**6.** None.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:SamplingSchools)**:
A bit like *cluster sampling* (randomly taking a small sample from many groups, and taking everyone (or everything) in those selected groups)... but not every person in the selected schools would respond (they would decide if they responded).
A *combination* of *cluster* and *voluntary response* sampling.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:SamplingMalaria)**:
True; False; True.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:SamplingForest)**:
:::
## Answers: Overview of internal validity {#FactorsInfluenceYAnswer}
Answers to exercises in Sect. \@ref(FactorsInfluenceYExercises).
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:FactorsInfluenceYStudy1)**:
Presumably *all* are extraneous variables, as all are possibly related to the response variable (incidence of depression): That is why the researchers obtained this information.
*None* can be lurking variables, as the researchers measure or observe all of them.
To be a confounding variable, the extraneous variable should be related to *both* the response variable (incidence of depression) *and* the explanatory variable (diet quality).
As a result, all of the extraneous variables could potentially be confounding variables.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:FactorsInfluenceYStudy2)**:
*Response* variable: something like 'risk of developing a cancer of the digestive system'.
*Explanatory* variable: 'whether or not the participants drank green tea at least three times a week'.
*Lurking* variable: 'health consciousness of the participants', because the researchers don't seem to have measured or observed this.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:FactorsInfluenceYStudy3)**:
Older children would probably be more likely to be smokers, and would be larger and older in general: age would be a confounding variable.
Age is easy to record, and usually *is* recorded in these types of studies, so probably *not* a lurking variable.
(The age, height and gender of each child *is* recorded.)
:::
## Answers: Designing experimental studies {#DesigningExperimentsAnswer}
Answers to exercises in Sect. \@ref(DesigningExperimentsExercises).
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:DesignExpTrueFalse).**
Only Statement 5 is true.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:DesignExpExtraneous).**
Lurking variables and confounding variables are special types of extraneous variables.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:DesignExpWeightLoss).**
Statements 2, 3, 7 and 8 are true.
'Sex of the patient' and 'Initial weight of the patients' are probably possible confounding variables.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:DesignExpParamedicsPills).**
**1.** A group receiving a pill that looks just like Treatment A and B, but has no effective ingredient.
**2.** Blinding the participants.
**3.** To ensure that participants did not change their behaviour because of the treatment they were receiving.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:DesignExpFillBlanks).**
Random allocation; analysis; washout; control group; researchers; participants.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ResearchTasteOfWater2)**:
Observer bias.
The researcher is directly contacting the subjects, so may unintentionally influence their responses.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ResearchDesignTasteOfWater3)**:
**1.** Randomly allocate the type of water to the subject (or the *order* in which the subjects taste-test each drink.)
**2.** The subjects do not know which type of water they are drinking.
**3.** The person providing the water and receiving the ratings does not know which type of water they are drinking.
**4.** Hard to find a control.
**5.** Any random sampling is good, if possible.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ResearchDesignSunscreen)**:
**1.** *Response*: The amount of sunscreen used;
*Explanatory*: The time spent on sunscreen application.
**2.** They were looking at potential confounding variables.
**3.** If the mean of both the response and explanatory variables was different for females and males, then the sex of the participant would be a *confounding* variable, and this would need to be factored into the analysis of the data.
**4.** The participants are blinded to what is happening in the study.
:::
## Answers: Designing observational studies {#DesigningObservationalAnswer}
Answers to exercises in Sect. \@ref(DesigningObservationalExercises).
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ResearchDesignObsTrueFalse).**
Only the second statement is true.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ResearchDesignObsPollen).**
Probably in case the size of the hive is a confounding variable.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ResearchDesignTasteOfWater)**:
**1.** Since this is an *observational study*, we *cannot* allocate students to receive bottled or tap water (because then the study would be an *experimental study*).
In an *experiment* we could randomly allocate students to receive *either* bottled or tap water and have them rate the taste (or even randomly allocate students to receive bottled or tap water *first*, then swap to the other type of water, and each student would then provide *two* ratings).
**2.** The students would not be aware of which water they would be drinking.
**3.** Neither the students nor the researchers who give the students the water would know which type of water the students are drinking.
**4.** We can't really set up a control here.
**5.** Any of the random sampling methods are possible, and are preferred.
In practice, perhaps use a convenience sample, but try to get a sample as representative as possible (Sect. \@ref(Representative-samples)).
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ResearchDesignHawthorneEffect)**:
No.
People can know they are being observed.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ResearchDesignSleep)**:
The descriptions indicates that patients probably knew they were involved, so the Hawthorne effect should be considered when interpreting the results.
:::
## Answers: Interpretation {#InterpretationAnswer}
Answers to exercises in Sect. \@ref(InterpretationExercises).
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ValidityLighting).**
External.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:ValidityFarmManagement).**
External; internal.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:InterpretationExerciseTrueFalse}.**
True; false; false; false.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:InterpretationExerciseValidities}.**
Ecological; external; internal; confounding; sampling.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:InterpretationExerciseExternalValidity)**:
Population: 'UniSC students on-campus'.
External validity refers to whether the results apply to other members of *this* population, not to people outside this population (such as members of the general public).
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:InterpretationExerciseParachutes)**:
**1.** P: Aircraft passengers aged 18 and over.
O: Unclear; something about 'composite of death or major traumatic injury'.
C: Between wearing a parachute and wearing a backpack.
I: Yes: Having participants wear the parachute or backpack.
**2.** Experimental: The researchers decide if the participants use a parachute or backpack.
**3.** *Explanatory*: 'whether or not a parachute is worn'.
*Response*: harder to understand; is it 'whether or not the participant dies or sustains a major injury'?
**4.** These results won't apply in the real world; not ecologically valid.
In the real world, parachutes are used at high altitude, for example.
**5.** The study is not very useful!
**6.** Speaking loosely:
That jumping from a small plane that is on the ground, parachutes are equally effective as backpacks in keeping people safe.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:InterpretationSleep)**:
Because the sample is not a random sample, the researchers are (rightly) noting that the results may not *generalise* to all hospitals.
Because the data was only collected at night, perhaps the study is not *ecologically valid*.
:::
## Answers: Data collection {#CollectionAnswer}
Answers to exercises in Sect. \@ref(CollectionExercises).
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:CollectSurveyQuestions1)**:
People aged 18 do not have a category.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:CollectSurveyQuestions2)**:
The second.
The first is *leading*: Should *concerned* cat owners...
The third is *leading* also (Do you *agree*...)
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:SunscreenQuestions)**.
The first question seems fine.
However, as always with the types of response options, what is meant by "seldom" (for instance) may vary from person to person.
The second question has options that overlap: Both 1h and 2h are in two categories.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:KidsEnvironmentQuestions).**
1. The first two are *closed*, with one option to be selected. The other is *open*.
1. I'm not sure primary school children could recall the answers to the first two questions accurately...
:::
## Answers: Describing variables {#DescribeAnswer}
Answers to exercises in Sect. \@ref(DescribeExercises).
::: {.answer data-latex=""}
**Answer to Exercise\@ref(exr:DescribeClassifyingGraphsLimeTrees)**:
*Foliage biomass*: quantitative continuous.
*Tree diameter* (in cm): quantitative continuous.
*Age of the tree* (in years): quantitative continuous.
*Origin of the tree*: Qualitative nominal.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:DescribeClassifyingVariables1)**:
**1.** Systolic blood pressure: quantitative continuous.
**2.** Program of enrolment: qualitative nominal.
**3.** Academic grade: qualitative ordinal.
**4.** Number of times people visited the doctor last year: quantitative discrete.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:DescribeClassifyingVariables2)**:
**1.** Age: qualitative ordinal.
**2.** Gender: qualitative nominal.
**3.** Location: qualitative nominal.
**4.** Social media use: qualitative ordinal.
**5.** BMI: quantitative continuous.
**6.** Total sitting time, in minutes per day: quantitative continuous.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:DescribeClassifyingOrthoses)**:
*Gender*: Qualitative nominal.
*Age*: Quantitative continuous.
*Height*: Quantitative continuous.
*Weight*: Quantitative continuous.
*GMFCS*: Qualitative ordinal.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:DescribeClassifyingNitrogenInSoil)**:
*Fertilizer dose*: Quantitative continuous.
*Soil nitrogen*: Quantitative continuous.
*Fertilizer source*: Qualitative nominal.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:DescribeClassifyingKangaroos)**:
*Response of kangaroos*: Qualitative ordinal. (Or perhaps nominal?)
*Height of drone*: 'Height' is quantitative,
but with just four values used it would probably be treated as qualitative ordinal.
*Mob size*s: Quantitative discrete.
*Sex*: Qualitative nominal.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:DescribeSelfieDeaths)**:
*Location* is the only variable (something observed from the *individuals*).
The *number of people* and the *percentage of people* who died at each location is a *summary* of the data collected from the individuals.
'Location' is a *nominal*, *qualitative* variable, with seven *levels*.
:::
## Answers: Graphs {#GraphsAnswer}
Answers to exercises in Sect. \@ref(GraphsExercises).
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:GraphsCars)**:
None of them are *bad* graphs.
I'd prefer the bar chart, but any are OK.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:GraphsLimeTrees)**:
A graph of the individual variables is always useful as a starting point: so a *bar chart* for the origin, and a *histogram* for the others.
But *relationships* are the main focus.
Relationships between foliage biomass and tree origin: *boxplot*.
Relationships between foliage biomass and the other variables: *scatterplot*.
On the scatterplot, the different origins of the trees could be *encoded*
by using different colours or plotting symbols.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:GraphsOrthoses)**:
`Gender` and `GFMCS`: both qualitative; the others are quantitative.
Relationships between two *quantitative* variables: use a scatterplot.
Relationships between two *qualitative* variables: (say) a side-by-side bar chart.
With one of each: boxplot.
See Fig. \@ref(fig:CPalsyPlots) for some examples.
:::
```{r CPalsyPlots, fig.align="center", fig.cap="Some graphs from the cerebral palsy data", fig.width=9, fig.height=4, out.width='100%'}
NumP <- 15
Gender <- rep("M",
NumP)
Gender[ c(11, 14, 15)] <- "F"
Age <- c(9, 7, 7, 12, 11, 5, 6, 8, 8, 6, 7, 11, 7, 9, 8)
Ht <- c(136, 106, 129, 152, 146, 113, 112, 112, 138, 116, 113, 141, 136, 128, 133)
Wt <- c(34.5, 16.2, 21.1, 40.4, 39.3, 18.1, 16.7, 19.1, 28.6, 19.3, 17.6, 34.9, 34.5, 21.9, 23.0)
GMFCS <- c(1, 2, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1)
CPalsy <- data.frame(Gender = Gender,
Age = Age,
Height = Ht,
Weight = Wt,
GMFCS = GMFCS)
par( mfrow = c(1, 3))
plot( Height ~ Age,
data = CPalsy,
las = 1,
xlab = "Age (in years)",
ylab = "Ht (in cm)")
boxplot( Height ~ Gender,
data = CPalsy,
las = 1,
xlab = "Gender",
ylab = "Ht (in cm)")
barplot( xtabs( ~ GMFCS + Gender,
data = CPalsy),
las = 1,
beside = FALSE,
xlab = "Gender",
ylab = "GMFCS")
```
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:GraphNitrogenInSoil)**:
*Fertilizer* (quantitative): histogram (response variable).
*Soil nitrogen* (quantitative): Histogram (explanatory variable).
*Source* (qualitative nominal): Bar chart (explanatory variable).
*Relationships*: Between fertilizer dose and soil nitrogen: scatterplot.
*Source* could be encoded using different *coloured* points.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:GraphSurveyData)**:
A bar chart (or dot chart).
A pie chart would *not* be appropriate, as respondents could select more than one option.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:GraphsAIS)**:
In general, female basketball players are taller than female netballer players (the first, second and third quartiles are all greater for basketball players).
For the second and third quartiles, the differences look quite substantial.
The minimum heights are similar.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:GraphNoisyMiners)**:
What do the different plotting symbols mean?
The labels on the axes are not helpful.
The vertical axis goes up to 35, but could easily stop at 20.
See Fig. \@ref(fig:MinersGood).
:::
```{r MinersGood, fig.cap="The number of noisy miners and the number of eucalyptus trees", fig.align="center", fig.width=5, fig.height=3.5}
data(NMiner)
plot(Minerab ~ Eucs,
data = NMiner,
ylim = c(0, 20),
las = 1,
xlab = "Number of eucalypt trees",
ylab = "Number of noisy miners",
main = "No. of eucalypts and the no. of noisy miners in\n 2 ha. buloke woodland patches in Vic.",
pch = 19)
```
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:GraphHorseshoeCrabs)**:
The graph is *inappropriate*!
Both variables are qualitative, but the graph is a scatterplot (used for two *quantitative* variables).
What does that plot even tell you?
A stacked or side-by-side bar chart should be used (Fig. \@ref(fig:CrabsPlotBar)).
:::
```{r CrabsPlotBar, fig.cap="The colour of female horseshoe crabs and the condition of their spines. There are no missing values.", fig.align="center", fig.width=5, fig.height=4}
data(HCrabs)
CrabsMat <- xtabs(~ Spine + Col,
data = HCrabs)
barplot( CrabsMat,
beside = FALSE,
xlab = "Condition of the spine",
ylab = "Carapace colour",
ylim = c(0, 100),
col = grey( seq(0, 1, length = 3)),
las = 1)
legend("topleft",
fill = grey( seq(0, 1, length = 3)),
legend = rownames(CrabsMat))
```
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:GraphsMADRS)**:
**1.** Response variable: *Change* in MADRS (quantitative continuous).
**2.** Explanatory variable: treatment group (qualitative nominal with three levels).
**3.** Response variable: Histogram. Explanatory: bar chart. Relationship: boxplot.
:::
:::{.answer data-latex=""}
**Answer to Exercise \@ref(exr:GreenBuilding)**:
See Fig. \@ref(fig:OfficeTempsBoxplot).
```{r OfficeTempsBoxplot, fig.cap="Boxplot of the office temperatures", fig.align="center", fig.width=5, fig.height=4}
Green.stats <- matrix(
c(16.4 ,15.9 ,20.1,
22.8 ,23.8 ,24.6,
24.4 ,25.5 ,26.1,
25.5 ,26.9 ,27.2,
27.4 ,31.0 ,30.3),
ncol = 3,
byrow = TRUE)
Green.bxp <- list( stats = Green.stats,
n = rep(100, 3), # MADE UP
conf = NA,
out = rep(numeric(0), 3),
group = rep(numeric(0), 3),
names = c("Office A",
"Office B",
"Office C"))
bxp( Green.bxp,
las = 1,
xlab = "Office",
ylab = "Temperature (in degrees C)",
ylim = c(15, 35)
)
```
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:GraphsSkewBar)**:
Variable is the 'Sport' (*qualitative*).
The bars can be ordered any way.
*Skewness makes no sense*: It only makes sense to talk about skewness for *quantitative* variables.
:::
## Answers: Numerical summaries for quantitative data {#NumericalQuantAnswer}
Answers to exercises in Sect. \@ref(NumericalQuantExercises).
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:NumericalQuantRides)**:
**1.** 3.7.
**2.** 3.5.
**3.** 1.888562.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:NumericalQuantFulmars)**:
**1.** 643g.
**2.** 12.884g.
**3.** 637.5g.
**4.** We do not know.
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:NumericalQuantNHANES)**:
Probably the median as slightly skewed right, with some outliers.
*Both* the mean and median *can* be quoted...
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:NumericalQuantSOI)**:
**1.** Sample mean: 0.467.
**2.** Sample median: 3.35.
**3.** Range: 29.6 (from -19.8 to 9.8).
**4.** Sample standard deviation: 10.40263.
(SOI has no units of measurement.)
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:NumericalQuantMatchingHistogramsAndBoxplots)**:
**A**: II (median; IQR).
**B**: I (mean; standard deviation).
**C**: III (median; IQR).
:::
::: {.answer data-latex=""}
**Answer to Exercise \@ref(exr:NumericalQuantConstructionWorkerProductivity)**:
See Fig \@ref(fig:PanelsBoxplot).
Worker 2 is faster in general (more panels installed per minute), including one *fast* outlier.
Workers 1 and 3 have similar medians, but Worker 3 is more consistent (smaller IQR).
:::