Placebo mean: 100.55555555555556
Caffine mean: 94.22222222222223
Placebo SD: 7.699206308300732
Caffine SD: 5.607534613753574
Placebo SEM: 2.5664021027669106
Caffine SEM: 1.8691782045845249
Placebo variance: 474.22222222222223
Caffine variance: 251.55555555555557
Pooled variance: 45.361111111111114
T Value: 1.9947880650265368
Degrees of Freedom: 16
P Value: 0.0609
Basically we went into this experiment saying that in order to accept that caffine affected metabolism, we needed to be at least 95% confident. But when we ran the tests, we found that we were around 94% confident. So we're close, but I wouldn't draw any conclusions just yet. In order to essentially, "break the tie" here, we can go get a bigger sample size and run a few more tests.
From this data we can conclude that caffine does not have an effect of muscle metabolism. The reason for that lies in the p-value (0.0606). What this value means is that we are ~94% confident that our effect is happening. In order to reject our null hypothesis it is generally accepted that we need to be above 95% confident in our interaction. Now, that doesn't mean that our research should stop here though. Clearly our data is indicating that we are extremely close to being confident. It is still certainly possible that our effect is happening, but maybe we don't have a large enough sample size. We should calculate how much statistical power we have, in order to determine the number of samples necessary to be confident that sample size is not the issue. We could also calculate Cohen's D to observe the effect size. This statistic does not give us anything that is conclusive, however, effect size can be helpful to know as a piece of evidence for where to look next. For example, if the effect size is very large (0.8-1) then maybe we have a Type-1 error. But, if our effect size is very small (0.1-0.3) then we can probably just accept the null hypothesis.