Built site for gh-pages

SydneyBioX · Dec 15, 2024 · e091ceb · e091ceb
1 parent b6c476d
commit e091ceb
Show file tree

Hide file tree

Showing 11 changed files with 465 additions and 16 deletions.
diff --git a/.nojekyll b/.nojekyll
@@ -1 +1 @@
-fa4d7850
+42df7a45
diff --git a/Procedure1.html b/Procedure1.html
@@ -20,6 +20,40 @@
   margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */ 
   vertical-align: middle;
 }
+/* CSS for syntax highlighting */
+pre > code.sourceCode { white-space: pre; position: relative; }
+pre > code.sourceCode > span { line-height: 1.25; }
+pre > code.sourceCode > span:empty { height: 1.2em; }
+.sourceCode { overflow: visible; }
+code.sourceCode > span { color: inherit; text-decoration: inherit; }
+div.sourceCode { margin: 1em 0; }
+pre.sourceCode { margin: 0; }
+@media screen {
+div.sourceCode { overflow: auto; }
+}
+@media print {
+pre > code.sourceCode { white-space: pre-wrap; }
+pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
+}
+pre.numberSource code
+  { counter-reset: source-line 0; }
+pre.numberSource code > span
+  { position: relative; left: -4em; counter-increment: source-line; }
+pre.numberSource code > span > a:first-child::before
+  { content: counter(source-line);
+    position: relative; left: -1em; text-align: right; vertical-align: baseline;
+    border: none; display: inline-block;
+    -webkit-touch-callout: none; -webkit-user-select: none;
+    -khtml-user-select: none; -moz-user-select: none;
+    -ms-user-select: none; user-select: none;
+    padding: 0 4px; width: 4em;
+  }
+pre.numberSource { margin-left: 3em;  padding-left: 4px; }
+div.sourceCode
+  {   }
+@media screen {
+pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
+}
 </style>
 
 
@@ -158,7 +192,12 @@ <h2 id="toc-title">Table of contents</h2>
 
   <ul>
   <li><a href="#intro" id="toc-intro" class="nav-link active" data-scroll-target="#intro"><span class="header-section-number">2.1</span> Intro</a></li>
-  <li><a href="#next-thing" id="toc-next-thing" class="nav-link" data-scroll-target="#next-thing"><span class="header-section-number">2.2</span> Next thing</a></li>
+  <li><a href="#set-up-the-environment-and-data-objects" id="toc-set-up-the-environment-and-data-objects" class="nav-link" data-scroll-target="#set-up-the-environment-and-data-objects"><span class="header-section-number">2.2</span> Set up the environment and data objects</a></li>
+  <li><a href="#final-procedure-using-scm-classifier-with-5-of-features-selected-from-t-test" id="toc-final-procedure-using-scm-classifier-with-5-of-features-selected-from-t-test" class="nav-link" data-scroll-target="#final-procedure-using-scm-classifier-with-5-of-features-selected-from-t-test"><span class="header-section-number">3</span> ## Final Procedure using SCM Classifier with 5% of features selected from t-test</a>
+  <ul class="collapse">
+  <li><a href="#cross-validated-classification" id="toc-cross-validated-classification" class="nav-link" data-scroll-target="#cross-validated-classification"><span class="header-section-number">3.1</span> Cross-validated Classification</a></li>
+  <li><a href="#classification-evaluation" id="toc-classification-evaluation" class="nav-link" data-scroll-target="#classification-evaluation"><span class="header-section-number">3.2</span> Classification Evaluation</a></li>
+  </ul></li>
   </ul>
 </nav>
     </div>
@@ -186,13 +225,104 @@ <h1 class="title"><span class="chapter-number">2</span>&nbsp; <span class="chapt
 
 <section id="intro" class="level2" data-number="2.1">
 <h2 data-number="2.1" class="anchored" data-anchor-id="intro"><span class="header-section-number">2.1</span> Intro</h2>
-<p>text</p>
+<p>Procedure 1 aims to compare the classification sample accuracy in bulked and pseudobulked data using an SVM classifier. Sample accuracy calcualtes the proportion of correct predicted classifications over n repeats. For example, if 70 out of 100 repeats accurately classify an individual, their sample accuracy is 0.70.</p>
+<p>For this analysis, a single-cell transcriptomics dataset for 54 oestrogen receptor positive (ER+)/progesterone receptor positive (PR+) breast cancer patients is used. There are 280 genes and 6 cell types. The classification outcomes are ‘cluster 1’ and ‘cluster 2’, found by unsupervised machine learning. Cluster 1 corresponds to patients with higher expression of LPL, CAVIN2, and TIMP4 in macrophage cells, and ADIPOQ in stromal cells, associated with better survival. Cluster 2 is the opposite, associated with poorer survival. More details on the clusters can be found at https://www.biorxiv.org/content/10.1101/2024.07.02.601790v1.full.</p>
+</section>
+<section id="set-up-the-environment-and-data-objects" class="level2" data-number="2.2">
+<h2 data-number="2.2" class="anchored" data-anchor-id="set-up-the-environment-and-data-objects"><span class="header-section-number">2.2</span> Set up the environment and data objects</h2>
+<p><strong>1. Load the R packages into the R environment:</strong></p>
+<p><span style="color: grey;">Timing ~ 6.5s</span></p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(ClassifyR)</span>
+<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(openxlsx)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
 </section>
-<section id="next-thing" class="level2" data-number="2.2">
-<h2 data-number="2.2" class="anchored" data-anchor-id="next-thing"><span class="header-section-number">2.2</span> Next thing</h2>
-<p>more text</p>
+<section id="final-procedure-using-scm-classifier-with-5-of-features-selected-from-t-test" class="level1" data-number="3">
+<h1 data-number="3"><span class="header-section-number">3</span> ## Final Procedure using SCM Classifier with 5% of features selected from t-test</h1>
+<p><strong>2. Import datasets</strong></p>
+<p><span style="color: grey;">Timing ~ 0.15s</span></p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>p <span class="ot">&lt;-</span> <span class="fu">readRDS</span>(<span class="st">"data/procedure1/pseudobulk_overall_sum.rds"</span>)</span>
+<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a>p_cell <span class="ot">&lt;-</span> <span class="fu">readRDS</span>(<span class="st">"data/procedure1/pseudobulk_celltype_sum.rds"</span>)</span>
+<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a>data <span class="ot">=</span> <span class="fu">list</span>(<span class="st">"pseudo_bulk_overall"</span> <span class="ot">=</span> p, <span class="st">"pseudo_bulk_cell"</span> <span class="ot">=</span> p_cell)</span>
+<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a>clusters <span class="ot">&lt;-</span> <span class="fu">readRDS</span>(<span class="st">"data/procedure1/cluster_result_ER_and_PR_onlyER+PR+.rds"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
+<p>The first and second command respectively read in the bulked and pseudobulked transcriptomic datasets for the 54 individuals.</p>
+<p>Both datasets are combined into a list structure for downstream analysis.</p>
+<p>The final command reads in the clustering outcome to be predicted.</p>
+<section id="cross-validated-classification" class="level2" data-number="3.1">
+<h2 data-number="3.1" class="anchored" data-anchor-id="cross-validated-classification"><span class="header-section-number">3.1</span> Cross-validated Classification</h2>
+<p><strong>3. Classifying patients into clusters</strong></p>
+<p><span style="color: grey;">Timing ~ 17.0s</span></p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">1</span>)</span>
+<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a>nFeatures <span class="ot">=</span> <span class="fu">list</span>(<span class="at">pseudo_bulk_overall =</span> <span class="fl">0.05</span><span class="sc">*</span><span class="fu">ncol</span>(p), <span class="at">pseudo_bulk_cell =</span> <span class="fl">0.05</span><span class="sc">*</span><span class="fu">ncol</span>(p_cell))</span>
+<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a>classifyr_result3 <span class="ot">&lt;-</span> <span class="fu">crossValidate</span>(data, <span class="at">outcome =</span> clusters<span class="sc">$</span>cluster, <span class="at">classifier =</span> <span class="st">"SVM"</span>, <span class="at">nFeatures =</span> nFeatures, <span class="at">nFolds =</span> <span class="dv">5</span>, <span class="at">nRepeats =</span> <span class="dv">100</span>, <span class="at">nCores =</span> <span class="dv">5</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
+<p>The <code>set.seed(1)</code> command ensures that any subsequent operations involving randomness yield consistent results across runs.</p>
+<p>The second command defines the number of features to be selected. Here, it is set to 5% of the available features in each dataset.</p>
+<p>The final command uses the <code>crossValidate</code> function to perform 5-fold cross-validation with the SVM classifier on both datasets. This process is repeated 100 times and utilizes 5 CPU cores for parallel processing to speed up classification. The type of classifier, number of folds, repeats and cores used can be adjusted as wished for different analyses.<br>
+<br>
+</p>
+</section>
+<section id="classification-evaluation" class="level2" data-number="3.2">
+<h2 data-number="3.2" class="anchored" data-anchor-id="classification-evaluation"><span class="header-section-number">3.2</span> Classification Evaluation</h2>
+<p><strong>4. Classification Accuracy</strong></p>
+<p><span style="color: grey;">Timing ~ 0.21s</span></p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>classifyr_result3 <span class="ot">&lt;-</span> <span class="fu">sapply</span>(classifyr_result3, <span class="cf">function</span>(results) {</span>
+<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a>  <span class="fu">calcCVperformance</span>(results, <span class="at">performanceType =</span> <span class="st">"Sample Accuracy"</span>)</span>
+<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a>}) <span class="co"># loop to calculate sample accuracy</span></span>
+<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a>accuracyMatrix <span class="ot">&lt;-</span> <span class="fu">sapply</span>(classifyr_result3, <span class="cf">function</span>(result) <span class="fu">performance</span>(result)[[<span class="st">"Sample Accuracy"</span>]])</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
+<p>The first function uses <code>calcCVperfomance</code> to calculate the classification sample accuracy for both datasets.</p>
+<p>The second function uses <code>performance</code> to output a matrix of sample accuracies.</p>
+<p><strong>5. Classification Performance Visualisation</strong></p>
+<p><span style="color: grey;">Timing ~ 1.7s</span></p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="fu">performancePlot</span>(classifyr_result3)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<div>
+<figure class="figure">
+<p><img src="Procedure1_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid figure-img" width="672"></p>
+</figure>
+</div>
+</div>
+</div>
+<p><code>performancePlot</code> outputs a side-by-side boxplot of the balanced accuracies for each dataset.<br>
+Both methods perform comparably in terms of median balanced accuracy, but the smaller range and interquartile range in balanced accuracy of the bulked datasets suggests it may be preferable for more consistent results.</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="fu">samplesMetricMap</span>(classifyr_result3)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<div>
+<figure class="figure">
+<p><img src="Procedure1_files/figure-html/unnamed-chunk-6-1.png" class="img-fluid figure-img" width="672"></p>
+</figure>
+</div>
+</div>
+<div class="cell-output cell-output-stdout">
+<pre><code>TableGrob (2 x 1) "arrange": 2 grobs
+  z     cells    name                grob
+1 1 (2-2,1-1) arrange      gtable[layout]
+2 2 (1-1,1-1) arrange text[GRID.text.276]</code></pre>
+</div>
+</div>
+<p><code>samplesMetricMap</code> outputs a heatmap showing the classification accuracy for each of 100 repeats in each sample. A greater proportion of samples show high sample accuracies (0.8,1] when classified by the bulked data as opposed to the pseudobulked data.</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb8"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plot</span>(accuracyMatrix) <span class="co">#scatterplot of sample accuracies for both datasets</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<div>
+<figure class="figure">
+<p><img src="Procedure1_files/figure-html/unnamed-chunk-7-1.png" class="img-fluid figure-img" width="672"></p>
+</figure>
+</div>
+</div>
+</div>
+<p>The final command outputs a scatterplot to show the prediction accuracy of each sample in each dataset.<br>
+</p>
 
 
+</section>
 </section>
 
 </main> <!-- /main -->

diff --git a/Procedure1_files/figure-html/unnamed-chunk-5-1.png b/Procedure1_files/figure-html/unnamed-chunk-5-1.png
diff --git a/Procedure1_files/figure-html/unnamed-chunk-6-1.png b/Procedure1_files/figure-html/unnamed-chunk-6-1.png
diff --git a/Procedure1_files/figure-html/unnamed-chunk-7-1.png b/Procedure1_files/figure-html/unnamed-chunk-7-1.png
diff --git a/Procedure3.html b/Procedure3.html
@@ -20,6 +20,40 @@
   margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */ 
   vertical-align: middle;
 }
+/* CSS for syntax highlighting */
+pre > code.sourceCode { white-space: pre; position: relative; }
+pre > code.sourceCode > span { line-height: 1.25; }
+pre > code.sourceCode > span:empty { height: 1.2em; }
+.sourceCode { overflow: visible; }
+code.sourceCode > span { color: inherit; text-decoration: inherit; }
+div.sourceCode { margin: 1em 0; }
+pre.sourceCode { margin: 0; }
+@media screen {
+div.sourceCode { overflow: auto; }
+}
+@media print {
+pre > code.sourceCode { white-space: pre-wrap; }
+pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
+}
+pre.numberSource code
+  { counter-reset: source-line 0; }
+pre.numberSource code > span
+  { position: relative; left: -4em; counter-increment: source-line; }
+pre.numberSource code > span > a:first-child::before
+  { content: counter(source-line);
+    position: relative; left: -1em; text-align: right; vertical-align: baseline;
+    border: none; display: inline-block;
+    -webkit-touch-callout: none; -webkit-user-select: none;
+    -khtml-user-select: none; -moz-user-select: none;
+    -ms-user-select: none; user-select: none;
+    padding: 0 4px; width: 4em;
+  }
+pre.numberSource { margin-left: 3em;  padding-left: 4px; }
+div.sourceCode
+  {   }
+@media screen {
+pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
+}
 </style>
 
 
@@ -191,6 +225,19 @@ <h2 data-number="4.1" class="anchored" data-anchor-id="intro"><span class="heade
 <section id="next-thing" class="level2" data-number="4.2">
 <h2 data-number="4.2" class="anchored" data-anchor-id="next-thing"><span class="header-section-number">4.2</span> Next thing</h2>
 <p>more text</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">1</span>)</span>
+<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="fu">suppressPackageStartupMessages</span>(<span class="fu">library</span>(ClassifyR))</span>
+<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="co"># mae &lt;- readRDS("data/procedure3/MultiAssayExperiment.rds")</span></span>
+<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="co"># pp &lt;- precisionPathwaysTrain(mae, "Outcome")</span></span>
+<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a><span class="co"># pp &lt;- calcCostsAndPerformance(pp, setNames(c(30, 100, 50), c("Clinical", "RNA_pair", "miRNA_pair")))</span></span>
+<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a><span class="co"># summary(pp)</span></span>
+<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a><span class="co"># bubblePlot(pp)</span></span>
+<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a><span class="co"># strataPlot(pp, "clinical-RNA_pair")</span></span>
+<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a><span class="co"># flowchart(pp, "clinical-RNA_pair")</span></span>
+<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a><span class="co"># predictions &lt;- precisionPathwaysPredict(pp, mae, "Outcome")</span></span>
+<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a><span class="co"># predictions$pathways</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
 
 
 </section>

diff --git a/Procedure3_files/figure-html/unnamed-chunk-1-1.png b/Procedure3_files/figure-html/unnamed-chunk-1-1.png
diff --git a/Procedure3_files/figure-html/unnamed-chunk-1-2.png b/Procedure3_files/figure-html/unnamed-chunk-1-2.png