diff --git a/.nojekyll b/.nojekyll
index ece341cc..af6f4909 100644
--- a/.nojekyll
+++ b/.nojekyll
@@ -1 +1 @@
-eec41acf
\ No newline at end of file
+585ab0ee
\ No newline at end of file
diff --git a/dl_primer.html b/dl_primer.html
index 66cdc282..d72f2031 100644
--- a/dl_primer.html
+++ b/dl_primer.html
@@ -84,6 +84,8 @@
 }</script>
 <style>html{ scroll-behavior: smooth; }</style>
 
+  <script src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script>
+  <script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml-full.js" type="text/javascript"></script>
 
 </head>
 
@@ -438,7 +440,7 @@ <h3 data-number="3.1.1" class="anchored" data-anchor-id="definition-and-importan
 <section id="brief-history-of-deep-learning" class="level3" data-number="3.1.2">
 <h3 data-number="3.1.2" class="anchored" data-anchor-id="brief-history-of-deep-learning"><span class="header-section-number">3.1.2</span> Brief History of Deep Learning</h3>
 <p>The concept of deep learning has its roots in the early artificial neural networks. It has witnessed several waves of popularity, starting with the introduction of the Perceptron in the 1950s <span class="citation" data-cites="rosenblatt1957perceptron">(<a href="references.html#ref-rosenblatt1957perceptron" role="doc-biblioref">Rosenblatt 1957</a>)</span>, followed by the development of backpropagation algorithms in the 1980s <span class="citation" data-cites="rumelhart1986learning">(<a href="references.html#ref-rumelhart1986learning" role="doc-biblioref">Rumelhart, Hinton, and Williams 1986</a>)</span>.</p>
-<p>The term “deep learning” emerged in the 2000s, marked by breakthroughs in computational power and data availability. Key milestones include the successful training of deep networks by <a href="https://amturing.acm.org/award_winners/hinton_4791679.cfm">Geoffrey Hinton</a>, one of the god fathers of AI, and the resurgence of neural networks as a potent tool for data analysis and modeling.</p>
+<p>The term <em>deep learning</em> emerged in the 2000s, marked by breakthroughs in computational power and data availability. Key milestones include the successful training of deep networks such as AlexNet <span class="citation" data-cites="krizhevsky2012imagenet">(<a href="references.html#ref-krizhevsky2012imagenet" role="doc-biblioref">Krizhevsky, Sutskever, and Hinton 2012</a>)</span> by <a href="https://amturing.acm.org/award_winners/hinton_4791679.cfm">Geoffrey Hinton</a>, one of the god fathers of AI, and the resurgence of neural networks as a potent tool for data analysis and modeling.</p>
 <p>In recent years, deep learning has witnessed exponential growth, becoming a transformative force across various industries. <a href="#fig-trends">Figure&nbsp;<span>3.1</span></a> shows that we are currently in the third era of deep learning. From 1952 to 2010, computational growth followed an 18-month doubling pattern. This dramatically accelerated to a 6-month cycle from 2010 to 2022. At the same time, we witnessed the advent of major-scale models between 2015 and 2022; these appeared 2 to 3 orders of magnitude faster and followed a 10-month doubling cycle.</p>
 <div id="fig-trends" class="quarto-figure quarto-figure-center anchored">
 <figure class="figure">
@@ -587,17 +589,17 @@ <h4 data-number="3.2.6.3" class="anchored" data-anchor-id="recurrent-neural-netw
 </section>
 <section id="generative-adversarial-networks-gans" class="level4" data-number="3.2.6.4">
 <h4 data-number="3.2.6.4" class="anchored" data-anchor-id="generative-adversarial-networks-gans"><span class="header-section-number">3.2.6.4</span> Generative Adversarial Networks (GANs)</h4>
-<p>GANs consist of two networks, a generator and a discriminator, that are trained simultaneously through adversarial training. The generator produces data that tries to mimic the real data distribution, while the discriminator aims to distinguish between real and generated data. GANs are widely used in image generation, style transfer, and data augmentation.</p>
+<p>GANs consist of two networks, a generator and a discriminator, that are trained simultaneously through adversarial training <span class="citation" data-cites="goodfellow2020generative">(<a href="references.html#ref-goodfellow2020generative" role="doc-biblioref">Goodfellow et al. 2020</a>)</span>. The generator produces data that tries to mimic the real data distribution, while the discriminator aims to distinguish between real and generated data. GANs are widely used in image generation, style transfer, and data augmentation.</p>
 <p>In embedded contexts, GANs could be used for on-device data augmentation to enhance the training of models directly on the embedded device, facilitating continual learning and adaptation to new data without the need for cloud computing resources.</p>
 </section>
 <section id="autoencoders" class="level4" data-number="3.2.6.5">
 <h4 data-number="3.2.6.5" class="anchored" data-anchor-id="autoencoders"><span class="header-section-number">3.2.6.5</span> Autoencoders</h4>
-<p>Autoencoders are neural networks used for data compression and noise reduction. They are structured to encode input data into a lower-dimensional representation and then decode it back to the original form. Variations like Variational Autoencoders (VAEs) introduce probabilistic layers that allow for generative properties, finding applications in image generation and anomaly detection.</p>
+<p>Autoencoders are neural networks used for data compression and noise reduction <span class="citation" data-cites="bank2023autoencoders">(<a href="references.html#ref-bank2023autoencoders" role="doc-biblioref">Bank, Koenigstein, and Giryes 2023</a>)</span>. They are structured to encode input data into a lower-dimensional representation and then decode it back to the original form. Variations like Variational Autoencoders (VAEs) introduce probabilistic layers that allow for generative properties, finding applications in image generation and anomaly detection.</p>
 <p>Implementing autoencoders can assist in efficient data transmission and storage, enhancing the overall performance of embedded systems with limited computational and memory resources.</p>
 </section>
 <section id="transformer-networks" class="level4" data-number="3.2.6.6">
 <h4 data-number="3.2.6.6" class="anchored" data-anchor-id="transformer-networks"><span class="header-section-number">3.2.6.6</span> Transformer Networks</h4>
-<p>Transformer networks have emerged as a powerful architecture, especially in the field of natural language processing. These networks use self-attention mechanisms to weigh the influence of different input words on each output word, facilitating parallel computation and capturing complex patterns in data. Transformer networks have led to state-of-the-art results in tasks such as language translation, summarization, and text generation.</p>
+<p>Transformer networks have emerged as a powerful architecture, especially in the field of natural language processing <span class="citation" data-cites="vaswani2017attention">(<a href="references.html#ref-vaswani2017attention" role="doc-biblioref">Vaswani et al. 2017</a>)</span>. These networks use self-attention mechanisms to weigh the influence of different input words on each output word, facilitating parallel computation and capturing complex patterns in data. Transformer networks have led to state-of-the-art results in tasks such as language translation, summarization, and text generation.</p>
 <p>These networks can be optimized to perform language-related tasks directly on-device. For instance, transformers can be utilized in embedded systems for real-time translation services or voice-assisted interfaces, where latency and computational efficiency are critical factors. Techniques such as model distillation which we will discss later on can be employed to deploy these networks on embedded devices with constrained resources.</p>
 <p>Each of these architectures serves specific purposes and excel in different domains, offering a rich toolkit for tackling diverse problems in the realm of embedded AI systems. Understanding the nuances of these architectures is vital in designing effective and efficient deep learning models for various applications.</p>
 </section>
@@ -608,11 +610,11 @@ <h2 data-number="3.3" class="anchored" data-anchor-id="libraries-and-frameworks"
 <p>In the world of deep learning, the availability of robust libraries and frameworks has been a cornerstone in facilitating the development, training, and deployment of models, particularly in embedded AI systems where efficiency and optimization are key. These libraries and frameworks are often equipped with pre-defined functions and tools that allow for rapid prototyping and deployment. This section sheds light on popular libraries and frameworks, emphasizing their utility in embedded AI scenarios.</p>
 <section id="tensorflow" class="level3" data-number="3.3.1">
 <h3 data-number="3.3.1" class="anchored" data-anchor-id="tensorflow"><span class="header-section-number">3.3.1</span> TensorFlow</h3>
-<p><a href="https://www.tensorflow.org/">TensorFlow</a>, developed by Google, stands as one of the premier frameworks for developing deep learning models. Its ability to work seamlessly with embedded systems comes from <a href="https://www.tensorflow.org/lite">TensorFlow Lite</a>, a lightweight solution designed to run on mobile and embedded devices. TensorFlow Lite enables the execution of optimized models on a variety of platforms, making it easier to integrate AI functionalities in embedded systems. For TinyML we will be dealing with <a href="https://www.tensorflow.org/lite/microcontrollers">TensorFlow Lite for Microcontrollers</a>.</p>
+<p><a href="https://www.tensorflow.org/">TensorFlow</a>, developed by Google <span class="citation" data-cites="abadi2016tensorflow">(<a href="references.html#ref-abadi2016tensorflow" role="doc-biblioref">Abadi et al. 2016</a>)</span>, stands as one of the premier frameworks for developing deep learning models. Its ability to work seamlessly with embedded systems comes from <a href="https://www.tensorflow.org/lite">TensorFlow Lite</a>, a lightweight solution designed to run on mobile and embedded devices. TensorFlow Lite enables the execution of optimized models on a variety of platforms, making it easier to integrate AI functionalities in embedded systems. For TinyML we will be dealing with <a href="https://www.tensorflow.org/lite/microcontrollers">TensorFlow Lite for Microcontrollers</a>.</p>
 </section>
 <section id="pytorch" class="level3" data-number="3.3.2">
 <h3 data-number="3.3.2" class="anchored" data-anchor-id="pytorch"><span class="header-section-number">3.3.2</span> PyTorch</h3>
-<p><a href="https://pytorch.org/">PyTorch</a>, an open-source library developed by Facebook, is praised for its dynamic computation graph and ease of use. For embedded AI, PyTorch can be a suitable choice for research and prototyping, offering a seamless transition from research to production with the use of the TorchScript scripting language. PyTorch Mobile further facilitates the deployment of models on mobile and embedded devices, offering tools and workflows to optimize performance.</p>
+<p><a href="https://pytorch.org/">PyTorch</a>, an open-source library developed by Facebook <span class="citation" data-cites="paszke2019pytorch">(<a href="references.html#ref-paszke2019pytorch" role="doc-biblioref">Paszke et al. 2019</a>)</span>, is praised for its dynamic computation graph and ease of use. For embedded AI, PyTorch can be a suitable choice for research and prototyping, offering a seamless transition from research to production with the use of the TorchScript scripting language. PyTorch Mobile further facilitates the deployment of models on mobile and embedded devices, offering tools and workflows to optimize performance.</p>
 </section>
 <section id="onnx-runtime" class="level3" data-number="3.3.3">
 <h3 data-number="3.3.3" class="anchored" data-anchor-id="onnx-runtime"><span class="header-section-number">3.3.3</span> ONNX Runtime</h3>
@@ -620,11 +622,11 @@ <h3 data-number="3.3.3" class="anchored" data-anchor-id="onnx-runtime"><span cla
 </section>
 <section id="keras" class="level3" data-number="3.3.4">
 <h3 data-number="3.3.4" class="anchored" data-anchor-id="keras"><span class="header-section-number">3.3.4</span> Keras</h3>
-<p>Keras serves as a high-level neural networks API, capable of running on top of TensorFlow, and other frameworks like Theano, or CNTK. For developers venturing into embedded AI, Keras offers a simplified interface for building and training models. Its ease of use and modularity can be especially beneficial in the rapid development and deployment of models in embedded systems, facilitating the integration of AI capabilities with minimal complexity.</p>
+<p>Keras <span class="citation" data-cites="chollet2015">(<a href="references.html#ref-chollet2015" role="doc-biblioref">Chollet 2015</a>)</span> serves as a high-level neural networks API, capable of running on top of TensorFlow, and other frameworks like Theano, or CNTK. For developers venturing into embedded AI, Keras offers a simplified interface for building and training models. Its ease of use and modularity can be especially beneficial in the rapid development and deployment of models in embedded systems, facilitating the integration of AI capabilities with minimal complexity.</p>
 </section>
 <section id="tvm" class="level3" data-number="3.3.5">
 <h3 data-number="3.3.5" class="anchored" data-anchor-id="tvm"><span class="header-section-number">3.3.5</span> TVM</h3>
-<p>TVM is an open-source machine learning compiler stack that aims to enable efficient deployment of deep learning models on a variety of platforms. Particularly in embedded AI, TVM and µTVM (Micro TVM) can be crucial in optimizing and streamlining models to suit the restricted computational and memory resources, thus making deep learning more accessible and feasible on embedded devices.</p>
+<p>TVM is an open-source machine learning compiler stack that aims to enable efficient deployment of deep learning models on a variety of platforms <span class="citation" data-cites="chen2018tvm">(<a href="references.html#ref-chen2018tvm" role="doc-biblioref">Chen et al. 2018</a>)</span>. Particularly in embedded AI, TVM and µTVM (Micro TVM) can be crucial in optimizing and streamlining models to suit the restricted computational and memory resources, thus making deep learning more accessible and feasible on embedded devices.</p>
 <p>These libraries and frameworks are pivotal in leveraging the capabilities of deep learning in embedded AI systems, offering a range of tools and functionalities that enable the development of intelligent and optimized solutions. Selecting the appropriate library or framework, however, is a crucial step in the development pipeline, aligning with the specific requirements and constraints of embedded systems.</p>
 </section>
 </section>
@@ -690,15 +692,39 @@ <h3 data-number="3.4.8" class="anchored" data-anchor-id="scalability"><span clas
 
 
 <div id="refs" class="references csl-bib-body hanging-indent" role="list" style="display: none">
+<div id="ref-abadi2016tensorflow" class="csl-entry" role="listitem">
+Abadi, Martı́n, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. <span>“<span class="math inline">\(\{\)</span>TensorFlow<span class="math inline">\(\}\)</span>: A System for <span class="math inline">\(\{\)</span>Large-Scale<span class="math inline">\(\}\)</span> Machine Learning.”</span> In <em>12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)</em>, 265–83.
+</div>
+<div id="ref-bank2023autoencoders" class="csl-entry" role="listitem">
+Bank, Dor, Noam Koenigstein, and Raja Giryes. 2023. <span>“Autoencoders.”</span> <em>Machine Learning for Data Science Handbook: Data Mining and Knowledge Discovery Handbook</em>, 353–74.
+</div>
+<div id="ref-chen2018tvm" class="csl-entry" role="listitem">
+Chen, Tianqi, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, et al. 2018. <span>“<span class="math inline">\(\{\)</span>TVM<span class="math inline">\(\}\)</span>: An Automated <span class="math inline">\(\{\)</span>End-to-End<span class="math inline">\(\}\)</span> Optimizing Compiler for Deep Learning.”</span> In <em>13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)</em>, 578–94.
+</div>
+<div id="ref-chollet2015" class="csl-entry" role="listitem">
+Chollet, François. 2015. <span>“Keras.”</span> <em>GitHub Repository</em>. <a href="https://github.com/fchollet/keras" class="uri">https://github.com/fchollet/keras</a>; GitHub.
+</div>
+<div id="ref-goodfellow2020generative" class="csl-entry" role="listitem">
+Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. <span>“Generative Adversarial Networks.”</span> <em>Communications of the ACM</em> 63 (11): 139–44.
+</div>
 <div id="ref-jouppi2017datacenter" class="csl-entry" role="listitem">
 Jouppi, Norman P, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, et al. 2017. <span>“In-Datacenter Performance Analysis of a Tensor Processing Unit.”</span> In <em>Proceedings of the 44th Annual International Symposium on Computer Architecture</em>, 1–12.
 </div>
+<div id="ref-krizhevsky2012imagenet" class="csl-entry" role="listitem">
+Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E Hinton. 2012. <span>“Imagenet Classification with Deep Convolutional Neural Networks.”</span> <em>Advances in Neural Information Processing Systems</em> 25.
+</div>
+<div id="ref-paszke2019pytorch" class="csl-entry" role="listitem">
+Paszke, Adam, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, et al. 2019. <span>“Pytorch: An Imperative Style, High-Performance Deep Learning Library.”</span> <em>Advances in Neural Information Processing Systems</em> 32.
+</div>
 <div id="ref-rosenblatt1957perceptron" class="csl-entry" role="listitem">
 Rosenblatt, Frank. 1957. <em>The Perceptron, a Perceiving and Recognizing Automaton Project Para</em>. Cornell Aeronautical Laboratory.
 </div>
 <div id="ref-rumelhart1986learning" class="csl-entry" role="listitem">
 Rumelhart, David E, Geoffrey E Hinton, and Ronald J Williams. 1986. <span>“Learning Representations by Back-Propagating Errors.”</span> <em>Nature</em> 323 (6088): 533–36.
 </div>
+<div id="ref-vaswani2017attention" class="csl-entry" role="listitem">
+Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. <span>“Attention Is All You Need.”</span> <em>Advances in Neural Information Processing Systems</em> 30.
+</div>
 </div>
 </section>
 </section>
diff --git a/references.html b/references.html
index e457ed1b..d1d2f4f8 100644
--- a/references.html
+++ b/references.html
@@ -389,11 +389,39 @@ <h1 class="title">References</h1>
 </header>
 
 <div id="refs" class="references csl-bib-body hanging-indent" role="list">
+<div id="ref-abadi2016tensorflow" class="csl-entry" role="listitem">
+Abadi, Martı́n, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis,
+Jeffrey Dean, Matthieu Devin, et al. 2016. <span>“<span class="math inline">{</span>TensorFlow<span class="math inline">}</span>: A System for <span class="math inline">{</span>Large-Scale<span class="math inline">}</span> Machine Learning.”</span> In <em>12th
+USENIX Symposium on Operating Systems Design and Implementation (OSDI
+16)</em>, 265–83.
+</div>
 <div id="ref-Thefutur92:online" class="csl-entry" role="listitem">
 ARM.com. <span>“The Future Is Being Built on Arm: Market Diversification
 Continues to Drive Strong Royalty and Licensing Growth as Ecosystem
 Reaches Quarter of a Trillion Chips Milestone – Arm®.”</span> <a href="https://www.arm.com/company/news/2023/02/arm-announces-q3-fy22-results" class="uri">https://www.arm.com/company/news/2023/02/arm-announces-q3-fy22-results</a>.
 </div>
+<div id="ref-bank2023autoencoders" class="csl-entry" role="listitem">
+Bank, Dor, Noam Koenigstein, and Raja Giryes. 2023.
+<span>“Autoencoders.”</span> <em>Machine Learning for Data Science
+Handbook: Data Mining and Knowledge Discovery Handbook</em>, 353–74.
+</div>
+<div id="ref-chen2018tvm" class="csl-entry" role="listitem">
+Chen, Tianqi, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan,
+Haichen Shen, Meghan Cowan, et al. 2018. <span>“<span class="math inline">{</span>TVM<span class="math inline">}</span>: An
+Automated <span class="math inline">{</span>End-to-End<span class="math inline">}</span> Optimizing Compiler for Deep
+Learning.”</span> In <em>13th USENIX Symposium on Operating Systems
+Design and Implementation (OSDI 18)</em>, 578–94.
+</div>
+<div id="ref-chollet2015" class="csl-entry" role="listitem">
+Chollet, François. 2015. <span>“Keras.”</span> <em>GitHub
+Repository</em>. <a href="https://github.com/fchollet/keras" class="uri">https://github.com/fchollet/keras</a>; GitHub.
+</div>
+<div id="ref-goodfellow2020generative" class="csl-entry" role="listitem">
+Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David
+Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020.
+<span>“Generative Adversarial Networks.”</span> <em>Communications of
+the ACM</em> 63 (11): 139–44.
+</div>
 <div id="ref-jouppi2017datacenter" class="csl-entry" role="listitem">
 Jouppi, Norman P, Cliff Young, Nishant Patil, David Patterson, Gaurav
 Agrawal, Raminder Bajwa, Sarah Bates, et al. 2017. <span>“In-Datacenter
@@ -401,6 +429,18 @@ <h1 class="title">References</h1>
 <em>Proceedings of the 44th Annual International Symposium on Computer
 Architecture</em>, 1–12.
 </div>
+<div id="ref-krizhevsky2012imagenet" class="csl-entry" role="listitem">
+Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E Hinton. 2012.
+<span>“Imagenet Classification with Deep Convolutional Neural
+Networks.”</span> <em>Advances in Neural Information Processing
+Systems</em> 25.
+</div>
+<div id="ref-paszke2019pytorch" class="csl-entry" role="listitem">
+Paszke, Adam, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury,
+Gregory Chanan, Trevor Killeen, et al. 2019. <span>“Pytorch: An
+Imperative Style, High-Performance Deep Learning Library.”</span>
+<em>Advances in Neural Information Processing Systems</em> 32.
+</div>
 <div id="ref-rosenblatt1957perceptron" class="csl-entry" role="listitem">
 Rosenblatt, Frank. 1957. <em>The Perceptron, a Perceiving and
 Recognizing Automaton Project Para</em>. Cornell Aeronautical
@@ -411,6 +451,12 @@ <h1 class="title">References</h1>
 <span>“Learning Representations by Back-Propagating Errors.”</span>
 <em>Nature</em> 323 (6088): 533–36.
 </div>
+<div id="ref-vaswani2017attention" class="csl-entry" role="listitem">
+Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion
+Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017.
+<span>“Attention Is All You Need.”</span> <em>Advances in Neural
+Information Processing Systems</em> 30.
+</div>
 </div>
 
 
diff --git a/search.json b/search.json
index fa3933d4..30d16472 100644
--- a/search.json
+++ b/search.json
@@ -130,28 +130,28 @@
     "href": "dl_primer.html#overview",
     "title": "3  Deep Learning Primer",
     "section": "3.1 Overview",
-    "text": "3.1 Overview\n\n3.1.1 Definition and Importance\nDeep learning, a subset of machine learning and artificial intelligence (AI), involves algorithms inspired by the structure and function of the human brain, called artificial neural networks. It stands as a cornerstone in the field of AI, spearheading advancements in various domains including computer vision, natural language processing, and autonomous vehicles. Its relevance in embedded AI systems is underscored by its ability to facilitate complex computations and predictions, leveraging the limited resources available in embedded environments.\n\n\n\n3.1.2 Brief History of Deep Learning\nThe concept of deep learning has its roots in the early artificial neural networks. It has witnessed several waves of popularity, starting with the introduction of the Perceptron in the 1950s (Rosenblatt 1957), followed by the development of backpropagation algorithms in the 1980s (Rumelhart, Hinton, and Williams 1986).\nThe term “deep learning” emerged in the 2000s, marked by breakthroughs in computational power and data availability. Key milestones include the successful training of deep networks by Geoffrey Hinton, one of the god fathers of AI, and the resurgence of neural networks as a potent tool for data analysis and modeling.\nIn recent years, deep learning has witnessed exponential growth, becoming a transformative force across various industries. Figure 3.1 shows that we are currently in the third era of deep learning. From 1952 to 2010, computational growth followed an 18-month doubling pattern. This dramatically accelerated to a 6-month cycle from 2010 to 2022. At the same time, we witnessed the advent of major-scale models between 2015 and 2022; these appeared 2 to 3 orders of magnitude faster and followed a 10-month doubling cycle.\n\n\n\nFigure 3.1: Growth of deep learning models.\n\n\nA confluence of factors has fueled this surge, including advancements in computational power, the proliferation of big data, and improvements in algorithmic designs. Firstly, the expansion of computational capabilities, particularly the advent of Graphics Processing Units (GPUs) and Tensor Processing Units (TPUs) (Jouppi et al. 2017), has significantly accelerated the training and inference times of deep learning models. These hardware advancements have made it feasible to construct and train more complex, deeper networks than were possible in the earlier years.\nSecondly, the digital revolution has brought forth an abundance of “big” data, providing rich material for deep learning models to learn from and excel in tasks such as image and speech recognition, language translation, and game playing. The availability of large, labeled datasets has been instrumental in the refinement and successful deployment of deep learning applications in real-world scenarios.\nAdditionally, collaborations and open-source initiatives have fostered a vibrant community of researchers and practitioners, propelling rapid advancements in deep learning techniques. Innovations such as deep reinforcement learning, transfer learning, and generative adversarial networks have expanded the boundaries of what is achievable with deep learning, opening new avenues and opportunities in various fields including healthcare, finance, transportation, and entertainment.\nCompanies and organizations worldwide are recognizing the transformative potential of deep learning, investing heavily in research and development to harness its power in offering innovative solutions, optimizing operations, and creating new business opportunities. As deep learning continues its upward trajectory, it is poised to revolutionize how we interact with technology, making our lives more convenient, safe, and connected.\n\n\n3.1.3 Applications of Deep Learning\nDeep learning is widely used in many industries today. It is used in finance for things such as stock market prediction, risk assessment, and fraud detection. It is also used in marketing for things such as customer segmentation, personalization, and content optimization. In healthcare, machine learning is used for tasks such as diagnosis, treatment planning, and patient monitoring. It has had a transformational impact on our society.\nAn example of the transformative impact that machine learning has had on society is how it has saved money and lives. For example, as mentioned earlier, deep learning algorithms can make predictions about stocks, like predicting whether they will go up or down. These predictions guide investment strategies and improve financial decisions. Similarly, deep learning can also make medical predictions to improve patient diagnosis and save lives. The possibilities are endless and the benefits are clear. Machine learning is not only able to make predictions with greater accuracy than humans but it is also able to do so at a much faster pace.\nDeep learning has been applied to manufacturing to great effect. By using software to constantly learn from the vast amounts of data collected throughout the manufacturing process, companies are able to increase productivity while reducing wastage through improved efficiency. Companies are benefiting financially from these effects while customers are receiving better quality products at lower prices. Machine learning enables manufacturers to constantly improve their processes to create higher quality goods faster and more efficiently than ever before.\nDeep learning has also improved products that we use daily like Netflix recommendations or Google Translate’s text translations, but it also allows companies such as Amazon and Uber to save money on customer service costs by quickly identifying unhappy customers.\n\n\n3.1.4 Relevance to Embedded AI\nEmbedded AI, which involves integrating AI algorithms directly into hardware devices, naturally benefits from the capabilities of deep learning. The synergy of deep learning algorithms with embedded systems has paved the way for intelligent, autonomous devices capable of sophisticated on-device data processing and analysis. Deep learning facilitates the extraction of intricate patterns and information from input data, making it a vital tool in the development of smart embedded systems, ranging from household appliances to industrial machines. This union aims to foster a new era of smart, interconnected devices that can learn and adapt to user behaviors and environmental conditions, optimizing performance and offering unprecedented levels of convenience and efficiency."
+    "text": "3.1 Overview\n\n3.1.1 Definition and Importance\nDeep learning, a subset of machine learning and artificial intelligence (AI), involves algorithms inspired by the structure and function of the human brain, called artificial neural networks. It stands as a cornerstone in the field of AI, spearheading advancements in various domains including computer vision, natural language processing, and autonomous vehicles. Its relevance in embedded AI systems is underscored by its ability to facilitate complex computations and predictions, leveraging the limited resources available in embedded environments.\n\n\n\n3.1.2 Brief History of Deep Learning\nThe concept of deep learning has its roots in the early artificial neural networks. It has witnessed several waves of popularity, starting with the introduction of the Perceptron in the 1950s (Rosenblatt 1957), followed by the development of backpropagation algorithms in the 1980s (Rumelhart, Hinton, and Williams 1986).\nThe term deep learning emerged in the 2000s, marked by breakthroughs in computational power and data availability. Key milestones include the successful training of deep networks such as AlexNet (Krizhevsky, Sutskever, and Hinton 2012) by Geoffrey Hinton, one of the god fathers of AI, and the resurgence of neural networks as a potent tool for data analysis and modeling.\nIn recent years, deep learning has witnessed exponential growth, becoming a transformative force across various industries. Figure 3.1 shows that we are currently in the third era of deep learning. From 1952 to 2010, computational growth followed an 18-month doubling pattern. This dramatically accelerated to a 6-month cycle from 2010 to 2022. At the same time, we witnessed the advent of major-scale models between 2015 and 2022; these appeared 2 to 3 orders of magnitude faster and followed a 10-month doubling cycle.\n\n\n\nFigure 3.1: Growth of deep learning models.\n\n\nA confluence of factors has fueled this surge, including advancements in computational power, the proliferation of big data, and improvements in algorithmic designs. Firstly, the expansion of computational capabilities, particularly the advent of Graphics Processing Units (GPUs) and Tensor Processing Units (TPUs) (Jouppi et al. 2017), has significantly accelerated the training and inference times of deep learning models. These hardware advancements have made it feasible to construct and train more complex, deeper networks than were possible in the earlier years.\nSecondly, the digital revolution has brought forth an abundance of “big” data, providing rich material for deep learning models to learn from and excel in tasks such as image and speech recognition, language translation, and game playing. The availability of large, labeled datasets has been instrumental in the refinement and successful deployment of deep learning applications in real-world scenarios.\nAdditionally, collaborations and open-source initiatives have fostered a vibrant community of researchers and practitioners, propelling rapid advancements in deep learning techniques. Innovations such as deep reinforcement learning, transfer learning, and generative adversarial networks have expanded the boundaries of what is achievable with deep learning, opening new avenues and opportunities in various fields including healthcare, finance, transportation, and entertainment.\nCompanies and organizations worldwide are recognizing the transformative potential of deep learning, investing heavily in research and development to harness its power in offering innovative solutions, optimizing operations, and creating new business opportunities. As deep learning continues its upward trajectory, it is poised to revolutionize how we interact with technology, making our lives more convenient, safe, and connected.\n\n\n3.1.3 Applications of Deep Learning\nDeep learning is widely used in many industries today. It is used in finance for things such as stock market prediction, risk assessment, and fraud detection. It is also used in marketing for things such as customer segmentation, personalization, and content optimization. In healthcare, machine learning is used for tasks such as diagnosis, treatment planning, and patient monitoring. It has had a transformational impact on our society.\nAn example of the transformative impact that machine learning has had on society is how it has saved money and lives. For example, as mentioned earlier, deep learning algorithms can make predictions about stocks, like predicting whether they will go up or down. These predictions guide investment strategies and improve financial decisions. Similarly, deep learning can also make medical predictions to improve patient diagnosis and save lives. The possibilities are endless and the benefits are clear. Machine learning is not only able to make predictions with greater accuracy than humans but it is also able to do so at a much faster pace.\nDeep learning has been applied to manufacturing to great effect. By using software to constantly learn from the vast amounts of data collected throughout the manufacturing process, companies are able to increase productivity while reducing wastage through improved efficiency. Companies are benefiting financially from these effects while customers are receiving better quality products at lower prices. Machine learning enables manufacturers to constantly improve their processes to create higher quality goods faster and more efficiently than ever before.\nDeep learning has also improved products that we use daily like Netflix recommendations or Google Translate’s text translations, but it also allows companies such as Amazon and Uber to save money on customer service costs by quickly identifying unhappy customers.\n\n\n3.1.4 Relevance to Embedded AI\nEmbedded AI, which involves integrating AI algorithms directly into hardware devices, naturally benefits from the capabilities of deep learning. The synergy of deep learning algorithms with embedded systems has paved the way for intelligent, autonomous devices capable of sophisticated on-device data processing and analysis. Deep learning facilitates the extraction of intricate patterns and information from input data, making it a vital tool in the development of smart embedded systems, ranging from household appliances to industrial machines. This union aims to foster a new era of smart, interconnected devices that can learn and adapt to user behaviors and environmental conditions, optimizing performance and offering unprecedented levels of convenience and efficiency."
   },
   {
     "objectID": "dl_primer.html#neural-networks",
     "href": "dl_primer.html#neural-networks",
     "title": "3  Deep Learning Primer",
     "section": "3.2 Neural Networks",
-    "text": "3.2 Neural Networks\nDeep learning takes inspiration from the human brain’s neural networks to create patterns utilized in decision-making. This section explores the foundational concepts that comprise deep learning, offering insights into the underpinnings of more complex topics explored later in this primer.\nNeural networks form the basis of deep learning, drawing inspiration from the biological neural networks of the human brain to process and analyze data in a hierarchical manner. Below, we dissect the primary components and structures commonly found in neural networks.\n\n3.2.1 Perceptrons\nAt the foundation of neural networks is the perceptron, a basic unit or node that forms the basis of more complex structures. A perceptron receives various inputs, applies weights and a bias to these inputs, and then employs an activation function to produce an output as shown below in Figure 3.2.\n\n\n\nFigure 3.2: Perceptron\n\n\nInitially conceptualized in the 1950s, perceptrons paved the way for the development of more intricate neural networks, serving as a fundamental building block in the field of deep learning.\n\n\n3.2.2 Multi-layer Perceptrons\nMulti-layer perceptrons (MLPs) evolve from the single-layer perceptron model, incorporating multiple layers of nodes connected in a feedforward manner. These layers include an input layer to receive data, several hidden layers to process this data, and an output layer to generate the final results. MLPs excel in identifying non-linear relationships, utilizing a backpropagation technique for training, wherein the weights are optimized through a gradient descent algorithm.\n\n\n\nMultilayer Perceptron\n\n\n\n\n3.2.3 Activation Functions\nActivation functions stand as vital components in neural networks, providing the mathematical equations that determine a network’s output. These functions introduce non-linearity to the network, facilitating the learning of complex patterns by allowing the network to adjust weights based on the error during the learning process. Popular activation functions encompass the sigmoid, tanh, and ReLU (Rectified Linear Unit) functions.\n\n\n\nActivation Function\n\n\n\n\n3.2.4 Computational Graphs\nDeep learning employs computational graphs to illustrate the various operations and their interactions within a neural network. This subsection explores the essential phases of computational graph processing.\n\n\n\nTensorFlow Computational Graph\n\n\n\n3.2.4.1 Forward Pass\nThe forward pass denotes the initial phase where data progresses through the network from the input to the output layer. During this phase, each layer conducts specific computations on the input data, utilizing weights and biases before passing the resulting values onto subsequent layers. The ultimate output of this phase is employed to compute the loss, representing the disparity between the predicted output and actual target values.\n\n\n3.2.4.2 Backward Pass (Backpropagation)\nBackpropagation signifies a pivotal algorithm in the training of deep neural networks. This phase involves computing the gradient of the loss function with respect to each weight using the chain rule, effectively maneuvering backwards through the network. The gradients calculated in this step guide the adjustment of weights with the objective of minimizing the loss function, thereby enhancing the network’s performance with each iteration of training.\nGrasping these foundational concepts paves the way to understanding more intricate deep learning architectures and techniques, fostering the development of more sophisticated and efficacious applications, especially within the realm of embedded AI systems.\n\n\n\n\n\n3.2.5 Training Concepts\nIn the realm of deep learning, it’s crucial to comprehend various key concepts and terms that set the foundation for creating, training, and optimizing deep neural networks. This section clarifies these essential concepts, providing a straightforward path to delve deeper into the intricate dynamics of deep learning. Overall, ML training is an iterative process. An untrained neural network model takes some features as input and makes a forward prediction pass. Given some ground truth about the prediction, which is known during the training process, we can compute a loss using a loss function and update the neural network parameters during the backward pass. We repeat this process until the network converges towards correct predictions with satisfactory accuracy.\n\n\n\nAn iterative approach to training a model.\n\n\n\n3.2.5.1 Loss Functions\nLoss functions, also known as cost functions, quantify how well a neural network is performing by calculating the difference between the actual and predicted outputs. The objective during the training process is to minimize this loss function to improve the model’s accuracy. As Figure 3.3 shows, models can either have high loss or low loss depending on where in the training phase the network is in.\n\n\n\nFigure 3.3: High loss in the left model; low loss in the right model.\n\n\nVarious loss functions are employed depending on the specific task, such as mean squared error, log loss and cross-entropy loss for regression tasks and categorical crossentropy for classification tasks.\n\n\n3.2.5.2 Optimization Algorithms\nOptimization algorithms play a crucial role in the training process, aiming to minimize the loss function by adjusting the model’s weights. These algorithms navigate through the model’s parameter space to find the optimal set of parameters that yield the minimum loss. Some commonly used optimization algorithms are:\n\nGradient Descent: A first-order optimization algorithm that uses the gradient of the loss function to move the weights in the direction that minimizes the loss.\nStochastic Gradient Descent (SGD): A variant of gradient descent that updates the weights using a subset of the data, thus accelerating the training process.\nAdam: A popular optimization algorithm that combines the benefits of other extensions of gradient descent, often providing faster convergence.\n\n\n\n\nMinimizing loss during the training process.\n\n\n\n\n3.2.5.3 Regularization Techniques\nTo prevent overfitting and help the model generalize better to unseen data, regularization techniques are employed. These techniques penalize the complexity of the model, encouraging simpler models that can perform better on new data.\n\nCommon regularization techniques include:\n\nL1 and L2 Regularization: These techniques add a penalty term to the loss function, discouraging large weights and promoting simpler models.\nDropout: A technique where randomly selected neurons are ignored during training, forcing the network to learn more robust features.\nBatch Normalization: This technique normalizes the activations of the neurons in a given layer, improving the stability and performance of the network.\n\nUnderstanding these fundamental concepts and terms forms the backbone of deep learning, setting the stage for a more in-depth exploration into the intricacies of various deep learning architectures and their applications, particularly in embedded AI systems.\n\n\n\n3.2.6 Model Architectures\nDeep learning architectures refer to the various structured approaches that dictate how neurons and layers are organized and interact in neural networks. These architectures have evolved to address different problems and data types efficiently. This section provides an overview of some prominent deep learning architectures and their characteristics.\n\n3.2.6.1 Multi-Layer Perceptrons (MLPs)\nMLPs are fundamental deep learning architectures, consisting of three or more layers: an input layer, one or more hidden layers, and an output layer. These layers are fully connected, meaning every neuron in a layer is connected to every neuron in the preceding and succeeding layers. MLPs can model complex functions and find applications in a wide range of tasks, including regression, classification, and pattern recognition. Their ability to learn non-linear relationships through backpropagation makes them a versatile tool in the deep learning arsenal.\nIn embedded AI systems, MLPs can serve as compact models for simpler tasks, such as sensor data analysis or basic pattern recognition, where computational resources are constrained. Their capability to learn non-linear relationships with relatively less complexity makes them a viable option for embedded systems.\n\n\n3.2.6.2 Convolutional Neural Networks (CNNs)\nCNNs are primarily used in image and video recognition tasks. This architecture uses convolutional layers that apply a series of filters to the input data to identify various features such as edges, corners, and textures. A typical CNN also includes pooling layers that reduce the spatial dimensions of the data, and fully connected layers for classification. CNNs have proven highly effective in tasks like image recognition, object detection, and computer vision applications.\nIn the realm of embedded AI, CNNs are pivotal for image and video recognition applications, where real-time processing is often required. They can be optimized for embedded systems by employing techniques such as quantization and pruning to reduce memory usage and computational demands, enabling efficient object detection and facial recognition functionalities in devices with limited computational resources.\n\n\n3.2.6.3 Recurrent Neural Networks (RNNs)\nRNNs are suited for sequential data analysis, such as time series forecasting and natural language processing. In this architecture, connections between nodes form a directed graph along a temporal sequence, allowing information to be carried across sequences through hidden state vectors. Variations of RNNs include Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), which are designed to capture longer dependencies in sequence data.\nIn embedded systems, these networks can be implemented in voice recognition systems, predictive maintenance, or in IoT devices where sequential data patterns are prevalent. Optimizations specific to embedded platforms can help in managing their typically high computational and memory requirements.\n\n\n3.2.6.4 Generative Adversarial Networks (GANs)\nGANs consist of two networks, a generator and a discriminator, that are trained simultaneously through adversarial training. The generator produces data that tries to mimic the real data distribution, while the discriminator aims to distinguish between real and generated data. GANs are widely used in image generation, style transfer, and data augmentation.\nIn embedded contexts, GANs could be used for on-device data augmentation to enhance the training of models directly on the embedded device, facilitating continual learning and adaptation to new data without the need for cloud computing resources.\n\n\n3.2.6.5 Autoencoders\nAutoencoders are neural networks used for data compression and noise reduction. They are structured to encode input data into a lower-dimensional representation and then decode it back to the original form. Variations like Variational Autoencoders (VAEs) introduce probabilistic layers that allow for generative properties, finding applications in image generation and anomaly detection.\nImplementing autoencoders can assist in efficient data transmission and storage, enhancing the overall performance of embedded systems with limited computational and memory resources.\n\n\n3.2.6.6 Transformer Networks\nTransformer networks have emerged as a powerful architecture, especially in the field of natural language processing. These networks use self-attention mechanisms to weigh the influence of different input words on each output word, facilitating parallel computation and capturing complex patterns in data. Transformer networks have led to state-of-the-art results in tasks such as language translation, summarization, and text generation.\nThese networks can be optimized to perform language-related tasks directly on-device. For instance, transformers can be utilized in embedded systems for real-time translation services or voice-assisted interfaces, where latency and computational efficiency are critical factors. Techniques such as model distillation which we will discss later on can be employed to deploy these networks on embedded devices with constrained resources.\nEach of these architectures serves specific purposes and excel in different domains, offering a rich toolkit for tackling diverse problems in the realm of embedded AI systems. Understanding the nuances of these architectures is vital in designing effective and efficient deep learning models for various applications."
+    "text": "3.2 Neural Networks\nDeep learning takes inspiration from the human brain’s neural networks to create patterns utilized in decision-making. This section explores the foundational concepts that comprise deep learning, offering insights into the underpinnings of more complex topics explored later in this primer.\nNeural networks form the basis of deep learning, drawing inspiration from the biological neural networks of the human brain to process and analyze data in a hierarchical manner. Below, we dissect the primary components and structures commonly found in neural networks.\n\n3.2.1 Perceptrons\nAt the foundation of neural networks is the perceptron, a basic unit or node that forms the basis of more complex structures. A perceptron receives various inputs, applies weights and a bias to these inputs, and then employs an activation function to produce an output as shown below in Figure 3.2.\n\n\n\nFigure 3.2: Perceptron\n\n\nInitially conceptualized in the 1950s, perceptrons paved the way for the development of more intricate neural networks, serving as a fundamental building block in the field of deep learning.\n\n\n3.2.2 Multi-layer Perceptrons\nMulti-layer perceptrons (MLPs) evolve from the single-layer perceptron model, incorporating multiple layers of nodes connected in a feedforward manner. These layers include an input layer to receive data, several hidden layers to process this data, and an output layer to generate the final results. MLPs excel in identifying non-linear relationships, utilizing a backpropagation technique for training, wherein the weights are optimized through a gradient descent algorithm.\n\n\n\nMultilayer Perceptron\n\n\n\n\n3.2.3 Activation Functions\nActivation functions stand as vital components in neural networks, providing the mathematical equations that determine a network’s output. These functions introduce non-linearity to the network, facilitating the learning of complex patterns by allowing the network to adjust weights based on the error during the learning process. Popular activation functions encompass the sigmoid, tanh, and ReLU (Rectified Linear Unit) functions.\n\n\n\nActivation Function\n\n\n\n\n3.2.4 Computational Graphs\nDeep learning employs computational graphs to illustrate the various operations and their interactions within a neural network. This subsection explores the essential phases of computational graph processing.\n\n\n\nTensorFlow Computational Graph\n\n\n\n3.2.4.1 Forward Pass\nThe forward pass denotes the initial phase where data progresses through the network from the input to the output layer. During this phase, each layer conducts specific computations on the input data, utilizing weights and biases before passing the resulting values onto subsequent layers. The ultimate output of this phase is employed to compute the loss, representing the disparity between the predicted output and actual target values.\n\n\n3.2.4.2 Backward Pass (Backpropagation)\nBackpropagation signifies a pivotal algorithm in the training of deep neural networks. This phase involves computing the gradient of the loss function with respect to each weight using the chain rule, effectively maneuvering backwards through the network. The gradients calculated in this step guide the adjustment of weights with the objective of minimizing the loss function, thereby enhancing the network’s performance with each iteration of training.\nGrasping these foundational concepts paves the way to understanding more intricate deep learning architectures and techniques, fostering the development of more sophisticated and efficacious applications, especially within the realm of embedded AI systems.\n\n\n\n\n\n3.2.5 Training Concepts\nIn the realm of deep learning, it’s crucial to comprehend various key concepts and terms that set the foundation for creating, training, and optimizing deep neural networks. This section clarifies these essential concepts, providing a straightforward path to delve deeper into the intricate dynamics of deep learning. Overall, ML training is an iterative process. An untrained neural network model takes some features as input and makes a forward prediction pass. Given some ground truth about the prediction, which is known during the training process, we can compute a loss using a loss function and update the neural network parameters during the backward pass. We repeat this process until the network converges towards correct predictions with satisfactory accuracy.\n\n\n\nAn iterative approach to training a model.\n\n\n\n3.2.5.1 Loss Functions\nLoss functions, also known as cost functions, quantify how well a neural network is performing by calculating the difference between the actual and predicted outputs. The objective during the training process is to minimize this loss function to improve the model’s accuracy. As Figure 3.3 shows, models can either have high loss or low loss depending on where in the training phase the network is in.\n\n\n\nFigure 3.3: High loss in the left model; low loss in the right model.\n\n\nVarious loss functions are employed depending on the specific task, such as mean squared error, log loss and cross-entropy loss for regression tasks and categorical crossentropy for classification tasks.\n\n\n3.2.5.2 Optimization Algorithms\nOptimization algorithms play a crucial role in the training process, aiming to minimize the loss function by adjusting the model’s weights. These algorithms navigate through the model’s parameter space to find the optimal set of parameters that yield the minimum loss. Some commonly used optimization algorithms are:\n\nGradient Descent: A first-order optimization algorithm that uses the gradient of the loss function to move the weights in the direction that minimizes the loss.\nStochastic Gradient Descent (SGD): A variant of gradient descent that updates the weights using a subset of the data, thus accelerating the training process.\nAdam: A popular optimization algorithm that combines the benefits of other extensions of gradient descent, often providing faster convergence.\n\n\n\n\nMinimizing loss during the training process.\n\n\n\n\n3.2.5.3 Regularization Techniques\nTo prevent overfitting and help the model generalize better to unseen data, regularization techniques are employed. These techniques penalize the complexity of the model, encouraging simpler models that can perform better on new data.\n\nCommon regularization techniques include:\n\nL1 and L2 Regularization: These techniques add a penalty term to the loss function, discouraging large weights and promoting simpler models.\nDropout: A technique where randomly selected neurons are ignored during training, forcing the network to learn more robust features.\nBatch Normalization: This technique normalizes the activations of the neurons in a given layer, improving the stability and performance of the network.\n\nUnderstanding these fundamental concepts and terms forms the backbone of deep learning, setting the stage for a more in-depth exploration into the intricacies of various deep learning architectures and their applications, particularly in embedded AI systems.\n\n\n\n3.2.6 Model Architectures\nDeep learning architectures refer to the various structured approaches that dictate how neurons and layers are organized and interact in neural networks. These architectures have evolved to address different problems and data types efficiently. This section provides an overview of some prominent deep learning architectures and their characteristics.\n\n3.2.6.1 Multi-Layer Perceptrons (MLPs)\nMLPs are fundamental deep learning architectures, consisting of three or more layers: an input layer, one or more hidden layers, and an output layer. These layers are fully connected, meaning every neuron in a layer is connected to every neuron in the preceding and succeeding layers. MLPs can model complex functions and find applications in a wide range of tasks, including regression, classification, and pattern recognition. Their ability to learn non-linear relationships through backpropagation makes them a versatile tool in the deep learning arsenal.\nIn embedded AI systems, MLPs can serve as compact models for simpler tasks, such as sensor data analysis or basic pattern recognition, where computational resources are constrained. Their capability to learn non-linear relationships with relatively less complexity makes them a viable option for embedded systems.\n\n\n3.2.6.2 Convolutional Neural Networks (CNNs)\nCNNs are primarily used in image and video recognition tasks. This architecture uses convolutional layers that apply a series of filters to the input data to identify various features such as edges, corners, and textures. A typical CNN also includes pooling layers that reduce the spatial dimensions of the data, and fully connected layers for classification. CNNs have proven highly effective in tasks like image recognition, object detection, and computer vision applications.\nIn the realm of embedded AI, CNNs are pivotal for image and video recognition applications, where real-time processing is often required. They can be optimized for embedded systems by employing techniques such as quantization and pruning to reduce memory usage and computational demands, enabling efficient object detection and facial recognition functionalities in devices with limited computational resources.\n\n\n3.2.6.3 Recurrent Neural Networks (RNNs)\nRNNs are suited for sequential data analysis, such as time series forecasting and natural language processing. In this architecture, connections between nodes form a directed graph along a temporal sequence, allowing information to be carried across sequences through hidden state vectors. Variations of RNNs include Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), which are designed to capture longer dependencies in sequence data.\nIn embedded systems, these networks can be implemented in voice recognition systems, predictive maintenance, or in IoT devices where sequential data patterns are prevalent. Optimizations specific to embedded platforms can help in managing their typically high computational and memory requirements.\n\n\n3.2.6.4 Generative Adversarial Networks (GANs)\nGANs consist of two networks, a generator and a discriminator, that are trained simultaneously through adversarial training (Goodfellow et al. 2020). The generator produces data that tries to mimic the real data distribution, while the discriminator aims to distinguish between real and generated data. GANs are widely used in image generation, style transfer, and data augmentation.\nIn embedded contexts, GANs could be used for on-device data augmentation to enhance the training of models directly on the embedded device, facilitating continual learning and adaptation to new data without the need for cloud computing resources.\n\n\n3.2.6.5 Autoencoders\nAutoencoders are neural networks used for data compression and noise reduction (Bank, Koenigstein, and Giryes 2023). They are structured to encode input data into a lower-dimensional representation and then decode it back to the original form. Variations like Variational Autoencoders (VAEs) introduce probabilistic layers that allow for generative properties, finding applications in image generation and anomaly detection.\nImplementing autoencoders can assist in efficient data transmission and storage, enhancing the overall performance of embedded systems with limited computational and memory resources.\n\n\n3.2.6.6 Transformer Networks\nTransformer networks have emerged as a powerful architecture, especially in the field of natural language processing (Vaswani et al. 2017). These networks use self-attention mechanisms to weigh the influence of different input words on each output word, facilitating parallel computation and capturing complex patterns in data. Transformer networks have led to state-of-the-art results in tasks such as language translation, summarization, and text generation.\nThese networks can be optimized to perform language-related tasks directly on-device. For instance, transformers can be utilized in embedded systems for real-time translation services or voice-assisted interfaces, where latency and computational efficiency are critical factors. Techniques such as model distillation which we will discss later on can be employed to deploy these networks on embedded devices with constrained resources.\nEach of these architectures serves specific purposes and excel in different domains, offering a rich toolkit for tackling diverse problems in the realm of embedded AI systems. Understanding the nuances of these architectures is vital in designing effective and efficient deep learning models for various applications."
   },
   {
     "objectID": "dl_primer.html#libraries-and-frameworks",
     "href": "dl_primer.html#libraries-and-frameworks",
     "title": "3  Deep Learning Primer",
     "section": "3.3 Libraries and Frameworks",
-    "text": "3.3 Libraries and Frameworks\nIn the world of deep learning, the availability of robust libraries and frameworks has been a cornerstone in facilitating the development, training, and deployment of models, particularly in embedded AI systems where efficiency and optimization are key. These libraries and frameworks are often equipped with pre-defined functions and tools that allow for rapid prototyping and deployment. This section sheds light on popular libraries and frameworks, emphasizing their utility in embedded AI scenarios.\n\n3.3.1 TensorFlow\nTensorFlow, developed by Google, stands as one of the premier frameworks for developing deep learning models. Its ability to work seamlessly with embedded systems comes from TensorFlow Lite, a lightweight solution designed to run on mobile and embedded devices. TensorFlow Lite enables the execution of optimized models on a variety of platforms, making it easier to integrate AI functionalities in embedded systems. For TinyML we will be dealing with TensorFlow Lite for Microcontrollers.\n\n\n3.3.2 PyTorch\nPyTorch, an open-source library developed by Facebook, is praised for its dynamic computation graph and ease of use. For embedded AI, PyTorch can be a suitable choice for research and prototyping, offering a seamless transition from research to production with the use of the TorchScript scripting language. PyTorch Mobile further facilitates the deployment of models on mobile and embedded devices, offering tools and workflows to optimize performance.\n\n\n3.3.3 ONNX Runtime\nThe Open Neural Network Exchange (ONNX) Runtime is a cross-platform, high-performance engine for running machine learning models. It is not particularly developed for embedded AI systems, though it supports a wide range of hardware accelerators and is capable of optimizing computations to improve performance in resource-constrained environments.\n\n\n3.3.4 Keras\nKeras serves as a high-level neural networks API, capable of running on top of TensorFlow, and other frameworks like Theano, or CNTK. For developers venturing into embedded AI, Keras offers a simplified interface for building and training models. Its ease of use and modularity can be especially beneficial in the rapid development and deployment of models in embedded systems, facilitating the integration of AI capabilities with minimal complexity.\n\n\n3.3.5 TVM\nTVM is an open-source machine learning compiler stack that aims to enable efficient deployment of deep learning models on a variety of platforms. Particularly in embedded AI, TVM and µTVM (Micro TVM) can be crucial in optimizing and streamlining models to suit the restricted computational and memory resources, thus making deep learning more accessible and feasible on embedded devices.\nThese libraries and frameworks are pivotal in leveraging the capabilities of deep learning in embedded AI systems, offering a range of tools and functionalities that enable the development of intelligent and optimized solutions. Selecting the appropriate library or framework, however, is a crucial step in the development pipeline, aligning with the specific requirements and constraints of embedded systems."
+    "text": "3.3 Libraries and Frameworks\nIn the world of deep learning, the availability of robust libraries and frameworks has been a cornerstone in facilitating the development, training, and deployment of models, particularly in embedded AI systems where efficiency and optimization are key. These libraries and frameworks are often equipped with pre-defined functions and tools that allow for rapid prototyping and deployment. This section sheds light on popular libraries and frameworks, emphasizing their utility in embedded AI scenarios.\n\n3.3.1 TensorFlow\nTensorFlow, developed by Google (Abadi et al. 2016), stands as one of the premier frameworks for developing deep learning models. Its ability to work seamlessly with embedded systems comes from TensorFlow Lite, a lightweight solution designed to run on mobile and embedded devices. TensorFlow Lite enables the execution of optimized models on a variety of platforms, making it easier to integrate AI functionalities in embedded systems. For TinyML we will be dealing with TensorFlow Lite for Microcontrollers.\n\n\n3.3.2 PyTorch\nPyTorch, an open-source library developed by Facebook (Paszke et al. 2019), is praised for its dynamic computation graph and ease of use. For embedded AI, PyTorch can be a suitable choice for research and prototyping, offering a seamless transition from research to production with the use of the TorchScript scripting language. PyTorch Mobile further facilitates the deployment of models on mobile and embedded devices, offering tools and workflows to optimize performance.\n\n\n3.3.3 ONNX Runtime\nThe Open Neural Network Exchange (ONNX) Runtime is a cross-platform, high-performance engine for running machine learning models. It is not particularly developed for embedded AI systems, though it supports a wide range of hardware accelerators and is capable of optimizing computations to improve performance in resource-constrained environments.\n\n\n3.3.4 Keras\nKeras (Chollet 2015) serves as a high-level neural networks API, capable of running on top of TensorFlow, and other frameworks like Theano, or CNTK. For developers venturing into embedded AI, Keras offers a simplified interface for building and training models. Its ease of use and modularity can be especially beneficial in the rapid development and deployment of models in embedded systems, facilitating the integration of AI capabilities with minimal complexity.\n\n\n3.3.5 TVM\nTVM is an open-source machine learning compiler stack that aims to enable efficient deployment of deep learning models on a variety of platforms (Chen et al. 2018). Particularly in embedded AI, TVM and µTVM (Micro TVM) can be crucial in optimizing and streamlining models to suit the restricted computational and memory resources, thus making deep learning more accessible and feasible on embedded devices.\nThese libraries and frameworks are pivotal in leveraging the capabilities of deep learning in embedded AI systems, offering a range of tools and functionalities that enable the development of intelligent and optimized solutions. Selecting the appropriate library or framework, however, is a crucial step in the development pipeline, aligning with the specific requirements and constraints of embedded systems."
   },
   {
     "objectID": "dl_primer.html#embedded-ai-challenges",
     "href": "dl_primer.html#embedded-ai-challenges",
     "title": "3  Deep Learning Primer",
     "section": "3.4 Embedded AI Challenges",
-    "text": "3.4 Embedded AI Challenges\nEmbedded AI systems often operate within environments with constrained resources, posing unique challenges in implementing the deep learning algorithms we discussed above efficiently. In this section, we explore various challenges encountered in the deployment of deep learning in embedded systems and potential solutions to navigate these complexities.\n\n3.4.1 Memory Constraints\n\nChallenge: Embedded systems usually have limited memory, which can be a bottleneck when deploying large deep learning models.\nSolution: Employing model compression techniques such as pruning and quantization to reduce the memory footprint without significantly affecting performance.\n\n\n\n3.4.2 Computational Limitations\n\nChallenge: The computational capacity in embedded systems can be limited, hindering the deployment of complex deep learning models.\nSolution: Utilizing hardware acceleration through GPUs or dedicated AI chips to boost computational power, and optimizing models for inference through techniques like layer fusion.\n\n\n\n3.4.3 Energy Efficiency\n\nChallenge: Embedded systems, particularly battery-powered devices, require energy-efficient operations to prolong battery life.\nSolution: Implementing energy-efficient neural networks that are designed to minimize energy consumption during operation, and employing dynamic voltage and frequency scaling to adjust the power consumption dynamically.\n\n\n\n3.4.4 Data Privacy and Security\n\nChallenge: Embedded AI systems often process sensitive data, raising concerns regarding data privacy and security.\nSolution: Employing on-device processing to keep sensitive data on the device itself, and incorporating encryption and secure channels for any necessary data transmission.\n\n\n\n3.4.5 Real-Time Processing Requirements\n\nChallenge: Many embedded AI applications demand real-time processing to provide instantaneous responses, which can be challenging to achieve with deep learning models.\nSolution: Streamlining the model through methods such as model distillation to reduce complexity and employing real-time operating systems to ensure timely processing.\n\n\n\n3.4.6 Model Robustness and Generalization\n\nChallenge: Ensuring that deep learning models are robust and capable of generalizing well to unseen data in embedded AI settings.\nSolution: Incorporating techniques like data augmentation and adversarial training to enhance model robustness and improve generalization capabilities.\n\n\n\n3.4.7 Integration with Existing Systems\n\nChallenge: Integrating deep learning capabilities into existing embedded systems can pose compatibility and interoperability issues.\nSolution: Adopting modular design approaches and leveraging APIs and middleware solutions to facilitate smooth integration with existing systems and infrastructures.\n\n\n\n3.4.8 Scalability\n\nChallenge: Scaling deep learning solutions to cater to a growing number of devices and users in embedded AI ecosystems.\nSolution: Utilizing cloud-edge computing paradigms to distribute computational loads effectively and ensuring that the models can be updated seamlessly to adapt to changing requirements.\n\nUnderstanding and addressing these challenges are vital in the successful deployment of deep learning solutions in embedded AI systems. By adopting appropriate strategies and solutions, developers can navigate these hurdles effectively, fostering the creation of reliable, efficient, and intelligent embedded AI systems.\n\n\n\n\nJouppi, Norman P, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, et al. 2017. “In-Datacenter Performance Analysis of a Tensor Processing Unit.” In Proceedings of the 44th Annual International Symposium on Computer Architecture, 1–12.\n\n\nRosenblatt, Frank. 1957. The Perceptron, a Perceiving and Recognizing Automaton Project Para. Cornell Aeronautical Laboratory.\n\n\nRumelhart, David E, Geoffrey E Hinton, and Ronald J Williams. 1986. “Learning Representations by Back-Propagating Errors.” Nature 323 (6088): 533–36."
+    "text": "3.4 Embedded AI Challenges\nEmbedded AI systems often operate within environments with constrained resources, posing unique challenges in implementing the deep learning algorithms we discussed above efficiently. In this section, we explore various challenges encountered in the deployment of deep learning in embedded systems and potential solutions to navigate these complexities.\n\n3.4.1 Memory Constraints\n\nChallenge: Embedded systems usually have limited memory, which can be a bottleneck when deploying large deep learning models.\nSolution: Employing model compression techniques such as pruning and quantization to reduce the memory footprint without significantly affecting performance.\n\n\n\n3.4.2 Computational Limitations\n\nChallenge: The computational capacity in embedded systems can be limited, hindering the deployment of complex deep learning models.\nSolution: Utilizing hardware acceleration through GPUs or dedicated AI chips to boost computational power, and optimizing models for inference through techniques like layer fusion.\n\n\n\n3.4.3 Energy Efficiency\n\nChallenge: Embedded systems, particularly battery-powered devices, require energy-efficient operations to prolong battery life.\nSolution: Implementing energy-efficient neural networks that are designed to minimize energy consumption during operation, and employing dynamic voltage and frequency scaling to adjust the power consumption dynamically.\n\n\n\n3.4.4 Data Privacy and Security\n\nChallenge: Embedded AI systems often process sensitive data, raising concerns regarding data privacy and security.\nSolution: Employing on-device processing to keep sensitive data on the device itself, and incorporating encryption and secure channels for any necessary data transmission.\n\n\n\n3.4.5 Real-Time Processing Requirements\n\nChallenge: Many embedded AI applications demand real-time processing to provide instantaneous responses, which can be challenging to achieve with deep learning models.\nSolution: Streamlining the model through methods such as model distillation to reduce complexity and employing real-time operating systems to ensure timely processing.\n\n\n\n3.4.6 Model Robustness and Generalization\n\nChallenge: Ensuring that deep learning models are robust and capable of generalizing well to unseen data in embedded AI settings.\nSolution: Incorporating techniques like data augmentation and adversarial training to enhance model robustness and improve generalization capabilities.\n\n\n\n3.4.7 Integration with Existing Systems\n\nChallenge: Integrating deep learning capabilities into existing embedded systems can pose compatibility and interoperability issues.\nSolution: Adopting modular design approaches and leveraging APIs and middleware solutions to facilitate smooth integration with existing systems and infrastructures.\n\n\n\n3.4.8 Scalability\n\nChallenge: Scaling deep learning solutions to cater to a growing number of devices and users in embedded AI ecosystems.\nSolution: Utilizing cloud-edge computing paradigms to distribute computational loads effectively and ensuring that the models can be updated seamlessly to adapt to changing requirements.\n\nUnderstanding and addressing these challenges are vital in the successful deployment of deep learning solutions in embedded AI systems. By adopting appropriate strategies and solutions, developers can navigate these hurdles effectively, fostering the creation of reliable, efficient, and intelligent embedded AI systems.\n\n\n\n\nAbadi, Martı́n, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. “\\(\\{\\)TensorFlow\\(\\}\\): A System for \\(\\{\\)Large-Scale\\(\\}\\) Machine Learning.” In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), 265–83.\n\n\nBank, Dor, Noam Koenigstein, and Raja Giryes. 2023. “Autoencoders.” Machine Learning for Data Science Handbook: Data Mining and Knowledge Discovery Handbook, 353–74.\n\n\nChen, Tianqi, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, et al. 2018. “\\(\\{\\)TVM\\(\\}\\): An Automated \\(\\{\\)End-to-End\\(\\}\\) Optimizing Compiler for Deep Learning.” In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), 578–94.\n\n\nChollet, François. 2015. “Keras.” GitHub Repository. https://github.com/fchollet/keras; GitHub.\n\n\nGoodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. “Generative Adversarial Networks.” Communications of the ACM 63 (11): 139–44.\n\n\nJouppi, Norman P, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, et al. 2017. “In-Datacenter Performance Analysis of a Tensor Processing Unit.” In Proceedings of the 44th Annual International Symposium on Computer Architecture, 1–12.\n\n\nKrizhevsky, Alex, Ilya Sutskever, and Geoffrey E Hinton. 2012. “Imagenet Classification with Deep Convolutional Neural Networks.” Advances in Neural Information Processing Systems 25.\n\n\nPaszke, Adam, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, et al. 2019. “Pytorch: An Imperative Style, High-Performance Deep Learning Library.” Advances in Neural Information Processing Systems 32.\n\n\nRosenblatt, Frank. 1957. The Perceptron, a Perceiving and Recognizing Automaton Project Para. Cornell Aeronautical Laboratory.\n\n\nRumelhart, David E, Geoffrey E Hinton, and Ronald J Williams. 1986. “Learning Representations by Back-Propagating Errors.” Nature 323 (6088): 533–36.\n\n\nVaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. “Attention Is All You Need.” Advances in Neural Information Processing Systems 30."
   },
   {
     "objectID": "embedded_ml.html#cloud-ml",
@@ -431,7 +431,7 @@
     "href": "references.html",
     "title": "References",
     "section": "",
-    "text": "ARM.com. “The Future Is Being Built on Arm: Market Diversification\nContinues to Drive Strong Royalty and Licensing Growth as Ecosystem\nReaches Quarter of a Trillion Chips Milestone – Arm®.” https://www.arm.com/company/news/2023/02/arm-announces-q3-fy22-results.\n\n\nJouppi, Norman P, Cliff Young, Nishant Patil, David Patterson, Gaurav\nAgrawal, Raminder Bajwa, Sarah Bates, et al. 2017. “In-Datacenter\nPerformance Analysis of a Tensor Processing Unit.” In\nProceedings of the 44th Annual International Symposium on Computer\nArchitecture, 1–12.\n\n\nRosenblatt, Frank. 1957. The Perceptron, a Perceiving and\nRecognizing Automaton Project Para. Cornell Aeronautical\nLaboratory.\n\n\nRumelhart, David E, Geoffrey E Hinton, and Ronald J Williams. 1986.\n“Learning Representations by Back-Propagating Errors.”\nNature 323 (6088): 533–36."
+    "text": "Abadi, Martı́n, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis,\nJeffrey Dean, Matthieu Devin, et al. 2016. “{TensorFlow}: A System for {Large-Scale} Machine Learning.” In 12th\nUSENIX Symposium on Operating Systems Design and Implementation (OSDI\n16), 265–83.\n\n\nARM.com. “The Future Is Being Built on Arm: Market Diversification\nContinues to Drive Strong Royalty and Licensing Growth as Ecosystem\nReaches Quarter of a Trillion Chips Milestone – Arm®.” https://www.arm.com/company/news/2023/02/arm-announces-q3-fy22-results.\n\n\nBank, Dor, Noam Koenigstein, and Raja Giryes. 2023.\n“Autoencoders.” Machine Learning for Data Science\nHandbook: Data Mining and Knowledge Discovery Handbook, 353–74.\n\n\nChen, Tianqi, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan,\nHaichen Shen, Meghan Cowan, et al. 2018. “{TVM}: An\nAutomated {End-to-End} Optimizing Compiler for Deep\nLearning.” In 13th USENIX Symposium on Operating Systems\nDesign and Implementation (OSDI 18), 578–94.\n\n\nChollet, François. 2015. “Keras.” GitHub\nRepository. https://github.com/fchollet/keras; GitHub.\n\n\nGoodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David\nWarde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020.\n“Generative Adversarial Networks.” Communications of\nthe ACM 63 (11): 139–44.\n\n\nJouppi, Norman P, Cliff Young, Nishant Patil, David Patterson, Gaurav\nAgrawal, Raminder Bajwa, Sarah Bates, et al. 2017. “In-Datacenter\nPerformance Analysis of a Tensor Processing Unit.” In\nProceedings of the 44th Annual International Symposium on Computer\nArchitecture, 1–12.\n\n\nKrizhevsky, Alex, Ilya Sutskever, and Geoffrey E Hinton. 2012.\n“Imagenet Classification with Deep Convolutional Neural\nNetworks.” Advances in Neural Information Processing\nSystems 25.\n\n\nPaszke, Adam, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury,\nGregory Chanan, Trevor Killeen, et al. 2019. “Pytorch: An\nImperative Style, High-Performance Deep Learning Library.”\nAdvances in Neural Information Processing Systems 32.\n\n\nRosenblatt, Frank. 1957. The Perceptron, a Perceiving and\nRecognizing Automaton Project Para. Cornell Aeronautical\nLaboratory.\n\n\nRumelhart, David E, Geoffrey E Hinton, and Ronald J Williams. 1986.\n“Learning Representations by Back-Propagating Errors.”\nNature 323 (6088): 533–36.\n\n\nVaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion\nJones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017.\n“Attention Is All You Need.” Advances in Neural\nInformation Processing Systems 30."
   },
   {
     "objectID": "resources.html#coding",