diff --git a/_quarto.yml b/_quarto.yml index e0df9502..3ca168cc 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -229,6 +229,8 @@ editor: format: html: lightbox: true + mermaid: + theme: default theme: light: - default diff --git a/contents/core/ml_systems/images/png/convergence.png b/contents/core/ml_systems/images/png/convergence.png new file mode 100644 index 00000000..5613e11a Binary files /dev/null and b/contents/core/ml_systems/images/png/convergence.png differ diff --git a/contents/core/ml_systems/ml_systems.qmd b/contents/core/ml_systems/ml_systems.qmd index c22af1df..1ef2cba1 100644 --- a/contents/core/ml_systems/ml_systems.qmd +++ b/contents/core/ml_systems/ml_systems.qmd @@ -10,19 +10,21 @@ Resources: [Slides](#sec-ml-systems-resource), [Videos](#sec-ml-systems-resource ![*DALL·E 3 Prompt: Illustration in a rectangular format depicting the merger of embedded systems with Embedded AI. The left half of the image portrays traditional embedded systems, including microcontrollers and processors, detailed and precise. The right half showcases the world of artificial intelligence, with abstract representations of machine learning models, neurons, and data flow. The two halves are distinctly separated, emphasizing the individual significance of embedded tech and AI, but they come together in harmony at the center.*](images/png/cover_ml_systems.png) -Machine learning (ML) systems, built on the foundation of computing systems, hold the potential to transform our world. These systems, with their specialized roles and real-time computational capabilities, represent a critical junction where data and computation meet on a micro-scale. They are specifically tailored to optimize performance, energy usage, and spatial efficiency—key factors essential for the successful implementation of ML systems. +The convergence of machine learning and computing systems has ushered in a new era of intelligent computing that extends from powerful cloud infrastructures to tiny embedded devices. Machine learning (ML) systems represent this intersection where algorithmic intelligence meets hardware constraints, creating solutions that must carefully balance computational power, energy efficiency, and real-world practicality. As ML continues to transform various sectors, understanding how to effectively deploy these systems across different computing platforms has become increasingly crucial. -As this chapter progresses, we will explore ML systems' complex and fascinating world. We'll gain insights into their structural design and operational features and understand their key role in powering ML applications. Starting with the basics of microcontroller units, we will examine the interfaces and peripherals that improve their functionalities. This chapter is designed to be a comprehensive guide that explains the nuanced aspects of different ML systems. +Modern ML systems span a remarkable spectrum of capabilities and constraints. At one end, cloud-based systems harness vast computational resources to train and deploy complex models. At the other end, tiny embedded systems bring ML capabilities to resource-constrained devices that operate on minimal power. Between these extremes lie edge and mobile computing solutions, each offering unique advantages for specific use cases. This diversity in deployment options presents both opportunities and challenges for system designers and ML practitioners. + +This chapter explores this diverse landscape of ML systems, beginning with a comprehensive overview of different deployment paradigms. We'll examine how each approach addresses specific challenges and requirements, from processing power and memory constraints to energy efficiency and real-time performance. Through detailed comparisons and real-world examples, we'll develop a deep understanding of when and how to employ each type of ML system effectively. :::{.callout-tip} ## Learning Objectives -- Understand the key characteristics and differences between Cloud ML, Edge ML, Mobile ML, and TinyML systems. +- Understand the key characteristics and differences between Cloud ML, Edge ML, Mobile ML, and Tiny ML systems. - Analyze the benefits and challenges associated with each ML paradigm. -- Explore real-world applications and use cases for Cloud ML, Edge ML, Mobile ML, and TinyML. +- Explore real-world applications and use cases for Cloud ML, Edge ML, Mobile ML, and Tiny ML. - Compare the performance aspects of each ML approach, including latency, privacy, and resource utilization. @@ -33,7 +35,11 @@ As this chapter progresses, we will explore ML systems' complex and fascinating ML is rapidly evolving, with new paradigms reshaping how models are developed, trained, and deployed. The field is experiencing significant innovation driven by advancements in hardware, software, and algorithmic techniques. These developments are enabling machine learning to be applied in diverse settings, from large-scale cloud infrastructures to edge devices and even tiny, resource-constrained environments. -Modern machine learning systems span a spectrum of deployment options, each with its own set of characteristics and use cases. At one end, we have cloud-based ML, which leverages powerful centralized computing resources for complex, data-intensive tasks. Moving along the spectrum, we encounter edge ML, which brings computation closer to the data source for reduced latency and improved privacy. At the far end, we find TinyML, which enables machine learning on extremely low-power devices with severe memory and processing constraints. +Modern machine learning systems span a spectrum of deployment options, each with its own set of characteristics and use cases. At one end, we have cloud-based ML, which leverages powerful centralized computing resources for complex, data-intensive tasks. Moving along the spectrum, we encounter edge ML, which brings computation closer to the data source for reduced latency and improved privacy. Mobile ML further extends these capabilities to smartphones and tablets, while at the far end, we find Tiny ML, which enables machine learning on extremely low-power devices with severe memory and processing constraints. + +This chapter explores the landscape of contemporary machine learning systems, covering four key approaches: Cloud ML, Edge ML, Mobile ML, and Tiny ML. @fig-cloud-edge-Tiny ML-comparison illustrates the spectrum of distributed intelligence across these approaches, providing a visual comparison of their characteristics. We will examine the unique characteristics, advantages, and challenges of each approach, as depicted in the figure. Additionally, we will discuss the emerging trends and technologies that are shaping the future of machine learning deployment, considering how they might influence the balance between these three paradigms. + +![Cloud vs. Edge vs. Mobile vs. Tiny ML: The Spectrum of Distributed Intelligence. Source: ABI Research -- Tiny ML.](images/png/cloud-edge-tiny.png){#fig-cloud-edge-Tiny ML-comparison} To better understand the dramatic differences between these ML deployment options, @tbl-representative-systems provides examples of representative hardware platforms for each category. These examples illustrate the vast range of computational resources, power requirements, and cost considerations across the ML systems spectrum. As we explore each paradigm in detail, you can refer back to these concrete examples to better understand the practical implications of each approach. @@ -55,7 +61,7 @@ To better understand the dramatic differences between these ML deployment option | Mobile ML | iPhone 15 Pro | A17 Pro (6-core CPU, 6-core GPU) | 8GB RAM | 128GB-1TB | 3-5W | $999+ | Face ID, computational | | | | | | | | | photography, voice recognition | +---------------+-----------------------+--------------------------------------+----------------+------------------+-----------+-------------+--------------------------------+ -| TinyML | Arduino Nano 33 | Arm Cortex-M4 @ 64MHz | 256KB RAM | 1MB Flash | 0.02-0.04W| $35 | Gesture recognition, | +| Tiny ML | Arduino Nano 33 | Arm Cortex-M4 @ 64MHz | 256KB RAM | 1MB Flash | 0.02-0.04W| $35 | Gesture recognition, | | | BLE Sense | | | | | | voice detection | +---------------+-----------------------+--------------------------------------+----------------+------------------+-----------+-------------+--------------------------------+ | | ESP32-CAM | Dual-core @ 240MHz | 520KB RAM | 4MB Flash | 0.05-0.25W| $10 | Image classification, | @@ -64,10 +70,6 @@ To better understand the dramatic differences between these ML deployment option : Representative hardware platforms across the ML systems spectrum, showing typical specifications and capabilities for each category. {#tbl-representative-systems .hover .striped} -This chapter explores the landscape of contemporary machine learning systems, covering four key approaches: Cloud ML, Edge ML, and TinyML. @fig-cloud-edge-tinyml-comparison illustrates the spectrum of distributed intelligence across these approaches, providing a visual comparison of their characteristics. We will examine the unique characteristics, advantages, and challenges of each approach, as depicted in the figure. Additionally, we will discuss the emerging trends and technologies that are shaping the future of machine learning deployment, considering how they might influence the balance between these three paradigms. - -![Cloud vs. Edge vs. TinyML: The Spectrum of Distributed Intelligence. Source: ABI Research -- TinyML.](images/png/cloud-edge-tiny.png){#fig-cloud-edge-tinyml-comparison} - The evolution of machine learning systems can be seen as a progression from centralized to increasingly distributed and specialized computing paradigms: **Cloud ML:** Initially, ML was predominantly cloud-based. Powerful, scalable servers in data centers are used to train and run large ML models. This approach leverages vast computational resources and storage capacities, enabling the development of complex models trained on massive datasets. Cloud ML excels at tasks requiring extensive processing power, distributed training of large models, and is ideal for applications where real-time responsiveness isn't critical. Popular platforms like AWS SageMaker, Google Cloud AI, and Azure ML offer flexible, scalable solutions for model development, training, and deployment. Cloud ML can handle models with billions of parameters, training on petabytes of data, but may incur latencies of 100-500ms for online inference due to network delays. @@ -76,18 +78,18 @@ The evolution of machine learning systems can be seen as a progression from cent **Mobile ML:** Building on edge computing concepts, Mobile ML focuses on leveraging the computational capabilities of smartphones and tablets. This approach enables personalized, responsive applications while reducing reliance on constant network connectivity. Mobile ML offers a balance between the power of edge computing and the ubiquity of personal devices. It utilizes on-device sensors (e.g., cameras, GPS, accelerometers) for unique ML applications. Frameworks like TensorFlow Lite and Core ML allow developers to deploy optimized models on mobile devices, with inference times often under 30ms for common tasks. Mobile ML enhances privacy by keeping personal data on the device and can operate offline, but must balance model performance with device resource constraints (typically 4-8GB RAM, 100-200GB storage). -**TinyML:** The latest development in this progression is TinyML, which enables ML models to run on extremely resource-constrained microcontrollers and small embedded systems. TinyML allows for on-device inference without relying on connectivity to the cloud, edge, or even the processing power of mobile devices. This approach is crucial for applications where size, power consumption, and cost are critical factors. TinyML devices typically operate with less than 1MB of RAM and flash memory, consuming only milliwatts of power, enabling battery life of months or years. Applications include wake word detection, gesture recognition, and predictive maintenance in industrial settings. Platforms like Arduino Nano 33 BLE Sense and STM32 microcontrollers, coupled with frameworks like TensorFlow Lite for Microcontrollers, enable ML on these tiny devices. However, TinyML requires significant model optimization and quantization to fit within these constraints. +**Tiny ML:** The latest development in this progression is Tiny ML, which enables ML models to run on extremely resource-constrained microcontrollers and small embedded systems. Tiny ML allows for on-device inference without relying on connectivity to the cloud, edge, or even the processing power of mobile devices. This approach is crucial for applications where size, power consumption, and cost are critical factors. Tiny ML devices typically operate with less than 1MB of RAM and flash memory, consuming only milliwatts of power, enabling battery life of months or years. Applications include wake word detection, gesture recognition, and predictive maintenance in industrial settings. Platforms like Arduino Nano 33 BLE Sense and STM32 microcontrollers, coupled with frameworks like TensorFlow Lite for Microcontrollers, enable ML on these tiny devices. However, Tiny ML requires significant model optimization and quantization to fit within these constraints. Each of these paradigms has its own strengths and is suited to different use cases: - Cloud ML remains essential for tasks requiring massive computational power or large-scale data analysis. - Edge ML is ideal for applications needing low-latency responses or local data processing in industrial or enterprise environments. - Mobile ML is suited for personalized, responsive applications on smartphones and tablets. -- TinyML enables AI capabilities in small, power-efficient devices, expanding the reach of ML to new domains. +- Tiny ML enables AI capabilities in small, power-efficient devices, expanding the reach of ML to new domains. This progression reflects a broader trend in computing towards more distributed, localized, and specialized processing. The evolution is driven by the need for faster response times, improved privacy, reduced bandwidth usage, and the ability to operate in environments with limited or no connectivity, while also catering to the specific capabilities and constraints of different types of devices. -@fig-vMLsizes illustrates the key differences between Cloud ML, Edge ML, Mobile ML, and TinyML in terms of hardware, latency, connectivity, power requirements, and model complexity. As we move from Cloud to Edge to TinyML, we see a dramatic reduction in available resources, which presents significant challenges for deploying sophisticated machine learning models. This resource disparity becomes particularly apparent when attempting to deploy deep learning models on microcontrollers, the primary hardware platform for TinyML. These tiny devices have severely constrained memory and storage capacities, which are often insufficient for conventional deep learning models. We will learn to put these things into perspective in this chapter. +@fig-vMLsizes illustrates the key differences between Cloud ML, Edge ML, Mobile ML, and Tiny ML in terms of hardware, latency, connectivity, power requirements, and model complexity. As we move from Cloud to Edge to Tiny ML, we see a dramatic reduction in available resources, which presents significant challenges for deploying sophisticated machine learning models. This resource disparity becomes particularly apparent when attempting to deploy deep learning models on microcontrollers, the primary hardware platform for Tiny ML. These tiny devices have severely constrained memory and storage capacities, which are often insufficient for conventional deep learning models. We will learn to put these things into perspective in this chapter. ![From cloud GPUs to microcontrollers: Navigating the memory and storage landscape across computing devices. Source: [@lin2023tiny]](./images/jpg/cloud_mobile_tiny_sizes.jpg){#fig-vMLsizes} @@ -327,137 +329,151 @@ These applications demonstrate how Mobile ML bridges the gap between cloud-based ## Tiny ML -TinyML sits at the crossroads of embedded systems and machine learning, representing a burgeoning field that brings smart algorithms directly to tiny microcontrollers and sensors. These microcontrollers operate under severe resource constraints, particularly regarding memory, storage, and computational power. @fig-tiny-ml encapsulates the key aspects of TinyML discussed in this section. +Tiny ML sits at the crossroads of embedded systems and machine learning, representing a burgeoning field that brings smart algorithms directly to tiny microcontrollers and sensors. These microcontrollers operate under severe resource constraints, particularly regarding memory, storage, and computational power. @fig-tiny-ml encapsulates the key aspects of Tiny ML discussed in this section. -![Section overview for Tiny ML.](images/png/tinyml.png){#fig-tiny-ml} +![Section overview for Tiny ML.](images/png/Tiny ML.png){#fig-tiny-ml} ### Characteristics #### On-Device Machine Learning -In TinyML, the focus, much like in Mobile ML, is on on-device machine learning. This means that machine learning models are deployed and trained on the device, eliminating the need for external servers or cloud infrastructures. This allows TinyML to enable intelligent decision-making right where the data is generated, making real-time insights and actions possible, even in settings where connectivity is limited or unavailable. +In Tiny ML, the focus, much like in Mobile ML, is on on-device machine learning. This means that machine learning models are deployed and trained on the device, eliminating the need for external servers or cloud infrastructures. This allows Tiny ML to enable intelligent decision-making right where the data is generated, making real-time insights and actions possible, even in settings where connectivity is limited or unavailable. #### Low Power and Resource-Constrained Environments -TinyML excels in low-power and resource-constrained settings. These environments require highly optimized solutions that function within the available resources. @fig-tinyml-example showcases an example TinyML device kit, illustrating the compact nature of these systems. These devices can typically fit in the palm of your hand or, in some cases, are even as small as a fingernail. TinyML meets the need for efficiency through specialized algorithms and models designed to deliver decent performance while consuming minimal energy, thus ensuring extended operational periods, even in battery-powered devices like those shown. +Tiny ML excels in low-power and resource-constrained settings. These environments require highly optimized solutions that function within the available resources. @fig-Tiny ML-example showcases an example Tiny ML device kit, illustrating the compact nature of these systems. These devices can typically fit in the palm of your hand or, in some cases, are even as small as a fingernail. Tiny ML meets the need for efficiency through specialized algorithms and models designed to deliver decent performance while consuming minimal energy, thus ensuring extended operational periods, even in battery-powered devices like those shown. -![Examples of TinyML device kits. Source: [Widening Access to Applied Machine Learning with TinyML.](https://arxiv.org/pdf/2106.04008.pdf)](images/jpg/tiny_ml.jpg){#fig-tinyml-example} +![Examples of Tiny ML device kits. Source: [Widening Access to Applied Machine Learning with Tiny ML.](https://arxiv.org/pdf/2106.04008.pdf)](images/jpg/tiny_ml.jpg){#fig-Tiny ML-example} -::: {#exr-tinyml .callout-caution collapse="true"} +::: {#exr-Tiny ML .callout-caution collapse="true"} -### TinyML with Arduino +### Tiny ML with Arduino -Get ready to bring machine learning to the smallest of devices! In the embedded machine learning world, TinyML is where resource constraints meet ingenuity. This Colab notebook will walk you through building a gesture recognition model designed on an Arduino board. You'll learn how to train a small but effective neural network, optimize it for minimal memory usage, and deploy it to your microcontroller. If you're excited about making everyday objects smarter, this is where it begins! +Get ready to bring machine learning to the smallest of devices! In the embedded machine learning world, Tiny ML is where resource constraints meet ingenuity. This Colab notebook will walk you through building a gesture recognition model designed on an Arduino board. You'll learn how to train a small but effective neural network, optimize it for minimal memory usage, and deploy it to your microcontroller. If you're excited about making everyday objects smarter, this is where it begins! -[![](https://colab.research.google.com/assets/colab-badge.png)](https://colab.research.google.com/github/arduino/ArduinoTensorFlowLiteTutorials/blob/master/GestureToEmoji/arduino_tinyml_workshop.ipynb) +[![](https://colab.research.google.com/assets/colab-badge.png)](https://colab.research.google.com/github/arduino/ArduinoTensorFlowLiteTutorials/blob/master/GestureToEmoji/arduino_Tiny ML_workshop.ipynb) ::: ### Benefits #### Extremely Low Latency -One of the standout benefits of TinyML is its ability to offer ultra-low latency. Since computation occurs directly on the device, the time required to send data to external servers and receive a response is eliminated. This is crucial in applications requiring immediate decision-making, enabling quick responses to changing conditions. +One of the standout benefits of Tiny ML is its ability to offer ultra-low latency. Since computation occurs directly on the device, the time required to send data to external servers and receive a response is eliminated. This is crucial in applications requiring immediate decision-making, enabling quick responses to changing conditions. #### High Data Security -TinyML inherently enhances data security. Because data processing and analysis happen on the device, the risk of data interception during transmission is virtually eliminated. This localized approach to data management ensures that sensitive information stays on the device, strengthening user data security. +Tiny ML inherently enhances data security. Because data processing and analysis happen on the device, the risk of data interception during transmission is virtually eliminated. This localized approach to data management ensures that sensitive information stays on the device, strengthening user data security. #### Energy Efficiency -TinyML operates within an energy-efficient framework, a necessity given its resource-constrained environments. By employing lean algorithms and optimized computational methods, TinyML ensures that devices can execute complex tasks without rapidly depleting battery life, making it a sustainable option for long-term deployments. +Tiny ML operates within an energy-efficient framework, a necessity given its resource-constrained environments. By employing lean algorithms and optimized computational methods, Tiny ML ensures that devices can execute complex tasks without rapidly depleting battery life, making it a sustainable option for long-term deployments. ### Challenges #### Limited Computational Capabilities -However, the shift to TinyML comes with its set of hurdles. The primary limitation is the devices' constrained computational capabilities. The need to operate within such limits means that deployed models must be simplified, which could affect the accuracy and sophistication of the solutions. +However, the shift to Tiny ML comes with its set of hurdles. The primary limitation is the devices' constrained computational capabilities. The need to operate within such limits means that deployed models must be simplified, which could affect the accuracy and sophistication of the solutions. #### Complex Development Cycle -TinyML also introduces a complicated development cycle. Crafting lightweight and effective models demands a deep understanding of machine learning principles and expertise in embedded systems. This complexity calls for a collaborative development approach, where multi-domain expertise is essential for success. +Tiny ML also introduces a complicated development cycle. Crafting lightweight and effective models demands a deep understanding of machine learning principles and expertise in embedded systems. This complexity calls for a collaborative development approach, where multi-domain expertise is essential for success. #### Model Optimization and Compression -A central challenge in TinyML is model optimization and compression. Creating machine learning models that can operate effectively within the limited memory and computational power of microcontrollers requires innovative approaches to model design. Developers often face the challenge of striking a delicate balance and optimizing models to maintain effectiveness while fitting within stringent resource constraints. +A central challenge in Tiny ML is model optimization and compression. Creating machine learning models that can operate effectively within the limited memory and computational power of microcontrollers requires innovative approaches to model design. Developers often face the challenge of striking a delicate balance and optimizing models to maintain effectiveness while fitting within stringent resource constraints. ### Example Use Cases #### Wearable Devices -In wearables, TinyML opens the door to smarter, more responsive gadgets. From fitness trackers offering real-time workout feedback to smart glasses processing visual data on the fly, TinyML transforms how we engage with wearable tech, delivering personalized experiences directly from the device. +In wearables, Tiny ML opens the door to smarter, more responsive gadgets. From fitness trackers offering real-time workout feedback to smart glasses processing visual data on the fly, Tiny ML transforms how we engage with wearable tech, delivering personalized experiences directly from the device. #### Predictive Maintenance -In industrial settings, TinyML plays a significant role in predictive maintenance. By deploying TinyML algorithms on sensors that monitor equipment health, companies can preemptively identify potential issues, reducing downtime and preventing costly breakdowns. On-site data analysis ensures quick responses, potentially stopping minor issues from becoming major problems. +In industrial settings, Tiny ML plays a significant role in predictive maintenance. By deploying Tiny ML algorithms on sensors that monitor equipment health, companies can preemptively identify potential issues, reducing downtime and preventing costly breakdowns. On-site data analysis ensures quick responses, potentially stopping minor issues from becoming major problems. #### Anomaly Detection -TinyML can be employed to create anomaly detection models that identify unusual data patterns. For instance, a smart factory could use TinyML to monitor industrial processes and spot anomalies, helping prevent accidents and improve product quality. Similarly, a security company could use TinyML to monitor network traffic for unusual patterns, aiding in detecting and preventing cyber-attacks. TinyML could monitor patient data for anomalies in healthcare, aiding early disease detection and better patient treatment. +Tiny ML can be employed to create anomaly detection models that identify unusual data patterns. For instance, a smart factory could use Tiny ML to monitor industrial processes and spot anomalies, helping prevent accidents and improve product quality. Similarly, a security company could use Tiny ML to monitor network traffic for unusual patterns, aiding in detecting and preventing cyber-attacks. Tiny ML could monitor patient data for anomalies in healthcare, aiding early disease detection and better patient treatment. #### Environmental Monitoring -In environmental monitoring, TinyML enables real-time data analysis from various field-deployed sensors. These could range from city air quality monitoring to wildlife tracking in protected areas. Through TinyML, data can be processed locally, allowing for quick responses to changing conditions and providing a nuanced understanding of environmental patterns, crucial for informed decision-making. +In environmental monitoring, Tiny ML enables real-time data analysis from various field-deployed sensors. These could range from city air quality monitoring to wildlife tracking in protected areas. Through Tiny ML, data can be processed locally, allowing for quick responses to changing conditions and providing a nuanced understanding of environmental patterns, crucial for informed decision-making. -In summary, TinyML serves as a trailblazer in the evolution of machine learning, fostering innovation across various fields by bringing intelligence directly to the edge. Its potential to transform our interaction with technology and the world is immense, promising a future where devices are connected, intelligent, and capable of making real-time decisions and responses. +In summary, Tiny ML serves as a trailblazer in the evolution of machine learning, fostering innovation across various fields by bringing intelligence directly to the edge. Its potential to transform our interaction with technology and the world is immense, promising a future where devices are connected, intelligent, and capable of making real-time decisions and responses. -## Hybrid ML +## Shared Principles -While we've examined Cloud ML, Edge ML, Mobile ML, and TinyML as distinct approaches, the reality of modern ML deployments is more nuanced. Systems architects often combine these paradigms to create solutions that leverage the strengths of each approach while mitigating their individual limitations. Understanding how these systems can work together opens up new possibilities for building more efficient and effective ML applications. +After exploring individual ML paradigms and their hybrid combinations, a deeper pattern emerges in how these systems fundamentally operate. @fig-ml-systems-convergence illustrates how different ML implementations, while optimized for distinct contexts, actually converge around core system principles that unite them all. -### Train-Serve Split +![Core principles converge across different ML system implementations, from cloud to tiny deployments, sharing common foundations in data pipelines, resource management, and system architecture.](./images/png/convergence.png){#fig-ml-systems-convergence} -One of the most common hybrid patterns is the train-serve split, where model training occurs in the cloud but inference happens on edge, mobile, or tiny devices. This pattern takes advantage of the cloud's vast computational resources for the training phase while benefiting from the low latency and privacy advantages of on-device inference. For example, smart home devices often use models trained on large datasets in the cloud but run inference locally to ensure quick response times and protect user privacy. In practice, this might involve training models on powerful systems like the NVIDIA DGX A100, leveraging its 8 A100 GPUs and terabyte-scale memory, before deploying optimized versions to edge devices like the NVIDIA Jetson AGX Orin for efficient inference. Similarly, mobile vision models for computational photography are typically trained on powerful cloud infrastructure but deployed to run efficiently on phone hardware. +The figure shows three key layers that help us understand how ML systems relate to each other. At the top, we see the diverse implementations that we have explored throughout this chapter. Cloud ML operates in data centers, focusing on training at scale with vast computational resources. Edge ML emphasizes local processing with inference capabilities closer to data sources. Mobile ML leverages personal devices for user-centric applications. Tiny ML brings intelligence to highly constrained embedded systems and sensors. -### Hierarchical Processing +Despite their distinct characteristics, the arrows in the figure show how all these implementations connect to the same core system principles. This reflects an important reality in ML systems---while they may operate at dramatically different scales, from cloud systems processing petabytes to tiny devices handling kilobytes, they all must solve similar fundamental challenges in terms of: -Hierarchical processing creates a multi-tier system where data and intelligence flow between different levels of the ML stack. In industrial IoT applications, tiny sensors might perform basic anomaly detection, edge devices aggregate and analyze data from multiple sensors, and cloud systems handle complex analytics and model updates. For instance, we might see ESP32-CAM devices performing basic image classification at the sensor level with their minimal 520KB RAM, feeding data up to Jetson AGX Orin devices for more sophisticated computer vision tasks, and ultimately connecting to cloud infrastructure for complex analytics and model updates. +- Managing data pipelines from collection through processing to deployment +- Balancing resource utilization across compute, memory, energy, and network +- Implementing system architectures that effectively integrate models, hardware, and software -This hierarchy allows each tier to handle tasks appropriate to its capabilities---TinyML devices handle immediate, simple decisions; edge devices manage local coordination; and cloud systems tackle complex analytics and learning tasks. Smart city installations often use this pattern, with street-level sensors feeding data to neighborhood-level edge processors, which in turn connect to city-wide cloud analytics. +These core principles then lead to shared system considerations around optimization, operations, and trustworthiness. This progression helps explain why techniques developed for one scale of ML system often transfer effectively to others. The underlying problems---efficiently processing data, managing resources, and ensuring reliable operation---remain consistent even as the specific solutions vary based on scale and context. -### Federated Learning +Understanding this convergence becomes particularly valuable as we move towards hybrid ML systems. When we recognize that different ML implementations share fundamental principles, combining them effectively becomes more intuitive. We can better appreciate why, for example, a cloud-trained model can be effectively deployed to edge devices, or why mobile and tiny ML systems can complement each other in IoT applications. -Federated learning represents a sophisticated hybrid approach where model training is distributed across many edge or mobile devices while maintaining privacy. Devices learn from local data and share model updates, rather than raw data, with cloud servers that aggregate these updates into an improved global model. This pattern is particularly powerful for applications like keyboard prediction on mobile devices or healthcare analytics, where privacy is paramount but benefits from collective learning are valuable. The cloud coordinates the learning process without directly accessing sensitive data, while devices benefit from the collective intelligence of the network. +As we examine each layer of @fig-ml-systems-convergence in detail, we'll see how these relationships manifest in practical system design and implementation. This understanding will prove valuable not just for working with individual ML systems, but for developing hybrid solutions that leverage the strengths of different approaches while mitigating their limitations. -### Progressive Deployment +### Implementations Layer -Progressive deployment strategies adapt models for different computational tiers, creating a cascade of increasingly lightweight versions. A model might start as a large, complex version in the cloud, then be progressively compressed and optimized for edge servers, mobile devices, and finally tiny sensors. Voice assistant systems often employ this pattern---full natural language processing runs in the cloud, while simplified wake-word detection runs on-device. This allows the system to balance capability and resource constraints across the ML stack. +The top layer of @fig-ml-systems-convergence represents the diverse landscape of ML systems we've explored throughout this chapter. Each implementation addresses specific needs and operational contexts, yet all contribute to the broader ecosystem of ML deployment options. -### Collaborative Learning +Cloud ML, centered in data centers, provides the foundation for large-scale training and complex model serving. With access to vast computational resources like the NVIDIA DGX A100 systems we saw in @tbl-representative-systems, cloud implementations excel at handling massive datasets and training sophisticated models. This makes them particularly suited for tasks requiring extensive computational power, such as training foundation models or processing large-scale analytics. -Collaborative learning enables peer-to-peer learning between devices at the same tier, often complementing hierarchical structures. Autonomous vehicle fleets, for example, might share learning about road conditions or traffic patterns directly between vehicles while also communicating with cloud infrastructure. This horizontal collaboration allows systems to share time-sensitive information and learn from each other's experiences without always routing through central servers. +Edge ML shifts the focus to local processing, prioritizing inference capabilities closer to data sources. Using devices like the NVIDIA Jetson AGX Orin, edge implementations balance computational power with reduced latency and improved privacy. This approach proves especially valuable in scenarios requiring quick decisions based on local data, such as industrial automation or real-time video analytics. -These hybrid patterns demonstrate how modern ML systems are evolving beyond simple client-server architectures into rich, multi-tier systems that combine the strengths of different approaches. By understanding these patterns, system architects can design solutions that effectively balance competing demands for computation, latency, privacy, and power efficiency. The future of ML systems likely lies not in choosing between cloud, edge, mobile, or tiny approaches, but in creatively combining them to build more capable and efficient systems. +Mobile ML leverages the capabilities of personal devices, particularly smartphones and tablets. With specialized hardware like Apple's A17 Pro chip, mobile implementations enable sophisticated ML capabilities while maintaining user privacy and providing offline functionality. This paradigm has revolutionized applications from computational photography to on-device speech recognition. -### Real-World Integration Patterns +Tiny ML represents the frontier of embedded ML, bringing intelligence to highly constrained devices. Operating on microcontrollers like the Arduino Nano 33 BLE Sense, tiny implementations must carefully balance functionality with severe resource constraints. Despite these limitations, Tiny ML enables ML capabilities in scenarios where power efficiency and size constraints are paramount. -In practice, ML systems rarely operate in isolation. Instead, they form interconnected networks where each paradigm---Cloud, Edge, Mobile, and TinyML---plays a specific role while communicating with other parts of the system. These interactions follow distinct patterns that emerge from the inherent strengths and limitations of each approach. Cloud systems excel at training and analytics but require significant infrastructure. Edge systems provide local processing power and reduced latency. Mobile devices offer personal computing capabilities and user interaction. TinyML enables intelligence in the smallest devices and sensors. +### System Principles Layer -Cloud systems excel at training and analytics but require significant infrastructure. Edge systems provide local processing power and reduced latency. Mobile devices offer personal computing capabilities and user interaction. TinyML enables intelligence in the smallest devices and sensors. +The middle layer reveals the fundamental principles that unite all ML systems, regardless of their implementation scale. These core principles remain consistent even as their specific manifestations vary dramatically across different deployments. -![Example interaction patterns between ML paradigms, showing data flows, model deployment, and processing relationships across Cloud, Edge, Mobile, and TinyML systems.](./images/png/hybrid.png){#fig-hybrid} +Data Pipeline principles govern how systems handle information flow, from initial collection through processing to final deployment. In cloud systems, this might mean processing petabytes of data through distributed pipelines. For tiny systems, it could involve carefully managing sensor data streams within limited memory. Despite these scale differences, all systems must address the same fundamental challenges of data ingestion, transformation, and utilization. -@fig-hybrid illustrates these key interactions through specific connection types: "Deploy" paths show how models flow from cloud training to various devices, "Data" and "Results" show information flow from sensors through processing stages, "Analyze" shows how processed information reaches cloud analytics, and "Sync" demonstrates device coordination. Notice how data generally flows upward from sensors through processing layers to cloud analytics, while model deployments flow downward from cloud training to various inference points. The interactions aren't strictly hierarchical---mobile devices might communicate directly with both cloud services and tiny sensors, while edge systems can assist mobile devices with complex processing tasks. +Resource Management emerges as a universal challenge across all implementations. Whether managing thousands of GPUs in a data center or optimizing battery life on a microcontroller, all systems must balance competing demands for computation, memory, energy, and network resources. The quantities involved may differ by orders of magnitude, but the core principles of resource allocation and optimization remain remarkably consistent. -To understand how these labeled interactions manifest in real applications, let's explore several common scenarios using @fig-hybrid: +System Architecture principles guide how ML systems integrate models, hardware, and software components. Cloud architectures might focus on distributed computing and scalability, while tiny systems emphasize efficient memory mapping and interrupt handling. Yet all must solve fundamental problems of component integration, data flow optimization, and processing coordination. -- **Model Deployment Scenario:** A company develops a computer vision model for defect detection. Following the "Deploy" paths shown in @fig-hybrid, the cloud-trained model is distributed to edge servers in factories, quality control tablets on the production floor, and tiny cameras embedded in the production line. This showcases how a single ML solution can be distributed across different computational tiers for optimal performance. +### System Considerations Layer -- **Data Flow and Analysis Scenario:** In a smart agriculture system, soil sensors (TinyML) collect moisture and nutrient data, following the "Data" path to TinyML inference. The "Results" flow to edge processors in local stations, which process this information and use the "Analyze" path to send insights to the cloud for farm-wide analytics, while also sharing results with farmers' mobile apps. This demonstrates the hierarchical flow shown in @fig-hybrid from sensors through processing to cloud analytics. +The bottom layer of @fig-ml-systems-convergence illustrates how fundamental principles manifest in practical system-wide considerations. These considerations span all ML implementations, though their specific challenges and solutions vary based on scale and context. -- **Edge-Mobile Assistance Scenario:** When a mobile app needs to perform complex image processing that exceeds the phone's capabilities, it utilizes the "Assist" connection shown in @fig-hybrid. The edge system helps process the heavier computational tasks, sending back results to enhance the mobile app's performance. This shows how different ML tiers can cooperate to handle demanding tasks. +Optimization and Efficiency shape how ML systems balance performance with resource utilization. In cloud environments, this often means optimizing model training across GPU clusters while managing energy consumption in data centers. Edge systems focus on reducing model size and accelerating inference without compromising accuracy. Mobile implementations must balance model performance with battery life and thermal constraints. Tiny ML pushes optimization to its limits, requiring extensive model compression and quantization to fit within severely constrained environments. Despite these different emphases, all implementations grapple with the core challenge of maximizing performance within their available resources. -- **TinyML-Mobile Integration Scenario:** A fitness tracker uses TinyML to continuously monitor activity patterns and vital signs. Using the "Sync" pathway shown in @fig-hybrid, it synchronizes this processed data with the user's smartphone, which combines it with other health data before sending consolidated updates via the "Analyze" path to the cloud for long-term health analysis. This illustrates the common pattern of tiny devices using mobile devices as gateways to larger networks. +Operational Aspects affect how ML systems are deployed, monitored, and maintained in production environments. Cloud systems must handle continuous deployment across distributed infrastructure while monitoring model performance at scale. Edge implementations need robust update mechanisms and health monitoring across potentially thousands of devices. Mobile systems require seamless app updates and performance monitoring without disrupting user experience. Tiny ML faces unique challenges in deploying updates to embedded devices while ensuring continuous operation. Across all scales, the fundamental problems of deployment, monitoring, and maintenance remain consistent, even as solutions vary. -- **Multi-Layer Processing Scenario:** In a smart retail environment, tiny sensors monitor inventory levels, using "Data" and "Results" paths to send inference results to both edge systems for immediate stock management and mobile devices for staff notifications. Following the "Analyze" path, the edge systems process this data alongside other store metrics, while the cloud analyzes trends across all store locations. This demonstrates how the interactions shown in @fig-hybrid enable ML tiers to work together in a complete solution. +Trustworthy AI considerations ensure ML systems operate reliably, securely, and with appropriate privacy protections. Cloud implementations must secure massive amounts of data while ensuring model predictions remain reliable at scale. Edge systems need to protect local data processing while maintaining model accuracy in diverse environments. Mobile ML must preserve user privacy while delivering consistent performance. Tiny ML systems, despite their size, must still ensure secure operation and reliable inference. These trustworthiness considerations cut across all implementations, reflecting the critical importance of building ML systems that users can depend on. -These real-world patterns demonstrate how different ML paradigms naturally complement each other in practice. While each approach has its own strengths, their true power emerges when they work together as an integrated system. By understanding these patterns, system architects can better design solutions that effectively leverage the capabilities of each ML tier while managing their respective constraints. +The progression through these layers - from diverse implementations through core principles to shared considerations - reveals why ML systems can be studied as a unified field despite their apparent differences. While specific solutions may vary dramatically based on scale and context, the fundamental challenges remain remarkably consistent. This understanding becomes particularly valuable as we move toward increasingly sophisticated hybrid systems that combine multiple implementation approaches. + +The convergence of fundamental principles across ML implementations helps explain why hybrid approaches work so effectively in practice. As we saw in our discussion of hybrid ML, different implementations naturally complement each other precisely because they share these core foundations. Whether we're looking at train-serve splits that leverage cloud resources for training and edge devices for inference, or hierarchical processing that combines Tiny ML sensors with edge aggregation and cloud analytics, the shared principles enable seamless integration across scales. + +### From Principles to Practice + +This convergence also suggests why techniques and insights often transfer well between different scales of ML systems. A deep understanding of data pipelines in cloud environments can inform how we structure data flow in embedded systems. Resource management strategies developed for mobile devices might inspire new approaches to cloud optimization. System architecture patterns that prove effective at one scale often adapt surprisingly well to others. + +Understanding these fundamental principles and shared considerations provides a foundation for comparing different ML implementations more effectively. While each approach has its distinct characteristics and optimal use cases, they all build upon the same core elements. As we move into our detailed comparison in the next section, keeping these shared foundations in mind will help us better appreciate both the differences and similarities between various ML system implementations. + +## ML System Comparison + +We can now synthesize our understanding by examining how the various ML system approaches compare across different dimensions. This synthesis is particularly important as system designers often face tradeoffs between different deployment options when implementing ML solutions. -## Comparison +The relationship between computational resources and deployment location forms one of the most fundamental comparisons across ML systems. As we move from cloud deployments to tiny devices, we observe a dramatic reduction in available computing power, storage, and energy consumption. Cloud ML systems, with their data center infrastructure, can leverage virtually unlimited resources, processing data at the scale of petabytes and training models with billions of parameters. Edge ML systems, while more constrained, still offer significant computational capability through specialized hardware like edge GPUs and neural processing units. Mobile ML represents a middle ground, balancing computational power with energy efficiency on devices like smartphones and tablets. At the far end of the spectrum, TinyML operates under severe resource constraints, often limited to kilobytes of memory and milliwatts of power consumption. -Let's bring together the different ML variants we've explored individually for a comprehensive view. For a detailed comparison of these ML variants, we can refer to @tbl-big_vs_tiny. This table offers a comprehensive analysis of Cloud ML, Edge ML, and TinyML based on various features and aspects. By examining these different characteristics side by side, we gain a clearer perspective on the unique advantages and distinguishing factors of each approach. This detailed comparison, combined with the visual overview provided by the Venn diagram, aids in making informed decisions based on the specific needs and constraints of a given application or project. +The operational characteristics of these systems reveal another important dimension of comparison. @tbl-big_vs_tiny provides a comprehensive view of how these systems differ across various operational aspects. Latency, for instance, shows a clear pattern: cloud systems typically incur delays of 100-1000ms due to network communication, while edge systems reduce this to 10-100ms by processing data locally. Mobile ML achieves even lower latencies of 5-50ms for many tasks, and TinyML systems can respond in 1-10ms for simple inferences. This latency gradient illustrates how moving computation closer to the data source can improve real-time processing capabilities. +--------------------------+----------------------------------------------------------+----------------------------------------------------------+-----------------------------------------------------------+----------------------------------------------------------+ -| Aspect | Cloud ML | Edge ML | Mobile ML | TinyML | +| Aspect | Cloud ML | Edge ML | Mobile ML | Tiny ML | +:=========================+:=========================================================+:=========================================================+:==========================================================+:=========================================================+ | Processing Location | Centralized cloud servers (Data Centers) | Local edge devices (gateways, servers) | Smartphones and tablets | Ultra-low-power microcontrollers and embedded systems | +--------------------------+----------------------------------------------------------+----------------------------------------------------------+-----------------------------------------------------------+----------------------------------------------------------+ @@ -487,7 +503,7 @@ Let's bring together the different ML variants we've explored individually for a +--------------------------+----------------------------------------------------------+----------------------------------------------------------+-----------------------------------------------------------+----------------------------------------------------------+ | Hardware Requirements | Cloud infrastructure | Edge servers/gateways | Modern smartphones | MCUs/embedded systems | +--------------------------+----------------------------------------------------------+----------------------------------------------------------+-----------------------------------------------------------+----------------------------------------------------------+ -| Framework Support | All ML frameworks | Most frameworks | Mobile-optimized (TFLite, CoreML) | TinyML frameworks | +| Framework Support | All ML frameworks | Most frameworks | Mobile-optimized (TFLite, CoreML) | Tiny ML frameworks | +--------------------------+----------------------------------------------------------+----------------------------------------------------------+-----------------------------------------------------------+----------------------------------------------------------+ | Model Size Limits | None | Several GB | 10s-100s MB | Bytes-KB range | +--------------------------+----------------------------------------------------------+----------------------------------------------------------+-----------------------------------------------------------+----------------------------------------------------------+ @@ -496,11 +512,72 @@ Let's bring together the different ML variants we've explored individually for a | Offline Capability | None | Good | Excellent | Complete | +--------------------------+----------------------------------------------------------+----------------------------------------------------------+-----------------------------------------------------------+----------------------------------------------------------+ -: Comparison of feature aspects across Cloud ML, Edge ML, and TinyML. {#tbl-big_vs_tiny .hover .striped} +: Comparison of feature aspects across Cloud ML, Edge ML, and Tiny ML. {#tbl-big_vs_tiny .hover .striped} + +Privacy and data handling represent another crucial axis of comparison. Cloud ML requires data to leave the device, potentially raising privacy concerns despite robust security measures. Edge ML improves privacy by keeping data within local networks, while Mobile ML further enhances this by processing sensitive information directly on personal devices. TinyML offers the strongest privacy guarantees, as data never leaves the sensor or microcontroller where it's collected. + +Development complexity and deployment considerations also vary significantly across these paradigms. Cloud ML benefits from mature development tools and frameworks but requires expertise in cloud infrastructure. Edge ML demands knowledge of both ML and networking protocols, while Mobile ML developers must understand mobile-specific optimizations and platform constraints. TinyML development, though targeting simpler devices, often requires specialized knowledge of embedded systems and careful optimization to work within severe resource constraints. + +Cost structures differ markedly as well. Cloud ML typically involves ongoing operational costs for computation and storage, often running into thousands of dollars monthly for large-scale deployments. Edge ML requires significant upfront investment in edge devices but may reduce ongoing costs. Mobile ML leverages existing consumer devices, minimizing additional hardware costs, while TinyML solutions can be deployed for just a few dollars per device, though development costs may be higher. + +These comparisons reveal that each paradigm has distinct advantages and limitations. Cloud ML excels at complex, data-intensive tasks but requires constant connectivity. Edge ML offers a balance of computational power and local processing. Mobile ML provides personalized intelligence on ubiquitous devices. TinyML enables ML in previously inaccessible contexts but requires careful optimization. Understanding these tradeoffs is crucial for selecting the appropriate deployment strategy for specific applications and constraints. + +## Hybrid ML + +While our comparison highlighted the distinct advantages and tradeoffs of each paradigm, modern ML deployments often transcend these boundaries. In practice, systems architects rarely confine themselves to a single approach, instead combining various paradigms to create more nuanced solutions. These hybrid approaches leverage the complementary strengths we've analyzed - from cloud's computational power to tiny's efficiency - while mitigating their individual limitations. By understanding how these systems can work together, architects can create new architectural patterns that balance competing demands for performance, privacy, and resource efficiency, opening up possibilities for more sophisticated ML applications that better meet complex real-world requirements. + +### Train-Serve Split + +One of the most common hybrid patterns is the train-serve split, where model training occurs in the cloud but inference happens on edge, mobile, or tiny devices. This pattern takes advantage of the cloud's vast computational resources for the training phase while benefiting from the low latency and privacy advantages of on-device inference. For example, smart home devices often use models trained on large datasets in the cloud but run inference locally to ensure quick response times and protect user privacy. In practice, this might involve training models on powerful systems like the NVIDIA DGX A100, leveraging its 8 A100 GPUs and terabyte-scale memory, before deploying optimized versions to edge devices like the NVIDIA Jetson AGX Orin for efficient inference. Similarly, mobile vision models for computational photography are typically trained on powerful cloud infrastructure but deployed to run efficiently on phone hardware. + +### Hierarchical Processing + +Hierarchical processing creates a multi-tier system where data and intelligence flow between different levels of the ML stack. In industrial IoT applications, tiny sensors might perform basic anomaly detection, edge devices aggregate and analyze data from multiple sensors, and cloud systems handle complex analytics and model updates. For instance, we might see ESP32-CAM devices performing basic image classification at the sensor level with their minimal 520KB RAM, feeding data up to Jetson AGX Orin devices for more sophisticated computer vision tasks, and ultimately connecting to cloud infrastructure for complex analytics and model updates. + +This hierarchy allows each tier to handle tasks appropriate to its capabilities---Tiny ML devices handle immediate, simple decisions; edge devices manage local coordination; and cloud systems tackle complex analytics and learning tasks. Smart city installations often use this pattern, with street-level sensors feeding data to neighborhood-level edge processors, which in turn connect to city-wide cloud analytics. + +### Federated Learning + +Federated learning represents a sophisticated hybrid approach where model training is distributed across many edge or mobile devices while maintaining privacy. Devices learn from local data and share model updates, rather than raw data, with cloud servers that aggregate these updates into an improved global model. This pattern is particularly powerful for applications like keyboard prediction on mobile devices or healthcare analytics, where privacy is paramount but benefits from collective learning are valuable. The cloud coordinates the learning process without directly accessing sensitive data, while devices benefit from the collective intelligence of the network. + +### Progressive Deployment + +Progressive deployment strategies adapt models for different computational tiers, creating a cascade of increasingly lightweight versions. A model might start as a large, complex version in the cloud, then be progressively compressed and optimized for edge servers, mobile devices, and finally tiny sensors. Voice assistant systems often employ this pattern---full natural language processing runs in the cloud, while simplified wake-word detection runs on-device. This allows the system to balance capability and resource constraints across the ML stack. + +### Collaborative Learning + +Collaborative learning enables peer-to-peer learning between devices at the same tier, often complementing hierarchical structures. Autonomous vehicle fleets, for example, might share learning about road conditions or traffic patterns directly between vehicles while also communicating with cloud infrastructure. This horizontal collaboration allows systems to share time-sensitive information and learn from each other's experiences without always routing through central servers. + +These hybrid patterns demonstrate how modern ML systems are evolving beyond simple client-server architectures into rich, multi-tier systems that combine the strengths of different approaches. By understanding these patterns, system architects can design solutions that effectively balance competing demands for computation, latency, privacy, and power efficiency. The future of ML systems likely lies not in choosing between cloud, edge, mobile, or tiny approaches, but in creatively combining them to build more capable and efficient systems. + +### Real-World Integration Patterns + +In practice, ML systems rarely operate in isolation. Instead, they form interconnected networks where each paradigm---Cloud, Edge, Mobile, and Tiny ML---plays a specific role while communicating with other parts of the system. These interactions follow distinct patterns that emerge from the inherent strengths and limitations of each approach. Cloud systems excel at training and analytics but require significant infrastructure. Edge systems provide local processing power and reduced latency. Mobile devices offer personal computing capabilities and user interaction. Tiny ML enables intelligence in the smallest devices and sensors. + +Cloud systems excel at training and analytics but require significant infrastructure. Edge systems provide local processing power and reduced latency. Mobile devices offer personal computing capabilities and user interaction. Tiny ML enables intelligence in the smallest devices and sensors. + +![Example interaction patterns between ML paradigms, showing data flows, model deployment, and processing relationships across Cloud, Edge, Mobile, and Tiny ML systems.](./images/png/hybrid.png){#fig-hybrid} + +@fig-hybrid illustrates these key interactions through specific connection types: "Deploy" paths show how models flow from cloud training to various devices, "Data" and "Results" show information flow from sensors through processing stages, "Analyze" shows how processed information reaches cloud analytics, and "Sync" demonstrates device coordination. Notice how data generally flows upward from sensors through processing layers to cloud analytics, while model deployments flow downward from cloud training to various inference points. The interactions aren't strictly hierarchical---mobile devices might communicate directly with both cloud services and tiny sensors, while edge systems can assist mobile devices with complex processing tasks. + +To understand how these labeled interactions manifest in real applications, let's explore several common scenarios using @fig-hybrid: + +- **Model Deployment Scenario:** A company develops a computer vision model for defect detection. Following the "Deploy" paths shown in @fig-hybrid, the cloud-trained model is distributed to edge servers in factories, quality control tablets on the production floor, and tiny cameras embedded in the production line. This showcases how a single ML solution can be distributed across different computational tiers for optimal performance. + +- **Data Flow and Analysis Scenario:** In a smart agriculture system, soil sensors (Tiny ML) collect moisture and nutrient data, following the "Data" path to Tiny ML inference. The "Results" flow to edge processors in local stations, which process this information and use the "Analyze" path to send insights to the cloud for farm-wide analytics, while also sharing results with farmers' mobile apps. This demonstrates the hierarchical flow shown in @fig-hybrid from sensors through processing to cloud analytics. + +- **Edge-Mobile Assistance Scenario:** When a mobile app needs to perform complex image processing that exceeds the phone's capabilities, it utilizes the "Assist" connection shown in @fig-hybrid. The edge system helps process the heavier computational tasks, sending back results to enhance the mobile app's performance. This shows how different ML tiers can cooperate to handle demanding tasks. + +- **Tiny ML-Mobile Integration Scenario:** A fitness tracker uses Tiny ML to continuously monitor activity patterns and vital signs. Using the "Sync" pathway shown in @fig-hybrid, it synchronizes this processed data with the user's smartphone, which combines it with other health data before sending consolidated updates via the "Analyze" path to the cloud for long-term health analysis. This illustrates the common pattern of tiny devices using mobile devices as gateways to larger networks. + +- **Multi-Layer Processing Scenario:** In a smart retail environment, tiny sensors monitor inventory levels, using "Data" and "Results" paths to send inference results to both edge systems for immediate stock management and mobile devices for staff notifications. Following the "Analyze" path, the edge systems process this data alongside other store metrics, while the cloud analyzes trends across all store locations. This demonstrates how the interactions shown in @fig-hybrid enable ML tiers to work together in a complete solution. + +These real-world patterns demonstrate how different ML paradigms naturally complement each other in practice. While each approach has its own strengths, their true power emerges when they work together as an integrated system. By understanding these patterns, system architects can better design solutions that effectively leverage the capabilities of each ML tier while managing their respective constraints. + ## Conclusion -In this chapter, we've offered a panoramic view of the evolving landscape of machine learning, covering cloud, edge, and tiny ML paradigms. Cloud-based machine learning leverages the immense computational resources of cloud platforms to enable powerful and accurate models but comes with limitations, including latency and privacy concerns. Edge ML mitigates these limitations by bringing inference directly to edge devices, offering lower latency and reduced connectivity needs. TinyML takes this further by miniaturizing ML models to run directly on highly resource-constrained devices, opening up a new category of intelligent applications. +In this chapter, we've offered a panoramic view of the evolving landscape of machine learning, covering cloud, edge, and tiny ML paradigms. Cloud-based machine learning leverages the immense computational resources of cloud platforms to enable powerful and accurate models but comes with limitations, including latency and privacy concerns. Edge ML mitigates these limitations by bringing inference directly to edge devices, offering lower latency and reduced connectivity needs. Tiny ML takes this further by miniaturizing ML models to run directly on highly resource-constrained devices, opening up a new category of intelligent applications. Each approach has its tradeoffs, including model complexity, latency, privacy, and hardware costs. Over time, we anticipate converging these embedded ML approaches, with cloud pre-training facilitating more sophisticated edge and tiny ML implementations. Advances like federated learning and on-device learning will enable embedded devices to refine their models by learning from real-world data. @@ -528,13 +605,13 @@ These slides are a valuable tool for instructors to deliver lectures and for stu - [Embedded Inference.](https://docs.google.com/presentation/d/1FOUQ9dbe3l_qTa2AnroSbOz0ykuCz5cbTNO77tvFxEs/edit?usp=drive_link) -- [TinyML on Microcontrollers.](https://docs.google.com/presentation/d/1jwAZz3UOoJTR8PY6Wa34FxijpoDc9gBM/edit?usp=drive_link&ouid=102419556060649178683&rtpof=true&sd=true) +- [Tiny ML on Microcontrollers.](https://docs.google.com/presentation/d/1jwAZz3UOoJTR8PY6Wa34FxijpoDc9gBM/edit?usp=drive_link&ouid=102419556060649178683&rtpof=true&sd=true) -- TinyML as a Service (TinyMLaaS): +- Tiny ML as a Service (Tiny MLaaS): - - [TinyMLaaS: Introduction.](https://docs.google.com/presentation/d/1O7bxb36SnexfDI3iE_p0C8JI_VYXAL8cyAx3JKDfeUo/edit?usp=drive_link) + - [Tiny MLaaS: Introduction.](https://docs.google.com/presentation/d/1O7bxb36SnexfDI3iE_p0C8JI_VYXAL8cyAx3JKDfeUo/edit?usp=drive_link) - - [TinyMLaaS: Design Overview.](https://docs.google.com/presentation/d/1ZUUHtTbKlzeTwVteQMSztscQmdmMxT1A24pBKSys7g0/edit#slide=id.g94db9f9f78_0_2) + - [Tiny MLaaS: Design Overview.](https://docs.google.com/presentation/d/1ZUUHtTbKlzeTwVteQMSztscQmdmMxT1A24pBKSys7g0/edit#slide=id.g94db9f9f78_0_2) ::: ::: {.callout-important collapse="false"} diff --git a/style.scss b/style.scss index 9ed2d83a..9e39f413 100644 --- a/style.scss +++ b/style.scss @@ -14,6 +14,18 @@ $font-size-root: 16px !default; /*-- scss:defaults --*/ :root { --link-color: #A51C30; /* Define the CSS variable for link color */ + + --mermaid-bg-color: #f9fafb; /* Light neutral background */ + --mermaid-edge-color: #4b5563; /* Muted dark gray for edges */ + --mermaid-node-fg-color: #1f2937; /* Dark gray for node text */ + --mermaid-fg-color: #2563eb; /* Primary blue for accents */ + --mermaid-fg-color--lighter: #60a5fa; /* Lighter blue for highlights */ + --mermaid-fg-color--lightest: #93c5fd; /* Lightest blue for secondary highlights */ + --mermaid-font-family: 'Roboto', sans-serif; /* Modern, clean font */ + --mermaid-label-bg-color: #e5e7eb; /* Soft gray for label backgrounds */ + --mermaid-label-fg-color: #1f2937; /* Dark gray for label text */ + --mermaid-node-bg-color: #ffffff; /* White for node backgrounds */ + --mermaid-node-fg-color: #1f2937; /* Dark gray for node text */ } div.sidebar-item-container .active {