diff --git a/contents/labs/seeed/xiao_esp32s3/image_classification/image_classification.qmd b/contents/labs/seeed/xiao_esp32s3/image_classification/image_classification.qmd index e8562adf..e0090616 100644 --- a/contents/labs/seeed/xiao_esp32s3/image_classification/image_classification.qmd +++ b/contents/labs/seeed/xiao_esp32s3/image_classification/image_classification.qmd @@ -35,7 +35,7 @@ Each category is split into the **train** (100 images), **test** (10 images), an - Download the dataset from the Kaggle website and put it on your computer. -> Optionally, you can add some fresh photos of bananas, apples, and potatoes from your home kitchen, using, for example, the codediscussed in the setup lab. +> Optionally, you can add some fresh photos of bananas, apples, and potatoes from your home kitchen, using, for example, the code discussed in the next setup lab. ## Training the model with Edge Impulse Studio @@ -73,7 +73,7 @@ So, starting from the raw images, we will resize them (96x96) pixels and feed th #### Pre-processing (Feature Generation) -Besides resizing the images, we can change them to Grayscale or keep the actual RGB color depth. Let's start selecting `Grayscale`. Doing that, each one of our data samples will have dimension 9, 216 features (96x96x1). Keeping RGB, this dimension would be three times bigger. Working with Grayscale helps to reduce the amount of final memory needed for inference. +Besides resizing the images, we can change them to Grayscale or keep the actual RGB color depth. Let's start selecting `Grayscale`. Doing that, each one of our data samples will have dimension 9,216 features (96x96x1). Keeping RGB, this dimension would be three times bigger. Working with Grayscale helps to reduce the amount of final memory needed for inference. ![](https://hackster.imgix.net/uploads/attachments/1587492/image_eqGdUoXrMb.png?auto=compress%2Cformat&w=1280&h=960&fit=max) @@ -163,7 +163,7 @@ Returning to your project (Tab Image), copy one of the Raw Data Sample: ![](./images/png/get_test_data.png) -9, 216 features will be copied to the clipboard. This is the input tensor (a flattened image of 96x96x1), in this case, bananas. Past this Input tensor on`features[] = {0xb2d77b, 0xb5d687, 0xd8e8c0, 0xeaecba, 0xc2cf67, ...}` +9,216 features will be copied to the clipboard. This is the input tensor (a flattened image of 96x96x1), in this case, bananas. Past this Input tensor on`features[] = {0xb2d77b, 0xb5d687, 0xd8e8c0, 0xeaecba, 0xc2cf67, ...}` ![](./images/png/features.png) @@ -254,7 +254,7 @@ For the test, we can train the model again, using the smallest version of Mobile ![](https://hackster.imgix.net/uploads/attachments/1591705/image_lwYLKM696A.png?auto=compress%2Cformat&w=1280&h=960&fit=max) -> Note that the estimated latency for an Arduino Portenta (ou Nicla), running with a clock of 480MHz is 45ms. +> Note that the estimated latency for an Arduino Portenta (or Nicla), running with a clock of 480MHz is 45ms. Deploying the model, we got an inference of only 135ms, remembering that the XIAO runs with half of the clock used by the Portenta/Nicla (240MHz): @@ -281,7 +281,7 @@ Follow the following steps to start the SenseCraft-Web-Toolkit: In our case, we will use the blue button at the bottom of the page: `[Upload Custom AI Model]`. -But first, we must download from Edge Impulse Studio our **quantized .tflite** model. +But first, we must download from Edge Impulse Studio our **quantized.tflite** model. 3. Go to your project at Edge Impulse Studio, or clone this one: @@ -311,10 +311,10 @@ On Device Log, you will get information as: ![](./images/jpeg//senseCraft-log.jpg) -- Preprocess time (image capture and Crop): 4ms; +- Preprocess time (image capture and Crop): 4ms, - Inference time (model latency): 106ms, -- Postprocess time (display of the image and inclusion of data): 0ms. -- Output tensor (classes), for example: [[89,0]]; where 0 is Apple (and 1is banana and 2 is potato) +- Postprocess time (display of the image and inclusion of data): 0ms, +- Output tensor (classes), for example: [[89,0]]; where 0 is Apple (and 1is banana and 2 is potato). Here are other screenshots: diff --git a/contents/labs/seeed/xiao_esp32s3/object_detection/object_detection.qmd b/contents/labs/seeed/xiao_esp32s3/object_detection/object_detection.qmd index 7f9ef588..d7764436 100644 --- a/contents/labs/seeed/xiao_esp32s3/object_detection/object_detection.qmd +++ b/contents/labs/seeed/xiao_esp32s3/object_detection/object_detection.qmd @@ -320,7 +320,7 @@ After a few seconds (or minutes), the model will be uploaded to your device, and ![](./images/jpeg/sense-craft-3.jpg) -The detected objects will be marked (the centroid). You can select the Confidence of your inference cursor `Confidence`. and `IoU`, which is used to assess the accuracy of predicted bounding boxes compared to truth bounding boxes +The detected objects will be marked (the centroid). You can select the Confidence of your inference cursor `Confidence`. and `IoU`, which is used to assess the accuracy of predicted bounding boxes compared to true bounding boxes Clicking on the top button (Device Log), you can open a Serial Monitor to follow the inference, as we did with the Arduino IDE. @@ -328,10 +328,10 @@ Clicking on the top button (Device Log), you can open a Serial Monitor to follow On Device Log, you will get information as: -- Preprocess time (image capture and Crop): 3 ms; +- Preprocess time (image capture and Crop): 3 ms, - Inference time (model latency): 115 ms, - Postprocess time (display of the image and marking objects): 1 ms. -- Output tensor (boxes), for example, one of the boxes: [[30,150, 20, 20,97, 2]]; where 30,150, 20, 20 are the coordinates of the box (around the centroid); 97 is the inference result, and 2 is the class (in this case 2: fruit) +- Output tensor (boxes), for example, one of the boxes: [[30,150, 20, 20,97, 2]]; where 30,150, 20, 20 are the coordinates of the box (around the centroid); 97 is the inference result, and 2 is the class (in this case 2: fruit). > Note that in the above example, we got 5 boxes because none of the fruits got 3 centroids. One solution will be post-processing, where we can aggregate close centroids in one.