From 1bc5e442eaec9efbf8ae7969da810f6bf5acd20b Mon Sep 17 00:00:00 2001 From: Vijay Janapa Reddi Date: Wed, 15 Nov 2023 21:17:36 -0500 Subject: [PATCH] Built site for gh-pages --- .nojekyll | 2 +- contributors.html | 56 ++++++++++++++++++++++---------------------- hw_acceleration.html | 30 +++++++++++++----------- references.html | 7 ++++++ search.json | 8 +++---- 5 files changed, 56 insertions(+), 47 deletions(-) diff --git a/.nojekyll b/.nojekyll index 1317d3f6..110da80a 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -0c60ec2c \ No newline at end of file +cc417eda \ No newline at end of file diff --git a/contributors.html b/contributors.html index 1593d648..69e9eccc 100644 --- a/contributors.html +++ b/contributors.html @@ -521,101 +521,101 @@

Contributors

-Jessica Quaye
Jessica Quaye

+Jennifer Zhou
Jennifer Zhou

-Marcelo Rovai
Marcelo Rovai

+Henry Bae
Henry Bae

-happyappledog
happyappledog

+Colby Banbury
Colby Banbury

-Jared Ni
Jared Ni

+sjohri20
sjohri20

-ishapira
ishapira

+Jessica Quaye
Jessica Quaye

-Shvetank Prakash
Shvetank Prakash

+Marcelo Rovai
Marcelo Rovai

-Ikechukwu Uchendu
Ikechukwu Uchendu

+sophiacho1
sophiacho1

-Henry Bae
Henry Bae

+arnaumarin
arnaumarin

-Pong Trairatvorakul
Pong Trairatvorakul

+AditiR_42
AditiR_42

-aptl26
aptl26

+Mark Mazumder
Mark Mazumder

-naeemkh
naeemkh

+Matthew Stewart
Matthew Stewart

-alxrod
alxrod

+naeemkh
naeemkh

-Colby Banbury
Colby Banbury

+Ikechukwu Uchendu
Ikechukwu Uchendu

-Jayson Lin
Jayson Lin

+Michael Schnebly
Michael Schnebly

-Jennifer Zhou
Jennifer Zhou

+alxrod
alxrod

-sjohri20
sjohri20

+Jeffrey Ma
Jeffrey Ma

-Eric D
Eric D

+Pong Trairatvorakul
Pong Trairatvorakul

-Vijay Janapa Reddi
Vijay Janapa Reddi

+Jared Ni
Jared Ni

-Emil Njor
Emil Njor

+happyappledog
happyappledog

-Mark Mazumder
Mark Mazumder

+oishib
oishib

-oishib
oishib

+Eric D
Eric D

-Jeffrey Ma
Jeffrey Ma

+Jayson Lin
Jayson Lin

-AditiR_42
AditiR_42

+aptl26
aptl26

-Michael Schnebly
Michael Schnebly

+ishapira
ishapira

-Matthew Stewart
Matthew Stewart

+Shvetank Prakash
Shvetank Prakash

-arnaumarin
arnaumarin

+Divya
Divya

-Divya
Divya

+Vijay Janapa Reddi
Vijay Janapa Reddi

Marco Zennaro
Marco Zennaro

-sophiacho1
sophiacho1

+Emil Njor
Emil Njor

diff --git a/hw_acceleration.html b/hw_acceleration.html index a20150ca..8467790a 100644 --- a/hw_acceleration.html +++ b/hw_acceleration.html @@ -768,8 +768,8 @@
Advanced Process No

Cutting edge manufacturing processes allow packing more transistors into smaller die areas, increasing density. ASICs designed specifically for high volume applications can better amortize the costs of bleeding edge process nodes.

-
-

Disadvatages

+
+

Disadvantages

Long Design Timelines

The engineering process of designing and validating an ASIC can take 2-3 years. Synthesizing the architecture using hardware description languages, taping out the chip layout, and fabricating the silicon on advanced process nodes involves long development cycles. For example, to tape out a 7nm chip, teams need to carefully define specifications, write the architecture in HDL, synthesize the logic gates, place components, route all interconnections, and finalize the layout to send for fabrication. This very large scale integration (VLSI) flow means ASIC design and manufacturing can traditionally take 2-5 years.

@@ -823,8 +823,8 @@
Native Su

A key advantage of FPGAs is the ability to natively implement any bit width for arithmetic units, such as INT4 or bfloat16 used in quantized ML models. For example, Intel’s Stratix 10 NX FPGAs have dedicated INT8 cores that can achieve up to 143 INT8 TOPS at ~1 TOPS/W Intel® Stratix® 10 NX FPGA. Lower bit widths increase arithmetic density and performance. FPGAs can even support mixed precision or dynamic precision tuning at runtime.

-
-

Disadvatages

+
+

Disadvatages

Lower Peak Throughput than ASICs

FPGAs cannot match the raw throughput numbers of ASICs customized for a specific model and precision. The overheads of the reconfigurable fabric compared to fixed function hardware result in lower peak performance. For example, the TPU v5e pods allow up to 256 chips to be connected with more than 100 petaOps of INT8 performance while FPGAs can offer up to 143 INT8 TOPS or 286 INT4 TOPS Intel® Stratix® 10 NX FPGA.

@@ -874,8 +874,8 @@
-

Disadvatages

+
+

Disadvatages

DSPs make architectural tradeoffs that limit peak throughput, precision, and model capacity compared to other AI accelerators. But their advantages in power efficiency and integer math make them a strong edge compute option. So while DSPs provide some benefits over CPUs, they also come with limitations for machine learning workloads:

Lower Peak Throughput than ASICs/GPUs
@@ -926,8 +926,8 @@
Programmable Arc

While not fully flexible as FPGAs, GPUs do provide programmability via CUDA and shader languages to customize computations. Developers can optimize data access patterns, create new ops, and tune precisions for evolving models and algorithms.

-
-

Disadvatages

+
+

Disadvatages

While GPUs have become the standard accelerator for deep learning, their architecture also comes with some key downsides.

Less Efficient than Custom ASICs
@@ -1005,8 +1005,8 @@
Low Power for Infe Ignatov, Andrey, Radu Timofte, William Chou, Ke Wang, Max Wu, Tim Hartley, and Luc Van Gool. 2018a. AI Benchmark: Running Deep Neural Networks on Android Smartphones.”
-
-

Disadvatages

+
+

Disadvatages

While providing some advantages, general-purpose CPUs also come with limitations for AI workloads.

Lower Throughput than Accelerators
@@ -1625,7 +1625,7 @@

Quantum techniques may first make inroads for optimization before more generalized ML adoption. Realizing the full potential of quantum ML awaits major milestones in quantum hardware development and ecosystem maturity.

-