Skip to content

Commit

Permalink
Minor spacing fix before reference
Browse files Browse the repository at this point in the history
  • Loading branch information
profvjreddi committed Dec 9, 2023
1 parent 4c1a3ae commit 7b03f43
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion hw_acceleration.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -530,7 +530,7 @@ efficiency gains compared to isolated efforts.

Jointly exploring innovations in neural network architectures along with custom hardware design is a powerful co-design technique. This allows finding ideal pairings tailored to each other's strengths [@sze2017efficient].

For instance, the shift to mobile architectures like MobileNets [@howard_mobilenets_2017] was guided by edge device constraints like model size and latency. The quantization [@jacob2018quantization] and pruning techniques [@gale2019state] that unlocked these efficient models became possible thanks to hardware accelerators with native low-precision integer support and pruning support[@mishrapruning].
For instance, the shift to mobile architectures like MobileNets [@howard_mobilenets_2017] was guided by edge device constraints like model size and latency. The quantization [@jacob2018quantization] and pruning techniques [@gale2019state] that unlocked these efficient models became possible thanks to hardware accelerators with native low-precision integer support and pruning support [@mishrapruning].

Attention-based models have thrived on massively parallel GPUs and ASICs where their computation maps well spatially, as opposed to RNN architectures reliant on sequential processing. Co-evolution of algorithms and hardware unlocked new capabilities.

Expand Down

0 comments on commit 7b03f43

Please sign in to comment.