SKDAVS.html

<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <title>SKDAVS</title>
    <link rel="stylesheet" type="text/css" href="assets/scripts/bulma.min.css">
    <link rel="stylesheet" type="text/css" href="assets/scripts/theme.css">
    <link rel="stylesheet" type="text/css" href="https://cdn.bootcdn.net/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">
  </head>
  <body>
    <section class="hero is-light" style="">
      <div class="hero-body" style="padding-top: 50px;">
        <div class="container" style="text-align: center;margin-bottom:5px;">
          <h1 class="title">
            Spatiotemporal Knowledge Distillation for Efficient
          </h1>
          <h1 class="title">
            Estimation of Aerial Video Saliency
          </h1>

          <div class="author">Jia Li<sup>1</sup></div>
          <div class="author">Kui Fu<sup>1,3</sup></div>
          <div class="author">Shengwei Zhao<sup>2,4</sup></div>
          <div class="author">Shiming Ge<sup>2</sup></div>
          <div class="group">
            <a href="http://cvteam.net/">CVTEAM</a>
          </div>
          <div class="aff">
            <p><sup>1</sup>State Key Laboratory of Virtual Reality Technology and Systems, SCSE, Beihang University, Beijing, China</p>
            <p><sup>2</sup>Institute of Information Engineering, Chinese Academy of Sciences</p>
            <p><sup>3</sup>Peng Cheng Laboratory</p>
            <p><sup>4</sup> School of Cyber Security, University of Chinese Academy of Sciences</p>
          </div>
          <div class="con">
            <p  style="font-size: 24px; margin-top:5px; margin-bottom: 15px;">
            TIP 2020
            </p>
          </div>
          <div class="columns">
            <div class="column"></div>
            <div class="column"></div>
            <div class="column">
              <a href="http://cvteam.net/papers/2019-TIP-Spatiotemporal%20Knowledge%20Distillation%20for%20Efficient%20Estimation%20of%20Aerial%20Video%20Saliency.pdf" target="_blank">
                <p class="link">Paper</p>
              </a>
            </div>
            <div class="column">
              <a href="https://github.com/iCVTEAM/SKDAVS/" target="_blank">
                <p class="link">Code</p>
              </a>
            </div>
            <div class="column"></div>
            <div class="column"></div>
          </div>
        </div>
      </div>
    </section>
    <div style="text-align: center;">
      <div class="container" style="max-width:850px">
        <div style="text-align: center;">
          <img src="assets/SKDAVS/head.png" class="centerImage">
        </div>
      </div>
      <div class="head_cap">
        <p style="color:gray;">
           System framework of SKD.
        </p>
      </div>
    </div>
    <section class="hero">
      <div class="hero-body">
        <div class="container" style="max-width: 800px" >
          <h1 style="">Abstract</h1>
          <p  style="text-align: justify; font-size: 17px;">
            The performance of video saliency estimation techniques 
            has achieved significant advances along with the rapid 
            development of Convolutional Neural Networks (CNNs). 
            However, devices like cameras and drones may have limited 
            computational capability and storage space so that 
            the direct deployment of complex deep saliency models 
            becomes infeasible. To address this problem, this paper 
            proposes a dynamic saliency estimation approach for aerial 
            videos via spatiotemporal knowledge distillation. In this 
            approach, five components are involved, including two 
            teachers, two students and the desired spatiotemporal model.
            The knowledge of spatial and temporal saliency is first 
            separately transferred from the two complex and redundant 
            teachers to their simple and compact students, while the 
            input scenes are also degraded from high-resolution to 
            low-resolution to remove the probable data redundancy so 
            as to greatly speed up the feature extraction process. 
            After that, the desired spatiotemporal model is further 
            trained by distilling and encoding the spatial and 
            temporal saliency knowledge of two students into a unified 
            network. In this manner, the inter-model redundancy can be 
            removed for the effective estimation of dynamic saliency on 
            aerial videos. Experimental results show that the proposed 
            approach is comparable to 11 state-of-the-art models in 
            estimating visual saliency on aerial videos, while its 
            speed reaches up to 28,738 FPS and 1,490.5 
            FPS on the GPU and CPU platforms,respectively.
          </p>
        </div>
      </div>
    </section>
    <section class="hero is-light" style="background-color:#FFFFFF;">
      <div class="hero-body">
        <div class="container" style="max-width:800px;margin-bottom:20px;">
          <h1>
            Qualitative comparisons
          </h1>
        </div>
      <div style="text-align: center;">
        <div class="container" style="max-width:850px">
          <div style="text-align: center;">
            <img src="assets/SKDAVS/comp.png" class="centerImage">
          </div>
        </div>
        <div class="head_cap">
          <p style="color:gray;">
            Representative frames of the models on AVS1K. (a) Video frame, (b) Ground truth, (c) HFT, 
            (d) SP, (e) PNSP, (f) SSD, (g) LDS, (h) eDN, (i) iSEEL,(j) DVA, (k) SalNet, (l) STS, (m) SKD.
          </p>
        </div>
      </div>
      </div>
    </section>
  <section class="hero" style="padding-top:0px;">
    <div class="hero-body">
      <div class="container" style="max-width:800px;">
  <div class="card">
  <header class="card-header">
    <p class="card-header-title">
      BibTex Citation
    </p>

    <a class="card-header-icon button-clipboard" style="border:0px; background: inherit;" data-clipboard-target="#bibtex-info" >
      <i class="fa fa-copy" height="20px"></i>
    </a>
  </header>
    <div class="card-content">
<pre style="background-color:inherit;padding: 0px;" id="bibtex-info">@article{li2019spatiotemporal,
  title={Spatiotemporal knowledge distillation for efficient estimation of aerial video saliency},
  author={Li, Jia and Fu, Kui and Zhao, Shengwei and Ge, Shiming},
  journal={IEEE Transactions on Image Processing},
  volume={29},
  pages={1902--1914},
  year={2019},
  publisher={IEEE}
}</pre>
    </div>
    </section>
    <script type="text/javascript" src="assets/scripts/clipboard.min.js"></script>
    <script>
      new ClipboardJS('.button-clipboard');
    </script>
  </body>
</html>