-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
280 lines (275 loc) · 18.3 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
<!DOCTYPE html><!--Author: Pranav Rajpurkar 2016-->
<html>
<head>
<meta charset="utf-8">
<title>AdvGLUE Benchmark</title>
<meta name="description" content="AdvGLUE is the Adversarial GLUE Benchmark">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no">
<meta property="og:image" content="/logo.png">
<link rel="image_src" type="image/png" href="/logo.png">
<link rel="shortcut icon" href="/favicon.ico" type="image/x-icon">
<link rel="icon" href="/favicon.ico" type="image/x-icon">
<link rel="stylesheet" href="/bower_components/bootstrap/dist/css/bootstrap.min.css">
<link rel="stylesheet" href="/stylesheets/layout.css">
<link rel="stylesheet" href="/stylesheets/index.css">
<script async defer src="https://buttons.github.io/buttons.js"></script>
<script src="/javascripts/analytics.js"></script>
<meta name="google-site-verification" content="SZfYjjinH_LiHzrdZwpRCVPsr_55vXApuX1YZqnO5Mo"/>
</head>
<body>
<div class="navbar navbar-default navbar-fixed-top" id="topNavbar" role="navigation">
<div class="container clearfix" id="navContainer">
<div class="rightNav">
<div class="collapseDiv">
<button class="navbar-toggle collapsed" type="button" data-toggle="collapse" data-target="#navbar"
aria-expanded="false" aria-controls="navbar"><span
class="glyphicon glyphicon-menu-hamburger"></span></button>
</div>
<div class="collapse navbar-collapse" id="navbar">
<ul class="nav navbar-nav navbar-right">
<li><a href="/">Home</a></li>
<li><a href="https://github.com/AI-secure/adversarial-glue">GitHub</a></li>
<li><a href="https://arxiv.org/abs/2111.02840">Paper</a></li>
<li><a href="/explore">AdvExplore</a></li>
</ul>
</div>
</div>
<div class="leftNav">
<div class="brandDiv"><a class="navbar-brand" href="/">AdvGLUE</a></div>
</div>
</div>
</div>
<div class="cover" id="topCover">
<div class="container">
<div class="row">
<!-- <div class="col-md-4">-->
<!-- <img src="logo.png" width="150" height="150" style="float:right;"/>-->
<!-- </div>-->
<div class="col-md-3">
</div>
<div class="col-md-2">
<img src="/logo.png" width="150" height="150"/>
</div>
<div class="col-md-5">
<h1 id="appTitle"><b>AdvGLUE</b></h1>
<h2 id="appSubtitle">The Adversarial GLUE Benchmark</h2>
</div>
</div>
</div>
</div>
<div class="cover" id="contentCover">
<div class="container">
<div class="row">
<div class="col-md-5">
<div class="infoCard">
<div class="infoBody">
<div class="infoHeadline"><h2>What is AdvGLUE?</h2></div>
<p><span>Adversarial GLUE Benchmark (AdvGLUE)</span> is a comprehensive robustness evaluation
benchmark that
focuses on the adversarial robustness evaluation of language models. It covers five natural
language understanding tasks from the famous GLUE tasks and is an adversarial version of
GLUE
benchmark. </p>
<!-- <iframe id="igraph" scrolling="no" style="border:none;" seamless="seamless" src="https://plotly.com/~xcj/5.embed" height="100%" width="100%"></iframe>-->
<hr>
<p> AdvGLUE considers textual adversarial attacks from different perspectives and
hierarchies, including word-level transformations, sentence-level manipulations, and
human-written adversarial examples, which provide comprehensive coverage of various
adversarial linguistic phenomena. </p>
<a class="btn actionBtn" href="/explore">Explore Statistics and Examples of AdvGLUE</a>
<a class="btn actionBtn" href="https://openreview.net/forum?id=GF9cSKI3A_q">AdvGLUE
paper </a>
<hr>
<p> The quality of AdvGLUE benchmark is validated by human annotators. Each adversarial example
in
AdvGLUE dataset is highly agreed among human annotators. To make sure the annotators fully
understand the GLUE tasks, each worker is required to pass a training step to be qualified
to work on the main filtering tasks for the generated adversarial examples. </p>
<a class="btn actionBtn" href="/instructions">Explore Human Evaluation</a>
<hr>
<p> <strong style="color: orangered">News [1/25/2024]</strong> We are excited to release our test set with detailed annotations. Please check out our <a href="dataset/test_ann.json">test dataset</a> here. Note that we also include the benign GLUE dev set, with <code>method</code> labeled as <code>glue</code>.</p>
<p> <strong style="color: orangered">News [3/20/2023]</strong> We included an additional dev set containing detailed annotations of our adversarial dataset, which includes adversarial attack method, benign examples, and more. Please check out our <a href="dataset/dev_ann.json">new dev dataset</a> here. </p>
<div class="infoHeadline"><h2>Getting Started</h2></div>
<p>We have built a few resources to help you get started with the dataset.</p>
<p> Download a copy of the dataset (distributed under the <a
href="http://creativecommons.org/licenses/by-sa/4.0/legalcode">CC BY-SA 4.0</a>
license):
<ul class="list-unstyled">
<li><a class="btn actionBtn inverseBtn" href="/dataset/dev.zip" download>Dev
Set (199 KB)</a></li>
<li><a class="btn actionBtn inverseBtn" href="/dataset/dev_ann.json" download>Dev
Set with Detailed Annotations (334 KB)</a></li>
<li><a class="btn actionBtn inverseBtn" href="/dataset/test_ann.json" download>Tesst
Set with Detailed Annotations (22.1 MB)</a></li>
</ul>
</p><p>To evaluate your models, we have also made available the evaluation script we will use
for official evaluation, along with a sample prediction file that the script will take as input.
To run the evaluation, use <code>python evaluate.py <path_to_dev> <path_to_predictions></code>.
<ul class="list-unstyled">
<li><a class="btn actionBtn inverseBtn"
href="https://worksheets.codalab.org/rest/bundles/0xdebf0dffbcad42dc92ecbe71679863cf/contents/blob/"
download>Evaluation Script</a></li>
<li><a class="btn actionBtn inverseBtn"
href="https://worksheets.codalab.org/bundles/0xcb8ebb3034cc498d87ef7521f1f7cfc0/"
download>Sample Prediction File (on Dev)</a></li>
</ul>
</p><p>Once you have a built a model that works to your expectations on the dev set, you submit
it to get official scores on the dev and a hidden test set. To preserve the integrity of test
results, we do not release the test set to the public. Instead, we require you to submit your
model so that we can run it on the test set for you. Here's a tutorial walking you through
official evaluation of your model:</p><a class="btn actionBtn inverseBtn"
href="https://worksheets.codalab.org/worksheets/0x023aaebc1cd74f3fb8eccc57643687dd/">Submission
Tutorial</a>
<p>Because AdvGLUE is an ongoing effort, we expect the dataset to evolve.</p>
<div class="infoHeadline"><h2>Have Questions?</h2></div>
<p> Ask us questions at <a href="mailto:[email protected]">[email protected]</a> and <a
href="mailto:[email protected]">[email protected]</a>.</p>
<div class="infoHeadline"><h2>Acknowledgements</h2></div>
<p> We thank the <a href="https://rajpurkar.github.io/SQuAD-explorer/">SQuAD team</a> for
allowing us to use their website template and submission tutorials. </p></div>
<div class="infoSubheadline">
<!-- <a class="github-button" href="https://github.com/adversarialglue" data-icon="octicon-star"-->
<!-- data-style="mega" data-count-href="/rajpurkar/stargazers"-->
<!-- data-count-api="/repos/rajpurkar#stargazers_count"-->
<!-- data-count-aria-label="# stargazers on GitHub"-->
<!-- aria-label="Star rajpurkar on GitHub">Star</a>-->
</div>
</div>
</div>
<div class="col-md-7">
<div class="infoCard">
<div class="infoBody">
<div class="infoHeadline"><h2>Leaderboard</h2></div>
<p>AdvGLUE is an adversarial robustness evaluation benchmark that thoroughly tests and analyzes
the
vulnerabilities of natural language understanding systems to different adversarial
transformations.</p>
<table class="table performanceTable">
<tr>
<th>Rank</th>
<th>Model</th>
<th>Score</th>
</tr>
<!-- Replace Here-->
<tr>
<td><p>1</p><span class="date label label-default">Jul 24, 2022</span></td>
<td style="word-break:break-word;"><a class="link"
href="/models/TBD-name (single).html">TBD-name
(single)</a>
<p class="institution">CASIA</p></td>
<td><b>0.6545</b></td>
</tr>
<tr>
<td><p>2</p><span class="date label label-default">Mar 31, 2022</span></td>
<td style="word-break:break-word;"><a class="link"
href="/models/CreAT (single model).html">CreAT
(single model)</a>
<p class="institution">SJTU</p></td>
<td>0.6249</td>
</tr>
<tr>
<td><p>3</p><span class="date label label-default">Aug 29, 2021</span></td>
<td style="word-break:break-word;"><a class="link"
href="/models/DeBERTa (single model).html">DeBERTa
(single model)</a>
<p class="institution">UIUC</p></td>
<td>0.6086</td>
</tr>
<tr>
<td><p>4</p><span class="date label label-default">Aug 29, 2021</span></td>
<td style="word-break:break-word;"><a class="link"
href="/models/ALBERT (single model).html">ALBERT
(single model)</a>
<p class="institution">UIUC</p></td>
<td>0.5922</td>
</tr>
<tr>
<td><p>5</p><span class="date label label-default">Aug 29, 2021</span></td>
<td style="word-break:break-word;"><a class="link"
href="/models/T5 (single model).html">T5 (single
model)</a>
<p class="institution">UIUC</p></td>
<td>0.5682</td>
</tr>
<tr>
<td><p>6</p><span class="date label label-default">Aug 29, 2021</span></td>
<td style="word-break:break-word;"><a class="link"
href="/models/SMART_RoBERTa (single model).html">SMART_RoBERTa
(single model)</a>
<p class="institution">UIUC</p></td>
<td>0.5371</td>
</tr>
<tr>
<td><p>7</p><span class="date label label-default">Aug 29, 2021</span></td>
<td style="word-break:break-word;"><a class="link"
href="/models/FreeLB (single model).html">FreeLB
(single model)</a>
<p class="institution">UIUC</p></td>
<td>0.5048</td>
</tr>
<tr>
<td><p>8</p><span class="date label label-default">Aug 29, 2021</span></td>
<td style="word-break:break-word;"><a class="link"
href="/models/RoBERTa (single model).html">RoBERTa
(single model)</a>
<p class="institution">UIUC</p></td>
<td>0.5021</td>
</tr>
<tr>
<td><p>9</p><span class="date label label-default">Aug 29, 2021</span></td>
<td style="word-break:break-word;"><a class="link"
href="/models/InfoBERT (single model).html">InfoBERT
(single model)</a>
<p class="institution">UIUC</p></td>
<td>0.4603</td>
</tr>
<tr>
<td><p>10</p><span class="date label label-default">Aug 29, 2021</span></td>
<td style="word-break:break-word;"><a class="link"
href="/models/ELECTRA (single model).html">ELECTRA
(single model)</a>
<p class="institution">UIUC</p></td>
<td>0.4169</td>
</tr>
<tr>
<td><p>11</p><span class="date label label-default">Aug 29, 2021</span></td>
<td style="word-break:break-word;"><a class="link"
href="/models/BERT (single model).html">BERT
(single model)</a>
<p class="institution">UIUC</p></td>
<td>0.3369</td>
</tr>
<tr>
<td><p>12</p><span class="date label label-default">Aug 29, 2021</span></td>
<td style="word-break:break-word;"><a class="link"
href="/models/SMART_BERT (single model).html">SMART_BERT
(single model)</a>
<p class="institution">UIUC</p></td>
<td>0.3029</td>
</tr>
<!-- Replace End-->
</table>
</div>
</div>
</div>
</div>
</div>
</div>
<nav class="navbar navbar-default navbar-static-bottom footer">
<div class="container clearfix">
<div class="rightNav">
<div>
<ul class="nav navbar-nav navbar-right">
<li><a href="/">AdvGLUE</a></li>
<li><a href="https://aisecure.github.io/">UIUC Secure Learning Lab</a></li>
<li><a href="https://www.microsoft.com/en-us/research/">Microsoft Research</a></li>
</ul>
</div>
</div>
</div>
</nav>
<script src="/bower_components/jquery/dist/jquery.min.js"></script>
<script src="/bower_components/bootstrap/dist/js/bootstrap.min.js"></script>
</body>
</html>