forked from software-tools-books/js4ds
-
Notifications
You must be signed in to change notification settings - Fork 0
/
vis.tex
355 lines (298 loc) · 10.8 KB
/
vis.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
\chapter{Visualizing Data}\label{s:vis}
Tables and lists are great, but visualizations are often more effective---if
they're well designed and your audience is sighted, that is.
There are even more ways to visualize data in the browser
than there are front-end toolkits for JavaScript.
We have chosen to use \href{http://vega.github.io/}{Vega-Lite},
which is a \gref{g:declarative}{declarative} framework:
as a user,
you specify the data and settings,
and let the library take care of everything else.
It doesn't do everything,
but it does common things well and easily,
and it interacts nicely with React.
\section{Vega-Lite}\label{s:vis-vega-lite}
Let's start by creating a skeleton web page to hold our visualization.
For now, we will load Vega, Vega-Lite, and Vega-Embed from the web;
we'll worry about local installation later.
We will create a \texttt{div} to be filled in by the visualization---we
don't have to give it the ID \texttt{vis}, but it's common to do so---and
we will leave space for the script.
Our skeleton looks like this
(with lines broken for the benefit of the printed version):
\begin{minted}{html}
<!DOCTYPE html>
<html>
<head>
<title>Embedding Vega-Lite</title>
<script src="https://cdnjs.cloudflare.com/ajax/libs\
/vega/3.0.7/vega.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs\
/vega-lite/2.0.1/vega-lite.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs\
/vega-embed/3.0.0-rc7/vega-embed.js"></script>
</head>
<body>
<div id="vis"></div>
<script type="text/javascript">
</script>
</body>
</html>
\end{minted}
We can now start filling in the script with the beginning of a visualization specification.
This is a blob of \gref{g:json}{JSON} with certain required fields:
\begin{itemize}
\item
\texttt{\$schema} identifies the version of the spec being used (as a URL).
\item
\texttt{description} is a comment to remind us what we thought we were doing when we created this.
\item
\texttt{data} is the actual data.
\end{itemize}
\begin{minted}{html}
...rest of page as before...
<script type="text/javascript">
let spec = {
"$schema": "https://vega.github.io/schema/vega-lite/v2.0.json",
"description": "Create data array but do not display anything.",
"data": {
"values": [
{"a": "A", "b": 28},
{"a": "B", "b": 55},
{"a": "C", "b": 43},
{"a": "D", "b": 91},
{"a": "E", "b": 81},
{"a": "F", "b": 53},
{"a": "G", "b": 19},
{"a": "H", "b": 87},
{"a": "I", "b": 52}
]
}
}
</script>
...rest of page as before...
\end{minted}
In this case,
we represent a two-dimensional data table as objects with explicit indices \texttt{"a"} and \texttt{"b"}.
We have to do this because JSON (like JavaScript) doesn't have a native representation
of two-dimensional arrays with row and column headers,
because programmers.
Once we have created our spec,
we can call \texttt{vegaEmbed} with the ID of the element that will hold the visualization,
the spec,
and some options (which for now we will leave empty):
\begin{minted}{js}
let spec = {
"$schema": "https://vega.github.io/schema/vega-lite/v2.0.json",
"description": "Create data array but do not display anything.",
"data": {
"values": [
// ...as above...
]
}
}
vegaEmbed("#vis", spec, {})
\end{minted}
When we open the page, though, nothing appears,
because we haven't told Vega-Lite \emph{how} to display the data.
To do that,
we need to add two more fields to the spec:
\begin{itemize}
\item
\texttt{mark} specifies the visual element used to show the data
\item
\texttt{encoding} tells Vega how to map values to marks
\end{itemize}
Here's our updated spec:
\begin{minted}{js}
let spec = {
"$schema": "https://vega.github.io/schema/vega-lite/v2.0.json",
"description": "Add mark and encoding for data.",
"data": {
"values": [
// ...as above...
]
},
"mark": "bar",
"encoding": {
"x": {"field": "a", "type": "ordinal"},
"y": {"field": "b", "type": "quantitative"}
}
}
vegaEmbed("#vis", spec, {})
\end{minted}
When we open the page now,
we see a bar chart,
and feel very proud of ourselves
(\figref{f:vis-mark-encoding}).
\figpdf{figures/vis-mark-encoding.png}{Mark and Encoding}{f:vis-mark-encoding}
There are also some poorly-styled links for various controls that we're not going to use.
We can fill in the options argument to \texttt{vegaEmbed} to turn those off:
\begin{minted}{js}
let spec = {
"$schema": "https://vega.github.io/schema/vega-lite/v2.0.json",
"description": "Disable control links.",
"data": {
// ...as before...
}
}
let options = {
"actions": {
"export": false,
"source": false,
"editor": false
}
}
vegaEmbed("#vis", spec, options)
\end{minted}
We now have the visualization we wanted
(\figref{f:vis-disable-controls}).
\begin{figure}
\centering
\includegraphics{figures/vis-disable-controls.png}
\caption{Without Controls}
\label{f:vis-disable-controls}
\end{figure}
Vega-Lite has a \emph{lot} of options:
for example,
we can use points and average the Y values.
(We will change the X data so that values aren't distinct in order to show this off,
because otherwise averaging doesn't do much.)
In our revised spec,
\texttt{x} is now \texttt{"nominal"} instead of \texttt{"ordinal"}
and \texttt{y} has an extra property \texttt{"aggregate"},
which is set to \texttt{"average"}
(but can be used to specify other \gref{g:aggregation-function}{aggregation functions}):
\begin{minted}{js}
let spec = {
"$schema": "https://vega.github.io/schema/vega-lite/v2.0.json",
"description": "Disable control links.",
"data": {
"values": [
{"a": "P", "b": 19},
{"a": "P", "b": 28},
{"a": "P", "b": 91},
{"a": "Q", "b": 55},
{"a": "Q", "b": 81},
{"a": "Q", "b": 87},
{"a": "R", "b": 43},
{"a": "R", "b": 52},
{"a": "R", "b": 53}
]
},
"mark": "point",
"encoding": {
"x": {"field": "a", "type": "nominal"},
"y": {"field": "b", "type": "quantitative", "aggregate": "average"}
}
}
let options = {
...disable controls as before...
}
vegaEmbed("#vis", spec, options)
\end{minted}
\figref{f:vis-aggregate-points} shows the result.
\begin{figure}
\centering
\includegraphics{figures/vis-aggregate-points.png}
\caption{Aggregating and Using Points}
\label{f:vis-aggregate-points}
\end{figure}
\section{Local Installation}\label{s:vis-vega-local}
Loading Vega from a \gref{g:cdn}{Content Delivery Network} (CDN) reduces the load on our server,
but prevents offline development.
Since we want to be able to work when we're disconnected,
let's load from local files.
Step 1 is to slim down our HTML file so that it only loads our application:
\begin{minted}{html}
<!DOCTYPE html>
<html>
<head>
<title>Load Vega from a File</title>
<meta charset="utf-8">
<script src="app.js" async></script>
</head>
<body>
<div id="vis"></div>
</body>
</html>
\end{minted}
In step 2,
we \texttt{npm\ install\ vega\ vega-lite\ vega-embed} and \texttt{require('vega-embed')} in \texttt{app.js}:
\begin{minted}{js}
const vegaEmbed = require('vega-embed')
const spec = {
// ...as before...
}
const options = {
// ...as before...
}
vegaEmbed("#vis", spec, options)
\end{minted}
We launch this with Parcel via our saved \texttt{npm\ run} command:
\begin{minted}{shell}
$ npm run dev -- src/vis/react-01/index.html
\end{minted}
But nothing appears when we open \texttt{http://localhost:4000} in our browser.
Looking in the browser console,
we see a message telling us that \texttt{vegaEmbed} is not a function.
What we have tripped over is something that's still painful in 2018.
The old method of getting libraries is \texttt{require},
and that's still what Node supports as of Version 10.9.0.
The new standard is \texttt{import},
which allows a module to define a default value so that \texttt{import\ 'something'} gets a function, a class, or whatever.
This is really handy, but \texttt{require} doesn't work that way.
We can either add the \texttt{-\/-experimental-modules} flag when using Node on the command line,
or rename our files with a \texttt{.mjs} extension,
both of which are annoying.
Alternatively,
we can get the thing we want by accessing \texttt{.default} during import,
or by referring to \texttt{vegaEmbed.default} when we call it.
These choices are also annoying,
but after a bit of fiddling and cursing,
we decide to make the fix as the library is loaded:
\begin{minted}{js}
const vegaEmbed = require('vega-embed').default
// ...as before...
\end{minted}
The third option is to use \texttt{import} where we can
and fix the \texttt{require} statements in the server-side code when Node is upgraded.
We can call the thing we import anything we want,
but we will stick to \texttt{vegaEmbed} for consistency with previous examples:
\begin{minted}{js}
import vegaEmbed from 'vega-embed'
// ...as before...
\end{minted}
If we do this,
the bundled file is 74.5K lines of JavaScript,
but at least it's all in one place for distribution.
\section{Exercises}\label{s:vis-exercises}
\exercise{Binned Scatterplots}
Vega-Lite can create
\href{https://vega.github.io/vega-lite/examples/circle_binned.html}{binned scatterplots}
in which the sizes of markers indicate how many values were put in each bin.
Modify the aggregating scatterplot shown above
so that values are binned in this way.
\exercise{Grouped Bar Charts}
Vega-Lite can display
\href{https://vega.github.io/vega-lite/examples/bar_grouped.html}{grouped bar charts}
as well as simple ones.
Find or create a simple data set and construct a grouped bar chart.
How impressed will your supervisor, your committee, or a future employee be
by your chosen color scheme?
\exercise{Limits of Declarative Programming}
Look at Vega-Lite's
\href{https://vega.github.io/vega-lite/examples/}{example gallery}
and identify one kind of plot or transformation you've used or seen
that \emph{isn't} included there.
Do you think this is because they just haven't gotten around to it yet,
or is there something about that plot or transformation
that doesn't lend itself to Vega-Lite's declarative model?
\exercise{Working With Arrays}
Vega-Lite is built on top of a visualization toolkit called \href{https://d3js.org/}{D3},
which includes \href{https://github.com/d3/d3-array}{a library for manipulating arrays}.
Write a small application that generates 1000 random values using \texttt{Math.random}
and reports the mean, standard deviation, and quartiles.
(You may also want to create a histogram showing the distribution of values.)
\section*{Key Points}
\input{keypoints/vis}