-
Notifications
You must be signed in to change notification settings - Fork 35
Jupyter
The Kotlin Spark API also supports Kotlin Jupyter notebooks. To it, simply add
%use spark
to the top of your notebook. This will get the latest version of the API, together with the latest version of Spark. To define a certain version of Spark or the API itself, simply add it like this:
%use spark(spark=3.2, v=1.1.0)
NOTE: You need kotlin-jupyter-kernel
to be at least version 0.11.0.83 for the Kotlin Spark API to work. Also, if the %use spark
magic does not output "Spark session has been started...", and %use spark-streaming
doesn't work at all, add %useLatestDescriptors
above it.
Inside the notebook a Spark session will be initiated automatically. This can be accessed via the spark
value.
sc: JavaSparkContext
can also be accessed directly. The API operates pretty similarly.
There is also support for HTML rendering of Datasets and simple (Java)RDDs. Check out the example as well.
To use Spark Streaming abilities, instead use
%use spark-streaming
This does not start a Spark session right away, meaning you can call withSparkStreaming(batchDuration) {}
in whichever cell you want.
Check out the example.