Skip to content

Commit

Permalink
GEOMESA-3377 Remove spark-jupyter-vegas support
Browse files Browse the repository at this point in the history
  • Loading branch information
elahrvivaz committed Jul 17, 2024
1 parent 90eb030 commit ce15f81
Show file tree
Hide file tree
Showing 8 changed files with 11 additions and 142 deletions.
2 changes: 1 addition & 1 deletion docs/tutorials/broadcast-join.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ interactive Spark REPL with all dependencies needed for running Spark with GeoMe

.. code-block:: bash
$ bin/spark-shell --jars geomesa-accumulo-spark-runtime-accumulo2_${VERSION}.jar
$ bin/spark-shell --jars geomesa-accumulo-spark-runtime-accumulo21_${VERSION}.jar
.. note::

Expand Down
2 changes: 1 addition & 1 deletion docs/tutorials/dwithin-join.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ interactive Spark REPL with all dependencies needed for running Spark with GeoMe

.. code-block:: bash
$ bin/spark-shell --jars geomesa-accumulo-spark-runtime-accumulo2_${VERSION}.jar
$ bin/spark-shell --jars geomesa-accumulo-spark-runtime-accumulo21_${VERSION}.jar
.. note::

Expand Down
2 changes: 1 addition & 1 deletion docs/user/spark/core.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ to the ``spark-submit`` command via the ``--jars`` option:

.. code-block:: bash
--jars file://path/to/geomesa-accumulo-spark-runtime-accumulo2_${VERSION}.jar
--jars file://path/to/geomesa-accumulo-spark-runtime-accumulo21_${VERSION}.jar
or passed to Spark via the appropriate mechanism in notebook servers such as
Jupyter (see :doc:`jupyter`) or Zeppelin.
Expand Down
49 changes: 2 additions & 47 deletions docs/user/spark/jupyter.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ GeoMesa:
# bundled GeoMesa Accumulo Spark and Spark SQL runtime JAR
# (contains geomesa-accumulo-spark, geomesa-spark-core, geomesa-spark-sql, and dependencies)
jars="file://$GEOMESA_ACCUMULO_HOME/dist/spark/geomesa-accumulo-spark-runtime-accumulo2_$VERSION.jar"
jars="file://$GEOMESA_ACCUMULO_HOME/dist/spark/geomesa-accumulo-spark-runtime-accumulo21_$VERSION.jar"
# uncomment to use the converter RDD provider
#jars="$jars,file://$GEOMESA_ACCUMULO_HOME/lib/geomesa-spark-converter_$VERSION.jar"
Expand Down Expand Up @@ -103,8 +103,7 @@ GeoMesa:
You may also consider adding ``geomesa-tools_${VERSION}-data.jar`` to include prepackaged converters for
publicly available data sources (as described in :ref:`prepackaged_converters`),
``geomesa-spark-jupyter-leaflet_${VERSION}.jar`` to include an interface for the `Leaflet`_ spatial visualization
library (see :ref:`jupyter_leaflet`, below), and/or ``geomesa-spark-jupyter-vegas_${VERSION}.jar`` to use the `Vegas`_ data
plotting library (see :ref:`jupyter_vegas`, below).
library (see :ref:`jupyter_leaflet`, below).

Running Jupyter
---------------
Expand Down Expand Up @@ -236,49 +235,6 @@ fillOpacity Number Fill opacity

Note: Options are comma-separated (i.e. ``{ color: "#ff0000", fillColor: "#0000ff" }``)

.. _jupyter_vegas:

Vegas for Plotting
------------------

The `Vegas`_ library may be used with GeoMesa, Spark, and Toree in Jupyter to plot quantitative data. The
``geomesa-spark-jupyter-vegas`` module builds a shaded JAR containing all of the dependencies needed to run Vegas in
Jupyter+Toree. This module must be built from source, using the ``vegas`` profile:

.. code-block:: bash
$ mvn clean install -Pvegas -pl geomesa-spark/geomesa-spark-jupyter-vegas
This will build ``geomesa-spark-jupyter-vegas_${VERSION}.jar`` in the ``target`` directory of the module, and
should be added to the list of JARs in the ``jupyter toree install`` command described in
:ref:`jupyter_configure_toree`:

.. code-block:: bash
jars="$jars,file:///path/to/geomesa-spark-jupyter-vegas_${VERSION}.jar"
# then continue with "jupyter toree install" as before
To use Vegas within Jupyter, load the appropriate libraries and a displayer:

.. code-block:: scala
import vegas._
import vegas.render.HTMLRenderer._
import vegas.sparkExt._
implicit val displayer: String => Unit = { s => kernel.display.content("text/html", s) }
Then use the ``withDataFrame`` method to plot data in a ``DataFrame``:

.. code-block:: scala
Vegas("Simple bar chart").
withDataFrame(df).
encodeX("a", Ordinal).
encodeY("b", Quantitative).
mark(Bar).
show(displayer)
.. _Apache Toree: https://toree.apache.org/
.. _Docker: https://www.docker.com/
.. _JupyterLab: https://jupyterlab.readthedocs.io/
Expand All @@ -287,4 +243,3 @@ Then use the ``withDataFrame`` method to plot data in a ``DataFrame``:
.. _Python: https://www.python.org/
.. _SBT: https://www.scala-sbt.org/
.. _Spark: https://spark.apache.org/
.. _Vegas: https://github.com/vegas-viz/Vegas
6 changes: 3 additions & 3 deletions docs/user/spark/pyspark.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ the providers outlined in :ref:`spatial_rdd_providers`.
mvn clean install -Ppython
pip3 install geomesa-spark/geomesa_pyspark/target/geomesa_pyspark-$VERSION.tar.gz
cp geomesa-accumulo/geomesa-accumulo-spark-runtime-accumulo2/target/geomesa-accumulo-spark-runtime-accumulo2_${VERSION}.jar /path/to/
cp geomesa-accumulo/geomesa-accumulo-spark-runtime-accumulo21/target/geomesa-accumulo-spark-runtime-accumulo21_${VERSION}.jar /path/to/
Alternatively, you can use ``conda-pack`` to bundle the dependencies for your project. This may be more appropriate if
you have additional dependencies.
Expand All @@ -39,7 +39,7 @@ you have additional dependencies.
# Install additional dependencies using conda or pip here
conda pack -o environment.tar.gz
cp geomesa-accumulo/geomesa-accumulo-spark-runtime-accumulo2/target/geomesa-accumulo-spark-runtime-accumulo2_${VERSION}.jar /path/to/
cp geomesa-accumulo/geomesa-accumulo-spark-runtime-accumulo21/target/geomesa-accumulo-spark-runtime-accumulo21_${VERSION}.jar /path/to/
.. warning::
``conda-pack`` currently has issues with Python 3.8, and ``pyspark`` has issues with Python 3.9, hence the explicit
Expand All @@ -57,7 +57,7 @@ the ``pyspark`` library.
import geomesa_pyspark
conf = geomesa_pyspark.configure(
jars=['/path/to/geomesa-accumulo-spark-runtime-accumulo2_${VERSION}.jar'],
jars=['/path/to/geomesa-accumulo-spark-runtime-accumulo21_${VERSION}.jar'],
packages=['geomesa_pyspark','pytz'],
spark_home='/path/to/spark/').\
setAppName('MyTestApp')
Expand Down
14 changes: 3 additions & 11 deletions docs/user/spark/zeppelin.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Configuring Zeppelin with GeoMesa
---------------------------------

The GeoMesa Accumulo Spark runtime JAR may be found either in the ``dist/spark`` directory of the GeoMesa Accumulo
binary distribution, or (after building) in the ``geomesa-accumulo/geomesa-accumulo-spark-runtime-accumulo2/target``
binary distribution, or (after building) in the ``geomesa-accumulo/geomesa-accumulo-spark-runtime-accumulo21/target``
directory of the GeoMesa source distribution.

.. note::
Expand All @@ -29,8 +29,8 @@ directory of the GeoMesa source distribution.
#. Scroll to the bottom where the "Spark" interpreter configuration appears.
#. Click on the "edit" button next to the interpreter name (on the right-hand side of the UI).
#. In the "Dependencies" section, add the GeoMesa JAR, either as
a. the full local path to the ``geomesa-accumulo-spark-runtime-accumulo2_${VERSION}.jar`` described above, or
b. the Maven groupId:artifactId:version coordinates (``org.locationtech.geomesa:geomesa-accumulo-spark-runtime-accumulo2_2.12:$VERSION``)
a. the full local path to the ``geomesa-accumulo-spark-runtime-accumulo21_${VERSION}.jar`` described above, or
b. the Maven groupId:artifactId:version coordinates (``org.locationtech.geomesa:geomesa-accumulo-spark-runtime-accumulo21_2.12:$VERSION``)
#. Click "Save". When prompted by the pop-up, click to restart the Spark interpreter.

It is not necessary to restart Zeppelin.
Expand All @@ -51,18 +51,10 @@ may be used to print a ``DataFrame`` via this display system:
dfc.foreach(r => println(r.mkString("\t")))
}
It is also possible to use third-party libraries such as `Vegas`_, as described in :ref:`jupyter_vegas`. For
Zeppelin, the following implicit ``displayer`` method should be used:

.. code-block:: scala
implicit val displayer: String => Unit = s => println("%html\n"+s)
.. |zeppelin_version| replace:: 0.7.0

.. _configuring the Zeppelin Spark interpreter: https://zeppelin.apache.org/docs/0.7.0/interpreter/spark.html
.. _Spark: https://spark.apache.org/
.. _Vegas: https://github.com/vegas-viz/Vegas/
.. _Zeppelin: https://zeppelin.apache.org/
.. _Zeppelin installation instructions: https://zeppelin.apache.org/docs/0.7.0/install/install.html
.. _Zeppelin table display system: https://zeppelin.apache.org/docs/0.7.0/displaysystem/basicdisplaysystem.html#table
68 changes: 0 additions & 68 deletions geomesa-spark/geomesa-spark-jupyter-vegas/pom.xml

This file was deleted.

10 changes: 0 additions & 10 deletions geomesa-spark/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,4 @@
<module>geomesa_pyspark</module>
</modules>

<profiles>
<profile>
<!-- built in a profile for licensing reasons -->
<id>vegas</id>
<modules>
<module>geomesa-spark-jupyter-vegas</module>
</modules>
</profile>
</profiles>

</project>

0 comments on commit ce15f81

Please sign in to comment.