diff --git a/docs/tutorials/broadcast-join.rst b/docs/tutorials/broadcast-join.rst index 77ad892a609e..b0e4f5ab6da4 100644 --- a/docs/tutorials/broadcast-join.rst +++ b/docs/tutorials/broadcast-join.rst @@ -60,7 +60,7 @@ interactive Spark REPL with all dependencies needed for running Spark with GeoMe .. code-block:: bash - $ bin/spark-shell --jars geomesa-accumulo-spark-runtime-accumulo2_${VERSION}.jar + $ bin/spark-shell --jars geomesa-accumulo-spark-runtime-accumulo21_${VERSION}.jar .. note:: diff --git a/docs/tutorials/dwithin-join.rst b/docs/tutorials/dwithin-join.rst index 707ed4836e33..f9fb7fd72388 100644 --- a/docs/tutorials/dwithin-join.rst +++ b/docs/tutorials/dwithin-join.rst @@ -50,7 +50,7 @@ interactive Spark REPL with all dependencies needed for running Spark with GeoMe .. code-block:: bash - $ bin/spark-shell --jars geomesa-accumulo-spark-runtime-accumulo2_${VERSION}.jar + $ bin/spark-shell --jars geomesa-accumulo-spark-runtime-accumulo21_${VERSION}.jar .. note:: diff --git a/docs/user/spark/core.rst b/docs/user/spark/core.rst index 6ba2b4c9dbdb..e4b2cb24ceb2 100644 --- a/docs/user/spark/core.rst +++ b/docs/user/spark/core.rst @@ -52,7 +52,7 @@ to the ``spark-submit`` command via the ``--jars`` option: .. code-block:: bash - --jars file://path/to/geomesa-accumulo-spark-runtime-accumulo2_${VERSION}.jar + --jars file://path/to/geomesa-accumulo-spark-runtime-accumulo21_${VERSION}.jar or passed to Spark via the appropriate mechanism in notebook servers such as Jupyter (see :doc:`jupyter`) or Zeppelin. diff --git a/docs/user/spark/jupyter.rst b/docs/user/spark/jupyter.rst index 6540fedd941b..0a323182858a 100644 --- a/docs/user/spark/jupyter.rst +++ b/docs/user/spark/jupyter.rst @@ -73,7 +73,7 @@ GeoMesa: # bundled GeoMesa Accumulo Spark and Spark SQL runtime JAR # (contains geomesa-accumulo-spark, geomesa-spark-core, geomesa-spark-sql, and dependencies) - jars="file://$GEOMESA_ACCUMULO_HOME/dist/spark/geomesa-accumulo-spark-runtime-accumulo2_$VERSION.jar" + jars="file://$GEOMESA_ACCUMULO_HOME/dist/spark/geomesa-accumulo-spark-runtime-accumulo21_$VERSION.jar" # uncomment to use the converter RDD provider #jars="$jars,file://$GEOMESA_ACCUMULO_HOME/lib/geomesa-spark-converter_$VERSION.jar" @@ -103,8 +103,7 @@ GeoMesa: You may also consider adding ``geomesa-tools_${VERSION}-data.jar`` to include prepackaged converters for publicly available data sources (as described in :ref:`prepackaged_converters`), ``geomesa-spark-jupyter-leaflet_${VERSION}.jar`` to include an interface for the `Leaflet`_ spatial visualization -library (see :ref:`jupyter_leaflet`, below), and/or ``geomesa-spark-jupyter-vegas_${VERSION}.jar`` to use the `Vegas`_ data -plotting library (see :ref:`jupyter_vegas`, below). +library (see :ref:`jupyter_leaflet`, below). Running Jupyter --------------- @@ -236,49 +235,6 @@ fillOpacity Number Fill opacity Note: Options are comma-separated (i.e. ``{ color: "#ff0000", fillColor: "#0000ff" }``) -.. _jupyter_vegas: - -Vegas for Plotting ------------------- - -The `Vegas`_ library may be used with GeoMesa, Spark, and Toree in Jupyter to plot quantitative data. The -``geomesa-spark-jupyter-vegas`` module builds a shaded JAR containing all of the dependencies needed to run Vegas in -Jupyter+Toree. This module must be built from source, using the ``vegas`` profile: - -.. code-block:: bash - - $ mvn clean install -Pvegas -pl geomesa-spark/geomesa-spark-jupyter-vegas - -This will build ``geomesa-spark-jupyter-vegas_${VERSION}.jar`` in the ``target`` directory of the module, and -should be added to the list of JARs in the ``jupyter toree install`` command described in -:ref:`jupyter_configure_toree`: - -.. code-block:: bash - - jars="$jars,file:///path/to/geomesa-spark-jupyter-vegas_${VERSION}.jar" - # then continue with "jupyter toree install" as before - -To use Vegas within Jupyter, load the appropriate libraries and a displayer: - -.. code-block:: scala - - import vegas._ - import vegas.render.HTMLRenderer._ - import vegas.sparkExt._ - - implicit val displayer: String => Unit = { s => kernel.display.content("text/html", s) } - -Then use the ``withDataFrame`` method to plot data in a ``DataFrame``: - -.. code-block:: scala - - Vegas("Simple bar chart"). - withDataFrame(df). - encodeX("a", Ordinal). - encodeY("b", Quantitative). - mark(Bar). - show(displayer) - .. _Apache Toree: https://toree.apache.org/ .. _Docker: https://www.docker.com/ .. _JupyterLab: https://jupyterlab.readthedocs.io/ @@ -287,4 +243,3 @@ Then use the ``withDataFrame`` method to plot data in a ``DataFrame``: .. _Python: https://www.python.org/ .. _SBT: https://www.scala-sbt.org/ .. _Spark: https://spark.apache.org/ -.. _Vegas: https://github.com/vegas-viz/Vegas diff --git a/docs/user/spark/pyspark.rst b/docs/user/spark/pyspark.rst index 925b4af35a5b..a23904e509e4 100644 --- a/docs/user/spark/pyspark.rst +++ b/docs/user/spark/pyspark.rst @@ -23,7 +23,7 @@ the providers outlined in :ref:`spatial_rdd_providers`. mvn clean install -Ppython pip3 install geomesa-spark/geomesa_pyspark/target/geomesa_pyspark-$VERSION.tar.gz - cp geomesa-accumulo/geomesa-accumulo-spark-runtime-accumulo2/target/geomesa-accumulo-spark-runtime-accumulo2_${VERSION}.jar /path/to/ + cp geomesa-accumulo/geomesa-accumulo-spark-runtime-accumulo21/target/geomesa-accumulo-spark-runtime-accumulo21_${VERSION}.jar /path/to/ Alternatively, you can use ``conda-pack`` to bundle the dependencies for your project. This may be more appropriate if you have additional dependencies. @@ -39,7 +39,7 @@ you have additional dependencies. # Install additional dependencies using conda or pip here conda pack -o environment.tar.gz - cp geomesa-accumulo/geomesa-accumulo-spark-runtime-accumulo2/target/geomesa-accumulo-spark-runtime-accumulo2_${VERSION}.jar /path/to/ + cp geomesa-accumulo/geomesa-accumulo-spark-runtime-accumulo21/target/geomesa-accumulo-spark-runtime-accumulo21_${VERSION}.jar /path/to/ .. warning:: ``conda-pack`` currently has issues with Python 3.8, and ``pyspark`` has issues with Python 3.9, hence the explicit @@ -57,7 +57,7 @@ the ``pyspark`` library. import geomesa_pyspark conf = geomesa_pyspark.configure( - jars=['/path/to/geomesa-accumulo-spark-runtime-accumulo2_${VERSION}.jar'], + jars=['/path/to/geomesa-accumulo-spark-runtime-accumulo21_${VERSION}.jar'], packages=['geomesa_pyspark','pytz'], spark_home='/path/to/spark/').\ setAppName('MyTestApp') diff --git a/docs/user/spark/zeppelin.rst b/docs/user/spark/zeppelin.rst index fa6842a3ccb0..b778c03c2408 100644 --- a/docs/user/spark/zeppelin.rst +++ b/docs/user/spark/zeppelin.rst @@ -17,7 +17,7 @@ Configuring Zeppelin with GeoMesa --------------------------------- The GeoMesa Accumulo Spark runtime JAR may be found either in the ``dist/spark`` directory of the GeoMesa Accumulo -binary distribution, or (after building) in the ``geomesa-accumulo/geomesa-accumulo-spark-runtime-accumulo2/target`` +binary distribution, or (after building) in the ``geomesa-accumulo/geomesa-accumulo-spark-runtime-accumulo21/target`` directory of the GeoMesa source distribution. .. note:: @@ -29,8 +29,8 @@ directory of the GeoMesa source distribution. #. Scroll to the bottom where the "Spark" interpreter configuration appears. #. Click on the "edit" button next to the interpreter name (on the right-hand side of the UI). #. In the "Dependencies" section, add the GeoMesa JAR, either as - a. the full local path to the ``geomesa-accumulo-spark-runtime-accumulo2_${VERSION}.jar`` described above, or - b. the Maven groupId:artifactId:version coordinates (``org.locationtech.geomesa:geomesa-accumulo-spark-runtime-accumulo2_2.12:$VERSION``) + a. the full local path to the ``geomesa-accumulo-spark-runtime-accumulo21_${VERSION}.jar`` described above, or + b. the Maven groupId:artifactId:version coordinates (``org.locationtech.geomesa:geomesa-accumulo-spark-runtime-accumulo21_2.12:$VERSION``) #. Click "Save". When prompted by the pop-up, click to restart the Spark interpreter. It is not necessary to restart Zeppelin. @@ -51,18 +51,10 @@ may be used to print a ``DataFrame`` via this display system: dfc.foreach(r => println(r.mkString("\t"))) } -It is also possible to use third-party libraries such as `Vegas`_, as described in :ref:`jupyter_vegas`. For -Zeppelin, the following implicit ``displayer`` method should be used: - -.. code-block:: scala - - implicit val displayer: String => Unit = s => println("%html\n"+s) - .. |zeppelin_version| replace:: 0.7.0 .. _configuring the Zeppelin Spark interpreter: https://zeppelin.apache.org/docs/0.7.0/interpreter/spark.html .. _Spark: https://spark.apache.org/ -.. _Vegas: https://github.com/vegas-viz/Vegas/ .. _Zeppelin: https://zeppelin.apache.org/ .. _Zeppelin installation instructions: https://zeppelin.apache.org/docs/0.7.0/install/install.html .. _Zeppelin table display system: https://zeppelin.apache.org/docs/0.7.0/displaysystem/basicdisplaysystem.html#table diff --git a/geomesa-spark/geomesa-spark-jupyter-vegas/pom.xml b/geomesa-spark/geomesa-spark-jupyter-vegas/pom.xml deleted file mode 100644 index 9882f364bcc1..000000000000 --- a/geomesa-spark/geomesa-spark-jupyter-vegas/pom.xml +++ /dev/null @@ -1,68 +0,0 @@ - - - - geomesa-spark_2.12 - org.locationtech.geomesa - 5.1.0-SNAPSHOT - - 4.0.0 - - geomesa-spark-jupyter-vegas_2.12 - GeoMesa Jupyter Vegas Runtime - - - - vegas - - - - org.vegas-viz - vegas-spark_2.12 - 0.3.6 - - - - - - - org.apache.maven.plugins - maven-shade-plugin - - - package - - shade - - - false - - - - org.scala-lang:* - org.scala-lang.modules:* - - - - - *:* - - META-INF/*.SF - META-INF/*.DSA - META-INF/*.RSA - - - - - - - - - - - - - - - - - diff --git a/geomesa-spark/pom.xml b/geomesa-spark/pom.xml index bf3e6345b3d0..558a7f792a04 100644 --- a/geomesa-spark/pom.xml +++ b/geomesa-spark/pom.xml @@ -22,14 +22,4 @@ geomesa_pyspark - - - - vegas - - geomesa-spark-jupyter-vegas - - - -