You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 8, 2018. It is now read-only.
I am trying to run mnist.py with standalone cluster mode. for this I set master = "local[*] to "spark://MY_MASTER_IP:7077" and I submit my task by following command
INFO DAGScheduler: ResultStage 1 (load at NativeMethodAccessorImpl.java:0) failed in 2.200 s due to Job aborted due to stage failure: Task 5 in stage 1.0 failed 4 times, most recent failure: Lost task 5.3 in stage 1.0 (TID 13, 192.168.1.102, executor 1): java.io.FileNotFoundException: File file:/home/semanticslab3/development/spark/spark-2.2.1-bin-hadoop2.7/bin/data/mnist_train.csv does not exist
It is possible the underlying files have been updated. You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved.
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:127)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:174)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:105)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:461)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:461)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157)
at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1336)
at scala.collection.TraversableOnce$class.aggregate(TraversableOnce.scala:214)
at scala.collection.AbstractIterator.aggregate(Iterator.scala:1336)
at org.apache.spark.rdd.RDD$$anonfun$aggregate$1$$anonfun$22.apply(RDD.scala:1113)
at org.apache.spark.rdd.RDD$$anonfun$aggregate$1$$anonfun$22.apply(RDD.scala:1113)
at org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:2125)
at org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:2125)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Driver stacktrace:
17/12/14 16:29:32 INFO DAGScheduler: Job 1 failed: load at NativeMethodAccessorImpl.java:0, took 2.211758 s
With master = "local[*] it works nice but it use only on worker. I want to use to two of my worker
Thanks
The text was updated successfully, but these errors were encountered:
Did you modify the example script in any way? Besides changing the master address? Because the script is not intended to be submitted through spark-submit.
Thanks for you quick replay. I just add an import statement from pyspark.sql import SparkSession in mnist.py and changing the master address. nothing more then that.
you said, script is not intended to be submitted through spark-submit then why this code is running when I use local[*] as master address.
if the script is not intended to be submitted through spark-submitI then how can i run this mnist.py so that it run over two cluster node .
or It is not possible to run in cluster ?
Thanks
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I am trying to run mnist.py with standalone cluster mode. for this I set
master = "local[*] to "spark://MY_MASTER_IP:7077"
and I submit my task by following commandbut I get the following error
With
master = "local[*]
it works nice but it use only on worker. I want to use to two of my workerThanks
The text was updated successfully, but these errors were encountered: