-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrades to Spark 3.4/JRE 17 and fixes all high/critical CVEs #226
Conversation
Signed-off-by: Adrian Cole <[email protected]>
trivy is clean now, so even if we don't release until 8.14 final... at least we can apply the security settings same as other repos after merge:
|
Signed-off-by: Adrian Cole <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elastic/elasticsearch-hadoop#2187 unlocked this (releasing in elastic-hadoop 8.14, but not sure when)
other notes below
@@ -150,6 +150,9 @@ public CassandraDependenciesJob build() { | |||
df.setTimeZone(TimeZone.getTimeZone("UTC")); | |||
this.dateStamp = df.format(new Date(builder.day)); | |||
this.conf = new SparkConf(true).setMaster(builder.sparkMaster).setAppName(getClass().getName()); | |||
if (builder.sparkMaster.startsWith("local[")) { | |||
conf.set("spark.driver.bindAddress", "127.0.0.1"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a spark 3.4 thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basically it tries to detect with the hostname, which isn't needed for local mode anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it is worth looking into, but InetAddress.getLocalHost().getHostAddress()
may be more reliable option (fe if the host uses IPv6 only).
.values() | ||
.map(DEPENDENCY_LINK_JSON); | ||
JavaRDD<Map<String, Object>> links; | ||
try (JavaSparkContext sc = new JavaSparkContext(conf)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is just polish as we can use try/resources with some of the drivers
@@ -0,0 +1,14 @@ | |||
# Set everything to be logged to the console |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is also spark 3.4 thing (log4j 2 not 1.2 config)
@@ -187,6 +224,53 @@ | |||
<type>pom</type> | |||
<scope>import</scope> | |||
</dependency> | |||
|
|||
<!-- CVE fix versions --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to keep the build without CVE will be difficult, but anyway at least once it is clean ;)
some big wins from merging this include:
|
oh yeah spent so much time doing this I forgot why.. I was trying to polish this up prior to adding dependencies to helm. There was a point where I though maybe we need to rewrite the entire thing (like in beam) to solve the revlock. I'm glad it didn't get that far. |
openzipkin/zipkin#3763 for zipkin changes we can now do |
Signed-off-by: Adrian Cole <[email protected]>
Signed-off-by: Adrian Cole <[email protected]>
exec java ${JAVA_OPTS} -Djava.io.tmpdir=/tmp -cp classes zipkin2.dependencies.ZipkinDependenciesJob $@ | ||
# Spark 3.4 module config from: | ||
# https://github.com/apache/spark/blob/branch-3.4/launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java#L29 | ||
exec java ${JAVA_OPTS} -Djava.io.tmpdir=/tmp \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
thanks for the look folks! |
I manually tested this on all three storage types in docker as well (using zipkin's docker/examples instructions)