Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature convert fcts #59

Merged
merged 47 commits into from
Dec 1, 2021
Merged
Show file tree
Hide file tree
Changes from 33 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
584a253
added length function
tombonfert Oct 18, 2021
c5aee1d
added length function
tombonfert Oct 18, 2021
5c046a9
Merge branch 'main' into length_function
VincentTNR Oct 19, 2021
ef41513
Merge pull request #2 from tombonfert/length_function
tombonfert Oct 19, 2021
f9de8d8
Merge branch 'databricks:main' into main
tombonfert Oct 19, 2021
225c77d
added length function again...
tombonfert Oct 19, 2021
84f59ef
removed redundant dependency definition
tombonfert Oct 19, 2021
363287e
Merge pull request #3 from tombonfert/feature-length-function
tombonfert Oct 20, 2021
5982c9f
length function arguments can now be attr or expr
tombonfert Oct 26, 2021
49040ad
Merge pull request #4 from databricks/main
tombonfert Oct 26, 2021
fdac9d9
Merge branch 'databricks:main' into main
tombonfert Oct 26, 2021
19bb9af
Merge branch 'databricks:main' into main
tombonfert Oct 26, 2021
e99af2a
Merge branch 'databricks:main' into main
tombonfert Oct 26, 2021
d0839a8
Merge branch 'databricks:main' into main
tombonfert Oct 26, 2021
fd06978
Merge branch 'databricks:main' into main
tombonfert Oct 26, 2021
dfaa7f3
Merge branch 'databricks:main' into main
tombonfert Nov 2, 2021
38aa740
Merge branch 'databricks:main' into main
tombonfert Nov 3, 2021
c39575d
Merge branch 'databricks:main' into main
tombonfert Nov 3, 2021
11a5201
Merge branch 'databricks:main' into main
tombonfert Nov 4, 2021
2a949ad
Merge branch 'databricks:main' into main
tombonfert Nov 4, 2021
391ed98
Merge branch 'databricks:main' into main
tombonfert Nov 4, 2021
476a788
Merge branch 'databricks:main' into main
tombonfert Nov 5, 2021
2cb2470
Merge branch 'databricks:main' into main
tombonfert Nov 5, 2021
520b550
Merge branch 'databricks:main' into main
tombonfert Nov 8, 2021
b371916
Merge branch 'databricks:main' into main
tombonfert Nov 9, 2021
4bd017b
Merge branch 'databricks:main' into main
tombonfert Nov 22, 2021
2dffa25
Merge branch 'databricks:main' into main
tombonfert Nov 22, 2021
23840f3
Merge branch 'databricks:main' into main
tombonfert Nov 22, 2021
5e06add
fixed failing tests
tombonfert Nov 22, 2021
6854eb1
Merge branch 'databricks:main' into main
tombonfert Nov 22, 2021
f1648ec
Merge branch 'databricks:main' into main
tombonfert Nov 26, 2021
a3d06f4
Merge branch 'databricks:main' into main
tombonfert Nov 26, 2021
7424bea
implementation of memk function
tombonfert Nov 26, 2021
c3fddb9
improved kmem implementation
tombonfert Nov 27, 2021
837eaa0
rm TODO
tombonfert Nov 27, 2021
45d3f5e
finalized memk implementation
tombonfert Nov 29, 2021
5c241f5
implementation of rmunit function
tombonfert Nov 29, 2021
62e2357
rm styleerror
tombonfert Nov 29, 2021
e26cd8e
implementation of rmcomma function
tombonfert Nov 29, 2021
c3b3ab0
implementation of ctime function
tombonfert Nov 29, 2021
bca112b
added ctime()
tombonfert Nov 29, 2021
7c8c54d
implementation of num function
tombonfert Nov 30, 2021
6e27dc2
added verification test for num fct
tombonfert Nov 30, 2021
5757f7c
improved implementation of num function
tombonfert Nov 30, 2021
685bf0d
added wildcard fields and none function
tombonfert Nov 30, 2021
e806bc4
added TODO...
tombonfert Nov 30, 2021
f003b84
support for wc fields
tombonfert Dec 1, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions src/main/scala/spl/catalyst/SplToCatalyst.scala
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,9 @@ object SplToCatalyst extends Logging {
Literal(null)
case "isnotnull" =>
IsNotNull(attrOrExpr(ctx, call.args.head))
case "memk" =>
tombonfert marked this conversation as resolved.
Show resolved Hide resolved
val field = attrOrExpr(ctx, call.args.head)
callMemk(ctx, field)
case _ =>
val approx = s"${call.name}(${call.args.map(_.toString).mkString(",")})"
throw new ConversionFailure(s"Unknown SPL function: $approx")
Expand All @@ -223,6 +226,18 @@ object SplToCatalyst extends Logging {
}
}

private def callMemk(ctx: LogicalContext, field: Expression): Expression = {
val regex = Literal.create("(?i)^(\\d*\\.?\\d+)([kmg])$")
val size = Cast(RegExpExtract(field, regex, Literal.create(1)), DoubleType)
val format = Upper(RegExpExtract(field, regex, Literal.create(2)))
val multiplier = CaseWhen(Seq(
(EqualTo(format, Literal.create("K")), Literal.create(1.0)),
(EqualTo(format, Literal.create("M")), Literal.create(1024.0)),
(EqualTo(format, Literal.create("G")), Literal.create(1024.0 * 1024.0))
), Literal.create(1.0))
Multiply(size, multiplier)
}

private def callCidrMatch(ctx: LogicalContext, cidr: Expression, ip: Expression): Expression = {
ip match {
case str: Literal => CidrMatch(cidr, str)
Expand Down
7 changes: 7 additions & 0 deletions src/main/scala/spl/pyspark/PythonGenerator.scala
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,13 @@ object PythonGenerator {
s".otherwise(${expressionCode(falseVal.get)})"
} else ""
s"F.when(${expressionCode(pred)}, ${expressionCode(trueVal)})$otherwiseStmt"
case CaseWhen(branches: Seq[(Expression, Expression)], elseValue: Option[Expression]) =>
// ToDo add all branches to when statement + consolidate both CaseWhen cases
tombonfert marked this conversation as resolved.
Show resolved Hide resolved
val otherwiseStmt = if (elseValue.isDefined) {
s".otherwise(${expressionCode(elseValue.get)})"
} else ""
s"F.when(${expressionCode(branches.head._1)}, ${expressionCode(branches.head._2)})" +
s"${otherwiseStmt}"
case In(attr, items) =>
s"${expressionCode(attr)}.isin(${items.map(expressionCode).mkString(", ")})"
case UnresolvedAlias(child, aliasFunc) =>
Expand Down
26 changes: 26 additions & 0 deletions src/test/scala/spl/catalyst/SplToCatalystTest.scala
Original file line number Diff line number Diff line change
Expand Up @@ -850,6 +850,32 @@ class SplToCatalystTest extends AnyFunSuite with PlanTestBase {
)
}

test("memk(x)") {
val regex = Literal("(?i)^(\\d*\\.?\\d+)([kmg])$")
val format = Upper(RegExpExtract(UnresolvedAttribute("x"), regex, Literal.create(2)))
check(ast.SearchCommand(
ast.Call("memk", Seq(
ast.Field("x")
))),
(_, tree) => {
Filter(
Multiply(
Cast(RegExpExtract(
UnresolvedAttribute("x"),
regex,
Literal.create(1)),
DoubleType),
CaseWhen(Seq(
(EqualTo(format, Literal("K")), Literal.create(1.0)),
(EqualTo(format, Literal("M")), Literal.create(1024.0)),
(EqualTo(format, Literal("G")), Literal.create(1024.0 * 1024.0))
), Literal.create(1.0))
),
tree)
}
)
}

test("round(x)") {
check(ast.SearchCommand(
ast.Call("round", Seq(
Expand Down