Skip to content

Commit

Permalink
fix tests and add docs
Browse files Browse the repository at this point in the history
  • Loading branch information
allisonwang-db committed Nov 27, 2024
1 parent c3ef33d commit 28472ce
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 6 deletions.
5 changes: 5 additions & 0 deletions python/docs/source/user_guide/sql/python_data_source.rst
Original file line number Diff line number Diff line change
Expand Up @@ -516,3 +516,8 @@ The following example demonstrates how to implement a basic Data Source using Ar
df = spark.read.format("arrowbatch").load()
df.show()
Usage Notes
-----------

- During data source resolution, built-in and Java data sources take precedence over Python data sources with the same name; to explicitly use a Python data source, make sure its name does not conflict with any other loaded Java data sources.
Original file line number Diff line number Diff line change
Expand Up @@ -683,6 +683,9 @@ object DataSource extends Logging {
}
}
case head :: Nil =>
// We do not check whether the provider is a Python data source
// (isUserDefinedDataSource) to avoid the lookup cost. Java data sources
// always take precedence over Python user-defined data sources.
head.getClass
case sources =>
// There are multiple registered aliases for the input. If there is single datasource
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,12 +94,6 @@ abstract class PythonDataSourceSuiteBase extends QueryTest with SharedSparkSessi
class PythonDataSourceSuite extends PythonDataSourceSuiteBase {
import IntegratedUDFTestUtils._

test("SPARK-45917: automatic registration of Python Data Source") {
assume(shouldTestPandasUDFs)
val df = spark.read.format(staticSourceName).load()
checkAnswer(df, Seq(Row(0, 0), Row(0, 1), Row(1, 0), Row(1, 1), Row(2, 0), Row(2, 1)))
}

test("SPARK-50426: should not trigger static Python data source lookup") {
assume(shouldTestPandasUDFs)
val testAppender = new LogAppender("Python data source lookup")
Expand All @@ -121,6 +115,12 @@ class PythonDataSourceSuite extends PythonDataSourceSuiteBase {
"Loading static Python Data Sources.")))
}

test("SPARK-45917: automatic registration of Python Data Source") {
assume(shouldTestPandasUDFs)
val df = spark.read.format(staticSourceName).load()
checkAnswer(df, Seq(Row(0, 0), Row(0, 1), Row(1, 0), Row(1, 1), Row(2, 0), Row(2, 1)))
}

test("simple data source") {
assume(shouldTestPandasUDFs)
val dataSourceScript =
Expand Down

0 comments on commit 28472ce

Please sign in to comment.