Skip to content

Commit

Permalink
AVRO import: Add support for namespace. AVRO import: Fixes exception …
Browse files Browse the repository at this point in the history
…when doc is missing

Resolves datacontract#121
  • Loading branch information
jochenchrist committed Apr 1, 2024
1 parent 5663bdc commit dd112f7
Show file tree
Hide file tree
Showing 5 changed files with 14 additions and 3 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Added export format **great-expectations**: `datacontract export --format great-expectations`
- Added gRPC support to OpenTelemetry integration for publishing test results
- Added Databricks SQL dialect for `datacontract export --format sql`
- Added AVRO import support for namespace (#121)

### Fixed

- Use `sql_type_converter` to build checks.
- Fixed AVRO import when doc is missing (#121)

## [0.9.7] - 2024-03-15

Expand Down
10 changes: 8 additions & 2 deletions datacontract/imports/avro_importer.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ def import_avro(data_contract_specification: DataContractSpecification, source:
data_contract_specification.models = {}

try:
avro_schema = avro.schema.parse(open(source, "rb").read())
with open(source, "r") as file:
avro_schema = avro.schema.parse(file.read())
except Exception as e:
raise DataContractException(
type="schema",
Expand All @@ -27,9 +28,14 @@ def import_avro(data_contract_specification: DataContractSpecification, source:
data_contract_specification.models[avro_schema.name] = Model(
type="table",
fields=fields,
description=avro_schema.doc,
)

if avro_schema.get_prop("doc") is not None:
data_contract_specification.models[avro_schema.name].description = avro_schema.get_prop("doc")

if avro_schema.get_prop("namespace") is not None:
data_contract_specification.models[avro_schema.name].namespace = avro_schema.get_prop("namespace")

return data_contract_specification


Expand Down
1 change: 1 addition & 0 deletions datacontract/model/data_contract_specification.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,7 @@ class Field(pyd.BaseModel):
class Model(pyd.BaseModel):
description: str = None
type: str = None
namespace: str = None
fields: Dict[str, Field] = {}


Expand Down
3 changes: 2 additions & 1 deletion tests/examples/avro/data/orders.avsc
Original file line number Diff line number Diff line change
Expand Up @@ -41,5 +41,6 @@
],
"name": "orders",
"doc": "My Model",
"type": "record"
"type": "record",
"namespace": "com.sample.schema"
}
1 change: 1 addition & 0 deletions tests/test_import_avro.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ def test_import_avro_schema():
orders:
type: table
description: My Model
namespace: com.sample.schema
fields:
ordertime:
type: long
Expand Down

0 comments on commit dd112f7

Please sign in to comment.