-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data source extensions should be composable #116
Data source extensions should be composable #116
Conversation
3ca0426
to
5ce8564
Compare
[QUESTIONS]:
This part is not really clear to me: Where exactly is nexus/pipline-position located, is it in the datasource's accompanying resource.json file?
Is the pipeline position assigned temporarily or locally? Does it change everytime when the individual data source return their resource catalogs? |
With Here you can see that the
It is assigned on the fly by Nexus and only stored in memory. The assignment occurs here in a method named
The value only changes, when the pipeline definition ( |
This merge requests adds a pipeline feature to Nexus so that multiple
IDataSource
s can be chained. This solves two problems:To distinguish which data source should handle which data requests, every resource gets an integer property assigned under the path
nexus/pipline-position
:This position is set by Nexus when the individual data sources return their resource catalogs. It is then later used to distribute
ReadRequest
s to the corresponding data sources.At the same time this additional piece of metadata is useful in making the data processing pipeline more tracable so that users can always find out which version of a software and which configuration led to the specific set of data. In future we should add Git support and create a commit every time the configuration changes. The current commit ID will then become part of the catalog metadata (#119).
A frequent change to Nexus source code is the renaming of
DataSourceRegistration
toPipeline
. This was necessary because we now do not only have a singleDataSourceRegistration
to provide a set of catalogs but multipleDataSourceRegistration
s which compose a pipeline.There are also many changes regarding to the "format on save" feature, i.e. often useless spaces have been removed. Or I have reformatted some individual LOC without changing their meaning.
Here are some comments to the individually changed files:
.github/workflows
:- Use specific pyright version because the new one causes type checking errors (however, this means we need to solve the python type errors in near future, #124)
.vscode/settings.json
- Exclude
.razor
files fromeditor.formatOnSave
because this produces incorrect filesnotes/plugin-pipeline.excalidraw
- A drawing which shows the pipeline feature, can be ignored
openapi.json
- This is an auto-generated file for the Swagger UI (https://nexus.iwes.fraunhofer.de/api), can be ignored
src/Nexus.UI/Components/CatalogAboutView.razor
- Since we now have a list of data sources (the pipeline), the data source info page has been adapted to display data per data source
src/Nexus.UI/Core/AppState.cs
- The
CatalogInfo
type (contains display info for the UI) had to be adapted so that info about all pipeline members (data sources) can be provided to the UIsrc/Nexus.UI/Core/NexusDemoClient.cs
- Same as before
src/Nexus.UI/ViewModels/FakeResourceCatalogViewModel.cs
- same as before
src/Nexus/API/CatalogsController.cs
- same as before
- mainly renaming from
DataSourceRegistration
toPipeline
- line 293 (old) / 301 (new): I made the extension method
JsonElement.GetStringValue
a bit more efficient by reducing the number of string.Split operations which means that now the first parameter is an array instead of a path-like string. This change will occur in other files as wellsrc/Nexus/API/SourcesController.cs
- Previously the user-specific
DataSourceRegistration
configuration was part of theproject.json
file in the Nexus configuration folder. This has been factored out and is now part of the user specific folders (also in the Nexus configuration folder):The file
pipelines.json
contains all user-configured pipelines and the pipelines itself are managed by the newly created servicePipelineService
and the file system interaction is handled by the already existingDatabaseService
which are both injected into this file (src/Nexus/API/SourcesController.cs
).The REST API code in this file has been adapted to let users interact with pipelines instead of data source registrations.
src/Nexus/API/UsersController.cs
- The type
InternalDataSourceRegistration
became superfluous and has been removed. NowDataSourceRegistration
is used everywhere insteadsrc/Nexus/Core/CatalogContainer.cs
- This file mainly follows the name changes and the fact that we now have to handle arrays instead of single object
DataSourceRegistration
ssrc/Nexus/Core/Models_NonPublic.cs
- As described above, previously the
DataSourceRegistration
s were part ofproject.json
which the typeUserConfiguration
belonged to. Now thatDataSourceRegistration
s are living in their ownpipeline.json
files, the typeUserConfiguration
is not required anymoresrc/Nexus/Core/Models_Public.cs
- see
- same as before
comments above- There is now a
DataSourcePipeline
type which is similar to the oldDataSourceRegistration
type except that now we have a list ofDataSourceRegistration
ssrc/Nexus/Extensibility/DataSource/DataSourceController.cs
- This is the core of the changes: Here Nexus has to handle the new pipeline approach, i.e. answers to questions like
What to do with multiple GetTimeRange() return values?
(because we now have multiple data sources), and more.- The solution for multiple GetAvailability() responses is to calculate the average
- The old extension method
GetCatalogAsync
has been renamed toEnrichCatalogAsync
because now every data source gets the catalog returned by the data source which is located earlier in the pipeline. The first data source gets an empty catalog.- Data sources get only read requests passed for resources which belong to the current pipeline position
src/Nexus/Extensions/Sources/Sample.cs
- mainly just adapt to other code changes
- Line 150 (old) / 149 (new): there now a new tuple parameter called
originalResourceName
in methodReadAsync
. This one became necessary because with the pipeline approach resource IDs can be modified by data source which come later in the pipeline. So the data source which originally provided a resource with a specific ID (= name) cannot rely anymore on the resource ID in theReadAsync
method. Therefore Nexus ensures that every resource has anorignal-name
property:This property can be deliberately set by a data source or - in case the data source doesn't do this - Nexus will do it for you so that this value is never
null
. So theoriginalResourceName
will now always be part of aReadRequest
.src/Nexus/Extensions/Writers/Csv.cs
- follow previous code changes
src/Nexus/Program.cs
- register the
PipelineService
for DIsrc/Nexus/Services/AppStateManager.cs
- Data source registrations are now managed by
PipelineService
, so remove the unnecessary code from heresrc/Nexus/Services/CatalogManager.cs
- follow previous code changes
src/Nexus/Services/DataControllerService.cs
- follow previous code changes
src/Nexus/Services/DataService.cs
- follow previous code changes
src/Nexus/Services/DatabaseService.cs
- Extend this service with functionality to handle pipeline data
src/Nexus/Services/PipelineService.cs
- The pipeline service (handles creation, deletion and retrieval of pipelines per user)
src/Nexus/wwwroot/css/app.css
- auto-generated by Tailwind (can be ignored)
src/clients/dotnet-client/NexusClient.g.cs
- auto-genereated (can be ignored)
src/clients/python-client/nexus_api/_nexus_api.py
- auto-genereated (can be ignored)
src/extensibility/dotnet-extensibility/DataModel/DataModelExtensions.cs
original-name
resource property, there is a helper method to create it. This already existed in the projectNexus.Sources.StructuredFile
but has been moved over into this projectnew:
old
src/extensibility/dotnet-extensibility/DataModel/PropertiesExtensions.cs
- As mentioned before, the number of
string.Split()
operations has been reduced to make property access more efficient. Internally catalog and resource properties are represented by aJsonElement
and unfortunately it is a bit of work to access nested JSON data. That is the reason why this class exists.src/extensibility/dotnet-extensibility/DataModel/ResourceCatalog.cs
- follow previous code changes
src/extensibility/dotnet-extensibility/Extensibility/DataSource/DataSourceTypes.cs
- follow previous code changes
src/extensibility/dotnet-extensibility/Extensibility/DataSource/IDataSource.cs
-
GetCatalogAsync
has been renamed toEnrichCatalogAsync
and the parameters changedsrc/extensibility/python-extensibility/nexus_extensibility/_extensibility_data_source.py
- mirror C# changes to Python`
tests/Nexus.Tests/DataSource/DataSourceControllerFixture.cs
- this unit test fixture prepares test data, i.e. it prepares data source registrations (now two instead of one because we want to test the new pipeline behavior)
tests/Nexus.Tests/DataSource/DataSourceControllerTests.cs
- Tests have been adapted to the pipeline feature
tests/Nexus.Tests/DataSource/SampleDataSourceTests.cs
- follow previous code changes
tests/Nexus.Tests/DataSource/TestSource.cs
- A data source to be used in the tests and which modifies existing resources and adds a new resource to the catalog. This data source is placed in pipeline position
1
, i.e. after the actual data sourcetests/Nexus.Tests/Other/CatalogContainersExtensionsTests.cs
- Mainly code format changes
- follow previous code changes
tests/Nexus.Tests/Other/PackageControllerTests.cs
- follow previous code changes
tests/Nexus.Tests/Services/CatalogManagerTests.cs
- Tests have been adapted to the pipeline feature
tests/Nexus.Tests/Services/DataControllerServiceTests.cs
tests/Nexus.Tests/Services/DataServiceTests.cs
- follow previous code changes
tests/Nexus.Tests/Services/PipelineServiceTests.cs
- tests for the
PipelineService
tests/Nexus.Tests/Services/TokenServiceTests.cs
- fix warnings
tests/TestExtensionProject/TestDataSource.cs
- follow previous code changes