Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release v0.5.0 #207

Merged
merged 1 commit into from
Jul 3, 2024
Merged

Release v0.5.0 #207

merged 1 commit into from
Jul 3, 2024

Conversation

nfx
Copy link
Collaborator

@nfx nfx commented Jul 3, 2024

  • Added Command Execution backend which uses Command Execution API on a cluster (#95). In this release, the databricks labs lSQL library has been updated with a new Command Execution backend that utilizes the Command Execution API. A new CommandExecutionBackend class has been implemented, which initializes a CommandExecutor instance taking a cluster ID, workspace client, and language as parameters. The execute method runs SQL commands on the specified cluster, and the fetch method returns the query result as an iterator of Row objects. The existing StatementExecutionBackend class has been updated to inherit from a new abstract base class called ExecutionBackend, which includes a save_table method for saving data to tables and is meant to be a common base class for both Statement and Command Execution backends. The StatementExecutionBackend class has also been updated to use the new ExecutionBackend abstract class and its constructor now accepts a max_records_per_batch parameter. The execute and fetch methods have been updated to use the new _only_n_bytes method for logging truncated SQL statements. Additionally, the CommandExecutionBackend class has several methods, execute, fetch, and save_table to execute commands on a cluster and save the results to tables in the databricks workspace. This new backend is intended to be used for executing commands on a cluster and saving the results in a databricks workspace.
  • Added basic integration with Lakeview Dashboards (#66). In this release, we've added basic integration with Lakeview Dashboards to the project, enhancing its capabilities. This includes updating the databricks-labs-blueprint dependency to version 0.4.2 with the [yaml] extra, allowing for additional functionality related to handling YAML files. A new file, dashboards.py, has been introduced, providing a class for interacting with Databricks dashboards, along with methods for retrieving and saving dashboard configurations. Additionally, a new __init__.py file under the src/databricks/labs/lsql/lakeview directory imports all classes and functions from the model.py module, providing a foundation for further development and customization. The release also introduces a new file, model.py, containing code generated from OpenAPI specs by the Databricks SDK Generator, and a template file, model.py.tmpl, used for handling JSON data during integration with Lakeview Dashboards. A new file, polymorphism.py, provides utilities for checking if a value can be assigned to a specific type, supporting correct data typing and formatting with Lakeview Dashboards. Furthermore, a .gitignore file has been added to the tests/integration directory as part of the initial steps in adding integration testing to ensure compatibility with the Lakeview Dashboards platform. Lastly, the test_dashboards.py file in the tests/integration directory contains a function, test_load_dashboard(ws), which uses the Dashboards class to save a dashboard from a source to a destination path, facilitating testing during the integration process.
  • Added dashboard-as-code functionality (#201). This commit introduces dashboard-as-code functionality for the UCX project, enabling the creation and management of dashboards using code. The feature resolves multiple issues and includes a new create-dashboard command for creating unpublished dashboards. The functionality is available in the lsql lab and allows for specifying the order and width of widgets, overriding default widget identifiers, and supporting various SQL and markdown header arguments. The dashboard.yml file is used to define top-level metadata for the dashboard. This commit also includes extensive documentation and examples for using the dashboard as a library and configuring different options.
  • Automate opening integration test dashboard in debug mode (#167). A new feature has been added to automatically open the integration test dashboard in debug mode, making it easier for software engineers to debug and troubleshoot. This has been achieved by importing the webbrowser and is_in_debug modules from "databricks.labs.blueprint.entrypoint", and adding a check in the create function to determine if the code is running in debug mode. If it is, a dashboard URL is constructed from the workspace configuration and dashboard ID, and then opened in a web browser using "webbrowser.open". This allows for a more streamlined debugging process for the integration test dashboard. No other parts of the code have been affected by this change.
  • Automatically tile widgets (#109). In this release, we've introduced an automatic widget tiling feature for the dashboard creation process in our open-source library. The Dashboards class now includes a new class variable, _maximum_dashboard_width, set to 6, representing the maximum width allowed for each row of widgets in the dashboard. The create_dashboard method has been updated to accept a new self parameter, turning it into an instance method. A new _get_position method has been introduced to calculate and return the next available position for placing a widget, and a _get_width_and_height method has been added to return the width and height for a widget specification, initially handling CounterSpec instances. Additionally, we've added new unit tests to improve testing coverage, ensuring that widgets are created, positioned, and sized correctly. These tests also cover the correct positioning of widgets based on their order and available space, as well as the expected width and height for each widget.
  • Bump actions/checkout from 4.1.3 to 4.1.6 (#102). In the latest release, the 'actions/checkout' GitHub Action has been updated from version 4.1.3 to 4.1.6, which includes checking the platform to set the archive extension appropriately. This release also bumps the version of github/codeql-action from 2 to 3, actions/setup-node from 1 to 4, and actions/upload-artifact from 2 to 4. Additionally, the minor-actions-dependencies group was updated with two new versions. Disabling extensions.worktreeConfig when disabling sparse-checkout was introduced in version 4.1.4. The release notes and changelog for this update can be found in the provided link. This commit was made by dependabot[bot] with contributions from cory-miller and jww3.
  • Bump actions/checkout from 4.1.6 to 4.1.7 (#151). In the latest release, the 'actions/checkout' GitHub action has been updated from version 4.1.6 to 4.1.7 in the project's push workflow, which checks out the repository at the start of the workflow. This change brings potential bug fixes, performance improvements, or new features compared to the previous version. The update only affects the version number in the YAML configuration for the 'actions/checkout' step in the release.yml file, with no new methods or alterations to existing functionality. This update aims to ensure a smooth and enhanced user experience for those utilizing the project's push workflows by taking advantage of the possible improvements or bug fixes in the new version of 'actions/checkout'.
  • Create a dashboard with a counter from a single query (#107). In this release, we have introduced several enhancements to our dashboard-as-code approach, including the creation of a Dashboards class that provides methods for getting, saving, and deploying dashboards. A new method, create_dashboard, has been added to create a dashboard with a single page containing a counter widget. The counter widget is associated with a query that counts the number of rows in a specified dataset. The deploy_dashboard method has also been added to deploy the dashboard to the workspace. Additionally, we have implemented a new feature for creating dashboards with a counter from a single query, including modifications to the test_dashboards.py file and the addition of four new tests. These changes improve the robustness of the dashboard creation process and provide a more automated way to view important metrics.
  • Create text widget from markdown file (#142). A new feature has been implemented in the library that allows for the creation of a text widget from a markdown file, enhancing customization and readability for users. This development resolves issue #1
  • Design document for dashboards-as-code (#105). "The latest release introduces 'Dashboards as Code,' a method for defining and managing dashboards through configuration files, enabling version control and controlled changes. The building blocks include .sql, .md, and dashboard.yml files, with .sql defining queries and determining tile order, and dashboard.yml specifying top-level metadata and tile overrides. Metadata can be inferred or explicitly defined in the query or files. The tile order can be determined by SQL file order, tiles order in dashboard.yml, or SQL file metadata. This project can also be used as a library for embedding dashboard generation in your code. Configuration precedence follows command-line flags, SQL file headers, dashboard.yml, and SQL query content. The command-line interface is utilized for dashboard generation from configuration files."
  • Ensure propagation of lsql version into User-Agent header when it is used as library (#206). In this release, the pyproject.toml file has been updated to ensure that the correct version of the lsql library is propagated into the User-Agent header when used as a library, improving attribution. The databricks-sdk version has been updated from 0.22.0 to 0.29.0, and the __init__.py file of the lsql library has been modified to add the with_user_agent_extra function from the databricks.sdk.core package for correct attribution. The backends.py file has also been updated with improved type handling in the _row_to_sql and save_table functions for accurate SQL insertion and handling of user-defined classes. Additionally, a test has been added to ensure that the lsql version is correctly propagated in the User-Agent header when used as a library. These changes offer improved functionality and accurate type handling, making it easier for developers to identify the library version when used in other projects.
  • Fixed counter encodings (#143). In this release, we have improved the encoding of counters in the lsql dashboard by modifying the create_dashboard function in the dashboards.py file. Previously, the counter field encoding was hardcoded as "count," but has been changed to dynamically determine the first field name of the given fields, ensuring that counters are expected to have only one field. Additionally, a new integration test has been added to the tests/integration/test_dashboards.py file to ensure that the dashboard deployment functionality correctly handles SQL queries that do not perform a count. A new test for the Dashboards class has also been added to check that counter field encoding names are created as expected. The WorkspaceClient is mocked and not called in this test. These changes enhance the accuracy of counter encoding and improve the overall functionality and reliability of the lsql dashboard.
  • Fixed non-existing reference and typo in the documentation (#104). In this release, we've made improvements to the documentation of our open-source library, specifically addressing issue #104. The changes include fixing a non-existent reference and a typo in the Library size comparison section of the "comparison.md" document. This section provides guidance for selecting a library based on factors like library size, unified authentication, and compatibility with various Databricks warehouses and SQL Python APIs. The updates clarify the required dependency size for simple applications and scripts, and offer more detailed information about each library option. We've also added a new subsection titled Detailed comparison to provide a more comprehensive overview of each library's features. These changes are intended to help software engineers better understand which library is best suited for their specific needs, particularly for applications that require data transfer of large amounts of data serialized in Apache Arrow format and low result fetching latency, where we recommend using the Databricks SQL Connector for Python for efficient data transfer and low latency.
  • Fixed parsing message (#146). In this release, the warning message logged during the creation of a dashboard when a ParseError occurs has been updated to provide clearer and more detailed information about the parsing error. The new error message now includes the specific query being parsed and the exact parsing error, enabling developers to quickly identify the cause of parsing issues. This change ensures that engineers can efficiently diagnose and address parsing errors, improving the overall development and debugging experience with a more informative log format: "Parsing {query}: {error}".
  • Improve dashboard as code (#108). The Dashboards class in the 'dashboards.py' file has been updated to improve functionality and usability, with changes such as the addition of a type variable T for type checking and more descriptive names for methods. The save_to_folder method now accepts a Dashboard object and returns a Dashboard object, and a new static method create_dashboard has been added. Additionally, two new methods _with_better_names and _replace_names have been added for improved readability. The get_dashboard method now returns a Dashboard object instead of a dictionary. The save_to_folder method now also formats SQL code before saving it to file. These changes aim to enhance the functionality and readability of the codebase and provide more user-friendly methods for interacting with the Dashboards class. In addition to the changes in the Dashboards class, there have been updates in the organization of the project structure. The 'queries/counter.sql' file has been moved to 'dashboards/one_counter/counter.sql' in the 'tests/integration' directory. This modification enhances the organization of the project. Furthermore, several tests for the Dashboards class have been introduced in the 'databricks.labs.lsql.dashboards' module, demonstrating various functionalities of the class and ensuring that it functions as intended. The tests cover saving SQL and YML files to a specified folder, creating a dataset and a counter widget for each query, deploying dashboards with a given display name or dashboard ID, and testing the behavior of the save_to_folder and deploy_dashboard methods. Lastly, the commit removes the test_load_dashboard function and updates the test_dashboard_creates_one_dataset_per_query and test_dashboard_creates_one_counter_widget_per_query functions to use the updated Dashboard class. A new replace_recursively function is introduced to replace specific fields in a dataclass recursively. A new test function test_dashboards_deploys_exported_dashboard_definition has been added, which reads a dashboard definition from a JSON file, deploys it, and checks if it's successfully deployed using the Dashboards class. A new test function test_dashboard_deploys_dashboard_the_same_as_created_dashboard has also been added, which compares the original and deployed dashboards to ensure they are identical. Overall, these changes aim to improve the functionality and readability of the codebase and provide more user-friendly methods for interacting with the Dashboards class, as well as enhance the organization of the project structure and add new tests for the Dashboards class to ensure it functions as intended.
  • Infer fields from a query (#111). The Dashboards class in the dashboards.py file has been updated with the addition of a new method, _get_fields, which accepts a SQL query as input and returns a list of Field objects using the sqlglot library to parse the query and extract the necessary information. The create_dashboard method has been modified to call this new function when creating Query objects for each dataset. If a ParseError occurs, a warning is logged and iteration continues. This allows for the automatic population of fields when creating a new dashboard, eliminating the need for manual specification. Additionally, new tests have been added for invalid queries and for checking if the fields in a query have the expected names. These tests include test_dashboards_skips_invalid_query and test_dashboards_gets_fields_with_expected_names, which utilize the caplog fixture and create temporary query files to verify functionality. Existing functionality related to creating dashboards remains unchanged.
  • Make constant all caps (#140). In this release, the project's 'dashboards.py' file has been updated to improve code readability and maintainability. A constant variable _maximum_dashboard_width has been changed to all caps, becoming '_MAXIMUM_DASHBOARD_WIDTH'. This modification affects the Dashboards class and its methods, particularly _get_fields and '_get_position'. The _get_position method has been revised to use the new all caps constant variable. This change ensures better visibility of constants within the code, addressing issue #140. It's important to note that this modification only impacts the 'dashboards.py' file and does not affect any other functionalities.
  • Read display name from dashboard.yml (#144). In this release, we have introduced a new DashboardMetadata dataclass that reads the display name of a dashboard from a dashboard.yml file located in the dashboard's directory. If the dashboard.yml file is absent, the folder name will be used as the display name. This change improves the readability and maintainability of the dashboard configuration by explicitly defining the display name and reducing the need to specify widget information in multiple places. We have also added a new fixture called make_dashboard for creating and cleaning up lakeview dashboards in the test suite. The fixture handles creation and deletion of the dashboard and provides an option to set a custom display name. Additionally, we have added and modified several unit tests to ensure the proper handling of the DashboardMetadata class and the dashboard creation process, including tests for missing, present, or incorrect display_name keys in the YAML file. The dashboards.deploy_dashboard() function has been updated to handle cases where only dashboard_id is provided.
  • Set widget id in query header (#154). In this release, we've made significant improvements to widget metadata handling in our open-source library. We've introduced a new WidgetMetadata class that replaces the previous WidgetMetadata dataclass, now featuring a path attribute, spec_type property, and optional parameters for order, width, height, and _id. The _get_widgets method has been updated to accept an Iterable of WidgetMetadata objects, and both _get_layouts and _get_widgets methods now sort widgets using the order field. A new class method, WidgetMetadata.from_path, handles parsing widget metadata from a file path, replacing the removed _get_width_and_height method. Additionally, the WidgetMetadata class is now used in the deploy_dashboard method, and the test suite for the dashboards module has been enhanced with updated test_widget_metadata_replaces_width_and_height and test_widget_metadata_replaces_attribute functions, as well as new tests for specific scenarios. Issue #154 has been addressed by setting the widget id in the query header, and the aforementioned changes improve flexibility and ease of use for dashboard development.
  • Use order key in query header if defined (#149). In this release, we've introduced a new feature to use an order key in the query header if defined, enhancing the flexibility and control over the dashboard creation process. The WidgetMetadata dataclass now includes an optional order parameter of type int, and the _get_arguments_parser() method accepts the --order flag with type int. The replace_from_arguments() method has been updated to support the new order parameter, with a default value of self.order. The create_dashboard() method now implements a new _get_datasets() method to retrieve datasets from the dashboard folder and introduces a _get_widgets() method, which accepts a list of files, iterates over them, and yields tuples containing widgets and their corresponding metadata, including the order. These improvements enable the use of an order key in query headers, ensuring the correct order of widgets in the dashboard creation process. Additionally, a new test case has been added to verify the correct behavior of the dashboard deployment with a specified order key in the query header. This feature resolves issue #148.
  • Use widget width and height defined in query header (#147). In this release, the handling of metadata in SQL files has been updated to utilize the header of the file, instead of the first line, for improved readability and flexibility. This change includes a new WidgetMetadata class for defining the width and height of a widget in a dashboard, as well as new methods for parsing the widget metadata from a provided path. The release also includes updates to the documentation to cover the supported widget arguments -w or --width and '-h or --height', and resolves issue #114 by adding a test for deploying a dashboard with a big widget using a new function test_dashboard_deploys_dashboard_with_big_widget. Additionally, new test cases have been added for creating dashboards with custom-sized widgets based on query header width and height values, improving functionality and error handling.

Dependency updates:

  • Bump actions/checkout from 4.1.3 to 4.1.6 (#102).
  • Bump actions/checkout from 4.1.6 to 4.1.7 (#151).

* Added Command Execution backend which uses Command Execution API on a cluster ([#95](#95)). In this release, the databricks labs lSQL library has been updated with a new Command Execution backend that utilizes the Command Execution API. A new `CommandExecutionBackend` class has been implemented, which initializes a `CommandExecutor` instance taking a cluster ID, workspace client, and language as parameters. The `execute` method runs SQL commands on the specified cluster, and the `fetch` method returns the query result as an iterator of Row objects. The existing `StatementExecutionBackend` class has been updated to inherit from a new abstract base class called `ExecutionBackend`, which includes a `save_table` method for saving data to tables and is meant to be a common base class for both Statement and Command Execution backends. The `StatementExecutionBackend` class has also been updated to use the new `ExecutionBackend` abstract class and its constructor now accepts a `max_records_per_batch` parameter. The `execute` and `fetch` methods have been updated to use the new `_only_n_bytes` method for logging truncated SQL statements. Additionally, the `CommandExecutionBackend` class has several methods, `execute`, `fetch`, and `save_table` to execute commands on a cluster and save the results to tables in the databricks workspace. This new backend is intended to be used for executing commands on a cluster and saving the results in a databricks workspace.
* Added basic integration with Lakeview Dashboards ([#66](#66)). In this release, we've added basic integration with Lakeview Dashboards to the project, enhancing its capabilities. This includes updating the `databricks-labs-blueprint` dependency to version 0.4.2 with the `[yaml]` extra, allowing for additional functionality related to handling YAML files. A new file, `dashboards.py`, has been introduced, providing a class for interacting with Databricks dashboards, along with methods for retrieving and saving dashboard configurations. Additionally, a new `__init__.py` file under the `src/databricks/labs/lsql/lakeview` directory imports all classes and functions from the `model.py` module, providing a foundation for further development and customization. The release also introduces a new file, `model.py`, containing code generated from OpenAPI specs by the Databricks SDK Generator, and a template file, `model.py.tmpl`, used for handling JSON data during integration with Lakeview Dashboards. A new file, `polymorphism.py`, provides utilities for checking if a value can be assigned to a specific type, supporting correct data typing and formatting with Lakeview Dashboards. Furthermore, a `.gitignore` file has been added to the `tests/integration` directory as part of the initial steps in adding integration testing to ensure compatibility with the Lakeview Dashboards platform. Lastly, the `test_dashboards.py` file in the `tests/integration` directory contains a function, `test_load_dashboard(ws)`, which uses the `Dashboards` class to save a dashboard from a source to a destination path, facilitating testing during the integration process.
* Added dashboard-as-code functionality ([#201](#201)). This commit introduces dashboard-as-code functionality for the UCX project, enabling the creation and management of dashboards using code. The feature resolves multiple issues and includes a new `create-dashboard` command for creating unpublished dashboards. The functionality is available in the `lsql` lab and allows for specifying the order and width of widgets, overriding default widget identifiers, and supporting various SQL and markdown header arguments. The `dashboard.yml` file is used to define top-level metadata for the dashboard. This commit also includes extensive documentation and examples for using the dashboard as a library and configuring different options.
* Automate opening integration test dashboard in debug mode ([#167](#167)). A new feature has been added to automatically open the integration test dashboard in debug mode, making it easier for software engineers to debug and troubleshoot. This has been achieved by importing the `webbrowser` and `is_in_debug` modules from "databricks.labs.blueprint.entrypoint", and adding a check in the `create` function to determine if the code is running in debug mode. If it is, a dashboard URL is constructed from the workspace configuration and dashboard ID, and then opened in a web browser using "webbrowser.open". This allows for a more streamlined debugging process for the integration test dashboard. No other parts of the code have been affected by this change.
* Automatically tile widgets ([#109](#109)). In this release, we've introduced an automatic widget tiling feature for the dashboard creation process in our open-source library. The `Dashboards` class now includes a new class variable, `_maximum_dashboard_width`, set to 6, representing the maximum width allowed for each row of widgets in the dashboard. The `create_dashboard` method has been updated to accept a new `self` parameter, turning it into an instance method. A new `_get_position` method has been introduced to calculate and return the next available position for placing a widget, and a `_get_width_and_height` method has been added to return the width and height for a widget specification, initially handling `CounterSpec` instances. Additionally, we've added new unit tests to improve testing coverage, ensuring that widgets are created, positioned, and sized correctly. These tests also cover the correct positioning of widgets based on their order and available space, as well as the expected width and height for each widget.
* Bump actions/checkout from 4.1.3 to 4.1.6 ([#102](#102)). In the latest release, the 'actions/checkout' GitHub Action has been updated from version 4.1.3 to 4.1.6, which includes checking the platform to set the archive extension appropriately. This release also bumps the version of github/codeql-action from 2 to 3, actions/setup-node from 1 to 4, and actions/upload-artifact from 2 to 4. Additionally, the minor-actions-dependencies group was updated with two new versions. Disabling extensions.worktreeConfig when disabling sparse-checkout was introduced in version 4.1.4. The release notes and changelog for this update can be found in the provided link. This commit was made by dependabot[bot] with contributions from cory-miller and jww3.
* Bump actions/checkout from 4.1.6 to 4.1.7 ([#151](#151)). In the latest release, the 'actions/checkout' GitHub action has been updated from version 4.1.6 to 4.1.7 in the project's push workflow, which checks out the repository at the start of the workflow. This change brings potential bug fixes, performance improvements, or new features compared to the previous version. The update only affects the version number in the YAML configuration for the 'actions/checkout' step in the release.yml file, with no new methods or alterations to existing functionality. This update aims to ensure a smooth and enhanced user experience for those utilizing the project's push workflows by taking advantage of the possible improvements or bug fixes in the new version of 'actions/checkout'.
* Create a dashboard with a counter from a single query ([#107](#107)). In this release, we have introduced several enhancements to our dashboard-as-code approach, including the creation of a `Dashboards` class that provides methods for getting, saving, and deploying dashboards. A new method, `create_dashboard`, has been added to create a dashboard with a single page containing a counter widget. The counter widget is associated with a query that counts the number of rows in a specified dataset. The `deploy_dashboard` method has also been added to deploy the dashboard to the workspace. Additionally, we have implemented a new feature for creating dashboards with a counter from a single query, including modifications to the `test_dashboards.py` file and the addition of four new tests. These changes improve the robustness of the dashboard creation process and provide a more automated way to view important metrics.
* Create text widget from markdown file ([#142](#142)). A new feature has been implemented in the library that allows for the creation of a text widget from a markdown file, enhancing customization and readability for users. This development resolves issue [#1](#1)
* Design document for dashboards-as-code ([#105](#105)). "The latest release introduces 'Dashboards as Code,' a method for defining and managing dashboards through configuration files, enabling version control and controlled changes. The building blocks include `.sql`, `.md`, and `dashboard.yml` files, with `.sql` defining queries and determining tile order, and `dashboard.yml` specifying top-level metadata and tile overrides. Metadata can be inferred or explicitly defined in the query or files. The tile order can be determined by SQL file order, `tiles` order in `dashboard.yml`, or SQL file metadata. This project can also be used as a library for embedding dashboard generation in your code. Configuration precedence follows command-line flags, SQL file headers, `dashboard.yml`, and SQL query content. The command-line interface is utilized for dashboard generation from configuration files."
* Ensure propagation of `lsql` version into `User-Agent` header when it is used as library ([#206](#206)). In this release, the `pyproject.toml` file has been updated to ensure that the correct version of the `lsql` library is propagated into the `User-Agent` header when used as a library, improving attribution. The `databricks-sdk` version has been updated from `0.22.0` to `0.29.0`, and the `__init__.py` file of the `lsql` library has been modified to add the `with_user_agent_extra` function from the `databricks.sdk.core` package for correct attribution. The `backends.py` file has also been updated with improved type handling in the `_row_to_sql` and `save_table` functions for accurate SQL insertion and handling of user-defined classes. Additionally, a test has been added to ensure that the `lsql` version is correctly propagated in the `User-Agent` header when used as a library. These changes offer improved functionality and accurate type handling, making it easier for developers to identify the library version when used in other projects.
* Fixed counter encodings ([#143](#143)). In this release, we have improved the encoding of counters in the lsql dashboard by modifying the `create_dashboard` function in the `dashboards.py` file. Previously, the counter field encoding was hardcoded as "count," but has been changed to dynamically determine the first field name of the given fields, ensuring that counters are expected to have only one field. Additionally, a new integration test has been added to the `tests/integration/test_dashboards.py` file to ensure that the dashboard deployment functionality correctly handles SQL queries that do not perform a count. A new test for the `Dashboards` class has also been added to check that counter field encoding names are created as expected. The `WorkspaceClient` is mocked and not called in this test. These changes enhance the accuracy of counter encoding and improve the overall functionality and reliability of the lsql dashboard.
* Fixed non-existing reference and typo in the documentation ([#104](#104)). In this release, we've made improvements to the documentation of our open-source library, specifically addressing issue [#104](#104). The changes include fixing a non-existent reference and a typo in the `Library size comparison` section of the "comparison.md" document. This section provides guidance for selecting a library based on factors like library size, unified authentication, and compatibility with various Databricks warehouses and SQL Python APIs. The updates clarify the required dependency size for simple applications and scripts, and offer more detailed information about each library option. We've also added a new subsection titled `Detailed comparison` to provide a more comprehensive overview of each library's features. These changes are intended to help software engineers better understand which library is best suited for their specific needs, particularly for applications that require data transfer of large amounts of data serialized in Apache Arrow format and low result fetching latency, where we recommend using the Databricks SQL Connector for Python for efficient data transfer and low latency.
* Fixed parsing message ([#146](#146)). In this release, the warning message logged during the creation of a dashboard when a ParseError occurs has been updated to provide clearer and more detailed information about the parsing error. The new error message now includes the specific query being parsed and the exact parsing error, enabling developers to quickly identify the cause of parsing issues. This change ensures that engineers can efficiently diagnose and address parsing errors, improving the overall development and debugging experience with a more informative log format: "Parsing {query}: {error}".
* Improve dashboard as code ([#108](#108)). The `Dashboards` class in the 'dashboards.py' file has been updated to improve functionality and usability, with changes such as the addition of a type variable `T` for type checking and more descriptive names for methods. The `save_to_folder` method now accepts a `Dashboard` object and returns a `Dashboard` object, and a new static method `create_dashboard` has been added. Additionally, two new methods `_with_better_names` and `_replace_names` have been added for improved readability. The `get_dashboard` method now returns a `Dashboard` object instead of a dictionary. The `save_to_folder` method now also formats SQL code before saving it to file. These changes aim to enhance the functionality and readability of the codebase and provide more user-friendly methods for interacting with the `Dashboards` class. In addition to the changes in the `Dashboards` class, there have been updates in the organization of the project structure. The 'queries/counter.sql' file has been moved to 'dashboards/one_counter/counter.sql' in the 'tests/integration' directory. This modification enhances the organization of the project. Furthermore, several tests for the `Dashboards` class have been introduced in the 'databricks.labs.lsql.dashboards' module, demonstrating various functionalities of the class and ensuring that it functions as intended. The tests cover saving SQL and YML files to a specified folder, creating a dataset and a counter widget for each query, deploying dashboards with a given display name or dashboard ID, and testing the behavior of the `save_to_folder` and `deploy_dashboard` methods. Lastly, the commit removes the `test_load_dashboard` function and updates the `test_dashboard_creates_one_dataset_per_query` and `test_dashboard_creates_one_counter_widget_per_query` functions to use the updated `Dashboard` class. A new `replace_recursively` function is introduced to replace specific fields in a dataclass recursively. A new test function `test_dashboards_deploys_exported_dashboard_definition` has been added, which reads a dashboard definition from a JSON file, deploys it, and checks if it's successfully deployed using the `Dashboards` class. A new test function `test_dashboard_deploys_dashboard_the_same_as_created_dashboard` has also been added, which compares the original and deployed dashboards to ensure they are identical. Overall, these changes aim to improve the functionality and readability of the codebase and provide more user-friendly methods for interacting with the `Dashboards` class, as well as enhance the organization of the project structure and add new tests for the `Dashboards` class to ensure it functions as intended.
* Infer fields from a query ([#111](#111)). The `Dashboards` class in the `dashboards.py` file has been updated with the addition of a new method, `_get_fields`, which accepts a SQL query as input and returns a list of `Field` objects using the `sqlglot` library to parse the query and extract the necessary information. The `create_dashboard` method has been modified to call this new function when creating `Query` objects for each dataset. If a `ParseError` occurs, a warning is logged and iteration continues. This allows for the automatic population of fields when creating a new dashboard, eliminating the need for manual specification. Additionally, new tests have been added for invalid queries and for checking if the fields in a query have the expected names. These tests include `test_dashboards_skips_invalid_query` and `test_dashboards_gets_fields_with_expected_names`, which utilize the caplog fixture and create temporary query files to verify functionality. Existing functionality related to creating dashboards remains unchanged.
* Make constant all caps ([#140](#140)). In this release, the project's 'dashboards.py' file has been updated to improve code readability and maintainability. A constant variable `_maximum_dashboard_width` has been changed to all caps, becoming '_MAXIMUM_DASHBOARD_WIDTH'. This modification affects the `Dashboards` class and its methods, particularly `_get_fields` and '_get_position'. The `_get_position` method has been revised to use the new all caps constant variable. This change ensures better visibility of constants within the code, addressing issue [#140](#140). It's important to note that this modification only impacts the 'dashboards.py' file and does not affect any other functionalities.
* Read display name from `dashboard.yml` ([#144](#144)). In this release, we have introduced a new `DashboardMetadata` dataclass that reads the display name of a dashboard from a `dashboard.yml` file located in the dashboard's directory. If the `dashboard.yml` file is absent, the folder name will be used as the display name. This change improves the readability and maintainability of the dashboard configuration by explicitly defining the display name and reducing the need to specify widget information in multiple places. We have also added a new fixture called `make_dashboard` for creating and cleaning up lakeview dashboards in the test suite. The fixture handles creation and deletion of the dashboard and provides an option to set a custom display name. Additionally, we have added and modified several unit tests to ensure the proper handling of the `DashboardMetadata` class and the dashboard creation process, including tests for missing, present, or incorrect `display_name` keys in the YAML file. The `dashboards.deploy_dashboard()` function has been updated to handle cases where only `dashboard_id` is provided.
* Set widget id in query header ([#154](#154)). In this release, we've made significant improvements to widget metadata handling in our open-source library. We've introduced a new `WidgetMetadata` class that replaces the previous `WidgetMetadata` dataclass, now featuring a `path` attribute, `spec_type` property, and optional parameters for `order`, `width`, `height`, and `_id`. The `_get_widgets` method has been updated to accept an Iterable of `WidgetMetadata` objects, and both `_get_layouts` and `_get_widgets` methods now sort widgets using the order field. A new class method, `WidgetMetadata.from_path`, handles parsing widget metadata from a file path, replacing the removed `_get_width_and_height` method. Additionally, the `WidgetMetadata` class is now used in the `deploy_dashboard` method, and the test suite for the `dashboards` module has been enhanced with updated `test_widget_metadata_replaces_width_and_height` and `test_widget_metadata_replaces_attribute` functions, as well as new tests for specific scenarios. Issue [#154](#154) has been addressed by setting the widget id in the query header, and the aforementioned changes improve flexibility and ease of use for dashboard development.
* Use order key in query header if defined ([#149](#149)). In this release, we've introduced a new feature to use an order key in the query header if defined, enhancing the flexibility and control over the dashboard creation process. The `WidgetMetadata` dataclass now includes an optional `order` parameter of type `int`, and the `_get_arguments_parser()` method accepts the `--order` flag with type `int`. The `replace_from_arguments()` method has been updated to support the new `order` parameter, with a default value of `self.order`. The `create_dashboard()` method now implements a new `_get_datasets()` method to retrieve datasets from the dashboard folder and introduces a `_get_widgets()` method, which accepts a list of files, iterates over them, and yields tuples containing widgets and their corresponding metadata, including the order. These improvements enable the use of an order key in query headers, ensuring the correct order of widgets in the dashboard creation process. Additionally, a new test case has been added to verify the correct behavior of the dashboard deployment with a specified order key in the query header. This feature resolves issue [#148](#148).
* Use widget width and height defined in query header ([#147](#147)). In this release, the handling of metadata in SQL files has been updated to utilize the header of the file, instead of the first line, for improved readability and flexibility. This change includes a new WidgetMetadata class for defining the width and height of a widget in a dashboard, as well as new methods for parsing the widget metadata from a provided path. The release also includes updates to the documentation to cover the supported widget arguments `-w or --width` and '-h or --height', and resolves issue [#114](#114) by adding a test for deploying a dashboard with a big widget using a new function `test_dashboard_deploys_dashboard_with_big_widget`. Additionally, new test cases have been added for creating dashboards with custom-sized widgets based on query header width and height values, improving functionality and error handling.

Dependency updates:

 * Bump actions/checkout from 4.1.3 to 4.1.6 ([#102](#102)).
 * Bump actions/checkout from 4.1.6 to 4.1.7 ([#151](#151)).
@nfx nfx merged commit 619ff0a into main Jul 3, 2024
8 checks passed
@nfx nfx deleted the prepare/0.5.0 branch July 3, 2024 11:02
Copy link

github-actions bot commented Jul 3, 2024

✅ 35/35 passed, 2 skipped, 50m6s total

Running from acceptance #279

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant