- #408 Enable Scan On Push for the ECR Repository
- #407: Upgrade frontend dependencies
- #409: Upgrade frontend dependencies
- #410: Upgrade backend dependencies
- #411: Upgrade frontend dependencies
- #412: Upgrade backend dependencies
- #413: Upgrade frontend dependencies
- #414: Upgrade frontend dependencies
- #416: Upgrade backend dependencies
- #415: Upgrade frontend dependencies and move to node20
- #405 Replace AmazonAPIGatewayInvokeFullAccess managed policy with inline policy
- #395: Increase the speed of the json_handler by migrating from a list to a set. Move from O(n) to O(1)
- #393: Fix for a bug affecting Create Data Mapper API with manually created Glue Tables that don't contain SerdeInfo Parameters metadata
- #390: Upgrade frontend dependencies and fix bug affecting Create Data Mapper UI with manually created Glue Tables
- #389: Upgrade backend dependencies
- #387: Upgrade frontend dependencies
- #386: Upgrade backend dependencies
- #376: Upgrade backend dependencies
- #377: Upgrade backend dependencies
- #383: Upgrade backend dependencies
- #384:
- Fix an issue impacting JSON Gzipped S3 objects being written uncompressed after a deletion Job
- Upgrade backend dependencies
- #369: Upgrade frontend dependencies
- #370: Upgrade frontend dependencies
- #371: Upgrade frontend dependencies
- #372: Upgrade backend dependencies
- #368: Fix an issue that prevented deployments for users who have multiple S3F2 instances deployed within the same AWS account, using the same ResourcePrefix parameter, but in different regions.
- #367: Fix deployment issue caused by deprecation of polling-based S3 sources in AWS CodePipeline
- #366: Upgrade backend and frontend dependencies
- #363: Bump cfn-lint version, update CloudFront to use OAC instead of OAI
- #362: Improvements for container image build process
- #360: Refactor of Web UI S3 bucket access control mechanisms
- #359: Fix UI Links to documentation
- #358: Fix for bug that caused failure when opening gzipped files due to pyarrow unzipping
- #348: Cost and performance improvement for Queue API
- #350: Upgrade build dependencies
- #351: Upgrade frontend dependencies
- #352: Upgrade frontend dependencies
- #355: Upgrade frontend and backend dependencies
- #343: Upgrade frontend dependencies
- #345: Improved error handling when processing invalid match IDs during the Query Planning phase
- #346: Upgrade build dependencies
- #347: Add settings to allocate AWS Lambda memory size for API handlers and Deletion Job tasks
- #342: Fix for an issue affecting jobs terminating with FORGET_PARTIALLY_FAILED despite correctly processing all the objects, caused by ECS running multi-processing in fork mode. It includes improvement on multi-processing handling, as well as enhanced logging to include PIDs when running on ECS during the Forget phase
- #341: Upgrade build dependencies
- #340: Upgrade frontend dependencies
- #336: Add Mumbai (ap-south-1) deploy link
- #332:
- Switch to Pyarrow's S3FileSystem to read from S3 on the Forget Phase
- Switch to boto3's upload_fileobj to write to S3 on the Forget Phase
- Upgrade backend dependencies
- #327: Improve performance of Athena results handler
- #329: Upgrade frontend dependencies
- #321: Upgrade numpy dependency
- #322: Upgraded to Python 3.9
- #314: Fix query generation step for Composite matches consisting of a single column
- #320: Fix deployment issue introduced in v0.48
- #316: Upgrade dependencies
- #313: Add option to choose IAM for authentication (in place of Cognito)
- #313: Add option to not deploy WebUI component. Cognito auth is required for WebUI
- #310: Improve performance of Athena query generation
- #308: Upgrade frontend dependencies and use npm workspaces to link frontend sub-project
- #306: Adds retry behaviour for old object deletion to improve reliability against transient errors from Amazon S3
- #303: Improve performance of Athena query generation
- #301: Include table name to error when query generation fails due to an invalid column type
- Dependency version updates for:
- #293: Upgrade dependencies
-
#289: Upgrade frontend dependencies
-
#287: Add data mapper parameter for ignoring Object Not Found exceptions encountered during deletion
-
#285: Fix for a bug that caused a job to fail with a false positive
The object s3://<REDACTED> was processed successfully but no rows required deletion
when processing a job with queries running for more than 30m -
#286: Fix for a bug that causes
AthenaQueryMaxRetries
setting to be ignored -
#286: Make state machine more resilient to transient failures by adding retry
-
#284: Improve performance of find query for data mappers with multiple column identifiers
-
#283: Fix for a bug that caused a job to fail with
Runtime.ExitError
when processing a large queue of objects to be modified -
#281: Improve performance of query generation step for tables with many partitions
- #280: Improve performance for large queues of composite matches
- #279: Improve performance for large queues of simple matches and logging additions
- #278: Fix for a bug that caused a job to fail if the processing of an object took longer than the lifetime of its IAM temporary access credentials
-
#276: First attempt for fixing a bug that causes the access token to expire and cause a Job to fail if processing of an object takes more than an hour
-
#275: Upgrade JavaScript dependencies
-
#274: Fix for a bug that causes deletion to fail in parquet files when a data mapper has multiple column identifiers
- #272: Introduce a retry mechanism when running Athena queries
- #271: Support for decimal type column identifiers in Parquet files
- #270: Fix for a bug affecting the front-end causing a 403 error when making a request to STS in the Data Mappers Page
-
#266: Fix creating data mapper bug when glue table doesn't have partition keys
-
#264: Upgrade frontend dependencies
-
#263: Improve bucket policies
-
#261: Upgrade frontend dependencies
- #260: Add Stockholm region
- #257: Introduce data mapper setting to specify the partition keys to be used when querying the data during the Find Phase
- #256: Upgrade backend dependencies
- #252: Upgrade frontend and backend dependencies
- #248: Fix for a bug affecting Deletion Jobs running for cross-account buckets
- #246: Upgrade build dependencies
This version introduces breaking changes to the API and Web UI. Please consult the migrating from <=v0.24 to v0.25 guide
- #239: Remove limit on queue size for individual jobs.
- #240: Add ECR API Endpoint and migrate to Fargate Platform version 1.4.0
- #238: Upgrade frontend dependencies
- #236: Export API Gateway URL + Deletion Queue Table Stream ARN from main CloudFormation Template
- #232: Fix for a bug affecting the Frontend not rendering the Data Mappers list when a Glue Table associated to a Data Mapper gets deleted
- #233: Add GET endpoint for specific data mapper
- #234: Performance improvements for the query generation phase
- #223: This release fixes an issue (#222) where new deployments of the solution could fail due to unavailability of a third-party dependency. Container base images are now retrieved and bundled with each release.
- #220: Fix for
a bug affecting Parquet files with lower-cased column identifiers generating a
Apache Arrow processing error: 'Field "customerid" does not exist in table schema
exception during the Forget phase (for examplecustomerId
in parquet file being mapped to lower-casecustomerid
in glue table)
- #216: Fix for
a bug affecting Parquet files with complex data types as column identifier
generating a
Apache Arrow processing error: Mix of struct and list types not yet supported
exception during the Forget phase - #216: Fix for
a bug affecting workgroups other than
primary
generating a permission error exception during the Find phase
- #215: Support for data registered with AWS Lake Formation
- #213: Fix for a bug causing a FIND_FAILED error related to a States.DataLimitExceed exception triggered by Step Function's Athena workflow when executing the SubmitQueryResults lambda
- #208: Fix bug preventing PUT DataMapper to edit existing datamapper with same location, fix Front-end DataMapper creation to prevent editing an existing one.
- #207: Upgrade frontend dependencies
- #202: Fix a
bug that was affecting Partitions with non-string types generating a
SYNTAX_ERROR: line x:y: '=' cannot be applied to integer, varchar(z)
exception during the Find Phase - #203: Upgrade frontend dependencies
- #204: Improve performance during Cleanup Phase
- #205: Fix a UI issue affecting FireFox preventing to show the correct queue size due to a missing CORS header
- #200: Add API Endpoint for adding deletion queue items in batch - deprecates PATCH /v1/queue
- #170: JSON support
- #193: Add support for datasets with Pandas indexes. Pandas indexes will be preserved if present.
- #194: Remove debugging code from Fargate task
- #195: Fix support for requester pays buckets
- #196: Upgrade backend dependencies
- #197: Fix duplicated query executions during Find Phase
This version introduces breaking changes to the CloudFormation templates. Please consult the migrating from <=v0.8 to v0.9 guide
- #185: Fix dead links to VPC info in docs
- #186: Fix: Solves an issue where the forget phase container could crash when redacting numeric Match IDs from its logs
- #187: Dependency version updates for react-scripts
- #183: Dependency version updates for elliptic
- #173: Show column types and hierarchy in the front-end during Data Mapper creation
- #173: Add support for char, smallint, tinyint, double, float
- #174: Add support for types nested in struct
- #177: Reformat of Python source code (non-functional change)
- Dependency version updates for:
- #172: Fix for an issue where Make may not install the required Lambda layer dependencies, resulting in unusable builds.
- #171: Fix for a bug affecting the API for 5xx responses not returning the appropriate CORS headers
- #164: Fix for a bug affecting v0.2 deployment via CloudFormation
- #161: Fix for
a bug affecting Parquet files with nullable values generating a
Table schema does not match schema used to create file
exception during the Forget phase
Initial Release