Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stub: Create a step to Hash Content in a workflow #2551

Open
olyveya opened this issue Dec 16, 2024 · 0 comments
Open

Stub: Create a step to Hash Content in a workflow #2551

olyveya opened this issue Dec 16, 2024 · 0 comments

Comments

@olyveya
Copy link

olyveya commented Dec 16, 2024

Goal: Create a step in a in a workflow to hash content - to avoid duplicate records.

Description:
Create a python script step in a workflow to hash each chunk of text/content before it is converted into a node and sent to the graph.

Requirements:

Check that all content sent to the knowledge graph is hashed using a hash function prior to node creation.

If a hash already exists, content is not added to the graph (ends workflow).

Open Question:
How to pull and retrieve existing content in the graph to compare hash outputs (strings of hash) so that we don't create a duplicate record in our graph?

Do we store old content for version control?
if hash is the same between old content and new - then create an edge between "new_use" and "old_paragraph"

@olyveya olyveya changed the title Stub: Hash Content Before Creating Node in Workflow Stub: Create a step to Hash Content in a workflow Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant