Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a script to automatically update the data #6

Open
2 tasks
anuveyatsu opened this issue Aug 31, 2023 · 5 comments
Open
2 tasks

Create a script to automatically update the data #6

anuveyatsu opened this issue Aug 31, 2023 · 5 comments
Assignees

Comments

@anuveyatsu
Copy link
Member

anuveyatsu commented Aug 31, 2023

Acceptance criteria

  • I have a script (preferably python3) to update this dataset.
  • I have GH Actions that runs on monthly basis and runs the script to update the data in this repo.

Analysis

Take a look how we are doing it in similar repos:

@gradedSystem
Copy link
Member

gradedSystem commented Sep 5, 2023

Hi there! @anuveyatsu I am currently applying for a data engineering position at datopian and it is said to complete the task the given in this thread, I have couple of questions regarding the task, is there a requirement on how data should be presented I mean should we just look through the dataset and check the updates on dataset each month and if data will be updated we should just replace the current data with the new data that will appear in UNData. UNSD Demographic Statistics? And should we submit the task using pull request?

@anuveyatsu
Copy link
Member Author

anuveyatsu commented Sep 5, 2023

Hi @gradedSystem thanks for your interest! Would it be OK if I assign to you another similar issue so that your effort will add a lot of value for us? You might have similar questions so feel free to ask them there. Generally we welcome pull requests 👍🏼

Here is the another issue I could assign to you datasets/commodity-prices#4

@anuveyatsu anuveyatsu moved this to Ready to go in Data Curation Sep 7, 2023
@judeleonard
Copy link

Hi @anuveyatsu, I am also working on this task currently. I want to know if it's something I should proceed with?

@anuveyatsu
Copy link
Member Author

@judeleonard hi, I've just assigned it to you, please, take a look.

@judeleonard
Copy link

Hi @anuveyatsu, now I have the script to get the data from the website and then wrangle it to the desired state. But I'm not so sure how you would want this update to happen.

Here are my thoughts:

(1) Overwrite the previous data to include the updated records
(2) Track where the updates happened and insert only the updated records.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Maintainer wanted
Development

No branches or pull requests

3 participants