Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4273 – Refactor API Seeds #1798

Merged
merged 48 commits into from
Mar 1, 2024
Merged

4273 – Refactor API Seeds #1798

merged 48 commits into from
Mar 1, 2024

Conversation

vasconsaurus
Copy link
Contributor

@vasconsaurus vasconsaurus commented Feb 9, 2024

First

Apologies as this got way too big, I probably should have broken it up by action (creating relationships, tipline requests etc)

Description

Our old seeds script was built to create items for one user, once the need to have more users appeared there wasn't an easy way to update it.

This file separates everything into two objects: setup and populated_workspaces.

[1] Setup

  • creates the users, teams and team_users
    • 3 users
    • 4 teams
  • t takes the approach of looking into creating those instances as configuration, so it's more verbose, but I think it's easier to understand what is being created.

[2] Populated Workspaces

  • we initialize a new populated workspace with the value from Setup
  • then with populate_projects we use nested_attributes to create projects, project medias, claim descriptions and fact checks
  • after that we have other methods we can call on our instance to run actions to change the state in our workspace:
    • populated_workspaces.saved_searches
    • populated_workspaces.main_user_feed
    • populated_workspaces.share_feed
    • populated_workspaces.confirm_relationships
    • populated_workspaces.suggest_relationships
    • populated_workspaces.tipline_requests
    • populated_workspaces.publish_fact_checks
    • populated_workspaces.clusters

References: 4273, 4174

How has this been tested?

By running the seeds to create a new user and add a new user.

Notes

  • This script is actually faster than the old one, but it creates more users and teams than before, so it can be quite slow.
    • I think with more people testing this we can better understand a balance between creating all the things vs time.
    • Right now it creates items from all media types for the main user, but skips creating links for the invited users (this drops the time it takes to run from 10 minutes to 6)
    • For now there are ways to edit it to create less:
      • Users: To create only one user, comment lines to skip adding invited users
        • 77 (Setup.users.all_users.merge!(invited_users))
        • 87 (Setup.teams.all_teams.merge!(invited_teams))
      • Items: Play around with how many items get created, there are different ways to do this:
        • you can edit MEDIAS_PARAMS, comment out type of medias you do not want to create
        • you can edit what gets passed to each project in populate_projects, look for project_media_attributes inside project_params (there is one for projects_params_main_user_a and project_params_invited_users)
      • Hopefully we can make all this more dynamic for the user, so they can just input what they want to be created
    • Be mindful of sidekiq queues, a lot get created
      • keep track here: http://localhost:3000/sidekiq
      • if for whatever reason you want to run it again right away, you should probably clear the queues (they add up). In the rails console run Sidekiq::Queue.all.map(&:clear)
  • It does not add imported fact checks yet

I tried to summarize what I thought was important, if there are any questions please let me know.
I know it's a lot, so I'm sure there is a bunch of confusing stuff in there.

db/seeds.rb Fixed Show fixed Hide fixed
db/seeds.rb Fixed Show fixed Hide fixed
@vasconsaurus vasconsaurus force-pushed the 4273-refactor-api-seeds branch from bae90ba to 24e4485 Compare February 12, 2024 18:00
db/seeds.rb Fixed Show fixed Hide fixed
db/seeds.rb Fixed Show fixed Hide fixed
db/seeds.rb Dismissed Show dismissed Hide dismissed
db/seeds.rb Dismissed Show dismissed Hide dismissed
@vasconsaurus vasconsaurus force-pushed the 4273-refactor-api-seeds branch 2 times, most recently from b2784c9 to 057c76e Compare February 19, 2024 14:18
@vasconsaurus vasconsaurus force-pushed the 4273-refactor-api-seeds branch from b767bb8 to 9dff7fd Compare February 26, 2024 21:38
I was creating media and project media again,  and I don't think that was
necessary
This shouldn't create any issues, and will make creating objects easier
We were hoping to create everything when creating projects using
nested team_attributes.

Unfortunately we can't create everything nested at once, because:
  - team needs to exist for project media
  - team user needs to exist because of a callback

Still, re-structuring things has helped:
   - making it easier to see what each object needs to be created and
   how they relate
   - making it more flexible to create how many users and full workspaces
   we need

There is still a lot missing, but I think it makes sense to rethink
how we create things, and this seems like a good direction.
I think this can be a private method of this class,
also changed the method name from medias_params to get_medias_params,
though I'm not too sure about it
Could use attr_reader or access the instance variable using @
But I liked this post:
https://ivoanjo.me/blog/2017/09/20/why-i-always-use-attr_reader-to-access-instance-variables/

And confirmed with Caio this was a good way to go.
I think the setup shouldn't get this info and print it
it should just get it, and we do whatever we wnat with it
Nothing serious, but it was bugging me
We need to call get_medias_params each time, because the urls need
to be different for all users
When setting the params to create project medias we are now
also setting the claim_description_attributes.
To do this I had to add the association and accepts_nested_attributes_for
in project_media.rb
The way we dealt with this before, is: we would rescue link errors,
print a message to the user and move forward with our script creating
other media.

I'm not sure if that will work while crating with nested attributes.
Even if I rescue it seems to fail adding the project_medias, though it
does create the project.

So for now, I'm printing an useful message and not showing the user
information if it errors.

But I'll look further into this.
So we can create fact checks when creating claims.
Since we don't want to create fact checks for all claims,
I added reject_if, when we send a blank summary it does
not create a fact check.
just to make what is being passed clear
Because the interface looks different, we create 2 feeds:
- feed_1 shares Published Fact-Checks
- feed_2 shares both Published Fact-Checks and Media, Claim & Requests,
and has clusters
I think we can make less of each type of item:
- we will still have a bunch in total
- it will be faster (one hopes)
@vasconsaurus vasconsaurus force-pushed the 4273-refactor-api-seeds branch from 9dff7fd to 094166e Compare February 28, 2024 17:59
The other type of medias can remain as constants:
- makes it easier if we only want to create items from CLAIMS_PARAMS
instead of calling medias_params
- the links need to be created each time for each user so that we get
different timestamps

I also added a items_total method, so it can check that for that, instead
of hardcoding a number
The minimun amount of items we need to create is 8, so each type of media
should create that (so we can run only one of them if needed)

for relationships and tiplines I tried to make it more dynamic:
- so it doesn't break with the minimum amount
- if we create everything, it creates more of those
I'm trying to play around with ways to make this faster
- using constants
- creating the links only for the main user
I don't like this yet, but the time did drop from 10 minutes to 6,
so I'm leaving it like this for now

I added a conditional in cluster_teams to make sure it does not break
when we don't create invited teams
@vasconsaurus vasconsaurus changed the title WIP: 4273 – Refactor API Seeds 4273 – Refactor API Seeds Feb 29, 2024
@vasconsaurus vasconsaurus marked this pull request as ready for review February 29, 2024 17:19
@brianfleming
Copy link
Contributor

FYI @vasconsaurus I pushed up a merge with develop to try to clear up those bundle message we chatted about

@brianfleming
Copy link
Contributor

love it @vasconsaurus :

—————
Stretch your legs, this might take a while.
On a mac took about 10 minutes to create all populated workspaces.
...

Copy link
Contributor

@brianfleming brianfleming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ran without errors for me, poking around the generated workspaces. Will let the experts comment on the code

Copy link

codeclimate bot commented Feb 29, 2024

Code Climate has analyzed commit ceb4e7c and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 100.0% (100% is the threshold).

This pull request will bring the total coverage in the repository to 100.0% (0.0% change).

View more on Code Climate.

@brianfleming
Copy link
Contributor

Just so I understand the expectations of the data @vasconsaurus
are the counts media, requests, fact-checks just random numbers? Meaning they are not back by actual data correct? I'm guessing thats the case since the number of fact-checks matches the other number, which isn't realistic.

Screenshot 2024-02-29 at 4 07 57 PM

@caiosba
Copy link
Contributor

caiosba commented Feb 29, 2024

That's correct, @brianfleming - for now those are just random numbers because we currently only list the cards. In the next sprint, when we have the item page, the numbers will reflect real data.

Copy link
Contributor

@caiosba caiosba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vasconsaurus , the code is way more clear, great refactoring!

We don't need to keep this file, right? -> db/seeds_backup.rb? Can we just delete it?

@vasconsaurus
Copy link
Contributor Author

Thanks @vasconsaurus , the code is way more clear, great refactoring!

We don't need to keep this file, right? -> db/seeds_backup.rb? Can we just delete it?

yes! will delete it and merge this

@vasconsaurus vasconsaurus force-pushed the 4273-refactor-api-seeds branch from ceb4e7c to f681d03 Compare March 1, 2024 11:11
@vasconsaurus vasconsaurus merged commit 6265cf5 into develop Mar 1, 2024
7 of 8 checks passed
@vasconsaurus vasconsaurus deleted the 4273-refactor-api-seeds branch March 1, 2024 11:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants