You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On the test instance I noticed an ever increasing use of disk storage, for storing the original, raw documents.
While it needs further investigation, I assume this is partly true to do re-ingesting documents with the same ID. Overwriting existing state, but keeping duplicates in the storage backend. Currently the strategy there is that we don't clean up.
The question is: should be change this strategy? One opinion I read about this, in the context of Mastodon IIRC, was that storage is cheaper than API calls cleaning up storage. On the other side, constantly increasing storage (and cost) just for being lazy also feels wrong.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
On the test instance I noticed an ever increasing use of disk storage, for storing the original, raw documents.
While it needs further investigation, I assume this is partly true to do re-ingesting documents with the same ID. Overwriting existing state, but keeping duplicates in the storage backend. Currently the strategy there is that we don't clean up.
The question is: should be change this strategy? One opinion I read about this, in the context of Mastodon IIRC, was that storage is cheaper than API calls cleaning up storage. On the other side, constantly increasing storage (and cost) just for being lazy also feels wrong.
Beta Was this translation helpful? Give feedback.
All reactions