Replies: 9 comments 2 replies
-
Hi @mobab-th, thank you very much for contributing to the project again! I think it would be nice to override the language:detected_1 metadata with the first detected language, since our current language detection engine is old. Just one question, why did you choose to use LibreTranslate API as a service instead of integrating Argos (the actual implementation lib) directly, does it benefit of special hardware like GPU? |
Beta Was this translation helpful? Give feedback.
-
Hi @lfcnassif,
i don't have looked for: is IPED using tika for language detection?
In this implentation you can use a seperate Translation engine and no other dependcies or hardware i.e. GPU on the processing machine is needed. |
Beta Was this translation helpful? Give feedback.
-
Yes, but an outdated version.
I just saw LibreTranslate can use GPU for faster processing, great! Do you know if they have an engine different than Argos? |
Beta Was this translation helpful? Give feedback.
-
just an idea:
LibreTranslate use ArgosTranslate. ArgosTranslate can also use GPU. From the Documentation: To enable GPU support, you need to set the ARGOS_DEVICE_TYPE env variable to cuda or auto. |
Beta Was this translation helpful? Give feedback.
-
First version of the translation script: https://github.com/mobab-th/IPED-scripts |
Beta Was this translation helpful? Give feedback.
-
Thank you @mobab-th! I'll try to take a look this week.
For fast modules, like language or file type detection, I think it shouldn't help. Generally, since we seize dozens or hundreds of evidences in the same investigation, we distribute processing across different machines at the evidence level, using an IPED automation and distributed service, that looks for new evidences in a network share and delegates IPED processing for an iddle processing node. But heavy tasks, like transcription, or those that need special hardware, like GPUs, for sure can benefit from a client-server processing approach. |
Beta Was this translation helpful? Give feedback.
-
Just saw LibreTranslate can do batch processing, it can improve translation speed a lot when using GPUs with enough memory. I also took a preliminary look at the script, thanks @mobab-th. A few comments:
|
Beta Was this translation helpful? Give feedback.
-
Luis Filipe Nassif ***@***.***> schrieb am So., 21. Apr.
2024, 20:23:
Just saw LibreTranslate can do batch processing, it can improve
translation speed a lot when using GPUs with enough memory.
I also took a preliminary look at the script, thanks @mobab-th
<https://github.com/mobab-th>. A few comments:
- I liked the subitems approach to store translated texts. I thought
about appending the translated text to the original text, replacing it (not
good from a forensic perspective) or adding it as a new Metadata property,
maybe this last one would be an interesting alternative, since search hits
will bring the translated text and the original text/item at the same time.
Saving the translation as metadata was my first approach. However, the
readability is very poor. Maybe we can do both, at the expense of Disk
space requirements.
It's only one line of code more.
- if user doesn't setup the translation service, maybe we can start a
local one automatically, this would make non technical users life easier.
Then the complete environment for LibreTranslate would have to be
delivered, including the language models, or the language models would have
to be downloaded at the first start.
…
—
Reply to this email directly, view it on GitHub
<#2172 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEDII6PUUDHBVRLNTUGADUDY6P73TAVCNFSM6AAAAABGPBG7LCVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TCOBQGY4TM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
@mobab-th could you create a PR with your translation feature proposal? We have other PRs to review before, but this way at least it will be in our review queue and could be scheduled properly. |
Beta Was this translation helpful? Give feedback.
-
Hello,
I am currently working on a text translation via LibreTranslate API with a Python script.
The user can configure the following:
Any other ideas?
Beta Was this translation helpful? Give feedback.
All reactions