Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problema de segmentação na associação de Alagoas no diário de 14/12/23 #67

Open
ogecece opened this issue Dec 15, 2023 · 0 comments
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@ogecece
Copy link
Member

ogecece commented Dec 15, 2023

Logs do erro:

Dec 14 22:04:22: WARNING:root:Could not process gazette: 2700000/2023-12-14/ab3183d921f7bb119a6a253e6dc59dc6fb07a367.pdf. Cause: 'Couldn\'t find info for "albarrasaomiguelal"'
Dec 14 22:04:22: ERROR:root:'Couldn\'t find info for "albarrasaomiguelal"'
Dec 14 22:04:22: Traceback (most recent call last):
Dec 14 22:04:22:   File "/mnt/code/tasks/gazette_text_extraction.py", line 32, in extract_text_from_gazettes
Dec 14 22:04:22:     document_ids = try_process_gazette_file(
Dec 14 22:04:22:   File "/mnt/code/tasks/gazette_text_extraction.py", line 69, in try_process_gazette_file
Dec 14 22:04:22:     territory_segments = segmenter.get_gazette_segments(gazette)
Dec 14 22:04:22:   File "/mnt/code/segmentation/segmenters/al_associacao_municipios.py", line 24, in get_gazette_segments
Dec 14 22:04:22:     gazette_segments = [
Dec 14 22:04:22:   File "/mnt/code/segmentation/segmenters/al_associacao_municipios.py", line 25, in <listcomp>
Dec 14 22:04:22:     self.build_segment(territory_slug, segment_text, gazette).__dict__
Dec 14 22:04:22:   File "/mnt/code/segmentation/segmenters/al_associacao_municipios.py", line 65, in build_segment
Dec 14 22:04:22:     territory_data = get_territory_data(territory_slug, self.territories)
Dec 14 22:04:22:   File "/mnt/code/tasks/utils/territories.py", line 28, in get_territory_data
Dec 14 22:04:22:     raise KeyError(f"Couldn't find info for \"{territory_slug}\"")
Dec 14 22:04:22: KeyError: 'Couldn\'t find info for "albarrasaomiguelal"'

Provavelmente seria suficiente alterar o _normalize_territory_name() do segmentador e incluir esse caso:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Development

No branches or pull requests

1 participant