Parse raw model output to structured JSON object #19

romansinkus · 2024-12-06T08:01:52Z

Create hard-coded keys file for the example log book
Use hard-coded keys to parse raw model output from Florence model to create a structured JSON object

Note:

The parsing accuracy is still quite low. This will likely mean we should transition to a more general model (not OCR-specific) in order to more accurately parse the model output into a JSON object making it more usable on the front end.

TonyLiu0226

Good work so far with parsing model output to keys! I would say as a temporary measure for the demo, if it is possible, please add the full generated text to the repsonse since after parsing to keys a lot of the transcribed text ends up missing (likely as mentioned due to model performance)

TonyLiu0226 · 2024-12-07T01:22:56Z

transcription/app.py

@@ -16,6 +18,7 @@

 @app.route("/api/transcribe", methods=["POST"])
 def transcribe():
+    print("START OF ENDPOINT")


remove this print

TonyLiu0226 · 2024-12-07T01:34:09Z

transcription/keys.json

@@ -0,0 +1,18 @@
+{


For now this is okay as this model's accuracy is low and we will also need to consider other database templates, but we will need a more comprehensive list of keys for all subheadings in log template

TonyLiu0226 · 2024-12-07T01:44:16Z

transcription/app.py

-        return jsonify({"transcription": generated_text})
+        keys = load_keys("keys.json")
+        json_result = parse_florence_output(generated_text, keys)
+        return json_result


Due to low model accuracy, could you please also include the full generated_text in the API's repsonse for demo purposes? I could maybe suggest return a jsonify or json.dumps that combines json_result with generated_text

Parse model output using pre-determined keys.

b68912c

romansinkus requested a review from TonyLiu0226 December 6, 2024 08:01

TonyLiu0226 requested changes Dec 7, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse raw model output to structured JSON object #19

Parse raw model output to structured JSON object #19

romansinkus commented Dec 6, 2024

TonyLiu0226 left a comment

TonyLiu0226 Dec 7, 2024

TonyLiu0226 Dec 7, 2024

TonyLiu0226 Dec 7, 2024

Parse raw model output to structured JSON object #19

Are you sure you want to change the base?

Parse raw model output to structured JSON object #19

Conversation

romansinkus commented Dec 6, 2024

TonyLiu0226 left a comment

Choose a reason for hiding this comment

TonyLiu0226 Dec 7, 2024

Choose a reason for hiding this comment

TonyLiu0226 Dec 7, 2024

Choose a reason for hiding this comment

TonyLiu0226 Dec 7, 2024

Choose a reason for hiding this comment