Skip to content

Commit

Permalink
Forces encoding of XML to UTF-8 prior to converting to JSON
Browse files Browse the repository at this point in the history
Fixes #2894.
  • Loading branch information
afred committed Dec 18, 2024
1 parent bfdff51 commit cf67d36
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions app/controllers/api_controller.rb
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,15 @@ def show
# escape double quotes (because they may appear in node values)
xml = xml.gsub(%(\"), %(\\\"))

# Non-ASCII characters that are valid UTF-8 will throw an error during
# the translation process. Since we want UTF-8 in and out, forcing
# encoding to UTF-8 here should alleviate issues of where a multibyte
# char is not recognized. However, if non UTF-U encodings are being
# used, then this still may error, and we need to re-open the discussion
# about how/wehther to support other encodings, which would have to be
# stored/read within the PBCore documents themselves, i would think.
xml.force_encoding('UTF-8')

json = pbcore_xml_to_json_xsl_doc.transform(Nokogiri::XML(xml))
render json: JSON.pretty_generate(
JSON.parse(json)
Expand Down

0 comments on commit cf67d36

Please sign in to comment.