You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If a template file contains a single non-ASCII character (e.g. "ë"), the conversion might contain a wrongly converted character (e.g. "ë"). After doing some research I found that this issue is strongly connected to this PyPugJS issue.
The problem is with the chardet package. By scanning a file, it makes a guess at the encoding of the file. However, if there is only a single non-ASCII character in the file, a wrong encoding might be detected (confer this and this issue). A solution was proposed to this problem, but it never got accepted and is now out-of-date. The correspondence in that last referenced pull request contains a quick and dirty patch, which resolved the issue for me.
I am, however, not satisfied with this kind of solution. It is not nice for users to have to go through the research I went through to resolve a strange bug as this one. Even mentioning the hotfix in the PyPugJS documentation seems like the wrong way to go. The problem is that this package now forces users to rely on an unreliable package.
My proposal is to change the open method in pypugjs/runtime.py introduced in PR #27 to use a global setting which the user can use to force their preferred encoding. The default value would be auto, which uses chardet. Other values can be any strings, as long as they are valid names of encodings.
I would love to do this work myself, but I am on a tight deadline for a job, and I might not have time nor urgency to resolve the issue once I'm done with that job. Hopefully somebody else can pick up the slack. Many thanks!
The text was updated successfully, but these errors were encountered:
Description
If a template file contains a single non-ASCII character (e.g. "ë"), the conversion might contain a wrongly converted character (e.g. "ë"). After doing some research I found that this issue is strongly connected to this PyPugJS issue.
The problem is with the
chardet
package. By scanning a file, it makes a guess at the encoding of the file. However, if there is only a single non-ASCII character in the file, a wrong encoding might be detected (confer this and this issue). A solution was proposed to this problem, but it never got accepted and is now out-of-date. The correspondence in that last referenced pull request contains a quick and dirty patch, which resolved the issue for me.I am, however, not satisfied with this kind of solution. It is not nice for users to have to go through the research I went through to resolve a strange bug as this one. Even mentioning the hotfix in the PyPugJS documentation seems like the wrong way to go. The problem is that this package now forces users to rely on an unreliable package.
My proposal is to change the
open
method inpypugjs/runtime.py
introduced in PR #27 to use a global setting which the user can use to force their preferred encoding. The default value would beauto
, which useschardet
. Other values can be any strings, as long as they are valid names of encodings.I would love to do this work myself, but I am on a tight deadline for a job, and I might not have time nor urgency to resolve the issue once I'm done with that job. Hopefully somebody else can pick up the slack. Many thanks!
The text was updated successfully, but these errors were encountered: