You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What steps will reproduce the problem?
1. Save a file in UTF-8 without BOM
2. Try to detect Character Encoding.
What is the expected output? What do you see instead?
I expect to see UTF-8 from the #getDetectedCharset() method. Instead I get null.
What version of the product are you using? On what operating system?
I am using juniversalchardet-1.0.3.jar on a Windows 7 System.
Please provide any additional information below.
When I use UTF-8 with BOM I can detect the file just fine but Java does not
support BOM so I get characters at the beginning of the file which I do not
want. Therefore I have been using UTF-8 without BOM.
Perhaps I am not feeding the detector enough data with the file I am reading
in? Although I don't think that is the case because I have extended the amount
of data inside of the file up to 171390 characters with no difference.
Original issue reported on code.google.com by [email protected] on 30 Sep 2011 at 10:20
The problem may be, that your file does not contain any special characters
(e.g. ä,ü,ß,...). In this case the library seems not to be able to detect
the encoding.
Original issue reported on code.google.com by
[email protected]
on 30 Sep 2011 at 10:20Attachments:
The text was updated successfully, but these errors were encountered: