Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails to Detect UTF-8 without BOM #13

Open
GoogleCodeExporter opened this issue May 12, 2015 · 2 comments
Open

Fails to Detect UTF-8 without BOM #13

GoogleCodeExporter opened this issue May 12, 2015 · 2 comments

Comments

@GoogleCodeExporter
Copy link

What steps will reproduce the problem?
1. Save a file in UTF-8 without BOM
2. Try to detect Character Encoding.

What is the expected output? What do you see instead?
I expect to see UTF-8 from the #getDetectedCharset() method. Instead I get null.

What version of the product are you using? On what operating system?
I am using juniversalchardet-1.0.3.jar on a Windows 7 System.


Please provide any additional information below.
When I use UTF-8 with BOM I can detect the file just fine but Java does not 
support BOM so I get characters at the beginning of the file which I do not 
want. Therefore I have been using UTF-8 without BOM.

Perhaps I am not feeding the detector enough data with the file I am reading 
in? Although I don't think that is the case because I have extended the amount 
of data inside of the file up to 171390 characters with no difference.

Original issue reported on code.google.com by [email protected] on 30 Sep 2011 at 10:20

Attachments:

@GoogleCodeExporter
Copy link
Author

Attached file is a test file I am using.

Original comment by [email protected] on 30 Sep 2011 at 10:32

@GoogleCodeExporter
Copy link
Author

The problem may be, that your file does not contain any special characters 
(e.g. ä,ü,ß,...). In this case the library seems not to be able to detect 
the encoding.

Original comment by [email protected] on 7 Jan 2014 at 2:40

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant