forked from DSpace/xoai
-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix(common): apply more lowlevel fix for broken UTF-8 chars #188
With this commit, we introduce an analysis routine that will go over the last few bytes when reading with CopyElement. This is faster than converting to String and checking for the UTF-8 unknown char sign over and over again. By using a buffered input stream, we can rewind the stream if necessary and read again up to the point that we don't have a broken char. Extensive testing was added to make sure the analysis function works with any length of multibyte UTF-8 chars
- Loading branch information
1 parent
5057b23
commit 53f8222
Showing
2 changed files
with
98 additions
and
30 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters