Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escape characters in CSV #397

Merged
merged 8 commits into from
Jan 29, 2018
Merged

Escape characters in CSV #397

merged 8 commits into from
Jan 29, 2018

Conversation

emartynov
Copy link
Contributor

@emartynov emartynov commented Jan 21, 2018

It is fix for #383.

Changes:

  1. Refactored CSV export
  2. Escaped '\n', '\r', '"', ',' when generating CSV
  3. Extracted one more string constant
  4. Added some missing tests

No changes here:

  1. Using async task
  2. Resume export after user allows write permission

Please review and comment.

@emartynov
Copy link
Contributor Author

@AndyScherzinger can you review?

@coveralls
Copy link

coveralls commented Jan 21, 2018

Coverage Status

Coverage increased (+1.002%) to 20.148% when pulling a57e9d0 on 383-fix-for-csv-export into 54dd90f on develop.

@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@Glucosio Glucosio deleted a comment Jan 21, 2018
@AndyScherzinger
Copy link

@emartynov sure, I'll give the branch a test drive :)

osw.append(i == values.length - 1 ? '\n' : ',');
}
}

private String escapeCSV(String value) {
if (containsSymbolsToEscape(value)) {
Copy link

@AndyScherzinger AndyScherzinger Jan 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SYMBOLS_TO_ESCAPE actually contains more symbols that get checked (and lead to a return true) than the following line will actually escape.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate more? I don't get the comment, sorry.

Did I make some logical error?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try :) containsSymbolsToEscape checks for a certain set of characters while not all of them get escaped then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more specific? What characters are not escaped?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try :)

 if (containsSymbolsToEscape(value)) {
    return "\"" + value.replace("\"", "\"\"") + "\"";
} 

escaping is done for " while the condition containsSymbolsToEscape(value) needs to be met of course. Thus I expected the condition to check for the same pattern. But it actually checks for:

private boolean containsSymbolsToEscape(String value) {
        for (char c : SYMBOLS_TO_ESCAPE) {
            if (value.indexOf(c) > -1) {
                return true;
            }
        }

        return false;
    }

while SYMBOLS_TO_ESCAPE contains:

private static final char[] SYMBOLS_TO_ESCAPE = new char[]{',', '"', '\n', '\r'};

So I would have expected the actual ascaping to also then escape the other "symbols to escape".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other than that also when \n has been found the check return true and you are escaping for "

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If , or \n or \r is detected then escaping is wrapping the whole string in the single quotes ". However, if the string contains single quote it should be escaped as double quote "". I hope it helps.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah 👍
I didn't think of that, now makes send. Thanks for the explanation and your patience 😃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues, I also didn't know it by heart and should, of course, mention it in PR description.

@AndyScherzinger
Copy link

Looks good to me code wise, just a minor comment. Will test the export later today after work.

@AndyScherzinger
Copy link

Like mentioned in the in line comment only " and ' seem to be escaped, line breaks still exist and won't be escaped. This leads to new lines which will be treated as new rows, e.g. the following export:

Date,Time,Concentration,Preferred Glucose unit,Measured,Notes
1/22/18,21:37,25,mg/dL,After dinner,"Yet
Another
'
""
Test"
1/22/18,21:36,23.5,mg/dL,After dinner,"This, is
A
New line
Test"

When imported into MS Excel (komma delimiter, "-String key):
2018-01-22 21_53_03-mappe1 - excel

So I would guess an in-app import of it would still fail (don't know how to do an import :/)

@emartynov
Copy link
Contributor Author

We don't have import, only export :(

I was quickly checking https://tools.ietf.org/html/rfc4180. It looks like MS Excel is wrong with importing CSV.

I will dive more :)

@AndyScherzinger
Copy link

I also send it to my gmail adress and opened it as preview within gmail and there it works fine... so yeah, maybe we just have to ignore Excel...

@emartynov
Copy link
Contributor Author

Conclusion on this PR?

@AndyScherzinger
Copy link

@emartynov fine by me and as discussed excel has its cvs issues but that is excel not glucose :)

@AndyScherzinger
Copy link

AndyScherzinger commented Jan 29, 2018

Hi @emartynov I tried to explain it again, sorry for not getting it clearly explained :( Hope this time I got it right :)

@AndyScherzinger
Copy link

All fine by me. Any feedback @paolorotolo?

@paolorotolo
Copy link
Collaborator

I'll try it ASAP

@paolorotolo
Copy link
Collaborator

Works fine in LibreOffice, good work

screenshot from 2018-01-29 15-35-52

@paolorotolo paolorotolo merged commit 2497e54 into develop Jan 29, 2018
@emartynov emartynov deleted the 383-fix-for-csv-export branch February 4, 2018 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants