Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix issue when ReadStringBuffer returns without populating the strBuffer #182

Closed
wants to merge 3 commits into from
Closed

Conversation

Ctrl-Alt-Delight
Copy link

ReadStringBuffer can return early "if the string sits entirely within the current record" without populating the strBuffer.
ReadStringBuffer returns the string needed anyway so we can just return ReadStringBuffer.

@MarkPflug
Copy link
Owner

Do you have a file that demonstrates this bug?

@Ctrl-Alt-Delight
Copy link
Author

Sorry I have a file but it is full of private and confidential information. Opening the file and removing such information results in a file that no longer has the issue. The file is from a third party and unfortunately I have no control over its creation.

This pull request fixes my first issue that was throwing an exception of "Index was out of range. Must be non-negative and less than the size of the collection.\r\nParameter name: startIndex" when attempting to create a string from a zero length buffer (as it was never resized and populated)

I don't understand MS-XLS enough to provide the crucial information, it is happening in a record type of Label (516 / 0x0204) if that helps.

Is there some meta data I can pull from the file that would assist?

There is another issue I am facing which might be related in which the first character is a null character inside the same Label record type cause the string returned to include the null character but loose the final character of the string. I am not sure at this point with my limited knowledge if this particular issue is just a poorly formed xls file or not

@MarkPflug
Copy link
Owner

Would it be possible to email it to me privately? mark dot pflug at live dot com. I have a set of "external" tests that I run against files that I don't check into the repository: https://github.com/MarkPflug/Sylvan.Data.Excel/blob/main/source/Sylvan.Data.Excel.Tests/ExternalDataTests.cs

@Ctrl-Alt-Delight
Copy link
Author

Would it be possible to email it to me privately? ....

I have redacted what I can without altering the behaviour of the file and sent an email just now.

@MarkPflug
Copy link
Owner

Got the file and am able to reproduce the issue. Thanks.

@MarkPflug
Copy link
Owner

This issue ended up being a bit more complicated. The null character you were seeing was actually due to a discrepancy with the .xls specification I was using for reference. The label records seem to be different between excel 95 and excel 97. The spec doesn't mention this, but it is pretty obvious what's going on when looking at the actual data.

I've implemented a fix in #183 and have published a pre-release package containing the fix. If you could test that package and verify that it works with your actual, non-redacted files. If all looks good on your end, I'll publish a final 0.4.25 build.

https://www.nuget.org/packages/Sylvan.Data.Excel/0.4.25-b0001

@Ctrl-Alt-Delight
Copy link
Author

Ctrl-Alt-Delight commented Aug 14, 2024

Yeah I was coming to a similar conclusion just now as well, but more just fumbling around after reading up about XLUnicodeString

Had literally just committed to a branch that was passing for my use case but would likely break non excel 95:
908fc34

Glad you got to the bottom of it, will get this version and test.

Thanks.

@Ctrl-Alt-Delight
Copy link
Author

Have tested with a couple of files directly from the third party, all is working as expected.

@MarkPflug
Copy link
Owner

Okay, final 0.4.25 version is now published.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants