Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement RichText support for XLSX #441

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

FlixCoder
Copy link

For #440 .

I implemented a RichText representation and changed the Data representation to use it. The parsing is so far only implemented for XLSX, the other formats continue to only ever produce Data::String even if it is actually rich text content.

The shared strings also have to be rich text now ^^

I think it turned out quite well, let me know what you think :)

Copy link
Owner

@tafia tafia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, this looks really neat!
I made a few tiny comments, the most important one being of not having a DataRef::RichText variant.

src/de.rs Show resolved Hide resolved
src/style.rs Show resolved Hide resolved
src/style.rs Outdated Show resolved Hide resolved
src/style.rs Outdated Show resolved Hide resolved
src/style.rs Outdated Show resolved Hide resolved
src/datatype.rs Outdated Show resolved Hide resolved
@FlixCoder
Copy link
Author

Adjusted to the requested changes. I also found a few more little things that I missed, e.g. the DataType::get_string implementation for Data and such.

tests/test.rs Outdated Show resolved Hide resolved
Copy link
Owner

@tafia tafia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot.
I think I still don't see the reason not to keep DataRef::SharedString and just add a new variant DataRef::SharedRichText if necessary.

src/style.rs Show resolved Hide resolved
@@ -11,7 +11,7 @@ use super::{cell_format, parse_formula, wide_str, RecordIter};
pub struct XlsbCellsReader<'a> {
iter: RecordIter<'a>,
formats: &'a [CellFormat],
strings: &'a [String],
strings: &'a [RichText],
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is wrong to force all strings to be RichText. There are already some situation where memory is an issue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an idea how to handle this? We need to store either normal strings or rich text, both can be the case at any shared string. So naturally you would make an enum:

enum SharedString {
    String(String),
    RichText(RichText),
}

But then the enum type is at least as big as RichText, if not more. So using &[SharedString] does not solve the memory issue.

So how are we able to store rich text while keeping the memory layout of string in the usual case? We could make the rich text varian RichText(Box<RichText>), but that has different overhead, that I am not sure if it is better than additional memory use?

DataRef::SharedString(v) => Some(v),
DataRef::String(v) => Some(v),
DataRef::RichText(v) => Some(v.text()),
DataRef::SharedString(v) => Some(v.text()),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect a SharedString and SharedRichText ??
What is the issue exactly to force RichText all over the place?

@FlixCoder
Copy link
Author

I can't really progress until #441 (comment) . Could you elaborate a bit how you imagine things to be?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants