Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Reading (and writing) formatting #404

Open
MyTotoro opened this issue Feb 9, 2024 · 2 comments
Open

Feature Request: Reading (and writing) formatting #404

MyTotoro opened this issue Feb 9, 2024 · 2 comments

Comments

@MyTotoro
Copy link

MyTotoro commented Feb 9, 2024

It would be very helpful to read the formatting and also write formatting out to excel files

I have the code to get string values out!
6,0: Some(String("Revenue Build"))

Would be curious if there was a way to extract the formatting info:
image

I've done some browsing under the hood and it seems like formats are driven from read_styles in
https://github.com/tafia/calamine/blob/master/src/xlsx/mod.rs#L221

Is the right thing to do to read and parse more styles, and expose this data upwards?

@wpf375516041
Copy link

I understand that the code you're referring to pertains to some formatting properties of the table itself rather than individual cells. If we need to extend functionality for specific cell attributes, I am not well-versed in parsing XLSX files but would like to contribute. Can someone provide me with more detailed suggestions on how to proceed with extending cell functionality?

@noctuid
Copy link

noctuid commented Apr 19, 2024

My team works with large amounts of data in excel files. We report various errors about this data and need to be strict about types to avoid ambiguity. For example, we expect columns with percentage values and either require the inputs to be numbers in the excel percent format or strings ending in "%". Decimals like 0.2 are ambiguous (was this meant to be 20% or 0.2%?), so we report errors for cells like that. I listed some more examples here.

Currently we are using openpyxl, but it is unusable for larger files. Calamine, on the other hand, does not currently allow detecting issues like this. After reading with calamine, we get decimals and know nothing about the original type. With openpyxl, we can get not just a cell value but also the data type and number format (e.g. .data_type == "n" and "%" in cell.number_format to check for percentage-type cells).

It would be nice if calamine could also read/provide type/format information or if it could have more DataTypes so that we could distinguish percentage decimals from plain decimals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants