-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/396 handle xlsx files #421
Conversation
…. missing. Lint still failing due to cycle dependendies + type any warning.
fixed missing sheets on xlsx import of wb
Please let me know which classes I should write test cases for and to which extend or if this is not required at this stage. @rhazn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
I know I wasn't tagged but still wanted to leave you a few comments ;-)
addSheet(sheet: Sheet) { | ||
this.sheets.push(sheet); | ||
} | ||
|
||
addNewSheet(data: string[][], sheetName?: string) { | ||
const sheetNameOrDefault = sheetName ?? `Sheet${this.sheets.length + 1}`; | ||
this.addSheet(new Sheet(data, sheetNameOrDefault)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation currently allows sheets with duplicate names. I'd try to avoid this, or is this intended behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added check to prevent duplicate names in workbook
PrimitiveValuetypes.Text, | ||
); | ||
const sheet = workbook.getSheetByName(sheetName); | ||
assert(sheet instanceof Sheet); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather sheet !== undefined
(would be more expressive of your intention imo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed assert completly as already checked within function getSheetByName()
.
const workSheet = workBookFromFile.Sheets[workSheetName]; | ||
assert(workSheet !== undefined); | ||
|
||
const workSheetDataArray = Array.prototype.map.call< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That line looks weird to me, why go over the prototype?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be quite frank, because I did't manage to do it differently eg calling map
directly without failing lint and or type checks. Maybe I can have another look with @rhazn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please bring this up in our sync, we can do it live.
) => string[], | ||
], | ||
string[][] | ||
>(workSheet, (x: xlsx.CellObject[]): string[] => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls use speaking variable names to indicate to readers whats happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to row
], | ||
string[][] | ||
>(workSheet, (x: xlsx.CellObject[]): string[] => { | ||
return x.map<string>((y: xlsx.CellObject) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here: better naming of variables please (e.g., cell)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to cell
Uhm, weird. Didn't mean to close this PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments.
Regarding the PR itself: Please make sure to always at least mention the related task, otherwise it is very hard to navigate. You can do so by just adding a 'closes #' or 'relates to #' in the description where number is the issue id.
Regarding test cases, hopefully @georg-schwarz can help me out here, do you know from the top of your head where we are right now with tests, should we add tests for the new blocks or the iotype?
@@ -134,6 +135,12 @@ export class DebugLogVisitor implements IoTypeVisitor<void> { | |||
|
|||
this.log('... (omitted in peek mode)'); | |||
} | |||
visitWorkbook(workbook: Workbook): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should
- have the same empty return with minimal debug granularity as the visitTable function has
- Log the number of worksheets in the workbook by default
- use the log function consistently, never console.log
- log the peek comment (as visit table does)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
- empty return for minimal debug granularity
- Logging the number of worksheets in the workbook
- consistently using log function
- added peek functionality for bigger workbooks
|
||
export class Workbook implements IOTypeImplementation<IOType.WORKBOOK> { | ||
public readonly ioType = IOType.WORKBOOK; | ||
private sheets: Sheet[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be assigned immediately and you do not need a constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed accordingly.
this.sheets = []; | ||
} | ||
|
||
getSheets(): ReadonlyArray<Sheet> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why readonly array? I'd just reuse the Sheet[]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed accordingly.
return this.sheets; | ||
} | ||
|
||
getSheetByName(sheetName: string): Sheet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't crash the interpreter with an assert if the sheet does not exist, I'd return Sheet or undefined and use https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/find
return sheet; | ||
} | ||
|
||
addSheet(sheet: Sheet) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why addSheet and addNewSheet with different semantic? addSheet lets you add sheets with duplicate names, addNewSheet does not. This has to be one function, just remove the current addSheet
and rename the addNewSheet
to addSheet
after.
Also, please add return types. These functions should return the workbook itself to enable fluent interfaces (https://en.wikipedia.org/wiki/Fluent_interface).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed as suggested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the general approach I am not convinced a name should be the property of a sheet. You run into multiple problems using that approach:
- You have complex code because you need to filter
- You have the problem of duplicate names and what to do with them
Why not make the name to sheet mapping a property of a workbook? Keep sheets as they are, nameless. Instead, track sheets in the workbook as objects containing the name and the actual sheet. Then you can have an index (from the position of the object in the array), the sheet and the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added it as a map<string, workbook> where string is the name of the workbook.
As discussed with the downside not beeing able to reference the sheet by position within the workbook e.g. read from sheet 5.
); | ||
|
||
this.docs.description = | ||
"Interprets an input file as a xlsx-file and outputs a `Workbook` containing `Sheet`'s."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Interprets an input file as a xlsx-file and outputs a `Workbook` containing `Sheet`'s."; | |
"Interprets an input file as a XLSX-file and outputs a `Workbook` containing `Sheet`s."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed as suggested
{ | ||
code: blockExample, | ||
description: | ||
"Interprets an input file as a xlsx-file and outputs a `Workbook` containing `Sheet`'s.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Interprets an input file as a xlsx-file and outputs a `Workbook` containing `Sheet`'s.", | |
"Interprets an input file as a XLSX-file and outputs a `Workbook` containing `Sheet`s.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed as suggested
file: BinaryFile, | ||
context: ExecutionContext, | ||
): Promise<R.Result<Workbook>> { | ||
context.logger.logDebug(`reading from xlsx file`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
context.logger.logDebug(`reading from xlsx file`); | |
context.logger.logDebug(`Reading from XLSX file`); |
const workSheet = workBookFromFile.Sheets[workSheetName]; | ||
assert(workSheet !== undefined); | ||
|
||
const workSheetDataArray = Array.prototype.map.call< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please bring this up in our sync, we can do it live.
@A-M-A-X seems like the dev branch got deleted by mistake, leading to an automation to close this PR. Can you retarget on |
@georg-schwarz not sure if I am able to. I guess I am missing rights to do so. |
You can create a new PR from your branch to main and keep this one closed. |
opened new PR #433 |
Implemented xlsx interpreter and sheet-picker blocks.
Test cases if applicable will be in a different PR.