-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/396 handle xlsx files #421
Changes from all commits
5898db4
a74a470
bc9c93b
6b7dcb2
4d82690
fd5c19f
0abeb51
f1f0a5c
a7898a2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For the general approach I am not convinced a name should be the property of a sheet. You run into multiple problems using that approach:
Why not make the name to sheet mapping a property of a workbook? Keep sheets as they are, nameless. Instead, track sheets in the workbook as objects containing the name and the actual sheet. Then you can have an index (from the position of the object in the array), the sheet and the name. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added it as a map<string, workbook> where string is the name of the workbook. |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,47 @@ | ||||||
// SPDX-FileCopyrightText: 2023 Friedrich-Alexander-Universitat Erlangen-Nurnberg | ||||||
// | ||||||
// SPDX-License-Identifier: AGPL-3.0-only | ||||||
|
||||||
import { strict as assert } from 'assert'; | ||||||
|
||||||
import { IOType } from '@jvalue/jayvee-language-server'; | ||||||
|
||||||
import { IOTypeImplementation, IoTypeVisitor } from './io-type-implementation'; | ||||||
import { Sheet } from './sheet'; | ||||||
|
||||||
export class Workbook implements IOTypeImplementation<IOType.WORKBOOK> { | ||||||
public readonly ioType = IOType.WORKBOOK; | ||||||
private sheets: Sheet[]; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This can be assigned immediately and you do not need a constructor. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed accordingly. |
||||||
constructor() { | ||||||
this.sheets = []; | ||||||
} | ||||||
|
||||||
getSheets(): ReadonlyArray<Sheet> { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why readonly array? I'd just reuse the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. changed accordingly. |
||||||
return this.sheets; | ||||||
} | ||||||
|
||||||
getSheetByName(sheetName: string): Sheet { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We shouldn't crash the interpreter with an assert if the sheet does not exist, I'd return Sheet or undefined and use https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/find |
||||||
const sheet = this.sheets.filter( | ||||||
(sheet) => sheet.getSheetName() === sheetName, | ||||||
)[0]; | ||||||
assert(sheet instanceof Sheet); | ||||||
return sheet; | ||||||
} | ||||||
|
||||||
addSheet(sheet: Sheet) { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why addSheet and addNewSheet with different semantic? addSheet lets you add sheets with duplicate names, addNewSheet does not. This has to be one function, just remove the current Also, please add return types. These functions should return the workbook itself to enable fluent interfaces (https://en.wikipedia.org/wiki/Fluent_interface). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. changed as suggested |
||||||
this.sheets.push(sheet); | ||||||
} | ||||||
|
||||||
addNewSheet(data: string[][], sheetName?: string) { | ||||||
const sheetNameOrDefault = sheetName ?? `Sheet${this.sheets.length + 1}`; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like this but the name should have an empty space.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As discussed, we will keep this without the gap to stay in line with common default naming for worksheets in eg. excel or sheets. |
||||||
if ( | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if (this.getSheetByName), reuse your own api. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. changed as suggested |
||||||
this.sheets.some((sheet) => sheet.getSheetName() === sheetNameOrDefault) | ||||||
) | ||||||
return; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Silent returns without doing anything is suboptimal. Please log a clear error (e.g., did not add sheet X to workbook, sheet with name X already exists). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. changed accordingly. |
||||||
this.addSheet(new Sheet(data, sheetNameOrDefault)); | ||||||
} | ||||||
|
||||||
acceptVisitor<R>(visitor: IoTypeVisitor<R>): R { | ||||||
return visitor.visitWorkbook(this); | ||||||
} | ||||||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
// SPDX-FileCopyrightText: 2023 Friedrich-Alexander-Universitat Erlangen-Nurnberg | ||
// | ||
// SPDX-License-Identifier: AGPL-3.0-only | ||
|
||
import * as R from '@jvalue/jayvee-execution'; | ||
import { | ||
AbstractBlockExecutor, | ||
BlockExecutorClass, | ||
ExecutionContext, | ||
Sheet, | ||
Workbook, | ||
implementsStatic, | ||
} from '@jvalue/jayvee-execution'; | ||
import { IOType, PrimitiveValuetypes } from '@jvalue/jayvee-language-server'; | ||
|
||
@implementsStatic<BlockExecutorClass>() | ||
export class SheetPickerExecutor extends AbstractBlockExecutor< | ||
IOType.WORKBOOK, | ||
IOType.SHEET | ||
> { | ||
public static readonly type = 'SheetPicker'; | ||
|
||
constructor() { | ||
super(IOType.WORKBOOK, IOType.SHEET); | ||
} | ||
|
||
// eslint-disable-next-line @typescript-eslint/require-await | ||
async doExecute( | ||
workbook: Workbook, | ||
context: ExecutionContext, | ||
): Promise<R.Result<Sheet | null>> { | ||
const sheetName = context.getPropertyValue( | ||
'sheetName', | ||
PrimitiveValuetypes.Text, | ||
); | ||
const sheet = workbook.getSheetByName(sheetName); | ||
return R.ok(sheet); | ||
} | ||
} |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,61 @@ | ||||||
// SPDX-FileCopyrightText: 2023 Friedrich-Alexander-Universitat Erlangen-Nurnberg | ||||||
// | ||||||
// SPDX-License-Identifier: AGPL-3.0-only | ||||||
|
||||||
import { strict as assert } from 'assert'; | ||||||
|
||||||
import * as R from '@jvalue/jayvee-execution'; | ||||||
import { | ||||||
AbstractBlockExecutor, | ||||||
BinaryFile, | ||||||
BlockExecutorClass, | ||||||
ExecutionContext, | ||||||
Workbook, | ||||||
implementsStatic, | ||||||
} from '@jvalue/jayvee-execution'; | ||||||
import { IOType } from '@jvalue/jayvee-language-server'; | ||||||
import * as xlsx from 'xlsx'; | ||||||
|
||||||
@implementsStatic<BlockExecutorClass>() | ||||||
export class XLSXInterpreterExecutor extends AbstractBlockExecutor< | ||||||
IOType.FILE, | ||||||
IOType.WORKBOOK | ||||||
> { | ||||||
public static readonly type = 'XLSXInterpreter'; | ||||||
|
||||||
constructor() { | ||||||
super(IOType.FILE, IOType.WORKBOOK); | ||||||
} | ||||||
|
||||||
async doExecute( | ||||||
file: BinaryFile, | ||||||
context: ExecutionContext, | ||||||
): Promise<R.Result<Workbook>> { | ||||||
context.logger.logDebug(`reading from xlsx file`); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
const workBookFromFile = xlsx.read(file.content, { dense: true }); | ||||||
const workbook = new Workbook(); | ||||||
for (const workSheetName of workBookFromFile.SheetNames) { | ||||||
const workSheet = workBookFromFile.Sheets[workSheetName]; | ||||||
assert(workSheet !== undefined); | ||||||
|
||||||
const workSheetDataArray = Array.prototype.map.call< | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That line looks weird to me, why go over the prototype? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To be quite frank, because I did't manage to do it differently eg calling There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, please bring this up in our sync, we can do it live. |
||||||
xlsx.WorkSheet, | ||||||
[ | ||||||
callbackfn: ( | ||||||
value: xlsx.CellObject[], | ||||||
index: number, | ||||||
array: xlsx.WorkSheet[], | ||||||
) => string[], | ||||||
], | ||||||
string[][] | ||||||
>(workSheet, (row: xlsx.CellObject[]): string[] => { | ||||||
return row.map<string>((cell: xlsx.CellObject) => { | ||||||
return cell.v?.toString() ?? ''; | ||||||
}); | ||||||
}); | ||||||
|
||||||
workbook.addNewSheet(workSheetDataArray, workSheetName); | ||||||
} | ||||||
return Promise.resolve(R.ok(workbook)); | ||||||
} | ||||||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
// SPDX-FileCopyrightText: 2023 Friedrich-Alexander-Universitat Erlangen-Nurnberg | ||
// | ||
// SPDX-License-Identifier: AGPL-3.0-only | ||
|
||
import { | ||
BlockMetaInformation, | ||
IOType, | ||
PrimitiveValuetypes, | ||
} from '@jvalue/jayvee-language-server'; | ||
|
||
export class SheetPickerMetaInformation extends BlockMetaInformation { | ||
constructor() { | ||
super( | ||
// How the block type should be called: | ||
'SheetPicker', | ||
// Property definitions: | ||
{ | ||
sheetName: { | ||
type: PrimitiveValuetypes.Text, | ||
docs: { | ||
description: 'The name of the sheet to select.', | ||
}, | ||
}, | ||
}, | ||
// Input type: | ||
IOType.WORKBOOK, | ||
|
||
// Output type: | ||
IOType.SHEET, | ||
); | ||
|
||
this.docs.description = | ||
'Selects one `Sheet` from a `Workbook` based on its `sheetName`. If no sheet matches the name, no output is created and the execution of the pipeline is aborted.'; | ||
this.docs.examples = [ | ||
{ | ||
code: `block AgencySheetPicker oftype SheetPicker { | ||
sheetName: "AgencyNames"; | ||
}`, | ||
description: | ||
'Tries to pick the sheet `AgencyNames` from the provided `Workbook`. If `AgencyNames` exists it is passed on as `Sheet`, if it does not exist the execution of the pipeline is aborted.', | ||
}, | ||
]; | ||
} | ||
} |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,33 @@ | ||||||
// SPDX-FileCopyrightText: 2023 Friedrich-Alexander-Universitat Erlangen-Nurnberg | ||||||
// | ||||||
// SPDX-License-Identifier: AGPL-3.0-only | ||||||
|
||||||
import { BlockMetaInformation, IOType } from '@jvalue/jayvee-language-server'; | ||||||
|
||||||
export class XLSXInterpreterMetaInformation extends BlockMetaInformation { | ||||||
constructor() { | ||||||
super( | ||||||
// How the block type should be called: | ||||||
'XLSXInterpreter', | ||||||
// Property definitions: | ||||||
{}, | ||||||
// Input type: | ||||||
IOType.FILE, | ||||||
|
||||||
// Output type: | ||||||
IOType.WORKBOOK, | ||||||
); | ||||||
|
||||||
this.docs.description = | ||||||
"Interprets an input file as a xlsx-file and outputs a `Workbook` containing `Sheet`'s."; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed as suggested |
||||||
this.docs.examples = [ | ||||||
{ | ||||||
code: blockExample, | ||||||
description: | ||||||
"Interprets an input file as a xlsx-file and outputs a `Workbook` containing `Sheet`'s.", | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed as suggested |
||||||
}, | ||||||
]; | ||||||
} | ||||||
} | ||||||
const blockExample = `block AgencyXLSXInterpreter oftype XLSXInterpreter { | ||||||
}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added