Skip to content

Commit

Permalink
Merge pull request #72 from rpbouman/dev
Browse files Browse the repository at this point in the history
Bugfixes, Duckdb Upgrade, etc.
  • Loading branch information
rpbouman authored Feb 26, 2024
2 parents d42cc95 + e580977 commit 90147d0
Show file tree
Hide file tree
Showing 21 changed files with 802 additions and 380 deletions.
48 changes: 31 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# 🦆 Huey
Huey is a browser-based application that lets you inspect and analyze tabular datasets.
Huey is a browser-based application that lets you explore tabular datasets.
Huey supports reading from multiple file formats, like .csv, .parquet, .json data files as well as .duckdb database files.

__Try Huey now online__ [https://rpbouman.github.io/huey/src/index.html](https://rpbouman.github.io/huey/src/index.html)
Expand All @@ -8,10 +8,13 @@ __Try Huey now online__ [https://rpbouman.github.io/huey/src/index.html](https:/


## Key features
- Zero install. Download or checkout the source tree, and open src/index.html in your browser! No server required.
- An intuitive and responsive pivot table that supports filtering and (sub)totals
- Supports many different aggregate functions for reporting and data exploration
- Automatic breakdown of date/time columns into separate parts (year, month, quarter etc) for reporting
- Supports reading .parquet, .csv, .json and .duckdb database files. (Support for reading MS Excel .xlsx files and .sqlite is planned)
- Export of results and/or SQL queries to file or clipboard
- Blazing fast, even for large files - courtesy of [DuckDB](https://duckdb.org)
- An intuitive and responsive pivot table, with support for many types of metrics
- Zero install. Download or checkout the source tree, and open src/index.html in your browser! No server required.

Note: although Huey can run locally, there is nothing that keeps you from deploying it in a webserver if you want to.

Expand All @@ -27,33 +30,44 @@ Note: although Huey can run locally, there is nothing that keeps you from deploy
## Registering and Analyzing Files with Huey

### Registering Files
Huey uses [DuckDb WASM](https://duckdb.org/docs/archive/0.9.2/api/wasm/overview) to access and analyze files.
Due to general security policy, the web browser can not simply read arbitrary files from your local computer: you need to explicitly select files and register them in DuckDB WASM's virtual file system.
Huey uses [DuckDb WASM](https://duckdb.org/docs/archive/0.9.2/api/wasm/overview) to read and analyze data files.
Due to general security policy, the web browser can not simply read files from your local computer: you need to explicitly select and register files in DuckDB WASM's virtual file system.

To register one or more files, you can either
1) Click the 'Upload...' button, which is the leftmost button on the toolbar at the top of the page.
2) Drag 'n Drop one or more files unto the "Datasources" tab in the sidebar. (The sidebar is on the left of the screen)
1) Click the 'Upload...' button ![upload button icon](https://github.com/rpbouman/huey/assets/647315/8dbae6ad-c4f2-4d5e-bc9a-f15fa9444c89).
The upload button is always available as the leftmost button on the toolbar at the top of the page. The upload action will pop up a file browser dialog that lets you browse and choose one or more files from your local filesystem.
In the file browser dialog, navigate to the file or files that you want to explore, select them and then confirm the dialog by clicking the 'Ok' button.
2) Drag 'n Drop one or multiple files unto the "Datasources" tab in the sidebar.

Either action will open the Upload dialog. The upload dialog will show a progress bar for each file that is being registered. Additional progress items may appear in case a duckdb extension needs to be installed and/or loaded.

![image](https://github.com/rpbouman/huey/assets/647315/b0c37783-4b3a-4166-9f3b-7f5a5ff91cd9)

Either of these actions will pop up a dialog that lets you browse and choose one or more files from your local filesystem.
In the file browser dialog, navigate to the file or files that you want to analyze, select them and then confirm the dialog by clicking the 'Ok' button.
After completion of the upload process, the upload dialog is updated to indicate the status of the uploads (or the extension installation, if applicable).

After confirming the dialog, Huey will attempt to register the files in DuckDb.
The successfully registered files are added to the "Datasources" tab in the sidebar.
Items that encountered an error are indicated by red progressbars. In case of errors, the item is expanded to reveal any information that might help to remedy the issue.

When registering new files, Huey will attempt to group files having similar column signature. The group appears as a separate node in the Datasources tab, and the individual files appear indented below it.
Successful actions are indicated by green progressbars. Succesfully loaded files are available in the Datasources tab, from where you can start exploring their contents by clicking the explore button ![explore button](https://github.com/rpbouman/huey/assets/647315/7b67ff2d-5cec-44e0-91d4-e670d38487c1). As a convenience, the explore button is also present in the upload dialog.

Huey will attempt to group files having similar column signature. The group appears as a separate top-level node in the Datasources tab, with its individual files indented below it. A file group has its own explore button, so that you can not only explore the individual files, but also the UNION of all Files in the group:

![image](https://github.com/rpbouman/huey/assets/647315/0ad057e0-e4ab-4bd8-b996-d3f50542853d)

Files that cannot be grouped appear in a separate Miscellanous Files group.

#### Opening DuckDb files
Apart from reading data files directly, Huey can also utilize existing duckdb files and access its tables and views.
The process for accessing duckdb files is exactly the same as for accessing data files. Just make sure you give your duckdb file a '.duckdb' extension - that's how Huey knows it's a duckdb file
(DuckDB data files are not required to have any particular name or extension, but Huey currently cannot detect that, so it relies on a file extension convention instead.)
Apart from reading data files directly, Huey can also open existing duckdb files and access its tables and views. The process for accessing duckdb files is exactly the same as for accessing data files. Just make sure you give your duckdb file a '.duckdb' extension - that's how Huey knows it's a duckdb file. (DuckDB data files are not required to have any particular name or extension, but Huey currently cannot detect that, so it relies on a file extension convention instead.) Successfully loaded .duckdb files will appear in the DuckDb Folder, which appears at the top of the DataSources tab.

![image](https://github.com/rpbouman/huey/assets/647315/c7ca5ed7-7454-4783-8dbc-493244f8bb28)

The schemas in the duckdb database file are presented as folders below the duckdb file entry, and any tables or views in the schema are presented below the schema folder. Each table or view has an explore button which you can click to explore the data.

Note: We ran into a limitation - when the duckdb file itself refers to external files, then it's likely that Huey (or rather, DuckDB WASM) won't be able to find them.
But native duckdb tables, as well as views based on duckdb base tables work marvelously and are quite a bit faster than querying bare data files.

### Analyzing Datasources
The Datasources have an analyze button. After clicking it, the sidebar switches to the Attributes tab, which is then is populated with a list of the Attributes of the selected Datasource.
### Exploring Datasources
The Datasources have an explore button ![explore button](https://github.com/rpbouman/huey/assets/647315/7b67ff2d-5cec-44e0-91d4-e670d38487c1)
. After clicking it, the sidebar switches to the Attributes tab, which is then is populated with a list of the Attributes of the selected Datasource.
You can think of Attributes as a list of values (a column) that can be extracted from the Datasource and presented along the axes of the pivot table.

The pivot table has two axes for placing attribute values:
Expand Down
114 changes: 64 additions & 50 deletions src/AttributeUi/AttributeUi.css
Original file line number Diff line number Diff line change
Expand Up @@ -13,47 +13,61 @@
.attributeUi details > summary > .label {
max-width: calc(100% - 166px);
}

/**
* folder icons
*/
.attributeUi details[data-nodetype=folder] > summary > .icon::before {
/* folder */
content: "\eaad";
}

.attributeUi details[data-nodetype=folder][open] > summary > .icon::before {
/* folder-open */
content: "\faf7";
}

/**
*
* Data type icons
*
*/
.attributeUi details[data-nodetype=column][data-column_type=VARCHAR] > summary > .icon:before {
.attributeUi details[data-nodetype=column][data-column_type=VARCHAR] > summary > .icon::before {
/* letter T */
content: "\ec63";
}

.attributeUi details[data-nodetype=column][data-column_type$=INT] > summary > .icon:before,
.attributeUi details[data-nodetype=column][data-column_type=INTEGER] > summary > .icon:before
.attributeUi details[data-nodetype=column][data-column_type$=INT] > summary > .icon::before,
.attributeUi details[data-nodetype=column][data-column_type=INTEGER] > summary > .icon::before
{
/* 123 */
content: "\f554";
}

.attributeUi details[data-nodetype=column][data-column_type^=STRUCT] > summary > .icon:before
.attributeUi details[data-nodetype=column][data-column_type^=STRUCT] > summary > .icon::before
{
/* code-dots */
content: "\f61a";
}

.attributeUi details[data-nodetype=column][data-column_type^=DECIMAL] > summary > .icon:before,
.attributeUi details[data-nodetype=column][data-column_type=DOUBLE] > summary > .icon:before,
.attributeUi details[data-nodetype=column][data-column_type=REAL] > summary > .icon:before {
.attributeUi details[data-nodetype=column][data-column_type^=DECIMAL] > summary > .icon::before,
.attributeUi details[data-nodetype=column][data-column_type=DOUBLE] > summary > .icon::before,
.attributeUi details[data-nodetype=column][data-column_type=REAL] > summary > .icon::before {
/* decimal */
content: "\fa26";
}

.attributeUi details[data-nodetype=column][data-column_type*=TIMESTAMP] > summary > .icon:before {
.attributeUi details[data-nodetype=column][data-column_type*=TIMESTAMP] > summary > .icon::before {
/* calendar-clock */
content: "\fd2e";
}

.attributeUi details[data-nodetype=column][data-column_type=DATE] > summary > .icon:before {
.attributeUi details[data-nodetype=column][data-column_type=DATE] > summary > .icon::before {
/* calendar */
content: "\ea53";
}

.attributeUi details[data-nodetype=column][data-column_type=TIME] > summary > .icon:before {
.attributeUi details[data-nodetype=column][data-column_type=TIME] > summary > .icon::before {
/* clock */
content: "\ea70";
}
Expand All @@ -63,61 +77,61 @@
* Derivation icons
*
*/
.attributeUi details[data-nodetype=derived][data-derivation=iso-date] > summary > .icon:before {
.attributeUi details[data-nodetype=derived][data-derivation=iso-date] > summary > .icon::before {
/* calendar */
content: "\ea53";
}
.attributeUi details[data-nodetype=derived][data-derivation=year] > summary > .icon:before {
.attributeUi details[data-nodetype=derived][data-derivation=year] > summary > .icon::before {
/* letter-y */
content: "\ec68";
}

.attributeUi details[data-nodetype=derived][data-derivation=quarter] > summary > .icon:before {
.attributeUi details[data-nodetype=derived][data-derivation=quarter] > summary > .icon::before {
/* letter q */
content: "\ec60";
}

.attributeUi details[data-nodetype=derived][data-derivation="month num"] > summary > .icon:before {
.attributeUi details[data-nodetype=derived][data-derivation="month num"] > summary > .icon::before {
/* letter m */
content: "\ec5c";
}

.attributeUi details[data-nodetype=derived][data-derivation="week num"] > summary > .icon:before {
/* letter w */
content: "\ec66";
.attributeUi details[data-nodetype=derived][data-derivation="week num"] > summary > .icon::before {
/* calendar-week */
content: "\fd30";
}

.attributeUi details[data-nodetype=derived][data-derivation="day of year"] > summary > .icon:before {
.attributeUi details[data-nodetype=derived][data-derivation="day of year"] > summary > .icon::before {
/* letter-d */
content: "\ec53";
}

.attributeUi details[data-nodetype=derived][data-derivation="day of month"] > summary > .icon:before {
/* letter-d */
content: "\ec53";
.attributeUi details[data-nodetype=derived][data-derivation="day of month"] > summary > .icon::before {
/* calendar-month */
content: "\fd2f";
}

.attributeUi details[data-nodetype=derived][data-derivation="day of week"] > summary > .icon:before {
/* letter-d */
content: "\ec53";
.attributeUi details[data-nodetype=derived][data-derivation="day of week"] > summary > .icon::before {
/* letter-d-small */
content: "\fcca";
}

.attributeUi details[data-nodetype=derived][data-derivation=iso-time] > summary > .icon:before {
.attributeUi details[data-nodetype=derived][data-derivation=iso-time] > summary > .icon::before {
/* clock */
content: "\ea70";
}

.attributeUi details[data-nodetype=derived][data-derivation=hour] > summary > .icon:before {
.attributeUi details[data-nodetype=derived][data-derivation=hour] > summary > .icon::before {
/* letter-h */
content: "\ec57";
}

.attributeUi details[data-nodetype=derived][data-derivation=minute] > summary > .icon:before {
.attributeUi details[data-nodetype=derived][data-derivation=minute] > summary > .icon::before {
/* letter-m-small */
content: "\fcd3";
}

.attributeUi details[data-nodetype=derived][data-derivation=second] > summary > .icon:before {
.attributeUi details[data-nodetype=derived][data-derivation=second] > summary > .icon::before {
/* letter-s */
content: "\ec62";
}
Expand All @@ -128,82 +142,82 @@
*
*/

.attributeUi details[data-nodetype=aggregate][data-aggregator=count] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator=count] > summary > .icon::before {
/* tallymarks */
content: "\ec4a";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator="distinct count"] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator="distinct count"] > summary > .icon::before {
/* tallymark-4 */
content: "\ec49";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator=max] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator=max] > summary > .icon::before {
/* math-max */
content: "\f0f5";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator=min] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator=min] > summary > .icon::before {
/* math-min */
content: "\f0f6";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator=list] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator=list] > summary > .icon::before {
/* list */
content: "\eb6b";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator="distinct list"] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator="distinct list"] > summary > .icon::before {
/* list details */
content: "\ef40";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator=histogram] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator=histogram] > summary > .icon::before {
/* list numbers */
content: "\ef11";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator=sum] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator=sum] > summary > .icon::before {
/* sum */
content: "\eb73";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator=avg] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator=avg] > summary > .icon::before {
/* math-avg */
content: "\f0f4";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator=median] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator=median] > summary > .icon::before {
/* calculator */
content: "\eb80";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator=mode] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator=mode] > summary > .icon::before {
/* calculator */
content: "\eb80";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator=stdev] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator=stdev] > summary > .icon::before {
/* calculator */
content: "\eb80";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator=variance] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator=variance] > summary > .icon::before {
/* calculator */
content: "\eb80";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator=entropy] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator=entropy] > summary > .icon::before {
/* calculator */
content: "\eb80";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator=kurtosis] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator=kurtosis] > summary > .icon::before {
/* calculator */
content: "\eb80";
}

.attributeUi details[data-nodetype=aggregate][data-aggregator=skewness] > summary > .icon:before {
.attributeUi details[data-nodetype=aggregate][data-aggregator=skewness] > summary > .icon::before {
/* calculator */
content: "\eb80";
}
Expand All @@ -212,7 +226,7 @@
/**
* Prevent the Attribute UI events when the pivot table is busy:
*/
main.layout:has( .workarea > .pivotTableUiContainer[aria-busy=true] ) > nav#sidebar .attributeUi details > summary > label.attributeUiAxisButton {
main.layout:has( .workarea > .pivotTableUiContainer[aria-busy=true] ) > nav#sidebar .attributeUi details > summary {
pointer-events: none;
}

Expand Down Expand Up @@ -241,7 +255,7 @@ main.layout:has( .workarea > .pivotTableUiContainer[aria-busy=true] ) > nav#side
display: none;
}

.attributeUi details > summary > .attributeUiAxisButton[data-axis=rows]:before {
.attributeUi details > summary > .attributeUiAxisButton[data-axis=rows]::before {
/* table-column */
/*
It may seem backward that we're using the table-column icon for the rows axis,
Expand All @@ -250,7 +264,7 @@ main.layout:has( .workarea > .pivotTableUiContainer[aria-busy=true] ) > nav#side
content: "\faff";
}

.attributeUi details > summary > .attributeUiAxisButton[data-axis=columns]:before {
.attributeUi details > summary > .attributeUiAxisButton[data-axis=columns]::before {
/* table-row */
/*
It may seem backward that we're using the table-row icon for the columns axis,
Expand All @@ -259,7 +273,7 @@ main.layout:has( .workarea > .pivotTableUiContainer[aria-busy=true] ) > nav#side
content: "\fb00";
}

.attributeUi details > summary > .attributeUiAxisButton[data-axis=cells]:before {
.attributeUi details > summary > .attributeUiAxisButton[data-axis=cells]::before {
/* layout grid */
content: "\edba";
}
Expand All @@ -279,14 +293,14 @@ main.layout:has( .workarea > .pivotTableUiContainer[aria-busy=true] ) > nav#side
color: var( --huey-icon-color-highlight );
}

.attributeUi details > summary > .attributeUiAxisButton[data-axis=filters]:before {
.attributeUi details > summary > .attributeUiAxisButton[data-axis=filters]::before {
/* filter-plus */
/* content: "\fa02"; */
/* filter */
content: "\eaa5";
}

.attributeUi details > summary > .attributeUiAxisButton[data-axis=filters]:has( > input[type=checkbox]:checked ):before {
.attributeUi details > summary > .attributeUiAxisButton[data-axis=filters]:has( > input[type=checkbox]:checked )::before {
/* filter-x */
/* content: "\fa04"; */
/* filter-off */
Expand Down
Loading

0 comments on commit 90147d0

Please sign in to comment.