You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've had chance now to more extensively test the internal deployment of cime4r for larger datasets now (mostly 200 000 but sometimes upto 800000). Generally it is working much better now than before, but I have noticed a few small issues still:
Dataset upload: a column with 'shap_0' is still required for all datasets
Projection: occasionally the projection still does not update / occasionally fails for no obvious reason for larger datasets (> 400000) but normally a projection can be achieved by repeating the projection.
Filtering: the filters shown do not match the filtered data (e.g. here the filter labels show -1 to 11, but the data points are only 0 to 11). This also happens sometimes with more complex filter settings, not just on initiation.
Filtering: if the dataset is larger than 10000 then it should be 'randomly' filtered, but currently the filtering is not random, e.g. here the only catalyst 'c' is shown but the dataset has at least 5 different catalyst values:
Aggregate: sometimes fails for large datasets (around 800000) points and the selecting one hexagon to look at the points inside it (on the 'selection' tab) sometimes does not work. But I have not yet figured for which cases it does / does not work.
General: For large datasets the whole interface (but particularly aggregation and encoding) can be very slow and sluggish. This particularly causes a problem when the session times out whilst working on a dataset and you have to start from scratch.
For me, the biggest problems from above are the problems with filtering (both the random sampling and the filter selection being shown correctly).
With the selection of 10000 datapoints, the options for the expected behaviour would be one of the following:
show all experiments with experiment_cycle > -1, then a completely random selection of the rest
always show the first 10000 points (we could randomise the data when preparing it)
completely random selection of 10000 points
( I would not mind which one but it would be important to understand what is actually happening)
The text was updated successfully, but these errors were encountered:
I've had chance now to more extensively test the internal deployment of cime4r for larger datasets now (mostly 200 000 but sometimes upto 800000). Generally it is working much better now than before, but I have noticed a few small issues still:
Dataset upload: a column with 'shap_0' is still required for all datasets
Projection: occasionally the projection still does not update / occasionally fails for no obvious reason for larger datasets (> 400000) but normally a projection can be achieved by repeating the projection.
Filtering: the filters shown do not match the filtered data (e.g. here the filter labels show -1 to 11, but the data points are only 0 to 11). This also happens sometimes with more complex filter settings, not just on initiation.
Filtering: if the dataset is larger than 10000 then it should be 'randomly' filtered, but currently the filtering is not random, e.g. here the only catalyst 'c' is shown but the dataset has at least 5 different catalyst values:
Aggregate: sometimes fails for large datasets (around 800000) points and the selecting one hexagon to look at the points inside it (on the 'selection' tab) sometimes does not work. But I have not yet figured for which cases it does / does not work.
General: For large datasets the whole interface (but particularly aggregation and encoding) can be very slow and sluggish. This particularly causes a problem when the session times out whilst working on a dataset and you have to start from scratch.
For me, the biggest problems from above are the problems with filtering (both the random sampling and the filter selection being shown correctly).
With the selection of 10000 datapoints, the options for the expected behaviour would be one of the following:
( I would not mind which one but it would be important to understand what is actually happening)
The text was updated successfully, but these errors were encountered: