Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(transform): add exponential smoothing data transform methods #6522

Merged
merged 4 commits into from
Nov 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
900 changes: 900 additions & 0 deletions __tests__/integration/snapshots/static/emaBasic.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
37 changes: 37 additions & 0 deletions __tests__/plots/static/ema-basic.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
import { G2Spec } from '../../../src';

export function emaBasic(): G2Spec {
return {
type: 'view',
children: [
{
type: 'line',
data: {
type: 'fetch',
value: 'data/aapl.csv',
transform: [
{
type: 'ema',
field: 'close',
alpha: 0.8,
},
],
},
},
{
type: 'line',
style: {
opacity: 0.3,
},
data: {
type: 'fetch',
value: 'data/aapl.csv',
},
},
],
encode: {
x: 'date',
y: 'close',
},
};
}
1 change: 1 addition & 0 deletions __tests__/plots/static/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ export { aaplLineAreaBasicSample } from './aapl-line-area-basic-sample';
export { aaplAreaLineSmoothSample } from './aapl-area-line-smooth-sample';
export { aaplLinePointBasicSample } from './aapl-line-point-basic-sample';
export { speciesDensityBasic } from './species-density-basic';
export { emaBasic } from './ema-basic';
export { speciesViolinBasic } from './species-violin-basic';
export { speciesViolinBasicPolar } from './species-violin-basic-polar';
export { unemploymentLineMultiSeries } from './unemployment-line-multi-series';
Expand Down
101 changes: 101 additions & 0 deletions __tests__/unit/data/ema.spec.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
import { EMA } from '../../../src/data';

describe('EMA', () => {
it('EMA({...}) returns a function that is used to exponentially smooth the data', async () => {
const transform = EMA({ alpha: 0.6, field: 'y' });
const data = [
{ x: 1, y: 2 },
{ x: 4, y: 5 },
{ x: 5, y: 8 },
];
const r = await transform(data);
r.forEach((d, i) => {
if (i > 0) {
expect(d.y).not.toBe(data[i].y);
} else {
expect(d.y).toBe(data[i].y);
}
expect(d.x).toBe(data[i].x);
});
});

it('The "field" field determines the smoothed data', () => {
const transform = EMA({ alpha: 0.6, field: 'x' });
const data = [
{ x: 1, y: 2 },
{ x: 4, y: 5 },
{ x: 5, y: 8 },
];
const r = transform(data);
r.forEach((d, i) => {
if (i > 0) {
expect(d.x).not.toBe(data[i].x);
} else {
expect(d.x).toBe(data[i].x);
}
expect(d.y).toBe(data[i].y);
});
});

it('The as field will avoid overwriting the original data', () => {
const transform = EMA({ alpha: 0.6, field: 'y', as: 'smooth' });
const data = [
{ x: 1, y: 2 },
{ x: 4, y: 5 },
{ x: 5, y: 8 },
];
const r = transform(data);
expect(r).toEqual([
{
x: 1,
y: 2,
smooth: 2,
},
{
x: 4,
y: 5,
smooth: 3.2,
},
{
x: 5,
y: 8,
smooth: 5.12,
},
]);
});

it('should handle missing field values', function () {
const data = [{ x: 1 }, { y: 2 }, { y: 3 }, { x: 4 }];
const result = EMA({ field: 'y' })(data);
expect(result[0].y).toBe(undefined);
expect(result[1].y).not.toBe(undefined);
expect(result[2].y).not.toBe(undefined);
expect(result[3].y).toBe(undefined);
});

it('should handle missing alpha value', function () {
const data = [{ y: 1 }, { y: 2 }, { y: 3 }];
const r = EMA({ field: 'y' })(data);
r.forEach((d, i) => {
if (i > 0) {
expect(d.y).not.toBe(data[i].y);
} else {
expect(d.y).toBe(data[i].y);
}
});
});
it('The value of alpha should be greater than zero and less than one', function () {
const data = [{ y: 1 }, { y: 2 }, { y: 3 }];
let alpha = 1.1;
expect(() => EMA({ field: 'y', alpha })(data)).toThrowError();

alpha = -0.1;
expect(() => EMA({ field: 'y', alpha })(data)).toThrowError();
});

it('Returns an empty array if entered', function () {
const data = [];
const result = EMA({ field: 'y' })(data);
expect(result).toEqual([]);
});
});
2 changes: 2 additions & 0 deletions __tests__/unit/lib/core.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ import {
KDE,
Log as DataLog,
WordCloud,
EMA,
} from '../../../src/data';
import {
OverflowHide,
Expand All @@ -182,6 +183,7 @@ describe('corelib', () => {
'data.join': Join,
'data.kde': KDE,
'data.log': DataLog,
'data.ema': EMA,
'data.wordCloud': WordCloud,
'transform.stackY': StackY,
'transform.binX': BinX,
Expand Down
2 changes: 2 additions & 0 deletions __tests__/unit/lib/std.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,7 @@ import {
Arc,
Log as DataLog,
WordCloud,
EMA,
} from '../../../src/data';
import {
OverflowHide,
Expand All @@ -196,6 +197,7 @@ describe('stdlib', () => {
'data.join': Join,
'data.kde': KDE,
'data.venn': Venn,
'data.ema': EMA,
'data.wordCloud': WordCloud,
'data.cluster': Cluster,
'data.arc': Arc,
Expand Down
6 changes: 6 additions & 0 deletions site/docs/spec/data/ema.en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
title: ema
order: 1
---

<embed src="@/docs/spec/data/ema.zh.md"></embed>
116 changes: 116 additions & 0 deletions site/docs/spec/data/ema.zh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
---
title: ema
order: 1
---


EMA(Exponential Moving Average)是一种常用的平滑算法,用于计算数据的指数移动平均值。它通过给较近的数据赋予权重来平滑数据,从而减少噪声和波动。

在模型训练中,可以使用EMA来平滑数据,观察数据变化趋势。

如下公式显示,α越大平滑效果更明显


$EMA_t = (1 - \alpha) \cdot P_t + \alpha \cdot EMA_{t-1}$

具体细节可参考[文档](https://en.wikipedia.org/wiki/Exponential_smoothing)



## 开始使用

```ts
const data = [
{ x: 1, y: 2 },
{ x: 4, y: 5 },
{ x: 5, y: 8 },
];

chart
.data({
type: 'line',
value: data,
transform: [
{
type: 'ema',
field: 'y',
alpha: 0.6,
as: 'other'
},
],
});
```

上述例子处理之后,数据变成为:

```js
[
{
"x": 1,
"y": 2,
"other": 2,
},
{
"x": 4,
"y": 3.2,
"other": 3.2,
},
{
"x": 5,
"y": 5.12,
"other": 5.12,
}
];
```

## 开始使用

```js | ob
(() => {
const chart = new G2.Chart();

chart.options({
type: 'view',
children: [
{
type: 'line',
data: {
type: 'fetch',
value: 'https://gw.alipayobjects.com/os/bmw-prod/551d80c6-a6be-4f3c-a82a-abd739e12977.csv',
transform: [
{
type: 'ema',
field: 'close',
alpha: 0.8,
},
],
},
},
{
type: 'line',
style: {
opacity: 0.3,
},
data: {
type: 'fetch',
value: 'https://gw.alipayobjects.com/os/bmw-prod/551d80c6-a6be-4f3c-a82a-abd739e12977.csv',
},
},
],
encode: {
x: 'date',
y: 'close',
},
});

return chart.render().then((chart) => chart.getContainer());
})();
```

## 选项

| 属性 | 描述 | 类型 | 默认值|
| -------------| ----------------------------------------------------------- | -----------------------------| --------------------|
| field | 需要处理的字段列表 | string | y |
| alpha | 平滑因子,范围在0-1 | number | 0.6 |
| as | 存储的字段, 默认是field传入的值,可自定义字段避免覆盖原字段数据 | string | y |
62 changes: 62 additions & 0 deletions src/data/ema.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
import { DataComponent as DC } from '../runtime';
import { EMADataTransform } from '../spec';

export function ema(values: number[], alpha: number): number[] {
if (alpha < 0 || alpha > 1) {
throw new Error('alpha must be between 0 and 1.');
}
if (values.length === 0) {
return [];
}

let last = values[0];
const smoothed: number[] = [];

for (const point of values) {
if (point === null || point === undefined) {
// 如果没有数据的话,使用最近的值
smoothed.push(point);
console.warn('EMA:The value is null or undefined', values);
continue;
}

if (last === null || last === undefined) {
last = point;
}

const smoothedVal = last * alpha + (1 - alpha) * point;
smoothed.push(smoothedVal);
last = smoothedVal;
}

return smoothed;
}

export type EMAOptions = Omit<EMADataTransform, 'type'>;

/**
* https://en.wikipedia.org/wiki/Exponential_smoothing
* @param options
* @returns
*/

export const EMA: DC<EMAOptions> = (options) => {
const { field = 'y', alpha = 0.6, as = field } = options;

return (data) => {
const values = data.map((d) => {
return d[field];
});

const out = ema(values, alpha);

return data.map((d, i) => {
return {
...d,
[as]: out[i],
};
});
};
};

EMA.props = {};
2 changes: 2 additions & 0 deletions src/data/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ export { Slice } from './slice';
export { KDE } from './kde';
export { Venn } from './venn';
export { Log } from './log';
export { EMA } from './ema';

export type { FetchOptions } from './fetch';
export type { FoldOptions } from './fold';
Expand All @@ -39,3 +40,4 @@ export type { SliceOptions } from './slice';
export type { KDEOptions } from './kde';
export type { VennOptions } from './venn';
export type { LogDataOptions } from './log';
export type { EMAOptions } from './ema';
2 changes: 2 additions & 0 deletions src/lib/core.ts
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,7 @@ import {
Sort as DataSort,
KDE as DataKDE,
Log as DataLog,
EMA as DataEMA,
WordCloud,
} from '../data';
import {
Expand Down Expand Up @@ -181,6 +182,7 @@ export function corelib() {
'data.kde': DataKDE,
'data.log': DataLog,
'data.wordCloud': WordCloud,
'data.ema': DataEMA,
'transform.stackY': StackY,
'transform.binX': BinX,
'transform.bin': Bin,
Expand Down
Loading
Loading