Skip to content

Commit

Permalink
Add usage examples to documentation📖 (#9)
Browse files Browse the repository at this point in the history
* Add simplified type hints

* Improve API

* Remove checks for subsets

* Add examples

* Extend documentation

* sync readme.md with index.md
  • Loading branch information
KarelZe authored Dec 4, 2023
1 parent 6afad7d commit c915346
Show file tree
Hide file tree
Showing 5 changed files with 188 additions and 39 deletions.
118 changes: 90 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,41 +1,103 @@
# Trade classification for python 🐍

![GitHubActions](https://github.com/karelze/tclf//actions/workflows/tests.yaml/badge.svg)
![Codecov](https://codecov.io/gh/karlze/tclf/branch/master/graph/badge.svg)

# tclf 💸
`tclf` is a [`scikit-learn`](https://scikit-learn.org/stable/)-compatible implementation of trade classification algorithms to classify financial markets transactions into buyer- and seller-initiated trades.

[`scikit-learn`](https://scikit-learn.org/stable/)-compatible implementation of popular trade classification algorithms to classify financial markets transactions into buyer- and seller-initiated trades.
The key features are:

## Algorithms
* **Easy**: Easy to use and learn.
* **Sklearn-compatible**: Compatible to the sklearn API. Use sklearn metrics and visualizations.
* **Feature complete**: Wide range of supported algorithms. Use the algorithms individually or stack them like LEGO blocks.

- Tick test
- Quote rule
- LR algorithm
- EMO rule
- CLNV rule
- Depth rule
- Tradesize rule
## Installation
```console
$ pip install .
---> 100%
Successfully installed tclf-0.0.0
```

## Minimal Example

## Usage
Let's start off simple: classify all trades by the quote rule and all other trades, which cannot be classified by the quote rule, randomly.

Create a `main.py` with:
```python
>>> X = pd.DataFrame(
... [
... [1.5, 1, 3],
... [2.5, 1, 3],
... [1.5, 3, 1],
... [2.5, 3, 1],
... [1, np.nan, 1],
... [3, np.nan, np.nan],
... ],
... columns=["trade_price", "bid_ex", "ask_ex"],
... )
>>> y = pd.Series([-1, 1, 1, -1, -1, 1])
>>> clf = ClassicalClassifier(layers=[("quote", "ex")], strategy="const")
>>> clf.fit(X, y)
ClassicalClassifier(layers=[('quote', 'ex')], strategy='const')
>>> pred = clf.predict_proba(X)
import numpy as np
import pandas as pd

from tclf.classical_classifier import ClassicalClassifier

X = pd.DataFrame(
[
[1.5, 1, 3],
[2.5, 1, 3],
[1.5, 3, 1],
[2.5, 3, 1],
[1, np.nan, 1],
[3, np.nan, np.nan],
],
columns=["trade_price", "bid_ex", "ask_ex"],
)
y = pd.Series([1, 1, 1, 1, 1, 1])

clf = ClassicalClassifier(layers=[("quote", "ex")], strategy="random")
clf.fit(X, y)
probs = clf.predict_proba(X)
print(probs)
```
Run your script with
```console
python main.py
```
In this example, input data is available as a pd.DataFrame/Series with columns conforming to our [naming conventions](https://karelze.github.io/tclf/naming_conventions/).

The parameter `layers=[("quote", "ex")]` sets the quote rule at the exchange level and `strategy="random"` specifies the fallback strategy for unclassified trades. The true label `y` is not used in classification and only for API consistency by convention.

## Advanced Example
Often it is desirable to classify both on exchange level data and nbbo data. Also, data might only be available as a numpy array. So let's extend the previous example by classifying using the quote rule at exchange level, then at nbbo and all other trades randomly.

```python hl_lines="6 16 17 20"
import numpy as np
from sklearn.metrics import accuracy_score

from tclf.classical_classifier import ClassicalClassifier

X = np.array(
[
[1.5, 1, 3, 2, 2.5],
[2.5, 1, 3, 1, 3],
[1.5, 3, 1, 1, 3],
[2.5, 3, 1, 1, 3],
[1, np.nan, 1, 1, 3],
[3, np.nan, np.nan, 1, 3],
]
)
y_true = np.array([-1, 1, 1, -1, -1, 1])
features = ["trade_price", "bid_ex", "ask_ex", "bid_best", "ask_best"]

clf = ClassicalClassifier(
layers=[("quote", "ex"), ("quote", "best")], strategy="const", features=features
)
clf.fit(X, y_true)

y_pred = clf.predict(X)
print(accuracy_score(y_true, y_pred))
```
A detailled documentation is available [here](https://KarelZe.github.io/tclf/).
In this example, input data is available as np.arrays with both exchange (`"ex"`) and nbbo data (`"best"`). We set the layers parameter to `layers=[("quote", "ex"), ("quote", "best")]` to classify trades first on subset `"ex"` and remaining trades on subset `"best"`. Additionally, we have to set `ClassicalClassifier(..., features=features)` to pass column information to the classifier.

Like before, column/feature names must follow our [naming conventions](https://karelze.github.io/tclf/naming_conventions/).

## Supported Algorithms

- (Rev.) Tick test
- Quote rule
- (Rev.) LR algorithm
- (Rev.) EMO rule
- (Rev.) CLNV rule
- Depth rule
- Tradesize rule

## References

Expand Down
99 changes: 93 additions & 6 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,101 @@
# Trade classification for python
# Trade classification for python 🐍

`tclf` is [`scikit-learn`](https://scikit-learn.org/stable/)-compatible implementation of popular trade classification algorithms to classify financial markets transactions into buyer- and seller-initiated trades.
![GitHubActions](https://github.com/karelze/tclf//actions/workflows/tests.yaml/badge.svg)
![Codecov](https://codecov.io/gh/karlze/tclf/branch/master/graph/badge.svg)

`tclf` is a [`scikit-learn`](https://scikit-learn.org/stable/)-compatible implementation of trade classification algorithms to classify financial markets transactions into buyer- and seller-initiated trades.

The key features are:

* **Easy**: Easy to use and learn.
* **Sklearn-compatible**: Compatible to the sklearn API. Use sklearn metrics and visualizations.
* **Feature complete**: Wide range of supported algorithms. Use the algorithms individually or stack them like LEGO blocks.

## Installation
```console
$ pip install .
---> 100%
Successfully installed tclf-0.0.0
```

## Minimal Example

Let's start off simple: classify all trades by the quote rule and all other trades, which cannot be classified by the quote rule, randomly.

Create a `main.py` with:
```python
import numpy as np
import pandas as pd

from tclf.classical_classifier import ClassicalClassifier

X = pd.DataFrame(
[
[1.5, 1, 3],
[2.5, 1, 3],
[1.5, 3, 1],
[2.5, 3, 1],
[1, np.nan, 1],
[3, np.nan, np.nan],
],
columns=["trade_price", "bid_ex", "ask_ex"],
)
y = pd.Series([1, 1, 1, 1, 1, 1])

clf = ClassicalClassifier(layers=[("quote", "ex")], strategy="random")
clf.fit(X, y)
probs = clf.predict_proba(X)
print(probs)
```
Run your script with
```console
python main.py
```
In this example, input data is available as a pd.DataFrame/Series with columns conforming to our [naming conventions](https://karelze.github.io/tclf/naming_conventions/).

The parameter `layers=[("quote", "ex")]` sets the quote rule at the exchange level and `strategy="random"` specifies the fallback strategy for unclassified trades. The true label `y` is not used in classification and only for API consistency by convention.

## Advanced Example
Often it is desirable to classify both on exchange level data and nbbo data. Also, data might only be available as a numpy array. So let's extend the previous example by classifying using the quote rule at exchange level, then at nbbo and all other trades randomly.

```python hl_lines="6 16 17 20"
import numpy as np
from sklearn.metrics import accuracy_score

from tclf.classical_classifier import ClassicalClassifier

X = np.array(
[
[1.5, 1, 3, 2, 2.5],
[2.5, 1, 3, 1, 3],
[1.5, 3, 1, 1, 3],
[2.5, 3, 1, 1, 3],
[1, np.nan, 1, 1, 3],
[3, np.nan, np.nan, 1, 3],
]
)
y_true = np.array([-1, 1, 1, -1, -1, 1])
features = ["trade_price", "bid_ex", "ask_ex", "bid_best", "ask_best"]

clf = ClassicalClassifier(
layers=[("quote", "ex"), ("quote", "best")], strategy="const", features=features
)
clf.fit(X, y_true)

y_pred = clf.predict(X)
print(accuracy_score(y_true, y_pred))
```
In this example, input data is available as np.arrays with both exchange (`"ex"`) and nbbo data (`"best"`). We set the layers parameter to `layers=[("quote", "ex"), ("quote", "best")]` to classify trades first on subset `"ex"` and remaining trades on subset `"best"`. Additionally, we have to set `ClassicalClassifier(..., features=features)` to pass column information to the classifier.

Like before, column/feature names must follow our [naming conventions](https://karelze.github.io/tclf/naming_conventions/).

## Supported Algorithms

- Tick test
- (Rev.) Tick test
- Quote rule
- LR algorithm
- EMO rule
- CLNV rule
- (Rev.) LR algorithm
- (Rev.) EMO rule
- (Rev.) CLNV rule
- Depth rule
- Tradesize rule

Expand Down
Empty file added docs/naming_conventions.md
Empty file.
5 changes: 3 additions & 2 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ theme:
primary: black
accent: teal
icon:
repo: fontawesome/brands/github-alt
repo: fontawesome/brands/github

repo_name: karelze/tclf
repo_url: https://github.com/karelze/tclf
Expand All @@ -17,6 +17,7 @@ edit_uri: ""
nav:
- Home: index.md
- API reference: reference.md
- Naming conventions: naming_conventions.md

markdown_extensions:
- toc:
Expand Down Expand Up @@ -46,7 +47,7 @@ plugins:

extra:
social:
- icon: fontawesome/brands/github-alt
- icon: fontawesome/brands/github
link: https://github.com/karelze/tclf
- icon: fontawesome/brands/linkedin
link: https://www.linkedin.com/in/markus-bilz/
Expand Down
5 changes: 2 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@ requires = ["setuptools"]
build-backend = "setuptools.build_meta"

[project]
name = "otc"
name = "tclf"
authors = [
{ name="Markus Bilz", email="[email protected]" },
]
description = "Code to perform option trade classification using machine learning."
description = "Code to perform trade classification using trade classification algorithms."
readme = "README.md"
license = {file = "LICENSE.txt"}
requires-python = ">=3.8"
Expand All @@ -25,7 +25,6 @@ dependencies = [
"scikit-learn"
]


dynamic = ["version"]

[project.urls]
Expand Down

0 comments on commit c915346

Please sign in to comment.