Skip to content

Commit

Permalink
docs: refactor /README.md for GH and pypi
Browse files Browse the repository at this point in the history
by making ad-hoc distinctions with index.MD for mkdocs.

Minor changes.

Signed-off-by: tarilabs <[email protected]>
  • Loading branch information
tarilabs committed Aug 10, 2024
1 parent d5fb577 commit f201893
Show file tree
Hide file tree
Showing 8 changed files with 213 additions and 89 deletions.
132 changes: 48 additions & 84 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,131 +7,95 @@

# OCI Artifact for ML model & metadata

[![Python](https://img.shields.io/badge/python%20-3.9%7C3.10%7C3.11%7C3.12-blue)](https://github.com/tarilabs/omlmd)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)
[![Build](https://github.com/tarilabs/omlmd/actions/workflows/build.yaml/badge.svg)](https://github.com/tarilabs/omlmd/actions/workflows/build.yaml)
[![E2E testing](https://github.com/tarilabs/omlmd/actions/workflows/e2e.yaml/badge.svg)](https://github.com/tarilabs/omlmd/actions/workflows/e2e.yaml)
[![PyPI - Version](https://img.shields.io/pypi/v/omlmd)](https://pypi.org/project/omlmd)

[![Static Badge](https://img.shields.io/badge/Website-green?style=plastic&label=Documentation&labelColor=blue)](https://tarilabs.github.io/omlmd)
[![GitHub Repo stars](https://img.shields.io/github/stars/tarilabs/omlmd?label=GitHub%20Repository)](https://github.com/tarilabs/omlmd)
[![YouTube Channel Subscribers](https://img.shields.io/youtube/channel/subscribers/UCmvDe7dCEmiT4J0XoM6TREQ?label=YouTube%20Playlist)](https://www.youtube.com/watch?v=W4GwIRPXE8E&list=PLdbdefeRIj9SRbg6Hkr15GeyPH0qpk_ww)

This project is a collection of blueprints, patterns and toolchain (in the form of python SDK and CLI) to leverage OCI Artifact and containers for ML model and metadata.

Documentation: https://tarilabs.github.io/omlmd

GitHub repository: https://github.com/tarilabs/omlmd <br/>
YouTube video playlist: https://www.youtube.com/watch?v=W4GwIRPXE8E&list=PLdbdefeRIj9SRbg6Hkr15GeyPH0qpk_ww <br/>
Pypi distribution: https://pypi.org/project/omlmd <br/>

## Installation

> [!TIP]
> We recommend checking out the [Getting Started tutorial](https://tarilabs.github.io/omlmd) in the documentation; below instructions are provided for a quick overview.
In your Python environment, use:

```
pip install omlmd
```

!!! question "Why do I need a Python environment?"

This SDK follows the same prerequisites as [InstructLab](https://github.com/instructlab/instructlab?tab=readme-ov-file#-installing-ilab) and is intented to offer Pythonic way to create OCI Artifact for ML model and metadata.
For general CLI tools for containers, we invite you to checkout [Podman](https://podman.io) and all the [Containers](https://github.com/containers/#%EF%B8%8F-tools) toolings.

## Push

Store ML model file `model.joblib` and its metadata in the OCI repository at `localhost:8080`:

=== "Python"

```py
from omlmd.helpers import Helper
```py
from omlmd.helpers import Helper

omlmd = Helper()
omlmd.push("localhost:8080/matteo/ml-artifact:latest", "model.joblib", name="Model Example", author="John Doe", license="Apache-2.0", accuracy=9.876543210)
```

=== "CLI"

```sh
omlmd push localhost:8080/mmortari/mlartifact:v1 model.joblib --metadata md.json
```
omlmd = Helper()
omlmd.push("localhost:8080/matteo/ml-artifact:latest", "model.joblib", name="Model Example", author="John Doe", license="Apache-2.0", accuracy=9.876543210)
```

## Pull

Fetch everything in a single pull:

=== "Python"

```py
omlmd.pull(target="localhost:8080/matteo/ml-artifact:latest", outdir="tmp/b")
```

=== "CLI"

```sh
omlmd pull localhost:8080/mmortari/mlartifact:v1 -o tmp/a
```
```py
omlmd.pull(target="localhost:8080/matteo/ml-artifact:latest", outdir="tmp/b")
```

Or fetch only the ML model assets:

=== "Python"

```py
omlmd.pull(target="localhost:8080/matteo/ml-artifact:latest", outdir="tmp/b", media_types=["application/x-mlmodel"])
```

=== "CLI"

```sh
omlmd pull localhost:8080/mmortari/mlartifact:v1 -o tmp/b --media-types "application/x-mlmodel"
```
```py
omlmd.pull(target="localhost:8080/matteo/ml-artifact:latest", outdir="tmp/b", media_types=["application/x-mlmodel"])
```

### Custom Pull: just metadata

The features can be composed in order to expose higher lever capabilities, such as retrieving only the metadata informatio.
Implementation intends to follow OCI-Artifact convention

=== "Python"

```py
md = omlmd.get_config(target="localhost:8080/matteo/ml-artifact:latest")
print(md)
```

=== "CLI"

```sh
omlmd get config localhost:8080/mmortari/mlartifact:v1
```
```py
md = omlmd.get_config(target="localhost:8080/matteo/ml-artifact:latest")
print(md)
```

## Crawl

Client-side crawling of metadata.

_Note: Server-side analogous coming soon/reference in blueprints._

=== "Python"

```py
crawl_result = omlmd.crawl([
"localhost:8080/matteo/ml-artifact:v1",
"localhost:8080/matteo/ml-artifact:v2",
"localhost:8080/matteo/ml-artifact:v3"
])
```

=== "CLI"

```sh
omlmd crawl localhost:8080/mmortari/mlartifact:v1 localhost:8080/mmortari/mlartifact:v2 localhost:8080/mmortari/mlartifact:v3
```
```py
crawl_result = omlmd.crawl([
"localhost:8080/matteo/ml-artifact:v1",
"localhost:8080/matteo/ml-artifact:v2",
"localhost:8080/matteo/ml-artifact:v3"
])
```

### Example query

Demonstrate integration of crawling results with querying (in this case using jQ)
Demonstrate integration of crawling results with querying (in this case using [jQ](https://jqlang.github.io/jq))

> Of the crawled ML OCI artifacts, which one exhibit the max accuracy?
=== "Python"

```py
import jq
jq.compile( "max_by(.config.customProperties.accuracy).reference" ).input_text(crawl_result).first()
```

=== "CLI"

```sh
omlmd crawl \
localhost:8080/mmortari/mlartifact:v1 \
localhost:8080/mmortari/mlartifact:v2 \
localhost:8080/mmortari/mlartifact:v3 \
| jq "max_by(.config.customProperties.accuracy).reference"
```
```py
import jq
jq.compile( "max_by(.config.customProperties.accuracy).reference" ).input_text(crawl_result).first()
```

## To be continued...

Don't forget to checkout the [documentation website](https://tarilabs.github.io/omlmd) for more information!
1 change: 0 additions & 1 deletion docs/README.md

This file was deleted.

7 changes: 7 additions & 0 deletions docs/appendixes/appendix-links.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,14 @@ This is a collection of all the links from this website:
- https://github.com/kubernetes/enhancements/pull/4639
- https://github.com/lampajr/oci-storage-initializer/blob/main/GET_STARTED.md
- https://github.com/opencontainers/image-spec/blob/main/manifest.md#guidelines-for-artifact-usage
- https://github.com/tarilabs/omlmd
- https://github.com/tarilabs/omlmd/actions/workflows/build.yaml
- https://github.com/tarilabs/omlmd/actions/workflows/e2e.yaml
- https://jqlang.github.io/jq
- https://pypi.org/project/omlmd
- https://tarilabs.github.io/omlmd
- https://www.cncf.io/projects/confidential-containers
- https://www.kubeflow.org/docs/components/model-registry
- https://www.openpolicyagent.org/
- https://www.youtube.com/watch?v=W4GwIRPXE8E&list=PLdbdefeRIj9SRbg6Hkr15GeyPH0qpk_ww

3 changes: 2 additions & 1 deletion docs/appendixes/gen-appendix-links.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,9 @@ def get_all_hrefs(dir):
tokens = md.parse(f.read())
node = SyntaxTreeNode(tokens)
for n in node.walk():
if n.type == "link":
if n.type == "link" and n.attrs["href"].startswith("http"):
all_href.append(n.attrs["href"])
all_href = list(dict.fromkeys(all_href))
all_href.sort()
print(all_href)
return all_href
Expand Down
146 changes: 146 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
<!--
NOTE: headers should align to /README.md
-->

![](https://github.com/tarilabs/omlmd/raw/main/docs/imgs/banner.png)

# OCI Artifact for ML model & metadata

[![Python](https://img.shields.io/badge/python%20-3.9%7C3.10%7C3.11%7C3.12-blue)](https://github.com/tarilabs/omlmd)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)
[![Build](https://github.com/tarilabs/omlmd/actions/workflows/build.yaml/badge.svg)](https://github.com/tarilabs/omlmd/actions/workflows/build.yaml)
[![E2E testing](https://github.com/tarilabs/omlmd/actions/workflows/e2e.yaml/badge.svg)](https://github.com/tarilabs/omlmd/actions/workflows/e2e.yaml)
[![PyPI - Version](https://img.shields.io/pypi/v/omlmd)](https://pypi.org/project/omlmd)

[![Static Badge](https://img.shields.io/badge/Website-green?style=plastic&label=Documentation&labelColor=blue)](https://tarilabs.github.io/omlmd)
[![GitHub Repo stars](https://img.shields.io/github/stars/tarilabs/omlmd?label=GitHub%20Repository)](https://github.com/tarilabs/omlmd)
[![YouTube Channel Subscribers](https://img.shields.io/youtube/channel/subscribers/UCmvDe7dCEmiT4J0XoM6TREQ?label=YouTube%20Playlist)](https://www.youtube.com/watch?v=W4GwIRPXE8E&list=PLdbdefeRIj9SRbg6Hkr15GeyPH0qpk_ww)

This project is a collection of blueprints, patterns and toolchain (in the form of python SDK and CLI) to leverage OCI Artifact and containers for ML model and metadata.

## Installation

In your Python environment, use:

```
pip install omlmd
```

!!! question "Why do I need a Python environment?"

This SDK follows the same prerequisites as [InstructLab](https://github.com/instructlab/instructlab?tab=readme-ov-file#-installing-ilab) and is intented to offer Pythonic way to create OCI Artifact for ML model and metadata.
For general CLI tools for containers, we invite you to checkout [Podman](https://podman.io) and all the [Containers](https://github.com/containers/#%EF%B8%8F-tools) toolings.

## Push

Store ML model file `model.joblib` and its metadata in the OCI repository at `localhost:8080`:

=== "Python"

```py
from omlmd.helpers import Helper

omlmd = Helper()
omlmd.push("localhost:8080/matteo/ml-artifact:latest", "model.joblib", name="Model Example", author="John Doe", license="Apache-2.0", accuracy=9.876543210)
```

=== "CLI"

```sh
omlmd push localhost:8080/mmortari/mlartifact:v1 model.joblib --metadata md.json --plain-http
```

## Pull

Fetch everything in a single pull:

=== "Python"

```py
omlmd.pull(target="localhost:8080/matteo/ml-artifact:latest", outdir="tmp/b")
```

=== "CLI"

```sh
omlmd pull localhost:8080/mmortari/mlartifact:v1 -o tmp/a --plain-http
```

Or fetch only the ML model assets:

=== "Python"

```py
omlmd.pull(target="localhost:8080/matteo/ml-artifact:latest", outdir="tmp/b", media_types=["application/x-mlmodel"])
```

=== "CLI"

```sh
omlmd pull localhost:8080/mmortari/mlartifact:v1 -o tmp/b --media-types "application/x-mlmodel" --plain-http
```

### Custom Pull: just metadata

The features can be composed in order to expose higher lever capabilities, such as retrieving only the metadata informatio.
Implementation intends to follow OCI-Artifact convention

=== "Python"

```py
md = omlmd.get_config(target="localhost:8080/matteo/ml-artifact:latest")
print(md)
```

=== "CLI"

```sh
omlmd get config localhost:8080/mmortari/mlartifact:v1 --plain-http
```

## Crawl

Client-side crawling of metadata.

_Note: Server-side analogous coming soon/reference in blueprints._

=== "Python"

```py
crawl_result = omlmd.crawl([
"localhost:8080/matteo/ml-artifact:v1",
"localhost:8080/matteo/ml-artifact:v2",
"localhost:8080/matteo/ml-artifact:v3"
])
```

=== "CLI"

```sh
omlmd crawl localhost:8080/mmortari/mlartifact:v1 localhost:8080/mmortari/mlartifact:v2 localhost:8080/mmortari/mlartifact:v3 --plain-http
```

### Example query

Demonstrate integration of crawling results with querying (in this case using [jQ](https://jqlang.github.io/jq))

> Of the crawled ML OCI artifacts, which one exhibit the max accuracy?
=== "Python"

```py
import jq
jq.compile( "max_by(.config.customProperties.accuracy).reference" ).input_text(crawl_result).first()
```

=== "CLI"

```sh
omlmd crawl --plain-http \
localhost:8080/mmortari/mlartifact:v1 \
localhost:8080/mmortari/mlartifact:v2 \
localhost:8080/mmortari/mlartifact:v3 \
| jq "max_by(.config.customProperties.accuracy).reference"
```

## To be continued...
2 changes: 1 addition & 1 deletion docs/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ First, the ML model and (replicable) metadata get signed as OCI Artifact on a OC

This in turn provides a signature chain, supporting more than _Lineage_, also _Provenance_.

See also demo 3 in this project.
See also [Demo 3](demos/demo3.md) in this project.

## Integration with OPA

Expand Down
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ theme:
extra_css:
- stylesheets/extra.css
nav:
- Getting Started: 'README.md'
- Getting Started: 'index.md'
- Overview: 'overview.md'
- Demos:
- 'Demo 1: Introduction': 'demos/demo.md'
Expand Down
9 changes: 8 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,10 +1,17 @@
[tool.poetry]
name = "omlmd"
version = "0.1.2"
description = ""
description = "OCI Artifact for ML model & metadata"
authors = ["Matteo Mortari <[email protected]>"]
readme = "README.md"

[project.urls]
Homepage = "https://tarilabs.github.io/omlmd"
Documentation = "https://tarilabs.github.io/omlmd"
Repository = "https://github.com/tarilabs/omlmd.git"
Issues = "https://github.com/tarilabs/omlmd/issues"
Changelog = "https://github.com/tarilabs/omlmd/releases"

[tool.poetry.dependencies]
python = "^3.9"
oras = "^0.1.30"
Expand Down

0 comments on commit f201893

Please sign in to comment.