Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarks for machine learning application #189

Open
corona10 opened this issue May 2, 2022 · 4 comments
Open

Add benchmarks for machine learning application #189

corona10 opened this issue May 2, 2022 · 4 comments

Comments

@corona10
Copy link
Member

corona10 commented May 2, 2022

Today, Machine learning applications are important use-cases of Python.
I haven't prepared concrete benchmark implementations yet, but I would like to suggest guidelines for machine learning benchmarks.

A. Each benchmark should provide all of the following implementations and shows the same result.

  • Pure Python-based implementation (might be not easy). Sympy based implementation
  • Numpy-based implementation.
  • (optional) Famous frameworks like scikit-learn, TensorFlow, or PyTorch-based implementation.

B. Following algorithm-based benchmark should provide training and inference benchmark.

  • Regression algorithm
  • Decision tree algorithm
  • Clustering algorithm
  • Nearest neighborhood algorithm
  • Matrix factorization
  • ... (Please suggest!)

C. Deep learning-based or neural network-based benchmarks only provide inference benchmark with fixed weights since training benchmark needs GPU resources but using GPU resource is out of the topic.

  • Simple neural network
  • ... (Please suggest!)
@corona10
Copy link
Member Author

corona10 commented Aug 16, 2022

I change my mind about Pure Python-based implementation.
Sympy-based implementation will be enough since Sympy is implemented by pure Python.
I expect that it can reduce the difficulty of implementation.

@brandtbucher
Copy link
Member

I wonder if Pyston's bm_pytorch_alexnet_inference would fit the bill? The only hiccup that we've run into is that PyTorch doesn't release wheels for early pre-releases of CPython, and building it from source is really hard.

That might be okay, though. Even if we can't run it during pre-releases to inform our work then, we can still run it between stable versions of CPython. If we see that it got X% faster from 3.10 to 3.11, great! If not, we can still gather stats, etc. and use them to inform 3.12 work, even if the feedback loop isn't as tight as we would prefer.

CC @mdboom.

@corona10
Copy link
Member Author

corona10 commented Aug 18, 2022

I wonder if Pyston's bm_pytorch_alexnet_inference would fit the bill? The only hiccup that we've run into is that PyTorch doesn't release wheels for early pre-releases of CPython, and building it from source is really hard.

It will only cover case C.

If Python is satisfied with its position as a glue language in machine learning applications, it will be sufficient.
(And I think that we should not)
It's a different point of view, but if languages ​​like Julia try to position themselves as an alternative to Python, we also need to improve similar workloads with pure Python code as a competitor.

@corona10
Copy link
Member Author

I am considering to add the subset of https://github.com/mlcommons/inference
It's a de-factor benchmark suite of the NPU inference benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants