Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undraft references #75

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

behrica
Copy link
Member

@behrica behrica commented Dec 3, 2024

I added explanantions in the references and in
the ml notebooks I added pointers to the reference.

So to me, the "draft" can be removed.

@behrica behrica requested a review from daslu December 3, 2024 22:02
Copy link
Member

@daslu daslu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@behrica
Thanks for this amazing work, as always.

I added a few initial comments.

@@ -257,6 +257,16 @@ ctx-after-train
;; So now we can add more operations to the pipeline,
;; and nothing else changes, for example drop columns.

;; While most metamorph compliant operations behave the same in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit abstract.

How are :fit and :transform different from a certain notion of "fit" and "transform"?

I think this paragraph leaves the reader confused.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea how to express the concept of the metamorph context in simple words.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the concept of "keeping state", while still have pure functions.
combined with "transformers" which (might !) behave different in the "two passes" over a pipeline,
in "train" and "predict" (= :fit and :transform)

Copy link
Member Author

@behrica behrica Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

combined with the concept that existing dataset->dataset functions can be "lifted automatically by a macro" to become "context"->"context" functions, so we don't need to rewrite tablecloth to become "metamorh compliant"

This is "not simple"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

combined with the fact that we can avoid pipelines as well,
or split it into a dataset transformation pipeline (= using a normal threading macro)
and a metamorph pipeline in most cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks


;; These libraries do not have any functions for the models they contain.
;; `metamorph.ml` has instead of funtcions per model the concept of each model having a
;; unique `key`, the :model-type , which needs to be given when calling
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to consistently use backticks for keywords
:model-type, etc.

;;`metamorph.ml/train`
;;
;; The model libraries register their models under these keys, when their main ns
;; is `require`d. (and the model keys get printed on screen when getting registered)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some periods (.) are missing at the end of sentences.


;; ML tutorial
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a heading?

;; Instead we provide in the the last chapters of the Noj book a complete list
;; of all models (and their keys) incl. the parameters they take with a description.
;; For some models this reference documentation contains as well code examples.
;; This can be used to browse or search for models and their parameters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice

@@ -6,7 +6,20 @@
[scicloj.ml.smile.projections]
[noj-book.utils.render-tools :refer [render-key-info]]))

;; ## Smile other models reference - DRAFT 🛠
;; ## Smile other models reference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More context is needed, like the above.

@@ -13,7 +13,16 @@
^:kindly/hide-code
(require '[scicloj.ml.smile.regression])

;; ## Smile regression models reference - DRAFT 🛠
;; ## Smile regression models reference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More context is needed, like the above.

@@ -29,7 +32,7 @@
(kind/md "----------------------------------------------------------")]))))


;; ## Transformer reference - DRAFT 🛠
;; ## Transformer reference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More context is needed, like the above.

We cannot expect the reader to know what Transformers are, even if that was mentioned in a paragraph many chapters ago.

This is especially important, since Transformers have different meanings nowadays.



;; ## Tribuo reference - DRAFT 🛠
;; ## Tribuo reference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More context is needed, like the above.

@@ -5,6 +5,15 @@
[noj-book.utils.render-tools :refer [render-key-info]]))


;; ## Xgboost model reference - DRAFT 🛠
;; ## Xgboost model reference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More context is needed, like the above.

What is XGBoost, etc.

@behrica
Copy link
Member Author

behrica commented Dec 4, 2024

You ask for quite some changes, so I somehow believe it is not very efficient if I do them myself and we ping-pong here back and forward.
I don't have your notion of "beginner friendly".
So I suggest that you take ownership of this draft PR
and modify it as you whish.

This seems more efficient.

@behrica behrica marked this pull request as draft December 4, 2024 10:57
@daslu
Copy link
Member

daslu commented Dec 4, 2024

Sure, thanks

@behrica
Copy link
Member Author

behrica commented Dec 4, 2024

Just to mention it.
metamorph pipelines have the same goal as this:
https://scikit-learn.org/stable/modules/compose.html

and they are the same thing, just "functional" and "stateless".
The "state" of the pipeline is the metamorph context.
it can contain the "state" of all transformes/operation in pipeline
Most "transformers" have their "state" in the dataset, so they do not write/read anything to the pipeline state (outside the dataset)

@behrica
Copy link
Member Author

behrica commented Dec 4, 2024

Differently to "metamorph transformers", sklearn pipelines cannot have directly "pandas ds->ds operations".

But user could create a "custom transformer" which internally executes pandas operations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants