Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WEBSITE] Add scarf to readme for website analytics #219

Merged
merged 2 commits into from
Sep 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions PRIVACY_NOTICE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Privacy Notice

This project follows the [Privacy Policy of Astronomer](https://www.astronomer.io/privacy/)
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
[![PyPi](https://img.shields.io/pypi/v/dag-factory.svg)](https://pypi.org/project/dag-factory/)
[![Code Style](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)
[![Downloads](https://pepy.tech/badge/dag-factory)](https://pepy.tech/project/dag-factory)
<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=2bb92a5b-beb3-48cc-a722-79dda1089eda" />
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be rendered as

<img src="https://camo.githubusercontent.com/4b5c60ccaf0796d9330635a0927b42159b34946c2bf4d7aa4e412c8b1f606f59/68747470733a2f2f706570792e746563682f62616467652f6461672d666163746f7279" alt="Downloads" data-canonical-src="https://pepy.tech/badge/dag-factory" style="max-width: 100%;">

Probably due to:

Pixel-based telemetry will work on standard webpages, rendered markdown documentation on package registry sites like Docker Hub, npm, and PyPi, and anywhere an image can be embedded, with a notable exception being GitHub. When GitHub renders markdown, it rewrites URLs from their original web address to https://camo.githubusercontent.com/, where GitHub hosts any linked images themselves. This prevents Scarf from providing insights to maintainers, since all that can now be detected at the original web address via the tracking pixel is undifferentiated traffic from GitHub.

Paragraph from: https://docs.scarf.sh/web-traffic/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, we should be able to see this change in https://pypi.org/project/dag-factory/ once we publish this change to PyPI.

Copy link
Collaborator Author

@cmarteepants cmarteepants Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed with Arjun that it still works, but that it won't be as fine-grained. We can do a separate one for the pypi site.

Copy link
Collaborator

@tatiana tatiana Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When users access the README.md hosted on Github, Scarf will believe that the traffic is coming from Github IPs, as if they were performing the web traffic.

This means that we won't have conversion rates from viewing docs to downloading DAG Factory artifacts and possibly not which parts of DAG Factory documentation are looked at most when the access comes from Github, which seem to be the main features of this:
https://docs.scarf.sh/web-traffic/

That said, once the package is published, the data that comes from PyPI should be accurate. Would it be possible for us to filter out the misleading/incomplete information added by Github in the Scarf UI? Or would it make sense to have a PyPI README that is not the same as the Github's one and only add the pixel to the PyPI README?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the pixel that was specific for the readme. Yes, we should use a separate one for pypi. It would interesting to compare, and if we find the one directly embedded in the readme isn't useful, we can remove it at that time.

Copy link
Collaborator

@tatiana tatiana Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make the PyPI change in a follow-up PR? ATM, both Github and PyPI use the same README.md:

readme = "README.md"

If we decide to split them, we can rename the tracking pixel from dag-factory-readme to dag-factory-github-readme and create a new dag-factory-pypi-readme. The only downside is that we'll need to maintain two and not only one README up-to-date.

I noticed the project doesn't currently have automated release pipeline. An alternative, if we decide to not have the tracking pixel in the Github README, could be to add the tracking pixel in a separate markdown and make it part of the PyPI one as part of the deployment pipeline, using something like https://pandoc.org/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, it seems these docs in Scarf are outdated:

When GitHub renders markdown, it rewrites URLs from their original web address to https://camo.githubusercontent.com/, where GitHub hosts any linked images themselves. This prevents Scarf from providing insights to maintainers, since all that can now be detected at the original web address via the tracking pixel is undifferentiated traffic from GitHub.

In practice, we currently (20 September) can track pixel-based events in Github markdown pages with the current changes. We were able to see locations and companies in the Scarf UI.


Welcome to *dag-factory*! *dag-factory* is a library for [Apache Airflow®](https://airflow.apache.org) to construct DAGs declaratively via configuration files.

tatiana marked this conversation as resolved.
Show resolved Hide resolved
Expand Down
Loading