-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WEBSITE] Add scarf to readme for website analytics #219
Conversation
This adds website analytics to the Dag-Factory readme. Scarf privacy policy: https://about.scarf.sh/privacy-policy Note that while you cannot explicitly opt-out of website analytics for the publicly hosted readme (and docs), Scarf respects browser DND. If that is set via the browser, telemetry for that user will not be sent to Scarf.
@@ -5,6 +5,7 @@ | |||
[![PyPi](https://img.shields.io/pypi/v/dag-factory.svg)](https://pypi.org/project/dag-factory/) | |||
[![Code Style](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black) | |||
[![Downloads](https://pepy.tech/badge/dag-factory)](https://pepy.tech/project/dag-factory) | |||
<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=2bb92a5b-beb3-48cc-a722-79dda1089eda" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be rendered as
<img src="https://camo.githubusercontent.com/4b5c60ccaf0796d9330635a0927b42159b34946c2bf4d7aa4e412c8b1f606f59/68747470733a2f2f706570792e746563682f62616467652f6461672d666163746f7279" alt="Downloads" data-canonical-src="https://pepy.tech/badge/dag-factory" style="max-width: 100%;">
Probably due to:
Pixel-based telemetry will work on standard webpages, rendered markdown documentation on package registry sites like Docker Hub, npm, and PyPi, and anywhere an image can be embedded, with a notable exception being GitHub. When GitHub renders markdown, it rewrites URLs from their original web address to https://camo.githubusercontent.com/, where GitHub hosts any linked images themselves. This prevents Scarf from providing insights to maintainers, since all that can now be detected at the original web address via the tracking pixel is undifferentiated traffic from GitHub.
Paragraph from: https://docs.scarf.sh/web-traffic/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, we should be able to see this change in https://pypi.org/project/dag-factory/ once we publish this change to PyPI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed with Arjun that it still works, but that it won't be as fine-grained. We can do a separate one for the pypi site.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When users access the README.md
hosted on Github, Scarf will believe that the traffic is coming from Github IPs, as if they were performing the web traffic.
This means that we won't have conversion rates from viewing docs to downloading DAG Factory artifacts and possibly not which parts of DAG Factory documentation are looked at most when the access comes from Github, which seem to be the main features of this:
https://docs.scarf.sh/web-traffic/
That said, once the package is published, the data that comes from PyPI should be accurate. Would it be possible for us to filter out the misleading/incomplete information added by Github in the Scarf UI? Or would it make sense to have a PyPI README that is not the same as the Github's one and only add the pixel to the PyPI README?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the pixel that was specific for the readme. Yes, we should use a separate one for pypi. It would interesting to compare, and if we find the one directly embedded in the readme isn't useful, we can remove it at that time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make the PyPI change in a follow-up PR? ATM, both Github and PyPI use the same README.md
:
Line 10 in 3ff6fe0
readme = "README.md" |
If we decide to split them, we can rename the tracking pixel from dag-factory-readme
to dag-factory-github-readme
and create a new dag-factory-pypi-readme
. The only downside is that we'll need to maintain two and not only one README up-to-date.
I noticed the project doesn't currently have automated release pipeline. An alternative, if we decide to not have the tracking pixel in the Github README, could be to add the tracking pixel in a separate markdown and make it part of the PyPI one as part of the deployment pipeline, using something like https://pandoc.org/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, it seems these docs in Scarf are outdated:
When GitHub renders markdown, it rewrites URLs from their original web address to https://camo.githubusercontent.com/, where GitHub hosts any linked images themselves. This prevents Scarf from providing insights to maintainers, since all that can now be detected at the original web address via the tracking pixel is undifferentiated traffic from GitHub.
In practice, we currently (20 September) can track pixel-based events in Github markdown pages with the current changes. We were able to see locations and companies in the Scarf UI.
### Added - Support using envvar in config YAML by @tatiana in #236 - **Callback improvements** - Support installed code via python callable string by @john-drews in #221 - Add `callback_file` & `callback_name` to `default_args` DAG level by @subbota19 in #218 - Cast callbacks to functions when set with `default_args` on TaskGroups by @Baraldo and @pankajastro in #235 - **Telemetry** - For more information, please, read the [Privacy Notice](https://github.com/astronomer/dag-factory/blob/main/PRIVACY_NOTICE.md#collection-of-data). - Add scarf to readme for website analytics by @cmarteepants in #219 - Support telemetry during DAG parsing emitting data to Scarf by @tatiana in #250. ### Fixed - Build DAGs when tehre is an invalid YAML in the DAGs folder by @quydx and @tatiana in #184 ### Others - Development tools - Fix make docker-run by @pankajkoti in #249 - Add vim dot files to .gitignore by @tatiana in #228 - Use Hatchling to modern package building by @kaxil in #208 - CI - Fix static check failures in PR #218 by @pankajkoti in #251 - Fix pre-commit checks by @tatiana in #247 - Remove tox and corresponding build jobs in CI by @pankajkoti in #248 - Install Airflow with different versions in the CI by @pankajkoti in #237 - Run pre-commit hooks on all existing files by @pankajkoti in #245 - Add Python 3.11 and 3.12 to CI test pipeline by @pankajkoti in #229 - Tests - Fix duplicate test name by @pankajastro in #234 - Add static check by @pankajastro in #231 - Fix running tests locally (outside the CI) by @tatiana in #227 - Add the task_2 back to dataset example by @cmarteepants in #204 - Remove unnecessary config line by @jlaneve in #202 - Documentation - Update the license from MIT to Apache 2.0 by @pankajastro in #191 - Add registration icon and links to Airflow references by @cmarteepants in #190 - Update quickstart and add feature examples by @cmarteepants #189 ### Breaking changes - Removed support for Python 3.7 - The license was changed from MIT to Apache 2.0 Closes: #217
This adds website analytics to the Dag-Factory readme. Scarf privacy policy: https://about.scarf.sh/privacy-policy
Note that while you cannot explicitly opt-out of website analytics for the publicly hosted readme (and docs), Scarf respects browser DND. If that is set via the browser, telemetry for that user will not be sent to Scarf.