Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infra stats missing since October 2024 data for stats.jenkins.io Plugin Installation Trend feature #4386

Open
krisstern opened this issue Nov 12, 2024 · 14 comments

Comments

@krisstern
Copy link
Member

Service(s)

stats.jenkins.io

Summary

Recently (over the past few days) we noticed that the data for (1) "Installations by Version" and (2) "Installations by Version (%)" for individual plugins of October 2024 is missing and we are seeing something like the following at https://stats.jenkins.io/plugin-trends:

Screenshot 2024-11-13 at 2 47 33 AM

The expected behavior is that such data returned should not be empty.

Reproduction steps

No response

@krisstern krisstern added the triage Incoming issues that need review label Nov 12, 2024
@dduportal dduportal added this to the infra-team-sync-2024-11-19 milestone Nov 13, 2024
@dduportal dduportal removed the triage Incoming issues that need review label Nov 13, 2024
@dduportal dduportal self-assigned this Nov 13, 2024
@dduportal
Copy link
Contributor

If I understand correctly, it's related to the 2 missing days as per #4285 (comment).

Let's see the November update which should fix this (no ETA yet)

@dduportal
Copy link
Contributor

For info:

  • On usage.jenkins.io VM (used by both @kohsuke and @abayer for stats tasks):
    • I can see new "raw apache logs" from Oct. 29 up to today (Nov. 14)
    • But no "usage stats logs" since 30 Oct.

=> most probably an issue on @kohsuke 's VPN system (tends to crash on client side, we are searching for solutions). Since he travelled recently, he could not have updated it. I'm checking with him

@krisstern
Copy link
Member Author

Thanks @dduportal for the update! Appreciate the follow-up

@krisstern
Copy link
Member Author

Any update on this @dduportal?

@dduportal
Copy link
Contributor

Any update on this @dduportal?

Nope, we are still depending on @kohsuke for this

@krisstern
Copy link
Member Author

I see. Thanks for the prompt response, Damien! 👍🏼

@krisstern krisstern changed the title Infra stats missing October 2024 data for stats.jenkins.io Plugin Installation Trend feature Infra stats missing since October 2024 data for stats.jenkins.io Plugin Installation Trend feature Dec 15, 2024
@krisstern
Copy link
Member Author

Hi @dduportal is there any chance we will know of any updates soon?

@dduportal
Copy link
Contributor

Hi @dduportal is there any chance we will know of any updates soon?

We'll let you know when we'll have news. You should not expect anything until 18-20 January

@krisstern
Copy link
Member Author

krisstern commented Dec 31, 2024

Thanks! The badges/shields team have been blocked by this yet-to-be resolved issue as well so a ballpark at least about any new updates for us would be good.

@dduportal
Copy link
Contributor

Thanks! The badges/shields team have been blocked by this yet-to-be resolved issue as well so a ballpark at least about any new updates for us would be good.

As explained in the previous issues such as #4285, the process for statistics depends on @kohsuke for the first step (anonymization of the data on his own machine), followed by @abayer (compilation, aggregation and filtering) for the second step.

Yes, it it means 2 SPOFs due to historical and safety reasons. Yes, it should be better if it would change to avoid this, but we are not there yet and it requires a LOT of efforts during a long time span.

I understand it can be blocking some other tasks, but that is a situation for which there is nothing we can do except waiting.

You did not receive any updates in the past 10 days because the team is in a "2 weeks break" for the end of the year. Given the workload and timings of different interlocutors, nothing will happen realistically until 18-20 January.

There is no need to add too much notifications and pings: this issue is tracked properly and we check it every week. It looks like you want an update: we'll add a comment weekly starting the 07 to say "nothing happened" until it is resolved.

@timja
Copy link
Member

timja commented Dec 31, 2024

If I understand correctly, it's related to the 2 missing days as per #4285 (comment).

Let's see the November update which should fix this (no ETA yet)

@dduportal what makes you think its due to 2 days of data missing?

The issue is that these two fields are empty:

"installationsPerVersion": {},
"installationsPercentagePerVersion": {}

see:
https://old.stats.jenkins.io/plugin-installation-trend/slack.stats.json

I assume something wrong in https://github.com/jenkins-infra/jenkins-usage-stats, cc @abayer

(I'll try take a look)

@timja
Copy link
Member

timja commented Dec 31, 2024

Finally managed to get the tool (https://github.com/jenkins-infra/jenkins-usage-stats) to work and I get the same broken results locally.

I'll try debug some more when I get more time.

@timja
Copy link
Member

timja commented Dec 31, 2024

I suspect the report command may have been run with the wrong month and data for that month wasn't in the database.

The docs say:

❯ ./build/jenkins-usage-stats report
Error: required flag(s) "database", "directory" not set
Usage:
  jenkins-usage-stats report [flags]

Flags:
      --database string    Database URL to import to
      --directory string   Directory to output to
  -h, --help               help for report
      --latest-month int   Month of latest data to include. Defaults the previous month of when this is running

i.e. what is the latest month you want to include data of.
You would expect:

report \
  --database postgres://timja@localhost/jenkins_usage_stats?sslmode=disable&timezone=UTC \
  --directory /Users/timja/code/jenkins/jenkins-usage-stats/build/reports \
  --latest-month 10 \
  --latest-year 2024

to work when I have October data.

But its actually the previous month of whatever you input:
https://github.com/jenkins-infra/jenkins-usage-stats/blob/7747222884f405115f55dc199012c097ecd73535/report.go#L940

I was able to get valid data with:

report \
  --database postgres://timja@localhost/jenkins_usage_stats?sslmode=disable&timezone=UTC \
  --directory /Users/timja/code/jenkins/jenkins-usage-stats/build/reports \
  --latest-month 11 \
  --latest-year 2024

Previously:

{
    "name": "slack",
    "installations": {
        "1727740800000": 1537
    },
    "installationsPercentage": {
        "1727740800000": 14.967378
    },
    "installationsPerVersion": {},
    "installationsPercentagePerVersion": {}
}

Now:

{
    "name": "slack",
    "installations": {
        "1727740800000": 1537
    },
    "installationsPercentage": {
        "1727740800000": 14.967378
    },
    "installationsPerVersion": {
        "1.7": 1,
        "1.8": 1,
        "1.8.1": 2,
        "2.0.1": 1,
        "2.1": 2,
        "2.12": 1,
        "2.14": 5,
        "2.16": 2,
        "2.18": 1,
        "2.2": 7,
        "2.20": 2,
        "2.22": 2,
        "2.23": 9,
        "2.24": 3,
        "2.26": 1,
        "2.27": 1,
        "2.28": 4,
        "2.29": 4,
        "2.3": 33,
        "2.32": 4,
        "2.34": 15,
        "2.35": 5,
        "2.36": 4,
        "2.37": 4,
        "2.39": 2,
        "2.4": 1,
        "2.40": 21,
        "2.41": 8,
        "2.42": 8,
        "2.43": 6,
        "2.44": 1,
        "2.45": 15,
        "2.46": 8,
        "2.47": 5,
        "2.48": 44,
        "2.49": 38,
        "2.6": 2,
        "602.v0da_f7458945d": 9,
        "608.v19e3b_44b_b_9ff": 20,
        "616.v03b_1e98d13dd": 40,
        "625.va_eeb_b_168ffb_0": 10,
        "629.vf00ea_cb_40d53": 6,
        "631.v40deea_40323b": 88,
        "664.vc9a_90f8b_c24a_": 47,
        "684.v833089650554": 281,
        "714.v62ffe7c796cd": 6,
        "715.v1cfed1b_9c63c": 26,
        "722.vd07f1ea_7ff40": 121,
        "734.v7f9ec8b_66975": 51,
        "741.v00f9591c586d": 109,
        "751.v2e44153c8fe1": 450
    },
    "installationsPercentagePerVersion": {
        "1.7": 0.009738047,
        "1.8": 0.009738047,
        "1.8.1": 0.019476093,
        "2.0.1": 0.009738047,
        "2.1": 0.019476093,
        "2.12": 0.009738047,
        "2.14": 0.048690233,
        "2.16": 0.019476093,
        "2.18": 0.009738047,
        "2.2": 0.06816632,
        "2.20": 0.019476093,
        "2.22": 0.019476093,
        "2.23": 0.08764242,
        "2.24": 0.02921414,
        "2.26": 0.009738047,
        "2.27": 0.009738047,
        "2.28": 0.038952187,
        "2.29": 0.038952187,
        "2.3": 0.32135552,
        "2.32": 0.038952187,
        "2.34": 0.1460707,
        "2.35": 0.048690233,
        "2.36": 0.038952187,
        "2.37": 0.038952187,
        "2.39": 0.019476093,
        "2.4": 0.009738047,
        "2.40": 0.20449898,
        "2.41": 0.07790437,
        "2.42": 0.07790437,
        "2.43": 0.05842828,
        "2.44": 0.009738047,
        "2.45": 0.1460707,
        "2.46": 0.07790437,
        "2.47": 0.048690233,
        "2.48": 0.42847404,
        "2.49": 0.37004578,
        "2.6": 0.019476093,
        "602.v0da_f7458945d": 0.08764242,
        "608.v19e3b_44b_b_9ff": 0.19476093,
        "616.v03b_1e98d13dd": 0.38952187,
        "625.va_eeb_b_168ffb_0": 0.09738047,
        "629.vf00ea_cb_40d53": 0.05842828,
        "631.v40deea_40323b": 0.8569481,
        "664.vc9a_90f8b_c24a_": 0.45768818,
        "684.v833089650554": 2.736391,
        "714.v62ffe7c796cd": 0.05842828,
        "715.v1cfed1b_9c63c": 0.2531892,
        "722.vd07f1ea_7ff40": 1.1783036,
        "734.v7f9ec8b_66975": 0.49664038,
        "741.v00f9591c586d": 1.061447,
        "751.v2e44153c8fe1": 4.382121
    }
}

Note: I just imported one file from usage.jenkins.io for this test: access_usage.jenkins.io.log.20241029000000.gz

@krisstern
Copy link
Member Author

Thanks @timja for investigating the root cause!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants