Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve reliability of fb_league_stats(..., team_or_player = "player") #396

Merged
merged 1 commit into from
Sep 2, 2024

Conversation

tonyelhabr
Copy link
Collaborator

Fixes #395.

  • Refactors backend of fb_league_stats(..., team_or_player = "player") to use rvest::read_html_live()
    • In doing so, also fixes bug with page element used to extract data by fb_league_stats(..., team_or_player = "player") (😅 )

Testing and verifying that all stat_types with fb_league_stats(..., team_or_player = "player") work.

library(worldfootballR)
packageVersion('worldfootballR')
#> [1] '0.6.6.0'

mls_2024_standard <- fb_league_stats(
  country = 'USA',
  gender = 'M',
  season_end_year = 2024,
  tier = '1st',
  stat_type = 'standard',
  team_or_player = 'player'
)
#> Please be aware that `fb_league_stats(..., team_or_player = "player")` depends on `rvest::read_html_live` (and chromote), which may not always work.
#> This message is displayed once per session.

mls_2024_shooting <- fb_league_stats(
  country = 'USA',
  gender = 'M',
  season_end_year = 2024,
  tier = '1st',
  stat_type = 'shooting',
  team_or_player = 'player'
)

mls_2024_passing <- fb_league_stats(
  country = 'USA',
  gender = 'M',
  season_end_year = 2024,
  tier = '1st',
  stat_type = 'passing',
  team_or_player = 'player'
)

mls_2024_passing_types <- fb_league_stats(
  country = 'USA',
  gender = 'M',
  season_end_year = 2024,
  tier = '1st',
  stat_type = 'passing_types',
  team_or_player = 'player'
)

mls_2024_gca <- fb_league_stats(
  country = 'USA',
  gender = 'M',
  season_end_year = 2024,
  tier = '1st',
  stat_type = 'gca',
  team_or_player = 'player'
)

mls_2024_defense <- fb_league_stats(
  country = 'USA',
  gender = 'M',
  season_end_year = 2024,
  tier = '1st',
  stat_type = 'defense',
  team_or_player = 'player'
)

mls_2024_possession <- fb_league_stats(
  country = 'USA',
  gender = 'M',
  season_end_year = 2024,
  tier = '1st',
  stat_type = 'possession',
  team_or_player = 'player'
)

mls_2024_playing_time <- fb_league_stats(
  country = 'USA',
  gender = 'M',
  season_end_year = 2024,
  tier = '1st',
  stat_type = 'playing_time',
  team_or_player = 'player'
)

mls_2024_misc <- fb_league_stats(
  country = 'USA',
  gender = 'M',
  season_end_year = 2024,
  tier = '1st',
  stat_type = 'misc',
  team_or_player = 'player'
)

mls_2024_keepers <- fb_league_stats(
  country = 'USA',
  gender = 'M',
  season_end_year = 2024,
  tier = '1st',
  stat_type = 'keepers',
  team_or_player = 'player'
)

mls_2024_keepers_adv <- fb_league_stats(
  country = 'USA',
  gender = 'M',
  season_end_year = 2024,
  tier = '1st',
  stat_type = 'keepers_adv',
  team_or_player = 'player'
)

dim(mls_2024_standard)
#> [1] 791  39
dim(mls_2024_shooting)
#> [1] 791  28
dim(mls_2024_passing)
#> [1] 791  34
dim(mls_2024_passing_types)
#> [1] 791  26
dim(mls_2024_gca)
#> [1] 791  27
dim(mls_2024_defense)
#> [1] 791  27
dim(mls_2024_possession)
#> [1] 791  33
dim(mls_2024_playing_time)
#> [1] 913  32
dim(mls_2024_misc)
#> [1] 791  27
dim(mls_2024_keepers)
#> [1] 59 29
dim(mls_2024_keepers_adv)
#> [1] 59 36

Copy link
Collaborator Author

@tonyelhabr tonyelhabr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all-in-all, i think fb_league_stats(..., team_or_player = "player") should be ~100% reliable now!

@@ -1,7 +1,7 @@
Type: Package
Package: worldfootballR
Title: Extract and Clean World Football (Soccer) Data
Version: 0.6.5.0008
Version: 0.6.6.0000
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since fb_league_stats(..., team_or_player = "player") seems to be a popular function that was buggy before, I think this deserves a little more than a dev version increment!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, good choice.

@@ -36,7 +36,7 @@ Imports:
readr,
rlang,
rstudioapi,
rvest (>= 1.0.0),
rvest (>= 1.0.4),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notable requirement


***

# worldfootballR 0.6.6
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"reset" the NEWS

session <- worldfootballr_chromote_session(url)
player_table <- worldfootballr_html_player_table(session)
session$session$close(wait_ = FALSE)
page <- rvest::read_html_live(url)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, so easy now! (i can completely delete chromote-fbref.R 😄)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have been using rvest::read_html_live() for work quite a bit, just didn't get to addressing it here though - awesome stuff!

page <- rvest::read_html_live(url)
## for keepers: although URLs have plural term, div elements have singular term
stat_type <- gsub("keepers", "keeper", stat_type)
player_table_element <- page$html_elements(paste0("#div_stats_", stat_type))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks for this ID element in the HTML comment that is loaded client side

image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before, i hardcoded a search for the #stats_shooting_sh element. in hindsight, i think this meant that fb_league_stats(..., team_or_player = "player") only worked for stat_type = "shooting"?! maybe my thought that our prior code was unreliable was actually more so due to a bug than bad code!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hahaha a very nice unintended fix!

@tonyelhabr tonyelhabr requested a review from JaseZiv September 2, 2024 11:59
Copy link
Owner

@JaseZiv JaseZiv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome PR.

Thanks so much for addressing @tonyelhabr.

@JaseZiv JaseZiv merged commit e1093e0 into main Sep 2, 2024
0 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fb_league_stats not returning anything other than "shooting" for players
2 participants