Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understat match stats and players #386

Merged
merged 17 commits into from
Jul 9, 2024
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -54,4 +54,4 @@ Suggests:
rmarkdown,
testthat
Encoding: UTF-8
RoxygenNote: 7.2.3
RoxygenNote: 7.3.1
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,9 @@ export(tm_team_transfers)
export(understat_available_teams)
export(understat_league_match_results)
export(understat_league_season_shots)
export(understat_match_players)
export(understat_match_shots)
export(understat_match_stats)
export(understat_player_shots)
export(understat_team_meta)
export(understat_team_players_stats)
Expand Down
53 changes: 53 additions & 0 deletions R/understat_match_players.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@

#' Get Understat match player data
#'
#' Returns player values for a selected match from Understat.com
#'
#' @param match_url the URL of the match played
#'
#' @return returns a dataframe with data for all players for the match
#'
#' @importFrom magrittr %>%
#'
#' @export

understat_match_players <- function(match_url) {
tonyelhabr marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my only new feedback on this is that there might have been a cleaner way of doing all of this. i noticed that {understatr} seems to have a more straightforward approach, although i haven't thoroughly checked the details. it could be the same exact implementation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but i don't think this function should be changed at this point

# .pkg_message("Scraping all shots for match {match_url}. Please acknowledge understat.com as the data source")

match_id <- gsub("[^0-9]", "", match_url)

match_player_data <- .get_understat_json(page_url = match_url) %>%
rvest::html_nodes("script") %>%
as.character()

match_player_data <- match_player_data[grep("rostersData\t=", match_player_data)] %>%
stringi::stri_unescape_unicode() %>%

substr(41,nchar(.)) %>%
substr(0,nchar(.)-13) %>%
paste0('[', . , ']') %>%

unlist() %>%
stringr::str_subset("\\[\\]", negate = TRUE)

match_player_data <- lapply(match_player_data, jsonlite::fromJSON) %>%
do.call("rbind", .)

match_player_data_home <- do.call(rbind.data.frame, match_player_data$h)
match_player_data_away <- do.call(rbind.data.frame, match_player_data$a)

match_player_data <- bind_rows(match_player_data_home,match_player_data_away) %>%
tonyelhabr marked this conversation as resolved.
Show resolved Hide resolved
mutate(match_id = match_id) %>%

select(match_id, team_id,
team_status = h_a,
player_id, swap_id = id,
player, position, positionOrder,
time_played = time,
everything()) %>%
tonyelhabr marked this conversation as resolved.
Show resolved Hide resolved
mutate(team_status = ifelse(team_status=="h","home","away"))

return(match_player_data)
}


56 changes: 56 additions & 0 deletions R/understat_match_stats.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@

#' Get Understat match stats table data
#'
#' Returns the Stats values for a selected match from Understat.com
#'
#' @param match_url the URL of the match played
#'
#' @return returns a dataframe with data from the stats table for the match
#'
#' @importFrom magrittr %>%
#'
#' @export

understat_match_stats <- function(match_url) {
# .pkg_message("Scraping all shots for match {match_url}. Please acknowledge understat.com as the data source")

match_stats <- .get_understat_json(page_url = match_url) %>%
rvest::html_nodes("div.scheme-block.is-hide[data-scheme='stats']") %>%
rvest::html_nodes(".progress-value") %>%
rvest::html_text()

away <- match_stats[seq(1, length(match_stats), by=2)]
home <- match_stats[seq(2, length(match_stats), by=2)]

match_stats <- data.frame(

match_id = gsub("[^0-9]", "", match_url),

home_team = away[1],
home_chances = away[2],
home_goals = home[3],
home_xG = home[4],
home_shots = home[5],
home_shot_on_target = home[6],
home_deep = home[7],
home_PPDA = home[8],
home_xPTS = home[9],
tonyelhabr marked this conversation as resolved.
Show resolved Hide resolved

draw_chances = home[2],

away_team = home[1],
away_chances = away[3],
away_goals = away[4],
away_xG = away[5],
away_shots = away[6],
away_shot_on_target = away[7],
away_deep = away[8],
away_PPDA = away[9],
away_xPTS = away[10]

)

return(match_stats)
}


17 changes: 17 additions & 0 deletions man/understat_match_players.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 17 additions & 0 deletions man/understat_match_stats.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 19 additions & 0 deletions vignettes/extract-understat-data.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,25 @@ wba_liv_shots <- understat_match_shots(match_url = "https://understat.com/match/
dplyr::glimpse(wba_liv_shots)
```

### Match Stats

To get the data from the stats table for an individual match, use the `understat_match_stats()` function:

```{r match_stats}
wba_liv_stats <- understat_match_stats(match_url = "https://understat.com/match/14789")
dplyr::glimpse(wba_liv_stats)
```


### Match Players

To get the data for player in an individual match, use the `understat_match_players()` function:

```{r match_players}
wba_liv_players <- understat_match_players(match_url = "https://understat.com/match/14789")
dplyr::glimpse(wba_liv_players)
```


***

Expand Down