Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rate games accounting for first move advantage (fix #6818) #16281

Closed
wants to merge 53 commits into from

Conversation

ddugovic
Copy link
Contributor

Case studies measure a first move advantage of 35 Elo, for example:
2002 - https://en.chessbase.com/post/the-sonas-rating-formula-better-than-elo
2024 - https://www.robweir.com/blog/2014/01/first-move-advantage-in-chess.html

This PR does not tune tau (see http://www.glicko.net/glicko/glicko2.pdf ),
although doing so may help stabilize variant and ultrabullet ratings.

@niklasf
Copy link
Member

niklasf commented Oct 29, 2024

Love to see this!

I analysed this a while back and concluded that adding this parameter is a significant improvement, but somehow never got around to implementing it.

It's important to get the parameter right. If 35 overshoots too much, it could lead to less accurate results than 0.

It should be possible to calculate the optimum value from aggregate results, like opening explorer data. We want that White's average score equals the rating system prediction for a game between two stable players at 1500.

@niklasf niklasf linked an issue Oct 29, 2024 that may be closed by this pull request
@ddugovic
Copy link
Contributor Author

ddugovic commented Oct 30, 2024

Sampling rated blitz games with rating=1500, I see:

  • 20,707 (49.7%) white wins
  • 19,802 (47.5%) black wins
  • 1,117 (2.7%) draws

These numbers don't match the Opening Explorer numbers, or any other case study I could find. I selected blitz games because the rapid rating distribution is skewed, and the blitz rating distribution is not skewed.

Using this calculator, I see that an Elo difference of +7.786 produces an expected score of 0.510870.


final class GameResult(winner: Rating, loser: Rating, isDraw: Boolean) extends Result:
final class GameResult(val first: Rating, val second: Rating, outcome: Option[Boolean]) extends Result:
Copy link
Collaborator

@ornicar ornicar Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned that you changed the semantics of this class while keeping typing compatibility.

winner: Rating, loser: Rating and val first: Rating, val second: Rating have the same types, but mean different things. Code using it might break without the compiler noticing.

I reckon the class should at least have a different name, to force us to review and fix the code wherever it's used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I changed the only other consumer of this class (PuzzleFinisher.scala) to consume BinaryResult instead.

I tried to think of a better name than GameResult to describe this concept... I needed a way to identify which player had the first move advantage.

private val VOLATILITY = lila.rating.Glicko.default.volatility
private val TAU = 0.75d
private val calculator = glicko2.RatingCalculator(VOLATILITY, TAU)
private val calculator = lila.rating.Glicko.system
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what the fudge... there was a horrible bug here.

private val calculator = glicko2.RatingCalculator(VOLATILITY, TAU)

was completely wrong here, as the constructor goes RatingCalculator(tau: Double = 0.75d, ratingPeriodsPerDay: Double = 0).

And that's what happens when we use weak types like Double. Arguments get swapped during some code change and no-one notices.

Gonna change these to proper opaque type so that it doesn't happen again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Goodness, that is quite a mess and I feel somewhat bad for having introduced it years ago.

Parameter tau being near-zero may have caused solvers' volatility to increase slower when on a hot streak:

Smaller values of τ prevent the volatility measures from changing by large amounts, which in turn prevent enormous changes in ratings based on very improbable results.
http://www.glicko.net/glicko/glicko2.pdf

Parameter ratingPeriodsPerDay being 0;75d instead of 0.21436d may have caused solvers' RD to increase faster.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I wasn't blaming you for the bug. In fact I didn't check who introduced it and assumed it was me.
Thanks for fixing it tho.

@ornicar
Copy link
Collaborator

ornicar commented Oct 30, 2024

looks like I cannot push to this PR, that's a bit unpractical

@ddugovic
Copy link
Contributor Author

I'm trying to grant you collaborator access as well as trying to grant you access here
image


def getOpponent(player: Rating): Rating
def getAdvantage(advantage: Double, player: Rating): Double =
if player == first then advantage / 2.0d else -advantage / 2.0d
Copy link
Member

@niklasf niklasf Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: It's a bit insane that players are identified by their exact Rating. Not introduced by this PR, though.

Copy link
Contributor Author

@ddugovic ddugovic Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, there is a dilemma when two players have the exact same Rating. (I can't tell at the moment whether this PR either creates or fixes that edge case.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that comes from https://github.com/goochjs/glicko2. We could fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to fix it... but also, there's a case of dependency inversion since RatingCalculator shouldn't need to inspect RatingPeriodResults to identify what players, puzzles, tutor-exercises, etc. need ratings updated.

Trait RatingPeriodResults should be replaceable by List[Result], because each consumer of RatingCalcuator#updateRatings already has that context, especially because we're only rating games and puzzles serially (one at a time) anyway (except for modules/tutor/src/test/GlickoTest.scala which rates 2 results concurrently).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in any case this PR is not the time and place to fix it.

@ornicar
Copy link
Collaborator

ornicar commented Oct 30, 2024

[EDIT] I'm able to push after accepting the github invitation to contribute.

IDK why I still can't push to this PR

git remote -v
origin	https://github.com/lichess-org/lila.git (fetch)
origin	https://github.com/lichess-org/lila.git (push)
lishogi	https://github.com/ddugovic/lishogi.git (fetch)
lishogi	https://github.com/ddugovic/lishogi.git (push)
git push --set-upstream lishogi glicko-first-move
remote: Permission to ddugovic/lishogi.git denied to ornicar.
fatal: unable to access 'https://github.com/ddugovic/lishogi.git/': The requested URL returned error: 403

It happened before on a couple PRs and has remained a mystery.

@ornicar
Copy link
Collaborator

ornicar commented Oct 30, 2024

sbt scalafmtAll fixes CI

@ddugovic
Copy link
Contributor Author

ddugovic commented Nov 2, 2024

Given excitement in our team channel about crazyhouse, I did a little analysis (and maybe should update this PR):

This month, 1228 rated crazyhouse games were played, with 628 (51%) White wins and 575 (47%) Black wins, a 15.171 Elo advantage (as compared to 7.786 Elo for standard blitz).

fixes clicking the reset password email link
after obtaining a gdpr erasure
players: Iterable[Rating],
results: RatingPeriodResults[?],
skipDeviationIncrease: Boolean = false
) =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is strange, or I'm misunderstanding something.

This updateRatings function now takes an additional argument players.

That info used to come from results which also contains the players as results.players.

Which means there are now 2 sources of truth for the players. Both players and results.players.

I see that results.players has been made private and is completely unused (?!) but it is still a code smell in my opinion, and a potential source of bugs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this PR exposes what in my mind was previously a code smell: RatingCalculator.scala looking at results.players to figure out "which ratings need to be updated?" was unnecessarily complex (besides that previous code being confusing in other ways with trying to figure out if a player played any games in that RatingPeriodResults or not).

Tutor could have exercises with fixed ratings: in fact, in chess literature (books, magazines, CDs, other training materials) exercises are provided with "if you get this score, we estimate your rating is X." But this PR might be prematurely forcing us to make that design decision.

Copy link
Contributor Author

@ddugovic ddugovic Nov 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For others' benefit; we discussed on stream that we don't think results.players needs to exist, but it's an artifact from a previous reference implementation prematurely optimized straight from java hell. https://xkcd.com/2347/
image

Copy link
Contributor Author

@ddugovic ddugovic Nov 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding result.players, class Result is perhaps the world's least efficient implementation mapping Pair[Rating, Pair[OpponentRating, Score]] or maybe Map[Rating, Pair[OpponentRating, Score]] (for a pair of players playing a game, for a player and a puzzle, or for a player attempting a fixed-rating tutor exercise). At present my head hurts trying to think about it.

Copy link
Contributor Author

@ddugovic ddugovic Nov 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait... Result should actually be Tuple[OpponentRating, Score] and there's no need for any functions. Sure, a game has a winner and a loser, so a game should produce a Pair[Result].

val results = glicko2.GameRatingPeriodResults(
List(
game.winnerColor match
case None => glicko2.GameResult(white, black, true)
case Some(chess.White) => glicko2.GameResult(white, black, false)
case Some(chess.Black) => glicko2.GameResult(black, white, false)
Copy link
Collaborator

@ornicar ornicar Nov 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we go from

          case Some(chess.White) => glicko2.GameResult(white, black, false)
          case Some(chess.Black) => glicko2.GameResult(black, white, false)

to

          case Some(chess.White) => glicko2.GameResult(white, black, Some(true))
          case Some(chess.Black) => glicko2.GameResult(black, white, Some(false))

which I'm pretty sure has a bug, not fixed by more recent commits

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a best effort to correct this (and pushed my commits), and am overwhelmed trying to understand compile errors.

@ornicar
Copy link
Collaborator

ornicar commented Nov 19, 2024

I'm making a pure glicko calculator API, moving it to scalachess, and integrating it back into lila

@ornicar ornicar closed this Nov 20, 2024
@ddugovic
Copy link
Contributor Author

I'm making a pure glicko calculator API, moving it to scalachess, and integrating it back into lila

Thanks! I see that PR successfully merged into master!

@ddugovic ddugovic deleted the glicko-first-move branch November 21, 2024 04:44
@ddugovic
Copy link
Contributor Author

On scalachess I'm not a collaborator, so I created lichess-org/scalachess#599

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

adjust Glicko-2 to account for first move advantage
3 participants