-
Notifications
You must be signed in to change notification settings - Fork 0
/
NOTES.txt
307 lines (245 loc) · 15.2 KB
/
NOTES.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
In test competion 2 my program crashed five times. See the following files:
logs/game-53360.txt
logs/game-53492.txt
logs/game-53505.txt
logs/game-53576.txt
logs/game-53590.txt
logs/game-53604.txt
The crash in game 53505 has not been explained. The other crashes were caused
by use of the value from the transposition table as a lower bound for the game
value. When this value is accurate, then all subsequent searches will fail low,
and at the top-level *move will never be initialized.
This occured in practice when the previous search went over budget (which caused
the search depth for the next search to be decreased) combined with the opponent
having only a single move available (which caused the search depth for that
search to be extended) and as a result, the tranposition table would contain an
entry with the required minimal depth for the root case of the next search,
which would then crash.
The resolution is to not use the transposition table at the top level at all.
If I implement the killer heuristic then the transposition table might be used
in those situations, but I must be very careful to avoid hash collisions (which,
to be fair, I have not observed in practice) which might cause an invalid move
to be selected, messing up the internal game state or worse!
Re: move ordering:
- first, use heuristical move ordering:
try stacking on opponent first, stacking on own pieces last.
- second, use a killer heuristic:
try the best move from the transposition table first.
================================================================================
Local test competition (4 players, 12 rounds) has the following results:
No Player Points Won Tied Lost Fail Avg Time Max Time
-- -------------------- ------ ---- ---- ---- ---- -------- --------
1 players/dvonner 6930 59 0 13 0 2.651s 4.823s
2 players/holtz 5101 47 0 25 1 174.467s 188.804s
3 ./player 4227 33 0 39 0 2.081s 2.847s
4 players/dDvonn 1004 5 0 67 0 243.754s 418.068s
-- -------------------- ------ ---- ---- ---- ---- -------- --------
1 2 3 4
--- --- --- ---
1 players/dvonner 20 15 24
2 players/holtz 4 19 24
3 ./player 9 5 19
4 players/dDvonn 0 0 5
Win count of player 1 (row) against player 2 (column)
1 2 3 4
------ ------ ------ ------
1 players/dvonner 18.88 1.08 18.50
2 players/holtz -18.88 2.42 7.25
3 ./player -1.08 -2.42 13.83
4 players/dDvonn -18.50 -7.25 -13.83
Average score difference between players.
Interesting conclusions:
- dDvonn player very poorly all around. I'll probably exclude it from now on.
- My player defeats dvonner (number 1) more easily than Holtz (number 2)!
- Except for being terribly slow, Holtz does something right. Should look at
its evaluation function, since it's open-source.
Additionally, I evaluated the placement phase strategies of the programs, by
truncating all games to 49 moves, and have Dvonner finish them by playing
against itself, and seeing which side wins to determine who started the
placement phase from a better position. I ran 10 games per position to reduce
the influence of chance, and the results are as follows:
# won player
------ ----------
5 (tied)
25 ./player
34 players/dDvonn
35 players/holtz
45 players/dvonner
This suggests that Dvonner's setup phase is currently best! I should try
to improve upon it.
================================================================================
No more crashes in test competition 3, but more losses/draws. Some losses occur
against weaker opponents. In those games, the Dvonn stones are close together
on one side of the board, and my stones are too far away from them after them
placement phase, resulting in emberassing cut-offs. That needs to be fixed!
Comparing my logged times and the judge's output, the judge seems to think I use
about 70~90 ms more time than I think I do. To make sure I stay within the time
limit, I will fixed my timelimit to 5s - 120 ms = 4.88 s.
================================================================================
To prevent losses like in competition 3, I reduced the weight of edge fields
from 5 to 2. At the same time, I changed the way time is allocated for moves.
This seemed an innocuous change, but the new program seemed to lose to the
testcomp-3 version, despite the edge fix!
First, I verified the edge fix itself was for the best, by patching the test
competition program with just this weight change and nothing else:
No Player Points Won Tied Lost Fail Avg Time Max Time
-- -------------------- ------ ---- ---- ---- ---- -------- --------
1 ./player-tc3-edgefix 12271 105 1 94 0 2.471s 3.757s
2 ./player-tc3 10946 94 1 105 0 2.239s 3.513s
-- -------------------- ------ ---- ---- ---- ---- -------- --------
(Stone difference: 1.68)
So, that's not the problem. Next, I patched the old time budgeting code back
into the new program (with edge fix) and tried again:
No Player Points Won Tied Lost Fail Avg Time Max Time
-- -------------------- ------ ---- ---- ---- ---- -------- --------
1 ./player 13714 119 5 76 0 2.384s 3.737s
2 ./player-tc3 9555 76 5 119 0 2.230s 3.676s
-- -------------------- ------ ---- ---- ---- ---- -------- --------
(Stone difference: 1.45)
Much better! The only functional difference between the two would be the edge
fix (and a little higher maximum time limit, 4.88 instead of 4.80). So the
conclusion is that apperantly the old time allocation was much better! The
bigger difference in score than in the previous set of games I cannot account
for. It makes sense to chalk it up to chance, as the stone difference appears
to be smaller than before, suggesting the player was not fundamentally stronger.
================================================================================
It turns out that the CodeCup memory limit is 64,000,000 bytes, not 64 MB!
This is slightly over 61 MB. I have set the transposition table to use 57 MB,
leaving about 4 MB for the rest of the program. This seems safe because I do
little other allocations, and although the stack grows dynamically, I do not
use a large amount of stack space. The only recursive function is the search
in AI.c but that is limited to depth 20.
================================================================================
For the transposition table, an interesting question is what happens when we
want to store a position in the transposition table, but its designated table
entry is already in use.
The simplest solution is to always replace the old entry with the new entry.
This prevents stale positions (whose information will never be used anymore)
from keeping useful new data out of the table. This was the first approach I
implemented. The downside is that sometimes valuable data is discared, for
example, when overwriting the search result for a large subtree with the result
of evaluating a single leaf node.
Since we store the search depth used for position in the transposition table,
there is a simple remedy for this: look at the search depth used for the
existing position, and if it is larger than our current search depth, we discard
the new data and keep the old position information. This works great when
solving a single position, but it doesn't work so well in the context of a game
where the transposition table data is preserved across moves. This is because
the transposition table will remain many positions which occur only in
variations that were once possible but have since been cut off (or even have
literally occured in the game history). Keeping these positions around is
useless and especially harmful during the endgame, when the transposition table
is getting filled.
To get the best of both wordls (keep more detailed information, but purge stale
positions too) I implemented a hybrid approach. For each position I define its
relevancy as the number of moves played in the game before it was reached plus
twice the remaining search depth. An existing position is replaced if its
relevancy is less than or equal to the relevancy of the new position.
This has the effect that for positions found from the same starting point, those
higher up in the search tree still take precedence over those lower in the tree
(with have shallower depth remaining and thus are cheaper to recompute).
However, because positions that occur later in the game have higher relevancy
than those that occur earlier, stale entries can be replaced by new entries
occuring in later positions at the same depth or even shallower depth. This is
the second method I implemented.
================================================================================
After test competition 4. Timing seems to be pretty much exact now; raised limit
to 4.95s (seems sensible to keep 50ms of wiggle room).
Only one lost match, against the number two; see logfile:
"102-1558-55523 Maks Verver vs Abdessamad ELKASIMI: BLACK WIN"
Or online:
http://www.codecup.nl/showgame.php?ga=55523
I get disconnected very early. Maybe I should ensure the Dvonn pieces are spread
around so this doesn't happen!
================================================================================
After test competition 5. All games won.
Tested to ensure principal variation search is useful:
No Player Points Won Tied Lost Fail Avg Time Max Time
-- ------------------------------ ------ ---- ---- ---- ---- -------- --------
1 ./player-tc5 (--pvs=1) 31097 250 15 235 0 2.830s 3.651s
2 ./player-tc5 --pvs=0 29626 235 15 250 0 2.826s 3.710s
-- ------------------------------ ------ ---- ---- ---- ---- -------- --------
Score difference: 0.24
So looks like it is, though the benefit is small.
Also evaluated some variations for calculating the final game score, basing the
score mostly on the winner or the controlled player's number of pieces (because
these are counted in the CodeCup too!) but testing against the previous
submission as well as several weaker opponents does not show this to be an
improvement.
Evaluated set-up functions by running my player four times from the starting
position and seeing who wins. Using only games from the second phase of test
competition 5, the results are as follows (grouped per permutation group):
Using my player: Using Dvonner:
22 Szymon Wylezol 5 Bauke Conijn
25 Bauke Conijn 5 Joakim Mjärdner
29 Joakim Mjärdner 8 - (tie)
35 John Christian David 9 Szymon Wylezol
38 - (tie) 11 John Christian David
46 Marcel Vlastuin 12 Marcel Vlastuin
47 Jorik Mandemaker 13 Leon Schoorl
60 Louis Bliss 13 Louis Bliss
68 Leon Schoorl 15 Jorik Mandemaker
68 Rahim Hajibagheran 17 Rahim Hajibagheran
86 saeedeh sadeghi 22 saeedeh sadeghi
96 imanOmojtaba imanOmojtaba 23 imanOmojtaba imanOmojtaba
99 Thijs Marinussen 23 Thijs Marinussen
100 Tomek Czajka 24 Maks Verver
102 Laurent van den Bos 26 Pieter Bootsma
105 Koos van der Linden 27 Koos van der Linden
106 Pieter Bootsma 27 Mohammad Shokri
115 Bas Nieuwenhuizen 28 Abdessamad ELKASIMI
115 Christian Kauth 28 Laurent van den Bos
119 Mohammad Shokri 31 Christian Kauth
121 Valentin Hirschi 31 Tomek Czajka
122 Abdessamad ELKASIMI 32 Bas Nieuwenhuizen
124 Maks Verver 32 Valentin Hirschi
So in either case I'm in the best group, but I'm not sure exactly how I compare
against e.g. Tomek, since I'm on top first while he is last, and later I am at
the bottom while he is near the top! Maybe the conclusion should be that my
current setup works best for the type of follow-up play I do (and Dvonner seems
to do better from other starting positions).
================================================================================
After test competion 6. Top players are getting closer. I lost a few matches,
one (out of four) against Tomek Czajka.
Games lost by me:
104-1662-58383 Maks Verver vs Bas van den Aardweg: BLACK WIN.txt
104-1663-58408 Christian Nilsson vs Maks Verver: WHITE WIN.txt
104-1664-58432 Tomek Czajka vs Maks Verver: WHITE WIN.txt
Games lost by Tomek:
104-1634-57550 Tomek Czajka vs Maks Verver: BLACK WIN.txt
104-1641-57858 Tomek Czajka vs Bas van den Aardweg: BLACK WIN.txt
104-1655-58217 Tomek Czajka vs Christian Kauth: BLACK WIN.txt
104-1658-58289 Tomek Czajka vs Sandor Pandur: BLACK WIN.txt
104-1634-57549 Maks Verver vs Tomek Czajka: WHITE WIN.txt
104-1639-57773 Leon Schoorl vs Tomek Czajka: WHITE WIN.txt
104-1640-57817 Abdessamad ELKASIMI vs Tomek Czajka: WHITE WIN.txt
104-1655-58218 Christian Kauth vs Tomek Czajka: WHITE WIN.txt
104-1659-58314 Leon Schoorl vs Tomek Czajka: WHITE WIN.txt
104-1662-58387 Abdessamad ELKASIMI vs Tomek Czajka: WHITE WIN.txt
104-1663-58409 Bas van den Aardweg vs Tomek Czajka: WHITE WIN.txt
104-1664-58431 Maks Verver vs Tomek Czajka: WHITE WIN.txt
Interesting is that Bas van den Aardweg and Christian Nilson beat both Tomek
and me, so they might be doing something right! Need to investigate.
================================================================================
Friday 21 January -- Last day optimizations!
Can't override GCC 4.3's compilation target, which is too bad. The difference
between "-O2 -march=i386" and "-O2 -march=pentium4" seems to be about 5%, and
with -O3 even around 6%. GCC 4.4 supports __attribute__((target())) to override
the target in the source code, but since I can't find anything equivalent for
GCC 4.3, I'll leave this out.
Hardcoding parameters (most notably evaluation depth) doesn't improve
performance notably. (But I'll leave it in now I've coded it, I guess.)
I seem to do about 300,000 evaluations per second on the CodeCup servers, so
a maximum of around 1,500,000 entries, which would be nearly 70% of my hash
table! In practice, it tends to be closer to 50%, but it's still tricky.
"Improved" hashtable does seem to give a small performance boost, but I'm
worried performance might be affected, so I'll probably keep it disabled.
The "fancy" evaluation function (that weighs fields by distance to Dvonns)
seems to be paying off, but only against itself, which is weird. :-/
================================================================================
With a CodeCup or bust attitude (and after some preliminary testing) I ended up
submitting a program based on the experimental fancy-evaluation branch, and with
largely untested time distribution code; linear probing in the hashtable didn't
make the cut. It was 4:50 AM when I finally submitted my program, and although
I couldn't sleep until much later, losing some sleep was worth it: my program
won the CodeCup without losing a single game! \o/