-
Notifications
You must be signed in to change notification settings - Fork 8
/
README
797 lines (645 loc) · 37.5 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
[note: this file is outdated now that GMP-ECM moved to gitlab.]
This is the README file for GMP-ECM. (See INSTALL-ecm for installing GMP-ECM
and the ecm library, and README.lib for using the ecm library.)
Table of contents of this file:
1. Basic usage.
2. How to use P-1, P+1, and ECM efficiently?
3. Extra factors and Brent-Suyama's extension.
4. Memory usage.
5. Expression syntax reference for GMP-ECM's syntax parser.
6. Options -param, -sigma, -x0, -y0, -A, -torsion.
7. Options -save, -resume and -chkpnt.
8. How to get the best of GMP-ECM?
9. GMP-ECM and GPU.
11. Record factors.
11. Known problems.
12. GMP-ECM packages.
##############################################################################
1. Basic usage
GMP-ECM reads the numbers to be factored from stdin (one number on each
line) and requires a numerical parameter, the stage 1 bound B1. A reasonable
stage 2 bound B2 for the given B1 is chosen by default, but can be overridden
by a second numerical parameter. (For a given B1 value, the "default" B2 might
differ between ECM, P-1, and P+1.) By default, GMP-ECM uses the ECM factoring
algorithm.
Example: To run one curve of ECM with B1=1000000 on each number in the file
"composites", run
ecm 1000000 < composites
To use a B2 value of ~5*10^8 instead of the default value of ~10^9, run
ecm 1000000 5e8 < composites
Scientific notation is accepted for B1 and B2 values. The B1 value might be
of the form B1min-B1, in which case primes up to B1min are skipped, for
example with 1e6-1e9, primes up to 10^6 are skipped. In particular using
the same value enables one to skip stage 1, for example 1e9-1e9 (this is
useful together with -resume when stage 1 was computed with another tool
or on another computer). The actual B2 value used may be larger than the
specified value to let parameters satisfy some conditions imposed by the
stage 2 algorithm.
To run one curve with B1=11e7 on M1061, simply do:
echo "2^1061-1" | ecm 11e7
To run more than one ECM curve on each input number, use the -c parameter.
Example: to run 100 curves with B1=1000000 and default B2 on each number
in "composites", run
ecm -c 100 1000000 < composites
To use the P-1 or P+1 factoring methods, use the -pm1 or -pp1 parameter,
respectively.
Example: to use the P-1 method with B1=10^9 on all numbers in the file
"composites", run
ecm -pm1 1e9 < composites
Note that, unlike for ECM, using the same B1,B2 bounds on one number is
quite useless for P-1, and of limited use for P+1. See "2. How to use P-1,
P+1, and ECM efficiently?"
##############################################################################
2. How to use P-1, P+1, and ECM efficiently?
The P-1 method works well when the input number has a prime factor P such
that P-1 is "smooth", i.e., has all its prime factor less or equal the
step 1 bound B1, except one which may be less or equal the second step
bound B2. For P=67872792749091946529, we have P-1 = 2^5 * 11 * 17 * 19 *
43 * 149 * 8467 * 11004397, so this factor will be found as long as B1 >= 8467
and B2 >= 11004397:
$ echo 67872792749091946529 | ./ecm -pm1 -x0 2809890345 8467 11004397
GMP-ECM ... [powered by GMP ...] [P-1]
Input number is 67872792749091946529 (20 digits)
Using B1=8467, B2=6710-19370830, x0=2809890345
Step 1 took 3ms
Step 2 took 14ms
********** Factor found in step 2: 67872792749091946529
Found input number N
There is no need to run P-1 several times with the same B1 and B2 as there
is for ECM, since a factor found with one seed will (almost always) be found
by another one.
Note: if a factor of P-1 if known, you can specify it with -go. For example
it is known that the Fermat number 2^(2^n)+1 has factors of the form
k*2^(n+2)+1, thus 2^(n+2) is a known factor of P-1. The -go expression has
a special placeholder 'N' which stands for the input number, for example:
$ echo "2^(2^12)+1" | ecm -go "N-1" -pm1 1e6
The P+1 method works well when the input number has a prime factor P such
that P+1 is "smooth". For P=4190453151940208656715582382315221647, we have
P+1 = 2^4 * 283 * 2423 * 21881 * 39839 * 1414261 * 2337233 * 132554351, so
this factor will be found as long as B1 >= 2337233 and B2 >= 132554351:
$ echo 4190453151940208656715582382315221647 | ./ecm -pp1 -x0 7 2337233 132554351
GMP-ECM ... [powered by GMP ...] [P+1]
Input number is 4190453151940208656715582382315221647 (37 digits)
Using B1=2337233, B2=2324738-343958122, x0=7
Step 1 took 750ms
Step 2 took 120ms
********** Factor found in step 2: 4190453151940208656715582382315221647
Found input number N
However not all seeds will succeed: only half of the seeds 'x0' work for P+1
(namely those where the Jacobi symbol of x0^2-4 and P is -1.) Unfortunately,
since P is usually not known in advance, there is no way to ensure that this
holds. However, if the seed is chosen randomly, there is a probability of
about 1/2 that it will give a Jacobi symbol of -1 (i.e., the factor P will
be found if P+1 is smooth enough). A rule of thumb is to run 3 times P+1
with different random seeds. The seeds 2/7 and 6/5 have a slightly higher
chance of success than average as they lead to a group order divisible by
6 or 4, respectively. When factoring Fibonacci numbers F_n or Lucas
numbers L_n, using the seed 23/11 ensures that the group order is
divisible by 2n, making other P+1 (and probably P-1) work unnecessary.
As of version 6.2, a new stage 2 for the P-1 and P+1 algorithms is
implemented. It uses less memory and is faster than the previous code,
thus allowing larger B2 values. If GMP-ECM is configured with the
"--enable-openmp" flag and is compiled with a compiler that implements
OpenMP, it uses multi-threading for computation of polynomial roots and
NTT multiplication. When not using the NTT, it benefits from multi-threading
only in the computation of roots phase. The number of threads to use can
be controlled with the OMP_NUM_THREADS environment variable. Unlike the
previous generic stage 2, the new stage 2 cannot use the Brent-Suyama
extension (-power and -dickson parameters). Specifying these options on
the command line forces use of the generic stage 2. Note: the notation of
the parameters follows that in the paper, the number of multi-point
evaluations (similar to "blocks") is given by s_2. You can specify a lower
limit for s_2 by the -k command line parameter.
The ECM method is a probabilistic method, and can be viewed in some sense
as a generalization of the P-1 and P+1 method, where we only require that
P+t+1 is smooth, where t depends on the curve we use and satisfies
|t| <= 2*P^(1/2) (Hasse's theorem). The optimal B1 and B2 bounds have
to be chosen according to the (usually unknown) size of P. The following
table gives a set of nearly optimal B1 and B2 pairs, with the corresponding
expected number of curves to find a factor of given size (column "-power 1"
does not take into account the extra factors found by Brent-Suyama's exten-
sion, whereas column "default poly" takes them into account, with the poly-
nomial used by default: D(n) means Dickson's polynomial of degree n):
digits D optimal B1 default B2 expected curves
N(B1,B2,D)
-power 1 default poly
20 11e3 1.9e6 74 74 [x^1]
25 5e4 1.3e7 221 214 [x^2]
30 25e4 1.3e8 453 430 [D(3)]
35 1e6 1.0e9 984 904 [D(6)]
40 3e6 5.7e9 2541 2350 [D(6)]
45 11e6 3.5e10 4949 4480 [D(12)]
50 43e6 2.4e11 8266 7553 [D(12)]
55 11e7 7.8e11 20158 17769 [D(30)]
60 26e7 3.2e12 47173 42017 [D(30)]
65 85e7 1.6e13 77666 69408 [D(30)]
Table 1: optimal B1 and expected number of curves to find a
factor of D digits with GMP-ECM.
After performing the expected number of curves from Table 1, the
probability that a factor of D digits was missed is exp(-1), i.e.,
about 37%. After twice the expected number of curves, it is exp(-2),
i.e., about 14%, and so on.
Example: after performing 8266 curves with B1=43e6 and B2=2.4e11
(or 7553 curves with -dickson 12), the probability to miss a 50-digit
factor is about 37%.
From version 6.0 on, GMP-ECM prints the expected number of curves and expected
time to find factors of different sizes in verbose mode (option -v). This
makes it easy to further optimize parameters for a certain factor size if
desired: simply try to minimize the expected time.
(lengthy NOTE: The order of an elliptic curve with Montgomery parameterization
is known to be divisible by 4. Depending on the parametrization that is used
(see Section 6) this number is increased in different ways. Let us call T the
known torsion (depending on the parametrization). One can assume that the
probability that the order is B1,B2 smooth should be about as great as for a
random integer 1/Tth in value. However, Montgomery observed that the order
behaves even nicer than that: heuristically, it seems that the order is as more
likely to be smooth than a random integer about 1/Tth. In "Finding ECM-Friendly
Curves through a Study of Galois Properties", Barbulescu, Bos, Bouvier,
Kleinjung and Montgomery showed how to computed the average valuation that
should be used instead of the average valuation of random number. This is the
values we use in GMP-ECM and the computed probabilities match those observed in
experiments very well. This however means that the so computed values for the
expected number of curves for given B1,B2 values and factor sizes may not match
those published in the previous literature.
The factor GMP-ECM uses is defined as ECM_EXTRA_SMOOTHNESS in rho.c,
EXTRA_SMOOTHNESS_SQUARE and EXTRA_SMOOTHNESS 32BITS_D in ecm.c depending on the
parametrization. You can change it to
ECM_EXTRA_SMOOTHNESS 3.0
EXTRA_SMOOTHNESS_SQUARE 0.333333333
EXTRA_SMOOTHNESS 32BITS_D 0.333333333
if you want to reproduce the more pessimistic values found in the literature.)
In summary, we advise the following method:
0 - choose a target factor size of D digits
1 - choose optimal B1 and B2 values to find factors of D digits (cf Table 1)
2 - run once P-1 with 10*B1, and the default B2 chosen by GMP-ECM
3 - (optional) run 3 times P+1 with 5*B1, and the default B2
4 - run N(B1,B2,D) times ECM with those B1 and B2, where N(B1,B2,D) is the
expected number of ECM curves with step 1 bound B1, step 2 bound B2,
to find a factor of D digits (cf above table).
5 - if no factor is found, either increase D by 5 digits and go to 0, or use
another factorization method (MPQS, NFS)
Note: if a factor is found in steps 2, 3 or 4, simply continue the current
step with the remaining cofactor (if composite). There is no need
to start again from 0, since the cofactor was already tested, too.
##############################################################################
3. Extra factors and Brent-Suyama's extension.
GMP-ECM may sometimes find some "extra" factors, such that one factor of P-1,
P+1 or P+t+1 exceeds the step 2 bound B2, thanks to Brent-Suyama's extension.
Let's explain how it works for P-1, since it's simpler. The classical step 2
(without Brent-Suyama's extension) considers s^(j*d) mod N and s^i mod N,
where N is the number to factor, and s is the residue computed in stage 1.
Here, d is fixed, and the integers i and j vary in two sets so that j*d-i
covers all primes in [B1, B2]. Now consider a polynomial f(x), and compute
s^f(j*d) and s^f(i) instead of s^(j*d) and s^i [thus the classical step 2
corresponds to f(x)=x^1]. Then P will be found whenever all but one of the
factors of P-1 are <= B1, and one factor divides some f(j*d) - f(i):
$ echo 1207946164033269799036708918081 | ./ecm -pm1 -k 3 -power 12 286493 25e6
GMP-ECM ... [powered by GMP ...] [P-1]
Input number is 1207946164033269799036708918081 (31 digits)
Using B1=286493, B2=30806172, polynomial x^12, x0=1548711558
Step 1 took 320ms
Step 2 took 564ms
********** Factor found in step 2: 1207946164033269799036708918081
Found input number N
Here the largest factor of P-1 is 83957197, which is 3.35 times larger than B2.
Warning: these "extra" factorizations may not be reproducible from one
version of GMP-ECM to another one, since they depend on some internal
parameters that may change.
For P-1 with the generic stage 2, the degree of the Brent-Suyama polynomial
should be even. Since i^2k - (j*d)^2k = (i^k - (j*d)^k)(i^k + (j*d)^k),
this allows testing two values symmetric around a multiple of d
simultaneously, halving the amount of computation required in stage 2.
P+1 with the generic stage 2 and ECM do this inherently.
The new fast stage 2 for P-1 and P+1 does not support the Brent-Suyama
extension. By default, the fast stage 2 is used for P-1 and P+1;
giving a -power or -dickson parameter on the command line forces use of
the previous, generic stage 2. It is recommended to use the new stage 2
(from version 6.2) for P-1 and P+1, which is the default: it is so much
faster that it largely compensates the few extra factors that are not found
because Brent-Suyama's extension is not available.
The default polynomial used for ECM with a given B2 should be near optimal,
i.e., give only a marginal overhead in step 2, while enabling extra factors.
##############################################################################
4. Memory usage.
Step 1 does not require much memory: O(n) for an input number of n digits.
Step 2 may be quite memory expensive, especially for large B2, since its
efficient algorithms use some large tables. To reduce the memory usage of
step 2, you may increase the 'k' parameter, which controls the number of
"blocks" performed in step 2. Multiplying the default value of k by 4 will
decrease the memory usage by a factor of 2. For example with B2=1e10 and
a 155-digit number, step 2 requires about 55MB with the default k=4, but
only 27MB with k=16. Increasing k does, however, slightly increase the time
required for step 2 (see section "How to get the best of GMP-ECM?").
An estimation of the memory usage is given at the start of stage 2:
$ echo 95209938255048826235189575712705128366296557149606415206280987204268594538412191641776798249266895999715600261737863698825644292938050707507901970225804581 > c155
$ ecm -v -k 4 10 1e10 < c155
...
Estimated memory usage: 55M
...
Step 2 took 18649ms
$ ecm -v -k 16 10 1e10 < c155
...
Estimated memory usage: 27M
...
Step 2 took 26972ms
Another way is to use the -treefile parameter, which causes some of the
tables to be stored on disk instead of in memory. Using the option
"-treefile /var/tmp/ecmtree" will create the files "/var/tmp/ecmtree.1",
"/var/tmp/ecmtree.2" etc. The files are deleted upon completion of stage 2:
$ ecm -v -treefile /tmp/ecmtree -k 4 10 1e10 < c155
...
Estimated memory usage: 36M
...
Step 2 took 18648ms
Due to time consuming disk I/O, this will cause stage 2 to take somewhat
longer. How much memory is saved depends on stage 2 parameters, but a
typical value is that memory use is reduced by a factor of about 1.5.
Increasing the number of blocks with -k also reduces the amount of data that
needs to get written to disk, thus reducing disk I/O time. Combining these
parameters is a very effective way of reducing memory use.
Up from version 6.1, there is still another (better) possibility, with the
-maxmem option. The command-line -maxmem nnn option tells GMP-ECM to use at
most nnn MB in stage 2. It is better than -k because it takes into account
the size of the number to be factored, and automatically adjusts the number
of blocks to use:
$ ./ecm -v -maxmem 40 10 1e10 < c155
...
dF=8192, k=15, d=79170, d2=11, i0=-10
...
Estimated memory usage: 27M
...
Step 2 took 25456ms
##############################################################################
5. Expression syntax reference for GMP-ECM's syntax parser.
GMP-ECM can handle several kinds of expressions as input numbers. Here is the
syntax that is handled:
1. Raw decimal numbers like 123456789
2. Comments can be placed in the file. The C++ "one line comment" //
is used. Everything after the // on a line (including the //) is ignored.
Warning: no input number should appear on such a comment line.
3. Any white space (space, tab, end of line) is ignored. However, the "end of
line" is used to end the expression. For example, processing this:
1 2 3 4 5 6 7 8 9
would be the same as processing
123456789
4. "common" arithmetic expressions (* / + - %), the period '.' might be used
in place of * for multiply, and - can be unary minus (e.g., -55555551).
Example: echo "3*5+2" | ./ecm 100
5. Grouping ( [ { for start of group (which symbol is used does not matter)
and ) ] } to end a group (again all 3 symbols mean the SAME thing).
6. Exponentiation with the ^ character (i.e., 2^24 is the same as 16777216).
Example: echo "2^24+1" | ./ecm 100
7. Simple factorial using the exclamation point ! character. Example is
53! == 1*2*3*4...*52*53. Example: echo '53!+1' | ./ecm 1e2
8. Multi-factorial as in: n!m with an example: 15!3 == 15.12.9.6.3.
9. Simple Primorial using the # character with example of 11# == 2*3*5*7*11
10. Reduced Primorial n#m with example of 17#5 == 5.7.11.13.17
11. Functions are possible with the expression parser. Currently, the only
available function is Phi(n,x), which is the value of the n-th cyclotomic
polynomial at x, however other functions should be easy to add.
Note: Expressions are maintained as much as possible (even if the expression
becomes longer than the decimal expansion). Expressions are output as cofactors
(if the input was an expression), and are stored into save/resume files
(again if and only if the original input was an expression, and not an expanded
decimal number). When a factor is found, the cofactor expression is of the
form (original_expression)/factor_found (see however option -cofdec):
$ echo "3*2^210+1" | ./ecm -sigma 4007218240 2500
GMP-ECM ... [powered by GMP ...] [ECM]
Input number is 3*2^210+1 (64 digits)
Using B1=2500, B2=186156, polynomial x^1, sigma=4007218240
Step 1 took 16ms
Step 2 took 16ms
********** Factor found in step 2: 1358437
Found probable prime factor of 7 digits: 1358437
Probable prime cofactor (3*2^210+1)/1358437 has 58 digits
##############################################################################
6. Options -param, -sigma, -x0, -y0, -A, -torsion.
This section explains the behavior of the options 'param', 'sigma', 'x0', 'y0',
'A' and 'torsion' when GMP-ECM is used to run ECM.
The parameters A, x0 and y0 can have integer or rational values, for example
-A 22/7 -x0 1/3 -y0 2/7.
a) with -param <= 4 (default value is 0): curves in Montgomery form
-------------------------------------------------------------------
In stage 1, the elliptic curves used are of the following Montgomery form:
b*y^2 = x^3 + A*x^2 + x. The starting point in ( x0 : : 1).
In this case we do not need the value of y0.
Either sigma is given, in which case the values of A and x0 are deduced from
sigma and the params option (see below).
Or A and x0 are given (in which case both must be given).
The -param option is used to choose the parametrization from which the curves
are taken. It can take 4 values (s is the 'sigma' parameter):
0: use Suyama parametrization. It was the parametrization used by previous
versions of GMP-ECM.
u = s^2-5, v = 4*s,
A = (v-u)^3*(3*u+v)/(4*u^3*v)-2,
x0=u^3/v^3.
1: (only for 64-bit processors)
A = 4*s^2/2^64-2,
x0 = 2.
2: use an elliptic parametrization to compute d(s) such that the curve has
a 6-torsion point.
A = 4*d(s)-2,
x0 = 2.
3: it is mostly used for the gpu computation.
A = 4*s/2^32-2,
x0 = 2.
Generally, the parameter s is chosen randomly but the -sigma option can be
used to choose its value.
* with -param 0, there is no constraint on sigma
* with -param 1 (only for 64-bit processors), s^2 should fit in a 64-bit
integer, thus s < 2^32
* with -param 2, there is no constraint on sigma
* with -param 3 (mostly used for GPU), s should fit in a 32-bit word,
thus s < 2^32
In the case where only "-sigma s" is used, "-param 0" is assumed in order to be
compatible with older version of GMP-ECM. The above is also true when resuming
from a file (see Section 7). In the case where only "-param p" is used, the
parameter is chosen at random by the ecm library. In the case where neither
"-sigma" nor "-param" is used, the choice of the parametrization is left to ecm
library and the parameter is chosen at random. In the case where the two
options are used, "-param p -sigma s" can be shorten in "-sigma p:s".
The curve can also be chosen with the -A option which directly specifies the
values of the coefficient a of the curve. In this case the -x0 option is
mandatory.
The -x0 option can be used to override the x0 value in the case where param=0
and is mandatory in the case the curve is given with the -A option. In the case
where param=1, param=2 or param=3, x0 must be equal to 2, so this should not be
used.
b) with param = 5: curves in Weierstrass form
---------------------------------------------
In this case we use curves in Weierstrass form:
E: y^2 = x^3 + A*x + b
The user must provide all of A, x0 and y0. GMP-ECM will consider that
(x0, y0) is a point on E, where b = y0^2-x0^3-A*x0 mod N. Step 2 is performed
as usual, since it already uses the Weierstrass form.
The options of Section 7 apply.
Applications include using curves with some exotic prescribed torsion
subgroup (see d below), or CM (Complex Multiplication) curves.
c) with param = 6, 7: curves in (twisted) Hessian form
------------------------------------------------------
These curves have torsion group Z3xZ3 over Q(sqrt(-3)) and are useful when
the prime factor p we are looking for is known to satisfy p = 1 mod 3.
Since p = 1 mod 3 is quite frequent, a try at this version is not unthinkable
in general.
Curves in projective Hessian form have equation
X^3+Y^3+Z^3=D*X*Y*Z.
They should be used with
-param 6 -A <D> -x0 <x0> -y0 <y0>
with D^3 mod N != 1, and (x0=X0/Z0, y0=Y0/Z0) is a base point on the curve.
Curves in projective twisted Hessian form have equation
a*X^3+Y^3+Z^3=d*X*Y*Z.
They should be used with
-param 7 -A <D> -x0 <x0> -y0 <y0>
where D=a^3/d and (x0=X0/Z0, y0=Y0/Z0) is a base point on the curve.
Ref: JKL-ECM in proceedings ANTS-XII.
d) -torsion:
------------
This option enables the user to ask for a given curve with specific torsion
group (over the rationals). For instance
-torsion Z5 -sigma 2
will generate a curve E over the rationals with torsion group Z/5Z and a point
of infinite order, both generated from parameter sigma = 2.
Note that in the present case, -param is overriden by whatever "good" form
the curves are built with that prescribed torsion.
Available torsion groups (over the rationals) are:
* Z5: for sigma != 0, -1/2, -1/3, -1/4, cf. [1]
* Z7: [1]
* Z9: [1]
* Z10: [1]
* Z2xZ8: [1]
* Z3xZ3
* Z4xZ4
References: [1] Atkin/Morain, Math. Comp., 1993.
##############################################################################
7. Options -save, -resume and -chkpnt.
These -save and -resume options are useful to save the current state of the
computation after step 1, or to exchange data with other software. It allows
to perform step 1 with GMP-ECM, and step 2 with another software (or
vice-versa).
Note: the residue from the end of stage 1 gets written to the file only
after stage 2, if stage 2 is performed in the same program run. This way,
if a factor is found, the save file entry will contain the new cofactor
(if composite) or will be omitted (if cofactor is a probable prime). For
periodic saving during stage 1 for crash recovery, use -chkpnt,
described below.
Here is an example how to reuse some P-1 computation:
$ cat c71
13155161912808540373988986448257115022677318870175067553764004308210487
$ ./ecm -save toto -pm1 -mpzmod -x0 2 5000000 < c71
GMP-ECM ... [powered by GMP ...] [P-1]
Input number is 13155161912808540373988986448257115022677318870175067553764004308210487 (71 digits)
Using B1=5000000, B2=352526802, polynomial x^24, x0=2
Step 1 took 3116ms
Step 2 took 2316ms
The file "toto" now contains some information about the method used, the step
1 bound, the number to factor, the value X at the end of step 1 (in hexa-
decimal), and a checksum to verify that no data was corrupted:
$ cat toto
METHOD=P-1; B1=5000000; N=13155161912808540373988986448257115022677318870175067553764004308210487; X=0x12530157ae22ae14d54d6a5bc404ae9458e54032c1bb2ab269837d1519f; CHECKSUM=2287710189; PROGRAM=GMP-ECM 6.2; X0=0x2; [email protected]; TIME=Sat Apr 12 13:41:01 2008;
Then one can resume the computation with larger B1 and/or B2 as follows:
$ ./ecm -resume toto 1e7
GMP-ECM ... [powered by GMP ...] [ECM]
Resuming P-1 residue saved by [email protected] with GMP-ECM 6.2 on Sat Apr 12 13:41:01 2008
Input number is 13155161912808540373988986448257115022677318870175067553764004308210487 (71 digits)
Using B1=5000000-10000000, B2=9946848-1326917772,
Step 1 took 3076ms
Step 2 took 4304ms
********** Factor found in step 2: 1448595612076564044790098185437
Found probable prime factor of 31 digits: 1448595612076564044790098185437
Probable prime cofactor 9081321110693270343633073697474256143651 has 40 digits
The second run only considered the primes in [5e6-10e6] in step 1,
which saved half the time of step 1.
The format used is the following:
- each line corresponds to a composite (expression ARE saved in the save file)
- a line contains assignments <id>=<value> separated by semi-colons ';'
- possible values for <id> are
- METHOD (value = ECM or P-1 or P+1)
- PARAM (value = ECM parametrization) [ECM only] (see Section 6)
- SIGMA (value = ECM sigma parameter) [ECM only] (see Section 6)
- B1 (first step bound)
- N (composite number to factor)
- X (value at the end of step 1)
- A (A-parameter of the elliptic curve) [ECM only]
- CHECKSUM (internal value to check correctness of the format)
- PROGRAM (program used to perform step 1, useful for factor credits)
- X0 (initial point for ECM, or initial residue for P-1/P+1) [optional]
- WHO (who performed step 1)
- TIME (date and time of first step)
PARAM, SIGMA and X0 would be optional, and would be mainly be used in case of
a factor is found, to be able to reproduce the factorization.
For ECM, one of the SIGMA or A values must be present, so that the
computation can be continued on the correct curve.
The B1 and X values satisfy the condition that X is a lcm(1,2,...,B1)-th
power in the (multiplicatively written) group.
If consecutive lines in a save file being resumed contain the same number to
be factored, say when many ECM curves on one number have been saved, factors
discovered by GMP-ECM are carried over from one attempt to the next so that
factors will be reported only once. If the cofactor is a probable prime, or
if the -one option was given and a factor was found, the remaining
consecutive lines for that number will be skipped.
Note: it is allowed to have both -save f1 and -resume f2 for the same run,
however the files f1 and f2 should be different.
Remark: you should not perform in parallel several -resume runs on the same
input with the same B1/B2 values, since those runs will do the same
computations. Options -save/-resume are useful in the following cases:
(a) somebody did a previous step 1 computation with another software
which is faster than GMP-ECM, and wants to perform Step 2 with
GMP-ECM which is faster for that task.
(b) somebody did a previous step 1 for P-1 or P+1 up to a given bound
B1, and you want to extend that computation with B1' > B1, without
restarting from scratch. Note: this does not apply to ECM, where
the smoothness property depends on the (random) curve chosen, not
only on the input number.
(c) you did a huge step 1 P-1 or P+1 computation on a given machine, and you
want to perform a huge step 2 in parallel on several
machines. For example machine 1 tests the range B2_1-B2_2, machine
2 tests B2_2-B2_3, ... This also decreases the memory usage for
each machine, which is function of the range width B2max-B2min.
For the same reason as (b), this does not apply to ECM.
The -chkpnt option causes GMP-ECM to write the current residue periodically
during the stage 1 computation. This is useful as a safeguard in case the
GMP-ECM process is terminated, or the computer loses power, etc. The
checkpoint is written every ten minutes, when a signal (SIGINT, SIGTERM) is
received, and at the end of stage 1. The format of the checkpoint file is
very similar to that of regular save files, and checkpoints can be resumed
with the -resume option.
For example:
$ ecm -chkpnt pm1chkpoint -pm1 1e10 1 < largenumber.txt
[Computer crashes during computation]
$ ecm -resume pm1chkpoint 1e10 1
Note 1: if an existing file is specified as the checkpoint file, it will be
silently overwritten!
Note 2: When resuming a checkpoint file, additional small primes may be
processed in stage 1 when the checkpoint file is resumed, so the
end-of-stage 1 residues of an uninterrupted run and a checkpointed run
may not match. The extra primes do not reduce the probability of finding
factors, however.
Note 3: for ECM, the -chkpnt option is only implemented with -param 0 so far.
##############################################################################
8. How to get the best of GMP-ECM?
After configuring GMP-ECM, and "make", do:
$ make ecm-params
This will optimize parameters for your machine and put them in ecm-params.h.
If you do an out-of-source build, then:
* copy the new ecm-params.h file into the source folder
* run "make clean" and "make" again
The ecm program automatically selects what it thinks is the best
arithmetic for the given input number. If that choice is not optimal, you may
force the use of a certain arithmetic by trying options -modmulm, -mpzmod,
-redc. (The best choice should depend on B1 and B2 only very little, so long
as B1 is not too small, say >= 1000.)
Number of step 2 blocks. The step 2 range [B1,B2] is divided into k
"big blocks". The default value of k is chosen to be near to optimal.
However, it may be that for a given (B1,B2) pair, another value of k
may be better. Try for this to give the option -k <value> to ecm,
where <value> is 1, 2, 3, ... This will force ecm to divide step 2
in at least <value> blocks.
Changing the value of the number of blocks will not modify the chance
of finding a factor (except for extra factors, but some will be lost,
and some will be won, so the balance should be nearly even). However it
will change the time spent in Step 2 and modify the memory used by Step 2
(see the section "Memory usage").
Optimal thresholds. The thresholds for the algorithms used in ecm are defined
in ecm-params.h, which tries to select a good "params.h" file.
Several params.h files are included in the distribution
and the configure script will select one matching your machine if it exists
(see the output of ecm -printconfig).
If there is no params.h for your machine then see the section "How to get the
best of GMP-ECM?" to generate one optimized for your machine.
Stage 2 now uses Number-Theoretic Transforms (NTT) for polynomial arithmetic
by default for numbers of at most 30 machine words (NTT_SIZE_THRESHOLD in
ecm-ecm.h). The NTT code forces dF to be a power of 2; it can be disabled by
passing the command-line option -no-ntt and unconditionally enabled by -ntt.
Performance of NTT is dependent on:
- Architecture. NTT seems to give the greatest improvement on Athlons,
and the least improvement on Pentiums without SSE2.
- Thresholds. It is vital to have ecm-params.h properly tuned for your machine.
- C compiler. The SSE2 assembly code for 32 bit and the assembly code for
64 bit only work for x86 using gcc or Intel cc, so it is compiler dependent.
Note on factoring Fermat numbers:
GMP-ECM features Schönhage-Strassen multiplication for polynomials in
stage 2 when factoring Fermat numbers (not in the new, fast stage 2 for
P+1 and P-1. This is to be implemented.) This greatly reduces the number of
modular multiplications required, thus improving speed. It does, however,
restrict the length of the polynomials to powers of two, so that for a given
number of blocks (-k parameter), the B2 value can only increase by factors of
approximately 4.
For the number of blocks, choices of 2, 3 or 4 usually give best performance.
However, if the polynomial degree becomes too large, relatively expensive
Karatsuba or Toom-Coom methods are required to split the polynomial before
Schönhage-Strassen's method can handle them. That can make a larger number
of blocks worthwhile. When factoring the m-th Fermat number F_m = 2^(2^m)+1,
degrees up to dF=2^(m+1) can be handled directly. If your B2 choice
requires a degree much larger than this (dF is printed with the -v
parameter), try increasing the number of blocks with -k and see if
performance improves.
The Brent-Suyama extension should not be used when factoring Fermat numbers,
it is more efficient to simply increase B2. Therefore, -power 1 for P+1 and
ECM, and -power 2 for P-1 are the default for Fermat numbers. (Larger
degrees for Brent-Suyama may possibly become worthwhile for P-1 runs on
smaller Fermat numbers and extremely large B2, when Karatsuba and Toom-Cook
are used extensively.)
Factoring Fermat numbers uses a lot of memory, depending on the size of the
Fermat number and on dF. For dF=65536 and F_12, the memory used is about
1700MB. If your system does not have enough memory, you will have to use a
larger number of blocks to reach the desired B2 value with a smaller poly
degree dF, which sacrifices some performance. Additionally, you may use the
-treefile option (see 4. Memory usage)
k=1 k=2 k=3 k=4
dF=256 582132 1222002 1864182 2504052
dF=512 2443992 5008092 7572192 10131672
dF=1024 10016172 20263332 30519732 40766892
dF=2048 42634420 85689250 128744080 171798910
dF=4096 173259252 347500242 521780502 696021492
dF=8192 711738310 1425139180 2138540050 2851940920
dF=16384 2850278350 5703881830 8557643650 11411247130
dF=32768 11702792020 23412731170 35122670320 46832609470
dF=65536 48071333326 96165459406 144259585486 192353711566
dF=131072 194020810630 388069884940 582118959250 776168033560
Table 2: Stage 2 interval length B2-B2min, for dF
a power of 2 and small values of k.
For example, if you'd like to run stage 2 on F_12 with B2 ~= 40G, try
parameters "-k 1 <B1> 48e9", "-k 3 <B1> 35e9" or "-k 4 <B1> 46e9".
##############################################################################
9. GMP-ECM and GPU
The stage 1 of ECM can be computed on a Nvidia GPU.
More details are given in README.gpu.
##############################################################################
10. Record factors.
If you find a very large factor, the program will print a message like:
Report your potential champion to <email address>
(see <url for record factors>)
This means that your factor might be a champion, i.e., one of the top-ten
largest factors ever found by the corresponding method (P-1, P+1 or ECM).
Cf the following URLs:
ECM: https://maths-people.anu.edu.au/~brent/ftp/champs.txt
P-1: http://www.loria.fr/~zimmerma/records/Pminus1.html
P+1: http://www.loria.fr/~zimmerma/records/Pplus1.html
##############################################################################
11. Known problems.
On some machines, GMP-ECM uses the clock() function to measure the cpu time
used by step 1 and 2. Since that function returns a 32-bit integer, there
is a possible wrap-around effect when the clock() value goes back from 2^32-1
to 0, which may produce negative timings.
The NTT code uses primes that fit in one machine word and that are
congruent to 1 (mod l), where l is the largest transform length required
for the desired stage 2 parameters. For very large B2 on 32-bit machines,
there may not be enough suitable primes, which may limit the possible
transform length to less than what available memory would permit. This
problem occurs mostly in the fast stage 2 for P-1 and P+1, as the generic
stage 2 uses far more memory for a given polynomial degree, so that memory
on a 32-bit machine will be exhausted long before suitable NTT primes are.
The maximal transform length depends on the size of the input number.
For a transform length l on a 32 bit machine, N must satisfy:
l=2^11:N<2^756200, l=2^12:N<2^379353, l=2^13:N<2^190044, l=2^14:N<2^94870,
l=2^15:N<2^47414, l=2^16:N<2^23322, l=2^17:N<2^11620, l=2^18:N<2^5891,
l=2^19:N<2^2910, l=2^20:N<2^1340, l=2^21:N<2^578, l=2^22:N<2^228.
Since log(N)*l is approximately constant, this limits the amount of memory
that can be used to about 600MB for P-1, and 1200MB for P+1.
##############################################################################
12. GMP-ECM packages.
GMP-ECM is a standard Debian package. See for example
https://packages.debian.org/sid/gmp-ecm.
An opam package was built by Michael Soegtrop and was tested on MaOS
(including Apple silicon), many Linux variants and Windows (MinGW with cygwin
build host). See https://github.com/ocaml/opam-repository/pull/20105.