DOUBLE MERSENNE PRIME | A distributed math search

Double Mersenne Productivity Page

On this page we will show how programs perform related to the range to analyze. All calculations have been made using an Intel i7 9800X with 32GB of RAM DDR4.

We have used a single core for the testing of mmff and gmp-doublemersennes. We chose 8 workers for mprime and 16 threads for dmdsieve

mmff and dmdsievecl used a RTX 2060 8GB GPU. The parameter for dmdsieve is -W 16, while -g 512 -G 2 has been used for dmdsievecl.

Your mileage may vary depending on the hardware used.

Double Mersenne numbers that are prime

MMp	#	status
MM2	1	Prime!
MM3	2	Prime!
MM5	3	Prime!
MM7	4	Prime!

Numbers tested with ECM

These numbers have been trial-factored for very high values of k (see the history page), and ECM is the best way to search for factors.

ECM with mprime
MMp	#	Exponent
			Digits in factor	25	30	35	40	45	50	55	60	65	70	75
			Bound #1	50k	250k	1M	3M	11M	44M	110M	260M	800M	2400M	7200M
			Curves to test	280	640	1,566	4,588	9,201	16,550	40,830	105,362	194,238	350,860	567,615
MM13	5	8,191	sec/curve	done	done	done	done	done	done	done	done	43,934
MM17	6	131,071	sec/curve	done	done	done	done	done	done	8,462
MM19	7	524,287	sec/curve	done	done	done	done	done	3,135

Numbers tested with mmff

George Woltman's mmff uses GPU to speed up trial-factoring of such numbers. At the moment, MM31 and MM127 reached their seach limits.

mmff
MMp	#	K	K/sec
MM31	8	1,155,000,000,000,000,000	30 Gk/sec
MM61	9	600,000,000,000,000,000	11 Gk/sec
MM89	10	78,000,000,000,000,000	7.8 Gk/sec
MM107	11	300,000,000,000,000,000	4.3 Gk/sec
MM127	12	1,152,921,504,606,846,976	3.2 Gk/sec

Numbers tested with Dmdsieve -x

Dmdsieve is a sieving program written by Mark Rodenkirch.

Mark added a nice option to test the candidates remaining after the sieving.

Moereover, he added parallelism to the sieve, using OpenCL routines

The following table shows the efficiency of the program with different ranges of k.

We assume K/sec using a single instance of the software.

dmdsieve				timing for k range
MMp	#	K start	K/sec	1e7	1e8	1e9	1e10	1e11
MM521	13	1,250,000,000,000	20,000	132s	227s	20m	3h	-
MM607	14	1,250,000,000,000	12,500	139s	288s	32m	5h	-
MM1279	15	240,000,000,000	1,850	235s	21m	3h	1.25d	-
MM2203	16	60,000,000,000	430	10m	1.4h	13h	5d	-
MM2281	17	48,000,000,000	375	12m	1.5h	14h 45m	-	-
MM3217	18	19,200,000,000	150	26m	4h	-	-	-
MM4253	19	9,600,000,000	66	55m	8h 30m	-	-	-
MM4423	20	9,000,000,000	60	1h	9h 20m	-	-	-
MM9689	21	4,800,000,000	8.27	7h	2d 15h	-	-	-
MM9941	22	4,800,000,000	7.65	7h 20m	2d 21h	-	-	-
MM11213	23	3,200,000,000	5,90	9.5h	-	-	-	-
MM19937	24	1,200,000,000	2,00	1d	-	-	-	-
MM21701	25	960,000,000	1.14	2d	-	-	-	-
MM23209	26	800,000,000	0.95	2d 8h	-	-	-	-
MM44497	27	200,000,000	0.18	12d	-	-	-	-
MM86243	28	48,000,000	0.04	50d	-	-	-	-

dmfs vs dmdsieve

To test MMp between MM521 and MM86243 we have two options: dmdsieve -x (as we just saw) and Gary Gostin's dmfs

While dmdsieve concentrates on sieving, trying to cancel out the higher possible number of candidates before the divisibility testing, dmfs relies on multi-threading.

dmfs can use up to 256 parallel threads to divide and conquer a k_range.

We cannot have a 100 percent parallelism betweem threads, there will always be some diminishing of the efficiency. We cannot have a 100% parallelism betweem threads, there will always be some diminishing of the efficiency.

Which program is more efficient for our search? The answer is... it depepds.

Here is a table where we have tested all the MMp numbers with different k ranges and different programs having different configurations.

We tested dmfs with a single thread, dmfs with 8 concurrent threads and dmdsieve with 1 thread and parallelized sieve.

We will show how much faster dmfs is with 8 threads with respect with the version single-threaded - and no, it's not a plain 800%.

We also show how a deeper sieving done by dmdsieve might be winning against a parallelization of testing tasks.

The variable you should keep an eye on is k_sec for the three configurations.

If you have an efficient multicore platform, you might choose dmfs - If you have a fast GPU and you search the smallest MMPs, dmdsieve might be your first choice.

Here is the comparison table!

		dmfs 1 thread				dmfs 8 thread				dmfs	dmdsieve				dmfs vs dmdsieve
#	MMp	k_range	k_tested	% tested	k/sec 1 thr	k_range	k_tested	% tested	k/sec 8 thr	8 thr vs 1 thr	k_range	k_tested	% tested	k/sec 1 thr	1 thr	8 thr
13	521	75,000,000	2,033,669	2.712%	707,122	600,000,000	16,269,570	2.712%	5,179,406	732.46%	10,000,000,000	195,383.973	1.954%	909,984	77.71%	569.18%
14	607	50,000,000	1,333,226	2.666%	450,885	400,000,000	10,663,828	2.666%	3,396,004	753.19%	10,000,000,000	188,169.581	1.882%	605,416	74.48%	560.94%
15	1279	8,500,000	215,077	2.530%	75,394	68,000,000	1,723,558	2.535%	513,145	680.61%	10,000,000,000	183,577.304	1.836%	93,889	80.30%	546.54%
16	2203	2,000,000	50,875	2.544%	17,221	16,000,000	405,388	2.534%	130,152	755.77%	10,000,000,000	176,942.766	1.769%	23,072	74.64%	564.11%
17	2281	1,750,000	44,066	2.518%	15,256	14,000,000	324,155	2.530%	114,319	749.32%	1,000,000,000	18,756.308	1.876%	18,801	81.15%	608.04%
18	3217	750,000	18,956	2.527%	6,181	6,000,000	151,930	2.532%	47,815	773.52%	100,000,000	1,971.959	1.972%	7,237	85.41%	660.69%
19	4253	400,000	10,316	2.579%	3,289	3,200,000	81,521	2.548%	25,829	785.22%	100,000,000	1,920.813	1.921%	3,271	100.55%	789.53%
20	4423	375,000	9,524	2.540%	3,259	3,000,000	75,990	2.533%	24,992	766.81%	100,000,000	1,910.151	1.910%	2,975	109.52%	839.85%
21	9689	100,000	2,560	2.560%	659.6	800,000	20,218	2.527%	5,026	762.03%	100,000,000	1,810.403	1.810%	437.5	150.76%	1,148.83%
22	9941	100,000	2,561	2.561%	656.2	800,000	20,343	2.543%	4,954	755.00%	100,000,000	1,807.141	1.810%	404.9	162.06%	1,223.54%
23	11213	70,000	1,714	2.449%	613.6	560,000	14,303	2.554%	4,454	725.99%	10,000,000	191.140	1.911%	293.8	208.87%	1,516.34%
24	19937	25,000	645	2.580%	333.1	200,000	5,052	2.526%	2,541	762.90%	10,000,000	187.606	1.876%	103.6	321.43%	2,452.20%
25	21701	17,500	433	2.474%	214.0	140,000	3,544	2.531%	1,583	739.96%	10,000,000	184.603	1.846%	59.73	358.27%	2,651.04%
26	23209	16,000	404	2.525%	197.6	128,000	3,225	2.520%	1,503	760.37%	10,000,000	184.991	1.850%	50.02	395.18%	3,004.88%
27	44497	4,000	97	2.425%	31.51	32,000	822	2.569%	228.5	725.14%	10,000,000	170.644	1.706%	9.785	322.06%	2,335.37%
28	86243	2,000	51	2.550%	15.62	8,000	196	2.450%	125.0	800.41%	10,000,000	169.094	1.691%	2.315	674.77%	5,400.90%
29	110503	2,000	52	2.600%	8.778	16,000	405	2.531%	68.95	785.51%	-	-	-	-	-	-
30	132049	1,300	38	2.923%	5.478	10,400	253	2.433%	50.56	922.96%	-	-	-	-	-	-
31	216091	1,200	26	2.167%	2.219	9,600	236	2.635%	14.93	672.74%	-	-	-	-	-	-

Numbers tested with dmdsieve[cl] and mprime [LLR]

As the numbers' size grows up, we ned different algorithms to test their primality.

A possible factor of a (double) Mersenne number has the form 2kp + 1, where p is the prime exponent of the number, and k is an integer.

In double Mersennes, p is a Mersenne prime, of the form 2^p-1.

This way, the factors of double Mersennee have the form 2k(2^p-1) -1; if we change 2k to k', this formula translates into k'+2^p - k' + 1 .

This general form can be tested for pimality using mprime.

The problem here is to filter all the k values that surely can't guarantee a prime. The sieve (dmdsieve) is in charge of such filtering using modular arithmetic tricks.

The values in green show that sieving is complete: the time needed to sieve out a new factor on the actual k range is higher than the PRP test on the same factor.

MMp	#	sieved to	CPU extraction time	GPU extraction time	Tested Thru K	K sieved	worker/thr.	completion time
		dmdsieve[cl]					mprime
MM29	110,503	16 T	4s/k	2s/k	9,999,999	10 M	8x1	0.663 s
MM30	132,049	100 T	21s/k	11s/k	7,499,999	7.5 M	8x1	0.792 s
MM31	216,091	131 T	53s/k	27s/k	4,999,999	5 M	8x1	2.557 s
MM32	757,839	350 T	142s/k	75s/k	4,999,889	5 M	8x1	32.545 s
MM33	859,433	266 T	92s/k	62s/k	4,999,985	5 M	8x1	48.13 s
MM34	1,257,787	1 P	409s/k	250s/k	2,000,084	5 M	8x1	1m 37s
MM35	1,398,269	700 T	230s/k	150s/k	1,500,008	5 M	8x1	2m 01s
MM36	2,976,221	450 T	400s/k	165 s/k	350,460	2 M	8x1	4m 04s
MM37	3,021,377	365 T	350s/k		311,213	2 M	8x1	4m 51s
MM38	6,972,593	1.6 P	1111s/k		114,120	2 M	2x4	29m 59s
MM39	13,466,917	302 T	200s/k		100,536	2 M	1x8	1h 59m 11s
MM40	20,996,011	300 T	227s/k		1,113	2 M	1x8	4h 52m 54s
MM41	24,036,583	331 T	292s/k		1,080	2 M	1x8	6h 16m 58s
MM42	25,964,951	331 T	200s/k		780	2 M	1x8	8h 34m 58s
MM43	30,402,457	400 T	352s/k		720	2 M	1x8	12h 33m 59s
MM44	32,582,657	500 T	460s/k		716	2 M	1x8	14h 53m 51s
MM45	37,156,667	2.1 P	1,964s/k		684	2 M	1x8	21h 03m 20s
MM64	42,643,801	2.5 P	2,065s/k		648	2 M	1x8	1d 07h 51m 52s
MM47	43,112,609	500 T	400s/k		513	2 M	1x8	1d 08h 34m 26s
MM48	57,885,161	750 T	600s/k		485	2 M	1x8	2d 09h 52m 09s
MM49	74,207,281	1.6 P	1,025s/k		405	2 M	1x8	4d 15h 12m 28s
MM50	77,232,917	2.4 P	1800s/k		404	2 M	1x8	4d 21h 35m 14s
MM51	82,589,933	3.6 P	2390s/k	788s/k	341	2 M	1x8	5d 10h 57m 52s
MM52	136,279,841	5 P	5600s/k	1850 s/k	41	2 M	1x8	14d 05h 09m 14s

field legend:

MMp - The double Mersenne number
# - The exponent of the number
worker/thr - Composition of workers and threads in mprime to PRP the factor
Completion time - Time needed to a Intel I7 9800X with 8 active threads to complete the PRP test
Sieved to - The maximum prime sieved from a list of "K sieved" see below)
CPU extraction time - The actual time required to sieve one factor on CPU (16 threads)
GPU extraction time - The actual time required to sieve one factor on GPU
K - the maximum K tested for primality on this MMp
K sieved - List of K used for the sieve program: 2M means all K from 1 to 2,000,000