PrimeSieve.jl

Fast generation and counting of primes
Popularity
10 Stars
Updated Last
1 Year Ago
Started In
December 2014

PrimeSieve

Prime number and factoring functions

This package provides functions related to prime numbers and factoring, mostly for generating and counting primes and factoring integers. This package uses some of the fastest open-source libraries for these functions. The name Primes would be better, but that might cause collisions.

Note: Dec 5, 2016. Bitrot rests for no man. Quite a bit still works, but quite a bit not. Broken tests are commented out, so tests suite should pass. If you want to fix something and make a PR, please do. The most challenging problem is the C wrapper to make libsmsieve.so from a standalone program no longer works for all inputs.

See LICENSE.md for links to the authors of the tables and the libraries used in this package.

I am unaware of binaries of libprimesieve and libprimecount for Windows and OSX, so there is no easy installation for these platforms.

This is not a registered package, and it has a non-registered dependency. You can install it (at least on Unix) with

Pkg.clone("https://github.com/jlapeyre/DeepConvert.jl")
Pkg.clone("https://github.com/jlapeyre/PrimeSieve.jl")
Pkg.build("PrimeSieve")

Some functions in this package

  • genprimes(a,b) generate array of primes between a and b, inclusive
  • genprimes(b) generate array of primes between 2 and b
  • mfactor(n) factor integers up to about 100 decimal digits.
  • primepi(n) the prime counting function: number of primes < n
  • countprimes(a,b) number of primes between a and b
  • nextprime(n), prevprime(n) first prime greater (or smaller) than n
  • someprimes(n1,n2) iterator. all primes from n1 through n2
  • someprimes(n2) all primes from 2 through n2
  • allprimes(n) iterator. all primes >= n
  • allprimes() all primes
  • nthprime(n) the nth prime
  • nprimes(n,start) generate array of the first n primes > start
  • isprime(z) primality test for Gaussian integers
  • randprime(a,b) random prime between a and b

This package uses the following tables and libraries.

[T. Oliveira's tables of the prime counting function] (http://www.ieeta.pt/~tos/primes.html)

Prime number sieve library [libprimesieve] (http://primesieve.org/) and prime counting function library libprimecount

Integer factoring libraries msieve and gmp-ecm

Data types

The tables are encoded in Int128. The native type of the sieve (libprimesieve) is Uint64. The input/output type of the fastest primepi algorithm in libprimecount, the Deleglise Rivat algorithm, is Int128. There is a risk of overflow when constructing and giving arguments to functions in this package. The easiest way to avoid this is to put arguments in quotes: eg countprimes("10^19","10^19+100"). Also available are @bigint and @int128 from DeepConvert.

Functions

Most of the following functions are vectorized.

genprimes

genprimes(start,stop)

Return an array of all primes >= start and <= stop.

genprimes(stop)

Return an array of all primes between 1 and stop

genprimes(start,stop; alg = algorithm)

Generate primes using a specified algorithm. The algorithm must be one of :auto (the default), :sieve, or :next. Which algorithm is more efficient depends on the parameters. In general, :sieve is better for larger intervals, and :next is better for larger values of start. The keyword :sieve uses a very fast sieve (libprimesieve), and :next uses the function nextprime.

If you exceed the upper limit for argument to the sieve, then :next is chosen automatically.

julia> @bigint genprimes(10^20, 10^20+1000)
24-element Array{BigInt,1}:
 100000000000000000039 ...

This could also have been written genprimes(bi"10^20", bi"10^20+1000")

primepi

Computes the prime counting function.

primepi(x; alg = algorithm)

The efficient algorithms (or methods) are :auto (the default), :dr, and :tabsieve. The default, :auto, tries to choose the faster between :dr and :tabsieve (see Notes below). The other algorithms are slower in all cases. They are: :legendre, :lehmer, :meissel, :lmo, :sieve. The algorithm :dr uses an efficient parallel Deleglise Rivat method. The algorithm :tabsieve uses a combination of tables and a sieve and is more efficient when x is not too much greater than a table entry. (Note: Below, 10^14+10^8 is not too much greater than 10^14.) For example

julia> @time primepi(10^14+10^10; alg = :tabsieve)
elapsed time: 6.622672664 seconds (216 bytes allocated)
3205251958942

julia> @time primepi(10^14+10^10; alg = :dr)            # Deleglise Rivat is faster
elapsed time: 0.495413145 seconds (208 bytes allocated)
3205251958942

julia> @time primepi(10^14+10^8; alg = :dr)
elapsed time: 0.505796298 seconds (208 bytes allocated)
3204944853481

julia> @time primepi(10^14+10^8; alg = :tabsieve)       # Table and sieve is faster
elapsed time: 0.08235147 seconds (216 bytes allocated)
3204944853481

mfactor

Factor an integer.

Example:

julia> @time mfactor("2^251-1")
elapsed time: 29.709989827 seconds (13283880 bytes allocated)
Dict{Int128,Int64} with 5 entries:
  12070396178249893039969681 => 1
  178230287214063289511      => 1
  61676882198695257501367    => 1
  503                        => 1
  54217                      => 1

mfactor(n; ecm = true) uses ecm to search for factors larger than 15 digits, rather than less than 15 digits.

mfactor(n; deadline = m) aborts the factoring after m minutes.

mfactor(n; logfile = "filename.log") writes information to a log file.

mfactor(n; info = true) prints log information to the terminal.

mfactor( @bigint [ 2^100 + i for i in -5:5] )  # returns an array of factorizations.

mfactor calls libmsieve and libecm.

countprimes

Count the number of primes (or prime tuplets in an interval. This looks up the largest value in the table that is smaller than the requested value and computes the remaining values. Note that primepi is logically equivalent to countprimes with start=1. For start=1, The function primepi is often much faster than, and is never slower than countprimes

countprimes(stop)            # count the number of primes less than or equal to stop
countprimes(start,stop)      # count the number of primes >= start and <= stop
countprimes([start], stop, tuplet=n) # Count prime n-tuplets
countprimes(start, stop, alg = algorithm) # Count prime n-tuplets

The default value of start is 1. The optional keyword argument 'tuplet' may take values between 1 and 6, that is primes, through prime sextuplets. Tables are implemented only for 'tuplet' equal to one, that is for primes, but not tuplets.

The optional keyword argument alg may be one of :tabsieve (the default), :next, :nexta, or :sieve (:sieve will always be slower than :tabsieve). As above, :tabsieve uses a combination of tables and a fast sieve. :next and :nexta are two different variants of next_prime.

Examples

countprimes(100,1000)  # the number of primes x satisfying  100 <= x <= 1000
143
countprimes(100,tuplet=3)  # the number of prime triplets between 1 and 100
8
countprimes(10,1000,tuplet=6)  # the number of prime sextuplets between 100 and 1000
1     

If you quote the arguments (either as an expression or a string), they will be converted to Int128. This prevents overflow.

countprimes("10^19+10^9")
234057667299198865

If you use BigInt's, then the method :nexta will be chosen automatically. For example

julia> @bigint countprimes(10^50, 10^50+1000)
7

nextprime, prevprime

nextprime(n) returns the smallest prime greater than n.

prevprime(n) returns the largest prime less than n.

nextprime(n,k) returns the kth prime following n.

prevprime(n,k) returns the kth prime preceeding n.

Several algorithms are used. Finding the optimal one (of the available) is partially automated. nextprime1 and prevprime1 use an alternate algorithm coded by H W Borcher.

Iterators

someprimes(n2) All primes n, 2 <= n <= n2

someprimes(n1,n2) All primes n, n1 <= n <= n2

allprimes(n1) All primes n, n > n1

allprimes() All primes

For example, here is the primorial function defined using an iterator:

julia> primorial(n) = prod(someprimes(n))
julia> @bigint primorial(100)
2305567963945518424753102147331756070

nthprime()

Returns the nth prime using a fast algorithm from libprimecount. The argument is converted to Int64.

nthprime(n; alg = :sieve) uses the older algorithm from libprimesieve, which is much slower.

nprimes

Return an array of the first n primes >= start.

Usage

nprimes(n,[start=1])

single threaded versions

Usage

scountprimes([start],stop, tuplets=1)

printprimes

Print all primes (or prime n-tuplets) that are >= start and <= stop

Usage

printprimes([start],stop, [tuplet=1])

The default value of 'start' is 1. The optional keyword argument 'tuplet' may take values between 1 and 6.

legendrephi

The legendre sum or phi function

legendre(x,a)

The arguments are converted to Int64.

primeLi

The offset logarithmic integral. The argument is converted to Int64.

PrimeLiinv

The inverse Li function. The argument is converted to Int64.

randprime

randprime(a,b) choose a random prime between a and b. All primes in the range are equally likely to be chosen.

randprime(b) choose a random prime between 2 and b.

randprime(a,b; lim=n) find a maximum of n random composite numbers before giving up.

randprime(a,b,n1,n2,...) return a n1xn2x... array of random primes

isprime(z)

Returns true if the Gaussian integer z is prime.

Sieve Parameters

primesievesize

Get, set the sieve size in kilobytes. (setting does not seem to work) sz must satisfy 1 <= sz <= 2048

Usage

primesievesize()
primesievesize(sz)

primesieve_num_threads

Get, set the number of threads used in the parallel sieve. By default, the number of cores is used.

Usage

primesieve_num_threads()
primesieve_num_threads(numthreads)

primepi_num_threads

Get, set the number of threads used in the parallel primepi. By default, the number of cores is used.

Usage

primepi_num_threads()
primepi_num_threads(numthreads)

primemaxstop

Return the largest value (as a Uint64) that can be passed as the parameter stop in the sieve.

Usage

primemaxstop()

primepi_xmax()

Function that returns the largest allowed argument to primepi when using the :dr algorithm.

primetest

Run a test of the sieve algorithm.

Usage

primetest()

primepi_test

Run a test of the primepi algorithms

Tables of prime pi function

Some of the above functions use tables, but this is completley hidden from the user. But it is also possible to access them directly. The tables work like this:

julia> @time countprimes(10^17 + 10^14 + 10^10)
elapsed time: 3.729049749 seconds (168 bytes allocated)
2626112053757377

To see what happened, we can look in the tables:

julia> primelookup(10^17 + 10^14 + 10^10)
(14,(2626111798288135,100100000000000000,10000000000))

The 14th table was used. The value of prime pi for 10^17+10^14, 2626111798288135 is in the table, and the primes in an interval of length 10^10 must be found with the sieves.

primetableinfo

Print information about the prime pi tables.

julia> primetableinfo()
Tables of π(x). Listed are: table number, increment in x (and first value of x),
number of entries in the table, largest x in table.

table  incr    tab len  max x
1      10^1    10^4     10^5
2      10^2    10^4     10^6
3      10^3    10^4     10^7
4      10^4    10^4     10^8
5      10^5    10^4     10^9
6      10^6    10^4     10^10
7      10^7    10^4     10^11
8      10^8    10^4     10^12
9      10^9    10^4     10^13
10     10^10   10^4     10^14
11     10^11   10^4     10^15
12     10^12   10^4     10^16
13     10^13   10^4     10^17
14     10^14   10^4     10^18
15     10^15   10^4     10^19
16     10^16   10^4     10^20
17     10^17   10^3     10^20
18     10^18   10^2     10^20
19     10^19   10^2     10^21
20     10^20   10^2     10^22
21     10^21   10^2     10^23
22     10^22   10^1     10^23

primelookup

Look up a value of the prime pi function in the tables. This is only provided to aid in understanding the behavior of countprimes.

Usage

primelookup(x)

A tuple of a single element and another tuple of three elements is returned:

(j,(p,y,rem))
  • j is the number of the best table found
  • y is the largest index satisfying y<x found.
  • p is the value of prime pi at y
  • rem is x-y

primetables

The array of type Array{PrimeTable,1} containing the prime tables. See tables.jl for the format.

Example

show(map(length,primetables)) # see the number of tables and their lengths

primetablefilename

Function returning the path to the file containing the prime pi tables. The tables are loaded when the package is loaded.

Other details

For x>typemax(Int), you need to explicitly ask for a bigger data type. For instance,

julia> countprimes(int128(10)^23)
1925320391606803968923

This example returned a value from the table. The argument was larger than than primemaxstop().

With any of the routines, you can quote the arguments and they will be converted to the appropriate type.

julia> countprimes(:(10^23))
1925320391606803968923
julia> countprimes("10^19 + 10^9")
234057667299198865

Routines that use the tables will convert the arguments to Int128. This is because some indices in the tables are greater than typemax(Uint64). Routines that only use the sieve will be converted to Uint64, which is the data type that the sieve routines use.

The largest stop value that may be given is 2^64 - 10 * 2^32. The largest start value that may be given is 2^64 - 11 * 2^32. The sieve works with the Uint64 data type. But conversions are done depending on the types of start, stop, and n.

countprimes returns Int128, because it uses tables and sieves The other routines only support smaller data types.

primetabletype()

Return data type of tables. This should be Int128. The largest values cannot be used together with the sieve.

primesievetype()

Return the native prime sieve type. This should be Uint64. libprimesieve returns the data in various integer formats. These are chosen by the Julia interface by the type of the start parameter.

eltype(t::PrimeTable)

Return element type of values in table.

Other functions

apopcount

This is a only curiosity. It is supposed to be an optimized C (C++) function, but doing the same thing in a Julia loop is much faster.

apopcount(arr) gives the number of 1's in the binary representation of the array arr. The length of the array is truncated to a multiple of 8.

Note that this treats the contents of the array as a bits type. In particular, if arr is not an array of bits type, then the number of 1's in the pointers in the array are counted.

In the comments, Kim Walisch says this about the algorithm,

This algorithm counts the number of 1 bits (population count) in an array using 64-bit tree merging. To the best of my knowledge this is the fastest integer arithmetic bit population count algorithm, it uses only 8 operations for 8 bytes on 64-bit CPUs

Example:

julia> aa = Array(Uint64,100000000);
julia> fill!(aa,typemax(Uint64));

julia> @time apopcount(aa)
elapsed time: 0.140950344 seconds (96 bytes allocated)  # test shows overhead is negligible
6400000000

julia> lpopcount(x) = ( s = 0; for i in 1:length(x) s += count_ones(x[i]) end; s)

julia> @time lpopcount(aa)
elapsed time: 0.106306266 seconds (96 bytes allocated)  # Julia version is faster !
6400000000

The algorithm is by Cédric Lauradoux

Notes

The algorithms used by :auto in genprimes and primepi are simple and do not always choose the best method. They could be improved. However, the routines are still fast. In the worst case, genprimes is more than an order of magnitude faster than Base.primes.

Bugs

Interrupting a call to the sieves usually does not cause a memory error. But, libprimesieve apparently has some static state, such that, after the interrupt, subsequent sieving runs much slower, and may not give correct results.

Interrupting a call to libprimecount, results in a segfault. We could disable_sigint, but there appear to be memory leaks in libprimecount. Better to crash Julia than the whole system.

msieve interrupt handler is installed, then all c code uses it. We need to install and then uninstall it on every call.