Prime Number algorithm for massively parallel Processor-in-memory machine
Doing Sieve of Eratosthenes on a 1024-way bit-serial processor implemented on an FPGA
The bit-serial processors very simple, backed with 512 bits of RAM.
All processors perform the same operation, subject to individual processor
enables, encompassing boolean logic, and can read and write their own
512 bit RAM.
The 512 bits are allocated among variables, of arbitrary width, perhaps 32 bits.
By a succession of boolean operations, addition, subtraction multiply and
divide are coded into routines. All these operations happen in parallel.
Algorithm design can be difficult, as all program branch paths must be traversed.
A simple sieve on a regular processor takes time proportional to the total
number of candidate primes tested, and the number of factors used in the divisions.
I present a sieve algorithm with time only proportional to the number of factors.