[isabelle-dev] Future of isatest/afptest

Lars Hupel hupel at in.tum.de
Wed Nov 18 18:18:30 CET 2015

(moving to isabelle-dev)

> In the various different isatest configurations for main Isabelle (not
> AFP) we do indeed test normal situations, like threads=4 or threads=8,
> alongside with abnormal ones like threads=1 or skip_proofs=true.

This is definitely something which I will replicate in the new Jenkins
setup as soon as it becomes clear on which hardware the tests will be run.

> What always happens in such situations is that the total runtime
> converges to the longest sequential task.  That used to be JinjaThreads,
> now it is AODV or ConcurrentGC.  These take many hours CPU time.  With a
> good multicore machine of 2015, it should be possible to run all of
> Isabelle + AFP in approximately 1h elapsed time, maybe even less after
> some tweaking.

I probably have some more things to say about this when my
investigations are finished (including hard numbers), but I will share
some findings already here:

It is clear that the current hardware we use to run regression tests
(namely: macbroy2, macafp and – to some extent – lxbroy10) are not
sufficient any more to cope with the ever expanding (which is good!)
AFP, especially given the goal that we want to run the entire thing on
every push. (Replicating Mira's bisecting behaviour is not going to happen.)

Essentially, we've determined three options, in increasing order of cost:

1) dedicated hardware with plenty of cores and RAM provided by a sponsor

I've contacted a potential sponsor (a hoster running an open source
programme with free credits) who offers beefy machines. In theory this
would be possible, but they will only have free capacities some time
next year. Even so, there's no guarantee they'll accept our application.
(If you have any lead here, feel free to contact me off-list.)

2) virtual machines provided by LRZ

LRZ offers cloud hosting of virtual machines. We could get an allowance
of 32 cores, although one single machine can only have 8 at most (e.g.
we could run 4 virtual machines with 8 cores each). This severely
constrains how we can run AFP tests. We would need to run the "slow" AFP
sessions seperately. I managed to get the elapsed time for a "non-slow"
AFP build down to about 3-4 hours, which is still a lot, but since we
could have 4 machines in parallel it wouldn't be a big problem. This is
also the reason why I ran the AFP in single-threaded mode: to squeeze
the last bit of performance out of the machines. It turns out that under
similar conditions, these machines already run about 20% faster than the
currently fastest machine we have (lxbroy10). I haven't yet checked the
elapsed time of the "slow" sessions on one of these machines, but I
suspect it's going to be around 4 hours, too.

3) dedicated hardware with plenty of cores and RAM to be bought by our chair

This option will hopefully solve our scaling problems. It has yet to be
determined how big of a machine we may get. As it stands now this is the
preferred solution. I will keep this list posted when I have new

With all that in mind, we can still keep the other machines we have
(especially the Mac boxes) to ensure that at least the Isabelle
distribution runs correctly on different operating systems and different
hardware configurations. Thanks to Jenkins a reliable integration of
Windows builds might be within reach although I don't know the present
situation wrt. setup of the repository version under Windows in headless


More information about the isabelle-dev mailing list