[isabelle-dev] isabelle build
dm at prolingua.co.uk
Wed Aug 8 10:54:36 CEST 2012
The garbage collector now includes a pass that looks for cells that have
the same contents and merges them using the principle that equality in
ML cannot distinguish between two pointers to the same cell or two cells
with the same contents. There has been a PolyML.shareCommonData
function for a long time that does this but that has always had to be
called explicitly by the user. It has also required quite a lot of
extra memory. Currently, I understand Isabelle uses it just before it
saves the state to reduce the sizes of saved state files.
Some time ago Makarius wondered if it was possible to have this invoked
as part of the GC. I couldn't see how to do this especially without
adding any extra memory overhead but gradually I've been improving it.
The heap sizing code now fires off this pass when memory is particularly
short since it costs several times the cost of a normal GC. The latest
improvement was to find a way to merge deep structures (lists and trees)
something that PolyML.shareCommonData has always done but at the cost of
the extra memory. This made a dramatic difference to JinjaThreads. It
appears that it creates many equal data structures and merging these can
reduce the live data size by some 80%.
Actually, one has to be careful when interpreting the parallelism
figures. Much of the time an ML application is memory-bound which means
that using more processors can make an improvement in overall run-time
but the overall CPU usage will grow. As far as I can tell, the time
that a processor spends waiting for memory counts as its CPU time unlike
when it is waiting for IO.
On 08/08/2012 09:00, Lawrence Paulson wrote:
> I understand about the parallelism, but what has cut back on the memory consumption?
More information about the isabelle-dev