[isabelle-dev] Using PolyML's memory consumption profiling on Isabelle

Thomas Sewell Thomas.Sewell at nicta.com.au
Thu Oct 13 13:24:32 CEST 2011

Good day all. Just wanted to let the Isabelle developers know about the latest feature David Matthews has added to PolyML, and to let you all know how useful it is.

The feature allows profiling of objects after garbage collection. When code is compiled with PolyML.compiler.allocationProfiling set to 1, all objects allocated are also given a pointer to their allocating function. When the garbage collector runs with PolyML.compiler.profiling set to 4, a statistical trace is printed of which objects survived garbage collection.

This means that for Isabelle we get our first real look at what is taking up space at runtime and in images.

I include the last 20 lines of traces produced at four interesting points
  1) After including the HOL theories in HOL's ROOT.ML
  2) After performing a PolyML.shareCommonData after (1)
  3) After adding a 400-element record to a test theory built on the HOL image.
  4) After performing a shareCommonData after (3)

The output is currently a bit raw, since I'm not sure what the units are
(bytes? words? objects?). I suspect objects, since the total is roughly a tenth
of the runtime size. The other difficulty in interpreting this is the function
call graph. Small functions inlined into others are counted as their parent.
Library functions called from many places are counted against all of them

Things for Isabelle developers to note from this:

Isabelle is generating a *lot* of copies of types & terms, particularly via
Term.map_atyps. Since shareCommonData eliminates them, many are
duplicates. It's possible that further use of the "same" variants from
Term_Subst might help. It's also possible that the repeated reconstruction is
necessary (e.g. repeated generalization/abstraction of type variables) and
further use of the new Term_Sharing mechanism might be the answer.

A large quantity of the persistent objects are Table and Net objects, as

There are surprisingly many dummy tasks.

A surprisingly large quantity of the persistent objects are associated with
proof terms and name derivations. This is presumably not Thm.name_derivation
but inlined code from its subcalls Proofterm.thm_proof, proof_combP,
proof_combt' and Library.foldl, none of which are listed. If indeed these foldl
loops are producing this many objects then perhaps the work done unconditionally
here should be rethought.


1) After including the HOL theories in HOL's ROOT.ML

    824925 Term_Subst.instT_same(3)subst_typs(1)
    918632 Task_Queue.dummy_task(1)
    942473 Net.insert(4)ins1(2)
   1032196 Term.map_atyps(2)
   1046107 Term_Subst.inst_same(4)subst(1)
   1058304 Term_Sharing.init(2)typ(1)
   1172790 List.map(2)
   1194308 Term.map_atyps(2)
   1356790 List.map(2)
   1377084 Table().modify(3)modfy(1)
   1875680 Term_Sharing.init(2)term(1)
   2688712 Type.strip_sorts(1)
   3082276 Table().modify(3)modfy(1)
   3177593 Function code
   5746085 Strings
   5914517 Unidentified word data
   8216169 Term.map_types(1)map_aux(1)
   9206871 List.map(2)
   9424264 Table().modify(3)modfy(1)
  13085440 Term.map_atyps(2)

2) After performing a PolyML.shareCommonData after (1)

    183492 Table().map_default(3)(1)
    199062 List.map(2)
    220528 Thm.transfer(2)
    294270 Term_Subst.inst_same(4)subst(1)
    350230 Thm.map_tags(1)(1)
    461706 Ord_List.union(3)unio(2)
    464109 Mutable data
    478833 List.map(2)
    562309 Term_Sharing.init(2)term(1)
    581994 Proofterm.prepare_thm_proof(8)
    674516 Thm.name_derivation(2)
    700221 Net.insert(4)ins1(2)
    918632 Task_Queue.dummy_task(1)
   1019993 Term.map_types(1)map_aux(1)
   1193305 Table().modify(3)modfy(1)
   2619583 Table().modify(3)modfy(1)
   2769346 Unidentified word data
   3177590 Function code
   4684257 Strings
   8033629 Table().modify(3)modfy(1)

3) After adding a 400-element record to a test theory built on the HOL image.

    211270 List.map(2)
    549768 Table().modify(3)modfy(1)
    638246 Logic.varifyT_global_same(1)(1)
    844780 Envir.norm_term2(2)norm(1)
    972069 ListPair.map(3)
   1151428 Type.varify_global(3)thaw(1)
   1222416 List.map(2)
   1379140 Logic.varifyT_global_same(1)(1)
   1727520 Term.map_atyps(2)
   2397356 Type.strip_sorts(1)
   2675166 Thm.name_derivation(2)
   6035908 Term_Subst.map_atypsT_same(1)typ(1)
   6558968 Net.insert(4)ins1(2)
   7004076 List.map(2)
   8166615 Same.map(2)
  10044260 Logic.incr_indexes_same(2)incr(2)
  10359792 Term.map_atyps(2)
  14882581 Term.map_types(1)map_aux(1)
  15712284 List.map(2)
  23758208 Term.map_atyps(2)

4) After performing a shareCommonData after (3)

     19086 Ord_List.union(3)unio(2)
     20925 List.map(2)
     25107 Table().modify(3)modfy(1)
     30360 Thm.transfer(2)
     33459 List.map(2)
     49475 Term_Subst.inst_same(4)subst(1)
     54711 Term_Subst.map_types_same(1)term(1)
     55132 Table().modify(3)modfy(1)
     58414 Mutable data
     62876 Strings
     64358 Thm.map_tags(1)(1)
     72438 List.map(2)
     87924 Proofterm.prepare_thm_proof(8)
    133588 Task_Queue.dummy_task(1)
    347278 Logic.incr_indexes_same(2)incr(2)
    364754 Envir.norm_term2(2)norm(1)
    506900 Table().modify(3)modfy(1)
    993121 Term.map_types(1)map_aux(1)
   2672118 Thm.name_derivation(2)
   3332121 Net.insert(4)ins1(2)

The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.

More information about the isabelle-dev mailing list