Dotty Linker: Making your Scala applications smaller and faster

Dmitry Petrashko

About me:

  • PhD at EPFL
  • previously worked on ScalaBlitz
  • for the last year working on Dotty. Implemented:
    • mini-phases framework;
    • pattern matcher;
    • lazy vals, based on SIP-20 - Improved Lazy Vals Initialization;
    • tailrec ;
    • shared JVM8 backend based on GenBCode;
  • supervising:
    • Guillaume Martres: value classes and arrays of value classes
    • Alex Sikiaridis: specialization

The language’s scalability is the result of a careful integration of object-oriented and functional language concepts.

But nothing comes for free.

Example: Boxing

13x slowdown due to boxing

But doesn't @specialize solve this problem?

But doesn't miniboxing solve those problems?

Problems with specialization and miniboxing

trait Function3[-T1, -T2, -T3 +R]

How many specializations are needed?

Specialization: 10000 classes

Miniboxing: 81 classes

Are they actually needed?
in your program?

Most likely no!

What is we could generate only specializations that your program needs?

How do we know which ones?


  • Tells how all methods are actually implemented
  • Included for all classes emitted by dotc
  • source code for dependencies

Call-graph analysis

Detect program entry points, mark them as reachable

For every reachable method:

  • mark all methods called from it as reachable
  • track how type params flow to other methods
  • track which trait combinations were instantiated

Paper: Ali K., Rapoport M., Lhoták O., Dolby J., Tip F. Constructing Call Graphs of Scala Programs, ECOOP 2014

Call-graph analysis, example

Call-graph analysis, example

  • marked main as entry point
  • map is called from main with tparams [String]
  • sum is called from main with tparams [Int]
  • foldLeft is called from sum with tparams [Int]
  • foldl is called from foldLeft with tparams [Int]
  • Function.apply is called from foldl with tparams [Int, Int, Int]

What can linker do?

  • smart specialization
  • smart dead code elimination. squeryl: 400 methods vs 1500 methods reachable by proguard
  • convert classes to value classes
  • infer more precise types
  • eliminate virtual dispatch
  • replace vars by vals
  • remove duplicate vals

Smart dead code elimination


Converting classes to value classes

Class needs to follow the restrictions:

  • Does not reach a callsite that calls AnyRef methods
  • Does not have side effects in constructor
  • Is effectively final

Closing the world

But what if some other program has a class that inherits from class that was marked as a value class?

  • No problem, it has our TASTY and will tune us for it's own use-case.
  • Binary compatibility not guaranteed
  • source- and TASTY-compatibility provided instead

Eliminating virtual dispatch: inlining

  • does not handle recursive methods
  • increases size of program if method is frequently used
  • can hurt performance if methods become huge
  • is already performed by JVM or JS backend

Eliminating virtual dispatch: clonning

  • handles recursive methods
  • small increase in size
  • no huge methods
  • is actually just specialization for singleton types



bytecode size500KB330KB180KB
JVM code cache size(1000 runs)1700KB1500KB1000KB
objects allocated136K120K39K

Future ideas

  • De-lazify lazy vals
  • use offheap memory(scala-offheap)
  • deduplicate methods

Why would one use scala-offheap?

Thank you.

See my other talks.