The document discusses the binary compatibility challenge in Scala and proposes using typed trees as a common intermediate format to address this issue. Currently, binary compatibility breaks cause problems when upgrading dependencies. The proposal is to compile to typed trees rather than bytecode, and distribute libraries as trees instead of bytecode. This would allow recompiling from trees as needed for compatibility, avoiding rebuilds but providing more flexibility than bytecode. It could improve efficiency of builds and enable cross-platform compilation.
2. The Problem in a Nutshell
• Binary compatibility has been an issue ever since Scala
became popular.
• Causes grief when building, friction for upgrading.
• The community has learned to deal with this by
becoming more conservative.
• But this makes it harder to innovate and improve.
Break your client’s builds vs Freeze, and stop improving
Is there no third way?
2
3. What is Binary Compatibility?
Binary compatibility ≠ Source compatibility
Source & binary incompatible
object
Client
{
msg.length
}
object
Server
{
val
msg
=
“abc”
}
object
Server
{
val
msg
=
Some(“abc”)
}
4. What is Binary Compatibility?
Binary compatibility ≠ Source compatibility
Source incompatible, binary compatible:
object
Client
{
import
a,
b
val
x:
String
=
1
}
object
a
{
implicit
def
f(x:
Int):
String
=
x.toString
}
object
b
object
a
{
implicit
def
f(x:
Int):
String
=
x.toString
}
object
b
{
implicit
def
g(x:
Int):
String
=
”abc”
}
5. What is Binary Compatibility?
Binary compatibility ≠ Source compatibility
Source compatible, binary incompatible:
object
Apple
extends
Edible
{
def
joules
=
500000
}
trait
Edible
{
def
joules:
Double
}
trait
Edible
{
def
joules:
Double
def
calories
=
joules
*
4.184
}
èNeed to recompile on Java 1-7
In Java 8 it’s more complex but fundamentally the same.
6. What is Binary Compatibility?
Binary compatibility ≠ Source compatibility
Source compatible, binary incompatible:
object
Apple
extends
Edible
{
def
joules
=
500000
}
trait
Edible
{
def
joules:
Double
def
calories:
Double
}
trait
Edible
{
object
def
Edible$joules:
class
Double
{
}
def
calories($this:
Edible):
Double
=
$this.joules
*
4.184
trait
Edible
{
def
joules:
Double
def
calories
=
joules
*
4.184
}
}
object
Apple
extends
Edible
{
def
joules
=
500000.0
def
calories:
Double
=
Edibl$class.calories(this)
}
èNeed to recompile on Java 1-7
In Java 8 it’s more complex but fundamentally the same.
7. Other Issues
Compiler optimizations and bug fixes can affect binary
compatibility.
Example: Implementation of lazy values.
trait
Edible
{
def
joules:
Double
lazy
val
def
calories
=
joules
*
4.184
}
object
Apple
extends
Edible
{
def
joules
=
500000
}
object
Apple
extends
Edible
{
def
joules
=
500000.0
private
var
initFlags:
BitSet
private
var
cals:
Int
=
_
def
calories
=
{
if
(!initFlags(N))
{
cals
=
Edible$class.initCals(this)
initFlags(N)
=
true
}
cals
}}
Previously:
1 bit per lazy val
To avoid deadlocks:
2 bits.
è all offsets change!
8. Compiler Pipeline
Parser
Typer
SyntheticMethods
SuperAccessors
RefChecks
ElimRepeated
ElimLocals
ExtensionMethods
TailRec
PatternMatcher
ExplicitOuter
Erasure
Mixin
Memoize
LazyVals
CapturedVars
Constructors
LambdaLift
Flatten
RestoreScopes
Cleanup
more GenBCode
phases
Source
Symbols
JVM
Byte-code
Lots of
scope for
things to
go wrong!
9. Where It Breaks
A.class
C.class
C.class
C.scala C.scala
(binary incompatible
source change)
10. Why Is This Such a Big Problem?
MyApplication
DustyLegacyLib
Scala
Library
2.10
X can’t upgrade to
(too old, can’t rebuild)
Scala
Library
2.11
Seq.scala Seq.scala
(binary incompatible
source change)
Scala 2.11!
11. Not Just A Problem with Scala-Library
MyApplication
DustyLegacyLib
Akka
3.2
X can’t upgrade to
(can’t rebuild)
Akka
3.3
Actor.scala Actor.scala
(binary incompatible
source change)
Akka 3.3!
12. Not Just A Problem with Scala-Library
MyApplication
DustyLegacyLib
shapeless
2.0
X can’t upgrade to
(can’t rebuild)
shapeless
2.1
unions.scala unions.scala
(binary incompatible
source change)
shapeless 2.1!
13. Dealing With It So Far
“MiMa” tool can detect binary incompatibilities.
Scala release policy:
– Minor versions need to be (forwards and backwards) binary
compatible.
– Major versions are allowed to break binary compatibility
– Major versions are released rarely (+18 months between them).
Problem:
– 3rd party libraries need similar policies but often don’t enforce
them.
– Innovation is stifled.
– Simple fixes have to wait for a long time to get in.
– Lots of dev cycles spent on dealing with binary compatibility.
14. What Do Others Do?
Java:
• Language close to JVM bytecode.
• Innovation happens on JVM level.
– Either in the JVM itself or through reflection.
– E.g. Java 8 lambdas, default methods.
• Libraries are frozen when they appear.
– E.g. java.util.Date
• Language is restricted in terms of extensibility
– E.g. nterface1, interface2, ... interface7 in Eclipse.
15. What Do Others Do?
OSGI: Allow multiple versions of a library in an application
MyApplication
DustyLegacyLib
Scala
Library
2.10
MyApplication
Scala
Library
2.11
rebuild
• Fragile, requires serious classloader magic.
• Few frameworks beyond Eclipse have bought in.
16. What Do Others Do?
C/C++:
• Relies on Linker for more flexibility in interfaces.
• Not that great a story either (c.f. DLL Hell).
22. Why Can’t Scala Build from Source?
No standard Build Tool
Should we standardize on SBT, Gradle, Maven, Ivy, Ant?
Reproducible builds are rare.
Chicken and egg problem:
Because everyone is used to binary builds, nobody* invests in
making builds reproducible
*Not quite true: Typesafe has invested in community build, can now
build more than 1M lines of community projects. But it’s a huge
effort.
23. What We Need
• An interchange format
that captures the essence
of Scala dependencies.
• This cannot be the JVM
bytecode format
• Nor can it be source
23
24. The Idea
Use Typed Trees as an
interchange format.
– More robust than source.
– More stable than JVM
bytecode.
– Efficient?
29. More Resilient Than Bytecode
Can
– add fields and methods to traits
– add lazy vals anywhere
– change compilation scheme in any way necessary.
None of these would be binary compatible!
Can also
– add or remove implicits
– add methods anywhere
– change imports
All of these could be source incompatible!
30. Efficient?
Can typed trees be efficient enough to build million+ line
systems?
Possible issues:
• Size of trees
– on disk
– in memory
• Transformation time
30
32. Back of the Envelope Calculation:
16 nodes
Average size of node: 32 bytes
512 bytes total.
Double that to include type info.
è 16 bytes source à 1KB tree (factor 64 blow-up).
For a 1M line system
30MB source à 2GB trees.
32
33. A More Compact Representation
Apply
(34)
SelectTermWithSig
(9)
Ident
(3)
“xs”
“filter”
“Function1
-‐
Boolean”
Closure
(23)
ParamDef
(7)
“x”
TypeRef
“scala.Int”
Apply
(14)
SelectTermWithSig
(9)
Ident
(3)
“x$”
“=“
“Integer
-‐
Boolean”
Literal
(3)
0
33
Still navigable,
because inner nodes
contain size of total tree
derived from them
Types or symbols given
at the leaves.
Types of inner nodes are
reconstituted using the
TypeAssigner.
34. Speed
Transformation + byte-code generation amounts to ~ 60%
of total compile time.
We can speed this up by
– fusing phases, reducing amount of intermediate trees,
– using a fast type assigner, instead of a slow typer,
– building different files in parallel.
Besides, can use incremental compilation.
– Compile only this units that depend on changed libraries.
– Need to do that only once.
34
35. Other Benefits
Optimization
– Typed trees are a great format for interprocedural analyses
– Inlining across compilation units made simple
– Inlining without binary compatibility issues
Program Analysis
– Types trees are close to source, but easy to traverse
– Ideal for context-dependent program analyses such as FindBugs
– Ideal for instrumentation
Portability
– Typed trees allow retargeting to different backends, as long as
dependencies exist.
– Allow libraries to be used on JVM, JS, LLVM... without needing
explicit recompilation.
35
37. Conclusion
37
Typed
trees
can fix the
binary
compatibility
problem and they
offer lots of
other benefits, too.
Let’s start the
work to make them
real!