Description
Scala collections use a serialization proxy, which can leak during deserialization of a cyclic object graph.
Utility:
object SD {
import java.io._, scala.util.chaining._
def serialize(obj: AnyRef) = new ByteArrayOutputStream().tap(b => new ObjectOutputStream(b).writeObject(obj)).toByteArray
def deserialize(a: Array[Byte]) = new ObjectInputStream(new ByteArrayInputStream(a)).readObject()
def serializeDeserialize[T <: AnyRef](obj: T) = deserialize(serialize(obj)).asInstanceOf[T]
}
Test code:
@Test def coll(): Unit = {
val b = ListBuffer[AnyRef]()
val bar = new Bar(b)
b += bar
SD.serializeDeserialize(b)
}
This fails with
java.lang.ClassCastException:
cannot assign instance of scala.collection.generic.DefaultSerializationProxy
to field scala.collection.mutable.Bar.c of type scala.collection.mutable.Iterable
in instance of scala.collection.mutable.Bar
A stand-alone reproducer:
class A(var b: B) extends Serializable {
def writeReplace: AnyRef = new AProxy(this.b)
}
class AProxy(val b: B) extends Serializable {
def readResolve: AnyRef = new A(b)
}
class B(val a: A) extends Serializable
Test code:
@Test def repr(): Unit = {
val a = new A(null)
val b = new B(a)
a.b = b
SD.serializeDeserialize(a)
}
The readResolve
method is only invoked once the AProxy
instance is fully deserialized. During deserialization, references to this a
object resolve to the proxy.
@retronym points out that this is documented, last paragraph in https://docs.oracle.com/javase/8/docs/platform/serialization/spec/input.html#a5903
The
readResolve
method is not invoked on the object until the object is fully constructed, so any references to this object in its object graph will not be updated to the new object nominated byreadResolve
. [...] if the reference types [...] are not compatible, the construction of the object graph will raise aClassCastException
.
Links
- https://bugs.openjdk.org/browse/JDK-8024931
- Apache Fury: [Java] add read resolve circular test suite apache/fory#1161 (comment)
The same behavior can be triggered with Java collections (agian @retronym's example), just that the use of a serialization proxy is less widespread in Java collections.
@Test def jcoll(): Unit = {
import java.util.{ArrayList => JAL}
import java.util.{List => JL}
val c1 = new JAL[JL[_]]()
val c2 = JL.of(c1)
c1.add(c2)
val c2c = SD.serializeDeserialize(c2)
c2c.get(0).get(0).size() // ClassCastException: class java.util.CollSer cannot be cast to class java.util.List
}