Java Object 序列化的基准测试(V1)SPEED/SPACE Benchmarks of Java Object Serializing(V1)
1.概要Summary
Java 序列化体系的性能孰高孰低,网上已经有了许多比较文章。 但我认为有些比较存在问题: - 测试样本结构简单
- 测试程序进行泛化处理以公平衡量各序列化体系
- 涉及序列化体系较少
- 测试程序扩展性差难以加入其它序列化体系
因此撰写此文及程序,作为众人参考比较的选择。 There has been so many discussions about which is the best Java serialization system.Yet I think there were some problem in some of them. - Sample data structure is too simple
- Testing program did not generalize serialization systems to evaluate each of them fairly
- Only a few serialization systems are involved
- Testing program is not extendable to involve more serialization systems
That's why this testing program and article were written, providing another option to build your own Java Serialization Systems evaluation.
1.1.涉及的序列化体系Serialization systems involved
- JDK bulit-in
- Protobuf
- Hessian2
- Kryo
- Fastjson
- Jackson
- Gson
1.2.测试结果关注点Testing results to concern
Speed of serialization
Speed of deserialization
Space cost after serialization
1.3.泛化处理Generalization
Protobuf协议需要对消息定义执行静态编译,JDK built-in序列化协议需要被序列化对象实现java.io.Serializable接口。而其他框架都是运行时动态对任意Java Object进行序列化。为了能在同一个基准上进行比较,需要定义泛化约束如下。 Static compilation is required for Protocol Buffers message definitions, and JDK built-in serialization protocol requires Objects to implement java.io.Serializable, while others serialize any Plain Java Object dynamically. Constraints were defined to generalize all serialization systems.
1.3.1.结构泛化Structure Generalization
- 测试所用的领域对象必须与.proto文件预定义的消息结构相同,并提供转换器与.proto文件预定义的消息相互转化。
- All domain objects should have the same structure defined by
.proto , and provides converters to convert POJOs and protobuf messages back and forth.
- 测试所用的领域对象必须实现java.io.Serializable接口
- All domain objects should implement java.io.Serializable
1.3.2.输入泛化Input Generalization
运行同一轮基准测试时,所有序列化框架输入的数据相同,循环次数相同。 Use exactly the same input for each Serialization System and loop exactly the same times for the same benchmark testing.
1.3.3.如何构建和运行How to build and run
构建测试程序 - 进入构建目录
cd master - 全量构建
mvn clean install
Build The Testing Program - Enter building directory
cd master - Startover building
mvn clean install
运行测试程序 - 进入benchmark目录
cd benchmark - 开始运行
java -jar target/benchmark-<version>.jar - 在${user.home}/benchmark.log 查看输出、日志
Run The Testing Program - Enter benchmark directory
cd benchmark - Start running by typing
java -jar target/benchmark-<version>.jar - Checkout logs in ${user.home}/benchmark.log
2.测试程序设计Testing Program Designing
2.1.测试样本对象Samples Testing Models
为了满足测试的多样性,较全面测试空间和时间性能,测试样本对象当满足如下条件。 Testing objects are supposed to satisfy requirements mentioned below, so that space/speed performances are better evaluated.
- C1-01 Testing objects and the properties of them are created randomly
- C1-02 数据类型使用上至少包括整数、字符串、浮点数和枚举
- C1-02 Testing objects should have integer/string/float/enum propertiesAll of these types are mandatory.
- C1-03 Testing objects are supposed to hold at least 1 collection property
- C1-04 Testing objects are supposed to refer to each other
2.2.序列化对象Serialized Object
序列化对象是普通Java对象的包装,满足如下条件。 Serialized Objects are wrappers of POJOs, are supposed to satisfy requirements mentioned below.
- C2-01 接受一个普通Java对象作为初始化对象
- C2-01 Accepts a POJO for initialization
- C2-02 提供返回值为
byte[] 类型的无参方法获取序列化后的字节流
- C2-02 Provides a
byte[] method without args for accessing serialized byte array
- C2-03 提供返回值为
int 类型的无参方法获取字节流长度
- C2-03 Provides a
int method without args for accessing the length of byte array
- C2-04 提供返回值为
String 的无参方法获取序列化后字节流的UTF-8字符串形态
- C2-04 Provides a
String method without args for accessing the UTF-8 form of byte array
- C2-05 提供返回值为
String 的无参方法获取序列化后字节流的Base64字符串形态
- C2-05 Provides a
String method without args for accessing the Base64 form of byte array
- C2-06 提供返回值与初始化对象相同无参方法对序列化后的字节流反序列化
- C2-06 Provides method without args returning the same type as accepted POJO, which is deserialized from the byte array
- C2-07
C2-06 所提及的方法不能直接返回C2-01 传入的对象
- C2-07 The method required by
C2-06 shall not return the POJO accepted by C2-01
- C2-08 序列化对象应当是不可变对象
- 不提供任何
set* ,add* 等会改变对象状态的方法 C2-02 所提供的方法应当进行保护性复制
- C2-08 Serialized Object is supposed to be IMMUTABLE
- Provide no mutators that changes the object status, like
set* , add* - Method defined by
C2-02 should return a defencive copy of the internal byte array
2.3.基准测试对象Benchmark Testing Objects
2.3.1.空间基准测试对象Space Benchmark Testing Objects
空间基准测试比较简单。只需要随机测试样本,逐个输出各序列化体系的空间占用即可。 Space benchmark testing is the simpler one. Generate samples, and record space cost of each serialization systems. That's all we have to do.
2.3.2.速度基准测试对象Speed Benchmark Testing Objects
为了公平比较各序列化体系,定义速度基准测试对象约束如下 To be fair, the subsequent constraints are defined
- C3-01 提供接受1个Object类型参数和1个int类型参数的方法。其中Object类型参数为待序列化对象,int类型参数为循环次数
- C3-01 Provides a method which accepts 1
Object argument, which is to be serialized; and 1 int argument, which indicates times of looping.
- C3-02 对
C3-01 定义的方法开始和结束时进行计时,计算总消耗时间和平均每次序列化的时间
- C3-02 Calculate elapsed time of the method defined by
C3-01 , and average elapsed time of each serialization.
- C3-03 速度基准测试对象的执行次序应当可以在运行时随意调整
- C3-03 The execution order of each Speed Benchmark Objects are able to be adjusted at runtime, freely.
3.测试程序实现Implementing Testing Program
3.1.SerializedObjectSerializedObject 是所有序列化对象的基类,根据 2.2的要求实现.SerializedObject 的子类告诉其父类如何把所包装对象序列化成字节流.SerializedObject 的子类告诉其父类如何把字节流反序列化成对象.SerializedObject 的子类可通过实现 beforeSerilize() 方法初始化序列化过程中需要用到的工具.SerializedObject 在序列化过程中捕捉的受检异常都会被包装到SerializationException 重新抛出.SerializedObject 提供了工厂方法初始化其子类,其子类的构造函数都是package-private的。
SerializedObject is the base class of all serialized objects, which complies with 2.2.- Sub-types of
SerializedObject tells their super class how to serialize the wrapped object. - Sub-types of
SerializedObject tells their super class how to deserialize from the byte array. - Sub-types of
SerializedObject are allowed to implement beforeSerilize() to initiate the internal utilities. - Checked exception of serialization procedure inside
SerializedObject are wrapped and rethrown by SerializationException . SerializedObject provides factory method to initialize it's known sub-types, since the constructor of which are package-private.
3.1.1.Hessian2Hessian2SerializedObject 需要额外的配置,用以指定自定义的序列化和反序列化策略。相应的配置放在META-INF目录下面。
Hessian2SerializedObject requires extra configuration under META-INF, which specifies custom serializers.
3.2.Benchmark Interface是速度基准测试接口 which is a Speed Benchmark Interface
Benchmark 接口根据2.3.2定义了单次基准测试的执行方法
Benchmark defined method for benchmark testing, complies with 2.3.2.
Benchmark 的执行计时通过ProfilingAspect 拦截实现
Benchmark executions are intercepted by ProfilingAspect , for elapsed time calculation.
ProfilingAspect 的总耗时单位是毫秒,单次调用平均耗时单位为微秒。
ProfilingAspect records total elapsed time in Milliseconds, and average elapsed time of a single call in Microseconds.
3.3.SpeedBenchmarks- 组合所有
Benchmark 已知的接口的实现 - 对所有
Benchmark 实现分别执行1,000, 5,000, 20,000, 50,000, 200,000次 - 定义执行Benchmark Testing的线程池并管理之
- Arranges known
Benchmark implementations. - Run each
Benchmark implementation for 1,000, 5,000, 20,000, 50,000, 200,000 times. - Define thread pool which executes Benchmark Testing and manage its lifecycle.
3.4.自动生成的代码Protocol Buffers消息对象需要通过静态编译预生成. 同时为避免冗长的代码,测试程序使用了lombok。如果你导入代码到IDE时发现缺少了相应的类或者库,请先到master目录运行mvn clean install ,然后再重新导入代码。 Protocol Buffers messages requires static compilation. Moreover, the testing program introduced lombok. If you see any required classes or dependencies are missing after importing into IDE, checkout the master directory and run mvn clean insall first, and re-import the testing program after that.
3.5.Testing ModelsTestingModels 是样本测试数据生成器,可随机生成被测试的样本对象及枚举值。测试样本类型由lombok编译器生成。无论编译与否,原文件在message/testing-models/src/main/lombok 目录下找到。
TestingModels is the sample testing object provider, which generates samples testing objects and enums randomly. Sample testing model types are generated by lombok automatically. The original source can be found under message/testing-models/src/main/lombok even before compilation.
3.6.Package io.demo.message.domain.proto io.demo.message.domain.proto 包含2种类型
- Protobuf编译器生成的消息类,编译后可在
message/testing-models/target/generated-sources/protobuf/java 找到。 - Protobuf消息类和测试样本类之间的转换类。
2 kinds of classes are underio.demo.message.domain.proto Message classes generated by Protobuf compiler, which can be found under message/testing-models/target/generated-sources/protobuf/java after compilation.- Converters transforms Testing Models and Protobuf messages back and forth
4.如何扩展测试程序How to extend the Testing Program |
请发表评论