There is SchemaConverters.createConverterToSQL
but it is private unfortunately.
There are PRs to make it public, but they were never merged:
There's a workaround though that we used.
You can expose it by creating a class in com.databricks.spark.avro
package:
package com.databricks.spark.avro
import org.apache.avro.Schema
import org.apache.avro.generic.GenericRecord
import org.apache.spark.sql.Row
import org.apache.spark.sql.types.DataType
object MySchemaConversions {
def createConverterToSQL(avroSchema: Schema, sparkSchema: DataType): (GenericRecord) => Row =
SchemaConverters.createConverterToSQL(avroSchema, sparkSchema).asInstanceOf[(GenericRecord) => Row]
}
Then you can use it in your code like this:
final DataType myAvroType = SchemaConverters.toSqlType(MyAvroRecord.getClassSchema()).dataType();
final Function1<GenericRecord, Row> myAvroRecordConverter =
MySchemaConversions.createConverterToSQL(MyAvroRecord.getClassSchema(), myAvroType);
Row[] convertAvroRecordsToRows(List<GenericRecord> records) {
return records.stream().map(myAvroRecordConverter::apply).toArray(Row[]::new);
}
For one record you can just call it like this:
final Row row = myAvroRecordConverter.apply(record);
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…