Module iceberg::writer

source ·
Expand description

Iceberg writer module.

The writer API is designed to be extensible and flexible. Each writer is decoupled and can be create and config independently. User can: 1.Customize the writer using the writer trait. 2.Combine different writer to build a writer which have complex write logic.

There are two kinds of writer:

  1. FileWriter: Focus on writing record batch to different physical file format.(Such as parquet. orc)
  2. IcebergWriter: Focus on the logical format of iceberg table. It will write the data using the FileWriter finally.

§Simple example for data file writer:

// Create a parquet file writer builder. The parameter can get from table.
let file_writer_builder = ParquetWriterBuilder::new(
   0,
   WriterProperties::builder().build(),
   schema,
   file_io.clone(),
   loccation_gen,
   file_name_gen,
)
// Create a data file writer using parquet file writer builder.
let data_file_builder = DataFileBuilder::new(file_writer_builder);
// Build the data file writer.
let data_file_writer = data_file_builder.build().await.unwrap();

data_file_writer.write(&record_batch).await.unwrap();
let data_files = data_file_writer.flush().await.unwrap();

Modules§

  • Base writer module contains the basic writer provide by iceberg: DataFileWriter, PositionDeleteFileWriter, EqualityDeleteFileWriter.
  • This module contains the writer for data file format supported by iceberg: parquet, orc.

Traits§