Module iceberg::spec

source ·
Expand description

Spec for Iceberg.

Structs§

  • Data file carries data file path, partition tuple, metrics, …
  • Builder for DataFile.
  • Literal associated with its type. The value and type pair is checked when construction, so the type and value is guaranteed to be correct when used.
  • Field summary for partition field in the spec.
  • A list is a collection of values with some element type. The element field has an integer id that is unique in the table schema. Elements can be either optional or required. Element types may be any type.
  • A manifest contains metadata and a list of entries.
  • A manifest is an immutable Avro file that lists data files or delete files, along with each file’s partition data tuple, metrics, and tracking information.
  • Entry in a manifest list.
  • Snapshots are embedded in table metadata, but the list of manifests for a snapshot are stored in a separate manifest list file.
  • A manifest list writer.
  • Meta data of a manifest that is stored in the key-value metadata of the Avro file
  • A manifest writer.
  • Map is a collection of key-value pairs with a key type and a value type. It used in Literal::Map, to make it hashable, the order of key-value pairs is stored in a separate vector so that we can hash the map in a deterministic way. But it also means that the order of key-value pairs is matter for the hash value.
  • A map is a collection of key-value pairs with a key type and a value type. Both the key field and value field each have an integer id that is unique in the table schema. Map keys are required and map values can be either optional or required. Both map keys and map values may be any type, including nested types.
  • Encodes changes to the previous metadata files for the table
  • A struct is a tuple of typed values. Each field in the tuple is named and has an integer id that is unique in the table schema. Each field can be either optional or required, meaning that values can (or cannot) be null. Fields may be any type. Fields may have an optional comment or doc string. Fields can have default values.
  • Partition fields capture the transform from table data to partition values.
  • Partition spec that defines how to produce a tuple of partition values from a record.
  • Create valid partition specs for a given schema.
  • Raw literal representation used for serde. The serialize way is used for Avro serializer.
  • Defines schema in iceberg.
  • Schema builder.
  • A snapshot represents the state of a table at some time and is used to access the complete set of data files in the table.
  • A log of when each snapshot was made.
  • Iceberg tables keep track of branches and tags using snapshot references.
  • Entry for every column that is to be sorted
  • A sort order is defined by a sort order id and a list of sort fields. The order of the sort fields within the list defines the order in which the sort is applied to the data.
  • Builder for SortOrder.
  • The SQL representation stores the view definition as a SQL SELECT, with metadata such as the SQL dialect.
  • The partition struct stores the tuple of partition values for each file. Its type is derived from the partition fields of the partition spec used to write the manifest file. In v2, the partition struct’s field ids must match the ids from the partition spec.
  • DataType for a specific struct
  • An iterator that moves out of a struct.
  • Summarises the changes in the snapshot.
  • Fields for the version 2 of the table metadata.
  • Manipulating table metadata.
  • Unbound partition field can be built without a schema and later bound to a schema.
  • Unbound partition spec can be built without a schema and later bound to a schema.
  • Create a new UnboundPartitionSpec
  • Fields for the version 1 of the view metadata.
  • Manipulating view metadata.
  • A list of view representations.
  • A view versions represents the definition of a view at a specific point in time.
  • A log of when each snapshot was made.

Enums§

  • Type of content stored by the data file: data, equality deletes, or position deletes (all v1 files are data files)
  • Error type for DataFileBuilder
  • Format of this data.
  • Iceberg format version
  • Values present in iceberg type
  • The type of files tracked by the manifest, either data or delete files; Data(0) for all v1 manifests
  • Used to track additions and deletions in ManifestEntry.
  • Describes the order of null values when sorted.
  • The operation field is used by some operations, like snapshot expiration, to skip processing certain snapshots.
  • Values present in iceberg type
  • Primitive data types
  • The snapshot expiration procedure removes snapshots from table metadata and applies the table’s retention policy.
  • Sort direction in a partition, either ascending or descending
  • Error type for SortOrderBuilder
  • Transform is used to transform predicates to partition predicates, in addition to transforming data values.
  • All data types are either primitives or nested types, which are maps, lists, or structs.
  • Iceberg format version
  • View definitions can be represented in multiple ways. Representations are documented ways to express a view definition.

Constants§

Traits§

Functions§

Type Aliases§