Module io

Source
Expand description

File io implementation.

§How to build FileIO

We provided a FileIOBuilder to build FileIO from scratch. For example:

use iceberg::Result;
use iceberg::io::{FileIOBuilder, S3_REGION};

// Build a memory file io.
let file_io = FileIOBuilder::new("memory").build()?;
// Build an fs file io.
let file_io = FileIOBuilder::new("fs").build()?;
// Build an s3 file io.
let file_io = FileIOBuilder::new("s3")
    .with_prop(S3_REGION, "us-east-1")
    .build()?;

Or you can pass a path to ask FileIO to infer schema for you:

use iceberg::Result;
use iceberg::io::{FileIO, S3_REGION};

// Build a memory file io.
let file_io = FileIO::from_path("memory:///")?.build()?;
// Build an fs file io.
let file_io = FileIO::from_path("fs:///tmp")?.build()?;
// Build an s3 file io.
let file_io = FileIO::from_path("s3://bucket/a")?
    .with_prop(S3_REGION, "us-east-1")
    .build()?;

§How to use FileIO

Currently FileIO provides simple methods for file operations:

  • delete: Delete file.
  • exists: Check if file exists.
  • new_input: Create input file for reading.
  • new_output: Create output file for writing.

Structs§

AzdlsConfig
Azure Data Lake Storage configuration.
CustomAwsCredentialLoader
Custom AWS credential loader. This can be used to load credentials from a custom source, such as the AWS SDK.
Extensions
Container for storing type-safe extensions used to configure underlying FileIO behavior.
FileIO
FileIO implementation, used to manipulate files in underlying storage.
FileIOBuilder
Builder for FileIO.
FileMetadata
The struct the represents the metadata of a file.
GcsConfig
Google Cloud Storage configuration.
InputFile
Input file is used for reading from files.
OssConfig
Alibaba Cloud OSS storage configuration.
OutputFile
Output file is used for writing to files..
S3Config
Amazon S3 storage configuration.
StorageConfig
Configuration properties for storage backends.

Enums§

OpenDalStorage
OpenDAL-based storage implementation.
OpenDalStorageFactory
OpenDAL-based storage factory.

Constants§

ADLS_ACCOUNT_KEY
The key to authentication against the account.
ADLS_ACCOUNT_NAME
The account that you want to connect to.
ADLS_AUTHORITY_HOST
The authority host of the service principal.
ADLS_CLIENT_ID
The client-id.
ADLS_CLIENT_SECRET
The client-secret.
ADLS_CONNECTION_STRING
A connection string.
ADLS_SAS_TOKEN
The shared access signature.
ADLS_TENANT_ID
The tenant-id.
CLIENT_REGION
Region to use for the S3 client (takes precedence over S3_REGION).
GCS_ALLOW_ANONYMOUS
Option to skip signing requests (e.g. for public buckets/folders).
GCS_CREDENTIALS_JSON
Google Cloud Storage credentials JSON string, base64 encoded.
GCS_DISABLE_CONFIG_LOAD
Option to skip loading configuration from config file and the env.
GCS_DISABLE_VM_METADATA
Option to skip loading the credential from GCE metadata server.
GCS_NO_AUTH
Allow unauthenticated requests.
GCS_PROJECT_ID
Google Cloud Project ID.
GCS_SERVICE_PATH
Google Cloud Storage endpoint.
GCS_TOKEN
Google Cloud Storage token.
GCS_USER_PROJECT
Google Cloud user project.
OSS_ACCESS_KEY_ID
Aliyun OSS access key ID.
OSS_ACCESS_KEY_SECRET
Aliyun OSS access key secret.
OSS_ENDPOINT
Aliyun OSS endpoint.
S3_ACCESS_KEY_ID
S3 access key ID.
S3_ALLOW_ANONYMOUS
Option to skip signing requests (e.g. for public buckets/folders).
S3_ASSUME_ROLE_ARN
If set, all AWS clients will assume a role of the given ARN, instead of using the default credential chain.
S3_ASSUME_ROLE_EXTERNAL_ID
Optional external ID used to assume an IAM role.
S3_ASSUME_ROLE_SESSION_NAME
Optional session name used to assume an IAM role.
S3_DISABLE_CONFIG_LOAD
Option to skip loading configuration from config file and the env.
S3_DISABLE_EC2_METADATA
Option to skip loading the credential from EC2 metadata (typically used in conjunction with S3_ALLOW_ANONYMOUS).
S3_ENDPOINT
S3 endpoint URL.
S3_PATH_STYLE_ACCESS
S3 Path Style Access.
S3_REGION
S3 region.
S3_SECRET_ACCESS_KEY
S3 secret access key.
S3_SESSION_TOKEN
S3 session token (required when using temporary credentials).
S3_SSE_KEY
S3 Server Side Encryption Key. If S3 encryption type is kms, input is a KMS Key ID. In case this property is not set, default key “aws/s3” is used. If encryption type is custom, input is a custom base-64 AES256 symmetric key.
S3_SSE_MD5
S3 Server Side Encryption MD5.
S3_SSE_TYPE
S3 Server Side Encryption Type.

Traits§

FileRead
Trait for reading file.
FileWrite
Trait for writing file.
Storage
Trait for storage operations in Iceberg.
StorageFactory
Factory for creating Storage instances from configuration.