Expand description
File io implementation.
§How to build FileIO
We provided a FileIOBuilder to build FileIO from scratch. For example:
use iceberg::Result;
use iceberg::io::{FileIOBuilder, S3_REGION};
// Build a memory file io.
let file_io = FileIOBuilder::new("memory").build()?;
// Build an fs file io.
let file_io = FileIOBuilder::new("fs").build()?;
// Build an s3 file io.
let file_io = FileIOBuilder::new("s3")
.with_prop(S3_REGION, "us-east-1")
.build()?;Or you can pass a path to ask FileIO to infer schema for you:
use iceberg::Result;
use iceberg::io::{FileIO, S3_REGION};
// Build a memory file io.
let file_io = FileIO::from_path("memory:///")?.build()?;
// Build an fs file io.
let file_io = FileIO::from_path("fs:///tmp")?.build()?;
// Build an s3 file io.
let file_io = FileIO::from_path("s3://bucket/a")?
.with_prop(S3_REGION, "us-east-1")
.build()?;§How to use FileIO
Currently FileIO provides simple methods for file operations:
delete: Delete file.exists: Check if file exists.new_input: Create input file for reading.new_output: Create output file for writing.
Structs§
- Azdls
Config - Azure Data Lake Storage configuration.
- Custom
AwsCredential Loader - Custom AWS credential loader. This can be used to load credentials from a custom source, such as the AWS SDK.
- Extensions
- Container for storing type-safe extensions used to configure underlying FileIO behavior.
- FileIO
- FileIO implementation, used to manipulate files in underlying storage.
- FileIO
Builder - Builder for
FileIO. - File
Metadata - The struct the represents the metadata of a file.
- GcsConfig
- Google Cloud Storage configuration.
- Input
File - Input file is used for reading from files.
- OssConfig
- Alibaba Cloud OSS storage configuration.
- Output
File - Output file is used for writing to files..
- S3Config
- Amazon S3 storage configuration.
- Storage
Config - Configuration properties for storage backends.
Enums§
- Open
DalStorage - OpenDAL-based storage implementation.
- Open
DalStorage Factory - OpenDAL-based storage factory.
Constants§
- ADLS_
ACCOUNT_ KEY - The key to authentication against the account.
- ADLS_
ACCOUNT_ NAME - The account that you want to connect to.
- ADLS_
AUTHORITY_ HOST - The authority host of the service principal.
- ADLS_
CLIENT_ ID - The client-id.
- ADLS_
CLIENT_ SECRET - The client-secret.
- ADLS_
CONNECTION_ STRING - A connection string.
- ADLS_
SAS_ TOKEN - The shared access signature.
- ADLS_
TENANT_ ID - The tenant-id.
- CLIENT_
REGION - Region to use for the S3 client (takes precedence over
S3_REGION). - GCS_
ALLOW_ ANONYMOUS - Option to skip signing requests (e.g. for public buckets/folders).
- GCS_
CREDENTIALS_ JSON - Google Cloud Storage credentials JSON string, base64 encoded.
- GCS_
DISABLE_ CONFIG_ LOAD - Option to skip loading configuration from config file and the env.
- GCS_
DISABLE_ VM_ METADATA - Option to skip loading the credential from GCE metadata server.
- GCS_
NO_ AUTH - Allow unauthenticated requests.
- GCS_
PROJECT_ ID - Google Cloud Project ID.
- GCS_
SERVICE_ PATH - Google Cloud Storage endpoint.
- GCS_
TOKEN - Google Cloud Storage token.
- GCS_
USER_ PROJECT - Google Cloud user project.
- OSS_
ACCESS_ KEY_ ID - Aliyun OSS access key ID.
- OSS_
ACCESS_ KEY_ SECRET - Aliyun OSS access key secret.
- OSS_
ENDPOINT - Aliyun OSS endpoint.
- S3_
ACCESS_ KEY_ ID - S3 access key ID.
- S3_
ALLOW_ ANONYMOUS - Option to skip signing requests (e.g. for public buckets/folders).
- S3_
ASSUME_ ROLE_ ARN - If set, all AWS clients will assume a role of the given ARN, instead of using the default credential chain.
- S3_
ASSUME_ ROLE_ EXTERNAL_ ID - Optional external ID used to assume an IAM role.
- S3_
ASSUME_ ROLE_ SESSION_ NAME - Optional session name used to assume an IAM role.
- S3_
DISABLE_ CONFIG_ LOAD - Option to skip loading configuration from config file and the env.
- S3_
DISABLE_ EC2_ METADATA - Option to skip loading the credential from EC2 metadata (typically used in conjunction with
S3_ALLOW_ANONYMOUS). - S3_
ENDPOINT - S3 endpoint URL.
- S3_
PATH_ STYLE_ ACCESS - S3 Path Style Access.
- S3_
REGION - S3 region.
- S3_
SECRET_ ACCESS_ KEY - S3 secret access key.
- S3_
SESSION_ TOKEN - S3 session token (required when using temporary credentials).
- S3_
SSE_ KEY - S3 Server Side Encryption Key. If S3 encryption type is kms, input is a KMS Key ID. In case this property is not set, default key “aws/s3” is used. If encryption type is custom, input is a custom base-64 AES256 symmetric key.
- S3_
SSE_ MD5 - S3 Server Side Encryption MD5.
- S3_
SSE_ TYPE - S3 Server Side Encryption Type.
Traits§
- File
Read - Trait for reading file.
- File
Write - Trait for writing file.
- Storage
- Trait for storage operations in Iceberg.
- Storage
Factory - Factory for creating Storage instances from configuration.