What You’ll Need
- Python 3.9+ installed
- Hugging Face datasets Python package
- s3fs Python package
- boto3 AWS SDK (for CLI interactions)
- Akave credentials (get yours at akave.com)
- Access to the Akave O3 S3-compatible endpoint
Step 1: Choose Your Dataset from Hugging Face
Go to huggingface.co/datasets and find a dataset you’d like to use.
For this demo, we’ll use the standard IMDB dataset:
imdbCopy the dataset name to use in your script.
Step 2: Transfer the Dataset into Akave Cloud
We’ll use the Hugging Face datasets library with s3fs to load and push the dataset into an Akave bucket.
Here's a simplified Python script example:
from datasets import load_dataset
import s3fs
import os
# Akave S3 credentials
os.environ["AWS_ACCESS_KEY_ID"] = "<your-key>"
os.environ["AWS_SECRET_ACCESS_KEY"] = "<your-secret>"
endpoint_url = "https://o3-rc2.akave.xyz"
# Load dataset from Hugging Face
dataset = load_dataset("imdb")
# Save it to Akave via s3fs
fs = s3fs.S3FileSystem(client_kwargs={'endpoint_url': endpoint_url})
# Create target bucket + path
bucket_name = "huggingface-bucket"
target_path = f"{bucket_name}/imdb"
dataset.save_to_disk(f"s3://{target_path}", fs=fs)You can use aws s3 ls on your bucket to inspect the data after upload.
Step 3: Stream the Dataset Back from Akave
You can now load the dataset directly from Akave storage using Hugging Face commands — no need to store locally.
from datasets import load_from_disk
dataset = load_from_disk(f"s3://huggingface-bucket/imdb", fs=fs)
print(dataset["train"][0])This enables inference directly from cloud storage, with no egress fees.
Step 4: Use Helper Scripts for Easier Integration
Akave provides pre-written helper scripts to:
- List available buckets and S3 paths
- Initialize S3 connections
- Transfer or sync datasets
- Run basic tests to validate your connection
Available in the docs.akave.xyz → “Hugging Face Integration” section:
huggingface_s3.py- huggingface_test.py
These Python classes help automate repetitive setup tasks.
▶️ Watch the demo: How to Use Akave Cloud with Hugging Face | Run Inference on Decentralized Datasets
FAQ
1. Can I run Hugging Face inference directly from Akave Cloud?
Yes, Akave’s S3-compatible gateway allows models and datasets to stream directly from storage without downloading locally.
2. Does this work with any Hugging Face dataset?
Yes, any dataset loaded via load_dataset() or load_from_disk() can be stored on Akave Cloud.
3. Can I use Akave with my existing S3 scripts?
Yes, Akave fully supports S3 APIs, signatures, and paths.
4. Are there egress fees?
No. Akave Cloud provides unlimited retrievals under fair use.
5. How does Akave ensure dataset integrity?
Every object is written with content-addressed integrity checks, cryptographic verifiability, and immutable audit trails.

.webp)