How to Use Akave Cloud with Hugging Face for S3-Compatible Inference?

This guide demonstrates how to seamlessly move datasets from the Hugging Face datasets library into Akave Cloud using Akave's S3-compatible gateway, and then run inference directly from Akave.
Bart Hofkin
December 3, 2025

What You’ll Need

  • Python 3.9+ installed
  • Hugging Face datasets Python package 
  • s3fs Python package
  • boto3 AWS SDK (for CLI interactions)
  • Akave credentials (get yours at akave.com)
  • Access to the Akave O3 S3-compatible endpoint

Step 1: Choose Your Dataset from Hugging Face

Go to huggingface.co/datasets and find a dataset you’d like to use.

For this demo, we’ll use the standard IMDB dataset:

imdb

Copy the dataset name to use in your script.

Step 2: Transfer the Dataset into Akave Cloud

We’ll use the Hugging Face datasets library with s3fs to load and push the dataset into an Akave bucket.

Here's a simplified Python script example:

from datasets import load_dataset
import s3fs
import os

# Akave S3 credentials
os.environ["AWS_ACCESS_KEY_ID"] = "<your-key>"
os.environ["AWS_SECRET_ACCESS_KEY"] = "<your-secret>"
endpoint_url = "https://o3-rc2.akave.xyz"

# Load dataset from Hugging Face
dataset = load_dataset("imdb")

# Save it to Akave via s3fs
fs = s3fs.S3FileSystem(client_kwargs={'endpoint_url': endpoint_url})

# Create target bucket + path
bucket_name = "huggingface-bucket"
target_path = f"{bucket_name}/imdb"

dataset.save_to_disk(f"s3://{target_path}", fs=fs)

You can use aws s3 ls on your bucket to inspect the data after upload.

Step 3: Stream the Dataset Back from Akave

You can now load the dataset directly from Akave storage using Hugging Face commands — no need to store locally.

from datasets import load_from_disk

dataset = load_from_disk(f"s3://huggingface-bucket/imdb", fs=fs)
print(dataset["train"][0])

This enables inference directly from cloud storage, with no egress fees.

Step 4: Use Helper Scripts for Easier Integration

Akave provides pre-written helper scripts to:

  • List available buckets and S3 paths
  • Initialize S3 connections
  • Transfer or sync datasets
  • Run basic tests to validate your connection

Available in the docs.akave.xyz → “Hugging Face Integration” section:

  • huggingface_s3.py
  • huggingface_test.py

These Python classes help automate repetitive setup tasks.

▶️ Watch the demo: How to Use Akave Cloud with Hugging Face | Run Inference on Decentralized Datasets

FAQ

1. Can I run Hugging Face inference directly from Akave Cloud?

Yes, Akave’s S3-compatible gateway allows models and datasets to stream directly from storage without downloading locally.

2. Does this work with any Hugging Face dataset?

Yes, any dataset loaded via load_dataset() or load_from_disk() can be stored on Akave Cloud.

3. Can I use Akave with my existing S3 scripts?

Yes, Akave fully supports S3 APIs, signatures, and paths.

4. Are there egress fees?

No. Akave Cloud provides unlimited retrievals under fair use.

5. How does Akave ensure dataset integrity?

Every object is written with content-addressed integrity checks, cryptographic verifiability, and immutable audit trails.

Modern Infra. Verifiable By Design

Whether you're scaling your AI infrastructure, handling sensitive records, or modernizing your cloud stack, Akave Cloud is ready to plug in. It feels familiar, but works fundamentally better.