Akave Cloud × Baselight creates a "queryable data marketplace" to monetize siloed datasets by enabling direct queries on source files, offering granular pricing for providers and efficient access for consumers, shifting value to the point of query.
For AI teams, data vendors, and DePIN builders, the data economy is broken. High-value datasets sit idle in proprietary silos. They are expensive to store, hard to share, and difficult to monetize. Buyers waste time across fragmented marketplaces and download terabytes to extract a few columns.
Akave Cloud and Baselight connect storage to revenue. Static files become on-demand, streamable products. Providers set granular pricing and access rules. Buyers query the source and retrieve only what they need.
This replaces manual file transfers with a live, query-first workflow. Data stays put. Integrity is provable. Access is metered and auditable.
A New Architecture for the Data Economy
The integration replaces slow file movement and duplication with a live, queryable layer.
Akave Cloud provides the verifiable object storage backbone for the data economy. As the permanent, S3-compatible object store, it keeps every dataset secure, content-addressed, and integrity-verifiable. Akave Cloud helps customers save up to 80% cost savings by not charging for egress, make it easier to query as often as you want.
Baselight provides the discovery and query layer. Built on a high-performance DuckDB engine, its marketplace lists structured datasets, manages access terms, and resolves user queries in real time against source files in Akave Cloud.
When a provider lists a dataset on Baselight, the file stays where it is. Buyers query it directly and stream only relevant slices, without moving or duplicating anything. This shifts value creation to the point of query.
How It Works: Seamless Data Flow & Query Execution
At the core of this partnership is a fully integrated data flow, ensuring structured datasets remain secure, accessible, and query-ready in real time.
- Data Ingestion & Cataloging
- Users or crawlers add structured data to Baselight’s catalog.
- This data is uploaded to Akave’s decentralized network and referenced through a Unique Resource Identifier (URI).
- Automatic Updates & Versioning
- New datasets and any changes to existing datasets are instantly updated within Baselight’s catalog.
- On-Demand Query Execution
- When a query is executed, Baselight retrieves the source URI and streams only the relevant parquet columns needed, directly from Akave’s storage layer.
- This allows for efficient, decentralized query execution, eliminating the need for unnecessary data duplication.

“We essentially stream-execute the query directly against the file stored in Akave, which is pretty cool.” — Alfonso de la Rocha, CTO of Baselight
A Paradigm Shift in Data Marketplaces (Before vs. After)
The Impact: From Idle Assets to Active Revenue
For AI teams and researchers. Publish proprietary training data with programmable access so others query only the slices they need without taking full custody.
For enterprises and data vendors. Turn proprietary data into active, monetizable assets with granular, stream-based pricing while retaining ownership and governance.
For Web3 and DePIN builders. Expose and monetize network data without building a custom marketplace.
The Future of Data Commerce
Traditional storage treats datasets as static files in a bucket. This architecture treats them as live, queryable products.
List once.
Store verifiably.
Stream only what matters.
Monetize at scale.
It’s live today on Baselight, built on Akave Cloud’s verifiable storage.
Connect with Us
Akave Cloud is an enterprise-grade, distributed and scalable object storage designed for large-scale datasets in AI, analytics, and enterprise pipelines. It offers S3 object compatibility, cryptographic verifiability, immutable audit trails, and SDKs for agentic agents; all with zero egress fees and no vendor lock-in saving up to 80% on storage costs vs. hyperscalers.
Akave Cloud works with a wide ecosystem of partners operating hundreds of petabytes of capacity, enabling deployments across multiple countries and powering sovereign data infrastructure. The stack is also pre-qualified with key enterprise apps such as Snowflake and others.