Navigating a New Era of AI Governance
The EU AI Act, effective since August, 2024, with key obligations for general-purpose AI (GPAI) models starting August 2, 2025, sets a global standard for responsible AI development. Providers must document training data sources, ensure lawful data use, and maintain transparent audit trails to comply. Violations carry steep penalties: up to €40 million or 7% of global annual turnover for prohibited practices, and up to €20 million or 4% for transparency breaches. To guide compliance, the European Commission launched the General Purpose AI Code of Practice in July 2025. This voluntary framework outlines practical steps, such as publishing training data summaries, ensuring copyright compliance, and reporting systemic risks.
These requirements underscore a broader goal: fostering AI integrity. Data provenance - traceable, verifiable records of data origins and handling, is now critical for building trustworthy AI systems that meet regulatory and ethical expectations across training and inferencing pipelines.
Why Legacy Storage Falls Short
The Challenges of Traditional Storage for AI Compliance
Traditional storage systems, while effective for enterprise needs, face limitations in addressing the AI Act’s demand for verifiable provenance in complex, multi-party AI ecosystems.
- On-Premises Systems (e.g., Pure Storage, VAST Data, Dell): Solutions like Pure Storage’s SafeMode, Dell PowerScale’s SmartLock, and VAST provide robust immutability through Write-Once-Read-Many (WORM) and Object Lock features, often certified for financial compliance (e.g., SEC 17a-4). These systems excel at securing data within a single organization, with detailed logs supporting internal audits. However, their audit trails rely on vendor-controlled systems, which may not offer the independent verification needed when data crosses boundaries between licensors, AI labs, auditors, and regulators. Everyone has to trust what the vendor says happened.
- Cloud Platforms (e.g., AWS S3, Glacier): AWS S3 Object Lock and Glacier Vault Lock ensure immutability and meet regulatory retention standards. Tools like CloudTrail provide comprehensive audit logs. Yet, these logs are managed by AWS, requiring external parties to trust the provider’s systems. This can limit their suitability for multi-party scenarios where shared, independent verification is essential.
- Multi-Party AI Supply Chains: AI data flows across diverse stakeholders:
- Data licensors (e.g., content owners, publishers)
- AI labs and providers (training models or running inferencing)
- Auditors and regulators (verifying compliance and lawful sourcing)
- Enterprises (deploying models or inputting proprietary data)
In such ecosystems, traditional logs may not provide the shared trust needed. For instance, how can a licensor confirm which dataset was used in training, or an enterprise ensure its inferencing data was handled securely? While on-premises and cloud systems support internal compliance, their reliance on operator-controlled logs can complicate cross-organizational verification, a key requirement for AI Act compliance.
Akave Cloud: A New Storage Standard for Data Provenance
Akave Cloud addresses these challenges with a decentralized, cryptographic approach to data provenance, designed to align with the EU AI Act’s transparency and auditability requirements. Its platform features:
- encoded CID (eCID): is Akave’s upgraded version of the traditional unique content identifier hash (CID), designed for enterprise use cases. It is a unique cryptographic representation of the encrypted and encoded dataset. It embeds extra metadata and verifiable proofs directly into the identifier itself, making data more trustworthy, auditable.
- Proof of Data Possession (PDP): Cryptographic protocols enable storage providers to prove they hold specific data without retransmitting it, ensuring integrity and availability.
- On-Chain Attestations: Every file transaction, write, read, delete, or policy update, is recorded on an immutable blockchain ledger, creating a verifiable chain of custody.
These capabilities allow stakeholders—licensors, auditors, regulators, or enterprises—to independently verify data provenance without relying on a single vendor’s logs. For example, a regulator can confirm a dataset’s lawful sourcing, or an enterprise can validate secure handling of inferencing data, all through cryptographic evidence accessible via Akave Cloud’s immutable ledger.
Akave Cloud’s system extends provenance across the AI lifecycle, from training, inferencing to fine-tuning, ensuring compliance with the AI Act’s requirements for data transparency and copyright adherence.
By embedding provenance into the data architecture, Akave supports not only regulatory compliance but also the trust and integrity essential for responsible AI.
How Does Akave Cloud Compare to On-Premises and Cloud Vendors?
Why Akave’s Approach Matters?
- Compliance by Design: Akave’s cryptographic ledger simplifies AI Act compliance, enabling instant, auditable data summaries without manual collation or mutable logs.
- Cross-Organizational Trust: Independent verification supports licensors, regulators, and enterprises in complex AI supply chains.
- AI Integrity: Transparent, tamper-proof provenance ensures models are built and operated on trustworthy data, aligning with the AI Act’s goals.
- Scalable Innovation: Akave’s decentralized architecture is designed to adapt to evolving regulatory and industry needs, from data sovereignty to audit readiness.
Looking Ahead: The Future of AI Governance
The EU AI Act is a starting point for global AI regulation. As governance frameworks evolve, trends suggest:
- Regulators may favor verifiable, cryptographic systems to ensure transparency.
- Enterprises may prioritize provenance guarantees for data licensing and inferencing.
- Data sovereignty could become a file-level requirement in regions like the EU.
Akave’s platform is built to anticipate these shifts, offering a scalable solution that embeds trust and compliance into AI infrastructure. While traditional systems remain effective for enterprise needs, Akave’s cryptographic provenance provides a forward-looking approach for multi-party ecosystems.
Closing Thought
The EU AI Act marks a turning point for AI accountability. Trust in AI depends on verifiable data trails, not just vendor assurances. On-premises systems deliver enterprise reliability, cloud platforms offer scale, but Akave provides a new foundation: cryptographic data provenance that ensures compliance and integrity across the AI lifecycle. As regulatory demands grow, Akave Cloud positions itself as the leading S3-compatible storage for AI workloads, combining cryptographic proofs with unique cost savings vs. traditional clouds.
Connect with Us
Akave Cloud is an enterprise-grade, distributed and scalable object storage designed for large-scale datasets in AI, analytics, and enterprise pipelines. It offers S3 object compatibility, cryptographic verifiability, immutable audit trails, and SDKs for agentic agents; all with zero egress fees and no vendor lock-in saving up to 80% on storage costs vs. hyperscalers.
Akave Cloud works with a wide ecosystem of partners operating hundreds of petabytes of capacity, enabling deployments across multiple countries and powering sovereign data infrastructure. The stack is also pre-qualified with key enterprise apps such as Snowflake and others.
References
- European Commission – EU rules for General Purpose AI models start to apply (Digital Strategy, Aug 2025): Link
- European Commission – General Purpose AI Code of Practice (Policy overview): Link
- PubAffairs Bruxelles – General Purpose AI Code of Practice now available (Jul 2025): Link
- Wikipedia – European Artificial Intelligence Office (background on enforcement): Link
- AWS – Amazon S3 Object Lock overview (SEC 17a-4 compliance)
- Dell – PowerScale SmartLock documentation (Enterprise vs Compliance mode)
- Pure Storage – SafeMode snapshots and Cohasset assessment
- VAST Data – Cohasset assessment for WORM and financial compliance