Onchain vs Offchain Storage: A Builder's Decision Framework

We've deployed dozens of smart contracts and helped builders ship onchain products. The question that trips people up the most? Where to store their data.

Store too much onchain and you'll burn through gas fees faster than you can raise funding. Store too little and your "decentralized" app is one server failure away from breaking. Neither extreme works.

This guide shares what we've learned about storage decisions—the trade-offs that actually matter, the patterns that work in production, and the mistakes to avoid. Whether you're launching an NFT collection, building a DAO, or creating an onchain website, you'll walk away with a clear framework for deciding what goes where.

Onchain and Offchain: What's the Real Difference?

Onchain data lives on the blockchain itself. Every validator node stores a copy. Once it's there, it stays there—immutable, transparent, and accessible to anyone.

Common onchain data includes:

Token balances and ownership records
Smart contract code and state
Transaction history and event logs
Governance votes and outcomes
Content hashes that point to offchain files

Offchain data lives everywhere else. Your own servers. IPFS nodes. Arweave's permaweb. Third-party databases. The blockchain doesn't know it exists—unless you store a reference pointing to it.

Common offchain data includes:

NFT images, videos, and metadata
User profiles and settings
Front-end code and interfaces
Discussion threads and documents
Analytics and logs

Most production Web3 apps use both. The skill is knowing which data belongs where—and why.

The Six Trade-offs That Matter

Every storage decision comes down to balancing six factors. Understanding them helps you make choices you won't regret later.

1. Cost

Onchain storage is expensive by design. You're paying every validator in the network to store your data forever.

On Ethereum mainnet, storing 1KB can cost $10-50 in gas fees during normal conditions. Layer 2s like Base drop this to $0.01-0.10 per KB—a massive improvement, but costs still add up for larger datasets.

Offchain is almost free by comparison:

IPFS: Free if you run your own node, or $5-20/month for pinning services
Arweave: One-time payment of roughly $0.01-0.10 per MB for permanent storage
Traditional databases: Pennies per GB

The practical rule: If you're storing media files, frequently changing data, or large datasets, keep them offchain. Reserve onchain for ownership records, financial data, and cryptographic proofs.

2. Trust

This is where onchain really shines.

Onchain data requires zero trust. Run a node or check a block explorer—you can verify everything yourself. The network's consensus guarantees integrity without relying on any company or individual.

Offchain data requires trust in whoever controls it. A centralized server can go offline, serve different data to different users, or disappear entirely. Even IPFS content vanishes if nobody pins it.

The practical rule: For financial data, ownership records, or anything where tampering causes real harm, go onchain. For content that benefits from flexibility or privacy controls, offchain with verification works well.

3. Permanence

Onchain is forever—literally. You can't delete or modify data once it's written to the blockchain. This creates powerful guarantees for historical records, but zero margin for error.

Offchain ranges from "probably temporary" to "permanent-ish":

Centralized servers: Can change, go down, or get wiped
IPFS: Persists only while someone pins it
Arweave: Designed for permanence with economic incentives to maintain data indefinitely

The practical rule: Legal agreements, audit trails, and proof of ownership belong somewhere permanent. UI elements, user preferences, and evolving content belong somewhere flexible.

4. Privacy

Blockchains are public by default. Every transaction, every state change, every piece of data is visible to anyone who looks. Great for transparency. Terrible for privacy.

Offchain data can be private. You control access through authentication, encryption, or simply not publishing. This makes compliance with regulations like GDPR actually possible.

The practical rule: User profiles, sensitive business data, and anything you wouldn't want on a billboard goes offchain with proper access controls.

5. Speed

This one's straightforward.

Offchain is instant. Databases return queries in milliseconds. APIs respond in under a second. Users get immediate feedback.

Onchain takes time. Ethereum needs 12+ seconds for confirmation. Base confirms in ~2 seconds but still requires network propagation. Complex contract interactions can take longer.

The practical rule: Build your UI to feel instant using offchain data, then confirm with onchain state in the background.

6. Composability

This is onchain's secret weapon.

Any smart contract can read another contract's state, call its functions, or build on its logic. This "money legos" effect powers DeFi, makes NFTs interoperable, and enables permissionless innovation.

Offchain data needs bridges. APIs must be called. Oracles must relay information. Composability is possible but requires extra infrastructure.

The practical rule: If other developers should be able to build on your data without your permission, it needs to be onchain.

Understanding Onchain Storage Types

Not all onchain storage is equal. Different mechanisms have different costs and capabilities.

Contract Storage

This is your smart contract's persistent memory. State variables declared in Solidity occupy storage slots that persist across transactions.

contract TokenRegistry {
    mapping(address => uint256) public balances;  // Storage
    address public owner;                         // Storage
}

Storage is the most expensive option. Writing to a new slot costs around 20,000 gas. Updating an existing slot costs about 5,000 gas.

Best for: Token balances, ownership records, protocol parameters.

Event Logs

Events record contract activity without storing it in contract state. Cheaper than storage, but contracts can't read their own events.

event Transfer(address indexed from, address indexed to, uint256 value);

Emitting events costs roughly 375 gas plus 375 per indexed parameter and 8 gas per data byte. That's about 10x cheaper than storage for equivalent data.

Best for: Transaction history, audit trails, off-chain indexing.

Calldata

Function input parameters that exist only during transaction execution. The cheapest way to pass data into a contract—4 gas per zero byte, 16 gas per non-zero byte.

Best for: Large verification proofs, single-transaction operations.

L1 vs L2: The Cost-Security Trade-off

Layer 2 networks like Base, Optimism, and Arbitrum process transactions off Ethereum mainnet while inheriting its security through periodic settlement.

The cost savings are substantial. A typical token transfer:

Ethereum mainnet: $1-5
Base: $0.01-0.05

The trade-off is finality. Base transactions achieve practical finality in seconds, but true cryptographic finality takes 30 minutes to 2 hours when the L2 state settles to Ethereum.

For most applications, this delay is invisible to users. The pattern that works: build on Base for cost efficiency, anchor critical state roots to Ethereum when maximum security matters.

Your Offchain Options

Centralized Databases

Traditional databases are fast, cheap, and flexible. You control access, update freely, and get instant queries.

The downside is obvious: users trust you completely. Your servers can go down, serve tampered data, or get acquired by someone else. For many applications this is fine—just be honest about the trust model.

Best for: User profiles, analytics, frequently updated state, private data.

IPFS (InterPlanetary File System)

IPFS identifies files by their content hash rather than their location. Upload a file, get a Content Identifier (CID), and anyone can retrieve it from any node that has it.

The CID is a cryptographic hash of the content itself. Change the file, the CID changes. This makes IPFS naturally tamper-proof—store the CID onchain as your proof.

The catch: files only persist while someone pins them. Stop pinning, and the content eventually disappears. Pinning services like Pinata or NFT.Storage solve this for $5-20/month.

Best for: NFT metadata, website front-ends, documents needing verifiable integrity.

Arweave

Arweave takes a different approach. Pay once upfront, and the network economically incentivizes miners to store your data forever.

The math behind it: hardware costs decline predictably over time. Arweave charges enough upfront to cover perpetual storage based on these projections.

The downside: higher initial cost and slower retrieval than CDNs. Also no content updates—every version is a new permanent upload.

Best for: Historical archives, legal documents, content that must outlive organizations.

IPFS vs Arweave: Quick Comparison

Factor	IPFS	Arweave
Cost	$5-20/month for pinning	$0.01-0.10 per MB (one-time)
Permanence	Requires active maintenance	Permanent by design
Speed	Fast with good pinning	Slower, improving
Updates	New CID per version	New transaction per version
Best for	Active projects, iteration	Archives, legal records

Most projects start with IPFS during development and early iterations. As content stabilizes, Arweave becomes more attractive for permanent storage.

Hybrid Patterns That Work

The most successful Web3 projects don't choose onchain or offchain—they combine both strategically.

Pattern 1: Content Offchain, Hash Onchain

Upload your content to IPFS or Arweave. Store only the hash onchain. Users download the content and verify it matches.

contract VerifiedContent {
    mapping(uint256 => bytes32) public contentHashes;

    function publish(uint256 id, bytes32 hash) external {
        contentHashes[id] = hash;
    }
}

This pattern works for NFT metadata, legal documents, website front-ends—any content where integrity matters but size or cost makes onchain impractical.

Pattern 2: Wallet Authentication

Replace traditional passwords with wallet signatures. Users prove identity by signing messages—the signature proves key control without revealing secrets.

function authenticate(bytes memory signature, string memory message)
    external
{
    address signer = recoverSigner(message, signature);
    require(hasAccess[signer], "Unauthorized");
}

No password databases. No phishing vulnerabilities. No account recovery headaches.

Pattern 3: Commit-Reveal for Fairness

When participants shouldn't see each other's submissions before a deadline, use commit-reveal.

Phase 1 (commit): Users submit hashes of their data. Phase 2 (reveal): After the deadline, users reveal actual data. Contract verifies it matches the hash.

function commit(bytes32 hash) external {
    commitments[msg.sender] = hash;
}

function reveal(string memory data) external {
    require(keccak256(bytes(data)) == commitments[msg.sender]);
    revealed[msg.sender] = data;
}

Prevents frontrunning in auctions, gaming in prediction markets, and manipulation in governance votes.

Pattern 4: L2 for Operations, L1 for Anchoring

Build your application on Base for low-cost daily operations. Periodically commit state roots to Ethereum for maximum security.

Most users interact entirely on L2 and benefit from low fees. Power users who need absolute guarantees can verify against L1 state.

Getting Offchain Data Onchain: Oracles

Sometimes offchain information needs to influence onchain logic. Price feeds, sports scores, weather data—this real-world information must cross the boundary somehow.

Oracles bridge this gap. They query offchain sources, aggregate data, and publish results where smart contracts can read them.

The challenge is trust. A compromised oracle can feed false data to your contract. Different oracles address this differently:

Centralized oracles (like Coinbase's price feed) are fast and simple but require trusting a single entity.

Decentralized oracles (like Chainlink) aggregate from multiple independent nodes, making manipulation harder but adding cost.

Optimistic oracles (like UMA) assume data is correct unless challenged, using economic incentives for dispute resolution.

When is oracle dependence acceptable?

Data is objectively verifiable (prices, timestamps)
Multiple independent oracles agree
The value at stake justifies the cost
Your app can tolerate occasional downtime

Avoid oracles when data is subjective, easily manipulated, or when trust collapses to a single failure point.

What Never Goes Onchain

Some data should never touch a public blockchain, regardless of cost.

Personal Information

Names, addresses, phone numbers, emails, medical records. Publishing PII onchain creates permanent privacy violations. Even hashed PII can often be reversed through rainbow tables on small datasets.

Store instead: Encrypted databases with proper access controls. Use only anonymous IDs onchain if authentication is needed.

Private Keys and Secrets

This seems obvious, but people make mistakes. Never store API credentials, encryption keys, or private keys onchain. The blockchain is public. Forever.

Store instead: Hardware security modules, secure key management systems, or threshold signature schemes.

Large Files

Even ignoring cost, blockchains optimize for small, frequently-read state—not gigabytes of video. The network wasn't designed for it.

Store instead: IPFS or Arweave for media, with CIDs stored onchain for verification.

Storage by Use Case

NFTs

Onchain: Token ID, owner address, metadata hash/CID Offchain: Images, videos, full metadata JSON

The tokenURI function returns an IPFS or Arweave URL pointing to metadata:

function tokenURI(uint256 tokenId) public view returns (string memory) {
    return string(abi.encodePacked("ipfs://", tokenCID[tokenId]));
}

DAOs

Onchain: Proposal hashes, vote tallies, execution logic Offchain: Discussions, supporting documents, voter rationales

Governance decisions go onchain for transparency. Discussions happen offchain where they can be threaded and updated.

Tokens

Onchain: Contract, balances, transfer logic, governance rules Offchain: Analytics dashboards, holder lists, engagement metrics

Token economics must be transparent and immutable. Analytics pull from onchain events but compute offchain for speed.

Onchain Websites

Onchain: Domain pointer (ENS), content hash reference Offchain: HTML, CSS, JavaScript, images (on IPFS/Arweave)

Host your front-end on IPFS or Arweave. Store the CID in an ENS record. Users verify they're loading the authentic version by comparing hashes.

Verifying Content Yourself

Hybrid architectures only work if you can actually verify offchain content matches onchain references.

The verification process:

Download content from IPFS or Arweave
Compute its hash (usually keccak256 or SHA-256)
Compare with the onchain reference
If they match, content is authentic and unmodified

Many tools handle this automatically. MetaMask verifies IPFS content behind ENS domains. Block explorers show event logs with published hashes.

Reading IPFS CIDs

CIDs look like QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco (CIDv0) or bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi (CIDv1). Retrieve content through any gateway:

https://ipfs.io/ipfs/[CID]
https://gateway.pinata.cloud/ipfs/[CID]
https://[CID].ipfs.dweb.link

Avoiding Broken Links

IPFS content disappears if nobody pins it. To ensure persistence:

Use paid pinning services (Pinata, NFT.Storage)
Run your own IPFS node for critical content
Use Filecoin for economic storage incentives
Choose Arweave for guaranteed permanence

Optimizing Cost and UX

Reducing Gas

Compress before storing: Bit-pack data, optimize struct layouts. Batch operations: Group updates into single transactions. Store hashes, not content: Use onchain pointers to offchain data. Events over storage: If contracts don't need to read it, emit events instead. L2 over L1: Base offers 10-100x savings with minimal security trade-offs.

Communicating Finality

L2 transactions confirm in seconds but may not be final for 30+ minutes. Design your UI to communicate this:

Pending: Transaction submitted
Confirmed: Included in L2 block
Final: Settled to L1 (only show if relevant)

For most users, L2 confirmation is plenty. They care about seeing their action complete, not theoretical rollback windows.

Your Decision Checklist

Before storing data, ask yourself:

Must anyone be able to verify it?

Yes → Onchain or offchain with onchain hash
No → Offchain with access controls

Must it be permanent and immutable?

Yes → Onchain (critical) or Arweave (large)
No → IPFS or centralized storage

How large is it?

Under 1KB → Consider onchain
1KB-1MB → IPFS or Arweave
Over 1MB → Definitely offchain

How often does it change?

Never → Onchain or Arweave
Sometimes → IPFS with versioned CIDs
Frequently → Centralized database

Do other contracts need to read it?

Yes → Must be onchain
No → Offchain is fine

What's your trust model?

Zero trust required → Onchain
Some trust acceptable → Hybrid with verification
Trust isn't a concern → Centralized with optional proofs

There's no universal answer to onchain vs offchain. The best Web3 products store critical data onchain, keep large or dynamic content offchain, and connect the two with cryptographic proofs.

Start with this hybrid approach. Optimize for your specific constraints. And remember that storage architecture can evolve as your project grows—just plan for upgrades from day one.

Looking to build an onchain website? Check out our guide to onchain domains or learn how crypto wallets work to understand the authentication layer.