Skip to main content

13 posts tagged with "Company"

View All Tags

Don't Split My Data: I Will Use a Database (Not PostgreSQL) for My Data Needs

· 17 min read
EloqData
EloqData
EloqData Core Team

The internet (or at least the IT community) had a field day when a couple of blog posts claimed you could replace Redis and Kafka with PostgreSQL. "Redis is fast, I'll cache in Postgres" and "Kafka is fast -- I'll use Postgres" have gotten much attention on HackerNews here and here, and on Reddit here and here. Obviously, some of the claims in the posts got roasted on HN and Reddit for suggesting you could replace Redis or Kafka with PostgreSQL. Many people (correctly) pointed out that the benchmarks were far from properly set up, and the workloads were non-typical. Some of the Kafka people also posted long articles to clarify what Kafka is designed for and why it is not hard to use. But, on the flip side, many of the posts also (correctly) preached a valid point: keeping fewer moving parts matters, and using the right tool for the job matters even more.

x

Those "Postgres can replace Redis/Kafka" posts benchmarked tiny workloads where serious tools simply aren't needed and then stretched that into a sweeping "you don't need Redis or Kafka" narrative. It's like writing an article on how much cargo a U-Haul can carry, and concluding that you'll just use your convertible for grocery runs. Nobody would post that on a trucking forum expecting a heated debate. So why does the same flavored reasoning blow up every time someone writes it about databases?

Sure, part of the drama is emotional. Developers get attached to their tools. Kafka, Redis, and Postgres all have cult-like followings, and any praise or criticism feels personal. But there's more to it than fandom. These debates touch on real technical trade-offs: complexity, durability, scaling, and the cost of managing multiple systems. In this article, we'll look at why it actually makes sense to use a real database for workloads that used to require specialized systems, and then tackle the elephant in the room: why this shift hasn't happened before and what's needed to make it possible.

The Advantage of Combining Workloads on a Real Database

Back in the pre-internet days, "database" basically meant Oracle or Sybase: big relational systems that stored all of a company's data and powered every enterprise app. Then came the internet era, and data exploded in both volume and variety. The once-unified database stack shattered into a zoo of specialized systems: PostgreSQL and MySQL for transactional truth, Kafka for streams, Redis for caching, and Spark, Snowflake, or ClickHouse for analytics. We also have specialized tools for graphs, for vectors, for time-series, for tensors, and even just for account balances. Each tool solved a particular problem, but together, they created a monster of complexity. IT teams now spend countless hours wiring, tuning, and babysitting these silos just to keep data moving.

Lately, there's been a growing movement to consolidate around a single database (often PostgreSQL) as the backbone of data infrastructure. The obvious motivation is simplicity and cost: running one system is easier (and cheaper) than orchestrating a zoo of databases stitched together with duct tape and ETL jobs. But beyond operational convenience, there are deeper reasons this trend deserves attention. A converged data architecture unlocks advantages that go far beyond saving money and any systems architect can't afford to ignore.

1. Avoid Cascading Failures

Every engineer who's operated a large-scale system knows this pain: one small failure snowballs into a full-blown outage. This phenomenon is called cascading failure. When you glue together multiple systems, say Kafka feeding MySQL with Redis caching on top, each layer becomes a potential failure point. A hiccup in Kafka can stall writes, which can then cause cache invalidations to fail, which can then cause a thundering herd of retries.

Example: Twitter's Cache Incident (via Dan Luu) In 2013, Twitter experienced an outage where a minor latency issue in the caching layer caused by interrupt-affinity misconfiguration led to a GC spiral on the tweet service. As cache latency grew, more requests bypassed the cache, overwhelming downstream services and eventually collapsing success rates to near 0% in one datacenter.

The database itself wasn't the root cause, but the incident shows how fragile multi-tiered data stacks can be: when layers depend on each other for throughput, even small delays can cascade into total failure.

This is exactly the kind of chain reaction that converged database architectures can avoid by keeping durability, caching, and data access under one coordinated system instead of scattered layers of dependency. In a converged architecture, where all data flows through a single database, node failures degrade capacity proportionally rather than triggering a domino effect. The system scales down gracefully instead of collapsing catastrophically.

2. Avoid Development Complexity

Maintaining two code paths for the same data is a tax on every engineering team. A typical "read-through" cache setup means developers have to write one logic path for MySQL (cache miss) and another for Redis (cache hit). Over time, these two paths drift apart: schema changes, data formats, and business rules slowly go out of sync. Debugging that kind of inconsistency is pure misery. A single database eliminates that duality. Your application talks to one API, with one transaction model, and one source of truth. Simpler code, fewer bugs, faster iteration.

Here is an example code for reading user information. If Redis cache is involved, the code looks like this:

def get_user(user_id):
# Try cache first
try:
user = redis.get(f"user:{user_id}")
if user:
return json.loads(user)
except RedisTimeout:
logger.warning("Redis timeout, falling back to DB")
except RedisConnectionError:
logger.error("Redis down, falling back to DB")

# Cache miss or Redis failed, try database
try:
with postgres.cursor() as cur:
cur.execute("SELECT * FROM users WHERE id = %s", (user_id,))
user = cur.fetchone()

# Try to update cache (but don't fail if Redis is down)
try:
redis.set(f"user:{user_id}", json.dumps(user), ex=3600)
except:
pass # "It's fine, we'll cache it next time"

return user
except PostgresConnectionError:
# Both Redis and Postgres down? Try the read replica
with postgres_replica.cursor() as cur:
# ... more error handling ...

47 lines of code to read one user. And this doesn't even handle cache invalidation, which was another 200 lines of scattered redis.delete() calls that everyone was afraid to touch.

With everything in one database, the code becomes trivial:

def get_user(user_id):
with db.cursor() as cur:
cur.execute("SELECT * FROM users WHERE id = %s", (user_id,))
return cur.fetchone()

def update_user(user_id, data):
with db.cursor() as cur:
cur.execute("UPDATE users SET ... WHERE id = %s", (..., user_id))

3. Avoid Consistency and Durability Confusion

Distributed systems are already hard to reason about. Add multiple data tiers like Redis, Kafka, and a transactional database and you multiply the number of durability and consistency scenarios. Each component has its own guarantees: Kafka offers a choice of "at-least-once", "at-most-once", "exact-once" delivery, and whether you need flush every writes to achieve durability is sometimes debated. Redis is eventually consistent, meaning data can be stale, and provide replication, write-ahead-log and snapshot as persistency mechanisms. Your SQL database is usually transactionally consistent with full ACID guarantee, but only within its own boundary. Stitch them together, and you're left trying to reason about correctness in a house of cards.

Example: Stale Writes and Inconsistent Cache at Scale (via Dan Luu) Dan Luu's review of real-world cache incidents includes multiple cases where caches went out of sync with the source of truth. In one Twitter incident, stale data persisted in the cache after the underlying value had changed, leading to users seeing outdated timelines. In another, duplicate cache entries caused some users to appear in multiple shards simultaneously: an impossible state from the database's perspective. The fixes required strict cache invalidation ordering and retry logic to prevent stale data from overwriting newer updates.

None of these were database bugs; they were integration bugs, symptoms of managing consistency across disconnected systems.

A single, converged database architecture eliminates this entire class of problems. There's no cache to drift out of sync, no queue to replay twice, no data race between "source of truth" and "derived truth." Everything from reads to writes to streaming happens within one transaction boundary and one consistency model. You don't just simplify the system; you make correctness provable instead of probabilistic.

What About Scalability and Performance?

If converging everything into a single database simplifies architecture so dramatically, the obvious question is: why hasn't everyone already done it? The short answer: performance and scalability.

One Size Fit All?

Specialized systems like Redis and Kafka were born because traditional databases couldn't keep up. Caches were faster, queues were more scalable, and analytics engines could crunch far more data than your average OLTP database. The trade-off was fragmentation and complexity, but at least things stayed fast. For decades, this was a necessary compromise. In early 2000 the godfather of database Mike Stonebreaker famously declared "One size does not Fit All" in database systems. Even as late as 2018, the CTO of Amazon Werner Vogels decalared that "A one-size-fits-all database doesn't fit anyone".

But is that still so? Clearly there is a demand for one size fit all database, as demonstrated by the blogs mentioned in the early part of this article. Indeed, only recently have database architectures evolved enough to make convergence a realistic alternative. In this section, we discuss the necessary capabilities of a database that must be satisfied to enable such convergence.

1. Scalable Transactions

For years, transactions were the deal-breaker for scaling databases. Everyone wanted the simplicity of ACID semantics, but once workloads outgrew a single machine, something had to give. The conventional wisdom was: you can have efficiency, transaction, or scale, but not all of them. That's how we ended up with a zoo of specialized systems. NoSQL databases like MongoDB and Cassandra threw away multi-record atomicity in exchange for horizontal scale. Developers got used to compensating in application code: implementing retries, deduplication, or manual rollback logic. It worked, but it was painful and brittle. For relational systems, manual sharding became the necessary evil: once you hit the single-node ceiling, you split your data and your sanity along with it.

As workloads grew, people learned that sharding relational databases by hand was a nightmare. Every cross-shard join, every transaction that touched more than one key, became a distributed systems problem. You could scale reads, but writes were a different story. And once data started to overlap across shards, consistency slipped through your fingers.

That's why the new generation of distributed transactional databases, often grouped under the "NewSQL" label, was such a breakthrough. Systems like Google Spanner, TiDB, and CockroachDB showed that you could have global scale and serializable transactions, thanks to better consensus protocols, hybrid logical clocks, and deterministic commit algorithms. Even previously non-ACID systems have evolved: MongoDB added multi-document transactions in 4.0, and Aerospike introduced distributed ACID transactions feature in 8.0.0. The takeaway is that distributed transactions are already standard features in many systems.

This changes the calculus completely. When your database can handle distributed transactions efficiently, you no longer need to glue together different engines just to scale. You get scalability and correctness in one system that can become a foundation for building truly unified data architectures.

Yet, most existing distributed transactional databases are still less efficient than their single node counter part. This is an issue we need to address (see below).

2. Independent Resource Scaling

Different workloads stress different parts of a system. Streaming ingestion (like Kafka) is IO-bound and needs massive sequential write throughput. Caching workloads (like Redis) are memory-bound, thriving on low-latency access to hot data. Analytical queries and vector search, on the other hand, are CPU-bound, demanding large compute bursts. Traditional shared-nothing databases where compute, storage, and memory scale together force you to over-provision one resource just to satisfy another. You end up paying for RAM you don't use or SSDs that sit mostly idle.

That's one of the reasons why people historically broke their pipelines into multiple specialized systems. You could put Kafka on write-optimized disks, Redis on high-RAM nodes, and MySQL on balanced hardware.

Now, cloud infrastructure is changing that equation. Modern cloud platforms allow elastic and independent scaling of compute, storage, and memory. Databases are starting to embrace this model directly. The separation of compute and storage, first popularized by cloud data warehouses like Snowflake and BigQuery, is now making its way into OLTP systems as well. This decoupling lets a single database scale ingest, query, and cache layers independently.

A truly converged database architecture must exploit this flexibility: scale write nodes when ingestion spikes, scale compute when CPU demands surge, and expand storage as data volumn grows.

3. Performance and the Zero-Overhead Principle

C++ developers have a mantra: the Zero-Overhead Principle

  1. You don't pay for what you don't use.
  2. What you do use is just as efficient as what you could reasonably write by hand.

That's exactly the mindset a converged database must adopt. Existing databases rarely meet this bar. Even if you just want to run an in-memory cache, most engines still insist on running full durability machinery: write-ahead logging, background checkpoints, transaction journals, and page eviction logic. The result? You're paying CPU and latency overhead for guarantees you may not need on a given workload.

This is why Redis can outperform MySQL on pure in-memory reads even when both are running on the same hardware and the dataset is identical. The overhead isn't only in the query parser; it's in the architectural layers designed for persistence, recovery, and buffer management that never get out of the way.

A truly unified database must treat these mechanisms as pluggable, not mandatory. Durable writes, replication, and recovery logging should be modular: enabled only when needed, bypassed when not. The same engine should be able to act as a blazing-fast cache, a strongly consistent OLTP store, or a long-term analytical system, without paying unnecessary tax in the fast path.

In other words, unification can not come at the cost of overhead. It has to come from rethinking the core execution and storage engine so that the same system can be both flexible and ruthlessly efficient, depending on how it's used.

PostgreSQL May Not Be the Answer

If "just use Postgres for everything" sounds too easy, that's because it is. PostgreSQL deserves its reputation: it's battle-tested, reliable, and absurdly extensible. But it was never designed to be the all-in-one database the modern world needs. Its architecture still reflects the assumptions of the 1990s: local disks, tightly coupled storage and compute, and a single-threaded execution model wrapped in process-based concurrency. The query pipeline, the storage system, the concurrency control mechanism are all designed with a single-node, row-based MVCC ACID relational database in mind.

You can retrofit Postgres with external layers: Citus for sharding, Neon for decoupled storage, TimeScale (now TigerData) and PGVector for handling time-series and vector data. But this is often not optimal, and often just recreates the complexity we were trying to eliminate (i.e. they do not compose). When serious modification is needed, such as making it fully multi-writer capable as in CockroachDB and YugabyteDB or streaming capable as in Materialize and RisingWave, a major rewrite is often needed and often only the wire-protocol can be reused. Extensions like logical replication, foreign data wrappers, and background workers are impressive engineering feats, but they're patches on a monolith, not blueprints for a unified data platform.

When people benchmark Postgres against Redis or Kafka, they're really comparing apples to power tools. Postgres can simulate a cache or a queue, but it isn't optimized for them. WAL writes still happen, visibility maps still update, vacuum still runs. The performance gaps show up not because Postgres is "slow," but because it's doing far more work than those specialized systems ever attempt. Its design trades raw throughput for correctness and compatibility, a perfectly valid choice, just not one that scales linearly to every use case.

Recently, there is another school of thought argues that as hardware keeps improving, scalability and efficiency becomes less relevant. If a single server can pack terabytes of RAM and hundreds of cores, why bother with distributed systems at all? Why not just keep everything in one big Postgres instance and call it a day? We fundamentally disagree with this line of thinking. Hardware growth helps, but data growth is exponential. User expectations for availability, latency, and global presence grow even faster. Scaling vertically might postpone the problem, it doesn't solve it. This is especially true in the AI age, when data is the lifeline of all applications. We'll dedicate another article to discuss this in detail.

PostgreSQL has earned its place as the default database of our time, but it's not the endgame. There is opportunity in databases that internalize the lessons of distributed systems, cloud elasticity, and modular design.

The Road Ahead: Toward a True Data Substrate

The industry's obsession with patching and stitching different data systems together is a legacy of the past twenty years of hardware and software limits. But those limits are fading. The next generation of data infrastructure won't be defined by whether it speaks SQL or supports transactions. It will be defined by whether it can unify the data lifecycle without compromise: streaming, caching, analytics, and transactions under a single, coherent architecture.

That's exactly what we're building with EloqData's Data Substrate. Instead of bolting more features onto yesterday's database engines, we started from a clean slate: modular storage and compute layers, fully ACID-compliant distributed transactions, object storage as a first-class persistence medium, and elastic scaling across workloads. The same engine that serves as a durable operational database can act as a high-throughput cache, a streaming log, or an analytical backend without duplicating data or wiring together half a dozen systems.

This is the promise of a converged data platform: simplicity without trade-offs, scalability without fragmentation, and performance without overhead. The future of data infrastructure belongs to systems that treat data as a continuous substrate rather than a pile of disconnected silos. That's where we're headed. Of course, we still have a very long way to go, but we are working hard on this goal. Join the discussion on our Discord Channel, or visit our GitHub repo to contribute. We would be happy to hear your thoughts on the future of databases.

How NVMe and S3 Reshape Decoupling of Compute and Storage for Online Databases

· 10 min read
EloqData
EloqData
EloqData Core Team

Cloud native databases are designed from the ground up to embrace core cloud principles: distributed architecture, automatic scalability, high availability, and elasticity. A prominent example is Amazon Aurora, which established the prevailing paradigm for online databases by championing the decoupling of compute and storage. This architecture allows the compute layer (responsible for query and transaction processing) and the storage layer (handling data persistence) to scale independently. As a result, database users benefit from granular resource allocation, cost efficiency through pay-per-use pricing, flexibility in hardware choices, and improved resilience by isolating persistent data from ephemeral compute instances.

In this blog post, we re-examine this decoupled architecture through the lens of cloud storage mediums. We argue that this prevailing model is at a turning point, poised to be reshaped by the emerging synergy between instance-level, volatile NVMe and highly durable object storage.

Lessons from the AWS us-east-1 Outage: Why Local NVMe as Primary DB Storage Is Risky

· 5 min read
EloqData
EloqData
EloqData Core Team

On October 20, 2025, AWS experienced a major disruption across multiple services in the us-east-1 region. According to AWS Health Status, various compute, storage, and networking services were impacted simultaneously. For many teams running OLTP databases on instances backed by local NVMe, this was not just a downtime problem-it was a data durability nightmare.

x

Cloud databases must constantly balance durability, performance, and cost. In modern cloud environments, there are three main types of storage available:

Storage TypeDurabilityLatencyCostPersistence Across VM Crash
Block Storage (EBS)HighMediumHighData persists
Local NVMeNoneUltra-fastLow per IOPSLost on restart/crash
Object Storage (S3)Very HighSlowLowestPersistent

Let’s break down the trade-offs and why recent events place a spotlight on risky architectural choices.


Option 1: Block-Level Storage (EBS) - Durable but Expensive and Slow

EBS is the default choice for reliability:

  • It survives instance failures.
  • It supports cross-AZ replication via multi-replica setups.
  • It enables quick reattachment to replacement nodes.

But the downside?

  • GP2/GP3 disks deliver modest IOPS and high latency.
  • High-performance variants like IO2 are extremely expensive when provisioned for hundreds of thousands of IOPS.
  • Scaling performance often means scaling cost linearly.

EBS gives you durability-but performance per dollar is disappointing.


Option 2: Local NVMe - Fast but Ephemeral (and Now Proven Risky)

Instance families like i4i provide 400K+ to 1M+ IOPS from local NVMe, making them a natural fit for databases chasing performance.

So many database vendors recommend:

  • Use local NVMe for primary storage
  • Add cross-AZ replicas for durability

But here’s the problem: Local NVMe is tied to the node lifecycle. If the node restarts, fails, gets terminated due to spot interruption, or is impacted by a region-level failure such as the recent us-east-1 outage-you lose ALL the data.

During routine failures, cross-AZ replicas often protect you. But during region-wide degradation or cascading incidents, with local NVMe, there is nothing to recover. The storage is simply gone. What you can do is to recovery from recent backups - often lagging days. Write loss is guaranteed between last backup and crash.

In contrast, EBS volumes can always be reattached to a new node.

The AWS us-east-1 outage just validated that “local NVMe + async replication” is a high-risk strategy for mission-critical databases.


Option 3: Object Storage (S3) - Durable & Cheap, But Latency Is a Challenge

Object storage is:

  • 3x cheaper than block storage
  • Regionally and cross-region durable
  • Built to survive region-level failures
  • Practically infinite
  • A first-class citizen for modern cloud-native platforms

But the challenge remains: S3 latency is too high for OLTP if accessed synchronously.

This is why traditional OLTP engines avoid it.

So the question becomes: How do we get the cost & durability benefits of S3 without paying the latency penalty?


The Data Substrate Approach: Object Storage First, NVMe as Cache, EBS for Logs

EloqData treats object storage (e.g., S3) as the primary data store, and architect the system to avoid the usual latency pitfalls:

LayerRoleWhy
S3 (Object Storage)Primary data storeUltra-durable, Cheap
EBS (Block Storage)Durable log storageSmall volume, low latency writes
Local NVMeHigh-performance cacheAccelerates reads & async flushes

Through Data Substrate, we decouple storage from compute and split durability between:

  • Log: persists immediately to EBS
  • Data store: periodically checkpointed to S3 (async + batched)
  • NVMe: purely a cache layer, safe to lose at any time

This allows us to:

  • Withstand node crashes seamlessly
  • Recover fully even if local NVMe is wiped
  • Handle region-level disruption by replaying logs and checkpoints
  • Enjoy millions of IOPS from NVMe without durability risk
  • Cut storage cost by 3x+ compared to full EBS-based systems

Check out more on our products powered by Data Substrate:


The Larger Industry Trend

We are not alone in this shift. The broader ecosystem is moving object-storage-first:

SystemUse of Object Storage
SnowflakeOLAP on S3
StreamNative UrsaStreaming data on S3
Confluent Freight ClustersStreaming data on S3
TurbopufferVector & full-text search on S3

EloqData brings this model to OLTP with a transactional, low-latency engine powered by Data Substrate.


After the Outage: A Hard Question Every Architect Should Ask

If my database node died right now, would I lose all my data?

If you're running a primary database on local NVMe, and relying solely on async replicas, the answer might be yes.

It’s time to rethink durability assumptions in the cloud era.


Summary

StrategyPerformanceDurabilityRegion Outage RiskCost
EBS onlyMedium$$$
Local NVMe onlyFast$$
NVMe + async replicasFastPartialHigh$$
Object Storage + Log + NVMe Cache (EloqData)Fast✅✅✅✅$

AWS us-east-1 just reminded the industry: Performance is replaceable. Lost data is not.

With the right architecture, you don’t have to choose.

  • Build fast.
  • Stay durable.
  • Be outage-proof.

That’s the future we’re building at EloqData.

Check out more on our open source databases:

Coroutines and Async Programming: The Future of Online Databases

· 8 min read
EloqData
EloqData
EloqData Core Team

Online databases are the backbone of interactive applications. Despite coming in many different types, online databases are all engineered for low-latency, high-throughput CRUD operations. At EloqData, we use the universal Data Substrate to build online databases for any model—from key-value and tables to JSON documents and vectors. In this post, we explore one of our core engineering practices for future online databases.

The Benefits of Data Substrate Architecture

· 14 min read
EloqData
EloqData
EloqData Core Team

In the previous article, we discussed the details of some of the architecture design of Data Substrate. In this article, we continue the discussion and elaborate on why we made these design choices and how these choices affect the resulting database solutions we built.

A Deeper Dive Into Data Substrate Architecture

· 18 min read
EloqData
EloqData
EloqData Core Team

In this article, we dive deeper into the technical foundations of Data Substrate—highlighting the key design decisions, abstractions, and architectural choices that set it apart from both classical and modern distributed databases.

Data Substrate Technology Explained

· 3 min read
EloqData
EloqData
EloqData Core Team

At EloqData, we've developed Data Substrate—a database architecture designed to meet the unprecedented demands of modern applications in the AI age. Unlike traditional database systems that struggle with the scale and complexity of AI workloads, Data Substrate reimagines the database as a unified, distributed computer where memory, compute, logging, and storage are fully decoupled yet globally addressable.

Building a Data Foundation for Agentic AI Applications

This series of articles explore the motivations, technical foundations, and benefits of Data Substrate, providing a comprehensive understanding of how this architecture addresses the critical challenges facing modern data infrastructure in the AI age.

Some of the topics covered are rather heavy in technical jargons, and require a good understanding of database internal mechanisms to appreciate. We apologize in advance.

1. Data Substrate: Motivation and Philosophy

This article introduces the core philosophy behind Data Substrate. We explore why traditional database architectures fall short in the AI era and present our vision for a new approach that treats the entire distributed system as a single, unified computer.

2. A Deeper Dive Into Data Substrate Architecture

This technical deep-dive explores the architectural foundations of Data Substrate. We examine the key design decisions, abstractions, and technical choices that set Data Substrate apart from both classical and modern distributed databases.

3. The Benefits of Data Substrate Architecture

This article examines the practical benefits and real-world implications of Data Substrate. We discuss how our design choices translate into concrete advantages for modern applications, particularly in cloud environments.

Why Data Substrate Matters

Traditional database architectures were designed for a different era—one where data volumes were smaller, workloads were more predictable, and the demands of AI applications were unimaginable. Data Substrate represents a fundamental rethinking of database design, built from the ground up for the challenges and opportunities of the AI age.

By treating the distributed system as a single, unified computer, Data Substrate eliminates many of the complexities that have traditionally made distributed databases difficult to build, operate, and reason about. This approach enables:

  • Modular architecture enables community collaboration and avoid reinventing the (many) wheels
  • True scalability without sacrificing consistency
  • Independent resource scaling for compute, memory, logging, and storage
  • Better performance through optimized hardware utilization and innovative algorithm design
  • Cloud-native features like auto-scaling and scale-to-zero
  • Simplified development through familiar single-node programming models

Get Started with Data Substrate

Ready to explore Data Substrate in action? Our open-source implementations are available on GitHub:

  • EloqKV: A high-performance key-value store built on Data Substrate
  • EloqSQL: A MySQL-compatible distributed SQL database
  • EloqDoc: A document database for modern applications

Join our Discord community to connect with other developers and stay updated on the latest developments in Data Substrate technology.

Building a Data Foundation for Agentic AI Applications

· 7 min read
EloqData
EloqData
EloqData Core Team

We have recently open sourced our three products: EloqKV, EloqSQL, and EloqDoc. These offerings reflect our commitment to addressing the evolving demands of modern data infrastructure, particularly as we enter an era dominated by powerful, autonomous AI systems.

LLM-powered Artificial Intelligence (AI) applications are driving transformative changes across industries, from healthcare to finance and beyond. We are rapidly entering the Agentic Application Age, an era where autonomous, AI-driven agents not only assist but actively make decisions, manage tasks, and optimize outcomes independently.

However, the backbone of these applications—the data infrastructure—faces immense challenges in scalability, consistency, and performance. In this post, we explore the critical limitations of current solutions and introduce EloqData’s innovative approach specifically designed to address these challenges. We also share our vision for an AI-native database, purpose-built to empower the Agentic Application Age, paving the way for smarter, more autonomous, and responsive AI applications in the future.

Why We Develop EloqDB Mainly in C++

· 8 min read
EloqData
EloqData
EloqData Core Team

We have recently introduced EloqKV, our distributed database product built on a cutting-edge architecture known as Data Substrate. Over the past several years, the EloqData team has worked tirelessly to develop this software, ensuring it meets the highest standards of performance and scalability. One key detail we’d like to share is that the majority of EloqKV’s codebase was written in C++.

ACID in EloqKV : Atomic Operations

· 8 min read
EloqData
EloqData
EloqData Core Team

In the previous blog, we discussed the durable feature of EloqKV and benchmarked the write performance of EloqKV with the Write-Ahead-Log enabled. In this blog, we will continue to explore the transaction capabilities of EloqKV and benchmark the performance of distributed atomic operations using the Redis MULTI EXEC commands.

Introduction to Data Substrate

· 12 min read
EloqData
EloqData
EloqData Core Team

In this blog post, we introduce our transformative concept Data Substrate. Data Substrate abstracts core functionality in online transactional databases (OLTP) by providing a unified layer for CRUD operations. A database built on this unified layer is modular: a database module is optional, can be replaced and can scale up/out independently of other modules.