
At Tofu, we’re building a fast‑growing B2B SaaS platform that automates bookkeeping with AI. Tofu’s platform connects to accounting systems, financial data sources, and enterprise workflows—processing large volumes of structured and semi‑structured financial data in near real time.
As our customer base and integration complexity grew, scalability, reliability, and zero‑downtime operations became mission‑critical. This post is the story of how Tofu evolved its database architecture—from a single‑node AWS RDS (PostgreSQL) instance to YugabyteDB Aeon, a fully managed, PostgreSQL‑compatible distributed SQL database that now powers production.
When Tofu launched, simplicity was the right call. AWS RDS (PostgreSQL) gave us reliability, predictable pricing, and minimal operations—ideal for an early product.
As Tofu onboarded more organizations and automated more bookkeeping workflows, the workload shape changed:
We at Tofu needed a distributed, fault‑tolerant, PostgreSQL‑compatible database—one that could scale reads and writes without forcing application rewrites.
We at Tofu looked at managed and open‑source paths that would keep PostgreSQL semantics while giving us true horizontal scale. Here’s how our short‑list stacked up.
Scope: Comparing the options we evaluated: YugabyteDB Aeon, Amazon Aurora PostgreSQL (classic), CockroachDB, and Citus (PostgreSQL extension).
Note: AWS also offers Aurora PostgreSQL Limitless Database, a separate sharded, multi‑writer architecture; our evaluation focused on classic Aurora.
Why YugabyteDB Aeon fit Tofu’s needs
Tofu’s engineering team prioritized: PostgreSQL wire compatibility (no ORM rewrites), horizontal read/write scale, rolling maintenance with automatic failover, backups + PITR, private networking (VPC + PrivateLink), and a Datadog integration that plugged into existing dashboards.
We at Tofu used YugabyteDB Voyager to keep the migration controlled and low‑risk:
From the application’s perspective, it was still PostgreSQL—only now distributed, resilient, and cloud‑native.
Zero‑downtime rolling maintenance
YugabyteDB Aeon performs rolling upgrades/maintenance on fault‑tolerant clusters, so the service stays available as nodes are patched one at a time. Connections to the node being updated can drop; we rely on pool/driver retries and schedule a maintenance window during off‑peak hours.
High availability and self‑healing
Data is sharded into tablets and synchronously replicated via Raft; leaders are elected automatically, and failovers happen without manual promotion.
Scale out, not just up
Adding nodes increases capacity; the cluster rebalances data across nodes, so we can scale reads and writes without app rewrites (performance tracks node count on well‑distributed workloads).
Backups and point‑in‑time recovery (PITR)
YugabyteDB Aeon supports scheduled full/incremental backups and PITR. Restores/clones pick the closest snapshot and use Flashback to reach your chosen timestamp within the retention window.
Observability that fits our stack
The Datadog integration streams cluster metrics to our existing dashboards with an out‑of‑the‑box view of health/perf. No custom exporters required.
Private networking on AWS
We use YugabyteDB Aeon VPCs and AWS PrivateLink (Private Service Endpoints) to keep traffic private, granting access to specific AWS principals (ARNs). Database auth uses PostgreSQL mechanisms (SCRAM/LDAP).
Developer simplicity
YSQL is built from PostgreSQL code (v15 lineage) and is wire‑compatible, so our Rust + SQLx code worked unchanged.
Transaction isolation & retries
In YSQL, the effective default is Snapshot (Repeatable Read)—PostgreSQL’s READ COMMITTED maps to Snapshot unless you enable a server flag. Under concurrency (and occasionally due to clock skew), you may see read restart/serialization errors; lightweight, idempotent retries are the right fix. For long‑running read‑only jobs, SERIALIZABLE READ ONLY DEFERRABLE avoids restarts by waiting for a safe snapshot. If your logic allows, enabling true READ COMMITTED can reduce visible restarts.
Query planning & statistics
YugabyteDB ships a Cost‑Based Optimizer (CBO) for YSQL that accounts for topology and network costs. Keep stats fresh: YugabyteDB Aeon provides Auto Analyze, and we still run ANALYZE after large imports or schema changes. Some plans (join order, index selection) will differ from single‑node Postgres—expected in a distributed layout.
Indexes are online by default
Index creation uses online backfill by default, so routine index adds don’t block writes. That made iterative schema changes safer during releases.
Extensions
Extensions we rely on—uuid-ossp, pgcrypto, pg_trgm—work as expected; you can also use gen_random_uuid() for UUIDs. (As with Postgres, low‑level C extensions that assume specific storage internals aren’t applicable.)
Moving from a single‑node RDS instance to YugabyteDB Aeon has been one of the most impactful infrastructure upgrades in Tofu’s journey. We gained horizontal scale, fault tolerance, and operational simplicity—without sacrificing PostgreSQL familiarity or developer velocity.
If you’re hitting RDS/Aurora limits and need elastic, zero‑downtime operations on PostgreSQL semantics, YugabyteDB Aeon is worth a close look.
A detailed white paper about Tofu’s use of YugabyteDB Aeon will soon be available on the YugabyteDB website.
Written by: Ken Kanai, VP of Engineering at Tofu
Tofu is a fast‑growing B2B SaaS platform transforming the bookkeeping industry with AI‑driven automation.
Learn more at www.gotofu.com.