The latest from 8VC
While models command the spotlight, the AI age ultimately depends on real-time, reliable data access - and this is true of countless traditional business processes. Today, Postgres is the de facto database for enterprises, SMBs, and developers, yet Change Data Capture (CDC) and data movement tools for Postgres represent a massive bottleneck. Picture the Ever Given wedged in the Suez Canal, with data as cargo.
Since its first release in 1996, Postgres has grown from a grassroots following to a central place in the modern data stack, for several reasons. It’s open source, with a robust community that keeps it current and relevant (e.g. pgvector, with the current craze around vector databases). It enables numerous use cases, including transactional, analytical, time-series, search, etc. Beyond scaling as an ecosystem, it just works.
The same cannot be said for Postgres ETLs and CDC, which are typically slow, crash-prone, hard to configure, and missing native features. Many don’t support something as simple as the Postgres COPY command, which could improve throughput exponentially, or native Postgres data types like Geospatial, ARRAYs, etc. Because these tools are built to support hundreds of connectors, not native Postgres features, there is no easy fix.
While struggling to adapt legacy tools, longtime friends and Postgres builders Sai Krishna Srirampur and Kaushik Iska realized the urgency of a top-tier Postgres data movement solution. In a few weeks, they built an MVP to stream data from Postgres to BigQuery. The result was 10x faster than incumbent tools, in terms of both replication throughput and latency. PeerDB was born in June 2023, with the vision of becoming the data movement and ETL standard for Postgres enterprises.
PeerDB’s product foundations are carefully chosen, reflecting Sai and Kaushik’s expertise in high-throughput instances and conviction in the Postgres market. PeerDB implements parallel snapshotting, which reduces massive dataset moves from days to hours. It natively replicates advanced Postgres data types, meaning data, once moved, is ready for action. Fundamentally, it prioritizes quality of connectors over being all things to all people. As Sai explains, “Any change our competitors make, they need to think about 100 connectors. We're more of a P2P model, emphasizing quality of data movement.”
PeerDB’s use cases have been similarly focused, starting with two core areas.
1. Fast, cost-effective data replication from Postgres to warehouses such as Snowflake, BigQuery and ClickHouse for AI-based business analytics.
2. Real-time streaming and CDC from Postgres to queues (including Kafka, Azure Event Hubs, and Google PubSub), enabling use cases like real-time alerting and microservices-based architectures.
Additional planned use cases include migration from legacy databases such as Oracle and SQL to Postgres, enterprise-grade Postgres High Availability (HA) and backups, and Vector ETL to enable scalable, advanced semantic search in AI applications.
Seven months into product development, PeerDB’s growth has been rapid and repeated. With six teammates, revenue is doubling every two months. PeerDB is consistently 80% cheaper and 10x faster than legacy ETL tools for CDC from Postgres to warehouses like Snowflake - indicating the MVP was no fluke. And they are doing it at scale, helping customers move 500 TB monthly.
Only an exceptional team could get here, and PeerDB’s co-founders, Sai (CEO) and Kaushik (CTO), supply exceptional technical chops, cohesiveness, and GTM savvy. At Citus Data (acquired by Microsoft), Sai led solutions engineering for Postgres services on Azure. Kaushik led data teams at Palantir, Safegraph, and Google, and represented India in the ICPC World Finals. They have learned their market, and customer challenges, intimately over the past decade. As an old-school RDBMS kernel developer himself, who worked with many legends of the craft, 8VC Partner Bhaskar “BG” Ghosh knew Sai and Kaushik had “it”.
We are privileged to lead PeerDB’s $3.6 million seed round, with participation from Y Combinator, Webb Investment Network, and Wayfinder Ventures, and angels including former Palantir leadership and the founders of Citus Data and Supabase. Like PeerDB itself, this round is focused, not flashy, calculated for their objectives.
In PeerDB, we’ve made a textbook bet: a top-tier team building something necessary, in a way that seems obvious now but wasn’t until very recently. We look forward to supporting Sai, Kaushik, and friends in a classic human endeavor: facilitating movement.
Announcing Our Investment in PeerDB
While models command the spotlight, the AI age ultimately depends on real-time, reliable data access - and this is true of countless traditional business processes. Today, Postgres is the de facto database for enterprises, SMBs, and developers, yet Change Data Capture (CDC) and data movement tools for Postgres represent a massive bottleneck. Picture the Ever Given wedged in the Suez Canal, with data as cargo.
Announcing Our Investment in PeerDB
Drew Oetting On Building An $8 Billion Venture Capital Firm To Invest In Startups That Are Fixing Our Broken World
Drew Oetting is one of the biggest forces providing the financial fuel this new generation of fast-growing, super-sized startups need to make it. His venture capital firm, 8VC, has invested in startups like Unlearn, Chaos, Tome, and Ushur.
OpenGov: A Changing Guard; A Continuing Mission
Today, OpenGov announced its acquisition by Cox Enterprises for $1.8 billion USD.
Nik Spirin (NVIDIA) Fireside Chat
For our final Chat8VC of 2023, we hosted Dr. Nik Spirin, Director of Generative AI and LLMOps at NVIDIA. He sits at a very compelling vantage point in this role. Prior to NVIDIA, he was a serial entrepreneur, classical ML PhD, and researcher by training who worked on many interesting applied AI projects at places like Gigster and played key roles in the enterprise-wide AI transformations of Canon, Dentsu, and Vodafone.
Michel Tricot (Airbyte) Fireside Chat
Our story is certainly full of ups and downs. Airbyte today is often associated with our open source data infrastructure project and product. It really started though in July 2019 when my co-founder and I first decided to build something together. We implemented a fairly disciplined system around idea exploration and honed in on the problem-space of data and went through YC in January ‘20, which was the famous COVID batch where we went from being in person to fully remote.