AI-driven Database Cache with Ben Hagan from PolyScale

Nov 14, 2023

PolyScale is a database cache, specifically designed to cache just your database. It is completely plug and play and it allows you to scale a database without a huge amount of effort, cost, and complexity. PolyScale currently supports Postgres, MySQL, MariaDB, MS SQL and MongoDB.
In this episode, we spoke with Ben Hagan, Founder & CEO at PolyScale. We discuss AI-driven caching, edge network advantages, use-cases, and PolyScale's future direction and growth.

Timestamps:
01:33 Introduction
04:26 What is PolyScale?
08:36 Supported Databases
14:28 Global Cache Invalidations
19:29 Preference Options
26:57 Global Edge Network
33:55 Pricing
42:15 Observability
51:06 Convincing Customers

Follow Ben: https://twitter.com/ben_hagan
Follow Alex: https://twitter.com/alexbdebrie

PolyScale
Website: https://www.polyscale.ai/
Youtube: https://www.youtube.com/@polyscale

Software Huddle ⤵︎
X: https://twitter.com/SoftwareHuddle

Transcript

[00:01:31] Alex: Ben, Welcome to the show.

[00:01:32] Ben: Thanks, Alex. Great to be here.

Introduction

[00:01:33] Alex: Yeah, absolutely. So I'm excited to talk to you. You are the founder and CEO at PolyScale, which is an AI driven cache. I think it's super interesting because it's just an area I don't know and don't understand that much around like what's going on there.

So I'm excited to dig into some of the technical aspects around this, but maybe tell us a little bit more about your background, and PolyScale.

[00:01:54] Ben: Sure. So, I guess PolyScale came from a classic startup tale of living the problem. And so my background I've traditionally been in kind of sales engineering solutions, architecture for, getting back a few companies for large data driven companies. I did some time at Elastic, so Elasticsearch, and before that a startup called DataSift that was focused on mining Twitter firehose and LinkedIn and Facebook data in sort of privacy safe ways. And, what I observed at these companies was that getting data into the right locations and to being able to support the right access patterns. It was complex and difficult unless you had the right teams of people and methods for moving that data.

So, really, the pains of scaling your kind of database and your data tiers in general, what drove PolyScale or the inception of PolyScale. And when I set out, it was really a case of how can we make it really easy to scale a database without a huge amount of effort, cost, and complexity from team to people And you've got the sort of traditional, vertical scaling challenges, and different access patterns, as I mentioned, but really I arrived on caching as being that sort of Layer for scaling those systems, so, I did a lot of research on things like materialized views, and would PolyScale be a read replica company?

Is that the way to solve this? Caching is obviously notoriously difficult, but it did tick all of the boxes around being able to build a platform that could scale across different types of databases. So, that was a big win. But as I say, the focus that was, I think, quite unique to this day is that PolyScale’s completely plug and play.

So the idea was how easy could we make it? How can we make it really trivial to plug this thing in and start scaling with databases? And that's where PolyScale came from was, as I say, the whole plug and play ethos.

[00:04:05] Alex: Yep, absolutely. And, and some of those companies, like I love Rockset, Elastic, like I'm sure like being in sort of those. Sales engineering, like leadership roles, you saw just like a lot of just a ton of like really great use cases, like data heavy use cases, but still like struggling even with those great tools, like how to, how to make these work with some of those volumes and things like that.

What is PolyScale?

So, PolyScale, you mentioned plug and play, like what I guess what is PolyScale? What gave you the idea for, or how are you seeing people

[00:04:36] Ben: Yeah, so taking a step back, so PolyScale is a database cache. So specifically designed to cache just your database. So, comparing it to a key value store, something like Redis or whatever, that you can obviously cache pretty much any type of data. PolyScale is very much focused on databases specifically.

And there's really three pillars that underpin it. The first one is that it's wire protocol compatible with various databases. So, Again, going back to the sort of plug and play, you can, it's wire protocol compatible with Postgres and MySQL, MS SQL server. We just released MongoDB which is interesting for us, the first sort of NoSQL environment.

But the idea that you could just update your connection string and start routing your data through PolyScale was really the goal. And, taking a step back in the architecture wise, PolyScale is effectively a proxy cache that sits between anything that's connecting to your database, like your web application, serverless function, whatever it may be.

And yeah, it inspects that traffic that passes through. So, being plug and play was the first one. And then secondly. Caching gets really hard when you, when you think about what to cache and how long to cache it for. So, if you're implementing caching at the application tier, you may select a specific query, maybe break that off to be a microservice in its own right.

Maybe it's a leaderboard query or something specific. And you may have a good idea around how long you want to cache that for. And maybe that's something you can invalidate when you know that data's changing. So, if the updates are coming through the application tier, you can easily write that logic.

So, the approach we're taking, however, is that we want to cache all queries that are good candidates for caching. So the fact that we are a kind of a sidecar proxy that sits alongside your application means that we can inspect all of the traffic. So every single interaction between the app and the database, we get a view at and what's cool about that is you can plug in complex applications that may run 50,000 unique queries a day through the platform and PolyScale can inspect those and work out what to cache.

So, the second sort of core principle here is that the AI caching engine, it inspects all of that traffic, as I've mentioned, that goes back and forth and it builds statistical models on every individual unique query and that's incredibly powerful because you as I say you have the full breadth of any query that can be cached will get picked up and it will get added into the algorithm. So as I say the goal of this is really to allow a developer to plug in large applications or full applications where writing any code without doing configuration and automatically, you know start seeing their hit rates go up with dealing with all the complexities of invalidation and, and all that good stuff.

And then finally, the third pillar of the platform is that we have our own edge network. So the whole idea again of the plug and play is you can connect this thing in and it will route the data through our infrastructure and whatever the closest location is to your Application tier that's where the data will be cached.

So you get this sort of nice reduction in latency or you can self host PolyScale if you want to run that inside your own VPC.

[00:08:01] Alex: Cool. I love that. I love that three pillar approach. That first point that plug and play is interesting and it makes me think of like, DAX for that. I use a lot of like Dynamo DB and they have DAX, which is like a basically, pass through cache on that one. We're starting to see a few more of these, but it's so interesting because it just lowers the barrier so much in like how, how much work you have to do to integrate a cache.

You're not like manually changing all your data storage logic to check the cache first and things like that. It's just totally passing that through. So that's pretty interesting. What you mentioned you integrate with a bunch of databases.

Supported Databases

Are there particular databases that you're seeing more interest for?

Or like, did you see a lot of people asking for Mongo or how did that, what did that thought process look like?

[00:08:44] Ben: Yeah, so we started with We picked MySQL. This is going back a couple of years. …At the time It was really the biggest, most popular database. And it was a good place to start. We knew the protocol understood that pretty well and then from there I guess the postgres is you know interest keeps going up and up and there's there's new awesome vendors coming out, you know around postgres So that was our next database and I guess taking a step back We really focused on or started to focus on kind of those traditional transactional databases to start with just you know, where was that adoption?

What was the widespread adoption and That's where we started. So we, as I said, we did MySQL then Postgres and for us actually implementing new databases is a reasonable amount of work because, for that protocol compatibility reason. So, I think the whole concept of asking people to install different ORMs or different drivers or different client libraries to interact with your tool is a burden and there's an overhead and you're always competing with. You're always going to be competing with other libraries. And, is there an ORM that someone's picked because of a specific feature? And you don't want to be having to replace that or in that fight. So being wire protocol compatible was really nice in that you just get plug and play, zero, and it runs across every language.

It runs across TypeScript, Ruby, doesn't matter. But yeah, to answer your question, I think we really focus on just the biggest popular, most popular databases at the time. And then move from there. We did MySQL, MariaDB is obviously a very similar protocol. So we did that and yeah, as I say more recently We added support for Microsoft SQL Server and that's really interesting because you see lots of cases where typically

If you're in the more edge facing use cases we don't see a huge amount of MS SQL Server in those types of environments and It's nice to actually be able to be the plumbing for those types of tools where people can now plug those in and use that that data anywhere and then let's say more recently Mongo's our first step towards.. There's actually a pretty significant roadmap around where we want to take the whole paradigm of using caching to distribute your data. You know support those different access patterns and Mongo is our first sort of NoSQL database.

And then we're also looking to move into things like ElasticSearch, search infrastructure and also data warehousing things like ClickHouse and potentially Google Big Query so it's really with following where demand is really, and I think, MongoDB and Atlas, there's a huge demand for for Mongo, high performance distribution and that caching layer is very useful.

[00:11:39] Alex: Yeah, absolutely. So you mentioned the protocol compatibility and some of the work there. Are you, I mean, do you start from scratch every time? Or do you use the query parsing front end from Postgres or MySQL or things like that? Or are you just like, Hey, we're gonna look at the wire protocol and parse that all in our engine…

[00:11:56] Ben: Yeah, it's got to be the latter. So, I mean, it is the latter. So our actual engines are written in C++ So we have our own proxy that's written in C++ And if you think about what we do, we're a middleman between your app and your database. And if we added any latency to SQL queries, if we added any latency in that process, we're going to fail as a business, right?

There's just no way that. People would voluntarily have latency added to their queries. So we've worked really hard to make sure that that sort of inspection that we do of all the packets that come through is very low latency, zero copy buffering and so we've written everything from scratch in in that perspective and that unfortunately means going right down to the wire protocol but the nice thing is about what we do is we're not sort of implementing all particular types of queries and for every database.

So what we do is we do implement the authentication handshake. So we will manage that and then that sort of flows. There's a handshake between the upstream database and that obviously allow somebody to connect into PolyScale and that then creates an upstream connection to the database. But from that point onwards, we're just letting packets flow backwards and forwards.

And what we do is we inspect those to understand, are those read queries, are they selects and shows? Obviously, are they cacheable versus are they sort of manipulation queries or mutations, inserts, updates, and deletes. And those mutation queries just flow through and hit the origin database.

And the nice thing about that is you're not altering people's architectures. You're not saying, all your writes are going to end up in the same location. You're not having to... Distribute your database or shard your data in any way. So that's that resonates well with customers in the you know Your writes are still going to where they they always did but if a query comes in that we see and it's it's a cache, you know We have that in the cache then we'll serve that from there so you're effectively getting a SQL compatible key value store at the at the simplest level

[00:14:03] Alex: Yeah, very cool. Okay. I want to dive a little bit deeper just on like some of the under the hood stuff. That protocol stuff is great, but, but keep going. So, One thing y'all talk about is AI driven caching, right? And I know AI is the buzz right now. That's like mostly LLM, that's LLM stuff.

And I'm guessing it's not an LLM

[00:14:23] Ben: Correct. Correct. no LLM tier at the moment.

Global Cache Invalidations

[00:14:28] Alex: Yeah, exactly. So tell me more about like, why is this a good, caching validation is famously one of the hardest problems in computer science. Like why is AI a good, good approach for this? How does it, how does that work?

[00:14:41] Ben: Yeah. So I think, when we started looking at this, it was a case of every, every cache implementation that's manually done, it's implemented at the application tier. Everyone's literally starting from scratch. You kind of um, and this was staggering to me even a couple of years ago. It's like well, why do we make it really hard for developers to implement caching because you're getting down this process of deciding Okay, pick your key value store reddit or Memcached or whatever it may be. That's the easiest part of the job, right? It's pick your tool. There's some amazing vendors out there of high availability cloud services but then it's a case of okay, I've got a blank canvas and I start with modeling your data and Let's you know, then you start building your application logic and then you've got to work out as you say, when do I invalidate?

And lots of … So many implementations I've seen people just say well It's good enough. Let's put a TTL of 10 minutes in there and I'm either serving some stale data and that may be fine or I may be just getting misses or my hit rates aren't what they could be, but that's okay. And I think lots of people settle in that world.

So, when we, from the inception here as a case of look, those approaches are typically, a developer will pick a handful of queries that are causing pain, right? They're either, we've got really slow performance because of various reasons on the database or whatever it may be, lack of resources or whatever.

Or it could be the case of you've got a, you've started writing your cache, and you've got halfway down that process, and you realize, well, actually, there's a lot of queries in here we need to, to, to solve for. And you get this ongoing burden as well, where a new feature comes into the application, it then has to go into that logic as well.

So you get this sort of ongoing, there's an observability task, there's a testing task, there's a, it's an ongoing overhead of is my caching working as expected with the platform? So having this sort of sidecar approach where you completely separate that logic from the app is completely independent.

It doesn't touch the database. It doesn't touch the app tier. It is really nice in that it's a one stop shop. You plug it in. Every feature then that you knew, add to this benefits from that, but so the way we do this is we, every query that passes through where there's a whole bunch of inputs go into those models that get built on them.

So, the bigger sort of more important ones are what we call obviously the arrival rate. So how frequently are we seeing the queries coming in? And then obviously how frequently is the data changing on the database so we can look at the payload size that comes back and compare that to say, well, is it changing?

Then based on the frequency of that change, we can build up a confidence score of how likely is it that this data is actually changing on the database. So if you think about. That's the basics of the statistical based invalidation and the platform will set a TTL that it thinks is optimum for based on those inputs.

And that gives you some level of comfort in, you get, the data will invalidate based on that statistical model. Now, turns out that humans want really good fast invalidation that sort of, is correct and accurate. And so what we do is there's a couple of additional things that we bring in.

So there's a whole feature set that we call smart invalidation. And what that does is it looks at when we see manipulation queries. So if it sees coming across the wire, inserts updates and deletes, it will actually determine what data in the cache has been affected by those changes and it will automatically remove that data from the cache So that's really nice in that, you know as a developer you can say there's an update statement coming in.

It's updated and it's affected a whole bunch of read queries that are already in the cache. Let's go and invalidate those The next request will be a miss, and that will then come back and refresh the cache with the new data. So you,we are on the side of sort of consistency in, rather than performance in those use cases.

And we also do that, PolyScale’s completely distributed as well, so you can run one or more. Of those instances. And those invalidations happen globally. So, if you get an update that happens in, let's say, Paris in France, and you've got another node in New York, that's going to invalidate those globally. So that smart invalidation is really nice for the majority of use cases, actually, that we see, whereby people are just plugging in and they've got a monolith, maybe with some microservices as well, but, all of that traffic is coming across the wire and we can inspect everything.

Preference Options

[00:19:29] Alex: So you mentioned preferring correctness and consistency over, over like top level performance. Is that something that's, that's tweakable? Like maybe I'm doing, if I'm like Twitter and it's like the like count is always going up or Reddit, points or something like that, can I say, Hey, only refresh this every once a while?

Or, or is that like, Hey, that's just someone we believe in consistency and, and for right now that's what’s available.

[00:19:51] Ben: Yeah, so if you've got the, if you take the, as it default behavior, if you just connect up PolyScale, it's running in what we call auto mode. So we use the AI to drive everything as default. You can, however, come in and overwrite any part of that. So you can overwrite, you can set manual TTLs.

So if you know for example, look, I've got my products table and that just doesn't change. I'm going to set that to be a cache of 20 minutes and that's perfect. You can come and do that and you can go right down to the individual SQL query level. So within the product, you can get this nice observability view of what are my slow queries?

Where are we being most efficient? And you can literally overwrite any one of those. And you could do that down to the table level as well. If you've got more fine grained stuff than that. So yeah, you can come in and set those up whatever you want them to be, or you can just go with the full sort of auto approach.

And what we found, yeah, the majority of people actually just run in the auto mode without any manual interventions, which is nice.

[00:20:56] Alex: Yep. Absolutely. All right. So I interrupted you. You're talking about invalidation. You talked about that, that first level of, of basic row invalidation, individual updates, things like that

[00:21:03] Ben: Yeah. So we've got the, yeah. The statistical based invalidation, which happens out of the box. We've then got the, what we call the smart invalidation that looks for inserts, updates and deletes. And we keep an index of, what do we have stored? What are the read queries we have in the cache? And how do those get affected by those updates?

And, that really means parsing the query. We look at the rows, columns, fields, tables that actually get affected by those queries. And then finally that's, that works really well for most use cases, with the exception of if you can't, if PolyScale can't see those updates.

So if, for example, you've got some direct updates going into the database, maybe they're just Cron jobs, imports, whatever they may be. And in those cases, you can just connect your CDC stream into PolyScale. And that's really cool that you've got a feed going in of your real time updates and that keeps the cache up to date globally.

Are we just working with a really interesting client actually, who's doing that sort of they're, they're actually bringing in their GraphQL data into PolyScale using that method, a CDC stream for invalidations. To invalidate the cache globally. So, yeah, so those are the three methods that we use for keeping the cache, real time and fresh.

[00:22:25] Alex: Yep. Very cool. And so just to, if I compare you for other things I've seen recently in the space, like I would say one other one that came to mind initially was ReadySet, which is another like drop in protocol compatible one, but they are more, as I understand it, like Noria data flow based, right?

Where you sort of like define your queries, you want cash and they'll maybe just hook into CDC and cache these expensive ones where you are all queries and basically like an automated traditional read aside cache that you don't have to implement yourself.

[00:22:58] Ben: That's right. So my understanding, ReadySet is, is effectively they're rendering those, those results sets upfront. And there's a few similar, I guess, platforms and tools that I guess materialized view style implementations and there's definitely pros and cons of these approaches and I think that you're right. It's on that approach.

You have to define what are those queries that you want to cache? Which isn't uncommon by any means but it's a case of you know the developer will select what what those queries are and pre render those up front and you know when I was very much in the R&D stage of the early stages of PolyScale.

It was a case of the types of access patterns may not be known up front and that was a real blocker for me around the sort of materialized views world and this really does go back to now that we can plug in an entire E commerce app and then just you watch that Start caching without doing anything literally without doing any configuration So, you know if you send a brand new query at PolyScale that it's never seen before you'll get a hit on typically on the third request.

And then what it does from there is it compares queries that are similar to each other. So it removes the query parameters from, and it will say, look, if we've seen something similar to this, you'll get a hit on the second request. So the actual speed that you can go from nothing to high hit rates is pretty impressive.

[00:24:23] Alex: Yeah. Is it like Java and a JVM where it's just, it takes a little while to get up to like, bam, really hitting peak performance. And like it's learning for a while as it learns your query patterns and access patterns. And then…

[00:24:38] Ben: Yeah, I mean, the actual learning never switches off. So there's never a every, every query that comes through the platform feeds into the algorithm. But as I say, it's actually very efficient from a cold start. You get a, like a third query, you'll get a hit. So, if you've got any sort of, I mean the queries you care about, you're going to see quite regularly in a caching type environment.

So yeah, you can really go from zero to to high hit rates very very quickly so, even if you're purging your entire data set, it doesn't really matter. You know, you can get up, you know very quickly.

[00:25:15] Alex: Yeah, very cool. What about just under the hood actually handling the cache? Is that something like Memcached or redis? Or do you have your own custom sort of just caching like a key value on those nodes or what’s that look like.

[00:25:28] Ben: Yeah, so we're in custom. We've been down a bit of a… there's this history here and we hit performance issues. We started out with redis. And we hit performance issues around Just the speed that we could read and write with concurrency and things of that nature. So we ended up going with our own solution that's and it's nice… and we share memory and disk so we can predominantly everything that's kept in memory as much as kept in memory as we, we keep the optimum stuff in memory, the stuff that's hot, and then we fall out to disk if we need to, if stuff's overflowing.

And what's nice about that is, we don't have to worry about the size of the data set. So people can be running very large data sets and that works well. It scales well and obviously you can have that data in different regions and If you think about what we actually do we cache the result of a sql query or you know a non sql query which Is typically relatively small.

I mean, obviously there can be larger payload sizes. But typically relatively small And what's nice about that is we're not a database read replica where we're taking an entire copy of your data set So we're just storing the results set so we can actually store large amounts of query results you know quite efficiently, So yeah, we store in memory and then fall back to disk if we need to and that gives us a nice You know a way to scale, very very high data sets large data sets

Global Edge Network

[00:26:57] Alex: Yep. Very cool. Okay. Now I want to talk about the global edge network, right? You have these, these sort of points all around the globe that, that people can hit and it'll, route, it'll serve it. If it's cached there, it'll route it back to the database if needed. We're seeing a little bit more of this with CDN type things.

Netlify or, or hosting providers Fly.io. This is, this is pretty interesting to see as a caching provider, I guess, like, How hard is that to build? Like, walk me through that. What's that? What's that look like to build that sort of infrastructure?

[00:27:26] Ben: Yeah. So the actual, as I say, really, we have a proxy component. We have three tiers of our architecture. So if you think about the sort of the, the bottom tier is our, what we call our private cache endpoint or the proxy component. And that's the actual component that manages the TCP connections that pass through or HTTP connections.

We support both. And that actually stores and persists the data. That's the main proxy component. Now, we can, we run that specific proxy component in multiple locations across most of the major providers, AWS, GCP, Azure, Fly, DigitalOcean. And that then connects back to the AI control plane. So if you think about that sort of proxy component that's just responsible for checking, have I got something in the cache?

If I haven't, let's pass it on and go get it. And if I have, I'll serve it from the cache. And it's dumb in that perspective because, That's how you go fast, right? It's make it simple and what it does is it then offsets whatever query came through onto the AI control plane, which actually takes the SQL and parses that and does the more expensive stuff that we can scale independently of that fast track.

So the nice thing about that is we can spin up these different locations wherever they need to be and they'll connect back to this AI control plane. that pauses and processes the queries. And then all that does is it sends back TTL messages back to the proxies to tell them, that specific query that now has a TTL of this number of seconds.

And there's real time traffic obviously passing through that process. But it makes it really easy for somebody to, or for us to actually deploy these proxy locations because they're just containers. We can run them pretty much anywhere. We just use Kubernetes. So the actual work, I guess more of the work is in the high availability, the uptime, the monitoring, the DNS around, we have a single DNS network where it will resolve to the closest point of presence.

And that's, yeah, as I say, I think the, what happens is if a pop goes down for whatever reason, say there's a Hardware, failure or whatever it may be. We just fall back to the next closest point of presence and That won't be the fastest one typically, but it's not going to yield downtime, which is the important thing.

[00:29:59] Alex: And, and then just to be clear, so are those pops, the, the proxy layer, are they, are they storing any cache data there or are they just routing it to the nearest, like then cache

[00:30:38] Ben: No, they actually store the data. Yeah So the proxy layer is actually what does the storage and that's a single component that runs in all the different locations And that does the TCP connectivity and it does the actual storage of the data itself. So they're relatively easy for us to deploy.

We can roll those out in a couple of hours to a new location. We're doing that sort of based on where customers are asking us to be. And as I say, they connect back to the control plane to actually do the, the slower processing of the, of the data.

[00:30:37] Alex: Is that control plane centrally located or is that distributed across a few locations as well

[00:30:43] Ben: It's, yeah, that's actually distributed as well. So we've doubled down on AWS specifically for that. And that control plane lives in AWS, but it's distributed across multiple locations. And, The constraint there is that the control plane must be relatively low latency back to the proxy.

So we don't want that hop to be too high. We don't want the round trip to be too high. So we always make sure that there's a low latency between those two. And then there's actually a third layer on top of everything, which is responsible for the global invalidations. So if we get an invalidation from one specific location that gets a fan out effect that goes out to all the others and that's what gets managed there.

[00:31:27] Alex: Yep. Okay. And then, so can you give me a rough number of like, how many sort of pops you have for the proxy, roughly how many control planes you have, AI control planes you have, what do those numbers look like?

[00:31:40] Ben: We have 18 pops at the moment across the major hyperscalers and then we've got a bunch running in Fly as well But yeah, so around 20 sort of edge pops and then a percentage of those are full control plane pops as well. I guess we've got six, seven that are sort of full, full blown that do the AI control plane as well.

[00:32:04] Alex: Gotcha. And is there any proactive filling of caches or does it mostly just make sense to, when a request hits a pop, it reaches out to the control plane and fills it there. And it's, it's unlikely to be hitting one of those other pops anyway. So you don't want to do much proactive.

[00:32:20] Ben: It's a really good question because one of the sort of challenges I saw prior to to PolyScale was, you know, the classic sharding problem. It's you know I need to have read replicas and what data do I put where and you know you get to the point where the data sets go so big that you're spending more time sort of replicating out to its regions rather than actually, you know serving a request So it actually works really well in the moment PolyScale will only store the data that's requested So you get this natural, lazy sharding that goes on.

It's the if I've got a lot of traffic being requested in, in New York, well, that's where it's going to get stored and it's not going to get put in the other pops. So, one of the things that and that works really well, as I say, with very large datasets, you can service your audience really quite specifically and only, optimize your compute that you're spending, running those queries only in the locations where they're actually being needed.

So, in the early days, we did look at like, well, can we warm the caches in other regions or should we and I think it's there's diminishing returns there. I think you spend most of the time sort of shuffling data around the planet rather than, they're actually being used.

Now what's great is we do have visibility of all of that. So we can see what queries are being used, where, and we can guess the likelihood that they're likely to be used in other regions. So I think we may, we may go into that area in a bit more detail. But yeah, at the moment we do nothing. It's very lazy and it's very sort of per region.

Pricing

[00:33:55] Alex: I wanna shift gears a little bit and get into pricing and operations, because I think that's another interesting place where, where you all are, are innovating. So first of all, just pricing and, and operational model. It looks like a serverless model. You're not, I'm not paying for instances.

I'm not paying for a certain amount of CPU and RAM or anything like that. Pricing Is totally based on, on egress, is that right?

[00:34:15] Ben: That's right. That's right. And so at the moment, everything's based on egress. And we're as you said, that makes it nice in that you can scale to zero. So if you think about your classic e-commerce environment where it follows the sun, busyness, different regions can scale right down and others can come up.

So. The serverless works well, and more recently, we've just launched our self hosted version. So, if you're for compliance and security reasons, you can't put your database data into a public cloud, you can run PolyScale inside of your VPC. And that's based on an event based model. So you'll pay for this, per million queries that get processed by the platform.

Because of course, there's no egress that's actually happening there. So, our cost is actually processing the SQL queries that come through the platform. We do that at a price per million queries, but, but yeah, on the serverless offering. It's, it's just egress.

[00:35:11] Alex: Yep. And okay. I want to come back to the pricing and operations, but talking about that self hosted, we're seeing that, like, what does that self hosted look like? If I want to self host, am I actually running the command, like setting up a Kubernetes cluster and doing that? Or are you putting it in my account and managing it for me, but it's in my account.

[00:35:28] Ben: Yeah. So you've got so, there's two options here, depending on your security and compliance requirements. So the first one is you can just take that proxy component and. That's just a docker image and you can spin that up inside of your ECS environment and you're up and running.

Now what's great about that is that it's still offsetting anonymously back all the queries back to the AI layer that we host, the control plane. So if you're, if as a company, if you're comfortable with having sort of anonymous connection going out and back to the AI control plane, all you have to do is literally spin up that single proxy component or as many of those as you want.

Like, you can deploy those into ten different locations and have your own sort of mini edge network. And that's really nice because people can literally be up and running in a few minutes. You can just pull it down, start it up and route your traffic through it. If you're a much larger organization or a bank or whatever it may be, then you want to actually host the control plane as well.

So you want to take that control plane and run that internal to your organization as well. And obviously there's more involved in that. But yeah, for most enterprises, it's really nice and you can just pull down that single component and you're good to go.

[00:36:43] Alex: Gotcha. And so then in that, in that, and I guess in both cases, they are actually running it themselves. They are deploying it. You make it available and I'm sure easy for them, but it's not like I've seen some models where maybe they created a separate AWS account and now I, as a provider, run stuff in that account, but you own the account and have visibility into it.

But they're actually running it. In this case, they're actually running it themselves somewhere.

[00:37:05] Ben: Correct. And what works well, there is, enterprises have their own requirements around uptime of a database. Specifically, if you think about what we are, if we go down, your database goes down. And that's an incredibly Absolutely. Important brick in the you know a piece of the puzzle so that giving somebody the ability to own that Actually allows them to put in whatever restrictions they need.

So or what structure they need about uptime So whatever health checks that they are happy with that are currently happening on their database. They can happen through the PolyScale proxy as well And likewise, any sort of HA requirements, they might have a hot standby or whatever it may be, it puts the onus on, on the, on the customer.

So, equally, we can build out private SaaS environments, but if you want to run it inside of your own VPC, that's typically the model that we follow.

[00:38:00] Alex: Gotcha, gotcha. Okay, going back to pricing, because you're pricing on egress, and I've been thinking a lot about egress lately and talking with people. Why was that the right factor for you to be, just like the right thing to be pricing on? Is it because it reflects your cost structure or it just aligns with value for the customer, or how did you settle on egress?

[00:38:20] Ben: Yeah. I mean, egress was really what we think about, all of the data coming out of the platform that really is from a SaaS environment. That's our cost. So that marries up nicely with our internal costs. So we, the two bits that are at cost, really, obviously that proxy component is processing bytes on the wire that's coming out, and then we're actually processing the SQL queries that come through.

So it aligns nicely with both of those, and that's really the reason we picked it. I think there's always going to be customers who have a very small number of queries, a very high amount of egress, and vice versa. But the majority of customers fall into a nice bucket somewhere in the middle.

But yeah, it really aligns nicely with our cost internally to be candid.

[00:39:09] Alex: Yep. Yep. I love, I mean, I love having just one factor to price on. And I think like you're saying in most cases it aligns with your cost, but then also it's going to align with how much data they're actually messing with. And it's a pretty good proxy for a lot of folks. Is that a hard mindset shift for people to be, I mean, first of all, just the serverless pricing generally, but then also specifically on egress, they may not have thought about egress before…

Like, is that a hard mindset shift?

[00:39:34] Ben: There is a mindset shift there. Definitely. And we're always thinking about, different pricing models. And one option that we might look into is to on the self hosted model where we're pricing by the number of queries. That's an option to, to, to roll out into the SaaS model as well, because I think for the reason you mentioned, so finding out how much egress you're actually running out of your database is not a number that springs to mind easily.

So typically people just connect in the platform and they use it for a couple of weeks and they work it out. But, if you said to a DBA, well, how many queries are you running through your database every month, they probably have a rough number in mind. So.There's definitely pros there. So yeah, typically there is a period of okay.

I need to see what these numbers are so people, you know, do a test or put it through the staging environment do a bit of a pilot but usually they want to do that Anyway, right? It's not just to find out how much the egress is but and we can be pretty we can give people good ballpark figures on what we see with other customers like it's like if you're doing this number of queries roughly you're going to see this sort of cost implications.

[00:40:45] Alex: On that same serverless operational aspect. I know, like, sometimes I see people that have trouble letting go of just like the operational visibility of what's happening. Like if they're used to running caches, they're like, Hey, I monitor my CPU or my memory used and available.

I guess, like, have you noticed that with people where you're I assume you don't make that visit visible to them. Like, how do people react to that? What metrics should they be monitoring as they're using PolyScale.

[00:41:13] Ben: Yeah. So if you're on the, if you're using the serverless model, then you're right. We don't expose any of that and any scaling issues we deal with. Right. So if the CPU is high or whatever our memory, then that's something we deal with. On the self hosted where you're actually managing the proxy yourself, then yeah, exactly.

That's in your wheelhouse. So you're just running a container, then, plug in all your Prometheus metrics and business as usual, CPU and memory. And again, that goes into whatever you're doing at the moment to run those containers. You continue doing it with PolyScale and we've got recommendations around the amount of Minimum ram and cpu that's required.

But yeah, we do very much black box on the serverless environment and again, it does go back to the plug and play stuff that from a developer perspective I just don't care. I just want to see the cache run. It needs to be fast we want to be we are consistently sub millisecond response times on every cache hit and As long as that happens, then customers are happy.

Observability

The one area we have invested a lot in that's been really good is the observability side of things where people. It's amazing how many people plug in the tool and say, well, actually I didn't realize I was. Running these types of queries or the extreme cases we've had one customer was running 500 queries per second They weren't aware of right just by accident.

There was a bug in the code but pretty much everyone that looks at it goes Oh, okay. I've learned something here. And so you get these sort of, and there's great observability database tools out there But it does give you that holistic view of okay. What's the expensive queries? and what are the ones I really care about some people like oh, well my ORM’s doing something crazy or I'm missing an index all the classic stuff that but that's definitely been something that really resonates with customers and people like and You know just showing you what what it's doing like, it's it's it's fascinating.

[00:43:10] Alex: That's, that's what I've been thinking throughout this. I was like, I can't imagine how many people aside from the caching and that stuff, just like visibility into what their database is doing and where those expensive calls are like, so many people I think don't have, don't have visibility into that and getting a sense of that and seeing where those expensive calls are. Just Amazing.

[00:43:28] Ben: That sort of goes back to that if you're building this manually if you're you know You'll select you've got to first work out What are those queries that are the expensive ones and you may know because you've had The worst case you've got support tickets telling you right. This is not working as expected.

But Yeah, usually it comes down to somebody to be looking at the slow logs or whatever it may be to start, you know Defining what are the ones we need to look at? So and that as I say you can plug it in pretty quickly and get that view, you know really quickly

[00:43:56] Alex: Yeah, I did an interview with Jean Yang from Akita Software once where she basically used eBPF to just intercept packets going through. And just like so many people I think had no visibility into their APIs and they're just like, Hey, this is a non intrusive way to do that and gather.

And it reminds me of a similar thing here. You don't have to make a ton of code changes, do a bunch of instrumentation to figure out what's slow in your database. You can drop this in and get visibility and the slower ones will start getting cached.

[00:44:25] Ben: Yeah Lots of customers do that and they'll they'll actually you can plug this in and turn off caching So it's literally just a pass through And then that gives you all the metrics it gives you all the observability and all the potential wins Like if you want to switch that button on so lots of people start there they do.

[00:44:39] Alex: You mentioned earlier about adding Mongo and I'm just gonna make the plug for Dynamo. Like, I love Dynamo. I don't know how many people would like DAX is sort of there on, on some things, but a few things I would just say with DAX is Number one, it's going to be instance based, so it's not a serverless operational model.

So now you're pulling in this thing that you have to manage, which is unfortunate. Number two, like, PolyScale is going to be distributed across the globe for you. So if you have customers all over the place, you'll get caching that way. But then also your, your caching story, I think, could be just the invalidation stuff.

And especially the CDC, like there's DynamoStreams. Like DAX does, it does item level caching. So if you're getting individual, thing. It'll cache or like invalidate that for you. But then it does query caching if you fetch a result set, but it doesn't invalidate that very well. So if you query a set of 10 items and then you update one, it's not going to invalidate that for you.

Sort of have to wait for it to expire. Whereas it sounds like you all have done the work to figure that out. So I'm gonna make the pitch for Dynamo.

[00:45:44] Ben: definitely a good one. Yeah, because we we could effectively plug in a yeah the cdc stream in and sns or whatever it may be and pull all that data in so Yeah, I think there's a, there's, there's a long list actually of people that sort of pitch us their next, data platform of choice and but I think it's a nice, we, we talk about this a lot, but the fact that, um, most enterprises are using multiple persistence layers, the right tool for the right job and, I think if you'd asked me five years ago, I would have said to you, well, everyone's going to consolidate on one or two databases and that just isn't the case, right?

It's gone absolutely the opposite way. Vector databases and I think, Postgres is definitely having its day as it's Greenfield projects is a great starting point because you know you can scale well from a broad use case, set of use cases, but I think the whole the concept of, supporting multiple persistence layers in the same method, being able to drop that data anywhere, get low latency hits, I think is, is valuable.

So, yeah. Yeah, we're pretty excited about moving into different, different spaces.

[00:46:54] Alex: Yeah. Very cool. I want to close out with some business stuff and just hear about where you're at as a company as a team and things like that. So yeah, just start with that. Where are you at as a company? Have you raised funding? How big is your team?

[00:47:05] Ben: Yeah, we've raised funding so we're a small team, we're less than ten people we're fully distributed, we're all over North America, Spain, Germany, London and, yeah, we've been around for about two and a half years and really that first year was You know, I guess you, we got our first product to market, which was the MySQL product after about a year maybe 15 months.

And then really from there, it's scaling challenges of what we've been focused on. So it's one thing to actually build this, but actually to make it fast and to scale is a, is a whole another level. So there's been a lot of work there before we started adding additional databases. We, yeah, we've raised some seed money.

We raised three and a half million dollars to date. And that really allowed us to, we've have, as I say, we have a small team and that's by design. We can do a lot with a small team. And as I say, we've built a a fast and efficient platform now. So yeah, really now we're focused on yeah, from, I guess, from a, just a high level roadmap perspective, we. it's all been about getting the data out of the cache at the right time. So, you know Let's make sure we're evicting at the right time and making sure people are getting you don't want to serve people bad data That's all stale data. That's that's not a good use of cache and where we're going in the future now is being clever about pre loading and pre warming data so you can think about use cases around personalization, for example, so you know logging into your Cell phone provider account or your bank account or whatever you're likely to press one of these buttons across the top here, or what do you do previously?

We can go preload that data. We can go pull that data in. And that's really exciting because then you're using it as a cache, but obviously then it becomes a bit more than the cache. You're saying, well, I've got a persistence layer here that can handle any types of queries, and it's always going to be running fast, right?

It's always going to be bringing in the data that you need. So that intelligence layer, we can crowdsource that across all users and that's pretty exciting. That's where we're focused on as well as moving into, as I already mentioned, into those different layers. But yeah, so we're a small team distributed and get lots of hard problems to deal with.

[00:49:29] Alex: Yeah. Thank you. Very cool. It's so much fun to see what you can do with a small team that's intensely focused on a hard problem and make some awesome progress.

[00:49:36] Ben: It is. And it gives us that agility. It really does. Like we can pivot really quickly onto projects that come at us and But as I say, I never you, you, you definitely, there are huge benefits from having a support team, there really are. And but I think getting the right people is challenging and getting the skill sets you need at the right time is challenging.

But yeah, it definitely gives you advantages over much people that raise a lot more money have gone out and scaled up much larger team, team sizes and and there's, there's definitely downsides to that.

[00:50:13] Alex: And a lot of them are, zombie companies if they raise it at too much of a valuation and grow into it over the last couple of years.

[00:50:21] Ben: Where we are in the last couple of years has been pretty crazy from a raise perspective. So, yeah, we've been pretty tight on where we've invested. We're, we're not out at every. Event or whatever that we'd like to be but we're building a good product. That's the focus

[00:50:36] Alex: Great. So one thing, I've talked to a few sort of cloud native database companies, and one thing I always ask them is just like, how do you get people to trust you given that you're a new company and you're dealing with their data, their storage of data, it's a little less of a, a concern since you're cache, right?

You're not like the primary permanent, persistent store. But, one thing you mentioned is like being a. Read through is like your uptime is, is my uptime now, right? Like your availability is my availability. How did you?

Convincing Customers

Yeah. And so I imagine you spend a bunch of time on thinking just how to make that better and be highly available from your end.

But how do you convince customers or just deal with their feelings on some of that stuff? How do you approach that?

[00:51:21] Ben: Yeah, it's a really good point it is front and center and it should be right for anyone who's plugging in a tool like PolyScale or and it's interesting where I was having this conversation actually a couple of weeks ago with a prospect and they were saying yeah How do we get comfortable with? What you're, what you're doing here and I think the, what I actually dug into their specific scenario is quite interesting.

And they were already proxying all of their SQL data through a security company that was doing PII analysis and a whole bunch of other stuff. And I said, well, okay, how did you get comfortable with that? Because you're doing exactly the same there. And anyway, long story short, I think the focus is you do it piecemeal.

You take, here's a specific function or whatever it may be, and you say, well, let's start routing that through PolyScale. So, from an integration perspective, PolyScale is just a connection string. So rather than going directly to your database, you're going to PolyScale.

And then that gets routed onto your origin. So what's cool about that, within your, could be just a serverless function, it could be even within your monolith, just, have that sort of dual connectivity. You can route some traffic through PolyScale and, and others not through PolyScale. But obviously you start with your development and staging environment.

So nine times outta 10 people are gonna plug it into a dev or staging. If they want to use the cloud environment, the serverless environment, it's a good starting point 'cause it's just easy to do it, just connect it and have a play. And that will allow you to start to get confidence with the platform, and, people first you want to test the sort of smart invalidation. I want to see that working and that's a really easy thing to test. But yeah, definitely people start piece by piece. They're like looking at certain features or functions or great use cases while I'm just going to break out this specific query to run on cloudflare workers.

And I want to run that through PolyScale because I can run that fast everywhere. And that's a great use case and it's easy to do. So, but yeah, you've got to build that confidence. You've got to build that trust and That then goes down to obviously the infrastructure that we provide that having the High availability and failover built in and and we're very public about that.

We you know any there's We've had downtime over the past couple of years. Definitely. That's been our fault and we've also we're also at the mercy of all of the providers that we Work with across AWS, GCP, Azure, and so we, we definitely, But what's nice is that if you think about a classic TCP connection, They are designed sorry, classic sort of database TCP connection.

It's typical to lose that connection and reconnect. You're going potentially across the public internet. You've got no control over routing or packet loss. ORMs and whatever logic reconnect logic is default within a database environment. So, for example, if you lose a connection, another one is going to get initiated by the client software.

They're good at those, right? That's what they're good at. So if you are in a situation where you do experience downtime and you switch over to another environment, a couple of seconds later that's That architecture has to be robust, right? That sort of DNS behavior has to be robust.

So yeah, I think starting small is the answer and it's definitely a challenge. People are, because… pieces of your queries into this. This is the whole environment. So, yeah, it's definitely a start small and grow.

[00:54:56] Alex: Yeah. Yeah. On that same note, our most, new customers for PolyScale, are people that were previously using a cache and were just like, Hey, I don't want the operational burden or, or mainly doing all this work? Or are they people that are just new to cache generally and like, Hey, this is a much easier way to do it than having to go back and instrument my code.

[00:55:13] Ben: Yeah, it's interesting because when we started this, it was like Okay, people want to solve that latency issue. That's the number one thing that they're after. Yeah. I'm going to have a distributed app and multi regions coming and everyone's running at the edge in the next couple of years.

And really what we found is there's three use cases that are you can't predict which ones people are going to be using, but there's the classic, Okay, my queries are slow, which is very traditional. I've got, my database is on fire for whatever reason, it could be concurrency or indexes or whatever.

Then there's the latency one. So I'm in. One or more regions and I need to reduce that network latency and the third one that's is cost savings so and I guess you know, you could look at this and say that's pretty traditional for a cache, but The fact that you can plug PolyScale into your entire application really does yield quite large cost savings because if i'm serving 75 of my reads from not my database I can either go do more stuff with that, that resource, serve my rights faster, or I can reduce that infrastructure spend.

So, those three things are really across the board and which isn't a great answer, but we definitely see all three from, from different types of customers.

[00:56:26] Alex: Yeah. Absolutely. Well, Ben, I appreciate this conversation. It's been a lot of fun and just learning about it and just seeing all the interesting things that you're doing. I think the operational model, the sort of visibility into what's happening in my application, the global distribution, in addition to just like the smart caching work that's happening there.

I think there's a lot of interesting stuff there. If people want to find out more about PolyScale or about you, what's the best place to find you?

[00:56:51] Ben: Yeah. So just obviously the website PolyScale.ai. You can email it myself, ben at PolyScale. ai and we're obviously on Twitter and all the usual channels. So yeah, definitely reach out and we have a discord channel. So, yeah, it'd be great to connect with people.

[00:57:05] Alex: All right. Sounds great, Ben. Thanks for coming on

[00:57:07] Ben: Great. Thanks for your time, Alex. Great to meet you.

[00:57:08] Alex: Yep. Bye.

Discussion about this post

Ready for more?