September 27, 2023

Building a memory assisted AI application is a pain, Cloudflare made latency less of a worry with Vectorize

In short, Cloudflare released a sentence embedding library which is directly embeddable in their Workers edge runtime which is a big deal if you're trying to cut down on query times and using something like OpenAI or another LLM API over the network. When I first started playing around with internal clientside apps leveraging these generative APIs, I can't tell you how bummed I was to learn that not only did I need to host a DB but also required spinning up infrastructure to transform queries into vectors OR needed to ping OpenAI to do the same. This turns a simple search query into a multi request affair and may not be helpful given how sensitive the application is to waiting.

Having a more turnkey setup is super useful when your requirements aren't sophisticated enough to require a shed load of services in your infrastructure.

Leave it to Cloudflare to give you not everything you'd find on AWS, but enough modern primitives to accomplish everything you'd do there and more 👏