Building a memory assisted AI application is a pain, Cloudflare made latency less of a worry with Vectorize

In short, Cloudflare released a sentence embedding library which is directly embeddable in their Workers edge runtime which is a big deal if you're trying to cut down on query times and using something like OpenAI or another LLM API over the network. When I first started playing around with internal clientside apps leveraging these generative APIs, I can't tell you how bummed I was to learn that not only did I need to host a DB but also required spinning up infrastructure to transform queries into vectors OR needed to ping OpenAI to do the same. This turns a simple search query into a multi request affair and may not be helpful given how sensitive the application is to waiting.

Having a more turnkey setup is super useful when your requirements aren't sophisticated enough to require a shed load of services in your infrastructure.

Leave it to Cloudflare to give you not everything you'd find on AWS, but enough modern primitives to accomplish everything you'd do there and more 👏

Comments?

Nope. Don't worry about leaving them here, instead hit me up @TRST_Blog and share your thoughts.

What is this Place?

This is the weblog of the strangely disembodied TRST. Here it attempts to write somewhat intelligibly on, well, anything really. Overall, it may be less than enticing.