You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tinyvector should look to be as simple as possible while being as powerful as possible. What is the best abstraction for an embedding database that minimizes the complexity and size of the codebase?
We've made a few assumptions so far:
Only one index can be tied per table, since we're assuming most users won't need multiple indexes on the same data.
Indexes can only be tied to a single table and cannot span multiple tables or special clauses. This might need to change in the future? Do we want to allow indexes to be built on multiple tables/with complex filtering?
Indexes should try to not be mutable, instead, should force manual deletion and recreation? We may want to have a number of mutable indexes for compatibility, but it seems to be more straightforward (from a performance and a user experience perspective) to intend for most indexes to be immutable.
Holding all indexes in memory and intending for vertical scaling seems like the simplest way to build tinyvector. In most common use-cases, it seems that vectors can easily be held in memory on reasonable hardware. If needed, you can do dimensionality reduction on your vectors to decrease memory impact and increase performance. Is this the right direction?
The text was updated successfully, but these errors were encountered:
tinyvector should look to be as simple as possible while being as powerful as possible. What is the best abstraction for an embedding database that minimizes the complexity and size of the codebase?
We've made a few assumptions so far:
The text was updated successfully, but these errors were encountered: