AI Architecture (Experiment)

GitLab has created a common set of tools to support our product groups and their utilization of AI. Our goals with this common architecture are:

  1. Increase the velocity of feature teams by providing a set of high quality, ready to use tools
  2. Ability to switch underlying technologies quickly and easily

AI is moving very quickly, and we need to be able to keep pace with changes in the area. We have built an abstraction layer to do this, allowing us to take a more “pluggable” approach to the underlying models, data stores, and other technologies.

The following diagram shows a simplified view of how the different components in GitLab interact. The abstraction layer helps avoid code duplication within the REST APIs within the AI API block.

SaaS-based AI abstraction layer

GitLab currently operates a cloud-hosted AI architecture. We are exploring how self-managed instances integrate with it.

There are two primary reasons for this: the best AI models are cloud-based as they often depend on specialized hardware designed for this purpose, and operating self-managed infrastructure capable of AI at-scale and with appropriate performance is a significant undertaking. We are actively tracking self-managed customers interested in AI.

Supported technologies

As part of the AI working group, we have been investigating various technologies and vetting them. Below is a list of the tools which have been reviewed and already approved for use within the GitLab application.

It is possible to utilize other models or technologies, however they will need to go through a review process prior to use. Use the AI Project Proposal template as part of your idea and include the new tools required to support it.

Models

The following models have been approved for use:

Vector stores

The following vector stores have been approved for use:

  • pgvector is a Postgres extension adding support for storing vector embeddings and calculating ANN (approximate nearest neighbor).

Indexing Update

We are currently using sequential scan, which provides perfect recall. We are considering adding an index if we can ensure that it still produces accurate results, as noted in the pgvector indexing documentation.

Given that the table contains thousands of entries, indexing with these updated settings would likely improve search speed while maintaining high accuracy. However, more testing may be needed to verify the optimal configuration for this dataset size before deploying to production.

A draft MR has been created to update the index.

The index function has been updated to improve search quality. This was tested locally by setting the ivfflat.probes value to 10 with the following SQL command:

Embedding::TanukiBotMvc.connection.execute("SET ivfflat.probes = 10")

Setting the probes value for indexing improves results, as per the neighbor documentation.

For optimal probes and lists values:

  • Use lists equal to rows / 1000 for tables with up to 1 million rows and sqrt(rows) for larger datasets.
  • For probes start with lists / 10 for tables up to 1 million rows and sqrt(lists) for larger datasets.