Advanced Search (re)indexing is causing GitLab performance issues
Issue
When (re)indexing Advanced Search, GitLab performance becomes slow and unresponsive. This may manifest in the following ways:
- Repository files and commits take a long time to load, or do not load at all.
- GitLab interface becomes sluggish, sometimes reporting a 500 error.
- The server(s) running Sidekiq or Gitaly report high CPU utilization.
Environment
-
Impacted offerings:
- GitLab Self-Managed
Cause
(Re)indexing Advanced Search triggers bulk indexing background jobs which can saturate Sidekiq and Gitaly CPU utilization as they retrieve project and repository data for storage in ElasticSearch. This in turns degrades GitLab performance.
Resolution
Set limits on Advanced Search (re)indexing in Admin Area > Settings > Search > Advanced Search. The settings to adjust are:
- Maximum bulk request size (MiB): Configures how much data must be collected (and stored in memory) in a given indexing process before submitting the payload to the Elasticsearch Bulk API.
- Bulk request concurrency: Configures how many indexer processes (or threads) can run in parallel to collect data.
A very conservative starting point would be:
- Maximum bulk request size (MiB): 2
- Bulk request concurrency: 1
You may want to experiment with increasing them gradually to speed up the reindexing operation.
Additional information
- Increasing the values of Maximum bulk request size (MiB) and Bulk request concurrency can
negatively impact Sidekiq performance. Return them to their default values if you see increased
scheduling_latency_s
durations in your Sidekiq logs.