Redis development guidelines
Redis instances
GitLab uses Redis for the following distinct purposes:
- Caching (mostly via
Rails.cache
). - As a job processing queue with Sidekiq.
- To manage the shared application state.
- To store CI trace chunks.
- As a Pub/Sub queue backend for ActionCable.
- Rate limiting state storage.
- Sessions.
In most environments (including the GDK), all of these point to the same Redis instance.
On GitLab.com, we use separate Redis instances. See the Redis SRE guide for more details on our setup.
Every application process is configured to use the same Redis servers, so they can be used for inter-process communication in cases where PostgreSQL is less appropriate. For example, transient state or data that is written much more often than it is read.
If Geo is enabled, each Geo node gets its own, independent Redis database.
We have development documentation on adding a new Redis instance.
Key naming
Redis is a flat namespace with no hierarchy, which means we must pay attention
to key names to avoid collisions. Typically we use colon-separated elements to
provide a semblance of structure at application level. An example might be
projects:1:somekey
.
Although we split our Redis usage by purpose into distinct categories, and those may map to separate Redis servers in a Highly Available configuration like GitLab.com, the default Omnibus and GDK setups share a single Redis server. This means that keys should always be globally unique across all categories.
It is usually better to use immutable identifiers - project ID rather than full path, for instance - in Redis key names. If full path is used, the key stops being consulted if the project is renamed. If the contents of the key are invalidated by a name change, it is better to include a hook that expires the entry, instead of relying on the key changing.
Multi-key commands
GitLab supports Redis Cluster only for the Redis rate-limiting type, introduced in epic 823.
This imposes an additional constraint on naming: where GitLab is performing operations that require several keys to be held on the same Redis server - for instance, diffing two sets held in Redis - the keys should ensure that by enclosing the changeable parts in curly braces. For example:
project:{1}:set_a
project:{1}:set_b
project:{2}:set_c
set_a
and set_b
are guaranteed to be held on the same Redis server, while set_c
is not.
Currently, we validate this in the development and test environments
with the RedisClusterValidator
,
which is enabled for the cache
and shared_state
Redis instances..
Developers are highly encouraged to use hash-tags where appropriate to facilitate future adoption of Redis Cluster in more Redis types. For example, the Namespace model uses hash-tags for its config cache keys.
Redis in structured logging
For GitLab Team Members: There are basic and advanced videos that show how you can work with the Redis structured logging fields on GitLab.com.
Our structured logging for web requests and Sidekiq jobs contains fields for the duration, call count, bytes written, and bytes read per Redis instance, along with a total for all Redis instances. For a particular request, this might look like:
Field | Value |
---|---|
json.queue_duration_s |
0.01 |
json.redis_cache_calls |
1 |
json.redis_cache_duration_s |
0 |
json.redis_cache_read_bytes |
109 |
json.redis_cache_write_bytes |
49 |
json.redis_calls |
2 |
json.redis_duration_s |
0.001 |
json.redis_read_bytes |
111 |
json.redis_shared_state_calls |
1 |
json.redis_shared_state_duration_s |
0 |
json.redis_shared_state_read_bytes |
2 |
json.redis_shared_state_write_bytes |
206 |
json.redis_write_bytes |
255 |
As all of these fields are indexed, it is then straightforward to
investigate Redis usage in production. For instance, to find the
requests that read the most data from the cache, we can just sort by
redis_cache_read_bytes
in descending order.
The slow log
NOTE: There is a video showing how to see the slow log (GitLab internal) on GitLab.com
On GitLab.com, entries from the Redis slow log are available in the
pubsub-redis-inf-gprd*
index with the redis.slowlog
tag.
This shows commands that have taken a long time and may be a performance
concern.
The
fluent-plugin-redis-slowlog
project is responsible for taking the slowlog
entries from Redis and
passing to Fluentd (and ultimately Elasticsearch).
Analyzing the entire keyspace
The Redis Keyspace Analyzer project contains tools for dumping the full key list and memory usage of a Redis instance, and then analyzing those lists while eliminating potentially sensitive data from the results. It can be used to find the most frequent key patterns, or those that use the most memory.
Currently this is not run automatically for the GitLab.com Redis instances, but is run manually on an as-needed basis.
N+1 calls problem
Introduced in
spec/support/helpers/redis_commands/recorder.rb
viaf696f670
RedisCommands::Recorder
is a tool for detecting Redis N+1 calls problem from tests.
Redis is often used for caching purposes. Usually, cache calls are lightweight and cannot generate enough load to affect the Redis instance. However, it is still possible to trigger expensive cache recalculations without knowing that. Use this tool to analyze Redis calls, and define expected limits for them.
Create a test
It is implemented as a ActiveSupport::Notifications
instrumenter.
You can create a test that verifies that a testable code only makes a single Redis call:
it 'avoids N+1 Redis calls' do
control = RedisCommands::Recorder.new { visit_page }
expect(control.count).to eq(1)
end
or a test that verifies the number of specific Redis calls:
it 'avoids N+1 sadd Redis calls' do
control = RedisCommands::Recorder.new { visit_page }
expect(control.by_command(:sadd).count).to eq(1)
end
You can also provide a pattern to capture only specific Redis calls:
it 'avoids N+1 Redis calls to forks_count key' do
control = RedisCommands::Recorder.new(pattern: 'forks_count') { visit_page }
expect(control.count).to eq(1)
end
You also can use special matchers exceed_redis_calls_limit
and
exceed_redis_command_calls_limit
to define an upper limit for
a number of Redis calls:
it 'avoids N+1 Redis calls' do
control = RedisCommands::Recorder.new { visit_page }
expect(control).not_to exceed_redis_calls_limit(1)
end
it 'avoids N+1 sadd Redis calls' do
control = RedisCommands::Recorder.new { visit_page }
expect(control).not_to exceed_redis_command_calls_limit(:sadd, 1)
end
These tests can help to identify N+1 problems related to Redis calls, and make sure that the fix for them works as expected.
See also
Utility classes
We have some extra classes to help with specific use cases. These are
mostly for fine-grained control of Redis usage, so they wouldn't be used
in combination with the Rails.cache
wrapper: we'd either use
Rails.cache
or these classes and literal Redis commands.
We prefer using Rails.cache
so we can reap the benefits of future
optimizations done to Rails. Ruby objects are
marshalled
when written to Redis, so we must pay attention to store neither huge objects,
nor untrusted user input.
Typically we would only use these classes when at least one of the following is true:
- We want to manipulate data on a non-cache Redis instance.
-
Rails.cache
does not support the operations we want to perform.
Gitlab::Redis::{Cache,SharedState,Queues}
These classes wrap the Redis instances (using
Gitlab::Redis::Wrapper
)
to make it convenient to work with them directly. The typical use is to
call .with
on the class, which takes a block that yields the Redis
connection. For example:
# Get the value of `key` from the shared state (persistent) Redis
Gitlab::Redis::SharedState.with { |redis| redis.get(key) }
# Check if `value` is a member of the set `key`
Gitlab::Redis::Cache.with { |redis| redis.sismember(key, value) }
Gitlab::Redis::Boolean
In Redis, every value is a string.
Gitlab::Redis::Boolean
makes sure that booleans are encoded and decoded consistently.
Gitlab::Redis::HLL
The Redis PFCOUNT
,
PFADD
, and
PFMERGE
commands operate on
HyperLogLogs, a data structure that allows estimating the number of unique
elements with low memory usage. For more information,
see HyperLogLogs in Redis.
Gitlab::Redis::HLL
provides a convenient interface for adding and counting values in HyperLogLogs.
Gitlab::SetCache
For cases where we need to efficiently check the whether an item is in a group
of items, we can use a Redis set.
Gitlab::SetCache
provides an #include?
method that uses the
SISMEMBER
command, as well as #read
to fetch all entries in the set.
This is used by the
RepositorySetCache
to provide a convenient way to use sets to cache repository data like branch
names.
Background migration
Redis-based migrations involve using the SCAN
command to scan the entire Redis instance for certain key patterns.
For large Redis instances, the migration might exceed the time limit
for regular or post-deployment migrations. RedisMigrationWorker
performs long-running Redis migrations as a background migration.
To perform a background migration by creating a class:
module Gitlab
module BackgroundMigration
module Redis
class BackfillCertainKey
def perform(keys)
# implement logic to clean up or backfill keys
end
def scan_match_pattern
# define the match pattern for the `SCAN` command
end
def redis
# define the exact Redis instance
end
end
end
end
end
To trigger the worker through a post-deployment migration:
class ExampleBackfill < Gitlab::Database::Migration[2.1]
disable_ddl_transaction!
MIGRATION='BackfillCertainKey'
def up
queue_redis_migration_job(MIGRATION)
end
end