I had written an article about the decentralized transactions, which mentioned that Redis does not provide the data persistence, with only a brief description. Today, I will deep dive on this topic to see how Redis AOF does.
TL;DR
Redis does not guarantee that the data will not be lost at all, even if the strictest setting is turned on.
The strictest setting here is AOF with fsync
at every query. Before we are talking about the reason, I will introduce the AOF further.
Redis AOF
AOF (Append Only File) is a mechanism like LSM Tree. After Redis server finishes processing the commands from clients, it will write a command log appending to the end of the log file. There are three settings to determine when these logs are actually written to the hard disk.
- No
fsync
at all -
fsync
every second -
fsync
at every query
The best performance is the first one, and then it gets worse and worse. According to the experiment, we can know the last option works even worse than LSM tree.
From the official manual, it looks like the data is durable when the AOF is enabled. Actually, no, not at all. The trick is the launch time of AOF; If you look closely you will find AOF is starting after commands are processed. It is not like WAL (Write Ahead Log) of other databases. Therefore, if the command is finished processing, and then the system is crashed, you will obviously lost this operation.
fsync
every second
fsync
at every queue makes lots of performance impact, hence we usually adopt fsync
every second. This is also the default value in enabling Redis AOF. There comes another question:
Does this mean I will only lose one second of data?
The answer is no. There are two steps in AOF writes logs into the disk.
- WRITE: Write data to file
- SAVE:
fsync
, i.e., write file to disk
From the implementation of Redis, the flow is as follows:
- Scenario 1: Return without WRITE and SAVE
- Scenario 2: WRITE, but no SAVE
- Scenario 3: WRITE and SAVE
To sum up, if the system crashed, you will lost data within 2 seconds. On the other hand, the description in official manual---appendfsync everysec: fsync every second. Fast enough (in 2.4 likely to be as fast as snapshotting), and you can lose 1 second of data if there is a disaster.---is incorrect.
Let me add, in parenthesis, I had attended EuropeCloud Summit 2021, and there is a session, Accelerating Application Modernization and Cloud Migration with Redis, in day 2; one of a slide shows: Zero Data Loss around 9+ years in production. I am curious how he did it, so I thus ask my question; I have got no answer until today.
Here is my conclusion, using Redis AOF is much more durable, however, it cannot make sure zero data loss. If you want to persist the data as much as possible, you have to not only use AOF but also the replica even as well as RDB at the same time.