Redis实践-主从部署续

上一篇关于Redis的笔记已经介绍了如何实施一个简单的Redis主从配置,看似简单,但涉及到的内容还是值得探究的。本文将进一步深入解析Redis主从原理和部分场景的分析,在前篇笔记的基础上做一次高层的提升。

之前的例子中,配置了一台Master服务器和两台Slave服务器,简单地配置了slaveof之后即可完成基本的主从配置,那么Master服务器是怎样与Slave服务器通信和同步数据呢?

主从特点

我们先来了解一下Redis主从的特点:

  1. 一台Master服务器可以连接多台Slave服务器
  2. 在主从架构中,Slave下可以连接多台Slave,依次级联向下,类似Master->Slave->Slave…
  3. 读写分离,可配置Master只写,把数据持久化和读取的工作转移到Slave上
  4. Master可以以非阻塞的方式同步数据到Slave上

Redis实践-主从部署续

Redis中的主从复制基本原理:

当Master在接收到存储数据的指令时,后台启动一个进程把数据本地持久化到文件中,Slave对Master发送一个同步指令之后,Master将持久化的数据文件发送给Slave,Slave接收到数据之后先持久化到本地文件中,然后再恢复到内存中,Slave与Slave之间的数据同步也是如此。如果其中一台Slave宕机,当重新恢复连接上Master之后,Master会发送一个完整的数据文件给Slave同步所有数据。

主从切换

让我们在仔细思考一下,按照之前的做法,一台Master和两台Slave做成的主从架构,理论上其中一台Slave宕机了,也有另外一台Slave保证流程的正常运转,但是Master宕机了怎么办呢?好像我们没提过这方面的设想。为了保证主从架构的高可用性,必须有一个措施了防范此类情况。

其中一个解决的办法就是当Master服务器发生意外时,把正常的Slave服务器转换成Master来保证系统常运行。我们通过实验验证这一做法。

服务器清单:

  • Master服务器 redis-master-slave-master 172.17.0.2 暴露6379端口
  • Slave服务器 redis-master-slave-slave1 172.17.0.3 暴露6479端口
  • Slave服务器 redis-master-slave-slave2 172.17.0.4 暴露6579端口

按照上一篇笔记的配置,启动主从服务器,这次把这三台服务器的daemon模式关闭,让它们在前台实时打印出调试信息以便观察,通过以下命令启动Redis:

1
[root@058704fbf5ea redis-3.2.9]# src/redis-server ./redis.conf --loglevel debug

通过观察输出的调试信息,可以看到Master和Slave主从复制同步正常:

Master服务器:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
18:M 09 Jun 20:56:39.405 * Slave 172.17.0.3:6379 asks for synchronization
18:M 09 Jun 20:56:39.405 * Full resync requested by slave 172.17.0.3:6379
18:M 09 Jun 20:56:39.405 * Starting BGSAVE for SYNC with target: disk
18:M 09 Jun 20:56:39.406 * Background saving started by pid 21
21:C 09 Jun 20:56:39.410 * DB saved on disk
21:C 09 Jun 20:56:39.410 * RDB: 0 MB of memory used by copy-on-write
18:M 09 Jun 20:56:39.435 * Background saving terminated with success
18:M 09 Jun 20:56:39.436 * Synchronization with slave 172.17.0.3:6379 succeeded
18:M 09 Jun 20:56:40.940 - 0 clients connected (1 slaves), 1829176 bytes in use
18:M 09 Jun 20:56:45.954 - 0 clients connected (1 slaves), 1829176 bytes in use
18:M 09 Jun 20:56:50.970 - 0 clients connected (1 slaves), 1829176 bytes in use
18:M 09 Jun 20:56:55.986 - 0 clients connected (1 slaves), 1829176 bytes in use
18:M 09 Jun 20:56:59.250 - Accepted 172.17.0.4:39590
18:M 09 Jun 20:56:59.254 * Slave 172.17.0.4:6379 asks for synchronization
18:M 09 Jun 20:56:59.254 * Full resync requested by slave 172.17.0.4:6379
18:M 09 Jun 20:56:59.254 * Starting BGSAVE for SYNC with target: disk
18:M 09 Jun 20:56:59.255 * Background saving started by pid 22
22:C 09 Jun 20:56:59.259 * DB saved on disk
22:C 09 Jun 20:56:59.260 * RDB: 0 MB of memory used by copy-on-write
18:M 09 Jun 20:56:59.297 * Background saving terminated with success
18:M 09 Jun 20:56:59.297 * Synchronization with slave 172.17.0.4:6379 succeeded
18:M 09 Jun 20:57:01.002 - 0 clients connected (2 slaves), 1850072 bytes in use

Slave1服务器:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
19:S 09 Jun 20:56:59.248 - 0 clients connected (0 slaves), 759720 bytes in use
19:S 09 Jun 20:56:59.248 * Connecting to MASTER 172.17.0.2:6379
19:S 09 Jun 20:56:59.248 * MASTER <-> SLAVE sync started
19:S 09 Jun 20:56:59.249 * Non blocking connect for SYNC fired the event.
19:S 09 Jun 20:56:59.252 * Master replied to PING, replication can continue...
19:S 09 Jun 20:56:59.254 * Partial resynchronization not possible (no cached mas
ter)
19:S 09 Jun 20:56:59.256 * Full resync from master: 46b6a42b6abdac211935188be3b3
c7c620a50a40:29
19:S 09 Jun 20:56:59.306 * MASTER <-> SLAVE sync: receiving 76 bytes from master
19:S 09 Jun 20:56:59.306 * MASTER <-> SLAVE sync: Flushing old data
19:S 09 Jun 20:56:59.306 * MASTER <-> SLAVE sync: Loading DB in memory
19:S 09 Jun 20:56:59.307 * MASTER <-> SLAVE sync: Finished with success
19:S 09 Jun 20:57:04.263 - 1 clients connected (0 slaves), 780576 bytes in use

Slave2服务器:

1
2
3
4
5
6
7
8
9
10
11
12
13
18:S 09 Jun 20:56:39.396 * Connecting to MASTER 172.17.0.2:6379
18:S 09 Jun 20:56:39.396 * MASTER <-> SLAVE sync started
18:S 09 Jun 20:56:39.399 * Non blocking connect for SYNC fired the event.
18:S 09 Jun 20:56:39.402 * Master replied to PING, replication can continue...
18:S 09 Jun 20:56:39.404 * Partial resynchronization not possible (no cached mas
ter)
18:S 09 Jun 20:56:39.407 * Full resync from master: 46b6a42b6abdac211935188be3b3
c7c620a50a40:1
18:S 09 Jun 20:56:39.454 * MASTER <-> SLAVE sync: receiving 76 bytes from master
18:S 09 Jun 20:56:39.455 * MASTER <-> SLAVE sync: Flushing old data
18:S 09 Jun 20:56:39.455 * MASTER <-> SLAVE sync: Loading DB in memory
18:S 09 Jun 20:56:39.455 * MASTER <-> SLAVE sync: Finished with success
18:S 09 Jun 20:56:44.411 - 1 clients connected (0 slaves), 780576 bytes in use

现在我们来模拟一下Master宕机的意外场景,手动关闭Master服务器,在控制台按下Ctrl+C取消服务器进程,Master进程退出后,观察Slave的变化:

Master服务器退出:

1
2
3
4
5
6
7
8
...
^C18:signal-handler (1497014027) Received SIGINT scheduling shutdown...
18:M 09 Jun 21:13:47.171 # User requested shutdown...
18:M 09 Jun 21:13:47.171 * Saving the final RDB snapshot before exiting.
18:M 09 Jun 21:13:47.176 * DB saved on disk
18:M 09 Jun 21:13:47.176 * Removing the pid file.
18:M 09 Jun 21:13:47.176 # Redis is now ready to exit, bye bye...
...

Slave服务器调试信息输出错误信息,连接不上Master服务器:

1
2
3
4
5
6
7
8
9
10
11
...
18:S 09 Jun 21:14:17.625 - 0 clients connected (0 slaves), 820480 bytes in use
18:S 09 Jun 21:14:17.625 * Connecting to MASTER 172.17.0.2:6379
18:S 09 Jun 21:14:17.625 * MASTER <-> SLAVE sync started
18:S 09 Jun 21:14:17.626 * Non blocking connect for SYNC fired the event.
18:S 09 Jun 21:14:17.629 # Error reply to PING from master: '-Reading from master: Operation now in progress'
18:S 09 Jun 21:14:18.628 * Connecting to MASTER 172.17.0.2:6379
18:S 09 Jun 21:14:18.628 * MASTER <-> SLAVE sync started
18:S 09 Jun 21:14:18.630 * Non blocking connect for SYNC fired the event.
18:S 09 Jun 21:14:18.639 # Error reply to PING from master: '-Reading from master: Operation now in progress'
...

此时如果设置了读写分离,即Slave的配置中配置了slave-read-only yes,数据将不能写入服务器,Master已经退出进程,Slave无法完成工作。我们需要把其中一台Slave服务器提升至Master服务器来应对这样的故障。

假设挑选的是Slave1作为Master服务器,首先使用redis-cli连接到Slave1:

1
2
[root@f2c03e85f3aa redis-3.2.9]# ./src/redis-cli -h 172.17.0.3 -p 6379
172.17.0.3:6379> slave no one

之后,观察Slave1服务器,可见已经提升为了Master服务器角色:

1
2
3
4
5
6
19:M 09 Jun 21:22:41.598 * MASTER MODE enabled (user request from 'id=3 addr=172.17.0.1:50746 fd=6 name= age=98 idle=0 flags=N db=0 sub
=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=slaveof')
19:M 09 Jun 21:22:43.999 - 1 clients connected (0 slaves), 779576 bytes in use
19:M 09 Jun 21:22:49.014 - 1 clients connected (0 slaves), 779576 bytes in use
19:M 09 Jun 21:22:54.029 - 1 clients connected (0 slaves), 779576 bytes in use
19:M 09 Jun 21:22:59.045 - 1 clients connected (0 slaves), 779576 bytes in use

我们可以发现,其中使用了slaveof命令,看看大概的作用:

1
slaveof host port

把自己作为指定主机的从服务器,丢弃之前的数据,同步新的主服务器数据。

1
slaveof no one

从从服务器角色转变为主服务器角色

但是我们仔细想象,Slave1转为Master角色了,但是Slave2还是有问题,一直输出连接不到目标Master,使用Slaveof指令确实可以使Slave转为Master,但是其他Slave不能自动变新Master的从属服务器,我暂时没发现解决方案除了后面提到的Sentinel。

哨兵(Sentinel)

针对前面主从故障切换问题,Redis有一个新的解决方案,即哨兵(从Redis 2.8开始才有此功能),以下是官方的介绍:

Redis Sentinel的作用:

  • 监控(Monitoring): Sentinel 会不断地检查你的主服务器和从服务器是否运作正常。
  • 提醒(Notification): 当被监控的某个 Redis 服务器出现问题时, Sentinel 可以通过 API 向管理员或者其他应用程序发送通知。
  • 自动故障迁移(Automatic failover): 当一个主服务器不能正常工作时, Sentinel 会开始一次自动故障迁移操作, 它会将失效主服务器的其中一个从服务器升级为新的主服务器, 并让失效主服务器的其他从服务器改为复制新的主服务器; 当客户端试图连接失效的主服务器时, 集群也会向客户端返回新主服务器的地址, 使得集群可以使用新主服务器代替失效服务器。

Redis Sentinel是一个分布式系统, 你可以在一个架构中运行多个 Sentinel 进程(progress), 这些进程使用流言协议(gossip protocols)来接收关于主服务器是否下线的信息, 并使用投票协议(agreement protocols)来决定是否执行自动故障迁移, 以及选择哪个从服务器作为新的主服务器。
虽然 Redis Sentinel 释出为一个单独的可执行文件 redis-sentinel , 但实际上它只是一个运行在特殊模式下的 Redis 服务器, 你可以在启动一个普通 Redis 服务器时通过给定 –sentinel 选项来启动 Redis Sentinel 。

使用Sentinel很好地保证了整个主从架构在Master服务器发生故障时候自动提升Slave为Master服务器,而无须人为干预,并且它是高效、安全和稳定的。

高可用主从架构

光说不练可不行,让我们使用sentinel来实现主从切换,实现一个高可用的Redis主从架构。本次实验环境使用的Redis版本为3.2.9,是比较新的版本。

添加一个服务器(redis-master-slave-sentinel1)用于运行sentinel监控:

1
$ docker run -d --name redis-master-slave-sentinel1 redis-master-slave

本次具体的服务器清单:

  • Master服务器 redis-master-slave-master2 172.17.0.2
  • Slave服务器 redis-master-slave-slave1 172.17.0.3
  • Slave服务器 redis-master-slave-salve2 172.17.0.4
  • Sentinel服务器 redis-master-slave-sentinel1 172.17.0.5

我们可以看到,在src目录下面已经有之前已经编译好的sentinel可执行文件,配置文件存放在根目录中:

1
2
3
4
5
[root@f2c03e85f3aa src]# ls redis-se*
redis-sentinel redis-server
[root@f2c03e85f3aa src]# cd ..
[root@f2c03e85f3aa redis-3.2.9]# ls *.conf
redis.conf sentinel.conf

在配置之前先看看sentienl的用法,可以通过以下命令启动:

1
redis-sentinel /path/to/sentinel.conf

或者

1
redis-server /path/to/sentinel.conf --sentinel

如果没有为sentinel指定配置文件或者配置文件不可写,sentinel将不能启动。

在启动之前,需要配置sentinel的配置文件sentinel.conf,本次的实验配置为:

1
sentinel monitor mymaster 172.17.0.2 6379 1

将此配置覆盖掉原来在sentinel.conf配置文件中默认配置的127.0.0.1主机。

以上配置的作用是让Sentinel监控172.17.0.2主机端口为6379的Redis,并为他起一个名字为mymaster做标识,最后面的1是代表至少需要1个sentinel认为这个master服务器失效才被看作已经失效,这里我们之配置了一台sentinel服务器,最后配置Redis主从的slaveof配置项,Sentinel不需要配置从服务器,它会自动从Master服务器中检测到从服务器。

之后一次启动Master、Slave和Sentinel服务器。

1
[root@cf3d28d40ca0 redis-3.2.9]# src/redis-server redis.conf
1
[root@ef58f14977e5 redis-3.2.9]# src/redis-sentinel sentinel.conf

redis-master-slave-master1输出日志:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
31:M 11 Jun 15:01:30.127 * DB loaded from disk: 0.000 seconds
31:M 11 Jun 15:01:30.127 * The server is now ready to accept connections on port 6379
31:M 11 Jun 15:02:47.424 * Slave 172.17.0.3:6379 asks for synchronization
31:M 11 Jun 15:02:47.424 * Full resync requested by slave 172.17.0.3:6379
31:M 11 Jun 15:02:47.424 * Starting BGSAVE for SYNC with target: disk
31:M 11 Jun 15:02:47.425 * Background saving started by pid 34
34:C 11 Jun 15:02:47.430 * DB saved on disk
34:C 11 Jun 15:02:47.430 * RDB: 0 MB of memory used by copy-on-write
31:M 11 Jun 15:02:47.473 * Background saving terminated with success
31:M 11 Jun 15:02:47.473 * Synchronization with slave 172.17.0.3:6379 succeeded
31:M 11 Jun 15:03:33.379 * Slave 172.17.0.4:6379 asks for synchronization
31:M 11 Jun 15:03:33.379 * Full resync requested by slave 172.17.0.4:6379
31:M 11 Jun 15:03:33.379 * Starting BGSAVE for SYNC with target: disk
31:M 11 Jun 15:03:33.380 * Background saving started by pid 35
35:C 11 Jun 15:03:33.385 * DB saved on disk
35:C 11 Jun 15:03:33.385 * RDB: 0 MB of memory used by copy-on-write
31:M 11 Jun 15:03:33.419 * Background saving terminated with success
31:M 11 Jun 15:03:33.419 * Synchronization with slave 172.17.0.4:6379 succeeded

redis-master-slave-slave1输出日志:

1
2
3
4
5
6
7
8
9
10
11
12
35:S 11 Jun 15:02:47.423 * Connecting to MASTER 172.17.0.2:6379
35:S 11 Jun 15:02:47.424 * MASTER <-> SLAVE sync started
35:S 11 Jun 15:02:47.424 * Non blocking connect for SYNC fired the event.
35:S 11 Jun 15:02:47.424 * Master replied to PING, replication can continue...
35:S 11 Jun 15:02:47.424 * Partial resynchronization not possible (no cached mas
ter)
35:S 11 Jun 15:02:47.425 * Full resync from master: 07640bbbe8d6968bb4331e28cba0
a58997b22dd2:1
35:S 11 Jun 15:02:47.473 * MASTER <-> SLAVE sync: receiving 94 bytes from master
35:S 11 Jun 15:02:47.474 * MASTER <-> SLAVE sync: Flushing old data
35:S 11 Jun 15:02:47.474 * MASTER <-> SLAVE sync: Loading DB in memory
35:S 11 Jun 15:02:47.474 * MASTER <-> SLAVE sync: Finished with success

redis-master-slave-slave2输出日志:

1
2
3
4
5
6
7
8
9
10
11
12
29:S 11 Jun 15:03:33.378 * Connecting to MASTER 172.17.0.2:6379
29:S 11 Jun 15:03:33.379 * MASTER <-> SLAVE sync started
29:S 11 Jun 15:03:33.379 * Non blocking connect for SYNC fired the event.
29:S 11 Jun 15:03:33.379 * Master replied to PING, replication can continue...
29:S 11 Jun 15:03:33.379 * Partial resynchronization not possible (no cached mas
ter)
29:S 11 Jun 15:03:33.380 * Full resync from master: 07640bbbe8d6968bb4331e28cba0
a58997b22dd2:71
29:S 11 Jun 15:03:33.419 * MASTER <-> SLAVE sync: receiving 94 bytes from master
29:S 11 Jun 15:03:33.420 * MASTER <-> SLAVE sync: Flushing old data
29:S 11 Jun 15:03:33.420 * MASTER <-> SLAVE sync: Loading DB in memory
29:S 11 Jun 15:03:33.420 * MASTER <-> SLAVE sync: Finished with success

redis-master-slave-sentinel1输出日志:

1
2
3
4
24:X 11 Jun 15:04:57.880 # Sentinel ID is 165c2d90256badec3c89f47813fa544d8ae5bf8f
24:X 11 Jun 15:04:57.880 # +monitor master mymaster 172.17.0.2 6379 quorum 1
24:X 11 Jun 15:04:57.881 * +slave slave 172.17.0.3:6379 172.17.0.3 6379 @ mymaster 172.17.0.2 6379
24:X 11 Jun 15:04:57.885 * +slave slave 172.17.0.4:6379 172.17.0.4 6379 @ mymaster 172.17.0.2 6379

我们在本地机子使用CMD命令telnet到docker容器中操作Master,Master曝露的端口为6779:

1
2
telnet 192.168.0.101 6779
info replciation

看以得到输出结果,两台Slave已经连上了Master:

1
2
3
4
5
6
7
8
9
10
# Replication
role:master
connected_slaves:2
slave0:ip=172.17.0.3,port=6379,state=online,offset=68743,lag=0
slave1:ip=172.17.0.4,port=6379,state=online,offset=68729,lag=0
master_repl_offset:68743
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:68742

现在我们认为中断Master服务器进程,模拟Master宕机场景,观察Slave和Sentinel有什么变化。

在本地使用telnet发送shutdown命令关闭Master服务:

1
shutdown

Master已经退出了服务:

1
2
3
4
5
31:M 11 Jun 15:23:39.759 # User requested shutdown...
31:M 11 Jun 15:23:39.759 * Saving the final RDB snapshot before exiting.
31:M 11 Jun 15:23:39.773 * DB saved on disk
31:M 11 Jun 15:23:39.773 * Removing the pid file.
31:M 11 Jun 15:23:39.773 # Redis is now ready to exit, bye bye...

Sentienl日志发生了变化,检测到了Master服务器异常关闭,并通过了选举过程,选择redis-master-slave-slave2作为新的Master:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
24:X 11 Jun 15:04:57.880 # +monitor master mymaster 172.17.0.2 6379 quorum 1
24:X 11 Jun 15:04:57.881 * +slave slave 172.17.0.3:6379 172.17.0.3 6379 @ mymaster 172.17.0.2 6379
24:X 11 Jun 15:04:57.885 * +slave slave 172.17.0.4:6379 172.17.0.4 6379 @ mymaster 172.17.0.2 6379
24:X 11 Jun 15:24:09.843 # +sdown master mymaster 172.17.0.2 6379
24:X 11 Jun 15:24:09.843 # +odown master mymaster 172.17.0.2 6379 #quorum 1/1
24:X 11 Jun 15:24:09.843 # +new-epoch 1
24:X 11 Jun 15:24:09.843 # +try-failover master mymaster 172.17.0.2 6379
24:X 11 Jun 15:24:09.850 # +vote-for-leader 165c2d90256badec3c89f47813fa544d8ae5bf8f 1
24:X 11 Jun 15:24:09.850 # +elected-leader master mymaster 172.17.0.2 6379
24:X 11 Jun 15:24:09.850 # +failover-state-select-slave master mymaster 172.17.0.2 6379
24:X 11 Jun 15:24:09.934 # +selected-slave slave 172.17.0.4:6379 172.17.0.4 6379 @ mymaster 172.17.0.2 6379
24:X 11 Jun 15:24:09.934 * +failover-state-send-slaveof-noone slave 172.17.0.4:6379 172.17.0.4 6379 @ mymaster 172.17.0.2 6379
24:X 11 Jun 15:24:10.010 * +failover-state-wait-promotion slave 172.17.0.4:6379 172.17.0.4 6379 @ mymaster 172.17.0.2 6379
24:X 11 Jun 15:24:10.915 # +promoted-slave slave 172.17.0.4:6379 172.17.0.4 6379 @ mymaster 172.17.0.2 6379
24:X 11 Jun 15:24:10.915 # +failover-state-reconf-slaves master mymaster 172.17.0.2 6379
24:X 11 Jun 15:24:11.009 * +slave-reconf-sent slave 172.17.0.3:6379 172.17.0.3 6379 @ mymaster 172.17.0.2 6379
24:X 11 Jun 15:24:11.956 * +slave-reconf-inprog slave 172.17.0.3:6379 172.17.0.3 6379 @ mymaster 172.17.0.2 6379
24:X 11 Jun 15:24:11.957 * +slave-reconf-done slave 172.17.0.3:6379 172.17.0.3 6379 @ mymaster 172.17.0.2 6379
24:X 11 Jun 15:24:12.013 # +failover-end master mymaster 172.17.0.2 6379
24:X 11 Jun 15:24:12.014 # +switch-master mymaster 172.17.0.2 6379 172.17.0.4 6379
24:X 11 Jun 15:24:12.014 * +slave slave 172.17.0.3:6379 172.17.0.3 6379 @ mymaster 172.17.0.4 6379
24:X 11 Jun 15:24:12.014 * +slave slave 172.17.0.2:6379 172.17.0.2 6379 @ mymaster 172.17.0.4 6379

连接到redis-master-slave-slave2观察replication信息:

1
2
3
4
5
6
7
8
9
# Replication
role:master
connected_slaves:1
slave0:ip=172.17.0.3,port=6379,state=online,offset=5266,lag=0
master_repl_offset:5266
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:5265

确实是提升到了Master角色,现在有一个redis-master-salve-salve1从服务器连接。

之后,我们在重新启动原来“意外”宕机的redis-master-slave-master2服务器,在此观察Sentinel、Slave和新Master的变化。

Sentinel输出了新日志:

1
24:X 11 Jun 15:29:21.038 * +convert-to-slave slave 172.17.0.2:6379 172.17.0.2 6379 @ mymaster 172.17.0.4 6379

很明显,重新启动的redis-master-slave-master2变成了新Master的从服务器,而并不是恢复为之前的Master角色。

查看redis-master-slave-slave2的replication信息也是如此:

1
2
3
4
5
6
7
8
9
10
# Replication
role:master
connected_slaves:2
slave0:ip=172.17.0.3,port=6379,state=online,offset=22244,lag=1
slave1:ip=172.17.0.2,port=6379,state=online,offset=22244,lag=0
master_repl_offset:22244
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:22243

简单的实验到此完成,当然,如果想要更加稳定的架构,sentienl支持分布式多节点,您可以添加多个sentinel服务器,但并不需要互相配置sentinel的信息,它们会自动识别并相互通信,这是非常棒的机制。

其实,实现故障主从切换的方式不单单有redis的Sentinel,毕竟它是在2.8版本才退出,有些案列也使用了Keepalive来实现。

总结

本笔记总体上讲了关于Redis主从架构中节点故障的应对和主从转换策略,通过以上基本的搭配可以延伸出多节点、分布式等部署方式,另外sentinel可以开启多个节点形成集群来提升容灾能力,使之成为真正的高可用架构,这样的部署方式也是比较灵活,可以方便扩展主从服务器,某种层度上提供了可扩展性,当然以前的内容只是简单地介绍,其实很多实用的地方还是需要了解配置文件各项配置的作用。下一次的内容是基于现在的主从架构来提升到一个分布式集群架构层次,加油!