云服务器Redis集群部署及客户端通过公网IP连接问题

您所在的位置:网站首页 阿里云服务器外网怎么访问 云服务器Redis集群部署及客户端通过公网IP连接问题

云服务器Redis集群部署及客户端通过公网IP连接问题

2024-07-10 15:11| 来源: 网络整理| 查看: 265

目录 1、配置文件2、启动服务并创建集群(1)启动6个Redis服务(2)通过客户端命令创建集群 3、客户端连接(1)客户端配置(2)测试用例(3)错误日志分析 4、问题解决(1)查redis.conf配置文件(2)修改配置文件(3)重新启动Redis服务并创建集群 5、故障转移期间Lettuce客户端连接问题(1)测试用例(2)停掉其中一个master节点,模拟宕机(3)解决办法1)更换Redis客户端2)Lettuce客户端配置Redis集群拓扑刷新

1、配置文件

准备了6个配置文件:redis-6381.conf,redis-6382.conf,redis-6383.conf,redis-6384.conf,redis-6385.conf, redis-6386.conf。配置文件内容如下:

# 配置文件进行了精简,完整配置可自行和官方提供的完整conf文件进行对照。端口号自行对应修改 #后台启动的意思 daemonize yes #端口号 port 6381 # IP绑定,redis不建议对公网开放,这里绑定了服务器私网IP及环回地址 bind 172.17.0.13 127.0.0.1 # redis数据文件存放的目录 dir /redis/workingDir # 日志文件 logfile "/redis/logs/cluster-node-6381.log" # 开启AOF appendonly yes # 开启集群 cluster-enabled yes # 集群持久化配置文件,内容包含其它节点的状态,持久化变量等,会自动生成在上面配置的dir目录下 cluster-config-file cluster-node-6381.conf # 集群节点不可用的最大时间(毫秒),如果主节点在指定时间内不可达,那么会进行故障转移 cluster-node-timeout 5000

备注:Redis版本为6.0.4。

2、启动服务并创建集群 (1)启动6个Redis服务

redis-server redis-6381.conf redis-server redis-6382.conf redis-server redis-6383.conf redis-server redis-6384.conf redis-server redis-6385.conf redis-server redis-6386.conf

(2)通过客户端命令创建集群

创建集群,每个master节点分配一个从节点:

redis-cli --cluster create \ 172.17.0.13:6381 172.17.0.13:6382 172.17.0.13:6383 \ 172.17.0.13:6384 172.17.0.13:6385 172.17.0.13:6386 \ --cluster-replicas 1

3、客户端连接 (1)客户端配置 @Configuration public class RedisClusterConfig { @Bean public RedisConnectionFactory redisConnectionFactory() { // 客户端读写分离配置 LettuceClientConfiguration clientConfig = LettuceClientConfiguration.builder() .readFrom(ReadFrom.REPLICA_PREFERRED) .build(); RedisClusterConfiguration redisClusterConfiguration = new RedisClusterConfiguration(Arrays.asList( "122.51.151.130:6381", "122.51.151.130:6382", "122.51.151.130:6383", "122.51.151.130:6384", "122.51.151.130:6385", "122.51.151.130:6386")); return new LettuceConnectionFactory(redisClusterConfiguration, clientConfig); } } (2)测试用例 @RunWith(SpringRunner.class) @SpringBootTest(classes = Application.class) public class RedisClusterTest { @Autowired private StringRedisTemplate stringRedisTemplate; @Test public void readFromReplicaWriteToMasterTest() { System.out.println("开始设置值..."); stringRedisTemplate.opsForValue().set("username", "Nick"); System.out.println("获取值:" + stringRedisTemplate.opsForValue().get("username")); } } (3)错误日志分析 2020-08-14 14:57:49.180 WARN 22012 --- [ioEventLoop-6-4] i.l.c.c.topology.ClusterTopologyRefresh : Unable to connect to [172.17.0.13:6384]: connection timed out: /172.17.0.13:6384 2020-08-14 14:57:49.180 WARN 22012 --- [ioEventLoop-6-3] i.l.c.c.topology.ClusterTopologyRefresh : Unable to connect to [172.17.0.13:6383]: connection timed out: /172.17.0.13:6383 2020-08-14 14:57:49.182 WARN 22012 --- [ioEventLoop-6-2] i.l.c.c.topology.ClusterTopologyRefresh : Unable to connect to [172.17.0.13:6382]: connection timed out: /172.17.0.13:6382 2020-08-14 14:57:49.182 WARN 22012 --- [ioEventLoop-6-1] i.l.c.c.topology.ClusterTopologyRefresh : Unable to connect to [172.17.0.13:6381]: connection timed out: /172.17.0.13:6381 2020-08-14 14:57:49.190 WARN 22012 --- [ioEventLoop-6-1] i.l.c.c.topology.ClusterTopologyRefresh : Unable to connect to [172.17.0.13:6385]: connection timed out: /172.17.0.13:6385 2020-08-14 14:57:49.191 WARN 22012 --- [ioEventLoop-6-2] i.l.c.c.topology.ClusterTopologyRefresh : Unable to connect to [172.17.0.13:6386]: connection timed out: /172.17.0.13:6386 2020-08-14 14:57:59.389 WARN 22012 --- [ioEventLoop-6-3] i.l.core.cluster.RedisClusterClient : connection timed out: /172.17.0.13:6382 2020-08-14 14:58:09.391 WARN 22012 --- [ioEventLoop-6-4] i.l.core.cluster.RedisClusterClient : connection timed out: /172.17.0.13:6381 2020-08-14 14:58:19.393 WARN 22012 --- [ioEventLoop-6-1] i.l.core.cluster.RedisClusterClient : connection timed out: /172.17.0.13:6383 2020-08-14 14:58:29.396 WARN 22012 --- [ioEventLoop-6-2] i.l.core.cluster.RedisClusterClient : connection timed out: /172.17.0.13:6384 2020-08-14 14:58:39.399 WARN 22012 --- [ioEventLoop-6-3] i.l.core.cluster.RedisClusterClient : connection timed out: /172.17.0.13:6386 2020-08-14 14:58:49.402 WARN 22012 --- [ioEventLoop-6-4] i.l.core.cluster.RedisClusterClient : connection timed out: /172.17.0.13:6385

连接客户端我们用的是Lettuce,这里发现指定的公网ip竟然变成私网ip了,客户端获取的IP地址信息是从Redis集群获取的,所以我们得让集群返回给我们公网ip。

4、问题解决 (1)查redis.conf配置文件

让Redis暴露公网IP其实在redis.conf配置文件里是能找到的,下面这段配置主要针对docker这种特殊的部署,这里我们也可以手动指定Redis的公网IP、端口以及总线端口(默认服务端口加10000)。

########################## CLUSTER DOCKER/NAT support ######################## # In certain deployments, Redis Cluster nodes address discovery fails, because # addresses are NAT-ted or because ports are forwarded (the typical case is # Docker and other containers). # # In order to make Redis Cluster working in such environments, a static # configuration where each node knows its public address is needed. The # following two options are used for this scope, and are: # # * cluster-announce-ip # * cluster-announce-port # * cluster-announce-bus-port # # Each instruct the node about its address, client port, and cluster message # bus port. The information is then published in the header of the bus packets # so that other nodes will be able to correctly map the address of the node # publishing the information. # # If the above options are not used, the normal Redis Cluster auto-detection # will be used instead. # # Note that when remapped, the bus port may not be at the fixed offset of # clients port + 10000, so you can specify any port and bus-port depending # on how they get remapped. If the bus-port is not set, a fixed offset of # 10000 will be used as usually. # # Example: # # cluster-announce-ip 10.1.1.5 # cluster-announce-port 6379 # cluster-announce-bus-port 6380 (2)修改配置文件

手动指定了公网ip后,Redis集群中的节点会通过公网IP进行通信,也就是外网访问。因此相关的总线端口,如下面的16381等总线端口必须在云服务器中的安全组中放开,不然集群会处于fail状态。

# 配置文件进行了精简,完整配置可自行和官方提供的完整conf文件进行对照。端口号自行对应修改 #后台启动的意思 daemonize yes #端口号 port 6381 # IP绑定,redis不建议对公网开放,这里绑定了服务器私网IP及环回地址 bind 172.17.0.13 127.0.0.1 # redis数据文件存放的目录 dir /redis/workingDir # 日志文件 logfile "/redis/logs/cluster-node-6381.log" # 开启AOF appendonly yes # 开启集群 cluster-enabled yes # 集群持久化配置文件,内容包含其它节点的状态,持久化变量等,会自动生成在上面配置的dir目录下 cluster-config-file cluster-node-6381.conf # 集群节点不可用的最大时间(毫秒),如果主节点在指定时间内不可达,那么会进行故障转移 cluster-node-timeout 5000 # 云服务器上部署需指定公网ip cluster-announce-ip 122.51.151.130 # Redis总线端口,用于与其它节点通信 cluster-announce-bus-port 16381 (3)重新启动Redis服务并创建集群

这个时候我们可以查看一下节点配置文件cluster-node-6381.conf的内容前后有啥变化。

未指定公网IP前:

[universe@VM_0_13_centos workingDir]$ cat cluster-node-6381.conf 34287d78c1e9c4ff49880bb976707a0c17676f82 172.17.0.13:6384@16384 slave 1a206270f835a79e43e281df5f6f8215ab49d713 0 1597390563209 4 connected e306ae5e3ead5f2a837d3bdc0b95c0bd8e3cff99 172.17.0.13:6383@16383 master - 0 1597390565212 3 connected 10923-16383 0932cc203a19f37a3f5ebca8278962f5b325c67e 172.17.0.13:6385@16385 slave 2cc1aed536ff5b48c2fdd94f16cd96cefc4fd4ef 0 1597390564711 5 connected 2cc1aed536ff5b48c2fdd94f16cd96cefc4fd4ef 172.17.0.13:6382@16382 master - 0 1597390565000 2 connected 5461-10922 1a206270f835a79e43e281df5f6f8215ab49d713 172.17.0.13:6381@16381 myself,master - 0 1597390564000 1 connected 0-5460 0f63accb455594d0625cffa8d09aacc580d7e428 172.17.0.13:6386@16386 slave e306ae5e3ead5f2a837d3bdc0b95c0bd8e3cff99 0 1597390564210 6 connected

指定公网IP后:

[universe@VM_0_13_centos workingDir]$ cat cluster-node-6381.conf e2691ffd4bf7d867bc91b3b91c7b233a5f1e5dd2 122.51.151.130:6384@16384 master - 0 1597389992286 7 connected 10923-16383 511668874d39a7b1f701cc3df6f21d00510bfeae 122.51.151.130:6383@16383 slave e2691ffd4bf7d867bc91b3b91c7b233a5f1e5dd2 0 1597389991283 7 connected e77e540ef4115abe920fb191f354b81f42e7b4ed 122.51.151.130:6381@16381 myself,master - 0 1597389991000 1 connected 0-5460 2a3ea359311b34cd59e10da7d2f1bba48403f0ee 122.51.151.130:6385@16385 slave e77e540ef4115abe920fb191f354b81f42e7b4ed 0 1597389990583 5 connected 2bf4f01a4dba802eb1a50d9510947a4af0ac92ef 122.51.151.130:6382@16382 master - 0 1597389992789 2 connected 5461-10922 2b7671e002143b329c9c6c969bfb825a86fb41b2 122.51.151.130:6386@16386 slave 2bf4f01a4dba802eb1a50d9510947a4af0ac92ef 0 1597389991784 6 connected vars currentEpoch 7 lastVoteEpoch 7

这里我们可以发现,各节点暴露的IP全是公网IP了,再次运行测试用例,一切正常。

5、故障转移期间Lettuce客户端连接问题 (1)测试用例 @RunWith(SpringRunner.class) @SpringBootTest(classes = Application.class) public class RedisClusterTest { @Autowired private StringRedisTemplate stringRedisTemplate; @Test public void automaticFailoverTest() throws InterruptedException { int count = 0; while (true) { try { stringRedisTemplate.opsForValue().set("count", String.valueOf(++count)); System.out.println("修改count的值:" + count); System.out.println("获取count的值:" + stringRedisTemplate.opsForValue().get("count")); Thread.sleep(2000); } catch (Exception e) { System.out.println("可能发生切主,重新操作..."); Thread.sleep(3000); } } } } (2)停掉其中一个master节点,模拟宕机

日志如下:

2020-08-20 19:33:25.118 INFO 13696 --- [xecutorLoop-1-1] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was /122.51.151.130:6384 2020-08-20 19:33:26.213 WARN 13696 --- [ioEventLoop-6-1] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384 2020-08-20 19:33:31.015 INFO 13696 --- [xecutorLoop-1-2] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 122.51.151.130:6384 2020-08-20 19:33:32.107 WARN 13696 --- [ioEventLoop-6-2] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384 2020-08-20 19:33:36.616 INFO 13696 --- [xecutorLoop-1-2] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 122.51.151.130:6384 2020-08-20 19:33:37.709 WARN 13696 --- [ioEventLoop-6-2] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384 2020-08-20 19:33:42.016 INFO 13696 --- [xecutorLoop-1-4] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 122.51.151.130:6384 2020-08-20 19:33:43.110 WARN 13696 --- [ioEventLoop-6-4] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384 2020-08-20 19:33:47.216 INFO 13696 --- [xecutorLoop-1-1] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 122.51.151.130:6384 2020-08-20 19:33:48.317 WARN 13696 --- [ioEventLoop-6-1] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384 2020-08-20 19:33:56.515 INFO 13696 --- [xecutorLoop-1-2] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 122.51.151.130:6384 2020-08-20 19:33:57.605 WARN 13696 --- [ioEventLoop-6-2] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384 2020-08-20 19:34:14.016 INFO 13696 --- [xecutorLoop-1-3] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 122.51.151.130:6384 2020-08-20 19:34:15.113 WARN 13696 --- [ioEventLoop-6-3] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384 可能发生切主,重新操作... 2020-08-20 19:34:45.116 INFO 13696 --- [xecutorLoop-1-4] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 122.51.151.130:6384 2020-08-20 19:34:46.212 WARN 13696 --- [ioEventLoop-6-4] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384 2020-08-20 19:35:16.216 INFO 13696 --- [xecutorLoop-1-1] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 122.51.151.130:6384 2020-08-20 19:35:17.310 WARN 13696 --- [ioEventLoop-6-1] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384 可能发生切主,重新操作...

等了很长一段时间发现,发现客户端一致处于重连状态,这Lettuce客户端可能有毒。

(3)解决办法 1)更换Redis客户端

将客户端换为Jedis后,再次模拟主节点宕机,发现过段时间后客户端连接恢复正常了。

@Configuration public class RedisClusterConfig { @Bean public RedisConnectionFactory redisConnectionFactory() { RedisClusterConfiguration redisClusterConfiguration = new RedisClusterConfiguration(Arrays.asList( "122.51.151.130:6381", "122.51.151.130:6382", "122.51.151.130:6383", "122.51.151.130:6384", "122.51.151.130:6385", "122.51.151.130:6386")); return new JedisConnectionFactory(redisClusterConfiguration); } } 2)Lettuce客户端配置Redis集群拓扑刷新

难道Lettuce客户端不支持主从切换后客户端重连么,那是不可能的。我们在github上找到了关于lettuce关于Redis集群的一些信息,相关地址如下: https://github.com/lettuce-io/lettuce-core/wiki/Redis-Cluster https://github.com/lettuce-io/lettuce-core/wiki/Client-options#cluster-specific-options

接下来按照文档上的提示修改客户端配置:

@Configuration public class RedisClusterConfig { @Bean public RedisConnectionFactory redisConnectionFactory() { // 开启自适应集群拓扑刷新和周期拓扑刷新,不开启相应槽位主节点挂掉会出现服务不可用,直到挂掉节点重新恢复 ClusterTopologyRefreshOptions clusterTopologyRefreshOptions = ClusterTopologyRefreshOptions.builder() .enableAllAdaptiveRefreshTriggers() // 开启自适应刷新,自适应刷新不开启,Redis集群变更时将会导致连接异常 .adaptiveRefreshTriggersTimeout(Duration.ofSeconds(30)) //自适应刷新超时时间(默认30秒),默认关闭开启后时间为30秒 .enablePeriodicRefresh(Duration.ofSeconds(20)) // 默认关闭开启后时间为60秒 ClusterTopologyRefreshOptions.DEFAULT_REFRESH_PERIOD 60 .enablePeriodicRefresh(Duration.ofSeconds(2)) = .enablePeriodicRefresh().refreshPeriod(Duration.ofSeconds(2)) .build(); ClientOptions clientOptions = ClusterClientOptions.builder() .topologyRefreshOptions(clusterTopologyRefreshOptions) .build(); // 客户端读写分离配置 LettuceClientConfiguration clientConfig = LettuceClientConfiguration.builder() .clientOptions(clientOptions) .build(); RedisClusterConfiguration redisClusterConfiguration = new RedisClusterConfiguration(Arrays.asList( "122.51.151.130:6381", "122.51.151.130:6382", "122.51.151.130:6383", "122.51.151.130:6384", "122.51.151.130:6385", "122.51.151.130:6386")); return new LettuceConnectionFactory(redisClusterConfiguration, clientConfig); } }

修改完配置后,再次运行测试用例,模拟主节点宕机,客户端再次重连。

在这里插入图片描述



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3