在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
环境:PostgreSQL 9.2.4 主机:192.25.10.76 从机:192.25.10.71 做postgresql的流复制主从时,会遇到调整max_wal_sengers这个参数,官方文档对这个参数做了一个简要的说明(9.2.4比早先版本多了几句话并做了一些微调),但没有实际的例子。 1.参数说明:
也就是说,这个参数是在主机上设置的,是从机连接到主机的并发连接数之总和,所以这个参数是个正整型。默认值是0,也即默认没有流复制功能。该并发连接数从进程上看,就是各个wal sender进程数之和,可以通过ps -ef|grep senders来查看,所以该值不能超过系统的最大连接数(max_connections,该BUG在9.1.5被修复),可以允许超过实际的流复制用户数。该参数更改需要重启DB,比如我只配了一个从机: [postgres@ndb2 database]$ ps -ef|grep sender postgres 21257 21247 0 20:57 ? 00:00:00 postgres: wal sender process repuser 192.25.10.71(46161) streaming 0/4018ED8 postgres 22193 20949 0 23:02 pts/0 00:00:00 grep sender 2.异常情况很多时候配置主从的时候会遗漏这个参数,或者没有设置正确(比如实际配的从机数超过设置的连接数),这个时候一般会报错 number of requested standby connections exceeds max_wal_senders (currently X): 备机上显示的日志异常: 2013-08-12 20:53:42.132 CST,,,8859,,5208dad6.229b,1,,2013-08-12 20:53:42 CST,,0,FATAL,XX000,"could not connect to the primary server: FATAL: number of requested standby connections exceeds max_wal_senders (currently 0) ",,,,,,,,,"" 2013-08-12 20:53:47.137 CST,,,8861,,5208dadb.229d,1,,2013-08-12 20:53:47 CST,,0,FATAL,XX000,"could not connect to the primary server: FATAL: number of requested standby connections exceeds max_wal_senders (currently 0) ",,,,,,,,,"" 2013-08-12 20:53:52.142 CST,,,8862,,5208dae0.229e,1,,2013-08-12 20:53:52 CST,,0,FATAL,XX000,"could not connect to the primary server: FATAL: number of requested standby connections exceeds max_wal_senders (currently 0) ",,,,,,,,,"" 2013-08-12 20:53:57.148 CST,,,8864,,5208dae5.22a0,1,,2013-08-12 20:53:57 CST,,0,FATAL,XX000,"could not connect to the primary server: FATAL: number of requested standby connections exceeds max_wal_senders (currently 0) ",,,,,,,,," 主机上显示的日志异常: receiver" 2013-08-12 20:43:26.937 CST,,,21064,"",5208d86e.5248,1,"",2013-08-12 20:43:26 CST,,0,LOG,00000,"connection received: host=192.25.10.71 port=46085",,,,,,,,,"" 2013-08-12 20:43:26.938 CST,"repuser","",21064,"192.25.10.71:46085",5208d86e.5248,2,"authentication",2013-08-12 20:43:26 CST,2/1195,0,LOG,00000,"replication connection authorized: user=repuser",,,,,,,,,"" 2013-08-12 20:43:26.938 CST,"repuser","",21064,"192.25.10.71:46085",5208d86e.5248,3,"startup",2013-08-12 20:43:26 CST,2/0,0,FATAL,53300,"number of requested standby connections exceeds max_wal_senders (currently 0)",,,,,,,,,"walreceiver" 2013-08-12 20:43:26.939 CST,"repuser","",21064,"192.25.10.71:46085",5208d86e.5248,4,"startup",2013-08-12 20:43:26 CST,,0,LOG,00000,"disconnection: session time: 0:00:00.002 user=repuser database= host=192.25.10.71 port=46085",,,,,,,,,"wa lreceiver" 2013-08-12 20:43:41.513 CST,,,21066,"",5208d87d.524a,1,"",2013-08-12 20:43:41 CST,,0,LOG,00000,"connection received: host=192.25.10.71 port=46086",,,,,,,,,"" 2013-08-12 20:43:41.514 CST,"repuser","",21066,"192.25.10.71:46086",5208d87d.524a,2,"authentication",2013-08-12 20:43:41 CST,2/1198,0,LOG,00000,"replication connection authorized: user=repuser",,,,,,,,,"" 2013-08-12 20:43:41.514 CST,"repuser","",21066,"192.25.10.71:46086",5208d87d.524a,3,"startup",2013-08-12 20:43:41 CST,2/0,0,FATAL,53300,"number of requested standby connections exceeds max_wal_senders (currently 0)",,,,,,,,,"walreceiver" 2013-08-12 20:43:41.515 CST,"repuser","",21066,"192.25.10.71:46086",5208d87d.524a,4,"startup",2013-08-12 20:43:41 CST,,0,LOG,00000,"disconnection: session time: 0:00:00.002 user=repuser database= host=192.25.10.71 port=46086",,,,,,,,,"wa lreceiver" ... 遇到如此问题,则需要检查postgresql.conf文件的max _wal_senders参数了。 3.参数生效文档上说明需要重启DB使之生效,如果动态使之生效会报错如下: [postgres@proxy1 ]$ psql psql (9.2.4) Type "help" for help. postgres=# show max_wal_senders ; max_wal_senders ----------------- 0 (1 row) postgres=# set max_wal_senders=1; ERROR: parameter "max_wal_senders" cannot be changed without restarting the server postgres=# 补充: postgresql基于流复制 (streaming replication)的warm-standby 实例一枚: Primary:l 归档设置: Wal_level=archive Archive_mode=on archive_command = 'cp -i %p /data/pgsql/archived_wal/%f' l 流复制相关设置: max_wal_senders = '10' #启动复制进程数量限制,必须大于0 max_replication_slots = '10' #为使用replication slot,必须大于0;replication slot作用是保证wal没有同步到standby之前不能从pg_xlog移走; wal_keep_segments = '50' #指定pg_xlog中最少保留的wal数量 select pg_create_physical_replication_slot(‘gp1_a_slot'); #创建replication slot select * from pg_replication_slots; #查询创建的replication slot l 编辑pg_hba.conf # Allow replication connections from localhost, by a user with the replication privilege. #host replication postgres 127.0.0.1/32 trust #host replication postgres ::1/128 trust local replication postgres trust host replication postgres 192.168.12.0/24 trust l 联机备份过程(基础备份) #touch /var/lib/pgsql/backup_in_progress $psql –c "select pg_start_backup('hot_backup');" $tar -cf /var/lib/pgsql/backup.tar /var/lib/pgsql/data/ $psql -c "select pg_stop_backup();" #rm /var/lib/pgsql/backup_in_progress tar -rf /var/lib/pgsql/backup.tar /var/lib/pgsql/archive/ #打包归档 实例: psql -c "select pg_start_backup('pgbk10');" tar -zcf pgbk10.tgz data/ psql -c "select pg_stop_backup();" Standby:l 编辑recovery.conf standby_mode = 'on' primary_conninfo = 'host=192.168.12.38 port=5666 user=postgres' primary_slot_name='gp1_a_slot' #restore_command = 'cp /data/pgsql/archived_wal/%f %p' #archive_cleanup_command = 'pg_archivecleanup /data/pgsql/archived_wal %r' l 将primary上的基础备份传输到standby上 $scp primary: /var/lib/pgsql/backup.tar . 解压备份到standby上的$PGDATA l 启动standby $pg_ctl start –D $PGDATA 启动standby后,postgres开始从primary上接收wal日志进行恢复,并且一直保持恢复状态,psql不能登录; 以上为个人经验,希望能给大家一个参考,也希望大家多多支持极客世界。如有错误或未考虑完全的地方,望不吝赐教。 |
请发表评论