前言
因为之前自己博客已经写过太多次的主从搭建了,这里就不写了,直接copy自己以前写过的内容,后面加上keepalived高可用的重点内容!!!
一、下载POSTGRESQL源码安装包及主机配置 https://www.postgresql.org/ftp/source/v10.8/ postgresql-10.8.tar.gz 虚拟机环境 node1 192.168.159.4 node2 192.168.159.5 VIP 192.168.159.100 操作系统为redhat7.6 数据库为postgresql10.8 两个节点均配置/etc/hosts vi /etc/hosts node1 192.168.159.4 node2 192.168.159.5 二、编译安装 (1)创建postgres用户 useradd -m -r -s /bin/bash -u 5432 postgres (2)安装相关依赖包 yum install gettext gcc make perl python perl-ExtUtils-Embed readline-devel zlib-devel openssl-devel libxml2-devel cmake gcc-c++ libxslt-devel openldap-devel pam-devel python-devel cyrus-sasl-devel libgcrypt-devel libgpg-error-devel libstdc++-devel
(3)配置POSTGRES
上传解压安装包 cd /opt/
tar -zxvf postgresql-10.8.tar.gz
cd /opt/ postgresql-10.8
./configure --prefix=/opt/postgresql-10.8 --with-segsize=8 --with-wal-segsize=64 --with-wal-blocksize=16 --with-blocksize=16 --with-libedit-preferred --with-perl --with-python --with-openssl --with-libxml --with-libxslt --enable-thread-safety --enable-nls=zh_CN
注意编译的时候一定不要加参数--enable-profiling --enable-debug 这参数会导致不必要的日志产生,
而且增长极快,会导致磁盘空间迅速爆满,手动删除还可能有一些不好的影响,所以不要配这参数就好。
而且这参数说明也是用于测试开发用的,不适用于生产系统
configure 編译选项: --prefix=PREFIX 指定安装目录,默认为 /usr/local/pgsql --bindir= 可执行文件目录,默认 /PREFIX/bin --sysconfdir= 配置文件目录,默认 /PREFIX/etc --libdir= 库文件目录,默认 /PREFIX/lib --includedir= 头文件目录,默认 /PREFIX/include --datarootdir=DATAROOTDIR 设置只读共享文件目录,默认为 /PREFIX/share --mandir= man手册目录,默认 /DATAROOTDIR/man --with-extra-version=STRING 在版本号后面追加 STRING 字符串,用作自定义版本标识 --with-pgport=NUMBER 指定服务器端与客户端的默认端口号,缺省为 5432 --with-openssl 編译 ssl 支持,需要事先安装 OpenSSL 包 --with-pam 編译 PAM 支持 --with-ldap 編译 LDAP 支持,需要事先安装 OpenLDAP 包 --with-segsize= 设定 segment size ,以 gigabytes(GB) 为单位,默认为 1GB ,大型的表会按 segsize 被分割成多个文件 --with-blocksize= 指定块大小,是表内存储和 I/O 的基本单位,默认 8Kbytes ,通常无需变更,取值范围 1-32KB --with-wal-segsize= 设置 WAL(Write-Ahead Logging) 的 segment size ,以 megabytes(MB) 为单位,默认 16MB ,取值范围 1-64MB --with-wal-blocksize= 指定 WAL 的块大小,这是 WAL 預写式日志存储和 I/O 的基本单位,以 Kbytes 为单位,默认 8KB ,取值范围 1-64KB --enable-debug (Compiles all programs and libraries with debugging symbols. This means that you can run the programs in a debugger to analyze problems),使用 GCC 編译器时可用于生产环境,其它編译器会影响性能;多用于开发 --enable-profiling GCC編译器下,所有程序和库将被編译成可进行性能分析,后端退出时生成的 gmon.out 用于性能分析;多用于开发 --enable-coverage 代码覆盖率测试,仅用于开发 --enable-cassert 开启服务器的 assertion(断言) 检查,仅用于开发 --enable-depend (Enables automatic dependency tracking. With this option, the makefiles are set up so that all affected object files will be rebuilt when any header file is changed),仅用于开发 --enable-dtrace 編译动态追踪工具 Dtrace 支持,当前尚不能用于 Linux 平台,可用于 FreeBSD、Solaris --with-systemd 开启 systemd 支持,需9.6及以上版本
最后几行出现以下黄色输出即配置正确,否则根据报错提示继续安装依赖包
checking thread safety of required library functions... yes
checking whether gcc supports -Wl,--as-needed... yes
configure: using compiler=gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36)
configure: using CFLAGS=-Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -pg -DLINUX_PROFILE -O2
configure: using CPPFLAGS= -D_GNU_SOURCE -I/usr/include/libxml2
configure: using LDFLAGS= -Wl,--as-needed
configure: creating ./config.status
config.status: creating GNUmakefile
config.status: creating src/Makefile.global
config.status: creating src/include/pg_config.h
config.status: creating src/include/pg_config_ext.h
config.status: creating src/interfaces/ecpg/include/ecpg_config.h
config.status: linking src/backend/port/tas/dummy.s to src/backend/port/tas.s
config.status: linking src/backend/port/dynloader/linux.c to src/backend/port/dynloader.c
config.status: linking src/backend/port/posix_sema.c to src/backend/port/pg_sema.c
config.status: linking src/backend/port/sysv_shmem.c to src/backend/port/pg_shmem.c
config.status: linking src/backend/port/dynloader/linux.h to src/include/dynloader.h
config.status: linking src/include/port/linux.h to src/include/pg_config_os.h
config.status: linking src/makefiles/Makefile.linux to src/Makefile.port
或者马上输入echo $?得到输出是0就可以了
(4)编译 make && make install 最后几行出现以下黄色输出即配置正确
make[2]: 离开目录“/opt/postgresql-10.8/src/test/perl”
/usr/bin/mkdir -p '/opt/postgresql-10.8/lib/pgxs/src'
/usr/bin/install -c -m 644 Makefile.global '/opt/postgresql-10.8/lib/pgxs/src/Makefile.global'
/usr/bin/install -c -m 644 Makefile.port '/opt/postgresql-10.8/lib/pgxs/src/Makefile.port'
/usr/bin/install -c -m 644 ./Makefile.shlib '/opt/postgresql-10.8/lib/pgxs/src/Makefile.shlib'
/usr/bin/install -c -m 644 ./nls-global.mk '/opt/postgresql-10.8/lib/pgxs/src/nls-global.mk'
make[1]: 离开目录“/opt/postgresql-10.8/src”
make -C config install
make[1]: 进入目录“/opt/postgresql-10.8/config”
/usr/bin/mkdir -p '/opt/postgresql-10.8/lib/pgxs/config'
/usr/bin/install -c -m 755 ./install-sh '/opt/postgresql-10.8/lib/pgxs/config/install-sh'
/usr/bin/install -c -m 755 ./missing '/opt/postgresql-10.8/lib/pgxs/config/missing'
make[1]: 离开目录“/opt/postgresql-10.8/config”
PostgreSQL installation complete.
或者马上输入echo $?得到输出是0就可以了
(5)安装 make world && make install -world 最后几行出现以下黄色输出即配置正确
/usr/bin/mkdir -p '/opt/postgresql-10.8/lib/pgxs/src'
/usr/bin/install -c -m 644 Makefile.global '/opt/postgresql-10.8/lib/pgxs/src/Makefile.global'
/usr/bin/install -c -m 644 Makefile.port '/opt/postgresql-10.8/lib/pgxs/src/Makefile.port'
/usr/bin/install -c -m 644 ./Makefile.shlib '/opt/postgresql-10.8/lib/pgxs/src/Makefile.shlib'
/usr/bin/install -c -m 644 ./nls-global.mk '/opt/postgresql-10.8/lib/pgxs/src/nls-global.mk'
make[1]: 离开目录“/opt/postgresql-10.8/src”
make -C config install
make[1]: 进入目录“/opt/postgresql-10.8/config”
/usr/bin/mkdir -p '/opt/postgresql-10.8/lib/pgxs/config'
/usr/bin/install -c -m 755 ./install-sh '/opt/postgresql-10.8/lib/pgxs/config/install-sh'
/usr/bin/install -c -m 755 ./missing '/opt/postgresql-10.8/lib/pgxs/config/missing'
make[1]: 离开目录“/opt/postgresql-10.8/config”
PostgreSQL installation complete.
make: 离开目录“/opt/postgresql-10.8”
或者马上输入echo $?得到输出是0就可以了
(6)创建相关目录及配置环境变量 mkdir -p /home/postgresql10.8/serverlogmkdir /home/postgresql10.8/pgdata
su - postgres vi .bash_profile (删除原来的所有,以下黄色部分直接复制粘贴) # .bash_profile # Get the aliases and functions if [ -f ~/.bashrc ]; then . ~/.bashrc fi # User specific environment and startup programs PATH=$PATH:$HOME/bin export PATH # postgres #PostgreSQL端口 PGPORT=5432 #PostgreSQL数据目录 PGDATA=/home/postgresql10.8/pg/pgdata export PGPORT PGDATA #所使用的语言 export.utf8 #PostgreSQL 安装目录 export PGHOME= /home/postgresql10.8/pg/pgdata
#PostgreSQL 连接库文件 export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH export DATE=`date +"%Y%m%d%H%M"` #将PostgreSQL的命令行添加到 PATH 环境变量 export PATH=$PGHOME/bin:$PATH #PostgreSQL的 man 手册 export MANPATH=$PGHOME/share/man:$MANPATH #PostgreSQL的默认用户 export PGUSER=postgres #PostgreSQL默认主机地址 export PGHOST=127.0.0.1 #默认的数据库名 export PGDATABASE=postgres #定义日志存放目录 PGLOG="$PGDATA/serverlog"
source .bash_profile
(7)初始化数据库 #执行数据库初始化脚本 root用户登录 chown -R postgres.postgres /home/postgresql10.8 su - postgres /opt/postgresql-10.8/bin/initdb --encoding=utf8 -D /home/postgresql10.8/pg/pgdata/
警告:为本地连接启动了 "trust" 认证. 你可以通过编辑 pg_hba.conf 更改或你下次 行 initdb 时使用 -A或者--auth-local和--auth-host选项. Success. You can now start the database server using: 启动数据库 su - postgres
/opt/postgresql-10.8/bin/pg_ctl -D '/home/postgresql10.8/pg/pgdata/' -l logfile start (8)相关命令拷贝 root用户 mkdir /home/postgresql10.8/pg/pgdata/bin cp /opt/postgresql-10.8/bin/* /home/postgresql10.8/pg/pgdata/bin chown -R postgres.postgres /home/postgresql10.8/pg/pgdata/bin
三、postgresql主从搭建 1、主库配置 (1)创建一个用户复制的用户replica su - postgrespsqlCREATE ROLE replica login replication encrypted password ' replica';
(2)修改pg_hba.conf文件,指定replica登录网络(最后一行添加)vi /home/postgresql10.8/pg/pgdata/pg_hba.conf
host replication replica 192.168.159.0/24 md5
host all replica 192.168.159.0/24 md5
host all all 192.168.159.0/24 md5
host all all 0.0.0.0/0 md5 (3)主库配置文件修改以下几项,其他不变 vi /home/postgresql10.8/pg/pgdata/ postgresql.conf listen_addresses = '*' wal_level = hot_standby #热备模式 max_wal_senders= 10 #可以设置最多几个流复制链接,差不多有几个从,就设置多少 wal_keep_segments = 100 #重要配置 wal_send_timeout = 60s max_connections = 3000 #从库的 max_connections要大于主库 archive_mode = on #允许归档 archive_command = 'cp %p /home/postgresql10.8/pg/archivedir/%f' #根据实际情况设置
checkpoint_timeout = 30min max_wal_size = 2GB min_wal_size = 1GB mkdir /home/postgresql10.8/pg/archivedir 2、从库环境(1)把备库的数据文件夹目录清空rm -rf /home/postgresql10.8/pg/*(2)在备库上运行/opt/postgresql-10.8/bin/pg_basebackup -F p --progress -D /home/postgresql10.8/pg/pgdata -h 192.168.159.4 -p 5432 -U replica --password输入密码replica !!!注意,复制完成后,在备库一定要将数据目录下的所有文件重新授权chown -R postgres.postgres /home/postgresql10.8/pg/pgdata
(3)创建recovery.conf 文件cp /opt/postgresql-10.8/share/recovery.conf.sample /home/postgresql10.8/pg/pgdata/recovery.conf
vi /home/postgresql10.8/pg/pgdata/recovery.confstandby_mode = port=5432 user=replica password=replica'recovery_target_timeline = 'latest'trigger_file = ' /home/postgresql10.8/pg/pgdata/ trigger.kenyon'
(4)配置 postgresql.conf文件 vi /home/postgresql10.8/pg/pgdata /postgresql.conf listen_addresses ='*'wal_level = hot_standbymax_connections =1000 #一般从的最大链接要大于主的hot_standby =on #说明这台机器不仅仅用于数据归档,也用于查询max_standby_streaming_delay =30swal_receiver_status_interval = 10s #多久向主报告一次从的状态hot_standby_feedback = on #如果有错误的数据复制,是否向主进行范例
(5) 启动备库 su - postgres /opt/postgresql-10.8/bin/pg_ctl -D '/home/postgresql10.8/pg/pgdata/' -l logfile start如果无法启动,到主库复制文件 postmaster.opts到备库如下操作:scp /home/postgresql10.8/pg/pgdata/postmaster.opts 192.168.159.5: /home/postgresql10.8/pg/pgdata/
chown -R postgres.postgres /home/postgresql10.8/pg/pgdata/
cd /home/postgresql10.8/pg/chmod 700 pgdata/
3、验证主从功能主库查询su - postgrespsqlpostgres=# select client_addr,sync_state from pg_stat_replication; client_addr | sync_state-----------------+------------ 192.168.159.5 | async(1 row)
注意一个问题,生产库要注意时区问题
找到配置文件postgresql.conf
其中参数
timezone = 'PRC'
PRC代表是上海时区 四、keepalived高可用搭建
安装包:
keepalived-1.4.2.tar.gz
libnfnetlink-1.0.0-1.el6.x86_64.rpm
libnfnetlink-devel-1.0.0-1.el6.x86_64.rpm
libnl-devel-1.1.4-2.el6.x86_64.rpm
安装包下载地址:
http://www.keepalived.org/software/ keepalived-1.4.2.tar.gz
https://centos.pkgs.org/6/centos-x86_64/libnfnetlink-1.0.0-1.el6.i686.rpm.html
http://rpmfind.net/linux/RPM/centos/6.10/x86_64/Packages/libnfnetlink-devel-1.0.0-1.el6.x86_64.html
http://rpmfind.net/linux/RPM/centos/6.10/x86_64/Packages/libnl-devel-1.1.4-2.el6.x86_64.html
安装keepalived
tar zxvf keepalived-1.4.2.tar.gz
cd keepalived-1.4.2
./configure
遇到以下报错
!!! OpenSSL is not properly installed on your system. !!!
!!! Can not include OpenSSL headers files.
解决方法:
yum -y install openssl-devel
openssl version可查看openssl版本,只要是1.0.2k就可以。
make
make install
mkdir /etc/keepalived
cp /usr/local/etc/keepalived/keepalived.conf /etc/keepalived/
cp /usr/local/sbin/keepalived /usr/sbin/
master的keepalived配置文件如下:
vi /etc/keepalived/keepalived.conf
global_defs {
router_id PG-HA
}
vrrp_script check_run {
script "/etc/keepalived/pg_check.sh"
interval 60
}
vrrp_sync_group VG1 {
group {
VI_1
}
}
vrrp_instance VI_1 {
state BACKUP
interface eth1
virtual_router_id 51
priority 100
advert_int 1
nopreempt
authentication {
auth_type PASS
auth_pass 1234
}
track_script {
check_run
}
virtual_ipaddress {
192.168.159.100
}
}
slave的keepalived配置文件如下:
mkdir /etc/keepalived
vi /etc/keepalived/keepalived.conf
global_defs {
router_id PG_HA
}
vrrp_script check_run {
script "/ etc/keepalived/pg_check.sh"
interval 60
}
vrrp_sync_group VG1 {
group {
VI_1
}
}
vrrp_instance VI_1 {
state BACKUP
interface eth1
virtual_router_id 51
priority 90
advert_int 1
nopreempt
authentication {
auth_type PASS
auth_pass 1234
}
track_script {
check_run
}
virtual_ipaddress {
192.168.159.100
}
}
说明:
1、
master与slave的keepalived配置文件中只有priority设置不同,master为100,slave为90,其它一样。
配置文件是以块形式组织的,每个块都在{}包围的范围内,#和!开头的行都是注释。
2、
global_defs为全局定义,对整个Keepalived起作用,而不管是否使用LVS。
3、
router_id:运行Keepalived的机器的一个标识。
4、
vrrp_script配置业务进程监控脚本。
5、
script:设置脚本文件名。
6、
interval:设置脚本执行的时间间隔,这里为每60秒执行一次。
7、
/ etc/keepalived/pg_check.sh用以检测PG服务是否正常,
当发现连接不上PG,自动把keepalived进程杀掉,让VIP进行漂移。文件内容如下。
下面是关键脚本
主节点:
#!/bin/bash count=1 while true do su - postgres -c "psql -c "select 1"" > /dev/null 2>&1 i=$? ps aux | grep postgres | grep -v grep > /dev/null 2>&1 j=$? if [ $i = 0 ] && [ $j = 0 ] then exit 0 else if [ $i = 1 ] && [ $j = 0 ] then exit 0 else if [ $count -gt 5 ] then break fi let count++ continue fi fi done /etc/init.d/keepalived stop scp /opt/trigger.kenyon 192.168.159.5: /home/postgresql10.8/pg/pgdata /trigger.kenyon
#注意这里要自己先touch一个文件/opt/trigger.kenyon
#然后还要用root用户把主从两个节点的秘钥互相弄好 两个节点都操作: mkdir ~/.ssh ssh-keygen -t rsa ssh-keygen -t dsa 其中1个节点操作: cat ~/.ssh/id_rsa.pub >> ./.ssh/authorized_keys -公钥存在authorized_keys文件中,写到本机 cat ~/.ssh/id_dsa.pub >> ./.ssh/authorized_keys ssh 192.168.159.5 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys 另一个节点的公钥写到本机 ssh 192.168.159.5 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys scp ~/.ssh/authorized_keys 192.168.159.5:~/.ssh/authorized_keys 两个节点都操作: ssh 192.168.159.5 date ssh 192.168.159.4 date完成后就配置好秘钥了。 从节点:(其实就是IP换一换)
#!/bin/bash
count=1
while true
do
su - postgres -c "psql -c "select 1"" > /dev/null 2>&1
i=$?
ps aux | grep postgres | grep -v grep > /dev/null 2>&1
j=$?
if [ $i = 0 ] && [ $j = 0 ]
then
exit 0
else
if [ $i = 1 ] && [ $j = 0 ]
then
exit 0
else
if [ $count -gt 5 ]
then
break
fi
let count++
continue
fi
fi
done
/etc/init.d/keepalived stop
scp /opt/trigger.kenyon 192.168.159.4: /home/postgresql10.8/pg/pgdata /trigger.kenyon
完事之后,就是测试了
两个节点的pg和keepalived都启动,主节点ip addr可以看到vip地址
将主节点的postgres进程杀掉
kill pid
然后vip地址漂移到从节点,从节点psql登录
create table test (name varchar(9));
创建一个表测试可以创建成功就证明从库成功切换到主库了。
OK! 这里切换成功,那就是原来的从库变成了主库,而主库就是相当于是(模拟挂掉了)
这个时候如何修复原来挂掉的主库呢?
其实很简单,只需要把pg_wal 目录下的文件都删除(包含archive_status目录)
然后从现主库把pg_wal 目录下的文件都拷贝过来,
修改现从库的postgresql.conf配置文件中max_connections的数值比现主库的数值大即可
否则会报错:
2019-11-28 23:43:43.191 PST [21692] LOG: listening on IPv4 address "0.0.0.0", port 54322019-11-28 23:43:43.191 PST [21692] LOG: listening on IPv6 address "::", port 54322019-11-28 23:43:43.194 PST [21692] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"2019-11-28 23:43:43.285 PST [21693] LOG: database system was interrupted while in recovery at log time 2019-11-28 23:37:49 PST2019-11-28 23:43:43.285 PST [21693] HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.2019-11-28 23:43:43.454 PST [21693] LOG: entering standby mode2019-11-28 23:43:43.456 PST [21693] FATAL: hot standby is not possible because max_connections = 2000 is a lower setting than on the master server (its value was 2002)2019-11-28 23:43:43.457 PST [21692] LOG: startup process (PID 21693) exited with exit code 12019-11-28 23:43:43.458 PST [21692] LOG: aborting startup due to startup process failure2019-11-28 23:43:43.471 PST [21692] LOG: database system is shut down
然后启动现从库pg,启动keepalived,完成。
检查主从状态
在主库查询
postgres=# select client_addr,sync_state from pg_stat_replication; client_addr | sync_state ----------------+------------ 192.168.159.5 | async(1 row)
master.sh
#!/bin/sh Master_Log_File=$(ps -ef|grep -v 'grep'|grep -w sender|awk -F ' ' '{print $10}') Relay_Master_Log_File=$(ps -ef|grep -v 'grep'|grep -w sender|awk -F ' ' '{print $10}') su - postgres -c "psql -U postgres -p 123456 -h 192.168.159.4 -p5432 -c "select 1"" j=$? i=1 while true do if [ $Master_Log_File = $Relay_Master_Log_File ] && [ $j = 0 ] #&& [ $Read_Master_Log_Pos -eq $Exec_Master_Log_Pos ] then echo "ok" break else sleep 1 if [ $i -gt 60 ] then break scp /opt/trigger.kenyon 192.168.159.5:/data/pg/data/trigger.kenyon fi continue let i++ fi done stop.sh #!/bin/bash M_File1=$(ps -ef|grep -v 'grep'|grep -w sender|awk -F ' ' '{print $10}') sleep 1 M_File2=$(ps -ef|grep -v 'grep'|grep -w sender|awk -F ' ' '{print $10}') i=1 while true do if [ $M_File1 = $M_File2 ] then echo "ok" break else sleep 1 if [ $i -gt 60 ] then break fi continue let i++ fi done su - postgres -c " /opt/postgresql-10.8/bin/pg_ctl -D '/home/postgresql10.8/pg/pgdata/' -l logfile stop"
