Postgresql備份和增量恢復方案

2020-03-12 23:45:21

字體：大中小

來源：轉載

供稿：網友

前言

最近工作上使用的數據庫一直是Postgresql，這是一款開源的數據庫，而且任何個人可以將該數據庫用于商業用途。在使用Postgresql的時候，讓我最明顯的感覺就是這數據庫做的真心好，雖然說數據庫的安裝包真的很小，但是性能和操作的便捷是一點也不輸給其他商業的大型數據庫，另外在命令行界面下對該數據庫直接進行操作的感覺真的是很爽。在使用數據庫的時候，我們作為小公司的數據庫管理員有一項工作是不可能避免的，那就是數據的備份和恢復問題。PostgreSQL雖然各個方面的有點很多，但是在數據庫備份這方面，它是不支持增量備份的，這點確實讓人覺得很是可惜啊。不過，瑕不掩瑜，總的來說這是一款很好的數據庫軟件。

之前，我們在《Postgresql主從異步流復制方案》一節中，部署了Postgresql的主從異步流復制環境。主從復制的目的是為了實現數據的備份，實現數據的高可用性和容錯行。下面主要簡單地介紹下我們運維Postgresql數據庫時的場景備份與恢復方案。

增量備份

PostgreSQL在做寫入操作時，對數據文件做的任何修改信息，首先會寫入WAL日志（預寫日志），然后才會對數據文件做物理修改。當數據庫服務器掉重啟時，PostgreSQL在啟動時會首先讀取WAL日志，對數據文件進行恢復。因此，從理論上講，如果我們有一個數據庫的基礎備份（也稱為全備），再配合WAL日志，是可以將數據庫恢復到任意時間點的。

上面的知識點很重要，因為我們場景的增量備份說白了就是通過基礎備份 + 增量WAL日志進行重做恢復的。

增量備份設置

為了演示相關功能，我們基于《Postgresql主從異步流復制方案》一節中的環境pghost1服務器上，創建相關管理目錄

切換到 postgres 用戶下

mkdir -p /data/pg10/backupsmkdir -p /data/pg10/archive_wals

backups目錄則可以用來存放基礎備份

archive_wals目錄自然用來存放歸檔了

接下來我們修改我們的postgresql.conf文件的相關設置

wal_level = replicaarchive_mode = onarchive_command = '/usr/bin/lz4 -q -z %p /data/pg10/archive_wals/%f.lz4'

archive_command 參數的默認值是個空字符串，它的值可以是一條shell命令或者一個復雜的shell腳本。

在archive_command的shell命令或腳本中可以用 %p 表示將要歸檔的WAL文件的包含完整路徑信息的文件名，用 %f 代表不包含路徑信息的WAL文件的文件名。

修改wal_level和archive_mode參數都需要重新啟動數據庫才可以生效，修改archive_command不需要重啟，只需要reload即可，例如：

postgres=# SELECT pg_reload_conf();postgres=# show archive_command ;

創建基礎備份

我們使用之前介紹過的pg_basebackup命令進行基礎備份的創建，基礎備份很重要，我們的數據恢復不能沒有它，建議我們根據相關業務策略，周期性生成我們的基礎備份。

$ pg_basebackup -Ft -Pv -Xf -z -Z5 -p 25432 -D /data/pg10/backups/

這樣，我們就成功生成我們的基礎數據備份了

設置還原點

一般我們需要根據重要事件發生時創建一個還原點，通過基礎備份和歸檔恢復到事件發生之前的狀態。

創建還原點的系統函數為：pg_create_restore_point，它的定義如下：

postgres=# SELECT pg_create_restore_point('domac-201810141800');

恢復到指定還原點

接下來，我們通過一個示例，讓我們的數據還原到我們設置的還原點上

首先，我們創建一張測試表：

CREATE TABLE test_restore( id SERIAL PRIMARY KEY, ival INT NOT NULL DEFAULT 0, description TEXT, created_time TIMESTAMPTZ NOT NULL DEFAULT now());

初始化一些測試數據作為基礎數據，如下所示：

postgres=# INSERT INTO test_restore (ival) VALUES (1);INSERT 0 1postgres=# INSERT INTO test_restore (ival) VALUES (2);INSERT 0 1postgres=# INSERT INTO test_restore (ival) VALUES (3);INSERT 0 1postgres=# INSERT INTO test_restore (ival) VALUES (4);INSERT 0 1postgres=# select * from test_restore; id | ival | description |   created_time----+------+-------------+------------------------------- 1 | 1 |    | 2018-10-14 11:13:41.57154+00 2 | 2 |    | 2018-10-14 11:13:44.250221+00 3 | 3 |    | 2018-10-14 11:13:46.311291+00 4 | 4 |    | 2018-10-14 11:13:48.820479+00(4 rows)

并且按照上文的方法創建一個基礎備份。如果是測試，有一點需要注意，由于WAL文件是寫滿16MB才會進行歸檔，測試階段可能寫入會非常少，可以在執行完基礎備份之后，手動進行一次WAL切換。例如：

postgres=# select pg_switch_wal(); pg_switch_wal--------------- 0/1D01B858(1 row)

或者通過設置archive_timeout參數，在達到timeout閾值時強行切換到新的WAL段。

接下來，創建一個還原點，如下所示：

postgres=# select pg_create_restore_point('domac-1014'); pg_create_restore_point------------------------- 0/1E0001A8(1 row)

接下來我們對數據做一些變更, 我們刪除test_restore的所有數據：

postgres=# delete from test_restore;DELETE 4

下面進行恢復到名稱為“domac-1014”還原點的實驗，如下所示：

停止數據庫

$ pg_ctl stop -D /data/pg10/db

移除舊的數據目錄

$ rm -rf /data/pg10/db$ mkdir db && chmod 0700 db$ tar -xvf /data/pg10/backups/base.tar.gz -C /data/pg10/dbcp $PGHOME/share/recovery.conf.sample /pgdata/10/data/recovery.confchmod 0600 /pgdata/10/data/recovery.conf

修改 recovery.conf, 修改以下配置信息：

restore_command = '/usr/bin/lz4 -d /data/pg10/archive_wals/%f.lz4 %p'recovery_target_name = 'domac-1014

然后啟動數據庫進入恢復狀態，觀察日志，如下所示：

bash-4.2$ pg_ctl start -D /data/pg10/dbwaiting for server to start....2018-10-14 11:26:56.949 UTC [8397] LOG: listening on IPv4 address "0.0.0.0", port 254322018-10-14 11:26:56.949 UTC [8397] LOG: listening on IPv6 address "::", port 254322018-10-14 11:26:56.952 UTC [8397] LOG: listening on Unix socket "/tmp/.s.PGSQL.25432"2018-10-14 11:26:56.968 UTC [8398] LOG: database system was interrupted; last known up at 2018-10-14 09:26:59 UTC2018-10-14 11:26:57.049 UTC [8398] LOG: starting point-in-time recovery to "domac-1014"/data/pg10/archive_wals/00000002.history.lz4: No such file or directory2018-10-14 11:26:57.052 UTC [8398] LOG: restored log file "00000002.history" from archive/data/pg10/archive_w : decoded 16777216 bytes2018-10-14 11:26:57.077 UTC [8398] LOG: restored log file "000000020000000000000016" from archive2018-10-14 11:26:57.191 UTC [8398] LOG: redo starts at 0/160000602018-10-14 11:26:57.193 UTC [8398] LOG: consistent recovery state reached at 0/160001302018-10-14 11:26:57.193 UTC [8397] LOG: database system is ready to accept read only connections/data/pg10/archive_w : decoded 16777216 bytes2018-10-14 11:26:57.217 UTC [8398] LOG: restored log file "000000020000000000000017" from archive doneserver started/data/pg10/archive_w : decoded 16777216 bytes2018-10-14 11:26:57.384 UTC [8398] LOG: restored log file "000000020000000000000018" from archive/data/pg10/archive_w : decoded 16777216 bytes2018-10-14 11:26:57.513 UTC [8398] LOG: restored log file "000000020000000000000019" from archive/data/pg10/archive_w : decoded 16777216 bytes2018-10-14 11:26:57.699 UTC [8398] LOG: restored log file "00000002000000000000001A" from archive/data/pg10/archive_w : decoded 16777216 bytes2018-10-14 11:26:57.805 UTC [8398] LOG: restored log file "00000002000000000000001B" from archive/data/pg10/archive_w : decoded 16777216 bytes2018-10-14 11:26:57.982 UTC [8398] LOG: restored log file "00000002000000000000001C" from archive/data/pg10/archive_w : decoded 16777216 bytes2018-10-14 11:26:58.116 UTC [8398] LOG: restored log file "00000002000000000000001D" from archive/data/pg10/archive_w : decoded 16777216 bytes2018-10-14 11:26:58.310 UTC [8398] LOG: restored log file "00000002000000000000001E" from archive2018-10-14 11:26:58.379 UTC [8398] LOG: recovery stopping at restore point "domac-1014", time 2018-10-14 11:17:20.680941+002018-10-14 11:26:58.379 UTC [8398] LOG: recovery has paused2018-10-14 11:26:58.379 UTC [8398] HINT: Execute pg_wal_replay_resume() to continue.

重啟后，我們對test_restore表進行查詢，看數據是否正?；謴停?/p>

postgres=# select * from test_restore; id | ival | description |   created_time----+------+-------------+------------------------------- 1 | 1 |    | 2018-10-14 11:13:41.57154+00 2 | 2 |    | 2018-10-14 11:13:44.250221+00 3 | 3 |    | 2018-10-14 11:13:46.311291+00 4 | 4 |    | 2018-10-14 11:13:48.820479+00(4 rows)

可以看到數據已經恢復到指定的還原點：domac-1014。

這時，recovery.conf可以移除，避免下次數據重啟，數據再次恢復到該還原點

總結

備份和恢復是數據庫管理中非常重要的工作，日常運維中，我們需要根據需要進行相關策略的備份，并且周期性地進行恢復測試，保證數據的安全。

好了，以上就是這篇文章的全部內容了，希望本文的內容對大家的學習或者工作具有一定的參考學習價值，如果有疑問大家可以留言交流，謝謝大家對VEVB武林網的支持。

注：相關教程知識閱讀請移步到PostgreSQL頻道。

上一篇：PostgreSQL中使用數組改進性能實例代碼

下一篇：Postgresql主從異步流復制方案的深入探究