pg_streaming_replication Puppet Module

pg_streaming_replication module configures and initiates PostgreSQL streaming replication based on replication slots. It is an open source module, published under Apache 2.0 license, and managed at Github (contributors are welcome). The module is also published at puppet forge.

Basic Usage

For this basic example I'll assume few things:

  • There are two servers, 192.168.1.1 (to be the primary) and 192.168.1.2 (to be the standby);
  • PostgreSQL 9.4 or higher is already installed. Version must be 9.4 or newer because replication slots feature that is used by pg_streaming_replication module is introduced in 9.4.

The primary server must be initiated first, and this can be done by:

class { 'pg_streaming_replication':
  id_rsa_source        => 'puppet:///files/my_postgres_ssh_id_rsa',
  id_rsa_pub_source    => 'puppet:///files/my_postgres_ssh_id_rsa.pub',
  nodes                => ['192.168.1.1', '192.168.1.2'],
  replication_password => 'BDE4CE17-98E5-4FDC-B03C-B94559FE03D8',
  initiate_role        => 'primary',
}

In the previous script I've used a GUID (BDE4CE17-98E5-4FDC-B03C-B94559FE03D8) for replication_password, but you can select the password as you like. The only restriction is that the password cannot be empty and must be the same on all the servers. You should also know that you don't have to remember this password because you won't ever use it directly.

After the primary server is successfully promoted, you can join the standby server by:

class { 'pg_streaming_replication':
  id_rsa_source        => 'puppet:///files/my_postgres_ssh_id_rsa',
  id_rsa_pub_source    => 'puppet:///files/my_postgres_ssh_id_rsa.pub',
  nodes                => ['192.168.1.1', '192.168.1.2'],
  replication_password => 'BDE4CE17-98E5-4FDC-B03C-B94559FE03D8',
  initiate_role        => 'standby',
  primary_server_ip    => '192.168.1.1',
}

That's it. You can test the replication as described here.

After the replication is initiated it is recommended to change initiate_role parameter value to 'none' for both servers, in order to prevent tools conflict problem described below.

Requirements

pg_streaming_replication module has the following requirements:

  • PostgreSQL version 9.4 or newer (already explained);
  • PostgreSQL is configured identically on all the servers, meaning that:

    • The same PostgreSQL version is used (actually this is not the module's but PostgreSQL's requirement);
    • PostgreSQL configuration directory has the same path on all the servers (i.e. /etc/postgresql/9.5/main);
    • PostgreSQL data directory has the same path on all the servers (i.e. /var/lib/postgresql/9.5/main);
    • pg_ctl resides at the same location on all the servers (i.e. /usr/lib/postgresql/9.5/bin/pg_ctl);
  • The main configuration file must be named postgresql.conf, and must reside in the configuration directory explained above;
  • Host-based access file must be named pg_hba.conf, and must reside in the configuration directory explained above;
  • Every server that will be included in replication must have an unique hostname. The hostname must be unique even without domain part. For example, the following two servers cannot be used although they have different FQDN: MY-SERVER1.domain.com and MY-SERVER1.example.com.

PostgreSQL Configuration

pg_streaming_replication module is created in such way that minimally interfere with your existing configuration. This actually means that:

Yet, there are few rules you have to obey to.

Rule 1: When to Configure

Although you can configure almost anything, it does not mean that you can do this at any time. Actually, almost everything should be configured before the replication is initiated, and cannot be changed later. This rule is actually not introduced by pg_streaming_replicaiton module, but by PostgreSQL replication itself. There are some things you can change after the replication is configured:

  • Some insignificant entries in postgresql.conf, such as log file locations, log levels, etc. You can change even more important entries as listen_addresses is, assuming that you'll leave one that is used by the replication. But you cannot change port, for example, when the replication is initiated (or at least not in an easy way).
  • You can change pg_hba.conf rules freely even after the replication is started, but without changing few rules created by pg_streaming_replication module (all referring to replication user).
  • pg_ident.conf does not affects neither pg_streaming_replication module, nor the replication itself, so you can change it whenever you want.

Rule 2: Configuration Order

With puppetlabs/postgresql Module

If you are using puppetlabs/postgresql module for configuration, you don't need to worry about order of configuration steps. pg_streaming_replication module is created in such way that it cooperates with puppetlabs/postgresql module, and knows when it is its term to take over. If you try to define the order by using before, after, require, arrows (->) between pg_streaming_replication class and, for example, postgresql::server class - you'll probably end up with circular dependency error. For example the following will cause an error:

class { 'pg_streaming_replication':
  nodes                => ['192.168.1.1', '192.168.1.2'],
  replication_password => 'BDE4CE17-98E5-4FDC-B03C-B94559FE03D8', 
  require              => Class['postgresql::server'], # Beeeeeep, wrong! Don't do this!
}

Simply, keep in mind that pg_streaming_replication module already knows everything about order when it comes to puppetlabs/postgresql module.

Manual Configuration

If you are not using puppetlabs/postgresql module for configuring PostgreSQL, but instead you doing something like:

file { 'postgresql.conf':
  path   => '/etc/postgresql/9.5/main/postgresql.conf',
  source => 'puppet:///files/my_prepared_postgresql.conf',
}

Or similarly with templates, or file_line resource from puppetlabs/stdlib, then you'll have to take care of the order. In this case the rule is again simple: pg_streaming_replication module must be the last one that deals with configuration. You can accomplish this by adding before attribute to all other configuration resources you are using, for example:

file { 'postgresql.conf':
  path   => '/etc/postgresql/9.5/main/postgresql.conf',
  source => 'puppet:///files/my_prepared_postgresql.conf',
  before => Class['pg_streaming_replication'],
}

Rule 3: Reserved Entries

As already mentioned, there are few postgresql.conf entries that you should not touch, and leave them to pg_streaming_replication module. These are:

  • wal_level - you cannot set;
  • max_replication_slots - you can set only through pg_streaming_replication class' parameter with the same name;
  • max_wal_senders - again, only through a parameter;
  • hot_standby - you cannot set;
  • hot_standby_feedback - you cannot set.

Keep in mind that, besides these reserved entries, you also should not set any other postgresql.conf entry that relates to replication, unless you are absolutely sure that you know what you are doing.

The Biggest Problem

This section is not directly about pg_streaming_replication module, but about general problems in PostgreSQL replication. Nevertheless, you need to understand what the problems are so that you can understand why the module is created in such way, and how it addresses any problems.

If you ask someone without experience in the area what is the single biggest problem in PostgreSQL replication, chances are that he will answer "primary server failure". Well, it is not true. We are establishing the replication to address this exact problem, so it cannot surprise us, and we'll solve it easily - by performing failover. Strangely enough, the single biggest problem is exact opposite: ending up with more than one primary server. This is a situation we are not prepared for (nor our tools are), and it would cause data loss.

And you'll be surprised when I tell you how easy this may happen, and how often it actually happens in practice (if the system isn't carefully and correctly configured).

Scenario 1: Short Outage

It can be enough only 30 seconds for your primary server to loose the network, and if you have any automatic failover tool (i.e. pgpool-II), a failover will be performed, and one of standbys will take over primary server role. When the old primary server gets network connection again it'll be the second primary server in the cluster. Usually in such scenarios the same tool that performs automatic failover (pgpool-II) is responsible to prevent the old primary to come back as primary server.

Scenario 2: Tools Conflict

Let's say that primary server's database has became unresponsive, so failover tool (pgpool-II) does its job and promotes one of standbys to primary server role. Then after a some time puppet agent kicks, starts its run and notices "hey your config file is not good - you should be the master" and changes servers back to their original configuration. This way puppet would actually cause irreparable damage - neither old nor new primary are not valid primaries anymore. For this exact reason pg_streaming_replication module is created to not to change postgresql.conf at all. (It'll be explained later.)

Scenario 3: Mishandling

It is not possible even to imagine every scenario in which a person without knowledge damages the system. One good real world example can be found in this stackowerflow question - a guy decided to test replication by turning off the primary server, and after the failover has taken place, he again turned on the old primary server. Obviously he ended up wondering what time differences log files are talking about. What logs were actually saying is that the standby server also ended up wondering "I have newer data than the primary. How it can be?".

And of course, there are a lot more scenarios than I've enumerated here.

pg_streaming_replication Module's Way

The module essentially does two things: initiates the replication and creates some useful scripts that can be used directly (without puppet) to perform replication-related tasks. But once the replication is initiated, the module does not interfere with it anymore. Having in mind everything said previously, the reason for such implementation is obvious.

It basically means that after the replication is initiated, the module does not change postgresql.conf file in any way anymore. Instead of changing the config file itself, the module keeps two template config files - postgresql.conf.primary and postgresql.conf.standby, both stored in [CONF_DIR]/repltemplates directory. This way the server is ready to be either primary or standby - the only thing that needs to be done is to copy the appropriate template file. These template files are later used by promote.sh and initiate_replication.sh scripts (from [CONF_DIR]/replscripts directory) for performing actual primary-promotion / standby-joining actions respectively.

In certain conditions pg_streaming_replication module will execute these scripts:

  • If initiate_role parameter (to be explained below) is set to 'primary' or 'master', and if the server still doesn't have any role, the module will execute promote.sh script, thus promoting the server to primary server role.
  • Similarly, if initiate_role parameter is set to 'standby' or 'slave', and the server still doesn't have any role, the module will execute initiate_replication.sh script, thus joining the server as standby to existing primary server.

Another thing to explain here is how actually pg_streaming_replication module knows if the server is already joined (has primary or standby role). For this purpose the module uses so-called trigger file and standby file. Term trigger file is well known in PostgreSQL replication terminology, but here it has additional purpose. I've introduced a convention that the primary server must contain, and only primary server can contain trigger file. Similarly, every standby server must contain, and only standby servers can contain standby file. If this convention is enforced, it is easy for administrators and administrative scripts to immediately determine if particular server has any role (and which one).

Details about mentioned scripts (what they actually do) and trigger / standby files can be found in PostgreSQL HA with pgpool-II - Part 3. Once again, I strongly recommend reading the first three parts of this tutorial, since pg_streaming_replication module is based on that. Or at least the third part that explains the scripts.

To conclude: when any of the two scripts mentioned above runs for the first time on a particular server, besides performing the actual task (promoting primary or joining standby), it also creates trigger or standby file. And as long pg_streaming_replication module sees any of these two files - it will not run the scripts again, no matter initiate_role parameter value. It also means that if you (for any reason) want to force the module to run the scripts again - you should ensure that both trigger and standby files are removed from the server, set desired initiate_role parameter value, and run puppet agent.

Tools Conflict Stoppers

Just as a recap I'll say that pg_streaming_replication module implements two mechanisms that are preventing it to run replication scripts unwantedly:

  • Setting initiate_role value to 'none' for all the servers after the replication is initiated - it is the recommended thing to do;
  • Trigger / standby files. The module won't execute any script if any of these are present on the server. This mechanism is implemented out-of-the-box, without any action or configuration required from you.

Template Config Files

When it comes to template config files, pg_streaming_replication module differs from the referenced source tutorial. In the tutorial the template files are fixed since there's no another user (you) who wants to change them. Here they are dynamic in terms that the module keeps them updated according to the currently used postgresql.conf file. It is already explained that the module does not change postgresql.conf file anymore, but still at every run it checks the file for changes, and if it is changed, it mirrors these changes to template files. It means that if you change any entry in postgresql.conf file (except for few reserved entries enumerated above) the module will notice that, and copy the change to postgresql.conf.primary and postgresql.conf.standby files. This way the module actually ensures that the template files are in sync with the currently active configuration, so when they are deployed by executing any of replication scripts - the changes will be preserved.

pg_streaming_replication Class' Arguments

Before starting with explaining particular arguments I'll emphasize that all the arguments except for initiate_role have to be the same for all the servers. It is true that primary_server_ip is not needed on the primary server, but it won't hurt even if it is supplied (for consistency), since it'll be ignored there.

The parameters can be grouped in several groups by their purpose and intended usage. The first group contains arguments that you should supply no matter if you are using puppetlabs/postgresql module or not (unless the default value is correct in your case). These are:

Parameter Default Value Description
nodes N/A (mandatory) Array of IP addresses of all the servers included in replication.
replication_password N/A (mandatory) For replication purpose the module creates PostgreSQL login named 'replication' (not changeable), and this parameter specifies its password.
pg_ctl '/usr/lib/postgresql/[POSTGRESQL_VERSION]/bin/pg_ctl' (i.e. '/usr/lib/postgresql/9.5/bin/pg_ctl') pg_ctl command's full path.
postgres_home_dir '/var/lib/postgresql' Home directory of postgres user.

Two more parameters have to be specified in any non-testing environment:

Parameter Default Value Description
id_rsa_source 'puppet:///modules/pg_streaming_replication/id_rsa' Location of postgres user's SSH private key file. It defaults to private key contained by pg_streaming_replication module, but you certainly don't want to use a private key which anyone can see at Github.
id_rsa_pub_source 'puppet:///modules/pg_streaming_replication/id_rsa.pub' Public key that corresponds to id_rsa_source.

We are introducing the last two parameters because postgres user has to be able to SSH execute script on any of the servers, from every of the servers included in replication. This is accomplished as explained in Passwordless SSH in Linux. There you can see how you can create the private and public keys.

It is also worthwhile noting that the keys do not need to be stored in puppet and published through puppet url. You can also use network share or FTP server. Basically you can use anything that is acceptable by source attribute of file resource.

In the third group there are parameters related to replication itself:

Parameter Default Value Description
max_replication_slots [NUMBER_OF_NODES] + 1

See PostgreSQL documentation for details about this parameter.

To explain the default value: if you have, for example, 5 elements in nodes parameter, the default value of this parameter will be 6.

max_wal_senders max_replication_slots See PostgreSQL documentation for details about this parameter.

You don't have to change these parameters, but you can if you want.

The fourth group is for trigger / standby file paths. There's no reason for setting these (changing the default values), but again you can if you want. The parameters are:

Parameter Default Value Description
trigger_file '[CONFIG_DIR]/im_the_master' (i.e. '/etc/postgresql/9.5/main/im_the_master') Trigger file path.
standby_file '[CONFIG_DIR]/im_slave' (i.e. '/etc/postgresql/9.5/main/im_slave') Standby file path.

The fifth group contains parameters that should not be set if your manifest implements postgresql::server class. pg_streaming_replication module will read these values from puppetlabs/postgresql module, so specifying them twice would be redundant and can cause misconfiguration if the same value isn't configured in both places. These parameters are:

Parameter Default Value Description
config_dir If postgresql::server class is used, then postgresql::params::confdir is used as default; otherwise '/etc/postgresql/[POSTGRESQL_VERSION]/main' (i.e. '/etc/postgresql/9.5/main') PostgreSQL configuration directory (where postgresql.conf and pg_hba.conf are stored).
data_dir If postgresql::server class is used, then postgresql::params::datadir is used as default; otherwise '/var/lib/postgresql/[POSTGRESQL_VERSION]/main' (i.e. '/var/lib/postgresql/9.5/main') The directory where PostgreSQL data (database cluster) is stored.
postgresql_version If not specified, the default value is provided by postgresql::server class. If postgresql::server class is not implemented this parameter is mandatory. PostgreSQL version.
port If postgresql::server class is used, then the default value is taken from there; otherwise 5432 PostgreSQL server listening port.

The last group is for two role-based parameters:

Parameter Default Value Description
initiate_role 'none'

The role of the server in replication. Acceptable values are:

  • 'master' and 'primary' - the same meaning - the primary server;
  • 'slave' and 'standby' - the same meaning - a standby server;
  • 'none' (default) - neutral configuration.

As mentioned several times, after the replication is established this parameter should be set to 'none' for all the servers, no matter role they have.

primary_server_ip N/A (mandatory if initiate_role is set to 'slave' or 'standby') IP address of the primary server. It is used only if initiate_role is 'slave' or 'standby'; otherwise it is ignored.

Usage Example

Since almost all parameters have to be the same for all the servers I suggest you creating intermediary class that will supply parameters (to avoid repetitions):

# 192.168.1.1
node server1.domain.com {
  class { 'mypostgresqlservers':
    initiate_role => 'primary',
  }
}
 
# 192.168.1.2
node server2.domain.com {
  class { 'mypostgresqlservers':
    initiate_role => 'standby',
  }
}
 
# 192.168.1.3
node server3.domain.com {
  class { 'mypostgresqlservers':
    initiate_role => 'standby',
  }
}
  
class mypostgresqlservers (
  $initiate_role = 'none') {
 
  # PostgreSQL installation and configuration example
  class { 'postgresql::globals':
    manage_package_repo => true,
    version             => '9.5',
  }
  ->
  class { 'postgresql::server':
    ip_mask_deny_postgres_user => '0.0.0.0/32',
    listen_addresses           => '*',
    port                       => 5433,
    ipv4acls                   => ['host all postgres 192.168.1.0/24 md5'],
    postgres_password          => 'pa55w0rd',
    manage_pg_hba_conf         => true,
  }
 
  # Defining primary server:
  $ip_primary = '192.168.1.1'
  
  # Configuring replication
  class { 'pg_streaming_replication':
    id_rsa_source        => 'puppet:///files/my_postgres_ssh_id_rsa',
    id_rsa_pub_source    => 'puppet:///files/my_postgres_ssh_id_rsa.pub',
    nodes                => ['192.168.1.1', '192.168.1.2', '192.168.1.3'],
    replication_password => 'B0B52E6F-BB5F-4371-B824-098E086B5C2D',
    initiate_role        => $replication_role,
    primary_server_ip    => $ip_primary,
  }
}

Failover, Promotion and Recovery

Finally few words about what you should do when things go wrong. Let's start by explaining the terms from the title of this section:

  • Failover is when a standby server takes over primary server's role, usually because of failure of the primary server. In PostgreSQL this is done simply by creating the trigger file on the standby server that will take over. It is not important how you'll do this - you can even use puppet for ensuring file resource, but I strongly suggest not doing so, and I don't see too much sense in doing that in the first place. Actually, in my environment I'm not performing failover at all - I always go with full promotion immediately.
  • Promotion (or full promotion) is when a standby server is completely promoted to primary role (making it capable to accept standbys). "What to Do when Primary Server Fails?" section in PostgreSQL HA with pgpool-II - Part 2 page explains the difference between failover and full promotion.
  • Recovery is when we are bringing back a standby server that has failed previously. In replication slots streaming replication there's no any special recovery procedure, and as explained in "How to Recover Standby Server?" section in PostgreSQL HA with pgpool-II - Part 2 page, we basically have two options:

    • If the standby server is repaired without loosing the data, and if we haven't deleted its replication slot at the primary server, the only thing to do is to simply turn on the standby server and it will automatically catch-up, no matter for how long it was down.
    • If the standby server's data is corrupt - we'll simply remove all the data, and rejoin it as new standby (as if it has never been standby before).

Full Promotion by Using Script

This is the point where the scripts created by pg_streaming_replication module come handy. We can fully promote a standby server to primary role in the exact same way as we've created the original primary server in the first place - by executing promote.sh script. If you want to see what arguments the script accepts, execute:

promote.sh --help

The following example will promote the server where the script is executed to primary role:

sudo -u postgres /etc/postgresql/9.5/main/replscripts/promote.sh -f -p BDE4CE17-98E5-4FDC-B03C-B94559FE03D8 -d OLD-PRIMARY.domain.com

Let me explain the previous line, part by part:

  • The line starts with 'sudo -u postgres' because all the scripts generated by pg_streaming_replication module have to be executed as postgres user.
  • Flag -f means force. If specified, it will allow script to execute no matter if trigger / standby files are present or not. Without -f flag, before executing the script you would have to ensure that standby file is absent and trigger file is present. With -f flag the script itself will delete / create the files appropriately.
  • Part '-p BDE4CE17-98E5-4FDC-B03C-B94559FE03D8' defines replication user's password, the same one we've used above for establishing the replication. The password is mandatory.
  • Part '-d OLD-PRIMARY.domain.com' specifies the old primary server. This part is optional, and if specified, besides promoting the current server to primary role, the script will also try to destroy the old primary (by trying to execute disable_postgresql.sh script at the old primary server). Remember - it would be dangerous if the old primary ever comes back, still thinking that he's the master.

After this command is executed you'll have the new master, ready for accepting its own slaves. Note that in streaming replication with replication slots, when the primary server is changed, all standbys (except for the one that is promoted) need to be destroyed and recreated from scratch as standbys for the new primary server.

Promotion By Using Puppet

You can accomplish the similar (full promotion) by using puppet instead of directly executing the script. If you want do this, you first need to ensure that both trigger and standby file are removed from the server (remember pg_streaming_replication module won't execute any script if any of these files already exists). After deleting the files you'll simply change the manifest for that server to include initiate_role attribute with value 'primary', for example:

class { 'pg_streaming_replication':
  id_rsa_source        => 'puppet:///files/my_postgres_ssh_id_rsa',
  id_rsa_pub_source    => 'puppet:///files/my_postgres_ssh_id_rsa.pub',
  nodes                => ['192.168.1.1', '192.168.1.2'],
  replication_password => 'BDE4CE17-98E5-4FDC-B03C-B94559FE03D8',
  initiate_role        => 'primary',
}

When the script is executed by pg_streaming_replication module as in this example, -d option is not specified, meaning that the old primary is not destroyed, so you need prevent it from coming back.

Recovery by Using Script or Puppet

I'll repeat the key thing about recovery: it actually assumes destroying the server and recreating it as completely new standby. So we'll do this in the same exact way as we've originally installed any other standby, except this time we'll delete standby / trigger files first to allow pg_streaming_replication module to execute necessary scripts. After the files are deleted we'll simply set initiate_role to 'standby' and run puppet agent.

Managing Slots

The first thing to tell when it comes to replication slots is that they are in 1-to-1 relationship with standby servers, meaning that every standby server will have (and use) only one slot, and every replication slot can be used by only one standby server.

When replication slots are used we should be careful to not allow existence of orphaned slots. Every created slot, if not used anymore, has to be deleted, or the primary will die sooner or later. pg_streaming_replication module creates replication slots for you, but it does not delete them (since it cannot know if you'll use them again). For this reason you need to know when the slots are created and how they are named, so that you can check if any orphaned slot exists on primary.

pg_streaming_replication module creates replication slots when a new standby is added to the cluster. More precisely, the slot is created by initiate_replication.sh script which is executed at the new standby server when pg_streaming_replication module joins it as a standby (first puppet agent run with initiate_role value 'standby' or 'slave'). Notice that the script is executed at the standby, and the slot is created at the primary server. Newly created replication slot is named after the standby server that uses it. For example, if the new standby is MY-SERVER1.domain.com, the slot will be named my_server1 (hostname without domain, to lowercase, with dashes replaced by underscores). If you decide to remove a particular standby server from replication for good, you also have to manually delete its slot from the primary server. For instructions on how to delete the slot you can check original PostgreSQL documentation, or you can simply open create_slot.sh script created by pg_streaming_replication module and see how the slots are created and deleted there.

Once knowing how the replication slots are created and named, it is obvious why I've defined the requirement that hostnames must be unique.

The last thing I want to explain here is what is happening when the slot with a particular name already exists. For example, if you want to join MY-SERVER1.domain.com server as standby, it is already explained that pg_streaming_replication module (initiate_replication.sh script) will create a new slot at the primary server, with name 'my_server1'. But what happens if a slot with this name already exists? The script that actually creates the slot is create_slot.sh, and what it will do depends on the flags specified. Possible options are:

  • If -r flag (recreate) is specified, create_slot.sh script will first delete the existing slot and then create the new one with the same name.
  • If -r flag is not specified, create_slot.sh won't do anything - it will leave the existing slot intact.

But chances are that you'll never execute this script directly. It is called internally by initiate_replication.sh script, and there it is always called with -r flag specified, meaning that in reality, if the slot with the same name already exists, it will be deleted and the new one with the same name will be created.