Setting up HA for DBmarlin
These steps are for DBmarlin 5.9.0 and above which includes HA sync scripts
You might want to set up High Availability (HA) for DBmarlin to ensure continuous visibility into database performance, even during hardware failures, maintenance, or unexpected outages. An HA setup provides resilience by automatically failing over to a standby instance, minimising data loss, and maintaining uninterrupted access to dashboards, alerts, and historical comparisons. This not only improves reliability for operations teams but also builds confidence with stakeholders that database monitoring will always be available when itβs most needed.
HA for DBmarlin Repository (PostgreSQL)β
DBmarlin uses PostgreSQL as it's backend repository database. The bundled PostgreSQL installation is for a stand-alone PostgreSQL instance which runs on the same server as the other DBmarlin components. In order to setup HA, you should have 2 machines dedicated to the DBmarlin PostgreSQL cluster and 2 machines dedicated to the DBmarlin app server (4 machine in total).
PostgreSQL supports replication setups with a Primary instance receiving the read and write activity and the Replica receiving replication traffic until the event of a failover when it becomes the Primary. You can setup a remote PostgreSQL instance and turn this into a 2 node cluster following the PostgreSQL documentation https://www.postgresql.org/docs/15/high-availability.html.
HA for the DBmarlin Serverβ
The DBmarlin Server is stateless as long as you are using a remote PostgreSQL instance. Therefore the main consideration is keeping the binaries and config in sync between the Primary and the Standby servers.
Setupβ
These instruction assume 2 DBmarlin machines (dbmarlin-ha-test1 and dbmarlin-ha-test2) which should be kept in sync. Only 1 is active and any time and the other is stopped ready to take over in the case of a failover.
-
Make sure you can do passwordless
ssh
in both directions (See How to setup passwordless SSH login). Once setup you should be able to ssh like this in both directions.From dbmarlin@dbmarlin-ha-test1:
[dbmarlin@dbmarlin-ha-test1 dbmarlin]$ ssh dbmarlin-ha-test2
From dbmarlin@dbmarlin-ha-test2:
[dbmarlin@dbmarlin-ha-test2 dbmarlin]$ ssh dbmarlin-ha-test1
-
Make sure you have
rsync
installed on both machines.yum install rsync
-
Create
.ha_remote.conf
on each DBmarlin machine in/opt/dbmarlin
and configure to point to the REMOTE host, user and dir like below:On dbmarlin@dbmarlin-ha-test1
.ha_remote.conf
will look like:REMOTE_HOST=dbmarlin-ha-test2
REMOTE_USER=dbmarlin
REMOTE_DIR=/opt/dbmarlinOn dbmarlin@dbmarlin-ha-test2
.ha_remote.conf
will look like:REMOTE_HOST=dbmarlin-ha-test1
REMOTE_USER=dbmarlin
REMOTE_DIR=/opt/dbmarlin -
When you run
start.sh
it will check for the existence of.ha_remote.conf
and then attempt to see whether DBmarlin tomcat is running on theREMOTE_HOST
. Is so it will prevent it from starting on this host to prevent dual writes to the DBmarlin PostgreSQL repository DB../start.sh
Starting all processes
Remote postgresql (skipping)
Preparing to start Tomcat
Checking if Tomcat is active on dbmarlin-ha-test1...
Tomcat is already active on dbmarlin-ha-test1. Aborting start to prevent dual writes.
Syncingβ
There is a ./ha_sync.sh
script which be used to sync files from the remote or to the remote.
-
Create
.rsyncignore
with these contents if it doesn't already exist:.pid
*.log
tomcat/logs/
nginx/cache/
nginx/proxy_temp/
.ha_remote.conf
.rsyncignore -
Passive node can pull from active like this:
# Dry run to see which files will be copied/deleted
./ha_sync.sh --pull --dry-run
./ha_sync.sh --pull -
Active node cannot pull from passive - this will be prevented with message like:
./ha_sync.sh --pull
Local Tomcat is running β skipping pull (this is the active node) -
You could push from the active node to the passive node like after an upgrade if you like:
# Dry run to see which files will be copied/deleted
./ha_sync.sh --push --dry-run
./ha_sync.sh --push -
Sync can be scheduled via cron. Here is an example
crontab
which should be applied on on both DBmarlin nodes (The active one will be a no-op and only passive node will sync from the active). This will run sync hourly and log to/opt/dbmarlin/sync.log
0 * * * * /opt/dbmarlin/ha_sync.sh --pull >> /opt/dbmarlin/sync.log 2>&1