Skip to main content

Maintenance of a Carto4ch server

Introduction

This section explains how to maintain :

  • the demonstrator
  • a Carto4CH partner server.

Mainly, to take care to regularly compact the files of the triplestore, and to carry out regular backups of the base and the images.

Access to the triplestore Web interface

In order to observe or intervene directly on the triplestore, you need to create an SSH channel to open a specific secure port on the demonstrator server.

ssh user@carto4ch.huma-num.fr -N -f -L 4040:carto4ch.huma-num.fr:3030

(replacing 'user' with your user)

You can then access the Jena Fuseki interface via the url : http://localhost:4040/

You can then discover the data, run SPARQL queries and administer the datasets.

Compacting

Known problem on Jena Fuseki

The Jena/fuseki database suffers from obesity. From time to time, you have to stop it and run a compaction command to prevent the database files from taking up too much disk space.

Solution: the compaction script

Everything you need to know is in the SemApps documentation > Triple Store > Compacting datasets

For example, for the demonstrator, we added (and tested) the following compact.sh script:

#!/bin/bash

# Add /usr/local/bin directory where docker-compose is installed
#PATH=/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/bin

cd /home/semapps/carto4ch

# Stop all containers including Fuseki
/usr/bin/docker compose -f docker-compose-prod.yaml stop fuseki

# Run fuseki compact with same data as prod
/usr/bin/docker run --volume="$(pwd)"/data/fuseki:/fuseki --entrypoint=/docker-compact-entrypoint.sh semapps/jena-fuseki-webacl

/usr/bin/docker compose -f docker-compose-prod.yaml up -d fuseki

echo "Cron job finished at" $(date)

Script explanation

  • we stop the fuseki container
  • run the docker-compact-entrypoint.sh script (contained in the fuseki container)
  • restart the fuseki container
  • We write that everything went well.

Run the script regularly as a CRON

This script can be added to the crontab with crontab -e when running compaction, for example every night (at 3am) :

0 3 * * * /home/semapps/carto4ch/compact.sh >> /home/semapps/carto4ch/cron-compact.log 2>&1

A log will certify that everything has gone smoothly.

Backup

Presentation

To avoid losing data, either from the demonstrator or from a partner's Carto4ch server, it is important to be able to set up regular backups of the database.

Documentation

The backup mechanism used in SemApps is described here: https://semapps.org/docs/middleware/backup

SFTP external backup tests

In the case of the demonstrator, we only have one server and it is not in "production", so we have not installed a backup mechanism in production.

However, we have tested the mechanism so that we can describe it here, for partners who may wish to implement it.

To do this, we :

  • generated the backup files using the SemApps Backup service
  • set up an external FTP server (which will not be described here)
  • used file transfer via SFTP.

2 steps

We will explain here :

  • how to generate a backup of the Jena Fuseki database
  • how to back up this backup regularly on an external server using SemApps features

Backup generation

Manual generation

You can generate a backup manually from the Jena Fuseki web interface

Automatic generation

As described in the SemApps backup service documentation, it is possible to automate the backup of a directory via CRON.

We first added a user carto4ch on the remote FTP server.

Test of an sftp from the demonstrator on the remote server :

Note that Fuseki stores its backup files in the docker container, in the directory /app/fuseki-backups.

It is therefore important to publish this directory to the docker volumes :

  data-a:
...
volumes:
...
- ./data/fuseki/backups:/app/fuseki-backups

Next, we update (if necessary) the backup.service.js file in the middleware to tell it what time to start creating the backup files. Be careful not to leave the same time as the time of compaction (seen earlier).

...
cronJob: {
time: '00 04 * * *', // Every night at 4am
timeZone: 'Europe/Paris'
}

Finally, we update the middleware .env.local file, specifying :

  • the ftp user (created on the remote FTP server)
  • its password
  • the url of the FTP server
  • the directory where the files should be stored
  • in the case of our tests, we are not specifying the port number
  • the backup directory where the files to be transferred are located
# Backup server (Optional)
SEMAPPS_BACKUP_SERVER_USER=carto4ch
SEMAPPS_BACKUP_SERVER_PASSWORD=*****
SEMAPPS_BACKUP_SERVER_HOST=url-of-ftp-server
SEMAPPS_BACKUP_SERVER_PATH=/home/carto4ch/backups
SEMAPPS_BACKUP_SERVER_PORT=
SEMAPPS_BACKUP_FUSEKI_DATASETS_PATH=/app/fuseki-backups/
Attention

Add a CRON to regularly delete files in the /app/fuseki-backups directory (i.e. data/fuseki/backups)