Maintenance of a Carto4ch server
Introduction
This section explains how to maintain :
- the demonstrator
- a Carto4CH partner server.
Mainly, to take care to regularly compact the files of the triplestore, and to carry out regular backups of the base and the images.
Access to the triplestore Web interface
In order to observe or intervene directly on the triplestore, you need to create an SSH channel to open a specific secure port on the demonstrator server.
ssh user@carto4ch.huma-num.fr -N -f -L 4040:carto4ch.huma-num.fr:3030
(replacing 'user' with your user)
You can then access the Jena Fuseki interface via the url : http://localhost:4040/
You can then discover the data, run SPARQL queries and administer the datasets.
Compacting
Known problem on Jena Fuseki
The Jena/fuseki database suffers from obesity. From time to time, you have to stop it and run a compaction command to prevent the database files from taking up too much disk space.
Solution: the compaction script
Everything you need to know is in the SemApps documentation > Triple Store > Compacting datasets
For example, for the demonstrator, we added (and tested) the following compact.sh script:
#!/bin/bash
# Add /usr/local/bin directory where docker-compose is installed
#PATH=/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/bin
cd /home/semapps/carto4ch
# Stop all containers including Fuseki
/usr/bin/docker compose -f docker-compose-prod.yaml stop fuseki
# Run fuseki compact with same data as prod
/usr/bin/docker run --volume="$(pwd)"/data/fuseki:/fuseki --entrypoint=/docker-compact-entrypoint.sh semapps/jena-fuseki-webacl
/usr/bin/docker compose -f docker-compose-prod.yaml up -d fuseki
echo "Cron job finished at" $(date)
Script explanation
- we stop the fuseki container
- run the docker-compact-entrypoint.sh script (contained in the fuseki container)
- restart the fuseki container
- We write that everything went well.
Run the script regularly as a CRON
This script can be added to the crontab with crontab -e
when running compaction, for example every night (at 3am) :
0 3 * * * /home/semapps/carto4ch/compact.sh >> /home/semapps/carto4ch/cron-compact.log 2>&1
A log will certify that everything has gone smoothly.
Backup
Presentation
To avoid losing data, either from the demonstrator or from a partner's Carto4ch server, it is important to be able to set up regular backups of the database.
Documentation
The backup mechanism used in SemApps is described here: https://semapps.org/docs/middleware/backup
SFTP external backup tests
In the case of the demonstrator, we only have one server and it is not in "production", so we have not installed a backup mechanism in production.
However, we have tested the mechanism so that we can describe it here, for partners who may wish to implement it.
To do this, we :
- generated the backup files using the SemApps Backup service
- set up an external FTP server (which will not be described here)
- used file transfer via SFTP.
2 steps
We will explain here :
- how to generate a backup of the Jena Fuseki database
- how to back up this backup regularly on an external server using SemApps features
Backup generation
Manual generation
You can generate a backup manually from the Jena Fuseki web interface
Automatic generation
As described in the SemApps backup service documentation, it is possible to automate the backup of a directory via CRON.
We first added a user carto4ch on the remote FTP server.
Test of an sftp from the demonstrator on the remote server :
Note that Fuseki stores its backup files in the docker container, in the directory /app/fuseki-backups
.
It is therefore important to publish this directory to the docker volumes :
data-a:
...
volumes:
...
- ./data/fuseki/backups:/app/fuseki-backups
Next, we update (if necessary) the backup.service.js file in the middleware to tell it what time to start creating the backup files. Be careful not to leave the same time as the time of compaction (seen earlier).
...
cronJob: {
time: '00 04 * * *', // Every night at 4am
timeZone: 'Europe/Paris'
}
Finally, we update the middleware .env.local
file, specifying :
- the ftp user (created on the remote FTP server)
- its password
- the url of the FTP server
- the directory where the files should be stored
- in the case of our tests, we are not specifying the port number
- the backup directory where the files to be transferred are located
# Backup server (Optional)
SEMAPPS_BACKUP_SERVER_USER=carto4ch
SEMAPPS_BACKUP_SERVER_PASSWORD=*****
SEMAPPS_BACKUP_SERVER_HOST=url-of-ftp-server
SEMAPPS_BACKUP_SERVER_PATH=/home/carto4ch/backups
SEMAPPS_BACKUP_SERVER_PORT=
SEMAPPS_BACKUP_FUSEKI_DATASETS_PATH=/app/fuseki-backups/
Add a CRON to regularly delete files in the /app/fuseki-backups
directory (i.e. data/fuseki/backups
)