A Much Better Way To Backup GitLab to AWS S3

Balaji D Loganathan

54 sec read

Like many of you, we also use the self-hosted version of GitLab CE and it is one of the BEST open source software written in Ruby on Rails. Though GitLab got its own rake tasks to take backup to AWS S3, it wasn’t working for us and several other users. Hence we developed our own solution and sharing it here so that it can be helpful to others in need.

Once the GitLab repositories and uploads sizes become 6GB+, the probability of GitLab’s backup rake task working is 50/50. Most of the time it will fail while it tries to prepare backup or while uploading to AWS S3. To avoid this issue, we simply split the back up into two segments.

First, we upload the repositories and then we upload the uploads directory. It is done using a simple shell script located at https://gist.github.com/balajidl/44c9389835547fa40e88e881ac1d40ee

What does the shell script do?

  1. Initialize the log file
  2. Backup the repositories using the GitLab provided rake tasks gitlab:backup:create by ignoring the uploads folder gitlab:backup:create SKIP=uploads
  3. Backup the uploads by compressing the GitLab uploads folder
  4. Upload the two .gz files to AWS S3
  5. Periodically remove old backup files to avoid excess billing 🙂
  6. Send an email using SendGrid when the back up is successful.
echo "**************START*************************" >> $LOGFILE
#---------1. GITLAB git repos backup ---------
echo "Start the gitlab backup process" >> $LOGFILE
/opt/gitlab/bin/gitlab-rake gitlab:backup:create SKIP=uploads >> $LOGFILE
echo "git backup done" >> $LOGFILE
echo "Now, uploading gitlab repo to s3" >> $LOGFILE
DATE=`date +%Y_%m_%d`
EE="$(find /mnt/data/gitlab/ -type f -name $FILENAME)"
# Todo check $EE is not empty
echo "Gitlab backup repo file to be uploaded is: " $EE >> $LOGFILE
/usr/local/bin/aws s3 cp ${EE} s3://meow/ >> $LOGFILE
#---------1. GITLAB uploads backup ---------
echo "Start - gitlab uploads folder to s3" >> $LOGFILE
BAK_DATE=`date +%F`
BAK_DATETIME=`date +%F-%H%M`
echo "creating tar.gz file at " ${BAK_FILE} >> $LOGFILE
echo "Now uploading gitlab uploads folder to s3"
/usr/local/bin/aws s3 cp ${BAK_FILE} s3://meow/ >> $LOGFILE
#---------3. Delete old files ---------
echo 'Deleting backup older than '${KEEP_DAYS}' days' >> $LOGFILE
find /mnt/data/gitlab/backups/ -type f -name '*.tar' -mtime +3 -exec rm {} \;
find /mnt/data/gitlab/backups/ -type f -name '*.tar.gz' -mtime +3 -exec rm {} \;
#---------4. Send email to me to make me smile ---------
BAK_DATETIME=`date +%F-%H:%M`
SUBJECT="Gitlab backup to succcessful: ${BAK_DATETIME}"
REQUEST_DATA='{"personalizations": [{
"to": [{ "email": "foo@foo.bar" }],
"subject": "'"$SUBJECT"'"
"from": {
"email": "foo@foo.bar",
"name": "Code.spritle.com"
"content": [{
"type": "text/plain",
"value": "Keep smiling"
curl -X "POST" "https://api.sendgrid.com/v3/mail/send" \
-H "Authorization: Bearer $SENDGRID_API_KEY" \
-H "Content-Type: application/json" \
echo "Sent email notification via sendgrid" >> $LOGFILE
echo "***************END***************************" >> $LOGFILE
Backup email alert sent via sendgrid

Voila ^_^!

Amazing! Isn’t it? Please feel free to comment if you have questions or need help. Happy to help 🙂

Related posts:

Leave a Reply

Your email address will not be published. Required fields are marked *