Open edX is an open source project for online learning developed by the MIT and Harvard team. It is a web based application with a lot of components such as student facing, course authoring, course delivery and content management.
The Open edX is built in Python and uses Django as a web framework. It uses MongoDB as a database backend. When building and setting up an Open edX environment, one needs to think of the uptime of the service because the platform is widely used by the student and learner as an open platform.
High availability is a must for MongoDB databases, beside the application server too. For disaster recovery, a sound backup strategy is key so you know you can restore the data if something goes really wrong.
In this blog, we will review how to backup your Open edX MongoDB database.
Preparing the Backup Storage
The first thing we need to do is to prepare the storage for the MongoDB backup. You can stage the backups on the same infrastructure as the Open edX services, and then archive them offsite. You can use Storage Area Network (SAN) or Network Attached Storage where it is mounted to one of the MongoDB servers. AWS provides a simple storage service called S3 to archive your backups, while Google Cloud Platform has Cloud Storage.
It is on demand service and the pricing model is based on per GiB size of your backup. For safety, at least you can put your Open edX database backup on 2 different areas; which is on your premise, and on the cloud.
Manual Backup for MongoDB
Typically backup for MongoDB databases is using mongodump utility which is bundled when you install MongoDB server. You can take a backup in one of the MongoDB servers, just run the mongodump as shown below:
$ mongodump --db edxapp --out /backups/open-edx/`date +"%m-%d-%y"`
2021-01-11T11:23:42.541-0500 writing edxapp.module to /backups/open-edx/01-11-21/edxapp/module.bson
2021-01-11T11:23:42.878-0500 writing edxapp.module metadata to /backups/open-edx/01-11-21/newdb/module.metadata.json
2021-01-11T11:23:42.923-0500 done dumping edxapp.module (25359 documents)
2021-01-11T11:23:42.945-0500 writing newdb.system.indexes to /backups/open-edx/01-11-21/edxapp/system.indexes.bson
……
It will create a backup on the MongoDB host, you can have a script to move the backup files to some other storage.
Backup MongoDB for Open edX Using ClusterControl
ClusterControl supports MongoDB backup for your Open edX platform. It supports mongodump and we just added support for a new backup method called PBM (Percona Backup for MongoDB), which would be more appropriate for sharded MongoDB Clusters. Taking backup using mongodump in ClusterControl is very easy using a GUI-based wizard. Choose the Backup Tab and then Create Backup. There are two options you can choose, you can immediately create a backup or you can schedule the backup.
And then just click Continue:
Choose mongodump as a backup method, and then write down the location directory where you want to put the backup. In this step, you can use a Storage Area Network or Network Attached Storage that is mounted to your MongoDB server.
ClusterControl also supports backup to the cloud, currently we support Amazon Web Services (AWS), Google Cloud Platform, and Microsoft Azure.
You can also enable encryption for your backup, this is especially important if you are archiving in the cloud. Next, just press Create Backup, it will trigger a new job for the backup as shown below:
You can also use Percona Backup for MongoDB for consistent backup of your MongoDB Replicaset and Sharded Cluster.. Just select the percona-backup-mongodb as backup method, it requires you to install an agent on each node and shared storage to be mounted on every MongoDB node.