When in production, an application should provide a timely response to the user for the purpose of improving user interaction with your application. At times, however, database queries may start to lag hence taking a longer latency for a response to reach the user or rather the throughput operation terminated due to surpassing the set average timeout.
In this blog we are going to learn how you can identify these problems in MongoDB, ways to fix them whenever they arise and what are the possible strategies to undertake so that this may not happen again.
More often, what leads to slow query responses is degraded CPU capacity that is unable to withstand the underlying working set. Working set in this case is the amount of data and indexes that will be subjected to a throughput instance hence active at that moment. This is especially considered in capacity planning when one expects the amount of data involved to increase over time and the number of users engaging with your platform.
Identifying a Slow Query Problem
There are two ways you can identify slow queries in MongoDB.
- Using the Profiler
- Using db.currentOp() helper
Using the MongoDB Profiler
Database profiler in MongoDB is a mechanism for collecting detailed information about Database Commands executed against a running mongod instance that is: throughput operations (Create, Read, Update and Delete) and the configuration & administration commands.
The profiler utilizes a capped collection named system.profile where it writes all the data. This means, when the collection is full in terms of size, the older documents are deleted to give room for new data.
The Profiler is off by default but depending on the profiling level one can enable it on a per-database or per instance. The possible profiling levels are:
- 0 - the profiler is off hence does not collect any data.
- 1 - the profiler collects data for operations that take longer than the value of slowms
- 2- the profiler collects data for all operations.
However, enabling profiling generates a performance impact on the database and disk usage especially when the profiling level is set to 2 . One should consider any performance implications before enabling and configuring the profiler on a production deployment.
To set the profiling, we use the db.setProfilingLevel() helper such as:
db.setProfilingLevel(2)
A sample document that will be stored in the system.profile collection will be:
{ "was" : 0, "slowms" : 100, "sampleRate" : 1.0, "ok" : 1 }
The “ok”:1 key-value pair indicates that the operation succeeded whereas slowms is the threshold time in milliseconds an operation should take and by default is 100ms.
To change this value
db.setProfilingLevel(1, { slowms: 50 })
To query for data against the system.profile collection run:
db.system.profile.find().pretty()
Using db.currentOp()helper
This function lists the current running queries with very detailed information such as how long they have been running. On a running mongo shell, you run the comment for example:
db.currentOp({“secs_running”: {$gte: 5}})
Where secs_running is the filtering strategy so that only operations that have taken more than 5 seconds to perform will be returned, reducing the output. This is often used when the CPU’s health can be rated 100% due to adverse performance impact it may implicate on the database. So by changing the values you will learn which queries are taking long to execute.
The returned documents have the following as the keys of interest:
- query: what the query entails
- active: if the query is still in progress.
- ns: collection name against which the query is to be executed
- secs_running: duration the query has taken so far in seconds
By highlighting which queries are taking long, you have identified what is overloading the CPU.
Interpreting Results and Fixing the Issues
As we have described above, query latency is very dependent on the amount of data involved which will otherwise lead to inefficient execution plans. This is to say, for example if you don’t use indexes in your collection and want to update certain records, the operation has to go through all the documents rather than filtering for only those that match the query specification. Logically, this will take longer time hence leading to a slow query. You can examine an inefficient execution plan by running: explain(‘executionStats’) which provides statistics about the performance of the query. From this point you can learn how the query is utilizing the index besides providing a clue if the index is optimal.
If the explain helper returns
{
"queryPlanner" : {
"plannerVersion" : 1,
...
"winningPlan" : {
"stage" : "COLLSCAN",
...
}
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 3,
"executionTimeMillis" : 0,
"totalKeysExamined" : 0,
"totalDocsExamined" : 10,
"executionStages" : {
"stage" : "COLLSCAN",
...
},
...
},
...
}
queryPlanner.winningPlan.stage: COLLSCAN key value indicates that the mongod had to scan the entire collection document to identify the results hence it becomes an expensive operation hence leading to slow queries.
executionStats.totalKeysExamined:0means the collection is not utilizing indexing strategy
For a given query, the number of documents involved should be close to zero. If the number of documents is quite large there are two possibilities:
- Not using indexing with the collection
- Using an index which is not optimal.
To create an index for a collection run the command:
db.collection.createIndex( { quantity: 1 } )
Where quantity is an example field you have selected to be optimal for the indexing strategy.
If you want to learn more about indexing and which indexing strategy to use, check on this blog
Conclusion
Database performance degradation can be easily portrayed by having slow queries which is the least expectation we would want platform users to encounter. One can identify slow queries in MongoDB by enabling the profiler and configuring it to its some specifications or executing db.currentOp() on a running mongod instance.
By looking at the time parameters on the returned result, we can identify which queries are lagging. After identifying these queries, we use the explain helper on these queries to get more details for example if the query is using any index.
Without indexing, the operations become expensive since a lot of documents need to be scanned through before applying the changes. With this setback, the CPU will be overworked hence resulting in slow querying and rising CPU spikes.
The major mistake that leads to slow queries is inefficient execution planning which can be easily resolved by using an index with the involved collection.