Improve MongoDB query performance
As the amount of data grows in a MongoDB collection, query performance can decrease. This page covers several strategies to improve query performance, including proper indexes, TTL indexes, and archival.
Analyze query performance
Before pruning data, analyze query performance to check for proper indexes. Run explain plans to determine whether a specific query is using indexes. For more information, see the MongoDB tutorial on analyzing query performance using explain plans.
Indexes are data structures that MongoDB uses for more efficient queries. If an explain plan reveals that a query lacks an index, create one. For more information, see the MongoDB manual on indexes. A proper indexing strategy takes priority over data pruning for improving query performance. If a collection has all the necessary indexes and query performance is still slow, consider the strategies below.
TTL indexes
If documents in a collection are not needed after a certain amount of time, a TTL index can automatically remove them after a specified number of seconds or at an exact time. TTL indexes can only be created on fields with a date value or an array of date values.
TTL indexes are not recommended as a primary means for improving query performance. Use them only if you are certain that the data in a collection needs to persist for a specific amount of time. For more information, see the MongoDB manual on TTL indexes.
Capped collections
A capped collection enforces a limit on the number of documents or total data size that the collection can store. Capped collections use a first-in-first-out strategy: if an insertion pushes the collection past its maximum constraints, the oldest document is removed.
Collections can only be capped at creation time. Use capped collections only when you need a specific amount of the most recently inserted data. For more information, see the MongoDB manual on capped collections.
Archival strategy
If query performance is insufficient even with proper indexes on data that must be retained for historical records, an archival strategy can reduce the working size of a collection. This typically involves moving a subset of documents to another collection, database, or storage system. Documents that no longer need to persist can also be deleted outright. The right archival strategy depends on your use case and performance requirements.
Archival example: job metrics
Beginning with 2021.1, Operations Manager includes a dashboard that displays metrics on jobs run in Itential Platform. This dashboard queries the wfe_job_metrics collection. Because only one document is inserted per automation, the collection rarely grows large enough to significantly impact performance. However, search performance can degrade over time if the document count becomes high enough.
In that scenario, you can create a script that archives or removes metrics for all jobs completed before a certain date, which can improve query performance. Metrics for automations with no recently completed jobs are also candidates for archival. Weigh the importance of query performance against the need for the data being archived in the dashboard before proceeding.