To optimize performance in a system, it's helpful to break down the time it takes for different processes to occur.
- Pulling Data
- Updating Data
- Processes that Occur on a Schedule
There is the amount of time something like a query or report takes to PULL DATA, which is largely determined by the complexity of the query.
For example, if you're pulling dozens of columns from tens or hundreds of thousands of rows joining in lots of different tables, that's just going to be a complex query in any system, so it's crucial to eliminate custom SQL or overly complex queries. Good data hygiene (not keeping around old data, not constantly importing new data, or overwriting existing data) certainly will help here, but often it's solvable by looking at what a specific process is doing.
Then there is the amount of time something takes to UPDATE DATA interactively, such as submitting a form or changing a mailing address. The time it takes for a process here often has nothing to do with server performance or database size but is often affected by other data-updating operations in the database that are blocking the process from completion. These operations include rules, imports, retention policies, consolidate records, and others because they update many records simultaneously. This category is often unrelated to the first.
Occur on a Schedule
Then there is the amount of time something takes to OCCUR ON A SCHEDULE, such as rules or scheduled exports. It's important to distinguish between rules running slowly and rules running infrequently. Most of the time, it's not that rules are running any more slowly, but rather that they are running more infrequently because of other updates that might be occurring on the system, such as those described in category two, or because you've queued up a ton of records to go through the rule execution process. It's not that rules take 30 minutes to run, but rather they might take 15 minutes to get through the evaluation and execution process, but because they are taking longer than desirable to evaluate the criteria (due to typically having a ton of rules or non-optimized rules like the SQL described in the first category), we won't be able to schedule the background job for consideration as frequently.
By breaking down system processes into these categories, it's possible to identify specific areas for optimization and improve overall system performance.
Efficiently written queries run manually have a window of five minutes to execute successfully before timing out. To prevent this, it's essential to remember the following strategies and considerations when building a new query.
Filter in the Affirmative
Select filters that affirmatively specify the records' attributes that should be returned in the query results whenever possible. Using IN rather than NOT or NOT IN is almost always more efficient. Ultimately, filtering to include records instead of excluding records improves your queries' accuracy and transparency.
For example, instead of filtering for person records with a Citizenship Status NOT IN US Citizen, Permanent Resident, or Blank, filter for Citizenship Status IN Foreign National.
Avoid Multiple OR Statements
OR statements multiply the permutations necessary to evaluate. When the filter logic for a query involves multiple OR statements, the better approach is often to build several smaller queries instead.
Filters should be combined whenever possible:
Rather than adding a configurable filter multiple times, with a different selection in each version:
Limit Searching of Free-Text Fields
Filters such as "Application Field Value Contains" are written to look at the field index, i.e., the field value's first 64 characters. Behind the scenes, this uses a LIKE operator in SQL, which can be resource-intensive and may involve searching against several thousand characters per row (You might search against several thousand characters per applicant which could end up being several hundred megabytes of data).
Limit Filters that Reference Large Tables
Some tables may have tens of millions of rows of data. Examples include:
- [source.index] (no longer available for use in queries or reports)
Depending on how the query has been written, referencing tables like these in your query can potentially reduce efficiency in query run times. Therefore, if you find that your query is running slow and is attempting to reach these data points, try inactivating the associated exports or filters to see if it improves the query's run time.
Use Literals to add Static Values
When each row in the query results should have the same static value in a given column, a Literal is a more efficient way of setting that value than using a formula or custom SQL snippet.
Use Existence exports rather than a formula with case statements - Existence exports can return a specific result (string) when a record does or does not meet a set of criteria. We highly recommend that you use these in place of a formula with a case statement whenever possible. Example Formula Case Statement:
CASE WHEN (@Race like '%Indian%') then 'Y'ELSE 'N' END.
This example includes an existence export that filters for Race IN American Indian. When the race is American Indian, display a Y; otherwise, show an N.
Existence exports can be incredibly helpful in improving query efficiency, as the server needs to check whether anything is there rather than evaluating for a specific value. Existence exports are created using subqueries when building Configurable Joins Queries.
Limit the Use of Filters/Exports that Require "Prep" Clauses (e.g., Geographic Proximity)
Filters and exports sometimes use prep clauses that can be set up in the Prep Clause, which are automatically set up in a Prep Clause setting behind the scenes when creating/editing specific exports and filters such as a Geographic Proximity.
Geographic proximity exports and filters require some additional preparations behind the scenes of the query you are creating. These preparations are necessary to calculate proximity correctly, but as you can imagine, it requires extra processing power. If not configured appropriately, some of these types of filters or exports may cause a query slowdown (e.g., an Upcoming Interview Geographic Proximity filter in a Prospects and Applicants query base).
Break a Large Query into Several Smaller Queries
When pulling data from several different tables where a standard query base exists, it can be considerably more efficient to create separate queries rather than one large query.
Limit the Use of Custom Query Bases (e.g., Departmental Query Bases)
Departmental query bases are typically hardcoded to evaluate the currently logged-in user's roles and permissions. This involves some behind-the-scenes layering of the query that increases complexity. As always, something that is more complex can affect the run times.
To put it simply, you're running queries in queries within your query.
Limit the use of Sorts
Applying sorts to complex queries can increase the time it takes for the query to execute successfully. Limiting the number of Sorts, or removing any additional format masks that may have been applied to the Sort exports, can reduce latency.
Consider Materialized Views for Processes
Materialized views can sometimes provide a more efficient solution when a process does not require up to the second production data.
Use Populations and Rules to Offload Computational Complexity
Use a rule to set a field value or assign a record to a population rather than manage it via a query.