Decision Release Performance and Scalability
  • 16 Nov 2023
  • 1 minute read
  • Dark
    Light
  • PDF

Decision Release Performance and Scalability

  • Dark
    Light
  • PDF

Article Summary

The decision release process within Slate has been designed for scale. It is not uncommon for us to have single days during the year where more than 500,000 decisions are being released within a short period of time, and we've engineered the entire process for this type of a scale.

We use content delivery networks to serve all static resources such as images, CSS, and JavaScript. Each month, more than 500 million requests are served through this capacity, so we're left with just the dynamic content and pages. Nearly all of the organizations with which we work use Slate authentication for their applicants, and our account login system uses minimal locking and in-memory caches to ensure that the login process runs smoothly under scale. Once in to the status portal, all of the XSL transforms driving the pages are cached in-memory, and the database lookups are extremely efficient. When viewing a letter, we commit the timestamp that the letter has been viewed to an in-memory queue, which batch commits this back to the database every minute. This ensures that, under heavy load, there aren't tens of thousands of separate write transactions hitting the database simultaneously---they get pooled together and there is a single commit.

The greatest risks, therefore, are probably not with any base functionality but with custom teasers or complex custom form logic that might introduce a bottleneck, but even there we have many mechanisms in place to minimize record locking by deferring all rules to execute asynchronously with a low deadlock priority to be killed and retried if an interactive process needs to take priority.

All this said, we're on standby to monitor any major decision releases, and we have capabilities to disable selective functionality if there is a bottleneck introduced by some custom institution-specific process (apart from an SSO bottleneck).


Was this article helpful?