Performance tuning

Performance tuning

When you start using EmailEngine and have only a few email accounts to test on, you can probably get away with even a very modest server without changing any configuration options. Once your EmailEngine usage grows, this might not be enough anymore.

A lot depends on your specific use case. Waiting mainly for webhooks and not actively running API requests allows to use of a smaller server, and actively running API requests requires a larger server. But just provisioning a larger server might not be enough for some cases. You would have to check the specific configuration options.

In this blog post, I try to cover most configuration options for tuning EmailEngine performance.

IMAP

EmailEngine runs a fixed number of worker threads to manage IMAP connections. If you have 100 email accounts and 4 IMAP worker threads, each such worker will process ~25 email accounts. If you have a lot of CPU cores and are processing more than a small number of email accounts, consider increasing the worker number to decrease the load for individual CPU cores.

EENGINE_WORKERS=4

Default is to run 4 worker threads

If you process many email accounts with few available CPU cores, EmailEngine might overload your CPU when starting. Opening a TCP connection and running the initial commands to set up an IMAP session is CPU intensive, so doing the same for a large number of email accounts can be bad for the CPU.

You can overcome this by setting an artificial initialization delay. By using a delay, EmailEngine does not start all email accounts at once but one by one and waits for the delay before initializing the next IMAP connection. The downside is a longer startup time. A 3-second delay for 1000 email accounts means that it takes EmailEngine about an hour until all email accounts are active. This approach is practical primarily when waiting for webhooks from EmailEngine, instead of actively running API commands, as any API requests for an account would fail until the account has been connected.

EENGINE_CONNECION_SETUP_DELAY=0ms

By default there is no delay set

Faster notifications

If your use case requires faster notifications for specific folders (for example, you need updates from the Inbox and Sent Mail folder as fast as possible while ignoring everything else), use the sub-connection feature.

While adding a new email account (or updating an existing one), set the subconnections array property either with absolute paths or special use folder flags to set the additional folders in addition to the main folder (usually "Inbox") you want to monitor.

For example, the following setting asks EmailEngine to monitor the Sent Mail folder (uses special use flag instead of absolute folder path) for changes.

{
  "subconnections": ["\\Sent"]
}

You can use specialUse flags like \Sent or absolute paths "Parent Folder/Sent Mail"

In this case, EmailEngine would not open just one TCP connection but 2. One connection is for the main process, and the other is for the Sent Mail folder. The main process detects changes in the Inbox folder, and the sub-connection detects changes in Sent Mail and notifies the main process about these changes. This allows you to prevent running excessive polling to get those changes. Polling for changes too often would considerably increase CPU usage, so it is better to poll less frequently.

If you are never interested in anything but a few selected mailbox folders, you can limit indexing by defining only the selected paths.

{
  "path": ["Inbox", "\\Sent"],
  "subconnections": ["\\Sent"]
}

This account configuration example sets up an additional probe connection for the Sent Mail folder and limits actively monitored folders to INBOX and the Sent Mail folder. In this case, EmailEngine never checks other folders.

Webhooks

By default, EmailEngine processes webhooks in serial and one at a time. So if you have 1000 webhook notifications in the queue, and it takes 100ms to process each one, then it should take 10 seconds to process all. While 10s would not be a problem, consider that if it takes several seconds to process a webhook which is often more realistic, and you have tens or hundreds of thousands of webhooks in the queue, it is going to take forever to crunch through the backlog.

EmailEngine inserts all processable events into the webhook queue (you can see the webhook queue contents in ToolsArenanotify) even if you don't monitor that specific event or have webhooks disabled. EmailEngine decides if a webhook should be sent at the processing time.

To increase throughput, you can increase the count of webhook workers. You need to consider, though, that you might lose consistency by processing multiple webhooks in parallel. Depending on how your webhook processing endpoint works, you might get messageDeleted event for an email before you get messageNew. In most cases, this is only a theoretical concern, but you need to be aware of it when designing your systems.

Use the following value to increase the number of worker threads that process webhooks:

EENGINE_WORKERS_WEBHOOKS=1

By default, there is only 1 webhook worker

Additionally, you can increase queue concurrency variable. Use this only if you see that the CPU cores that run the webhook worker threads are underutilized. By increasing the concurrency, you can increase the load for CPU cores. For example, by setting it to 5, it means that each webhook worker would not process 1 webhook but 5 webhooks in parallel.

EENGINE_NOTIFY_QC=1

By default, the concurrency is set to 1

You can calculate how many webhooks EmailEngine would maximally handle at a time by multiplying these values.

ACTIVE_WH = EENGINE_WORKERS_WEBHOOKS * EENGINE_NOTIFY_QC
When designing a webhook handler, aim to minimize the time it takes to process a webhook. Ideally, the handler should not perform any operations on the webhook data itself. Instead, it should quickly write the data to an internal processing queue, such as a Kafka host, which should take only a few milliseconds. This allows EmailEngine to function smoothly. Once the event is stored in the persistent internal queue system, you can take your time processing it without concerns about your EmailEngine instance crashing due to insufficient memory.

Email sending

Just like webhooks, EmailEngine processes a single email submission at a time by default. If you need to send a lot of emails, this is going to be too slow.

All queued emails are stored in Redis, which in other words, means in RAM. If you expect to send many emails, ensure you provision your Redis server large enough to handle all that storage.

Use the following variable to increase the number of worker threads that process email submissions:

EENGINE_WORKERS_SUBMIT=1

By default, there is only 1 worker thread. 

Additionally, you can increase the queue concurrency value just like with webhooks. Beware, though, that it takes way more RAM to send emails than it takes to process webhooks (the entire RFC822 formatted email needs to be read to memory by the submission worker when sending it), so increasing the concurrency too much might exhaust the available RAM for the worker thread. Your server might have a lot of RAM available, but each thread can allocate a few gigs of heap memory max.

EENGINE_SUBMIT_QC=1

By default, the concurrency is set to 1.

Redis

  • Make sure the latency between Redis and your EmailEngine instance is as small as possible, so do not use different data centers.
  • Only enable AOF for Redis if you have extremely fast disks. On some Linux distributions, AOF might be automatically enabled for the default Redis package.
  • When setting up Redis, provision a machine with at least twice the RAM that you think you are going to need. Redis needs extra RAM when running periodic snapshots, so regular RAM usage should always be at most 80% of available RAM to allow safe snapshotting. You also need leeway for runaway webhook queues etc., which might accumulate many entries in Redis in a short time.
  • For the general mailbox indexing, consider 1-2 MBs of Redis storage per email account or more if these are large email accounts. It does not matter how large these accounts are in byte size. More important is the number of emails on an account.

Redis keepalive

Some Redis optimization blog posts advise setting the Redis keepalive configuration value to 0. Do not do this! Better leave it as the default (300).

tcp-keepalive 300

The default TCP keepalive value is 300

Redis alternatives

Some Redis alternatives promise better throughput or other benefits compared to the original.

  • Upstash Redis (also the default for fly.io) – EmailEngine is compatible with Upstash Redis. Make sure that your EmailEngine instance runs in the same AWS or GCP DC as your Redis server. EmailEngine runs a huge amount of Redis commands, so check your plan conditions and monitor your usage to prevent any surprises. NB! Upstash Redis limits command size to 1MB. This means that you can't run any operations with large blob values, like sending emails with large attachments (EmailEngine stores all queued emails in Redis).
  • ElastiCache – is reported to be working with EmailEngine. You shouldn't probably use it, though. As the name states, it's a cache. Losing data in EC is normal. For example, it happens any time you restart your cluster. EmailEngine expects Redis storage to be more or less persistent.
  • Memurai – seems to work with EmailEngine. I have only run it for testing, so I don't know how it behaves under load in production.
  • Dragonfly – seems to work with EmailEngine. Dragonfly must be started either with the DFLY_default_lua_flags=allow-undeclared-keys environment key or with the command line argument --default_lua_flags=allow-undeclared-keys
  • KeyDB – seems to work with EmailEngine. I have only run it for testing, so I don't know how it behaves under load in production.

ElasticSearch

EmailEngine uses ElasticSearch as its Document Store backend. By default, Document Store is not enabled, so most users do not have to worry about it. If you use the Document Store option, make sure your ElasticSearch instance has enough memory available (you would also need to set the -Xmx and -Xms arguments for the Java process to allocate that memory to ElasticSearch) and fast disks for faster indexing.

Horizontal scaling

EmailEngine does not have built-in horizontal scaling support. If you ran multiple EmailEngine instances against the same database, then these all would try to sync the same email accounts. So, for now, the only option for larger installations is to use manual sharding. Set up multiple independent EmailEngine instances and divide your email accounts between those instances. For example first 1000 email accounts to the first EmailEngine server, the next 1000 to the second server, and so on.