Data and security compliance

These days it is important to consider issues like security and privacy, both from the user's viewpoint and also as a legal requirement. As everything email-related is usually highly sensitive, then EmailEngine tries to help to achieve compliance on these fronts as easily as possible.

Unlike with some SaaS vendors, a lot of security and data compliance issues do not apply at all in the case of EmailEngine, as EmailEngine is not a vendor itself but a self-hosted application, and it avoids storing any identifiable or private data. Being a self-hosted application means that you do not have to list EmailEngine as a separate data processor. It is part of your own stack.

In this post, I'm going to address different security-related topics, how EmailEngine handles these, and what are the concerns. I'll also list all the data that is stored by EmailEngine.

Authentication

EmailEngine requires Bearer Tokens to be used for API calls. The web UI uses regular cookie-based login, and then there's also a public account registration form that does not require any authentication.

Encryption

Encryption is not enabled by default. If you do not provide EmailEngine an encryption token, then all data is stored unencrypted. If you enable encryption after you have already stored some data then it is kept unencrypted until the specific record is updated or you use the migration tool.
  • Field-level encryption. EmailEngine encrypts all secret database fields, e.g. IMAP/SMTP account passwords, OAuth2 client secret, OAuth2 access, and refresh tokens.
    EmailEngine uses aes-256-gcm cipher for encryption. Encryption is disabled by default, and you have to provide an encryption secret to enable it.
    Encrypted field values can not be retrieved via the API. You would only get an indication if that field is set but not the actual value.
Encryption secret can be provided to EmailEngine either directly by setting an environment variable or alternatively via Vault if your security policy requires secrets to be held in a secure storage.
  • Disk-level encryption. EmailEngine does not manage disks, so this is out of scope. You need to manage this yourself by either using an encrypted file system or using platform-provided encrypted volumes. It mostly affects disk volumes for the Redis servers as EmailEngine stores all its data in Redis.
  • Encryption in Transit. There are multiple ways data is transferred between servers and services.
    1. REST API requests. EmailEngine only provides an HTTP interface. To secure it, make sure that EmailEngine is only bound to localhost and then set up a reverse proxy that handles HTTPS
    2. EmailEngine and Redis. If EmailEngine and Redis are both hosted on the same server, then this can be discarded. For multi-server setups, Redis versions older than 6 do not support TLS at all. There are options to use secure tunnels though. In any case, if you can make TLS connections to Redis, then use the rediss:// connection URL instead of redis:// in the configuration.
    3. EmailEngine and IMAP/SMTP servers. EmailEngine always tries to use TLS or STARTTLS. In fact most servers today reject authentication attempts if these are not done over encrypted connections.

Data collection

EmailEngine stores some data in order to operate. In general, the data stored is only metadata required to find differences between data syncing operations. This is how EmailEngine knows if an email was added or removed from a folder.

1. Account data

EmailEngine stores the data structure you provide via the account add/update REST API endpoints. Identifiable data would be the name of the user and IMAP/SMTP username that usually (but not always) is the account email address. Passwords would be encrypted, assuming that EmailEngine encryption is set up.

2. Folder-level data

  • Folder names. Pathnames are identifiers in IMAP so this is a must-have.
  • Technical values like UIDVALIDITY, HIGHESTMODSEQ, UIDNEXT are needed to detect changes in a folder. These are numeric values that do not contain any identifiable information.

3. Email-level data

EmailEngine stores metadata for every email in a folder. These values are only needed to track changes or to identify messages, so content values that do not change or do not provide any identification info (eg. the subject line), are not stored and instead are fetched from the IMAP server every time these are needed.

These are the data fields EmailEngine stores for each message:

  • UID identifier – numeric identifier.
  • MODSEQ value, if available – a numeric value that increments every time the message is updated.
  • Email-ID, depending on the server either X-GM-MSGID or EMAILID value. This is a globally unique identifier, currently available in Gmail and Yahoo accounts as these standards are not widely supported (yet). This value does not contain any identifiable data, it is usually a large numeric value or a UUID or similar.
  • Message flags, eg. \Seen, \Flagged, \Draft etc. Most servers allow custom flags as well, and some email clients use these for colored labels.
  • Message labels. Only for Gmail accounts. Gmail allows messages to be stored in multiple folders simultaneously, and labels include all folder paths the message is stored in.
  • Bounce information. This includes the recipient address that failed (invalid@example.com) and the SMTP failure response (550 5.1.1 No such user).

In general, EmailEngine does not store anything that is not actually required for syncing. One objective is to avoid data compliance issues, but the more important thing is to decrease used memory size.

EmailEngine stores all data in Redis, which is an in-memory database, and if you are storing information about thousands of messages in thousands of email accounts, then numbers add up quickly, and you might run out of usable memory.

So whenever actual email contents are needed, these are fetched from the IMAP servers and not from EmailEngine's storage. If the user has revoked access to EmailEngine, then these values are also no longer available.

Deleting data

Deleting an email account from EmailEngine clears all account-related data.

Older versions of EmailEngine kept a list of pathnames of that account even after the account was deleted. You can manually remove such data by deleting a key in Redis with the following name: iah:{accountId} where {accountId} would be replaced with the registration Id value of the deleted account.

Backups

EmailEngine backups actually mean Redis backups, as this is where the data is stored. If you store backups for a long time and a user requests all their data to be deleted then it is your own responsibility to figure out whether the data in the backups must be deleted or not. Consult your Data Officer.