IDs explained

If you have used EmailEngine for a while, you probably notice the abundance of different message identifiers. There's id, emailId, uid, messageId , and under the hood also a sequence identifier that all seemingly do the same thing – identify an email on an IMAP account. What gives, why so many?

The reason is 40 years of IMAP evolution with backward compatibility. Each of these identifiers carries a separate role.

  • id – this is the ID value you can use in EmailEngine's API requests. It identifies a specific message entry within a particular folder and never changes. As long as the message still exists in that folder, the id points to that message. It does not identify a specific email entity. So if you move an email to another folder, the id value will change for that email. If you now try to use the old id value, it will point to a non-existing entry and thus is not valid anymore, even if the message is still on the account. This identifier is actually a wrapper for the uid value. It encodes folder path, UIDValidity identifier, and uid into it. This is how EmailEngine can find a message based on that id from the IMAP account.
  • uid – this is the IMAP UID identifier. It's a unique (per folder, not globally) autoincrementing integer. If you consider an IMAP folder as a separate table in MySQL, then uid would be the AUTO_INCREMENT primary key for that table. If you move a message from one folder to another and then back, the message would have a different uid than it initially had – this is because the initial UID was deleted, and uid values can not be reused, so a new one is assigned to that message. As the id value embeds this value, it behaves the same way – it identifies a specific message entry. Using id values is better, but both are pretty much the same. The id version includes slightly more information, but these two are deeply connected. You would need the uid value mostly when searching messages because you can provide a UID range as part of the search query, e.g., "123:456" would match all messages with UID values from 123 to 456.
  • emailId – uniquely identifies a message entity in the email account. This is the best identifier because it does not change. If you move or copy a message, the resulting email will still have the same emailId. So all emails with the same emailId are different instances of the same email. Unfortunately, this identifier requires special IMAP extensions that are only supported by a handful of IMAP servers (Gmail, Yahoo, Fastmail, and that's about it), so it is not very reliable. Unless you exclusively target Gmail accounts.
  • messageId – is the value from the Message-ID header. This is also a good value to use because it is globally unique. Unfortunately, it is a soft limit, as there is no one that could enforce the uniqueness. Nothing prevents anyone from reusing the same Message-ID header over and over (or not setting it at all for a message). It is still a good indicator because all legit email senders will use it properly. If it is missing, then that message is most probably spam or something suspicious. If you have seen a messageId before, then the new one is probably a copy of that previous message. Some users use messageId as the main identifier and drop emails without a messageId value (because these emails are usually worthless anyway).
  • There’s also the sequence-number based identifier that is otherwise heavily used in IMAP, but EmailEnginde does not expose these and only uses them internally.