How to parse emails with Cloudflare Email Workers?

How to parse emails with Cloudflare Email Workers?
This blog post is about general email processing with Cloudflare Email Workers and is not specific to EmailEngine. If you want to process incoming emails with EmailEngine instead, see the other posts here.

Cloudflare Email Workers are a nifty way to process incoming emails. The built-in API of Cloudflare Workers allows you to route these emails, and it also provides some email metadata information. For example, you can reject an incoming email with a bounce response, you can forward it, or you can generate and send a new email. Your worker is also provided with the SMTP envelope information, like the envelope-from and envelope-to addresses used for routing.

export default {
  async email(message, env, ctx) {
    message.setReject("I don't like your email :(");
  }
}

All emails to such an email route will bounce with the provided message.

But what about the content? Email routing information is mainly relevant for routing but not so much for processing. For example, it is probably not useful at all to detect the sender address as something like bounce-mc.us20_123456789.17649072-1234567899@mail30.atl18.mcdlv.net. It only tells us that this email was sent through a Mailchimp mailing list, but it does not reveal the actual sender. And what about the subject of the email or HTML body?

It turns out Cloudflare provides some information about the email contents. The message object includes a headers object which you can use to read email headers. It makes it really easy to read stuff like email subject line:

let subject = message.headers.get('subject');

The headers.get() method is good for reading single-line values like the subject line but kind of falls through when processing headers that might have multiple values like the To: or Cc: address lines. Additionally, there is no information at all about the text contents of the email or attachments.

Luckily, the message object includes an additional property called raw, which is a readable stream. From that stream, we can read the source code of the email, which in itself, yet again, is not very useful, but we can parse it to get any information we need about the email. Email parsing is quite complex and difficult, but luckily, there is a solution: the postal-mime package.

All you need to do is to install postal-mime dependency from NPM.

npm install postal-mime

And import it into your worker code.

import PostalMime from 'postal-mime';

This allows you to easily parse incoming emails.

const parser = new PostalMime();
const email = await parser.parse(message.raw);

The resulting parsed email object includes a bunch of stuff like the subject line (email.subject) or the HTML content of the email (email.html). You can find the full list of available properties from the docs.

Attachment support.

PostalMime parses all attachments into Uint8Array objects. If you want to process the contents of an attachment as a regular string (which makes sense for textual attachments but not for binary files like images), you can use the TextDecoder class for it.

const decoder = new TextDecoder('utf-8');
const attachmentText = decoder.decode(email.attachments[0].content);

Full example

The following Email Worker parses an incoming email and logs some information about the parsed email to the worker's log output.

import PostalMime from 'postal-mime';

export default {
  async email(message, env, ctx) {
    const parser = new PostalMime();
    const email = await parser.parse(message.raw);

    console.log('Subject', email.subject);
    console.log('HTML', email.html);

    email.attachments.forEach((attachment) => {
      let decoder = new TextDecoder('utf-8');
      console.log('Attachment', attachment.filename, 
        decoder.decode(attachment.content));
    });
  },
};