angle-uparrow-clockwisearrow-counterclockwisearrow-down-uparrow-leftatcalendarcard-listchatcheckenvelopefolderhouseinfo-circlepencilpeoplepersonperson-fillperson-plusphoneplusquestion-circlesearchtagtrashx

Using Python to get Postfix sent status for messages with a message-id

Postfix can generate reports. Here I present another way to get the send status of emails.

20 August 2020 Updated 20 August 2020
In Email
post main image
https://unsplash.com/@walling

The problem: I have a website that sends emails. The emails (meta) are stored in a database before sending to the Postfix MTA. Every email the website sends has a unique message-id, something like  159766775024.77.9154479190021576437@peterspython.com. I want to check if the emails are actually sent and add this information to the email (meta) database records. 

Postfix has statuses like:

  • status=bounced
  • status=deferred
  • status=sent

I first start with sent, later I can add other statuses.

Using Python makes our life more easy

I could not find a program that gave me this information so I decided to write it myself. My server is a Linux server (Ubuntu) with ISPConfig. We use the /var/log/mail.log to get the information.

Warning: different versions of Postfix may have different logline formats. Also on my server outgoing email is checked by Amavis, a content (spam) filter.

Of course we use Python for this task. I also like Bash programming but when things get little more complex you better switch to a more high level language. I use a mix here. Linux commands to reduce processing time and the size of the program, and Python code to do some filtering and generating the result lines.

Grep, Sed, Cut

Linux has a number of commands that greatly reduce our programming efforts, in this case I use:

  • Grep
  • Sed
  • Cut

Grep has the '-f' option where lines in file A can be filtered by a pattern in File B. Sed is very easy to use if you know regular expressions. I use it compress multiple spaces to a single space, add a colon to a line, remove parts of a line. Cut is used to pick one or more fields of the (filtered) line, or columns of a file.

Postfix, queues and mail.log

Postfix uses queues and messages are assigned queue_ids. When we submit a message to Postfix, this message is checked and if passed, put in what I will call the accept_queue. We see a line like:

Aug 17 16:07:52 server8 postfix/cleanup[8062]: C9DC982EEB: message-id=<159767327280.144.3674122625658819909@peterspython.com>

Here the accept_queue_id is 'C9DC982EEB'. If all goes well then after some time, your message is put in what I call the send_queue. This is a bit confusing as the line contains 'status=sent' while it has not been sent! In this case the line is:

Aug 17 16:07:53 server8 postfix/smtp[8063]: C9DC982EEB: to=<peterpm@xs4all.nl>, relay=127.0.0.1[127.0.0.1]:10026, delay=0.33, delays=0.01/0.01/0.01/0.3, dsn=2.0.0, status=sent (250 2.0.0 from MTA(smtp:[127.0.0.1]:10027): 250 2.0.0 Ok: queued as 2083482EEC)

Note the end of this line: 'queued as 2083482EEC)'. To check if and when this message was accepted by the receiving mail server we must now search send_queue_id '2083482EEC'. If you search mail.log you will find a line that not only contains 'status=sent' but also '<> accepted message', where <> is the receiver mail server, example:

Aug 17 16:07:55 server8 postfix/smtp[8067]: 2083482EEC: to=<peterpm@xs4all.nl>, relay=mx3.xs4all.nl[194.109.24.134]:25, delay=2, delays=0.01/0/1.6/0.44, dsn=2.0.0, status=sent (250 2.0.0 mxdrop307.xs4all.net accepted message 07HE7rrG015924)

The code

To execute Linux commands in Python I use:

os.system(command)

The flow is in the method update_message_id_and_status(). The heavy work is done by Linux commands. They generate a number of temporary files. When we have enough data, we use Python code for easy processing.

When you leave debugging on, the program shows the first two lines of each file that is created. Some lines in the resulting message_id_status_file:

159767280175.144.12615794910844491932@peterspython.com sent 2020-08-17 16:00:05
159767326022.144.11661923945211507822@peterspython.com sent 2020-08-17 16:07:42
159767327280.144.3674122625658819909@peterspython.com sent 2020-08-17 16:07:55

Here is the Python script:

import datetime
import inspect
import os
import csv
import subprocess 

MAIL_LOG_FILE='/var/log/mail.log'
MAIL_LOG_FILE='./mail.log' 
TMP_FILE_DIR='/tmp'
POSTFIX_MESSAGE_ID_STATUS_FILE_DIR='.'
POSTFIX_MESSAGE_ID_STATUS_FILENAME='message_id_status.txt'

class PostfixMessageIdStatus:

    def __init__(self):
        self.mail_log_file = MAIL_LOG_FILE
        self.message_id_status_file = os.path.join(POSTFIX_MESSAGE_ID_STATUS_FILE_DIR, POSTFIX_MESSAGE_ID_STATUS_FILENAME)
        self.tmp_message_ids_file = os.path.join(TMP_FILE_DIR, 'tmp_message_ids.txt')
        self.tmp_accept_queue_queue_id_message_id_all_file = os.path.join(TMP_FILE_DIR, 'tmp_accept_queue_queue_id_message_id_all.txt')
        self.tmp_accept_queue_queue_id_message_id_file = os.path.join(TMP_FILE_DIR, 'tmp_accept_queue_queue_id_message_id.txt')
        self.tmp_accept_queue_ids_file = os.path.join(TMP_FILE_DIR, 'tmp_accept_queue_ids.txt')
        self.tmp_accept_queue_id_send_queue_id_file = os.path.join(TMP_FILE_DIR, 'tmp_accept_queue_id_send_queue_id.txt')
        self.tmp_send_queue_ids_file = os.path.join(TMP_FILE_DIR, 'tmp_send_queue_ids.txt')
        self.tmp_sent_lines_file = os.path.join(TMP_FILE_DIR, 'self.tmp_send_lines.txt')

        self.my_message_id_domains = [
            '@peterspython.com'
        ]
        self.class_name = 'PostfixMessageIdStatus'
        self.dbg = False

    def get_fname(self):
        frame = inspect.stack()[1]
        function_name = inspect.currentframe().f_back.f_code.co_name
        fname = self.class_name + '::' + function_name
        return fname

    def print_line_count_and_few_lines_of_file(self, f): 

        if not os.path.isfile(f):
            raise FileNotFoundError(fname, ': file {} does not exist'.format(f))
        result = subprocess.check_output(['wc', '-l', f]).decode("utf-8") 
        line_count = int(result.split(' ')[0]) 
        print('file: {}'.format(f))
        print('line_count: {}'.format(line_count))
        command = "head --lines=2 " + f
        os.system(command)

    def create_empty_message_id_status_file_if_not_exists(self, f):
        fname = self.get_fname()
        dbg = self.dbg

        if not os.path.exists(f):
            print(fname + ': file {} does not exist so create it'.format(f))
            open(f, 'a').close()

            if not os.path.exists(f):
                raise Exception(fname + ': file {} could not be created'.format(f))

    def copy_message_ids_from_message_id_status_file_into_tmp_message_ids_file(self):
        fname = self.get_fname()
        dbg = self.dbg
        if dbg:
            print(fname + '()')

        command = 'cut -d " " -f 1 ' + self.message_id_status_file + ' > ' + self.tmp_message_ids_file
        if dbg:
            print(fname + ': command = {}'.format(command))
        os.system(command)

    def filter_mail_log_to_get_all_accept_queue_id_and_message_ids(self):
        fname = self.get_fname()
        dbg = self.dbg
        if dbg:
            print(fname + '()')

        # only include our own message_ids using the domain part
        message_id_domains_filter = '-e ' + ' -e '.join(self.my_message_id_domains)

        # do not remove ':' from the queue_id, we need this later
        command = "grep 'message-id=' " + self.mail_log_file + " | grep " + message_id_domains_filter + " | sed 's/\s\s*/ /g' | sed 's/message-id\=<//g' | sed 's/>//g' | cut -d ' ' -f 6,7 | sort -u > " + self.tmp_accept_queue_queue_id_message_id_all_file
        if dbg:
            print(fname + ': command = {}'.format(command))
        os.system(command)

    def filter_tmp_accept_queue_queue_id_message_id_all_file_with_tmp_message_ids_file(self):
        fname = self.get_fname()
        dbg = self.dbg
        if dbg:
            print(fname + '()')

        command = "grep -v -f " + self.tmp_message_ids_file + " " + self.tmp_accept_queue_queue_id_message_id_all_file + " > " + self.tmp_accept_queue_queue_id_message_id_file
        if dbg:
            print(fname + ': command = {}'.format(command))
        os.system(command)

    def create_tmp_accept_queue_ids_file(self):
        fname = self.get_fname()
        dbg = self.dbg
        if dbg:
            print(fname + '()')

        command = "cut -d ' ' -f 1 " + self.tmp_accept_queue_queue_id_message_id_all_file + " > " + self.tmp_accept_queue_ids_file
        if dbg:
            print(fname + ': command = {}'.format(command))
        os.system(command)

    def filter_mail_log_by_accept_queue_ids_and_queued_as_to_get_send_queue_ids(self):
        fname = self.get_fname()
        dbg = self.dbg
        if dbg:
            print(fname + '()')

        command = "grep -f " + self.tmp_accept_queue_ids_file + ' '  + self.mail_log_file + " | grep 'status=sent' | grep 'queued as' | sed 's/to.*queued as/ /g' | sed 's/)//g' | sed 's/\s\s*/ /g' |  sed 's/$/:/g' | cut -d ' ' -f 6,7 > " + self.tmp_accept_queue_id_send_queue_id_file
        if dbg:
            print(fname + ': command = {}'.format(command))
        os.system(command)

    def create_tmp_send_queue_ids_file(self):
        fname = self.get_fname()
        dbg = self.dbg
        if dbg:
            print(fname + '()')

        command = "cut -d ' ' -f 2 " + self.tmp_accept_queue_id_send_queue_id_file + " > " + self.tmp_send_queue_ids_file
        if dbg:
            print(fname + ': command = {}'.format(command))
        os.system(command)

    def filter_mail_log_by_send_queue_ids_and_accepted_message_to_get_sent_lines(self):
        fname = self.get_fname()
        dbg = self.dbg
        if dbg:
            print(fname + '()')

        command = "grep -f " + self.tmp_send_queue_ids_file + ' '  + self.mail_log_file + " | grep 'status=sent' | grep 'accepted message'| sed 's/\s\s*/ /g' > " + self.tmp_sent_lines_file
        if dbg:
            print(fname + ': command = {}'.format(command))
        os.system(command)

    def update_message_id_and_status(self):
        fname = self.get_fname()
        dbg = self.dbg
        dbg = True

        if dbg:
            print(fname + '()')

        # we use grep, sed, cut to do heavy work

        self.create_empty_message_id_status_file_if_not_exists(self.message_id_status_file)

        self.copy_message_ids_from_message_id_status_file_into_tmp_message_ids_file()
        if dbg:
            self.print_line_count_and_few_lines_of_file(self.tmp_message_ids_file)

        # after accepting a message, postfix passes the message to cleanup
        # example postfix line:
        # Aug 15 14:51:47 server8 postfix/cleanup[12794]: 23EFC803C7: message-id=<135717442.1972837.1597495901382@mail.yahoo.com>

        # get all accept_queue queue_id and message_id 
        self.filter_mail_log_to_get_all_accept_queue_id_and_message_ids()
        if dbg:
            self.print_line_count_and_few_lines_of_file(self.tmp_accept_queue_queue_id_message_id_all_file)

        # remove lines from tmp_accept_queue_queue_id_message_id_all_file that contain message_ids we already have
        self.filter_tmp_accept_queue_queue_id_message_id_all_file_with_tmp_message_ids_file()
        if dbg:
            self.print_line_count_and_few_lines_of_file(self.tmp_accept_queue_queue_id_message_id_file)

        # we now have a file with lines: accept_queue_id message_id
        # example:
        # E0EB882BF4: 159777360230.7.2488114074420651363@peterspython.com

        # create from the above a file with only accept_queue_ids
        self.create_tmp_accept_queue_ids_file()
        if dbg:
            self.print_line_count_and_few_lines_of_file(self.tmp_accept_queue_ids_file)

        # we now have a file with accept_queue_ids
        # example:
        # E0EB882BF4:

        # use this file to filter the mail.log file to get lines ending with 'queued as D80D582C38)'
        # Aug 15 15:00:01 server8 postfix/smtp[13089]: 82E3D82C26: to=<peterpm@xs4all.nl>, relay=127.0.0.1[127.0.0.1]:10026, delay=0.39, delays=0.03/0.01/0.01/0.33, dsn=2.0.0, status=sent (250 2.0.0 from MTA(smtp:[127.0.0.1]:10027): 250 2.0.0 Ok: queued as D80D582C38)
        self.filter_mail_log_by_accept_queue_ids_and_queued_as_to_get_send_queue_ids()
        if dbg:
            self.print_line_count_and_few_lines_of_file(self.tmp_accept_queue_id_send_queue_id_file)

        # we now have a file with accept_queue_ids and send_queue_ids
        # example:
        # 9168182C54: 8A6FF82CB8:

        # create from the above a file with only send_queue_ids
        self.create_tmp_send_queue_ids_file()
        if dbg:
            self.print_line_count_and_few_lines_of_file(self.tmp_send_queue_ids_file)

        # we now have a file with send_queue_ids
        # example:
        # 5346D82C08:

        # use this file to filter the mail.log file to get lines ending with 'accepted message'
        # Aug 15 15:00:03 server8 postfix/smtp[13029]: D80D582C38: to=<peterpm@xs4all.nl>, relay=mx1.xs4all.nl[194.109.24.132]:25, delay=2, delays=0.01/0.01/1.6/0.42, dsn=2.0.0, status=sent (250 2.0.0 mxdrop301.xs4all.net accepted message 07FD01qu014687)
        self.filter_mail_log_by_send_queue_ids_and_accepted_message_to_get_sent_lines()
        if dbg:
            self.print_line_count_and_few_lines_of_file(self.tmp_sent_lines_file)

        # we now have a file with sent lines
        # example:
        # Aug 16 07:00:04 server8 postfix/smtp[12993]: EB13A82DD3: to=<peterpm@xs4all.nl>, relay=mx1.xs4all.nl[194.109.24.132]:25, delay=2, delays=0.01/0.01/1.6/0.46, dsn=2.0.0, status=sent (250 2.0.0 mxdrop306.xs4all.net accepted message 07G50207012529)

        # from here we use python to process the sent lines and update our message_id and status file

        # slurp and process self.tmp_accept_queue_id_send_queue_id_file
        accept_queue_id2message_ids = {}
        with open(self.tmp_accept_queue_queue_id_message_id_file) as fh_csv:
            rows = csv.reader(fh_csv, delimiter=' ')
            for row in rows:
                accept_queue_id2message_ids[row[0]] = row[1]
        if dbg:
            print(fname + ': accept_queue_id2message_ids = {}'.format(accept_queue_id2message_ids))

        # slurp and process self.tmp_accept_queue_id_send_queue_id_file
        send_queue_id2accept_queue_ids = {}
        with open(self.tmp_accept_queue_id_send_queue_id_file) as fh_csv:
            rows = csv.reader(fh_csv, delimiter=' ')
            for row in rows:
                send_queue_id2accept_queue_ids[row[1]] = row[0]
        if dbg:
            print(fname + ': send_queue_id2accept_queue_ids = {}'.format(send_queue_id2accept_queue_ids))

        # iterate sent lines
        message_id_status_lines = []
        with open(self.tmp_sent_lines_file) as fh:
            for line in fh:
                line_parts = line.split(' ')

                log_dt = '2020 ' + ' '.join(line_parts[:3])
                log_dt_obj = datetime.datetime.strptime(log_dt, '%Y %b %d %H:%M:%S')
                log_dt_iso = log_dt_obj.strftime("%Y-%m-%d %H:%M:%S")

                send_queue_id = line_parts[5]
                if send_queue_id not in send_queue_id2accept_queue_ids:
                    continue

                accept_queue_id = send_queue_id2accept_queue_ids[send_queue_id]
                if accept_queue_id not in accept_queue_id2message_ids:
                    continue

                message_id = accept_queue_id2message_ids[accept_queue_id]

                message_id_status_line = '{} sent {}'.format(message_id, log_dt_iso)
                if dbg:
                    print(fname + ': {}'.format(message_id_status_line))

                message_id_status_lines.append(message_id_status_line)

        # update message_id_status_file
        if len(message_id_status_lines) > 0:
            with open(self.message_id_status_file, 'a') as fh:
                fh.write('\n'.join(message_id_status_lines) + '\n')

# start
postfix_message_id_status = PostfixMessageIdStatus()
postfix_message_id_status.update_message_id_and_status()

Summary

Executing Linux commands from Python makes your program small and fast. Once we have our data, we use Python code to do things that are more difficult to do in Bash. Processing the mail.log file is a challenge but once you know what is going on, it is not that difficult. And finally I have a file containing the message-ids that actually have been sent, which means, accepted by the receiving mail server. I can use this file to update the email (meta) records in the database.

Links / credits

Postfix
http://www.postfix.org/

Leave a comment

Comment anonymously or log in to comment.

Comments

Leave a reply

Reply anonymously or log in to reply.