Motivation

Quite some time ago, I learned how to remove a password from a pdf file, using Stirling pdf.

I am newly transferring all these minor jobs and automation scripts to a centralized Python Management Platform; more on that in a later blog post. Now I transferred this job, and as some things are slightly different, I wanted to re-discuss the post.

My solution

First, polling the messages got a bit more flexible: When not being bound to Node-Red any more, I can actually only mark messages as read when I actually successfully processed them!

So the listing messages part now uses Python’s imap_tools library and looks as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
with imap_tools.MailBox("mailserver.tech-tales.blog").login(
    "username", "password"
) as mailbox:

    for msg in mailbox.fetch(mark_seen=False):
        # Note: I may add a bit more search criteria here,
        # somewhen. Realistically, probably not...

        if condition:
            process_message(msg)
            mailbox.flag(msg.uid, MailMessage.Flags.SEEN, True)

Not too hard, and nothing fancy happening. The real magic happens in the process_message method anyways.

Process message

I have multiple steps here.

Find the correct attachment

This kind of depends on what you want to achieve, but imap_tools makes that quite simple: All attachments are stored in msg.attachments. For each attachment, we have got attachment.filename, attachment.content_type (which should be application/pdf in my case), attachment.payload (which is the actual file contents, in bytes) and some other stuff.

Use Stirling pdf to remove the password of an attachment

Now this was a bit more involved, as a direct transformation from Node-Red was impossible. In the end, my request looks as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
requests.post(
    "https://stirling-pdf.tech-tales.blog/api/v1/security/remove-password",
    data={
        "password": "my-super-secret-password"
    },
    files={
        "fileInput": (
            attachment.filename,
            attachment.payload,
            attachment.content_type
        )
    },
    headers={
        "accept": "*/*"
    }
)

Interesting changes:

  • I now explicitly define name of the file. Not that this makes a huge difference.
  • In the Node-Red version, I had the header Content-Type: "multipart/form-data" set. If I do that here, the request fails! I did not yet find out why that is.

Getting the contents of the new file now is easy: unencrypted_pdf_bytes = requests.post(...).content

Sending the message

Again, this is impressively easy with Python’s builtin tools:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
msg = email.message.EmailMessage()
msg["From"] = "paperless-preprocess@tech-tales.blog"
msg["To"] = "paperless@tech-tales.blog"
msg["Subject"] = "Hello world"
msg.add_attachment(
    unencrypted_pdf_bytes,
    maintype="application",
    subtype="pdf",
    filename=attachment.filename
)

with smtplib.SMTP("mailserver.tech-tales.blog", 587) as smtp:
    smtp.starttls()
    smtp.login("username", "password")
    smtp.send_message(msg)

And that’s all!

Conclusion

I definitely prefer Python over Node-Red. Not sure why I spent so much time on Node-Red stuff on such automation tasks when I could have used Python quite simply.