What’s in the box?!
Recipe Difficulty: Medium
Python Version: 3.5
Operating System: Any
MBOX files are often found in association with UNIX systems, Thunderbird, and Google Takeouts. These MBOX containers are text files with special formatting that split messages stored within. Since there are several formats for structuring MBOX files, our script will focus on those from Google Takeout, using the output from the prior recipe.
Getting started
All libraries used in this script are present in Python's standard library. We use the built-in mailbox
library to parse the Google Takeout structured MBOX file.
Note
To learn more about the mailbox
library, visit https://docs.python.org/3/library/mailbox.html.
How to do it...
To implement this script, we must:
- Design arguments to accept a file path to the MBOX file and an output the report its contents.
- Develop a custom MBOX reader that handles encoded data.
- Extract message metadata including attachment names.
- Write attachments to the output directory.
- Create an...