Copying files, attributes, and timestamps
Recipe Difficulty: Easy
Python Version: 2.7 or 3.5
Operating System: Windows
Preserving files is a fundamental task in digital forensics. It is often preferable to containerize files in a format that can store hashes and other metadata of loose files. However, sometimes we need to copy files in a forensic manner from one location to another. Using this recipe, we will demonstrate some of the methods available to copy files while preserving common metadata fields.
Getting started
This recipe requires the installation of two third-party modules pywin32
and pytz
. All other libraries used in this script are present in Python's standard library. This recipe will primarily use two libraries, the built-in shutil
and a third-party library, pywin32
. The shutil
library is our go-to for copying files within Python, and we can use it to preserve most of the timestamps and other file attributes. The shutil
module, however, is unable to preserve the creation time of files it copies. Rather, we must rely on the Windows-specific pywin32
library to preserve it. While the pywin32
library is platform specific, it is incredibly useful to interact with the Windows operating system.
Note
To learn more about the shutil
library, visit https://docs.python.org/3/library/shutil.html.
To install pywin32
, we need to access its SourceForge page at https://sourceforge.net/projects/pywin32/ and download the version that matches our Python installation. To check our Python version, we can import the sys
module and call sys.version
within an interpreter. Both the version and the architecture are important when selecting the correct pywin32
installer.
Note
To learn more about the sys
library, visit https://docs.python.org/3/library/sys.html.
In addition to the installation of the pywin32
library, we need to install pytz
, a third-party library used to manage time zones in Python. We can install this library using the pip
command:
pip install pytz==2017.2
How to do it…
We perform the following steps to forensically copy files on a Windows system:
- Gather source file and destination arguments.
- Use
shutil
to copy and preserve most file metadata. - Manually set timestamp attributes with
win32file
.
How it works…
Let’s now dive into copying files and preserving their attributes and timestamps. We use some familiar libraries to assist us in the execution of this recipe. Some of the libraries, such as pytz
, win32file
, and pywintypes
are new. Let’s briefly discuss their purpose here. The pytz
module allows us to work with time zones more granularly and allows us to initialize dates for the pywin32
library.
To allow us to pass timestamps in the correct format, we must also import pywintypes
. Lastly, the win32file
library, available through our installation of pywin32
, provides various methods and constants for file manipulation in Windows:
from __future__ import print_function import argparse from datetime import datetime as dt import os import pytz from pywintypes import Time import shutil from win32file import SetFileTime, CreateFile, CloseHandle from win32file import GENERIC_WRITE, FILE_SHARE_WRITE from win32file import OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL __authors__ = ["Chapin Bryce", "Preston Miller"] __date__ = 20170815 __description__ = "Gather filesystem metadata of provided file"
This recipe's command-line handler takes two positional arguments, source
and dest
, which represent the source file to copy and the output directory, respectively. This recipe has an optional argument, timezone
, which allows the user to specify a time zone.
To prepare the source file, we store the absolute path and split the filename from the rest of the path, which we may need to use later if the destination is a directory. Our last bit of preparation involves reading the timezone input from the user, one of the four common US time zones, and UTC. This allows us to initialize the pytz
time zone object for later use in the recipe:
parser = argparse.ArgumentParser( description=__description__, epilog="Developed by {} on {}".format( ", ".join(__authors__), __date__) ) parser.add_argument("source", help="Source file") parser.add_argument("dest", help="Destination directory or file") parser.add_argument("--timezone", help="Timezone of the file's timestamp", choices=['EST5EDT', 'CST6CDT', 'MST7MDT', 'PST8PDT'], required=True) args = parser.parse_args() source = os.path.abspath(args.source) if os.sep in args.source: src_file_name = args.source.split(os.sep, 1)[1] else: src_file_name = args.source dest = os.path.abspath(args.dest) tz = pytz.timezone(args.timezone)
At this point, we can copy the source file to the destination using the shutil.copy2()
method. This method accepts either a directory or file as the destination. The major difference between the shutil
copy()
and copy2()
methods is that the copy2()
method also preserves file attributes, including the last written time and permissions. This method does not preserve file creation times on Windows, for that we need to leverage the pywin32
bindings.
To that end, we must build the destination path for the file copied by the copy2()
call by using the following if
statement to join the correct path if the user provided a directory at the command line:
shutil.copy2(source, dest) if os.path.isdir(dest): dest_file = os.path.join(dest, src_file_name) else: dest_file = dest
Next, we prepare the timestamps for the pywin32
library. We use the os.path.getctime()
methods to gather the respective Windows creation times, and convert the integer value into a date using the datetime.fromtimestamp()
method. With our datetime
object ready, we can make the value time zone-aware by using the specified timezone
and providing it to the pywintype.Time()
function before printing the timestamps to the console:
created = dt.fromtimestamp(os.path.getctime(source)) created = Time(tz.localize(created)) modified = dt.fromtimestamp(os.path.getmtime(source)) modified = Time(tz.localize(modified)) accessed = dt.fromtimestamp(os.path.getatime(source)) accessed = Time(tz.localize(accessed)) print("Source\n======") print("Created: {}\nModified: {}\nAccessed: {}".format( created, modified, accessed))
With the preparation complete, we can open the file with the CreateFile()
method and pass the string path, representing the copied file, followed by arguments specified by the Windows API for accessing the file. Details of these arguments and their meanings can be reviewed at https://msdn.microsoft.com/en-us/library/windows/desktop/aa363858(v=vs.85).aspx:
handle = CreateFile(dest_file, GENERIC_WRITE, FILE_SHARE_WRITE, None, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, None) SetFileTime(handle, created, accessed, modified) CloseHandle(handle)
Once we have an open file handle, we can call the SetFileTime()
function to update, in order, the file's created, accessed, and modified timestamps. With the destination file's timestamps set, we need to close the file handle using the CloseHandle()
method. To confirm to the user that the copying of the file's timestamps was successful, we print the destination file's created, modified, and accessed times:
created = tz.localize(dt.fromtimestamp(os.path.getctime(dest_file))) modified = tz.localize(dt.fromtimestamp(os.path.getmtime(dest_file))) accessed = tz.localize(dt.fromtimestamp(os.path.getatime(dest_file))) print("\nDestination\n===========") print("Created: {}\nModified: {}\nAccessed: {}".format( created, modified, accessed))
The script output shows copying a file from the source to the destination with timestamps successfully preserved:

There's more…
This script can be further improved. We have provided a couple of recommendations here:
- Hash the source and destination files to ensure they were copied successfully. Hashing files are introduced in the hashing files and data streams recipe in the next section.
- Output a log of the files copied and any exceptions encountered during the copying process.