Hashing and data integrity
- Hashing: It is where the data inside a document is hashed using an algorithm such as Secure Hash Algorithm version 1 (SHA1) and Message Digest version 5 (MD5). This turns the data inside the file to a long text string known as a hash value; this is also known as a message digest.
- Hashing the same data: If you copy a file and therefore have two files containing the same data, and if you hash them with the same hashing algorithm, it will always produce the same hash value. Please look at the example that follows.
- Verifying integrity: During forensic analysis, the scientist takes a copy of the data prior to investigation. To ensure that he/she has not tampered with it during investigation, he/she will hash the data before starting and then compare the hash to the data when he/she has finished. If the hash matches, then we know that the integrity of the data is intact.
- One-way function: For the purpose of the exam, hashing is a one-way function and cannot be reversed.
- HMAC authentication: In cryptography, an HMAC (sometimes known as either keyed-hash message authentication code or hash-based message authentication code) is a specific type of Message Authentication Code (MAC) involving a cryptographic hash function and a secret cryptographic key. We can have HMAC-MD5 or HMAC-SHA1; the exam provides both data integrity and data authentication.
- Digital signature: This is used to verify the integrity of an email so that you know it has not been tampered with in transit. The private certificate used to sign the email that creates a one-way hash function and when it arrives at its destination the recipient has already been given a public key to verify that it has not been tampered with in transit. This will be covered more in-depth later in this book.
Can you read data that has been hashed? Hashing does not hide the data as a digitally signed email could still be read—it only verifies integrity. If you wish to stop someone reading the email in transit, you need to encrypt it.
- RACE Integrity Primitives Evaluation Message Digest (RIPEMD): This is a 128-bit hashing function. RIPEMD (https://en.wikipedia.org/wiki/RACE_(Europe) has been replaced by RIPEMD-160, RIPEMD-256, and RIPEMD-320. For the purpose of the exam, you need to know that it can be used to hash data.
Hash practical
The reason that we hash a file is to verify its integrity so that we know if someone has tampered with it.
Hash exercise
In this exercise, we have a file called data.txt
. First of all, I use a free MD5 hashing tool and browse to the data.txt
file, which generates a hash value. I have also created a folder called Move
data to here:
- Get the original hash:

- Copy the hash from the current hash value to the original hash value.
- Copy the
data.txt
file to theMove data to here
folder, then go to the MD5 hash software and browse to thedata.txt
file in the new location, then press verify. The values should be the same as shown here:

The values are the same, therefore we know the integrity of the data is intact and has not been tampered with during moving the readme.txt
file.
- Next, we go into the
data.txt
file and change a single character, add an extra dot at the end of a sentence, or even enter a space that cannot be seen. We then take another hash of the data and we will then see that the hash value is different and does not match; this means that the data has been tampered with:
