This project figure out the pattern of the bytes in the stego file and how steganalysis tool can identify the bytes appended to the truck file by steganography tool. This analysis is based on basic theory of steganography and steganalysis, and using a hex editor in order to check what kind of bytes that the steganography tool appends to the truck file.
---
Please contact to lailiaidi at gmail.com for download request
1. Project Title
Analyze a file hidden by the steganography tool and how the steganography detection
tool detects it.
Authors
• Laili Aidi
• Ahmet Aydın
• Arman Güngör
Report Organisation
This report is organised as follows: Chapter 1 contains the introduction; Chapter
2 contains the background; Chapter 3 contains the basic theory of Steganography and
Steganalysis; Chapter 4 contains the results and discussion; and finally Chapter 5
contains the conclusion and future scope.
Abstract
This project figure out the pattern of the bytes in the stego file and how steganalysis
tool can identify the bytes appended to the truck file by steganography tool. This
analysis is based on basic theory of steganography and steganalysis, and using a hex
editor in order to check what kind of bytes that the steganography tool appends to the
truck file.
Discussion
1. Introduction
This project uses and focuses on Hiderman and Masker as steganography tools
and Stegspy as a detection tool. We analyze how Hiderman and Masker hide the file
into another file and encrypt the bytes, and how Stegspy detects whether a file
contains hidden file or not.
2. Background
This project is based on the theory that, steganography tool leaves fingerprint and
steganography detection tool detects this to identify whether a file contains a
steganography message or not [1].
The goal of this topic is to know the bit pattern that Hiderman and Masker
produce after doing steganography process, and predict how Stegspy detects these,
and then combine these results to get conclusion of comparison between "Hiderman"
and Masker.
We can only reach this conclusion after we analyze and know how does
Hiderman and Masker work to hide a file, and how can Stegspy so easily to detect
that. By doing this, it is possible to make a better steganography tool, but that is out of
this project's scope.
3. Stenography and Steganalysis Concept
3.1 Steganography
Steganography is technique and science of communicating message that cannot
be detected [2]. Steganography and encryption are used for data confidentiality,
however there are differences between them [3]:
• Encryption provides secure communication, which is requiring key to read the
data; so it cannot be read / removed, but possible to be modified.
• Steganography provides secret communication, which is cannot be read /
2. modified / removed without significantly altering the data in which it is
embedded, this embedded data and the communication process itself is
confidential, unless attacker can detect it.
1) Steganography Type [3]
a. Robust steganography: involves embedding information into a file, which
cannot easily be destroyed. There is hidden mark in some part of the file
which if that is removed / changed, then the file would be rendered as
useless. Types of robust marking:
• Fingerprinting: involves hiding unique identifier of user, which is allowed
to use the data, but not distributing it. This identifier can be used to
identify which user violated the license agreement by distributing a copy
of that data.
• Watermarking: involves hiding unique identifier of owner to identify who
has an illegal copy. Watermarks can be hidden to prevent detection and
removal (imperceptible watermarks), or formed as a visual pattern
overlaid on an image (visible watermarks). Watermarking provides high
level of robustness, so it should be impossible to remove a watermark
without degrading the quality.
b. Fragile steganography: involves embedding information into truck file. It
will be destroyed if that truck file is modified.
2) Steganography Technique
a. Binary File Techniques, by embedding information inside a binary file
without affecting the file execution [3]. The original binary file is needed to
decode and extract the embedded watermark, by comparing the marked and
original file. This method is simple but not resistant to attack, because it can
be detected by using different version of the stego files.
b. Plaintext Steganography Techniques, by altering the document in which way
not visible to the human eye, but can be decoded by computer [3][1]. This
can be done by several ways:
• Line Shift Coding Protocol, by shifting various lines inside document up /
down by a small fraction [3].
• Word Shift Coding Protocol, by shifting the word left / right [3].
• Feature Coding Protocol, by using features to hide information and each
of these will be marked into the document [3].
• White Space Manipulation, by adding certain amount of white / blank
space to the end of lines / between consecutive words, which are
corresponds to bit value [3][1].
• Text Content, by changing the sentences but keeping its original meaning
[3].
• XML, by using the different tags as allowed by the W3C, the space inside
a tag, etc [3].
c. Still imagery Steganography Techniques [3, 1]: exploits the weakness of the
human visual system (HVS), which cannot detect the variation in luminance
of color vectors at higher frequency side of the visual spectrum [1]. As a
picture can be represented by a collection of color pixels. These individual
pixels are represented by optical characteristics and each of these
characteristics can be digitally expressed as 1 and 0.
• Simple Watermarking, by adding a pattern on top of another image [3]
3. • Least Significant Bit Hiding, byusing the least significant bits of each
pixel in an image to hide the most significant bits of another (image
hiding) [3]. However, this technique is very vulnerable to attacks (image
compression and formatting) [1].
• Direct Cosine Transformation [3, 1].
• Wavelet Transformation [3].
d. Audio and Video Steganography Techniques [3, 6, 7, 8]
• Spread Spectrum, by matching the narrow bandwidth of the embedded
data to the large bandwidth of the medium, so it will be heard as noise
sounds but can be recognised by the receiver with the correct key [3].
• LSB Coding, using sampling technique followed by quantization converts
analog audio signal to digital binary sequence [3, 1].
• Phase Coding, using the phase change in audio signal, which cannot easy
to recognize by Human Auditory System (HAS). This technique encodes
the secret message bits as phase shifts in the phase spectrum of a digital
signal, achieving an inaudible encoding in terms of signal-to-noise ratio
[3, 1].
• Spread Spectrum, by using DSSS (direct sequence spread spectrum) or
FHSS (frequency hopping spread spectrum) [9].
• Echo Hiding, by using cover audio / video signal as an echo to embedded
secret message [1].
e. IP datagram steganography / Network Covert Channel / Network
steganography: using IP header of a TCP/IP datagram (‘Flags’ field,
‘Identification’ field, ISN) to hide data in the network datagram level in a
TCP/IP based network, so it will be undetectable by network watchers [4, 5].
3.2 Steganalysis [1]
Steganalysis is the process of identifying steganography by identifies a suspected
stego media, inspecting various parameters whether that media contains hidden
message, then try to recover the hidden data. In the cryptanalysis, it is clear that the
intercepted message is encrypted and contains the hidden data, but in the steganalysis,
the suspected media may or may not be contain hidden data.
1) Steganalysis Techniques is based on unusual pattern in the media or visual
detection of the same. This can be done because the properties of electronic
media are changed after it is used to hide any object, result degradation in terms
of quality or unusual characteristics of the media.
2) Steganography Attacks
Steganography attacks consist of steganalysis, then extracting and destroying
hidden object of the stego media. Several types of attacks are:
• Known carrier attack: The analysis is based on both the original cover media
and stego media both.
• Steganography only attack: The analysis is only based on stego media.
• Known message attack: The analysis is only based on hidden message.
• Known steganography attack: The analysis is based on the cover media, stego
media and steganography tool / algorithm.
4. 4. Discussion and Result
This project is done by using the software below:
1. Steganography tools:
Hiderman version 3.0
Masker version 7.5
2. Steganalysis tool: Stegspy version 2.0
3. Hex Editor: Hex Editor Neo 4.95
The analysis of how those software’s work is done by analyzing byte pattern of
the stego file (compare it with the truck file and original file), identify the presence of
original file’s byte, identify inconsistency / differences within each of stego files,
reviewing multiple stego file generated by the same steganography tools to find the
signature pattern [11].
4.1 How do Hiderman and Masker work
a. Hiderman
Below is the analysis of the byte pattern in stego media, which is produced by
Steganography process of Hiderman:
1. The byte of truck file content, which is unencrypted.
2. 10 bytes data with unknown function, which the value depends on the password.
3. The length of the hidden file name, which is unencrypted.
4. The name of the hidden file, which is encrypted.
5. The bytes of the hidden file content, which is presented using this algorithm:
For every 4 bytes data, the first 2 bytes are unencrypted, and the last 2 bytes are
encrypted. For each encryption of these 2 bytes, 1 character of the password is
used as the key to the encryption.
For example; the first character of the password is used for the last 2 characters
of the first block’s encryption, and then the second character of the password is
used for the last 2 characters of the second block, etc.
6. 8 bytes data, which is almost same for every file. If it is changed / removed,
then Hiderman will not authenticate user to recover the stego file, even tough
the given password is correct.
The 6th byte, which is necessary for decryption, could be checksum or etc,
because if this byte is changed, then Hiderman will enter loop condition
(Hiderman does not respond and the size of the truck file increases rapidly) and
cannot recover the stego file.
7. Stream of unknown bytes, which the length is not same for each file.
8. The last 3 bytes (Hex value 43 44 4e) are the Hiderman signature.
5. b. Masker
Since Masker uses standard cryptography algorithms (Blowfish, DES, Cast5,
Serpent-256, Rijndael-256, TripleDES, TWOFISH) to encrypt the hidden message
into the truck file, it is harder to exactly know the usage of each byte. However, we
still can see the byte pattern of the Masker’s stego file.
Below is the analysis of the byte pattern in stego file, which is produced by
Steganography process of Masker:
1. The byte of truck file content, which is unencrypted.
2. The length of the hidden file content, which is unencrypted, presented twice,
followed by blank character (Hex value 20), with total length 13 bytes.
3. The bytes of the hidden file content, which is encrypted. After the encrypted
bytes of the file content, there is stream of 0 character (Hex value 30) followed
by 12 blank characters and 0 character followed by 12 blank characters again.
This pattern possible shows the end of the file content.
4. Stream of unknown bytes, which is possible contain the password and
encryption algorithm used for steganography process. The length of this part
depends on the length of the password.
5. The last 77 bytes are the Masker signature.
4.2 How can StegSpy detect their fingerprints
Stegspy does steganalysis for Hiderman by detecting the last 3 bytes of the stego
file as Hiderman’s signature. Finding the signature at the end of the file is enough for
Stegspy to detect steganography within the file, even tough that file is not stego file.
For example, if we just add the Hiderman’s signature to the end of a normal file, then
Stegspy detects it as a Hiderman’s stego file, however Hiderman does not recognize it
as stego file.
Moreover, Masker put longer fingerprint at the end of the byte data of stego file
(77 bytes). However, Stegspy cannot identify the file that contain hidden message,
6. which is produced by Masker (but Stegspy claims it can identify Masker’s stego file).
4.3 The Comparison between "Hiderman" and "Masker"
The table below presents our general analysis for Hiderman, Masker and Stegspy:
Table 1. Comparison of Hiderman and Masker
Comparison Hiderman Masker
Encryption The data is hidden by The data is hidden by
algorithm predictable encryption standard encryption
algorithm, however the algorithm, and the pattern is
encryption algorithm itself is unpredictable. User can
unknown. choose the encryption
algorithm at the user
interface.
Staganography Hiderman can recover both Masker cannot recover the
recovery the truck file and the hidden truck file, only the hidden file.
file, although sometimes some The truck file still contains
of the bytes change in the the hidden file data after
truck file after recovery recovery process.
process.
Staganoganalysis Stegspy does Stegspy cannot identify the
staganoganalysis for stego file, which is produced
Hiderman by detecting the by Masker. Masker uses the
last 3 bytes of the stego file as last 77 bytes as the signature.
Hiderman’s signature. These
bytes are also used by
Hiderman to recover the file.
5. Limitation of the Study
• The analysis of this project is only done with the text and JPEG files, not with
audio or video file. The procedures mentioned above might differ for the other
types of files.
• There are parts of the stego files that cannot be analyzed yet, because the
encryption that is used in the steganography process make these bytes
complicated to be analyzed.
6. Conclusions and Future Work Recommendations
Conclusions:
• Hiderman and Masker can be classified as robust steganography type and use
Binary File steganography techniques.
• Hiderman and Masker use encryption algorithm in the steganography process,
but Masker’s encryption is stronger than Hiderman’s encryption, because
Hiderman’s encryption result is predictable compared to Masker’s.
• Masker provides various encryption algorithms.
• Hiderman and Masker leave signature in the stego file and it is easy to detect.
• Stegspy can recognize the stego file produced by Hiderman but not Masker,
and it just searches for the signature of the steganography programs.
Future Work Recommendation:
• With decryption of all bytes of the stego file, it is possible to make deeper
7. analysis in order to understand the steganography process of Hiderman and
Masker.
• The research can be expanded by doing analysis of steganography process in
the audio and video media file.
• Analysis of the other steganography-steganalysis techniques and tools.
Reference(s)
[1] Soumyendu Das, Subhendu Das, Bijoy Bandyopadhyay, Sugata Sanyal.
“Steganography and Steganalysis: Different Approaches”. Internet:
http://www.tifr.res.in/~sanyal/papers/Soumyendu_Steganography_Steganalysis_d
ifferent_approaches.pdf [Nov. 13, 2010] --
[2] C. Cachin. “An Information-Theoretic Model for Steganography”, Proceedings of
nd
2 Workshop on Information Hiding, MIT Laboratory for Computer Science,
May 1998. Internet: http://www.zurich.ibm.com/~cca/papers/stego.pdf [Nov. 21,
2010]
[3] Jonathan Cummins, Patrick Diskin, Samuel Lau and Robert Parlett.
“Steganography And Digital Watermarking”. School of Computer Science, The
University of Birmingham, 2004. [Online] Internet:
http://www.cs.bham.ac.uk/~mdr/teaching/modules03/security/students/SS5/Stega
nography.pdf [Nov. 15, 2010]
[4] Niels Provos and Peter Honeyman. “Hide and Seek: An Introduction to
Steganography”. IEEE Security & Privacy Magazine. [On-line]. Internet:
http://niels.xtdnet.nl/papers/practical.pdf [Nov. 20, 2010] --
[5] Kamran Ahsan and Deepa Kundur. “Practical Data Hiding in TCP/IP”. Electrical
and Computer Engineering, university of Toronto. Internet: http://wwwiti.cs.uni-
magdeburg.de/iti_amsl/acm/acm02/ahsan_kundur.pdf [Nov. 17, 2010]
[6] Mark Noto. “MP3Stego: Hiding Text in MP3 Files”. The Information Security
Reading Room, SANS Institute. Internet:
http://www.sans.org/reading_room/whitepapers/stenganography/550.php [Nov.
15, 2010]
[7] Digital video steganalysis exploiting collusion sensitivity- Udit Budhiaa and
Deepa KundurSensors, Command Control, Communications and Intelligence
(C3I) Technologies for Homeland Security and Homeland Defense, Edward M.
Carapezza, ed., Proc. SPIE (vol. 5403), Orlando, Florida, April 2004.
http://www.ece.tamu.edu/~deepa/pdf/BudKun04.pdf [Nov. 17, 2010]
[8] Tyler Gibson. “Methods of Audio Steganography”. Internet:
http://www.snotmonkey.com/work/school/405/methods.html [Nov. 17, 2010]
[9] “Direct-sequence spread spectrum (DSSS)”. Wikipedia, the free encyclopedia,
GNU Free Documentation License. Internet: http://en.wikipedia.org/wiki/Direct-
sequence_spread_spectrum [Nov. 21, 2010]
[10] “Frequency-hopping spread spectrum (FHSS)”. Wikipedia, the free
encyclopedia, GNU Free Documentation License. Internet:
http://en.wikipedia.org/wiki/Frequency-hopping_spread_spectrum [Nov. 21,
2010]
[11] SpyHunter. “Steganography and Steganalysis”. Internet: http://www.spy-
hunter.com/steg_presentation_v1.pdf [Nov. 21, 2010]