SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
UNICODE
Hacking The International Character System
Introduction
• Standard for representing text for most of
the world’s writing systems	

• The most recent version is Unicode 6.0	

• Widely adopted by most programming
platforms, operating systems and The Web	

• The most widely used unicode encodings
are UTF-8 and UTF-16
Introduction to UTF-8
• UTF-8 (UCS Transformation Format - 8bit)	

• Backwards compatible with ASCII	

• Simple ASCII chars are represented by a
single byte	

• Other characters can include up to 4
bytes but 31 bits in total spanning across
6 physical bytes
UTF-8 Encoding Table
Bits Byte 1 Byte 2 Byte 3 Byte 4 Byte 5 Byte 6
7 0XXXXXX
11 110XXXXX 10XXXXXX
16 1110XXXX 10XXXXXX 10XXXXXX
21 11110XXX 10XXXXXX 10XXXXXX 10XXXXXX
26 111110XX 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX
31 1111110X 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX
UTF-8 Encoding Rules
• Every ASCII character is also valid UTF-8 character
(up to 7 bits or 128 characters)	

• For every other UTF-8 byte sequence the first byte
indicates the length of the sequence in bytes	

• The rest of the bytes from the byte sequence have 10
as the two most significant bits	

• This helps to easily find where a byte sequence
starts and ends	

• There are more rules but this is a good start...
Interesting UTF-8
Characters
• UTF-8 also provides a lot of function characters such as	

• Byte Order Mark (BOM) - 0xEF, 0xBB, 0xBF are placed at the start of the document to indicate UTF-8	

• Left to Right Mark (LRM) - 0xE2, 0x80, 0x8E are placed to indicate text orientation	

• In HTML - ‎ ‎ or ‎	

• Right to Left Mark (RLM) - 0xE2, 0x80, 0x8F are placed to indicate text orientation	

• In HTML - ‏ ‏ or ‏	

• Left to Right Embedding (LRE) - 0xE2, 0x80, 0xAA	

• In HTML - ‪	

• Right to Left Embedding (RLE) - 0xE2, 0x80, 0xAB	

• In HTML - ‫	

• There are more...
Clarifications
• How exactly the hex sequence 0xE2, 0x80, 0x8E maps to
‎ in HTML?	

• 0xE2, 0x80, 0x8E is UTF-8	

• ‎ is 0x20, 0x0E in UTF-16	

• also known as 0x0000200E in UTF-32	

• There is no magic!You simply need to know which
encoding system you are working with and find out what
characters it supports.	

• http://www.decodeunicode.org - is a good reference
Multiple
Representations
• The same character can be represented multiple ways	

• For example	

• . (DOT) is represented as 0x2E	

• It is also the equivalent of 0xC0, 0xAE	

• It is also the equivalent of 0xE0, 0x80, 0xAE	

• It is also the equivalent of 0xF0, 0x80, 0x80, 0xAE	

• It is also the equivalent of 0xF8, 0x80, 0x80, 0x80, 0xAE	

• It is also the equivalent of 0xFC, 0x80, 0x80, 0x80, 0x80, 0xAE
Translating the . (DOT)
HEX Byte 1 Byte 2 Byte 3 Byte 4 Byte 5 Byte 6
2E 00101110
C0 AE 11000000 10101110
E0 80 AE 11100000 10000000 10101110
F0 80 80 AE 11110000 10000000 10000000 10101110
F8 80 80 80 AE 11111000 10000000 10000000 10000000 10101110
FC 80 80 80 80 AE 11111100 10000000 10000000 10000000 10000000 10101110
Half and Full Width
Forms
• Graphic characters are traditionally classed as
halfwidth and fullwidth characters	

• In a fixed width font a halfwidth character takes
the half of the width of a fullwidth character	

• In Unicode you can find characters which are
presented in their halfwidth and fullwidth forms	

• http://www.unicode.org/charts/PDF/UFF00.pdf -
for more information
Fullwidth Latin
Characters
• Halfwidth and Fullwidth notations make sense when
used for characters such as those found in the Japanese
and Chinese character sets	

• The specifications also talk about latin characters
presented in their fullwidth forms	

• As a result the following mappings are possible	

• A - 0x41 (halfwidth) = A - 0xEF, 0xBC, 0xA1 (fullwidth)	

• B - 0x42 (halfwidth) = B - 0xEF, 0xBC, 0xA2 (fullwidth)	

• etc.
Security Considerations
• Visual Security Issues	

• Internationalized names	

• Left to Right and Right to Left representations	

• Charset Translation Issues	

• Occurs when strings are normalized before and after
translation between character sets	

• Characters in multiple representation	

• The same character can be represented in multiple ways
Case Study:Windows
Filename Mangling
• Consider the following files	

• [RTLO]cod.stnemucodtnatropmi.exe	

• [RTLO]cod.yrammusevituc[LTRO]n1c[LTRO].exe	

• [RTLO]gpj.!nuf_stohsnee[LTRO]n1c[LTRO].scr	

• Visually these files look different	

• exe.importantdocuments.doc	

• n1c.executivesummary.doc	

• n1c.screenshots_fun!.jpg
Case Study:The
PAYPAL Scam
• What is the difference between paypal.com
and paypai.com or between intel.com and
lntel.com?	

• How about citybank.com?	

• 0000000: d181 6974 7962 616e 6b2e 636f 6d ..itybank.com	

• 0xd1, 0x81 is the Cyrillic letter c which looks like the latin letter c
although they are very different
Case Study: Directory
Traversal
• Let’s say an application shows images by requesting /getimage.jsp?
name=image.jpg	

• The attacker tries to retrieve an arbitrary file by requesting /
getimage.jsp?name=../../../../boot.ini	

• Unfortunately the attack fails because the application checks
for the presence of ../ character sequence	

• ../ is 0x2E, 0x2E, 0x5C in hex	

• ../ is also 0x2E, 0xC0, 0xAE, 0x5C in overlong UTF-8	

• Since 0x2E, 0xC0, 0xAE, 0x5C is not equal to 0x2E, 0x2E, 0x5C
the security check is bypassed and the file content retrieved
References
• http://en.wikipedia.org/wiki/Mapping_of_Unicode_characters	

• http://decodeunicode.org	

• http://unicode.org/reports/tr36/	

• http://www.fileformat.info	

• http://blog.commtouch.com/cafe/email-security-news/using-unicode-to-trick-users-to-
install-malware/	

• https://dc414.org/wp-content/uploads/2011/01/righttoleften-override.pdf	

• http://norman.com/security_center/security_center_archive/2011/rtlo_unicode_hole/	

• http://en.wikipedia.org/wiki/Halfwidth_and_Fullwidth_Forms	

• http://www.unicode.org/charts/PDF/UFF00.pdf

Mais conteúdo relacionado

Mais procurados

Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)Project Student
 
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6Andrei Zmievski
 
Data encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeData encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeUlf Mattsson
 
Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Ulf Mattsson
 
20180324 leveraging unix tools
20180324 leveraging unix tools20180324 leveraging unix tools
20180324 leveraging unix toolsDavid Horvath
 
File handling in vb.net
File handling in vb.netFile handling in vb.net
File handling in vb.netEverywhere
 
Xml For Dummies Chapter 6 Adding Character(S) To Xml
Xml For Dummies   Chapter 6 Adding Character(S) To XmlXml For Dummies   Chapter 6 Adding Character(S) To Xml
Xml For Dummies Chapter 6 Adding Character(S) To Xmlphanleson
 
Unicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set CollisionsUnicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set CollisionsRay Paseur
 
Understanding Character Encodings
Understanding Character EncodingsUnderstanding Character Encodings
Understanding Character EncodingsMobisoft Infotech
 
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...stepheneisenhauer
 
Anton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can revealAnton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can revealDefconRussia
 
Overview of character encoding
Overview of character encodingOverview of character encoding
Overview of character encodingDuy Lâm
 
Abap slide class4 unicode-plusfiles
Abap slide class4 unicode-plusfilesAbap slide class4 unicode-plusfiles
Abap slide class4 unicode-plusfilesMilind Patil
 

Mais procurados (20)

Character Sets
Character SetsCharacter Sets
Character Sets
 
Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)
 
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
 
Data encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeData encryption and tokenization for international unicode
Data encryption and tokenization for international unicode
 
Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...
 
20180324 leveraging unix tools
20180324 leveraging unix tools20180324 leveraging unix tools
20180324 leveraging unix tools
 
Using unicode with php
Using unicode with phpUsing unicode with php
Using unicode with php
 
Ascii 03
Ascii 03Ascii 03
Ascii 03
 
File handling in vb.net
File handling in vb.netFile handling in vb.net
File handling in vb.net
 
Xml For Dummies Chapter 6 Adding Character(S) To Xml
Xml For Dummies   Chapter 6 Adding Character(S) To XmlXml For Dummies   Chapter 6 Adding Character(S) To Xml
Xml For Dummies Chapter 6 Adding Character(S) To Xml
 
Filehandling
FilehandlingFilehandling
Filehandling
 
Unicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set CollisionsUnicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set Collisions
 
Understanding Character Encodings
Understanding Character EncodingsUnderstanding Character Encodings
Understanding Character Encodings
 
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
 
Anton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can revealAnton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can reveal
 
Php Unicode I18n
Php Unicode I18nPhp Unicode I18n
Php Unicode I18n
 
ASCII-EBCDIC-HEX
ASCII-EBCDIC-HEXASCII-EBCDIC-HEX
ASCII-EBCDIC-HEX
 
Overview of character encoding
Overview of character encodingOverview of character encoding
Overview of character encoding
 
Abap slide class4 unicode-plusfiles
Abap slide class4 unicode-plusfilesAbap slide class4 unicode-plusfiles
Abap slide class4 unicode-plusfiles
 
What character is that
What character is thatWhat character is that
What character is that
 

Destaque

Advanced JS Deobfuscation
Advanced JS DeobfuscationAdvanced JS Deobfuscation
Advanced JS DeobfuscationMinded Security
 
Secure Coding - Web Application Security Vulnerabilities and Best Practices
Secure Coding - Web Application Security Vulnerabilities and Best PracticesSecure Coding - Web Application Security Vulnerabilities and Best Practices
Secure Coding - Web Application Security Vulnerabilities and Best PracticesWebsecurify
 
CODE BLUE 2014 : Joy of a bug hunter by Masato Kinugawa
CODE BLUE 2014 : Joy of a bug hunter by Masato KinugawaCODE BLUE 2014 : Joy of a bug hunter by Masato Kinugawa
CODE BLUE 2014 : Joy of a bug hunter by Masato KinugawaCODE BLUE
 
NoSQL Injections in Node.js - The case of MongoDB
NoSQL Injections in Node.js - The case of MongoDBNoSQL Injections in Node.js - The case of MongoDB
NoSQL Injections in Node.js - The case of MongoDBSqreen
 
X-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS Filter
X-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS FilterX-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS Filter
X-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS FilterMasato Kinugawa
 
SecurityCamp2015「バグハンティング入門」
SecurityCamp2015「バグハンティング入門」SecurityCamp2015「バグハンティング入門」
SecurityCamp2015「バグハンティング入門」Masato Kinugawa
 

Destaque (7)

Advanced JS Deobfuscation
Advanced JS DeobfuscationAdvanced JS Deobfuscation
Advanced JS Deobfuscation
 
Secure Coding - Web Application Security Vulnerabilities and Best Practices
Secure Coding - Web Application Security Vulnerabilities and Best PracticesSecure Coding - Web Application Security Vulnerabilities and Best Practices
Secure Coding - Web Application Security Vulnerabilities and Best Practices
 
CODE BLUE 2014 : Joy of a bug hunter by Masato Kinugawa
CODE BLUE 2014 : Joy of a bug hunter by Masato KinugawaCODE BLUE 2014 : Joy of a bug hunter by Masato Kinugawa
CODE BLUE 2014 : Joy of a bug hunter by Masato Kinugawa
 
Bug-hunter's Sorrow
Bug-hunter's SorrowBug-hunter's Sorrow
Bug-hunter's Sorrow
 
NoSQL Injections in Node.js - The case of MongoDB
NoSQL Injections in Node.js - The case of MongoDBNoSQL Injections in Node.js - The case of MongoDB
NoSQL Injections in Node.js - The case of MongoDB
 
X-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS Filter
X-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS FilterX-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS Filter
X-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS Filter
 
SecurityCamp2015「バグハンティング入門」
SecurityCamp2015「バグハンティング入門」SecurityCamp2015「バグハンティング入門」
SecurityCamp2015「バグハンティング入門」
 

Semelhante a Unicode - Hacking The International Character System

Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptAlula Tafere
 
Applied physics iii lecture3 digital_codes
Applied physics iii lecture3 digital_codesApplied physics iii lecture3 digital_codes
Applied physics iii lecture3 digital_codesJaphet Munnah
 
expect("").length.toBe(1)
expect("").length.toBe(1)expect("").length.toBe(1)
expect("").length.toBe(1)Philip Hofstetter
 
C101 – Intro to Programming with C
C101 – Intro to Programming with CC101 – Intro to Programming with C
C101 – Intro to Programming with Cgpsoft_sk
 
Comprehasive Exam - IT
Comprehasive Exam - ITComprehasive Exam - IT
Comprehasive Exam - ITguest6ddfb98
 
Pipiot - the double-architecture shellcode constructor
Pipiot - the double-architecture shellcode constructorPipiot - the double-architecture shellcode constructor
Pipiot - the double-architecture shellcode constructorMoshe Zioni
 
COMPUTER INTRODUCTION
COMPUTER INTRODUCTIONCOMPUTER INTRODUCTION
COMPUTER INTRODUCTIONAmit Sharma
 
Character Encoding issue with PHP
Character Encoding issue with PHPCharacter Encoding issue with PHP
Character Encoding issue with PHPRavi Raj
 
U-SQL Reading & Writing Files (SQLBits 2016)
U-SQL Reading & Writing Files (SQLBits 2016)U-SQL Reading & Writing Files (SQLBits 2016)
U-SQL Reading & Writing Files (SQLBits 2016)Michael Rys
 
presentation_python_7_1569170870_375360.pptx
presentation_python_7_1569170870_375360.pptxpresentation_python_7_1569170870_375360.pptx
presentation_python_7_1569170870_375360.pptxansariparveen06
 
Unicode
UnicodeUnicode
UnicodeESUG
 
Character encoding and unicode format
Character encoding and unicode formatCharacter encoding and unicode format
Character encoding and unicode formatAdityaSharma1452
 
introduction for computers
introduction for computersintroduction for computers
introduction for computersYogesh Chaure
 
Intro computeRRR
Intro computeRRRIntro computeRRR
Intro computeRRRGHOTRAANGEL
 
Lesson4.2 u4 l1 binary squences
Lesson4.2 u4 l1 binary squencesLesson4.2 u4 l1 binary squences
Lesson4.2 u4 l1 binary squencesLexume1
 

Semelhante a Unicode - Hacking The International Character System (20)

Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.ppt
 
Using unicode with php
Using unicode with phpUsing unicode with php
Using unicode with php
 
Applied physics iii lecture3 digital_codes
Applied physics iii lecture3 digital_codesApplied physics iii lecture3 digital_codes
Applied physics iii lecture3 digital_codes
 
expect("").length.toBe(1)
expect("").length.toBe(1)expect("").length.toBe(1)
expect("").length.toBe(1)
 
C101 – Intro to Programming with C
C101 – Intro to Programming with CC101 – Intro to Programming with C
C101 – Intro to Programming with C
 
Comprehasive Exam - IT
Comprehasive Exam - ITComprehasive Exam - IT
Comprehasive Exam - IT
 
Pipiot - the double-architecture shellcode constructor
Pipiot - the double-architecture shellcode constructorPipiot - the double-architecture shellcode constructor
Pipiot - the double-architecture shellcode constructor
 
COMPUTER INTRODUCTION
COMPUTER INTRODUCTIONCOMPUTER INTRODUCTION
COMPUTER INTRODUCTION
 
C# basics...
C# basics...C# basics...
C# basics...
 
Unicode 101
Unicode 101Unicode 101
Unicode 101
 
Character Encoding issue with PHP
Character Encoding issue with PHPCharacter Encoding issue with PHP
Character Encoding issue with PHP
 
U-SQL Reading & Writing Files (SQLBits 2016)
U-SQL Reading & Writing Files (SQLBits 2016)U-SQL Reading & Writing Files (SQLBits 2016)
U-SQL Reading & Writing Files (SQLBits 2016)
 
presentation_python_7_1569170870_375360.pptx
presentation_python_7_1569170870_375360.pptxpresentation_python_7_1569170870_375360.pptx
presentation_python_7_1569170870_375360.pptx
 
Unicode
UnicodeUnicode
Unicode
 
Character encoding and unicode format
Character encoding and unicode formatCharacter encoding and unicode format
Character encoding and unicode format
 
introduction for computers
introduction for computersintroduction for computers
introduction for computers
 
Intro compute
Intro computeIntro compute
Intro compute
 
Intro computeRRR
Intro computeRRRIntro computeRRR
Intro computeRRR
 
Intro compute
Intro computeIntro compute
Intro compute
 
Lesson4.2 u4 l1 binary squences
Lesson4.2 u4 l1 binary squencesLesson4.2 u4 l1 binary squences
Lesson4.2 u4 l1 binary squences
 

Mais de Websecurify

Security Challenges in Node.js
Security Challenges in Node.jsSecurity Challenges in Node.js
Security Challenges in Node.jsWebsecurify
 
Next Generation of Web Application Security Tools
Next Generation of Web Application Security ToolsNext Generation of Web Application Security Tools
Next Generation of Web Application Security ToolsWebsecurify
 
Web Application Security 101 - 14 Data Validation
Web Application Security 101 - 14 Data ValidationWeb Application Security 101 - 14 Data Validation
Web Application Security 101 - 14 Data ValidationWebsecurify
 
Web Application Security 101 - 12 Logging
Web Application Security 101 - 12 LoggingWeb Application Security 101 - 12 Logging
Web Application Security 101 - 12 LoggingWebsecurify
 
Web Application Security 101 - 10 Server Tier
Web Application Security 101 - 10 Server TierWeb Application Security 101 - 10 Server Tier
Web Application Security 101 - 10 Server TierWebsecurify
 
Web Application Security 101 - 07 Session Management
Web Application Security 101 - 07 Session ManagementWeb Application Security 101 - 07 Session Management
Web Application Security 101 - 07 Session ManagementWebsecurify
 
Web Application Security 101 - 06 Authentication
Web Application Security 101 - 06 AuthenticationWeb Application Security 101 - 06 Authentication
Web Application Security 101 - 06 AuthenticationWebsecurify
 
Web Application Security 101 - 05 Enumeration
Web Application Security 101 - 05 EnumerationWeb Application Security 101 - 05 Enumeration
Web Application Security 101 - 05 EnumerationWebsecurify
 
Web Application Security 101 - 04 Testing Methodology
Web Application Security 101 - 04 Testing MethodologyWeb Application Security 101 - 04 Testing Methodology
Web Application Security 101 - 04 Testing MethodologyWebsecurify
 
Web Application Security 101 - 03 Web Security Toolkit
Web Application Security 101 - 03 Web Security ToolkitWeb Application Security 101 - 03 Web Security Toolkit
Web Application Security 101 - 03 Web Security ToolkitWebsecurify
 
Web Application Security 101 - 02 The Basics
Web Application Security 101 - 02 The BasicsWeb Application Security 101 - 02 The Basics
Web Application Security 101 - 02 The BasicsWebsecurify
 

Mais de Websecurify (11)

Security Challenges in Node.js
Security Challenges in Node.jsSecurity Challenges in Node.js
Security Challenges in Node.js
 
Next Generation of Web Application Security Tools
Next Generation of Web Application Security ToolsNext Generation of Web Application Security Tools
Next Generation of Web Application Security Tools
 
Web Application Security 101 - 14 Data Validation
Web Application Security 101 - 14 Data ValidationWeb Application Security 101 - 14 Data Validation
Web Application Security 101 - 14 Data Validation
 
Web Application Security 101 - 12 Logging
Web Application Security 101 - 12 LoggingWeb Application Security 101 - 12 Logging
Web Application Security 101 - 12 Logging
 
Web Application Security 101 - 10 Server Tier
Web Application Security 101 - 10 Server TierWeb Application Security 101 - 10 Server Tier
Web Application Security 101 - 10 Server Tier
 
Web Application Security 101 - 07 Session Management
Web Application Security 101 - 07 Session ManagementWeb Application Security 101 - 07 Session Management
Web Application Security 101 - 07 Session Management
 
Web Application Security 101 - 06 Authentication
Web Application Security 101 - 06 AuthenticationWeb Application Security 101 - 06 Authentication
Web Application Security 101 - 06 Authentication
 
Web Application Security 101 - 05 Enumeration
Web Application Security 101 - 05 EnumerationWeb Application Security 101 - 05 Enumeration
Web Application Security 101 - 05 Enumeration
 
Web Application Security 101 - 04 Testing Methodology
Web Application Security 101 - 04 Testing MethodologyWeb Application Security 101 - 04 Testing Methodology
Web Application Security 101 - 04 Testing Methodology
 
Web Application Security 101 - 03 Web Security Toolkit
Web Application Security 101 - 03 Web Security ToolkitWeb Application Security 101 - 03 Web Security Toolkit
Web Application Security 101 - 03 Web Security Toolkit
 
Web Application Security 101 - 02 The Basics
Web Application Security 101 - 02 The BasicsWeb Application Security 101 - 02 The Basics
Web Application Security 101 - 02 The Basics
 

Último

How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 

Último (20)

Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 

Unicode - Hacking The International Character System

  • 2. Introduction • Standard for representing text for most of the world’s writing systems • The most recent version is Unicode 6.0 • Widely adopted by most programming platforms, operating systems and The Web • The most widely used unicode encodings are UTF-8 and UTF-16
  • 3. Introduction to UTF-8 • UTF-8 (UCS Transformation Format - 8bit) • Backwards compatible with ASCII • Simple ASCII chars are represented by a single byte • Other characters can include up to 4 bytes but 31 bits in total spanning across 6 physical bytes
  • 4. UTF-8 Encoding Table Bits Byte 1 Byte 2 Byte 3 Byte 4 Byte 5 Byte 6 7 0XXXXXX 11 110XXXXX 10XXXXXX 16 1110XXXX 10XXXXXX 10XXXXXX 21 11110XXX 10XXXXXX 10XXXXXX 10XXXXXX 26 111110XX 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX 31 1111110X 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX
  • 5. UTF-8 Encoding Rules • Every ASCII character is also valid UTF-8 character (up to 7 bits or 128 characters) • For every other UTF-8 byte sequence the first byte indicates the length of the sequence in bytes • The rest of the bytes from the byte sequence have 10 as the two most significant bits • This helps to easily find where a byte sequence starts and ends • There are more rules but this is a good start...
  • 6. Interesting UTF-8 Characters • UTF-8 also provides a lot of function characters such as • Byte Order Mark (BOM) - 0xEF, 0xBB, 0xBF are placed at the start of the document to indicate UTF-8 • Left to Right Mark (LRM) - 0xE2, 0x80, 0x8E are placed to indicate text orientation • In HTML - ‎ ‎ or ‎ • Right to Left Mark (RLM) - 0xE2, 0x80, 0x8F are placed to indicate text orientation • In HTML - ‏ ‏ or ‏ • Left to Right Embedding (LRE) - 0xE2, 0x80, 0xAA • In HTML - ‪ • Right to Left Embedding (RLE) - 0xE2, 0x80, 0xAB • In HTML - ‫ • There are more...
  • 7. Clarifications • How exactly the hex sequence 0xE2, 0x80, 0x8E maps to ‎ in HTML? • 0xE2, 0x80, 0x8E is UTF-8 • ‎ is 0x20, 0x0E in UTF-16 • also known as 0x0000200E in UTF-32 • There is no magic!You simply need to know which encoding system you are working with and find out what characters it supports. • http://www.decodeunicode.org - is a good reference
  • 8. Multiple Representations • The same character can be represented multiple ways • For example • . (DOT) is represented as 0x2E • It is also the equivalent of 0xC0, 0xAE • It is also the equivalent of 0xE0, 0x80, 0xAE • It is also the equivalent of 0xF0, 0x80, 0x80, 0xAE • It is also the equivalent of 0xF8, 0x80, 0x80, 0x80, 0xAE • It is also the equivalent of 0xFC, 0x80, 0x80, 0x80, 0x80, 0xAE
  • 9. Translating the . (DOT) HEX Byte 1 Byte 2 Byte 3 Byte 4 Byte 5 Byte 6 2E 00101110 C0 AE 11000000 10101110 E0 80 AE 11100000 10000000 10101110 F0 80 80 AE 11110000 10000000 10000000 10101110 F8 80 80 80 AE 11111000 10000000 10000000 10000000 10101110 FC 80 80 80 80 AE 11111100 10000000 10000000 10000000 10000000 10101110
  • 10. Half and Full Width Forms • Graphic characters are traditionally classed as halfwidth and fullwidth characters • In a fixed width font a halfwidth character takes the half of the width of a fullwidth character • In Unicode you can find characters which are presented in their halfwidth and fullwidth forms • http://www.unicode.org/charts/PDF/UFF00.pdf - for more information
  • 11. Fullwidth Latin Characters • Halfwidth and Fullwidth notations make sense when used for characters such as those found in the Japanese and Chinese character sets • The specifications also talk about latin characters presented in their fullwidth forms • As a result the following mappings are possible • A - 0x41 (halfwidth) = A - 0xEF, 0xBC, 0xA1 (fullwidth) • B - 0x42 (halfwidth) = B - 0xEF, 0xBC, 0xA2 (fullwidth) • etc.
  • 12. Security Considerations • Visual Security Issues • Internationalized names • Left to Right and Right to Left representations • Charset Translation Issues • Occurs when strings are normalized before and after translation between character sets • Characters in multiple representation • The same character can be represented in multiple ways
  • 13. Case Study:Windows Filename Mangling • Consider the following files • [RTLO]cod.stnemucodtnatropmi.exe • [RTLO]cod.yrammusevituc[LTRO]n1c[LTRO].exe • [RTLO]gpj.!nuf_stohsnee[LTRO]n1c[LTRO].scr • Visually these files look different • exe.importantdocuments.doc • n1c.executivesummary.doc • n1c.screenshots_fun!.jpg
  • 14. Case Study:The PAYPAL Scam • What is the difference between paypal.com and paypai.com or between intel.com and lntel.com? • How about citybank.com? • 0000000: d181 6974 7962 616e 6b2e 636f 6d ..itybank.com • 0xd1, 0x81 is the Cyrillic letter c which looks like the latin letter c although they are very different
  • 15. Case Study: Directory Traversal • Let’s say an application shows images by requesting /getimage.jsp? name=image.jpg • The attacker tries to retrieve an arbitrary file by requesting / getimage.jsp?name=../../../../boot.ini • Unfortunately the attack fails because the application checks for the presence of ../ character sequence • ../ is 0x2E, 0x2E, 0x5C in hex • ../ is also 0x2E, 0xC0, 0xAE, 0x5C in overlong UTF-8 • Since 0x2E, 0xC0, 0xAE, 0x5C is not equal to 0x2E, 0x2E, 0x5C the security check is bypassed and the file content retrieved
  • 16. References • http://en.wikipedia.org/wiki/Mapping_of_Unicode_characters • http://decodeunicode.org • http://unicode.org/reports/tr36/ • http://www.fileformat.info • http://blog.commtouch.com/cafe/email-security-news/using-unicode-to-trick-users-to- install-malware/ • https://dc414.org/wp-content/uploads/2011/01/righttoleften-override.pdf • http://norman.com/security_center/security_center_archive/2011/rtlo_unicode_hole/ • http://en.wikipedia.org/wiki/Halfwidth_and_Fullwidth_Forms • http://www.unicode.org/charts/PDF/UFF00.pdf