SlideShare uma empresa Scribd logo
1 de 9
Baixar para ler offline
BSDCONV
Kuan-Chung Chiu
(buganini at gmail dot com)
Contents
1 Syntax 1
1.1 Phases & Cascade . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Codecs & Fallback . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Codec argument . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Type & Flag 3
2.1 Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Helper codecs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 C Programming guide 6
3.1 Conversion instance lifecycle . . . . . . . . . . . . . . . . . . . . . 6
3.2 Skeleton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Output mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3.1 BSDCONV HOLD . . . . . . . . . . . . . . . . . . . . . . 8
3.3.2 BSDCONV AUTOMALLOC . . . . . . . . . . . . . . . . 8
3.3.3 BSDCONV PREMALLOCED . . . . . . . . . . . . . . . 8
3.3.4 BSDCONV FILE . . . . . . . . . . . . . . . . . . . . . . . 8
3.3.5 BSDCONV FD . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3.6 BSDCONV NULL . . . . . . . . . . . . . . . . . . . . . . 8
3.3.7 BSDCONV PASS . . . . . . . . . . . . . . . . . . . . . . 8
3.4 Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.5 Memory pool issue . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1 Syntax
1.1 Phases & Cascade
There are three types of conversion phases defined in bsdconv: from, inter,
to. The from phase takes byte sequence and decodes it into a list of code
points (except for from/PASS), on the other hand, the to phase encodes the list
of code points back to byte sequence. The inter phase does code point to code
point mapping.
1
A basic conversion consists of from and to phases. Search of codec name is
case insensitive.
ISO-8859-1 : UTF-8
from to
Figure 1: Basic two phases conversion
Between from and to phases, we can have an inter phase.
UTF-8 : UPPER : UTF-8
from inter to
Figure 2: Conversion with inter-mapping phase
There can be more than one inter phases.
UTF-8 : UPPER : FULL : UTF-8
from inter inter to
Figure 3: Conversion with multiple inter-mapping phases
An inter phase can be used standalonely, mostly in programmatic way.
HALF
inter
Figure 4: Standalone inter-mapping phase
Conversions can be cascaded with pipe symbol. In most cases it is equivalent
to shell pipe unless the use of codecs manipulating flag (described in section
2.2).
UTF-8 : BIG5 | BIG5 : UTF-8
from to from to
Figure 5: Cascaded conversions
ASCII-compatible codecs are designed to exclude ASCII part and named as
FOO, with alias FOO ⇒ FOO,ASCII or ASCII, FOO.
2
1.2 Codecs & Fallback
A phase consists of one or more codecs, separated by comma. The latter
codecs will be utilized if and only if the former codecs fail to consume the
incoming data, once a codec finish its task, the first codec will be up again for
upcoming data.
UTF-8 : ASCII , 3F
from to
Figure 6: Fallback codec
1.3 Codec argument
Some codecs take arguments, after the hash symbol.
UTF-8 : ASCII , ANY#3F
Figure 7: Passing argument to codec
Some codecs take arguments in key-value form. Argument name and value
consist of numbers, alphabets, hyphen and underscore, binary data are repre-
sented in hexadecimal form.
UTF-8 : ASCII , ESCAPE#PREFIX=2575
Figure 8: Passing argument to codec in key-value form
Multiple arguments can be passed by being concatenated with ampersand.
UTF-8 : ASCII , ESCAPE#PREFIX=262378&SUFFIX=3B
Figure 9: Passing multiple arguments to codec
List of data can be passed in dot-separated form.
ANY#013F.0121 : ASCII
Figure 10: Data list
3
2 Type & Flag
2.1 Type
A code point packet note its type at first byte.
ID Description Provider(from) Consumer(to)
00 Bsdconv special characters BSDCONV-KEYWORD BSDCONV-KEYWORD
01 Unicode Most decoders Most encoders
02 CNS116431
CNS11643 CNS11643
03 Byte BYTE; ESCAPE BYTE; ESCAPE#FOR=BYTE
04 Chinese components inter/ZH-DECOMP inter/ZH-COMP
1B ANSI control sequence ANSI-CONTROL -
Table 1: Types and its provider/consumer (just to name a few)
Entity Unicode UTF-8 Hex
% U+0025 25
A U+0041 41
∀ U+2200 E28880
A∀
Input (UTF-8 literal)
ASCII,BYTE : ...
Decoder
01
41
03
E2
03
88
03
80
Internal data
... : ASCII,ESCAPE
Encoder
41
”A”
25
45
32
”%E2”
25
38
38
”%88”
25
38
30
”%80”
Internal data
A%E2%88%80
Output (UTF-8 literal)
Figure 11: Fallback & Type
1As for the intersection of CNS11643 and Unicode, from/CNS11643 does conversion to
unicode type if possible. Vice versa, to/CNS11643 does conversion from unicode type if
possible.
4
2.2 Flag
A code point packet carries its own flags. Currently there are two types of
flag, FREE and MARK. Flag FREE indicates that the packet buffer needs
to be recycled or released, this is used only when programming is involved.
Flag MARK is (currently only) added by codec to/PASS#MARK and used
by codec from/PASS#UNMARK to identify which packets have already been
decoded and needs to be passed through in from phase.
The code point packets structure is retained, including flags, within cascaded
conversions, but not for shell pipe. Figure 11 demonstrate the flow of conversion
ESCAPE:PASS#MARK&FOR=1,BYTE|PASS#UNMARK,UTF-8:UTF-8”.
Entity Unicode UTF-8 Hex
α U+03B1 CEB1
β U+03B2 CEB2
%u03B1%CE%B2
Input (UTF-8 literal)
ESCAPE : ...
Decoder
01
03
B1
03
CE
03
B2
Internal data
... : PASS#MARK&FOR=1,BYTE
Encoder
01
03
B1
MARK
CE B2
Internal data
PASS#UNMARK,UTF-8 : ...
Decoder
01
03
B1
01
03
B2
Internal data
... : UTF-8
Encoder
CE
B1
”α”
CE
B2
”β”
Internal data
αβ
Output (UTF-8 literal)
Figure 12: Flag, from/PASS & to/PASS
5
2.3 Helper codecs
Codec from/bsdconv can be used to input internal data structure, and codec
to/BSDCONV-OUTPUT can be used to inspect type and flags.
3 C Programming guide
3.1 Conversion instance lifecycle
bsdconv create()
bsdconv init()
set input/output parameters
is last chunk set flush flag
bsdconv()
collect output
has next chunk
bsdconv destroy()
yes
no
no
yes
next chunk
no
reuse instance
Figure 13: Conversion instance lifecycle
6
3.2 Skeleton
#include <bsdconv.h>
bsdconv_instance *ins;
char *buf;
size_t len;
ins=bsdconv_create ("UTF -8: UPSIDEDOWN:UTF -8");
bsdconv_init(ins);
do{
buf=bsdconv_malloc (BUFSIZ );
/*
* fill data into buf
* len=filled data length
*/
ins ->input.data=buf;
ins ->input.len=len;
ins ->input.flags |= F_FREE;
ins ->input.next=NULL;
if(ins ->input.len ==0)
{ // last chunk
ins ->flush =1;
}
/*
* set output parameter (see section 3.3)
*/
bsdconv(ins);
/*
* collect output (see section 3.3)
*/
}while(ins ->flush ==0);
bsdconv_destroy (ins);
For chunked conversion, input buffer should be allocated for each input to
prevent content change during conversion. Output buffer with flag FREE is
safe to be reused.
3.3 Output mode
ins -> output mode Description
BSDCONV HOLD Hold output in memory
BSDCONV AUTOMALLOC Return output buffer which should be free() after use
BSDCONV PREMALLOCED Fill output into given buffer
BSDCONV FILE Write output into (FILE *) stream file
BSDCONV FD Write output into (int) file descriptor
BSDCONV NULL Discard output
BSDCONV PASS Pass to another conversion instance
7
3.3.1 BSDCONV HOLD
This is default output mode after bsdconv init(). Usually used with BSD-
CONV AUTOMALLOC or BSDCONV PREMALLOCED to get squeezed out-
put.
3.3.2 BSDCONV AUTOMALLOC
Output buffer will be allocated dynamically, the actual buffer size will be
ins->output.len + output content length, it is useful when you need to have
terminating null byte.
3.3.3 BSDCONV PREMALLOCED
If ins->output.data is NULL, the total length of content to be output will
be put to ins->output.len, but output will still be hold in memory. Otherwise,
bsdconv() will fill as much unfragmented data as possible within the buffer size
limit specified at ins->output.len.
3.3.4 BSDCONV FILE
Output will be fwrite() to the given FILE * at ins->output.data.
3.3.5 BSDCONV FD
Output will be write() to the given (int) file descriptor at ins->output.data.
Casting to intptr t (defined in <stdint.h>) is needed to eliminate compiler
warning.
3.3.6 BSDCONV NULL
Output will be discard. This is usually used with evaluating conversion (see
section 3.4).
3.3.7 BSDCONV PASS
Output packets will be passed to the given (struct bsdconv instance *) con-
version instance at ins->output.data.
3.4 Counters
Counters are listed in ins->counter in linked-list with following structure.
struct bsdconv_counter_entry {
char *key;
bsdconv_counter_t val;
struct bsdconv_counter_entry *next;
};
IERR and OERR are mandatory error counters.
8
There are two APIs to get/reset counter(s):
bsdconv_counter_t * bsdconv_counter (char *name );
Return the pointer to the counter value. bsdconv counter t is currently defined
as size t.
void bsdconv_counter_reset (char *name );
Reset the specified counter, if name is NULL, all counters are reset.
3.5 Memory pool issue
In case libbsdconv and your program uses different memory pools, bsdconv malloc()
and bsdconv free() should be used to replace malloc() and free().
9

Mais conteúdo relacionado

Mais procurados

N_Asm Assembly arithmetic instructions (sol)
N_Asm Assembly arithmetic instructions (sol)N_Asm Assembly arithmetic instructions (sol)
N_Asm Assembly arithmetic instructions (sol)Selomon birhane
 
Assembly Language Lecture 2
Assembly Language Lecture 2Assembly Language Lecture 2
Assembly Language Lecture 2Motaz Saad
 
Introduction to 8088 microprocessor
Introduction to 8088 microprocessorIntroduction to 8088 microprocessor
Introduction to 8088 microprocessorDwight Sabio
 
EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5PRADEEP
 
Assembly Language Programming By Ytha Yu, Charles Marut Chap 4 (Introduction ...
Assembly Language Programming By Ytha Yu, Charles Marut Chap 4 (Introduction ...Assembly Language Programming By Ytha Yu, Charles Marut Chap 4 (Introduction ...
Assembly Language Programming By Ytha Yu, Charles Marut Chap 4 (Introduction ...Bilal Amjad
 
Chapter 6 Flow control Instructions
Chapter 6 Flow control InstructionsChapter 6 Flow control Instructions
Chapter 6 Flow control Instructionswarda aziz
 
Embedded c program and programming structure for beginners
Embedded c program and programming structure for beginnersEmbedded c program and programming structure for beginners
Embedded c program and programming structure for beginnersKamesh Mtec
 
Chapter 2 The 8088 Microprocessor
Chapter 2   The 8088 MicroprocessorChapter 2   The 8088 Microprocessor
Chapter 2 The 8088 MicroprocessorDwight Sabio
 
Instruction set of 8086
Instruction set of 8086Instruction set of 8086
Instruction set of 80869840596838
 
Assembly Language Lecture 1
Assembly Language Lecture 1Assembly Language Lecture 1
Assembly Language Lecture 1Motaz Saad
 
1344 Alp Of 8086
1344 Alp Of 80861344 Alp Of 8086
1344 Alp Of 8086techbed
 
X86 assembly & GDB
X86 assembly & GDBX86 assembly & GDB
X86 assembly & GDBJian-Yu Li
 
Assembly Language Lecture 4
Assembly Language Lecture 4Assembly Language Lecture 4
Assembly Language Lecture 4Motaz Saad
 

Mais procurados (20)

Embedded c
Embedded cEmbedded c
Embedded c
 
N_Asm Assembly arithmetic instructions (sol)
N_Asm Assembly arithmetic instructions (sol)N_Asm Assembly arithmetic instructions (sol)
N_Asm Assembly arithmetic instructions (sol)
 
Assembly Language Lecture 2
Assembly Language Lecture 2Assembly Language Lecture 2
Assembly Language Lecture 2
 
Introduction to 8088 microprocessor
Introduction to 8088 microprocessorIntroduction to 8088 microprocessor
Introduction to 8088 microprocessor
 
Microcontroller part 4
Microcontroller part 4Microcontroller part 4
Microcontroller part 4
 
EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5
 
Assembly Language Programming By Ytha Yu, Charles Marut Chap 4 (Introduction ...
Assembly Language Programming By Ytha Yu, Charles Marut Chap 4 (Introduction ...Assembly Language Programming By Ytha Yu, Charles Marut Chap 4 (Introduction ...
Assembly Language Programming By Ytha Yu, Charles Marut Chap 4 (Introduction ...
 
Chapter 6 Flow control Instructions
Chapter 6 Flow control InstructionsChapter 6 Flow control Instructions
Chapter 6 Flow control Instructions
 
Embedded c program and programming structure for beginners
Embedded c program and programming structure for beginnersEmbedded c program and programming structure for beginners
Embedded c program and programming structure for beginners
 
Lecture5(1)
Lecture5(1)Lecture5(1)
Lecture5(1)
 
Chapter 2 The 8088 Microprocessor
Chapter 2   The 8088 MicroprocessorChapter 2   The 8088 Microprocessor
Chapter 2 The 8088 Microprocessor
 
Lecture6
Lecture6Lecture6
Lecture6
 
Microcontroller part 6_v1
Microcontroller part 6_v1Microcontroller part 6_v1
Microcontroller part 6_v1
 
FPGA - Programmable Logic Design
FPGA - Programmable Logic DesignFPGA - Programmable Logic Design
FPGA - Programmable Logic Design
 
Instruction set of 8086
Instruction set of 8086Instruction set of 8086
Instruction set of 8086
 
Assembly Language Lecture 1
Assembly Language Lecture 1Assembly Language Lecture 1
Assembly Language Lecture 1
 
1344 Alp Of 8086
1344 Alp Of 80861344 Alp Of 8086
1344 Alp Of 8086
 
X86 assembly & GDB
X86 assembly & GDBX86 assembly & GDB
X86 assembly & GDB
 
Introduction to HDLs
Introduction to HDLsIntroduction to HDLs
Introduction to HDLs
 
Assembly Language Lecture 4
Assembly Language Lecture 4Assembly Language Lecture 4
Assembly Language Lecture 4
 

Semelhante a Bsdconv

Error Resiliency and Concealment in H.264 MPEG-4 Part 10
Error Resiliency and Concealment in H.264 MPEG-4 Part 10Error Resiliency and Concealment in H.264 MPEG-4 Part 10
Error Resiliency and Concealment in H.264 MPEG-4 Part 10coldfire7
 
8085 micro processor
8085 micro processor8085 micro processor
8085 micro processorArun Umrao
 
Notes of 8085 micro processor Programming for BCA, MCA, MSC (CS), MSC (IT) &...
Notes of 8085 micro processor Programming  for BCA, MCA, MSC (CS), MSC (IT) &...Notes of 8085 micro processor Programming  for BCA, MCA, MSC (CS), MSC (IT) &...
Notes of 8085 micro processor Programming for BCA, MCA, MSC (CS), MSC (IT) &...ssuserd6b1fd
 
20090814102834_嵌入式C与C++语言精华文章集锦.docx
20090814102834_嵌入式C与C++语言精华文章集锦.docx20090814102834_嵌入式C与C++语言精华文章集锦.docx
20090814102834_嵌入式C与C++语言精华文章集锦.docxMostafaParvin1
 
Berkeley Packet Filters
Berkeley Packet FiltersBerkeley Packet Filters
Berkeley Packet FiltersKernel TLV
 
ADS1256 library documentation
ADS1256 library documentationADS1256 library documentation
ADS1256 library documentationCuriousScientist
 
Assembly Codes in C Programmes - A Short Notes by Arun Umrao
Assembly Codes in C Programmes - A Short Notes by Arun UmraoAssembly Codes in C Programmes - A Short Notes by Arun Umrao
Assembly Codes in C Programmes - A Short Notes by Arun Umraossuserd6b1fd
 
Bascom avr-course
Bascom avr-courseBascom avr-course
Bascom avr-coursehandson28
 
VJITSk 6713 user manual
VJITSk 6713 user manualVJITSk 6713 user manual
VJITSk 6713 user manualkot seelam
 
Basic Interoperable Scrambling System
Basic Interoperable Scrambling SystemBasic Interoperable Scrambling System
Basic Interoperable Scrambling SystemSais Abdelkrim
 
Tutorial-Auto-Code-Generation-for-F2803x-Target.pdf
Tutorial-Auto-Code-Generation-for-F2803x-Target.pdfTutorial-Auto-Code-Generation-for-F2803x-Target.pdf
Tutorial-Auto-Code-Generation-for-F2803x-Target.pdfmounir derri
 
Image compression1.ppt
Image compression1.pptImage compression1.ppt
Image compression1.pptssuser812128
 
OpenWRT manual
OpenWRT manualOpenWRT manual
OpenWRT manualfosk
 

Semelhante a Bsdconv (20)

Pcbgcode
PcbgcodePcbgcode
Pcbgcode
 
Error Resiliency and Concealment in H.264 MPEG-4 Part 10
Error Resiliency and Concealment in H.264 MPEG-4 Part 10Error Resiliency and Concealment in H.264 MPEG-4 Part 10
Error Resiliency and Concealment in H.264 MPEG-4 Part 10
 
8085 micro processor
8085 micro processor8085 micro processor
8085 micro processor
 
Notes of 8085 micro processor Programming for BCA, MCA, MSC (CS), MSC (IT) &...
Notes of 8085 micro processor Programming  for BCA, MCA, MSC (CS), MSC (IT) &...Notes of 8085 micro processor Programming  for BCA, MCA, MSC (CS), MSC (IT) &...
Notes of 8085 micro processor Programming for BCA, MCA, MSC (CS), MSC (IT) &...
 
20090814102834_嵌入式C与C++语言精华文章集锦.docx
20090814102834_嵌入式C与C++语言精华文章集锦.docx20090814102834_嵌入式C与C++语言精华文章集锦.docx
20090814102834_嵌入式C与C++语言精华文章集锦.docx
 
Berkeley Packet Filters
Berkeley Packet FiltersBerkeley Packet Filters
Berkeley Packet Filters
 
ADS1256 library documentation
ADS1256 library documentationADS1256 library documentation
ADS1256 library documentation
 
Compress
CompressCompress
Compress
 
Assembly Codes in C Programmes - A Short Notes by Arun Umrao
Assembly Codes in C Programmes - A Short Notes by Arun UmraoAssembly Codes in C Programmes - A Short Notes by Arun Umrao
Assembly Codes in C Programmes - A Short Notes by Arun Umrao
 
Interprocess Message Formats
Interprocess Message FormatsInterprocess Message Formats
Interprocess Message Formats
 
Multi Process Message Formats
Multi Process Message FormatsMulti Process Message Formats
Multi Process Message Formats
 
Bascom avr-course
Bascom avr-courseBascom avr-course
Bascom avr-course
 
VJITSk 6713 user manual
VJITSk 6713 user manualVJITSk 6713 user manual
VJITSk 6713 user manual
 
Basic Interoperable Scrambling System
Basic Interoperable Scrambling SystemBasic Interoperable Scrambling System
Basic Interoperable Scrambling System
 
OPCDE Crackme Solution
OPCDE Crackme SolutionOPCDE Crackme Solution
OPCDE Crackme Solution
 
Tutorial-Auto-Code-Generation-for-F2803x-Target.pdf
Tutorial-Auto-Code-Generation-for-F2803x-Target.pdfTutorial-Auto-Code-Generation-for-F2803x-Target.pdf
Tutorial-Auto-Code-Generation-for-F2803x-Target.pdf
 
Image compression1.ppt
Image compression1.pptImage compression1.ppt
Image compression1.ppt
 
Lb35189919904
Lb35189919904Lb35189919904
Lb35189919904
 
vorlage
vorlagevorlage
vorlage
 
OpenWRT manual
OpenWRT manualOpenWRT manual
OpenWRT manual
 

Último

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 

Último (20)

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 

Bsdconv

  • 1. BSDCONV Kuan-Chung Chiu (buganini at gmail dot com) Contents 1 Syntax 1 1.1 Phases & Cascade . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Codecs & Fallback . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Codec argument . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 Type & Flag 3 2.1 Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Helper codecs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3 C Programming guide 6 3.1 Conversion instance lifecycle . . . . . . . . . . . . . . . . . . . . . 6 3.2 Skeleton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.3 Output mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.3.1 BSDCONV HOLD . . . . . . . . . . . . . . . . . . . . . . 8 3.3.2 BSDCONV AUTOMALLOC . . . . . . . . . . . . . . . . 8 3.3.3 BSDCONV PREMALLOCED . . . . . . . . . . . . . . . 8 3.3.4 BSDCONV FILE . . . . . . . . . . . . . . . . . . . . . . . 8 3.3.5 BSDCONV FD . . . . . . . . . . . . . . . . . . . . . . . . 8 3.3.6 BSDCONV NULL . . . . . . . . . . . . . . . . . . . . . . 8 3.3.7 BSDCONV PASS . . . . . . . . . . . . . . . . . . . . . . 8 3.4 Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.5 Memory pool issue . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1 Syntax 1.1 Phases & Cascade There are three types of conversion phases defined in bsdconv: from, inter, to. The from phase takes byte sequence and decodes it into a list of code points (except for from/PASS), on the other hand, the to phase encodes the list of code points back to byte sequence. The inter phase does code point to code point mapping. 1
  • 2. A basic conversion consists of from and to phases. Search of codec name is case insensitive. ISO-8859-1 : UTF-8 from to Figure 1: Basic two phases conversion Between from and to phases, we can have an inter phase. UTF-8 : UPPER : UTF-8 from inter to Figure 2: Conversion with inter-mapping phase There can be more than one inter phases. UTF-8 : UPPER : FULL : UTF-8 from inter inter to Figure 3: Conversion with multiple inter-mapping phases An inter phase can be used standalonely, mostly in programmatic way. HALF inter Figure 4: Standalone inter-mapping phase Conversions can be cascaded with pipe symbol. In most cases it is equivalent to shell pipe unless the use of codecs manipulating flag (described in section 2.2). UTF-8 : BIG5 | BIG5 : UTF-8 from to from to Figure 5: Cascaded conversions ASCII-compatible codecs are designed to exclude ASCII part and named as FOO, with alias FOO ⇒ FOO,ASCII or ASCII, FOO. 2
  • 3. 1.2 Codecs & Fallback A phase consists of one or more codecs, separated by comma. The latter codecs will be utilized if and only if the former codecs fail to consume the incoming data, once a codec finish its task, the first codec will be up again for upcoming data. UTF-8 : ASCII , 3F from to Figure 6: Fallback codec 1.3 Codec argument Some codecs take arguments, after the hash symbol. UTF-8 : ASCII , ANY#3F Figure 7: Passing argument to codec Some codecs take arguments in key-value form. Argument name and value consist of numbers, alphabets, hyphen and underscore, binary data are repre- sented in hexadecimal form. UTF-8 : ASCII , ESCAPE#PREFIX=2575 Figure 8: Passing argument to codec in key-value form Multiple arguments can be passed by being concatenated with ampersand. UTF-8 : ASCII , ESCAPE#PREFIX=262378&SUFFIX=3B Figure 9: Passing multiple arguments to codec List of data can be passed in dot-separated form. ANY#013F.0121 : ASCII Figure 10: Data list 3
  • 4. 2 Type & Flag 2.1 Type A code point packet note its type at first byte. ID Description Provider(from) Consumer(to) 00 Bsdconv special characters BSDCONV-KEYWORD BSDCONV-KEYWORD 01 Unicode Most decoders Most encoders 02 CNS116431 CNS11643 CNS11643 03 Byte BYTE; ESCAPE BYTE; ESCAPE#FOR=BYTE 04 Chinese components inter/ZH-DECOMP inter/ZH-COMP 1B ANSI control sequence ANSI-CONTROL - Table 1: Types and its provider/consumer (just to name a few) Entity Unicode UTF-8 Hex % U+0025 25 A U+0041 41 ∀ U+2200 E28880 A∀ Input (UTF-8 literal) ASCII,BYTE : ... Decoder 01 41 03 E2 03 88 03 80 Internal data ... : ASCII,ESCAPE Encoder 41 ”A” 25 45 32 ”%E2” 25 38 38 ”%88” 25 38 30 ”%80” Internal data A%E2%88%80 Output (UTF-8 literal) Figure 11: Fallback & Type 1As for the intersection of CNS11643 and Unicode, from/CNS11643 does conversion to unicode type if possible. Vice versa, to/CNS11643 does conversion from unicode type if possible. 4
  • 5. 2.2 Flag A code point packet carries its own flags. Currently there are two types of flag, FREE and MARK. Flag FREE indicates that the packet buffer needs to be recycled or released, this is used only when programming is involved. Flag MARK is (currently only) added by codec to/PASS#MARK and used by codec from/PASS#UNMARK to identify which packets have already been decoded and needs to be passed through in from phase. The code point packets structure is retained, including flags, within cascaded conversions, but not for shell pipe. Figure 11 demonstrate the flow of conversion ESCAPE:PASS#MARK&FOR=1,BYTE|PASS#UNMARK,UTF-8:UTF-8”. Entity Unicode UTF-8 Hex α U+03B1 CEB1 β U+03B2 CEB2 %u03B1%CE%B2 Input (UTF-8 literal) ESCAPE : ... Decoder 01 03 B1 03 CE 03 B2 Internal data ... : PASS#MARK&FOR=1,BYTE Encoder 01 03 B1 MARK CE B2 Internal data PASS#UNMARK,UTF-8 : ... Decoder 01 03 B1 01 03 B2 Internal data ... : UTF-8 Encoder CE B1 ”α” CE B2 ”β” Internal data αβ Output (UTF-8 literal) Figure 12: Flag, from/PASS & to/PASS 5
  • 6. 2.3 Helper codecs Codec from/bsdconv can be used to input internal data structure, and codec to/BSDCONV-OUTPUT can be used to inspect type and flags. 3 C Programming guide 3.1 Conversion instance lifecycle bsdconv create() bsdconv init() set input/output parameters is last chunk set flush flag bsdconv() collect output has next chunk bsdconv destroy() yes no no yes next chunk no reuse instance Figure 13: Conversion instance lifecycle 6
  • 7. 3.2 Skeleton #include <bsdconv.h> bsdconv_instance *ins; char *buf; size_t len; ins=bsdconv_create ("UTF -8: UPSIDEDOWN:UTF -8"); bsdconv_init(ins); do{ buf=bsdconv_malloc (BUFSIZ ); /* * fill data into buf * len=filled data length */ ins ->input.data=buf; ins ->input.len=len; ins ->input.flags |= F_FREE; ins ->input.next=NULL; if(ins ->input.len ==0) { // last chunk ins ->flush =1; } /* * set output parameter (see section 3.3) */ bsdconv(ins); /* * collect output (see section 3.3) */ }while(ins ->flush ==0); bsdconv_destroy (ins); For chunked conversion, input buffer should be allocated for each input to prevent content change during conversion. Output buffer with flag FREE is safe to be reused. 3.3 Output mode ins -> output mode Description BSDCONV HOLD Hold output in memory BSDCONV AUTOMALLOC Return output buffer which should be free() after use BSDCONV PREMALLOCED Fill output into given buffer BSDCONV FILE Write output into (FILE *) stream file BSDCONV FD Write output into (int) file descriptor BSDCONV NULL Discard output BSDCONV PASS Pass to another conversion instance 7
  • 8. 3.3.1 BSDCONV HOLD This is default output mode after bsdconv init(). Usually used with BSD- CONV AUTOMALLOC or BSDCONV PREMALLOCED to get squeezed out- put. 3.3.2 BSDCONV AUTOMALLOC Output buffer will be allocated dynamically, the actual buffer size will be ins->output.len + output content length, it is useful when you need to have terminating null byte. 3.3.3 BSDCONV PREMALLOCED If ins->output.data is NULL, the total length of content to be output will be put to ins->output.len, but output will still be hold in memory. Otherwise, bsdconv() will fill as much unfragmented data as possible within the buffer size limit specified at ins->output.len. 3.3.4 BSDCONV FILE Output will be fwrite() to the given FILE * at ins->output.data. 3.3.5 BSDCONV FD Output will be write() to the given (int) file descriptor at ins->output.data. Casting to intptr t (defined in <stdint.h>) is needed to eliminate compiler warning. 3.3.6 BSDCONV NULL Output will be discard. This is usually used with evaluating conversion (see section 3.4). 3.3.7 BSDCONV PASS Output packets will be passed to the given (struct bsdconv instance *) con- version instance at ins->output.data. 3.4 Counters Counters are listed in ins->counter in linked-list with following structure. struct bsdconv_counter_entry { char *key; bsdconv_counter_t val; struct bsdconv_counter_entry *next; }; IERR and OERR are mandatory error counters. 8
  • 9. There are two APIs to get/reset counter(s): bsdconv_counter_t * bsdconv_counter (char *name ); Return the pointer to the counter value. bsdconv counter t is currently defined as size t. void bsdconv_counter_reset (char *name ); Reset the specified counter, if name is NULL, all counters are reset. 3.5 Memory pool issue In case libbsdconv and your program uses different memory pools, bsdconv malloc() and bsdconv free() should be used to replace malloc() and free(). 9