SlideShare a Scribd company logo
1 of 2
Download to read offline
ParSyll Algorithm
While the code and test environment still refers to SYLLABIX (the earlier name assigned to the prototype
algorithm prior to the year 2000), it has been renamed due to the fact that a game with the name
Syllabix is now in existence.
At some time the program names, files and environment will be updated to reflect the new name. In the
interim, rights are claimed by way of use, reference, communication and publication including this very
document now emailed, distributed and reflected in electronic media.
Copyright and right are claimed in terms of the Berne Copyright Convention and in terms of the
Copyright Act 98 of 1978 of South Africa. No part of this publication or of the program(s) or any
associated code may be reproduced or transmitted in any form or by any means, electronic or
mechanical, including photocopying, recording or by any information storage and retrieval system,
without permission in writing from the author Trevor Nigel Gadd. All rights reserved.
The following is a brief description of the algorithm.
The purpose of the algorithm is to segment written words and names into auto-determined 'syllables'
which are then interpreted phonetically to a degree, and used to construct a retrieval 'code' that
inherently 'groups' like-sounding words or names together to 'broaden' search results during a textual
enquiry.
It is important to note that the ParSyll Algorithm does not attempt to emulate dictionary syllable
definition. It uses instead, raw logic to attempt syllable segmentation in isolation from referential data
and NO WHOLE WORDS are stored or referenced in the execution of its task.
The algorithm is divided into eight major segments, executed sequentially :-
1. An initial segmentation
1.1 Incorporates some temporary special character-sequence augmentation
which is deleted again at the end of initial segmentation
2. Diphthongs and Triphthongs
2.1 Segmentation is based on 'majority-fit' solutions, resulting in some
incorrect sound-splits and conjoins (is 'ruin' one syllable or two?
'IENCE' in SCIENCE? 'IENCE' in CONSCIENCE? etc.)
3. Complex segmentation
4. Ending sound segmentation
4.1 Some sequences, eg 'NG' in the middle of a word might be split as a
result of syllable segmentation, eg '..N~G..'. The same sequence in
an ending sound or final syllable might not, eg. '~ING'
5. Phonetic substitutions
The phonetic substitutions of PHONIX are established and documented.
It is anticipated that syllable segmentation will enable different,
if similar, substitutions to be defined. Significantly, simpler
substitutions may suffice by virtue of the 'added definition' of
syllable boundaries.
5.1 First Syllable
5.1.1 Leading character substitutions
5.1.2 Embedded & trailing character substitutions and negations
5.2 Middle Syllables Substitutions
5.2.1 General character substitutions and negations
5.3 Last Syllable Substitutions
5.3.1 Ending-sound substitutions and negations
6. Elimination of carrier vocalization (vowels) and 'silent' consonants
7. Character mapping to broaden search results
8. Indexing of results for retrieval purposes
Data evaluation of results for the development of algorithm segment 5 is underway
T.N. Gadd
22 December 2015

More Related Content

Viewers also liked

Natal conto ninguem-da-prendas-painatal
Natal conto ninguem-da-prendas-painatalNatal conto ninguem-da-prendas-painatal
Natal conto ninguem-da-prendas-painatal
professoraisasoares
 
Las relaciones del estado de israel & méxico.
Las relaciones del estado de israel & méxico.Las relaciones del estado de israel & méxico.
Las relaciones del estado de israel & méxico.
christianpulido
 
CERTIFICATE OF EHS Inspector
CERTIFICATE OF EHS InspectorCERTIFICATE OF EHS Inspector
CERTIFICATE OF EHS Inspector
LOI NGUYEN
 

Viewers also liked (18)

Fuel cells
Fuel cellsFuel cells
Fuel cells
 
Natal conto ninguem-da-prendas-painatal
Natal conto ninguem-da-prendas-painatalNatal conto ninguem-da-prendas-painatal
Natal conto ninguem-da-prendas-painatal
 
Case Study Example
Case Study ExampleCase Study Example
Case Study Example
 
Las relaciones del estado de israel & méxico.
Las relaciones del estado de israel & méxico.Las relaciones del estado de israel & méxico.
Las relaciones del estado de israel & méxico.
 
CERTIFICATE OF EHS Inspector
CERTIFICATE OF EHS InspectorCERTIFICATE OF EHS Inspector
CERTIFICATE OF EHS Inspector
 
Response card nxt
Response card nxtResponse card nxt
Response card nxt
 
【第2回】トレーニング内容
【第2回】トレーニング内容【第2回】トレーニング内容
【第2回】トレーニング内容
 
Lecture 6
Lecture 6Lecture 6
Lecture 6
 
Tics 1 Ultimate
Tics 1 Ultimate Tics 1 Ultimate
Tics 1 Ultimate
 
Solution SOM 2009 Edition (Most Perfect)
Solution SOM 2009 Edition (Most Perfect)Solution SOM 2009 Edition (Most Perfect)
Solution SOM 2009 Edition (Most Perfect)
 
2013 TMP Partners Meeting and Volunteer Recognition Reception
2013 TMP Partners Meeting and Volunteer Recognition Reception2013 TMP Partners Meeting and Volunteer Recognition Reception
2013 TMP Partners Meeting and Volunteer Recognition Reception
 
I360 ciits
I360 ciitsI360 ciits
I360 ciits
 
Seduc pi 4613780
Seduc pi 4613780Seduc pi 4613780
Seduc pi 4613780
 
Narrativas periodísticas en la web
Narrativas periodísticas en la webNarrativas periodísticas en la web
Narrativas periodísticas en la web
 
I360 school net
I360 school netI360 school net
I360 school net
 
Research Project on Knowledge vs CGPA
Research Project on Knowledge vs CGPAResearch Project on Knowledge vs CGPA
Research Project on Knowledge vs CGPA
 
Презентация кофе немецкой торговой марки Melitta
Презентация кофе немецкой торговой марки MelittaПрезентация кофе немецкой торговой марки Melitta
Презентация кофе немецкой торговой марки Melitta
 
Lecture 10
Lecture 10Lecture 10
Lecture 10
 

Similar to ParSyll Algorithm

Automatic subtitle generation
Automatic subtitle generationAutomatic subtitle generation
Automatic subtitle generation
tanyasaxena1611
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Programming in Computational Biology
Programming in Computational BiologyProgramming in Computational Biology
Programming in Computational Biology
AtreyiB
 
Diving into Functional Programming
Diving into Functional ProgrammingDiving into Functional Programming
Diving into Functional Programming
Lev Walkin
 
8080 8085 assembly language_programming manual programando
8080 8085 assembly  language_programming manual programando 8080 8085 assembly  language_programming manual programando
8080 8085 assembly language_programming manual programando
Universidad de Tarapaca
 

Similar to ParSyll Algorithm (20)

An Overview of Hadoop
An Overview of HadoopAn Overview of Hadoop
An Overview of Hadoop
 
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
 
Automatic subtitle generation
Automatic subtitle generationAutomatic subtitle generation
Automatic subtitle generation
 
What Shazam doesn't want you to know
What Shazam doesn't want you to knowWhat Shazam doesn't want you to know
What Shazam doesn't want you to know
 
05 - Bypassing DEP, or why ASLR matters
05 - Bypassing DEP, or why ASLR matters05 - Bypassing DEP, or why ASLR matters
05 - Bypassing DEP, or why ASLR matters
 
Cs6660 compiler design may june 2016 Answer Key
Cs6660 compiler design may june 2016 Answer KeyCs6660 compiler design may june 2016 Answer Key
Cs6660 compiler design may june 2016 Answer Key
 
IRJET - Pseudocode to Python Translation using Machine Learning
IRJET - Pseudocode to Python Translation using Machine LearningIRJET - Pseudocode to Python Translation using Machine Learning
IRJET - Pseudocode to Python Translation using Machine Learning
 
lempel_ziv
lempel_zivlempel_ziv
lempel_ziv
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Erlang, an overview
Erlang, an overviewErlang, an overview
Erlang, an overview
 
Stemming is one of several text normalization techniques that converts raw te...
Stemming is one of several text normalization techniques that converts raw te...Stemming is one of several text normalization techniques that converts raw te...
Stemming is one of several text normalization techniques that converts raw te...
 
Robust Speech Recognition Technique using Mat lab
Robust Speech Recognition Technique using Mat labRobust Speech Recognition Technique using Mat lab
Robust Speech Recognition Technique using Mat lab
 
Reversing & Malware Analysis Training Part 4 - Assembly Programming Basics
Reversing & Malware Analysis Training Part 4 - Assembly Programming BasicsReversing & Malware Analysis Training Part 4 - Assembly Programming Basics
Reversing & Malware Analysis Training Part 4 - Assembly Programming Basics
 
Erlang
ErlangErlang
Erlang
 
DataEngConf: Uri Laserson (Data Scientist, Cloudera) Scaling up Genomics with...
DataEngConf: Uri Laserson (Data Scientist, Cloudera) Scaling up Genomics with...DataEngConf: Uri Laserson (Data Scientist, Cloudera) Scaling up Genomics with...
DataEngConf: Uri Laserson (Data Scientist, Cloudera) Scaling up Genomics with...
 
Programming in Computational Biology
Programming in Computational BiologyProgramming in Computational Biology
Programming in Computational Biology
 
Diving into Functional Programming
Diving into Functional ProgrammingDiving into Functional Programming
Diving into Functional Programming
 
Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...
Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...
Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...
 
8080 8085 assembly language_programming manual programando
8080 8085 assembly  language_programming manual programando 8080 8085 assembly  language_programming manual programando
8080 8085 assembly language_programming manual programando
 
G0361034038
G0361034038G0361034038
G0361034038
 

ParSyll Algorithm

  • 1. ParSyll Algorithm While the code and test environment still refers to SYLLABIX (the earlier name assigned to the prototype algorithm prior to the year 2000), it has been renamed due to the fact that a game with the name Syllabix is now in existence. At some time the program names, files and environment will be updated to reflect the new name. In the interim, rights are claimed by way of use, reference, communication and publication including this very document now emailed, distributed and reflected in electronic media. Copyright and right are claimed in terms of the Berne Copyright Convention and in terms of the Copyright Act 98 of 1978 of South Africa. No part of this publication or of the program(s) or any associated code may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without permission in writing from the author Trevor Nigel Gadd. All rights reserved. The following is a brief description of the algorithm. The purpose of the algorithm is to segment written words and names into auto-determined 'syllables' which are then interpreted phonetically to a degree, and used to construct a retrieval 'code' that inherently 'groups' like-sounding words or names together to 'broaden' search results during a textual enquiry. It is important to note that the ParSyll Algorithm does not attempt to emulate dictionary syllable definition. It uses instead, raw logic to attempt syllable segmentation in isolation from referential data and NO WHOLE WORDS are stored or referenced in the execution of its task. The algorithm is divided into eight major segments, executed sequentially :- 1. An initial segmentation 1.1 Incorporates some temporary special character-sequence augmentation which is deleted again at the end of initial segmentation 2. Diphthongs and Triphthongs 2.1 Segmentation is based on 'majority-fit' solutions, resulting in some incorrect sound-splits and conjoins (is 'ruin' one syllable or two? 'IENCE' in SCIENCE? 'IENCE' in CONSCIENCE? etc.) 3. Complex segmentation 4. Ending sound segmentation 4.1 Some sequences, eg 'NG' in the middle of a word might be split as a result of syllable segmentation, eg '..N~G..'. The same sequence in an ending sound or final syllable might not, eg. '~ING' 5. Phonetic substitutions The phonetic substitutions of PHONIX are established and documented. It is anticipated that syllable segmentation will enable different, if similar, substitutions to be defined. Significantly, simpler substitutions may suffice by virtue of the 'added definition' of
  • 2. syllable boundaries. 5.1 First Syllable 5.1.1 Leading character substitutions 5.1.2 Embedded & trailing character substitutions and negations 5.2 Middle Syllables Substitutions 5.2.1 General character substitutions and negations 5.3 Last Syllable Substitutions 5.3.1 Ending-sound substitutions and negations 6. Elimination of carrier vocalization (vowels) and 'silent' consonants 7. Character mapping to broaden search results 8. Indexing of results for retrieval purposes Data evaluation of results for the development of algorithm segment 5 is underway T.N. Gadd 22 December 2015