SlideShare uma empresa Scribd logo
1 de 5
Baixar para ler offline
aXML-Motor
                             XML Document Parsing
                                  Algorithm
                                 version 2011.11.04
                                                             Abhishek Kumar ~=ABK=~
                                                     http://github.com/abhishekkr
                                                    http://www.twitter.com/abionic

Algorithm, Ruby Source and Gem:
[axml-motor] @GitHub: http://github.com/abhishekkr/axml-motor.git
rubygem's src @GitHub: http://github.com/abhishekkr/rubygem_xml_motor.git
gem install @RubyGems:http://rubygems.org/gems/xml-motor

                        Algorithm-Walk-through
Example XML Content:
      <BODY>
         <DIV id='banner'>
           <H1>aXML-Motor</H1>
           <H5>A new algorithm based compact XML Parser with <I>no
      dependencies</I>.
          </H5>
        </DIV>
        <DIV id='details'>
           <SPAN class='github'>@github:
              <A href='http://github.com/abhishekkr/axml-motor.git'>
                 axml-motor</A>
           </SPAN>
           <DIV class='gem'>
              <SPAN id='source' class='github'>@github:
                <A href='http://github.com/abhishekkr/rubygem-xml-
      motor.git'>rubygem-xml-motor</A>
              </SPAN>
              <SPAN class='rubygems'>@rubygems:
                <A href='http://rubygems.org/gems/xml-motor.git'>xml-motor</A>
             </SPAN>
          </DIV>
          <I> It's a new algorithm implemented to build a real compact parser
      (v0.0.2 has less than 200 ruby source code lines) without any
      dependencies.</I>
        </DIV>
      </BODY>
[Step.1]   Split    the XML Content

           (1.1) Split by '<'
              store as XMLNodes
             [0] BODY>
             [1] DIV id='banner'>
             [2] H1>aXML-Motor
             [3] /H1>
             [4] H5>A new algorithm based compact XML Parser with
             [5] I>no dependencies
             [6] /I>.
             [7] /H5>
             [8] /DIV>
             [9] DIV id='details'>
             [10] SPAN class='github'>@github:
             [11] A href='http://github.com/abhishekkr/axml-motor.git'>axml-motor<
             [12] /A>
             [13] /SPAN>
             [14] DIV class='gem'>
             [15] SPAN id='source' class='github'>@github:
             [16] A href='http://github.com/abhishekkr/rubygem-xml-motor.git'>rubygem-
                  xml-motor
             [17] /A>
             [18] /SPAN>
             [19] SPAN class='rubygems'>@rubygems:
             [20] A href='http://rubygems.org/gems/xml-motor.git'>xml-motor
             [21] /A>
             [22] /SPAN>
             [23] /DIV>
             [24] I> It's a new algorithm implemented to build a real compact parser
                  (v0.0.2 has less than 200 ruby source code lines) without any
                  dependencies.
             [25] /I>
             [26] /DIV>
             [27] /BODY>


           (1.2) Split previous step1.1 result by '>'
              update XMLNodes
             [0] [ 'BODY', '' ]
             [1] ['DIV id='banner', '' ]
             [2] ['H1', 'aXML-Motor' ]
             [3] ['/H1', '']
             [4] ['H5', 'A new algorithm based compact XML Parser with ']
             [5] ['I', 'no dependencies']
             [6] ['/I', '.']
             [7] ['/H5', '']
             [8] ['/DIV', '']
             [9] ['DIV id='details'', '']
             [10] ['SPAN class='github'', '@github: ']
             [11] ['A href='http://github.com/abhishekkr/axml-motor.git'',
                   'axml-motor']
[12]   ['/A', '']
  [13]   ['/SPAN', '']
  [14]   ['DIV class='gem'', '']
  [15]   ['SPAN id='source' class='github'', '@github: ']
  [16]   ['A href='http://github.com/abhishekkr/rubygem-xml-motor.git'',
          'rubygem-xml-motor']
  [17]   ['/A', '']
  [18]   ['/SPAN', '']
  [19]   ['SPAN class='rubygems'', '@rubygems: ']
  [20]   ['A href='http://rubygems.org/gems/xml-motor.git', 'xml-motor']
  [21]   ['/A', '']
  [22]   ['/SPAN', '']
  [23]   ['/DIV', '']
  [24]   ['I', 'It's a new algorithm implemented to build a real compact
          parser (v0.0.2 has less than 200 ruby source code lines) without
          any dependencies.']
  [25]   ['/I', '']
  [26]   ['/DIV', '']
  [27]   ['/BODY', '']


(1.3) Split first element per line by space/tab, mark 1 st part as
       tag_name and split latter part by '=', iterating to make
       key=value pair per attribute... turning XMLNodes to
   update XMLNodes
  [0] [ ['BODY', {}], '' ]
  [1] [ ['DIV', {'id'=>'banner'}], '' ]
  [2] [ ['H1', {}], 'aXML-Motor' ]
  [3] [ ['/H1', {}], '']
  [4] [ ['H5', {}], 'A new algorithm based compact XML Parser with ']
  [5] [ ['I', {}], 'no dependencies']
  [6] [ ['/I', {}], '.']
  [7] [ ['/H5', {}], '']
  [8] [ ['/DIV', {}], '']
  [9] [ ['DIV', {'id'=>'details'}], '']
  [10] [ ['SPAN', {'class'=>'github'}], '@github: ']
  [11] [ ['A', {'href'='http://github.com/abhishekkr/axml-motor.git'}],
        'axml-motor']
  [12] [ ['/A', {}], '']
  [13] [ ['/SPAN', {}], '']
  [14] [ ['DIV', {'class'=>'gem'}], '']
  [15] [ ['SPAN', {'id'=>'source', 'class'=>'github'}], '@github: ']
  [16] [ ['A',
          {'href'=>'http://github.com/abhishekkr/rubygem-xml-motor.git'}],
        'rubygem-xml-motor']
  [17] [ ['/A', {}], '']
  [18] [ ['/SPAN', {}], '']
  [19] [ ['SPAN', {'class'=>'rubygems'}], '@rubygems: ']
  [20] [ ['A', {'href'=>'http://rubygems.org/gems/xml-motor.git'}],
         'xml-motor']
  [21] [ ['/A', {}], '']
  [22] [ ['/SPAN', {}], '']
  [23] [ ['/DIV', {}], '']
  [24] [ ['I', {}], 'It's a new algorithm implemented to build a real
compact parser (v0.0.2 has less than 200 ruby source
                                 code lines) without any dependencies.']
              [25] [ ['/I', {}], '']
              [26] [ ['/DIV', {}], '']
              [27] [ ['/BODY', {}], '']


              Here, we have the XMLNodes as we wanted them.
              Now it's turn to Indexify them.


[Step.2] Index the processed XMLNodes
There are three things involved in Indexing of XMLNodes

       Tag_Name :
       Iterating through all elements of XMLNodes, every element has three components
       including Tag Name, which is available at XMLNodes.all[ [TAG_NAMES, *], *]
       Depth:
       The place/level of the Node in XML Node Tree starting from '0'.
       Index:
       The index value of Node as per depending upon the XMLNode Array

       How to Index-ify?
       There will be an element per Tag_Name with a Hash of Keys as the 'Depth' where it is
       found which has array of 2*number_of_nodes (starting and ending 'Index' for that same
       Node)
       Example:
       From above XMLNodes, the ['DIV'] would hold {1=>[1,8, 9,26], 2=>[14,26]}
       Because 'Tag_Name' DIV has 'Index' set of 1,8 and 9,26 for 'Depth' of 1.
       Similarly 'Index' set of 14,26 for 'Depth' of 2.

Indexed XMLTags for above processed XMLNodes will be as follows:

calculated XMLTags
              ['BODY'] = {0=>[0,27]}
              ['DIV'] = {1=>[1,8, 9,26], 2=>[14,23]}
              ['H1'] = {2=>[2,3]}
              ['H5'] = {2=>[4,7]}
              ['I'] => {3=>[5,6], 2=>[24,25]}
              ['SPAN'] => {2=>[10,13], 3=>[15,18, 19,22]}
              ['A'] => {3=>[11,12], 4=>[16,17, 20,21]}



[Step.3]   Grab My Node           from processed XMLNodes using XMLTags

Now suppose, I aim for a Tag_Name 'XYZ'..... then look for XMLTags['XYZ'], iterate
through all of its depths and extract 2 indexes at a time. These two indexes per time
indicate the start and end node, fetch all value within those nodes from XMLNode.
This will return set of values held by Tag_Name 'XYZ'.

Suppose a tree form is provided as 'ABC.XYZ', then start from top nodes as 'ABC' in
this context.
Grab all it's node. Now move on to lower nodes and filter the Indexes found only within
the Node Index ranges provided by the earlier node. This would end with the filtered set
of Indexes for 'XYZ' falling only under the Index-Range of 'ABC'.

To check for a Tag_Name with attribute, for every filtered Index-Range, just check if it
has the required attribute as it's key-value pair.

Example:

Case: Grabbing 'SPAN', with attribute “class=''github'”
It's a single node, grab all its Index-Range (10,13), (15,18) and (19,22).
Here, just XMLNodes[10] and XMLNodes[15] have required attribute.
Now, grab all data between XMLNodes[10][1] to XMLNodes[13-1][1] and
XMLNodes[15][1] to XMLNodes[18-1][1].
Result:
['@github: <A href='http://github.com/abhishekkr/axml-motor.git'>axml-motor</A>' ,
'@github: <A href='http://github.com/abhishekkr/rubygem-xml-motor.git'>rubygem-xml-
motor</A>']


Case: Grabbing 'H5.I'
Top node is 'H5', grab all its Index-Range (4,7).
Second node 'I', grab all falling between ranges from previous node (5,6).
Now, grab all data between XMLNodes[5][1] to XMLNodes[6-1][1]..
Result:
['no dependencies']

Below, you'll also see that you need not give entire hierarchy to fetch any
descendant from child tree of any node. Just giving the major scope nodes would do
the work as fine as providing exact hierarchy.
Case: Grabbing 'DIV.A'
Top node is 'DIV', grab all its Index-Range (1,8), (9,26) and (14,23).
Second node 'A', grab all falling between ranges from previous node (11,12), (16,17)
and (20,21).
Now, grab all data between XMLNodes[5][1] to XMLNodes[6-1][1]..
Result:
['axml-motor', 'rubygem-xml-motor', 'xml-motor']

Mais conteúdo relacionado

Mais procurados

React Native One Day
React Native One DayReact Native One Day
React Native One DayTroy Miles
 
Sharding and Load Balancing in Scala - Twitter's Finagle
Sharding and Load Balancing in Scala - Twitter's FinagleSharding and Load Balancing in Scala - Twitter's Finagle
Sharding and Load Balancing in Scala - Twitter's FinagleGeoff Ballinger
 
Relational Database Access with Python ‘sans’ ORM
Relational Database Access with Python ‘sans’ ORM  Relational Database Access with Python ‘sans’ ORM
Relational Database Access with Python ‘sans’ ORM Mark Rees
 
High Performance RPC with Finagle
High Performance RPC with FinagleHigh Performance RPC with Finagle
High Performance RPC with FinagleSamir Bessalah
 
CBStreams - Java Streams for ColdFusion (CFML)
CBStreams - Java Streams for ColdFusion (CFML)CBStreams - Java Streams for ColdFusion (CFML)
CBStreams - Java Streams for ColdFusion (CFML)Ortus Solutions, Corp
 
Play framework training by Neelkanth Sachdeva @ Scala traits event , New Delh...
Play framework training by Neelkanth Sachdeva @ Scala traits event , New Delh...Play framework training by Neelkanth Sachdeva @ Scala traits event , New Delh...
Play framework training by Neelkanth Sachdeva @ Scala traits event , New Delh...Knoldus Inc.
 
React Native Evening
React Native EveningReact Native Evening
React Native EveningTroy Miles
 
Using akka streams to access s3 objects
Using akka streams to access s3 objectsUsing akka streams to access s3 objects
Using akka streams to access s3 objectsMikhail Girkin
 
Rails 3 Beautiful Code
Rails 3 Beautiful CodeRails 3 Beautiful Code
Rails 3 Beautiful CodeGreggPollack
 
Java Servlet Programming under Ubuntu Linux by Tushar B Kute
Java Servlet Programming under Ubuntu Linux by Tushar B KuteJava Servlet Programming under Ubuntu Linux by Tushar B Kute
Java Servlet Programming under Ubuntu Linux by Tushar B KuteTushar B Kute
 
Managing GraphQL servers with AWS Fargate & Prisma Cloud
Managing GraphQL servers  with AWS Fargate & Prisma CloudManaging GraphQL servers  with AWS Fargate & Prisma Cloud
Managing GraphQL servers with AWS Fargate & Prisma CloudNikolas Burk
 
Realm.io par Clement Sauvage
Realm.io par Clement SauvageRealm.io par Clement Sauvage
Realm.io par Clement SauvageCocoaHeads France
 
Connect S3 with Kafka using Akka Streams
Connect S3 with Kafka using Akka Streams Connect S3 with Kafka using Akka Streams
Connect S3 with Kafka using Akka Streams Seiya Mizuno
 
Buiilding reactive distributed systems with Akka
Buiilding reactive distributed systems with AkkaBuiilding reactive distributed systems with Akka
Buiilding reactive distributed systems with AkkaJohan Andrén
 

Mais procurados (20)

React Native One Day
React Native One DayReact Native One Day
React Native One Day
 
Python database interfaces
Python database  interfacesPython database  interfaces
Python database interfaces
 
Angular 2 introduction
Angular 2 introductionAngular 2 introduction
Angular 2 introduction
 
Sharding and Load Balancing in Scala - Twitter's Finagle
Sharding and Load Balancing in Scala - Twitter's FinagleSharding and Load Balancing in Scala - Twitter's Finagle
Sharding and Load Balancing in Scala - Twitter's Finagle
 
Relational Database Access with Python ‘sans’ ORM
Relational Database Access with Python ‘sans’ ORM  Relational Database Access with Python ‘sans’ ORM
Relational Database Access with Python ‘sans’ ORM
 
High Performance RPC with Finagle
High Performance RPC with FinagleHigh Performance RPC with Finagle
High Performance RPC with Finagle
 
CBStreams - Java Streams for ColdFusion (CFML)
CBStreams - Java Streams for ColdFusion (CFML)CBStreams - Java Streams for ColdFusion (CFML)
CBStreams - Java Streams for ColdFusion (CFML)
 
Mongo db
Mongo dbMongo db
Mongo db
 
Alteryx SDK
Alteryx SDKAlteryx SDK
Alteryx SDK
 
Play framework training by Neelkanth Sachdeva @ Scala traits event , New Delh...
Play framework training by Neelkanth Sachdeva @ Scala traits event , New Delh...Play framework training by Neelkanth Sachdeva @ Scala traits event , New Delh...
Play framework training by Neelkanth Sachdeva @ Scala traits event , New Delh...
 
Play!ng with scala
Play!ng with scalaPlay!ng with scala
Play!ng with scala
 
React Native Evening
React Native EveningReact Native Evening
React Native Evening
 
Group111
Group111Group111
Group111
 
Using akka streams to access s3 objects
Using akka streams to access s3 objectsUsing akka streams to access s3 objects
Using akka streams to access s3 objects
 
Rails 3 Beautiful Code
Rails 3 Beautiful CodeRails 3 Beautiful Code
Rails 3 Beautiful Code
 
Java Servlet Programming under Ubuntu Linux by Tushar B Kute
Java Servlet Programming under Ubuntu Linux by Tushar B KuteJava Servlet Programming under Ubuntu Linux by Tushar B Kute
Java Servlet Programming under Ubuntu Linux by Tushar B Kute
 
Managing GraphQL servers with AWS Fargate & Prisma Cloud
Managing GraphQL servers  with AWS Fargate & Prisma CloudManaging GraphQL servers  with AWS Fargate & Prisma Cloud
Managing GraphQL servers with AWS Fargate & Prisma Cloud
 
Realm.io par Clement Sauvage
Realm.io par Clement SauvageRealm.io par Clement Sauvage
Realm.io par Clement Sauvage
 
Connect S3 with Kafka using Akka Streams
Connect S3 with Kafka using Akka Streams Connect S3 with Kafka using Akka Streams
Connect S3 with Kafka using Akka Streams
 
Buiilding reactive distributed systems with Akka
Buiilding reactive distributed systems with AkkaBuiilding reactive distributed systems with Akka
Buiilding reactive distributed systems with Akka
 

Destaque

Insecurity-In-Security version.1 (2010)
Insecurity-In-Security version.1 (2010)Insecurity-In-Security version.1 (2010)
Insecurity-In-Security version.1 (2010)Abhishek Kumar
 
xml-motor ~ What,Why,How
xml-motor ~ What,Why,Howxml-motor ~ What,Why,How
xml-motor ~ What,Why,HowAbhishek Kumar
 
Insecurity-In-Security version.2 (2011)
Insecurity-In-Security version.2 (2011)Insecurity-In-Security version.2 (2011)
Insecurity-In-Security version.2 (2011)Abhishek Kumar
 
An Express Guide ~ Zabbix for IT Monitoring
An Express Guide ~ Zabbix for IT Monitoring An Express Guide ~ Zabbix for IT Monitoring
An Express Guide ~ Zabbix for IT Monitoring Abhishek Kumar
 

Destaque (6)

Insecurity-In-Security version.1 (2010)
Insecurity-In-Security version.1 (2010)Insecurity-In-Security version.1 (2010)
Insecurity-In-Security version.1 (2010)
 
xml-motor ~ What,Why,How
xml-motor ~ What,Why,Howxml-motor ~ What,Why,How
xml-motor ~ What,Why,How
 
DevOps?!@
DevOps?!@DevOps?!@
DevOps?!@
 
Insecurity-In-Security version.2 (2011)
Insecurity-In-Security version.2 (2011)Insecurity-In-Security version.2 (2011)
Insecurity-In-Security version.2 (2011)
 
DevOps with Sec-ops
DevOps with Sec-opsDevOps with Sec-ops
DevOps with Sec-ops
 
An Express Guide ~ Zabbix for IT Monitoring
An Express Guide ~ Zabbix for IT Monitoring An Express Guide ~ Zabbix for IT Monitoring
An Express Guide ~ Zabbix for IT Monitoring
 

Semelhante a XML-Motor

Google apps script database abstraction exposed version
Google apps script database abstraction   exposed versionGoogle apps script database abstraction   exposed version
Google apps script database abstraction exposed versionBruce McPherson
 
Migrating Legacy Rails Apps to Rails 3
Migrating Legacy Rails Apps to Rails 3Migrating Legacy Rails Apps to Rails 3
Migrating Legacy Rails Apps to Rails 3Clinton Dreisbach
 
reactjs-quiz..docs.pdf
reactjs-quiz..docs.pdfreactjs-quiz..docs.pdf
reactjs-quiz..docs.pdfAyanSarkar78
 
cdac@parag.gajbhiye@test123
cdac@parag.gajbhiye@test123cdac@parag.gajbhiye@test123
cdac@parag.gajbhiye@test123Parag Gajbhiye
 
TurboGears2 Pluggable Applications
TurboGears2 Pluggable ApplicationsTurboGears2 Pluggable Applications
TurboGears2 Pluggable ApplicationsAlessandro Molina
 
Datagrids with Symfony 2, Backbone and Backgrid
Datagrids with Symfony 2, Backbone and BackgridDatagrids with Symfony 2, Backbone and Backgrid
Datagrids with Symfony 2, Backbone and BackgridGiorgio Cefaro
 
Datagrids with Symfony 2, Backbone and Backgrid
Datagrids with Symfony 2, Backbone and BackgridDatagrids with Symfony 2, Backbone and Backgrid
Datagrids with Symfony 2, Backbone and Backgrideugenio pombi
 
QConSP 2015 - Dicas de Performance para Aplicações Web
QConSP 2015 - Dicas de Performance para Aplicações WebQConSP 2015 - Dicas de Performance para Aplicações Web
QConSP 2015 - Dicas de Performance para Aplicações WebFabio Akita
 
09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)
09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)
09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)Igor Bronovskyy
 
case3h231diamond.gifcase3h231energy.jpgcase3h231moder.docx
case3h231diamond.gifcase3h231energy.jpgcase3h231moder.docxcase3h231diamond.gifcase3h231energy.jpgcase3h231moder.docx
case3h231diamond.gifcase3h231energy.jpgcase3h231moder.docxtidwellveronique
 
Crafting Beautiful CLI Applications in Ruby
Crafting Beautiful CLI Applications in RubyCrafting Beautiful CLI Applications in Ruby
Crafting Beautiful CLI Applications in RubyNikhil Mungel
 
How and why i roll my own node.js framework
How and why i roll my own node.js frameworkHow and why i roll my own node.js framework
How and why i roll my own node.js frameworkBen Lin
 
Anaysing your logs with docker and elk
Anaysing your logs with docker and elkAnaysing your logs with docker and elk
Anaysing your logs with docker and elkmelvin louwerse
 
SCR Annotations for Fun and Profit
SCR Annotations for Fun and ProfitSCR Annotations for Fun and Profit
SCR Annotations for Fun and ProfitMike Pfaff
 

Semelhante a XML-Motor (20)

Google apps script database abstraction exposed version
Google apps script database abstraction   exposed versionGoogle apps script database abstraction   exposed version
Google apps script database abstraction exposed version
 
Migrating Legacy Rails Apps to Rails 3
Migrating Legacy Rails Apps to Rails 3Migrating Legacy Rails Apps to Rails 3
Migrating Legacy Rails Apps to Rails 3
 
reactjs-quiz..docs.pdf
reactjs-quiz..docs.pdfreactjs-quiz..docs.pdf
reactjs-quiz..docs.pdf
 
Create a new project in ROR
Create a new project in RORCreate a new project in ROR
Create a new project in ROR
 
cdac@parag.gajbhiye@test123
cdac@parag.gajbhiye@test123cdac@parag.gajbhiye@test123
cdac@parag.gajbhiye@test123
 
Generators
GeneratorsGenerators
Generators
 
TurboGears2 Pluggable Applications
TurboGears2 Pluggable ApplicationsTurboGears2 Pluggable Applications
TurboGears2 Pluggable Applications
 
Datagrids with Symfony 2, Backbone and Backgrid
Datagrids with Symfony 2, Backbone and BackgridDatagrids with Symfony 2, Backbone and Backgrid
Datagrids with Symfony 2, Backbone and Backgrid
 
Datagrids with Symfony 2, Backbone and Backgrid
Datagrids with Symfony 2, Backbone and BackgridDatagrids with Symfony 2, Backbone and Backgrid
Datagrids with Symfony 2, Backbone and Backgrid
 
QConSP 2015 - Dicas de Performance para Aplicações Web
QConSP 2015 - Dicas de Performance para Aplicações WebQConSP 2015 - Dicas de Performance para Aplicações Web
QConSP 2015 - Dicas de Performance para Aplicações Web
 
Backbone
BackboneBackbone
Backbone
 
Intro to Rails 4
Intro to Rails 4Intro to Rails 4
Intro to Rails 4
 
09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)
09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)
09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)
 
Scala at Netflix
Scala at NetflixScala at Netflix
Scala at Netflix
 
Catalyst MVC
Catalyst MVCCatalyst MVC
Catalyst MVC
 
case3h231diamond.gifcase3h231energy.jpgcase3h231moder.docx
case3h231diamond.gifcase3h231energy.jpgcase3h231moder.docxcase3h231diamond.gifcase3h231energy.jpgcase3h231moder.docx
case3h231diamond.gifcase3h231energy.jpgcase3h231moder.docx
 
Crafting Beautiful CLI Applications in Ruby
Crafting Beautiful CLI Applications in RubyCrafting Beautiful CLI Applications in Ruby
Crafting Beautiful CLI Applications in Ruby
 
How and why i roll my own node.js framework
How and why i roll my own node.js frameworkHow and why i roll my own node.js framework
How and why i roll my own node.js framework
 
Anaysing your logs with docker and elk
Anaysing your logs with docker and elkAnaysing your logs with docker and elk
Anaysing your logs with docker and elk
 
SCR Annotations for Fun and Profit
SCR Annotations for Fun and ProfitSCR Annotations for Fun and Profit
SCR Annotations for Fun and Profit
 

Mais de Abhishek Kumar

Syslog Centralization Logging with Windows ~ A techXpress Guide
Syslog Centralization Logging with Windows ~ A techXpress GuideSyslog Centralization Logging with Windows ~ A techXpress Guide
Syslog Centralization Logging with Windows ~ A techXpress GuideAbhishek Kumar
 
Squid for Load-Balancing & Cache-Proxy ~ A techXpress Guide
Squid for Load-Balancing & Cache-Proxy ~ A techXpress GuideSquid for Load-Balancing & Cache-Proxy ~ A techXpress Guide
Squid for Load-Balancing & Cache-Proxy ~ A techXpress GuideAbhishek Kumar
 
Ethernet Bonding for Multiple NICs on Linux ~ A techXpress Guide
Ethernet Bonding for Multiple NICs on Linux ~ A techXpress GuideEthernet Bonding for Multiple NICs on Linux ~ A techXpress Guide
Ethernet Bonding for Multiple NICs on Linux ~ A techXpress GuideAbhishek Kumar
 
Solaris Zones (native & lxbranded) ~ A techXpress Guide
Solaris Zones (native & lxbranded) ~ A techXpress GuideSolaris Zones (native & lxbranded) ~ A techXpress Guide
Solaris Zones (native & lxbranded) ~ A techXpress GuideAbhishek Kumar
 
An Express Guide ~ "dummynet" for tweaking network latencies & bandwidth
An Express Guide ~ "dummynet" for tweaking network latencies & bandwidthAn Express Guide ~ "dummynet" for tweaking network latencies & bandwidth
An Express Guide ~ "dummynet" for tweaking network latencies & bandwidthAbhishek Kumar
 
An Express Guide ~ Cacti for IT Infrastructure Monitoring & Graphing
An Express Guide ~ Cacti for IT Infrastructure Monitoring & GraphingAn Express Guide ~ Cacti for IT Infrastructure Monitoring & Graphing
An Express Guide ~ Cacti for IT Infrastructure Monitoring & GraphingAbhishek Kumar
 
An Express Guide ~ SNMP for Secure Rremote Resource Monitoring
An Express Guide ~ SNMP for Secure Rremote Resource MonitoringAn Express Guide ~ SNMP for Secure Rremote Resource Monitoring
An Express Guide ~ SNMP for Secure Rremote Resource MonitoringAbhishek Kumar
 
Presentation on "XSS Defeating Concept in (secure)SiteHoster" : 'nullcon-2011'
Presentation on "XSS Defeating Concept in (secure)SiteHoster" : 'nullcon-2011'Presentation on "XSS Defeating Concept in (secure)SiteHoster" : 'nullcon-2011'
Presentation on "XSS Defeating Concept in (secure)SiteHoster" : 'nullcon-2011'Abhishek Kumar
 
XSS Defeating Concept - Part 2
XSS Defeating Concept - Part 2XSS Defeating Concept - Part 2
XSS Defeating Concept - Part 2Abhishek Kumar
 
XSS Defeating Trick ~=ABK=~ WhitePaper
XSS Defeating Trick ~=ABK=~ WhitePaperXSS Defeating Trick ~=ABK=~ WhitePaper
XSS Defeating Trick ~=ABK=~ WhitePaperAbhishek Kumar
 
FreeSWITCH on RedHat, Fedora, CentOS
FreeSWITCH on RedHat, Fedora, CentOSFreeSWITCH on RedHat, Fedora, CentOS
FreeSWITCH on RedHat, Fedora, CentOSAbhishek Kumar
 

Mais de Abhishek Kumar (11)

Syslog Centralization Logging with Windows ~ A techXpress Guide
Syslog Centralization Logging with Windows ~ A techXpress GuideSyslog Centralization Logging with Windows ~ A techXpress Guide
Syslog Centralization Logging with Windows ~ A techXpress Guide
 
Squid for Load-Balancing & Cache-Proxy ~ A techXpress Guide
Squid for Load-Balancing & Cache-Proxy ~ A techXpress GuideSquid for Load-Balancing & Cache-Proxy ~ A techXpress Guide
Squid for Load-Balancing & Cache-Proxy ~ A techXpress Guide
 
Ethernet Bonding for Multiple NICs on Linux ~ A techXpress Guide
Ethernet Bonding for Multiple NICs on Linux ~ A techXpress GuideEthernet Bonding for Multiple NICs on Linux ~ A techXpress Guide
Ethernet Bonding for Multiple NICs on Linux ~ A techXpress Guide
 
Solaris Zones (native & lxbranded) ~ A techXpress Guide
Solaris Zones (native & lxbranded) ~ A techXpress GuideSolaris Zones (native & lxbranded) ~ A techXpress Guide
Solaris Zones (native & lxbranded) ~ A techXpress Guide
 
An Express Guide ~ "dummynet" for tweaking network latencies & bandwidth
An Express Guide ~ "dummynet" for tweaking network latencies & bandwidthAn Express Guide ~ "dummynet" for tweaking network latencies & bandwidth
An Express Guide ~ "dummynet" for tweaking network latencies & bandwidth
 
An Express Guide ~ Cacti for IT Infrastructure Monitoring & Graphing
An Express Guide ~ Cacti for IT Infrastructure Monitoring & GraphingAn Express Guide ~ Cacti for IT Infrastructure Monitoring & Graphing
An Express Guide ~ Cacti for IT Infrastructure Monitoring & Graphing
 
An Express Guide ~ SNMP for Secure Rremote Resource Monitoring
An Express Guide ~ SNMP for Secure Rremote Resource MonitoringAn Express Guide ~ SNMP for Secure Rremote Resource Monitoring
An Express Guide ~ SNMP for Secure Rremote Resource Monitoring
 
Presentation on "XSS Defeating Concept in (secure)SiteHoster" : 'nullcon-2011'
Presentation on "XSS Defeating Concept in (secure)SiteHoster" : 'nullcon-2011'Presentation on "XSS Defeating Concept in (secure)SiteHoster" : 'nullcon-2011'
Presentation on "XSS Defeating Concept in (secure)SiteHoster" : 'nullcon-2011'
 
XSS Defeating Concept - Part 2
XSS Defeating Concept - Part 2XSS Defeating Concept - Part 2
XSS Defeating Concept - Part 2
 
XSS Defeating Trick ~=ABK=~ WhitePaper
XSS Defeating Trick ~=ABK=~ WhitePaperXSS Defeating Trick ~=ABK=~ WhitePaper
XSS Defeating Trick ~=ABK=~ WhitePaper
 
FreeSWITCH on RedHat, Fedora, CentOS
FreeSWITCH on RedHat, Fedora, CentOSFreeSWITCH on RedHat, Fedora, CentOS
FreeSWITCH on RedHat, Fedora, CentOS
 

Último

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Último (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

XML-Motor

  • 1. aXML-Motor XML Document Parsing Algorithm version 2011.11.04 Abhishek Kumar ~=ABK=~ http://github.com/abhishekkr http://www.twitter.com/abionic Algorithm, Ruby Source and Gem: [axml-motor] @GitHub: http://github.com/abhishekkr/axml-motor.git rubygem's src @GitHub: http://github.com/abhishekkr/rubygem_xml_motor.git gem install @RubyGems:http://rubygems.org/gems/xml-motor Algorithm-Walk-through Example XML Content: <BODY> <DIV id='banner'> <H1>aXML-Motor</H1> <H5>A new algorithm based compact XML Parser with <I>no dependencies</I>. </H5> </DIV> <DIV id='details'> <SPAN class='github'>@github: <A href='http://github.com/abhishekkr/axml-motor.git'> axml-motor</A> </SPAN> <DIV class='gem'> <SPAN id='source' class='github'>@github: <A href='http://github.com/abhishekkr/rubygem-xml- motor.git'>rubygem-xml-motor</A> </SPAN> <SPAN class='rubygems'>@rubygems: <A href='http://rubygems.org/gems/xml-motor.git'>xml-motor</A> </SPAN> </DIV> <I> It's a new algorithm implemented to build a real compact parser (v0.0.2 has less than 200 ruby source code lines) without any dependencies.</I> </DIV> </BODY>
  • 2. [Step.1] Split the XML Content (1.1) Split by '<' store as XMLNodes [0] BODY> [1] DIV id='banner'> [2] H1>aXML-Motor [3] /H1> [4] H5>A new algorithm based compact XML Parser with [5] I>no dependencies [6] /I>. [7] /H5> [8] /DIV> [9] DIV id='details'> [10] SPAN class='github'>@github: [11] A href='http://github.com/abhishekkr/axml-motor.git'>axml-motor< [12] /A> [13] /SPAN> [14] DIV class='gem'> [15] SPAN id='source' class='github'>@github: [16] A href='http://github.com/abhishekkr/rubygem-xml-motor.git'>rubygem- xml-motor [17] /A> [18] /SPAN> [19] SPAN class='rubygems'>@rubygems: [20] A href='http://rubygems.org/gems/xml-motor.git'>xml-motor [21] /A> [22] /SPAN> [23] /DIV> [24] I> It's a new algorithm implemented to build a real compact parser (v0.0.2 has less than 200 ruby source code lines) without any dependencies. [25] /I> [26] /DIV> [27] /BODY> (1.2) Split previous step1.1 result by '>' update XMLNodes [0] [ 'BODY', '' ] [1] ['DIV id='banner', '' ] [2] ['H1', 'aXML-Motor' ] [3] ['/H1', ''] [4] ['H5', 'A new algorithm based compact XML Parser with '] [5] ['I', 'no dependencies'] [6] ['/I', '.'] [7] ['/H5', ''] [8] ['/DIV', ''] [9] ['DIV id='details'', ''] [10] ['SPAN class='github'', '@github: '] [11] ['A href='http://github.com/abhishekkr/axml-motor.git'', 'axml-motor']
  • 3. [12] ['/A', ''] [13] ['/SPAN', ''] [14] ['DIV class='gem'', ''] [15] ['SPAN id='source' class='github'', '@github: '] [16] ['A href='http://github.com/abhishekkr/rubygem-xml-motor.git'', 'rubygem-xml-motor'] [17] ['/A', ''] [18] ['/SPAN', ''] [19] ['SPAN class='rubygems'', '@rubygems: '] [20] ['A href='http://rubygems.org/gems/xml-motor.git', 'xml-motor'] [21] ['/A', ''] [22] ['/SPAN', ''] [23] ['/DIV', ''] [24] ['I', 'It's a new algorithm implemented to build a real compact parser (v0.0.2 has less than 200 ruby source code lines) without any dependencies.'] [25] ['/I', ''] [26] ['/DIV', ''] [27] ['/BODY', ''] (1.3) Split first element per line by space/tab, mark 1 st part as tag_name and split latter part by '=', iterating to make key=value pair per attribute... turning XMLNodes to update XMLNodes [0] [ ['BODY', {}], '' ] [1] [ ['DIV', {'id'=>'banner'}], '' ] [2] [ ['H1', {}], 'aXML-Motor' ] [3] [ ['/H1', {}], ''] [4] [ ['H5', {}], 'A new algorithm based compact XML Parser with '] [5] [ ['I', {}], 'no dependencies'] [6] [ ['/I', {}], '.'] [7] [ ['/H5', {}], ''] [8] [ ['/DIV', {}], ''] [9] [ ['DIV', {'id'=>'details'}], ''] [10] [ ['SPAN', {'class'=>'github'}], '@github: '] [11] [ ['A', {'href'='http://github.com/abhishekkr/axml-motor.git'}], 'axml-motor'] [12] [ ['/A', {}], ''] [13] [ ['/SPAN', {}], ''] [14] [ ['DIV', {'class'=>'gem'}], ''] [15] [ ['SPAN', {'id'=>'source', 'class'=>'github'}], '@github: '] [16] [ ['A', {'href'=>'http://github.com/abhishekkr/rubygem-xml-motor.git'}], 'rubygem-xml-motor'] [17] [ ['/A', {}], ''] [18] [ ['/SPAN', {}], ''] [19] [ ['SPAN', {'class'=>'rubygems'}], '@rubygems: '] [20] [ ['A', {'href'=>'http://rubygems.org/gems/xml-motor.git'}], 'xml-motor'] [21] [ ['/A', {}], ''] [22] [ ['/SPAN', {}], ''] [23] [ ['/DIV', {}], ''] [24] [ ['I', {}], 'It's a new algorithm implemented to build a real
  • 4. compact parser (v0.0.2 has less than 200 ruby source code lines) without any dependencies.'] [25] [ ['/I', {}], ''] [26] [ ['/DIV', {}], ''] [27] [ ['/BODY', {}], ''] Here, we have the XMLNodes as we wanted them. Now it's turn to Indexify them. [Step.2] Index the processed XMLNodes There are three things involved in Indexing of XMLNodes Tag_Name : Iterating through all elements of XMLNodes, every element has three components including Tag Name, which is available at XMLNodes.all[ [TAG_NAMES, *], *] Depth: The place/level of the Node in XML Node Tree starting from '0'. Index: The index value of Node as per depending upon the XMLNode Array How to Index-ify? There will be an element per Tag_Name with a Hash of Keys as the 'Depth' where it is found which has array of 2*number_of_nodes (starting and ending 'Index' for that same Node) Example: From above XMLNodes, the ['DIV'] would hold {1=>[1,8, 9,26], 2=>[14,26]} Because 'Tag_Name' DIV has 'Index' set of 1,8 and 9,26 for 'Depth' of 1. Similarly 'Index' set of 14,26 for 'Depth' of 2. Indexed XMLTags for above processed XMLNodes will be as follows: calculated XMLTags ['BODY'] = {0=>[0,27]} ['DIV'] = {1=>[1,8, 9,26], 2=>[14,23]} ['H1'] = {2=>[2,3]} ['H5'] = {2=>[4,7]} ['I'] => {3=>[5,6], 2=>[24,25]} ['SPAN'] => {2=>[10,13], 3=>[15,18, 19,22]} ['A'] => {3=>[11,12], 4=>[16,17, 20,21]} [Step.3] Grab My Node from processed XMLNodes using XMLTags Now suppose, I aim for a Tag_Name 'XYZ'..... then look for XMLTags['XYZ'], iterate through all of its depths and extract 2 indexes at a time. These two indexes per time indicate the start and end node, fetch all value within those nodes from XMLNode.
  • 5. This will return set of values held by Tag_Name 'XYZ'. Suppose a tree form is provided as 'ABC.XYZ', then start from top nodes as 'ABC' in this context. Grab all it's node. Now move on to lower nodes and filter the Indexes found only within the Node Index ranges provided by the earlier node. This would end with the filtered set of Indexes for 'XYZ' falling only under the Index-Range of 'ABC'. To check for a Tag_Name with attribute, for every filtered Index-Range, just check if it has the required attribute as it's key-value pair. Example: Case: Grabbing 'SPAN', with attribute “class=''github'” It's a single node, grab all its Index-Range (10,13), (15,18) and (19,22). Here, just XMLNodes[10] and XMLNodes[15] have required attribute. Now, grab all data between XMLNodes[10][1] to XMLNodes[13-1][1] and XMLNodes[15][1] to XMLNodes[18-1][1]. Result: ['@github: <A href='http://github.com/abhishekkr/axml-motor.git'>axml-motor</A>' , '@github: <A href='http://github.com/abhishekkr/rubygem-xml-motor.git'>rubygem-xml- motor</A>'] Case: Grabbing 'H5.I' Top node is 'H5', grab all its Index-Range (4,7). Second node 'I', grab all falling between ranges from previous node (5,6). Now, grab all data between XMLNodes[5][1] to XMLNodes[6-1][1].. Result: ['no dependencies'] Below, you'll also see that you need not give entire hierarchy to fetch any descendant from child tree of any node. Just giving the major scope nodes would do the work as fine as providing exact hierarchy. Case: Grabbing 'DIV.A' Top node is 'DIV', grab all its Index-Range (1,8), (9,26) and (14,23). Second node 'A', grab all falling between ranges from previous node (11,12), (16,17) and (20,21). Now, grab all data between XMLNodes[5][1] to XMLNodes[6-1][1].. Result: ['axml-motor', 'rubygem-xml-motor', 'xml-motor']