SlideShare uma empresa Scribd logo
1 de 49
ANSI SQL Transparent Dynamic
Multipath Hierarchical Structured
       Data Processing For
 Relational, XML, & Legacy Data

    Michael M David and Lee Fesperman
  Advanced Data Access Technologies, Inc.
             www.adatinc.com

   The example results shown in this presentation can be reproduced
   on our online interactive prototype at www.adatinc.com/demo.html
Previous Database History
Hierarchical processing popular 40 years ago
  Used user navigational processing
Multipath hierarchical query processing used
  Used nonprocedural navigationless processing
  Made possible by automatic semantic processing
Relational processing replaces hierarchical
  Also uses nonprocedural navigationless processing
  But more flexibility with data independence from join
Hierarchical processing technology forgotten
  No Internet to store hierarchical processing knowledge
Hierarchical structures popular again with XML
  User procedural navigation is back again
  Where is navigationless access for structured access?2
SQL/XML Industry Problems
Only proprietary vendor solutions
  All vendor solutions incompatible with each other
  Integration solutions are XML centric and procedural
No satisfactory XML integration solution
  Hierarchical data value loss
  XML not fully integrated into SQL
  XML structured data processed as semistructured
No hierarchical processing standard
  Invalid hierarchical processing result is possible
  Invalid hierarchical structure result is possible
Markup and structured data processed the same
  Structured data needs to be processed differently!
                                                       3
SQL/XML Structured Data
    Processing Problems
Limited to linear single path processing
  Query data selection limitations
  Relational join single path mindset
Requires user database navigation
  Loss of dynamic hierarchical processing
  Limits complex hierarchical processing
  Prevents seamless & transparent processing
Requires specific vendor user training
  Requires XML and vendor trained users
                                           4
A Working ANSI SQL XML Solution
 Limits processing to only structured data
   Allows unambiguous nonprocedural querying
   Enables navigationless schema-free operation
 Only performs hierarchical operations
   Only uses hierarchical Left Outer Join operation
   Naturally supports full hierarchical structures
   Allows hierarchical structure-aware operation
 Supports inherent hierarchical processing
   This solves relational/XML data integration
   This also solves SQL to XML system mapping
   Transparent multipath hierarchical processing   5
Structured Data Automatic
        Processing Benefits
SQL transparent XML integration
  ANSI SQL-92 syntax and semantics, not XML centric
  Nonprocedural and navigationless, Interactive
  No knowledge of structure necessary, schema-free
Multipath nonlinear hierarchical processing
  Hierarchically accurate results automatically
  Dynamic hierarchical processing optimization
  Dynamic output uses hierarchical result structure
SQL hierarchical views fully functional
  Global views with no overhead and maximum reuse
  Global hierarchical queries now possible
  Hierarchical data filtering & XML keyword search    6
Current XML Hierarchical Data
XML Was Created as a Markup Language
Unstructured Mapped to Semistructured
Structured Vs. Semistructured Data
Same as Fixed Vs. Fuzzy Meaning
Why Semistructured Requires Navigation
   Structured       Semistructured
      Data             Data            The multiple B
                                       nodes in this
       A                A              structure make
      B C              B C             it ambiguous
                                       for querying
       D E              B D            without user
  Unambiguous         Ambiguous        navigation.
                                                        7
  Query Structure    Query Structure
Why Hierarchical Data
     structures are Powerful
Automatic Data and Path Reuse
Extending Path Increases Data Value
Adding Paths Increase Data Value Further
Hierarchical XML Becoming Ubiquitous
Data Structures are Unambiguous
           Data/Path       Creating more
                           value than is
       1) A/B          A   collected,
       2) A/C     B C      automatic
       3) A/C/D            nonlinear data
       4) A/C/E    D E     value increase.   8
Why Hierarchical Structured
   Data Processing is Powerful
Dynamic Multipath Query Combinations
Multipath Data Value Grows Continually
Every Node is Related to Every Other Node
No. of Possible Queries Becomes Unlimited
Automatic Processing = Unlimited Complexity
          Increasingly Complex   Naturally utilizes the inherent
            Multipath Queries    hierarchical semantics in
Mview                            multipath queries.
        1) SELECT B,C FROM Mview
 A      2) SELECT B,D FROM Mview WHERE A=2
B C     3) SELECT A,B FROM Mview WHERE E=5
        4) SELECT B,C FROM Mview WHERE D=1 AND E=5
 D E    5) SELECT D,E FROM Mview WHERE B=3 OR C=4            9
Hierarchical Data Structure Types
XML is self defining contiguous and nested
  No foreign key relation needed to make structure
IBM’s IMS database is discontinuous
  Internally linked
Structured VSAM is contiguous
  With hierarchical level & data occurrence counts
Flat tabular objects hierarchically modeled
  Relational tables, flat files, spread sheets

All support same hierarchical operation and principles
                                                     10
Hierarchical Structured Data Makeup
  Multiple Node Types
  Nodes Support Multiple Data Occurrences
  Naturally Built With 1 to M Relationships
  Hierarchical Data Preservation Operation-
  Naturally Support Variable Length Paths
        Co                 C01              Not to be confused with
            1
                                 Co2        function driven single node
            M                               external hierarchical data
                                   Emp3     structure processing.
        Emp             Emp1
        1       1          Emp2             Suitable for IMS, Structured
                                 Proj3      VSAM, XML, COBOL FD and
    M               M
                                    Proj4   hierarchically modeled
 Dpnd           Proj    Dpnd1 Proj2         Relational and flat data. 11
SQL Hierarchical Structured Data
 Can Be Stored in Relational Rowsets
 Multiple Node Types in Relational Rowset
 Node Multiple Data Occurrences in Rowset
 Multiple Pathways in Rowset
 Variable Length Pathways in Rowset

   C01                Relational Rowset
         Co2                                     Hierarchical
                     Co   Emp    Dpnd    Proj    structure
           Emp3     Co1   Emp1   Dpnd1           preserved
Emp1                                             in rowset,
   Emp2             Co1   Emp2           Proj2
                                                 in and back
         Proj3
                    Co2   Emp3           Proj3   out again.
            Proj4
Dpnd1 Proj2         Co2   Emp3           Proj4             12
Mapping Relational SQL to
            Hierarchical XML
 Input View   Conceptual Hierarchical        Result
                      Query
     A                                         A      Shown on this slide:
                  SELECT A.a, D.d, E.e
 BB C             FROM GlobalView            D E 1. Conceptual Level
                  WHERE B.b=“B1”
   D EE                                 XML           2. Multipath Processing
X D
                                       Output         3. Hierarchical Filtering
                                                      4. Dynamic Data Select
    Working Set                     Result Set
                                                      5. Access Optimization
A   B    C    D     E            A       D     E
                                                      6. Global View Support
A1 B1    C1 D1 E1              A1     D1      E1
A1 B1    C1 D2 E1              A1     D2      E1
                                                      7. Schema-free Query
A1 B1    C1 D1 E2              A1     D1      E2      8. Node Promotion
A1 B1    C1 D2 D2              A1     D2      D2      9. Auto Output Format
A1 B2    C1 D1 E1                                                          13
Relational Logical Hierarchical View
 CREATE VIEW EmpView AS
 SELECT * FROM Emp                             Emp
                       Hierarchical SQL
 LEFT JOIN Dpnd        data modeling and
 ON EmpID=DpndEmpID processing            Dpnd          Eaddr
                       semantics defined.
    AND DpndCode=’D’
 LEFT JOIN Eaddr ON EmpCustID=EaddrCustID;

 SELECT EmpID, DpndID, <root>
                        <emp empid="Emp01">
        EaddrID          <dpnd dpndid="Dpnd01"/>
 FROM EmpView            <eaddr eaddrid="Addr01"/>
                                  </emp>
 Logical hierarchical SQL         <emp empid="Emp02">
 view and semantics
                                   <eaddr eaddrid="Addr03"/>
 processed directly by
 relational engine making it
                                  </emp>
 operate fully hierarchically.   </root>
                                                           14
XML Physical Hierarchical View
CREATE XML CustView
                                      CREATE VIEW CustView AS
Cust(                                 SELECT * FROM Cust
 CustID Char(8),                      LEFT JOIN Invoice
 CustStoreID Char(8)),                ON CustID=InvID
Invoice(                              LEFT JOIN ADDR
                      Converted to
 InvID Char(8),
                     SQL CustView
                                      ON CustID=AddrID;
 InvCustID Char(8),
                                     SELECT Cust, Invoice, Addr
 InvStatus Char(8)) Parent Cust,
Addr(                                FROM CustView
 AddrID Char(8),                      <root>
 AddrCustID Char(8),                    <cust custid="Cust01">
 AddrState Char(8)) Parent Cust          <invoice invid="Inv01"/>
                                         <invoice invid="Inv02"/>
                                         <addr addrid="Addr01"/>
               Cust

                             This SQL created outer join view syntax is
     Invoice          Addr   used as a hierarchical map of the physical
                                                                          15
                             IMS CustView for seamless operation.
Joining Heterogeneous Views
SELECT EmpID, DpndID, EaddrID,
       CustID InvID, AddrID    <root>
FROM EmpView                    <emp empid="Emp01">
LEFT JOIN CustView               <dpnd dpndid="Dpnd01"/>
                                 <eaddr eaddrid="Addr01"/>
ON EmpCustID=CustID
                                 <cust custid="Cust01">
                                             <invoice invid="Inv01"/>
  Hierarchical      Heterogeneous            <invoice invid="Inv02"/>
 Structure Join      Logical View            <addr addrid="Addr01"/>
      Emp              of Result            </cust>
                                           </emp>
Dpnd      Eaddr         Emp                <emp empid="Emp02">
                                            <eaddr eaddrid="Addr03"/>
     Cust        Dpnd Eaddr     Cust        <cust custid="Cust03">
                                             <addr addrid="Addr03"/>
Invoice   Addr           Invoice   Addr
                                            </cust>
Emp logical and Cust physical views are    </emp>
both hierarchically modeled in the same   </root>
way in SQL making for a seamless,                                 16
unified and logical hierarchical join.
Logical Hierarchical Structures Offer
 Flexible Relational/XML Integration
 Logical      Physical                Data Model:             Logical
                          Data Type
 Tables         XML                      Natural              hierarchical
                          Mapping        Common               structures
                                      Features:               solve
  Common Outer Join
                          Modeling       Abstraction          hierarchical
    Data Modeling
                                         Consistency          fixed structure
                    Log. and Phy.        Seamless             problems.
                    Structures have   Capabilities:
        Emp
                    same hierarchy       Data Integration
  Dpnd      Eaddr   op principles        Solves Rel and Hier Problems
                                         Separates Structure from Data
                      Heterogeneous      Hierarchical Structure Flexibility
       Cust
                      Logical            Offers Flexibility to Fixed Structures
  Invoice   Addr      Structure

                                                                          17
Heterogeneous Data Structure
 Mashup Uses Linking Below Root
SELECT EmpID, DpndID, InvID,    <root>
       AddrID, EaddrID, CustID   <emp empid="Emp01">
                                  <dpnd dpndid="Dpnd01"/>
FROM EmpView LEFT JOIN
                                  <eaddr eaddrid="Addr01">
     CustView ON EaddrID=AddrID    <cust custid="Cust01">
WHERE CustID <> ”Cust02”            <invoice invid="Inv01"/>
                                              <invoice invid="Inv02"/>
Link Below Root       Result Structure        <addr addrid="Addr01"/>
                                             </cust>
      Emp                   Emp
                                            </eaddr>
 Dpnd     Eaddr        Dpnd     Eaddr      </emp> ...</root>

      Cust                      Cust       Mashups allow linking anywhere
                                           into the lower level structure. We
Invoice   Addr              Invoice Addr
                                           determined this was valid and what
Dashed arrow is linkage.                   the new semantic structure is.
Solid arrow is new structure.                                             18
Association Table Use
  SELECT EmpID, DpndID, EaddrID, CustID, InvoiceID
          AddrID, IntersectData
  FROM EmpView
  LEFT JOIN AssociationTable On DpndID=DpndX
  LEFT JOIN CustView ON CustID=CustX

                             Result
      Emp      EmpView      Structure Emp           Association Table
 Dpnd      Eaddr Association       Dpnd             Capabilities added:
                                            Eaddr
                   Table                            • External Relationships
DpndX CustX IntersectData                           • M to M Relationships
                                   IntersectData
                                                    • Intersecting Data
                                                    • Can be Transparent
        Cust   CustView             Cust            • Retains Hier Structure
 Invoice    Addr               Invoice   Addr                          19
Hierarchical Data Filtering, Auto
Output & Structure-free Processing
 Input View   Conceptual Hierarchical   Result
                                                    Automatic
                      Query
     A                                    A       Structured XML
                SELECT A.a, D.d, E.e
                                                 Output Formatting
 BB C           FROM GlobalView         D E
                WHERE B.b=“B1”
   D EE
X D

 Hierarchical WHERE clause global data filtering
   Cousin nodes like node C above are filtered from B node
   This makes this a more internally complex multipath query
 Structure-free processing
   Navigationless XML access, no need to know structure
 Automatic structure-aware output formatting
                                                              20

   Result structure known from outer join syntax modeling
Optimization, Global Views
           and Node Promotion
 Input View   Conceptual Hierarchical   Result
                                                 This is a conceptual
                      Query
     A                                    A      query where the input
                SELECT A.a, D.d, E.e             structure is not known
 BB C           FROM GlobalView         D E
                                                 and the output adapts
                WHERE B.b=“B1”
   D EE                                          to the dynamic result.
X D
 Hierarchical optimization can remove from access -
     Unreferenced data nodes not on path to referenced data
     Dynamically controlled by SQL’s variable SELECT list
 Optimization makes all views global views
     Because they have no overhead for unreferenced fields
 Node promotion closes around unselected nodes
                                                                   21
     This happens in relational processing too
Global Query Uses Global Views
                           EmpCust              Global views allow entire
         Global Query
                              Emp               structures to be defined
                                                and queried with no
SELECT *         Dpnd        Eaddr   Cust       overhead, more user
FROM EmpCust                                    friendly. Global query
                               Invoice   Addr   allows hierarchical
WHERE InvStatus=‘O’
                                                filtering of entire view.
              Global Data Filter
<root>
 <emp empid="Emp01" empstoreid="Store01" empcustid="Cust01"
       empstatus="F">
  <dpnd dpndid="Dpnd01" dpndempid="Emp01" dpndcode="D"/>
  <eaddr eaddrid="Addr01" eaddrcustid="Cust01" eaddrstate="CA"/>
  <cust custid="Cust01" custstoreid="Store01">
   <invoice invid="Inv02" invcustid="Cust01" invstatus="O"/>
   <addr addrid="Addr01" addrcustid="Cust01" addrstate="CA"/>
  </cust>
 </emp>
</root>                                                              22
Node Promotion and Nested View
 CREATE VIEW EmpCust AS
 SELECT *                        Hierarchical views can be nested.
 FROM EmpView LEFT JOIN CustView Views can contain views.
      ON EmpCustID=CustID

  SELECT EmpID, DpndID,                 <root>
         InvID, AddrID                   <emp empid="Emp01">
  FROM EmpCust                            <dpnd dpndid="Dpnd01"/>
                                          <invoice invid="Inv01"/>
                                          <invoice invid="Inv02"/>
   EmpCust               Node             <addr addrid="Addr01"/>
      Emp              Promotion         </emp>
                                         <emp empid="Emp02">
Dpnd      Eaddr         Emp
                                          <addr addrid="Addr03"/>
                  Dpnd Invoice   Addr    </emp>
     Cust                               </root>
Invoice   Addr       Not SELECTed                              23
Node Promotion Override and XML
Format Change    <emp>
                  <empid>Emp01</empid>
 SELECT EmpID, DpndID,                     <dpnd>
                                            <dpndid>Dpnd01</dpndid>
        InvID, AddrID                      </dpnd>
 FROM EmpCust                              <cust>
 FOR XML ELEMENT KEEP NODE                  <invoice>
                                             <invid>Inv01</invid>
    EmpCust             No Node
                                            </invoice>
                                            <invoice>
      Emp              Promotion             <invid>Inv02</invid>
                         Emp                </invoice>
 Dpnd     Eaddr
                                            <addr>
                    Dpnd    Cust             <addrid>Addr01</addrid>
     Cust
                                            </addr>
Invoice   Addr        Invoice      Addr    </cust>
                                          </emp>
Without node promotion, unselected        <emp>
nodes are output empty. XML output         <empid>Emp02</empid>
format was changed to Element style.                             24
Data Structure Mashup With Node
   Promotion = Data Virtualization
 SELECT EmpID, DpndID,                     <root>
        InvID, AddrID                       <emp empid="Emp01">
 FROM EmpView                                <dpnd dpndid="Dpnd01"/>
      LEFT JOIN CustView                     <invoice invid="Inv01"/>
                                             <invoice invid="Inv02"/>
       ON EaddrID=AddrID
                                             <addr addrid="Addr01"/>
Link Below Root                             </emp>
                    Result Output with
                                            <emp empid="Emp02">
      Emp            Node Promotion
                                             <addr addrid="Addr03"/>
 Dpnd     Eaddr            Emp              </emp>
                                           </root>
                  Dpnd Invoice     Addr
     Cust

Invoice   Addr                             Mashups with node promotion gives
                                           aggregated result of the data. This has
Dashed boxes are unselected, not output.   same effect as data virtualization. 25
Multipath Query Processing and
 its Required LCA Processing
Multipath query references multiple pathways-
  and uses a WHERE data filtering operation
  For example: Selecting data based on data in another path
  This requires a special processing using LCA
Lowest Common Ancestor (LCA) Node
  The LCA node is the lowest common ancestor node -
  Located between two pathway node points in the structure
  Used to keep the hierarchical query result meaningful
Two types of SQL multipath LCA usage
  SELECT with WHERE clause referencing two different paths
  Compound WHERE clause referencing two different paths
     This LCA processing solves the XML Keyword Search problem
  These two uses of LCA can be combined causing nesting
  Each SELECT data item can have its own LCA                     26
Multipath LCA Query Logic for
     Single SELECT Data
  SELECT DpndID                      Lowest Common Ancestor (LCA)
  FROM EmpCust                       processing for the SQL SELECT
  WHERE AddrID=’Addr01’              clause will automatically process
                                     multiple LCA nodes when multiple
                                     specified output data is located
        EmpCust                      across different pathways.

         Emp         SELECT LCA

 Dpnd Eaddr      Cust              <root>
                                    <dpnd dpndid="Dpnd01"/>
           Invoice   Addr
                                   </root>
SELECT data types Dpnd and
Invoice generate different LCAs.
                                                                    27
Multipath LCA Query Logic for
    Multiple SELECT Data
  SELECT DpndID, InvID               Lowest Common Ancestor (LCA)
  FROM EmpCust                       processing for the SQL SELECT
  WHERE AddrID=’Addr01’              clause will automatically process
                                     multiple LCA nodes when multiple
                                     specified output data is located
        EmpCust                      across different pathways.

         Emp         SELECT LCAs
                                   <root>
 Dpnd Eaddr      Cust
                                    <dpnd dpndid="Dpnd01"/>
           Invoice    Addr          <invoice invid=“Inv01"/>
                                    <invoice invid=“Inv02"/>
SELECT data types Dpnd and
Invoice generate different LCAs.   </root>
                                                                    28
Compound WHERE Clauses also
         Require Their Own LCA
                                       Lowest Common Ancestor (LCA) node
 SELECT DpndID                         processing for multipath queries is
 FROM EmpCust                          necessary. We found it working in SQL
 WHERE InvID=’Inv01’                   naturally on both the SELECT and
                                       WHERE operations. Academic projects
 AND AddrID=’Addr01’                   are currently researching how to add it to
                                       XQuery. LCA is the lowest node
       EmpCust                         between node points on separate paths.

        Emp        SELECT LCA

                           WHERE LCA
                                         <root>
Dpnd Eaddr     Cust
                                          <dpnd dpndid="Dpnd01"/>
         Invoice    Addr                  <dpnd dpndid="Dpnd03"/>
This LCA example is actually a           </root>
combined LCA and nested LCA example                                          29
Data Driven
       Variable Structure Generation
 SELECT EmpID, DpndID, InvID,          <root>
        AddrID, EaddrID,                <emp empid="Emp01"
        CustID, EmpStatus                      empstatus="F">
 FROM EmpView                            <dpnd dpndid="Dpnd01"/>
      LEFT JOIN CustView                 <cust custid="Cust01">
                                          <invoice invid="Inv01"/>
          ON EmpCustID=CustID
                                          <invoice invid="Inv02"/>
          AND EmpStatus=“F”               <addr addrid="Addr01"/>
                                         </cust>
Join Structure    Variable Structure    </emp>
                                        <emp empid="Emp02"
      Emp             Emp                    empstatus="">
Dpnd      Eaddr                         </emp>
                  Dpnd
                                       </root>
     Cust                Cust           Current EmpStatus data value
Invoice   Addr                          controls if this CustView data
                  Invoice       Addr
                                        occurrence expands or not.       30
Variable Structure Generation
       Using Multiple Choice
SELECT EmpID, DpndID, InvID,        Variable Structure
       AddrID, EaddrID,        <root>
       CustID, EmpStatus        <emp empid="Emp01"
FROM EmpView                           empstatus="F">
     LEFT JOIN CustViewF         <dpnd dpndid="Dpnd01"/>
         ON EmpCustID=CustID    Insert Full Time View Here
         AND EmpStatus=“F”      </emp>
                                <emp empid="Emp02"
     LEFT JOIN CustViewP
                                     empstatus=“P">
        ON EmpCustID=CustID
                                Insert Part Time View Here
        AND EmpStatus=“P”
                                </emp>
                               </root>
            Emp
                               Current EmpStatus data value
  Dpnd   CustViewF CustViewP   controls which view expands.
                               Control value is up or down path.   31
Data Structure Transformation
Two Basic Types of Data Structure Transformation
  Restructuring: uses data relationships in the data
  Restructuring: Uses only the structure semantics in the data
Restructuring
  Used to bring out new data relationships
  Also removes data and nodes
  New structure is secondary, driven new relationships desired
Reshaping
  Used to reshape data structure to a desired new structure
  No data relationships dependency
  Any-to-any data structure transformation is possible
  Reshaping is driven by structures natural semantics
SQL transformation operation assures accuracy
  Restructuring and Reshaping operations can be combined         32
Restructure Transformation
SELECT X.EmpID, X.DpndID, Y.InvID, X.AddrID
FROM EmpCust Y
LEFT JOIN EmpCust X ON Y.InvCustID=X.EmpCustID
                            <root>
                             <invoice invid="Inv01"/>
EmpCust    X.Emp              <emp empid="Emp01">
                               <dpnd dpndid="Dpnd01"/>
  X.Dpnd   Eaddr X.Cust        <addr addrid="Addr01"/>
                              </emp>
           Y.Invoice Addr
                             </Invoice> ....</root>

 Result Invoice              Restructuring SQL transformation:
                             1) Takes structure apart
           Emp               2) Disregards unwanted portions
                             3) Recombines it differently --
    Dpnd     Addr            4) Uses different data relationships
                             5) This introduces new semantics       33
Semantic Any-to-Any Structure
      Reshaping Transformation
SELECT X.DpndID, Y.EmpID, Y.EaddrID
FROM EmpView X
                       <root>
LEFT JOIN EmpView Y     <dpnd dpndid="Dpnd01">
ON X.DpndID=Y.DpndID      <emp empid="Emp01">
                                  <eaddr eaddrid="Addr01"/>
                    Result       </emp>
        EmpView    Structure    </dpnd>
    X    Emp        Dpnd       </root>
 Dpnd      Eaddr
                    Emp        Reshaping uses only the semantics of the
                               data structure to transform it into the desired
Y        Emp        Eaddr      structure. No new data relationships are
                               needed so this method can always be used.
Dpnd       Eaddr
                               Linking below the root is utilized.         34
Polymorphic Any-to-Any Reshaping
SELECT X.DpndID, Y.EaddrID, Z.EmpID
FROM EmpView X
LEFT JOIN EmpView Y ON X.DpndID=Y.DpndID
LEFT JOIN EmpView Z ON Y.EaddrID=Z.EaddrID
                                  <root>
         EmpView     Target
                                   <dpnd dpndid="Dpnd01">
                    Structure
          Emp                       <eaddr eaddrid="Addr01">
     X               Dpnd
                                     <emp empid="Emp01"/>
 Dpnd       Eaddr                   </eaddr>
                     Eaddr
                                   </dpnd>
 Y        Emp         Emp         </root>
 Dpnd       Eaddr
                    This reshaping example moves only one node at a time
                    which makes its operation polymorphic allowing any
 Z        Emp       shaped input structure that uses the same data names.
                    Reshaping uses comparing the structure to itself for
 Dpnd       Eaddr                                                           35
                    positioning since data relationships are not relied on.
Reshaping to Multipath Structure
SELECT X.DpndID, Y.EaddrID, Z.EmpID
FROM EmpView X
LEFT JOIN EmpView Y ON X.DpndID=Y.DpndID
LEFT JOIN EmpView Z ON X.DpndID=Z.DpndID

   EmpView                 Target
                                         <root>
                          Structure
       Emp
                                          <dpnd dpndid="Dpnd01">
                 X          Dpnd
                                           <eaddr eaddrid="Addr01"/>
 Dpnd     Eaddr                            <emp empid="Emp01"/>
                         Eaddr   Emp
                                          </dpnd>
         Emp         Y                   </root>
       Dpnd Eaddr
                                 This reshaping example demonstrates reconstructing
   Emp       Z                   to a multipath nonlinear structure using SQL’s
                                 hierarchical data modeling. The SQL hierarchical
Dpnd    Eaddr                    semantic operation helps preserve correctness. 36
Multipath Querying Vs. Transform
 Multipath Multipath                 Transformed Transformed
  ViewX ViewX Data                      ViewX     ViewX Data
                                          B         12 16
       A            10
                                          A         10 10
     B C          12 18
                 16                       C         18 18

SELECT A, C                               SELECT A, C        A simple
FROM ViewX                                FROM ViewX         multipath
WHERE B>10                                WHERE B>10         query can
                     Output     Data                         usually avoid
                    Structure Replication Specifically       transforms
Multiple use
                                                             and preserve
multipath                                 transforms
                 10     A      10 10                         structure with
schema-free                               structure A/B to
                                                             more accurate
query uses &                              B/A changing
                 18     C      18 18                         results.
preserves                                 semantics from
data structure                            1-to-M to M-to-1            37
Renaming, Replicating, & Splitting
SELECT EmpID EmpName, DpndID AS DpndName, Nodes
        Addr1.EaddrID AS AddrName,
         Addr1.EaddrID AS AddrName,
        Addr2.Eaddrstate AS AddrState
FROM Emp AS Employee
LEFT JOIN Dpnd AS Dependent ON EmpID=DpndEmpID and DpndCode='D'
LEFT JOIN EAddr AS Addr1 ON EmpCustID=Addr1.EAddrCustID
LEFT JOIN EAddr AS Addr2 ON Addr1.EaddrCustID=Addr2.EAddrCustID

                                      <root>
                                       <employee empname="Emp01">
Dpnd     Emp     Eaddr
                                        <dependent dpndname="Dpnd01"/>
                         Result         <addr1 addrname="Addr01">
                                         <addr2 addrstate="CA"/>
                     Employee
                                        </addr1>
                                       </employee>
               Dependent     Addr1
                                       <employee empname="Emp02">
                             Addr2      <addr1 addrname="Addr03">
                                         <addr2 addrstate="NV"/>
The AS keyword renames XML data         </addr1>
items and nodes, it can be used to     </employee>                 38
replicate nodes to help split them.   </root>
XML Duplicate and Shared
  Unambiguous Node Processing
                                                  This same SQL
   SELECT * FROM Dept                   handles both
   LEFT JOIN Cust ON DeptID=CustDeptID duplicate and
                                        shared data.
   LEFT JOIN Emp ON DeptID=EmpDeptID
   LEFT JOIN Addr AS AddrC ON CustID=AddrCustID
   LEFT JOIN Addr AS AddrE ON EmpID=AddrEmpID


  Ambiguous          Unambiguous              Ambiguous
Shared Structure Hierarchical Structure Duplicate Element Type
     Dept                Dept                  Dept

 Cust   Emp          Cust    Emp           Cust   Emp

                                                                 39
     Addr           AddrC   AddrE          Addr   Addr
Nonlinear Hierarchical ORDER BY
 SELECT CustID, InvID, AddrID <root>
 FROM CustView                 <cust custid="Cust03">
 ORDER BY AddrID Desc,          <addr addrid="Addr03"/>
                               </cust>
          InvID Desc,
                               <cust custid="Cust02">
          CustID Desc
                                <invoice invid="Inv03"/>
                                          <addr addrid="Addr04"/>
       CustView Ordering                  <addr addrid="Addr02"/>
                Cust                     </cust>
                                         <cust custid="Cust01">
           Invoice   Addr


Ordering Inv before Cust is Trouble:   Hierarchical ops like ORDER BY require
                                       special nonlinear hierarchical processing
   Cust1     Cust2    Cust1            to make sense for SQL Hierarchical
                                       processing. With Order By, each path is
    Inv1    Inv2   Inv3                ordered independently as in XML above.
                                                                             40
    Cust becomes out-of-order
Hier & XML Middleware Enabler
                                          Before Query:
 Query     SQL Pre Processing
                                            Establish Views
                    User’s SQL
                                            Preload XML (ETL)
 Real-time           Processor            SQL Pre Processing:
 Hier Access        Operating               Determines structure
                   Hierarchically           Optimizes structure
                                            Rewrite & submit SQL
         SQL Post Processing              Real-time Access:
                                            Structure-aware
 This Unique Implementation:                Retrieve XML (EII)
 • Supports XML enables hier processing     Return XML as rowset
 • Uses in place SQL processor              Retain XML data order
 • Needs no restaging of data
                                          SQL Post processing:
 • Not XML centric, no learning curve
 • ANSI SQL-92, vendor neutral              Remove replicated data
 • Operates seamlessly & transparently      Nonlinear data ordering
 • Protects current SQL investment          Rowset to XML       41
Dynamic Data Structure Creation Recap
  SELECTed Data Output
    Controls processing and output result structure
    With automatic node promotion and collection
  Combining Data Structures
    Joining and mashup of heterogeneous structures
     Mashup & node promotion = data virtualization
  Generating Variable Data Structures
    Multiple choice of view structure generation
    Driven by data value further up or down the path
  Transforming Data Structures
    Restructuring using data relationships
    Reshaping using structure semantics
    Node splitting and renaming
                                                       42
The Power of Hierarchical
   LEFT Outer Join Syntax Recap
ViewA LEFT JOIN ViewX
ON C.c=Z.c AND A.a=10              Shown on this slide:
      Views Expanded              • Automatically Combines and
                                    Unifies Heterogeneous Structures
A LEFT JOIN B ON A.a=B.a
  LEFT JOIN C ON A.a=C.a          • Path Filtering Based Up/Down Path
LEFT JOIN
X LEFT JOIN Y ON X.x=Y.y          • Directly Executable by SQL Engine
  LEFT JOIN Z ON X.x=Z.z
ON C.c=Z.c AND A.a=10             • Flexible Recombinant Properties

  A       A                       • Referencing Below Root is valid
                  Outer join
                  models join
B C      B C      of views,
                                  • Access optimization: paths sliced out
                  semantics
  X           X                      • Processing optimization: reorganized
                  define result
                  structure.         • Naturally Hierarchically Distributable
                                                                          43
Y Z        Y Z
Combining Basic Capabilities
     ViewR
                         Hierarchical     Examples:
       A                 XML Result       Hier View Support
    B C                                   Data Modeling
   X D E                                  Hier. Preservation
                              A1
                                          Var. Length Paths
CREATE View ViewR AS
                          B1 D1 E1        Multiple Paths
                         B2      E2
SELECT * FROM A                           Multi-node Types
LEFT JOIN B ON A.a=B.a                    Dyn. Data Select
LEFT JOIN C ON A.a=C.a    A1B1D1E1        Hier. Data Filtering
LEFT JOIN D ON C.c=D.c    A1B2 E2         Node Promotion
LEFT JOIN E ON C.c=E.c
                                          Node Collection
                                          Optimized Access
 A1B1C2D1E1      SELECT A.a,B.b,D.d,E.e
                 FROM ViewR
                                          Global View
 A1 C1D1
 A1B2C2 E2       WHERE C=‘C2’             Schema-free 44
Combining Advanced Capabilities
 ViewR     ViewX                     Data Mashup    Examples:
                   CREATE XML
   A        X      ViewX AS              A          XML View
 B C      Y Z      X(XID Char(8)),
                                         C          Heterogeneous
                   Y(YID Char(8),                   Data Integration
                     YFK Char(8))        X          Mashup
CREATE View        Parent X,
ViewR AS           Z(ZID Char(8),      Y Z          Var. Structures
SELECT * FROM A      ZFK Char(8))                   LCA Processing
LEFT JOIN B        Parent X
     ON A.a=B.a                          A          Linear Filtering
LEFT JOIN C         Converted to                    Hier. Ordering
     ON A.a=C.a                        B C
                     Outer Join                     Preserve Nodes
                                         X          Unified View
 SELECT A.a, C.c, Y.y, Z.z                          XML Keyword
 FROM ViewR LEFT JOIN ViewX            Y Z          Search
 ON C.c=Z.c AND A=‘A1’
 WHERE Y=‘Y1’ OR Z=‘Z2’              A1C2X1 Y1 Z1
                                                     Hierarchical
 ORDER BY A.a, Y.y, Z.z              A2C2X2 Y2 Z2    XML Result
 FOR XML KEEP NODE                   A3C3X3 Y3 Z3
                                                                    45
Advanced Capabilities 1
Multipath nonlinear processing
  Dynamic increase of data value using structure semantics
  Processes all queries regardless of internal complexity
  Hierarchical data filtering-- XML Keyword Search
Multipath hierarchical data structure joins
  Performed by simple join of hierarchical structure views
  Can be performed interactively and heterogeneously
  Dynamically combines hierarchical structure semantics
Linking below root allows structure mashup
  Enables capability to mashup structures meaningfully
  Includes powerful look-back and look-ahead capability
  Mashup + node promotion = data virtualization
  Enables SQL Transformations                          46
Advanced Capabilities 2
Dynamic automatic variable structure generation
  Dynamically builds structures based on values in data
  Can utilize cascading and embedded view operation
  Operates across multiple paths independently
Dynamic user control of structure
  Dynamic SELECT list controls processing structure
  Dynamic combining views builds data structure
  Transformations Change Structure
Dynamic Structured Output Formatting
  Structure-aware processing knows active structure
  Uses structure of result to format output data
  Active structure is dynamically controlled by user
                                                       47
Advanced Capabilities 3
Polymorphic any-to-any structure transform
  Uses data relationships or just structure semantics
  Utilizes SQL and hierarchical rowset data
  Two types of transforms, Restructure and Reshaping
Multipath nonlinear hierarchical Operations
  Single Order By orders multiple pathways separately
  Aggregation operates on separate pathways
Multipath internal LCA nonlinear processing
  LCA processing limits the domain across pathways
  Is automatic in ANSI SQL, but not in XQuery
  Responsible for keeping multipath result meaningful
Untested natural and automatic operations
  Natural hierarchical distributed processing
  Automatic parallel processing is possible
  Semantic web RDF to SQL, SQLfX driven by RDF      48
Advanced Capabilities Summary
1) ANSI SQL standard and mathematically sound
2) Ease of use (nonprocedural, navigationless, schema-free)
3) Hierarchically correct (principled multipath processing)
4) Greater efficiency (hierarchical processing optimization)
5) Fully interactive (dynamically process native XML)
6) Conceptual hierarchical processing on full structures
7) Queries can operate across the entire hierarchical structure
8) Nonlinear multipath LCA hierarchical processing
9) Full nonlinear hierarchical data structure mashups
A) Variable data generated structure control
B) Any-to-any polymorphic structure transformations
C) All operations are semantically controlled and accurate
D) Data virtualization supported
E) Natural distributed hierarchical processing
F) Automatic parallel processing is possible

All of the capabilities shown in this presentation can be reproduced 49
                                                                     on
    our online interactive demo at www.adatinc.com/demo.html

Mais conteúdo relacionado

Mais procurados

Sedna XML Database: Query Parser & Optimizing Rewriter
Sedna XML Database: Query Parser & Optimizing RewriterSedna XML Database: Query Parser & Optimizing Rewriter
Sedna XML Database: Query Parser & Optimizing RewriterIvan Shcheklein
 
Intro to T-SQL - 1st session
Intro to T-SQL - 1st sessionIntro to T-SQL - 1st session
Intro to T-SQL - 1st sessionMedhat Dawoud
 
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation Languages
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation LanguagesSyntax Reuse: XSLT as a Metalanguage for Knowledge Representation Languages
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation LanguagesTara Athan
 
Intro to T-SQL – 2nd session
Intro to T-SQL – 2nd sessionIntro to T-SQL – 2nd session
Intro to T-SQL – 2nd sessionMedhat Dawoud
 
Ibps it officer exam capsule by affairs cloud
Ibps it officer exam capsule by affairs cloudIbps it officer exam capsule by affairs cloud
Ibps it officer exam capsule by affairs cloudaffairs cloud
 
Sql scripting sorcerypaper
Sql scripting sorcerypaperSql scripting sorcerypaper
Sql scripting sorcerypaperoracle documents
 
Sedna XML Database System: Internal Representation
Sedna XML Database System: Internal RepresentationSedna XML Database System: Internal Representation
Sedna XML Database System: Internal RepresentationIvan Shcheklein
 
STRUCTURE OF SQL QUERIES
STRUCTURE OF SQL QUERIESSTRUCTURE OF SQL QUERIES
STRUCTURE OF SQL QUERIESVENNILAV6
 
Advanced SQL Webinar
Advanced SQL WebinarAdvanced SQL Webinar
Advanced SQL WebinarRam Kedem
 
Go faster with_native_compilation Part-2
Go faster with_native_compilation Part-2Go faster with_native_compilation Part-2
Go faster with_native_compilation Part-2Rajeev Rastogi (KRR)
 
Chapter 2 Relational Data Model-part 3
Chapter 2 Relational Data Model-part 3Chapter 2 Relational Data Model-part 3
Chapter 2 Relational Data Model-part 3Eddyzulham Mahluzydde
 

Mais procurados (20)

Sedna XML Database: Query Parser & Optimizing Rewriter
Sedna XML Database: Query Parser & Optimizing RewriterSedna XML Database: Query Parser & Optimizing Rewriter
Sedna XML Database: Query Parser & Optimizing Rewriter
 
Unit 04 dbms
Unit 04 dbmsUnit 04 dbms
Unit 04 dbms
 
ch3
ch3ch3
ch3
 
Unit 03 dbms
Unit 03 dbmsUnit 03 dbms
Unit 03 dbms
 
Intro to T-SQL - 1st session
Intro to T-SQL - 1st sessionIntro to T-SQL - 1st session
Intro to T-SQL - 1st session
 
SQL
SQLSQL
SQL
 
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation Languages
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation LanguagesSyntax Reuse: XSLT as a Metalanguage for Knowledge Representation Languages
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation Languages
 
Intro to T-SQL – 2nd session
Intro to T-SQL – 2nd sessionIntro to T-SQL – 2nd session
Intro to T-SQL – 2nd session
 
Ibps it officer exam capsule by affairs cloud
Ibps it officer exam capsule by affairs cloudIbps it officer exam capsule by affairs cloud
Ibps it officer exam capsule by affairs cloud
 
HDF5 Abstract Data Model
HDF5 Abstract Data ModelHDF5 Abstract Data Model
HDF5 Abstract Data Model
 
Algebra relacional
Algebra relacionalAlgebra relacional
Algebra relacional
 
Sql scripting sorcerypaper
Sql scripting sorcerypaperSql scripting sorcerypaper
Sql scripting sorcerypaper
 
SQL
SQLSQL
SQL
 
00 introduction
00 introduction00 introduction
00 introduction
 
Sedna XML Database System: Internal Representation
Sedna XML Database System: Internal RepresentationSedna XML Database System: Internal Representation
Sedna XML Database System: Internal Representation
 
STRUCTURE OF SQL QUERIES
STRUCTURE OF SQL QUERIESSTRUCTURE OF SQL QUERIES
STRUCTURE OF SQL QUERIES
 
Unit1
Unit1Unit1
Unit1
 
Advanced SQL Webinar
Advanced SQL WebinarAdvanced SQL Webinar
Advanced SQL Webinar
 
Go faster with_native_compilation Part-2
Go faster with_native_compilation Part-2Go faster with_native_compilation Part-2
Go faster with_native_compilation Part-2
 
Chapter 2 Relational Data Model-part 3
Chapter 2 Relational Data Model-part 3Chapter 2 Relational Data Model-part 3
Chapter 2 Relational Data Model-part 3
 

Semelhante a ANSI SQL Transparent Multipath Hierarchical Structured Data Processing for Relational, XML & Legacy Data

MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRMADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRPivotalOpenSourceHub
 
Entity Framework v1 and v2
Entity Framework v1 and v2Entity Framework v1 and v2
Entity Framework v1 and v2Eric Nelson
 
The Adventure: BlackRay as a Storage Engine
The Adventure: BlackRay as a Storage EngineThe Adventure: BlackRay as a Storage Engine
The Adventure: BlackRay as a Storage Enginefschupp
 
01 Persistence And Orm
01 Persistence And Orm01 Persistence And Orm
01 Persistence And OrmRanjan Kumar
 
ADO.NET Entity Framework
ADO.NET Entity FrameworkADO.NET Entity Framework
ADO.NET Entity FrameworkDoncho Minkov
 
07.Overview_of_Query_Processing.pdf
07.Overview_of_Query_Processing.pdf07.Overview_of_Query_Processing.pdf
07.Overview_of_Query_Processing.pdfssusera4b8a1
 
Tutorial On Database Management System
Tutorial On Database Management SystemTutorial On Database Management System
Tutorial On Database Management Systempsathishcs
 
CIKB - Software Architecture Analysis Design
CIKB - Software Architecture Analysis DesignCIKB - Software Architecture Analysis Design
CIKB - Software Architecture Analysis DesignAntonio Castellon
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 OverviewDavid Chou
 
Odtug2011 adf developers make the database work for you
Odtug2011 adf developers make the database work for youOdtug2011 adf developers make the database work for you
Odtug2011 adf developers make the database work for youLuc Bors
 
Building Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkBuilding Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkDatabricks
 
Anchor Modeling ER09 Presentation
Anchor Modeling ER09 PresentationAnchor Modeling ER09 Presentation
Anchor Modeling ER09 Presentationpajo01
 
Syntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service ArchitecturesSyntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service ArchitecturesMartin Szomszor
 
High Performance Jdbc
High Performance JdbcHigh Performance Jdbc
High Performance JdbcSam Pattsin
 
Choose'10: Ralf Laemmel - Dealing Confortably with the Confusion of Tongues
Choose'10: Ralf Laemmel - Dealing Confortably with the Confusion of TonguesChoose'10: Ralf Laemmel - Dealing Confortably with the Confusion of Tongues
Choose'10: Ralf Laemmel - Dealing Confortably with the Confusion of TonguesCHOOSE
 
Graph Databases in the Microsoft Ecosystem
Graph Databases in the Microsoft EcosystemGraph Databases in the Microsoft Ecosystem
Graph Databases in the Microsoft EcosystemMarco Parenzan
 

Semelhante a ANSI SQL Transparent Multipath Hierarchical Structured Data Processing for Relational, XML & Legacy Data (20)

MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRMADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
 
Entity Framework v1 and v2
Entity Framework v1 and v2Entity Framework v1 and v2
Entity Framework v1 and v2
 
The Adventure: BlackRay as a Storage Engine
The Adventure: BlackRay as a Storage EngineThe Adventure: BlackRay as a Storage Engine
The Adventure: BlackRay as a Storage Engine
 
01 Persistence And Orm
01 Persistence And Orm01 Persistence And Orm
01 Persistence And Orm
 
ADO.NET Entity Framework
ADO.NET Entity FrameworkADO.NET Entity Framework
ADO.NET Entity Framework
 
07.Overview_of_Query_Processing.pdf
07.Overview_of_Query_Processing.pdf07.Overview_of_Query_Processing.pdf
07.Overview_of_Query_Processing.pdf
 
Tutorial On Database Management System
Tutorial On Database Management SystemTutorial On Database Management System
Tutorial On Database Management System
 
Algorithm
AlgorithmAlgorithm
Algorithm
 
CIKB - Software Architecture Analysis Design
CIKB - Software Architecture Analysis DesignCIKB - Software Architecture Analysis Design
CIKB - Software Architecture Analysis Design
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 Overview
 
Odtug2011 adf developers make the database work for you
Odtug2011 adf developers make the database work for youOdtug2011 adf developers make the database work for you
Odtug2011 adf developers make the database work for you
 
Building Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkBuilding Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache Spark
 
Data structures graphics library in computer graphics.
Data structures  graphics library in computer graphics.Data structures  graphics library in computer graphics.
Data structures graphics library in computer graphics.
 
Anchor Modeling ER09 Presentation
Anchor Modeling ER09 PresentationAnchor Modeling ER09 Presentation
Anchor Modeling ER09 Presentation
 
Anchor Modeling
Anchor ModelingAnchor Modeling
Anchor Modeling
 
Syntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service ArchitecturesSyntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service Architectures
 
Evolution of Spark APIs
Evolution of Spark APIsEvolution of Spark APIs
Evolution of Spark APIs
 
High Performance Jdbc
High Performance JdbcHigh Performance Jdbc
High Performance Jdbc
 
Choose'10: Ralf Laemmel - Dealing Confortably with the Confusion of Tongues
Choose'10: Ralf Laemmel - Dealing Confortably with the Confusion of TonguesChoose'10: Ralf Laemmel - Dealing Confortably with the Confusion of Tongues
Choose'10: Ralf Laemmel - Dealing Confortably with the Confusion of Tongues
 
Graph Databases in the Microsoft Ecosystem
Graph Databases in the Microsoft EcosystemGraph Databases in the Microsoft Ecosystem
Graph Databases in the Microsoft Ecosystem
 

Último

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Último (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

ANSI SQL Transparent Multipath Hierarchical Structured Data Processing for Relational, XML & Legacy Data

  • 1. ANSI SQL Transparent Dynamic Multipath Hierarchical Structured Data Processing For Relational, XML, & Legacy Data Michael M David and Lee Fesperman Advanced Data Access Technologies, Inc. www.adatinc.com The example results shown in this presentation can be reproduced on our online interactive prototype at www.adatinc.com/demo.html
  • 2. Previous Database History Hierarchical processing popular 40 years ago Used user navigational processing Multipath hierarchical query processing used Used nonprocedural navigationless processing Made possible by automatic semantic processing Relational processing replaces hierarchical Also uses nonprocedural navigationless processing But more flexibility with data independence from join Hierarchical processing technology forgotten No Internet to store hierarchical processing knowledge Hierarchical structures popular again with XML User procedural navigation is back again Where is navigationless access for structured access?2
  • 3. SQL/XML Industry Problems Only proprietary vendor solutions All vendor solutions incompatible with each other Integration solutions are XML centric and procedural No satisfactory XML integration solution Hierarchical data value loss XML not fully integrated into SQL XML structured data processed as semistructured No hierarchical processing standard Invalid hierarchical processing result is possible Invalid hierarchical structure result is possible Markup and structured data processed the same Structured data needs to be processed differently! 3
  • 4. SQL/XML Structured Data Processing Problems Limited to linear single path processing Query data selection limitations Relational join single path mindset Requires user database navigation Loss of dynamic hierarchical processing Limits complex hierarchical processing Prevents seamless & transparent processing Requires specific vendor user training Requires XML and vendor trained users 4
  • 5. A Working ANSI SQL XML Solution Limits processing to only structured data Allows unambiguous nonprocedural querying Enables navigationless schema-free operation Only performs hierarchical operations Only uses hierarchical Left Outer Join operation Naturally supports full hierarchical structures Allows hierarchical structure-aware operation Supports inherent hierarchical processing This solves relational/XML data integration This also solves SQL to XML system mapping Transparent multipath hierarchical processing 5
  • 6. Structured Data Automatic Processing Benefits SQL transparent XML integration ANSI SQL-92 syntax and semantics, not XML centric Nonprocedural and navigationless, Interactive No knowledge of structure necessary, schema-free Multipath nonlinear hierarchical processing Hierarchically accurate results automatically Dynamic hierarchical processing optimization Dynamic output uses hierarchical result structure SQL hierarchical views fully functional Global views with no overhead and maximum reuse Global hierarchical queries now possible Hierarchical data filtering & XML keyword search 6
  • 7. Current XML Hierarchical Data XML Was Created as a Markup Language Unstructured Mapped to Semistructured Structured Vs. Semistructured Data Same as Fixed Vs. Fuzzy Meaning Why Semistructured Requires Navigation Structured Semistructured Data Data The multiple B nodes in this A A structure make B C B C it ambiguous for querying D E B D without user Unambiguous Ambiguous navigation. 7 Query Structure Query Structure
  • 8. Why Hierarchical Data structures are Powerful Automatic Data and Path Reuse Extending Path Increases Data Value Adding Paths Increase Data Value Further Hierarchical XML Becoming Ubiquitous Data Structures are Unambiguous Data/Path Creating more value than is 1) A/B A collected, 2) A/C B C automatic 3) A/C/D nonlinear data 4) A/C/E D E value increase. 8
  • 9. Why Hierarchical Structured Data Processing is Powerful Dynamic Multipath Query Combinations Multipath Data Value Grows Continually Every Node is Related to Every Other Node No. of Possible Queries Becomes Unlimited Automatic Processing = Unlimited Complexity Increasingly Complex Naturally utilizes the inherent Multipath Queries hierarchical semantics in Mview multipath queries. 1) SELECT B,C FROM Mview A 2) SELECT B,D FROM Mview WHERE A=2 B C 3) SELECT A,B FROM Mview WHERE E=5 4) SELECT B,C FROM Mview WHERE D=1 AND E=5 D E 5) SELECT D,E FROM Mview WHERE B=3 OR C=4 9
  • 10. Hierarchical Data Structure Types XML is self defining contiguous and nested No foreign key relation needed to make structure IBM’s IMS database is discontinuous Internally linked Structured VSAM is contiguous With hierarchical level & data occurrence counts Flat tabular objects hierarchically modeled Relational tables, flat files, spread sheets All support same hierarchical operation and principles 10
  • 11. Hierarchical Structured Data Makeup Multiple Node Types Nodes Support Multiple Data Occurrences Naturally Built With 1 to M Relationships Hierarchical Data Preservation Operation- Naturally Support Variable Length Paths Co C01 Not to be confused with 1 Co2 function driven single node M external hierarchical data Emp3 structure processing. Emp Emp1 1 1 Emp2 Suitable for IMS, Structured Proj3 VSAM, XML, COBOL FD and M M Proj4 hierarchically modeled Dpnd Proj Dpnd1 Proj2 Relational and flat data. 11
  • 12. SQL Hierarchical Structured Data Can Be Stored in Relational Rowsets Multiple Node Types in Relational Rowset Node Multiple Data Occurrences in Rowset Multiple Pathways in Rowset Variable Length Pathways in Rowset C01 Relational Rowset Co2 Hierarchical Co Emp Dpnd Proj structure Emp3 Co1 Emp1 Dpnd1 preserved Emp1 in rowset, Emp2 Co1 Emp2 Proj2 in and back Proj3 Co2 Emp3 Proj3 out again. Proj4 Dpnd1 Proj2 Co2 Emp3 Proj4 12
  • 13. Mapping Relational SQL to Hierarchical XML Input View Conceptual Hierarchical Result Query A A Shown on this slide: SELECT A.a, D.d, E.e BB C FROM GlobalView D E 1. Conceptual Level WHERE B.b=“B1” D EE XML 2. Multipath Processing X D Output 3. Hierarchical Filtering 4. Dynamic Data Select Working Set Result Set 5. Access Optimization A B C D E A D E 6. Global View Support A1 B1 C1 D1 E1 A1 D1 E1 A1 B1 C1 D2 E1 A1 D2 E1 7. Schema-free Query A1 B1 C1 D1 E2 A1 D1 E2 8. Node Promotion A1 B1 C1 D2 D2 A1 D2 D2 9. Auto Output Format A1 B2 C1 D1 E1 13
  • 14. Relational Logical Hierarchical View CREATE VIEW EmpView AS SELECT * FROM Emp Emp Hierarchical SQL LEFT JOIN Dpnd data modeling and ON EmpID=DpndEmpID processing Dpnd Eaddr semantics defined. AND DpndCode=’D’ LEFT JOIN Eaddr ON EmpCustID=EaddrCustID; SELECT EmpID, DpndID, <root> <emp empid="Emp01"> EaddrID <dpnd dpndid="Dpnd01"/> FROM EmpView <eaddr eaddrid="Addr01"/> </emp> Logical hierarchical SQL <emp empid="Emp02"> view and semantics <eaddr eaddrid="Addr03"/> processed directly by relational engine making it </emp> operate fully hierarchically. </root> 14
  • 15. XML Physical Hierarchical View CREATE XML CustView CREATE VIEW CustView AS Cust( SELECT * FROM Cust CustID Char(8), LEFT JOIN Invoice CustStoreID Char(8)), ON CustID=InvID Invoice( LEFT JOIN ADDR Converted to InvID Char(8), SQL CustView ON CustID=AddrID; InvCustID Char(8), SELECT Cust, Invoice, Addr InvStatus Char(8)) Parent Cust, Addr( FROM CustView AddrID Char(8), <root> AddrCustID Char(8), <cust custid="Cust01"> AddrState Char(8)) Parent Cust <invoice invid="Inv01"/> <invoice invid="Inv02"/> <addr addrid="Addr01"/> Cust This SQL created outer join view syntax is Invoice Addr used as a hierarchical map of the physical 15 IMS CustView for seamless operation.
  • 16. Joining Heterogeneous Views SELECT EmpID, DpndID, EaddrID, CustID InvID, AddrID <root> FROM EmpView <emp empid="Emp01"> LEFT JOIN CustView <dpnd dpndid="Dpnd01"/> <eaddr eaddrid="Addr01"/> ON EmpCustID=CustID <cust custid="Cust01"> <invoice invid="Inv01"/> Hierarchical Heterogeneous <invoice invid="Inv02"/> Structure Join Logical View <addr addrid="Addr01"/> Emp of Result </cust> </emp> Dpnd Eaddr Emp <emp empid="Emp02"> <eaddr eaddrid="Addr03"/> Cust Dpnd Eaddr Cust <cust custid="Cust03"> <addr addrid="Addr03"/> Invoice Addr Invoice Addr </cust> Emp logical and Cust physical views are </emp> both hierarchically modeled in the same </root> way in SQL making for a seamless, 16 unified and logical hierarchical join.
  • 17. Logical Hierarchical Structures Offer Flexible Relational/XML Integration Logical Physical Data Model: Logical Data Type Tables XML Natural hierarchical Mapping Common structures Features: solve Common Outer Join Modeling Abstraction hierarchical Data Modeling Consistency fixed structure Log. and Phy. Seamless problems. Structures have Capabilities: Emp same hierarchy Data Integration Dpnd Eaddr op principles Solves Rel and Hier Problems Separates Structure from Data Heterogeneous Hierarchical Structure Flexibility Cust Logical Offers Flexibility to Fixed Structures Invoice Addr Structure 17
  • 18. Heterogeneous Data Structure Mashup Uses Linking Below Root SELECT EmpID, DpndID, InvID, <root> AddrID, EaddrID, CustID <emp empid="Emp01"> <dpnd dpndid="Dpnd01"/> FROM EmpView LEFT JOIN <eaddr eaddrid="Addr01"> CustView ON EaddrID=AddrID <cust custid="Cust01"> WHERE CustID <> ”Cust02” <invoice invid="Inv01"/> <invoice invid="Inv02"/> Link Below Root Result Structure <addr addrid="Addr01"/> </cust> Emp Emp </eaddr> Dpnd Eaddr Dpnd Eaddr </emp> ...</root> Cust Cust Mashups allow linking anywhere into the lower level structure. We Invoice Addr Invoice Addr determined this was valid and what Dashed arrow is linkage. the new semantic structure is. Solid arrow is new structure. 18
  • 19. Association Table Use SELECT EmpID, DpndID, EaddrID, CustID, InvoiceID AddrID, IntersectData FROM EmpView LEFT JOIN AssociationTable On DpndID=DpndX LEFT JOIN CustView ON CustID=CustX Result Emp EmpView Structure Emp Association Table Dpnd Eaddr Association Dpnd Capabilities added: Eaddr Table • External Relationships DpndX CustX IntersectData • M to M Relationships IntersectData • Intersecting Data • Can be Transparent Cust CustView Cust • Retains Hier Structure Invoice Addr Invoice Addr 19
  • 20. Hierarchical Data Filtering, Auto Output & Structure-free Processing Input View Conceptual Hierarchical Result Automatic Query A A Structured XML SELECT A.a, D.d, E.e Output Formatting BB C FROM GlobalView D E WHERE B.b=“B1” D EE X D Hierarchical WHERE clause global data filtering Cousin nodes like node C above are filtered from B node This makes this a more internally complex multipath query Structure-free processing Navigationless XML access, no need to know structure Automatic structure-aware output formatting 20 Result structure known from outer join syntax modeling
  • 21. Optimization, Global Views and Node Promotion Input View Conceptual Hierarchical Result This is a conceptual Query A A query where the input SELECT A.a, D.d, E.e structure is not known BB C FROM GlobalView D E and the output adapts WHERE B.b=“B1” D EE to the dynamic result. X D Hierarchical optimization can remove from access - Unreferenced data nodes not on path to referenced data Dynamically controlled by SQL’s variable SELECT list Optimization makes all views global views Because they have no overhead for unreferenced fields Node promotion closes around unselected nodes 21 This happens in relational processing too
  • 22. Global Query Uses Global Views EmpCust Global views allow entire Global Query Emp structures to be defined and queried with no SELECT * Dpnd Eaddr Cust overhead, more user FROM EmpCust friendly. Global query Invoice Addr allows hierarchical WHERE InvStatus=‘O’ filtering of entire view. Global Data Filter <root> <emp empid="Emp01" empstoreid="Store01" empcustid="Cust01" empstatus="F"> <dpnd dpndid="Dpnd01" dpndempid="Emp01" dpndcode="D"/> <eaddr eaddrid="Addr01" eaddrcustid="Cust01" eaddrstate="CA"/> <cust custid="Cust01" custstoreid="Store01"> <invoice invid="Inv02" invcustid="Cust01" invstatus="O"/> <addr addrid="Addr01" addrcustid="Cust01" addrstate="CA"/> </cust> </emp> </root> 22
  • 23. Node Promotion and Nested View CREATE VIEW EmpCust AS SELECT * Hierarchical views can be nested. FROM EmpView LEFT JOIN CustView Views can contain views. ON EmpCustID=CustID SELECT EmpID, DpndID, <root> InvID, AddrID <emp empid="Emp01"> FROM EmpCust <dpnd dpndid="Dpnd01"/> <invoice invid="Inv01"/> <invoice invid="Inv02"/> EmpCust Node <addr addrid="Addr01"/> Emp Promotion </emp> <emp empid="Emp02"> Dpnd Eaddr Emp <addr addrid="Addr03"/> Dpnd Invoice Addr </emp> Cust </root> Invoice Addr Not SELECTed 23
  • 24. Node Promotion Override and XML Format Change <emp> <empid>Emp01</empid> SELECT EmpID, DpndID, <dpnd> <dpndid>Dpnd01</dpndid> InvID, AddrID </dpnd> FROM EmpCust <cust> FOR XML ELEMENT KEEP NODE <invoice> <invid>Inv01</invid> EmpCust No Node </invoice> <invoice> Emp Promotion <invid>Inv02</invid> Emp </invoice> Dpnd Eaddr <addr> Dpnd Cust <addrid>Addr01</addrid> Cust </addr> Invoice Addr Invoice Addr </cust> </emp> Without node promotion, unselected <emp> nodes are output empty. XML output <empid>Emp02</empid> format was changed to Element style. 24
  • 25. Data Structure Mashup With Node Promotion = Data Virtualization SELECT EmpID, DpndID, <root> InvID, AddrID <emp empid="Emp01"> FROM EmpView <dpnd dpndid="Dpnd01"/> LEFT JOIN CustView <invoice invid="Inv01"/> <invoice invid="Inv02"/> ON EaddrID=AddrID <addr addrid="Addr01"/> Link Below Root </emp> Result Output with <emp empid="Emp02"> Emp Node Promotion <addr addrid="Addr03"/> Dpnd Eaddr Emp </emp> </root> Dpnd Invoice Addr Cust Invoice Addr Mashups with node promotion gives aggregated result of the data. This has Dashed boxes are unselected, not output. same effect as data virtualization. 25
  • 26. Multipath Query Processing and its Required LCA Processing Multipath query references multiple pathways- and uses a WHERE data filtering operation For example: Selecting data based on data in another path This requires a special processing using LCA Lowest Common Ancestor (LCA) Node The LCA node is the lowest common ancestor node - Located between two pathway node points in the structure Used to keep the hierarchical query result meaningful Two types of SQL multipath LCA usage SELECT with WHERE clause referencing two different paths Compound WHERE clause referencing two different paths This LCA processing solves the XML Keyword Search problem These two uses of LCA can be combined causing nesting Each SELECT data item can have its own LCA 26
  • 27. Multipath LCA Query Logic for Single SELECT Data SELECT DpndID Lowest Common Ancestor (LCA) FROM EmpCust processing for the SQL SELECT WHERE AddrID=’Addr01’ clause will automatically process multiple LCA nodes when multiple specified output data is located EmpCust across different pathways. Emp SELECT LCA Dpnd Eaddr Cust <root> <dpnd dpndid="Dpnd01"/> Invoice Addr </root> SELECT data types Dpnd and Invoice generate different LCAs. 27
  • 28. Multipath LCA Query Logic for Multiple SELECT Data SELECT DpndID, InvID Lowest Common Ancestor (LCA) FROM EmpCust processing for the SQL SELECT WHERE AddrID=’Addr01’ clause will automatically process multiple LCA nodes when multiple specified output data is located EmpCust across different pathways. Emp SELECT LCAs <root> Dpnd Eaddr Cust <dpnd dpndid="Dpnd01"/> Invoice Addr <invoice invid=“Inv01"/> <invoice invid=“Inv02"/> SELECT data types Dpnd and Invoice generate different LCAs. </root> 28
  • 29. Compound WHERE Clauses also Require Their Own LCA Lowest Common Ancestor (LCA) node SELECT DpndID processing for multipath queries is FROM EmpCust necessary. We found it working in SQL WHERE InvID=’Inv01’ naturally on both the SELECT and WHERE operations. Academic projects AND AddrID=’Addr01’ are currently researching how to add it to XQuery. LCA is the lowest node EmpCust between node points on separate paths. Emp SELECT LCA WHERE LCA <root> Dpnd Eaddr Cust <dpnd dpndid="Dpnd01"/> Invoice Addr <dpnd dpndid="Dpnd03"/> This LCA example is actually a </root> combined LCA and nested LCA example 29
  • 30. Data Driven Variable Structure Generation SELECT EmpID, DpndID, InvID, <root> AddrID, EaddrID, <emp empid="Emp01" CustID, EmpStatus empstatus="F"> FROM EmpView <dpnd dpndid="Dpnd01"/> LEFT JOIN CustView <cust custid="Cust01"> <invoice invid="Inv01"/> ON EmpCustID=CustID <invoice invid="Inv02"/> AND EmpStatus=“F” <addr addrid="Addr01"/> </cust> Join Structure Variable Structure </emp> <emp empid="Emp02" Emp Emp empstatus=""> Dpnd Eaddr </emp> Dpnd </root> Cust Cust Current EmpStatus data value Invoice Addr controls if this CustView data Invoice Addr occurrence expands or not. 30
  • 31. Variable Structure Generation Using Multiple Choice SELECT EmpID, DpndID, InvID, Variable Structure AddrID, EaddrID, <root> CustID, EmpStatus <emp empid="Emp01" FROM EmpView empstatus="F"> LEFT JOIN CustViewF <dpnd dpndid="Dpnd01"/> ON EmpCustID=CustID Insert Full Time View Here AND EmpStatus=“F” </emp> <emp empid="Emp02" LEFT JOIN CustViewP empstatus=“P"> ON EmpCustID=CustID Insert Part Time View Here AND EmpStatus=“P” </emp> </root> Emp Current EmpStatus data value Dpnd CustViewF CustViewP controls which view expands. Control value is up or down path. 31
  • 32. Data Structure Transformation Two Basic Types of Data Structure Transformation Restructuring: uses data relationships in the data Restructuring: Uses only the structure semantics in the data Restructuring Used to bring out new data relationships Also removes data and nodes New structure is secondary, driven new relationships desired Reshaping Used to reshape data structure to a desired new structure No data relationships dependency Any-to-any data structure transformation is possible Reshaping is driven by structures natural semantics SQL transformation operation assures accuracy Restructuring and Reshaping operations can be combined 32
  • 33. Restructure Transformation SELECT X.EmpID, X.DpndID, Y.InvID, X.AddrID FROM EmpCust Y LEFT JOIN EmpCust X ON Y.InvCustID=X.EmpCustID <root> <invoice invid="Inv01"/> EmpCust X.Emp <emp empid="Emp01"> <dpnd dpndid="Dpnd01"/> X.Dpnd Eaddr X.Cust <addr addrid="Addr01"/> </emp> Y.Invoice Addr </Invoice> ....</root> Result Invoice Restructuring SQL transformation: 1) Takes structure apart Emp 2) Disregards unwanted portions 3) Recombines it differently -- Dpnd Addr 4) Uses different data relationships 5) This introduces new semantics 33
  • 34. Semantic Any-to-Any Structure Reshaping Transformation SELECT X.DpndID, Y.EmpID, Y.EaddrID FROM EmpView X <root> LEFT JOIN EmpView Y <dpnd dpndid="Dpnd01"> ON X.DpndID=Y.DpndID <emp empid="Emp01"> <eaddr eaddrid="Addr01"/> Result </emp> EmpView Structure </dpnd> X Emp Dpnd </root> Dpnd Eaddr Emp Reshaping uses only the semantics of the data structure to transform it into the desired Y Emp Eaddr structure. No new data relationships are needed so this method can always be used. Dpnd Eaddr Linking below the root is utilized. 34
  • 35. Polymorphic Any-to-Any Reshaping SELECT X.DpndID, Y.EaddrID, Z.EmpID FROM EmpView X LEFT JOIN EmpView Y ON X.DpndID=Y.DpndID LEFT JOIN EmpView Z ON Y.EaddrID=Z.EaddrID <root> EmpView Target <dpnd dpndid="Dpnd01"> Structure Emp <eaddr eaddrid="Addr01"> X Dpnd <emp empid="Emp01"/> Dpnd Eaddr </eaddr> Eaddr </dpnd> Y Emp Emp </root> Dpnd Eaddr This reshaping example moves only one node at a time which makes its operation polymorphic allowing any Z Emp shaped input structure that uses the same data names. Reshaping uses comparing the structure to itself for Dpnd Eaddr 35 positioning since data relationships are not relied on.
  • 36. Reshaping to Multipath Structure SELECT X.DpndID, Y.EaddrID, Z.EmpID FROM EmpView X LEFT JOIN EmpView Y ON X.DpndID=Y.DpndID LEFT JOIN EmpView Z ON X.DpndID=Z.DpndID EmpView Target <root> Structure Emp <dpnd dpndid="Dpnd01"> X Dpnd <eaddr eaddrid="Addr01"/> Dpnd Eaddr <emp empid="Emp01"/> Eaddr Emp </dpnd> Emp Y </root> Dpnd Eaddr This reshaping example demonstrates reconstructing Emp Z to a multipath nonlinear structure using SQL’s hierarchical data modeling. The SQL hierarchical Dpnd Eaddr semantic operation helps preserve correctness. 36
  • 37. Multipath Querying Vs. Transform Multipath Multipath Transformed Transformed ViewX ViewX Data ViewX ViewX Data B 12 16 A 10 A 10 10 B C 12 18 16 C 18 18 SELECT A, C SELECT A, C A simple FROM ViewX FROM ViewX multipath WHERE B>10 WHERE B>10 query can Output Data usually avoid Structure Replication Specifically transforms Multiple use and preserve multipath transforms 10 A 10 10 structure with schema-free structure A/B to more accurate query uses & B/A changing 18 C 18 18 results. preserves semantics from data structure 1-to-M to M-to-1 37
  • 38. Renaming, Replicating, & Splitting SELECT EmpID EmpName, DpndID AS DpndName, Nodes Addr1.EaddrID AS AddrName, Addr1.EaddrID AS AddrName, Addr2.Eaddrstate AS AddrState FROM Emp AS Employee LEFT JOIN Dpnd AS Dependent ON EmpID=DpndEmpID and DpndCode='D' LEFT JOIN EAddr AS Addr1 ON EmpCustID=Addr1.EAddrCustID LEFT JOIN EAddr AS Addr2 ON Addr1.EaddrCustID=Addr2.EAddrCustID <root> <employee empname="Emp01"> Dpnd Emp Eaddr <dependent dpndname="Dpnd01"/> Result <addr1 addrname="Addr01"> <addr2 addrstate="CA"/> Employee </addr1> </employee> Dependent Addr1 <employee empname="Emp02"> Addr2 <addr1 addrname="Addr03"> <addr2 addrstate="NV"/> The AS keyword renames XML data </addr1> items and nodes, it can be used to </employee> 38 replicate nodes to help split them. </root>
  • 39. XML Duplicate and Shared Unambiguous Node Processing This same SQL SELECT * FROM Dept handles both LEFT JOIN Cust ON DeptID=CustDeptID duplicate and shared data. LEFT JOIN Emp ON DeptID=EmpDeptID LEFT JOIN Addr AS AddrC ON CustID=AddrCustID LEFT JOIN Addr AS AddrE ON EmpID=AddrEmpID Ambiguous Unambiguous Ambiguous Shared Structure Hierarchical Structure Duplicate Element Type Dept Dept Dept Cust Emp Cust Emp Cust Emp 39 Addr AddrC AddrE Addr Addr
  • 40. Nonlinear Hierarchical ORDER BY SELECT CustID, InvID, AddrID <root> FROM CustView <cust custid="Cust03"> ORDER BY AddrID Desc, <addr addrid="Addr03"/> </cust> InvID Desc, <cust custid="Cust02"> CustID Desc <invoice invid="Inv03"/> <addr addrid="Addr04"/> CustView Ordering <addr addrid="Addr02"/> Cust </cust> <cust custid="Cust01"> Invoice Addr Ordering Inv before Cust is Trouble: Hierarchical ops like ORDER BY require special nonlinear hierarchical processing Cust1 Cust2 Cust1 to make sense for SQL Hierarchical processing. With Order By, each path is Inv1 Inv2 Inv3 ordered independently as in XML above. 40 Cust becomes out-of-order
  • 41. Hier & XML Middleware Enabler Before Query: Query SQL Pre Processing Establish Views User’s SQL Preload XML (ETL) Real-time Processor SQL Pre Processing: Hier Access Operating Determines structure Hierarchically Optimizes structure Rewrite & submit SQL SQL Post Processing Real-time Access: Structure-aware This Unique Implementation: Retrieve XML (EII) • Supports XML enables hier processing Return XML as rowset • Uses in place SQL processor Retain XML data order • Needs no restaging of data SQL Post processing: • Not XML centric, no learning curve • ANSI SQL-92, vendor neutral Remove replicated data • Operates seamlessly & transparently Nonlinear data ordering • Protects current SQL investment Rowset to XML 41
  • 42. Dynamic Data Structure Creation Recap SELECTed Data Output Controls processing and output result structure With automatic node promotion and collection Combining Data Structures Joining and mashup of heterogeneous structures Mashup & node promotion = data virtualization Generating Variable Data Structures Multiple choice of view structure generation Driven by data value further up or down the path Transforming Data Structures Restructuring using data relationships Reshaping using structure semantics Node splitting and renaming 42
  • 43. The Power of Hierarchical LEFT Outer Join Syntax Recap ViewA LEFT JOIN ViewX ON C.c=Z.c AND A.a=10 Shown on this slide: Views Expanded • Automatically Combines and Unifies Heterogeneous Structures A LEFT JOIN B ON A.a=B.a LEFT JOIN C ON A.a=C.a • Path Filtering Based Up/Down Path LEFT JOIN X LEFT JOIN Y ON X.x=Y.y • Directly Executable by SQL Engine LEFT JOIN Z ON X.x=Z.z ON C.c=Z.c AND A.a=10 • Flexible Recombinant Properties A A • Referencing Below Root is valid Outer join models join B C B C of views, • Access optimization: paths sliced out semantics X X • Processing optimization: reorganized define result structure. • Naturally Hierarchically Distributable 43 Y Z Y Z
  • 44. Combining Basic Capabilities ViewR Hierarchical Examples: A XML Result Hier View Support B C Data Modeling X D E Hier. Preservation A1 Var. Length Paths CREATE View ViewR AS B1 D1 E1 Multiple Paths B2 E2 SELECT * FROM A Multi-node Types LEFT JOIN B ON A.a=B.a Dyn. Data Select LEFT JOIN C ON A.a=C.a A1B1D1E1 Hier. Data Filtering LEFT JOIN D ON C.c=D.c A1B2 E2 Node Promotion LEFT JOIN E ON C.c=E.c Node Collection Optimized Access A1B1C2D1E1 SELECT A.a,B.b,D.d,E.e FROM ViewR Global View A1 C1D1 A1B2C2 E2 WHERE C=‘C2’ Schema-free 44
  • 45. Combining Advanced Capabilities ViewR ViewX Data Mashup Examples: CREATE XML A X ViewX AS A XML View B C Y Z X(XID Char(8)), C Heterogeneous Y(YID Char(8), Data Integration YFK Char(8)) X Mashup CREATE View Parent X, ViewR AS Z(ZID Char(8), Y Z Var. Structures SELECT * FROM A ZFK Char(8)) LCA Processing LEFT JOIN B Parent X ON A.a=B.a A Linear Filtering LEFT JOIN C Converted to Hier. Ordering ON A.a=C.a B C Outer Join Preserve Nodes X Unified View SELECT A.a, C.c, Y.y, Z.z XML Keyword FROM ViewR LEFT JOIN ViewX Y Z Search ON C.c=Z.c AND A=‘A1’ WHERE Y=‘Y1’ OR Z=‘Z2’ A1C2X1 Y1 Z1 Hierarchical ORDER BY A.a, Y.y, Z.z A2C2X2 Y2 Z2 XML Result FOR XML KEEP NODE A3C3X3 Y3 Z3 45
  • 46. Advanced Capabilities 1 Multipath nonlinear processing Dynamic increase of data value using structure semantics Processes all queries regardless of internal complexity Hierarchical data filtering-- XML Keyword Search Multipath hierarchical data structure joins Performed by simple join of hierarchical structure views Can be performed interactively and heterogeneously Dynamically combines hierarchical structure semantics Linking below root allows structure mashup Enables capability to mashup structures meaningfully Includes powerful look-back and look-ahead capability Mashup + node promotion = data virtualization Enables SQL Transformations 46
  • 47. Advanced Capabilities 2 Dynamic automatic variable structure generation Dynamically builds structures based on values in data Can utilize cascading and embedded view operation Operates across multiple paths independently Dynamic user control of structure Dynamic SELECT list controls processing structure Dynamic combining views builds data structure Transformations Change Structure Dynamic Structured Output Formatting Structure-aware processing knows active structure Uses structure of result to format output data Active structure is dynamically controlled by user 47
  • 48. Advanced Capabilities 3 Polymorphic any-to-any structure transform Uses data relationships or just structure semantics Utilizes SQL and hierarchical rowset data Two types of transforms, Restructure and Reshaping Multipath nonlinear hierarchical Operations Single Order By orders multiple pathways separately Aggregation operates on separate pathways Multipath internal LCA nonlinear processing LCA processing limits the domain across pathways Is automatic in ANSI SQL, but not in XQuery Responsible for keeping multipath result meaningful Untested natural and automatic operations Natural hierarchical distributed processing Automatic parallel processing is possible Semantic web RDF to SQL, SQLfX driven by RDF 48
  • 49. Advanced Capabilities Summary 1) ANSI SQL standard and mathematically sound 2) Ease of use (nonprocedural, navigationless, schema-free) 3) Hierarchically correct (principled multipath processing) 4) Greater efficiency (hierarchical processing optimization) 5) Fully interactive (dynamically process native XML) 6) Conceptual hierarchical processing on full structures 7) Queries can operate across the entire hierarchical structure 8) Nonlinear multipath LCA hierarchical processing 9) Full nonlinear hierarchical data structure mashups A) Variable data generated structure control B) Any-to-any polymorphic structure transformations C) All operations are semantically controlled and accurate D) Data virtualization supported E) Natural distributed hierarchical processing F) Automatic parallel processing is possible All of the capabilities shown in this presentation can be reproduced 49 on our online interactive demo at www.adatinc.com/demo.html