SlideShare a Scribd company logo
1 of 28
Download to read offline
Legacy System Archiving With XML, XQuery
and MongoDB




                              Dave Watson
                        SVP, iWay Software
                           @watsondaveny
                   watson.dave@gmail.com
Agenda


XML Archive Overview and Business Use Cases

XML Archive Technical Discussion




                                          Copyright 2009, Information Builders. Slide 2
iWay Archive

 What is XML Archive
    An extension of ESB for archiving data
       Leverage ESB process-oriented integration and data
           federation capabilities
      Long term data retention
      Large repository, large index (Big Data)
      Search and retrieve capabilities (High performance)
      Business use examples
         Satisfy regulatory requirement
         e-Discovery (e.g. research, forensic)
         Business analytics




                                                        Copyright 2009, Information Builders. Slide 3
Archive – Solving Business Needs

Examples of Business Requirements:
Regulations / Reqrs      Example Data                Retention
Federal Record           Patient health records      75 years (after last
Retention Requirement                                episode of care)
FDA 21 CFR Part 11       Clinical trials and FDA     35 years
                         approval
HIPAA (Healthcare)       Pediatric medical records   21 years

Sarbanes-Oxley (public   Audit                       7 years
companies)
SEC 17a-4 (Financial     Account records             6 years
services)                Corporate documentation     Life of the enterprise

Research                 Life science                Long-term
Analytics                Financial / Legal           Long-term


                                                            Copyright 2009, Information Builders. Slide 4
Archive – Types of Data

Can handles all types of data, for example:
       Electronic Documents
          Word, Excel, EDI, HL7, XML, …
       Applications
           ERPs, CRMs, SAP, SFDC, …
       Database Data
          IMS, DB2, Oracle, Sybase, SQL Server, MUMPS, …
       Electronic Files
          VSAM, Unix, Logs, …
       Email
          Outlook, Lotus Notes
       Others
          Multimedia files, Paper, Blueprints, Forms, Claims, …

ESB adapter components can be used to connect to the different types of
data.
Archive – Archiving Needs

Examples of Archiving Requirements:

Archive Requirements
Policy Based – Logical selection of DB records/transactions to be archived
Store very large amounts of data in archive
Keep data for a very long periods of time
Become independent from Applications/DBMS/Systems – future proof
Protect authenticity of data – regulation and compliance
Access archived data when needed / as needed
Quickly search huge numbers of archived documents
Discard data after retention period – regulation and compliance




                                                             Copyright 2009, Information Builders. Slide 6
Archive – Example Business Use Case

 Store 75 years worth of patient data
    Diverse Sources
       XML
       MUMPS
       Oracle
       HL7
 Support archive, query and integration scenarios
    XML to remain unchanged and exist outside the data store
    Ability to query documents
    Ability to retrieve original XML or part of XML using XQuery
    Ability to integrate XML archived data in federated services
     with operational sources (e.g. MUMPS, HL7, Oracle)

                                    Copyright 2010, Information Builders. Slide 7
Archive – Example Business Requirements

 Highly scalable high performance document
    management database
   Easily integrates into a ESB architecture
      Multi-threaded parallel processing
      Distributed processing
      Just another data source along with, e.g., Oracle and
       MUMPS databases
      Leverage ESB Tools for process orchestration,
       process monitoring, data mapping/transformation,
       security and data aggregation capabilities.
   Implementation and vendor neutral – archived data (e.g.
    XML) stored in the operating system‟s native file system

                                                  Copyright 2007, Information Builders. Slide 8
XML Archive Technical Discussion




                           Copyright 2009, Information Builders. Slide 9
Overview

 Highly configurable ESB Java application that can be
             customized to specific needs.
Load Channel
   Reads XML documents and loads them into the
   document repository.

Query Channel
  Handles query request and response against the
  document repository.

Test Channel
   Simple visual interface displaying functionality and
   usage of the Query API.

                                               Copyright 2009, Information Builders. Slide 10
Technology Involved


ESB -
  iWay Service Manager (commercial)
  IBM WebSphere ESB (commercial)
  Oracle Service Bus (commercial)
  WS02 ESB (open source)

mongoDB - http://www.mongodb.org/

JSON - Java Script Object Notation

XQuery - XML query language



                                      Copyright 2009, Information Builders. Slide 11
mongoDB


 “Humongous”
 Scalable, high-performance, document-oriented database.
 JSON-style documents.
 Mirror capable.
 Auto-Sharding (clustering), horizontal scaling, automatic
  failover, zero single point failure.
 MapReduce support for complex processing. Work is
  distributed among the cluster.
 GridFS support.
     A distributed file system.
 Commercial support from 10gen (OEM by iWay Software)
                                               Copyright 2009, Information Builders. Slide 12
XQuery

 A query and functional programming language for XML
  documents.
     Is to XML documents what SQL is to databases.
 “FLWOR” expressions.
     FOR, LET, WHERE, ORDER BY, RETURN
     Example:
          for $x in /FEDREG/CNTNTS/AGCY where
          $x/EAR=„Agricultural‟ order by $x ascending
          return $x
 Supports syntax for constructing new documents.



                                               Copyright 2009, Information Builders. Slide 13
JSON – JavaScript Object Notation

   The new data-interchange language of the web.
                   www.json.org




                                          Copyright 2009, Information Builders. Slide 14
Base Loading Architecture

                  ESB
            Listener             Flow

                        XML to   Store    Store
                        JSON     JSON     XML




                                 GridFS
                                             Binary
                       mongoDB              Storage


                                           Copyright 2009, Information Builders. Slide 15
Base Query Architecture

                    ESB
             Listener             Flow


             HTTP        Query      (Optional)
Requester                 DB         Get XML




                                  GridFS
                                                Binary
                        mongoDB                Storage

                                           Copyright 2009, Information Builders. Slide 16
Loading Modification
External Storage

                   ESB
             Listener             Flow

                         XML to   Store   Store
                         JSON     JSON    XML




                        mongoDB           File System



                                           Copyright 2009, Information Builders. Slide 17
Loading Modification
SAP Loading Architecture

                      ESB
                                Flow

              RFC      IDOC to            Store
                         XML   Store
             Server                       XML
                               JSON
 SAP                    XML to            Store
System                  JSON              IDOC


                                 GridFS

                                             Binary
                      mongoDB               Storage

                                           Copyright 2009, Information Builders. Slide 18
Loading Modification
Change Data Capture Loading Architecture

                       ESB
                                       Flow

                CDC       XML to      Store     Store
              Listener    JSON        JSON      XML

RDBMS



                                       GridFS
                                                   Binary
                         mongoDB                  Storage


                                                 Copyright 2009, Information Builders. Slide 19
Loading Modification
 Salesforce.com Loading Architecture

                        ESB
                                       Flow

                 SOAP XML to           Store    Store
                Listener JSON          JSON     XML

Salesforce
 System

                                       GridFS
                                                   Binary
                          mongoDB                 Storage


                                                 Copyright 2009, Information Builders. Slide 20
Loading Modification
FTP Loading Architecture

                           ESB
                                     Flow

                  FTP       XML to   Store    Store
                 Server     JSON     JSON     XML

   File
  System

                                     GridFS
                                                   Binary
                           mongoDB                Storage


                                              Copyright 2009, Information Builders. Slide 21
Query Modification
Web Service SOAP Query Architecture

                                    ESB
                Listener                  Flow
  Web
                                            (Optional)
 Service        SOAP        Query
                                            Get XML/
  Client                     DB
                                              IDOC




                                          GridFS
                                                        Binary
                           mongoDB                     Storage

                                                   Copyright 2009, Information Builders. Slide 22
The Test Client

Note: The archive is designed to be called from other
 flows or programs.

 A simple AJAX based human interface for querying the XML
  Archive.
 Provides examples of the HTTP query interface provided by
  the base XML Archive.
 Installed with the base implementation of the XML Archive.




                                              Copyright 2009, Information Builders. Slide 23
Simple Example

Loaded this simple XML Doc:




                              Copyright 2009, Information Builders. Slide 24
Displaying the Document

XML Link:




JSON Link:




                          Copyright 2009, Information Builders. Slide 25
Basic Query

Return all documents who have the name attribute of
           the <a> element equal to “bob”.




                                           Copyright 2009, Information Builders. Slide 26
Advanced Queries

  Query handler is a wrapper around the mongoDB
                  query language.

Support for:
   And
   Or
   Regular Expressions
   Ranges




                                         Copyright 2009, Information Builders. Slide 27
Basic XQUERY

   Return only the <b> element from the document.




 Formatted Result:




                                           Copyright 2009, Information Builders. Slide 28

More Related Content

Viewers also liked

Hyperdex - A closer look
Hyperdex - A closer lookHyperdex - A closer look
Hyperdex - A closer look
DECK36
 
Blazes: coordination analysis for distributed programs
Blazes: coordination analysis for distributed programsBlazes: coordination analysis for distributed programs
Blazes: coordination analysis for distributed programs
palvaro
 
In Pursuit of the Holy Grail: Building Isomorphic JavaScript Apps
In Pursuit of the Holy Grail: Building Isomorphic JavaScript AppsIn Pursuit of the Holy Grail: Building Isomorphic JavaScript Apps
In Pursuit of the Holy Grail: Building Isomorphic JavaScript Apps
Spike Brehm
 
Erlang plus BDB: Disrupting the Conventional Web Wisdom
Erlang plus BDB: Disrupting the Conventional Web WisdomErlang plus BDB: Disrupting the Conventional Web Wisdom
Erlang plus BDB: Disrupting the Conventional Web Wisdom
guest3933de
 
AST - the only true tool for building JavaScript
AST - the only true tool for building JavaScriptAST - the only true tool for building JavaScript
AST - the only true tool for building JavaScript
Ingvar Stepanyan
 

Viewers also liked (20)

Hyperdex - A closer look
Hyperdex - A closer lookHyperdex - A closer look
Hyperdex - A closer look
 
Blazes: coordination analysis for distributed programs
Blazes: coordination analysis for distributed programsBlazes: coordination analysis for distributed programs
Blazes: coordination analysis for distributed programs
 
Riak Search - Erlang Factory London 2010
Riak Search - Erlang Factory London 2010Riak Search - Erlang Factory London 2010
Riak Search - Erlang Factory London 2010
 
Chloe and the Realtime Web
Chloe and the Realtime WebChloe and the Realtime Web
Chloe and the Realtime Web
 
Brunch With Coffee
Brunch With CoffeeBrunch With Coffee
Brunch With Coffee
 
LXC, Docker, and the future of software delivery | LinuxCon 2013
LXC, Docker, and the future of software delivery | LinuxCon 2013LXC, Docker, and the future of software delivery | LinuxCon 2013
LXC, Docker, and the future of software delivery | LinuxCon 2013
 
(Functional) reactive programming (@pavlobaron)
(Functional) reactive programming (@pavlobaron)(Functional) reactive programming (@pavlobaron)
(Functional) reactive programming (@pavlobaron)
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document database
 
MongoDB Europe 2016 - MongoDB 3.4 preview and introduction to MongoDB Atlas
MongoDB Europe 2016 - MongoDB 3.4 preview and introduction to MongoDB AtlasMongoDB Europe 2016 - MongoDB 3.4 preview and introduction to MongoDB Atlas
MongoDB Europe 2016 - MongoDB 3.4 preview and introduction to MongoDB Atlas
 
NkSIP: The Erlang SIP application server
NkSIP: The Erlang SIP application serverNkSIP: The Erlang SIP application server
NkSIP: The Erlang SIP application server
 
Spring Cleaning for Your Smartphone
Spring Cleaning for Your SmartphoneSpring Cleaning for Your Smartphone
Spring Cleaning for Your Smartphone
 
Web-Oriented Architecture (WOA)
Web-Oriented Architecture (WOA)Web-Oriented Architecture (WOA)
Web-Oriented Architecture (WOA)
 
Interoperability With RabbitMq
Interoperability With RabbitMqInteroperability With RabbitMq
Interoperability With RabbitMq
 
In Pursuit of the Holy Grail: Building Isomorphic JavaScript Apps
In Pursuit of the Holy Grail: Building Isomorphic JavaScript AppsIn Pursuit of the Holy Grail: Building Isomorphic JavaScript Apps
In Pursuit of the Holy Grail: Building Isomorphic JavaScript Apps
 
Scalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDBScalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDB
 
Erlang plus BDB: Disrupting the Conventional Web Wisdom
Erlang plus BDB: Disrupting the Conventional Web WisdomErlang plus BDB: Disrupting the Conventional Web Wisdom
Erlang plus BDB: Disrupting the Conventional Web Wisdom
 
Shrinking the Haystack" using Solr and OpenNLP
Shrinking the Haystack" using Solr and OpenNLPShrinking the Haystack" using Solr and OpenNLP
Shrinking the Haystack" using Solr and OpenNLP
 
Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...
Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...
Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...
 
AST - the only true tool for building JavaScript
AST - the only true tool for building JavaScriptAST - the only true tool for building JavaScript
AST - the only true tool for building JavaScript
 
Erlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughputErlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughput
 

Similar to Complex Legacy System Archiving/Data Retention with MongoDB and Xquery

EBS and RBS in SharePoint 2010
EBS and RBS in SharePoint 2010EBS and RBS in SharePoint 2010
EBS and RBS in SharePoint 2010
Chris Geier
 
Track 1, Session 3 - intelligent infrastructure for the virtualized world by ...
Track 1, Session 3 - intelligent infrastructure for the virtualized world by ...Track 1, Session 3 - intelligent infrastructure for the virtualized world by ...
Track 1, Session 3 - intelligent infrastructure for the virtualized world by ...
EMC Forum India
 
Java application server in the cloud
Java application server in the cloudJava application server in the cloud
Java application server in the cloud
adm_exoplatform
 
Database management-system
Database management-systemDatabase management-system
Database management-system
kalasalingam
 
Choosing the Right Cloud Storage for Media and Entertainment Workloads - Apri...
Choosing the Right Cloud Storage for Media and Entertainment Workloads - Apri...Choosing the Right Cloud Storage for Media and Entertainment Workloads - Apri...
Choosing the Right Cloud Storage for Media and Entertainment Workloads - Apri...
Amazon Web Services
 

Similar to Complex Legacy System Archiving/Data Retention with MongoDB and Xquery (20)

EBS and RBS in SharePoint 2010
EBS and RBS in SharePoint 2010EBS and RBS in SharePoint 2010
EBS and RBS in SharePoint 2010
 
Moving from Relational to Document Store
Moving from Relational to Document StoreMoving from Relational to Document Store
Moving from Relational to Document Store
 
Expertezed 2012 Webcast - XML DB Use Cases
Expertezed 2012 Webcast - XML DB Use CasesExpertezed 2012 Webcast - XML DB Use Cases
Expertezed 2012 Webcast - XML DB Use Cases
 
SAP REST Summit 2009 - Atom At Work
SAP REST Summit 2009 - Atom At WorkSAP REST Summit 2009 - Atom At Work
SAP REST Summit 2009 - Atom At Work
 
Scality, Cloud Storage pour Zimbra
Scality, Cloud Storage pour ZimbraScality, Cloud Storage pour Zimbra
Scality, Cloud Storage pour Zimbra
 
Mark Logic Information Analysis Trends Webinar
Mark Logic Information Analysis Trends WebinarMark Logic Information Analysis Trends Webinar
Mark Logic Information Analysis Trends Webinar
 
Generated REST Gateways for Mobile Applications
Generated REST Gateways for Mobile ApplicationsGenerated REST Gateways for Mobile Applications
Generated REST Gateways for Mobile Applications
 
Track 1, Session 3 - intelligent infrastructure for the virtualized world by ...
Track 1, Session 3 - intelligent infrastructure for the virtualized world by ...Track 1, Session 3 - intelligent infrastructure for the virtualized world by ...
Track 1, Session 3 - intelligent infrastructure for the virtualized world by ...
 
JSON Application
JSON ApplicationJSON Application
JSON Application
 
Java application server in the cloud
Java application server in the cloudJava application server in the cloud
Java application server in the cloud
 
NoSQL support in Informix (JSON storage, Mongo DB API)
NoSQL support in Informix (JSON storage, Mongo DB API)NoSQL support in Informix (JSON storage, Mongo DB API)
NoSQL support in Informix (JSON storage, Mongo DB API)
 
Database management-system
Database management-systemDatabase management-system
Database management-system
 
Choosing the Right Cloud Storage for Media and Entertainment Workloads - Apri...
Choosing the Right Cloud Storage for Media and Entertainment Workloads - Apri...Choosing the Right Cloud Storage for Media and Entertainment Workloads - Apri...
Choosing the Right Cloud Storage for Media and Entertainment Workloads - Apri...
 
Using MongoDB to Build a Fast and Scalable Content Repository
Using MongoDB to Build a Fast and Scalable Content RepositoryUsing MongoDB to Build a Fast and Scalable Content Repository
Using MongoDB to Build a Fast and Scalable Content Repository
 
Informix warehouse and accelerator overview
Informix warehouse and accelerator overviewInformix warehouse and accelerator overview
Informix warehouse and accelerator overview
 
Using database object relational storage
Using database object relational storageUsing database object relational storage
Using database object relational storage
 
Why we chose mongodb for guardian.co.uk
Why we chose mongodb for guardian.co.ukWhy we chose mongodb for guardian.co.uk
Why we chose mongodb for guardian.co.uk
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentation
 
Aus Post Archiving
Aus Post ArchivingAus Post Archiving
Aus Post Archiving
 
02introduction
02introduction02introduction
02introduction
 

More from DATAVERSITY

The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
 

More from DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Complex Legacy System Archiving/Data Retention with MongoDB and Xquery

  • 1. Legacy System Archiving With XML, XQuery and MongoDB Dave Watson SVP, iWay Software @watsondaveny watson.dave@gmail.com
  • 2. Agenda XML Archive Overview and Business Use Cases XML Archive Technical Discussion Copyright 2009, Information Builders. Slide 2
  • 3. iWay Archive  What is XML Archive  An extension of ESB for archiving data  Leverage ESB process-oriented integration and data federation capabilities  Long term data retention  Large repository, large index (Big Data)  Search and retrieve capabilities (High performance)  Business use examples  Satisfy regulatory requirement  e-Discovery (e.g. research, forensic)  Business analytics Copyright 2009, Information Builders. Slide 3
  • 4. Archive – Solving Business Needs Examples of Business Requirements: Regulations / Reqrs Example Data Retention Federal Record Patient health records 75 years (after last Retention Requirement episode of care) FDA 21 CFR Part 11 Clinical trials and FDA 35 years approval HIPAA (Healthcare) Pediatric medical records 21 years Sarbanes-Oxley (public Audit 7 years companies) SEC 17a-4 (Financial Account records 6 years services) Corporate documentation Life of the enterprise Research Life science Long-term Analytics Financial / Legal Long-term Copyright 2009, Information Builders. Slide 4
  • 5. Archive – Types of Data Can handles all types of data, for example:  Electronic Documents Word, Excel, EDI, HL7, XML, …  Applications  ERPs, CRMs, SAP, SFDC, …  Database Data IMS, DB2, Oracle, Sybase, SQL Server, MUMPS, …  Electronic Files VSAM, Unix, Logs, …  Email Outlook, Lotus Notes  Others Multimedia files, Paper, Blueprints, Forms, Claims, … ESB adapter components can be used to connect to the different types of data.
  • 6. Archive – Archiving Needs Examples of Archiving Requirements: Archive Requirements Policy Based – Logical selection of DB records/transactions to be archived Store very large amounts of data in archive Keep data for a very long periods of time Become independent from Applications/DBMS/Systems – future proof Protect authenticity of data – regulation and compliance Access archived data when needed / as needed Quickly search huge numbers of archived documents Discard data after retention period – regulation and compliance Copyright 2009, Information Builders. Slide 6
  • 7. Archive – Example Business Use Case  Store 75 years worth of patient data  Diverse Sources  XML  MUMPS  Oracle  HL7  Support archive, query and integration scenarios  XML to remain unchanged and exist outside the data store  Ability to query documents  Ability to retrieve original XML or part of XML using XQuery  Ability to integrate XML archived data in federated services with operational sources (e.g. MUMPS, HL7, Oracle) Copyright 2010, Information Builders. Slide 7
  • 8. Archive – Example Business Requirements  Highly scalable high performance document management database  Easily integrates into a ESB architecture  Multi-threaded parallel processing  Distributed processing  Just another data source along with, e.g., Oracle and MUMPS databases  Leverage ESB Tools for process orchestration, process monitoring, data mapping/transformation, security and data aggregation capabilities.  Implementation and vendor neutral – archived data (e.g. XML) stored in the operating system‟s native file system Copyright 2007, Information Builders. Slide 8
  • 9. XML Archive Technical Discussion Copyright 2009, Information Builders. Slide 9
  • 10. Overview Highly configurable ESB Java application that can be customized to specific needs. Load Channel Reads XML documents and loads them into the document repository. Query Channel Handles query request and response against the document repository. Test Channel Simple visual interface displaying functionality and usage of the Query API. Copyright 2009, Information Builders. Slide 10
  • 11. Technology Involved ESB - iWay Service Manager (commercial) IBM WebSphere ESB (commercial) Oracle Service Bus (commercial) WS02 ESB (open source) mongoDB - http://www.mongodb.org/ JSON - Java Script Object Notation XQuery - XML query language Copyright 2009, Information Builders. Slide 11
  • 12. mongoDB  “Humongous”  Scalable, high-performance, document-oriented database.  JSON-style documents.  Mirror capable.  Auto-Sharding (clustering), horizontal scaling, automatic failover, zero single point failure.  MapReduce support for complex processing. Work is distributed among the cluster.  GridFS support.  A distributed file system.  Commercial support from 10gen (OEM by iWay Software) Copyright 2009, Information Builders. Slide 12
  • 13. XQuery  A query and functional programming language for XML documents.  Is to XML documents what SQL is to databases.  “FLWOR” expressions.  FOR, LET, WHERE, ORDER BY, RETURN  Example: for $x in /FEDREG/CNTNTS/AGCY where $x/EAR=„Agricultural‟ order by $x ascending return $x  Supports syntax for constructing new documents. Copyright 2009, Information Builders. Slide 13
  • 14. JSON – JavaScript Object Notation The new data-interchange language of the web. www.json.org Copyright 2009, Information Builders. Slide 14
  • 15. Base Loading Architecture ESB Listener Flow XML to Store Store JSON JSON XML GridFS Binary mongoDB Storage Copyright 2009, Information Builders. Slide 15
  • 16. Base Query Architecture ESB Listener Flow HTTP Query (Optional) Requester DB Get XML GridFS Binary mongoDB Storage Copyright 2009, Information Builders. Slide 16
  • 17. Loading Modification External Storage ESB Listener Flow XML to Store Store JSON JSON XML mongoDB File System Copyright 2009, Information Builders. Slide 17
  • 18. Loading Modification SAP Loading Architecture ESB Flow RFC IDOC to Store XML Store Server XML JSON SAP XML to Store System JSON IDOC GridFS Binary mongoDB Storage Copyright 2009, Information Builders. Slide 18
  • 19. Loading Modification Change Data Capture Loading Architecture ESB Flow CDC XML to Store Store Listener JSON JSON XML RDBMS GridFS Binary mongoDB Storage Copyright 2009, Information Builders. Slide 19
  • 20. Loading Modification Salesforce.com Loading Architecture ESB Flow SOAP XML to Store Store Listener JSON JSON XML Salesforce System GridFS Binary mongoDB Storage Copyright 2009, Information Builders. Slide 20
  • 21. Loading Modification FTP Loading Architecture ESB Flow FTP XML to Store Store Server JSON JSON XML File System GridFS Binary mongoDB Storage Copyright 2009, Information Builders. Slide 21
  • 22. Query Modification Web Service SOAP Query Architecture ESB Listener Flow Web (Optional) Service SOAP Query Get XML/ Client DB IDOC GridFS Binary mongoDB Storage Copyright 2009, Information Builders. Slide 22
  • 23. The Test Client Note: The archive is designed to be called from other flows or programs.  A simple AJAX based human interface for querying the XML Archive.  Provides examples of the HTTP query interface provided by the base XML Archive.  Installed with the base implementation of the XML Archive. Copyright 2009, Information Builders. Slide 23
  • 24. Simple Example Loaded this simple XML Doc: Copyright 2009, Information Builders. Slide 24
  • 25. Displaying the Document XML Link: JSON Link: Copyright 2009, Information Builders. Slide 25
  • 26. Basic Query Return all documents who have the name attribute of the <a> element equal to “bob”. Copyright 2009, Information Builders. Slide 26
  • 27. Advanced Queries Query handler is a wrapper around the mongoDB query language. Support for: And Or Regular Expressions Ranges Copyright 2009, Information Builders. Slide 27
  • 28. Basic XQUERY Return only the <b> element from the document. Formatted Result: Copyright 2009, Information Builders. Slide 28