SlideShare uma empresa Scribd logo
1 de 26
IF4IT
AUTOMATIC AND RAPID
GENERATION OF MASSIVE
KNOWLEDGE REPOSITORIES,
DIRECTLY FROM DATA
Author/Presenter: Frank Guerino
Chairman for The International Foundation for Information Technology (IF4IT)
Email: Frank.Guerino @ if4it.com
LinkedIn: https://www.linkedin.com/in/frankguerino/
Follow Us on Twitter: @IF4IT
Co-Author: Dr. Joel Kline, PhD.
Board of Advisors, The International Foundation for Information Technology (IF4IT)
Professor, Lebanon Valley College, PA-USA
1
IF4IT
The Future isAutomated Synthesis of Knowledge Repositories
Read More: https://www.if4it.com/knowledge-management-automated-content-generation-and-curation/
Meet Bob.
Bob is very competent.
Bob outperforms other people
by generating one great
knowledge article per hour.
Automated Content
Generation
Software
Meet Bob’s
replacement.
Bob’s replacement generates millions of
higher quality, highly curated, and
semantically inter-linked knowledge articles,
in the time it takes Bob to create just one… at
a fraction of the cost.
2
Few knowledge repositories,
limited content, poor curation,
lots of dead links, and no
semantic relationships.
More knowledge repositories,
far more content, greater
curation, almost no dead links,
and semantic relationships.
✖
✔
ACTOR ACTIONS RESULTS
IF4IT
The Wikipedia Problem
• The Wikipedia Community is NOT like an
Enterprise Work Community
- About 17 years to develop,
- Over 130M voluntary editors (i.e. free labor),
- Over 6M content articles
• People believe they can build internal
knowledge repositories (like libraries and intranets) using the same
manual content development paradigm as Wikipedia
• The end result is almost always the same… “Relatively empty and
low value Knowledge/Content Repositories”
People often can’t find the answers they need.
Read More: https://www.if4it.com/wikipedia-problem-understanding-enterprise-knowledge-repositories-fail/
3
IF4IT
The Problem is Manual Labor
Quantity: Low quantities of artifact delivery.
Quality: Higher levels of human-introduced errors.
Time: Longer artifact delivery times.
Money: High costs for delivery of artifacts.
Trend: Knowledge Repository Automation is very important because,
more often than not, teams that build them have very limited resource
(people & finances).
Trend: With the move to “Digital” the expectation of Knowledge
Repositories is even higher.
4
IF4IT
The Solution = Automation via Compilation
• The process is called Synthesis (a.k.a. Compilation)
• Compilation is the word used by software developers
• Synthesis is the word used by non-software developers
• Specifically, we use and recommend Data Driven
Synthesis (DDS)
• We use Compiler-based DDS to generate content, curate
content, interlink content, and automatically build and
provision Knowledge/Content repositories
Read More: https://www.if4it.com/understanding-data-driven-synthesis/
5
IF4IT
Many Decades of Successful Synthesis
 Synthesis/Compilation of Software (Since 1970s)
 Synthesis of Integrated Circuit Schematics (Since 1992)
- Inputs are Hardware Descriptive Languages (HDLs) like VHDL and Verilog.
- Outputs are used for Simulation, Acceleration, Emulation, and Fabrication
 Synthesis of APIs and software code (i.e. Scaffolding for Software
Developers, such as for Java Spring and Ruby on Rails)
 Synthesis of large volumes of test data to exercise complex systems
 Synthesis of chemical Compounds for Drug Discovery
 Synthesis of Health Care Pathways (Diagnosis + Treatments)
 Synthesis of (computer generated) Music and Art
 Synthesis of Electronic Documentation
(i.e. data driven content)
 Synthesis of Digital Libraries (massive web sites)
 Synthesis of Semantic Data Graphs (SDGs)
6
IF4IT
Who cares about DDS-based automation?
• Internet and Intranet Web Content Managers & Developers
• Technical Writers / Technical Communicators
• Architects (Enterprise/Solutions/Business/Applications/Data/etc.)
• Enterprise Models
• Software Developers (Using Compilation for about 5 Decades)
• API Documentation
• Software Configuration Documentation
• Engineers (Using Synthesis for about 3 Decades)
• Hardware, Network, Communications, & Semiconductor Documentation
• Anyone who documents topics, curates, and who publishes results to
web pages in some Content/Knowledge Repository
7
IF4IT
Common Use Cases Driving DDS
• Strategic Planning – Enterprise Portfolio Impact Analysis
• Faster Domain Documentation, - More inter-linked documentation,
with interactive data and with fewer errors, @ far lower costs
• Better Customer Support – Rapid and more accurate Incident Impact
Analysis
• Better Operational Work - Faster Knowledge Discovery = faster &
better work decisions
• Lower Development Costs – Synthesis helps eliminate significant
Software Development
• Better Search & Discovery – Synthesis helps yield better & more
accurate Search Results
Higher Levels of Customer / End-User Satisfaction
8
IF4IT
Synthesis is Compiler-based
Data
Compiler/Synthes
izer
Baseline Input
Data
Processing
Rules
Synthesized
Output(s)
Outputs are used for
machines like computers
AND for Humans.
Flat files like *.csv
sourced from spreadsheets
and systems.
Controls ontologies,
formatting, view controls,
report generation, semantic
relationship harvesting, etc.
9
Software
Compiler/Synthes
izer
Source Code
Files
Compiled
Software
Software
Compilation/Synthesis
Data
Compilation/Synthesis
IF4IT
Benefits of DDS
Agile: Changes can be made iteratively and in
seconds/minutes
• Simple CSV flat files can be compiled
• No long software development cycles
Scalable: Hundreds of Thousands or Millions of content
pages can be generated in minutes
Stable: Elimination of human errors, like dead links, leads
to far higher levels of quality.
Affordable: The cost per content page (including both
Quantity and Quality) is a small fraction of manually
generated content
10
IF4IT
The Synthesis Sequence of Events
Application Data
(e.g. .CSV File)
Capability Data
(e.g. .CSV File)
Human Resource Data
(e.g. .CSV File)
Product Data
(e.g. .CSV File)
Service Data
(e.g. .CSV File)
Etc. Data
(e.g. .CSV File)
Facility Data
(e.g. .CSV File)
Organization Data
(e.g. .CSV File)
…Synthesizer Inputs
Fromspreadsheetsandsystems.
1
Processing Rules
for
• Relationship Discovery
• Data Formatting
• View Generation
• Report Calculations
• Etc.
2
Data Synthesizer/
Data Compiler
3
Node Views
Data Graph/Network
Relationships
CI (z)
CI (y)
CI (x)
Business Intelligence
• Inventories
• Reports
• Graphs & Charts
• Glossaries
• Dashboards
• Visualizations
• Abbreviations
• Acronyms
Data Indexes
Catalogs
Intranet/
Digital Library
4
11
IF4IT
Real Business Impacts
12
Your Compiler
Intranets / Content Management Systems
(Confluence, Jive, Drupal, MediaWiki, etc.)
Architecture Modeling Tools (AMTs)
(Troux, Mega, Adaptive, System Architect, etc.)
Configuration Management Databases (CMDBs)
(HP, BMC, ServiceNow, etc.)
Stand-Alone Knowledge Management Systems
(Madcap, KPS, Bitrix, SalesForce, ServiceNow, etc.)
Library Management Systems (LMSs)
(Koha, Soft Link, NGL, LibSys, Folet, etc.)
Semantic Data Systems
(Cambridge Semantics, Protégé, Swoop, LDIF, etc.)
The Traditional Way = $$$$$$$$$$$$$$$$$$$
(Too many complex, expensive, difficult to deliver & operate systems
and tools… just to get to a comprehensive view of your enterprise!)
ExpensiveIntegration
ExpensiveBusinessIntelligence&Reporting
ExpensivePeoplewithSpecificSkills
DDS Results = $
(A very simple, very quick, and very
affordable “Compiler Based Approach”)
Your Data
Your Branded Digital Libraries
(Complete with Catalogs, Indexes,
Relationships, Data Views, Reports,
Dashboards, Visualizations, etc.)
3
4
Your Data + Your Rules1
Complexity Simplicity
2
Data Synthesizer/
Data Compiler
✖ ✔
Many Years & Countless Resources Minutes/Hours & Small # of Resources
IF4IT
Compiler-based DDS helps generate
“Knowledge Structures”
1. Content – High quantities, richly formatted, highly
structured, and strongly inter-linked
2. Interactive Data Visualizations - for Interactive
Analytics, Data Science, and Visual Discovery
3. Knowledge Repositories – fully curated structures
like advanced Intranets and Digital Libraries
Read More: https://www.if4it.com/knowledge-management-understanding-knowledge-structures/
13
IF4IT
1. Content: SFN over LFN
Raw and unstructured human
narrative in the form of “content”
(not “data”).
Highly structured data, based on
Name/Value pair paradigms
(e.g. CSV, JSON, etc.).
✖ ✔
14
IF4IT
2. Interactive Data Visualizations
VisualComplexity.com D3js.org
• Data Science and Data Scientists are VERY expensive.
• DDS creates a common set of fully integrated Data Visualizations
• DDS automatically creates many more out-of-the-box and ready-
to-use Data Visualizations, faster and at far lower costs.
15
IF4IT
Geographic Maps
Interactive Data Visualization Examples…
Force Directed Graphs Bubbles
Condegram Spirals
Bars, Pies, Lines
Sankey FlowsChords Multivariate Grids
See many interactive examples in the gallery at: http://www.d3js.org
16
IF4IT
3. Knowledge Repositories
Read More: https://www.if4it.com/nounz/
Generic Example: http://nounz.if4it.com Domain-Specific Example: http://km.if4it.com
17
IF4IT
The Spectrum of Synthesizable Knowledge Structures
Range of Synthesizable Knowledge Structures
• Data Records/Nodes
• Tables & Inventories
• Charts (Pie, Bar, Area,
Bubble, etc.)
• Graphs (Line, Multi-
Line, etc.)
• Web Pages
• Catalogs
• Indexes
• Reports
• Semantic Relationships
• Semantic Predicates
Simple Knowledge
Structures
• Dashboards
• Data Visualizations
(many different
visualizations)
• Semantic Data Graphs
(SDGs) / Semantic Data
Networks (SDNs)
• HTML Link Networks
• Navigation Taxonomies
• Classification
Taxonomies
Moderately Complex
Knowledge
Structures
• General Web Sites
• Intranets
• Architecture Models
• Architecture
Repositories
• Configuration
Management
Databases (CMDBs)
• Domain-specific
Knowledge
Repositories
Complex Knowledge
Structures
• Multi-Context/Multi-
Domain Digital Libraries
that include all other
structures in the
spectrum (all columns
to the left)
• Industry Specific
Determinations…
- Automatic Claim
Processing
- New Viable Drugs
- Healthcare Care
Pathways
- High Frequency Auto-
Investing
- Etc.
Super Complex
Knowledge
Structures
Example Formats = TXT, CSV, TSV, JSON, XML, HTML, SVG, PDF, Etc.
Simplest Most Complex
• Bits and Bytes
• Built-In Types and
Constants
• Lists, Arrays, and Hash
Tables
• Stacks and Heaps
• For Loops, Do Loops,
and While Loops
• Formulas and
Algorithms
• Buffers, Streams and
Files
• Classes and Objects
Simplest Knowledge
Structures
Read More: https://www.if4it.com/knowledge-management-understanding-knowledge-structures/
18
IF4IT
DDS Solves the Wikipedia Problem for Enterprises...
Quantity: Much higher quantities of artifact delivery.
Quality: Much higher levels quality.
Time: Much shorter times for artifact delivery (i.e.
much higher quantities with higher quality).
Money: Much lower costs to deliver artifacts
(especially for Data Science & Data Visualizations).
FASTER & BETTER
KNOWLEDGE DISCOVERY
AND DECISION MAKING
19
IF4IT
The Benefits of DDS
• More and Better Knowledge Repositories
- Far higher quantities of more advanced content
- More advanced features and capabilities
- Dynamic integration of data with content
- Higher quality of content (e.g. far fewer dead links)
- Far less investment of time and funds
• Higher stakeholder satisfaction and engagement
20
IF4IT
Getting Started with DDS
1. Acquire a Data Compiler/Synthesizer
• Contact IF4IT for a free NOUNZ Lite compiler https://www.if4it.com/contact-us/
2. Start with simple Spreadsheet-based Inventories (and Sharepoint List
Structure extracts)
3. Incrementally customize small data sets to meet your needs and your
desired look-and-feel
4. Slowly progress to more complicated Data Extracts (from proprietary
systems)
5. Keep in mind that Time-To-Learn is “incremental” [you don’t have to
start with big projects]
Crawl Walk Run
21
IF4IT
Questions and Discussion
22
Frank Guerino
CEO & Chairman
The International Foundation for
Information Technology (IF4IT)
Email: Frank.Guerino@if4it.com
Twitter: @IF4IT
IF4IT
Read More:
• Automated Content Generation & Curation: https://www.if4it.com/knowledge-
management-automated-content-generation-and-curation/
• The Wikipedia Problem: https://www.if4it.com/wikipedia-problem-understanding-
enterprise-knowledge-repositories-fail/
• Understanding Data Driven Synthesis: https://www.if4it.com/understanding-data-
driven-synthesis/
• Understanding Knowledge Structures: https://www.if4it.com/knowledge-management-
understanding-knowledge-structures/
• Learn about D3 and Interactive Visualizations: http:www.d3js.org
• Understanding Knowledge Structures: https://www.if4it.com/knowledge-management-
understanding-knowledge-structures/
• Learn about the IF4IT NOUNZ Data Compilation Platform:
https://www.if4it.com/nounz/
• See Interactive Example of DDS-generated Generic Digital Library:
http://nounz.if4it.com (Less than 3 minutes to generate.)
• See Interactive Example of DDS-generated KM Body of Knowledge:
http://km.if4it.com (Only seconds to generate.)
23
IF4IT24
APPENDIX
Real Case Studies
IF4IT
Global Biopharmaceutical
25
-- TOTAL Administration Category Noun Instances = 5: Time = Wednesday June 15, 2016 at 10:04:08
-- TOTAL Assay Noun Instances = 749: Time = Wednesday June 15, 2016 at 10:04:08
-- TOTAL Biological Matrix Category Noun Instances = 42: Time = Wednesday June 15, 2016 at 10:04:08
-- TOTAL Biomarker Noun Instances = 42: Time = Wednesday June 15, 2016 at 10:04:08
-- TOTAL Company Noun Instances = 18: Time = Wednesday June 15, 2016 at 10:04:08
-- TOTAL Disease Mechanism Noun Instances = 17: Time = Wednesday June 15, 2016 at 10:04:08
-- TOTAL Facility Noun Instances = 3: Time = Wednesday June 15, 2016 at 10:04:08
-- TOTAL Immunoassay Platform Noun Instances = 6: Time = Wednesday June 15, 2016 at 10:04:08
-- TOTAL Instrument Category Noun Instances = 5: Time = Wednesday June 15, 2016 at 10:04:08
-- TOTAL Instrument Noun Instances = 37: Time = Wednesday June 15, 2016 at 10:04:08
-- TOTAL Offering Noun Instances = 516: Time = Wednesday June 15, 2016 at 10:04:09
-- TOTAL Program Category Noun Instances = 5: Time = Wednesday June 15, 2016 at 10:04:09
-- TOTAL Study Type Noun Instances = 17: Time = Wednesday June 15, 2016 at 10:04:09
-- TOTAL White Paper Noun Instances = 28: Time = Wednesday June 15, 2016 at 10:04:09
-- TOTAL Application Noun Instances = 1000: Time = Wednesday June 15, 2016 at 10:04:09
-- TOTAL Business Domain Noun Instances = 9: Time = Wednesday June 15, 2016 at 10:04:09
-- TOTAL Capability Noun Instances = 32: Time = Wednesday June 15, 2016 at 10:04:09
-- TOTAL Computing Server Noun Instances = 100: Time = Wednesday June 15, 2016 at 10:04:09
-- TOTAL Contract Noun Instances = 1166: Time = Wednesday June 15, 2016 at 10:04:09
-- TOTAL Country Noun Instances = 251: Time = Wednesday June 15, 2016 at 10:04:09
-- TOTAL Customer Noun Instances = 150: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Database Noun Instances = 100: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Data Transport Technology Noun Instances = 4: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Environment Noun Instances = 8: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Frequently Asked Question Noun Instances = 32: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Information Category Noun Instances = 16: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Interface Noun Instances = 99: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Language Code Noun Instances = 504: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Letter Noun Instances = 26: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Location Noun Instances = 50: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Market Sector Noun Instances = 2: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Market Segment Noun Instances = 2: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL News Article Noun Instances = 6: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Number Noun Instances = 9: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Organization Noun Instances = 29: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Policy Noun Instances = 100: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Process Noun Instances = 26: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Product Noun Instances = 25: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Project Noun Instances = 1000: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Resource Noun Instances = 14: Time = Wednesday June 15, 2016 at 10:04:10
-- TOTAL Sales Transaction Noun Instances = 886: Time = Wednesday June 15, 2016 at 10:04:11
-- TOTAL SDLC Activity Noun Instances = 353: Time = Wednesday June 15, 2016 at 10:04:11
-- TOTAL SDLC Phase Noun Instances = 14: Time = Wednesday June 15, 2016 at 10:04:11
-- TOTAL Service Noun Instances = 561: Time = Wednesday June 15, 2016 at 10:04:11
-- TOTAL Software Noun Instances = 100: Time = Wednesday June 15, 2016 at 10:04:11
-- TOTAL Glossary Term Noun Instances = 235: Time = Wednesday June 15, 2016 at 10:04:11
-- TOTAL Vendor Noun Instances = 100: Time = Wednesday June 15, 2016 at 10:04:11
-- TOTAL Undefined Noun Type Noun Instances = 1: Time = Wednesday June 15, 2016 at 10:04:11
TOTAL Number of Unique Noun Types = 48: Time = Wednesday June 15, 2016 at 10:04:11
TOTAL Noun Instances registered = 8500: Time = Wednesday June 15, 2016 at 10:04:11
TOTAL Number of Unique Abbreviations or Acronyms = 655: Time = Wednesday June 15, 2016 at 10:04:11
TOTAL Number of Unique Semantic Relationships = 30767: Time = Wednesday June 15, 2016 at 10:04:15
TOTAL Number of Unique Semantic Relationship Predicates = 97: Time = Wednesday June 15, 2016 at 10:04:15
TOTAL Minimum Number of HTML Links = 113536: Time = Wednesday June 15, 2016 at 10:07:27
Spreadsheets were used to easily and quickly
collect, organize, and supply data to NOUNZ
Compiler in 1st Normal Form CSV formats.
Vertical industry and business data was collected
from public Biopharma web site, organized and
cleansed in about 5 hours.
Generic IT Data was intentionally comingled with
Biopharma vertical industry and business data, in
order to show the effects of mixing different data
types.
TOTALS:
Total unique Noun Types (Data Types) = 48
Total Catalogs = 50
Total Noun Instances (across all Noun Types = 8500
Total Semantic Relationships = 30767
Total Semantic Predicates = 97
Total Abbreviations and Acronyms = 655
Total “minimum” # of HTML links = 113536
Total Compile Time = 3 Minutes and 27 Seconds
IF4IT
Regional Health Care Payer/Insurer
26
• 47 defined Noun Types (a.k.a. Data Types),
• almost 49,000 Noun Instances (a.k.a. Data Instances or Records) that are sourced
from the different Noun Types,
• Almost 294,000 automatically synthesized web pages with different views of data
and information,
• Over 300K automatically discovered and harvested Semantic Relationships that
translate directly to over 1,100,000 contextual and meaningful HTML links.
• 46 total Catalogs, Including a Master Catalog, 47 Noun Domain Specific Catalogs
(one for each Noun Type), an Abbreviations/Acronyms Catalog, and a Relationship
Predicates Catalog
• 288 unique Indexing Categories with 2582 unique Data Indexes
• 869 harvested and curated Abbreviations and Acronyms
• Over 1,600 unique semantic relationship descriptors (i.e. Predicates)
• 47 Domain Specific Dashboards (one for each Noun Type).
Total Compiler Time = Approximately 15 minutes

Mais conteúdo relacionado

Mais procurados

The Key to Big Data Modeling: Collaboration
The Key to Big Data Modeling: CollaborationThe Key to Big Data Modeling: Collaboration
The Key to Big Data Modeling: Collaboration
Embarcadero Technologies
 

Mais procurados (20)

Introduction to Big Data Analytics
Introduction to Big Data AnalyticsIntroduction to Big Data Analytics
Introduction to Big Data Analytics
 
Focus on Your Analysis, Not Your SQL Code
Focus on Your Analysis, Not Your SQL CodeFocus on Your Analysis, Not Your SQL Code
Focus on Your Analysis, Not Your SQL Code
 
Data science governance : what and how
Data science governance : what and howData science governance : what and how
Data science governance : what and how
 
Chapter12
Chapter12Chapter12
Chapter12
 
Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements  Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements
 
The Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy Webinar
The Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy WebinarThe Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy Webinar
The Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy Webinar
 
Key Elements for a Successful Service Analytics Program
Key Elements for a Successful Service Analytics ProgramKey Elements for a Successful Service Analytics Program
Key Elements for a Successful Service Analytics Program
 
Data-Ed Online: Emerging Trends in Data Jobs
Data-Ed Online: Emerging Trends in Data JobsData-Ed Online: Emerging Trends in Data Jobs
Data-Ed Online: Emerging Trends in Data Jobs
 
BlueBrain Nexus Technical Introduction
BlueBrain Nexus Technical IntroductionBlueBrain Nexus Technical Introduction
BlueBrain Nexus Technical Introduction
 
Groundbreaking and Game-changing Enterprise Search Webinar
Groundbreaking and Game-changing Enterprise Search WebinarGroundbreaking and Game-changing Enterprise Search Webinar
Groundbreaking and Game-changing Enterprise Search Webinar
 
The Key to Big Data Modeling: Collaboration
The Key to Big Data Modeling: CollaborationThe Key to Big Data Modeling: Collaboration
The Key to Big Data Modeling: Collaboration
 
A Year in Review - Building a Comprehensive Data Management Program
A Year in Review - Building a Comprehensive Data Management ProgramA Year in Review - Building a Comprehensive Data Management Program
A Year in Review - Building a Comprehensive Data Management Program
 
Strategic imperative the enterprise data model
Strategic imperative the enterprise data modelStrategic imperative the enterprise data model
Strategic imperative the enterprise data model
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data Warehouse
 
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced AnalyticsADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
 
ARMA Calgary Spring Seminar: The Nuts and Bolts of Metadata Tagging and Taxon...
ARMA Calgary Spring Seminar: The Nuts and Bolts of Metadata Tagging and Taxon...ARMA Calgary Spring Seminar: The Nuts and Bolts of Metadata Tagging and Taxon...
ARMA Calgary Spring Seminar: The Nuts and Bolts of Metadata Tagging and Taxon...
 
Building the Modern Data Hub
Building the Modern Data HubBuilding the Modern Data Hub
Building the Modern Data Hub
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data Architecture
 
“Semantic Technologies for Smart Services”
“Semantic Technologies for Smart Services” “Semantic Technologies for Smart Services”
“Semantic Technologies for Smart Services”
 
Data Systems Integration & Business Value Pt. 1: Metadata
Data Systems Integration & Business Value Pt. 1: MetadataData Systems Integration & Business Value Pt. 1: Metadata
Data Systems Integration & Business Value Pt. 1: Metadata
 

Semelhante a Automatic and rapid generation of massive knowledge repositories from data

CESSI Digital Library Case Study Eng
CESSI Digital Library Case Study EngCESSI Digital Library Case Study Eng
CESSI Digital Library Case Study Eng
atolomei
 
iirml_dsi.ppt
iirml_dsi.pptiirml_dsi.ppt
iirml_dsi.ppt
Videoguy
 
iirml_dsi.ppt
iirml_dsi.pptiirml_dsi.ppt
iirml_dsi.ppt
Videoguy
 
Page 18Goal Implement a complete search engine. Milestones.docx
Page 18Goal Implement a complete search engine. Milestones.docxPage 18Goal Implement a complete search engine. Milestones.docx
Page 18Goal Implement a complete search engine. Milestones.docx
smile790243
 
Exercises portfolio-Digital Curation Tools (IS40620)
Exercises portfolio-Digital Curation Tools (IS40620)Exercises portfolio-Digital Curation Tools (IS40620)
Exercises portfolio-Digital Curation Tools (IS40620)
softwaresatish
 

Semelhante a Automatic and rapid generation of massive knowledge repositories from data (20)

CESSI Digital Library Case Study Eng
CESSI Digital Library Case Study EngCESSI Digital Library Case Study Eng
CESSI Digital Library Case Study Eng
 
What do you need to know before going in to Sri Lankan IT industry
What do you need to know before going in to Sri Lankan IT industryWhat do you need to know before going in to Sri Lankan IT industry
What do you need to know before going in to Sri Lankan IT industry
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
 
iirml_dsi.ppt
iirml_dsi.pptiirml_dsi.ppt
iirml_dsi.ppt
 
iirml_dsi.ppt
iirml_dsi.pptiirml_dsi.ppt
iirml_dsi.ppt
 
Digital library presentation
Digital library presentationDigital library presentation
Digital library presentation
 
lawTechCamp - Knowledge Management Panel
lawTechCamp - Knowledge Management PanellawTechCamp - Knowledge Management Panel
lawTechCamp - Knowledge Management Panel
 
File Manager for z/OS - Overview
File Manager for z/OS - OverviewFile Manager for z/OS - Overview
File Manager for z/OS - Overview
 
Database and types of database
Database and types of databaseDatabase and types of database
Database and types of database
 
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software Engineering
 
Page 18Goal Implement a complete search engine. Milestones.docx
Page 18Goal Implement a complete search engine. Milestones.docxPage 18Goal Implement a complete search engine. Milestones.docx
Page 18Goal Implement a complete search engine. Milestones.docx
 
Open source: Making connections by Sunny Pai
Open source: Making connections by Sunny PaiOpen source: Making connections by Sunny Pai
Open source: Making connections by Sunny Pai
 
Ict uses in libraries
Ict uses in librariesIct uses in libraries
Ict uses in libraries
 
OpenLink Virtuoso - Management & Decision Makers Overview
OpenLink Virtuoso - Management & Decision Makers OverviewOpenLink Virtuoso - Management & Decision Makers Overview
OpenLink Virtuoso - Management & Decision Makers Overview
 
Data-Oriented Programming: making data a first-class citizen
Data-Oriented Programming: making data a first-class citizenData-Oriented Programming: making data a first-class citizen
Data-Oriented Programming: making data a first-class citizen
 
How Best Practices Enable Rapid Implementation of Intelligence Portals
How Best Practices Enable Rapid Implementation of Intelligence PortalsHow Best Practices Enable Rapid Implementation of Intelligence Portals
How Best Practices Enable Rapid Implementation of Intelligence Portals
 
Structuring Serendipitous Collaboration
Structuring Serendipitous CollaborationStructuring Serendipitous Collaboration
Structuring Serendipitous Collaboration
 
TEI based dictionaries
TEI based dictionariesTEI based dictionaries
TEI based dictionaries
 
Exercises portfolio-Digital Curation Tools (IS40620)
Exercises portfolio-Digital Curation Tools (IS40620)Exercises portfolio-Digital Curation Tools (IS40620)
Exercises portfolio-Digital Curation Tools (IS40620)
 
How AI is transforming DevOps | Calidad Infotech
How AI is transforming DevOps | Calidad InfotechHow AI is transforming DevOps | Calidad Infotech
How AI is transforming DevOps | Calidad Infotech
 

Mais de SIKM

Mais de SIKM (20)

Knowledge Retention Framework and Maturity Model
Knowledge Retention Framework and Maturity ModelKnowledge Retention Framework and Maturity Model
Knowledge Retention Framework and Maturity Model
 
To ISO or not to ISO?
To ISO or not to ISO?To ISO or not to ISO?
To ISO or not to ISO?
 
Accelerating Knowledge at Scale
Accelerating Knowledge at ScaleAccelerating Knowledge at Scale
Accelerating Knowledge at Scale
 
The crossroads of Information Architecture and Knowledge Management
The crossroads of Information Architecture and Knowledge ManagementThe crossroads of Information Architecture and Knowledge Management
The crossroads of Information Architecture and Knowledge Management
 
A system-thinking approach to a learning organization transformation
A system-thinking approach to a learning organization transformationA system-thinking approach to a learning organization transformation
A system-thinking approach to a learning organization transformation
 
Resilience and KM
Resilience and KMResilience and KM
Resilience and KM
 
Expert Knowledge Transfer - Reflections and Panel Discussion
Expert Knowledge Transfer - Reflections and Panel DiscussionExpert Knowledge Transfer - Reflections and Panel Discussion
Expert Knowledge Transfer - Reflections and Panel Discussion
 
The Value of Knowledge
The Value of KnowledgeThe Value of Knowledge
The Value of Knowledge
 
Communities of Practice - Challenges, Curiosity and Dragons
Communities of Practice - Challenges, Curiosity and Dragons Communities of Practice - Challenges, Curiosity and Dragons
Communities of Practice - Challenges, Curiosity and Dragons
 
Data Curation - Data probity in a time of COVID
Data Curation - Data probity in a time of COVIDData Curation - Data probity in a time of COVID
Data Curation - Data probity in a time of COVID
 
AI and Big Data in KM
AI and Big Data in KMAI and Big Data in KM
AI and Big Data in KM
 
Tips & Tricks for Your Lessons Learned Program
Tips & Tricks for Your Lessons Learned ProgramTips & Tricks for Your Lessons Learned Program
Tips & Tricks for Your Lessons Learned Program
 
Integration of Knowledge and Innovation Standards
Integration of Knowledge and Innovation StandardsIntegration of Knowledge and Innovation Standards
Integration of Knowledge and Innovation Standards
 
Behavioral DNA of Collaborative Leadership
Behavioral DNA of Collaborative LeadershipBehavioral DNA of Collaborative Leadership
Behavioral DNA of Collaborative Leadership
 
More Than a Feeling: Emotions and Knowledge Management
More Than a Feeling: Emotions and Knowledge ManagementMore Than a Feeling: Emotions and Knowledge Management
More Than a Feeling: Emotions and Knowledge Management
 
Applied Knowledge Services: A New Approach for Management and Leadership in t...
Applied Knowledge Services: A New Approach for Management and Leadership in t...Applied Knowledge Services: A New Approach for Management and Leadership in t...
Applied Knowledge Services: A New Approach for Management and Leadership in t...
 
Could a Rural Island Inspire KM Approaches?
Could a Rural Island Inspire KM Approaches?Could a Rural Island Inspire KM Approaches?
Could a Rural Island Inspire KM Approaches?
 
Tom Barfield - Navigating Knowledge to the User
Tom Barfield - Navigating Knowledge to the UserTom Barfield - Navigating Knowledge to the User
Tom Barfield - Navigating Knowledge to the User
 
The Impact of Data Analytics in Digital Transformation Programs
The Impact of Data Analytics in Digital Transformation ProgramsThe Impact of Data Analytics in Digital Transformation Programs
The Impact of Data Analytics in Digital Transformation Programs
 
Alchemy of Data Elements - Top Down Meets Bottom Up
Alchemy of Data Elements - Top Down Meets Bottom UpAlchemy of Data Elements - Top Down Meets Bottom Up
Alchemy of Data Elements - Top Down Meets Bottom Up
 

Último

Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
lizamodels9
 
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
lizamodels9
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usage
Matteo Carbone
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
Abortion pills in Kuwait Cytotec pills in Kuwait
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
dlhescort
 

Último (20)

It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataRSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors Data
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
 
Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptx
 
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usage
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 

Automatic and rapid generation of massive knowledge repositories from data

  • 1. IF4IT AUTOMATIC AND RAPID GENERATION OF MASSIVE KNOWLEDGE REPOSITORIES, DIRECTLY FROM DATA Author/Presenter: Frank Guerino Chairman for The International Foundation for Information Technology (IF4IT) Email: Frank.Guerino @ if4it.com LinkedIn: https://www.linkedin.com/in/frankguerino/ Follow Us on Twitter: @IF4IT Co-Author: Dr. Joel Kline, PhD. Board of Advisors, The International Foundation for Information Technology (IF4IT) Professor, Lebanon Valley College, PA-USA 1
  • 2. IF4IT The Future isAutomated Synthesis of Knowledge Repositories Read More: https://www.if4it.com/knowledge-management-automated-content-generation-and-curation/ Meet Bob. Bob is very competent. Bob outperforms other people by generating one great knowledge article per hour. Automated Content Generation Software Meet Bob’s replacement. Bob’s replacement generates millions of higher quality, highly curated, and semantically inter-linked knowledge articles, in the time it takes Bob to create just one… at a fraction of the cost. 2 Few knowledge repositories, limited content, poor curation, lots of dead links, and no semantic relationships. More knowledge repositories, far more content, greater curation, almost no dead links, and semantic relationships. ✖ ✔ ACTOR ACTIONS RESULTS
  • 3. IF4IT The Wikipedia Problem • The Wikipedia Community is NOT like an Enterprise Work Community - About 17 years to develop, - Over 130M voluntary editors (i.e. free labor), - Over 6M content articles • People believe they can build internal knowledge repositories (like libraries and intranets) using the same manual content development paradigm as Wikipedia • The end result is almost always the same… “Relatively empty and low value Knowledge/Content Repositories” People often can’t find the answers they need. Read More: https://www.if4it.com/wikipedia-problem-understanding-enterprise-knowledge-repositories-fail/ 3
  • 4. IF4IT The Problem is Manual Labor Quantity: Low quantities of artifact delivery. Quality: Higher levels of human-introduced errors. Time: Longer artifact delivery times. Money: High costs for delivery of artifacts. Trend: Knowledge Repository Automation is very important because, more often than not, teams that build them have very limited resource (people & finances). Trend: With the move to “Digital” the expectation of Knowledge Repositories is even higher. 4
  • 5. IF4IT The Solution = Automation via Compilation • The process is called Synthesis (a.k.a. Compilation) • Compilation is the word used by software developers • Synthesis is the word used by non-software developers • Specifically, we use and recommend Data Driven Synthesis (DDS) • We use Compiler-based DDS to generate content, curate content, interlink content, and automatically build and provision Knowledge/Content repositories Read More: https://www.if4it.com/understanding-data-driven-synthesis/ 5
  • 6. IF4IT Many Decades of Successful Synthesis  Synthesis/Compilation of Software (Since 1970s)  Synthesis of Integrated Circuit Schematics (Since 1992) - Inputs are Hardware Descriptive Languages (HDLs) like VHDL and Verilog. - Outputs are used for Simulation, Acceleration, Emulation, and Fabrication  Synthesis of APIs and software code (i.e. Scaffolding for Software Developers, such as for Java Spring and Ruby on Rails)  Synthesis of large volumes of test data to exercise complex systems  Synthesis of chemical Compounds for Drug Discovery  Synthesis of Health Care Pathways (Diagnosis + Treatments)  Synthesis of (computer generated) Music and Art  Synthesis of Electronic Documentation (i.e. data driven content)  Synthesis of Digital Libraries (massive web sites)  Synthesis of Semantic Data Graphs (SDGs) 6
  • 7. IF4IT Who cares about DDS-based automation? • Internet and Intranet Web Content Managers & Developers • Technical Writers / Technical Communicators • Architects (Enterprise/Solutions/Business/Applications/Data/etc.) • Enterprise Models • Software Developers (Using Compilation for about 5 Decades) • API Documentation • Software Configuration Documentation • Engineers (Using Synthesis for about 3 Decades) • Hardware, Network, Communications, & Semiconductor Documentation • Anyone who documents topics, curates, and who publishes results to web pages in some Content/Knowledge Repository 7
  • 8. IF4IT Common Use Cases Driving DDS • Strategic Planning – Enterprise Portfolio Impact Analysis • Faster Domain Documentation, - More inter-linked documentation, with interactive data and with fewer errors, @ far lower costs • Better Customer Support – Rapid and more accurate Incident Impact Analysis • Better Operational Work - Faster Knowledge Discovery = faster & better work decisions • Lower Development Costs – Synthesis helps eliminate significant Software Development • Better Search & Discovery – Synthesis helps yield better & more accurate Search Results Higher Levels of Customer / End-User Satisfaction 8
  • 9. IF4IT Synthesis is Compiler-based Data Compiler/Synthes izer Baseline Input Data Processing Rules Synthesized Output(s) Outputs are used for machines like computers AND for Humans. Flat files like *.csv sourced from spreadsheets and systems. Controls ontologies, formatting, view controls, report generation, semantic relationship harvesting, etc. 9 Software Compiler/Synthes izer Source Code Files Compiled Software Software Compilation/Synthesis Data Compilation/Synthesis
  • 10. IF4IT Benefits of DDS Agile: Changes can be made iteratively and in seconds/minutes • Simple CSV flat files can be compiled • No long software development cycles Scalable: Hundreds of Thousands or Millions of content pages can be generated in minutes Stable: Elimination of human errors, like dead links, leads to far higher levels of quality. Affordable: The cost per content page (including both Quantity and Quality) is a small fraction of manually generated content 10
  • 11. IF4IT The Synthesis Sequence of Events Application Data (e.g. .CSV File) Capability Data (e.g. .CSV File) Human Resource Data (e.g. .CSV File) Product Data (e.g. .CSV File) Service Data (e.g. .CSV File) Etc. Data (e.g. .CSV File) Facility Data (e.g. .CSV File) Organization Data (e.g. .CSV File) …Synthesizer Inputs Fromspreadsheetsandsystems. 1 Processing Rules for • Relationship Discovery • Data Formatting • View Generation • Report Calculations • Etc. 2 Data Synthesizer/ Data Compiler 3 Node Views Data Graph/Network Relationships CI (z) CI (y) CI (x) Business Intelligence • Inventories • Reports • Graphs & Charts • Glossaries • Dashboards • Visualizations • Abbreviations • Acronyms Data Indexes Catalogs Intranet/ Digital Library 4 11
  • 12. IF4IT Real Business Impacts 12 Your Compiler Intranets / Content Management Systems (Confluence, Jive, Drupal, MediaWiki, etc.) Architecture Modeling Tools (AMTs) (Troux, Mega, Adaptive, System Architect, etc.) Configuration Management Databases (CMDBs) (HP, BMC, ServiceNow, etc.) Stand-Alone Knowledge Management Systems (Madcap, KPS, Bitrix, SalesForce, ServiceNow, etc.) Library Management Systems (LMSs) (Koha, Soft Link, NGL, LibSys, Folet, etc.) Semantic Data Systems (Cambridge Semantics, Protégé, Swoop, LDIF, etc.) The Traditional Way = $$$$$$$$$$$$$$$$$$$ (Too many complex, expensive, difficult to deliver & operate systems and tools… just to get to a comprehensive view of your enterprise!) ExpensiveIntegration ExpensiveBusinessIntelligence&Reporting ExpensivePeoplewithSpecificSkills DDS Results = $ (A very simple, very quick, and very affordable “Compiler Based Approach”) Your Data Your Branded Digital Libraries (Complete with Catalogs, Indexes, Relationships, Data Views, Reports, Dashboards, Visualizations, etc.) 3 4 Your Data + Your Rules1 Complexity Simplicity 2 Data Synthesizer/ Data Compiler ✖ ✔ Many Years & Countless Resources Minutes/Hours & Small # of Resources
  • 13. IF4IT Compiler-based DDS helps generate “Knowledge Structures” 1. Content – High quantities, richly formatted, highly structured, and strongly inter-linked 2. Interactive Data Visualizations - for Interactive Analytics, Data Science, and Visual Discovery 3. Knowledge Repositories – fully curated structures like advanced Intranets and Digital Libraries Read More: https://www.if4it.com/knowledge-management-understanding-knowledge-structures/ 13
  • 14. IF4IT 1. Content: SFN over LFN Raw and unstructured human narrative in the form of “content” (not “data”). Highly structured data, based on Name/Value pair paradigms (e.g. CSV, JSON, etc.). ✖ ✔ 14
  • 15. IF4IT 2. Interactive Data Visualizations VisualComplexity.com D3js.org • Data Science and Data Scientists are VERY expensive. • DDS creates a common set of fully integrated Data Visualizations • DDS automatically creates many more out-of-the-box and ready- to-use Data Visualizations, faster and at far lower costs. 15
  • 16. IF4IT Geographic Maps Interactive Data Visualization Examples… Force Directed Graphs Bubbles Condegram Spirals Bars, Pies, Lines Sankey FlowsChords Multivariate Grids See many interactive examples in the gallery at: http://www.d3js.org 16
  • 17. IF4IT 3. Knowledge Repositories Read More: https://www.if4it.com/nounz/ Generic Example: http://nounz.if4it.com Domain-Specific Example: http://km.if4it.com 17
  • 18. IF4IT The Spectrum of Synthesizable Knowledge Structures Range of Synthesizable Knowledge Structures • Data Records/Nodes • Tables & Inventories • Charts (Pie, Bar, Area, Bubble, etc.) • Graphs (Line, Multi- Line, etc.) • Web Pages • Catalogs • Indexes • Reports • Semantic Relationships • Semantic Predicates Simple Knowledge Structures • Dashboards • Data Visualizations (many different visualizations) • Semantic Data Graphs (SDGs) / Semantic Data Networks (SDNs) • HTML Link Networks • Navigation Taxonomies • Classification Taxonomies Moderately Complex Knowledge Structures • General Web Sites • Intranets • Architecture Models • Architecture Repositories • Configuration Management Databases (CMDBs) • Domain-specific Knowledge Repositories Complex Knowledge Structures • Multi-Context/Multi- Domain Digital Libraries that include all other structures in the spectrum (all columns to the left) • Industry Specific Determinations… - Automatic Claim Processing - New Viable Drugs - Healthcare Care Pathways - High Frequency Auto- Investing - Etc. Super Complex Knowledge Structures Example Formats = TXT, CSV, TSV, JSON, XML, HTML, SVG, PDF, Etc. Simplest Most Complex • Bits and Bytes • Built-In Types and Constants • Lists, Arrays, and Hash Tables • Stacks and Heaps • For Loops, Do Loops, and While Loops • Formulas and Algorithms • Buffers, Streams and Files • Classes and Objects Simplest Knowledge Structures Read More: https://www.if4it.com/knowledge-management-understanding-knowledge-structures/ 18
  • 19. IF4IT DDS Solves the Wikipedia Problem for Enterprises... Quantity: Much higher quantities of artifact delivery. Quality: Much higher levels quality. Time: Much shorter times for artifact delivery (i.e. much higher quantities with higher quality). Money: Much lower costs to deliver artifacts (especially for Data Science & Data Visualizations). FASTER & BETTER KNOWLEDGE DISCOVERY AND DECISION MAKING 19
  • 20. IF4IT The Benefits of DDS • More and Better Knowledge Repositories - Far higher quantities of more advanced content - More advanced features and capabilities - Dynamic integration of data with content - Higher quality of content (e.g. far fewer dead links) - Far less investment of time and funds • Higher stakeholder satisfaction and engagement 20
  • 21. IF4IT Getting Started with DDS 1. Acquire a Data Compiler/Synthesizer • Contact IF4IT for a free NOUNZ Lite compiler https://www.if4it.com/contact-us/ 2. Start with simple Spreadsheet-based Inventories (and Sharepoint List Structure extracts) 3. Incrementally customize small data sets to meet your needs and your desired look-and-feel 4. Slowly progress to more complicated Data Extracts (from proprietary systems) 5. Keep in mind that Time-To-Learn is “incremental” [you don’t have to start with big projects] Crawl Walk Run 21
  • 22. IF4IT Questions and Discussion 22 Frank Guerino CEO & Chairman The International Foundation for Information Technology (IF4IT) Email: Frank.Guerino@if4it.com Twitter: @IF4IT
  • 23. IF4IT Read More: • Automated Content Generation & Curation: https://www.if4it.com/knowledge- management-automated-content-generation-and-curation/ • The Wikipedia Problem: https://www.if4it.com/wikipedia-problem-understanding- enterprise-knowledge-repositories-fail/ • Understanding Data Driven Synthesis: https://www.if4it.com/understanding-data- driven-synthesis/ • Understanding Knowledge Structures: https://www.if4it.com/knowledge-management- understanding-knowledge-structures/ • Learn about D3 and Interactive Visualizations: http:www.d3js.org • Understanding Knowledge Structures: https://www.if4it.com/knowledge-management- understanding-knowledge-structures/ • Learn about the IF4IT NOUNZ Data Compilation Platform: https://www.if4it.com/nounz/ • See Interactive Example of DDS-generated Generic Digital Library: http://nounz.if4it.com (Less than 3 minutes to generate.) • See Interactive Example of DDS-generated KM Body of Knowledge: http://km.if4it.com (Only seconds to generate.) 23
  • 25. IF4IT Global Biopharmaceutical 25 -- TOTAL Administration Category Noun Instances = 5: Time = Wednesday June 15, 2016 at 10:04:08 -- TOTAL Assay Noun Instances = 749: Time = Wednesday June 15, 2016 at 10:04:08 -- TOTAL Biological Matrix Category Noun Instances = 42: Time = Wednesday June 15, 2016 at 10:04:08 -- TOTAL Biomarker Noun Instances = 42: Time = Wednesday June 15, 2016 at 10:04:08 -- TOTAL Company Noun Instances = 18: Time = Wednesday June 15, 2016 at 10:04:08 -- TOTAL Disease Mechanism Noun Instances = 17: Time = Wednesday June 15, 2016 at 10:04:08 -- TOTAL Facility Noun Instances = 3: Time = Wednesday June 15, 2016 at 10:04:08 -- TOTAL Immunoassay Platform Noun Instances = 6: Time = Wednesday June 15, 2016 at 10:04:08 -- TOTAL Instrument Category Noun Instances = 5: Time = Wednesday June 15, 2016 at 10:04:08 -- TOTAL Instrument Noun Instances = 37: Time = Wednesday June 15, 2016 at 10:04:08 -- TOTAL Offering Noun Instances = 516: Time = Wednesday June 15, 2016 at 10:04:09 -- TOTAL Program Category Noun Instances = 5: Time = Wednesday June 15, 2016 at 10:04:09 -- TOTAL Study Type Noun Instances = 17: Time = Wednesday June 15, 2016 at 10:04:09 -- TOTAL White Paper Noun Instances = 28: Time = Wednesday June 15, 2016 at 10:04:09 -- TOTAL Application Noun Instances = 1000: Time = Wednesday June 15, 2016 at 10:04:09 -- TOTAL Business Domain Noun Instances = 9: Time = Wednesday June 15, 2016 at 10:04:09 -- TOTAL Capability Noun Instances = 32: Time = Wednesday June 15, 2016 at 10:04:09 -- TOTAL Computing Server Noun Instances = 100: Time = Wednesday June 15, 2016 at 10:04:09 -- TOTAL Contract Noun Instances = 1166: Time = Wednesday June 15, 2016 at 10:04:09 -- TOTAL Country Noun Instances = 251: Time = Wednesday June 15, 2016 at 10:04:09 -- TOTAL Customer Noun Instances = 150: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Database Noun Instances = 100: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Data Transport Technology Noun Instances = 4: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Environment Noun Instances = 8: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Frequently Asked Question Noun Instances = 32: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Information Category Noun Instances = 16: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Interface Noun Instances = 99: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Language Code Noun Instances = 504: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Letter Noun Instances = 26: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Location Noun Instances = 50: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Market Sector Noun Instances = 2: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Market Segment Noun Instances = 2: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL News Article Noun Instances = 6: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Number Noun Instances = 9: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Organization Noun Instances = 29: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Policy Noun Instances = 100: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Process Noun Instances = 26: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Product Noun Instances = 25: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Project Noun Instances = 1000: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Resource Noun Instances = 14: Time = Wednesday June 15, 2016 at 10:04:10 -- TOTAL Sales Transaction Noun Instances = 886: Time = Wednesday June 15, 2016 at 10:04:11 -- TOTAL SDLC Activity Noun Instances = 353: Time = Wednesday June 15, 2016 at 10:04:11 -- TOTAL SDLC Phase Noun Instances = 14: Time = Wednesday June 15, 2016 at 10:04:11 -- TOTAL Service Noun Instances = 561: Time = Wednesday June 15, 2016 at 10:04:11 -- TOTAL Software Noun Instances = 100: Time = Wednesday June 15, 2016 at 10:04:11 -- TOTAL Glossary Term Noun Instances = 235: Time = Wednesday June 15, 2016 at 10:04:11 -- TOTAL Vendor Noun Instances = 100: Time = Wednesday June 15, 2016 at 10:04:11 -- TOTAL Undefined Noun Type Noun Instances = 1: Time = Wednesday June 15, 2016 at 10:04:11 TOTAL Number of Unique Noun Types = 48: Time = Wednesday June 15, 2016 at 10:04:11 TOTAL Noun Instances registered = 8500: Time = Wednesday June 15, 2016 at 10:04:11 TOTAL Number of Unique Abbreviations or Acronyms = 655: Time = Wednesday June 15, 2016 at 10:04:11 TOTAL Number of Unique Semantic Relationships = 30767: Time = Wednesday June 15, 2016 at 10:04:15 TOTAL Number of Unique Semantic Relationship Predicates = 97: Time = Wednesday June 15, 2016 at 10:04:15 TOTAL Minimum Number of HTML Links = 113536: Time = Wednesday June 15, 2016 at 10:07:27 Spreadsheets were used to easily and quickly collect, organize, and supply data to NOUNZ Compiler in 1st Normal Form CSV formats. Vertical industry and business data was collected from public Biopharma web site, organized and cleansed in about 5 hours. Generic IT Data was intentionally comingled with Biopharma vertical industry and business data, in order to show the effects of mixing different data types. TOTALS: Total unique Noun Types (Data Types) = 48 Total Catalogs = 50 Total Noun Instances (across all Noun Types = 8500 Total Semantic Relationships = 30767 Total Semantic Predicates = 97 Total Abbreviations and Acronyms = 655 Total “minimum” # of HTML links = 113536 Total Compile Time = 3 Minutes and 27 Seconds
  • 26. IF4IT Regional Health Care Payer/Insurer 26 • 47 defined Noun Types (a.k.a. Data Types), • almost 49,000 Noun Instances (a.k.a. Data Instances or Records) that are sourced from the different Noun Types, • Almost 294,000 automatically synthesized web pages with different views of data and information, • Over 300K automatically discovered and harvested Semantic Relationships that translate directly to over 1,100,000 contextual and meaningful HTML links. • 46 total Catalogs, Including a Master Catalog, 47 Noun Domain Specific Catalogs (one for each Noun Type), an Abbreviations/Acronyms Catalog, and a Relationship Predicates Catalog • 288 unique Indexing Categories with 2582 unique Data Indexes • 869 harvested and curated Abbreviations and Acronyms • Over 1,600 unique semantic relationship descriptors (i.e. Predicates) • 47 Domain Specific Dashboards (one for each Noun Type). Total Compiler Time = Approximately 15 minutes