SlideShare uma empresa Scribd logo
1 de 26
Slinging Data: Data Loading and Cleanup in Evergreen Growing Evergreen Conference 22 April 2010
To migrate data … ,[object Object]
Whence ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
All over the map ,[object Object],[object Object],[object Object],[object Object]
All over the map ,[object Object],[object Object],[object Object],[object Object],[object Object]
All over the map ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
All over the map Legacy Item Type Circ Modifier 0 Regular 1 Media 45 AV 123 Reference 234 Reference
Cleaning up ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Don’t box me in! ,[object Object],[object Object]
Yes, those fixed fields really matter ,[object Object]
Yes, those fixed fields really matter ,[object Object]
Fixed fields
Oops! ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
On stage ,[object Object],[object Object],[object Object]
On stage ,[object Object],[object Object],[object Object],[object Object]
On stage ,[object Object],[object Object]
On stage ,[object Object]
On stage ,[object Object],[object Object],[object Object],[object Object]
On stage ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Counting ,[object Object],[object Object]
Counting ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Counting ,[object Object],[object Object]
Tools ,[object Object],[object Object],[object Object],[object Object],[object Object]
And now something new
Equinox Migration Tools ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object]

Mais conteúdo relacionado

Semelhante a Slinging Data: Data Loading and Cleanup in Evergreen

Brief Intro to Apache Spark @ Stanford ICME
Brief Intro to Apache Spark @ Stanford ICMEBrief Intro to Apache Spark @ Stanford ICME
Brief Intro to Apache Spark @ Stanford ICMEPaco Nathan
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsJulien Le Dem
 
Think Like Spark
Think Like SparkThink Like Spark
Think Like SparkAlpine Data
 
Apache Spark: What? Why? When?
Apache Spark: What? Why? When?Apache Spark: What? Why? When?
Apache Spark: What? Why? When?Massimo Schenone
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory OptimizationWei Lin
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory Optimizationguest3eed30
 
Mapreduce in Search
Mapreduce in SearchMapreduce in Search
Mapreduce in SearchAmund Tveit
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce AlgorithmsAmund Tveit
 
Think Like Spark: Some Spark Concepts and a Use Case
Think Like Spark: Some Spark Concepts and a Use CaseThink Like Spark: Some Spark Concepts and a Use Case
Think Like Spark: Some Spark Concepts and a Use CaseRachel Warren
 
R Spatial Analysis using SP
R Spatial Analysis using SPR Spatial Analysis using SP
R Spatial Analysis using SPtjagger
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2goMoriyoshi Koizumi
 
Big data shim
Big data shimBig data shim
Big data shimtistrue
 
Introduction to Python , Overview
Introduction to Python , OverviewIntroduction to Python , Overview
Introduction to Python , OverviewNB Veeresh
 
Introduction to Dart
Introduction to DartIntroduction to Dart
Introduction to DartRamesh Nair
 
Strata NYC 2015 - What's coming for the Spark community
Strata NYC 2015 - What's coming for the Spark communityStrata NYC 2015 - What's coming for the Spark community
Strata NYC 2015 - What's coming for the Spark communityDatabricks
 
Compiler2016 by abcdabcd987
Compiler2016 by abcdabcd987Compiler2016 by abcdabcd987
Compiler2016 by abcdabcd987乐群 陈
 
Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014Konrad Malawski
 

Semelhante a Slinging Data: Data Loading and Cleanup in Evergreen (20)

Brief Intro to Apache Spark @ Stanford ICME
Brief Intro to Apache Spark @ Stanford ICMEBrief Intro to Apache Spark @ Stanford ICME
Brief Intro to Apache Spark @ Stanford ICME
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analytics
 
Think Like Spark
Think Like SparkThink Like Spark
Think Like Spark
 
Apache Spark: What? Why? When?
Apache Spark: What? Why? When?Apache Spark: What? Why? When?
Apache Spark: What? Why? When?
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory Optimization
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory Optimization
 
Mapreduce in Search
Mapreduce in SearchMapreduce in Search
Mapreduce in Search
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
Think Like Spark: Some Spark Concepts and a Use Case
Think Like Spark: Some Spark Concepts and a Use CaseThink Like Spark: Some Spark Concepts and a Use Case
Think Like Spark: Some Spark Concepts and a Use Case
 
R Spatial Analysis using SP
R Spatial Analysis using SPR Spatial Analysis using SP
R Spatial Analysis using SP
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2go
 
Big data shim
Big data shimBig data shim
Big data shim
 
Introduction to Python , Overview
Introduction to Python , OverviewIntroduction to Python , Overview
Introduction to Python , Overview
 
Introduction to Dart
Introduction to DartIntroduction to Dart
Introduction to Dart
 
Strata NYC 2015 - What's coming for the Spark community
Strata NYC 2015 - What's coming for the Spark communityStrata NYC 2015 - What's coming for the Spark community
Strata NYC 2015 - What's coming for the Spark community
 
Compiler2016 by abcdabcd987
Compiler2016 by abcdabcd987Compiler2016 by abcdabcd987
Compiler2016 by abcdabcd987
 
Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014
 
What is data structure
What is data structureWhat is data structure
What is data structure
 
Lexyacc
LexyaccLexyacc
Lexyacc
 
Easy R
Easy REasy R
Easy R
 

Mais de Galen Charlton

Putting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For Koha
Putting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For KohaPutting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For Koha
Putting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For KohaGalen Charlton
 
Lovingly crafting a mountain, not by hand: managing piles of metadata
Lovingly crafting a mountain, not by hand: managing piles of metadataLovingly crafting a mountain, not by hand: managing piles of metadata
Lovingly crafting a mountain, not by hand: managing piles of metadataGalen Charlton
 
Calm at the center of the distributed whirlwinds
Calm at the center of the distributed whirlwindsCalm at the center of the distributed whirlwinds
Calm at the center of the distributed whirlwindsGalen Charlton
 
Scotty, I need more speed - Koha tuning
Scotty, I need more speed - Koha tuningScotty, I need more speed - Koha tuning
Scotty, I need more speed - Koha tuningGalen Charlton
 
Lightning Talk - pgTAP
Lightning Talk - pgTAPLightning Talk - pgTAP
Lightning Talk - pgTAPGalen Charlton
 
Stuff, stuff, and more stuff
Stuff, stuff, and more stuffStuff, stuff, and more stuff
Stuff, stuff, and more stuffGalen Charlton
 
Awash in a sea of connections
Awash in a sea of connectionsAwash in a sea of connections
Awash in a sea of connectionsGalen Charlton
 
Koha 3 2 Update @ Code4Lib 2010
Koha 3 2 Update @ Code4Lib 2010Koha 3 2 Update @ Code4Lib 2010
Koha 3 2 Update @ Code4Lib 2010Galen Charlton
 
‡biblios.net Web Services
‡biblios.net Web Services‡biblios.net Web Services
‡biblios.net Web ServicesGalen Charlton
 

Mais de Galen Charlton (12)

Putting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For Koha
Putting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For KohaPutting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For Koha
Putting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For Koha
 
Lovingly crafting a mountain, not by hand: managing piles of metadata
Lovingly crafting a mountain, not by hand: managing piles of metadataLovingly crafting a mountain, not by hand: managing piles of metadata
Lovingly crafting a mountain, not by hand: managing piles of metadata
 
Calm at the center of the distributed whirlwinds
Calm at the center of the distributed whirlwindsCalm at the center of the distributed whirlwinds
Calm at the center of the distributed whirlwinds
 
MetaData, MetaKoha
MetaData, MetaKohaMetaData, MetaKoha
MetaData, MetaKoha
 
Scotty, I need more speed - Koha tuning
Scotty, I need more speed - Koha tuningScotty, I need more speed - Koha tuning
Scotty, I need more speed - Koha tuning
 
Lightning Talk - pgTAP
Lightning Talk - pgTAPLightning Talk - pgTAP
Lightning Talk - pgTAP
 
Stuff, stuff, and more stuff
Stuff, stuff, and more stuffStuff, stuff, and more stuff
Stuff, stuff, and more stuff
 
Awash in a sea of connections
Awash in a sea of connectionsAwash in a sea of connections
Awash in a sea of connections
 
Koha 3 2 Update @ Code4Lib 2010
Koha 3 2 Update @ Code4Lib 2010Koha 3 2 Update @ Code4Lib 2010
Koha 3 2 Update @ Code4Lib 2010
 
Koha 3.2 Status
Koha 3.2 StatusKoha 3.2 Status
Koha 3.2 Status
 
Beyond Koha 3.2
Beyond Koha 3.2Beyond Koha 3.2
Beyond Koha 3.2
 
‡biblios.net Web Services
‡biblios.net Web Services‡biblios.net Web Services
‡biblios.net Web Services
 

Último

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Último (20)

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

Slinging Data: Data Loading and Cleanup in Evergreen

  • 1. Slinging Data: Data Loading and Cleanup in Evergreen Growing Evergreen Conference 22 April 2010
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7. All over the map Legacy Item Type Circ Modifier 0 Regular 1 Media 45 AV 123 Reference 234 Reference
  • 8.
  • 9.
  • 10.
  • 11.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 25.
  • 26.