SlideShare uma empresa Scribd logo
1 de 28
Baixar para ler offline
Scaling Servers and
Storage for Film Assets
Mike Sundy
Digital Asset System Administrator

David Baraff
Senior Animation Research Scientist

Pixar Animation Studios
Environment Overview
Scaling Storage
Scaling Servers
Challenges with Scaling
Environment Overview
Environment


    As of March 2011:

    •  ~1000 Perforce users (80% of company)
    •  70 GB db.have
    •  12 million p4 ops per day (on busiest server)
    •  30+ VMWare server instances
    •  40 million submitted changelists (across all servers)
    •  On 2009.1 but planning to upgrade to 2010.1 soon
Growth & Types of Data


    Pixar grew from one code server in 2007 to 90+ Perforce
    servers storing all types of assets:

    •  art – reference and concept art – inspirational art for film.
    •  tech – show-specific data. e.g. models, textures, pipeline.
    •  studio – company-wide reference libraries. e.g. animation
    reference, config files, flickr-like company photo site.
    •  tools – code for our central tools team, software projects.
    •  dept – department-specific files. e.g. Creative Resources
    has “blessed” marketing images.
    •  exotics – patent data, casting audio, data for live action
    shorts, story gags, theme park concepts, intern art show.
Scaling Storage
Storage Stats


    •  115 million files in Perforce.
    •  20+ TB of versioned files.
Techniques to Manage Storage




    •  Use +S filetype for the majority of generated data.
      Saved 40% of storage for Toy Story 3 (1.2 TB).

    •  Work with teams to migrate versionless data out of
       Perforce. Saved 2 TB by moving binary scene data out.

    •  De-dupe files — saved 1 million files and 1 TB.
De-dupe Trigger Cases


    • p4 submit file1 file2 ••• fileN
      p4 submit file1 file2 ••• fileN
                              # only file2 actually modified

    •  p4 submit file         # contents: revision n
                              # five seconds later: “crap!”
      p4 submit file          # contents: revision n–1

    •  p4 delete file
       p4 submit file         # user deletes file (revision n)
                              # five seconds later: “crap!”
      p4 add file
      p4 submit file          # contents: revision n
De-dupe Trigger Mechanics


 repfile.14     repfile.15
 AABBCC…!       AABBCC…!
    file#n        file#n+1


 repfile.24     repfile.25   repfile.26
 AABBCC…!       XXYYZZ…!     AABBCC…!

    file#n        file#n+1   file#n+2


  repfile.34                 repfile.38
  AABBCC…!                   AABBCC…!

    file#n        file#n+1     file#n+2
De-dupe Trigger Mechanics


 repfile.24         repfile.25         repfile.26
 AABBCC…!           XXYYZZ…!           AABBCC…!


   •  +F for all files; detect duplicates via checksums.
   •  Safely discard duplicate:
       $ ln repfile.24 repfile.26.tmp

       $ rename repfile.26.tmp repfile.26!


 repfile.24         repfile.25           repfile.26
 AABBCC…!           XXYYZZ…!             AABBCC…!


       hardlink         repfile.26.tmp         rename
Scaling Servers
Scale Up vs. Scale Out


    Why did we choose to scale out?

    •  Shows are self-contained.
    •  Performance of one depot won’t affect another.*
    •  Easy to browse other depots.
    •  Easier administration/downtime scheduling.
    •  Fits with workflow (e.g. no merging art)
    •  Central code server – share where it matters.
Pixar Perforce Server Spec


    •  VMWare ESX Version 4.
    •  RHEL 5 (Linux 2.6).
    •  4 GB RAM.
    •  50 GB “local” data volume (on EMC SAN).
    •  Versioned files on Netapp GFX.
    •  90 Perforce depots on 6 node VMWare cluster –
    special 2-node cluster for “hot” tech show.
    •  For more details, see 2009 conference paper.
Virtualization Benefits


    •  Quick to spin up new servers.
    •  Stable and fault tolerant.
    •  Easy to remotely administer.
    •  Cost-effective.
    •  Reduces datacenter footprint, cooling, power, etc.
Reduce Dependencies


    •  Clone all servers from a VM template.
    •  RHEL vs. Fedora.
    •  Reduce triggers to minimum.
    •  Default tables, p4d startup options.
    •  Versioned files stored on NFS.
    •  VM on a cluster.
    •  Can build new VM quickly if one ever dies.
Virtualization Gotchas


    •  Had severe performance problem when one datastore
    grew to over 90% full.
    •  Requires some jockeying to ensure load stays
    balanced across multiple nodes – manual vs. auto.
    •  Physical host performance issues can cause cross-
    depot issues.
Speed of Virtual Perforce Servers


    •  Used Perforce Benchmark Results Database tools.
    •  Virtualized servers 95% of performance for
    branchsubmit benchmark.
    •  85% of performance for browse benchmark (not as
    critical to us).
    •  VMWare flexibility outweighed minor performance hit.
Quick Server Setup


    •  Critical to be able to quickly spin up new servers.
    •  Went from 2-3 days for setup to 1 hour.

    1-hour Setup
    •  Clone a p4 template VM. (30 minutes)
    •  Prep the VM. ( 15 minutes)
    •  Run “squire” script to build out p4 instance. (8 seconds)
    •  Validate and test. (15 minutes)
Squire


    Script which automates p4 server setup. Sets up:

    •  p4 binaries
    •  metadata tables (protect/triggers/typemap/counters)
    •  cron jobs (checkpoint/journal/verify)
    •  monitoring
    •  permissions (filesystem and p4)
    •  .initd startup script
    •  linkatron namespace
    •  pipeline integration (for tech depots)
    •  config files
Superp4


   Script for managing p4 metadata tables across multiple
   servers.

   •  Preferable to hand-editing 90 tables.
   •  Database driven (i.e. list of depots)
   •  Scopable by depot domain (art, tech, etc.)
   •  Rollback functionality.
Superp4 example

$ cd /usr/anim/ts3!
$ p4 triggers -o!
Triggers: 

      !noHost form-out client ”removeHost.py %formfile%”!
!
$ cat fix-noHost.py!
def modify(data, depot):!
  return [line.replace("noHost form-out”,!
                                "noHost form-in”)!
                                for line in data]!
!
$ superp4 –table triggers –script fix-noHost.py –diff!
    • Copies triggers to restore dir
    • Runs fix-noHost.py to produce new triggers, for each depot.
    • Shows me a diff of the above.
    • Asks confirmation; finally, modifies triggers on each depot.
    • Tells me where the restore dir is!!
Superp4 options

$ superp4 –help!
   -n                      Don’t actually modify data!
   -diff                   Show diffs for each depot using xdiff.

  -category category       Pick depots by category (art, tech, etc.)
  -units unit1 unit2 ...   Specify an explicit depot list (regexp allowed).

  -script script           Python file to be execfile()'d; must define a
                           function named modify().

  -table tableType         Table to operate on (triggers, typemap,…)
  -configFile configFile   Config file to modify (e.g. admin/values-config)

  -outDir outDir           Directory to store working files, and for restoral.
  -restoreDir restoreDir   Directory previously produced by running
                           superp4, for when you screw up.
Challenges With Scaling
Gotchas


   •  //spec/client filled up.
   •  user-written triggers sub-optimal.
   •  “shadow files” consumed server space.
   •  monitoring difficult – cue templaRX and mayday.
   •  cap renderfarm ops.
   •  beware of automated tests and clueless GUIs.
   •  verify can be dangerous to your health (cross-depot).
Summary


   •  Perforce scales well for large amounts of binary data.
   •  Virtualization = fast and cost-effective server setup.
   •  Use +S filetype and de-dupe to reduce storage usage.
Q&A


  Questions?

Mais conteúdo relacionado

Mais procurados

FBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversFBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp servers
Angelo Failla
 
Putting Wings on the Elephant
Putting Wings on the ElephantPutting Wings on the Elephant
Putting Wings on the Elephant
DataWorks Summit
 
Python Deployment with Fabric
Python Deployment with FabricPython Deployment with Fabric
Python Deployment with Fabric
andymccurdy
 

Mais procurados (20)

A Universe From Nothing
A Universe From NothingA Universe From Nothing
A Universe From Nothing
 
FBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversFBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp servers
 
IBM SONAS and the Cloud Storage Taxonomy
IBM SONAS and the Cloud Storage TaxonomyIBM SONAS and the Cloud Storage Taxonomy
IBM SONAS and the Cloud Storage Taxonomy
 
Ceph Day Melbourne - Scale and performance: Servicing the Fabric and the Work...
Ceph Day Melbourne - Scale and performance: Servicing the Fabric and the Work...Ceph Day Melbourne - Scale and performance: Servicing the Fabric and the Work...
Ceph Day Melbourne - Scale and performance: Servicing the Fabric and the Work...
 
Perforce Administration: Optimization, Scalability, Availability and Reliability
Perforce Administration: Optimization, Scalability, Availability and ReliabilityPerforce Administration: Optimization, Scalability, Availability and Reliability
Perforce Administration: Optimization, Scalability, Availability and Reliability
 
Perl Dist::Surveyor 2011
Perl Dist::Surveyor 2011Perl Dist::Surveyor 2011
Perl Dist::Surveyor 2011
 
4.8 apend backups
4.8 apend backups4.8 apend backups
4.8 apend backups
 
Red Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) OverviewRed Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) Overview
 
Kfs presentation
Kfs presentationKfs presentation
Kfs presentation
 
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyPilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
 
Putting Wings on the Elephant
Putting Wings on the ElephantPutting Wings on the Elephant
Putting Wings on the Elephant
 
101 apend. backups
101 apend. backups101 apend. backups
101 apend. backups
 
Node Interactive Debugging Node.js In Production
Node Interactive Debugging Node.js In ProductionNode Interactive Debugging Node.js In Production
Node Interactive Debugging Node.js In Production
 
Redis 101
Redis 101Redis 101
Redis 101
 
TUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data CenterTUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data Center
 
Comparison of foss distributed storage
Comparison of foss distributed storageComparison of foss distributed storage
Comparison of foss distributed storage
 
OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt Ahrens
OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt AhrensOpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt Ahrens
OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt Ahrens
 
Python Deployment with Fabric
Python Deployment with FabricPython Deployment with Fabric
Python Deployment with Fabric
 
Developing High Performance Application with Aerospike & Go
Developing High Performance Application with Aerospike & GoDeveloping High Performance Application with Aerospike & Go
Developing High Performance Application with Aerospike & Go
 

Destaque

[Pixar] Big Data, Big Depots
[Pixar] Big Data, Big Depots[Pixar] Big Data, Big Depots
[Pixar] Big Data, Big Depots
Perforce
 
[Pixar] Templar Underminer
[Pixar] Templar Underminer[Pixar] Templar Underminer
[Pixar] Templar Underminer
Perforce
 

Destaque (9)

[Pixar] Big Data, Big Depots
[Pixar] Big Data, Big Depots[Pixar] Big Data, Big Depots
[Pixar] Big Data, Big Depots
 
Fact Sheet: Mike Sundy & David Baraff, Pixar
Fact Sheet: Mike Sundy & David Baraff, Pixar Fact Sheet: Mike Sundy & David Baraff, Pixar
Fact Sheet: Mike Sundy & David Baraff, Pixar
 
Keynote Presentation: Perforce in the Wild
Keynote Presentation: Perforce in the WildKeynote Presentation: Perforce in the Wild
Keynote Presentation: Perforce in the Wild
 
Build Your Own Monster in the Perforce Workshop
Build Your Own Monster in the Perforce WorkshopBuild Your Own Monster in the Perforce Workshop
Build Your Own Monster in the Perforce Workshop
 
White Paper: Scaling Servers and Storage for Film Assets
White Paper: Scaling Servers and Storage for Film AssetsWhite Paper: Scaling Servers and Storage for Film Assets
White Paper: Scaling Servers and Storage for Film Assets
 
[Pixar] Templar Underminer
[Pixar] Templar Underminer[Pixar] Templar Underminer
[Pixar] Templar Underminer
 
BDAM: Big Data Asset Management
BDAM: Big Data Asset ManagementBDAM: Big Data Asset Management
BDAM: Big Data Asset Management
 
Big Data, Analytics & its impact on asset management
Big Data, Analytics & its impact on asset managementBig Data, Analytics & its impact on asset management
Big Data, Analytics & its impact on asset management
 
ClearCase Escape Plan
ClearCase Escape PlanClearCase Escape Plan
ClearCase Escape Plan
 

Semelhante a Scaling Servers and Storage for Film Assets

Shak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-finalShak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-final
Tommy Lee
 
SFBigAnalytics_20190724: Monitor kafka like a Pro
SFBigAnalytics_20190724: Monitor kafka like a ProSFBigAnalytics_20190724: Monitor kafka like a Pro
SFBigAnalytics_20190724: Monitor kafka like a Pro
Chester Chen
 

Semelhante a Scaling Servers and Storage for Film Assets (20)

Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
 
Multi-Site Perforce at NetApp
Multi-Site Perforce at NetAppMulti-Site Perforce at NetApp
Multi-Site Perforce at NetApp
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at Netflix
 
Linux Perf Tools
Linux Perf ToolsLinux Perf Tools
Linux Perf Tools
 
Shak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-finalShak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-final
 
Hoodie: Incremental processing on hadoop
Hoodie: Incremental processing on hadoopHoodie: Incremental processing on hadoop
Hoodie: Incremental processing on hadoop
 
Supporting Android-based Platform Development in Samsung
Supporting Android-based Platform Development in SamsungSupporting Android-based Platform Development in Samsung
Supporting Android-based Platform Development in Samsung
 
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedPerformance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons Learned
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...
 
Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)
 
Designing a Highly Available Environment Using Methods of Modern IT Infrastru...
Designing a Highly Available Environment Using Methods of Modern IT Infrastru...Designing a Highly Available Environment Using Methods of Modern IT Infrastru...
Designing a Highly Available Environment Using Methods of Modern IT Infrastru...
 
Docker 102 - Immutable Infrastructure
Docker 102 - Immutable InfrastructureDocker 102 - Immutable Infrastructure
Docker 102 - Immutable Infrastructure
 
TryStack: A Sandbox for OpenStack Users and Admins
TryStack: A Sandbox for OpenStack Users and AdminsTryStack: A Sandbox for OpenStack Users and Admins
TryStack: A Sandbox for OpenStack Users and Admins
 
Using Archivematica 0.8 for Digitized Content
Using Archivematica 0.8 for Digitized ContentUsing Archivematica 0.8 for Digitized Content
Using Archivematica 0.8 for Digitized Content
 
Open Source Logging and Metrics Tools
Open Source Logging and Metrics ToolsOpen Source Logging and Metrics Tools
Open Source Logging and Metrics Tools
 
Open Source Logging and Monitoring Tools
Open Source Logging and Monitoring ToolsOpen Source Logging and Monitoring Tools
Open Source Logging and Monitoring Tools
 
Migration challenges and process
Migration challenges and processMigration challenges and process
Migration challenges and process
 
xlwings performance
xlwings performancexlwings performance
xlwings performance
 
SFBigAnalytics_20190724: Monitor kafka like a Pro
SFBigAnalytics_20190724: Monitor kafka like a ProSFBigAnalytics_20190724: Monitor kafka like a Pro
SFBigAnalytics_20190724: Monitor kafka like a Pro
 
Gears of Perforce: AAA Game Development Challenges
Gears of Perforce: AAA Game Development ChallengesGears of Perforce: AAA Game Development Challenges
Gears of Perforce: AAA Game Development Challenges
 

Mais de Perforce

Mais de Perforce (20)

How to Organize Game Developers With Different Planning Needs
How to Organize Game Developers With Different Planning NeedsHow to Organize Game Developers With Different Planning Needs
How to Organize Game Developers With Different Planning Needs
 
Regulatory Traceability: How to Maintain Compliance, Quality, and Cost Effic...
Regulatory Traceability:  How to Maintain Compliance, Quality, and Cost Effic...Regulatory Traceability:  How to Maintain Compliance, Quality, and Cost Effic...
Regulatory Traceability: How to Maintain Compliance, Quality, and Cost Effic...
 
Efficient Security Development and Testing Using Dynamic and Static Code Anal...
Efficient Security Development and Testing Using Dynamic and Static Code Anal...Efficient Security Development and Testing Using Dynamic and Static Code Anal...
Efficient Security Development and Testing Using Dynamic and Static Code Anal...
 
Understanding Compliant Workflow Enforcement SOPs
Understanding Compliant Workflow Enforcement SOPsUnderstanding Compliant Workflow Enforcement SOPs
Understanding Compliant Workflow Enforcement SOPs
 
Branching Out: How To Automate Your Development Process
Branching Out: How To Automate Your Development ProcessBranching Out: How To Automate Your Development Process
Branching Out: How To Automate Your Development Process
 
How to Do Code Reviews at Massive Scale For DevOps
How to Do Code Reviews at Massive Scale For DevOpsHow to Do Code Reviews at Massive Scale For DevOps
How to Do Code Reviews at Massive Scale For DevOps
 
How to Spark Joy In Your Product Backlog
How to Spark Joy In Your Product Backlog How to Spark Joy In Your Product Backlog
How to Spark Joy In Your Product Backlog
 
Going Remote: Build Up Your Game Dev Team
Going Remote: Build Up Your Game Dev Team Going Remote: Build Up Your Game Dev Team
Going Remote: Build Up Your Game Dev Team
 
Shift to Remote: How to Manage Your New Workflow
Shift to Remote: How to Manage Your New WorkflowShift to Remote: How to Manage Your New Workflow
Shift to Remote: How to Manage Your New Workflow
 
Hybrid Development Methodology in a Regulated World
Hybrid Development Methodology in a Regulated WorldHybrid Development Methodology in a Regulated World
Hybrid Development Methodology in a Regulated World
 
Better, Faster, Easier: How to Make Git Really Work in the Enterprise
Better, Faster, Easier: How to Make Git Really Work in the EnterpriseBetter, Faster, Easier: How to Make Git Really Work in the Enterprise
Better, Faster, Easier: How to Make Git Really Work in the Enterprise
 
Easier Requirements Management Using Diagrams In Helix ALM
Easier Requirements Management Using Diagrams In Helix ALMEasier Requirements Management Using Diagrams In Helix ALM
Easier Requirements Management Using Diagrams In Helix ALM
 
How To Master Your Mega Backlog
How To Master Your Mega Backlog How To Master Your Mega Backlog
How To Master Your Mega Backlog
 
Achieving Software Safety, Security, and Reliability Part 3: What Does the Fu...
Achieving Software Safety, Security, and Reliability Part 3: What Does the Fu...Achieving Software Safety, Security, and Reliability Part 3: What Does the Fu...
Achieving Software Safety, Security, and Reliability Part 3: What Does the Fu...
 
How to Scale With Helix Core and Microsoft Azure
How to Scale With Helix Core and Microsoft Azure How to Scale With Helix Core and Microsoft Azure
How to Scale With Helix Core and Microsoft Azure
 
Achieving Software Safety, Security, and Reliability Part 2
Achieving Software Safety, Security, and Reliability Part 2Achieving Software Safety, Security, and Reliability Part 2
Achieving Software Safety, Security, and Reliability Part 2
 
Should You Break Up With Your Monolith?
Should You Break Up With Your Monolith?Should You Break Up With Your Monolith?
Should You Break Up With Your Monolith?
 
Achieving Software Safety, Security, and Reliability Part 1: Common Industry ...
Achieving Software Safety, Security, and Reliability Part 1: Common Industry ...Achieving Software Safety, Security, and Reliability Part 1: Common Industry ...
Achieving Software Safety, Security, and Reliability Part 1: Common Industry ...
 
What's New in Helix ALM 2019.4
What's New in Helix ALM 2019.4What's New in Helix ALM 2019.4
What's New in Helix ALM 2019.4
 
Free Yourself From the MS Office Prison
Free Yourself From the MS Office Prison Free Yourself From the MS Office Prison
Free Yourself From the MS Office Prison
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Último (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 

Scaling Servers and Storage for Film Assets

  • 1. Scaling Servers and Storage for Film Assets Mike Sundy Digital Asset System Administrator David Baraff Senior Animation Research Scientist Pixar Animation Studios
  • 2.
  • 3. Environment Overview Scaling Storage Scaling Servers Challenges with Scaling
  • 5. Environment As of March 2011: •  ~1000 Perforce users (80% of company) •  70 GB db.have •  12 million p4 ops per day (on busiest server) •  30+ VMWare server instances •  40 million submitted changelists (across all servers) •  On 2009.1 but planning to upgrade to 2010.1 soon
  • 6. Growth & Types of Data Pixar grew from one code server in 2007 to 90+ Perforce servers storing all types of assets: •  art – reference and concept art – inspirational art for film. •  tech – show-specific data. e.g. models, textures, pipeline. •  studio – company-wide reference libraries. e.g. animation reference, config files, flickr-like company photo site. •  tools – code for our central tools team, software projects. •  dept – department-specific files. e.g. Creative Resources has “blessed” marketing images. •  exotics – patent data, casting audio, data for live action shorts, story gags, theme park concepts, intern art show.
  • 8. Storage Stats •  115 million files in Perforce. •  20+ TB of versioned files.
  • 9. Techniques to Manage Storage •  Use +S filetype for the majority of generated data. Saved 40% of storage for Toy Story 3 (1.2 TB). •  Work with teams to migrate versionless data out of Perforce. Saved 2 TB by moving binary scene data out. •  De-dupe files — saved 1 million files and 1 TB.
  • 10. De-dupe Trigger Cases • p4 submit file1 file2 ••• fileN p4 submit file1 file2 ••• fileN # only file2 actually modified •  p4 submit file # contents: revision n # five seconds later: “crap!” p4 submit file # contents: revision n–1 •  p4 delete file p4 submit file # user deletes file (revision n) # five seconds later: “crap!” p4 add file p4 submit file # contents: revision n
  • 11. De-dupe Trigger Mechanics repfile.14 repfile.15 AABBCC…! AABBCC…! file#n file#n+1 repfile.24 repfile.25 repfile.26 AABBCC…! XXYYZZ…! AABBCC…! file#n file#n+1 file#n+2 repfile.34 repfile.38 AABBCC…! AABBCC…! file#n file#n+1 file#n+2
  • 12. De-dupe Trigger Mechanics repfile.24 repfile.25 repfile.26 AABBCC…! XXYYZZ…! AABBCC…! •  +F for all files; detect duplicates via checksums. •  Safely discard duplicate: $ ln repfile.24 repfile.26.tmp
 $ rename repfile.26.tmp repfile.26! repfile.24 repfile.25 repfile.26 AABBCC…! XXYYZZ…! AABBCC…! hardlink repfile.26.tmp rename
  • 14. Scale Up vs. Scale Out Why did we choose to scale out? •  Shows are self-contained. •  Performance of one depot won’t affect another.* •  Easy to browse other depots. •  Easier administration/downtime scheduling. •  Fits with workflow (e.g. no merging art) •  Central code server – share where it matters.
  • 15. Pixar Perforce Server Spec •  VMWare ESX Version 4. •  RHEL 5 (Linux 2.6). •  4 GB RAM. •  50 GB “local” data volume (on EMC SAN). •  Versioned files on Netapp GFX. •  90 Perforce depots on 6 node VMWare cluster – special 2-node cluster for “hot” tech show. •  For more details, see 2009 conference paper.
  • 16. Virtualization Benefits •  Quick to spin up new servers. •  Stable and fault tolerant. •  Easy to remotely administer. •  Cost-effective. •  Reduces datacenter footprint, cooling, power, etc.
  • 17. Reduce Dependencies •  Clone all servers from a VM template. •  RHEL vs. Fedora. •  Reduce triggers to minimum. •  Default tables, p4d startup options. •  Versioned files stored on NFS. •  VM on a cluster. •  Can build new VM quickly if one ever dies.
  • 18. Virtualization Gotchas •  Had severe performance problem when one datastore grew to over 90% full. •  Requires some jockeying to ensure load stays balanced across multiple nodes – manual vs. auto. •  Physical host performance issues can cause cross- depot issues.
  • 19. Speed of Virtual Perforce Servers •  Used Perforce Benchmark Results Database tools. •  Virtualized servers 95% of performance for branchsubmit benchmark. •  85% of performance for browse benchmark (not as critical to us). •  VMWare flexibility outweighed minor performance hit.
  • 20. Quick Server Setup •  Critical to be able to quickly spin up new servers. •  Went from 2-3 days for setup to 1 hour. 1-hour Setup •  Clone a p4 template VM. (30 minutes) •  Prep the VM. ( 15 minutes) •  Run “squire” script to build out p4 instance. (8 seconds) •  Validate and test. (15 minutes)
  • 21. Squire Script which automates p4 server setup. Sets up: •  p4 binaries •  metadata tables (protect/triggers/typemap/counters) •  cron jobs (checkpoint/journal/verify) •  monitoring •  permissions (filesystem and p4) •  .initd startup script •  linkatron namespace •  pipeline integration (for tech depots) •  config files
  • 22. Superp4 Script for managing p4 metadata tables across multiple servers. •  Preferable to hand-editing 90 tables. •  Database driven (i.e. list of depots) •  Scopable by depot domain (art, tech, etc.) •  Rollback functionality.
  • 23. Superp4 example $ cd /usr/anim/ts3! $ p4 triggers -o! Triggers: 
 !noHost form-out client ”removeHost.py %formfile%”! ! $ cat fix-noHost.py! def modify(data, depot):! return [line.replace("noHost form-out”,! "noHost form-in”)! for line in data]! ! $ superp4 –table triggers –script fix-noHost.py –diff! • Copies triggers to restore dir • Runs fix-noHost.py to produce new triggers, for each depot. • Shows me a diff of the above. • Asks confirmation; finally, modifies triggers on each depot. • Tells me where the restore dir is!!
  • 24. Superp4 options $ superp4 –help! -n Don’t actually modify data! -diff Show diffs for each depot using xdiff. -category category Pick depots by category (art, tech, etc.) -units unit1 unit2 ... Specify an explicit depot list (regexp allowed). -script script Python file to be execfile()'d; must define a function named modify(). -table tableType Table to operate on (triggers, typemap,…) -configFile configFile Config file to modify (e.g. admin/values-config) -outDir outDir Directory to store working files, and for restoral. -restoreDir restoreDir Directory previously produced by running superp4, for when you screw up.
  • 26. Gotchas •  //spec/client filled up. •  user-written triggers sub-optimal. •  “shadow files” consumed server space. •  monitoring difficult – cue templaRX and mayday. •  cap renderfarm ops. •  beware of automated tests and clueless GUIs. •  verify can be dangerous to your health (cross-depot).
  • 27. Summary •  Perforce scales well for large amounts of binary data. •  Virtualization = fast and cost-effective server setup. •  Use +S filetype and de-dupe to reduce storage usage.