Transcript
Page 1: [Pixar] Big Data, Big Depots

1  

Pixar: Big Data, Big Depots

Mark Harrison, Tech Lead, Data Management Group

Mike Sundy, Senior Asset Administrator David Baraff, Senior Animation Scientist Pixar Animation Studios Emeryville, CA

Logo area

Page 2: [Pixar] Big Data, Big Depots

2  

Three Fundamental Questions of Big Data

•  How big is a big repository? •  How long:

•  Do operations take? •  Do you use your data?

•  How can you back things up?

Page 3: [Pixar] Big Data, Big Depots

3  

Templar Refresher

•  What we store: •  Source code, film assets, original artwork, video

•  Scale of what we store •  Many TB, 100+ depots

•  How long we store it for: •  50 year charter

•  Why P4 for storing •  Scalability, Off-the-Shelf

Page 4: [Pixar] Big Data, Big Depots

4  

Big Data Problems

•  Deal with slow operations on huge files •  Dealing with high backup costs •  Dealing with depots that are 99% R/O, 1% R/W

Page 5: [Pixar] Big Data, Big Depots

5  

Slow Operations

•  Biggest asset: 900 GB •  Biggest checkin: 6.5 TB •  Biggest depot: 35 TB •  Rule of thumb:

•  1 MB = 1 sec, 1 GB= 1 minute, 1 TB = 14 hours

Page 6: [Pixar] Big Data, Big Depots

6  

How do you back it up?

•  Tape, offsite •  Expensive •  Recurring operation

•  Needs to be verified, refreshed, tapes upgraded •  Archived/non-archived

•  Things not actively being changed can be backed up less aggressively ( = more cheaply)

Page 7: [Pixar] Big Data, Big Depots

7  

A Modest User Request

•  “Can we make the depot read-only to reduce costs, but still write to it just a little bit?”

•  Response 1: •  #*@#$(*%%%( !!!

•  Response 2: •  “we shall investigate the issue and let you know.”

Page 8: [Pixar] Big Data, Big Depots

8  

Problem: Archive + Non Archive

•  Movie process, never totally finished •  Need to periodically make small additions/

tweaks •  E.g. ads, interstitials, Oscar promo (hopefully!)

•  Applicable to lots of industries

Page 9: [Pixar] Big Data, Big Depots

9  

Archive + Non Archive (requirements)

•  Conflicting goals: •  readonly / writable •  Slow cheap storage / Fast expensive storage

•  Lots of data duplication = need to dedupe •  Most common file:

•  125,241 copies

Page 10: [Pixar] Big Data, Big Depots

10  

Archive + Non Archive (requirements)

•  # cueformat = 5;

Page 11: [Pixar] Big Data, Big Depots

11  

Ideas which didn’t work for us

•  +X filetype •  p4 snap •  Various vendor backup things

Page 12: [Pixar] Big Data, Big Depots

12  

Solution: split data onto multiple volumes

•  “old” stuff onto archive R/O volume •  “new” stuff onto active, writable volume •  Works because of ,d magic •  Hooray P4 Super Brains!!

Page 13: [Pixar] Big Data, Big Depots

13  

Horizontal linking / Vertical Linking

•  Horizontal = volume splitting = underminer •  Vertical = deduping = shrink ray

Page 14: [Pixar] Big Data, Big Depots

Underminer

P4 Depot Symlinks

Writable Storage Read-only Storage

P4 Depot

Page 15: [Pixar] Big Data, Big Depots

Writable Storage Read-only storage •  Normal Perforce

Configuration •  New Files added here •  Perforce doesn’t care if

files are symlinks (GENIUS!)

•  Files copied from other storage

•  Symlinks established

•  Uses ‘archive’

Underminer

Page 16: [Pixar] Big Data, Big Depots

Shrink Ray

Duplicate File

Duplicate File

•  All duplicate files on one volume hard linked together.

•  Reduces storage overhead – but transparent to Perforce.

Duplicate File

Page 17: [Pixar] Big Data, Big Depots

Shrink ray links

Shrink ray links

Underminer links

Active depot storage Read-only Storage

Page 18: [Pixar] Big Data, Big Depots

18  

Shrink Ray – P4 Deduper

•  Simple file-level dedupe (not block level) •  Do this in real-time on checkin •  Batch shrink ray for undermined files •  Use p4 checksums , move into database •  Now all queries are db operations •  Easily cap number of hard links

Page 19: [Pixar] Big Data, Big Depots

19  

Sample db queries

•  Unique files •  select  count(*)  from  (select  digest,  count(1)  from  fileinfo  group  by  digest)  

•  How many copies of this file contents? •  select  count(*)  from  fileinfo  where  digest  =  '1E8529CE1AE991982A0FB5FD760CE92D'  

Page 20: [Pixar] Big Data, Big Depots

20  

Why not p4 verify?

•  Takes a long time: 1 week on mediavault depot! •  Due to NFS, can incapacitate p4 server machine •  Caused “fear pathology” among p4 end users

when p4 operations would “hang” •  False alarms due to “p4 purge” coupled with

long running time of “p4 verify”

Page 21: [Pixar] Big Data, Big Depots

21  

solution: the suminator (offline verify)

•  Runs on separate machine from P4 server •  Accesses repository store via NFS •  Gets checksums via “p4 fstat” •  Verifies repository based on those checksums •  Uses in-house Python streaming API to minimize

memory footprint and startup delay •  Parallelizable: can farm out to multiple machines

Page 22: [Pixar] Big Data, Big Depots

22  

problem: p4 submit of big data

•  no idea if submit is still running •  one mediavault check-in of 6 TB took 3 days •  all the user sees:

>  p4  submit  -­‐d  "test"  bigfile  Submi<ng  change  5784.  Locking  1  files  ...  edit  //markive/miketest/bigfile#2  

Page 23: [Pixar] Big Data, Big Depots

23  

solution: progress indicator

•  As of p4 2012.2, you can use ‘p4 –I submit’ •  p4 -I submit bigfile •  Change 5788 created with 1 open file(s). •  Submitting change 5788. •  Locking 1 files ... •  /home/msundy/depots/markive/miketest/bigfile 43%

•  provides feedback and predictability •  users are happy

Page 24: [Pixar] Big Data, Big Depots

24  

problem: debug big data submit performance

•  triggers can take several minutes with big data •  p4 log not granular enough to debug trigger performance •  how long did we spend in each trigger?  2013/02/25  16:35:28  pid  24112  msundy@msundy-­‐home-­‐depot-­‐shaunkive  

138.72.131.168  [p4/2012.1/LINUX26X86_64/442152]  'dm-­‐CommitSubmit’  <commit  triggers  fire  –  can  only  see  p4  ops  from  within  triggers,  not  trigger  phases>  

 2013/02/25  16:35:59  pid  24112  completed  30.03s  1+1us  192+264io  0+0net  2672k  0pf      

Page 25: [Pixar] Big Data, Big Depots

25  

solution: structured logging

•  as of 2012.1 •  results: cut 80 min thumbnail branch bug to 5 seconds

JSON:11,1343771362,805530491,2012/07/31  14:49:22  805530491,6280,17,msundy,focus-­‐msundy-­‐markive,dm-­‐CommitSubmit,138.72.247.75,,v66,,1,/usr/anim/modsquad/bin/pyrunmodule  prodp4.triggers.thumbnails  138.72.240.162%3A1666  5576  JSON:11,1343776124,571539452,2012/07/31  16:08:44  571539452,6280,17,msundy,focus-­‐msundy-­‐markive,dm-­‐CommitSubmit,unknown,,v66,,2,  

Page 26: [Pixar] Big Data, Big Depots

26  

<?>

Mark Harrison ([email protected]) Mike Sundy ([email protected]) David Baraff ([email protected])


Top Related