pixiv thumbnails
DESCRIPTION
TRANSCRIPT
Introduction to pixiv
● Illust Communication Site○ http://www.pixiv.net
● Users○ 5 million
● Monthly page view○ 3.3 billion
● Network Traffic○ Over 6Gbps
About me
● Tatsuhiko Kubo(H.N:bokko)● @cubicdaiya● Infrastructure & Software Engineer@pixiv. Inc
My Works@pixiv. Inc
Responsible for
● Middle-Ware Development● Technical Operation & Administration● Datastore & Caching Strategy● Image Upload & Transformation● Illust & User Recommendation● Notification( called Popboard at pixiv)
etc...
My works@private■Software
● neoagent○ A Yet Another Memcached Protocol Proxy Server
● dtl○ diff template library with C++
● ngx_small_light○ Dynamic Transformation Module for Nginx
■Writing● Software Design 2009 Sep
○ どのようにして差分を導き出すのか~diffの動作原理を知る~○ http://gihyo.jp/dev/column/01/prog/2011/diff_sd200906
And More -> http://cccis.jp
Agenda
● Following mechanism at pixiv○ Image Upload○ Thumbnail Generation
Scale of pixiv according to thumbnails
● pixiv has about 30,000,000 illusts and comics● Each illust has about 12 ~ too many thumbnails● 20,000 illusts and comics are uploaded every day● Total volume is about 30TB
Image Upload
Image Upload Detail 1
● Generate too many thumbnails○ 12 ~ too many thumbnails
● Save a original image and thumbnails to storage○ not NFS○ with in-house WebDAV Client(ImageClient)
ImageClient
WebDAV Client
ImageClient
Powered by @?????
ImageClient
Powered by @?????
ImageClient
Powered by @kamipo
ImageClient
● Enable WebDAV operations to multiple servers transparently○ put image○ delete image○ move image○ make directory○ move directory○ delete directory
ImageClient
Image Storage
Image Storage・・・・・
PUTDELETEMOVE
MKCOL
PUTDELETEMOVE
MKCOL
Image Upload Detail 2
● A image is uploaded semi-asynchronously○ Generating thumbnails takes a long time
File Selection View Input Information View Completed View
User Side Action
semin-asynchronous upload mechanism
File Selection View Input Information View Completed View
User Side Action
semin-asynchronous upload mechanism
File Selection View Input Information View Completed View
User Side Action
create lock file
semin-asynchronous upload mechanism
File Selection View Input Information View Completed View
User Side Action
create lock file
semin-asynchronous upload mechanism
File Selection View Input Information View Completed View
User Side Action
create lock file poll until lock file is deleted
semin-asynchronous upload mechanism
File Selection View Input Information View Completed View
User Side Action
Server Side Action
create lock file poll until lock file is deleted
semin-asynchronous upload mechanism
File Selection View Input Information View Completed View
User Side Action
Server Side Action
create lock file poll until lock file is deleted
prefork server
semin-asynchronous upload mechanism
File Selection View Input Information View Completed View
User Side Action
Server Side Action
create lock file poll until lock file is deleted
prefork server upload worker
upload worker
upload worker
prefork
semin-asynchronous upload mechanism
・・・・・
File Selection View Input Information View Completed View
User Side Action
Server Side Action
create lock file poll until lock file is deleted
prefork server upload worker
upload worker
upload worker
・・・・・
preforkGenrate Thumbnails
Genrate Thumbnails
Genrate Thumbnails
semin-asynchronous upload mechanism
File Selection View Input Information View Completed View
User Side Action
Server Side Action
create lock file poll until lock file is deleted
prefork server upload worker
upload worker
upload worker
・・・・・
preforkGenrate Thumbnails delete lock file
Genrate Thumbnails delete lock file
Genrate Thumbnails delete lock file
semin-asynchronous upload mechanism
Inside Image Upload
● User Side Action○ Apache○ PHP
■ ImageClient● Server Side Action
○ daemontools○ Python
■ python-prefork, python-q4m, python-worker● python-q4m and python-worker are in-house libraries
○ Q4M■ Used As Upload Job Queue
Thumbnail Generation
Two types of thumbnails at pixiv
● Static Thumbnail● Dynamic Thumbnail
Static Thumbnail
Static ThumbnailGenerated on upload
Detail of generating thumbnails
● pixiv uses ImageMagick○ ImageMagick is not fast○ But quality of generated image is good○ Quality is more important than speed for us○ Of course, optimization is important, too
ImageMagick Optimization
libjpeg-turbo
libjpeg-turbo
● Optimized libjpeg for x86 and x86_64● Replace libjpeg simply by LD_LIBRARY_PATH● ImageMagick uses libjpeg
benchmark of libjpeg and libjpeg-turbo
■libjpeg■libjpeg-turbo
processing JPEG-image with libjpeg and libjpeg-turbo
benchmark of libjpeg and libjpeg-turbo
libjpeg-turbo is faster than libjpeg by 10% on x86_64
■libjpeg■libjpeg-turbo
processing JPEG-image with libjpeg and libjpeg-turbo
Disable OpenMP
Disable OpenMP
● Latest ImageMagick is OpenMP enabled at default○ This is very slow in multi-process environment
● How to disable OpenMP○ Re-compile with '--disable-openmp'○ OMP_NUM_THREADS=1
● pixiv takes the latter○ Re-compiling and Re-packaging are complicated
OMP_NUM_THREADS=1
CPU Usage of Application Server
OMP_NUM_THREADS=1
CPU usage in peak dropped by 150~200%
CPU Usage of Application Server
OMP_NUM_THREADS=1
Load Average of Application Server
OMP_NUM_THREADS=1
Load Average became dust
Load Average of Application Server
Dynamic Thumbnail
Dynamic ThumbnailGenerated on demand
Why is dynamic thumbnail needed?
● Static thumbnail consumes disk space○ Dynamic thumbnail does not consume disk space
● Preparing new static thumbnails takes a long time○ Dynamic thumbnail is ready in a second!
Other Companies' Cases
● Cookpad○ mod_tofu○ http://www.slideshare.net/mirakui/ss-8150494
● livedoor(Now, NHN Japan)○ mod_small_light○ http://www.slideshare.net/livedoor/smalllight2
pixiv's Case
mod_small_light-1.1.1
mod_small_light-1.1.1
pixiv Edition5
mod_small_light-1.1.1
pixiv Edition5
heavily customized...
Heavily customized...
mod_small_light configuration
Resize image with mod_small_light
GET /small_light(dw=100,dh=100)/tank.jpgGET /tank.jpg
Many Options
q image quality(0~100)
of output format(jpg,gif,png,tiff)e processing engine(imlib2,imagemagick)
cc canvas colorp pattern name with SmallLightPatternDefine
etc Too many options
Pattern Name
Resize image with mod_small_light
GET /small_light(p=small)/tank.jpg orGET /tank.jpg
GET /small_light(dw=100,dh=100,e=imagemagick,jpeghint=y)/tank.jpg
Why not original mod_small_light?
● Some thumbnails are special○ comic cover○ various cropping algorithms depending on aspect ratio
● Default output format is JPEG○ It is good for us that input and output formats are the same
Why not original mod_small_light?
● No support for CMYK○ pixiv must support CMYK
● Needed support for strange aspect ratio○ da=l -> long-edge○ da=s -> short-edge○ da=p -> pixiv-edge (only pixiv Edition)
Special Thumbnail 1
● comic cover
Special Thumbnail 2
● special cropping algorithm
Special Thumbnail 2
● special cropping algorithm
Special Thumbnail 2
● special cropping algorithm
crop center of image if image is landscape
Special Thumbnail 3
● special cropping algorithm
Special Thumbnail 3
● special cropping algorithm
Special Thumbnail 3
● special cropping algorithm
crop top of image if image is portrait
Many Extend Options
rmprof remove image profilescrop_square crop image squarlycover add manga coversamec conform dw & dh & cw & chextendl extend long-edgeetc twenty new options in all
Summary
● Optimization of image processing is very important○ Generating thumbnail takes a long time○ Let's tune and desynchronize
● pixiv has two types of thumbnails○ Static Thumbnail○ Dynamic Thumbnail
● Dynamic Thumbnail is diskspace-saving and flexible○ Big image storage is expensive○ Easy to correspond to shift of application
■ New thumbnail types for new designs
Thanks!