a journey through new languages - rancho dev 2017

139
LANGUAGES: A JOURNEY @akitaonrails

Upload: fabio-akita

Post on 23-Jan-2018

89 views

Category:

Technology


0 download

TRANSCRIPT

LANGUAGES:A JOURNEY

@akitaonrails

LANGUAGES:A JOURNEY

RANCHO DEV 2017

@akitaonrails

@akitaonrails

www.theconf.club

Languages Syntax are EASY

Architectures (PATTERNS) are HARD

git checkout -b old_version remotes/origin/old_version

time bin/manga-downloadr -t

#!/usr/bin/envruby$LOAD_PATH.unshiftFile.join(File.dirname(__FILE__),'..','lib')require'optparse'

options={test:false}option_parser=OptionParser.newdo|opts|opts.banner="Usage:manga-downloadr[options]"

opts.on("-t","--test","Testroutine")do|t|options[:url]="http://www.mangareader.net/onepunch-man"options[:name]="one-punch-man"options[:directory]="/tmp/manga-downloadr/one-punch-man"options[:test]=trueend

opts.on("-uURL","--urlURL","FullMangaReader.netmangahomepageURL-required")do|v|options[:url]=vend

opts.on("-nNAME","--nameNAME","slugtobeusedforthesub-foldertostoreallmangafiles-required")do|n|options[:name]=nend

opts.on("-dDIRECTORY","--directoryDIRECTORY","mainfolderwhereallmangaswillbestored-required")do|d|options[:directory]=dend

opts.on("-h","--help","Showthismessage")doputsoptsexitendend

require'manga-downloadr'generator=MangaDownloadr::Workflow.create(options[:url],options[:name],options[:directory])

generator.fetch_chapter_urls!

generator.fetch_page_urls!

generator.fetch_image_urls!

generator.fetch_images!

generator.compile_ebooks!

require'manga-downloadr'generator=MangaDownloadr::Workflow.create(options[:url],options[:name],options[:directory])

puts"Massiveparallelscanningofallchapters"generator.fetch_chapter_urls!

puts"\nMassiveparallelscanningofallpages"generator.fetch_page_urls!

puts"\nMassiveparallelscanningofallimages"generator.fetch_image_urls!puts"\nTotalpagelinksfound:#{generator.chapter_pages_count}"

puts"\nMassiveparalleldownloadofallpageimages"generator.fetch_images!

puts"\nCompilingallimagesintoPDFvolumes"generator.compile_ebooks!

puts"\nProcessfinished."

require'manga-downloadr'generator=MangaDownloadr::Workflow.create(options[:url],options[:name],options[:directory])unlessgenerator.state?(:chapter_urls)puts"Massiveparallelscanningofallchapters"generator.fetch_chapter_urls!endunlessgenerator.state?(:page_urls)puts"\nMassiveparallelscanningofallpages"generator.fetch_page_urls!endunlessgenerator.state?(:image_urls)puts"\nMassiveparallelscanningofallimages"generator.fetch_image_urls!puts"\nTotalpagelinksfound:#{generator.chapter_pages_count}"endunlessgenerator.state?(:images)puts"\nMassiveparalleldownloadofallpageimages"generator.fetch_images!endunlessoptions[:test]puts"\nCompilingallimagesintoPDFvolumes"generator.compile_ebooks!endputs"\nProcessfinished."

MangaDownloadr::Workflow

MangaDownloadr::WorkflowmoduleMangaDownloadrImageData=Struct.new(:folder,:filename,:url)

classWorkflow

definitialize(root_url=nil,manga_name=nil,manga_root=nil,options={})end

deffetch_chapter_urls!end

deffetch_page_urls!end

deffetch_image_urls!end

deffetch_images!end

defcompile_ebooks!end

defstate?(state)end

privatedefcurrent_state(state)end

endend

fetch_chapter_urls!

moduleMangaDownloadrImageData=Struct.new(:folder,:filename,:url)

classWorkflow

definitialize(root_url=nil,manga_name=nil,manga_root=nil,options={})end

deffetch_chapter_urls!end

deffetch_page_urls!end

deffetch_image_urls!end

deffetch_images!end

defcompile_ebooks!end

defstate?(state)end

privatedefcurrent_state(state)end

endend

fetch_chapter_urls!

fetch_chapter_urls!deffetch_chapter_urls!doc=Nokogiri::HTML(open(manga_root_url))

self.chapter_list=doc.css("#listinga").map{|l|l['href']}self.manga_title=doc.css("#mangapropertiesh1").first.text

current_state:chapter_urlsend

fetch_chapter_urls!deffetch_chapter_urls!doc=Nokogiri::HTML(open(manga_root_url))

self.chapter_list=doc.css("#listinga").map{|l|l['href']}self.manga_title=doc.css("#mangapropertiesh1").first.text

current_state:chapter_urlsend

deffetch_page_urls!

chapter_list.eachdo|chapter_link|

response=Typhoeus.get"http://www.mangareader.net#{chapter_link}"

chapter_doc=Nokogiri::HTML(response.body)pages=chapter_doc.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option")chapter_pages.merge!(chapter_link=>pages.map{|p|p['value']})print'.'

end

self.chapter_pages_count=chapter_pages.values.inject(0){|total,list|total+=list.size}current_state:page_urlsend

deffetch_page_urls!

chapter_list.eachdo|chapter_link|beginresponse=Typhoeus.get"http://www.mangareader.net#{chapter_link}"

beginchapter_doc=Nokogiri::HTML(response.body)pages=chapter_doc.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option")chapter_pages.merge!(chapter_link=>pages.map{|p|p['value']})print'.'rescue=>eself.fetch_page_urls_errors<<{url:chapter_link,error:e,body:response.body}print'x'endend

rescue=>eputseendend

unlessfetch_page_urls_errors.empty?puts"\nErrorsfetchingpageurls:"putsfetch_page_urls_errorsend

self.chapter_pages_count=chapter_pages.values.inject(0){|total,list|total+=list.size}current_state:page_urlsend

deffetch_page_urls!hydra=Typhoeus::Hydra.new(max_concurrency:hydra_concurrency)chapter_list.eachdo|chapter_link|beginrequest=Typhoeus::Request.new"http://www.mangareader.net#{chapter_link}"request.on_completedo|response|beginchapter_doc=Nokogiri::HTML(response.body)pages=chapter_doc.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option")chapter_pages.merge!(chapter_link=>pages.map{|p|p['value']})print'.'rescue=>eself.fetch_page_urls_errors<<{url:chapter_link,error:e,body:response.body}print'x'endendhydra.queuerequestrescue=>eputseendendhydra.rununlessfetch_page_urls_errors.empty?puts"\nErrorsfetchingpageurls:"putsfetch_page_urls_errorsend

self.chapter_pages_count=chapter_pages.values.inject(0){|total,list|total+=list.size}current_state:page_urlsend

deffetch_page_urls!hydra=Typhoeus::Hydra.new(max_concurrency:hydra_concurrency)chapter_list.eachdo|chapter_link|beginrequest=Typhoeus::Request.new"http://www.mangareader.net#{chapter_link}"request.on_completedo|response|beginchapter_doc=Nokogiri::HTML(response.body)pages=chapter_doc.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option")chapter_pages.merge!(chapter_link=>pages.map{|p|p['value']})print'.'rescue=>eself.fetch_page_urls_errors<<{url:chapter_link,error:e,body:response.body}print'x'endendhydra.queuerequestrescue=>eputseendendhydra.rununlessfetch_page_urls_errors.empty?puts"\nErrorsfetchingpageurls:"putsfetch_page_urls_errorsend

self.chapter_pages_count=chapter_pages.values.inject(0){|total,list|total+=list.size}current_state:page_urlsend

deffetch_page_urls!hydra=Typhoeus::Hydra.new(max_concurrency:hydra_concurrency)chapter_list.eachdo|chapter_link|beginrequest=Typhoeus::Request.new"http://www.mangareader.net#{chapter_link}"request.on_completedo|response|beginchapter_doc=Nokogiri::HTML(response.body)pages=chapter_doc.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option")chapter_pages.merge!(chapter_link=>pages.map{|p|p['value']})print'.'rescue=>eself.fetch_page_urls_errors<<{url:chapter_link,error:e,body:response.body}print'x'endendhydra.queuerequestrescue=>eputseendendhydra.rununlessfetch_page_urls_errors.empty?puts"\nErrorsfetchingpageurls:"putsfetch_page_urls_errorsend

self.chapter_pages_count=chapter_pages.values.inject(0){|total,list|total+=list.size}current_state:page_urlsend

deffetch_page_urls!hydra=Typhoeus::Hydra.new(max_concurrency:hydra_concurrency)chapter_list.eachdo|chapter_link|beginrequest=Typhoeus::Request.new"http://www.mangareader.net#{chapter_link}"request.on_completedo|response|beginchapter_doc=Nokogiri::HTML(response.body)pages=chapter_doc.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option")chapter_pages.merge!(chapter_link=>pages.map{|p|p['value']})print'.'rescue=>eself.fetch_page_urls_errors<<{url:chapter_link,error:e,body:response.body}print'x'endendhydra.queuerequestrescue=>eputseendendhydra.rununlessfetch_page_urls_errors.empty?puts"\nErrorsfetchingpageurls:"putsfetch_page_urls_errorsend

self.chapter_pages_count=chapter_pages.values.inject(0){|total,list|total+=list.size}current_state:page_urlsend

deffetch_page_urls!hydra=Typhoeus::Hydra.new(max_concurrency:hydra_concurrency)chapter_list.eachdo|chapter_link|beginrequest=Typhoeus::Request.new"http://www.mangareader.net#{chapter_link}"request.on_completedo|response|beginchapter_doc=Nokogiri::HTML(response.body)pages=chapter_doc.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option")chapter_pages.merge!(chapter_link=>pages.map{|p|p['value']})print'.'rescue=>eself.fetch_page_urls_errors<<{url:chapter_link,error:e,body:response.body}print'x'endendhydra.queuerequestrescue=>eputseendendhydra.rununlessfetch_page_urls_errors.empty?puts"\nErrorsfetchingpageurls:"putsfetch_page_urls_errorsend

self.chapter_pages_count=chapter_pages.values.inject(0){|total,list|total+=list.size}current_state:page_urlsend

deffetch_page_urls!hydra=Typhoeus::Hydra.new(max_concurrency:hydra_concurrency)chapter_list.eachdo|chapter_link|beginrequest=Typhoeus::Request.new"http://www.mangareader.net#{chapter_link}"request.on_completedo|response|beginchapter_doc=Nokogiri::HTML(response.body)pages=chapter_doc.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option")chapter_pages.merge!(chapter_link=>pages.map{|p|p['value']})print'.'rescue=>eself.fetch_page_urls_errors<<{url:chapter_link,error:e,body:response.body}print'x'endendhydra.queuerequestrescue=>eputseendendhydra.rununlessfetch_page_urls_errors.empty?puts"\nErrorsfetchingpageurls:"putsfetch_page_urls_errorsend

self.chapter_pages_count=chapter_pages.values.inject(0){|total,list|total+=list.size}current_state:page_urlsend

deffetch_page_urls!hydra=Typhoeus::Hydra.new(max_concurrency:hydra_concurrency)chapter_list.eachdo|chapter_link|beginrequest=Typhoeus::Request.new"http://www.mangareader.net#{chapter_link}"request.on_completedo|response|beginchapter_doc=Nokogiri::HTML(response.body)pages=chapter_doc.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option")chapter_pages.merge!(chapter_link=>pages.map{|p|p['value']})print'.'rescue=>eself.fetch_page_urls_errors<<{url:chapter_link,error:e,body:response.body}print'x'endendhydra.queuerequestrescue=>eputseendendhydra.rununlessfetch_page_urls_errors.empty?puts"\nErrorsfetchingpageurls:"putsfetch_page_urls_errorsend

self.chapter_pages_count=chapter_pages.values.inject(0){|total,list|total+=list.size}current_state:page_urlsend

deffetch_page_urls!hydra=Typhoeus::Hydra.new(max_concurrency:hydra_concurrency)chapter_list.eachdo|chapter_link|beginrequest=Typhoeus::Request.new"http://www.mangareader.net#{chapter_link}"request.on_completedo|response|beginchapter_doc=Nokogiri::HTML(response.body)pages=chapter_doc.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option")chapter_pages.merge!(chapter_link=>pages.map{|p|p['value']})print'.'rescue=>eself.fetch_page_urls_errors<<{url:chapter_link,error:e,body:response.body}print'x'endendhydra.queuerequestrescue=>eputseendendhydra.rununlessfetch_page_urls_errors.empty?puts"\nErrorsfetchingpageurls:"putsfetch_page_urls_errorsend

self.chapter_pages_count=chapter_pages.values.inject(0){|total,list|total+=list.size}current_state:page_urlsend

deffetch_page_urls!hydra=Typhoeus::Hydra.new(max_concurrency:hydra_concurrency)chapter_list.eachdo|chapter_link|beginrequest=Typhoeus::Request.new"http://www.mangareader.net#{chapter_link}"request.on_completedo|response|beginchapter_doc=Nokogiri::HTML(response.body)pages=chapter_doc.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option")chapter_pages.merge!(chapter_link=>pages.map{|p|p['value']})print'.'rescue=>eself.fetch_page_urls_errors<<{url:chapter_link,error:e,body:response.body}print'x'endendhydra.queuerequestrescue=>eputseendendhydra.rununlessfetch_page_urls_errors.empty?puts"\nErrorsfetchingpageurls:"putsfetch_page_urls_errorsend

self.chapter_pages_count=chapter_pages.values.inject(0){|total,list|total+=list.size}current_state:page_urlsend

deffetch_image_urls!hydra=Typhoeus::Hydra.new(max_concurrency:hydra_concurrency)chapter_list.eachdo|chapter_key|chapter_pages[chapter_key].eachdo|page_link|beginrequest=Typhoeus::Request.new"http://www.mangareader.net#{page_link}"request.on_completedo|response|beginchapter_doc=Nokogiri::HTML(response.body)image=chapter_doc.css('#img').firsttokens=image['alt'].match("^(.*?)\s\-\s(.*?)$")extension=File.extname(URI.parse(image['src']).path)

chapter_images.merge!(chapter_key=>[])ifchapter_images[chapter_key].nil?chapter_images[chapter_key]<<ImageData.new(tokens[1],"#{tokens[2]}#{extension}",image['src'])print'.'rescue=>eself.fetch_image_urls_errors<<{url:page_link,error:e}print'x'endendhydra.queuerequestrescue=>eputseendendendhydra.rununlessfetch_image_urls_errors.empty?puts"\nErrorsfetchingimageurls:"putsfetch_image_urls_errorsend

current_state:image_urlsend

deffetch_image_urls!hydra=Typhoeus::Hydra.new(max_concurrency:hydra_concurrency)chapter_list.eachdo|chapter_key|chapter_pages[chapter_key].eachdo|page_link|beginrequest=Typhoeus::Request.new"http://www.mangareader.net#{page_link}"request.on_completedo|response|beginchapter_doc=Nokogiri::HTML(response.body)image=chapter_doc.css('#img').firsttokens=image['alt'].match("^(.*?)\s\-\s(.*?)$")extension=File.extname(URI.parse(image['src']).path)

chapter_images.merge!(chapter_key=>[])ifchapter_images[chapter_key].nil?chapter_images[chapter_key]<<ImageData.new(tokens[1],"#{tokens[2]}#{extension}",image['src'])print'.'rescue=>eself.fetch_image_urls_errors<<{url:page_link,error:e}print'x'endendhydra.queuerequestrescue=>eputseendendendhydra.rununlessfetch_image_urls_errors.empty?puts"\nErrorsfetchingimageurls:"putsfetch_image_urls_errorsend

current_state:image_urlsend

deffetch_images!hydra=Typhoeus::Hydra.new(max_concurrency:hydra_concurrency)chapter_list.each_with_indexdo|chapter_key,chapter_index|chapter_images[chapter_key].eachdo|file|downloaded_filename=File.join(manga_root_folder,file.folder,file.filename)nextifFile.exists?(downloaded_filename)#effectivelyresumesthedownloadlistwithoutre-downloadingeverythingrequest=Typhoeus::Request.newfile.urlrequest.on_completedo|response|begin#downloadFileUtils.mkdir_p(File.join(manga_root_folder,file.folder))File.open(downloaded_filename,"wb+"){|f|f.writeresponse.body}

unlessis_test#resizeimage=Magick::Image.read(downloaded_filename).firstresized=image.resize_to_fit(600,800)resized.write(downloaded_filename){self.quality=50}GC.start#toavoidaleaktoobig(ImageMagickisnotoriousforthat,speciallyonresizes)end

print'.'rescue=>eself.fetch_images_errors<<{url:file.url,error:e}print'#'endendhydra.queuerequestendendhydra.rununlessfetch_images_errors.empty?puts"\nErrorsdownloadingimages:"putsfetch_images_errorsend

current_state:imagesend

defcompile_ebooks!folders=Dir[manga_root_folder+"/*/"].sort_by{|element|ary=element.split("").last.to_i}self.download_links=folders.inject([])do|list,folder|list+=Dir[folder+"*.*"].sort_by{|element|ary=element.split("").last.to_i}end

#concatenatingPDFfiles(250pagespervolume)chapter_number=0while!download_links.empty?chapter_number+=1pdf_file=File.join(manga_root_folder,"#{manga_title}#{chapter_number}.pdf")list=download_links.slice!(0..pages_per_volume)Prawn::Document.generate(pdf_file,page_size:page_size)do|pdf|list.eachdo|image_file|beginpdf.imageimage_file,position::center,vposition::centerrescue=>eputs"Errorin#{image_file}-#{e}"endendendprint'.'end

current_state:ebooksend

time bin/manga-downloadr -t

17.18s user 17.62s system 41% cpu 1:24.04 total

time bin/manga-downloadr -t

17.18s user 17.62s system 41% cpu 1:24.04 total

. !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs

61 directories, 281 files

mix.exs

mix.exsdefmoduleExMangaDownloadr.MixfiledouseMix.Project

defprojectdo[app::ex_manga_downloadr,version:"1.0.2",elixir:"~>1.4",build_embedded:Mix.env==:prod,start_permanent:Mix.env==:prod,escript:[main_module:ExMangaDownloadr.CLI],deps:deps()]end

defapplicationdo[applications:[:logger,:httpoison,:porcelain,:observer]]end

defpdepsdo[{:httpoison,"~>0.11"},{:floki,"~>0.17"},{:porcelain,"~>2.0.3"},{:mock,"~>0.2",only::test}]endend

Mixfile

MixfilePoolManagement

. !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs

61 directories, 281 files

workflow.ex

Mixfile

. !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs

61 directories, 281 files

workflow.ex

workflow.exdefmoduleExMangaDownloadr.Workflowdodefdetermine_source(url)doend

defchapters({url,source})do{:ok,{_manga_title,chapter_list}}=MangaWrapper.index_page(url,source){chapter_list,source}end

defpages({chapter_list,source})dopages_list=chapter_list|>Task.async_stream(MangaWrapper,:chapter_page,[source],max_concurrency:@max_demand)|>Enum.to_list()|>Enum.reduce([],fn{:ok,{:ok,list}},acc->acc++listend){pages_list,source}end

defimages_sources({pages_list,source})dopages_list|>Task.async_stream(MangaWrapper,:page_image,[source],max_concurrency:@max_demand)|>Enum.to_list()|>Enum.map(fn{:ok,{:ok,image}}->imageend)end

defprocess_downloads(images_list,directory)doimages_list|>Task.async_stream(MangaWrapper,:page_download_image,[directory],max_concurrency:@max_demand/2,timeout:@download_timeout)|>Enum.to_list()directoryend

defoptimize_images(directory)do…enddefcompile_pdfs(directory,manga_name)do…end

defpcompile_volume(manga_name,directory,{chunk,index})do…enddefpprepare_volume(manga_name,directory,chunk,index)do…enddefpchunk(collection,default_size)do…endend

:chapter_page

workflow.exdefmoduleExMangaDownloadr.Workflowdodefdetermine_source(url)doend

defchapters({url,source})do{:ok,{_manga_title,chapter_list}}=MangaWrapper.index_page(url,source){chapter_list,source}end

defpages({chapter_list,source})dopages_list=chapter_list|>Task.async_stream(MangaWrapper,:chapter_page,[source],max_concurrency:@max_demand)|>Enum.to_list()|>Enum.reduce([],fn{:ok,{:ok,list}},acc->acc++listend){pages_list,source}end

defimages_sources({pages_list,source})dopages_list|>Task.async_stream(MangaWrapper,:page_image,[source],max_concurrency:@max_demand)|>Enum.to_list()|>Enum.map(fn{:ok,{:ok,image}}->imageend)end

defprocess_downloads(images_list,directory)doimages_list|>Task.async_stream(MangaWrapper,:page_download_image,[directory],max_concurrency:@max_demand/2,timeout:@download_timeout)|>Enum.to_list()directoryend

defoptimize_images(directory)do…enddefcompile_pdfs(directory,manga_name)do…end

defpcompile_volume(manga_name,directory,{chunk,index})do…enddefpprepare_volume(manga_name,directory,chunk,index)do…enddefpchunk(collection,default_size)do…endend

:chapter_page

workflow.exdefmoduleExMangaDownloadr.Workflowdodefdetermine_source(url)doend

defchapters({url,source})do{:ok,{_manga_title,chapter_list}}=MangaWrapper.index_page(url,source){chapter_list,source}end

defpages({chapter_list,source})dopages_list=chapter_list|>Task.async_stream(MangaWrapper,:chapter_page,[source],max_concurrency:@max_demand)|>Enum.to_list()|>Enum.reduce([],fn{:ok,{:ok,list}},acc->acc++listend){pages_list,source}end

defimages_sources({pages_list,source})dopages_list|>Task.async_stream(MangaWrapper,:page_image,[source],max_concurrency:@max_demand)|>Enum.to_list()|>Enum.map(fn{:ok,{:ok,image}}->imageend)end

defprocess_downloads(images_list,directory)doimages_list|>Task.async_stream(MangaWrapper,:page_download_image,[directory],max_concurrency:@max_demand/2,timeout:@download_timeout)|>Enum.to_list()directoryend

defoptimize_images(directory)do…enddefcompile_pdfs(directory,manga_name)do…end

defpcompile_volume(manga_name,directory,{chunk,index})do…enddefpprepare_volume(manga_name,directory,chunk,index)do…enddefpchunk(collection,default_size)do…endend

:chapter_page

workflow.exdefmoduleExMangaDownloadr.Workflowdodefdetermine_source(url)doend

defchapters({url,source})do{:ok,{_manga_title,chapter_list}}=MangaWrapper.index_page(url,source){chapter_list,source}end

defpages({chapter_list,source})dopages_list=chapter_list|>Task.async_stream(MangaWrapper,:chapter_page,[source],max_concurrency:@max_demand)|>Enum.to_list()|>Enum.reduce([],fn{:ok,{:ok,list}},acc->acc++listend){pages_list,source}end

defimages_sources({pages_list,source})dopages_list|>Task.async_stream(MangaWrapper,:page_image,[source],max_concurrency:@max_demand)|>Enum.to_list()|>Enum.map(fn{:ok,{:ok,image}}->imageend)end

defprocess_downloads(images_list,directory)doimages_list|>Task.async_stream(MangaWrapper,:page_download_image,[directory],max_concurrency:@max_demand/2,timeout:@download_timeout)|>Enum.to_list()directoryend

defoptimize_images(directory)do…enddefcompile_pdfs(directory,manga_name)do…end

defpcompile_volume(manga_name,directory,{chunk,index})do…enddefpprepare_volume(manga_name,directory,chunk,index)do…enddefpchunk(collection,default_size)do…endend

:chapter_page

POOL

workflow.exdefmoduleExMangaDownloadr.Workflowdodefdetermine_source(url)doend

defchapters({url,source})do{:ok,{_manga_title,chapter_list}}=MangaWrapper.index_page(url,source){chapter_list,source}end

defpages({chapter_list,source})dopages_list=chapter_list|>Task.async_stream(MangaWrapper,:chapter_page,[source],max_concurrency:@max_demand)|>Enum.to_list()|>Enum.reduce([],fn{:ok,{:ok,list}},acc->acc++listend){pages_list,source}end

defimages_sources({pages_list,source})dopages_list|>Task.async_stream(MangaWrapper,:page_image,[source],max_concurrency:@max_demand)|>Enum.to_list()|>Enum.map(fn{:ok,{:ok,image}}->imageend)end

defprocess_downloads(images_list,directory)doimages_list|>Task.async_stream(MangaWrapper,:page_download_image,[directory],max_concurrency:@max_demand/2,timeout:@download_timeout)|>Enum.to_list()directoryend

defoptimize_images(directory)do…enddefcompile_pdfs(directory,manga_name)do…end

defpcompile_volume(manga_name,directory,{chunk,index})do…enddefpprepare_volume(manga_name,directory,chunk,index)do…enddefpchunk(collection,default_size)do…endend

:chapter_page

manga_wrapper.exdefmoduleMangaWrapperdorequireLogger

defindex_page(url,source)dosource|>manga_source("IndexPage")|>apply(:chapters,[url])end

defchapter_page(chapter_link,source)dosource|>manga_source("ChapterPage")|>apply(:pages,[chapter_link])end

defpage_image(page_link,source)dosource|>manga_source("Page")|>apply(:image,[page_link])end

defpage_download_image(image_data,directory)dodownload_image(image_data,directory)end

defpmanga_source(source,module)docasesourcedo"mangareader"->:"Elixir.ExMangaDownloadr.MangaReader.#{module}""mangafox"->:"Elixir.ExMangaDownloadr.Mangafox.#{module}"endend

defpdownload_image({image_src,image_filename},directory)do

endend

:chapter_page

ChapterPage

manga_wrapper.exdefmoduleMangaWrapperdorequireLogger

defindex_page(url,source)dosource|>manga_source("IndexPage")|>apply(:chapters,[url])end

defchapter_page(chapter_link,source)dosource|>manga_source("ChapterPage")|>apply(:pages,[chapter_link])end

defpage_image(page_link,source)dosource|>manga_source("Page")|>apply(:image,[page_link])end

defpage_download_image(image_data,directory)dodownload_image(image_data,directory)end

defpmanga_source(source,module)docasesourcedo"mangareader"->:"Elixir.ExMangaDownloadr.MangaReader.#{module}""mangafox"->:"Elixir.ExMangaDownloadr.Mangafox.#{module}"endend

defpdownload_image({image_src,image_filename},directory)do

endend

. !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs

61 directories, 281 files

:chapter_page

ChapterPage

manga_wrapper.exdefmoduleMangaWrapperdorequireLogger

defindex_page(url,source)dosource|>manga_source("IndexPage")|>apply(:chapters,[url])end

defchapter_page(chapter_link,source)dosource|>manga_source("ChapterPage")|>apply(:pages,[chapter_link])end

defpage_image(page_link,source)dosource|>manga_source("Page")|>apply(:image,[page_link])end

defpage_download_image(image_data,directory)dodownload_image(image_data,directory)end

defpmanga_source(source,module)docasesourcedo"mangareader"->:"Elixir.ExMangaDownloadr.MangaReader.#{module}""mangafox"->:"Elixir.ExMangaDownloadr.Mangafox.#{module}"endend

defpdownload_image({image_src,image_filename},directory)do

endend

. !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs

61 directories, 281 files

:chapter_page

ChapterPage

manga_wrapper.exdefmoduleMangaWrapperdorequireLogger

defindex_page(url,source)dosource|>manga_source("IndexPage")|>apply(:chapters,[url])end

defchapter_page(chapter_link,source)dosource|>manga_source("ChapterPage")|>apply(:pages,[chapter_link])end

defpage_image(page_link,source)dosource|>manga_source("Page")|>apply(:image,[page_link])end

defpage_download_image(image_data,directory)dodownload_image(image_data,directory)end

defpmanga_source(source,module)docasesourcedo"mangareader"->:"Elixir.ExMangaDownloadr.MangaReader.#{module}""mangafox"->:"Elixir.ExMangaDownloadr.Mangafox.#{module}"endend

defpdownload_image({image_src,image_filename},directory)do

endend

. !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs

61 directories, 281 files

:chapter_page

ChapterPage

manga_wrapper.exdefmoduleMangaWrapperdorequireLogger

defindex_page(url,source)dosource|>manga_source("IndexPage")|>apply(:chapters,[url])end

defchapter_page(chapter_link,source)dosource|>manga_source("ChapterPage")|>apply(:pages,[chapter_link])end

defpage_image(page_link,source)dosource|>manga_source("Page")|>apply(:image,[page_link])end

defpage_download_image(image_data,directory)dodownload_image(image_data,directory)end

defpmanga_source(source,module)docasesourcedo"mangareader"->:"Elixir.ExMangaDownloadr.MangaReader.#{module}""mangafox"->:"Elixir.ExMangaDownloadr.Mangafox.#{module}"endend

defpdownload_image({image_src,image_filename},directory)do

endend

. !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs

61 directories, 281 files

:chapter_page

ChapterPage

defmoduleExMangaDownloadr.Mangafox.ChapterPagedorequireLoggerrequireExMangaDownloadr

defpages(chapter_link)doExMangaDownloadr.fetchchapter_link,do:fetch_pages(chapter_link)end

defpfetch_pages(html,chapter_link)do[_page|link_template]=chapter_link|>String.split("/")|>Enum.reverse

html|>Floki.find("div[id='top_center_bar']option")|>Floki.attribute("value")|>Enum.reject(fnpage_number->page_number=="0"end)|>Enum.map(fnpage_number->["#{page_number}.html"|link_template]|>Enum.reverse|>Enum.join("/")end)endend

ChapterPage

ChapterPage

. !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs

61 directories, 281 files

cli.ex

cli.exdefmoduleExMangaDownloadr.CLIdoaliasExMangaDownloadr.WorkflowrequireExMangaDownloadrdefmain(args)doargs|>parse_args|>processend...defpparse_args(args)doend

defpprocess(:help)doend

defpprocess(directory,url)doFile.mkdir_p!(directory)File.mkdir_p!("/tmp/ex_manga_downloadr_cache")

manga_name=directory|>String.split("/")|>Enum.reverse|>Enum.at(0)url|>Workflow.determine_source|>Workflow.chapters|>Workflow.pages|>Workflow.images_sources|>Workflow.process_downloads(directory)|>Workflow.optimize_images|>Workflow.compile_pdfs(manga_name)|>finish_processend

defpprocess_test(directory,url)doend

defpfinish_process(directory)doendend

Workflow

mix deps.getmix testmix escript.build

mix deps.getmix testmix escript.build

ex_manga_downloadr - 4.6M

time ./ex_manga_downloadr —test

time ./ex_manga_downloadr —test

32.03s user 57.97s system 120% cpu 1:14.45 total

. !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs

61 directories, 281 files

. !"" cr_manga_downloadr !"" libs # !"" ... !"" LICENSE !"" README.md !"" shard.lock !"" shard.yml !"" spec # !"" cr_manga_downloadr # # !"" chapters_spec.cr # # !"" concurrency_spec.cr # # !"" image_downloader_spec.cr # # !"" page_image_spec.cr # # $"" pages_spec.cr # !"" fixtures # # !"" ... # $"" spec_helper.cr $"" src !"" cr_manga_downloadr # !"" chapters.cr # !"" concurrency.cr # !"" downloadr_client.cr # !"" image_downloader.cr # !"" page_image.cr # !"" pages.cr # !"" records.cr # !"" version.cr # $"" workflow.cr $"" cr_manga_downloadr.cr

. !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs

61 directories, 281 files

. !"" cr_manga_downloadr !"" libs # !"" ... !"" LICENSE !"" README.md !"" shard.lock !"" shard.yml !"" spec # !"" cr_manga_downloadr # # !"" chapters_spec.cr # # !"" concurrency_spec.cr # # !"" image_downloader_spec.cr # # !"" page_image_spec.cr # # $"" pages_spec.cr # !"" fixtures # # !"" ... # $"" spec_helper.cr $"" src !"" cr_manga_downloadr # !"" chapters.cr # !"" concurrency.cr # !"" downloadr_client.cr # !"" image_downloader.cr # !"" page_image.cr # !"" pages.cr # !"" records.cr # !"" version.cr # $"" workflow.cr $"" cr_manga_downloadr.cr

File.mkdir_p!(directory)File.mkdir_p!("/tmp/ex_manga_downloadr_cache")

manga_name=directory|>String.split("/")|>Enum.reverse|>Enum.at(0)url|>Workflow.determine_source|>Workflow.chapters|>Workflow.pages|>Workflow.images_sources|>Workflow.process_downloads(directory)|>Workflow.optimize_images|>Workflow.compile_pdfs(manga_name)|>finish_processend

[email protected]_directory

pipeSteps.fetch_chapters(@config).>>Steps.fetch_pages(@config).>>Steps.fetch_images(@config).>>Steps.download_images(@config).>>Steps.optimize_images(@config).>>Steps.prepare_volumes(@config).>>unwrap

puts"Done!"end

File.mkdir_p!(directory)File.mkdir_p!("/tmp/ex_manga_downloadr_cache")

manga_name=directory|>String.split("/")|>Enum.reverse|>Enum.at(0)url|>Workflow.determine_source|>Workflow.chapters|>Workflow.pages|>Workflow.images_sources|>Workflow.process_downloads(directory)|>Workflow.optimize_images|>Workflow.compile_pdfs(manga_name)|>finish_processend

#1y=c(b(a))

#2x=b(a)y=c(x)

#ElixirPipesy=a|>b|>c

#CrystalMacroPipesy=pipea.>>b.>>c.>>unwrap

defmoduleExMangaDownloadr.MangaReader.IndexPagedorequireLoggerrequireExMangaDownloadr

defchapters(manga_root_url)doExMangaDownloadr.fetchmanga_root_url,do:collectend

defpcollect(html)do{fetch_manga_title(html),fetch_chapters(html)}end

defpfetch_manga_title(html)dohtml|>Floki.find("#mangapropertiesh1")|>Floki.textend

defpfetch_chapters(html)dohtml|>Floki.find("#listinga")|>Floki.attribute("href")endend

defmoduleExMangaDownloadr.MangaReader.IndexPagedorequireLoggerrequireExMangaDownloadr

defchapters(manga_root_url)doExMangaDownloadr.fetchmanga_root_url,do:collectend

defpcollect(html)do{fetch_manga_title(html),fetch_chapters(html)}end

defpfetch_manga_title(html)dohtml|>Floki.find("#mangapropertiesh1")|>Floki.textend

defpfetch_chapters(html)dohtml|>Floki.find("#listinga")|>Floki.attribute("href")endend

defmoduleExMangaDownloadr.MangaReader.IndexPagedorequireLoggerrequireExMangaDownloadr

defchapters(manga_root_url)doExMangaDownloadr.fetchmanga_root_url,do:collectend

defpcollect(html)do{fetch_manga_title(html),fetch_chapters(html)}end

defpfetch_manga_title(html)dohtml|>Floki.find("#mangapropertiesh1")|>Floki.textend

defpfetch_chapters(html)dohtml|>Floki.find("#listinga")|>Floki.attribute("href")endend

require"./downloadr_client"require"xml"

moduleCrMangaDownloadrclassChapters<DownloadrClientdeffetchhtml=get(@config.root_uri).as(XML::Node)nodes=html.xpath_nodes("//table[contains(@id,'listing')]//td//a/@href")nodes.map{|node|node.text.as(String)}endendend

DownloadrClient

require"./downloadr_client"require"xml"

moduleCrMangaDownloadrclassChapters<DownloadrClientdeffetchhtml=get(@config.root_uri).as(XML::Node)nodes=html.xpath_nodes("//table[contains(@id,'listing')]//td//a/@href")nodes.map{|node|node.text.as(String)}endendend

DownloadrClient

moduleCrMangaDownloadrclassDownloadrClient...defget(uri:String,binary=false)Dir.mkdir_p(@config.cache_directory)unlessDir.exists?(@config.cache_directory)cache_path=File.join(@config.cache_directory,cache_filename(uri))whiletruebeginresponse=if@cache_http&&File.exists?(cache_path)body=File.read(cache_path)HTTP::Client::Response.new(200,body)else@http_client.get(uri,headers:HTTP::Headers{"User-Agent"=>CrMangaDownloadr::USER_AGENT})end

caseresponse.status_codewhen301uri=response.headers["Location"]when200if(binary||@cache_http)&&!File.exists?(cache_path)File.open(cache_path,"w")do|f|f.printresponse.bodyendend

ifbinaryreturncache_pathelsereturnXML.parse_html(response.body)endendrescueIO::Timeoutputs"Sleepingover#{uri}"sleep1endendend...end

DownloadrClient

moduleCrMangaDownloadrclassDownloadrClient...defget(uri:String,binary=false)Dir.mkdir_p(@config.cache_directory)unlessDir.exists?(@config.cache_directory)cache_path=File.join(@config.cache_directory,cache_filename(uri))whiletruebeginresponse=if@cache_http&&File.exists?(cache_path)body=File.read(cache_path)HTTP::Client::Response.new(200,body)else@http_client.get(uri,headers:HTTP::Headers{"User-Agent"=>CrMangaDownloadr::USER_AGENT})end

caseresponse.status_codewhen301uri=response.headers["Location"]when200if(binary||@cache_http)&&!File.exists?(cache_path)File.open(cache_path,"w")do|f|f.printresponse.bodyendend

ifbinaryreturncache_pathelsereturnXML.parse_html(response.body)endendrescueIO::Timeoutputs"Sleepingover#{uri}"sleep1endendend...end

DownloadrClient

moduleCrMangaDownloadrclassDownloadrClient...defget(uri:String,binary=false)Dir.mkdir_p(@config.cache_directory)unlessDir.exists?(@config.cache_directory)cache_path=File.join(@config.cache_directory,cache_filename(uri))whiletruebeginresponse=if@cache_http&&File.exists?(cache_path)body=File.read(cache_path)HTTP::Client::Response.new(200,body)else@http_client.get(uri,headers:HTTP::Headers{"User-Agent"=>CrMangaDownloadr::USER_AGENT})end

caseresponse.status_codewhen301uri=response.headers["Location"]when200if(binary||@cache_http)&&!File.exists?(cache_path)File.open(cache_path,"w")do|f|f.printresponse.bodyendend

ifbinaryreturncache_pathelsereturnXML.parse_html(response.body)endendrescueIO::Timeoutputs"Sleepingover#{uri}"sleep1endendend...end

DownloadrClient

moduleCrMangaDownloadrclassDownloadrClient...defget(uri:String,binary=false)Dir.mkdir_p(@config.cache_directory)unlessDir.exists?(@config.cache_directory)cache_path=File.join(@config.cache_directory,cache_filename(uri))whiletruebeginresponse=if@cache_http&&File.exists?(cache_path)body=File.read(cache_path)HTTP::Client::Response.new(200,body)else@http_client.get(uri,headers:HTTP::Headers{"User-Agent"=>CrMangaDownloadr::USER_AGENT})end

caseresponse.status_codewhen301uri=response.headers["Location"]when200if(binary||@cache_http)&&!File.exists?(cache_path)File.open(cache_path,"w")do|f|f.printresponse.bodyendend

ifbinaryreturncache_pathelsereturnXML.parse_html(response.body)endendrescueIO::Timeoutputs"Sleepingover#{uri}"sleep1endendend...end

DownloadrClient

moduleCrMangaDownloadrclassDownloadrClient...defget(uri:String,binary=false)Dir.mkdir_p(@config.cache_directory)unlessDir.exists?(@config.cache_directory)cache_path=File.join(@config.cache_directory,cache_filename(uri))whiletruebeginresponse=if@cache_http&&File.exists?(cache_path)body=File.read(cache_path)HTTP::Client::Response.new(200,body)else@http_client.get(uri,headers:HTTP::Headers{"User-Agent"=>CrMangaDownloadr::USER_AGENT})end

caseresponse.status_codewhen301uri=response.headers["Location"]when200if(binary||@cache_http)&&!File.exists?(cache_path)File.open(cache_path,"w")do|f|f.printresponse.bodyendend

ifbinaryreturncache_pathelsereturnXML.parse_html(response.body)endendrescueIO::Timeoutputs"Sleepingover#{uri}"sleep1endendend...end

DownloadrClient

moduleCrMangaDownloadrclassDownloadrClient...defget(uri:String,binary=false)Dir.mkdir_p(@config.cache_directory)unlessDir.exists?(@config.cache_directory)cache_path=File.join(@config.cache_directory,cache_filename(uri))whiletruebeginresponse=if@cache_http&&File.exists?(cache_path)body=File.read(cache_path)HTTP::Client::Response.new(200,body)else@http_client.get(uri,headers:HTTP::Headers{"User-Agent"=>CrMangaDownloadr::USER_AGENT})end

caseresponse.status_codewhen301uri=response.headers["Location"]when200if(binary||@cache_http)&&!File.exists?(cache_path)File.open(cache_path,"w")do|f|f.printresponse.bodyendend

ifbinaryreturncache_pathelsereturnXML.parse_html(response.body)endendrescueIO::Timeoutputs"Sleepingover#{uri}"sleep1endendend...end

DownloadrClient

require"fiberpool"

moduleCrMangaDownloadrstructConcurrency(A,B)definitialize(@config:Config,@engine_class:DownloadrClient.class)end

deffetch(collection:Array(A)?,&block:A,DownloadrClient->Array(B)?):Array(B)results=[]ofBifcollectionpool=Fiberpool.new(collection,@config.download_batch_size)pool.rundo|item|engine=@engine_class.new(@config)ifreply=block.call(item,engine)results.concat(reply)endendendresultsendendend

fetch

Concurrency

require"fiberpool"

moduleCrMangaDownloadrstructConcurrency(A,B)definitialize(@config:Config,@engine_class:DownloadrClient.class)end

deffetch(collection:Array(A)?,&block:A,DownloadrClient->Array(B)?):Array(B)results=[]ofBifcollectionpool=Fiberpool.new(collection,@config.download_batch_size)pool.rundo|item|engine=@engine_class.new(@config)ifreply=block.call(item,engine)results.concat(reply)endendendresultsendendend

fetch

Concurrency

require"fiberpool"

moduleCrMangaDownloadrstructConcurrency(A,B)definitialize(@config:Config,@engine_class:DownloadrClient.class)end

deffetch(collection:Array(A)?,&block:A,DownloadrClient->Array(B)?):Array(B)results=[]ofBifcollectionpool=Fiberpool.new(collection,@config.download_batch_size)pool.rundo|item|engine=@engine_class.new(@config)ifreply=block.call(item,engine)results.concat(reply)endendendresultsendendend

fetch

Concurrency

require"fiberpool"

moduleCrMangaDownloadrstructConcurrency(A,B)definitialize(@config:Config,@engine_class:DownloadrClient.class)end

deffetch(collection:Array(A)?,&block:A,DownloadrClient->Array(B)?):Array(B)results=[]ofBifcollectionpool=Fiberpool.new(collection,@config.download_batch_size)pool.rundo|item|engine=@engine_class.new(@config)ifreply=block.call(item,engine)results.concat(reply)endendendresultsendendend

fetch

Concurrency

require"fiberpool"

moduleCrMangaDownloadrstructConcurrency(A,B)definitialize(@config:Config,@engine_class:DownloadrClient.class)end

deffetch(collection:Array(A)?,&block:A,DownloadrClient->Array(B)?):Array(B)results=[]ofBifcollectionpool=Fiberpool.new(collection,@config.download_batch_size)pool.rundo|item|engine=@engine_class.new(@config)ifreply=block.call(item,engine)results.concat(reply)endendendresultsendendend

fetch

Concurrency

require"fiberpool"

moduleCrMangaDownloadrstructConcurrency(A,B)definitialize(@config:Config,@engine_class:DownloadrClient.class)end

deffetch(collection:Array(A)?,&block:A,DownloadrClient->Array(B)?):Array(B)results=[]ofBifcollectionpool=Fiberpool.new(collection,@config.download_batch_size)pool.rundo|item|engine=@engine_class.new(@config)ifreply=block.call(item,engine)results.concat(reply)endendendresultsendendend

fetch

Concurrency

fetchConcurrency

moduleCrMangaDownloadrclassWorkflowend

moduleStepsdefself.fetch_chapters(config:Config)end

defself.fetch_pages(chapters:Array(String)?,config:Config)puts"Fetchingpagesfromallchapters..."reactor=Concurrency(String,String).new(config,Pages)reactor.fetch(chapters)do|link,engine|engine.try(&.fetch(link)).as(Array(String))endend

defself.fetch_images(pages:Array(String)?,config:Config)end

defself.download_images(images:Array(Image)?,config:Config)end

defself.optimize_images(downloads:Array(String),config:Config)end

defself.prepare_volumes(downloads:Array(String),config:Config)endendend

fetchConcurrency

moduleCrMangaDownloadrclassWorkflowend

moduleStepsdefself.fetch_chapters(config:Config)end

defself.fetch_pages(chapters:Array(String)?,config:Config)puts"Fetchingpagesfromallchapters..."reactor=Concurrency(String,String).new(config,Pages)reactor.fetch(chapters)do|link,engine|engine.try(&.fetch(link)).as(Array(String))endend

defself.fetch_images(pages:Array(String)?,config:Config)end

defself.download_images(images:Array(Image)?,config:Config)end

defself.optimize_images(downloads:Array(String),config:Config)end

defself.prepare_volumes(downloads:Array(String),config:Config)endendend

crystal depscrystal speccrystal build src/cr_manga_downloadr.cr --release

crystal depscrystal speccrystal build src/cr_manga_downloadr.cr --release

cr_manga_downloadr 752K

time ./cr_manga_downloadr -t

time ./cr_manga_downloadr -t

5.57s user 6.79s system 14% cpu 1:26.76 total

. !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs

61 directories, 281 files

. !"" cr_manga_downloadr !"" libs # !"" ... !"" LICENSE !"" README.md !"" shard.lock !"" shard.yml !"" spec # !"" cr_manga_downloadr # # !"" chapters_spec.cr # # !"" concurrency_spec.cr # # !"" image_downloader_spec.cr # # !"" page_image_spec.cr # # $"" pages_spec.cr # !"" fixtures # # !"" ... # $"" spec_helper.cr $"" src !"" cr_manga_downloadr # !"" chapters.cr # !"" concurrency.cr # !"" downloadr_client.cr # !"" image_downloader.cr # !"" page_image.cr # !"" pages.cr # !"" records.cr # !"" version.cr # $"" workflow.cr $"" cr_manga_downloadr.cr

. !"" cr_manga_downloadr !"" libs # !"" ... !"" LICENSE !"" README.md !"" shard.lock !"" shard.yml !"" spec # !"" cr_manga_downloadr # # !"" chapters_spec.cr # # !"" concurrency_spec.cr # # !"" image_downloader_spec.cr # # !"" page_image_spec.cr # # $"" pages_spec.cr # !"" fixtures # # !"" ... # $"" spec_helper.cr $"" src !"" cr_manga_downloadr # !"" chapters.cr # !"" concurrency.cr # !"" downloadr_client.cr # !"" image_downloader.cr # !"" page_image.cr # !"" pages.cr # !"" records.cr # !"" version.cr # $"" workflow.cr $"" cr_manga_downloadr.cr

. !"" bin # $"" manga-downloadr !"" Gemfile !"" Gemfile.lock !"" lib # !"" manga-downloadr # # !"" chapters.rb # # !"" concurrency.rb # # !"" downloadr_client.rb # # !"" image_downloader.rb # # !"" page_image.rb # # !"" pages.rb # # !"" records.rb # # !"" version.rb # # $"" workflow.rb # $"" manga-downloadr.rb !"" LICENSE.txt !"" manga-downloadr.gemspec !"" Rakefile !"" README.md $"" spec !"" fixtures # !"" ... !"" manga-downloadr # !"" chapters_spec.rb # !"" concurrency_spec.rb # !"" image_downloader_spec.rb # !"" page_image_spec.rb # $"" pages_spec.rb $"" spec_helper.rb

. !"" cr_manga_downloadr !"" libs # !"" ... !"" LICENSE !"" README.md !"" shard.lock !"" shard.yml !"" spec # !"" cr_manga_downloadr # # !"" chapters_spec.cr # # !"" concurrency_spec.cr # # !"" image_downloader_spec.cr # # !"" page_image_spec.cr # # $"" pages_spec.cr # !"" fixtures # # !"" ... # $"" spec_helper.cr $"" src !"" cr_manga_downloadr # !"" chapters.cr # !"" concurrency.cr # !"" downloadr_client.cr # !"" image_downloader.cr # !"" page_image.cr # !"" pages.cr # !"" records.cr # !"" version.cr # $"" workflow.cr $"" cr_manga_downloadr.cr

. !"" bin # $"" manga-downloadr !"" Gemfile !"" Gemfile.lock !"" lib # !"" manga-downloadr # # !"" chapters.rb # # !"" concurrency.rb # # !"" downloadr_client.rb # # !"" image_downloader.rb # # !"" page_image.rb # # !"" pages.rb # # !"" records.rb # # !"" version.rb # # $"" workflow.rb # $"" manga-downloadr.rb !"" LICENSE.txt !"" manga-downloadr.gemspec !"" Rakefile !"" README.md $"" spec !"" fixtures # !"" ... !"" manga-downloadr # !"" chapters_spec.rb # !"" concurrency_spec.rb # !"" image_downloader_spec.rb # !"" page_image_spec.rb # $"" pages_spec.rb $"" spec_helper.rb

. !"" cr_manga_downloadr !"" libs # !"" ... !"" LICENSE !"" README.md !"" shard.lock !"" shard.yml !"" spec # !"" cr_manga_downloadr # # !"" chapters_spec.cr # # !"" concurrency_spec.cr # # !"" image_downloader_spec.cr # # !"" page_image_spec.cr # # $"" pages_spec.cr # !"" fixtures # # !"" ... # $"" spec_helper.cr $"" src !"" cr_manga_downloadr # !"" chapters.cr # !"" concurrency.cr # !"" downloadr_client.cr # !"" image_downloader.cr # !"" page_image.cr # !"" pages.cr # !"" records.cr # !"" version.cr # $"" workflow.cr $"" cr_manga_downloadr.cr

. !"" bin # $"" manga-downloadr !"" Gemfile !"" Gemfile.lock !"" lib # !"" manga-downloadr # # !"" chapters.rb # # !"" concurrency.rb # # !"" downloadr_client.rb # # !"" image_downloader.rb # # !"" page_image.rb # # !"" pages.rb # # !"" records.rb # # !"" version.rb # # $"" workflow.rb # $"" manga-downloadr.rb !"" LICENSE.txt !"" manga-downloadr.gemspec !"" Rakefile !"" README.md $"" spec !"" fixtures # !"" ... !"" manga-downloadr # !"" chapters_spec.rb # !"" concurrency_spec.rb # !"" image_downloader_spec.rb # !"" page_image_spec.rb # $"" pages_spec.rb $"" spec_helper.rb

[email protected]_directory

pipeSteps.fetch_chapters(@config).>>Steps.fetch_pages(@config).>>Steps.fetch_images(@config).>>Steps.download_images(@config).>>Steps.optimize_images(@config).>>Steps.prepare_volumes(@config).>>unwrap

puts"Done!"end

defself.run(config=Config.new)FileUtils.mkdir_pconfig.download_directory

CM(config,Workflow).fetch_chapters.fetch_pages(config).fetch_images(config).download_images(config).optimize_images(config).prepare_volumes(config).unwrap

puts"Done!"end

[email protected]_directory

pipeSteps.fetch_chapters(@config).>>Steps.fetch_pages(@config).>>Steps.fetch_images(@config).>>Steps.download_images(@config).>>Steps.optimize_images(@config).>>Steps.prepare_volumes(@config).>>unwrap

puts"Done!"end

#concurrency.crpool=Fiberpool.new(collection,@config.download_batch_size)pool.rundo|item|engine=@engine_class.new(@config)ifreply=block.call(item,engine)results.concat(reply)endend

#concurrency.crpool=Fiberpool.new(collection,@config.download_batch_size)pool.rundo|item|engine=@engine_class.new(@config)ifreply=block.call(item,engine)results.concat(reply)endend

pool=Thread.pool(@config.download_batch_size)mutex=Mutex.newresults=[]

collection.eachdo|item|pool.process{engine=@turn_on_engine?@engine_klass.new(@config.domain,@config.cache_http):nilreply=block.call(item,engine)&.flattenmutex.synchronizedoresults+=(reply||[])end}endpool.shutdown

Fibers

Threads

moduleCrMangaDownloadrclassPages<DownloadrClientdeffetch(chapter_link:String)html=get(chapter_link)nodes=html.xpath_nodes("//div[@id='selectpage']//select[@id='pageMenu']//option")nodes.map{|node|"#{chapter_link}/#{node.text}"}endendend

moduleCrMangaDownloadrclassPages<DownloadrClientdeffetch(chapter_link:String)html=get(chapter_link)nodes=html.xpath_nodes("//div[@id='selectpage']//select[@id='pageMenu']//option")nodes.map{|node|"#{chapter_link}/#{node.text}"}endendend

moduleMangaDownloadrclassPages<DownloadrClientdeffetch(chapter_link)getchapter_linkdo|html|nodes=html.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option")nodes.map{|node|[chapter_link,node.children.to_s].join("/")}endendendend

time bin/manga-downloadr -t

time bin/manga-downloadr -t

19.77s user 10.65s system 33% cpu 1:31.69 total

Ruby/Typhoeus(hydra_concurrency = 50) 41% CPU 1:24 min

Ruby/Typhoeus(hydra_concurrency = 50) 41% CPU 1:24 min

Elixir 1.4.5 (@max_demand=50) 120% CPU 1:14 min

Ruby/Typhoeus(hydra_concurrency = 50) 41% CPU 1:24 min

Elixir 1.4.5 (@max_demand=50) 120% CPU 1:14 min

Crystal 0.23.0 (opt_batch_size = 50) 14% CPU 1:26 min

Ruby/Typhoeus(hydra_concurrency = 50) 41% CPU 1:24 min

Elixir 1.4.5 (@max_demand=50) 120% CPU 1:14 min

Crystal 0.23.0 (opt_batch_size = 50) 14% CPU 1:26 min

Ruby 2.4.1 (opt_batch_size = 50) 33% CPU 1:31 min

Ruby Typhoeus libcurl

Ruby Typhoeus libcurl

Elixir OTP Poolboy

Ruby Typhoeus libcurl

Elixir OTP Poolboy

Crystal Fibers Fiberpool

Ruby Typhoeus libcurl

Elixir OTP Poolboy

Crystal Fibers Fiberpool

Ruby Thread Thread/Pool

manga-downloadr

ex_manga_downloadr

cr_manga_downloadr

manga-downloadr

ex_manga_downloadr

cr_manga_downloadr

fiberpool

cr_chainable_methods

chainable_methods

PREMATUREOPTIMIZATION

The Root of ALL Evil

THANKS

@akitaonrails

slideshare.net/akitaonrails

www.theconf.club