do cheaters prosper? understanding the rules of application development
DESCRIPTION
From an early age, we are told that cheating is bad. There are rules we’re supposed to follow, and breaking these rules – we’re told – is a big no-no. No ice-cream before dinner, no peeking at your classmate’s test, no cutting in line for lunch… or else. But is cheating always a bad thing? Bad cheating breaks the law, puts people at risk or has severe negative consequences. If you’re making a program to run a nuclear reactor: don’t cheat. If something goes wrong then there will be very bad consequences. Good cheating is like a magic trick: it’s a method to achieve a desired effect. Just don’t get caught. Getting caught ruins the magic and gets you in trouble. Do things “wrong”, get dirty, use hacks, and optimize for very specific cases. Do all this in order to deliver the desired effect. Applications capture, retrieve and display resources. These resources have costs, such as the cost of transmission, processing and storage. Different resources have varying probabilities of being required; some may be requested and others not, some may be requested frequently and others rarely. These are all things to consider when deciding what processes can and need to be optimized, which will in turn help you choose from your magic bag of tricks. Here’s a quick barometer for deciding when to cheat: - Requests that are free or cheap to deliver and are rare are not worth optimizing since these are essentially effortless already - Requests that are expensive but rare aren’t a priority, feel free to procrastinate on these - Requests that are free or cheap and happen frequently are the fun ones to optimize - Requests that are frequent and expensive are your big wins… if you can figure out how to optimize them.TRANSCRIPT
Do Cheaters Prosper?UNDERSTANDING THE RULES
OF APPLICATION DEVELOPMENT
Scott Tadman (Chief of Research) / Dom Bortolussi (CEO), TWG @tadmanter @dombort @twg
“The Rules”
• These are taught to you in school.
• Based on conventions, shaped by tradition.
• Supposed to steer you towards best practices.
• Usually distorted by superstition and paranoia.
“The Rules”
• These arbitrary rules put you in a box.
• They limit your thinking.
“The Box”
“Cheating”
• Defies convention.
• Thinks “outside the box”.
• Breaks “rules”.
• Yields an “unfair” advantage.
Cheating?
• Isn’t cheating bad?
Bad Cheating
• Breaks laws.
• Puts people at risk.
• Has severe negative consequences.
Good Cheating
• Often called “hacks” or “tricks”
• Circumvents ritual or tradition.
• Cutting corners.
Good Cheating
Good Cheating
Bad Cheating
Bad Cheating
• We usually call it “illegal”.
• Like get you into jail illegal.
So what are the rules?
The Application’s Rules
• Don’t crash on me.
• Don’t waste my time.
• Don’t lie to me.
• Don’t forget my things.
• Don’t cost me money/bandwidth.
Important Rule
• Don’t kill me.
Asimov’s Law
• “A robot may not harm a human being.”
Developer’s Law?
• “An application may not harm a human being.”
“Fatal Error?”
Nuclear Reactor
Nuclear Reactor
Nuclear Reactor
Doh
Nuclear Reactor
Consequences?
We all die.
X-Ray Machine
X-Ray Machine
X-Ray Machine
X-Ray Machine
Yikes.
Consequences?
Someone dies.
The ApplicationYou’re Building
Most Applications
Typical Application
Bah.
Consequences?
Nobody dies.
Don’t worry so much.
Most Applications
• Won’t kill people.
• Won’t misplace lots of money.
• Won’t destroy the universe.
• Have a lot of opportunities to “cheat”.
“Cheating”
It’s like a magic trick.
Don’t get caught.
It ruins it.
Successful Cheating
• Don’t let me catch you crashing on me.
• Don’t let me catch you wasting my time.
• Don’t let me catch you lying to me.
• Don’t let me catch you forgetting my things.
• Don’t let me catch you costing me money/bandwidth.
Successful Cheating
• It’s all about not getting caught.
A Magic Trick
• Rough definition:
• A method to achieve the desired effect.
An Application
• Rough definition:
• A programmed method to achieve desired effects.
Performance Cheats
Performance Cheats
• Doing the “wrong” thing.
• Getting “dirty”, introducing “hacks”.
• Optimized for very specific cases.
• Are focused on delivering the desired effects.
• It’s always about perception.
Applications Deconstructed
• Capture, retrieve and display resources.
• Resources have varying costs including:
• Cost of transmission.
• Cost of processing.
• Cost of storage.
Application Fundamentals
• Resources may or may not be requested.
• Resources have varying probabilities of being required.
• Some requests are frequent.
• Some requests are rare.
Cost to Deliver
Probability of Request
Low
High
Optimization Regions
The BoxExpensive
Rare
Free
Frequent
Generalized solutions to generalized problems.
Cost to Deliver
Probability of Request
Low
High
Optimization Regions
Effortless
Expensive
Rare
Free
Frequent
These don’t require optimization.
Cost to Deliver
Probability of Request
Low
High
Optimization Regions
Effortless Procrastinate
Expensive
Rare
Free
Frequent
These aren’t a priority.
Cost to Deliver
Probability of Request
ExpensiveLow
Rare
High
Optimization Regions
Pleasure
Effortless Procrastinate
Free
Frequent
These are the ones we love to fix.
Cost to Deliver
Probability of Request
Low
High
Optimization Regions
PainPleasure
Effortless Procrastinate
Expensive
Rare
Free
Frequent
Big wins...if you can figure out how.
Bandwidth
System Power
Low
High
Resources
Marginal
None
Unlimited
Unlimited
Personal Computer
MobileSingle Server
Server Cluster
Fundamental Strategies
Perception Matters
• Performance is all about perception.
• People are oblivious to your best.
• People will remember your worst.
• Outliers are what people perceive the most.
• Remember: If the application “feels” fast, it is fast.
“Cheater’s Bag of Tricks”
Goals of “Cheating”
Cost to Deliver
Probability of Request
Low
High
Expensive
Rare
Free
Frequent Identified ProblemExpensive and frequent.
Cheating Goals
Cost to Deliver
Probability of Request
Low
High
Expensive
Rare
Free
Frequent
Optimization AMethod to reduce cost of delivery.
Cheating Goals
Cost to Deliver
Probability of Request
Low
High
Expensive
Rare
Free
Frequent
Optimization BMethod to reduce frequency of requests.
Cheating Goals
Cost to Deliver
Probability of Request
Low
High
Expensive
Rare
Free
Frequent
Optimization A + BCombination reduces cost and frequency.
Real-World Examples
Cost to Deliver
Probability of Request
Low
High
Weather Widget
Expensive
Rare
Free
Frequent
ImplementedFetch from API on demandand reformat into widget.
Cost to Deliver
Probability of Request
Low
High
Weather Widget
Expensive
Rare
Free
Frequent
ImplementedFetch from API on demandand reformat into widget.
PopularityWidget used by largepercentage of users.
Cost to Deliver
Probability of Request
Low
High
Weather Widget
Expensive
Rare
Free
Frequent
ImplementedFetch from API on demandand reformat into widget.
CachedWidget data cached andre-used for different users.
PopularityWidget used by largepercentage of users.
Cost to Deliver
Probability of Request
Low
High
Financial Blog
Expensive
Rare
Free
Frequent
Stock SolutionImages are loaded on demandwithout any processing.
Cost to Deliver
Probability of Request
Low
High
Financial Blog
Expensive
Rare
Free
Frequent
Stock SolutionImages are loaded on demandwithout any processing.
PopularityBlog becomes very populardue to publicity.
Cost to Deliver
Probability of Request
Low
High
Financial Blog
Expensive
Rare
Free
Frequent
Stock SolutionImages are loaded on demandwithout any processing.
PopularityBlog becomes very populardue to publicity.
Content ShiftLarge, detailed charts nowa common feature on blog.
Cost to Deliver
Probability of Request
Low
High
Financial Blog
Expensive
Rare
Free
Frequent
Stock SolutionImages are loaded on demandwithout any processing.
PopularityBlog becomes very populardue to publicity.
Content ShiftLarge, detailed charts nowa common feature on blog.
Solution 1Lazy-load images further down page.
Cost to Deliver
Probability of Request
Low
High
Financial Blog
Expensive
Rare
Free
Frequent
Stock SolutionImages are loaded on demandwithout any processing.
PopularityBlog becomes very populardue to publicity.
Content ShiftLarge, detailed charts nowa common feature on blog.
Solution 2Host images on cloud.
Solution 1Lazy-load images further down page.
Cost to Deliver
Probability of Request
Low
High
Financial Blog
Expensive
Rare
Free
Frequent
Stock SolutionImages are loaded on demandwithout any processing.
PopularityBlog becomes very populardue to publicity.
Content ShiftLarge, detailed charts nowa common feature on blog.
Solution 1Lazy-load images further down page.
Solution 2Host images on cloud.
Solution 3Create smaller inline images.
Specific Strategies
Wait Less
• Work to minimize perceived wait.
• Make things appear instantly.
• Don’t block, lock up, or stall.
• Flip client state immediately, handle in background.
Cost to Deliver
Probability of Request
Low
High
Wait Less
Expensive
Rare
Free
Frequent
Target Zone
“Add User Comment”Frequent operation but requires some heavy processing to apply correctly, complicated by convoluted business logic.
Cost to Deliver
Probability of Request
Low
High
Wait Less
Expensive
Rare
Free
Frequent
Client State Directly AlteredShow result of user action immediately, actually process in background.
Wait More
• Make things appear important.
• Keep user informed: spinners, progress bars.
• Keep delay proportional to significance.
• Can avoid really expensive requests by being annoying.
Cost to Deliver
Probability of Request
Low
High
Wait More
Expensive
Rare
Free
Frequent
Target Zone“Preview My Book”Requesting too many previews can cause severe server load issues.
Cost to Deliver
Probability of Request
Low
High
Wait More
Expensive
Rare
Free
Frequent
“Your Book is Queued”Delayed gratification.
More ConsiderationPeople less likely to engagefeature needlessly.
Cost to Deliver
Probability of Request
Low
High
Wait More
Expensive
Rare
Free
Frequent
Target Zone
“Find Best Flight”Might be trivial to calculate, but perceived value is very high.
Cost to Deliver
Probability of Request
Low
High
Wait More
Expensive
Rare
Free
Frequent
“Calculating Results...”Artificial delay makes task seem significant, system more powerful.
Load Less
• Fetching things on demand.
• “Lazy loading”
• Ideal for resources that might not be seen.
• Refrigerator light principle: Always seems on.
Cost to Deliver
Probability of Request
Low
High
Load Less
Expensive
Rare
Free
Frequent
Target Zone
“Blog Page, Post N”Things that are requestedbut not always utilized.
Cost to Deliver
Probability of Request
Low
High
Load Less
Expensive
Rare
Free
Frequent
Lazy LoadingPuts of loading a resourceuntil actually required.
Load More
• Fetching resources ahead of time.
• “Eager loading”
• Ideal for resources that will probably be seen.
• Boy-scout principle: Be prepared.
Cost to Deliver
Probability of Request
Low
High
Load More
Expensive
Rare
Free
Frequent
Target Zone “Site Icons”Many, many requests for tiny, inexpensive resources can add up.
Cost to Deliver
Probability of Request
Low
High
Load More
Expensive
Rare
Free
Frequent
Asset BundlingCreate single asset that can be used to render all icons.
High Aggregate CostRequesting many tiny resourcescan be expensive.
Site IconsVery large number of small files.
Saving Less
• Trim, crop, shrink.
• Strip out redundant or duplicated content.
• Define and enforce limitations.
• One size fits all instead of uncertainty.
• Consider rendering on demand.
Cost to Deliver
Probability of Request
Low
High
Saving Less
Expensive
Rare
Free
Frequent
Target Zone
“Upload Photo”Users upload full-sized photos when only smaller versions are ever actually displayed.
Cost to Deliver
Probability of Request
Low
High
Saving Less
Expensive
Rare
Free
Frequent
Reduce Size, CompressShrinking to sizes required by actual use cases, compressed to an acceptable level of quality.
Saving More
• Make copies of things in different formats.
• A form of “caching”, “pre-rendering”
• De-normalize your data.
• Optimize structure around retrieval patterns.
• Try and have everything important ready instantly.
Cost to Deliver
Probability of Request
Low
High
Saving More
Expensive
Rare
Free
Frequent
Target Zone
“Image Variants by Size”Images are used at various fixed sizes, each needing some processing to render.
Cost to Deliver
Probability of Request
Low
High
Saving More
Expensive
Rare
Free
Frequent
Pre-RenderingReduces frequency of requestsrequiring heavy processing.
Cache More
• Fastest database call is the one never made.
• Pre-cache when you have the data on hand.
• Reuse and recycle expensive results.
• Let your client cache images, scripts, pages.
• Learn to love the “expires” feature.
Target Zone
Cost to Deliver
Probability of Request
Low
High
Cache More
Expensive
Rare
Free
Frequent
Target Zone
High Frequency + CostCache anything you can.
Target Zone
High FrequencyCache if you can.
High CostCache if likely to be used more than once.
Cost to Deliver
Probability of Request
Low
High
Cache More
Expensive
Rare
Free
Frequent
“Friends of My Friends” Database QueryResult unlikely to change except in specific circumstances. Tricky to compute, easy to cache.
Cost to Deliver
Probability of Request
Low
High
Cache More
Expensive
Rare
Free
Frequent
“Video Effects Preview”Result of preview saved temporarily.
First Request
Second Request
Cache Less
• Tuning your database can make caching redundant.
• Caches can make updates take longer.
• Cache invalidation can be really hard.
• Cache mistakes can be really embarrassing.
• Running without a cache leaves you with headroom.
Cost to Deliver
Probability of Request
Low
High
Cache Less
Expensive
Rare
Free
Frequent
Target Zone
“Premature Optimization”Hiding real costs, creating hidden liabilities.
Cost to Deliver
Probability of Request
Low
High
Cache Less
Expensive
Rare
Free
Frequent Original ImplementationExpensive and slow.
Caching OpportunityPotential performance gain.
Cached ResultQuick and easy fix.
Refactoring Without CacheSolves actual performance problem.
Distribute More
• Content distribution networks.
• Clustered databases, data replication.
• Client-side storage, content bundles.
• Push data closer to users.
• “Edge caching”
Cost to Deliver
Probability of Request
Low
High
Distribute More
Expensive
Rare
Free
Frequent
Target Zone
“Server-Side Content”User provided content is stored on your application servers, storage and retrieval is getting overwhelming.
Cost to Deliver
Probability of Request
Low
High
Distribute More
Expensive
Rare
Free
Frequent
Server Hosted ResourceStored on server, transmissionimpedes other server operations.
CDN Hosted ResourceNo effort required to retainand deliver to client.
Distribute Less
• Keep data and content local.
• Self-hosted databases instead of cloud-hosted.
• “Dumb client, smart server” applications.
• Way more control over structure, strategy, technology.
Cost to Deliver
Probability of Request
Low
High
Distribute Less
Expensive
Rare
Free
Frequent
Target Zone
“Cloud Hosted Metrics Database”Slow, unresponsive API, charged per query.
Cost to Deliver
Probability of Request
Low
High
Distribute Less
Expensive
Rare
Free
Frequent
Remote Cloud ServiceTakes hundreds of API calls.
Local Data WarehouseTakes a few SQL queries.
Purge More
• Delete data no longer used.
• Archive to cheaper storage systems.
• Generate on demand.
• Less data equals less overhead.
• Minimalist principle.
Target Zone
Cost to Deliver
Probability of Request
Low
High
Purge More
Expensive
Rare
Free
Frequent
“Outdated, Detailed User Metrics Data”Expensive to retain, not likely to get used.
Cost to Deliver
Probability of Request
Low
High
Purge More
Expensive
Rare
Free
Frequent
DeletedNobody will miss it.
Purge Less
• Dump it on the cloud, forget about it.
• Keep things cached longer.
• Pre-render instead of render on demand.
• Long-tail: Sometimes unpopular things matter.
Cost to Deliver
Probability of Request
Low
High
Purge Less
Expensive
Rare
Free
FrequentSave Alternate FormatsAlso store as XML or JSON, exactly sent by API. High Aggregate Cost
Requests for unrelatedresources can add up.
“Movie Actors”Extensive library of movies, but queries aren’t predictable.
Compress More
• Lossless or lossy compression.
• Use deflate (gzip) for transfers.
• Aggressively minify scripts.
Target ZoneCost to Deliver
Probability of Request
Low
Compress More
Expensive
Rare
FreeAlmost AnythingMost resources benefit from some form of compression.
Frequent
Cost to Deliver
Probability of Request
Low
Compress More
Expensive
Rare
Free
Frequent
Text ResourceSome things compress really well.
JPEG Image ResourceSome things can’t be compressed much more without damage.
Compress Less
• Keep data in pure, raw form.
• Databases can’t interact with compressed data.
• Serialized or compressed data can’t be queried.
• Some databases love JSON or key-value types.
Cost to Deliver
Probability of Request
Low
High
Compress Less
Expensive
Rare
Free
Frequent
“User Profile Data”Hundreds of arbitrary fields, stored as JSON.
Store as Raw JSONSlightly higher write and retrieval cost.Query JSON in DB
Much lower query cost.
Index More
• Every query has a cost.
• Examine access patterns, index accordingly.
• A query without an index: painful.
• A query with a tuned index: bliss.
• Indexes massively reduce retrieval time.
Cost to Deliver
Probability of Request
Low
Index More
Expensive
Rare
Free
Frequent
“Message Board Table”Queried very frequently, indexing is vital.
Aggressive IndexingUse index to reduce read cost.
Index Less
• Every index makes writes more expensive.
• Indexes don’t always get used.
• Some indexes you might want but not need.
• Awkward but fast can be better than easy but slow.
Cost to Deliver
Probability of Request
Low
Index Less
Expensive
Rare
Free
Frequent
Aggressive IndexingWrites are significantly more expensive.
“User Activity Table”More writes than reads.
Multitask More
• Do things in parallel.
• Break down dependencies, avoid contention.
• Distribute work across many systems.
• Map-reduce doesn’t have to be hard.
Cost to Deliver
Probability of Request
Low
High
Multitask More
Expensive
Rare
Free
Frequent“Render Page”Smaller scale task is easy todelegate to many workers.
“Render Album Preview”Each page is an independent, but the final book needs to be one file.
Perceived ResultProcessing time massively reduced.
Multitask Less
• Stream things in sequentially.
• Keep load on server light.
• Keeps more network resources available.
• Buffer requests and process casually.
Cost to Deliver
Probability of Request
Low
High
Multitask Less
Expensive
Rare
Free
Frequent
“Sync Calendar Entries”Operation that’s expected to take considerable time can be delayed.
Background TasksCreate a simple work queue to minimize number of parallel operations.
Provision More
• Server hardware can be scaled up.
• Desktop/mobile apps can leverage server hardware.
• On-demand cloud services can be awesome.
• No inherent limit on server capability.
• If you can throw hardware at it, maybe do that.
Cost to Deliver
Probability of Request
Low
High
Provision More
Expensive
Rare
Free
Frequent Relatively ExpensiveUses high percentage of CPU.
Relatively InexpensiveSame operation on distributed server cluster is no big deal.
Provision Less
• Keep your application footprint lean.
• Don’t burn through CPU and battery.
• Don’t use tons of memory.
• Smaller apps load faster, don’t get kicked out as often.
• Smaller server clusters easier to tune and manage.
Cost to Deliver
Probability of Request
Low
High
Provision Less
Expensive
Rare
Free
Frequent
“Upload All Photos on Phone”Interrupts application flow, stalls, heavy CPU usage.
Batched OperationsSplit up task into smaller, low-impact tasks.
Breaking Rules
“Cheating”
Focus on the effects.
Figure the rest out.
Scott Tadman
twg.ca @ceben @twg
THANK YOUScott Tadman / Dom Bortolussi
twg.ca@tadmanter @dombort @twggithub.com/[email protected]