how do ad blockers work? - wtf ad blocking uk, 3/10/16
TRANSCRIPT
WTF AdBlockersHow do they work?
Matt O’Neill
hello.
what is ad blocking?
– Lennox1339, reddit/r/ShowerThoughts
“Browsing without an ad blocker is like [making love] without a condom. Only do it with someone you trust.”
“My right to protect myself from malvertising, spyware, and totally irrelevant
advertising.” “A way to speed up my web browsing.”
“It saves me a huge amount of money on my mobile data plan.”
“I don’t want to be tracked by marketers. It’s creepy.”
“Outright theft of my content” “A reflection of how consumers feel about
online advertising” “It’s only really used by young, male
techies.” “It’s the same as pirating movies”
The hiding or request denial of content not directly related to core content of a page or app. This includes display & video
advertising, text links, content recommendation and marketing, native advertising and other paid and unpaid
placements.
ad blocking: a brief history
0
75
150
225
300
2000 2002 2004 2006 2008 2010 2012 2014 2016
2002
AdBlock launched
Blocks based on source of creative
2006
ADB+ released as standalone extension
Site level blocking enabled
2010
AdBlock+ released as Chrome Extension
2013
Support added for IE
Removed from Google Play but available on the ADB+ site
2014
AdBlock Plus released for OS X Safari
2015
Blocker browsers
Apple Support for ad blocking
Blocker blockers emerge
Network / carrier level blocking
AdBlock+ launched ‘acceptable ads’ as an industry initiative
History of Ad Blocking2004
Content policies leveraged as blocking mechanism
Scripts, background images, stylesheets added to lockable item list
2005
Whitelisting introduced
localised versions launched
list & filter synching
how does it work?
Some Ad Blocking TermsBrowser Extension: A small program that is linked to a web browser like Chrome or Firefox.
Content Policies / Filtering: The use of a program to screen and exclude from access or availability Web pages or electronic content that is deemed objectionable.
DOM (Document Object Model): An API to an underlying HTML or XML document represented typically by a hierarchy of objects within the document.
DPI (Deep Packet Inspection): A kind of filtering that examines the data part and header of a chunk of information as it passes a point in a network. It searches for protocol non-compliance, viruses, spam, intrusions, and other flagged information.
Gecko: A web browser designed to support open Internet standards and is used by different applications to display web pages and, in some cases, an application's user interface itself. It is free and open-source software.
Lists (e.g. EasyList): a collection of domains, sub-domains, and other references to technology that delivers advertising to browsers
RegEx (Regular Expression): A way to search for parts of a string of text. Similar to using an * when looking for a filename.
Element Hiding + Request Blocking
Element hiding: A CSS snippet is injected into the DOM via the browser extension to hide elements. Ads are completely removed from the rendered page. This doesn't prevent resources from loading in the first place.
Request blocking: To keep the payload from loading in the first place, HTTP requests for retrieving resources that are supposed to be blocked are blocked entirely. This will make the page load faster by reducing data throughput. Request blocking enables blocking content that is loaded from within Flash or HTML5 including video pre-roll.
http://stackoverflow.com/users/406565/sebastian-noack
The steps to blockingUser requests a web page with ad blocking enabled
Ad blocker is an extension to the browser
Ad blocker references a list of know ad servers and content delivery networks (CDNs)
The ad blocker inspects the DOM for scripts and CSS known to be affiliated with advertising
The ad blocker cleans up the holes (sometimes) in the page to tidy to up
The page renders on the user’s browser
Domain Filtering
Other Blocking Methods
“Ad blockers - a white paper”, Secret Media - 2014
What the browser sees
With ad blocking
Let’s make an ad blocker!#!/usr/bin/perl -w
use strict;
my %hosts = (); while ( <> ) { if ( $_ =~ m/^\|\|([a-z][a-z0-9-_.]+\.([a-z]{2,3}))\^\s*$/ ) { $hosts{$1} = 1; } }
foreach my $host ( sort keys %hosts ) { print( "127.0.0.1\t$host\n" ); }
moneill$ perl filter-easylist-to-hosts.pl easylist.txt >easylist.hosts moneill$ cp /etc/hosts etc/hosts.bak moneill$ cp easylist.hosts /etc/hosts
moneill$ wget https://easylist-downloads.adblockplus.org/easylist.txt --2016-03-08 14:31:34-- https://easylist-downloads.adblockplus.org/easylist.txt Resolving easylist-downloads.adblockplus.org... 136.243.62.212, 148.251.139.76, 2a01:4f8:212:1626::2, ... Connecting to easylist-downloads.adblockplus.org|136.243.62.212|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 1756007 (1.7M) [text/plain] Saving to: 'easylist.txt'
easylist.txt 100%[===================>] 1.67M 368KB/s in 4.4s
https://newspaint.wordpress.com/2014/08/18/filtering-easylist-for-hosts-file-style-adblock/
recent developments
improved mobile blocking
adblock + ad block plus acceptable ads
rise of blocker blockers
shine & carrier level blocking
ad blocking: weaknesses
annoying the privacy gods
adoption of encrypted web calls / https
Mobile is a challenge, especially in app
native and facebook
takeways
it’s not hard to build
it’s largely open source and globally crowd sourced
it requires access to something in the document to work
ultimately it can be defeated by publishers and ad tech
But at what cost…