field guide to preventing competitor price scraping, unwanted transactions, brute force attacks, and...
TRANSCRIPT
#RSPS15#RSPS15
StubHub's Field Guide To Preventing Competitor Price Scraping, Unwanted Transactions, Brute Force
Attacks, And Click Fraud
SPONSORED BY:
#RSPS15
#RSPS15Retail Touchpoints: @RTouchPoints
Distil Networks: @DistilMarty Boos: @StubHub
Rami Essaid: @RamiEssaidAlicia Fiorletta: @AliciaFiorletta
Follow this event on LinkedIn & Twitter
#RSPS15
Questions, Tweets & Resources
Submit your questions
here
Download today’s
resources
Join the conversation
#RSPS15
#RSPS15
About Retail TouchPoints Launched in 2007 Over 30,000 retail subscribers To provide executives with
relevant, insightful content across a variety of digital medium
Sign up for our weekly newsletter: www.retailtouchpoints.com/subscribe
#RSPS15
PanelistsMODERATOR:Alicia FiorlettaSenior Editor, Retail TouchPoints
Rami EssaidCEO & Co-Founder Distil Networks @ramiessaid
Marty BoosSr. Director Technology OperationsStubHub@StubHub
StubHub’s Field Guide to Preventing Competitor Price Scraping, Unwanted Transactions, Brute Force
Attacks, and Click Fraud
Agenda
The growing bot problemThe impact of bots on e-commerce businessesHow StubHub squashed malicious botsSelection criteria for a bot detection solutionQ & A
What Is Web Scraping?Web ScrapingAlso known as screen scraping, web scraping is the act of copying large amounts of data from a website – either manually or with an automated program (Bot)
Legitimate ScrapingScraping can sometimes be benevolent and totally acceptable. For example, the search engine bots that index your website
Malicious ScrapingA systematic theft of intellectual property accessible on a website, including pricing, content, images, and proprietary data
Web Scraping at Large Online Beauty Retailer
Black Friday saw a 100x
Increase in Bad Bots
Challenges Distil Results
Competitors were scraping product and pricing data, using it to lure customers away
Stopped competitors from scraping pricing and product data by blocking bad bots
Traffic from malicious bots was consuming server resources and slowing site performance
Eliminated bad bot traffic, cutting server resource needs by 22% while improving performance
Tracking suspicious IP addresses manually was a tedious manual process
Automated the bot detection and mitigation process, saving valuable IT resources
Beauty Retailer Clamps Down on Competitive Data Mining
One of Europe’s largest online beauty retailers.
We have a handful of competitors that cause us a lot of headaches. With Distil, we’ve stopped them from scraping our data, which protects our competitive advantage. In addition, we’ve reduced the load by 22%, and our customers experience faster response times. ”-Principal Solutions Developer
“
How Big is the Problem?
Up to 60% of traffic on ecommerce websites are Bad Bots
4.2 million IP addresses impacted by “Pushdo” botnet alone
15% bot traffic can equate to hitting each of your pricing pages 30 times per month
Why the Massive Increase in Bot Traffic? Online data has increased in valuePricing information, product availability, product descriptions, and vendor reviews are changing daily and highly valuable to competitors
Anyone can get in the gameCheap or free virtual servers, bandwidth, easy-to-use tools, and scrapers for hire
Bots no longer tied to IP addressesBots cycle through random IP addresses Bots hide behind anonymous proxies Consumer IPs now infected with bot traffic too
High Profile Web Scraping in the Ecommerce Industry
QVC is an American television home shopping network and online ecommerce site.
Aggressive price and inventory scraping by shopping aggregator app resulted in the following repercussions for QVC
● Two day website outage● Loss of $2M in revenue● Highly publicized lawsuit● Damage to QVC Brand
Negative SEO Attacks
Bots steal content, product lists, and prices for duplication elsewhere on the Internet
Duplicated content reduces your company’s uniqueness and thus quality score
SEO damage may result, especially if○Your prices are undercut○The content is repurposed on a more popular site
Bots and Negative SEO Attacks
Bots and Competitive Data Mining
Duplicating your Product PortfolioBots can easily gather product and supplier listsfor replication elsewhere
Undermining your PricesBots monitor your prices, ensuring competitorscan undercut with lower price listings
Availability TrackingIdentifying when your supply has been exhausted provides competitors a unique opportunity to raise the price of their goods.
Bots and Security Breaches
Brute Force Account TakeoverUsing a bot to try stolen usernames and passwords from breaches at other websites on your site
Newly compromised accounts are then used for various forms of fraud/theft
Bots and Transaction Fraud
CardingCreating micro-transactions with stolen credit cards against e-commerce sites to test their validity
About StubHub
Largest secondary ticket marketplace in the worldAn eBay companyProcesses nearly 500 transactions per second
○StubHub is an online marketplace
which provides services for buyers and
sellers of tickets for sports, concerts,
theater and other live entertainment
events.
StubHub Bot Challenges
Bot Challenges○Bots were used for brute force account takeovers
○Competitors tried to game the system, scraping prices, and monitoring inventory and customer behavior
○Random spikes in bot traffic were causing increased utilizationof resources
○Tested multiple competitor solutions, but they were difficult to configure and in some cases broke our website
StubHub Bot Selection Criteria
Bot Detection and Mitigation Solution Requirements○Block web scrapers without impacting human visitors
○Accurately identify good bots vs. bad bots
○Cannot solely rely on rule based systemMust include automated learning to “self tune”for defending against emerging and unknown threats
○Needs to include Distil community to improve accuracy of bot detection
○Must seamlessly co-exist with existing solutions(SIEM, CDN, WAF, etc.)
StubHub Results with Distil Networks
Reduced competitive data mining and fraud
Drastically reduced competitive data mining, increased SEO rankings, and protected our marketplace ecosystem
Distil is a key piece of our fraud detection and prevention suite of tools
StubHub Results with Distil Networks
Improved traffic quality and enriched analytic data
Cut pageviews in half, without impacting human users or ad deliveries
Quality of traffic has greatly improved by stopping unwanted bots and limiting site access for trusted bots
Negative Security Model - Blocking Bad Bots
Positive Security Model - Whitelisting Trusted Sources
The Importance of No False Positives / Negative Impact on Humans
Good bots make up over 35% of all traffic to the average website
○ Search engines - Google, Bing, Baidu, etc.,○ Alexa Crawler○ Pingdom, Keynote, etc.
Effective solutions block bad bots but leave good bots unhindered
The Importance of Accurately Identifying Good vs Bad Bots
Source: Distil Networks, 2015 Bad Bot Landscape Report
Bot detection should never rely on static signatures or manual rule creation
Automation and machine learning must be performed in real-time
Effective bot mitigation solutions ○Dynamically classify users by correlating dozens of data pointsas well as behavior patterns
○Constantly “self-tune” to evolve alongside the morphing threats they encounter and protect against
The Importance of Machine Learning and Self Tuning
○Real-time updates from a centralized violators database help protect all sites and improve accuracy
○Data from attacks detected anywhere on the network should be centralized, correlated, and analyzed by a big data analysis platform
○Signatures are then constantly updated to drastically reduce false positives (blocking humans) and false negatives (missing bad bots)
The Importance of Community Supported Centralized Threat Database
Many organizations have complex web environments which may include a multitude of different solutions including
○Content Delivery Networks (CDNs)○WAFs, FW, IPS○SIEMs○Load balancers○and more..
Bot mitigation must be able to seamlessly deployed alongside these technologies without impacting their performance or usage
The Importance of Seamless Compatibility
The First Easy and Accurate Way to Defend Websites Against Malicious
Bots
The World’s Most Accurate Bot Detection System
Inline FingerprintingFingerprints stick to the bot even if it attempts to reconnect from random IP addresses or hide behind an anonymous proxy. Known Violators DatabaseReal-time updates from the world’s largest Known Violators Database, which is based on the collective intelligence of all Distil-protected sites.
Browser ValidationThe first solution to disallow browser spoofing by validating each incoming request as self-reported and detects all known browser automation tools.
Behavioral Modeling and Machine LearningMachine-learning algorithms pinpoint behavioral anomalies specific to your site’s unique traffic patterns.
How Ecommerce Companies Benefit from Distil
Increase insight & control over human, good bot & bad bot
traffic
○Block 99.9% of
malicious bots without
impacting legitimate
users
○Slash the high tax
bots place on internal
teams & web
infrastructure
○Protect data from
web scrapers,
unauthorized
aggregators & hackers
www.distilnetworks.com/trial/Offer Ends October 15th
Two Months of Free Service + Traffic Analysis
QUESTIONS….COMMENTS?I N F O @ D I S T I L N E T W O R K S . C O M
OR CALL US ON1.866.423.0606
www.distilnetworks.com
#RSPS15
Q & A // PanelistsMODERATOR:Alicia Fiorletta Senior Editor, Retail TouchPoints
Rami EssaidCEO & Co-Founder Distil Networks @ramiessaid
Marty BoosSr. Director Technology OperationsStubHub@StubHub
#RSPS15
http://www3.retailtouchpoints.com/rsp15/
PLEASE JOIN US FOR OUR NEXT SESSION:Today at 2PM ET / 11AM PT
Thanks for attending!