distil technical-white-paper

5

Click here to load reader

Upload: dr-augustine-fou

Post on 12-Jul-2015

205 views

Category:

Business


1 download

TRANSCRIPT

Page 1: Distil technical-white-paper

A Technical Whitepaper

Distil.it 2200 Wilson Blvd., Suite 102-219 Arlington, VA 22201

www.distil.it

Inside the Distil Content Protection Network

Page 2: Distil technical-white-paper

1

Introduction  

The Distil Content Protection Network (CPN) is the first cloud-based, intelligent gatekeeper for website content. The system makes real-time decisions to distinguish between human visitors to your website and malicious bots so that controls can be put in place to limit or eliminate content scraping.

A CPN is like anti-virus protection software. It monitors 100% of incoming requests to ensure they are from valid end-users and not from potentially harmful software systems. Protection is provided automatically and around the clock to keep your web content safe from theft, reduce bandwidth requirements, and provide ultimate peace of mind.

The Distil CPN uses a unique, behavioral-based learning mechanism that continually gathers knowledge over to time to perfect bad bot identification. So, the more you use Distil, the more it learns about your type of visitors and the better protection it provides. In addition, Distil accelerates performance of your website by using content caching techniques.

2

An important aspect of the Distil CPN is that it runs entirely as a cloud service. This means there is no infrastructure investment from either a software or a hardware standpoint in order to use the system. Our global network of redundant servers provide high availability of your website and unlimited scalability. And, getting started with Distil can be done within minutes since your existing website and infrastructure do not have to be modified in any way.

How Disti l Works

From an end user’s perspective, there is no change in behavior when your website switches over to use the Distil Content Protection Network. End users will be able to access all services and information as they would normally.

 

From a technical perspective, requests for web pages will first be routed through the Distil CPN servers so that all visitors can be monitored, vetted, and fingerprinted. Any visitor that appears to be performing content scraping will be quickly identified by our behavioral-based learning algorithms, and then action is taken according to configuration options. For example, you could immediately block offending users, or, use a gradual stepped response resulting in an eventual ban. This gives you total control over the type of access you want for your particular website.

© 2012, Distil.it

Distil protects your web content from malicious scraping, data mining, and

unauthorized duplication.

Page 3: Distil technical-white-paper

3

Dynamic  Threat  Response  

It’s not enough to simply look for a certain type of behavior and ban the offending signatures. Bot and tool designers will quickly change the behavior of their software to work around simple defenses. Because the Distil Content Protection Network uses a behavior-based learning mechanism, the system dynamically adjusts to the type of threat encountered using a variety of methodologies:

Ø Active Connection Monitoring

The Distil CPN monitors every single connection and builds a fingerprint of every incoming connection. The platform evaluates a wide variety of metrics such as user-agent,

4

header values, and requests made over time, to look for discrepancies and irregularities that wouldn’t be present in normal and legitimate connection requests.

Ø HTTP Stream Injection

Unlike any other solution on the market, the Distil patent-pending technology randomly and dynamically injects challenges and traps into the HTTP stream making it impossible to predict our testing algorithms.

Ø Network-wide Threat Information Propagation

All of the servers that make up the Distil CPN internally share information regarding malicious signatures. If a scraper attempts to

© 2012, Distil.it

Page 4: Distil technical-white-paper

5

attack one site protected by Distil, that unique signature will be distributed and flagged for all sites under protection.

Ø User Defined Threat Response

In the event of false positives, the Distil CPN gives the site owner the ability to adjust the severity of threat response. Responses can vary based on what the site owner wishes to configure. A few examples of responses are:

Captcha Challenges: A minimally invasive challenge to guarantee a real end user

Block Page: A custom form that captures user information and deploys Distil support to immediately investigate

Drop Connect: A roadblock for the most egregious offenders

Easy  Setup  and  Configuration  

The Distil Content Protection Network was designed from the ground up to be quick to deploy and easy to maintain. Unlike any other solution on the market, the Distil CPN does not require any complex setup. Other providers require hardware and or custom code integration. This puts unnecessary stress on your infrastructure and is very time consuming. The Distil CPN instead offloads the load from your servers by routing your traffic through our network of cloud servers improving your performance, but more importantly, eliminating all setup effort.

Advantages  of  the  Cloud  

By acting as a cloud-based intermediary between

6

end users and content sites, the Distil Content Protection Network has several advantages over appliance-based and server-side software anti-scraping solutions:

Ø Better Response times and Reduced Latency

Because the Distil CPN is based entirely in the cloud, the system leverages a global network of redundant servers dispersed geographically throughout North America, Europe, and Asia. When end-users connect to a specific site, they’ll be routed through a Distil server closest to them. Our servers will then forward that connection through an internal network if going to a different geographical region or through a backbone link if staying within the same region. This allows the connection to avoid less then optimal links that can occur between ISPs and internet backbone providers.

Ø Limitless Scalability

As your business grows, so will your infrastructure and bandwidth requirements. The Distil CPN will scale with your needs dynamically based on the number of connection requests we observe. If your site is only experiencing a momentary spike in traffic, the system will scale back down accordingly allowing you to always only use the amount of bandwidth you need.

Ø Better Reporting and Monitoring Options

Every server contains rudimentary reporting and logging options for troubleshooting and traffic monitoring. These tools, however, are often either limited in scope or resources

© 2012, Distil.it

Page 5: Distil technical-white-paper

7

because they aren’t the server’s primary purpose. The Distil CPN, however, isn’t constrained by the same limitations, and the fact that Distil acts as a gateway allows the system to offer enhanced connection monitoring and reporting options.

The  Distil  Advantage  

Due to cloud-based nature of the Distil Content Protection Network, the service also offers several distinct site acceleration features allowing us to protect website content while accelerating performance:

Ø Cloud Acceleration

The Distil CPN gives you the option to accelerate the delivery of your files from our global edge nodes. This reduces page load times resulting in happier customers and higher conversions. This will also reduce the load on your infrastructure resulting in better server performance and lower bandwidth consumption.

8

Ø Dynamic Content Caching

The Distil CPN caches website content in order to return resources faster to users and reduce page load time, bandwidth, and server load. Distil not only caches static content but also identifies content dynamically generated by your applications. Cached content does not mean stale content. Distil allows you to customize the length of time to store cached requests and honors backend "Expires" and "Cache-Control" directives.

Ø Compression

For content rich applications, data transfer can take a long time, which is why common web servers and browsers support compression for content. Configuring the compression of resources on your web server requires complicated settings and technical knowledge. It also requires substantial processing power from your web server. The Distil CPN compresses content for you automatically even if it is sent uncompressed from your server.

© 2012, Distil.it

About Distil . it

Distil is the leading Content Protection Network (CPN) and the first cloud-based, intelligent gatekeeper for website content. Our CPN makes real-time decisions and seamlessly distinguishes human visitors from malicious bots. Distil mitigates against duplicate content, improves SEO power, and accelerates the end-user experience – all while reducing server load and infrastructure demand. The setup is lightning–fast, secure, and completely transparent.

Our mission is to provide enterprise class protection safeguarding commercial and individual content producers. With the Distil CPN there is finally a solution to protect your content, your brand, and your revenue without impacting your end-user experience.