Mais conteúdo relacionado
Semelhante a Distil technical-white-paper (20)
Mais de Dr. Augustine Fou - Independent Ad Fraud Researcher (20)
Distil technical-white-paper
- 1. Inside the Distil Content
Protection Network
A Technical Whitepaper
Distil.it
2200 Wilson Blvd., Suite 102-219
Arlington, VA 22201
www.distil.it
- 2. 2
1
Distil protects your web content from
malicious scraping, data mining, and
unauthorized duplication.
Introduction
The Distil Content Protection Network (CPN) is
the first cloud-based, intelligent gatekeeper for
website content. The system makes real-time
decisions to distinguish between human visitors
to your website and malicious bots so that
controls can be put in place to limit or eliminate
content scraping.
A CPN is like anti-virus protection software. It
monitors 100% of incoming requests to ensure
they are from valid end-users and not from
potentially harmful software systems. Protection
is provided automatically and around the clock to
keep your web content safe from theft, reduce
bandwidth requirements, and provide ultimate
peace of mind.
The Distil CPN uses a unique, behavioral-based
learning mechanism that continually gathers
knowledge over to time to perfect bad bot
identification. So, the more you use Distil, the
more it learns about your type of visitors and the
better protection it provides. In addition, Distil
accelerates performance of your website by using
content caching techniques.
An important aspect of the Distil CPN is that it
runs entirely as a cloud service. This means there
is no infrastructure investment from either a
software or a hardware standpoint in order to use
the system. Our global network of redundant
servers provide high availability of your website
and unlimited scalability. And, getting started
with Distil can be done within minutes since your
existing website and infrastructure do not have to
be modified in any way.
How Distil Wor ks
From an end user’s perspective, there is no
change in behavior when your website switches
over to use the Distil Content Protection
Network. End users will be able to access all
services and information as they would normally.
From a technical perspective, requests for web
pages will first be routed through the Distil CPN
servers so that all visitors can be monitored,
vetted, and fingerprinted. Any visitor that
appears to be performing content scraping will be
quickly identified by our behavioral-based
learning algorithms, and then action is taken
according to configuration options. For example,
you could immediately block offending users, or,
use a gradual stepped response resulting in an
eventual ban. This gives you total control over
the type of access you want for your particular
website.
© 2012, Distil.it
- 3. 4
3
Dynamic
Threat
Response
It’s not enough to simply look for a certain type
of behavior and ban the offending signatures. Bot
and tool designers will quickly change the
behavior of their software to work around simple
defenses. Because the Distil Content Protection
Network uses a behavior-based learning
mechanism, the system dynamically adjusts to
the type of threat encountered using a variety of
methodologies:
Ø
header values, and requests made over time,
to look for discrepancies and irregularities
that wouldn’t be present in normal and
legitimate connection requests.
Ø
Unlike any other solution on the market, the
Distil patent-pending technology randomly
and dynamically injects challenges and traps
into the HTTP stream making it impossible
to predict our testing algorithms.
Ø
Active Connection Monitoring
HTTP Stream Injection
The Distil CPN monitors every single
connection and builds a fingerprint of every
incoming connection. The platform evaluates
a wide variety of metrics such as user-agent,
© 2012, Distil.it
Network-wide Threat Information
Propagation
All of the servers that make up the Distil
CPN internally share information regarding
malicious signatures. If a scraper attempts to
- 4. 6
5
attack one site protected by Distil, that
unique signature will be distributed and
flagged for all sites under protection.
Ø
User Defined Threat Response
In the event of false positives, the Distil CPN
gives the site owner the ability to adjust the
severity of threat response. Responses can
vary based on what the site owner wishes to
configure. A few examples of responses are:
end users and content sites, the Distil Content
Protection Network has several advantages over
appliance-based and server-side software antiscraping solutions:
Ø
Because the Distil CPN is based entirely in
the cloud, the system leverages a global
network of redundant servers dispersed
geographically throughout North America,
Europe, and Asia. When end-users connect to
a specific site, they’ll be routed through a
Distil server closest to them. Our servers will
then forward that connection through an
internal network if going to a different
geographical region or through a backbone
link if staying within the same region. This
allows the connection to avoid less then
optimal links that can occur between ISPs
and internet backbone providers.
Captcha Challenges: A minimally invasive
challenge to guarantee a real end user
Block Page: A custom form that captures
user information and deploys Distil support
to immediately investigate
Drop Connect: A roadblock for the most
egregious offenders
Easy
Setup
and
Configuration
The Distil Content Protection Network was
designed from the ground up to be quick to
deploy and easy to maintain. Unlike any other
solution on the market, the Distil CPN does not
require any complex setup. Other providers
require hardware and or custom code integration.
This puts unnecessary stress on your
infrastructure and is very time consuming. The
Distil CPN instead offloads the load from your
servers by routing your traffic through our
network of cloud servers improving your
performance, but more importantly, eliminating
all setup effort.
Better Response times and Reduced Latency
Ø
Limitless Scalability
As your business grows, so will your
infrastructure and bandwidth requirements.
The Distil CPN will scale with your needs
dynamically based on the number of
connection requests we observe. If your site
is only experiencing a momentary spike in
traffic, the system will scale back down
accordingly allowing you to always only use
the amount of bandwidth you need.
Ø
Advantages
of
the
Cloud
By acting as a cloud-based intermediary between
© 2012, Distil.it
Better Reporting and Monitoring Options
Every server contains rudimentary reporting
and logging options for troubleshooting and
traffic monitoring. These tools, however, are
often either limited in scope or resources
- 5. 8
7
because they aren’t the server’s primary
purpose. The Distil CPN, however, isn’t
constrained by the same limitations, and the
fact that Distil acts as a gateway allows the
system to offer enhanced connection
monitoring and reporting options.
Ø
The Distil CPN caches website content in
order to return resources faster to users and
reduce page load time, bandwidth, and server
load. Distil not only caches static content but
also identifies content dynamically generated
by your applications. Cached content does
not mean stale content. Distil allows you to
customize the length of time to store cached
requests and honors backend "Expires" and
"Cache-Control" directives.
The
Distil
Advantage
Due to cloud-based nature of the Distil Content
Protection Network, the service also offers
several distinct site acceleration features allowing
us to protect website content while accelerating
performance:
Ø
Dynamic Content Caching
Ø
Cloud Acceleration
The Distil CPN gives you the option to
accelerate the delivery of your files from our
global edge nodes. This reduces page load
times resulting in happier customers and
higher conversions. This will also reduce the
load on your infrastructure resulting in
better server performance and lower
bandwidth consumption.
Compression
For content rich applications, data transfer
can take a long time, which is why common
web servers and browsers support
compression for content. Configuring the
compression of resources on your web server
requires complicated settings and technical
knowledge. It also requires substantial
processing power from your web server. The
Distil CPN compresses content for you
automatically even if it is sent uncompressed
from your server.
About Distil.it
Distil is the leading Content Protection Network (CPN) and the first cloud-based, intelligent
gatekeeper for website content. Our CPN makes real-time decisions and seamlessly distinguishes
human visitors from malicious bots. Distil mitigates against duplicate content, improves SEO
power, and accelerates the end-user experience – all while reducing server load and infrastructure
demand. The setup is lightning–fast, secure, and completely transparent.
Our mission is to provide enterprise class protection safeguarding commercial and individual
content producers. With the Distil CPN there is finally a solution to protect your content, your
brand, and your revenue without impacting your end-user experience.
© 2012, Distil.it