SlideShare uma empresa Scribd logo
1 de 102
Baixar para ler offline
INTERNET APPLICATIONS
           Internet applications (IA) are web applications that have the features and
functionality of traditional desktop applications. IAs typically transfer the processing
necessary for the user interface to the web client but keep the bulk of the data (i.e
maintaining the state of the program, the data etc) back on the application server.
IAS TYPICALLY:
          run in a web browser, or do not require software installation
          run locally in a secure environment called a sandbox
can be "occasionally connected" wandering in and out of hot-spots or from office to
office.


HISTORY OF IAS
          The term " Internet application" was introduced in a Macromedia whitepaper in
March 2002, though the concept had been around for a number of years before that under
different names such as:
          Remote Scripting, by Microsoft, circa 1998
          X Internet, by Forrester Research in October 2000
          (Web) clients
          web application
          Comparison to standard web applications
          Traditional web applications centered all activity around a client-server
architecture with a thin client. Under this system all processing is done on the server, and
the client is only used to display static (in this case HTML) content. The biggest
drawback with this system is that all interaction with the application must pass through
the server, which requires data to be sent to the server, the server to respond, and the page
to be reloaded on the client with the response. By using a client side technology which
can execute instructions on the client's computer, IAs can circumvent this slow and
synchronous loop for many user interactions. This difference is somewhat analogous to
the difference between "terminal and mainframe" and Client-server/Fat client approaches.
Internet standards have evolved slowly and continually over time to accommodate these
techniques, so it is hard to draw a strict line between what constitutes an IA and what
does not. But all IAs share one characteristic: they introduce an intermediate layer of
code, often called a client engine, between the user and the server. This client engine is
usually downloaded at the beginning of the application, and may be supplemented by
further code downloads as the application progresses. The client engine acts as an
extension of the browser, and usually takes over responsibility for rendering the
application's user interface and for server communication.
        What can be done in an IA may be limited by the capabilities of the system used
on the client. But in general, the client engine is programmed to perform application
functions that its designer believes will enhance some aspect of the user interface, or
improve its responsiveness when handling certain user interactions, compared to a
standard Web browser implementation. Also, while simply adding a client engine does
not force an application to depart from the normal synchronous pattern of interactions
between browser and server, in most IAs the client engine performs additional
asynchronous communications with servers.


BENEFITS
        Because IAs employ a client engine to interact with the user, they are:
        er. They can offer user-interface behaviors not obtainable using only the HTML
widgets available to standard browser-based Web applications. This er functionality may
include anything that can be implemented in the technology being used on the client side,
including drag and drop, using a slider to change data, calculations performed only by the
client and which do not need to be sent back to the server (e.g. an insurance rate
calculator), etc.
        More responsive. The interface behaviors are typically much more responsive
than those of a standard Web browser that must always interact with the server.
The most sophisticated examples of IAs exhibit a look and feel approaching that of a
desktop environment. Using a client engine can also produce other performance benefits:
        Client/Server balance. The demand for client and server computing resources is
better balanced, so that the Web server need not be the workhorse that it is with a
traditional Web application. This frees server resources, allowing the same server
hardware to handle more client sessions concurrently.
Asynchronous communication. The client engine can interact with the server
asynchronously -- that is, without waiting for the user to perform an interface action like
clicking on a button or link. This option allows IA designers to move data between the
client and the server without making the user wait. Perhaps the most common application
of this is prefetching, in which an application anticipates a future need for certain data,
and downloads it to the client before the user requests it, thereby speeding up a
subsequent response. Google Maps uses this technique to move adjacent map segments to
the client before the user scrolls their view.
           Network efficiency. The network traffic may also be significantly reduced because
an application-specific client engine can be more intelligent than a standard Web browser
when deciding what data needs to be exchanged with servers. This can speed up
individual requests or responses because less data is being transferred for each
interaction, and overall network load is reduced. However, use of asynchronous
prefetching techniques can neutralize or even reverse this potential benefit. Because the
code cannot anticipate exactly what every user will do next, it is common for such
techniques to download extra data, not all of which is actually needed, to many or all
clients.
SHORTCOMINGS AND RESTRICTIONS
          Shortcomings and restrictions associated with IAs are:
          Sandbox. Because IAs run within a sandbox, they have restricted access to system
resources. If assumptions about access to resources are incorrect, IAs may fail to operate
correctly.
          Disabled scripting. JavaScript or another scripting language is often required. If
the user has disabled active scripting in their browser, the IA may not function properly,
if at all.
          Client processing speed. To achieve platform independence, some IAs use client-
side scripts written in interpreted languages such as JavaScript, with a consequential loss
of performance. This is not an issue with compiled client languages such as Java, where
performance is comparable to that of traditional compiled languages, or with Flash
movies, in which the bulk of the operations are performed by the native code of the Flash
player.
          Script download time. Although it does not have to be installed, the additional
client-side intelligence (or client engine) of IA applications needs to be delivered by the
server to the client. While much of this is usually automatically cached it needs to be
transferred at least once. Depending on the size and type of delivery, script download
time may be unpleasantly long. IA developers can lessen the impact of this delay by
compressing the scripts, and by staging their delivery over multiple pages of an
application.
          Loss of integrity. If the application-base is X/HTML, conflicts arise between the
goal of an application (which naturally wants to be in control of its presentation and
behaviour) and the goals of X/HTML (which naturally wants to give away control). The
DOM interface for X/HTML makes it possible to create IAs, but by doing so makes it
impossible to guarantee correct function. Because an IA client can modify the IA's basic
structure and override presentation and behaviour, it can cause an irrecoverable client
failure or crash. Eventually, this problem could be solved by new client-side mechanisms
that granted an IA client more limited permission to modify only those resources within
the scope of its application. (Standard software running natively does not have this
problem because by definition a program automatically possesses all rights to all its
allocated resources).
        Loss of visibility to search engines. Search engines may not be able to index the
text content of the application.
MANAGEMENT COMPLICATIONS


        The advent of IA technologies has introduced considerable additional complexity
into Web applications. Traditional Web applications built using only standard HTML,
having a relatively simple software architecture and being constructed using a limited set
of development options, are relatively easy to design and manage. For the person or
organization using IA technology to deliver a Web application, their additional
complexity makes them harder to design, test, measure, and support.
        Use of IA technology poses several new Service Level Management ("SLM")
challenges, not all of which are completely solved today. SLM concerns are not always
the focus of application developers, and are rarely if ever perceived by application users,
but they are vital to the successful delivery of an online application. Aspects of the IA
architecture that complicate management processes[1] are:
        Greater complexity makes development harder. The ability to move code to the
client gives application designers and developers far more creative freedom. But this in
turn makes development harder, increases the likelihood of defects (bugs) being
introduced, and complicates software testing activities. These complications lengthen the
software development process, regardless of the particular methodology or process being
employed. Some of these issues may be mitigated through the use of a Web application
framework to standardize aspects of IA design and development. However, increasing
complexity in a software solution can complicate and lengthen the testing process, if it
increases the number of use cases to be tested. Incomplete testing lowers the application's
quality and its reliability during use.
        IA architecture breaks the Web page paradigm. Traditional Web applications can
be viewed as a series of Web pages, each of which requires a distinct download, initiated
by an HTTP GET request. This model has been characterized as the Web page paradigm.
IAs invalidate this model, introducing additional asynchronous server communications to
support a more responsive user interface. In IAs, the time to complete a page download
may no longer correspond to something a user perceives as important, because (for
example) the client engine may be prefetching some of the downloaded content for future
use. New measurement techniques must be devised for IAs, to permit reporting of
response time quantities that reflect the user's experience. In the absence of standard tools
that do this, IA developers must instrument their application code to produce the
measurement data needed for SLM.
       Asynchronous communication makes it harder to isolate performance problems.
Paradoxically, actions taken to enhance application responsiveness also make it harder to
measure, understand, report on, and manage responsiveness. Some IAs do not issue any
further HTTP GET requests from the browser after their first page, using asynchronous
requests from the client engine to initiate all subsequent downloads. The IA client engine
may be programmed to continually download new content and refresh the display, or (in
applications using the Comet approach) a server-side engine can keep pushing new
content to the browser over a connection that never closes. In these cases, the concept of
a "page download" is no longer applicable. These complications make it harder to
measure and subdivide application response times, a fundamental requirement for
problem isolation and service level management. Tools designed to measure traditional
Web applications may -- depending on the details of the application and the tool -- report
such applications either as a single Web page per HTTP request, or as an unrelated
collection of server activities. Neither conclusion reflects what is really happening at the
application level.
       The client engine makes it harder to measure response time. For traditional Web
applications, measurement software can reside either on the client machine or on a
machine that is close to the server, provided that it can observe the flow of network
traffic at the TCP and HTTP levels. Because these protocols are synchronous and
predictable, a packet sniffer can read and interpret packet-level data, and infer the user’s
experience of response time by tracking HTTP messages and the times of underlying
TCP packets and acknowledgments. But the IA architecture reduces the power of the
packet sniffing approach, because the client engine breaks the communication between
user and server into two separate cycles operating asynchronously -- a foreground (user-
to-engine) cycle, and a background (engine-to-server) cycle. Both cycles are important,
because neither stands alone; it is their relationship that defines application behavior. But
that relationship depends only on the application design, which cannot (in general) be
inferred by a measurement tool, especially one that can observe only one of the two
cycles. Therefore the most complete IA measurements can only be obtained using tools
that reside on the client and observe both cycles.


THE CURRENT STATUS OF IA DEVELOPMENT AND ADOPTION
        IAs are still in the early stages of development and user adoption. There are a
number of restrictions and requirements that remain:
Browser adoption: Many IAs require modern web browsers in order to run. Advanced
JavaScript engines must be present in the browser as IAs use techniques such as
XMLHTTPRequest for client-server communication, and DOM Scripting and advanced
CSS techniques to enable the user interface.


Web standards: Differences between web browsers can make it difficult to write an IA
that will run across all major browsers. The consistency of the Java platform, particularly
after Java 1.1, makes this task much simpler for IAs written as Java applets.


Development tools: Some Ajax Frameworks and products like Adobe Flex provide an
integrated environment in which to build IA and B2B web applications.
Accessibility concerns: Additional interactivity may require technical approaches that
limit applications' accessibility.


User adoption: Users expecting standard web applications may find that some accepted
browser functionality (such as the "Back" button) may have somewhat different or even
undesired behaviour.
JUSTIFICATIONS
       Although developing applications to run in a web browser is a much more
limiting, difficult, and intricate a process than developing a regular desktop application,
the efforts are often justified because:
       installation is not required -- updating and distributing the application is an
instant, automatically handled process
       users can use the application from any computer with an internet connection, and
usually regardless of what operating system that computer is running
       web-based applications are generally less prone to viral infection than running an
actual executable
       as web usage increases, computer users are becoming less willing to go to the
trouble of installing new software if a browser-based alternative is available
       This last point is often true even if this alternative is slower or not as feature-. A
good example of this phenomenon is webmail.


METHODS AND TECHNIQUES
JAVASCRIPT


       The first major client side language and technology available with the ability to
run code and installed on a majority of web clients was JavaScript. Although its uses
were relatively limited at first, combined with layers and other developments in DHTML
it has become possible to piece together an IA system without the use of a unified client-
side solution. Ajax is a new term coined to refer to this combination of techniques and
has recently been used most prominently by Google for projects such as Gmail and
Google Maps. However, creating a large application in this framework is very difficult,
as many different technologies must interact to make it work, and browser compatibility
requires a lot of effort. In order to make the process easier, several AJAX Frameworks
have been developed.
       The "" in " Internet applications" may also suffer from an all-JavaScript approach,
because you are still bound by the media types predictably supported by the world's
various deployed browsers -- video will display in different ways in different browsers
with an all-JavaScript approach, audio support will be unpredictable, realtime
communications, whiteboarding, outbound webcams, opacity compositing, socket
support, all of these are implemented in different ways in different browsers, so all-
JavaScript approaches tend to cluster their "ness" around text refreshes and image
refreshes.


ADOBE FLASH AND APOLLO


       Adobe Flash is another way to build Internet Application. This technology is
cross-platform and quite powerful to create application UI. Adobe Flex provides the
possibility to create Flash user interface by compiling MXML, a XML based interface
description language. Adobe is currently working on providing a more powerful IA
platform with the product Adobe Apollo, a technology combining Flash and PDF.


WINDOWS PRESENTATION FOUNDATION


       With the .NET 3.0 Framework, Microsoft introduced Windows Presentation
Foundation (WPF) which provides a way to build single-platform applications with some
similarities to IAs using XAML and languages like C# and Visual Basic. In addition,
Microsoft has announced Windows Presentation Foundation/Everywhere which may
eventually provide a subset of WPF functionality on devices and other platforms.


ACTIVEX CONTROLS


       Embedding ActiveX controls into HTML is a very powerful way to develop
Internet applications, however they are only guaranteed to run properly in Internet
Explorer. Furthermore, since they can break the sandbox model, they are potential targets
for computer viruses and malware making them high security risks. At the time of this
writing, the Adobe Flash Player for Internet Explorer is implemented as an ActiveX
control for Microsoft environments, as well as in multi-platform Netscape Plugin
wrappers for the wider world. Only if corporations have standardized on using Internet
Explorer as the primary web browser, is ActiveX per se a good choice for building
corporate applications.


JAVA APPLETS
       Java applets run in standard HTML pages and generally start automatically when
their web page is opened with a modern web browser. Java applets have access to the
screen (inside an area designated in its page's HTML), speakers, keyboard and mouse of
any computer their web page is opened on, as well as access to the Internet, and provide a
sophisticated environment capable of real time applications.


JAVA APPLICATIONS


       Java based IAs can be launched from within the browser or as free standing
applications via Java Web Start. Java IAs can take advantage of the full power of the Java
platform to deliver functionality, 2D & 3D graphics, and off-line capabilities, but at the
cost of delayed startup.
       Numerous frameworks for Java IAs exist, including XUL-like XML-based
frameworks such as XUI and Swixml.




USER INTERFACE LANGUAGES
       As an alternative to HTML/XHTML new user interface markup languages can be
used in IAs. For instance, the Mozilla Foundation's XML-based user interface markup
language XUL - this could be used in IAs though it would be restricted to Mozilla-based
browsers, since it is not a de facto or de jure standard. The W3Cs          Web Clients
Activity[2] has initiated a Web Application Formats Working Group whose mission
includes the development of such standards [3].
       IA's user interfaces can also become er through the use of scriptable SVG (though
not all browsers support native SVG rendering yet) as well SMIL.
OTHER TECHNIQUES
IAs could use XForms to enhance their functionality.
Using XML and XSLT along with some XHTML, CSS and JavaScript can also be used
to generate er client side UI components like data tables that can be resorted locally on
the client without going back to the server. Mozilla and Internet Explorer browsers both
support this.
       The Omnis Web Client is an ActiveX control or Netscape plug-in which can be
embedded into an HTML page providing a application interface in the end-user's web
browser.


IA WITH REAL-TIME PUSH
       Traditionally, web pages have been delivered to the client only when the client
requested for it. For every client request, the browser initiates an HTTP connection to the
web server, which then returns the data and the connection is closed. The drawback of
this approach was that the page displayed was updated only when the user explicitly
refreshes the page or moves to a new page. Since transferring entire pages can take a long
time, refreshing pages can introduce a long latency.


DEMAND FOR LOCALISED USAGE OF IA
       With the increasing adoption and improvement in broadband technologies, fewer
users experience poor peformance caused by remote latency. Furthermore one of the
critical reasons for using an IA is that many developers are looking for a language to
serve up desktop applications that is not only desktop OS neutral but also installation and
system issue free.
       IA running in the ubiquitous web browser is a potential candidate even when used
standalone or over a LAN, with the required webserver functionalities hosted locally.
[edit] Client-side functionalities and development tools for IA needed
With client-side functionalities like Javascript and DHTML, IA can operate on top of a
range of OS and webserver functionalities.
DIRECTORY SERVICE


       A directory service is a software application — or a set of applications — that
stores and organizes information about a computer network's users and network
resources, and that allows network administrators to manage users' access to the
resources. Additionally, directory services act as an abstraction layer between users and
shared resources.
       A directory service should not be confused with the directory repository itself;
which is the database that holds information about named objects that are managed in the
directory service. In the case of the X.500 distributed directory services model, one or
more namespaces (trees of objects) are used to form the directory service. The directory
service provides the access interface to the data that is contained in one or more directory
namespaces. The directory service interface acts as a central/common authority that can
securely authenticate the system resources that manage the directory data.
       Like a database, a directory service is highly optimized for reads and provides
advanced search on the many different attributes that can be associated with objects in a
directory. The data that is stored in the directory is defined by an extendible and
modifiable schema. Directory services use a distributed model for storing their
information and that information is usually replicated between directory servers. [1]


   INTRODUCTION


       A simple directory service called a naming service maps the names of network
resources to their respective network addresses. With the name service type of directory,
a user doesn't have to remember the physical address of a network resource; providing a
name will locate the resource. Each resource on the network is considered an object on
the directory server. Information about a particular resource is stored as attributes of that
object. Information within objects can be made secure so that only users with the
available permissions are able to access it. More sophisticated directories are designed
with namespaces as Subscribers, Services, Devices, Entitlements, Preferences, Content
and so on. This design process is highly related to Identity management.
        A directory service defines the namespace for the network. A namespace in this
context is the term that is used to hold one or more objects as named entries. The
directory design process normally has a set of rules that determine how network
resources are named and identified. The rules specify that the names be unique and
unambiguous. In X.500 (the directory service standards) and LDAP the name is called
the distinguished name (DN) and is used to refer to a collection of attributes (relative
distinguished names) which make up the name of a directory entry.
        A directory service is a shared information infrastructure for locating, managing,
administrating, and organizing common items and network resources, which can include
volumes, folders, files, printers, users, groups, devices, telephone numbers and other
objects. A directory service is an important component of a NOS (Network Operating
System). In the more complex cases a directory service is the central information
repository for a Service Delivery Platform. For example, looking up "computers" using a
directory service might yield a list of available computers and information for accessing
them.
        Replication and Distribution have very distinct meanings in the design and
management of a directory service. The term replication is used to indicate that the same
directory namespace (the same objects) are copied to another directory server for
redundancy and throughput reasons. The replicated namespace is governed by the same
authority. The term distribution is used to indicate that multiple directory servers, that
hold different namespaces, are interconnected to form a distributed directory service.
Each distinct namespace can be governed by different authorities.


DIRECTORY SERVICES SOFTWARE


        Directory services produced by different vendors and standards bodies include the
following offerings:
Windows NT Directory Services (NTDS) for Windows NT
Active Directory for Windows 2000, Server 2003
Apple Open Directory in Mac OS X Server
Novell eDirectory - formerly called Novell Directory Services (NDS) for Novell
NetWare version 4.x-5.x
OpenLDAP
Fedora Directory Server
Sun Directory Services


COMPARISON WITH RELATIONAL DATABASES


       There are a number of things that distinguish a traditional directory service from a
relational database.
       Depending on the directory application, the information is generally read more
often than it is written. Hence the usual database features of transactions and rollback are
not implemented in some directory systems. Data may be made redundant, but the
objective is to get a faster response time during searches.
       Data can be organized in a strictly hierarchical manner which is sometimes seen
to be problematic. To overcome the issues of deep namespaces, some directories
dismantle the object namespace hierarchy in their storage mechanisms in order to
optimize navigation. That is, these directories find the item based on their data attributes
and then determine their namespace values as this is faster than navigating large
namespaces to find the item. In terms of cardinality, traditional directories do not have
many-to-many relations. Instead, such relations must be maintained explicitly using lists
of distinguished names or other identifiers (similar to the cross table identifiers used in
relational databases).
       Originally X.500 type directory information hierarchies were considered
problematic against relational data designs. Today Java based object-oriented databases
are being developed and XML document forms have adopted an hierarchical object
model - indicating an evolution from traditional relational data engineering.
       A schema is defined as object classes, attributes, name bindings and knowledge
(namespaces).
An objectClass has:
Must-attributes that each of its instances must have
May-attributes that can be defined for an instance, but could also be omitted when the
object is created. The lack of a certain attribute is somewhat like a NULL in relational
databases
        Attributes are sometimes multi-valued in directories allowing multiple naming
attributes at one level such as machine type and serial number concatenated or multiple
phone numbers for "work phone".
        Attributes and objectClasses are standardized throughout the industry and
formally registered with the IANA for their object ID. Therefore directory applications
seek to reuse much of the standard classes and attributes to maximize the benefit of
existing directory server software.
        Object instances are slotted into namespaces. That is, each objectClass inherits
from its parent objectClass (and ultimately from the root of the hierarchy) adding
attributes to the must/may list.
        Directory services are often a central component in the security design of an IT
system and have a correspondingly fine granularity regarding access control: who may
operate in which manner on what information. Also see: ACLs
        Directory design is quite different from relational database design. With databases
one tends to design a data model for the business issues and process requirements,
sometimes with the online customer, service, user management, presence and system
scale issues omitted. With directories however, if one is placing information into a
common repository for many applications and users, then its information (and identity)
design and schema must be developed around what the objects are representing in real
life. In most cases, these objects represent users, address books, rosters, preferences,
entitlements, products and services, devices, profiles, policies, telephone numbers,
routing information, etc. In addition one must also consider the operational aspects of
design in regard to performance and scale. A quick check on the operational design is to
take eg. 1 million users, 50 objects each with users or applications accessing these objects
up to 5000 times a second, minute, or hour (to authorize and update their service
environments), and check if the server and network machinery considered can support
this.
The major difference with databases and directories is at the system level where a
database is used to automate a process with a dedicated (relational) data model, but a
directory is used to hold "identified" objects that can be used by many applications in
random ways. A Directory service is applied where "multi governance" (many
applications and users) are, for integrity and efficiency reasons, using the same
information. This approach to system design gives greater scale and flexibility so that the
larger scale functions such as Service Delivery Platforms can be specified correctly.
SDPs now need to support 100s of millions of objects (HSS/HLR, address books, user
entitlements, VOIP telephone numbers, user and device information, etc) in real time,
random ways and be managed from BSS/OSS/CRM type systems as well as the customer
self care applications. See Service Delivery Platform
       Symptomatic of database designs is that the larger companies have hundreds (if
not thousands) of them for different processes and are now trying to converge their user
and service identity information and their online goods and services management, and
deliver these in real time, cost effectively. So a large scale directory service should be in
their solution architecture.


IMPLEMENTATIONS OF DIRECTORY SERVICES


       Directory services were part of an Open Systems Interconnection (OSI) initiative
to get everyone in the industry to agree to common network standards to provide multi-
vendor interoperability. In the 1980s the ITU and ISO came up with a set of standards -
X.500, for directory services, initially to support the requirements of inter-carrier
electronic messaging and network name lookup. The Lightweight Directory Access
Protocol, LDAP, is based on the directory information services of X.500, but uses the
TCP/IP stack and a string encoding scheme of the X.509 protocol DAP, giving it more
relevance on the Internet.
       There have been numerous forms of directory service implementations from
different vendors. Among them are:
       NIS: The Network Information Service (NIS) protocol, originally named Yellow
Pages (YP), was Sun Microsystems' implementation of a directory service for Unix
network environments. (Sun has, in the early 2000s, merged its iPlanet alliance Netscape
and developed its LDAP-based directory service to become part of Sun ONE, now called
Sun Java Enterprise.)
       eDirectory: This is Novell's implementation of directory services. It supports
multiple architectures including Windows, NetWare, Linux and several flavours of Unix
and has long been used for user administration, configuration management, and software
management. eDirectory has evolved into a central component in a broader range of
Identity management products. It was previously known as Novell Directory Services.
Red Hat Directory Server: Red Hat released a directory service, that it acquired from
AOL's Netscape Security Solutions unit[1], as a commercial product running on top of
Red Hat Enterprise Linux called Red Hat Directory Server and as part of Fedora Core
called Fedora Directory Server.
       Active Directory: Microsoft's directory service is the Active Directory which is
included in the Windows 2000 and Windows Server 2003 operating system versions.
       Open Directory: Apple's Mac OS X Server offers a directory service called Open
Directory which integrates with many open standard protocols such as LDAP and
Kerberos as well as proprietary directory solutions like Active Directory and eDirectory.
       Apache Directory Server: Apache Software Foundation offers a directory service
called ApacheDS.
       Oracle Internet Directory: (OID) is Oracle Corporation's directory service, which
is compatible with LDAP version 3.
       Computer Associates Etrust Directory:
Sun Java System Directory Server: Sun Microsystems' current directory service offering,
found at [2].
       OpenDS: A next generation and open source directory service, backed by Sun
Microsystems and hosted at [3].
       There are also plenty of open-source tools to create directory services, including
OpenLDAP and the Kerberos (protocol), and Samba software which can act as a Domain
Controller with Kerberos and LDAP backends.
NEXT GENERATION DIRECTORY SYSTEMS


       Databases have been with the IT industry since the dawn of the computer age and
traditional directories for the last 20-30 years, and they will be with us in the future.
However, with the larger scale, converged services and event driven (presence) systems
now being developed world wide (e.g. 3G-IMS), information, identity and presence
services engineering and the technologies that support it will require some evolution.
This could take the form of CADS (Composite Adaptive Directory Services) and CADS
supported Service Delivery Platforms, see www.wwite.com and Service Delivery
Platform. CADS is an advanced directory service and contains functions for managing
identity, presence, content and adaption algorithms for self-tuning and with its unique
functions, greatly simplifies and enhances the design of converged services SDPs. See
Service Delivery Platform
DOMAIN NAME SYSTEM
        On    the     Internet,        the      domain        name       system   (DNS)    stores       and
associates       many       types      of       information          with      domain     names;     most
importantly,         it    translates       domain         names     (computer     hostnames)      to    IP
addresses. It also lists mail exchange servers accepting e-mail for
each    domain.        In       providing       a        worldwide       keyword-based     redirection
service, DNS is an essential component of contemporary Internet use.
        Useful       for    several      reasons,           the     DNS    pre-eminently     makes       it
possible        to        attach      easy-to-remember               domain       names     (such        as
"wikipedia.org")             to       hard-to-remember               IP     addresses       (such        as
66.230.200.100). People take advantage of this when they recite URLs
and e-mail addresses. In a subsidiary function, the domain name system
makes    it   possible          for   people        to    assign     authoritative       names    without
needing to communicate with a central registrar each time.


HISTORY OF THE DNS
        The practice of using a name as a more human-legible abstraction
of a machine's numerical address on the network predates even TCP/IP,
and goes all the way back to the ARPAnet era. Originally, each computer
on the network retrieved a file called HOSTS.TXT from SRI (now SRI
International) which mapped an address (such as 192.0.34.166) to a name
(such as www.example.net.) The Hosts file still exists on most modern
operating     systems,          either     by   default        or    through      configuration,        and
allows users to specify an IP address to use for a hostname without
checking the DNS. This file now serves primarily for troubleshooting
DNS    errors    or       for    mapping     local         addresses      to   more     organic    names.
Systems based on a HOSTS.TXT file have inherent limitations, because of
the    obvious       requirement       that     every       time     a    given   computer's      address
changed, every computer that seeks to communicate with it would need an
update to its Hosts file.
        The growth of networking called for a more scalable system: one
that recorded a change in a host's address in one place only. Other
hosts would learn about the change dynamically through a notification
system, thus completing a globally accessible network of all hosts'
names and their associated IP Addresses.
        Paul Mockapetris invented the DNS in 1983 and wrote the first
implementation. The original specifications appear in RFC 882 and 883.
In 1987, the publication of RFC 1034 and RFC 1035 updated the DNS
specification and made RFC 882 and RFC 883 obsolete. Several more-
recent RFCs have proposed various extensions to the core DNS protocols.
        In 1984, four Berkeley students — Douglas Terry, Mark Painter,
David Riggle and Songnian Zhou — wrote the first UNIX implementation,
which    was      maintained    by    Ralph      Campbell   thereafter.       In   1985,   Kevin
Dunlap of DEC significantly re-wrote the DNS implementation and renamed
it BIND. Mike Karels, Phil Almquist and Paul Vixie have maintained BIND
since then. BIND was ported to the Windows NT platform in the early
1990s.
        Due to its long history of security issues, several alternative
nameserver/resolver           programs      have     been   written    and    distributed     in
recent years.


HOW THE DNS WORKS IN THEORY




Domain names, arranged in a tree, cut into zones, each served by a
nameserver.
The domain name space consists of a tree of domain names. Each node or
leaf    in       the   tree   has    one    or    more   resource     records,     which    hold
information associated with the domain name. The tree sub-divides into
zones.       A     zone    consists        of    a    collection      of     connected     nodes
authoritatively served by an authoritative DNS nameserver. (Note that a
single nameserver can host several zones.)
When a system administrator wants to let another administrator
control a part of the domain name space within his or her zone of
authority, he or she can delegate control to the other administrator.
This splits a part of the old zone off into a new zone, which comes
under the authority of the second administrator's nameservers. The old
zone becomes no longer authoritative for what comes under the authority
of the new zone.
      A   resolver      looks      up    the      information      associated         with    nodes.    A
resolver knows how to communicate with name servers by sending DNS
requests,      and     heeding       DNS       responses.        Resolving        usually      entails
recursing through several name servers to find the needed information.
      Some resolvers function simplistically and can only communicate
with a single name server. These simple resolvers rely on a recursing
name server to perform the work of finding information for them.


UNDERSTANDING THE PARTS OF A DOMAIN NAME
      A domain name usually consists of two or more parts (technically
labels), separated by dots. For example wikipedia.org.
      The rightmost label conveys the top-level domain (for example,
the address en.wikipedia.org has the top-level domain org).
      Each label to the left specifies a subdivision or subdomain of
the   domain       above     it.     Note         that   "subdomain"           expresses       relative
dependence,      not       absolute        dependence:        for       example,       wikipedia.org
comprises a subdomain of the org domain, and en.wikipedia.org comprises
a subdomain of the domain wikipedia.org. In theory, this subdivision
can go down to 127 levels deep, and each label can contain up to 63
characters, as long as the whole domain name does not exceed a total
length of 255 characters. But in practice some domain registries have
shorter limits than that.
      A   hostname         refers    to       a    domain    name    that       has    one     or    more
associated      IP     addresses.         For      example,       the     en.wikipedia.org           and
wikipedia.org domains are both hostnames, but the org domain is not.
      The    DNS     consists       of   a    hierarchical        set     of   DNS    servers.       Each
domain or subdomain has one or more authoritative DNS servers that
publish   information        about       that     domain    and     the    name      servers    of    any
domains   "beneath"         it.    The       hierarchy      of    authoritative         DNS     servers
matches the hierarchy of domains. At the top of the hierarchy stand the
root servers: the servers to query when looking up (resolving) a top-
level domain name (TLD).
THE ADDRESS RESOLUTION MECHANISM
       In theory a full host name may have several name segments, (e.g
ahost.ofasubnet.ofabiggernet.inadomain.example).                  In       practice,    in    the
experience of the majority of public users of Internet services, full
host     names      will     frequently      consist    of        just       three     segments
(ahost.inadomain.example, and most often www.inadomain.example).
       For querying purposes, software interprets the name segment by
segment, from right to left, using an iterative search procedure. At
each step along the way, the program queries a corresponding DNS server
to provide a pointer to the next server which it should consult.




       A DNS recurser consults three nameservers to resolve the address
www.wikipedia.org.
       As originally envisaged, the process was as simple as: the local
system is pre-configured with the known addresses of the root servers
in a file of root hints, which need to be updated periodically by the
local administrator from a reliable source to be kept up to date with
the changes which occur over time.
       Query one of the root servers to find the server authoritative
for the next level down (so in the case of our simple hostname, a root
server   would      be   asked   for   the   address   of     a    server      with    detailed
knowledge of the example top level domain).
Querying     this   second    server   for    the   address       of   a    DNS    server    with
detailed knowledge of the second-level domain (inadomain.example in our
example) repeating the previous step to progress down the name, until
the final step which would, rather than generating the address of the
next DNS server, return the final address sought.
       The    diagram      illustrates       this   process        for       the     real    host
www.wikipedia.org.
The mechanism in this simple form has a difficulty: it places a huge
operating burden on the collective of root servers, with each and every
search     for   an    address         starting    by    querying         one    of   them.     Being    as
critical as they are to the overall function of the system such heavy
use would create an insurmountable bottleneck for trillions of queries
placed     every     day.    In    practice       there       are    two    key     additions     to    the
mechanism.
        Firstly, the DNS resolution process allows for local recording
and subsequent consultation of the results of a query (or caching) for
a period of time after a successful answer (the server providing the
answer initially dictates the period of validity, which may vary from
just seconds to days or even weeks). In our illustration, having found
a   list    of   addresses        of    servers     capable         of    answering     queries     about
the .example domain, the local resolver will not need to make the query
again until the validity of the currently known list expires, and so on
for   all     subsequent         steps.     Hence       having       successfully        resolved       the
address of ahost.inadomain.example it is not necessary to repeat the
process for some time since the address already reached will be deemed
reliable         for         a         defined          period,            and        resolution         of
anotherhost.anotherdomain.example                  can        commence      with      already    knowing
which      servers     can   answer       queries       for    the       .example     domain.    Caching
significantly reduces the rate at which the most critical name servers
have to respond to queries, adding the extra benefit that subsequent
resolutions are not delayed by network transit times for the queries
and responses.
        Secondly,       most      domestic    and       small-business           clients   "hand       off"
address resolution to their ISP's DNS servers to perform the look-up
process, thus allowing for the greatest benefit from those same ISPs
having busy local caches serving a wide variety of queries and a large
number of users.




CIRCULAR DEPENDENCIES AND GLUE RECORDS


        Name servers in delegations appear listed by name, rather than by
IP address. This means that a resolving name server must issue another
DNS request to find out the IP address of the server to which it has
been referred. Since this can introduce a circular dependency if the
nameserver referred to is under the domain that it is authoritative of,
it     is    occasionally   necessary    for     the   nameserver       providing     the
delegation to also provide the IP address of the next nameserver. This
record is called a glue record.
        For example, assume that the sub-domain en.wikipedia.org contains
further sub-domains (such as something.en.wikipedia.org) and that the
authoritative     nameserver   for   these     lives   at   ns1.en.wikipedia.org.      A
computer trying to resolve something.en.wikipedia.org will thus first
have    to   resolve   ns1.en.wikipedia.org.      Since     ns1   is   also   under   the
en.wikipedia.org        subdomain,      resolving       something.en.wikipedia.org
requires resolving ns1.en.wikipedia.org which is exactly the circular
dependency mentioned above. The dependency is broken by the glue record
in the nameserver of wikipedia.org that provides the IP address of
ns1.en.wikipedia.org directly to the requestor, enabling it to bootstrap
the process by figuring out where ns1.en.wikipedia.org is located.


DNS IN PRACTICE


        When an application (such as a web browser) tries to find the IP
address of a domain name, it doesn't necessarily follow all of the
steps outlined in the Theory section above. We will first look at the
concept of caching, and then outline the operation of DNS in "the real
world."


CACHING AND TIME TO LIVE
        Because of the huge volume of requests generated by a system like
the DNS, the designers wished to provide a mechanism to reduce the load
on individual DNS servers. The mechanism devised provided that when a
DNS resolver (i.e. client) received a DNS response, it would cache that
response for a given period of time. A value (set by the administrator
of the DNS server handing out the response) called the time to live
(TTL), defines that period of time. Once a response goes into cache,
the resolver will consult its cached (stored) answer; only when the TTL
expires (or when an administrator manually flushes the response from
the resolver's memory) will the resolver contact the DNS server for the
same information.
       Generally, the Start of Authority (SOA) record specifies the time
to live. The SOA record has the parameters:
Serial — the zone serial number, incremented when the zone file is
modified, so the slave and secondary name servers know when the zone
has been changed and should be reloaded.
       Refresh     —   the    number   of     seconds    between    update    requests      from
secondary and slave name servers.
       Retry — the number of seconds the secondary or slave will wait
before retrying when the last attempt has failed.
       Expire — the number of seconds a master or slave will wait before
considering the data stale if it cannot reach the primary name server.
       Minimum     —   previously      used    to   determine      the    minimum   TTL,    this
offers negative caching.


CACHING TIME
       As   a   noteworthy      consequence         of   this   distributed     and   caching
architecture, changes to the DNS do not always take effect immediately
and    globally.       This   is   best       explained     with     an    example:    If    an
administrator has set a TTL of 6 hours for the host www.wikipedia.org,
and then changes the IP address to which www.wikipedia.org resolves at
12:01pm, the administrator must consider that a person who cached a
response with the old IP address at 12:00pm will not consult the DNS
server again until 6:00pm. The period between 12:01pm and 6:00pm in
this example is called caching time, which is best defined as a period
of time that begins when you make a change to a DNS record and ends
after the maximum amount of time specified by the TTL expires. This
essentially leads to an important logistical consideration when making
changes to the DNS: not everyone is necessarily seeing the same thing
you're seeing. RFC 1537 helps to convey basic rules for how to set the
TTL.
       Note that the term "propagation", although very widely used, does
not describe the effects of caching well. Specifically, it implies that
[1] when you make a DNS change, it somehow spreads to all other DNS
servers (instead, other DNS servers check in with yours as needed), and
[2] that you do not have control over the amount of time the record is
cached (you control the TTL values for all DNS records in your domain,
except your NS records and any authoritative DNS servers that use your
domain name).
Some   resolvers      may   override       TTL   values,     as       the    protocol     supports
caching for up to 68 years or no caching at all. Negative caching (the
non-existence of records) is determined by name servers authoritative
for a zone which MUST include the SOA record when reporting no data of
the requested type exists. The MINIMUM field of the SOA record and the
TTL of the SOA itself is used to establish the TTL for the negative
answer. RFC 2308
       Many people incorrectly refer to a mysterious 48 hour or 72 hour
propagation time when you make a DNS change. When one changes the NS
records     for    one's    domain    or    the     IP     addresses         for   hostnames    of
authoritative DNS servers using one's domain (if any), there can be a
lengthy period of time before all DNS servers use the new information.
This   is   because    those   records       are    handled      by    the    zone   parent    DNS
servers     (for    example,    the    .com        DNS    servers       if    your   domain     is
example.com),      which    typically       cache        those    records      for   48     hours.
However, those DNS changes will be immediately available for any DNS
servers that do not have them cached. And, any DNS changes on your
domain other than the NS records and authoritative DNS server names can
be nearly instantaneous, if you choose for them to be (by lowering the
TTL once or twice ahead of time, and waiting until the old TTL expires
before making the change).




DNS IN THE REAL WORLD
DNS    resolving       from   program    to    OS-resolver       to       ISP-resolver      to
greater system.
Users   generally        do    not    communicate      directly     with      a    DNS   resolver.
Instead DNS resolution takes place transparently in client applications
such as web browsers (like Internet Explorer, Opera, Mozilla Firefox,
Safari,       Netscape    Navigator,      etc),       mail    clients      (Outlook       Express,
Mozilla   Thunderbird,          etc),   and    other    Internet     applications.           When    a
request is made which necessitates a DNS lookup, such programs send a
resolution request to the local DNS resolver in the operating system
which in turn handles the communications required.
        The DNS resolver will almost invariably have a cache (see above)
containing recent lookups. If the cache can provide the answer to the
request, the resolver will return the value in the cache to the program
that made the request. If the cache does not contain the answer, the
resolver will send the request to a designated DNS server or servers.
In the case of most home users, the Internet service provider to which
the machine connects will usually supply this DNS server: such a user
will either configure that server's address manually or allow DHCP to
set it; however, where systems administrators have configured systems
to use their own DNS servers, their DNS resolvers will generally point
to their own nameservers. This name server will then follow the process
outlined above in DNS in theory, until it either successfully finds a
result, or does not. It then returns its results to the DNS resolver;
assuming it has found a result, the resolver duly caches that result
for   future     use,    and    hands    the    result       back   to   the      software    which
initiated the request.




BROKEN RESOLVERS
An additional level of complexity emerges when resolvers violate
the rules of the DNS protocol. Some people have suggested that a number
of    large   ISPs     have     configured       their   DNS    servers              to   violate     rules
(presumably to allow them to run on less-expensive hardware than a
fully compliant resolver), such as by disobeying TTLs, or by indicating
that a domain name does not exist just because one of its name servers
does not respond.
        As a final level of complexity, some applications such as Web
browsers also have their own DNS cache, in order to reduce the use of
the DNS resolver library itself. This practice can add extra difficulty
to DNS debugging, as it obscures which data is fresh, or lies in which
cache. These caches typically have very short caching times of the
order of one minute. A notable exception is Internet Explorer; recent
versions cache DNS records for half an hour.


OTHER DNS APPLICATIONS
        The     system      outlined       above      provides         a       somewhat      simplified
scenario. The DNS includes several other functions:
        Hostnames and IP addresses do not necessarily match on a one-to-
one    basis.    Many      hostnames      may    correspond       to       a    single     IP    address:
combined with virtual hosting, this allows a single machine to serve
many web sites. Alternatively a single hostname may correspond to many
IP    addresses:         this       can   facilitate         fault         tolerance            and    load
distribution,        and    also      allows     a    site   to      move        physical        location
seamlessly.
        There    are     many    uses     of    DNS   besides     translating              names      to   IP
addresses. For instance, Mail transfer agents use DNS to find out where
to    deliver    e-mail       for    a    particular     address.              The    domain     to    mail
exchanger mapping provided by MX records accommodates another layer of
fault tolerance and load distribution on top of the name to IP address
mapping.
        Sender Policy Framework and DomainKeys instead of creating own
record types were designed to take advantage of another DNS record
type, the TXT record.
To provide resilience in the event of computer failure, multiple DNS
servers provide coverage of each domain. In particular, thirteen root
servers exist worldwide. DNS programs or operating systems have the IP
addresses of these servers built in. At least nominally, the USA hosts
all but three of the root servers. However, because many root servers
actually implement anycast, where many different computers can share
the same IP address to deliver a single service over a large geographic
region, most of the physical (rather than nominal) root servers now
operate outside the USA.
        The DNS uses TCP and UDP on port 53 to serve requests. Almost all
DNS queries consist of a single UDP request from the client followed by
a single UDP reply from the server. TCP typically comes into play only
when the response data size exceeds 512 bytes, or for such tasks as
zone transfer. Some operating systems such as HP-UX are known to have
resolver implementations that use TCP for all queries, even when UDP
would suffice.


EXTENSIONS TO DNS
        EDNS   is   an   extension    of   the   DNS   protocol    which   enhances   the
transport of DNS data in UDP packages, and adds support for expanding
the space of request and response codes. It is described in RFC 2671.


IMPLEMENTATIONS OF DNS
        For    a   commented   list   of   DNS   server-side      implementations,    see
Comparison of DNS server software.


STANDARDS


RFC 882 Concepts and Facilities (Deprecated by RFC 1034)
RFC 883 Domain Names: Implementation specification (Deprecated by RFC
1035)
RFC 1032 Domain administrators guide
RFC 1033 Domain administrators operations guide
RFC 1034 Domain Names - Concepts and Facilities.
RFC 1035 Domain Names - Implementation and Specification
RFC 1101 DNS Encodings of Network Names and Other Types
RFC 1123 Requirements for Internet Hosts -- Application and Support
RFC 1183 New DNS RR Definitions
RFC 1706 DNS NSAP Resource Records
RFC 1876 Location Information in the DNS (LOC)
RFC 1886 DNS Extensions to support IP version 6
RFC 1912 Common DNS Operational and Configuration Errors
RFC 1995 Incremental Zone Transfer in DNS
RFC   1996    A    Mechanism     for   Prompt    Notification       of    Zone   Changes   (DNS
NOTIFY)
RFC 2136 Dynamic Updates in the domain name system (DNS UPDATE)
RFC 2181 Clarifications to the DNS Specification
RFC 2182 Selection and Operation of Secondary DNS Servers
RFC 2308 Negative Caching of DNS Queries (DNS NCACHE)
RFC 2317 Classless IN-ADDR.ARPA delegation
RFC 2671 Extension Mechanisms for DNS (EDNS0)
RFC 2672 Non-Terminal DNS Name Redirection (DNAME record)
RFC 2782 A DNS RR for specifying the location of services (DNS SRV)
RFC 2845 Secret Key Transaction Authentication for DNS (TSIG)
RFC   2874     DNS   Extensions        to   Support     IPv6   Address      Aggregation    and
Renumbering
RFC 3403 Dynamic Delegation Discovery System (DDDS) (NAPTR records)
RFC   3696    Application        Techniques     for    Checking     and   Transformation    of
Names
RFC 4398 Storing Certificates in the Domain Name System
RFC 4408 Sender Policy Framework (SPF) (SPF records)
[edit] Types of DNS records


Important categories of data stored in the DNS include the following:
An A record or address record maps a hostname to a 32-bit IPv4 address.
An AAAA record or IPv6 address record maps a hostname to a 128-bit IPv6
address.
A CNAME record or canonical name record is an alias of one name to
another. The A record that the alias is pointing to can be either local
or remote - on a foreign name server. Useful when running multiple
services from a single IP address, where each service has its own entry
in DNS.


        An MX record or mail exchange record maps a domain name to a list
of mail exchange servers for that domain.


        A    PTR   record   or    pointer     record    maps   an    IPv4   address   to   the
canonical name for that host. Setting up a PTR record for a hostname in
the in-addr.arpa domain that corresponds to an IP address implements
reverse    DNS    lookup   for     that    address.      For    example   (at   the   time   of
writing),   www.icann.net        has      the    IP   address   192.0.34.164,     but    a   PTR
record     maps     164.34.0.192.in-addr.arpa              to     its     canonical      name,
referrals.icann.org.
      An NS record or name server record maps a domain name to a list
of DNS servers authoritative for that domain. Delegations depend on NS
records.
      An SOA record or start of authority record specifies the DNS
server providing authoritative information about an Internet domain,
the email of the domain administrator, the domain serial number, and
several timers relating to refreshing the zone. An SRV record is a
generalized service location record.
      A TXT record allows an administrator to insert arbitrary text
into a DNS record. For example, this record is used to implement the
Sender Policy Framework and DomainKeys specifications.
      NAPTR records ("Naming Authority Pointer") are a newer type of
DNS record that support regular expression based rewriting.
      Other types of records simply provide information (for example, a
LOC record gives the physical location of a host), or experimental data
(for example, a WKS record gives a list of servers offering some well
known service such as HTTP or POP3 for a domain).


INTERNATIONALISED DOMAIN NAMES


INTERNATIONALIZED DOMAIN NAME
      While      domain    names    in     the    DNS    have    no   restrictions      on   the
characters they use and can include non-ASCII characters, the same is
not true for host names. Host names are the names most people see and
use for things like e-mail and web browsing. Host names are restricted
to a small subset of the ASCII character set that includes the Roman
alphabet in upper and lower case, the digits 0 through 9, the dot, and
the hyphen. (See RFC 3696 section 2 for details.) This prevented the
representation of names and words of many languages natively. ICANN has
approved the Punycode-based IDNA system, which maps Unicode strings
into the valid DNS character set, as a workaround to this issue. Some
registries have adopted IDNA.
SECURITY ISSUES IN DNS
        DNS was not originally designed with security in mind, and thus
has a number of security issues. DNS responses are traditionally not
cryptographically signed, leading to many attack possibilities; DNSSEC
modifies DNS to add support for cryptographically signed responses.
There     are   various          extensions     to    support      securing     zone     transfer
information as well.
        Some domain names can spoof other, similar-looking domain names.
For example, "paypal.com" and "paypa1.com" are different names, yet
users may be unable to tell the difference. This problem is much more
serious in systems that support internationalized domain names, since
many characters that are different (from the point of view of                                  ISO
10646) appear identical on typical computer screens.


LEGAL USERS OF DOMAINS
REGISTRANT
        No    one   in     the    world   really     "owns"    a   domain     name   except    the
Network      Information         Centre   (NIC),     or   domain   name     registry.[citation
needed] Most of the NICs in the world receive an annual fee from a
legal user in order for the legal user to utilize the domain name (i.e.
a sort of a leasing agreement exists, subject to the registry's terms
and   conditions).         Depending      on   the   various    naming      convention    of   the
registries, legal users become commonly known as "registrants" or as
"domain holders".
ICANN holds a complete list of domain registries in the world. One can
find the legal user of a domain name by looking in the WHOIS database
held by most domain registries.
        For most of the more than 240 country code top-level domains
(ccTLDs),       the      domain      registries       hold      the   authoritative        WHOIS
(Registrant,        name    servers,      expiry     dates    etc).   For    instance,     DENIC,
Germany NIC holds the authoritative WHOIS to a .DE domain name.
        However, some domain registries, such as for .COM, .ORG, .INFO,
etc., use a registry-registrar model. There are hundreds of Domain Name
Registrars that actually perform the domain name registration with the
end-user (see lists at ICANN or VeriSign). By using this method of
distribution, the registry only has to manage the relationship with the
registrar, and the registrar maintains the relationship with the end-
users,   or     'registrants'.        For     .COM,     .NET     domain    names,    the   domain
registries, VeriSign holds a basic WHOIS (registrar and name servers
etc). One can find the detailed WHOIS (Registrant, name servers, expiry
dates etc) at the registrars.
Since about 2001, most gTLD registries (.ORG, .BIZ, .INFO) have adopted
a so-called "thick" registry approach, i.e. keeping the authoritative
WHOIS with the various registries instead of the registrars.
[edit] Administrative contact
A registrant usually designates an administrative contact to manage the
domain name. In practice, the administrative contact usually has the
most immediate power over a domain. Management functions delegated to
the administrative contacts may include (for example):
the obligation to conform to the requirements of the domain registry in
order to retain the right to use a domain name
authorization        to   update      the    physical        address,     e-mail    address     and
telephone number etc in WHOIS
[edit] Technical contact
A technical contact manages the name servers of a domain name. The many
functions of a technical contact include:
making   sure    the      configurations       of     the    domain   name    conforms     to   the
requirements of the domain registry
updating the domain zone
providing the 24x7 functionality of the name servers (that leads to the
accessibility of the domain name)
[edit] Billing contact
The party whom a NIC invoices.
[edit] Name servers
Namely the authoritative name servers that host the domain name zone of
a domain name.
[edit] Politics
Many investigators have voiced criticism of the methods currently used
to   control     ownership       of   domains.        Critics     commonly     claim   abuse    by
monopolies      or   near-monopolies,          such     as    VeriSign,      Inc.   Particularly
noteworthy was the VeriSign Site Finder system which redirected all
unregistered      .com     and   .net       domains    to    a   VeriSign     webpage.     Despite
widespread criticism, VeriSign only reluctantly removed it after ICANN
threatened to revoke its contract to administer the root name servers.
There is also significant disquiet regarding United States political
influence over the Internet Corporation for Assigned Names and Numbers
(ICANN). This was a significant issue in the attempt to create a .xxx
Top-level domain and sparked greater interest in Alternative DNS roots
that would be beyond the control of any single country.
[edit] Truth in Domain Names Act
In the United States, the "Truth in Domain Names Act", in combination
with the PROTECT Act, forbids the use of a misleading domain name with
the intention of attracting people into viewing a visual depiction of
sexually explicit conduct on the Internet.




ELECTRONIC MAIL


       Electronic mail (abbreviated "e-mail" or, often, "email") is a store and forward
method of composing, sending, storing, and receiving messages over electronic
communication systems. The term "e-mail" (as a noun or verb) applies both to the
Internet e-mail system based on the Simple Mail Transfer Protocol (SMTP) and to
intranet systems allowing users within one organization to e-mail each other. Often these
workgroup collaboration organizations may use the Internet protocols for internal e-mail
service.


   ORIGINS OF E-MAIL


       E-mail predates the Internet; existing e-mail systems were a crucial tool in
creating the Internet. MIT first demonstrated the Compatible Time-Sharing System
(CTSS) in 1961. [1] It allowed multiple users to log into the IBM 7094 [2] from remote
dial-up terminals, and to store files online on disk. This new ability encouraged users to
share information in new ways. E-mail started in 1965 as a way for multiple users of a
time-sharing mainframe computer to communicate. Although the exact history is murky,
among the first systems to have such a facility were SDC's Q32 and MIT's CTSS.
E-mail was quickly extended to become network e-mail, allowing users to pass
messages between different computers. The messages could be transferred between users
on different computers by 1966, but it is possible the SAGE system had something
similar some time before.
       The ARPANET computer network made a large contribution to the evolution of
e-mail. There is one report [1] which indicates experimental inter-system e-mail transfers
on it shortly after its creation, in 1969. Ray Tomlinson initiated the use of the @ sign to
separate the names of the user and their machine in 1971 [2]. The ARPANET
significantly increased the popularity of e-mail, and it became the killer app of the
ARPANET.




MODERN INTERNET E-MAIL
How Internet e-mail works




       The diagram above shows a typical sequence of events that takes place when
Alice composes a message using her mail user agent (MUA). She types in, or selects
from an address book, the e-mail address of her correspondent. She hits the "send"
button.
          Her MUA formats the message in Internet e-mail format and uses the Simple Mail
Transfer Protocol (SMTP) to send the message to the local mail transfer agent (MTA), in
this case smtp.a.org, run by Alice's Internet Service Provider (ISP).
The MTA looks at the destination address provided in the SMTP protocol (not from the
message header), in this case bob@b.org. An Internet e-mail address is a string of the
form localpart@domain.example, which is known as a Fully Qualified Domain Address
(FQDA). The part before the @ sign is the local part of the address, often the username of
the recipient, and the part after the @ sign is a domain name. The MTA looks up this
domain name in the Domain Name System to find the mail exchange servers accepting
messages for that domain.
          The DNS server for the b.org domain, ns.b.org, responds with an MX record
listing the mail exchange servers for that domain, in this case mx.b.org, a server run by
Bob's ISP.
          smtp.a.org sends the message to mx.b.org using SMTP, which delivers it to the
mailbox of the user bob.
          Bob presses the "get mail" button in his MUA, which picks up the message using
the Post Office Protocol (POP3).
          This sequence of events applies to the majority of e-mail users. However, there
are many alternative possibilities and complications to the e-mail system:
          Alice or Bob may use a client connected to a corporate e-mail system, such as
IBM's Lotus Notes or Microsoft's Exchange. These systems often have their own internal
e-mail format and their clients typically communicate with the e-mail server using a
vendor-specific, proprietary protocol. The server sends or receives e-mail via the Internet
through the product's Internet mail gateway which also does any necessary reformatting.
If Alice and Bob work for the same company, the entire transaction may happen
completely within a single corporate e-mail system.
          Alice may not have a MUA on her computer but instead may connect to a
webmail service.
          Alice's computer may run its own MTA, so avoiding the transfer at step 1.
Bob may pick up his e-mail in many ways, for example using the Internet Message
Access Protocol, by logging into mx.b.org and reading it directly, or by using a webmail
service.
       Domains usually have several mail exchange servers so that they can continue to
accept mail when the main mail exchange server is not available.
       Emails are not secure if email encryption is not used correctly.
It used to be the case that many MTAs would accept messages for any recipient on the
Internet and do their best to deliver them. Such MTAs are called open mail relays. This
was important in the early days of the Internet when network connections were
unreliable. If an MTA couldn't reach the destination, it could at least deliver it to a relay
that was closer to the destination. The relay would have a better chance of delivering the
message at a later time. However, this mechanism proved to be exploitable by people
sending unsolicited bulk e-mail and as a consequence very few modern MTAs are open
mail relays, and many MTAs will not accept messages from open mail relays because
such messages are very likely to be spam.


INTERNET E-MAIL FORMAT


       The format of Internet e-mail messages is defined in RFC 2822 and a series of
RFCs, RFC 2045 through RFC 2049, collectively called Multipurpose Internet Mail
Extensions (MIME). Although as of July 13, 2005 (see [3]) RFC 2822 is technically a
proposed IETF standard and the MIME RFCs are draft IETF standards, these documents
are the de facto standards for the format of Internet e-mail. Prior to the introduction of
RFC 2822 in 2001 the format described by RFC 822 was the de facto standard for
Internet e-mail for nearly two decades; it is still the official IETF standard. The IETF
reserved the numbers 2821 and 2822 for the updated versions of RFC 821 (SMTP) and
RFC 822, honoring the extreme importance of these two RFCs. RFC 822 was published
in 1982 and based on the earlier RFC 733.


Internet e-mail messages consist of two major sections:
Header - Structured into fields such as summary, sender, receiver, and other
information about the e-mail
        Body - The message itself as unstructured text; sometimes containing a signature
block at the end
The header is separated from the body by a blank line.


INTERNET E-MAIL HEADER


        The message header consists of fields, usually including at least the following:
From: The e-mail address, and optionally name, of the sender of the message
To: The e-mail address[es], and optionally name[s], of the receiver[s] of the message
Subject: A brief summary of the contents of the message
Date: The local time and date when the message was originally sent
        Each header field has a name and a value. RFC 2822 specifies the precise syntax.
Informally, the field name starts in the first character of a line, followed by a ":",
followed by the value which is continued on non-null subsequent lines that have a space
or tab as their first character. Field names and values are restricted to 7-bit ASCII
characters. Non-ASCII values may be represented using MIME encoded words.
Note that the "To" field in the header is not necessarily related to the addresses to which
the message is delivered. The actual delivery list is supplied in the SMTP protocol, not
extracted from the header content. The "To" field is similar to the greeting at the top of a
conventional letter which is delivered according to the address on the outer envelope.
Also note that the "From" field does not have to be the real sender of the e-mail message.
It is very easy to fake the "From" field and let a message seem to be from any mail
address. It is possible to digitally sign e-mail, which is much harder to fake. Some
Internet service providers do not relay e-mail claiming to come from a domain not hosted
by them, but very few (if any) check to make sure that the person or even e-mail address
named in the "From" field is the one associated with the connection. Some internet
service providers apply e-mail authentication systems to e-mail being sent through their
MTA to allow other MTAs to detect forged spam that might apparently appear to be from
them.
Cc: carbon copy
Received: Tracking information generated by mail servers that have previously handled a
message
Content-Type: Information about how the message has to be displayed, usually a MIME
type
Many e-mail clients present "Bcc" (Blind carbon copy, recipients not visible in the "To"
field) as a header field. Since the entire header is visible to all recipients, "Bcc" is not
included in the message header. Addresses added as "Bcc" are only added to the SMTP
delivery list, and do not get included in the message data.
IANA maintains a list of standard header fields.




E-MAIL CONTENT ENCODING


       E-mail was originally designed for 7-bit ASCII. Much e-mail software is 8-bit
clean but must assume it will be communicating with 7-bit servers and mail readers. The
MIME standard introduced charset specifiers and two content transfer encodings to
encode 8 bit data for transmission: quoted printable for mostly 7 bit content with a few
characters outside that range and base64 for arbitrary binary data. The 8BITMIME
extension was introduced to allow transmission of mail without the need for these
encodings but many mail transport agents still don't support it fully. For international
character sets, Unicode is growing in popularity.


SAVED MESSAGE FILENAME EXTENSION


       Most, but not all, e-mail clients save individual messages as separate files, or
allow users to do so. Different applications save e-mail files with different filename
extensions.
.eml - This is the default e-mail extension for Mozilla Thunderbird and is used by
Microsoft Outlook Express.
.emlx - Used by Apple Mail.
.msg - Used by Microsoft Office Outlook.


MESSAGES AND MAILBOXES


       Messages are exchanged between hosts using the Simple Mail Transfer Protocol
with software like Sendmail. Users can download their messages from servers with
standard protocols such as the POP or IMAP protocols, or, as is more likely in a large
corporate environment, with a proprietary protocol specific to Lotus Notes or Microsoft
Exchange Servers.
       Mail can be stored either on the client, on the server side, or in both places.
Standard formats for mailboxes include Maildir and mbox. Several prominent e-mail
clients use their own proprietary format and require conversion software to transfer e-
mail between them.
       When a message cannot be delivered, the recipient MTA must send a bounce
message back to the sender, indicating the problem.


SPAMMING AND E-MAIL WORMS


       The usefulness of e-mail is being threatened by three phenomena: spamming,
phishing and e-mail worms.
       Spamming is unsolicited commercial e-mail. Because of the very low cost of
sending e-mail, spammers can send hundreds of millions of e-mail messages each day
over an inexpensive Internet connection. Hundreds of active spammers sending this
volume of mail results in information overload for many computer users who receive tens
or even hundreds of junk messages each day.
       E-mail worms use e-mail as a way of replicating themselves into vulnerable
computers. Although the first e-mail worm affected UNIX computers, the problem is
most common today on the more popular Microsoft Windows operating system.
       The combination of spam and worm programs results in users receiving a constant
drizzle of junk e-mail, which reduces the usefulness of e-mail as a practical tool.
A number of anti-spam techniques mitigate the impact of spam. In the United
States, U.S. Congress has also passed a law, the Can Spam Act of 2003, attempting to
regulate such e-mail. Australia also has very strict spam laws restricting the sending of
spam                 from                  an                Australian                 ISP
(http://www.aph.gov.au/library/pubs/bd/2003-04/04bd045.pdf), but its impact has been
minimal since most spam comes from regimes that seem reluctant to regulate the sending
of spam.


PRIVACY PROBLEMS REGARDING E-MAIL


E-MAIL PRIVACY
       E-mail privacy, without some security precautions, can be compromised because
e-mail messages are generally not encrypted;
E-mail messages have to go through intermediate computers before reaching their
destination, meaning it is relatively easy for others to intercept and read messages;
       Many Internet Service Providers (ISP) store copies of your e-mail messages on
their mail servers before they are delivered. The backups of these can remain up to
several months on their server, even if you delete them in your mailbox;


       The Received: headers and other information in the email can often identify the
sender, preventing anonymous communication.
       There are cryptography applications that can serve as a remedy to one or more of
the above. For example, Virtual Private Networks or the Tor anonymity network can be
used to encrypt traffic from the user machine to a safer network while GPG, PGP or
S/MIME can be used for end-to-end message encryption, and SMTP STARTTLS or
SMTP over Transport Layer Security/Secure Sockets Layer can be used to encrypt
communications for a single mail hop between the SMTP client and the SMTP server.
       Another risk is that e-mail passwords might be intercepted during sign-in. One
may use encrypted authentication schemes such as SASL to help prevent this.
FINGER
       In computer networking, the Name/Finger protocol and the Finger user
information protocol are simple network protocols for the exchange of human-oriented
status and user information.


   Name/Finger protocol
       The Name/Finger protocol is based on Request for comments document 742
(December 1977) as an interface to the name and finger programs that provide status
reports on a particular computer system or a particular person at network sites. The finger
program was written in 1971 by Les Earnest who created the program to solve the need
of users who wanted information on other users of the network. Information on who is
logged-in was useful to check the availability of a person to meet.
       Prior to the finger program, the only way to get this information was with a who
program that showed IDs and terminal line numbers for logged-in users, and people used
to run their fingers down the who list. Earnest named his program after this concept.
Finger user information protocol
       The Finger user information protocol is based on RFC 1288 (The Finger User
Information Protocol, December 1991). Typically the server side of the protocol is
implemented by a program fingerd (for finger daemon), while the client side is
implemented by the name and finger programs which are supposed to return a friendly,
human-oriented status report on either the system at the moment or a particular person in
depth. There is no required format, and the protocol consists mostly of specifying a single
command line. It is most often implemented on Unix or Unix-like systems.
       The program would supply information such as whether a user is currently
logged-on, e-mail address, full name etc. As well as standard user information, finger
displays the contents of the .project and .plan files in the user's home directory. Often
this file (maintained by the user) contains either useful information about the user's
current activities, or alternatively all manner of humor.
SECURITY CONCERNS
       Supplying such detailed information as e-mail addresses and full names was
considered acceptable and convenient in the early days of the Internet, but later was
considered questionable for privacy and security reasons. Finger information has been
frequently used by crackers as a way to initiate a social engineering attack on a
company's computer security system. By using a finger client to get a list of a company's
employee names, email addresses, phone numbers, and so on, a cracker can telephone or
email someone at a company requesting information while posing as another employee.
The finger daemon has also had several exploitable security holes which crackers have
used to break into systems. The Morris worm exploited an overflow vulnerability in
fingerd (among others) to spread.

       For these reasons, while finger was widely used during the early days of Internet,
by the 1990s the vast majority of sites on the internet no longer offered the service.
Notable exceptions include John Carmack and Justin Frankel, who until recently still
updated their status information occasionally. In late 2005, John Carmack switched to
using a blog, instead of his old .plan site.




FILE TRANSFER PROTOCOL (FTP)
       FTP or File Transfer Protocol is used to connect two computers over the Internet
so that the user of one computer can transfer files and perform file commands on the
other computer.
       Specifically, FTP is a commonly used protocol for exchanging files over any
network that supports the TCP/IP protocol (such as the Internet or an intranet). There are
two computers involved in an FTP transfer: a server and a client. The FTP server,
running FTP server software, listens on the network for connection requests from other
computers. The client computer, running FTP client software, initiates a connection to the
server. Once connected, the client can do a number of file manipulation operations such
as uploading files to the server, download files from the server, rename or delete files on
the server and so on. Any software company or individual programmer is able to create
FTP server or client software because the protocol is an open standard. Virtually every
computer platform supports the FTP protocol. This allows any computer connected to a
TCP/IP based network to manipulate files on another computer on that network
regardless of which operating systems are involved (if the computers permit FTP access).


   OVERVIEW
       FTP runs exclusively over TCP. FTP servers by default listen on port 21 for
incoming connections from FTP clients. A connection to this port from the FTP Client
forms the control stream on which commands are passed to the FTP server from the FTP
client and on occasion from the FTP server to the FTP client. For the actual file transfer
to take place, a different connection is required which is called the data stream.
Depending on the transfer mode, the process of setting up the data stream is different.
In active mode, the FTP client opens a random port (> 1023), sends the FTP server the
random port number on which it is listening over the control stream and waits for a
connection from the FTP server. When the FTP server initiates the data connection to the
FTP client it binds the source port to port 20 on the FTP server.
       In passive mode, the FTP Server opens a random port (> 1023), sends the FTP
client the port on which it is listening over the control stream and waits for a connection
from the FTP client. In this case the FTP client binds the source port of the connection to
a random port greater than 1023.
       While data is being transferred via the data stream, the control stream sits idle.
This can cause problems with large data transfers through firewalls which time out
sessions after lengthy periods of idleness. While the file may well be successfully
transferred, the control session can be disconnected by the firewall, causing an error to be
generated.
       When FTP is used in a UNIX environment, there is an often-ignored but valuable
command, "reget" (meaning "get again") that will cause an interrupted "get" command to
be continued, hopefully to completion, after a communications interruption. The principle
is obvious—the receiving station has a record of what it got, so it can spool through the
file at the sending station and re-start at the right place for a seamless splice. The
converse would be "reput" but is not available. Again, the principle is obvious: The
sending station does not know how much of the file was actually received, so it would
not know where to start.
The objectives of FTP, as outlined by its RFC, are:
        To promote sharing of files (computer programs and/or data).
        To encourage indirect or implicit use of remote computers.
        To shield a user from variations in file storage systems among different hosts.
        To transfer data reliably, and efficiently.


CRITICISMS OF FTP
        Passwords and file contents are sent in clear text, which can be intercepted by
eavesdroppers. There are protocol enhancements that circumvent this.
Multiple TCP/IP connections are used, one for the control connection, and one for each
download, upload, or directory listing. Firewall software needs additional logic to
account for these connections.
        It is hard to filter active mode FTP traffic on the client side by using a firewall,
since the client must open an arbitrary port in order to receive the connection. This
problem is largely resolved by using passive mode FTP.
        It is possible to abuse the protocol's built-in proxy features to tell a server to send
data to an arbitrary port of a third computer; see FXP.
        FTP is a high latency protocol due to the number of commands needed to initiate
a transfer.
        No integrity check on the receiver side. If transfer is interrupted the receiver has
no way to know if the received file is complete or not. It is necessary to manage this
externally for example with MD5 sums or cyclic redundancy checking.
        No error detection. FTP relies on the underlying TCP layer for error control,
which uses a weak checksum by modern standards.
        No date/timestamp attribute transfer. Uploaded files are given a new current
timestamp, unlike other file transfer protocols such as SFTP, which allow attributes to be
included. There is no way in the standard FTP protocol to set the time-last-modified (or
time-created) datestamp that most modern filesystems preserve. There is a draft of a
proposed extension that adds new commands for this, but as of yet, most of the popular
FTP servers do not support it.
SECURITY PROBLEMS
        The original FTP specification is an inherently insecure method of transferring
files because there is no method specified for transferring data in an encrypted fashion.
This means that under most network configurations, user names, passwords, FTP
commands and transferred files can be "sniffed" or viewed by anyone on the same
network using a packet sniffer. This is a problem common to many Internet protocol
specifications written prior to the creation of SSL such as HTTP, SMTP and Telnet. The
common solution to this problem is to use either SFTP (SSH File Transfer Protocol), or
FTPS (FTP over SSL), which adds SSL or TLS encryption to FTP as specified in RFC
4217.


FTP RETURN CODES
        FTP server return codes indicate their status by the digits within them. A brief
explanation of various digits' meanings are given below:
        1yz: Positive Preliminary reply. The action requested is being initiated but there
will be another reply before it begins.
        2yz: Positive Completion reply. The action requested has been completed. The
client may now issue a new command.
        3yz: Positive Intermediate reply. The command was successful, but a further
command is required before the server can act upon the request.
        4yz: Transient Negative Completion reply. The command was not successful, but
the client is free to try the command again as the failure is only temporary.
        5yz: Permanent Negative Completion reply. The command was not successful
and the client should not attempt to repeat it again.
        x0z: The failure was due to a syntax error.
        x1z: This response is a reply to a request for information.
        x2z: This response is a reply relating to connection information.
x3z: This response is a reply relating to accounting and authorization.
        x4z: Unspecified as yet
        x5z: These responses indicate the status of the Server file system vis-a-vis the
requested transfer or other file system action
ANONYMOUS FTP
        Many sites that run FTP servers enable so-called "anonymous ftp". Under this
arrangement, users do not need an account on the server. The user name for anonymous
access is typically 'anonymous' or 'ftp'. This account does not need a password. Although
users are commonly asked to send their email addresses as their passwords for
authentication, usually there is trivial or no verification, depending on the FTP server and
its configuration. Internet Gopher has been suggested as an alternative to anonymous
FTP, as well as Trivial File Transfer Protocol.
        While transferring data over the network, several data representations can be
used. The two most common transfer modes are:
ASCII mode
Binary mode
        The two types differ in the way they send the data. When a file is sent using an
ASCII-type transfer, the individual letters, numbers, and characters are sent using their
ASCII character codes. The receiving machine saves these in a text file in the appropriate
format (for example, a Unix machine saves it in a Unix format, a Macintosh saves it in a
Mac format). Hence if an ASCII transfer is used it can be assumed plain text is sent,
which is stored by the receiving computer in its own format. Translating between text
formats entails substituting the end of line and end of file characters used on the source
platform with those on the destination platform, e.g. a Windows machine receiving a file
from a Unix machine will replace the line feeds with carriage return-line feed pairs.
ASCII transfer is also marginally faster, as the highest-order bit is dropped from each
byte in the file.[1]
        Sending a file in binary mode is different. The sending machine sends each file bit
for bit and as such the recipient stores the bitstream as it receives it. Any form of data that
is not plain text will be corrupted if this mode is not used.
By default, most FTP clients use ASCII mode. Some clients try to determine the required
transfer-mode by inspecting the file's name or contents.
The FTP specifications also list the following transfer modes:
EBCDIC mode
Local mode
         In practice, these additional transfer modes are rarely used. They are however still
used by some legacy mainframe systems.


FTP AND WEB BROWSERS
         Most recent web browsers and file managers can connect to FTP servers, although
they may lack the support for protocol extensions such as FTPS. This allows
manipulation of remote files over FTP through an interface similar to that used for local
files.    This    is   done     via    an    FTP     URL,     which     takes      the   form
ftp(s)://<ftpserveraddress> (e.g., [2]). A password can optionally be given in the URL,
e.g.:    ftp(s)://<login>:<password>@<ftpserveraddress>:<port>. Most web-browsers
require the use of passive mode FTP, which not all FTP servers are capable of handling.
Some browsers allow only the downloading of files, but offer no way to upload files to
the server.


FTP OVER SSH
         FTP over SSH refers to the practice of tunneling a normal FTP session over an
SSH connection.
         Because FTP uses multiple TCP connections (unusual for a TCP/IP protocol that
is still in use), it is particularly difficult to tunnel over SSH. With many SSH clients,
attempting to set up a tunnel for the control channel (the initial client-to-server
connection on port 21) will only protect that channel; when data is transferred, the FTP
software at either end will set up new TCP connections (data channels) which will bypass
the SSH connection, and thus have no confidentiality, integrity protection, etc.
If the FTP client is configured to use passive mode and to connect to a SOCKS server
interface that many SSH clients can present for tunnelling, it is possible to run all the FTP
channels over the SSH connection.
Otherwise, it is necessary for the SSH client software to have specific knowledge
of the FTP protocol, and monitor and rewrite FTP control channel messages and
autonomously open new forwardings for FTP data channels. Version 3 of SSH
Communications Security's software suite, and the GPL licensed FONC are two software
packages that support this mode.
        FTP over SSH is sometimes referred to as secure FTP; this should not be
confused with other methods of securing FTP, such as with SSL/TLS (FTPS). Other
methods of transferring files using SSH that are not related to FTP include SFTP and
SCP; in each of these, the entire conversation (credentials and data) is always protected
by the SSH protocol.




AN OVERVIEW OF THE FILE TRANSFER PROTOCOL
        The File Transfer Protocol (FTP) was one of the first efforts to create a standard
means of exchanging files over a TCP/IP network, so the FTP has been around since the
1970's. The FTP was designed with as much flexibility as possible, so it could be used
over networks other than TCP/IP, as well as being engineered to have the capability with
exchanging files with a broad variety of machines.
        The base specification is RFC 959 and is dated October 1985. There are some
additional RFCs relating to FTP, but it should be noted that even as of this writing
(December 2001) that most of the new additions are not in widespread use. The purpose
of this document is to provide general information about how the protocol works without
getting into too many technical details. RFC 959 should be consulted for details on the
protocol.
Control Connection -- the conversation channel
        The protocol can be thought of as interactive, because clients and servers actually
have a conversation where they authenticate themselves and negotiate file transfers. In
addition, the protocol specifies that the client and server do not exchange data on the
conversation channel. Instead, clients and servers negotiate how to send data files on
separate connections, with one connection for each data transfer. Note that a directory
listing is considered a file transfer.
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1
Internet applications unit1

Mais conteúdo relacionado

Mais procurados

IBM MobileFirst - Hybrid Application Development with Worklight
IBM MobileFirst - Hybrid Application Development with WorklightIBM MobileFirst - Hybrid Application Development with Worklight
IBM MobileFirst - Hybrid Application Development with WorklightIBIZZ
 
Mobile and IBM Worklight Best Practices
Mobile and IBM Worklight Best PracticesMobile and IBM Worklight Best Practices
Mobile and IBM Worklight Best PracticesAndrew Ferrier
 
Introduction to Force.com
Introduction to Force.comIntroduction to Force.com
Introduction to Force.comIMC Institute
 
The Challenges Oracle Forms Migration to ADF
The Challenges Oracle Forms Migration to ADFThe Challenges Oracle Forms Migration to ADF
The Challenges Oracle Forms Migration to ADFDataNext Solutions
 
Static Enabler: A Response Enhancer for Dynamic Web Applications
Static Enabler: A Response Enhancer for Dynamic Web ApplicationsStatic Enabler: A Response Enhancer for Dynamic Web Applications
Static Enabler: A Response Enhancer for Dynamic Web ApplicationsOsama M. Khaled
 
Presentation build and connect apps, devices and data ibm worklight overview
Presentation   build and connect apps, devices and data ibm worklight overviewPresentation   build and connect apps, devices and data ibm worklight overview
Presentation build and connect apps, devices and data ibm worklight overviewxKinAnx
 
Web Component Development with Servlet and JSP Technologies Unit 01
Web Component Development with Servlet and JSP Technologies Unit 01Web Component Development with Servlet and JSP Technologies Unit 01
Web Component Development with Servlet and JSP Technologies Unit 01Prashanth Shivakumar
 
User Virtualization with AppSense
User Virtualization with AppSenseUser Virtualization with AppSense
User Virtualization with AppSensesubtitle
 
Sencha Web Applications Come of Age
Sencha Web Applications Come of AgeSencha Web Applications Come of Age
Sencha Web Applications Come of Agebastila
 
IBM Worklight - Technical Overview
IBM Worklight - Technical OverviewIBM Worklight - Technical Overview
IBM Worklight - Technical OverviewIIC_Barcelona
 
Sybase Success Story
Sybase Success StorySybase Success Story
Sybase Success StoryPeter Mudd
 
Placement of BPM runtime components in an SOA environment
Placement of BPM runtime components in an SOA environmentPlacement of BPM runtime components in an SOA environment
Placement of BPM runtime components in an SOA environmentKim Clark
 

Mais procurados (20)

IBM MobileFirst - Hybrid Application Development with Worklight
IBM MobileFirst - Hybrid Application Development with WorklightIBM MobileFirst - Hybrid Application Development with Worklight
IBM MobileFirst - Hybrid Application Development with Worklight
 
Mobile and IBM Worklight Best Practices
Mobile and IBM Worklight Best PracticesMobile and IBM Worklight Best Practices
Mobile and IBM Worklight Best Practices
 
RIA
RIARIA
RIA
 
Introduction to Force.com
Introduction to Force.comIntroduction to Force.com
Introduction to Force.com
 
AppSense Product Deck
AppSense Product DeckAppSense Product Deck
AppSense Product Deck
 
The Challenges Oracle Forms Migration to ADF
The Challenges Oracle Forms Migration to ADFThe Challenges Oracle Forms Migration to ADF
The Challenges Oracle Forms Migration to ADF
 
Static Enabler: A Response Enhancer for Dynamic Web Applications
Static Enabler: A Response Enhancer for Dynamic Web ApplicationsStatic Enabler: A Response Enhancer for Dynamic Web Applications
Static Enabler: A Response Enhancer for Dynamic Web Applications
 
Presentation build and connect apps, devices and data ibm worklight overview
Presentation   build and connect apps, devices and data ibm worklight overviewPresentation   build and connect apps, devices and data ibm worklight overview
Presentation build and connect apps, devices and data ibm worklight overview
 
IBM Worklight
IBM WorklightIBM Worklight
IBM Worklight
 
Web Component Development with Servlet and JSP Technologies Unit 01
Web Component Development with Servlet and JSP Technologies Unit 01Web Component Development with Servlet and JSP Technologies Unit 01
Web Component Development with Servlet and JSP Technologies Unit 01
 
User Virtualization with AppSense
User Virtualization with AppSenseUser Virtualization with AppSense
User Virtualization with AppSense
 
Sencha Web Applications Come of Age
Sencha Web Applications Come of AgeSencha Web Applications Come of Age
Sencha Web Applications Come of Age
 
RahulParwekar_Resume
RahulParwekar_ResumeRahulParwekar_Resume
RahulParwekar_Resume
 
Kashif guffar
Kashif guffarKashif guffar
Kashif guffar
 
Worklight Overview
Worklight OverviewWorklight Overview
Worklight Overview
 
IBM Worklight - Technical Overview
IBM Worklight - Technical OverviewIBM Worklight - Technical Overview
IBM Worklight - Technical Overview
 
Peoplesoft technical consultant interview questions
Peoplesoft technical consultant interview questionsPeoplesoft technical consultant interview questions
Peoplesoft technical consultant interview questions
 
IBM Worklight-Overview
IBM Worklight-OverviewIBM Worklight-Overview
IBM Worklight-Overview
 
Sybase Success Story
Sybase Success StorySybase Success Story
Sybase Success Story
 
Placement of BPM runtime components in an SOA environment
Placement of BPM runtime components in an SOA environmentPlacement of BPM runtime components in an SOA environment
Placement of BPM runtime components in an SOA environment
 

Destaque

Web Ed Email Security Ppt
Web Ed Email Security PptWeb Ed Email Security Ppt
Web Ed Email Security PptAntonio Ieranò
 
Technical Background Overview Ppt
Technical Background Overview PptTechnical Background Overview Ppt
Technical Background Overview PptAntonio Ieranò
 
China’s Innovations on the Web
China’s Innovations on the WebChina’s Innovations on the Web
China’s Innovations on the WebDayDayUp
 
The Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsThe Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsBarry Feldman
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome EconomyHelge Tennø
 

Destaque (6)

Web Ed Email Security Ppt
Web Ed Email Security PptWeb Ed Email Security Ppt
Web Ed Email Security Ppt
 
Technical Background Overview Ppt
Technical Background Overview PptTechnical Background Overview Ppt
Technical Background Overview Ppt
 
China’s Innovations on the Web
China’s Innovations on the WebChina’s Innovations on the Web
China’s Innovations on the Web
 
Stay Up To Date on the Latest Happenings in the Boardroom: Recommended Summer...
Stay Up To Date on the Latest Happenings in the Boardroom: Recommended Summer...Stay Up To Date on the Latest Happenings in the Boardroom: Recommended Summer...
Stay Up To Date on the Latest Happenings in the Boardroom: Recommended Summer...
 
The Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsThe Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post Formats
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome Economy
 

Semelhante a Internet applications unit1

Web Application Development-Ultimate Guide To Web Application Architecture
Web Application Development-Ultimate Guide To Web Application ArchitectureWeb Application Development-Ultimate Guide To Web Application Architecture
Web Application Development-Ultimate Guide To Web Application ArchitectureVersatile Mobitech
 
Web Application Architecture: Everything You Need to Know About
Web Application Architecture: Everything You Need to Know AboutWeb Application Architecture: Everything You Need to Know About
Web Application Architecture: Everything You Need to Know AboutNoman Shaikh
 
Iron speed appgeneration
Iron speed appgenerationIron speed appgeneration
Iron speed appgenerationankit.das
 
Quality Attributes of Web Software Applications ∗
Quality Attributes of Web Software Applications ∗Quality Attributes of Web Software Applications ∗
Quality Attributes of Web Software Applications ∗hasnainqayyum1
 
Web application architecture guide how it works types, components, best pract...
Web application architecture guide how it works types, components, best pract...Web application architecture guide how it works types, components, best pract...
Web application architecture guide how it works types, components, best pract...Katy Slemon
 
Application-Servers.pdf
Application-Servers.pdfApplication-Servers.pdf
Application-Servers.pdfSamir Paul
 
Web based lims_white_paper
Web based lims_white_paperWeb based lims_white_paper
Web based lims_white_paperPreeti Gupta
 
J2EE Notes JDBC database Connectiviy and Programs related to JDBC
J2EE Notes JDBC database Connectiviy and Programs related to JDBCJ2EE Notes JDBC database Connectiviy and Programs related to JDBC
J2EE Notes JDBC database Connectiviy and Programs related to JDBCChaithraCSHirematt
 
Web Application Architecture: A Comprehensive Guide for Success in 2023
Web Application Architecture: A Comprehensive Guide for Success in 2023Web Application Architecture: A Comprehensive Guide for Success in 2023
Web Application Architecture: A Comprehensive Guide for Success in 2023stevefary
 
Client Side Performance In Web Applications
Client Side Performance In Web ApplicationsClient Side Performance In Web Applications
Client Side Performance In Web Applicationsvladungureanu
 
A Comprehensive Guide to Web Application Architecture
A Comprehensive Guide to Web Application ArchitectureA Comprehensive Guide to Web Application Architecture
A Comprehensive Guide to Web Application Architecturestevefary
 
SOA Fundamentals
SOA  FundamentalsSOA  Fundamentals
SOA Fundamentalsabhi1112
 
Web Engineering - Web Applications versus Conventional Software
Web Engineering - Web Applications versus Conventional SoftwareWeb Engineering - Web Applications versus Conventional Software
Web Engineering - Web Applications versus Conventional SoftwareNosheen Qamar
 
Whitepaper: Volume Testing Thick Clients and Databases
Whitepaper:  Volume Testing Thick Clients and DatabasesWhitepaper:  Volume Testing Thick Clients and Databases
Whitepaper: Volume Testing Thick Clients and DatabasesRTTS
 
Single Page App Dev with LightSwitch Framework.
Single Page App Dev with LightSwitch Framework.Single Page App Dev with LightSwitch Framework.
Single Page App Dev with LightSwitch Framework.Sumantro Mukherjee
 
Client server based computing
Client server based computingClient server based computing
Client server based computingMohammad Affan
 
Real time service oriented cloud computing
Real time service oriented cloud computingReal time service oriented cloud computing
Real time service oriented cloud computingwww.pixelsolutionbd.com
 
Comparing Client Server Architectural Style
Comparing Client Server Architectural StyleComparing Client Server Architectural Style
Comparing Client Server Architectural StyleJessica Robles
 

Semelhante a Internet applications unit1 (20)

Web Application Development-Ultimate Guide To Web Application Architecture
Web Application Development-Ultimate Guide To Web Application ArchitectureWeb Application Development-Ultimate Guide To Web Application Architecture
Web Application Development-Ultimate Guide To Web Application Architecture
 
Web Application Architecture: Everything You Need to Know About
Web Application Architecture: Everything You Need to Know AboutWeb Application Architecture: Everything You Need to Know About
Web Application Architecture: Everything You Need to Know About
 
Iron speed appgeneration
Iron speed appgenerationIron speed appgeneration
Iron speed appgeneration
 
Quality Attributes of Web Software Applications ∗
Quality Attributes of Web Software Applications ∗Quality Attributes of Web Software Applications ∗
Quality Attributes of Web Software Applications ∗
 
Web application architecture guide how it works types, components, best pract...
Web application architecture guide how it works types, components, best pract...Web application architecture guide how it works types, components, best pract...
Web application architecture guide how it works types, components, best pract...
 
Application-Servers.pdf
Application-Servers.pdfApplication-Servers.pdf
Application-Servers.pdf
 
Web based lims_white_paper
Web based lims_white_paperWeb based lims_white_paper
Web based lims_white_paper
 
J2EE Notes JDBC database Connectiviy and Programs related to JDBC
J2EE Notes JDBC database Connectiviy and Programs related to JDBCJ2EE Notes JDBC database Connectiviy and Programs related to JDBC
J2EE Notes JDBC database Connectiviy and Programs related to JDBC
 
Web Application Architecture: A Comprehensive Guide for Success in 2023
Web Application Architecture: A Comprehensive Guide for Success in 2023Web Application Architecture: A Comprehensive Guide for Success in 2023
Web Application Architecture: A Comprehensive Guide for Success in 2023
 
Client Side Performance In Web Applications
Client Side Performance In Web ApplicationsClient Side Performance In Web Applications
Client Side Performance In Web Applications
 
A Comprehensive Guide to Web Application Architecture
A Comprehensive Guide to Web Application ArchitectureA Comprehensive Guide to Web Application Architecture
A Comprehensive Guide to Web Application Architecture
 
SOA Fundamentals
SOA  FundamentalsSOA  Fundamentals
SOA Fundamentals
 
Web Engineering - Web Applications versus Conventional Software
Web Engineering - Web Applications versus Conventional SoftwareWeb Engineering - Web Applications versus Conventional Software
Web Engineering - Web Applications versus Conventional Software
 
Whitepaper: Volume Testing Thick Clients and Databases
Whitepaper:  Volume Testing Thick Clients and DatabasesWhitepaper:  Volume Testing Thick Clients and Databases
Whitepaper: Volume Testing Thick Clients and Databases
 
Single Page App Dev with LightSwitch Framework.
Single Page App Dev with LightSwitch Framework.Single Page App Dev with LightSwitch Framework.
Single Page App Dev with LightSwitch Framework.
 
Client server based computing
Client server based computingClient server based computing
Client server based computing
 
Web Application Solutions
Web Application SolutionsWeb Application Solutions
Web Application Solutions
 
Real time service oriented cloud computing
Real time service oriented cloud computingReal time service oriented cloud computing
Real time service oriented cloud computing
 
Modern Web Applications
Modern Web ApplicationsModern Web Applications
Modern Web Applications
 
Comparing Client Server Architectural Style
Comparing Client Server Architectural StyleComparing Client Server Architectural Style
Comparing Client Server Architectural Style
 

Internet applications unit1

  • 1. INTERNET APPLICATIONS Internet applications (IA) are web applications that have the features and functionality of traditional desktop applications. IAs typically transfer the processing necessary for the user interface to the web client but keep the bulk of the data (i.e maintaining the state of the program, the data etc) back on the application server. IAS TYPICALLY: run in a web browser, or do not require software installation run locally in a secure environment called a sandbox can be "occasionally connected" wandering in and out of hot-spots or from office to office. HISTORY OF IAS The term " Internet application" was introduced in a Macromedia whitepaper in March 2002, though the concept had been around for a number of years before that under different names such as: Remote Scripting, by Microsoft, circa 1998 X Internet, by Forrester Research in October 2000 (Web) clients web application Comparison to standard web applications Traditional web applications centered all activity around a client-server architecture with a thin client. Under this system all processing is done on the server, and the client is only used to display static (in this case HTML) content. The biggest drawback with this system is that all interaction with the application must pass through the server, which requires data to be sent to the server, the server to respond, and the page to be reloaded on the client with the response. By using a client side technology which can execute instructions on the client's computer, IAs can circumvent this slow and synchronous loop for many user interactions. This difference is somewhat analogous to the difference between "terminal and mainframe" and Client-server/Fat client approaches. Internet standards have evolved slowly and continually over time to accommodate these techniques, so it is hard to draw a strict line between what constitutes an IA and what
  • 2. does not. But all IAs share one characteristic: they introduce an intermediate layer of code, often called a client engine, between the user and the server. This client engine is usually downloaded at the beginning of the application, and may be supplemented by further code downloads as the application progresses. The client engine acts as an extension of the browser, and usually takes over responsibility for rendering the application's user interface and for server communication. What can be done in an IA may be limited by the capabilities of the system used on the client. But in general, the client engine is programmed to perform application functions that its designer believes will enhance some aspect of the user interface, or improve its responsiveness when handling certain user interactions, compared to a standard Web browser implementation. Also, while simply adding a client engine does not force an application to depart from the normal synchronous pattern of interactions between browser and server, in most IAs the client engine performs additional asynchronous communications with servers. BENEFITS Because IAs employ a client engine to interact with the user, they are: er. They can offer user-interface behaviors not obtainable using only the HTML widgets available to standard browser-based Web applications. This er functionality may include anything that can be implemented in the technology being used on the client side, including drag and drop, using a slider to change data, calculations performed only by the client and which do not need to be sent back to the server (e.g. an insurance rate calculator), etc. More responsive. The interface behaviors are typically much more responsive than those of a standard Web browser that must always interact with the server. The most sophisticated examples of IAs exhibit a look and feel approaching that of a desktop environment. Using a client engine can also produce other performance benefits: Client/Server balance. The demand for client and server computing resources is better balanced, so that the Web server need not be the workhorse that it is with a traditional Web application. This frees server resources, allowing the same server hardware to handle more client sessions concurrently.
  • 3. Asynchronous communication. The client engine can interact with the server asynchronously -- that is, without waiting for the user to perform an interface action like clicking on a button or link. This option allows IA designers to move data between the client and the server without making the user wait. Perhaps the most common application of this is prefetching, in which an application anticipates a future need for certain data, and downloads it to the client before the user requests it, thereby speeding up a subsequent response. Google Maps uses this technique to move adjacent map segments to the client before the user scrolls their view. Network efficiency. The network traffic may also be significantly reduced because an application-specific client engine can be more intelligent than a standard Web browser when deciding what data needs to be exchanged with servers. This can speed up individual requests or responses because less data is being transferred for each interaction, and overall network load is reduced. However, use of asynchronous prefetching techniques can neutralize or even reverse this potential benefit. Because the code cannot anticipate exactly what every user will do next, it is common for such techniques to download extra data, not all of which is actually needed, to many or all clients.
  • 4. SHORTCOMINGS AND RESTRICTIONS Shortcomings and restrictions associated with IAs are: Sandbox. Because IAs run within a sandbox, they have restricted access to system resources. If assumptions about access to resources are incorrect, IAs may fail to operate correctly. Disabled scripting. JavaScript or another scripting language is often required. If the user has disabled active scripting in their browser, the IA may not function properly, if at all. Client processing speed. To achieve platform independence, some IAs use client- side scripts written in interpreted languages such as JavaScript, with a consequential loss of performance. This is not an issue with compiled client languages such as Java, where performance is comparable to that of traditional compiled languages, or with Flash movies, in which the bulk of the operations are performed by the native code of the Flash player. Script download time. Although it does not have to be installed, the additional client-side intelligence (or client engine) of IA applications needs to be delivered by the server to the client. While much of this is usually automatically cached it needs to be transferred at least once. Depending on the size and type of delivery, script download time may be unpleasantly long. IA developers can lessen the impact of this delay by compressing the scripts, and by staging their delivery over multiple pages of an application. Loss of integrity. If the application-base is X/HTML, conflicts arise between the goal of an application (which naturally wants to be in control of its presentation and behaviour) and the goals of X/HTML (which naturally wants to give away control). The DOM interface for X/HTML makes it possible to create IAs, but by doing so makes it impossible to guarantee correct function. Because an IA client can modify the IA's basic structure and override presentation and behaviour, it can cause an irrecoverable client failure or crash. Eventually, this problem could be solved by new client-side mechanisms that granted an IA client more limited permission to modify only those resources within the scope of its application. (Standard software running natively does not have this
  • 5. problem because by definition a program automatically possesses all rights to all its allocated resources). Loss of visibility to search engines. Search engines may not be able to index the text content of the application. MANAGEMENT COMPLICATIONS The advent of IA technologies has introduced considerable additional complexity into Web applications. Traditional Web applications built using only standard HTML, having a relatively simple software architecture and being constructed using a limited set of development options, are relatively easy to design and manage. For the person or organization using IA technology to deliver a Web application, their additional complexity makes them harder to design, test, measure, and support. Use of IA technology poses several new Service Level Management ("SLM") challenges, not all of which are completely solved today. SLM concerns are not always the focus of application developers, and are rarely if ever perceived by application users, but they are vital to the successful delivery of an online application. Aspects of the IA architecture that complicate management processes[1] are: Greater complexity makes development harder. The ability to move code to the client gives application designers and developers far more creative freedom. But this in turn makes development harder, increases the likelihood of defects (bugs) being introduced, and complicates software testing activities. These complications lengthen the software development process, regardless of the particular methodology or process being employed. Some of these issues may be mitigated through the use of a Web application framework to standardize aspects of IA design and development. However, increasing complexity in a software solution can complicate and lengthen the testing process, if it increases the number of use cases to be tested. Incomplete testing lowers the application's quality and its reliability during use. IA architecture breaks the Web page paradigm. Traditional Web applications can be viewed as a series of Web pages, each of which requires a distinct download, initiated by an HTTP GET request. This model has been characterized as the Web page paradigm. IAs invalidate this model, introducing additional asynchronous server communications to
  • 6. support a more responsive user interface. In IAs, the time to complete a page download may no longer correspond to something a user perceives as important, because (for example) the client engine may be prefetching some of the downloaded content for future use. New measurement techniques must be devised for IAs, to permit reporting of response time quantities that reflect the user's experience. In the absence of standard tools that do this, IA developers must instrument their application code to produce the measurement data needed for SLM. Asynchronous communication makes it harder to isolate performance problems. Paradoxically, actions taken to enhance application responsiveness also make it harder to measure, understand, report on, and manage responsiveness. Some IAs do not issue any further HTTP GET requests from the browser after their first page, using asynchronous requests from the client engine to initiate all subsequent downloads. The IA client engine may be programmed to continually download new content and refresh the display, or (in applications using the Comet approach) a server-side engine can keep pushing new content to the browser over a connection that never closes. In these cases, the concept of a "page download" is no longer applicable. These complications make it harder to measure and subdivide application response times, a fundamental requirement for problem isolation and service level management. Tools designed to measure traditional Web applications may -- depending on the details of the application and the tool -- report such applications either as a single Web page per HTTP request, or as an unrelated collection of server activities. Neither conclusion reflects what is really happening at the application level. The client engine makes it harder to measure response time. For traditional Web applications, measurement software can reside either on the client machine or on a machine that is close to the server, provided that it can observe the flow of network traffic at the TCP and HTTP levels. Because these protocols are synchronous and predictable, a packet sniffer can read and interpret packet-level data, and infer the user’s experience of response time by tracking HTTP messages and the times of underlying TCP packets and acknowledgments. But the IA architecture reduces the power of the packet sniffing approach, because the client engine breaks the communication between user and server into two separate cycles operating asynchronously -- a foreground (user-
  • 7. to-engine) cycle, and a background (engine-to-server) cycle. Both cycles are important, because neither stands alone; it is their relationship that defines application behavior. But that relationship depends only on the application design, which cannot (in general) be inferred by a measurement tool, especially one that can observe only one of the two cycles. Therefore the most complete IA measurements can only be obtained using tools that reside on the client and observe both cycles. THE CURRENT STATUS OF IA DEVELOPMENT AND ADOPTION IAs are still in the early stages of development and user adoption. There are a number of restrictions and requirements that remain: Browser adoption: Many IAs require modern web browsers in order to run. Advanced JavaScript engines must be present in the browser as IAs use techniques such as XMLHTTPRequest for client-server communication, and DOM Scripting and advanced CSS techniques to enable the user interface. Web standards: Differences between web browsers can make it difficult to write an IA that will run across all major browsers. The consistency of the Java platform, particularly after Java 1.1, makes this task much simpler for IAs written as Java applets. Development tools: Some Ajax Frameworks and products like Adobe Flex provide an integrated environment in which to build IA and B2B web applications. Accessibility concerns: Additional interactivity may require technical approaches that limit applications' accessibility. User adoption: Users expecting standard web applications may find that some accepted browser functionality (such as the "Back" button) may have somewhat different or even undesired behaviour.
  • 8. JUSTIFICATIONS Although developing applications to run in a web browser is a much more limiting, difficult, and intricate a process than developing a regular desktop application, the efforts are often justified because: installation is not required -- updating and distributing the application is an instant, automatically handled process users can use the application from any computer with an internet connection, and usually regardless of what operating system that computer is running web-based applications are generally less prone to viral infection than running an actual executable as web usage increases, computer users are becoming less willing to go to the trouble of installing new software if a browser-based alternative is available This last point is often true even if this alternative is slower or not as feature-. A good example of this phenomenon is webmail. METHODS AND TECHNIQUES JAVASCRIPT The first major client side language and technology available with the ability to run code and installed on a majority of web clients was JavaScript. Although its uses were relatively limited at first, combined with layers and other developments in DHTML it has become possible to piece together an IA system without the use of a unified client- side solution. Ajax is a new term coined to refer to this combination of techniques and has recently been used most prominently by Google for projects such as Gmail and Google Maps. However, creating a large application in this framework is very difficult, as many different technologies must interact to make it work, and browser compatibility requires a lot of effort. In order to make the process easier, several AJAX Frameworks have been developed. The "" in " Internet applications" may also suffer from an all-JavaScript approach, because you are still bound by the media types predictably supported by the world's various deployed browsers -- video will display in different ways in different browsers
  • 9. with an all-JavaScript approach, audio support will be unpredictable, realtime communications, whiteboarding, outbound webcams, opacity compositing, socket support, all of these are implemented in different ways in different browsers, so all- JavaScript approaches tend to cluster their "ness" around text refreshes and image refreshes. ADOBE FLASH AND APOLLO Adobe Flash is another way to build Internet Application. This technology is cross-platform and quite powerful to create application UI. Adobe Flex provides the possibility to create Flash user interface by compiling MXML, a XML based interface description language. Adobe is currently working on providing a more powerful IA platform with the product Adobe Apollo, a technology combining Flash and PDF. WINDOWS PRESENTATION FOUNDATION With the .NET 3.0 Framework, Microsoft introduced Windows Presentation Foundation (WPF) which provides a way to build single-platform applications with some similarities to IAs using XAML and languages like C# and Visual Basic. In addition, Microsoft has announced Windows Presentation Foundation/Everywhere which may eventually provide a subset of WPF functionality on devices and other platforms. ACTIVEX CONTROLS Embedding ActiveX controls into HTML is a very powerful way to develop Internet applications, however they are only guaranteed to run properly in Internet Explorer. Furthermore, since they can break the sandbox model, they are potential targets for computer viruses and malware making them high security risks. At the time of this writing, the Adobe Flash Player for Internet Explorer is implemented as an ActiveX control for Microsoft environments, as well as in multi-platform Netscape Plugin wrappers for the wider world. Only if corporations have standardized on using Internet
  • 10. Explorer as the primary web browser, is ActiveX per se a good choice for building corporate applications. JAVA APPLETS Java applets run in standard HTML pages and generally start automatically when their web page is opened with a modern web browser. Java applets have access to the screen (inside an area designated in its page's HTML), speakers, keyboard and mouse of any computer their web page is opened on, as well as access to the Internet, and provide a sophisticated environment capable of real time applications. JAVA APPLICATIONS Java based IAs can be launched from within the browser or as free standing applications via Java Web Start. Java IAs can take advantage of the full power of the Java platform to deliver functionality, 2D & 3D graphics, and off-line capabilities, but at the cost of delayed startup. Numerous frameworks for Java IAs exist, including XUL-like XML-based frameworks such as XUI and Swixml. USER INTERFACE LANGUAGES As an alternative to HTML/XHTML new user interface markup languages can be used in IAs. For instance, the Mozilla Foundation's XML-based user interface markup language XUL - this could be used in IAs though it would be restricted to Mozilla-based browsers, since it is not a de facto or de jure standard. The W3Cs Web Clients Activity[2] has initiated a Web Application Formats Working Group whose mission includes the development of such standards [3]. IA's user interfaces can also become er through the use of scriptable SVG (though not all browsers support native SVG rendering yet) as well SMIL.
  • 11. OTHER TECHNIQUES IAs could use XForms to enhance their functionality. Using XML and XSLT along with some XHTML, CSS and JavaScript can also be used to generate er client side UI components like data tables that can be resorted locally on the client without going back to the server. Mozilla and Internet Explorer browsers both support this. The Omnis Web Client is an ActiveX control or Netscape plug-in which can be embedded into an HTML page providing a application interface in the end-user's web browser. IA WITH REAL-TIME PUSH Traditionally, web pages have been delivered to the client only when the client requested for it. For every client request, the browser initiates an HTTP connection to the web server, which then returns the data and the connection is closed. The drawback of this approach was that the page displayed was updated only when the user explicitly refreshes the page or moves to a new page. Since transferring entire pages can take a long time, refreshing pages can introduce a long latency. DEMAND FOR LOCALISED USAGE OF IA With the increasing adoption and improvement in broadband technologies, fewer users experience poor peformance caused by remote latency. Furthermore one of the critical reasons for using an IA is that many developers are looking for a language to serve up desktop applications that is not only desktop OS neutral but also installation and system issue free. IA running in the ubiquitous web browser is a potential candidate even when used standalone or over a LAN, with the required webserver functionalities hosted locally. [edit] Client-side functionalities and development tools for IA needed With client-side functionalities like Javascript and DHTML, IA can operate on top of a range of OS and webserver functionalities.
  • 12. DIRECTORY SERVICE A directory service is a software application — or a set of applications — that stores and organizes information about a computer network's users and network resources, and that allows network administrators to manage users' access to the resources. Additionally, directory services act as an abstraction layer between users and shared resources. A directory service should not be confused with the directory repository itself; which is the database that holds information about named objects that are managed in the directory service. In the case of the X.500 distributed directory services model, one or more namespaces (trees of objects) are used to form the directory service. The directory service provides the access interface to the data that is contained in one or more directory namespaces. The directory service interface acts as a central/common authority that can securely authenticate the system resources that manage the directory data. Like a database, a directory service is highly optimized for reads and provides advanced search on the many different attributes that can be associated with objects in a directory. The data that is stored in the directory is defined by an extendible and modifiable schema. Directory services use a distributed model for storing their information and that information is usually replicated between directory servers. [1] INTRODUCTION A simple directory service called a naming service maps the names of network resources to their respective network addresses. With the name service type of directory, a user doesn't have to remember the physical address of a network resource; providing a name will locate the resource. Each resource on the network is considered an object on the directory server. Information about a particular resource is stored as attributes of that object. Information within objects can be made secure so that only users with the available permissions are able to access it. More sophisticated directories are designed
  • 13. with namespaces as Subscribers, Services, Devices, Entitlements, Preferences, Content and so on. This design process is highly related to Identity management. A directory service defines the namespace for the network. A namespace in this context is the term that is used to hold one or more objects as named entries. The directory design process normally has a set of rules that determine how network resources are named and identified. The rules specify that the names be unique and unambiguous. In X.500 (the directory service standards) and LDAP the name is called the distinguished name (DN) and is used to refer to a collection of attributes (relative distinguished names) which make up the name of a directory entry. A directory service is a shared information infrastructure for locating, managing, administrating, and organizing common items and network resources, which can include volumes, folders, files, printers, users, groups, devices, telephone numbers and other objects. A directory service is an important component of a NOS (Network Operating System). In the more complex cases a directory service is the central information repository for a Service Delivery Platform. For example, looking up "computers" using a directory service might yield a list of available computers and information for accessing them. Replication and Distribution have very distinct meanings in the design and management of a directory service. The term replication is used to indicate that the same directory namespace (the same objects) are copied to another directory server for redundancy and throughput reasons. The replicated namespace is governed by the same authority. The term distribution is used to indicate that multiple directory servers, that hold different namespaces, are interconnected to form a distributed directory service. Each distinct namespace can be governed by different authorities. DIRECTORY SERVICES SOFTWARE Directory services produced by different vendors and standards bodies include the following offerings: Windows NT Directory Services (NTDS) for Windows NT Active Directory for Windows 2000, Server 2003
  • 14. Apple Open Directory in Mac OS X Server Novell eDirectory - formerly called Novell Directory Services (NDS) for Novell NetWare version 4.x-5.x OpenLDAP Fedora Directory Server Sun Directory Services COMPARISON WITH RELATIONAL DATABASES There are a number of things that distinguish a traditional directory service from a relational database. Depending on the directory application, the information is generally read more often than it is written. Hence the usual database features of transactions and rollback are not implemented in some directory systems. Data may be made redundant, but the objective is to get a faster response time during searches. Data can be organized in a strictly hierarchical manner which is sometimes seen to be problematic. To overcome the issues of deep namespaces, some directories dismantle the object namespace hierarchy in their storage mechanisms in order to optimize navigation. That is, these directories find the item based on their data attributes and then determine their namespace values as this is faster than navigating large namespaces to find the item. In terms of cardinality, traditional directories do not have many-to-many relations. Instead, such relations must be maintained explicitly using lists of distinguished names or other identifiers (similar to the cross table identifiers used in relational databases). Originally X.500 type directory information hierarchies were considered problematic against relational data designs. Today Java based object-oriented databases are being developed and XML document forms have adopted an hierarchical object model - indicating an evolution from traditional relational data engineering. A schema is defined as object classes, attributes, name bindings and knowledge (namespaces). An objectClass has:
  • 15. Must-attributes that each of its instances must have May-attributes that can be defined for an instance, but could also be omitted when the object is created. The lack of a certain attribute is somewhat like a NULL in relational databases Attributes are sometimes multi-valued in directories allowing multiple naming attributes at one level such as machine type and serial number concatenated or multiple phone numbers for "work phone". Attributes and objectClasses are standardized throughout the industry and formally registered with the IANA for their object ID. Therefore directory applications seek to reuse much of the standard classes and attributes to maximize the benefit of existing directory server software. Object instances are slotted into namespaces. That is, each objectClass inherits from its parent objectClass (and ultimately from the root of the hierarchy) adding attributes to the must/may list. Directory services are often a central component in the security design of an IT system and have a correspondingly fine granularity regarding access control: who may operate in which manner on what information. Also see: ACLs Directory design is quite different from relational database design. With databases one tends to design a data model for the business issues and process requirements, sometimes with the online customer, service, user management, presence and system scale issues omitted. With directories however, if one is placing information into a common repository for many applications and users, then its information (and identity) design and schema must be developed around what the objects are representing in real life. In most cases, these objects represent users, address books, rosters, preferences, entitlements, products and services, devices, profiles, policies, telephone numbers, routing information, etc. In addition one must also consider the operational aspects of design in regard to performance and scale. A quick check on the operational design is to take eg. 1 million users, 50 objects each with users or applications accessing these objects up to 5000 times a second, minute, or hour (to authorize and update their service environments), and check if the server and network machinery considered can support this.
  • 16. The major difference with databases and directories is at the system level where a database is used to automate a process with a dedicated (relational) data model, but a directory is used to hold "identified" objects that can be used by many applications in random ways. A Directory service is applied where "multi governance" (many applications and users) are, for integrity and efficiency reasons, using the same information. This approach to system design gives greater scale and flexibility so that the larger scale functions such as Service Delivery Platforms can be specified correctly. SDPs now need to support 100s of millions of objects (HSS/HLR, address books, user entitlements, VOIP telephone numbers, user and device information, etc) in real time, random ways and be managed from BSS/OSS/CRM type systems as well as the customer self care applications. See Service Delivery Platform Symptomatic of database designs is that the larger companies have hundreds (if not thousands) of them for different processes and are now trying to converge their user and service identity information and their online goods and services management, and deliver these in real time, cost effectively. So a large scale directory service should be in their solution architecture. IMPLEMENTATIONS OF DIRECTORY SERVICES Directory services were part of an Open Systems Interconnection (OSI) initiative to get everyone in the industry to agree to common network standards to provide multi- vendor interoperability. In the 1980s the ITU and ISO came up with a set of standards - X.500, for directory services, initially to support the requirements of inter-carrier electronic messaging and network name lookup. The Lightweight Directory Access Protocol, LDAP, is based on the directory information services of X.500, but uses the TCP/IP stack and a string encoding scheme of the X.509 protocol DAP, giving it more relevance on the Internet. There have been numerous forms of directory service implementations from different vendors. Among them are: NIS: The Network Information Service (NIS) protocol, originally named Yellow Pages (YP), was Sun Microsystems' implementation of a directory service for Unix
  • 17. network environments. (Sun has, in the early 2000s, merged its iPlanet alliance Netscape and developed its LDAP-based directory service to become part of Sun ONE, now called Sun Java Enterprise.) eDirectory: This is Novell's implementation of directory services. It supports multiple architectures including Windows, NetWare, Linux and several flavours of Unix and has long been used for user administration, configuration management, and software management. eDirectory has evolved into a central component in a broader range of Identity management products. It was previously known as Novell Directory Services. Red Hat Directory Server: Red Hat released a directory service, that it acquired from AOL's Netscape Security Solutions unit[1], as a commercial product running on top of Red Hat Enterprise Linux called Red Hat Directory Server and as part of Fedora Core called Fedora Directory Server. Active Directory: Microsoft's directory service is the Active Directory which is included in the Windows 2000 and Windows Server 2003 operating system versions. Open Directory: Apple's Mac OS X Server offers a directory service called Open Directory which integrates with many open standard protocols such as LDAP and Kerberos as well as proprietary directory solutions like Active Directory and eDirectory. Apache Directory Server: Apache Software Foundation offers a directory service called ApacheDS. Oracle Internet Directory: (OID) is Oracle Corporation's directory service, which is compatible with LDAP version 3. Computer Associates Etrust Directory: Sun Java System Directory Server: Sun Microsystems' current directory service offering, found at [2]. OpenDS: A next generation and open source directory service, backed by Sun Microsystems and hosted at [3]. There are also plenty of open-source tools to create directory services, including OpenLDAP and the Kerberos (protocol), and Samba software which can act as a Domain Controller with Kerberos and LDAP backends.
  • 18. NEXT GENERATION DIRECTORY SYSTEMS Databases have been with the IT industry since the dawn of the computer age and traditional directories for the last 20-30 years, and they will be with us in the future. However, with the larger scale, converged services and event driven (presence) systems now being developed world wide (e.g. 3G-IMS), information, identity and presence services engineering and the technologies that support it will require some evolution. This could take the form of CADS (Composite Adaptive Directory Services) and CADS supported Service Delivery Platforms, see www.wwite.com and Service Delivery Platform. CADS is an advanced directory service and contains functions for managing identity, presence, content and adaption algorithms for self-tuning and with its unique functions, greatly simplifies and enhances the design of converged services SDPs. See Service Delivery Platform
  • 19. DOMAIN NAME SYSTEM On the Internet, the domain name system (DNS) stores and associates many types of information with domain names; most importantly, it translates domain names (computer hostnames) to IP addresses. It also lists mail exchange servers accepting e-mail for each domain. In providing a worldwide keyword-based redirection service, DNS is an essential component of contemporary Internet use. Useful for several reasons, the DNS pre-eminently makes it possible to attach easy-to-remember domain names (such as "wikipedia.org") to hard-to-remember IP addresses (such as 66.230.200.100). People take advantage of this when they recite URLs and e-mail addresses. In a subsidiary function, the domain name system makes it possible for people to assign authoritative names without needing to communicate with a central registrar each time. HISTORY OF THE DNS The practice of using a name as a more human-legible abstraction of a machine's numerical address on the network predates even TCP/IP, and goes all the way back to the ARPAnet era. Originally, each computer on the network retrieved a file called HOSTS.TXT from SRI (now SRI International) which mapped an address (such as 192.0.34.166) to a name (such as www.example.net.) The Hosts file still exists on most modern operating systems, either by default or through configuration, and allows users to specify an IP address to use for a hostname without checking the DNS. This file now serves primarily for troubleshooting DNS errors or for mapping local addresses to more organic names. Systems based on a HOSTS.TXT file have inherent limitations, because of the obvious requirement that every time a given computer's address changed, every computer that seeks to communicate with it would need an update to its Hosts file. The growth of networking called for a more scalable system: one that recorded a change in a host's address in one place only. Other hosts would learn about the change dynamically through a notification system, thus completing a globally accessible network of all hosts' names and their associated IP Addresses. Paul Mockapetris invented the DNS in 1983 and wrote the first implementation. The original specifications appear in RFC 882 and 883. In 1987, the publication of RFC 1034 and RFC 1035 updated the DNS
  • 20. specification and made RFC 882 and RFC 883 obsolete. Several more- recent RFCs have proposed various extensions to the core DNS protocols. In 1984, four Berkeley students — Douglas Terry, Mark Painter, David Riggle and Songnian Zhou — wrote the first UNIX implementation, which was maintained by Ralph Campbell thereafter. In 1985, Kevin Dunlap of DEC significantly re-wrote the DNS implementation and renamed it BIND. Mike Karels, Phil Almquist and Paul Vixie have maintained BIND since then. BIND was ported to the Windows NT platform in the early 1990s. Due to its long history of security issues, several alternative nameserver/resolver programs have been written and distributed in recent years. HOW THE DNS WORKS IN THEORY Domain names, arranged in a tree, cut into zones, each served by a nameserver. The domain name space consists of a tree of domain names. Each node or leaf in the tree has one or more resource records, which hold information associated with the domain name. The tree sub-divides into zones. A zone consists of a collection of connected nodes authoritatively served by an authoritative DNS nameserver. (Note that a single nameserver can host several zones.)
  • 21. When a system administrator wants to let another administrator control a part of the domain name space within his or her zone of authority, he or she can delegate control to the other administrator. This splits a part of the old zone off into a new zone, which comes under the authority of the second administrator's nameservers. The old zone becomes no longer authoritative for what comes under the authority of the new zone. A resolver looks up the information associated with nodes. A resolver knows how to communicate with name servers by sending DNS requests, and heeding DNS responses. Resolving usually entails recursing through several name servers to find the needed information. Some resolvers function simplistically and can only communicate with a single name server. These simple resolvers rely on a recursing name server to perform the work of finding information for them. UNDERSTANDING THE PARTS OF A DOMAIN NAME A domain name usually consists of two or more parts (technically labels), separated by dots. For example wikipedia.org. The rightmost label conveys the top-level domain (for example, the address en.wikipedia.org has the top-level domain org). Each label to the left specifies a subdivision or subdomain of the domain above it. Note that "subdomain" expresses relative dependence, not absolute dependence: for example, wikipedia.org comprises a subdomain of the org domain, and en.wikipedia.org comprises a subdomain of the domain wikipedia.org. In theory, this subdivision can go down to 127 levels deep, and each label can contain up to 63 characters, as long as the whole domain name does not exceed a total length of 255 characters. But in practice some domain registries have shorter limits than that. A hostname refers to a domain name that has one or more associated IP addresses. For example, the en.wikipedia.org and wikipedia.org domains are both hostnames, but the org domain is not. The DNS consists of a hierarchical set of DNS servers. Each domain or subdomain has one or more authoritative DNS servers that publish information about that domain and the name servers of any domains "beneath" it. The hierarchy of authoritative DNS servers matches the hierarchy of domains. At the top of the hierarchy stand the
  • 22. root servers: the servers to query when looking up (resolving) a top- level domain name (TLD). THE ADDRESS RESOLUTION MECHANISM In theory a full host name may have several name segments, (e.g ahost.ofasubnet.ofabiggernet.inadomain.example). In practice, in the experience of the majority of public users of Internet services, full host names will frequently consist of just three segments (ahost.inadomain.example, and most often www.inadomain.example). For querying purposes, software interprets the name segment by segment, from right to left, using an iterative search procedure. At each step along the way, the program queries a corresponding DNS server to provide a pointer to the next server which it should consult. A DNS recurser consults three nameservers to resolve the address www.wikipedia.org. As originally envisaged, the process was as simple as: the local system is pre-configured with the known addresses of the root servers in a file of root hints, which need to be updated periodically by the local administrator from a reliable source to be kept up to date with the changes which occur over time. Query one of the root servers to find the server authoritative for the next level down (so in the case of our simple hostname, a root server would be asked for the address of a server with detailed knowledge of the example top level domain). Querying this second server for the address of a DNS server with detailed knowledge of the second-level domain (inadomain.example in our example) repeating the previous step to progress down the name, until the final step which would, rather than generating the address of the next DNS server, return the final address sought. The diagram illustrates this process for the real host www.wikipedia.org.
  • 23. The mechanism in this simple form has a difficulty: it places a huge operating burden on the collective of root servers, with each and every search for an address starting by querying one of them. Being as critical as they are to the overall function of the system such heavy use would create an insurmountable bottleneck for trillions of queries placed every day. In practice there are two key additions to the mechanism. Firstly, the DNS resolution process allows for local recording and subsequent consultation of the results of a query (or caching) for a period of time after a successful answer (the server providing the answer initially dictates the period of validity, which may vary from just seconds to days or even weeks). In our illustration, having found a list of addresses of servers capable of answering queries about the .example domain, the local resolver will not need to make the query again until the validity of the currently known list expires, and so on for all subsequent steps. Hence having successfully resolved the address of ahost.inadomain.example it is not necessary to repeat the process for some time since the address already reached will be deemed reliable for a defined period, and resolution of anotherhost.anotherdomain.example can commence with already knowing which servers can answer queries for the .example domain. Caching significantly reduces the rate at which the most critical name servers have to respond to queries, adding the extra benefit that subsequent resolutions are not delayed by network transit times for the queries and responses. Secondly, most domestic and small-business clients "hand off" address resolution to their ISP's DNS servers to perform the look-up process, thus allowing for the greatest benefit from those same ISPs having busy local caches serving a wide variety of queries and a large number of users. CIRCULAR DEPENDENCIES AND GLUE RECORDS Name servers in delegations appear listed by name, rather than by IP address. This means that a resolving name server must issue another
  • 24. DNS request to find out the IP address of the server to which it has been referred. Since this can introduce a circular dependency if the nameserver referred to is under the domain that it is authoritative of, it is occasionally necessary for the nameserver providing the delegation to also provide the IP address of the next nameserver. This record is called a glue record. For example, assume that the sub-domain en.wikipedia.org contains further sub-domains (such as something.en.wikipedia.org) and that the authoritative nameserver for these lives at ns1.en.wikipedia.org. A computer trying to resolve something.en.wikipedia.org will thus first have to resolve ns1.en.wikipedia.org. Since ns1 is also under the en.wikipedia.org subdomain, resolving something.en.wikipedia.org requires resolving ns1.en.wikipedia.org which is exactly the circular dependency mentioned above. The dependency is broken by the glue record in the nameserver of wikipedia.org that provides the IP address of ns1.en.wikipedia.org directly to the requestor, enabling it to bootstrap the process by figuring out where ns1.en.wikipedia.org is located. DNS IN PRACTICE When an application (such as a web browser) tries to find the IP address of a domain name, it doesn't necessarily follow all of the steps outlined in the Theory section above. We will first look at the concept of caching, and then outline the operation of DNS in "the real world." CACHING AND TIME TO LIVE Because of the huge volume of requests generated by a system like the DNS, the designers wished to provide a mechanism to reduce the load on individual DNS servers. The mechanism devised provided that when a DNS resolver (i.e. client) received a DNS response, it would cache that response for a given period of time. A value (set by the administrator of the DNS server handing out the response) called the time to live (TTL), defines that period of time. Once a response goes into cache, the resolver will consult its cached (stored) answer; only when the TTL expires (or when an administrator manually flushes the response from
  • 25. the resolver's memory) will the resolver contact the DNS server for the same information. Generally, the Start of Authority (SOA) record specifies the time to live. The SOA record has the parameters: Serial — the zone serial number, incremented when the zone file is modified, so the slave and secondary name servers know when the zone has been changed and should be reloaded. Refresh — the number of seconds between update requests from secondary and slave name servers. Retry — the number of seconds the secondary or slave will wait before retrying when the last attempt has failed. Expire — the number of seconds a master or slave will wait before considering the data stale if it cannot reach the primary name server. Minimum — previously used to determine the minimum TTL, this offers negative caching. CACHING TIME As a noteworthy consequence of this distributed and caching architecture, changes to the DNS do not always take effect immediately and globally. This is best explained with an example: If an administrator has set a TTL of 6 hours for the host www.wikipedia.org, and then changes the IP address to which www.wikipedia.org resolves at 12:01pm, the administrator must consider that a person who cached a response with the old IP address at 12:00pm will not consult the DNS server again until 6:00pm. The period between 12:01pm and 6:00pm in this example is called caching time, which is best defined as a period of time that begins when you make a change to a DNS record and ends after the maximum amount of time specified by the TTL expires. This essentially leads to an important logistical consideration when making changes to the DNS: not everyone is necessarily seeing the same thing you're seeing. RFC 1537 helps to convey basic rules for how to set the TTL. Note that the term "propagation", although very widely used, does not describe the effects of caching well. Specifically, it implies that [1] when you make a DNS change, it somehow spreads to all other DNS servers (instead, other DNS servers check in with yours as needed), and [2] that you do not have control over the amount of time the record is cached (you control the TTL values for all DNS records in your domain,
  • 26. except your NS records and any authoritative DNS servers that use your domain name). Some resolvers may override TTL values, as the protocol supports caching for up to 68 years or no caching at all. Negative caching (the non-existence of records) is determined by name servers authoritative for a zone which MUST include the SOA record when reporting no data of the requested type exists. The MINIMUM field of the SOA record and the TTL of the SOA itself is used to establish the TTL for the negative answer. RFC 2308 Many people incorrectly refer to a mysterious 48 hour or 72 hour propagation time when you make a DNS change. When one changes the NS records for one's domain or the IP addresses for hostnames of authoritative DNS servers using one's domain (if any), there can be a lengthy period of time before all DNS servers use the new information. This is because those records are handled by the zone parent DNS servers (for example, the .com DNS servers if your domain is example.com), which typically cache those records for 48 hours. However, those DNS changes will be immediately available for any DNS servers that do not have them cached. And, any DNS changes on your domain other than the NS records and authoritative DNS server names can be nearly instantaneous, if you choose for them to be (by lowering the TTL once or twice ahead of time, and waiting until the old TTL expires before making the change). DNS IN THE REAL WORLD
  • 27. DNS resolving from program to OS-resolver to ISP-resolver to greater system. Users generally do not communicate directly with a DNS resolver. Instead DNS resolution takes place transparently in client applications such as web browsers (like Internet Explorer, Opera, Mozilla Firefox, Safari, Netscape Navigator, etc), mail clients (Outlook Express, Mozilla Thunderbird, etc), and other Internet applications. When a request is made which necessitates a DNS lookup, such programs send a resolution request to the local DNS resolver in the operating system which in turn handles the communications required. The DNS resolver will almost invariably have a cache (see above) containing recent lookups. If the cache can provide the answer to the request, the resolver will return the value in the cache to the program that made the request. If the cache does not contain the answer, the resolver will send the request to a designated DNS server or servers. In the case of most home users, the Internet service provider to which the machine connects will usually supply this DNS server: such a user will either configure that server's address manually or allow DHCP to set it; however, where systems administrators have configured systems to use their own DNS servers, their DNS resolvers will generally point to their own nameservers. This name server will then follow the process outlined above in DNS in theory, until it either successfully finds a result, or does not. It then returns its results to the DNS resolver; assuming it has found a result, the resolver duly caches that result for future use, and hands the result back to the software which initiated the request. BROKEN RESOLVERS
  • 28. An additional level of complexity emerges when resolvers violate the rules of the DNS protocol. Some people have suggested that a number of large ISPs have configured their DNS servers to violate rules (presumably to allow them to run on less-expensive hardware than a fully compliant resolver), such as by disobeying TTLs, or by indicating that a domain name does not exist just because one of its name servers does not respond. As a final level of complexity, some applications such as Web browsers also have their own DNS cache, in order to reduce the use of the DNS resolver library itself. This practice can add extra difficulty to DNS debugging, as it obscures which data is fresh, or lies in which cache. These caches typically have very short caching times of the order of one minute. A notable exception is Internet Explorer; recent versions cache DNS records for half an hour. OTHER DNS APPLICATIONS The system outlined above provides a somewhat simplified scenario. The DNS includes several other functions: Hostnames and IP addresses do not necessarily match on a one-to- one basis. Many hostnames may correspond to a single IP address: combined with virtual hosting, this allows a single machine to serve many web sites. Alternatively a single hostname may correspond to many IP addresses: this can facilitate fault tolerance and load distribution, and also allows a site to move physical location seamlessly. There are many uses of DNS besides translating names to IP addresses. For instance, Mail transfer agents use DNS to find out where to deliver e-mail for a particular address. The domain to mail exchanger mapping provided by MX records accommodates another layer of fault tolerance and load distribution on top of the name to IP address mapping. Sender Policy Framework and DomainKeys instead of creating own record types were designed to take advantage of another DNS record type, the TXT record. To provide resilience in the event of computer failure, multiple DNS servers provide coverage of each domain. In particular, thirteen root servers exist worldwide. DNS programs or operating systems have the IP addresses of these servers built in. At least nominally, the USA hosts
  • 29. all but three of the root servers. However, because many root servers actually implement anycast, where many different computers can share the same IP address to deliver a single service over a large geographic region, most of the physical (rather than nominal) root servers now operate outside the USA. The DNS uses TCP and UDP on port 53 to serve requests. Almost all DNS queries consist of a single UDP request from the client followed by a single UDP reply from the server. TCP typically comes into play only when the response data size exceeds 512 bytes, or for such tasks as zone transfer. Some operating systems such as HP-UX are known to have resolver implementations that use TCP for all queries, even when UDP would suffice. EXTENSIONS TO DNS EDNS is an extension of the DNS protocol which enhances the transport of DNS data in UDP packages, and adds support for expanding the space of request and response codes. It is described in RFC 2671. IMPLEMENTATIONS OF DNS For a commented list of DNS server-side implementations, see Comparison of DNS server software. STANDARDS RFC 882 Concepts and Facilities (Deprecated by RFC 1034) RFC 883 Domain Names: Implementation specification (Deprecated by RFC 1035) RFC 1032 Domain administrators guide RFC 1033 Domain administrators operations guide RFC 1034 Domain Names - Concepts and Facilities. RFC 1035 Domain Names - Implementation and Specification RFC 1101 DNS Encodings of Network Names and Other Types RFC 1123 Requirements for Internet Hosts -- Application and Support RFC 1183 New DNS RR Definitions RFC 1706 DNS NSAP Resource Records RFC 1876 Location Information in the DNS (LOC) RFC 1886 DNS Extensions to support IP version 6
  • 30. RFC 1912 Common DNS Operational and Configuration Errors RFC 1995 Incremental Zone Transfer in DNS RFC 1996 A Mechanism for Prompt Notification of Zone Changes (DNS NOTIFY) RFC 2136 Dynamic Updates in the domain name system (DNS UPDATE) RFC 2181 Clarifications to the DNS Specification RFC 2182 Selection and Operation of Secondary DNS Servers RFC 2308 Negative Caching of DNS Queries (DNS NCACHE) RFC 2317 Classless IN-ADDR.ARPA delegation RFC 2671 Extension Mechanisms for DNS (EDNS0) RFC 2672 Non-Terminal DNS Name Redirection (DNAME record) RFC 2782 A DNS RR for specifying the location of services (DNS SRV) RFC 2845 Secret Key Transaction Authentication for DNS (TSIG) RFC 2874 DNS Extensions to Support IPv6 Address Aggregation and Renumbering RFC 3403 Dynamic Delegation Discovery System (DDDS) (NAPTR records) RFC 3696 Application Techniques for Checking and Transformation of Names RFC 4398 Storing Certificates in the Domain Name System RFC 4408 Sender Policy Framework (SPF) (SPF records) [edit] Types of DNS records Important categories of data stored in the DNS include the following: An A record or address record maps a hostname to a 32-bit IPv4 address. An AAAA record or IPv6 address record maps a hostname to a 128-bit IPv6 address. A CNAME record or canonical name record is an alias of one name to another. The A record that the alias is pointing to can be either local or remote - on a foreign name server. Useful when running multiple services from a single IP address, where each service has its own entry in DNS. An MX record or mail exchange record maps a domain name to a list of mail exchange servers for that domain. A PTR record or pointer record maps an IPv4 address to the canonical name for that host. Setting up a PTR record for a hostname in
  • 31. the in-addr.arpa domain that corresponds to an IP address implements reverse DNS lookup for that address. For example (at the time of writing), www.icann.net has the IP address 192.0.34.164, but a PTR record maps 164.34.0.192.in-addr.arpa to its canonical name, referrals.icann.org. An NS record or name server record maps a domain name to a list of DNS servers authoritative for that domain. Delegations depend on NS records. An SOA record or start of authority record specifies the DNS server providing authoritative information about an Internet domain, the email of the domain administrator, the domain serial number, and several timers relating to refreshing the zone. An SRV record is a generalized service location record. A TXT record allows an administrator to insert arbitrary text into a DNS record. For example, this record is used to implement the Sender Policy Framework and DomainKeys specifications. NAPTR records ("Naming Authority Pointer") are a newer type of DNS record that support regular expression based rewriting. Other types of records simply provide information (for example, a LOC record gives the physical location of a host), or experimental data (for example, a WKS record gives a list of servers offering some well known service such as HTTP or POP3 for a domain). INTERNATIONALISED DOMAIN NAMES INTERNATIONALIZED DOMAIN NAME While domain names in the DNS have no restrictions on the characters they use and can include non-ASCII characters, the same is not true for host names. Host names are the names most people see and use for things like e-mail and web browsing. Host names are restricted to a small subset of the ASCII character set that includes the Roman alphabet in upper and lower case, the digits 0 through 9, the dot, and the hyphen. (See RFC 3696 section 2 for details.) This prevented the representation of names and words of many languages natively. ICANN has approved the Punycode-based IDNA system, which maps Unicode strings into the valid DNS character set, as a workaround to this issue. Some registries have adopted IDNA.
  • 32. SECURITY ISSUES IN DNS DNS was not originally designed with security in mind, and thus has a number of security issues. DNS responses are traditionally not cryptographically signed, leading to many attack possibilities; DNSSEC modifies DNS to add support for cryptographically signed responses. There are various extensions to support securing zone transfer information as well. Some domain names can spoof other, similar-looking domain names. For example, "paypal.com" and "paypa1.com" are different names, yet users may be unable to tell the difference. This problem is much more serious in systems that support internationalized domain names, since many characters that are different (from the point of view of ISO 10646) appear identical on typical computer screens. LEGAL USERS OF DOMAINS REGISTRANT No one in the world really "owns" a domain name except the Network Information Centre (NIC), or domain name registry.[citation needed] Most of the NICs in the world receive an annual fee from a legal user in order for the legal user to utilize the domain name (i.e. a sort of a leasing agreement exists, subject to the registry's terms and conditions). Depending on the various naming convention of the registries, legal users become commonly known as "registrants" or as "domain holders". ICANN holds a complete list of domain registries in the world. One can find the legal user of a domain name by looking in the WHOIS database held by most domain registries. For most of the more than 240 country code top-level domains (ccTLDs), the domain registries hold the authoritative WHOIS (Registrant, name servers, expiry dates etc). For instance, DENIC, Germany NIC holds the authoritative WHOIS to a .DE domain name. However, some domain registries, such as for .COM, .ORG, .INFO, etc., use a registry-registrar model. There are hundreds of Domain Name Registrars that actually perform the domain name registration with the end-user (see lists at ICANN or VeriSign). By using this method of distribution, the registry only has to manage the relationship with the registrar, and the registrar maintains the relationship with the end-
  • 33. users, or 'registrants'. For .COM, .NET domain names, the domain registries, VeriSign holds a basic WHOIS (registrar and name servers etc). One can find the detailed WHOIS (Registrant, name servers, expiry dates etc) at the registrars. Since about 2001, most gTLD registries (.ORG, .BIZ, .INFO) have adopted a so-called "thick" registry approach, i.e. keeping the authoritative WHOIS with the various registries instead of the registrars. [edit] Administrative contact A registrant usually designates an administrative contact to manage the domain name. In practice, the administrative contact usually has the most immediate power over a domain. Management functions delegated to the administrative contacts may include (for example): the obligation to conform to the requirements of the domain registry in order to retain the right to use a domain name authorization to update the physical address, e-mail address and telephone number etc in WHOIS [edit] Technical contact A technical contact manages the name servers of a domain name. The many functions of a technical contact include: making sure the configurations of the domain name conforms to the requirements of the domain registry updating the domain zone providing the 24x7 functionality of the name servers (that leads to the accessibility of the domain name) [edit] Billing contact The party whom a NIC invoices. [edit] Name servers Namely the authoritative name servers that host the domain name zone of a domain name. [edit] Politics Many investigators have voiced criticism of the methods currently used to control ownership of domains. Critics commonly claim abuse by monopolies or near-monopolies, such as VeriSign, Inc. Particularly noteworthy was the VeriSign Site Finder system which redirected all unregistered .com and .net domains to a VeriSign webpage. Despite widespread criticism, VeriSign only reluctantly removed it after ICANN threatened to revoke its contract to administer the root name servers.
  • 34. There is also significant disquiet regarding United States political influence over the Internet Corporation for Assigned Names and Numbers (ICANN). This was a significant issue in the attempt to create a .xxx Top-level domain and sparked greater interest in Alternative DNS roots that would be beyond the control of any single country. [edit] Truth in Domain Names Act In the United States, the "Truth in Domain Names Act", in combination with the PROTECT Act, forbids the use of a misleading domain name with the intention of attracting people into viewing a visual depiction of sexually explicit conduct on the Internet. ELECTRONIC MAIL Electronic mail (abbreviated "e-mail" or, often, "email") is a store and forward method of composing, sending, storing, and receiving messages over electronic communication systems. The term "e-mail" (as a noun or verb) applies both to the Internet e-mail system based on the Simple Mail Transfer Protocol (SMTP) and to intranet systems allowing users within one organization to e-mail each other. Often these workgroup collaboration organizations may use the Internet protocols for internal e-mail service. ORIGINS OF E-MAIL E-mail predates the Internet; existing e-mail systems were a crucial tool in creating the Internet. MIT first demonstrated the Compatible Time-Sharing System (CTSS) in 1961. [1] It allowed multiple users to log into the IBM 7094 [2] from remote dial-up terminals, and to store files online on disk. This new ability encouraged users to share information in new ways. E-mail started in 1965 as a way for multiple users of a time-sharing mainframe computer to communicate. Although the exact history is murky, among the first systems to have such a facility were SDC's Q32 and MIT's CTSS.
  • 35. E-mail was quickly extended to become network e-mail, allowing users to pass messages between different computers. The messages could be transferred between users on different computers by 1966, but it is possible the SAGE system had something similar some time before. The ARPANET computer network made a large contribution to the evolution of e-mail. There is one report [1] which indicates experimental inter-system e-mail transfers on it shortly after its creation, in 1969. Ray Tomlinson initiated the use of the @ sign to separate the names of the user and their machine in 1971 [2]. The ARPANET significantly increased the popularity of e-mail, and it became the killer app of the ARPANET. MODERN INTERNET E-MAIL How Internet e-mail works The diagram above shows a typical sequence of events that takes place when Alice composes a message using her mail user agent (MUA). She types in, or selects
  • 36. from an address book, the e-mail address of her correspondent. She hits the "send" button. Her MUA formats the message in Internet e-mail format and uses the Simple Mail Transfer Protocol (SMTP) to send the message to the local mail transfer agent (MTA), in this case smtp.a.org, run by Alice's Internet Service Provider (ISP). The MTA looks at the destination address provided in the SMTP protocol (not from the message header), in this case bob@b.org. An Internet e-mail address is a string of the form localpart@domain.example, which is known as a Fully Qualified Domain Address (FQDA). The part before the @ sign is the local part of the address, often the username of the recipient, and the part after the @ sign is a domain name. The MTA looks up this domain name in the Domain Name System to find the mail exchange servers accepting messages for that domain. The DNS server for the b.org domain, ns.b.org, responds with an MX record listing the mail exchange servers for that domain, in this case mx.b.org, a server run by Bob's ISP. smtp.a.org sends the message to mx.b.org using SMTP, which delivers it to the mailbox of the user bob. Bob presses the "get mail" button in his MUA, which picks up the message using the Post Office Protocol (POP3). This sequence of events applies to the majority of e-mail users. However, there are many alternative possibilities and complications to the e-mail system: Alice or Bob may use a client connected to a corporate e-mail system, such as IBM's Lotus Notes or Microsoft's Exchange. These systems often have their own internal e-mail format and their clients typically communicate with the e-mail server using a vendor-specific, proprietary protocol. The server sends or receives e-mail via the Internet through the product's Internet mail gateway which also does any necessary reformatting. If Alice and Bob work for the same company, the entire transaction may happen completely within a single corporate e-mail system. Alice may not have a MUA on her computer but instead may connect to a webmail service. Alice's computer may run its own MTA, so avoiding the transfer at step 1.
  • 37. Bob may pick up his e-mail in many ways, for example using the Internet Message Access Protocol, by logging into mx.b.org and reading it directly, or by using a webmail service. Domains usually have several mail exchange servers so that they can continue to accept mail when the main mail exchange server is not available. Emails are not secure if email encryption is not used correctly. It used to be the case that many MTAs would accept messages for any recipient on the Internet and do their best to deliver them. Such MTAs are called open mail relays. This was important in the early days of the Internet when network connections were unreliable. If an MTA couldn't reach the destination, it could at least deliver it to a relay that was closer to the destination. The relay would have a better chance of delivering the message at a later time. However, this mechanism proved to be exploitable by people sending unsolicited bulk e-mail and as a consequence very few modern MTAs are open mail relays, and many MTAs will not accept messages from open mail relays because such messages are very likely to be spam. INTERNET E-MAIL FORMAT The format of Internet e-mail messages is defined in RFC 2822 and a series of RFCs, RFC 2045 through RFC 2049, collectively called Multipurpose Internet Mail Extensions (MIME). Although as of July 13, 2005 (see [3]) RFC 2822 is technically a proposed IETF standard and the MIME RFCs are draft IETF standards, these documents are the de facto standards for the format of Internet e-mail. Prior to the introduction of RFC 2822 in 2001 the format described by RFC 822 was the de facto standard for Internet e-mail for nearly two decades; it is still the official IETF standard. The IETF reserved the numbers 2821 and 2822 for the updated versions of RFC 821 (SMTP) and RFC 822, honoring the extreme importance of these two RFCs. RFC 822 was published in 1982 and based on the earlier RFC 733. Internet e-mail messages consist of two major sections:
  • 38. Header - Structured into fields such as summary, sender, receiver, and other information about the e-mail Body - The message itself as unstructured text; sometimes containing a signature block at the end The header is separated from the body by a blank line. INTERNET E-MAIL HEADER The message header consists of fields, usually including at least the following: From: The e-mail address, and optionally name, of the sender of the message To: The e-mail address[es], and optionally name[s], of the receiver[s] of the message Subject: A brief summary of the contents of the message Date: The local time and date when the message was originally sent Each header field has a name and a value. RFC 2822 specifies the precise syntax. Informally, the field name starts in the first character of a line, followed by a ":", followed by the value which is continued on non-null subsequent lines that have a space or tab as their first character. Field names and values are restricted to 7-bit ASCII characters. Non-ASCII values may be represented using MIME encoded words. Note that the "To" field in the header is not necessarily related to the addresses to which the message is delivered. The actual delivery list is supplied in the SMTP protocol, not extracted from the header content. The "To" field is similar to the greeting at the top of a conventional letter which is delivered according to the address on the outer envelope. Also note that the "From" field does not have to be the real sender of the e-mail message. It is very easy to fake the "From" field and let a message seem to be from any mail address. It is possible to digitally sign e-mail, which is much harder to fake. Some Internet service providers do not relay e-mail claiming to come from a domain not hosted by them, but very few (if any) check to make sure that the person or even e-mail address named in the "From" field is the one associated with the connection. Some internet service providers apply e-mail authentication systems to e-mail being sent through their MTA to allow other MTAs to detect forged spam that might apparently appear to be from them.
  • 39. Cc: carbon copy Received: Tracking information generated by mail servers that have previously handled a message Content-Type: Information about how the message has to be displayed, usually a MIME type Many e-mail clients present "Bcc" (Blind carbon copy, recipients not visible in the "To" field) as a header field. Since the entire header is visible to all recipients, "Bcc" is not included in the message header. Addresses added as "Bcc" are only added to the SMTP delivery list, and do not get included in the message data. IANA maintains a list of standard header fields. E-MAIL CONTENT ENCODING E-mail was originally designed for 7-bit ASCII. Much e-mail software is 8-bit clean but must assume it will be communicating with 7-bit servers and mail readers. The MIME standard introduced charset specifiers and two content transfer encodings to encode 8 bit data for transmission: quoted printable for mostly 7 bit content with a few characters outside that range and base64 for arbitrary binary data. The 8BITMIME extension was introduced to allow transmission of mail without the need for these encodings but many mail transport agents still don't support it fully. For international character sets, Unicode is growing in popularity. SAVED MESSAGE FILENAME EXTENSION Most, but not all, e-mail clients save individual messages as separate files, or allow users to do so. Different applications save e-mail files with different filename extensions. .eml - This is the default e-mail extension for Mozilla Thunderbird and is used by Microsoft Outlook Express.
  • 40. .emlx - Used by Apple Mail. .msg - Used by Microsoft Office Outlook. MESSAGES AND MAILBOXES Messages are exchanged between hosts using the Simple Mail Transfer Protocol with software like Sendmail. Users can download their messages from servers with standard protocols such as the POP or IMAP protocols, or, as is more likely in a large corporate environment, with a proprietary protocol specific to Lotus Notes or Microsoft Exchange Servers. Mail can be stored either on the client, on the server side, or in both places. Standard formats for mailboxes include Maildir and mbox. Several prominent e-mail clients use their own proprietary format and require conversion software to transfer e- mail between them. When a message cannot be delivered, the recipient MTA must send a bounce message back to the sender, indicating the problem. SPAMMING AND E-MAIL WORMS The usefulness of e-mail is being threatened by three phenomena: spamming, phishing and e-mail worms. Spamming is unsolicited commercial e-mail. Because of the very low cost of sending e-mail, spammers can send hundreds of millions of e-mail messages each day over an inexpensive Internet connection. Hundreds of active spammers sending this volume of mail results in information overload for many computer users who receive tens or even hundreds of junk messages each day. E-mail worms use e-mail as a way of replicating themselves into vulnerable computers. Although the first e-mail worm affected UNIX computers, the problem is most common today on the more popular Microsoft Windows operating system. The combination of spam and worm programs results in users receiving a constant drizzle of junk e-mail, which reduces the usefulness of e-mail as a practical tool.
  • 41. A number of anti-spam techniques mitigate the impact of spam. In the United States, U.S. Congress has also passed a law, the Can Spam Act of 2003, attempting to regulate such e-mail. Australia also has very strict spam laws restricting the sending of spam from an Australian ISP (http://www.aph.gov.au/library/pubs/bd/2003-04/04bd045.pdf), but its impact has been minimal since most spam comes from regimes that seem reluctant to regulate the sending of spam. PRIVACY PROBLEMS REGARDING E-MAIL E-MAIL PRIVACY E-mail privacy, without some security precautions, can be compromised because e-mail messages are generally not encrypted; E-mail messages have to go through intermediate computers before reaching their destination, meaning it is relatively easy for others to intercept and read messages; Many Internet Service Providers (ISP) store copies of your e-mail messages on their mail servers before they are delivered. The backups of these can remain up to several months on their server, even if you delete them in your mailbox; The Received: headers and other information in the email can often identify the sender, preventing anonymous communication. There are cryptography applications that can serve as a remedy to one or more of the above. For example, Virtual Private Networks or the Tor anonymity network can be used to encrypt traffic from the user machine to a safer network while GPG, PGP or S/MIME can be used for end-to-end message encryption, and SMTP STARTTLS or SMTP over Transport Layer Security/Secure Sockets Layer can be used to encrypt communications for a single mail hop between the SMTP client and the SMTP server. Another risk is that e-mail passwords might be intercepted during sign-in. One may use encrypted authentication schemes such as SASL to help prevent this.
  • 42. FINGER In computer networking, the Name/Finger protocol and the Finger user information protocol are simple network protocols for the exchange of human-oriented status and user information. Name/Finger protocol The Name/Finger protocol is based on Request for comments document 742 (December 1977) as an interface to the name and finger programs that provide status reports on a particular computer system or a particular person at network sites. The finger program was written in 1971 by Les Earnest who created the program to solve the need of users who wanted information on other users of the network. Information on who is logged-in was useful to check the availability of a person to meet. Prior to the finger program, the only way to get this information was with a who program that showed IDs and terminal line numbers for logged-in users, and people used to run their fingers down the who list. Earnest named his program after this concept. Finger user information protocol The Finger user information protocol is based on RFC 1288 (The Finger User Information Protocol, December 1991). Typically the server side of the protocol is implemented by a program fingerd (for finger daemon), while the client side is implemented by the name and finger programs which are supposed to return a friendly, human-oriented status report on either the system at the moment or a particular person in depth. There is no required format, and the protocol consists mostly of specifying a single command line. It is most often implemented on Unix or Unix-like systems. The program would supply information such as whether a user is currently logged-on, e-mail address, full name etc. As well as standard user information, finger displays the contents of the .project and .plan files in the user's home directory. Often this file (maintained by the user) contains either useful information about the user's current activities, or alternatively all manner of humor.
  • 43. SECURITY CONCERNS Supplying such detailed information as e-mail addresses and full names was considered acceptable and convenient in the early days of the Internet, but later was considered questionable for privacy and security reasons. Finger information has been frequently used by crackers as a way to initiate a social engineering attack on a company's computer security system. By using a finger client to get a list of a company's employee names, email addresses, phone numbers, and so on, a cracker can telephone or email someone at a company requesting information while posing as another employee. The finger daemon has also had several exploitable security holes which crackers have used to break into systems. The Morris worm exploited an overflow vulnerability in fingerd (among others) to spread. For these reasons, while finger was widely used during the early days of Internet, by the 1990s the vast majority of sites on the internet no longer offered the service. Notable exceptions include John Carmack and Justin Frankel, who until recently still updated their status information occasionally. In late 2005, John Carmack switched to using a blog, instead of his old .plan site. FILE TRANSFER PROTOCOL (FTP) FTP or File Transfer Protocol is used to connect two computers over the Internet so that the user of one computer can transfer files and perform file commands on the other computer. Specifically, FTP is a commonly used protocol for exchanging files over any network that supports the TCP/IP protocol (such as the Internet or an intranet). There are two computers involved in an FTP transfer: a server and a client. The FTP server, running FTP server software, listens on the network for connection requests from other computers. The client computer, running FTP client software, initiates a connection to the
  • 44. server. Once connected, the client can do a number of file manipulation operations such as uploading files to the server, download files from the server, rename or delete files on the server and so on. Any software company or individual programmer is able to create FTP server or client software because the protocol is an open standard. Virtually every computer platform supports the FTP protocol. This allows any computer connected to a TCP/IP based network to manipulate files on another computer on that network regardless of which operating systems are involved (if the computers permit FTP access). OVERVIEW FTP runs exclusively over TCP. FTP servers by default listen on port 21 for incoming connections from FTP clients. A connection to this port from the FTP Client forms the control stream on which commands are passed to the FTP server from the FTP client and on occasion from the FTP server to the FTP client. For the actual file transfer to take place, a different connection is required which is called the data stream. Depending on the transfer mode, the process of setting up the data stream is different. In active mode, the FTP client opens a random port (> 1023), sends the FTP server the random port number on which it is listening over the control stream and waits for a connection from the FTP server. When the FTP server initiates the data connection to the FTP client it binds the source port to port 20 on the FTP server. In passive mode, the FTP Server opens a random port (> 1023), sends the FTP client the port on which it is listening over the control stream and waits for a connection from the FTP client. In this case the FTP client binds the source port of the connection to a random port greater than 1023. While data is being transferred via the data stream, the control stream sits idle. This can cause problems with large data transfers through firewalls which time out sessions after lengthy periods of idleness. While the file may well be successfully transferred, the control session can be disconnected by the firewall, causing an error to be generated. When FTP is used in a UNIX environment, there is an often-ignored but valuable command, "reget" (meaning "get again") that will cause an interrupted "get" command to be continued, hopefully to completion, after a communications interruption. The principle
  • 45. is obvious—the receiving station has a record of what it got, so it can spool through the file at the sending station and re-start at the right place for a seamless splice. The converse would be "reput" but is not available. Again, the principle is obvious: The sending station does not know how much of the file was actually received, so it would not know where to start. The objectives of FTP, as outlined by its RFC, are: To promote sharing of files (computer programs and/or data). To encourage indirect or implicit use of remote computers. To shield a user from variations in file storage systems among different hosts. To transfer data reliably, and efficiently. CRITICISMS OF FTP Passwords and file contents are sent in clear text, which can be intercepted by eavesdroppers. There are protocol enhancements that circumvent this. Multiple TCP/IP connections are used, one for the control connection, and one for each download, upload, or directory listing. Firewall software needs additional logic to account for these connections. It is hard to filter active mode FTP traffic on the client side by using a firewall, since the client must open an arbitrary port in order to receive the connection. This problem is largely resolved by using passive mode FTP. It is possible to abuse the protocol's built-in proxy features to tell a server to send data to an arbitrary port of a third computer; see FXP. FTP is a high latency protocol due to the number of commands needed to initiate a transfer. No integrity check on the receiver side. If transfer is interrupted the receiver has no way to know if the received file is complete or not. It is necessary to manage this externally for example with MD5 sums or cyclic redundancy checking. No error detection. FTP relies on the underlying TCP layer for error control, which uses a weak checksum by modern standards. No date/timestamp attribute transfer. Uploaded files are given a new current timestamp, unlike other file transfer protocols such as SFTP, which allow attributes to be
  • 46. included. There is no way in the standard FTP protocol to set the time-last-modified (or time-created) datestamp that most modern filesystems preserve. There is a draft of a proposed extension that adds new commands for this, but as of yet, most of the popular FTP servers do not support it. SECURITY PROBLEMS The original FTP specification is an inherently insecure method of transferring files because there is no method specified for transferring data in an encrypted fashion. This means that under most network configurations, user names, passwords, FTP commands and transferred files can be "sniffed" or viewed by anyone on the same network using a packet sniffer. This is a problem common to many Internet protocol specifications written prior to the creation of SSL such as HTTP, SMTP and Telnet. The common solution to this problem is to use either SFTP (SSH File Transfer Protocol), or FTPS (FTP over SSL), which adds SSL or TLS encryption to FTP as specified in RFC 4217. FTP RETURN CODES FTP server return codes indicate their status by the digits within them. A brief explanation of various digits' meanings are given below: 1yz: Positive Preliminary reply. The action requested is being initiated but there will be another reply before it begins. 2yz: Positive Completion reply. The action requested has been completed. The client may now issue a new command. 3yz: Positive Intermediate reply. The command was successful, but a further command is required before the server can act upon the request. 4yz: Transient Negative Completion reply. The command was not successful, but the client is free to try the command again as the failure is only temporary. 5yz: Permanent Negative Completion reply. The command was not successful and the client should not attempt to repeat it again. x0z: The failure was due to a syntax error. x1z: This response is a reply to a request for information. x2z: This response is a reply relating to connection information.
  • 47. x3z: This response is a reply relating to accounting and authorization. x4z: Unspecified as yet x5z: These responses indicate the status of the Server file system vis-a-vis the requested transfer or other file system action ANONYMOUS FTP Many sites that run FTP servers enable so-called "anonymous ftp". Under this arrangement, users do not need an account on the server. The user name for anonymous access is typically 'anonymous' or 'ftp'. This account does not need a password. Although users are commonly asked to send their email addresses as their passwords for authentication, usually there is trivial or no verification, depending on the FTP server and its configuration. Internet Gopher has been suggested as an alternative to anonymous FTP, as well as Trivial File Transfer Protocol. While transferring data over the network, several data representations can be used. The two most common transfer modes are: ASCII mode Binary mode The two types differ in the way they send the data. When a file is sent using an ASCII-type transfer, the individual letters, numbers, and characters are sent using their ASCII character codes. The receiving machine saves these in a text file in the appropriate format (for example, a Unix machine saves it in a Unix format, a Macintosh saves it in a Mac format). Hence if an ASCII transfer is used it can be assumed plain text is sent, which is stored by the receiving computer in its own format. Translating between text formats entails substituting the end of line and end of file characters used on the source platform with those on the destination platform, e.g. a Windows machine receiving a file from a Unix machine will replace the line feeds with carriage return-line feed pairs. ASCII transfer is also marginally faster, as the highest-order bit is dropped from each byte in the file.[1] Sending a file in binary mode is different. The sending machine sends each file bit for bit and as such the recipient stores the bitstream as it receives it. Any form of data that is not plain text will be corrupted if this mode is not used.
  • 48. By default, most FTP clients use ASCII mode. Some clients try to determine the required transfer-mode by inspecting the file's name or contents. The FTP specifications also list the following transfer modes: EBCDIC mode Local mode In practice, these additional transfer modes are rarely used. They are however still used by some legacy mainframe systems. FTP AND WEB BROWSERS Most recent web browsers and file managers can connect to FTP servers, although they may lack the support for protocol extensions such as FTPS. This allows manipulation of remote files over FTP through an interface similar to that used for local files. This is done via an FTP URL, which takes the form ftp(s)://<ftpserveraddress> (e.g., [2]). A password can optionally be given in the URL, e.g.: ftp(s)://<login>:<password>@<ftpserveraddress>:<port>. Most web-browsers require the use of passive mode FTP, which not all FTP servers are capable of handling. Some browsers allow only the downloading of files, but offer no way to upload files to the server. FTP OVER SSH FTP over SSH refers to the practice of tunneling a normal FTP session over an SSH connection. Because FTP uses multiple TCP connections (unusual for a TCP/IP protocol that is still in use), it is particularly difficult to tunnel over SSH. With many SSH clients, attempting to set up a tunnel for the control channel (the initial client-to-server connection on port 21) will only protect that channel; when data is transferred, the FTP software at either end will set up new TCP connections (data channels) which will bypass the SSH connection, and thus have no confidentiality, integrity protection, etc. If the FTP client is configured to use passive mode and to connect to a SOCKS server interface that many SSH clients can present for tunnelling, it is possible to run all the FTP channels over the SSH connection.
  • 49. Otherwise, it is necessary for the SSH client software to have specific knowledge of the FTP protocol, and monitor and rewrite FTP control channel messages and autonomously open new forwardings for FTP data channels. Version 3 of SSH Communications Security's software suite, and the GPL licensed FONC are two software packages that support this mode. FTP over SSH is sometimes referred to as secure FTP; this should not be confused with other methods of securing FTP, such as with SSL/TLS (FTPS). Other methods of transferring files using SSH that are not related to FTP include SFTP and SCP; in each of these, the entire conversation (credentials and data) is always protected by the SSH protocol. AN OVERVIEW OF THE FILE TRANSFER PROTOCOL The File Transfer Protocol (FTP) was one of the first efforts to create a standard means of exchanging files over a TCP/IP network, so the FTP has been around since the 1970's. The FTP was designed with as much flexibility as possible, so it could be used over networks other than TCP/IP, as well as being engineered to have the capability with exchanging files with a broad variety of machines. The base specification is RFC 959 and is dated October 1985. There are some additional RFCs relating to FTP, but it should be noted that even as of this writing (December 2001) that most of the new additions are not in widespread use. The purpose of this document is to provide general information about how the protocol works without getting into too many technical details. RFC 959 should be consulted for details on the protocol. Control Connection -- the conversation channel The protocol can be thought of as interactive, because clients and servers actually have a conversation where they authenticate themselves and negotiate file transfers. In addition, the protocol specifies that the client and server do not exchange data on the conversation channel. Instead, clients and servers negotiate how to send data files on separate connections, with one connection for each data transfer. Note that a directory listing is considered a file transfer.