3. Designing and Implementing an Effective Backup
Solution
Anuj Mediratta, Director, and Neeraj Mediratta
Ace Data Devices Pvt. Ltd., India
Knowledge Sharing Winners 2009 Awards
Thank You!
(left to right): Alok Shrivastava, Sr. Director,
EMC Education Services with Ken Guest,
Brian Dehn, Jacob Willig, Bruce Yellin,
Faisal Choudry, Sejal Joshi, Anuj Sharma For the fourth consecutive year, we are pleased to recognize our EMC® Proven™
with Tom Clancy, Vice President EMC Education
Services, and Frank Hauck, Executive Vice
Professional Knowledge Sharing authors. This year’s Book of Abstracts demon-
President, EMC. strates how the Knowledge Sharing program has grown into a powerful forum
for sharing ideas, expertise, unique deployments, and best practices among IT
infrastructure professionals. Articles from our contributing Knowledge Sharing
authors have been downloaded more than 156,000 times, underscoring the
power of the knowledge sharing concept. View our full library of Knowledge
Sharing articles, and articles from the 2010 competition, published monthly,
at http://education.EMC.com/KnowledgeSharing.
Our Knowledge Sharing authors also play a leading role in our EMC Proven
Professional community. It’s a great place to collaborate with other Proven
Professionals, ask questions about the program, or share your experiences.
Visit the community at http://education.EMC.com/ProvenCommunity.
The EMC Proven Professional program had another great year. We expanded
exclusive benefits available to EMC Proven Professionals with the introduction
of the Knowledge Sharing eSeminar—an interactive online session where
EMC Proven Professionals can connect with other subject matter experts.
In addition, we implemented enhancements to Knowledge Maintenance,
providing specialist- and expert-level EMC Proven Professionals with proactive
notification and complimentary knowledge updates to sustain the value of
their certification.
Our continuing success is built on the foundation of committed professionals
who participate, contribute, and share. We thank each of you who participated
in the 2010 Knowledge Sharing competition.
Tom Clancy Alok Shrivastava
Vice-President, Senior Director,
EMC Education Services EMC Education Services
EMC Proven Professional: Knowledge Sharing 2010 3
4.
5. Table of Contents
F I R ST- P L ACE KNOWLEDGE SHARING ARTICLE
Above the Clouds—Best Practices in Creating a Sustainable Computing
Infrastructure to Achieve Business Value and Growth..................................................... 9
Paul Brant, EMC
SECO N D - P LACE KNOWLEDGE SHARING ARTICLE
Applying Composite Data Deduplication to Simplify Hybrid Clouds............................10
Mohammed Hashim, Wipro Technologies
Rejaneesh Sasidharan, Wipro Technologies
T H I R D - P L ACE KNOWLEDGE SHARING ARTICLE
SAN Performance—Getting the Most ‘Bang’ for the Buck. ............................................11
John Bowling, Private Financial Institution
B EST O F C LOUD
How to Trust the Cloud—“Be Careful Up There”.............................................................11
Paul Brant, EMC
Denis Guyadeen, EMC
B EST O F S KILLS
The Convergence Factor—When IT Service Silos Collide................................................12
Ben Dingley, EMC Consulting
B EST O F CONSOLIDATION
Growing Storage Area Network Consolidation and Migration: Is it for You?................12
Sejal Joshi, Large Telecom Company
Ken Guest, Large Telecom Company
A P P L I C AT ION INTEGRATION
Designing Documentum Applications, a Multi-Dimensional Challenge.......................13
Jacob Willig, Lazymen/MatchFirst
ARCHIVE
Save Costs on E-mail Infrastructure: Obtain Value-added Benefits
of E-mail Retention and Archiving...................................................................................14
Satheesh Mc, PMC-Sierra
AVA M A R
Are You Ready for ‘AVA’ Generation?...............................................................................14
Randeep Singh, HCL Comnet
BAC K U P A ND RECOVERY
Backup and Recovery of Microsoft SharePoint—A Complete Solution.........................15
Douglas Collimore, EMC
EMC Proven Professional: Knowledge Sharing 2010 5
6. EMC PowerSnap Implementation—Challenges and Examples......................................16
Hrvoje Crvelin, Orchestra Service
Enhancing Throughput with Multi-Homed Feature in NetWorker Storage Node..........16
Gururaj Kulkarni, EMC
Anand Subramanian, EMC
Scheduling Operations in NetWorker..............................................................................17
Aaron Kleinsmith, EMC
The Union of Snapshot-Based Backup Technology and Data Deduplication. ..............18
Chris Mavromatis, EMC
B U S I N ESS CONTINUITY
EMC Business Continuity Offerings: Another Differentiator of Midrange
and High-end Storage Arrays...........................................................................................19
Denis Serov, EMC
B U S I N ESS PROCESS
Revisiting Business Information Management Re-Engineering:
A Never Ending Cycle........................................................................................................19
Eugene Demigillo, EMC
C L A R i i ON
Best Practices for Designing, Deploying, and Administering SAN
using EMC CLARiiON Storage Systems . .........................................................................20
Anuj Sharma, EMC
How to Shrink a LUN or MetaLUN with CLARiiON CX4....................................................21
Markus Langenberg, Fujitsu Technology Solutions
Performance—Feeling or Reality in a CLARiiON Environment. ......................................21
Octavio Palomino, EMC
Javier Alvarez Dasnoy, EMC
C LO U D
Managing the Cloud—Out of the Fog. .............................................................................22
Brian Dehn, EMC
CO N T E NT MANAGEMENT
Building a Knowledge Management System with Documentum. .................................23
Kaiyin Hu, EMC
Hongfeng Sun, PATAC
Documentum Disaster Recovery Implementation..........................................................24
Narsingarao Miriyala, EMC
EMC Documentum Single Sign-on using Standard Tools...............................................24
Sander Hendriks, Informed Consulting
Enterprise Content Management: Yesterday’s Lessons and Today’s Challenges........25
Himakara Pieris, Daedal Consulting
6 EMC Proven Professional: Knowledge Sharing 2009
7. Integrating SharePoint and Documentum......................................................................26
Rami A Al Ghanim, Saudi Aramco
Sara AlMansour, Saudi Aramco
Media Asset Management in the Broadcasting Industry using Documentum.............26
Derrick Lau, Contractor
Understanding In-Depth Documentum Security............................................................27
Narsingarao Miriyala, EMC
WCM with DITA and Documentum...................................................................................28
Jaimala D. Bondre
Working with Documentum Composer and Without It...................................................28
Andrew Fitzpatrick, Capgemini
Zero-Cost Single Sign-On for Documentum Web Applications
with Integrated Windows Authentication.......................................................................29
Fangjian Wu, Booz Allen Hamilton
DATA CE N TER MIGRATION
The Art and Science of a Data Center Migration.............................................................30
Michael Breshears, EMC
R ECOV E R P OINT
Host-Based Write Splitting in a Solaris RecoverPoint Environment. ............................30
Arthur Johnson, EMC
R ES O U R CE MANAGEMENT
Building a ControlCenter Infrastructure in a Windows 2003
Clustered Environment.....................................................................................................31
Kevin Atkin, Thomson Reuters
SECU R I TY
A Practical Approach to Data Loss Prevention................................................................31
Sonali Bhavsar, Paragon Solutions
A subsidiary of ParagonComputer Professionals, Inc.
RAID! Digital Forensics, SANs, and Virtualization..........................................................32
Charles Brooks, EMC
STO R A G E
Cost-Effective Methods of Remote Data Storage ...........................................................33
Johny Jose, Don Bosco College
Mitchel Valentina D’Souza, National Institute of Technology Karnataka
Failure Points in Storage Configurations: From the Application
to the Physical Disk..........................................................................................................33
Charles Macdonald, TELUS
Procedures to Migrate Data between Storage Arrays: IBM and Symmetrix.................34
Alvaro Clemente, EMC
Role of EMC Storage in the Media Industry....................................................................34
Preethi Vasudevan, EMC
EMC Proven Professional: Knowledge Sharing 2010 7
8. Storage, a Cloud Storm or Not.........................................................................................35
Roy Mikes, Mondriaan Zorggroep
The Illusion of Space and the Science of Data Compression and Deduplication.........36
Bruce Yellin, EMC
V I RT UALIZATION
Design Considerations for Block-Based Storage Virtualization Applications .............36
Venugopal Reddy, EMC
Unified Storage Computing in the Data Center. .............................................................37
Amrith Raj Radhakrishnan, TATA Consultancy Services Limited
9. FIRST-PLACE KNOWLEDGE SHARING ARTICLE
Above the Clouds—Best Practices in Creating a
Sustainable Computing Infrastructure to Achieve
Business Value and Growth
Paul Brant, EMC
The IT industry is embarking on a new service delivery paradigm. From the start, each infor-
mation technology wave—from mainframe, minicomputer, PC/microprocessor to networked
distributed computing—offered new challenges and benefits. We are embarking on a new
wave, one that offers innovative methods and technologies to achieve sustainable growth
while increasing business value for small businesses to major multi-national corporations.
There are various new technologies and approaches that businesses can use to enable more
efficient and sustainable growth such as cloud computing, cloud services, private clouds,
and warehouse-scale data center design. Other “services-on-demand” offerings are begin-
ning to make their mark on the corporate IT landscape.
Every business has its own requirements and challenges in creating a sustainable IT business
model that addresses the need for continued growth and scalability. Can this new wave—
fostered by burgeoning new technologies such as cloud computing and the accelerating
information growth curve—turn the IT industry into a level playing field? What are the metrics
that would enable a systematic determination as to what technology should be imple-
mented? What are the standards and best practices in the evaluation of each technology?
Will one technology fit all business, environmental, and sustainable possibilities?
For example, to sustain business growth, IT consumers require specific standards so that
data and applications are not held captive by non-interoperable cloud services providers.
Otherwise, we end up with walled gardens as we had with CompuServe, AOL, and Prodigy in
the period before the Internet and worldwide web emerged. Data and application portability
standards have to be firmly in place with solid cloud service provider backing.
Data centers are changing at a rapid and exponential pace. However, with all the changes,
data center facilities and the associated information management technologies, IT profes-
sionals face numerous challenges in unifying their peers to solve problems for their compa-
nies. Sometimes you may feel as if you are speaking different languages or living on different
planets. What do virtual computers and three-phase power have in common? Has your IT staff
or department ever come to you asking for more power without considering that additional
power and cooling is required? Do you have thermal hot spots in places you never expected
or contemplated? Has virtualization changed your network architecture or your security proto-
cols? What exactly does cloud computing mean to your data center?
Is cloud computing, storage, or software as a service (SaaS) being performed in your data
center already? More importantly, how do you align the different data center disciplines to
understand how new technologies will work together to solve data center sustainability
problems? One potential best practice discussed in this article is to have a standardized
data center stack framework to address these issues enabling us to achieve a sustained
business value growth trajectory.
How do we tier data center efficiency and map it back to business value and growth?
In 2008, American data centers consumed more power than American televisions.
Collectively, data centers consumed more power than all the televisions in every home and
every sports bar in America. That really puts a new perspective on it, doesn’t it? This article
addresses all of these questions and offers possible solutions.
In summary, this article goes above the cloud, offering best practices to align with the most
important goal—creating a sustainable computing infrastructure to achieve business value
and growth.
9 EMC Proven Professional: Knowledge Sharing 2009
2010 9
10. SECO N D - P L A CE K N OW L E D G E S H A R I N G A RT I C L E
Applying Composite Data Deduplication to Simplify
Hybrid Clouds
Mohammed Hashim, Wipro Technologies
Rejaneesh Sasidharan, Wipro Technologies
Enterprises are constantly seeking new methods to tackle ever increasing data and are chal-
lenged by requirements for security, data protection, and computing capacity. Although data
growth is not new, the pace of growth has become more rapid; the location of data more dis-
persed; and the linkage between data sets more complex. Hence, enterprises are planning
phases to move from the current physical, to virtual, then to private cloud setups. This is
leading to data center consolidation with the adoption of virtualization and deduplication
with the intent to optimize capacity and reduce data.
As private clouds and collaboration with public clouds gain maturity with just-in-time com-
puting power, managing data and flow across cloud boundaries is becoming more complex—
particularly when maintaining diverse storage system scales and types. The applications of
the hybrid cloud computing model are expanding rapidly with improved public-private collab-
oration, thinner boundaries, dropping connectivity costs, and efficient computing hardware.
The services that can be delivered from the cloud have expanded past web applications to
include data storage, raw computing, and access to different specialized services.
Composite data deduplication offers companies the opportunity to: dramatically reduce the
amount of storage required for backups; effectively implement security and data protection;
more efficiently centralize backups to multiple sites for assured disaster recovery; and dissi-
pate pockets of computing power across hybrid clouds. The technological attempts continue
to decouple hardware from software applications so that each resource can be optimally
scaled, utilized, and managed.
As with all IT investments, data deduplication makes business sense. At one level, the value
is easily established. For instance, adding disk to your backup strategy can provide faster
backup and restore performance, as well as offer RAID levels fault tolerance. However, with
conventional storage technology, the number of disks needed for backup is simply too costly.
Data deduplication solves that problem because it lets users put 10 to 50 times more backup
data on the same amount of disks. Additionally, deduplication dramatically reduces the
backup window and space. Actually, the biggest value proposition is when it all comes down
to more effectively protecting data and cost. The economic benefits when doing an ROI look-
ing at total cost of ownership is tremendous when you deep dive into data deduplication.
This article focuses on applying data composite deduplication techniques to simplify the
data processing and management within hybrid cloud based setups. It also describes how
deduplication operates interacting with file systems, the data flow, backup and restore, pros
and cons, and comparison with and without deduplication. The discussion includes:
• An overview of data deduplication and its features
• Understanding hybrid clouds
• Architecture and applying deduplication to hybrid clouds
• How to simplify hybrid clouds by adopting composite data deduplication
• Analyzing the pros and cons
• The future of data dedupe and its industrial outcome
10 EMC Proven Professional: Knowledge Sharing 2010
11. THIRD-PLACE KNOWLEDGE SHARING ARTICLE
SAN Performance—Getting the Most ‘Bang’ for the Buck
John Bowling, Private Financial Institution
“The SAN is slow.” There is not a single storage administrator who has not heard these four
dreaded words. What does slow actually mean? How did ‘they’ determine the underlying
issue was storage related? Do ‘they’ have the proper tools and information to pinpoint the
root-cause?
As a storage professional, how do you combat the perception that any server or database
related performance issue is directly attributed to the SAN? Do you have the proper tools and
information required to repudiate or corroborate their findings?
This article outlines several key metrics and examples that every storage administrator
should have readily available. Having timely and accurate data is crucial to identify current
and potential bottlenecks within the storage infrastructure.
Although several examples within this document are focused on EMC platforms, many of
the concepts and methodologies can be applied across heterogeneous environments.
B EST O F C LO U D
How to Trust the Cloud—“Be Careful Up There”
Paul Brant, EMC
Denis Guyadeen, EMC
Can we trust the cloud? As many organizations look to the cloud as a way to become more
agile and efficient, the impact to Governance, Risk, and Compliance (GRC), security, and
privacy cannot be overlooked.
Cloud computing delivers convenient, on-demand access to shared pools of data, applica-
tions and hardware over the Internet. Cloud computing erases many of the traditional physi-
cal boundaries that help define and protect an organization’s data assets. Some might find
it reassuring to know that information security in the cloud doesn’t deviate dramatically from
ensuring information security in traditional enterprise IT infrastructures. The same require-
ments—threats, policies, and controls, as well as governance and compliance issues—apply.
However, adaptations to how time-tested security practices and tools are applied given the
inherent differences in how cloud resources can be shared, divided, controlled, and managed
need to be addressed. Each cloud provider takes a different approach to security and no
official security industry-standard has been ratified. Most cloud providers (including Amazon
EC2) do not allow vulnerability scanning; many cloud providers are not forthcoming about
their security architectures and policies. Compliance auditors are wary of the cloud and are
awaiting guidelines on audit testing procedures.
This article discusses how to make cloud security better, faster, cheaper, more efficient, and
less intrusive and to expand the definition of user identity. We will identify the role of trust
relationships between organizations; deciding who to trust is a requirement for cloud com-
puting. Private versus public clouds as they relate to security also is discussed as well as
expectations between parties that can be formalized and organized through ties of federa-
tion. This will hopefully result in not just a safer community of users, but one that interacts
more conveniently, openly and productively. In summary, this article addresses GRC security
and privacy best practices that will align with business value and growth.
EMC Proven Professional: Knowledge Sharing 2010 11
12. BEST OF SKILLS
The Convergence Factor—When IT Service Silos Collide
Ben Dingley, EMC Consulting
With the advent of data center virtualization and cloud computing, a lot has been written
regarding the convergence of storage, server, and networking technologies. The blending of
these core data center technologies into one homogenous mixing pot means IT departments
will be able to supply computing requirements as an on-demand, highly available service.
However, there is the collision course this new computing utopia has created from the per-
spective of the traditional IT support team. Previously, discussions around combining SANs
with data networks using Fibre Channel over Ethernet technology have alluded to how this
may affect the traditional support model. But now with VMware®, Cisco, and EMC amongst
others, combining to virtualize the data center, the scope for IT support consolidation has
rapidly increased and changed almost overnight.
Many organizations have some form of virtual server infrastructure within their data center,
and up until now this has not fundamentally changed the way in which IT departments have
provided support. This is now changing. When we virtualize the data center converging
server, storage, and network technologies, we are also converging the traditional IT support
model. Having dedicated lines of support for each individual technology, each with its own
manager, budget, and agenda can lead to inefficiencies in overall functionality, performance,
and configuration of the newly virtualized data center.
There are examples of the existing IT support model hindering the full potential and limiting
the benefits of what the virtual data center private cloud can achieve. For example, in place
upgrades of VMware Infrastructure 3 to vSphere™ without proper review and the creation of
a strategic roadmap is destined to fall well short of the promised cloud computing benefits.
Critically, unless all of the core components are included as part of a virtual infrastructure
review, the strategic roadmap for creating a virtual data center cannot be defined, and
required changes to the IT support model will not be identified.
This article examines what considerations are needed for changing the way IT departments
restructure their support models for the virtual data center age. New ways of thinking about
how IT service support areas should interact, and key considerations when defining a strate-
gic vision for the IT support of the virtual data center are discussed in depth.
B E S T O F C O N S O L I D AT I O N
Growing Storage Area Network Consolidation and Migration:
Is it for You?
Sejal Joshi, Large Telecom Company
Ken Guest, Large Telecom Company
For IT managers, consolidation and migration have become one of the most routine and
challenging facts of life. Technology refreshes, mergers/acquisitions, bandwidth require-
ments, data classification, availability, ease of access, and cost savings are all major drivers
for consolidation and migration. Consolidating resources is usually a key factor in lowering
total cost of ownership (TCO), as well as simplifying the IT environment, which is as complex
as it gets. Consolidation makes good business sense. We will explain how to simplify the IT
storage area network (SAN) infrastructure and reduce operating costs through consolidation
and migration. It can help your company dramatically reduce high maintenance costs, more
fully utilize IT assets, and improve the quality of services that IT offers to the enterprise.
IT Infrastructure relies on SAN solutions to provide accessibility to critical business data.
The increasing complexity and ever changing requirements of SAN and storage solutions
drives the need to consolidate SAN islands. However, consolidation for consolidation’s sake
12 EMC Proven Professional: Knowledge Sharing 2010
13. might not be the best idea. Each IT infrastructure is different, and careful analysis is required
to compare consolidation’s advantages with total costs, including equipment and personnel.
Growth, impact on the disaster recovery process, and impact on IT budgets are other factors
that need to be considered. As the demand for bandwidth grows, so does the reach of net-
works. Users need to access data globally, from multiple locations, quickly and transparently
without breaking the budget. Cost, bandwidth, reach, latency, and ease-of-access are the
driving metrics behind today’s SANs.
Although the terms consolidation and migration are often used to refer to several different
operations, this article focuses on SAN fabric consolidation and migration as the process
of moving connectivity from one fabric (source) to another (destination), as in a technology
refresh or changing a switch vendor. This article discusses the process to consolidate SAN
islands. The process to migrate host and storage devices from multi SAN switch vendor plat-
forms, SAN standards, and several migrations to and from the three major SAN switch vendors
(McData, Brocade, and Cisco) still prominent in the SAN arena are also discussed. Sample
migration plans to help simplify the migration/consolidation process are provided as well.
A P P L I C AT I O N I N T E G R AT I O N
Designing Documentum Applications, a Multi-Dimensional
Challenge
Jacob Willig, Lazymen/MatchFirst
Designing one application is relatively easy. Most important is to deliver what the customer
and business requires. They need it for their daily operations and typically provide the bud-
get. Designing another application is just as easy if you target a totally separate environment
to host it in. As soon as you want to host the second application in the same environment,
things become more complex. You must consider extra dimensions of complexity while
designing. The more applications you are going to be building and hosting in the same envi-
ronment, the more complex the situation can become.
These dimensions of complexity do not apply just to IT development. There are very interest-
ing examples outside IT that have many parallels. One of those is the way IKEA is designing
their products.
Simply designing a nice product based on customers’ needs alone is not good enough.
Many benefits will be gained by taking into account more considerations. IKEA designers
have a challenging task, as they need to be responsible for many dimensions of complexity
while designing a new product:
• Retail price
• Appeal
• Durability
• Material
• Transportation
• Reuse of components
• Usefulness
• Ease of assembly by customer
• Environmental impact
Many of the IKEA products are built out of smaller (generic) components that are easier
to produce, ship, replace, and assemble. (See http://designinnovationinstitute.org/
behindikea_trans.html where IKEA designer Knut Hagberg gives a nice example of how
they designed the Mörker lamp.)
Likely more considerations apply while designing document management applications on
an EMC Documentum® platform. New applications must be supported by a central support
EMC Proven Professional: Knowledge Sharing 2010 13
14. team. They need to be compatible with the platform while simultaneously being maintainable
and upgradable without interfering with other applications and with the least expensive
effort possible.
We can categorize most design considerations into separate dimensions that have a certain
correlation. If one consideration (beauty and style) is maximized, it typically will have a nega-
tive effect on other considerations (efficient shipping and ease and cost of production). Many
different dimensions can also be defined that all should be maximized if possible. This is
a significant challenge as still more dimensions are discovered while improving quality of
design.
In this article, all dimensions involved when designing new Documentum-based applications
for an existing enterprise-wide Documentum platform are discussed. Topics are focused on
Documentum applications, but will also easily apply to many other IT development contexts.
ARCHIVE
Save Costs on E-mail Infrastructure: Obtain Value-added
Benefits of E-mail Retention and Archiving
Satheesh Mc, PMC-Sierra
E-mail has become a primary channel of business communication. It provides organizations
with a fast medium to convey business correspondence such as purchase orders, quotations,
and sales transactions, in virtually any geographical area with the least physical effort possi-
ble. This drastic escalation in storage requirements is mainly attributed to the increase in the
use of e-mail, added to the upsurge of each attachment up to 20 MB. This has doubled pri-
mary storage requirements in IT environments.
Various pieces of legislation have been enacted to protect personal privacy, enforce corpo-
rate governance standards, and maintain ethical conduct due to e-mails’ predominance
in the business industry. Examples include The Sarbanes-Oxley Act (SOX), Gramm-Leach
Bliley act (GLBA), and the Freedom of Information Act (FOIA). They require retaining e-mail
for a longer duration due to compliance requirements.
These requirements present us with two problems: ever-increasing primary storage require-
ments; and how to retain and archive e-mail retention for compliance purposes. In addition,
corporate policies and regulations require us to save that data for longer periods of time. That
means storage, which translates into additional floor space, power, staff, and complexity.
This article explores a solution using EMC EmailXtender™ and EMC Centera®, enabling orga-
nizations to drastically reduce their primary storage requirements and achieve the added
benefit of e-mail archiving and retention. It explores issues such as cost-effective solutions,
decreasing primary storage requirements, deduplication, intelligent ways of storing data, site
disaster recovery model, and implementing effective e-mail retention and archive policies.
AVAMAR
Are You Ready for ‘AVA’ Generation?
Randeep Singh, HCL Comnet
Backup windows that roll into production hours, network constraints, too much storage under
management, data protection, continuing growth of data per year, the need for extra storage,
and data residing at remote offices are some of the problems in traditional backups. What if
we could solve these issues?
14 EMC Proven Professional: Knowledge Sharing 2010
15. EMC Avamar® software solves all these challenges by bringing you the advanced deduplica-
tion concept that identifies redundant data segments at the source before transfer across the
network. It moves only new, unique subfile variable length data segments and reduces the
daily network bandwidth by up to 500x. In addition to backup to disk, it also works with exist-
ing tape and traditional backup software such as EMC NetWorker™. Avamar’s architecture
provides online scalability; the patented RAIN (redundant array of independent nodes) tech-
nology provides high availability.
This article presents the new ‘AVA’ generation over traditional backup and will discuss:
• Rise of dedupe concept
• Introduction of RAIN technology
• Active/passive replication phase
• Integration with traditional backup civilization
• Protection of data in VMware environment
• Architecting Avamar to ensure maximum system availability
• Tuning client cache to optimize backup performance
• Proactive steps to manage health and capacity
• Tape integration
B A C K U P A N D R ECOV E R Y
Backup and Recovery of Microsoft SharePoint—
A Complete Solution
Douglas Collimore, EMC
Microsoft® SharePoint® has grown from a simple collaboration tool to a highly important
tool at many small and medium size businesses and large enterprises. Many important
custom applications, such as human resources and sales databases, have been migrated
to SharePoint portals, and it is now viewed as a Tier 1 application, limiting the amount of
downtime it can sustain to almost zero. In addition, the ease with which a site can be setup
and configured by anyone makes SharePoint resemble an application virus that pervades
many companies’ infrastructure.
The standard backup and recovery paradigm no longer suffices when SharePoint is prevalent
within the data center. There are many servers that support a SharePoint farm requiring
backup of more than a single application server. Storage needs grow exponentially extending
a backup window; possibly past the time allowed to complete the backup process. Restore
requirements become more granular as more sites and objects are created by a growing
number of business units. More stringent service-level agreements need to be authored to
maintain service to the business units. Finally, tighter requirements and restrictions must
be in place to support backup and recovery to balance the infrastructure and administrative
cost of sustaining the application with business results.
In this article, the needs and requirements for a federated backup/recovery solution for
Microsoft SharePoint are addressed.
The requirements include the ability to backup and restore the farm infrastructure including:
• Web front-end servers (WFE)
• Application server(s)
• SQL servers supporting the content and config databases
• Search and index servers
• Repository servers including Documentum
• Security servers including RSA®
EMC Proven Professional: Knowledge Sharing 2010 15
16. We will also discuss best practices for building your SharePoint farm in support of these
requirements. This section includes information on disk sizing, server sizing, and other
pertinent guidelines. Finally, operating SharePoint in a virtual environment is also discussed
as virtualization adds an entirely new level of complexity and set of operations to the backup
and recovery solution.
EMC PowerSnap Implementation—Challenges and Examples
Hrvoje Crvelin, Orchestra Service
When dealing with modern technologies and data centers, we are told about multiple
approaches based on what we have and what may be the best to suit our needs in respect to
scalability, ease of use, and availability. Nevertheless, we always have some mission-critical
data that resides within databases. Among those there is always a single crown jewel which
tends to be large and looking for special attention. Such a database or databases require
protection. They require backup to be prepared for restore. The nightmare scenario of con-
ducting disaster recovery of your large database is sometimes what we fear the most. This
is why we prepare for such a scenario, no matter how unlikely it might be.
I started my career implementing small systems. However, they have grown to more complex
environments where the peak of my experience has been reached working at data centers
around the globe and implementing backup solutions based on the EMC NetWorker family.
While there is no single or unique approach to this subject, from a high level point of view we
always have our large database(s) which require fast backup and no load during this opera-
tion. Having a fast restore is, of course, another requirement—business today is based on
24x7 availability and every second matters. Modern network designs are full of VLANs build-
ing isolated islands in network forests imposing rather challenging tasks for the architects
and those leading an implementation. Is there a simple answer and approach to this subject?
It may depend on your components, but overall the answer is yes!
This article describes how to implement EMC PowerSnap™ to achieve the above mentioned
goals. Our task will be to protect SAP with an Oracle database with 15 TB storage. Our net-
work infrastructure will be isolated and we will see how to address this problem. Our primary
storage will be EMC Symmetrix® while backup to tape will be achieved using EMC EDL 4406
system. Our backup and restore strategy is based on the EMC NetWorker family of products.
You will read about the details and challenges you might face as well as get a glimpse into
the future.
Enhancing Throughput with Multi-Homed Feature
in NetWorker Storage Node
Gururaj Kulkarni, EMC
Anand Subramanian, EMC
EMC NetWorker is an enterprise-class backup and recovery solution. It is three-tiered soft-
ware with components including the NetWorker client (hosts the data to be backed up), the
NetWorker server (coordinates the entire backup/recovery process, tracks the metadata) and
the NetWorker storage node (connects to diverse storage devices and writes/reads data).
The NetWorker storage node is a key player during the backup and recovery of data. It must
have tremendous power to transfer data from and to storage devices, since the rate at which
data is backed up and retrieved determines the overall speed of the backup-recovery applica-
tion. In most enterprise backup infrastructures, the storage node is a powerful server with
superior system configuration. With most high-end machines having multiple NIC ports, it is
worthwhile to use the power of more than one NIC and improve overall system throughput.
In any backup infrastructure, I/O operations are the most expensive. By having a multi-homed
NetWorker storage node, the system’s bus speed, CPU speed, and disk throughput will be
used to its maximum capabilities, therefore providing improved performance throughput.
16 EMC Proven Professional: Knowledge Sharing 2010
17. Another important outcome is the reduced backup window realized by the improved net
throughput, therefore meeting the recovery-time objective (RTO).
The following are some of the important components that play a major role during data
transfer on storage nodes:
• CPU speed
• Number of CPUs
• Disk utilization
• System bus speed
• Network bandwidth
• Physical memory on the storage node
• Fibre Channel (FC) connectivity
The response time on storage nodes is impacted if any of these underlying components
exceed their manufacturer’s rated capacity. For example, if the incoming or outgoing data
exceeds the rated Gigabit Ethernet capacity, there will be contention for read or write opera-
tions. This will result in lower device throughput. In such cases, it is necessary to have a load
sharing mechanism to improve the data transfer rate. Configuring multiple NICs is one solu-
tion to this problem.
Industry standard benchmarks have demonstrated that with a GigE network, you can achieve
up to 90 percent of NIC utilization, which is approximately 90–110 MB/s. If a dual NIC setup
is used on the storage node, you can expect up to 180 percent of increased NIC utilization.
This will increase the backup speed and reduce the backup window.
This article describes the detailed steps for configuring more than one NIC port on the
storage node to transfer data. We also present recommendations and best practices when
configuring a NetWorker storage node with multiple NICs. By following these simple steps
and guidelines, you can achieve increased backup and recovery throughput to help reduce
the backup window size.
Scheduling Operations in NetWorker
Aaron Kleinsmith, EMC
As an EMC instructor, I meet many NetWorker administrators in my education classes every
week and find that many of them struggle to find the correct configuration to meet their orga-
nization’s backup requirements. There are strategies used by professional services and cus-
tomers to work around limitations related to scheduling, but I have not seen many of these
documented for general consumption.
For example, NetWorker uses a Group setting to define a start time for a backup of one or
more backup clients. A typical Group will start a backup of these backup clients once a day;
that was a typical requirement when NetWorker was first designed 20 years ago. More
recently, customers are required to back up data either more frequently (every hour) or less
frequently (once a month). New requirements such as Sarbanes-Oxley in the United States
and similar technology-driven laws in other countries are requiring reconfiguration of existing
backup environments as well. NetWorker was not originally designed to handle some of these
requirements.
The type of data and the requirements for protecting that data have changed greatly over
the years as well. The addition of new media types for storing back up data such as disk and
virtual tape libraries have also changed the way NetWorker administrators schedule backups.
Cloning and staging operations are used more frequently since the number of devices avail-
able for these operations has increased.
This article describes different ways to schedule operations in NetWorker such as backups,
cloning, and staging operations. It also includes alternative methods and best practices for
scheduling backups, and cloning and staging operations in NetWorker that are used to meet
specific business requirements.
EMC Proven Professional: Knowledge Sharing 2010 17
18. Backup administrators who are required to configure and monitor backups, cloning opera-
tions, and staging operations through NetWorker are the target audience for this article.
The following topics are discussed:
• Overview and explanation of Groups, Schedules, and Directives in NetWorker
– Different ways to schedule a Group
– Different ways of using Schedules
– How to use Directives effectively
– Using Groups, Schedules, and Directives for database backups
– Using Groups to schedule bootstrap and index backups
• Overview and explanation of cloning and staging in NetWorker
– How cloning and staging can be used with various device types
– Different ways to schedule and run cloning and staging operations
–
Examples of using Groups, Schedules, and Cloning configurations to meet different
backup and recovery business requirements
The Union of Snapshot-Based Backup Technology
and Data Deduplication
Chris Mavromatis, EMC
Organizations are finding it an increasingly daunting task to meet consistently shrinking
backup windows while contending with the current explosive data growth rates requiring
daily protection.
EMC is focused on assisting customers to meet this challenge by providing solution-based
services that transform their infrastructure to deliver seamless collaboration with various
types of hardware and software components and providing a superior backup solution.
One such solution-based delivery is the integration of EMC technologies:
• NetWorker
• Avamar
• PowerSnap
• Symmetrix
• EMC CLARiiON®
This article illustrates the integration of EMC’s key technologies that provide customers
with the ability to create instant backups of large amounts of data by leveraging the
mature dependability of disk arrays such as Symmetrix or CLARiiON®. Simultaneously,
using an enterprise-level backup platform, we offer the ability to store only unique data
(deduplication) across the data zone.
This article outlines installation, configuration, pitfalls, and best practices for this solution.
It will empower Sales, SEs, Support, and customers to better leverage EMC’s robust product
offerings.
18 EMC Proven Professional: Knowledge Sharing 2010
19. B U S I N ESS CO N T I N U I TY
EMC Business Continuity Offerings: Another Differentiator
of Midrange and High-end Storage Arrays
Denis Serov, EMC
EMC storage array technologies develop very quickly. In particular, disk arrays’ scalability
and performance is doubling every two years. This is true for both midrange and high-end
storage systems. Array performance is constantly growing year over year. Flash drives break
performance barriers for midrange arrays. Storage systems that have been considered high-
end several years ago are now less than half the size of today’s midrange systems. Some
time ago, array size was a primary differentiator, but now array size is less important for
medium size companies, who themselves contribute to major parts of EMC’s business.
If a CLARiiON array scales to 960 drives and Symmetrix scales to 2,400 drives, does it matter
to a company who is considering purchasing 300 drives? Isn’t it obvious that CLARiiON, as
the less expensive choice, should be selected? More and more EMC customers want to have
a clear understanding about how the two array capabilities are differentiated. There are a
number of major differences between CLARiiON and Symmetrix. Scalability, tiered storage
functionality, fault tolerance, and performance availability, security, data mobility, energy
efficiency, and product lifecycle are included in this list. There is, however, another extremely
important practical differentiator. Business continuity capability differences are usually shad-
owed by other spectacular differentiators. While, at a first glance, both storage arrays offer
similar functionality (local and remote replication), these functionalities differ in important
practical details that must be well understood during the array selection process.
This article reviews practical differences for both local site protection and disaster recovery of
two arrays, and shows how Symmetrix is the better choice for business continuity of critical
applications.
BUSINESS PROCESS
Revisiting Business Information Management Re-Engineering:
A Never Ending Cycle
Eugene Demigillo, EMC
In a past article, I discussed establishing a Business Information Management Re-Engineering
(BIMR) strategy. This is a re-engineering strategy I’ve coined to establish a strategy/method-
ology to manage information infrastructures. It aims to establish a formal committee that
involves the customer, technology, and application vendor, and information owners who
define the architecture, design, and management of the information infrastructure.
The establishment of BIMR is not only meant for implementing a new information storage,
protection, or security solution. It can begin with implementing a new infrastructure, but it is
a living strategy that should be regularly reviewed and re-engineered as and when the busi-
ness information ecosystem changes. As a business thrives or contracts, the requirements to
store, protect, and secure data continuously changes. Thus, managing the information needs
to evolve to remain aligned with business.
More often than not, it is left to the customer to establish their information management
strategies after an infrastructure or solution is implemented. Where technology vendors
ensure a successful implementation, failure sometimes occurs in the post implementation
and operation. What causes these failures? They can be attributed to hardware or application
failure, environmental failure, or human error. But how do we account for the changes in the
business: corporate restructuring, business re-alignment, and business relocation? There
EMC Proven Professional: Knowledge Sharing 2010 19
20. are a number of business evolutions that the information infrastructure should be able to
support and where we should employ appropriate management strategies.
Topics in this article include:
• Revisiting the BIMR methodology
• Elements of the BIMR strategy
• BIMR strategy as a unique approach
CLARiiON
Best Practices for Designing, Deploying, and Administering
SAN using EMC CLARiiON Storage Systems
Anuj Sharma, EMC
If your Microsoft Windows® or UNIX networks are expanding to keep pace with a growing
business, you need more than simple, server-based information storage solutions. You need
an enterprise-capable, fault-tolerant, high-availability solution. EMC CLARiiON storage sys-
tems are the answer. They employ the industry’s most extensive set of data integrity and data
availability features, such as dual-active storage processors, mirrored write caching, data ver-
ification, and fault-recovery algorithms. CLARiiON systems support complex cluster configura-
tions, including Oracle and Microsoft software. They have been recognized as the most
robust and innovative in the industry.
Features of CLARiiON systems include:
• Flash drives
• EMC UltraFlex™ technology
• Fibre Channel/iSCSI connectivity
• Virtual provisioning
• Tiered storage
• Virtualization-aware management
• Virtual LUN technology
The performance and flexibility of CLARiiON systems is backed by superior availability, and
here is where CLARiiON really shines. Its design has no single-point-of-failure, all the way
down to the fans and power cords.
This article includes the practices that I propose should be followed from the initial solution
design to implementation and then to administration of EMC CLARiiON. This will help you
to optimally use the resources and get the optimal performance out of the CLARiiON.
This article will benefit anyone who implements, manages, or administers a SAN using
EMC CLARiiON systems.
There are many important things that you need to consider before you design a SAN. You
need to know how the components fit together in order to choose a SAN design that will
work for you. Like most storage managers, you’ll want to design your SAN to fit today’s stor-
age needs as well as tomorrow’s increased storage capacity. Aside from being scalable, the
ideal SAN should also be designed for resiliency and high availability with the least amount
of latency. This article addresses most of the aspects of designing a SAN using CLARiiON
systems.
This article touches upon the following topics:
• Initial SAN solution designing (i.e., calculating storage capacity required, desired IOPS,
tiering the disk types)
• SAN topologies to be considered for different SAN deployments
• Zoning best practices
20 EMC Proven Professional: Knowledge Sharing 2010
21. • Practices while implementing the SAN (i.e., creation of RAID groups, types of RAID groups,
and LUN sizes)
• Using thin provisioning
• How thin provisioning offers utilization and capacity benefits
• Tuning the SAN according to the application that will access the storage
(i.e., OLTP applications, Exchange)
• Best practices for implementing EMC MirrorView™ and EMC SnapView™
• Disaster recovery site readiness best practices
• VMware ESX® server using EMC CLARiiON systems
How to Shrink a LUN or MetaLUN with CLARiiON CX4
Markus Langenberg, Fujitsu Technology Solutions
There are two problems with oversized LUNs at customer sites. The customer has:
1. Oversized a LUN and wants to reduce the physical space used
2. Created an oversized MetaLUN and wants to release consumed space
The formatted space could not be easily reclaimed if the LUN is already used and bound
inside the operating system. Normally data has to be backed up, LUNs have to be unbound,
and a new LUN has to be bound, and we have to restore the data. All of this must be done
offline.
With the new FLARE R29 revision, it is possible to shrink LUNs that are mapped to a host.
The first step is to shrink the host accessible volume space. The underlying storage will be
freed by reducing the capacity of a FLARE LUN. This is managed by using operating system
and storage APIs, such as the Virtual Disk Service (VDS) provider to detect the size of capacity
reduction, to ensure no host I/O is present before and during the shrink operation, and to
ensure that no used host data is deleted.
This article guides you step-by-step through the shrinking process and explains the require-
ments and restrictions.
Performance—Feeling or Reality in a CLARiiON Environment
Octavio Palomino, EMC
Javier Alvarez Dasnoy, EMC
Performance is a topic often discussed among professionals and it allows a variety of inter-
pretations. Given this dispersion of information, it is interesting to summarize what we mean
by performance and why it is important to analyze it at the implementation stage and not
only when the customer “feels” that there is a performance problem. We begin our work by
analyzing what we hear and say when we talk about performance.
What we hear when we talk about performance
At this point we try to summarize the various comments offered by the customer (which refers
to the people to which the storage is providing services) when identifying or “feeling” a per-
formance problem. We must hear what the customer says and interpret the data that the stor-
age system provides. We then continue with the analysis but from the point of view of the
storage administrator.
What we say when we talk about performance
Here we will develop different performance analyses from the storage administrator’s point
of view. We develop different conversations with the customer when the problem has already
occurred. The analysis will be identified as a random problem (it was produced and is being
analyzed) or a specific problem (is currently occurring). Instead of trying to relate the problem
EMC Proven Professional: Knowledge Sharing 2010 21
22. to another cause, we dive into the problem as if it really existed within our environment and
was hiding. We develop a conclusion only after gathering enough information.
As noted above, we believe the performance should be analyzed first at the implementation
stage. Therefore, we analyze how to implement a CLARiiON with the customer.
Implementing a CLARiiON with the customer
There was an implementation before every problem. At this point, we perform an analysis
when implementing a CLARiiON. What kind of operating environment will you receive? Will it
be a new environment? Is it possible to collect data that can be used to compare and predict
performance behavior?
By collecting this information we will present the tables we use to calculate the spindles
and project the performance behavior of the CLARiiON according to the customers’ needs.
We implement what the customer has, not necessarily the ideal for his environment. In the
implementation stage, we can identify behaviors that help us to “set” customer
expectations.
What information we request and how we read it
It is important to collect data prior to the implementation since it will help us understand the
behavior of applications and implement accordingly.
What solutions are available and what is the scope
There is no magic. The spindles of the disks support a finite number of IOPS (we will describe
them), and distributions and protections have their costs (in terms of performance and avail-
ability). At this point we will discuss these options and will evaluate the advantages and dis-
advantages of each alternative.
Tips
We will outline some tips that we use in our implementations that you may find useful.
We believe our work will present a method of analysis that could help professionals in a sub-
ject as diverse as performance. It is inevitable that everyone has a different “feeling” about
performance, but we also believe that while the performance analysis is not an exact science,
there is information that, if collected and organized properly, enables us to better understand
the platform and enables the customer to grow with us once we achieve that knowledge.
Our profession requires that we learn something new every day. However, we believe it
is important to take the time to analyze what we have seen many times in a different way.
We hope that this article will be of value to this community of knowledge.
C LO U D
Managing the Cloud—Out of the Fog
Brian Dehn, EMC
Cloud computing, private cloud, virtualization, federation, and abstraction are today’s
buzzwords in the IT industry. IT has begun its journey toward the cloud, and the movement
is gaining momentum.
Virtualization is a major step on the journey toward the cloud and is being introduced at all
levels of the IT “stack”—application, server, network, and storage—at an increasingly fast
pace. The journey toward the cloud commonly begins with the virtualization of existing data
centers. In fact, based on EMC’s definition, the internal cloud is just another name for the
fully virtualized data center (refer to Chuck Hollis’ blog). In conjunction with the external
cloud (i.e. services and resources outside the virtualized data center), the combination of
the two comprises the private cloud (also visit www.privatecloud.com).
22 EMC Proven Professional: Knowledge Sharing 2010
23. One of the challenges with clouds is that they are a “foggy” concept—intangible, hard to
corral, difficult to track, and ever changing. These characteristics result in an environment
that is more difficult to manage than ever before. Virtualization complicates the matter
because you can no longer walk up to an IT resource, reach out and touch it, and perform
whatever critical tasks are required to keep that resource in optimal shape.
In reality, automation (a key component of the virtualized data center) probably already
relocated that virtualized resource, possibly to a completely different area of the world.
Throw abstraction and federation into the mix, and the fog becomes as thick as pea soup.
The bottom line is: as the level of virtualization grows, the cloud becomes more difficult
to visualize and the ability to manage the cloud becomes all the more vital.
Adequate management solutions are frequently an afterthought occurring far into the jour-
ney. It is not uncommon for IT departments to evaluate and implement virtualization and
automation solutions with little or no thought as to how they will manage them. The expan-
sion of virtualization in a data center usually causes the need for management solutions to
become more obvious, usually as a result of painful incidents and problems. Lack of fore-
thought regarding a management solution frequently produces a hodgepodge of tools, simi-
lar to a collection of pieces from different jigsaw puzzles, from which you attempt in vain to
create a single picture. These tools may each be sufficient for managing specific IT resources,
but do not integrate with each other, provide a line of sight through the fog of the virtualized
data center, or allow management of IT from the perspective of the business or of the ser-
vices provided to end-user customers.
This article was written to emphasize the importance of a cohesive management solution in
a cloud environment. The following topics are discussed:
• An overview of cloud computing
• The nature of cloud management
• Challenges associated with managing the cloud
• Recommendations for choosing a cloud management solution set
The goal of this article is to help you understand why cloud management is critical and to
help you get “out of the fog.”
CO N T E N T M A N A G E M E N T
Building a Knowledge Management System with Documentum
Kaiyin Hu, EMC
Hongfeng Sun, PATAC
This article describes best practices to architect and build a knowledge management system
with the Documentum platform. It is derived from a well-known automotive customer, but it
can be applied throughout other industries.
Knowledge has already become a key competitive advantage for large companies. There are
urgent requirements to build systems to manage these assets and make them easy to be cre-
ated, shared, identified, consolidated, and retained.
Let’s take a real example. In a knowledge management system, any document can be gener-
ated by collaboration among colleagues or different lines of business. The document goes
through a recommendation process. When reviewed and approved by knowledge experts
in specific areas, it becomes published knowledge that can be shared within the enterprise.
Comments or questions and answers can be attached to the published knowledge to assist
other workers needing this relevant information.
The most exciting point is that the workers may get inspired based on the knowledge in the
system, work for further inspections, and turn the knowledge into a company patent or other
intellectual property. It also supports the companies’ culture by calling for teamwork.
EMC Proven Professional: Knowledge Sharing 2010 23
24. This article discusses the four functional models needed for knowledge management
systems:
• Knowledge Center: to create, review, and publish knowledge
• Knowledge Set: to organize the knowledge from several different views
• Expert Base: to communicate with experts online and to perform knowledge evaluations,
for example, assign a grade or add comments and recommendations, reading materials, etc.
• Personal Space: to manage draft-state knowledge
Documentum Disaster Recovery Implementation
Narsingarao Miriyala, EMC
In most of my enterprise engagements, clients ask about EMC recommendations for
Documentum application disaster recovery (DR). I have not found any document which
talks in-depth about different models for implementing DR for Documentum.
This article explains infrastructure requirements for successfully recovering a Documentum
application at a DR site. It also explains a few DR models with Symmetrix, CLARiiON, and
Centera storage and replication software like EMC SRDF® IP-Replicator etc. How clients can
have homogenous storage systems and still implement DR solutions for Documentum also
is explained.
EMC Documentum Single Sign-on using Standard Tools
Sander Hendriks, Informed Consulting
Single sign-on is every system user’s dream. Start up your PC in the morning and never worry
about your password for the rest of the day. All the applications will be informed of your iden-
tity securely, so they can authorize your for the data and functionality you need. Everyone
wins: the users, the applications, and the network security person.
The reason why Documentum applications in most organizations still display a login screen
the browser is started is because the technical implementation of single sign-on can be
difficult.
Documentum applications are usually implemented as web applications, so there are three
main parts in the architecture:
• The user’s browser
• The application server
• The content server
The difficulty is in getting the user’s identity all the way from the user’s PC to the content
server in a secure way, leaving no opportunity for impersonation or other abuse.
This article describes ways of achieving single sign-on for Documentum applications using:
• A specialized product, such as RSA, or CA SiteMinder
• A third-party Kerberos implementation, offered by Solfit
• Kerberos and a trust relation between the application and the content server
• Kerberos all the way to the content server
To select which approach is best for you, we will discuss several design choices:
• When to use a specialized product
• What can be configured and customized in the application server
• Content server authentication: using trust or an authentication plug-in
24 EMC Proven Professional: Knowledge Sharing 2010
25. Since the single sign-on offerings by RSA, CA, and Solfit are extensively documented, the
rest of the article focuses on the custom Kerberos solution. It also shows that several viable
options for implementation exist and some pointers and best practices are offered.
This article is helpful for anyone facing single sign-on implementation for a Documentum
system.
Enterprise Content Management:
Yesterday’s Lessons and Today’s Challenges
Himakara Pieris, Daedal Consulting
In less than a decade, enterprise content management (ECM) has grown from a vision to real-
ity. It is safe to say that any large organization owns and operates content management sys-
tems at some level. The industry has overcome phenomenal challenges during this period of
growth and adoption; a process primarily influenced by regulatory measures, business value,
cost savings, and business continuity efforts. Key events of the last decade have shaped the
way we live, travel, and conduct business. This is also true for enterprise content manage-
ment. The impact of events such as the Enron scandal has profoundly altered the cause of
ECM, and any recent and future events such as the mortgage crisis will continue to shape the
way we do business and how we manage content.
There are many past lessons: industry and our customers have grown to recognize the value
of open platforms; and they have learned that picking the right product is not all that matters.
All of us have come to appreciate the value of using extensible frameworks instead of appli-
cation silos with similar core operations. We have learned what roles constitute an ECM
implementation team and what skills to look for. Implementers have realized an “all IT” team
would not implement the most usable system or, in other words, the importance of cross-
functional representation during all stages of an implementation. We have learned which
implementation methodologies work and which will not. We have come to appreciate the
regulatory, legal, and business implications from ECM systems.
The road ahead for content management is nevertheless challenging. The practice of ECM has
become an integral part of business. ECM systems have to be scalable, conform to standards,
and interoperate with a host of other systems—internal and external. The future signals stan-
dardization, structured content initiatives, knowledge mining, and more distributed systems
As content management practitioners and implementers, we must understand the core les-
sons we have learned and use that knowledge to build better systems. This article estab-
lishes key lessons from the past and describes how not applying them could result in project
failures. For example, AIMM’s “State of the ECM Industry 2009” ranks “underestimated pro-
cess and organizational issues” at the top of its implementation issues list. This situation
occurs due to lack of cross-functional representation during the project cycle and not retain-
ing the correct mix of skills. The importance of a cross-functional project team is one of the
lessons discussed in this article.
While we are satisfied with how much ECM has progressed over the years, it is important to
prepare for the challenges that lay ahead. This article discusses some key challenges: scalabil-
ity, standardization, interoperability, structured content initiatives, knowledge mining, and
distributing content management systems. The industries’ response to these challenges, such
as content management interoperability services and the ECM maturity model, is also
analyzed.
This article’s goal is to spark a conversation about the lessons and challenges faced by con-
tent management practitioners and users.
EMC Proven Professional: Knowledge Sharing 2010 25
26. Integrating SharePoint and Documentum
Rami A Al Ghanim, Saudi Aramco
Sara AlMansour, Saudi Aramco
This article explains how Saudi Aramco integrated Microsoft Office SharePoint 2007 with
EMC Documentum using EMC Documentum Repository Services for Microsoft SharePoint
(EDRSMS). This enabled us to create a single point of access to our Knowledge Sharing
system and our Content Management system using SharePoint 2007 as the user friendly
interface to the robust document management features of EMC Documentum. The
integration enabled us to eliminate redundancy, reduce SQL Server bloat, and capitalize
on Documentum’s Records Management features.
This article offers a brief description of Saudi Aramco, our business needs and requirements,
what EDRSMS is, and why we picked it. It also describes our EMC Documentum and
SharePoint environments before and after implementing the integration. It also illustrates
the flow of documents before and after EDRSMS. This topic is relevant to the broader commu-
nity of EMC Proven professionals because it addresses many issues and concerns that we
face in the ECM world today. Most EMC Proven professionals work in companies who have
both SharePoint and Documentum, or have Documentum and see a need to have SharePoint.
This topic would be beneficial to them as it would explain why and how our company inte-
grated the two products. The article affects anyone who currently has or is thinking about
implementing SharePoint, has both SharePoint and Documentum, or is thinking about inte-
grating or has already integrated SharePoint with Documentum.
You will learn why Saudi Aramco implemented a Knowledge Sharing system and a Content
Management system, what issues we faced by having the two systems separated, why we
needed to integrate them, and the steps we took—from creating the project team to rolling
out the integrated system to production. You will see the whole journey Saudi Aramco took
starting from the history of the Knowledge Sharing system, to our current position after inte-
grating the two systems, and finally our future vision for this integration.
Media Asset Management in the Broadcasting Industry
using Documentum
Derrick Lau, Contractor
Media asset management consists of management processes and decisions surrounding
the ingestion, annotation, cataloguing, storage, retrieval, and distribution of audio or video
assets. These can include small web-sized mp3 files or QuickTime video, as well as large
mpeg files and high quality Grass Valley video files.
Media asset management is relevant to television broadcasters because they traditionally
store vast amounts of media data in the form of commercial spots, secondary event media,
television episodes, and feature films. These media assets need to be catalogued, reviewed,
and eventually played on air. They also need to be accessible by different systems as there
are many different users within the broadcaster’s operations who must ingest and input
metadata on the media as well as prepare cue sheets for each media asset to ascertain when
to play which segment.
Traditionally, Documentum has been used as an enterprise document management system,
but with its rich media product suite, it has expanded into the media and publishing indus-
tries, and can also be integrated into broadcasting environments. This opens up a whole new
market for Documentum contractors, and in particular those with EMC Proven Professional
certification, to implement Documentum and integrate it with other broadcasting systems
to provide competitive advantage to broadcasters.
This article provides a high-level methodology on how Documentum can fulfill some media
asset management requirements specific to broadcasters. The goal is to describe these pos-
sible solutions as suggested design patterns, and present them in layman’s terms for use in
potential discussions with broadcasters. These solutions will be drawn from personal
26 EMC Proven Professional: Knowledge Sharing 2010
27. experience and not from other sources. This article does not provide a complete media asset
management solution, but rather outlines the basics so you can expand upon the solutions
to meet unique requirements.
This article discusses major solutions that will be of interest to EMC Proven professionals
seeking to enter the broadcast market. They include:
• Designing a practical object model to catalogue the media
• Designing the repository taxonomy
• Archiving media assets
• Integrating with broadcast trafficking systems and
• Integrating with broadcast automation systems
Finally, some user interface issues encountered in the broadcast environment are briefly
described, as well as what Documentum customizations were quickly provided to compen-
sate for them. This will essentially be a catalogue of quick wins to help gain user acceptance
of media asset management.
Although there are plenty of broadcasters using Documentum as an enterprise document
management system, this article focuses solely on its media asset management potential.
It is not assumed that the reader has Documentum technical knowledge. After reading this
article, you should have a high-level understanding of how to present Documentum to
broadcasters as a solution for their media asset management needs, and be able to list the
high-level steps required to integrate Documentum into their technical environment. This is
important to both the reader and the television broadcaster in understanding Documentum’s
full potential and ensuring that the broadcaster gets the full value of a Documentum imple-
mentation. Finally, this article provides a launching point for EMC Proven professionals in the
hardware space with regard to structuring a solutions package with EMC hardware, alongside
EMC software, to effectively meet broadcast requirements.
Understanding In-Depth Documentum Security
Narsingarao Miriyala, EMC
During the last nine years with EMC, I have done more than 40 Documentum repository
deployments. Some were complex and required presenting the architecture to clients’
infrastructure and application security teams.
Most of the questions from client Information Security Officers are:
• How does Documentum secure the data?
• How does Documentum encrypt the content on disk?
• How does Documentum ensure classified content is not accessed by system
administrators?
• What type of encryption do you use in the repository and what is the algorithm?
• How does Documentum encrypt data communication?
Compiling the required information to answer these questions was a challenge. This article:
• Explains Documentum’s built-in security features, certificates, encryption keys, security
algorithms, and the data encryption mechanism
• Reviews how content on storage gets encrypted with Trusted Content Services
• Covers how to encrypt the transmission from end-to-end when the user tries to access
content from the repository
• Reviews how to encrypt content on the remote BOCS servers
• Discusses auditing and how a client can mitigate the risk of rogue Documentum system
administrators from unauthorized access to the data
EMC Proven Professional: Knowledge Sharing 2010 27
28. WCM with DITA and Documentum
Jaimala D. Bondre
Most large organizations use some form of web content management (WCM), enabling them
to publish huge and diverse data to the web. Publishing is just one part of the whole spec-
trum of the WCM, which needs to co-exist with effective and easy content authoring. Content
in large organizations grows rapidly and there is a need for a complete and structured pub-
lishing solution.
Darwin Information Typing Architecture (DITA) takes a very structured approach toward con-
tent authoring, enabling content to be produced in a standard way in different formats. This
article attempts to combine the Documentum content management, workflow, and publish-
ing capabilities with the universally accepted DITA for content authoring. It walks through the
evolution of unstructured data to structured xml data, retaining the DITA structures within
Documentum that can be reused. It also reviews the transformation of this data and passing
this structured data through workflow for review, and finally publishing this data to form a
complete and effective end-to-end structured publishing solution.
Working with Documentum Composer and Without It
Andrew Fitzpatrick, Capgemini
Documentum Composer has made significant steps forward for developers and analysts
who want to develop for Documentum. However, there are still some issues with using it.
The first section of this article discusses the issues I faced and how I overcame them. The
second section discusses two simple tools that I created to complement Composer and
solve some of the issues discussed in the first section. The final section details an agile
development methodology, some of the supporting tools, and how Composer and my tools
can be used to achieve agility.
Composer is now the primary way to create Documentum applications. In short, it provides
a friendly interface for business analysts to create artifacts, a development environment
in which to create custom code, and a platform that is extensible through plug-ins. I have
now used Composer for over 12 months and have found it to be a vast improvement over
Documentum Application Builder (DAB).
The issues detailed in this article are not program bugs, instead they are issues I have had
with fitting Composer to the development process. As my issues have been process-related,
I believe that many other developers working with Composer are also contending with them.
By detailing the solutions I have put in place, my knowledge is presented so it may be reused
in future. Therefore, this article is useful for all ECM professionals who are using Composer
as their development tool and can use the knowledge herein to solve similar issues on their
own projects.
The article’s contents are presented from the point-of-view of a developer. However, all roles
within the project lifecycle are discussed within the relevant sections. This article is also use-
ful for project teams who are working to develop ECM but don’t necessarily fit into a particu-
lar methodology. One agile approach is presented that has been used with success and the
role of Composer within that methodology is discussed.
By reading this article, you should be able to overcome the development issues that I have
faced, and I believe will be common. You also will be able to determine what Composer can
be used for and how its use can be extended. Finally, you will be able to use Composer suc-
cessfully within a methodology that supports agility, high quality, and simplicity.
28 EMC Proven Professional: Knowledge Sharing 2010