1. Support Data-Intensive Workflows With Automated,
Intelligent Migration and Archiving
Pharmaceutical companies invest heavily in research and
development to create new drugs and medicines in order
to treat the ailments of today’s global society. Success
brings many benefits but research is a slow, expensive
and highly competitive proposition. Research priorities
now require accelerated analysis of lab data and fast
decision-making based on integration, assimilation and
examination of multidisciplinary data. Hitachi Data Systems
storage solutions deliver hardware and advanced storage
management features to meet the challenges of today’s
data-intensive life sciences organizations.
SOLUTIONPROFILE
Both large and small organizations can benefit from
our hardware-accelerated architecture, which sup-
ports scaling without compromising performance.
Our data management software simplifies many day-
to-day administration tasks and helps provide shared
data access to the many individuals and workgroups
who need it. It can also improve the performance of
computational workflows by virtualizing storage serv-
ers to transparently shift workloads across various
physical servers.
To address long-term data management issues, our
solutions offer automated intelligent data migration
and integrated archive functionality. This approach
allows organizations to optimize the use of their
higher performance storage.
High-Performance NAS Accelerates Life Sciences Research
Virtualization Economics Innovate Reliable
ormation Global Change Intelligent Technology Serv
ht Opportunity Social Infrastructure Integrate Analy
DISCOVER
2. SOLUTION PROFILE
New Requirements Demand
New Solutions
Life sciences research is heavily compute
and data intensive, which has been the
case since the sequencing of the human
genome was completed 10 years ago.
But there are significant differences today,
including:
■■ Volumes of data have increased by orders
of magnitude over the last few years,
driven in part by the race to achieve the
$1,000 genome sequence.
■■ High-performance computing clusters
have brought supercomputing power
to even the smallest labs, keeping
cluster processors satiated to sus-
tain high-throughput computational
workflows.
■■ The multidisciplinary nature of life sci-
ences research means that there is a
much greater need to provide shared
access to lab data and various public and
private databases.
These changes have had a considerable
impact to the proliferation of data and
its resulting management requirements.
Workloads are quite variable and place
changing demands on storage systems. To
address these challenges, we provide scal-
able storage and data management solutions
to ensure workflows run unimpeded, in a
cost-effective manner (see Figure 1).
New Lab Equipment Drives
Data Growth
Life sciences data volumes are exploding
due to the wider use of higher resolution
imaging and sequencing in labs today.
On the imaging front, there is growing use of
microscopy in basic research and develop-
ment, as well as in early stage clinical trials.
There is also an increased use of microarrays
for single-nucleotide polymorphism (SNP)
detection and gene expression profiling.
Expanded use of higher-resolution imaging
in high-performance research drives the
need for high-performance and cost-
effective storage solutions to complement
increasing computing power used for analy-
sis of imaging data.
Accelerate Life Sciences
Research
Life sciences continue to experience
explosive growth of raw experimental and
calculated data. There is greater use of
high-throughput computational workflows
and the growing adoption of new data
reuse, mining and retention practices. This
environment places new demands on the
way life sciences organizations store and
manage their data.
On the sequencing front, next-generation
sequencers produce orders of magnitude
more data per run than their predecessors.
Each run can be done in a faster time and
the cost per run is decreasing. Also, these
experiments result in more data than with
previous generation sequencers. Essentially,
the lower cost and faster run times mean
more completed experiments in a given
period of time, making it difficult to manage
the sequencing data.
Changing Nature of Research
Creates New Demands
To address the life sciences data explosion,
organizations in the past would simply add
raw storage capacity. However, that is not
the best solution today.
As the volume of data and capabilities
of high-performance computing (HPC)
resources grow, meeting growing stor-
age capacity and performance demands
becomes a challenge. It is difficult to meet
these needs while minimizing the burden of
managing the expanding volumes of data.
Life sciences research is a multidisciplinary
effort, which adds to data management
challenges. Many researchers using vastly
different systems and data analysis routines
need shared access to lab results and sub-
scribed databases.
As numerous blockbuster drugs come off
patent, life sciences organizations today
routinely look at past research on these
already approved drugs to find new indica-
tions for them. Therefore, researchers often
need access to older data, which in the past
would have been archived and taken offline.
Making matters more complex, regulatory
and new patient safety requirements dictate
retention of more data for longer periods.
Now, for responsible data management
you must decide which data gets stored on
which systems at a particular point in the
data’s lifetime.
Figure 1. Genomic Sequencing Environment With Shared Storage
3. 3
Hitachi NAS Platform features improved performance,
manageability and availability of NAS workloads as well as
enhanced integration with VMware high-availability and
testing environments.
Additionally, life sciences research is now
typically conducted in a more operationally
efficient manner. Previously, scientists cut
and pasted data from one spreadsheet or
database into another for analysis. However,
today’s workflows can automatically use
results from one analysis or computation
as input to another. This capability places
demands on the storage systems and their
interactions with HPC resources.
Storage Solutions Designed
for Today’s Life Sciences
Research
Hitachi NAS Platform (HNAS), our
high-performance NAS technology, fea-
tures improved integration with VMware
high-availability and testing environments.
With HNAS, Hitachi Data Systems provides
improved performance, manageability and
availability of NAS workloads.
HNAS supports intelligent tiered storage and
the ability to transparently move data from
tier to tier. Its single file system presentation
for the hosts, end users and applications
eliminates the need for changes, such as
redirecting an application to a new drive or
volume when a file is moved.
Our approach to intelligent tiered storage
allows online, nearline and archival data to
reside on any combination of solid state,
Fibre Channel, SAS and SATA disks. Also,
Hitachi Data Systems networked storage
systems automatically and intelligently
migrate data across various tiers using
policies that the storage administrator
establishes. These policies can be based
on a wide range of parameters, such as
previous access date, the owner of the data
and the amount of disk space available.
We call this type of data
movement with its single
file system view, transpar-
ent data mobility (TDM).
The “Complementary
Technologies” sidebar pres-
ents the HDS technologies
that contribute to TDM.
Next Steps
To learn more about how Hitachi NAS
Platform and Hitachi technologies can sup-
port your life sciences workflow, contact
your Hitachi Data Systems representative.
www.HDS.com/innovate
Innovation is the engine of change, and
information is its fuel. Innovate intelligently
to lead your market, grow your company,
and change the world. Manage your
information with Hitachi Data Systems.
COMPLEMENTARY Technology for Transparent Data Mobility
To complement Hitachi hardware, we offer a number of technologies that help reduce
storage management costs, optimize storage resource use and automatically move
data to an appropriate storage tier. These technologies include:
■■ Data migrator feature, a policy-based engine, allows administrators to implement
data movement policies.
■■ Cross volume links reside on the primary file system and point to the corresponding
file on the secondary system, which houses the migrated data file.
■■ Dynamic read caching feature reserves space on a storage tier for caching of “hot”
files. Dynamic read caching can be applied to a cluster of Hitachi networked storage
servers. It is policy-driven and automated.
■■ Enterprise virtual server migration feature relocates a virtual server within a cluster
or to a server outside of the cluster that share access to the same storage devices.
LEARN MORE
Life Sciences
Solutions