Mobile and Cloud Labs is developing a big data storage appliance called Argo to address the huge demand for flash storage. Argo will provide dense flash storage in a modular system with high performance, low latency and power efficiency using NVMe over PCIe. It aims to replace existing flash storage systems by offering greater scalability, flexibility and lower cost per gigabyte compared to competitors. Mobile and Cloud Labs plans to target applications such as data analytics, caching and virtualization in large datacenters.
Reference:
http://www.datacenterknowledge.com/archives/2012/03/14/estimate-amazon-cloud-backed-by-450000-servers/
MCL presentation suggesting 42% CAGR for Storage
10M Servers Sold Annually – IDC/Gartner
http://www.datacenterknowledge.com/archives/2007/11/12/google-data-centers-3000-a-square-foot/
Datacentermap.com
http://tech.fortune.cnn.com/2011/05/23/down-on-the-server-farm/
Huge Low to Mid Range Market opportunity
3000 Large Datacenters have over 100K + Servers
500K Datacenters range from 1K -10K Servers
2M datacenter >1K Servers
The above data an approximation based on information on internet
If we have to replace 300M servers@10M servers a year,it will take 30 years,which means if by 2020 all datacenter equipments were to change due to storage and processing capacity increases,45M servers I,e 4.5x more than today need to sell and same for storage systems
Financial institutions want to integrate their own software/Proprietary
Big data solutions are difficult to deploy. Clustered solutions on commodity hardware has a higher TCO because of cluster management costs. A typical cluster can take several weeks to months to be configured. We can take advantage of the fast shared storage with built in compute and network (pre-installed with software to have a plug and play big data solution ). vStorm enterprise from Veristorm is packaged big data solution which addresses both the distributed market as well as the mainframe market. This big data appliance will be offered to customers so that they can plug it into their network providing a complete Big Data solution. The solution can scale horizontally as well as vertically. IO is the bottle neck in a lot of cases as is compute for this kind of processing. The appliance can bring in data from disparate data sources and make it available for processing using the Big Data analytics tool.
Before we talk about processing data, let’s talk about the SOURCES and the VALUE of data
FIRST
Traditional enterprise data is very often processed on IBM mainframes.
60% of all transactions, top banks, top retailers, top insurance… Enterprise systems and IBM dominate this space
BUT, it’s expensive for real-time and batch processing. If we can reduce batch processing, we can save money.
SECOND
Big Data is about more than enterprise data:
It’s SOCIAL, it’s Enterprise Resource Planning (ERP), it’s Customer Relationship Management (CRM), it’s Supply Chain Management (SCM)
It’s even email and system logs… the volume is huge. (Big, in fact) It’s unstructured.
We need new ways to perform analytics on ALL of this data
vStorm enterprise is a data transfer solution that is packaged with a Big Data ecosystem ( Hadoop for now ), on the Argo appliance. Using simple drag and drop, data can be moved from relational, non-relational and unstructured data to the appliance. The idea is to provide Big Data clusters on demand on the appliance. The solution can scale both vertically and horizontally both for storage as well as compute.
1. Differentiation is Software, now every one is driving towards Density at very low cost and need flexibility
2. Service providers would like to build their own software to get the best out of the system
Pure Storage over $200M
Violin memory going IPO
Skyera/Nimbuz and others raised over $50M
This will be our first test before shipping to Beta customers
Ref: Cisco
http://www.ieee802.org/3/ba/public/AdHoc/40GSMF/carter_40_01_0208.pdf
Plus Datacentermap.com suggests there are 3000 datacenters WW very large and given to the above data being from 2006 where 25K servers were supported today more than 100K are supported in Very large datacenters
Ref –Vaid-Flash summit 2013 Microsoft
Ref Flash summit-Vaid-Microsoft
20% is Hot or Low Latency data, hence if we assume SSD is 20% of the HDD market then 20% of $42B =$8.4B or will be SSD Storage System Market
CAGR 100% means: 99.9% new build in 2020 0.1% would be the traffic of today
CAGR 70% means: 99.5% new build in 2020 0.5% would be the traffic of today
CAGR 40% means: 96.5% new build in 2020 3.5% would be the traffic of today
Traffic Growth – considering grwoth of traffic in 2010 assumed to be 1 today we have 10 exabyte ,factor of by 2015 50 exabytes 10K PB =1EB
40%=2011 - 1.4
2012= 1.4*
Rebuilt 80% of the current network to support
This is just an insight in to what needs to be done in 5 yrs