Llnl talk

•Transferir como PPTX, PDF•

1 gostou•743 visualizações

These slides are from a recent talk I gave at Lawrence Livermore Labs. The talk gives an architectural outline of the MapR system and then discusses how this architecture facilitates large scale machine learning algorithms.

Tecnologia Educação

MapR Architecture and Machine Learning 1

Outline MapR system overview Map-reduce review MapR architecture Performance Results Map-reduce on MapR Machine learning on MapR

Bottlenecks and Issues Read-only files Many copies in I/O path Shuffle based on HTTP Can’t use new technologies Eats file descriptors Spills go to local file space Bad for skewed distribution of sizes

MapR Improvements Faster file system Fewer copies Multiple NICS No file descriptor or page-buf competition Faster map-reduce Uses distributed file system Direct RPC to receiver Very wide merges

MapR Innovations Volumes Distributed management Data placement Read/write random access file system Allows distributed meta-data Improved scaling Enables NFS access Application-level NIC bonding Transactionally correct snapshots and mirrors

MapR'sContainers Files/directories are sharded into blocks, whichare placed into mini NNs (containers ) on disks ,[object Object]

No need to manage directlyContainers are 16-32 GB segments of disk, placed on nodes

Container locations and replication CLDB N1, N2 N1 N3, N2 N1, N2 N2 N1, N3 N3, N2 N3 Container location database (CLDB) keeps track of nodes hosting each container

MapR Scaling Containers represent 16 - 32GB of data ,[object Object]

100M containers = ~ 2 Exabytes (a very large cluster)250 bytes DRAM to cache a container ,[object Object]

Typical large 10PB cluster needs 2GBContainer-reports are 100x - 1000x < HDFS block-reports ,[object Object]

Increase container size to 64G to serve 4EB cluster

Terasort on MapR 10+1 nodes: 8 core, 24GB DRAM, 11 x 1TB SATA 7200 rpm Elapsed time (mins) Lower is better

MUCH faster for some operations Same 10 nodes … Teststoppedhere Create Rate # of files (millions)

NFS mounting models Export to the world NFS gateway runs on selected gateway hosts Local server NFS gateway runs on local host Enables local compression and check summing Export to self NFS gateway runs on all data nodes, mounted from localhost

Export to the world NFS Server NFS Server NFS Server NFS Server NFS Client

Local server Client Application NFS Server Cluster Nodes

Universal export to self Cluster Nodes Cluster Node Application NFS Server

Cluster Node Application NFS Server Cluster Node Application Cluster Node Application NFS Server NFS Server Nodes are identical

Shardedtext indexing Mapper assigns document to shard Shard is usually hash of document id Reducer indexes all documents for a shard Indexes created on local disk On success, copy index to DFS On failure, delete local files Must avoid directory collisions can’t use shard id! Must manage local disk space

Conventional data flows Failure of search engine requires another download of the index from clustered storage. Map Failure of a reducer causes garbage to accumulate in the local disk Reducer Clustered index storage Input documents Local disk Search Engine Local disk

Simplified NFS data flows Map Reducer Search Engine Input documents Clustered index storage Failure of a reducer is cleaned up by map-reduce framework Search engine reads mirrored index directly.

Application to machine learning So now we have the hammer Let’s see some nails!

K-means Classic E-M based algorithm Given cluster centroids, Assign each data point to nearest centroid Accumulate new centroids Rinse, lather, repeat

K-means, the movie Centroids Assign to Nearest centroid I n p u t Aggregate new centroids

Mais conteúdo relacionado

Mais procurados

Threading Successes 06 Allegorithmicguest40fc7cd

Sector Sphere 2009lilyco

Oscon data-2011-ted-dunningTed Dunning

Hadoop 2EasyMedico.com

Масштабируемость Hadoop в Facebook. Дмитрий Мольков, Facebookyaevents

Common Support Issues And How To Troubleshoot Them - Michael Hackett, Vikhyat...Ceph Community

Reference Architecture: Architecting Ceph Storage Solutions Ceph Community

Accordion - VLDB 2014Marco Serafini

HPTS talk on micro-sharding with KattaTed Dunning

January 2011 HUG: Pig PresentationYahoo Developer Network

Champion Fas DeduplicationMichael Hudak

Overview of Spark for HPCGlenn K. Lockwood

Apache hadoop, hdfs and map reduce OverviewNisanth Simon

MapR M7: Providing an enterprise quality Apache HBase APImcsrivas

Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...Ceph Community

Hive, Presto, and Spark on TPC-DS benchmarkDongwon Kim

Dhcp in linuxUc Man

Ceph for Big Science - Dan van der SterCeph Community

Spark tunning in Apache KylinShi Shao Feng

Hadoop MapReduce Streaming and PipesHanborq Inc.

Mais procurados (20)

Threading Successes 06 Allegorithmic

Sector Sphere 2009

Oscon data-2011-ted-dunning

Hadoop 2

Масштабируемость Hadoop в Facebook. Дмитрий Мольков, Facebook

Common Support Issues And How To Troubleshoot Them - Michael Hackett, Vikhyat...

Reference Architecture: Architecting Ceph Storage Solutions

Accordion - VLDB 2014

HPTS talk on micro-sharding with Katta

January 2011 HUG: Pig Presentation

Champion Fas Deduplication

Overview of Spark for HPC

Apache hadoop, hdfs and map reduce Overview

MapR M7: Providing an enterprise quality Apache HBase API

Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...

Hive, Presto, and Spark on TPC-DS benchmark

Dhcp in linux

Ceph for Big Science - Dan van der Ster

Spark tunning in Apache Kylin

Hadoop MapReduce Streaming and Pipes

Semelhante a Llnl talk

Data mining-2011-09Ted Dunning

Hadoop Network Performance profilepramodbiligiri

02.28.13 WANdisco ApacheCon 2013WANdisco Plc

Ted Dunning - Whither HadoopEd Kohlwey

(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...Amazon Web Services

Hadoop ArchitectureDelhi/NCR HUG

Data ScienceSubhajit75

Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hari Shankar Sreekumar

Putting Wings on the ElephantDataWorks Summit

Data mining 2011 09MapR Technologies

Apache hadoopsheetal sharma

Trip down the GPU lane with Machine LearningRenaldas Zioma

Architectural Overview of MapR's Apache Hadoop Distributionmcsrivas

Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Databricks

Hadoop Architecture_Cluster_Cap_PlanNarayana B

Lecture 2 part 1Jazan University

Artmosphere DemoKeira Zhou

An Introduction to HadoopDerrekYoungDotCom

Unified Big Data Processing with Apache Spark (QCON 2014)Databricks

Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Виталий Стародубцев

Semelhante a Llnl talk (20)

Data mining-2011-09

Hadoop Network Performance profile

02.28.13 WANdisco ApacheCon 2013

Ted Dunning - Whither Hadoop

(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...

Hadoop Architecture

Data Science

Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)

Putting Wings on the Elephant

Data mining 2011 09

Apache hadoop

Trip down the GPU lane with Machine Learning

Architectural Overview of MapR's Apache Hadoop Distribution

Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...

Hadoop Architecture_Cluster_Cap_Plan

Lecture 2 part 1

Artmosphere Demo

An Introduction to Hadoop

Unified Big Data Processing with Apache Spark (QCON 2014)

Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...

Mais de Ted Dunning

Dunning - SIGMOD - Data Economy.pptxTed Dunning

How to Get Going with KubernetesTed Dunning

Progress for big data in KubernetesTed Dunning

Anomaly Detection: How to find what you didn’t know to look forTed Dunning

Streaming Architecture including Rendezvous for Machine LearningTed Dunning

Machine Learning LogisticsTed Dunning

Tensor Abuse - how to reuse machine learning frameworksTed Dunning

Machine Learning logisticsTed Dunning

T digest-updateTed Dunning

Finding Changes in Real DataTed Dunning

Where is Data Going? - RMDC KeynoteTed Dunning

Real time-hadoopTed Dunning

Cheap learning-dunning-9-18-2015Ted Dunning

Sharing Sensitive Data SecurelyTed Dunning

Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-timeTed Dunning

How the Internet of Things is Turning the Internet Upside DownTed Dunning

Apache Kylin - OLAP Cubes for SQL on HadoopTed Dunning

Dunning time-series-2015Ted Dunning

Doing-the-impossibleTed Dunning

Anomaly Detection - New York Machine LearningTed Dunning

Mais de Ted Dunning (20)

Dunning - SIGMOD - Data Economy.pptx

How to Get Going with Kubernetes

Progress for big data in Kubernetes

Anomaly Detection: How to find what you didn’t know to look for

Streaming Architecture including Rendezvous for Machine Learning

Machine Learning Logistics

Tensor Abuse - how to reuse machine learning frameworks

Machine Learning logistics

T digest-update

Finding Changes in Real Data

Where is Data Going? - RMDC Keynote

Real time-hadoop

Cheap learning-dunning-9-18-2015

Sharing Sensitive Data Securely

Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time

How the Internet of Things is Turning the Internet Upside Down

Apache Kylin - OLAP Cubes for SQL on Hadoop

Dunning time-series-2015

Doing-the-impossible

Anomaly Detection - New York Machine Learning

Último

Powerpoint exploring the locations used in television show Time Clashcharlottematthew16

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe

TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey

Commit 2024 - Secret Management made easyAlfredo García Lavilla

What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

Story boards and shot lists for my a level piececharlottematthew16

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

Llnl talk

1. MapR Architecture and Machine Learning 1

2. Outline MapR system overview Map-reduce review MapR architecture Performance Results Map-reduce on MapR Machine learning on MapR

3. Map-Reduce Shuffle Input Output

4. Bottlenecks and Issues Read-only files Many copies in I/O path Shuffle based on HTTP Can’t use new technologies Eats file descriptors Spills go to local file space Bad for skewed distribution of sizes

5. MapR Improvements Faster file system Fewer copies Multiple NICS No file descriptor or page-buf competition Faster map-reduce Uses distributed file system Direct RPC to receiver Very wide merges

6. MapR Innovations Volumes Distributed management Data placement Read/write random access file system Allows distributed meta-data Improved scaling Enables NFS access Application-level NIC bonding Transactionally correct snapshots and mirrors

8. Directories & files

9. Data blocks

10. Replicated on servers

11. No need to manage directlyContainers are 16-32 GB segments of disk, placed on nodes

12. Container locations and replication CLDB N1, N2 N1 N3, N2 N1, N2 N2 N1, N3 N3, N2 N3 Container location database (CLDB) keeps track of nodes hosting each container

13.

14.

15. But not necessary, can page to disk

16.

17. Increase container size to 64G to serve 4EB cluster

18.

19. Terasort on MapR 10+1 nodes: 8 core, 24GB DRAM, 11 x 1TB SATA 7200 rpm Elapsed time (mins) Lower is better

20. MUCH faster for some operations Same 10 nodes … Teststoppedhere Create Rate # of files (millions)

21. MUCH faster for some operations

22. NFS mounting models Export to the world NFS gateway runs on selected gateway hosts Local server NFS gateway runs on local host Enables local compression and check summing Export to self NFS gateway runs on all data nodes, mounted from localhost

23. Export to the world NFS Server NFS Server NFS Server NFS Server NFS Client

24. Local server Client Application NFS Server Cluster Nodes

25. Universal export to self Cluster Nodes Cluster Node Application NFS Server

26. Cluster Node Application NFS Server Cluster Node Application Cluster Node Application NFS Server NFS Server Nodes are identical

27. Shardedtext indexing Mapper assigns document to shard Shard is usually hash of document id Reducer indexes all documents for a shard Indexes created on local disk On success, copy index to DFS On failure, delete local files Must avoid directory collisions can’t use shard id! Must manage local disk space

28. Conventional data flows Failure of search engine requires another download of the index from clustered storage. Map Failure of a reducer causes garbage to accumulate in the local disk Reducer Clustered index storage Input documents Local disk Search Engine Local disk

29. Simplified NFS data flows Map Reducer Search Engine Input documents Clustered index storage Failure of a reducer is cleaned up by map-reduce framework Search engine reads mirrored index directly.

30. Application to machine learning So now we have the hammer Let’s see some nails!

31. K-means Classic E-M based algorithm Given cluster centroids, Assign each data point to nearest centroid Accumulate new centroids Rinse, lather, repeat

32. K-means, the movie Centroids Assign to Nearest centroid I n p u t Aggregate new centroids

33. But …

34. Parallel Stochastic Gradient Descent Model Train sub model I n p u t Average models

35. VariationalDirichlet Assignment Model Gather sufficient statistics I n p u t Update model

36. Old tricks, new dogs Mapper Assign point to cluster Emit cluster id, (1, point) Combiner and reducer Sum counts, weighted sum of points Emit cluster id, (n, sum/n) Output to HDFS Read from local disk from distributed cache Read from HDFS to local disk by distributed cache Written by map-reduce

37. Old tricks, new dogs Mapper Assign point to cluster Emit cluster id, 1, point Combiner and reducer Sum counts, weighted sum of points Emit cluster id, n, sum/n Output to HDFS Read from NFS Written by map-reduce MapR FS

38. Click modeling architecture Map-reduce Side-data Now via NFS Feature extraction and down sampling I n p u t Data join Sequential SGD Learning

39. Poor man’s Pregel Mapper Lines in bold can use conventional I/O via NFS while not done: read and accumulate input models for each input: accumulate model write model synchronize reset input format emit summary 31

40. Trivial visualization interface Map-reduce output is visible via NFS Legacy visualization just works $ R > x <- read.csv(“/mapr/my.cluster/home/ted/data/foo.out”) > plot(error ~ t, x) > q(save=‘n’)

41. Conclusions We used to know all this Tab completion used to work 5 years of work-arounds have clouded our memories We just have to remember the future

Llnl talk

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Llnl talk

Semelhante a Llnl talk (20)

Mais de Ted Dunning

Mais de Ted Dunning (20)

Último

Último (20)

Llnl talk