Cloud jpl

•

0 gostou•579 visualizações

The document discusses Hadoop and cloud computing. It provides an overview of Hadoop, including what it is ("flexible infrastructure for large scale computational and data processing on a network of commodity hardware"), how it works (using MapReduce for distributed processing), and some example applications. It also discusses the Hadoop file system and ecosystem. Examples of companies using Hadoop include cloud computing providers like Cloudera as well as organizations working with large datasets.

Tecnologia

Cloud Computing
i
Hadoop
X JPL
Barcelona, 01/07/2011

Marc de Palol
@lant

Els dos són sistemes distribuïts

“A distributed system is one in which the failure
of a computer you didn't even know existed can
render your own computer unusable”
Leslie Lamport

Hadoop

MapReduce: Simplified Data Processing on Large Clusters
Jeffrey Dean and Sanjay Ghemawat

OSDI'04: Sixth Symposium on Operating System Design and Implementation,
San Francisco, CA, December, 2004.

Hadoop

●
Nutch

●
Lucene

●
Hadoop

●
Avro

Hadoop

“Flexible infrastructure for large scale
computational and data processing on
a network of commodity hardware”

Parand Tony Darugar

Map & Reduce

Map :

V = [ 1 , 2 , 3 , 4 , 5 ]
Def quadrat( x ) = x * x;

Map ( V, quadrat ) =
For (var v : V) {
Output quadrat(v);
}
}

[1, 4, 9, 16, 25]

Map & Reduce

Map : Reduce :

V = [ 1 , 2 , 3 , 4 , 5 ] V = [ 1 , 4 , 9 , 16 , 25 ]
Def quadrat( x ) = x * x;

Map ( V, quadrat ) = Reduce ( V ) =
For (var v : V) { Var acum = 0;
output quadrat(v); For (var v : V) {
} acum = acum + v
} }
}

[1, 4, 9, 16, 25] 55

Hadoop DFS

The Google File System
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

19th ACM Symposium on Operating Systems Principles,
Lake George, NY, October, 2003.

●
Dissenyat per Big Data ●
Des de fa poc permet 'append'
●
Write Once, Read Many ●
No pot ser muntat al SO
●
Datanode per màquina ●
Lectura seqüencial
●
Un Name Node per cluster (SPOAD) ●
Estable i robust
●
Tolerància a errors HW ●
Estable i robust
●
Replica Rack Aware ●
Estable i robust

Exemple
DFS

Mapper
Entrada: [ “paraula1”, “paraula2”,
“paraula3”, “paraula1” ]

Sortida: [
“paraula1” : 2,
“paraula2” : 1,
“paraula3” : 1
]

Exemple
DFS

“paraula1” : [ 2, x, y]
2 del mapper 1
x del mapper 2
y del mapper 3

“paraula2” : [ x, z, w]
x del mapper 1
z del mapper 2
w del mapper 3

“paraula3” : [ ... ]

Exemple
DFS

“paraula1”:x
“paraula2”:y
“paraula1” ∑ “paraula3”:z
...

“paraula2” ∑

“paraula3” ∑

Exemple de codi

public static class Map extends Mapper<LongWritable, Text, Text,
IntWritable> {

private final static IntWritable one = new IntWritable(1);
private Text word = new Text();

public void map(LongWritable key, Text value,
Context context) {

String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, one);
}
}
}

Exemple de codi

public static class Reduce extends Reducer<Text, IntWritable,
Text, IntWritable> {

public void reduce(Text key,
Iterable<IntWritable> values, Context context) {

int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
context.write(key, new IntWritable(sum));
}
}

Exemple de codi

public static void main(String[] args) throws Exception {

Configuration conf = new Configuration();

Job job = new Job(conf, "wordcount");

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);

FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.waitForCompletion(true);
}

Interessats ?

Per provar Hadoop:

http://www.cloudera.com ► Downloads
http://hadoop.apache.org

Grup d'usuaris de Hadoop i escalabilitat a nivell
nacional:

https://groups.google.com/group/spain-scalability-users

Grups al LinkedIn:

Hadoop España
Hive España

Preguntes ?

Marc de Palol
marc.de.palol@gmail.com
@lant

Mais conteúdo relacionado

Mais procurados

Introduction to r studio on aws 2020 05_06

Barry DeCicco

Python for R Users

Ajay Ohri

Stratosphere System Overview Big Data Beers Berlin. 20.11.2013

Robert Metzger

User biglm

johnatan pladott

Spark 4th Meetup Londond - Building a Product with Spark

samthemonad

When you think of SQL Server, the first thing you think about is probably not SQL as host for messaging / queuing applications. However, in certain scenarios it definitely makes sense to implement messaging inside the SQL engine. In this session we will see the benefits of messaging applications inside SQL as well as what options you have when implementing it and their respective performance implications.

Queuing Sql Server: Utilise queues to increase performance in SQL Server

Niels Berglund

Apache Spark - Key-Value RDD | Big Data Hadoop Spark Tutorial | CloudxLab

CloudxLab

Real-Time Integration Between MongoDB and SQL Databases

MongoDB

This talk was prepared for the November 2013 DataPhilly Meetup: Data in Practice ( http://www.meetup.com/DataPhilly/events/149515412/ ) Map Reduce: Beyond Word Count by Jeff Patti Have you ever wondered what map reduce can be used for beyond the word count example you see in all the introductory articles about map reduce? Using Python and mrjob, this talk will cover a few simple map reduce algorithms that in part power Monetate's information pipeline Bio: Jeff Patti is a backend engineer at Monetate with a passion for algorithms, big data, and long walks on the beach. Prior to working at Monetate he performed software R&D for Lockheed Martin, where he worked on projects ranging from social network analysis to robotics.

Map reduce: beyond word count

Jeff Patti

Transforming Big Data with Spark and Shark - AWS Re:Invent 2012 BDT 305

mjfrankli

Python for R users

Satyarth Praveen

Parallel R in snow (english after 2nd slide)

Cdiscount

This is a deck of slides from a recent meetup of AWS Usergroup Greece, presented by Ioannis Konstantinou from the National Technical University of Athens. The presentation gives an overview of the Map Reduce framework and a description of its open source implementation (Hadoop). Amazon's own Elastic Map Reduce (EMR) service is also mentioned. With the growing interest on Big Data this is a good introduction to the subject.

Hadoop & MapReduce

Newvewm

The Berkeley AMPLab is developing a new open source data analysis software stack by deeply integrating machine learning and data analytics at scale (Algorithms), cloud and cluster computing (Machines) and crowdsourcing (People) to make sense of massive data. Current application efforts focus on cancer genomics, real-time traffic prediction, and collaborative analytics for mobile devices. In this talk, we present an overview of this stack and demonstrate key components: Spark and Shark.

BDT305 Transforming Big Data with Spark and Shark - AWS re: Invent 2012

Amazon Web Services

BDAS Shark study report 03 v1.1

Stefanie Zhao

Talk given at ClojureD conference, Berlin Apache Spark is an engine for efficiently processing large amounts of data. We show how to apply the elegance of Clojure to Spark - fully exploiting the REPL and dynamic typing. There will be live coding using our gorillalabs/sparkling API. In the presentation, we will of course introduce the core concepts of Spark, like resilient distributed data sets (RDD). And you will learn how the Spark concepts resembles those well-known from Clojure, like persistent data structures and functional programming. Finally, we will provide some Do’s and Don’ts for you to kick off your Spark program based upon our experience. About Paulus Esterhazy and Christian Betz Being a LISP hacker for several years, and a Java-guy for some more, Chris turned to Clojure for production code in 2011. He’s been Project Lead, Software Architect, and VP Tech in the meantime, interested in AI and data-visualization. Now, working on the heart of data driven marketing for Performance Media in Hamburg, he turned to Apache Spark for some Big Data jobs. Chris released the API-wrapper ‘chrisbetz/sparkling’ to fully exploit the power of his compute cluster. Paulus Esterhazy Paulus is a philosophy PhD turned software engineer with an interest in functional programming and a penchant for hammock-driven development. He currently works as Senior Web Developer at Red Pineapple Media in Berlin.

Big Data Processing using Apache Spark and Clojure

Dr. Christian Betz

The .NET garbage collector can be your best friend or your worst enemy; and it’s not friendly with a lot of people. The GC left more than a few production systems burning in smoke after developers failed to anticipate the effects of real production loads on the memory subsystem. In this talk, we will methodically measure and improve the .NET garbage collector’s performance. We will begin with a quick refresher on dynamic performance tools that can identify GC issues: CLR performance counters, ETW GC events, and ETW object allocation events; as well as static analysis tools, such as the Roslyn-based heap allocations analyzer. Then, we will inspect multiple issues at the source code level: excessive boxing, unintended effects of lambdas closing over local variables, await-generated state machines, intermediate objects in LINQ queries, and many others. We will also discuss higher-level memory problems: how to get rid of large object allocations, how to avoid finalization, and how to convert heap-based designs to local objects. Some of these ideas are now being applied at the language and framework level in C# 7 and .NET Core. At the end of the talk, you will be equipped to reduce memory traffic and GC overhead in your own applications, often by a factor of 10 or more!

Look Mommy, No GC! (TechDays NL 2017)

Dina Goldshtein

Pune Clojure Course Outline

Baishampayan Ghose

Spark: Taming Big Data

Leonardo Gamas

While Map/Reduce is an excellent environment for some parallel computing tasks, there are many ways to use a cluster beyond Map/Reduce. Within the last year, the YARN and NextGen Map/Reduce has been contributed into the Hadoop trunk, Mesos has been released as an open source project, and a variety of new parallel programming environments have emerged such as Spark, Giraph, Golden Orb, Accumulo, and others. We will discuss the features of YARN and Mesos, and talk about obvious yet relatively unexplored uses of these cluster schedulers as simple work queues. Examples will be provided in the context of machine learning. Next, we will provide an overview of the Bulk-Synchronous-Parallel model of computation, and compare and contrast the implementations that have emerged over the last year. We will also discuss two other alternative environments: Spark, an in-memory version of Map/Reduce which features a Scala-based interpreter; and Accumulo, a BigTable-style database that implements a novel model for parallel computation and was recently released by the NSA.

Beyond Map/Reduce: Getting Creative With Parallel Processing

Ed Kohlwey

Mais procurados (20)

Introduction to r studio on aws 2020 05_06

Python for R Users

Stratosphere System Overview Big Data Beers Berlin. 20.11.2013

User biglm

Spark 4th Meetup Londond - Building a Product with Spark

Queuing Sql Server: Utilise queues to increase performance in SQL Server

Apache Spark - Key-Value RDD | Big Data Hadoop Spark Tutorial | CloudxLab

Real-Time Integration Between MongoDB and SQL Databases

Map reduce: beyond word count

Transforming Big Data with Spark and Shark - AWS Re:Invent 2012 BDT 305

Python for R users

Parallel R in snow (english after 2nd slide)

Hadoop & MapReduce

BDT305 Transforming Big Data with Spark and Shark - AWS re: Invent 2012

BDAS Shark study report 03 v1.1

Big Data Processing using Apache Spark and Clojure

Look Mommy, No GC! (TechDays NL 2017)

Pune Clojure Course Outline

Spark: Taming Big Data

Beyond Map/Reduce: Getting Creative With Parallel Processing

Destaque

No bid left behind

Marc de Palol

Competing to be unique

Specialist Language Courses

There Are Literally Thousands of Erlang Projects

Pierre Fenoll

Hfile

Marc de Palol

Presented at Erlang Factory 2016, San Francisco, CA. Erlang is widely used for building concurrent applications. However, when we push the performance of our Erlang based application to handle millions of concurrent clients, some Erlang scalability issues begin to show and some conventional programming paradigm of Erlang no longer hold. We would like to share some of these issue and how we address them. In addition, we share some of our experience on how to profile an Erlang application to identify bottlenecks. We will take a deep look at some of the basic mechanisms of Erlang and show how they behave under high load and parallelism, which includes message delivery, process management and shared data structures such as maps and ETS tables. We will demonstrate their limitations and propose techniques to alleviate the issues. We will also share profiling techniques on how to find those bottlenecks in Erlang applications across different levels. We will share techniques for writing highly performant Erlang applications.

High Performance Erlang - Pitfalls and Solutions

Yinghai Lu

State of the art introduction

Jolien Coenraets

Erlang containers

Sargun Dhillon

Netty is an asynchronous event-driven network application framework for rapid development of maintainable high performance protocol servers & clients. AND IT'S TRUE! In this talk given at JBCNConf 2015 in Barcelona, we will see how we use Netty at Trovit since 2013, what brought to us and how it opened our minds. We will share tips that helped us to learn more about Netty, some performance tricks and all things that worked for us.

Netty from the trenches

Jordi Gerona

Destaque (8)

No bid left behind

Competing to be unique

There Are Literally Thousands of Erlang Projects

Hfile

High Performance Erlang - Pitfalls and Solutions

State of the art introduction

Erlang containers

Netty from the trenches

Semelhante a Cloud jpl

Introducción a hadoop

datasalt

Taste Java In The Clouds

Jacky Chu

Introduction to Scalding and Monoids

Hugo Gävert

Hadoop

Scott Leberknight

Learn Hadoop and Bigdata Analytics, Join Design Pathshala training programs on Big data and analytics. This slide covers the Advance Map reduce concepts of Hadoop and Big Data. For training queries you can contact us: Email: admin@designpathshala.com Call us at: +91 98 188 23045 Visit us at: http://designpathshala.com Join us at: http://www.designpathshala.com/contact-us Course details: http://www.designpathshala.com/course/view/65536 Big data Analytics Course details: http://www.designpathshala.com/course/view/1441792 Business Analytics Course details: http://www.designpathshala.com/course/view/196608

Advance Map reduce - Apache hadoop Bigdata training by Design Pathshala

Desing Pathshala

Introduction to the Hadoop Ecosystem (codemotion Edition)

Uwe Printz

Apache Hadoop is one of the most popular solutions for today’s Big Data challenges. Hadoop offers a reliable and scalable platform for fail-safe storage of large amounts of data as well as the tools to process this data. This presentation will give an overview of the architecture of Hadoop and explain the possibilities for integration within existing enterprise systems. Finally, the main tools for processing data will be introduced which includes the scripting language layer Pig, the SQL-like query layer Hive as well as the column-based NoSQL layer HBase.

Introduction to the hadoop ecosystem by Uwe Seiler

Codemotion

Introduction to the Hadoop Ecosystem (SEACON Edition)

Uwe Printz

Behm Shah Pagerank

gothicane

Recent developments in Hadoop version 2 are pushing the system from the traditional, batch oriented, computational model based on MapRecuce towards becoming a multi paradigm, general purpose, platform. In the first part of this talk we will review and contrast three popular processing frameworks. In the second part we will look at how the ecosystem (eg. Hive, Mahout, Spark) is making use of these new advancements. Finally, we will illustrate "use cases" of batch, interactive and streaming architectures to power traditional and "advanced" analytics applications.

Full stack analytics with Hadoop 2

Gabriele Modena

Hadoop ecosystem

Ran Silberman

Big Data Essentials meetup @ IBM Ljubljana 23.06.2015

Andrey Vykhodtsev

Hadoop ecosystem

Ran Silberman

Scalable and Flexible Machine Learning With Scala @ LinkedIn

Vitaly Gordon

Graph relationships are everywhere. In fact, more often than not, analyzing relationships between points in your datasets lets you extract more business value from your data. Consider social graphs, or relationships of customers to each other and products they purchase, as two of the most common examples. Now, if you think you have a scalability issue just analyzing points in your datasets, imagine what would happen if you wanted to start analyzing the arbitrary relationships between those data points: the amount of potential processing will increase dramatically, and the kind of algorithms you would typically want to run would change as well. If your Hadoop batch-oriented approach with MapReduce works reasonably well, for scalable graph processing you have to embrace an in-memory, explorative, and iterative approach. One of the best ways to tame this complexity is known as the Bulk synchronous parallel approach. Its two most widely used implementations are available as Hadoop ecosystem projects: Apache Giraph (used at Facebook), and Apache GraphX (as part of a Spark project). In this talk we will focus on practical advice on how to get up and running with Apache Giraph and GraphX; start analyzing simple datasets with built-in algorithms; and finally how to implement your own graph processing applications using the APIs provided by the projects. We will finally compare and contrast the two, and try to lay out some principles of when to use one vs. the other.

Introduction into scalable graph analysis with Apache Giraph and Spark GraphX

rhatr

Hadoop trainingin bangalore

appaji intelhunt

Apache Hadoop & Friends at Utah Java User's Group

Cloudera, Inc.

Big data distributed processing: Spark introduction

Hektor Jacynycz García

Scoobi - Scala for Startups

bmlever

Introduction to Spark

Sriram Kailasam

Semelhante a Cloud jpl (20)

Introducción a hadoop

Taste Java In The Clouds

Introduction to Scalding and Monoids

Hadoop

Advance Map reduce - Apache hadoop Bigdata training by Design Pathshala

Introduction to the Hadoop Ecosystem (codemotion Edition)

Introduction to the hadoop ecosystem by Uwe Seiler

Introduction to the Hadoop Ecosystem (SEACON Edition)

Behm Shah Pagerank

Full stack analytics with Hadoop 2

Hadoop ecosystem

Big Data Essentials meetup @ IBM Ljubljana 23.06.2015

Hadoop ecosystem

Scalable and Flexible Machine Learning With Scala @ LinkedIn

Introduction into scalable graph analysis with Apache Giraph and Spark GraphX

Hadoop trainingin bangalore

Apache Hadoop & Friends at Utah Java User's Group

Big data distributed processing: Spark introduction

Scoobi - Scala for Startups

Introduction to Spark

Último

Artificial Intelligence Chap.5 : Uncertainty

Khushali Kathiriya

AWS Community Day CPH - Three problems of Terraform

Andrey Devyatkin

Dubai, often portrayed as a shimmering oasis in the desert, faces its own set of challenges, including the occasional threat of flooding. Despite its reputation for opulence and modernity, the emirate is not immune to the forces of nature. In recent years, Dubai has experienced sporadic but significant floods, testing the resilience of its infrastructure and communities. Among the critical lifelines in this bustling metropolis is the Dubai International Airport, a bustling hub that connects the city to the world. This article explores the intersection of Dubai flood events and the resilience demonstrated by the Dubai International Airport in the face of such challenges.

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Orbitshub

💥 You’re lucky! We’ve found two different (lead) developers that are willing to share their valuable lessons learned about using UiPath Document Understanding! Based on recent implementations in appealing use cases at Partou and SPIE. Don’t expect fancy videos or slide decks, but real and practical experiences that will help you with your own implementations. 📕 Topics that will be addressed: • Training the ML-model by humans: do or don't? • Rule-based versus AI extractors • Tips for finding use cases • How to start 👨‍🏫👨‍💻 Speakers: o Dion Morskieft, RPA Product Owner @Partou o Jack Klein-Schiphorst, Automation Developer @Tacstone Technology

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam

UiPathCommunity

Architecting Cloud Native Applications

WSO2

The Good, the Bad and the Governed - Why is governance a dirty word? David O'Neill, Chief Operating Officer - APIContext Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

apidays

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

Webinar Recording: https://www.panagenda.com/webinars/why-teams-call-analytics-is-critical-to-your-entire-business Nothing is as frustrating and noticeable as being in an important call and being unable to see or hear the other person. Not surprising then, that issues with Teams calls are among the most common problems users call their helpdesk for. Having in depth insight into everything relevant going on at the user’s device, local network, ISP and Microsoft itself during the call is crucial for good Microsoft Teams Call quality support. To ensure a quick and adequate solution and to ensure your users get the most out of their Microsoft 365. But did you know that ‘bad calls’ are also an excellent indicator of other problems arising? Precisely because it is so noticeable!? Like the canary in the mine, bad calls can be early indicators of problems. Problems that might otherwise not have been noticed for a while but can have a big impact on productivity and satisfaction. Join this session by Christoph Adler to learn how true Microsoft Teams call quality analytics helped other organizations troubleshoot bad calls and identify and fix problems that impacted Teams calls or the use of Microsoft365 in general. See what it can do to keep your users happy and productive! In this session we will cover - Why CQD data alone is not enough to troubleshoot call problems - The importance of attributing call problems to the right call participant - What call quality analytics can do to help you quickly find, fix-, and prevent problems - Why having retrospective detailed insights matters - Real life examples of how others have used Microsoft Teams call quality monitoring to problem shoot problems with their ISP, network, device health and more.

Why Teams call analytics are critical to your entire business

panagenda

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Zilliz

Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows. We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases. This video focuses on the deployment of external web forms using Jotform for Bonterra Impact Management. This solution can be customized to your organization’s needs and deployed to support the common use cases below: - Intake and consent - Assessments - Surveys - Applications - Program registration Interested in deploying web form automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Jeffrey Haguewood

Whatsapp Number Escorts Call girls 8617370543 Available 24x7 Mcleodganj Call Girls Service Offer Genuine VIP Model Escorts Call Girls in Your Budget. Mcleodganj Call Girls Service Provide Real Call Girls Number. Make Your Sexual Pleasure Memorable with Our Mcleodganj Call Girls at Affordable Price. Top VIP Escorts Call Girls, High Profile Independent Escorts Call Girls, Housewife Women Escorts Call Girl, College Girls Escorts Call Girls, Russian Escorts Call girls Service in Your Budget.

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model

Deepika Singh

Three things you will take away from the session: • How to run an effective tenant-to-tenant migration • Best practices for before, during, and after migration • Tips for using migration as a springboard to prepare for Copilot in Microsoft 365 Main ideas: Migration Overview: The presentation covers the current reality of cross-tenant migrations, the triggers, phases, best practices, and benefits of a successful tenant migration Considerations: When considering a migration, it is important to consider the migration scope, performance, customization, flexibility, user-friendly interface, automation, monitoring, support, training, scalability, data integrity, data security, cost, and licensing structure Next Wave: The next wave of change includes the launch of Copilot, which requires businesses to be prepared for upcoming changes related to Copilot and the cloud, and to consolidate data and tighten governance ShareGate: ShareGate can help with pre-migration analysis, configurable migration tool, and automated, end-user driven collaborative governance

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

sammart93

Following the popularity of "Cloud Revolution: Exploring the New Wave of Serverless Spatial Data," we're thrilled to announce this much-anticipated encore webinar. In this sequel, we'll dive deeper into the Cloud-Native realm by uncovering practical applications and FME support for these new formats, including COGs, COPC, FlatGeoBuf, GeoParquet, STAC, and ZARR. Building on the foundation laid by industry leaders Michelle Roby of Radiant Earth and Chris Holmes of Planet in the first webinar, this second part offers an in-depth look at the real-world application and behind-the-scenes dynamics of these cutting-edge formats. We will spotlight specific use-cases and workflows, showcasing their efficiency and relevance in practical scenarios. Discover the vast possibilities each format holds, highlighted through detailed discussions and demonstrations. Our expert speakers will dissect the key aspects and provide critical takeaways for effective use, ensuring attendees leave with a thorough understanding of how to apply these formats in their own projects. Elevate your understanding of how FME supports these cutting-edge technologies, enhancing your ability to manage, share, and analyze spatial data. Whether you're building on knowledge from our initial session or are new to the serverless spatial data landscape, this webinar is your gateway to mastering cloud-native formats in your workflows.

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Safe Software

Join our latest Connector Corner webinar to discover how UiPath Integration Service revolutionizes API-centric automation in a 'Quote to Cash' process—and how that automation empowers businesses to accelerate revenue generation. A comprehensive demo will explore connecting systems, GenAI, and people, through powerful pre-built connectors designed to speed process cycle times. Speakers: James Dickson, Senior Software Engineer Charlie Greenberg, Host, Product Marketing Manager

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

DianaGray10

The value of a flexible API Management solution for Open Banking Steve Melan, Manager for IT Innovation and Architecture - State's and Saving's Bank of Luxembourg Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The value of a flexible API Management solution for O...

apidays

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Product Anonymous

Dubai, known for its towering skyscrapers, luxurious lifestyle, and relentless pursuit of innovation, often finds itself in the global spotlight. However, amidst the glitz and glamour, the emirate faces its own set of challenges, including the occasional threat of flooding. In recent years, Dubai has experienced sporadic but significant floods, disrupting normalcy and posing unique challenges to its infrastructure. Among the critical nodes in this bustling metropolis is the Dubai International Airport, a vital hub connecting the world. This article delves into the intersection of Dubai flood events and the resilience demonstrated by the Dubai International Airport in the face of such challenges.

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Orbitshub

Corporate and higher education. Two industries that, in the past, have had a clear divide with very little crossover. The difference in goals, learning styles and objectives paved the way for differing learning technologies platforms to evolve. Now, those stark lines are blurring as both sides are discovering they have content that’s relevant to the other. Join Tammy Rutherford as she walks through the pros and cons of corporate and higher ed collaborating. And the challenges of these different technology platforms working together for a brighter future.

Corporate and higher education May webinar.pptx

Rustici Software

FWD Group - Insurer Innovation Award 2024

The Digital Insurer

MINDCTI Revenue Release Quarter One 2024

MIND CTI

Cloud jpl

1. Cloud Computing i Hadoop X JPL Barcelona, 01/07/2011 Marc de Palol @lant

2. Qui sóc ?

3. Qui sóc ?

4. Qui sóc ?

5. Qui sóc ?

6. Qui sóc ?

7. Qui sóc ?

8. Grid Computing vs Cloud

9. Grid Computing vs Cloud

10. Els dos són sistemes distribuïts “A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable” Leslie Lamport

11. Els dos són sistemes distribuïts “A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable” Leslie Lamport “A distributed system consists of multiple autonomous computers that communicate through a computer network.” Wikipedia

12. Cloud

13. Cloud

14. Hadoop

15. Hadoop MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat OSDI'04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, December, 2004.

16. Hadoop

17. Hadoop

18. Hadoop ● Nutch ● Lucene ● Hadoop ● Avro

19. Hadoop “Flexible infrastructure for large scale computational and data processing on a network of commodity hardware” Parand Tony Darugar

20. Hadoop “Flexible infrastructure for large scale computational and data processing on a network of commodity hardware” Parand Tony Darugar

21. Hadoop “Flexible infrastructure for large scale computational and data processing on a network of commodity hardware” Parand Tony Darugar

22. Map & Reduce Map : V = [ 1 , 2 , 3 , 4 , 5 ] Def quadrat( x ) = x * x; Map ( V, quadrat ) = For (var v : V) { Output quadrat(v); } } [1, 4, 9, 16, 25]

23. Map & Reduce Map : Reduce : V = [ 1 , 2 , 3 , 4 , 5 ] V = [ 1 , 4 , 9 , 16 , 25 ] Def quadrat( x ) = x * x; Map ( V, quadrat ) = Reduce ( V ) = For (var v : V) { Var acum = 0; output quadrat(v); For (var v : V) { } acum = acum + v } } } [1, 4, 9, 16, 25] 55

24. Hadoop DFS The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung 19th ACM Symposium on Operating Systems Principles, Lake George, NY, October, 2003. ● Dissenyat per Big Data ● Des de fa poc permet 'append' ● Write Once, Read Many ● No pot ser muntat al SO ● Datanode per màquina ● Lectura seqüencial ● Un Name Node per cluster (SPOAD) ● Estable i robust ● Tolerància a errors HW ● Estable i robust ● Replica Rack Aware ● Estable i robust

25. Exemple DFS

26. Exemple DFS Mapper Entrada: [ “paraula1”, “paraula2”, “paraula3”, “paraula1” ] Sortida: [ “paraula1” : 2, “paraula2” : 1, “paraula3” : 1 ]

27. Exemple DFS “paraula1” : [ 2, x, y] 2 del mapper 1 x del mapper 2 y del mapper 3 “paraula2” : [ x, z, w] x del mapper 1 z del mapper 2 w del mapper 3 “paraula3” : [ ... ]

28. Exemple DFS “paraula1”:x “paraula2”:y “paraula1” ∑ “paraula3”:z ... “paraula2” ∑ “paraula3” ∑

29. Exemple de codi public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, Context context) { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, one); } } }

30. Exemple de codi public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterable<IntWritable> values, Context context) { int sum = 0; for (IntWritable val : values) { sum += val.get(); } context.write(key, new IntWritable(sum)); } }

31. Exemple de codi public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new Job(conf, "wordcount"); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(true); }

32. Workflow DB LOGS HDFS DB NoSQL

33. Qui ho utilitza?

34. Qui ho utilitza?

35. Ecosistema Hadoop

36. Ecosistema Hadoop

37. Comunitat Hadoop Suport:

38. Interessats ? Per provar Hadoop: http://www.cloudera.com ► Downloads http://hadoop.apache.org Grup d'usuaris de Hadoop i escalabilitat a nivell nacional: https://groups.google.com/group/spain-scalability-users Grups al LinkedIn: Hadoop España Hive España

39. Preguntes ? Marc de Palol marc.de.palol@gmail.com @lant

Cloud jpl

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (8)

Semelhante a Cloud jpl

Semelhante a Cloud jpl (20)

Último

Último (20)

Cloud jpl