7. Scala
Created by Martin Odersky and his research group in EPFL, 2003
Open Source
Runs on the JVM, Seamless Java Interoperability
Strongly Typed
Object Oriented
Functional
8. Functional Programming
From Wikipedia:
“[…] functional programming […] treats
computation as the evaluation of mathematical
functions and avoids changing-state and
mutable data”
15. Open Source
Strongly Typed
Java/JVM Friendly
Interactive
Performant
Abstracts “machinery”
Java Interoperability
class DirectParquetOutputCommitter(outputPath: Path, context: TaskAttemptContext)
extends ParquetOutputCommitter(outputPath, context) { … }
Java class from org.apache.parquet:parquet-hadoop
Scala class from org.apache.spark:spark-core_2.10
16. Open Source
Strongly Typed
Java/JVM Friendly
Interactive
Performant
Abstracts “machinery”
Scala is Interactive
Scala has a built-in REPL, extensible by Scala-based tools
17. Open Source
Strongly Typed
Java/JVM Friendly
Interactive
Performant
Abstracts “machinery”
Performant
Benchmarking languages is hard and suffers from bias
Most benchmarks show Scala is at least on-par with Java,
e.g. Google’s benchmark:
Nonsense!
No Way!
RAGE!!11
19. Open Source
Strongly Typed
Java/JVM Friendly
Interactive
Performant
Abstracts “machinery”
Performant
Ability to scale out is more significant than per-CPU performance
http://vmturbo.com/wp-content/uploads/2015/05/ScaleUpScaleOut_sm-min.jpg
21. Open Source
Strongly Typed
Java/JVM Friendly
Interactive
Performant
Abstracts “machinery”
Think about MapReduce
Hadoop’s Mapper and Reducer - code what to do, not:
How
Where
In what order
How to handle failures
Leaves these concerns for the framework to figure out
22. Open Source
Strongly Typed
Java/JVM Friendly
Interactive
Performant
Abstracts “machinery”
Think about MapReduce
Hadoop’s Java API imitates Functional Programming:
Mapper and Reducer are Functions
Executed by “Higher Order Functions”
No Side Effects / Mutability
24. Open Source
Strongly Typed
Java/JVM Friendly
Interactive
Performant
Abstracts “machinery”
Functional makes concurrency easy
val numbers = 1 to 100000
val result = numbers.map(slowF)
25. Open Source
Strongly Typed
Java/JVM Friendly
Interactive
Performant
Abstracts “machinery”
Functional makes concurrency easy
val numbers = 1 to 100000
val result = numbers.par.map(slowF)
Parallelizes next manipulations over available CPUs
26. Open Source
Strongly Typed
Java/JVM Friendly
Interactive
Performant
Abstracts “machinery”
Functional makes distribution easy
val numbers = 1 to 100000
val result = sparkContext.parallelize(numbers).map(slowF)
Parallelizes next manipulations over scalable cluster, by
creating a Spark RDD - a Resilient Distributed Dataset
27. “Spark RDDs are
the ultimate Scala
collections"
- Martin Odersky
photo: http://www.swissict-award.ch/fileadmin/award/Pressebilder/Martin_Odersky_Scala.jpg
28. Open Source
Strongly Typed
Java/JVM Friendly
Interactive
Performant
Abstracts “machinery”
Functional makes Resiliency easy
Pure functions are idempotent, which allows retriability
Map
Map
MapMap Map (retry)
34. It’s really not that scary...
From
Manuel Bernhardt's "Debunking Some Myths About Scala And Its Environment":
“I need to become a mathematician and know all about Monads before I can get
started”
“I can throw all of my object-orientation knowledge out of the window”
“There is no good IDE support for Scala”