+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Â
Scala Json Features and Performance
1. SCALA JSON FEATURES AND
PERFORMANCES
John Nestor- 47 Degrees
nestor@persist.com
Dragos Manolescu
dam@micro-workďŹow.com
https://github.com/47deg/json-perf
47deg.com 1
2. 47deg.com
DISCLAIMER
⢠Best effort attempt to measure performance and
describe features.
⢠Corrections always appreciated.
⢠Also let us know any Json parsers we missed.
47deg.com 2
3. 47deg.com
⢠There are lots of Scala Json parsers
⢠You can also use Java Json parsers in Scala
⢠How to Choose:
⢠Performance
⢠Features
⢠API
⢠Support (will not be abandoned)
⢠License (most are Apache 2)
SCALA JSON
3
5. 47deg.com
THE PARSERS (1 OF 4)
⢠Scala Library. This parser is part of the standard Scala
library in package scala.util.parsing.json. It is
implemented using parsing combinators.
⢠Twitter Json. A cleaned up version of the JSON parser in
Odersky's Scala book. It is implemented using parsing
combinators. Written by Steve Jenson while at Twitter.
⢠Persist Json. Developed as part of the OStore, a new
NoSQL database written in Scala. OStore started with the
Twitter parser. This turned out to be much too slow, so it
was rewritten from scratch keeping mostly the same API but
with an emphasis on speed. Developed by John Nestor
(with the codex based mapper by JR Dejardin).
5
6. 47deg.com
THE PARSERS (2 OF 4)
⢠Play Json. A part of the Typesafe Play framework.
Implemented using Jerkson, a Scala wrapper on
Jackson.
⢠Lift Json. Developed as part of Lift, a framework for
building web apps.
⢠Spray Json. Developed as part of Spray, a REST/
HTTP network IO toolkit.
6
7. 47deg.com
THE PARSERS (3 OF 4)
⢠Argonaut. Purely functional Json in Scala. Uses
Scalaz.
⢠Rojama. Another Scala parser that makes extensive
use of Scalaâs functional features. Developed by Robert
Macomber of Socrata.
⢠Jawn. Jawn was designed to parse JSON into an AST
as quickly as possible.
7
8. 47deg.com
THE PARSERS (4 OF 4)
⢠Jackson. Generally regarded as the best and
fastest Java Json parser. Has a very rich set of
features. We test using the DefaultScalaModule
(by Chris Currie) that provides Scala support.
⢠Json Smart. A newer faster (than Jackson) Json
parser written in Java.
8
9. 47deg.com
TEST SETS FOR PERFORMANCE TESTING
⢠Twitter. Tweets processed by the Yap.tv Guide â¨
(http://j.mp/15WL0p3), a service providing a personalized TV
guide companion experience based on social content from
Twitter and Facebook.This data set contains 100 tweets in Json â¨
(http://j.mp/13lKbU6).
⢠Google. PlaceSearchResults returned by Google in response
to place queries at 100 locations. The locations correspond to
the top best places to live in 2012, as compiled by CNN Money
(http://j.mp/13NmVid). This data set contains 138
PlaceSearchResults in Json (http://j.mp/13NmCUC) using
keyword âbreweryâ and a radius of 2 miles.
⢠Each ďŹle has one Json object per line.
9
11. 47deg.com
PRETTY SAMPLE GOOGLE JSON{"address_components":
[{"long_name":"622",
"short_name":"622",
"types":["street_number"]
},
{"long_name":"South Rangeline Road",
"short_name":"South Rangeline Road",
"types":["route"]
},
{"long_name":"Carmel",
"short_name":"Carmel",
"types":["locality","political"]
},
{"long_name":"Hamilton",
"short_name":"Hamilton",
"types":
["administrative_area_level_2","political"]
},
{"long_name":"Indiana",
"short_name":"Indiana",
"types":
["administrative_area_level_1","political"]
},
{"long_name":"US",
"short_name":"US",
"types":["country","political"]
},
{"long_name":"46032",
"short_name":"46032",
"types":["postal_code"]
}
],
"formatted_address":
"Suite Q, 622 South Rangeline Road, Carmel, Indiana, United States",
"formatted_phone_number":"(317) 429-6345",
"geometry":
{"location":{"lat":39.971703,"lng":-86.129099}},
"icon":
"http://maps.gstatic.com/mapfiles/place_api/icons/generic_business-71.png",
"id":"fcd83d32717980ec1fec2c7ec8719389b201a331",
"international_phone_number":"+1 317-429-6345",
"name":"Union Brewing Company",
"opening_hours":
{"open_now":true,
"periods":
[{"close":{"day":0,"time":"2000"},
"open":{"day":0,"time":"1200"}
},
{"close":{"day":2,"time":"2200"},
"open":{"day":2,"time":"1600"}
},
{"close":{"day":4,"time":"2200"},
"open":{"day":4,"time":"1600"}
},
{"close":{"day":6,"time":"0000"},
"open":{"day":5,"time":"1500"}
},
{"close":{"day":0,"time":"0000"},
"open":{"day":6,"time":"1200"}
}
]
},
"photos":
[{"height":1632,
"html_attributions":
["<a href="https://plus.google.com/117934275405882297051">Greg Magnusson</a>"
],
"photo_reference":
"CnRoAAAAvN9y_gkgZIGa13kUSyyBlqwholvjtH4NKo-BzvlklcX-Tt9Ysc6HRMXPxKl3PumZtiOnomHi-Nk83y-lxf8RX8nsWulwuCBpY2okAqaU9wohOhncStFPZlKr02t3WquA6pt8mfCYYO-
NAdU2HwdM1hIQYJmus4wpQBaRtP7BFdYhzRoU4XvzfAAQQwkdJZluFJ-tDoUulIo",
"width":1224
}
],
"reference":
"CoQBcgAAAF3VKrWBUmLMv5tLs1Ru47j3Tbxa6lPxlIFj5BUvpsTyPt3bpui2vOTCcaHjKYuAjSulIPHpd0YFgm5CKLQH6P_19xU1UPeu6avWeIMWA0u4hxyx4TazCfFF9ESCwHaOEcKZfRyJSD2b5p2IJvT0eVkFFExeWbqAcWrH80jIQ-
VrEhAvUSpbmH3rB4LEKn-cZtsYGhQxFpeco4U1rUtwe-ncAttqLBnSgQ",
"reviews":
[{"aspects":[{"rating":3,"type":"quality"}],
"author_name":"Greg Magnusson",
"author_url":
"https://plus.google.com/117934275405882297051",
"text":
"Truly outstanding local craft brewing company. Indy's got some great local brewers, but these guys really get it right. Nice little location in Carmel, great beer and local guest taps...
I'm so glad these guys moved into town. Love!",
"time":1361059887
}
],
"types":["food","establishment"],
"url":
"https://plus.google.com/102928473191458623183/about?hl=en-US",
"utc_offset":-300,
"vicinity":
"Suite Q, 622 South Rangeline Road, Carmel",
"website":"http://www.unionbrewingco.com"
}
11
12. 47deg.com
⢠Timing is done with Java System.nanotime().
⢠For each data set, each line is processed.
⢠This is repeated 25 time to warm JVM.
⢠This is repeated 200 times for measurement.
⢠For example, google has 138 Json lines, so during
warmup a total of 3450 lines are parsed and during
testing 27600 lines are parsed.
⢠The total summed nanoseconds for all 27600 parse
steps are reported as milliseconds for each parser.
TESTING PROCESS
12
13. 47deg.com
TIMING SCALA/JAVA CODE
⢠Timing is tricky! For example see
⢠http://www.ibm.com/developerworks/library/j-
benchmark1/
⢠A few of the many issues:
⢠Warmup (run several times to warm JVM)
⢠Repeatability (use average?, but what about P99?)
⢠Interference from other processes
⢠Caches
⢠Garbage collection
⢠Chosen data set
13
14. 47deg.com
TESTING MACHINE
⢠Times obviously depend on speed of machine used in
testing.
⢠Numbers here are for a MacBook pro with
⢠2 2.9 GHz cores
⢠16GB of main memory
⢠You can run tests on a machine of your choice!
14
15. 47deg.com
PARSING TIMES (MS)
Parser Twitter Google Ignore
Persist Json 443 712
Rojoma 540 1251
Jackson 445 842
Spray Json 603 1115
Lift Json 469 1002
Twitter Json 18179 42316 Too Slow
Scala Library 126006 329215 Way Too Slow
Play Json 442 1027
Json Smart 251 424
Argonaut 784 1448
JAWN 603 748
15
19. 47deg.com
WHY IS THE SCALA LIBRARY EVEN
SLOWER?
⢠Like Twitter uses parsing combinators.
⢠But why is it so much slower?
19
20. 47deg.com
WHY IS PLAY SO SLOW IF IT USES
JACKSON?
⢠It uses Jerkson (which is abandoned)?
⢠???
20
21. 47deg.com
JSON LANGUAGE EXTENSIONS
Parser Comments NoQuotes Root Type Other
Persist Json // ďŹeld any raw strings
Rojoma //,/**/ ďŹeld any keeps ďŹeld order
Jackson // ďŹeld object can use â
Spray Json // any
Lift Json object keeps ďŹeld order
Twitter Json any
Scala Library any
Play Json object
Json Smart # ďŹeld/value object
Argonaut object keeps ďŹeld order
JAWN object
21
22. 47deg.com
PARSER RESULTS (ASTS)
Parser Object, Array
Wrapped in
Object
Immutable Collections
Persist Json Map, List no yes Scala
Rojoma LinkedHashMap, Vector yes no Scala
Jackson Map, List yes no Java
Spray Json Map, Vector yes yes Scala
Lift Json List[Field], List yes yes Scala
Twitter Json Map, List no yes Scala
Scala Library Map, List yes yes Scala
Play Json Map, List yes yes Scala
Json Smart HashMap, List yes no Java
Argonaut scalaz.InsertionMap, List yes yes Scala
JAWN Map, Array yes no Scala
22
23. 47deg.com
UNPARSING
⢠The inverse of parsing (deserialization) is unparsing
(serialization).
⢠Unparsing takes the AST from parsing and converts it back to
a string.
⢠Useful for debugging and logging.
⢠Many parsers also include a pretty printed unparser.
⢠Timing here for the ânon-prettyâ simple form.
23
27. 47deg.com
WHY IS JACKSON SO INCREDIBLY FAST?
⢠Uses SegmentedStringBuilder (rather than
StringBuilder).
⢠Uses segmented internal buffer.
⢠Buffers are recycled.
27
28. 47deg.com
WHY IS PERSIST SLOW?
⢠Uses raw Seq and Map rather than being
wrapped in custom classes.
⢠Must use pattern match rather than virtual
dispatch to a virtual method.
28
29. 47deg.com
MAPPERS
⢠Parsers go from string to AST
⢠Mappers go to user speciďŹed case classes
⢠Twitter, Scala Library, Json Smart, JAWN: no
mapper
⢠Jackson, Argonaut, Rojoma: â¨
string => case classes
⢠Others: string => AST => case classes
29
30. 47deg.com
DYNAMIC VERSUS STATIC TYPING
⢠Dynamic: AST. More ďŹexible and agile. No additional code
needed for parsing. Can be used on any valid json data. But
need extra code if more checking is needed.
⢠Static: User SpeciďŹed Case Classes. Must specify case
classes before parsing can proceed. More checking. Can
attach behavior to case classes.
30
31. 47deg.com
MAPPING TIMES (MS)
Parser Twitter Google
Persist Json 622 2238
Rojoma 1117 2669
Jackson 326 1150
Spray Json 557 1675
Lift Json 520 2060
Play Json 1123 3768
Argonaut 937 2550
31
34. 47deg.com
MAPPERS
Parser Extra Code Lines Why
Persist Json 0
Rojoma 135 case classes, Array=>Seq
Jackson 0
Spray Json 16 case classes
Lift Json 7 BigDecimal
Play Json 16 case classes
Argonaut 180
case classes, Array=>List,
Seq=>List, BigDecimal=>Double
34
35. 47deg.com
AVOIDING EXTRA CODE
⢠Find types of case class parameter names. Java
reďŹection works.
⢠Find names of case class parameters. Prior to Java 8
not available via Java reďŹection. Scala reďŹection
however does work.
⢠ReďŹection can be quite slow. Caching can help!
⢠Persist: Shapeless
⢠Lift and Jackson: Paranamer. Gets info from reading
Java byte code symbol tables.
35
36. 47deg.com
SUMMARY
⢠Avoid: Scala Library, Twitter
⢠Fast parse and no other features: Json Smart
⢠Good overall choices: Jackson, Persist, Spray
⢠Very fast unparse: Jackson
36