foss.in 2012 talk (http://fossdotin2012.shdlr.com/conferences/talk/196)
Intent: There comes a time in every C/C++ programmer's life where he is looking at a smashed stack, a trashed heap & wishes that core dumps happened only when null pointers get deferenced. This is the weak moment when people hang up their gdb boots & trade it for java.lang.NullPointerException We shall be exploring how to use Java as a safer version of C without giving up too much of control. A lot of big open source projects are starting to show up in Java for this very reason (eg: hadoop)
Overview: The Java programming language was considered too slow and too high level in its early days by performance junkies who believed that the only true way out was to code in C (and very reluctantly in C++). The language itself made significant strides by the time it reached v5 and JVMs also have become quite good at what they do
1. Thinking in C/C++, coding in Java
Thinking in C/C++, coding in Java
foss.in 2012
Arvind Jayaprakash
Arvind Jayaprakash
2. Thinking in C/C++, coding in Java
Audience
• Surely not for you if you’ve never done *nix
system programming or bare C/C++
• Maybe for you if you’ve done reasonable
amount of the above and “hello world” Java
• Prime audience if you are being pushed
into/want to explore Java as an option for
moderately high performance applications
Arvind Jayaprakash
4. Thinking in C/C++, coding in Java
finger
Home Work
• anomalizer • anomalizer
• anomalizer • http://inmobi.com/
• http://anomalizer.net/
Arvind Jayaprakash
5. Thinking in C/C++, coding in Java
history/uname
Home Work
• MS-DOS in 1990 • 5 years of FreeBSD & 1 year
of RHEL
• Primarily Win98 & a little bit
of RH7 in 2001 • Chose the OS for current
employer’s servers (Ubuntu
• Win7 for PPT and Gentoo since 2008)
for everything else in 2012
(fluxbox is my window
manager, xterm is my • Gentoo/Win7 combo on my
favourite terminal) laptop
Arvind Jayaprakash
7. Thinking in C/C++, coding in Java
Survival tips
• Java language (J2SE) != J2EE
• J2SE 5 (also known as 1.5 or JLS5) is lowest
respectable version of the language
• Sun (now Oracle) JRE continues to remain the
most popular free JRE+JDK
• Sun-JRE 1.6.0.22 is a good min version if you have
64 bit, x86_64, NUMA hardware running linux
• IDEs are necessary evil; vim/emacs just doesn’t
cut it
Arvind Jayaprakash
12. Thinking in C/C++, coding in Java
Primitives v/s objects
• Primitive data types, structs & classes play by
the exact same set of rules in C/C++ in almost
every context
• Java fundamentally drives a wedge between
the two both at a language level and runtime
level
• This is why there a primitive int and a class
Integer. These 2 are not interchangeable*
Arvind Jayaprakash
* Auto boxing is a deception
13. Thinking in C/C++, coding in Java
The approximate analogy
Primitives Composites (Objects)
• Think of primitives of values • Think of objects (classes) as
that can reside on the stack values that always* reside
• Lifespan always tied to on heap
source scope for local • Now it becomes obvious
variables that you are always dealing
with pointers/references
• It also becomes obvious
that their true lifespan is
not tied to source scope
Arvind Jayaprakash
*escape analysis implementations in some JVMs
14. Thinking in C/C++, coding in Java
Nested structs/classes
class Point { struct Point {
public int x; int x;
public int y; int y;
} }
class Rect { struct InlineRect {
public Point top_left; Point top_left;
public Point bottom_right; Point bottom_right;
} }
struct IndirectRect {
Point *top_left;
Point *bottom_right;
}
Arvind Jayaprakash
15. Thinking in C/C++, coding in Java
Null & void
• The notorious void* exists in Java; it is
commonly referred to as the class named
Object
– Any object (reference) can be directly cast to
Object
– An object (reference) of type Object can be
downcast to any type at compile time#
• null is not a type, however it is a language
defined literal (like true & false)
Arvind Jayaprakash
# but can throw an error at runtime
16. Thinking in C/C++, coding in Java
What are references in java?
Why it is like a C pointer Why it is not like a C++ reference
• Think of a reference as C • j-refs are nullable (d’uh)
pointer • C++ refs cannot be made to
• Think of the dot operator in point to something else
Java as C’s arrow operator post declaration unlike Java
• null is NULL, dereferencing refs
it is a bad idea • == operator in J has ptr
• Think of a final ref in Java as equivalence semantics, not
a const ptr (not to be dereferenced object
confused with ptr to const) equivalence; use equals()
for that
Arvind Jayaprakash
17. Thinking in C/C++, coding in Java
vtables
• Every class inherits from Object class
• Every member function is virtual in Java; there is
no opt-out
– Hence, internally, every class has a vtable
– And every object instance has an internal pointer/ref
to the vtable of its actual type (for dynamic dispatch)
– And a fn-call is via ptr-to-fn*
• RTTI (of C++ fame) comes at no additional cost as
a side-effect & guaranteed to be available
Arvind Jayaprakash
*Unless you do some class/method finalisation
18. Thinking in C/C++, coding in Java
Other deceptive similarities
Arvind Jayaprakash
19. Thinking in C/C++, coding in Java
Generics & templates aren’t the same
Java generics C++ templates
• No support for primitives • Supports all types
• Single copy of code exists • One copy of object code for
regardless of the number of each template instantiation
type arguments a generic • Glorified C style
code is used with marcos, compilation happens
once for each expansion; some
• Generified code get compilation errors crop up
compiled as an entity in here
itself • No inheritance family based
• Bounded type bounding of type
parameters, possible, unbo parameters, only explicit
unded defaults to Object specialization is possible
Arvind Jayaprakash
20. Thinking in C/C++, coding in Java
casts
• Syntactically identical to C casts
• Let us speak in C++ terms for semantic clarity
– static_cast is permitted
– No const_cast as there are no consts to begin with
– dynamic_cast permitted due to implicit RTTI
support (hence Object objects can be cast to
anything)
– reinterpret_cast disallowed; convert & copy is
the only way out
Arvind Jayaprakash
22. Thinking in C/C++, coding in Java
Auto-boxing woes
• Java 5 made it syntactically possible to use a
primitive and it’s objectified version
interchange-able (eg: Long & long)
• The costs however are very different
– Indirection (ptr de-ref) to read value
– Memory footprint is 2 ptrs (one to value, and the
vptr inside object) + that of actually storing the
primitive
Arvind Jayaprakash
23. Thinking in C/C++, coding in Java
You don’t want to see this
Integer x;
for(int i = 0 ; i < 100; i++) {
x = i * i;
}
Arvind Jayaprakash
24. Thinking in C/C++, coding in Java
int[] v/s ArrayList<Integer>
• vector<int> & int[] have identical performance
in C++, don’t carry that assumption into Java!
• Remember, generics only work with
objects, so we can’t use an int with it
• And int is just not the same as an Integer
Arvind Jayaprakash
26. Thinking in C/C++, coding in Java
In words
• On an un-tuned 64 bit JVM, pay at-least 400%
memory tax (it is still 200% on a tuned JVM)
• 100% apparent memory access cost
• Completely wreck your cache lines by simply
iterating through the array (real tax can
exceed 100%)
• And yes, there is copying involved when you
expand beyond a certain limit
• And more work for GC …
Arvind Jayaprakash
27. Thinking in C/C++, coding in Java
The solution
• So what about collections of primitives?
– What if you want an expandable array of ints?
– What if you want a map of short to double?
• Use primitive collection libraries
– trove4j solves the above problems
– It is GNU project & comes with LGPL license too
• The larger point however is to understand the
object model & memory layout
Arvind Jayaprakash
28. Thinking in C/C++, coding in Java
No reinterpret cast for you!
• Imagine trying to read values from byte
streams such as files & sockets
• You have 3 choices
– Bottom-up read, one primitive at a time (entire
class chain must play nice for this)
– Slurp the blob, break the blob and make
meaningful object by copying over the primitives
in top-down fashion (a.k.a. memcpy)
– Use java serialization (disallows conditional
parsing)
Arvind Jayaprakash
30. Thinking in C/C++, coding in Java
Dealing with slow parts
(of any language)
• A common reason to fall back to “native”
languages is when a large amount of I/O is
involved
• I/O is dreaded as it usually translates to *nix
syscalls
• A lot of syscalls exist specifically to optimize
userspace/kernel space transition
inefficiencies
• They also have OS idosyncracies
Arvind Jayaprakash
31. Thinking in C/C++, coding in Java
Java & I/O
*nix & C feature Java equivalent Available since
Allocate char* ByteBuffer.allocate() 1.4
sendfile() FileChannel.transfer{To|From} 1.4
mmap() FileChannel.map() 1.4
epoll() Channels.Selector() + API since 1.4, epoll as
SelectorProvider implementation since 1.6
readv()/writev() Channel.read/write (ByteBuffer[]) 1.4
chmod()/chown()/ NIO2 file api 1.7
inotify()/stat()/
copy()/symlink()/
readdir/…
SCTP - 1.7
Arvind Jayaprakash
33. Thinking in C/C++, coding in Java
Not covered in the talk
• Reflection
– Runtime inspection of types & dynamic code gen
• JIT
– JRE profiles applications & recompiles code with
optimizations mid-flight!
– Discovers structural shortcuts possible in a given
app & exploits it
• JNI
– When you have to bridge your C code
Arvind Jayaprakash
34. Thinking in C/C++, coding in Java
Go read about the following
• “maven” (awesome build mgmt tool)
• “Google guavas” (as important as boost for
cpp, historically speaking)
• “Project lombok” (uses annotations to tuck
away massive boilerplate coding)
• “slf4j” (log4j is so Java 1.2, never code against
it)
• “netty” (the libevent of Java)
Arvind Jayaprakash
35. Thinking in C/C++, coding in Java
And some more
• “testng” (unit & module testing system)
• “mockito” (helps in creating test mocks)
• “javassist” (create entire classes from strings
at runtime!)
• “guice” & “Spring DI” (dependency injection)
Arvind Jayaprakash