1. _ _ _ ____
/ ___ ___(_) __ _ _ __ _ __ ___ ___ _ __ | |_ | ___|
/ _ / __/ __| |/ _` | '_ | '_ ` _ / _ '_ | __| |___
/ ___ __ __ | (_| | | | | | | | | | __/ | | | |_ ___) |
/_/ ____/___/_|__, |_| |_|_| |_| |_|___|_| |_|__| |____/
|___/
CS 2110 CODING COMPETITION 2009 ENTRY by Mengxiang and Chuck
-====================================================================-
Table of Contents
-----------------
1.) Philosophy
2.) Caching
3.) GraphViz
4.) Root-Finding Algorithm
5.) Multithreading (!)
6.) Fibonacci Heap
7.) Prim's Algorithm
8.) Testing
9.) Conclusion and Future Work
-====================================================================-
- PHILOSOPHY -
-====================================================================-
Our philosophy behind this project was to emphasize performance, while
still guaranteeing accurate results. We took several approaches toward
achieving this goal. We owe a few clever tricks to our vast speed-up,
which we will discuss thoroughly in this "Read Me" file. Thus, the
need for speed and our own unrelenting competitive spirit were our
motivations for developing this project.
We hope you enjoy reviewing our entry as much as we did creating it!
-====================================================================-
- CACHING -
-====================================================================-
The first and perhaps most obvious approach we took toward making the
program run faster was to add caching of the gene and animal distances.
When a distance is computed, we first check if had already been computed.
If it had been, then we used a hash table to look it up in O(1) time.
If it hadn't been already computed, we compute it by hand, and then we
store it in the hash table for future re-use. Moreover, we use the gene
and animal pairs as indexes to the hash table. Java's built-in hash table
functionality sufficed for this task. We realize that this speed comes
at the cost of memory, but the performance gains made this trade-off well
worth it. Before caching was added, it took the program about 1.5 hours
to generate 40 graphs. Afterward, it took the program about ten seconds.
Reducing the complexity down to O(1) really does pay off.
-====================================================================-
- GRAPH VIZ -
-====================================================================-
We leveraged the Graph Viz software to provide a springboard that would
hopefully launch us toward success in implementing our root-finding
algorithm. We wrote a GraphViz class that would generate a GraphViz
output file, similar to the Dendroscope and TreePrinter classes. We then
generated phylogenomic graphs for all 40 animals as roots, and began to
analyze the characteristics of these graphs in order to find some sort
of metric to determine the best root animal. Example JPEG graphs and
their corresponding Graph Viz source code are provided in this .ZIP file.
2. -====================================================================-
- Root-Finding Algorithm -
-====================================================================-
We immediately noticed the aesthetics of the Parmesianian graph, which
according to the online assignment, was the best root. The tree was much
wider than tall, so initially we figured we could simply use the width
of the tree as a way to determine the root. Unfortunately, several trees
had the same width as the Parmesianian, resulting in ties for the best
root. Moreover, the algorithm was not deterministic because the ties
were not resolved in any defined manner, so any tree with the same width
could potentially be resolved as the best root tree.
Our next attempt was to consider the width-to-height ratio of the graphs.
The tree with the greatest width-to-height ratio appeared to clearly be
the one with the Parmesianian root. However, this technique suffered the
same fate as the previous one; several ties with the same ratio clashed
for becoming the ideal root animal, and the algorithm was apparently not
deterministic either.
We had one last characteristic to consider, though. The tree was much
more "balanced" for the Parmesianian than any other animal. We sought
out for a mathematical definition of balance. We needed a definitive way
to quantitatively measure the balance of a given tree. We scoured the
Internet relentlessly for a way to measure the balance of a tree to no
avail so we began to brainstorm on our own.
The first method up for consideration was to utilize the fact that
binary trees increase by powers of two for each level. Thus, n-ary trees
must increase by some power of n for each level. The closer the tree is
to being balanced, the more evident this relationship holds. The main
problem of this approach though is, what is n? Should n be the same for
all trees? What if one n is better for one graph and another n is better
for another graph? This method left us with even more unanswered questions
so we looked for an alternative.
Finally, the method we chose to implement was that of a recursive algorithm.
We realized that balanced trees have equal amounts of children on each side.
In order to determine just how balanced a tree is, we compute a so-called
"mirror index" recursively that takes this into consideration. The mirror
index algorithm traverses each sub-tree of a given node, counts its children,
and adds the differences between this sub-tree and the other sub-trees at
that level to the mirror index. Then the algorithm recurses to the next level
and counts the "sub-sub-trees", adding the differences in children to the
index accordingly. The algorithm worked! As you can see below, the mirror
index is by far the lowest for the Parmesanian animal:
Frilly_Sea_Sprat: 70
Asian_Boxing_Lobster: 292
Policle: 330
Jelly_Belly: 198
Ballards_Hooting_Crane: 262
Pompous_Snark: 262
Fuzzy_Trible: 174
Sextopus: 356
Gilligans_Squimp: 292
Ballards_ProtoDuck: 222
Shy_Frecklepuss: 216
Bards_Star: 292
Larval_TreeNymph: 192
Globe_Floater: 216
Snuffling_Blat: 152
Big-Billed_Peacock: 262
3. Spotted_Ghila: 216
Gray_Floop: 216
Leaping_Lizard: 152
Sprats_Butterfly: 192
Munkles_Mouse: 330
Strats_Squirrel: 262
Biscuit: 262
Nocturnal_Mourningbird: 152
Green_Herring: 286
Nocturnal_Plexum: 262
Green_SnapDragon: 222
Common_Mudfly: 262
Striped_Salamander: 286
Paradise_Rockfish: 216
Hairy_Rock_Snot: 58
Darwins_Tortle: 292
Hallucigenia: 292
Parmesanian: 24
Swamp_Slime: 140
Pink_Ziffer: 286
Toothy_Ballonfish: 216
Elephant_Snark: 152
Translucent_Tridle: 88
-====================================================================-
- Multithreading (!) -
-====================================================================-
We decided to go off on a limb here and do multithreading. After all, now that
Moore's Law is quieting down and we are approaching the physical limits of what
good-old silicon transistor CPUs can actually do as far as clock speed goes, the
chip manufacturers still want to release innovative products so their idea is,
"Throw more cores on it!" Unfortunately, computer scientists haven't figured out
how to completely take advantage of having additional cores yet and
parallelization
is still an active research topic. PC games such as Crysis and the Source Engine
have only recently added support for multithreading to their 3D engines. Now,
GPUs
are being utilized computationally for the same purpose: parallelization.
We were admittedly tired of seeing the CPU usage in the Windows task manager
only
going up to 50% on my dual-core Thinkpad laptop especially back when generating
graphs took a really long time before optimizations were put in place. We wanted
to desperately double the speed of the program by using near 100% CPU usage the
entire time, and we were inspired by Professor Birman's lecture on
multithreading.
We decided to jump on the bandwagon here and implement the gene algorithms in
parallelizable form. In order to do this, we divided our program up into three
stages that must run in serial: the distance computations, the animal species
graph generation, and the root finding algorithm. We then wrote multithreaded
implementations of these algorithms. Luckily, the computations were extremely
well-suited for parallelization; the algorithms could work on different animals
at the same time since the data is completely independent of itself.
We wrote a ThreadManager class (not sure if this fits some kind of design
pattern)
that dispatches out worker threads with allocated workloads that work in tandem
to
accomplish the three serial tasks in parallel. We ran into the problem of
deciding
when each thread is done. Our solution was to put the main thread to sleep and
periodically "wake up" to poll the other threads to see if they were completed
4. every 100 ms. We realize that Java has built-in notification/wait functionality
for threads, but alas we ran out of time now with only three hours to go before
deadline. We needed a way to make sure that all of the threads were completely
finished before moving onto the next serial task so we implemented our own
Semaphore class with atomic operations for increasing and decreasing the
semaphore
count (P and V).
Multithreading code can be hard to debug. We discovered this ourselves the hard
way with this assignment. The programmer's mantra of "code for an hour, debug
for a week" rang quite true for us. We first ran into problems with Java's
built-in HashMap class not being thread-safe. Luckily for us, Java comes
equipped
with a ConcurrentHashMap class that alleviates these issues. Switching to the
concurrent step-child of HashMap was not difficult at all. We also ran into
dead-
locks and even more thread safety issues. HashSet just wasn't cooperating with
us
and caused the threads to deadlock halfway through. We took advantage of Java's
"synchronized" keyword to make the Phylogeny tree generation code run atomically
in each thread, and this resolved our deadlocking issues. At times, the
frustration became such that we almost gave up on the idea of using threads, but
we finally managed to work out all the bugs and come up with a parallel
implementation of the project.
-====================================================================-
- Fibonacci Heap -
-====================================================================-
The online assignment web page suggested that if we were truly crazy, we could
use our own Fibonacci Heap implementation to generate the MST tree. Since we
are, in fact, self-professedly crazy, we thought, "Sure! Why not?" This task did
not prove to be nearly as easy as we thought it would be. The Wikipedia page
article
was vague in explaining the Fibonacci Tree operations, so we had to sort of
reverse
engineer the diagrams on there. Moreover, we ran into problems with having
marked
root nodes and our alpha version of the implementation frequently violated the
heap invariant. JUnit testing came to the rescue here, and we were able to
work out all of the bugs and reap the benefits of using a Fibonacci Heap. We
wrote
our own Priority Queue implementation that took advantage of this Fibonacci Heap
class and used it for our next task: using Prim's algorithm.
-====================================================================-
- Prim's Algorithm -
-====================================================================-
The project web page mentioned using Prim's algorithm instead of Assignment 4's
Naive MST algorithm. We figured, what better way to put our Fibonacci heap
implementation to use?
Prim's algorithm turned out to be a bit more of a challenge than we had thought.
We ended up having to rebuild the PriorityQueue after each iteration to reflect
the new distances of the animals that were going to be added. Moreover, we had
to scrap our code three times and rewrite it because things just were not
working properly.
Implementing the lexicographic tie breaker turned out to be the hardest part.
Initially, my Fibonacci Priority Queue class was designed to be more like a
traditional Priority Queue by using numerical priorities instead of comparators,
which we thought of as a clumsy solution. Nevertheless, we had to resort to
using generics and supporting the comparator interface in our code albeit at
5. the cost of more bloated, more complex code. Once we switched to using the
comparator interface, we merely had to write our own comparison routine to
check for and break ties.
We thought of two ways of getting around Professor Birman's siblings infinite
distance "hack". The first was to use some sort of look-up table where we
could tell instantly if animals were siblings, and then do some sort of clever
work-around if they were in the PhylogenyTree Prim's algorithm implementation.
The second idea was the one we ended up using. When we build the Phylogeny
tree, we first ignore all siblings when looking for a closest distance for the
minimum spanning tree. If we cannot find a node because they are all siblings,
then we check for the first sibling and use that instead for the animal with
the closest distance. The end result is that we no longer need to set the
siblings' distances to the ad hoc infinity value that was needed before, and
we still get exactly the same MST.
The resulting Prim's algorithm implementation seemed to be a lot more stream-
lined than we had expected. It was clearly cleaner than the Naive MST-building
algorithm and about half the length in code.
-====================================================================-
- TESTING -
-====================================================================-
Our attitude toward testing was "test early and test often". Thus we devised
as many tests as we could to try to break our program. We were successful in
many instances, which helped to improve the stability of our code. Throughout
the project, we used the Subversion version control system that Chuck had
installed on his OpenBSD box at home to help aid in collaboration. Eclipse
even had a plugin that allowed it to use SVN as a development tool. This
allowed us to simultaneously write tests and run them in hopes of discovering
bugs. We came up with some pretty cool ideas for tests!
Multithreading necessitated a unique kind of testing we called "stress
testing". The idea was throw 20 threads in the ring and have them duke it out
and try to deadlock each other or reveal any race conditions. For the latter,
we repeated the test for multiple trials and checked to make sure the root
animal was the same each time. This spot checking turned out to be very useful
for detecting small variations in the tree. Such variations would manifest
themselves later on in the root finding algorithm, yielding completely
different results.
Furthermore, the Fibonacci heap needed thorough testing if we were going to
boldly replace Java's venerable PriorityQueue class with our own hack. Our
best idea was to try constructing multiple random heaps and perform our
own random set of operations on them, checking the heap invariant after every
one. This test proved to be quite effective. Many hidden bugs lurking within
the heap implementation were swiftly and surely brought to light by this
test. As a result, we gained some confidence that our own heap solution
was worthy enough to contend with Sun's (wishful thinking! :).
Menxiang wrote many of the rote validity tests in the code. They test the
methods for correctness and fault tolerance.
-====================================================================-
- CONCLUSION AND FUTURE WORK -
-====================================================================-
Time is a scarce resource at Cornell. Some of our most ambitious ideas did
not make it into the final product, but that can be said about many project
life cycles in the real world. We had thought of writing a 3D OpenGL tree
visualization tool for the GUI, but we were one day short of actually
including it in our project. JOGL would have facilitated this, along with
prior experience with OpenGL in other projects.
6. Also, we thought Birman's cloud/distributing computing stuff was pretty neat
and were wondering if we could somehow dispatch our threads on other machines
using Java's web services functionality. Unfortunately, multithreading alone
proved to be ambitious enough, and we were not able to implement this, but
hey, it's still a pretty cool idea nonetheless to crunch out large DNA data
sets "in the cloud" much like how protein folding is being carried out
nowadays.
Overall, we thought the project was pretty successful. Our greatest triumph
was hands down the multithreading, but all in all, the rest of the project
went just as smoothly and we seemed to work quite nicely toward
accomplishing our goals here even if it meant being overly ambitious at times!
("`-''-/").___..--''"`-._
`6_ 6 ) `-. ( ).`-.__.`)
(_Y_.)' ._ ) `._ `. ``-..-'
_..`--'_..-_/ /--'_.' ,'
(il).-'' (li).' ((!.-'