SlideShare a Scribd company logo
1 of 19
INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
           A High-Throughput
        Bioinformatics Distributed
           Computing Platform



19-09-2012                                                             1

               A high-throughput bioinformatics distributed computing platform
INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
                   Presented by-


                     Md. Habibur Rahman
                            BIT 0216
             Institute of Information Technology
                      University of Dhaka
                           Bangladesh


19-09-2012                                                                   2

                     A high-throughput bioinformatics distributed computing platform
The contributors of the paper




                                                                                    INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
 Thomas M. Keane, Andrew J. Page, James O. McInerney,
              and Thomas J. Naughton

       Bioinformatics and Pharmacogenomics Laboratory,
      National University of Ireland, Maynooth, Co. Kildare,
                               Ireland

   Department of Computer Science, National University of
          Ireland, Maynooth, Co. Kildare, Ireland


              Homepage: http://www.cs.nuim.ie/distibuted

19-09-2012                                                                      3

                        A high-throughput bioinformatics distributed computing platform
INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
             Publications

18th   IEEE Symposium on Computer-
     Based Medical System (CBMS’05)




19-09-2012                                                           4

             A high-throughput bioinformatics distributed computing platform
Suitability of Bioinformatics to




                                                                                     INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
                 Distributed Computing

             A    Class       of      Algorithmic           Parallelism
               referred to as coarse-grained parallelism.
              High   compute-to-data ratio.




19-09-2012                                                                       5

                         A high-throughput bioinformatics distributed computing platform
Topic and Problem Overview




                                                                                       INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
            Demand for high performance computing has increased
             dramatically in the area of bioinformatics due to rapid
             increase in the size of genomic databases.
            Traditional database search algorithm was not feasible to
             perform full search of a large database in a reasonable
             time.
            Feasibility of heuristic algorithm but reduction of
             sensitivity of search.
            Evolutionary biology, phylogenetic tree and greedy
             heuristic algorithm.




19-09-2012                                                                         6

                           A high-throughput bioinformatics distributed computing platform
INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
                Proposed solution
o    According to the writers of the paper---

                                            “We     present      a   general-
     purpose    programmable        distributed      computing       platform
     suitable for deployment in a typical university environment
     where many semi-idle desktop PC’s are connected via a
     network”
    The system is fully cross-platform.
    Two distributed bioinformatics applications:
                 i) DSEARCH
                 ii) DPRml
19-09-2012                                                                     7

                       A high-throughput bioinformatics distributed computing platform
Proposed solution(cont.)




                                                                                      INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
       o     Java Distributed Computing platform

             - Client Server model

             - Server controls the resources (database, algorithm
             or computer hardware)

             - The model is divided into three separate pieces of
             software: server, client and remote interface.




19-09-2012                                                                        8

                          A high-throughput bioinformatics distributed computing platform
Proposed solution(cont.)




                                                                               INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
                  Fig: Diagram of the complete system
19-09-2012                                                                 9

                   A high-throughput bioinformatics distributed computing platform
Proposed solution(cont.)




                                                                                  INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
    Installation and Deployment
-    Consists of three executable JAR files corresponding to
     the server, client and remote interface.
-    Run the client as a low priority background service.
-    Hardware specification: At least Pentium IV processor
-    OS compatibility: Windows, Sun Solaris, Mac OSX and
     Linux.




19-09-2012                                                                   10

                     A high-throughput bioinformatics distributed computing platform
INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
                                   DPRml
             - Distributed Phylogeny Reconstruction by maximum likelihood




                                Previous situation:
 Maximum       likelihood evolution is one the most accurate techniques
 for reconstructing phylogenies.
 Developed        parallel ML programs for reconstructing large and
 accurate phylogenetic trees.
 Implemented       in platform specific language



19-09-2012                                                                        11

                          A high-throughput bioinformatics distributed computing platform
INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
                             DPRml (cont.)
              - Distributed Phylogeny Reconstruction by maximum likelihood



             After the development of distributed computing platform:
 One        of the most general and powerful likelihood-based phylogenetic
 tree building program.
 Used        proven tree building algorithm and phylogenetic Analysis
 Library
 Possibility     of multiple phylogenetic computation.
 Platform      independent ML program.


19-09-2012                                                                         12

                           A high-throughput bioinformatics distributed computing platform
DPRml (cont.)




                                                                                           INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
                - Distributed Phylogeny Reconstruction by maximum likelihood


                                    Speed up Testing:




             Fig. Speedup achieved by running 6 simultaneous DPRml problems
                         using between 1-40 semi-idle processors.
19-09-2012                                                                            13

                              A high-throughput bioinformatics distributed computing platform
DSEARCH




                                                                                     INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
      Fully cross-platform parallel database search program.
      Operates in a master slave environment.
      Splitting the database into fixed sized units that are subsequently
       searched on the donor machines.,




19-09-2012                                                                      14

                        A high-throughput bioinformatics distributed computing platform
DSEARCH (cont.)




                                                                                           INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
               Speed up Testing:                           Using-

                                                       -    FASTA database file,
                                                       -    A    FASTA        query
                                                            sequence file.
                                                       -    A searching scheme
                                                       -    A configuration file.




    Fig. Speedup achieved by DSEARCH running on
           between 1-80 semi-idle processors.


19-09-2012                                                                            15

                         A high-throughput bioinformatics distributed computing platform
My criticism and future work to do




                                                                                      INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
     No detail description about how the applications works on the
      distributed computing platform.
     If we don’t get the spare clock cycle of the semi-idle pc then the
      system will not give us the best result.
     Failure of interconnected network of the desktop-pc’s will reduce
      the performance.
     To improve and expand the range of bioinformatics applications for
      the system.


19-09-2012                                                                       16

                         A high-throughput bioinformatics distributed computing platform
Conclusion




                                                                             INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
 “There should not have any conclusion of
 research work, It is a continual process and it will
 be continued for the betterment of the human
 being.”




19-09-2012                                                              17

                A high-throughput bioinformatics distributed computing platform
INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
             ANY QUESTION?



19-09-2012                                                             18

               A high-throughput bioinformatics distributed computing platform
INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA
19-09-2012                                                           19

             A high-throughput bioinformatics distributed computing platform

More Related Content

What's hot

Virtual Campfire/iNMV Storytelling on the iPhone
Virtual Campfire/iNMV Storytelling on the iPhoneVirtual Campfire/iNMV Storytelling on the iPhone
Virtual Campfire/iNMV Storytelling on the iPhone
Yiwei Cao
 
english_cv_final.doc
english_cv_final.docenglish_cv_final.doc
english_cv_final.doc
butest
 
OW2 A presentation pierre_chatel
OW2 A presentation pierre_chatelOW2 A presentation pierre_chatel
OW2 A presentation pierre_chatel
choreos
 
Radterror Spb Oct04 Paper
Radterror Spb Oct04 PaperRadterror Spb Oct04 Paper
Radterror Spb Oct04 Paper
martindudziak
 
CYBER INFRASTRUCTURE AS A SERVICE TO EMPOWER MULTIDISCIPLINARY, DATA-DRIVEN S...
CYBER INFRASTRUCTURE AS A SERVICE TO EMPOWER MULTIDISCIPLINARY, DATA-DRIVEN S...CYBER INFRASTRUCTURE AS A SERVICE TO EMPOWER MULTIDISCIPLINARY, DATA-DRIVEN S...
CYBER INFRASTRUCTURE AS A SERVICE TO EMPOWER MULTIDISCIPLINARY, DATA-DRIVEN S...
ijcsit
 
Kalman Graffi - 15 Slide on Monitoring P2P Systems - 2010
Kalman Graffi - 15 Slide on Monitoring P2P Systems - 2010Kalman Graffi - 15 Slide on Monitoring P2P Systems - 2010
Kalman Graffi - 15 Slide on Monitoring P2P Systems - 2010
Kalman Graffi
 

What's hot (16)

Data-Intensive Research
Data-Intensive ResearchData-Intensive Research
Data-Intensive Research
 
Virtual Campfire/iNMV Storytelling on the iPhone
Virtual Campfire/iNMV Storytelling on the iPhoneVirtual Campfire/iNMV Storytelling on the iPhone
Virtual Campfire/iNMV Storytelling on the iPhone
 
PROCEDURE OF EFFECTIVE USE OF CLOUDLETS IN WIRELESS METROPOLITAN AREA NETWORK...
PROCEDURE OF EFFECTIVE USE OF CLOUDLETS IN WIRELESS METROPOLITAN AREA NETWORK...PROCEDURE OF EFFECTIVE USE OF CLOUDLETS IN WIRELESS METROPOLITAN AREA NETWORK...
PROCEDURE OF EFFECTIVE USE OF CLOUDLETS IN WIRELESS METROPOLITAN AREA NETWORK...
 
"Parallel and Distributed Computing: BOINC Grid Implementation" por Rodrigo N...
"Parallel and Distributed Computing: BOINC Grid Implementation" por Rodrigo N..."Parallel and Distributed Computing: BOINC Grid Implementation" por Rodrigo N...
"Parallel and Distributed Computing: BOINC Grid Implementation" por Rodrigo N...
 
english_cv_final.doc
english_cv_final.docenglish_cv_final.doc
english_cv_final.doc
 
Multimedia Processing on Multimedia Semantics and Multimedia Context
Multimedia Processing on Multimedia Semantics and Multimedia ContextMultimedia Processing on Multimedia Semantics and Multimedia Context
Multimedia Processing on Multimedia Semantics and Multimedia Context
 
OW2 A presentation pierre_chatel
OW2 A presentation pierre_chatelOW2 A presentation pierre_chatel
OW2 A presentation pierre_chatel
 
Flexible Technologies for Smart Campus
Flexible Technologies for Smart CampusFlexible Technologies for Smart Campus
Flexible Technologies for Smart Campus
 
Bio-UnaGrid: Easing bioinformatics workflow execution
Bio-UnaGrid: Easing bioinformatics workflow executionBio-UnaGrid: Easing bioinformatics workflow execution
Bio-UnaGrid: Easing bioinformatics workflow execution
 
Mrsql3
Mrsql3Mrsql3
Mrsql3
 
An efficient transport protocol for delivery of multimedia content in wireles...
An efficient transport protocol for delivery of multimedia content in wireles...An efficient transport protocol for delivery of multimedia content in wireles...
An efficient transport protocol for delivery of multimedia content in wireles...
 
Radterror Spb Oct04 Paper
Radterror Spb Oct04 PaperRadterror Spb Oct04 Paper
Radterror Spb Oct04 Paper
 
CYBER INFRASTRUCTURE AS A SERVICE TO EMPOWER MULTIDISCIPLINARY, DATA-DRIVEN S...
CYBER INFRASTRUCTURE AS A SERVICE TO EMPOWER MULTIDISCIPLINARY, DATA-DRIVEN S...CYBER INFRASTRUCTURE AS A SERVICE TO EMPOWER MULTIDISCIPLINARY, DATA-DRIVEN S...
CYBER INFRASTRUCTURE AS A SERVICE TO EMPOWER MULTIDISCIPLINARY, DATA-DRIVEN S...
 
Part 2: Efficient Multimedia Delivery in Content-Centric Mobile Networks
Part 2: Efficient Multimedia Delivery in Content-Centric Mobile NetworksPart 2: Efficient Multimedia Delivery in Content-Centric Mobile Networks
Part 2: Efficient Multimedia Delivery in Content-Centric Mobile Networks
 
Kalman Graffi - 15 Slide on Monitoring P2P Systems - 2010
Kalman Graffi - 15 Slide on Monitoring P2P Systems - 2010Kalman Graffi - 15 Slide on Monitoring P2P Systems - 2010
Kalman Graffi - 15 Slide on Monitoring P2P Systems - 2010
 
Design of an IT Capstone Subject - Cloud Robotics
Design of an IT Capstone Subject - Cloud RoboticsDesign of an IT Capstone Subject - Cloud Robotics
Design of an IT Capstone Subject - Cloud Robotics
 

Similar to A High Throughput Bioinformatics Distributed Computing Platform

Telecom trends 261112
Telecom trends 261112Telecom trends 261112
Telecom trends 261112
Sharon Rozov
 
A crisis-communication-network-based-on-embodied-conversational-agents-system...
A crisis-communication-network-based-on-embodied-conversational-agents-system...A crisis-communication-network-based-on-embodied-conversational-agents-system...
A crisis-communication-network-based-on-embodied-conversational-agents-system...
Cemal Ardil
 
A Real-time Collaboration-enabled Mobile Augmented Reality System with Semant...
A Real-time Collaboration-enabled Mobile Augmented Reality System with Semant...A Real-time Collaboration-enabled Mobile Augmented Reality System with Semant...
A Real-time Collaboration-enabled Mobile Augmented Reality System with Semant...
Dejan Kovachev
 

Similar to A High Throughput Bioinformatics Distributed Computing Platform (20)

Beyond the Client-Server Architectures: A Survey of Mobile Cloud Techniques
Beyond the Client-Server Architectures: A Survey of Mobile Cloud TechniquesBeyond the Client-Server Architectures: A Survey of Mobile Cloud Techniques
Beyond the Client-Server Architectures: A Survey of Mobile Cloud Techniques
 
grid computing
grid computinggrid computing
grid computing
 
Research Challenges in Networked Systems, Torsten Braun, Universität Bern
Research Challenges in Networked Systems, Torsten Braun, Universität BernResearch Challenges in Networked Systems, Torsten Braun, Universität Bern
Research Challenges in Networked Systems, Torsten Braun, Universität Bern
 
Research Challenges in Networked Systems
Research Challenges in Networked SystemsResearch Challenges in Networked Systems
Research Challenges in Networked Systems
 
Dagstuhl 2010 - Kalman Graffi - Alternative, more promising IT Paradigms for ...
Dagstuhl 2010 - Kalman Graffi - Alternative, more promising IT Paradigms for ...Dagstuhl 2010 - Kalman Graffi - Alternative, more promising IT Paradigms for ...
Dagstuhl 2010 - Kalman Graffi - Alternative, more promising IT Paradigms for ...
 
Telecom trends 261112
Telecom trends 261112Telecom trends 261112
Telecom trends 261112
 
A crisis-communication-network-based-on-embodied-conversational-agents-system...
A crisis-communication-network-based-on-embodied-conversational-agents-system...A crisis-communication-network-based-on-embodied-conversational-agents-system...
A crisis-communication-network-based-on-embodied-conversational-agents-system...
 
Big Data in Bioinformatics & the Era of Cloud Computing
Big Data in Bioinformatics & the Era of Cloud ComputingBig Data in Bioinformatics & the Era of Cloud Computing
Big Data in Bioinformatics & the Era of Cloud Computing
 
Fujitsu keynote at Oracle OpenWorld 2012
Fujitsu keynote at Oracle OpenWorld 2012 Fujitsu keynote at Oracle OpenWorld 2012
Fujitsu keynote at Oracle OpenWorld 2012
 
GRID COMPUTING.ppt
GRID COMPUTING.pptGRID COMPUTING.ppt
GRID COMPUTING.ppt
 
Parking
ParkingParking
Parking
 
Presentation-1.ppt
Presentation-1.pptPresentation-1.ppt
Presentation-1.ppt
 
GridComputing-an introduction.ppt
GridComputing-an introduction.pptGridComputing-an introduction.ppt
GridComputing-an introduction.ppt
 
1. GRID COMPUTING
1. GRID COMPUTING1. GRID COMPUTING
1. GRID COMPUTING
 
A Real-time Collaboration-enabled Mobile Augmented Reality System with Semant...
A Real-time Collaboration-enabled Mobile Augmented Reality System with Semant...A Real-time Collaboration-enabled Mobile Augmented Reality System with Semant...
A Real-time Collaboration-enabled Mobile Augmented Reality System with Semant...
 
3003 eve 1
3003 eve 13003 eve 1
3003 eve 1
 
BIOMAJ
BIOMAJBIOMAJ
BIOMAJ
 
Blueprint for the Industrial Internet of Things
Blueprint for the Industrial Internet of ThingsBlueprint for the Industrial Internet of Things
Blueprint for the Industrial Internet of Things
 
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
 
2014 IEEE JAVA PARALLEL DISTRIBUTED PROJECT The design-and-evaluation-of-an-i...
2014 IEEE JAVA PARALLEL DISTRIBUTED PROJECT The design-and-evaluation-of-an-i...2014 IEEE JAVA PARALLEL DISTRIBUTED PROJECT The design-and-evaluation-of-an-i...
2014 IEEE JAVA PARALLEL DISTRIBUTED PROJECT The design-and-evaluation-of-an-i...
 

Recently uploaded

會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
中 央社
 

Recently uploaded (20)

II BIOSENSOR PRINCIPLE APPLICATIONS AND WORKING II
II BIOSENSOR PRINCIPLE APPLICATIONS AND WORKING IIII BIOSENSOR PRINCIPLE APPLICATIONS AND WORKING II
II BIOSENSOR PRINCIPLE APPLICATIONS AND WORKING II
 
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General Quiz
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General QuizPragya Champions Chalice 2024 Prelims & Finals Q/A set, General Quiz
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General Quiz
 
MichaelStarkes_UncutGemsProjectSummary.pdf
MichaelStarkes_UncutGemsProjectSummary.pdfMichaelStarkes_UncutGemsProjectSummary.pdf
MichaelStarkes_UncutGemsProjectSummary.pdf
 
How to Manage Closest Location in Odoo 17 Inventory
How to Manage Closest Location in Odoo 17 InventoryHow to Manage Closest Location in Odoo 17 Inventory
How to Manage Closest Location in Odoo 17 Inventory
 
REPRODUCTIVE TOXICITY STUDIE OF MALE AND FEMALEpptx
REPRODUCTIVE TOXICITY  STUDIE OF MALE AND FEMALEpptxREPRODUCTIVE TOXICITY  STUDIE OF MALE AND FEMALEpptx
REPRODUCTIVE TOXICITY STUDIE OF MALE AND FEMALEpptx
 
Championnat de France de Tennis de table/
Championnat de France de Tennis de table/Championnat de France de Tennis de table/
Championnat de France de Tennis de table/
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT VẬT LÝ 2024 - TỪ CÁC TRƯỜNG, TRƯ...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT VẬT LÝ 2024 - TỪ CÁC TRƯỜNG, TRƯ...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT VẬT LÝ 2024 - TỪ CÁC TRƯỜNG, TRƯ...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT VẬT LÝ 2024 - TỪ CÁC TRƯỜNG, TRƯ...
 
IPL Online Quiz by Pragya; Question Set.
IPL Online Quiz by Pragya; Question Set.IPL Online Quiz by Pragya; Question Set.
IPL Online Quiz by Pragya; Question Set.
 
Word Stress rules esl .pptx
Word Stress rules esl               .pptxWord Stress rules esl               .pptx
Word Stress rules esl .pptx
 
Software testing for project report .pdf
Software testing for project report .pdfSoftware testing for project report .pdf
Software testing for project report .pdf
 
Capitol Tech Univ Doctoral Presentation -May 2024
Capitol Tech Univ Doctoral Presentation -May 2024Capitol Tech Univ Doctoral Presentation -May 2024
Capitol Tech Univ Doctoral Presentation -May 2024
 
How to Analyse Profit of a Sales Order in Odoo 17
How to Analyse Profit of a Sales Order in Odoo 17How to Analyse Profit of a Sales Order in Odoo 17
How to Analyse Profit of a Sales Order in Odoo 17
 
Navigating the Misinformation Minefield: The Role of Higher Education in the ...
Navigating the Misinformation Minefield: The Role of Higher Education in the ...Navigating the Misinformation Minefield: The Role of Higher Education in the ...
Navigating the Misinformation Minefield: The Role of Higher Education in the ...
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
 
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
 
ANTI PARKISON DRUGS.pptx
ANTI         PARKISON          DRUGS.pptxANTI         PARKISON          DRUGS.pptx
ANTI PARKISON DRUGS.pptx
 
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
 
demyelinated disorder: multiple sclerosis.pptx
demyelinated disorder: multiple sclerosis.pptxdemyelinated disorder: multiple sclerosis.pptx
demyelinated disorder: multiple sclerosis.pptx
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024
 
The Last Leaf, a short story by O. Henry
The Last Leaf, a short story by O. HenryThe Last Leaf, a short story by O. Henry
The Last Leaf, a short story by O. Henry
 

A High Throughput Bioinformatics Distributed Computing Platform

  • 1. INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA A High-Throughput Bioinformatics Distributed Computing Platform 19-09-2012 1 A high-throughput bioinformatics distributed computing platform
  • 2. INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA Presented by- Md. Habibur Rahman BIT 0216 Institute of Information Technology University of Dhaka Bangladesh 19-09-2012 2 A high-throughput bioinformatics distributed computing platform
  • 3. The contributors of the paper INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA Thomas M. Keane, Andrew J. Page, James O. McInerney, and Thomas J. Naughton Bioinformatics and Pharmacogenomics Laboratory, National University of Ireland, Maynooth, Co. Kildare, Ireland Department of Computer Science, National University of Ireland, Maynooth, Co. Kildare, Ireland Homepage: http://www.cs.nuim.ie/distibuted 19-09-2012 3 A high-throughput bioinformatics distributed computing platform
  • 4. INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA Publications 18th IEEE Symposium on Computer- Based Medical System (CBMS’05) 19-09-2012 4 A high-throughput bioinformatics distributed computing platform
  • 5. Suitability of Bioinformatics to INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA Distributed Computing A Class of Algorithmic Parallelism referred to as coarse-grained parallelism.  High compute-to-data ratio. 19-09-2012 5 A high-throughput bioinformatics distributed computing platform
  • 6. Topic and Problem Overview INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA  Demand for high performance computing has increased dramatically in the area of bioinformatics due to rapid increase in the size of genomic databases.  Traditional database search algorithm was not feasible to perform full search of a large database in a reasonable time.  Feasibility of heuristic algorithm but reduction of sensitivity of search.  Evolutionary biology, phylogenetic tree and greedy heuristic algorithm. 19-09-2012 6 A high-throughput bioinformatics distributed computing platform
  • 7. INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA Proposed solution o According to the writers of the paper--- “We present a general- purpose programmable distributed computing platform suitable for deployment in a typical university environment where many semi-idle desktop PC’s are connected via a network”  The system is fully cross-platform.  Two distributed bioinformatics applications: i) DSEARCH ii) DPRml 19-09-2012 7 A high-throughput bioinformatics distributed computing platform
  • 8. Proposed solution(cont.) INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA o Java Distributed Computing platform - Client Server model - Server controls the resources (database, algorithm or computer hardware) - The model is divided into three separate pieces of software: server, client and remote interface. 19-09-2012 8 A high-throughput bioinformatics distributed computing platform
  • 9. Proposed solution(cont.) INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA Fig: Diagram of the complete system 19-09-2012 9 A high-throughput bioinformatics distributed computing platform
  • 10. Proposed solution(cont.) INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA  Installation and Deployment - Consists of three executable JAR files corresponding to the server, client and remote interface. - Run the client as a low priority background service. - Hardware specification: At least Pentium IV processor - OS compatibility: Windows, Sun Solaris, Mac OSX and Linux. 19-09-2012 10 A high-throughput bioinformatics distributed computing platform
  • 11. INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA DPRml - Distributed Phylogeny Reconstruction by maximum likelihood Previous situation: Maximum likelihood evolution is one the most accurate techniques for reconstructing phylogenies. Developed parallel ML programs for reconstructing large and accurate phylogenetic trees. Implemented in platform specific language 19-09-2012 11 A high-throughput bioinformatics distributed computing platform
  • 12. INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA DPRml (cont.) - Distributed Phylogeny Reconstruction by maximum likelihood After the development of distributed computing platform: One of the most general and powerful likelihood-based phylogenetic tree building program. Used proven tree building algorithm and phylogenetic Analysis Library Possibility of multiple phylogenetic computation. Platform independent ML program. 19-09-2012 12 A high-throughput bioinformatics distributed computing platform
  • 13. DPRml (cont.) INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA - Distributed Phylogeny Reconstruction by maximum likelihood Speed up Testing: Fig. Speedup achieved by running 6 simultaneous DPRml problems using between 1-40 semi-idle processors. 19-09-2012 13 A high-throughput bioinformatics distributed computing platform
  • 14. DSEARCH INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA  Fully cross-platform parallel database search program.  Operates in a master slave environment.  Splitting the database into fixed sized units that are subsequently searched on the donor machines., 19-09-2012 14 A high-throughput bioinformatics distributed computing platform
  • 15. DSEARCH (cont.) INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA Speed up Testing: Using- - FASTA database file, - A FASTA query sequence file. - A searching scheme - A configuration file. Fig. Speedup achieved by DSEARCH running on between 1-80 semi-idle processors. 19-09-2012 15 A high-throughput bioinformatics distributed computing platform
  • 16. My criticism and future work to do INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA  No detail description about how the applications works on the distributed computing platform.  If we don’t get the spare clock cycle of the semi-idle pc then the system will not give us the best result.  Failure of interconnected network of the desktop-pc’s will reduce the performance.  To improve and expand the range of bioinformatics applications for the system. 19-09-2012 16 A high-throughput bioinformatics distributed computing platform
  • 17. Conclusion INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA “There should not have any conclusion of research work, It is a continual process and it will be continued for the betterment of the human being.” 19-09-2012 17 A high-throughput bioinformatics distributed computing platform
  • 18. INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA ANY QUESTION? 19-09-2012 18 A high-throughput bioinformatics distributed computing platform
  • 19. INSTITUTE OF INFORMATION TECHNOLOGY (IIT), UNIVERSITY OF DHAKA 19-09-2012 19 A high-throughput bioinformatics distributed computing platform