O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Designing Biological Databases

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Próximos SlideShares
Data retreival system
Data retreival system
Carregando em…3
×

Confira estes a seguir

1 de 32 Anúncio

Designing Biological Databases

Baixar para ler offline

A report presented in my BNF 216 (Database Design and Modeling for Bioinformatics) class regarding principles and tips to follow in designing biological databases.

A report presented in my BNF 216 (Database Design and Modeling for Bioinformatics) class regarding principles and tips to follow in designing biological databases.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Anúncio

Semelhante a Designing Biological Databases (20)

Mais recentes (20)

Anúncio

Designing Biological Databases

  1. 1. How do you solve a problem like a biological database? (BNF 216 - Database Modeling and Design for Bioinformatics) Arjei Balandra Software Developer National Telehealth Center University of the Philippines – Manila http://bumblebest.net
  2. 2. Database • A database is a set of data that has a regular structure and that is organized in such a way that a computer can easily find the desired information. – The Linux Information Project (http://www.linfo.org/database.html)
  3. 3. Biological Database • Biological databases are libraries of life sciences information collected from scientific experiments, published literature, high- throughput experiment technology, and computational analyses. - Wikipedia (en.wikipedia.org/wiki/Biological_database)
  4. 4. NCBI - GenBank
  5. 5. European Nucleotide Archive – EMBL-EBI
  6. 6. DDBJ – DNA Data Bank Of Japan
  7. 7. Why Database? • Data-intensive techniques such as high- throughput screening and gene expression experiments demand methods to correlate large and diverse datasets. • Databases integrate information from a variety of sources allowing faster and more powerful searches.
  8. 8. DO A “GOOD” DATABASE DESIGN Tip #1:
  9. 9. Good Database Design • Provides easy access to previous results. • Supports both expert- and machine-guided searches for novel correlations in data.
  10. 10. Bad Database Design • Obfuscates the correlations for which the user is searching • makes it difficult for biologists to fit their data into the database or to find previously stored data resulting to user contempt. • ‘brittle’
  11. 11. LEARN FROM EXISTING LITERATURE Tip #2:
  12. 12. • Generalizations • Incorporate existing schema into the database design • Use existing structures for common data
  13. 13. Generalizations
  14. 14. aMAZE (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC308873/figure/gkh139f2/)
  15. 15. RESPECT THE UNIQUE NEEDS OF BIOLOGISTS (AND USERS) Tip #3:
  16. 16. Business rules • constraints – based on data derived from the real-world entities – specific to the needs of the organization.
  17. 17. What they need? – Use free-text Comments – Create user-specific categories Dealing with Business Rules
  18. 18. User-Specific Categories
  19. 19. DESIGN THE DATABASE BEFORE BUILDING IT Tip #4:
  20. 20. USE THE DATABASE TO ENFORCE DATA INTEGRITY Tip #5:
  21. 21. Normalization
  22. 22. Normalization
  23. 23. Normalization
  24. 24. KEEP THE DATABASE SCOPE MANAGEABLE Tip #6:
  25. 25. • In Biology, one size does not fit all • Focus on a subset of Biology (ie. Genes, Proteins) • In large subsets, do it one at a time • Inclusive Keep the database scope manageable
  26. 26. LISTEN TO THE PEOPLE WHO HAVE TO WRITE AND USE THE INTERFACE Tip #7:
  27. 27. • Databases are successful only when people use it Users know what they want and need + Developers know what they can do + Designers know what must be done --------------------------------------------------------- = Collaborative approach to develop a successful database
  28. 28. TEST THE DESIGN WITH REALISTIC DATA Tip #8:
  29. 29. MAKE THE DATABASE STRUCTURE UNDERSTANDABLE AND EASY TO MAINTAIN Tip #9:
  30. 30. THANK YOU! REPLACE(quote, ”pagmamahal”,” data”); quote
  31. 31. References • The Linux Information Project (http://www.linfo.org/database.html) • Nelson, M.R., Reisinger, S.J., Henry, S. (2003).Designing databases to store biological information. BIOSILICO Vol. 1, No. 4 • Wikipedia (en.wikipedia.org/wiki/Biological_database) • Lemer, C., Antezana, E., Couche, F., Fays, F., Santolaria, X., Janky, R., … Wodak, S. J. (2004). The aMAZE LightBench: a web interface to a relational database of cellular processes. Nucleic Acids Research, 32(Database issue), D443–D448. doi:10.1093/nar/gkh139

×