This book starts right from the basics with databases and Structured Query Language (SQL). Prior database or SQL knowledge is not necessary, as this book covers everything from database design to creating your first database and understanding how the SQL language is used with databases. You will need to follow its instructions for creating the book’s gene database, as this is used for all the examples in SQL.
The objective of Section I, Chapters 1 to 8 is to understand the basic concepts and practices that can be used during design and development of biological database systems. The course contents put more emphasis on Relational Database Management Systems (RDBMS) as it is an accepted database standard for biological systems and present it in a more detailed manner. Entity Relation Modeling and schema representation have been covered extensively through a case study along with third form normalization with a case study. At the end of the Chapter 8, Centralized Database approach has been described briefly with an illustration. Most of the biological systems present in today’s word are centralized.
The Chapters 9 to 14 will focus on Structured Query Language, or SQL as it is usually abbreviated. SQL is useful to create the biological database and inserting and extracting biological data. Therefore, going through Chapters 1 to 8 is essential to understand the theory and concepts behind database systems. The SQL examples in Chapters 10 and 11 comply with the modern SQL standards set by organizations such as the American National Standards Institute (ANSI) and the International Standards Organization (ISO). The SQL queries in this chapter are supported by most modern database systems.
PL/SQL, DBA and transaction management and performance tuning, etc., are outside the scope of this book. However, Chapters 10 and 11 explains about trigger mechanism, various roles in database and modes of transactions in DBMS. SQL queries that are related to some mathematical functions and string functions, but left and right outer joins (both left and right) have not been addressed in Section I. Therefore, the reader is requested to refer to SQL manuals for any missing content. The database is created and tested using Oracle 8i, 9i and 10g versions. In the appendix, a cursory view of Oracle basics and PL/SQL has been provided as a reference for those who are new to these concepts.
Chapters 14 to 21 explain briefly the basic concepts in data warehouse, approaches to data warehouse building, steps to build a warehouse including dimensional modeling and at the end of the section a case study in plant bioinformatics has been presented with an example to understand the steps involved in warehouse process for building data marts for analysis.
In the second part of the book, many important biological databases and their uses in biology has been described. Since too large number of biological databases exist in literature, only a selected few (those that are often referred by biologists) are described in this book. Types of databases, models of databases, primary nucleic acid and protein databases, secondary protein databases, composite sequence databases, meta-databases, genomic, proteomic and other databases have been described in detail. Search engines for literature have been added for gaining access to the published literature in Journals and Books. Major genome projects (about 19), genomic databases of human, animals, fungi, microorganisms, plant crop genome databases have been described in brief. Finally, organellar and pathway databases have been added.
Section I : Database Principles for Biologists
2. Database Development
3. Data Models
4. The Entity-Relationship Model
5. Integrity Constraints
6. Rules In Relation Model
7. Nornalization and Schema Design
8. Case Study
9. SQL Overview
10. Elements of SQL
11. Advanced Queries
13. Data Dictionary
14. Transactional Database
15. Data Warehouse
16. Biological Data Warehousing Steps
17. Data Warehousing Lifecycle
18. Data Warehouse Design
19. Dimensional Modeling Types
20. OLAP and OLTP
21. Case Study − Plant Bioinformatics
Section II : Biological Databases
22. Introduction to Biological Databases
23. Types of Databases
24. Models of Databases
25. Primary Nucleic Acid Databases
26. Primary Protein Data Banks
27. Secondary Protein Databases
28. Composite Sequence Databases
30. Genomic, Proteomic and Other Databases
31. The Search Engines for Literature
32. Genome Projects and Genomic Databases of Human, Animals, Fungi and Microorganisms
33. Plant and Crop Genome Databases
34. Organellar Databases
35. Pathway Databases