Databases Projects
From Chemical Informatics and Cyberinfrastructure Collaboratory
These are individual projects, but they have a lot in common particularly they all share a need to be able to store and search 2D chemical structure information, and to be able to view tables showing 2D structures and data. The standard for doing this that we're using is to use:
- PostgreSQL database
- gNova CHORD cartridge for storing and searching chemical structures
- Visualization and End User Tools currently under development
CHORD works only with PostgreSQL, and extends its functionality for chemical structures (e.g. adding SQL commands to perform substructure and similarity searching)
Contents |
Distributed Drug Discovery
Purpose: To build a database of 2D chemical structures and associated reaction information that have been made (or which can be made) in the Distributed Drug Discovery project run by Bill Scott
People: Kelsey Forsythe, Bill Scott, Malika Mahoui, Usha Cheemakurthi, Deepthi Jonnala, David Wild
Special Needs: Enumeration of libraries from reagents. Special kinds of searching.
Local NIH DTP database
Purpose: To build a local database containing the NIH DTP data that can be used for data mining
People: Melanie Wu, Xiao Dong, Huijun Wang, David Wild
Special Needs: Ability to similarity search, extract biological fingerprints and gene expression data.
Progress: Similarity Search on DTP data now available
Local Pubchem database
Purpose: To build a local copy of PubChem that can be used for data mining
People: Melanie Wu, Xiao Dong, Huijun Wang, David Wild, Rajarshi Guha
Special Needs: Ability to handle complex data in PubChem, and prototype new architectures.
Currently a slightly reduced version of PubChem is available locally. Web services are also provided as a front-end to the various tables. See the details for more information on schema, indices and triggers.
Docking database
Purpose: Store the results of large scale docking. Here large scale implies the whole of PubChem, though currently it is working with a drug like subset of PubChem (approximately 1M compounds) and multiple protein targets. Currently has data for 1 target, though families of proteins will be processed. See here for details.
People: Rajarshi Guha
Quantum Mechanical database (Varuna)
Purpose: People: Melanie Wu, Mookie Baik, ... ?
Special Needs:
