Databases
These are individual projects, but they have a lot in common. They all store and search 2D chemical structure information. We use the following to do that:
- PostgreSQL database
- gNova CHORD cartridge for storing and searching chemical structures
CHORD works only with PostgreSQL, and extends its functionality for chemical structures (e.g., adding SQL commands to perform substructure and similarity searching).
There is a need to display tables showing 2D structures and data, so visualization and end user tools currently under development.
Local NIH Developmental Therapeutics Program (DTP) Database
This is a local database containing the NIH Developmental Therapeutics Program (DTP) data that can be used for data mining. The database will require the ability to similarity search, and the ability to extract biological fingerprints and gene expression data.
Web Services Link(s):
- Similarity search on DTP data
- Cancer Cell Line Activity Predictions (currently for the first 20 cell lines only)
Local PubChem Database
This is a local copy of PubChem that can be used for data mining. The database will require the ability to handle complex data in PubChem. It will prototype new architectures.
Web Services Link(s):
These services are essentially wrapped queries. Naturally there may be queries that you'd like to see but are not present. If so let Rajarshi Guha know.
- Structure (Usage) Provides methods to get Pubchem Compound information
- Synonyms (Usage)] Provides methods to get synonyms given a compound or substance CID (PubChem's Chemical Identifier). (SMILES support coming)
- Derived properties (Usage) Gets calculated properties (SLogP and SMRef) given a compound CID. Can also search via exact values and ranges (but this is very slow at the moment).
- Docking results (Usage) Provides methods to get the ligand and target structures for PubChem compounds based on CID, sorted score values, or by SMARTS patterns. Ligands are returned in SDF format. (Currently only the ligand structures are accessible and the actual score values are coming soon.)
- 3D structures (Usage) Provides access to MMFF94 optimized 3D structures for PubChem compounds. Structures are returned in SD format and can be accessed by CID or by SMARTS patterns.
Local PubChem Dock Database
The PubChem Dock database aims to store the results of large-scale docking calculations. The results being stored include the PDB structure of the targets, 3D structures of docked ligands, and the docking scores. We currently evaluate four (arbitrarily chosen) scoring functions provided by Openeye's fred, namely:
- chemgauss3
- shapegauss
- oechemscore
- plp
For each scoring function we save the total score as well as the component scores. The database currently has docking results against six proteins (1YC4, 1R1P, 1YC3, 1YC1, 1XP6, 1QKT). We plan on populating it with docking results for families of proteins. One possible use is to screen ligands over families using a similarity approach.
Web Services Link(s):
The database can be accessed via a web form as well as with web services. (Usage)
Distributed Drug Discovery Database
A database of 2D chemical structures and associated reaction information that has been made (or which can be made) in the Distributed Drug Discovery project run by Bill Scott. The database will require enumeration of libraries from reagents and special kinds of searching.
No IU web service link available yet.
Quantum Mechanical Database (Varuna)
An integrated system that includes a depository for computational chemistry and a modeling environment, including automated execution of calculations, computational resources management and visualization. Currently has about 500 compounds.
- Web client for a structure search of the database (returns the input and 3D coordinate output file data)
- Demonstration of .net workflow
- Site for web services
Build clients from these:
- The file operation utilities help to convert file formats frequently used for Quantum mechanics and Molecular mechanics computation. These web services accept the input files and return the converted result in a string array. Some services need the support of Openbabel.
- The result analysis utilities accept the computed result from Jaguar and ADF package and return the frequency and geometry optimization information.
- The database query service helps the user to locate the project information in the Varuna database. This service also includes executing command on the Avidd cluster of computers.
- The upload web service helps the user interact with Avidd, Varuna and the local PC through SFTP or SCP protocol by using the open source SharpSSH library. These web services accept the input files, submit files to Avidd and store the information into Varuna.


