Distributed Drug-Discovery Database

(As part of the NIH-funded Chemical Informatics and Cyberinfrastructure Collaboratory at Indiana University)

Principles: Dr. Malika Mahoui, Assistant Professor of Informatics, IUPUI

Dr. Kelsey Forsythe, Associate Director of Chemical Informatics, IUPUI

Collaborators: William Scott, Research Professor, IUPUI Chemistry

Martin O'Donnell, Professor, IUPUI Chemistry

Scope: Build web services for use with the Distributed Drug Discovery (DDD) project

OpenSource Pledge: We agree to use IU's model software license and will utilize/incorporate open-source and other freely available software for the project. All software and services produced will be available free of charge to all.

Proprietary Software: ChemDraw Ultra and Digital Computing Ltd. Toolkit

Summary: The DDD project will involve the synthesis of a large combinatorial set of prospective disease inhibiting drugs. Our role will be to construct the database(s) and provide web based data-application and database management tools for adding/deleting and searching the resulting database(s). In addition web services which connect these tools with existing applications such as PubChem environment will be built.

More specifically the services supported include the following:

  1. Entry/retrieval of structures and sources of starting materials/reagents, with some "registry number" associated with each unique starting material. Each reagent (including solid substrate) is assigned number unique to DDD database.
  2. Generates product combinatorial library given a generic reaction schema with reagent attachment points (Markush structures) PLUS reagent library. We intend to seek opensource software for the combinatorial enumeration but we have proprietary software available such as ChemDraw and Digital Chemistry Ltd's Toolkit.
  3. Entry/retrieval of combinatorial products synthesized by the students. This will include all the information provided by the experiment (DDD registation number, CAS registration number (if available), PubChem registry number (if available), Molecular Weight, chemical formula, geographical location of laboratory, supporting instituion information)
  4. Two databases will result:
    • A database of synthesized products and associated reagents updated with analytical information such as percent purity, LC-MS etc.
    • A second database of yet-to-be synthesized products (i.e. combinatorial products not yet realized). This will be available to computational chemists and the like interested in modeling a product of interest as well as synthetic chemists querying the existence of the product material.
  5. Integrate the newly generated products in PubChem. This can be as an aggregated service where we first search for the new compound in PubChem and if it does not exist, we send the insert request to PubChem. We can use mirror PubChem that is being developed by other members in this grant for the experiments.
  6. Package the services as a web client application so that students in different countries can access the same database.
  7. Use SIBIOS or other workflow environment (Taverna for example) to allow user to pipeline a set of tasks as a workflow.
  8. Provide the services described above as XML web services to facilitate its use by agents, other applications, etc.

Students employed: Two student will work with Dr. Mahoui and Dr. Forsythe during the project paid a rate of $10/hr

Hardware: We intend to set up a test database server for use during the project at an estimated cost of $3000