Workflows

From Chemical Informatics and Cyberinfrastructure Collaboratory

We are developing computational workflows using our web service infrastructure and the open-source Taverna workflow tool. These emphasis is on developing workflows which encapsulate important processes in chemoinformatics and drug design, use diverse kinds of information together in novel ways, and which are of demonstrated scientific merit.

Below are descriptions of some of the workflows that we have developed, along with example output.

Contents

A Taverna Tutorial

A simple, "getting started" tutorial for Taverna is available from http://communitygrids.blogspot.com/2006/01/getting-started-with-taverna.html. A movie of this is available from http://www.chembiogrid.org/presentations/Movies/TavernaStrContDemo.avi.

Workflow 1 - FInding relationships between compounds and proteins

NIH SIM SEARCH -> FILTER -> OMEGA -> FRED -> JMOL/HTML

Examples of workflow output

This workflow is a sequence of performing a similarity search on the NIH DTP Human Tumor data, filtering the results based on Pharmacokinetic properties (FILTER), converting to 3D (OMEGA), docking into a pre-defined protein (FRED) and visualizing (JMOL). This workflow opens up various possibilities, including:

  • Finding similar structures in the DTP to existing ligands for tumor-related proteins from the PDB, and correlation of docking scores with cell-line assay results. Resultant hypothesisizing about which proteins are involved in which tumors
  • Testing the possible effectiveness of DTP compounds in other areas (e.g. Alzheimer's disease - see Alzheimer's Workflow) by docking structures to PDB proteins from that therapeutic area.
  • Integration of this workflow with other tools such as Sentient Desktop - see example of using Workflow 1 with Alzheimers Disease in Sentient.

Workflow 2 - HTS data organization and flagging

NIH SCREEN RETRIEVE -> FILTER -> TOXICITY FLAG -> SERIES GENERATION (Divkm) -> VISUALIZATION (VOPlot, 2Dviewer)

Example of Workflow Output

An AVI movie of this workflow running in Taverna is available from here.

This workflow demonstrates how screening data can be flagged and organized for human analysis. The compounds and data values for a particular screen are retrieved, and then are filtered to remove compounds with reactive groups, etc. ToxTree is used to flag the potential toxicities of compounds. Divkmeans is used to add a column of cluster numbers. Finally, the results are visualized using VOPlot and the 2D viewer applet.

Sample Workflows - Some Part of the Above, Prototypes, or Test Workflows

An example of the workflow2 for HTS data organization and flagging

ToxTreeBrief of ToxTreeServer

ToxTreeVerbose of ToxTreeServer