These tools are majorly for various types of applications such as to Over view the molecules, for molecular modelling, molecular homology, molecular dynamics, QSAR studies, docking, activity testing and predictors, structure related as well as activity related and binding relations between ligands and proteins, between two proteins etc....
Now a days, many of these tools are available for free for non-commercial use as open source developed and programmed by various organizations, people to make the technology to be understandable, applicable and reusable to develop new molecular entities to target various diseases.
Now, I would like to lookout what are those tools available as or under Open Source.
For this, first we have to know about CRDD, which is abbreviated as Computational Resources for Drug Discovery. CRDD is a forum to initiate and develop a vision to provide affordable healthcare to the developing World. The OSDD concept aims to synergize the power of genomics, computational technologies and facilitate the participation of young and brilliant talent from Universities and industry. It seeks to provide a global platform where the best brains can collaborate and collectively endeavor to solve the complex problems associated with discovering novel therapies for neglected diseases like Tuberculosis.
For this, first we have to know about CRDD, which is abbreviated as Computational Resources for Drug Discovery. CRDD is a forum to initiate and develop a vision to provide affordable healthcare to the developing World. The OSDD concept aims to synergize the power of genomics, computational technologies and facilitate the participation of young and brilliant talent from Universities and industry. It seeks to provide a global platform where the best brains can collaborate and collectively endeavor to solve the complex problems associated with discovering novel therapies for neglected diseases like Tuberculosis.
So, in this post currently I am trying to post majorly concentrating on 3 topics, i.e., 1) Target Identification and validation, 2) Virtual Screening and 3) Drug Design which are the important aspects of Drug design and discovery.
TARGET IDENTIFICATION & VALIDATION:
Drugs fail in the clinic for two main reasons; the first is that they do not work and the second is that they are not safe. As such, one of the most important steps in developing a new drug is target identification and validation. A target is a broad term which can be applied to a range of biological entities which may include for example proteins, genes and RNA. A good target needs to be efficacious, safe, meet clinical and commercial needs and, above all, be ‘druggable’. A ‘druggable’ target is accessible to the putative drug molecule, be that a small molecule or larger biologicals and upon binding, elicit a biological response which may be measured both in vitro and in vivo.
Target based Drug Discovery starts with a thorough understanding of the disease mechanisms and the role of enzymes, receptors and proteins in the disease pathology. Target Identification majorly starts with genome annotations, proteome annotations, potential targets, protein structure and Si/mi RNA.
Gene Annotations:
Genome sequencing techniques now a days becoming more advanced and hence the number of sequencing genomes are increasing exponentially. One of the major challenge in contemporary science is annotate the available sequence data.Annotation defines the coding regions in the genome as well as their physical location. It also provides the number and spatial distribution of repeat regions and the evolutionary information about the whole genomes.
Several computational tools have been developed to cut down time and expense involved in the experimental procedure of annotation.
Several computational tools have been developed to cut down time and expense involved in the experimental procedure of annotation.
Servers integrated for CADD.
Server
|
Description
|
A web server for
locating probable protein coding region in nucleotide sequence using fourier
tranform approach (Issac, B., Singh, H., Kaur, H. and Raghava, G.P.S.
(2002) Bioinformatics 18:196).
|
|
This server allows to
predict gene (protein coding regions) in eukaryote genomes that includes
introns and exons, using similarity aided (double) and consensus Ab Intion
methods. (Issac B, Raghava GP. (2004) Genome Res. 14(9):1756-66)
|
|
A web server for
predicting genes in a DNAsequence.
|
|
A genome wide blast
server. It allow user to search ther sequence against sequenced genomes and
annonated proteomes. This integrate various tools which allows analysys of
BLAST SEARCH.
|
|
It is a support vector
based approach to identify the protein coding regions in human genomic DNA.
|
|
Spectral Repeat Finder
(SRF) is a program to find repeats through an analysis of the power spectrum
of a given DNA sequence. By repeat we mean the repeated occurrence of a
segment of N nucleotides within a DNA sequence. SRF is an ab initio technique
as no prior assumptions need to be made regarding either the repeat length,
its fidelity, or whether the repeats are in tandem or not (Sharma D, Issac
B, Raghava GP, Ramaswamy R. (2004) Bioinformatics. 20(9):1405-12)
|
|
Genome Wise Sequence
Similarity Search using FASTA. It allow user to search their sequence against
sequenced genomes and their product proteome. This integrate various tools
which allows analysys of FASTA search (Issac, B. and Raghava, G.P.S.
(2002) Biotechniques 33:548-56).
|
|
A suite of datasets
and tools for evaluating gene prediction methods.
|
|
MyPattern Finder is a
program for detection of a 'motif' in DNA sequence by using an exact search
method (Option A (1.0)) or
an alignment technique (Option B (1.0)).
|
Name
|
Can be used for
|
Algorithm
|
Archaea,Metagenomes,Eukaryotes,Viruses, Phages, Plasmids, EST and cDNA
|
hidden Markov model
|
|
|
Microbial genomes
|
Markov model
|
|
Human
|
Hidden Markov Model
|
|
vertebrate and C.elegans
|
Hidden Markov Model
|
Prokaryotes
|
Ab-inito METHOD
|
|
|
Bacteria ,Viruses and
eukaryotes
|
HMM and similarity
based searches
|
|
Animal, Human, Plants
fungus,Protists
|
Neural Network
|
|
Vertebrates,
Arabidopsis, Maize
|
Ab-inito Method
|
Web Interface on Libraries
Standalone Software
Name
|
Can
be used for
|
Algorithm
|
GenomeThreader
|
Plants
|
Similarity-based
gene prediction program where additional cDNA/EST and/or protein sequences are used to predict
gene structures via spliced alignments
|
JIGSAW(formerly
"Combiner")
|
Eukaryotic
|
multiple sources of evidence (output from gene
finders, splice site prediction programs and sequence alignments to predict
gene models)
|
Eukaryotic
|
GlimmerHMM is based on a
Generalized Hidden Markov Model (GHMM). Although the gene finder conforms to
the overall mathematical framework of a GHMM, additionally it incorporates
splice site models adapted from the GeneSplicerprogram and a decision tree adapted
from GlimmerM. It also utilizes Interpolated Markov
Models for the coding and noncodingmodels . Currently,GlimmerHMM's GHMM structure includes introns of each phase, intergenicregions,
and four types ofexons (initial, internal, final, and
single).
|
|
|
eukaryotic
|
GeneZilla is based on the
Generalized Hidden Markov Model (GHMM). It evolved out of the ab initioeukaryotic gene
finderTIGRscan, which was developed at The
Institute for Genomic Research.
|
Twinscan/N-SCAN (Ver 4.1.2)
|
Twinscan: Mammals,Caenorhabditis(worm), Dicotplants,
andCryptococci. N-SCAN: human and Drosophila
|
TWINSCAN
extends the probability model of GENSCAN, allowing it to exploit homology
between two related genomes. Separate probability models are used for
conservation inexons, introns,
splice sites, and UTRs, reflecting the differences among their patterns of
evolutionary conservation.
N-SCAN
(a.k.a. TWINSCAN 3.0) model the
phylogenetic relationships between the aligned genome sequences, context
dependent substitution rates, and insertions and deletions. N-SCAN Is created
and used to generate predictions for the entire human genome and the genome
of the fruit fly Drosophila melanogaster.
|
|
prokaryotic and eukaryotic genomes
|
Manatee
is a web-based gene evaluation and genome annotation tool that can view,
modify, and store annotation for prokaryotic and eukaryotic genomes. The
Manatee interface allows biologists to quickly identify genes and make high
quality functional assignments using a multitude of genome analyses tools.
These tools consist of, but are not limited to GO classifications, BER and
blast search data,paralogous families, and annotation suggestions
generated from automated analysis.
|
NA
|
alignment
of multiple genomic sequences
|
|
(Coding Region Identification Tool Invoking Comparative Analysis)
|
Prokaryotic
|
CRITICA
combines traditional approaches to the problem with a novel comparative
analysis. If, in a nucleotide alignment, a pair of ORFs can be found in which
the conceptual translated products are more conserved than would be expected
from the amount of conservation at the nucleotide level, this is evolutionary
evidence that the DNA sequences are protein coding. Regions found by this
method are used to generate traditionaldicodon frequencies for further analysis and
give the prediction about a probable protein coding region.
|
|
||
|
Eukaryotes (Homo sapiens, Plasmodiumfalciparum,
Plasmodiumvivax)
|
Phat is a HMM-basedgenefinder, originally developed for genefinding in
Plasmodium falciparum.
|
|
Eukaryotes
|
EuGène exploit probabilistic
models like Markov models for discriminating coding from non coding sequences
or to discriminate effective splice sites from false splice sites (using
various mathematical models).
|
|
Eukaryotic genomic sequences
|
It
allows to use
protein homology information and travel in the prediction.
|
Databases
|
A database of human genes, their
products and their involvement in diseases. It offers concise information
about the functions of all human genes that have an approved symbol as well
as selected others. It is especially useful for those who are searching for
information working in functional genomics and proteomics. The data is
collected with Knowledge Discovery and Data Mining's techniques and accessed
by means of proprietary Guidance System that makes more or less intelligent
suggestions to the user of where and how the information may be retrieved.
|
TRANSFAC
|
TRANSFAC is a transcription factor
database. It compiles data about gene regulatory DNA sequences and protein
factors binding to them. On this basis, programs are developed that help to
identify putative promoter or enhancer structures and to suggest their
features.
|
A database of genes that relate to
vertebrate red blood cells. A detailed description of EpoDB can be found on Chapter 5. The
database includes DNA sequence, structural features and potential
transcription factor binding sites.
|
|
A
Database of plant promoter
|
|
A
Database of plant promoter
|
|
RegulonDB provides curated
information on gene organization and regulation in E. coli. Current information
is provided on the gene, operonand regulon level.
Future expansion will include information on regulation beyond transcription
initiation.
|
0 comentários:
Post a Comment