"Bioinformatics - the development and application of computational
methods to acquire, store, organize, archive and visualize biological data
- is one of the fastest-growing technologies." [1]
[definition]
"Bioinformatics is the field of science in which biology, computer
science, and information technology merge to form a single discipline. The
ultimate goal of the field is to enable the discovery of new biological
insights as well as to create a global perspective from which unifying principles
in biology can be discerned" [2]
[definition]
"Bioinformatics is the application of computer technology to the management
and analysis of biological data. The result is that computers are being used to
gather, store, analyse and merge biological data." [8]
[definition]
"Bioinformatics is conceptualising biology in terms of molecules
and applying informatics techniques (derived from disciplines such
as applied maths, computer science and statistics) to understand and organise
the information associated with these molecules, on a large scale. In short,
bioinformatics is a management information system for molecular biology
and has many practical applications." [3]
[some typical applications]
Data
management with databases
How can the large amounts of data be handled, that e.g. hight-throughput
experiments or the sequencing of the human genome produce? more on this topic
Sequence analysis
One of the oldest areas where bioinformatics developed into a science
of its own was the task to analyze the information encoded in our own
genes.
Data analysis
All experiments need (and should get) computational support at some
point. From a few calculations in an excel file to trained classifying
methods used in the field of data mining, there are various ways to get
a better understanding of the experimetal data.
Pathway reconstruction
The analysis of various types of experiments hopefully leads to insights into the underlying
biological mechanisms. The discovery of how the individiual parts (molecules or other entities)
act together in a coordinated network is one of the most exciting areas of research.
[further readings & sources]
[1] M. Chicurel Bioinformatics:
Bringing it all together Nature 419, 751-757
ENSEMBL the leading genome annotation and browsing system www.ensembl.org
EnsEMBL is a joint project between the Wellcome Trust Sanger Centre &
the European Bioinformatic Institute. The main goal is the automated annotation of genomes
(eukaryotic and model organisms).
Some information is imported from other Ressouorces, but most data on
the location of genes, transcripts and many other features within the
genome is gained by the analysis of sequence data with standard and with
novel programs.
All data and all software is made avaliable for free.
#connect explicitely using the DBAdaptor
# (used in examples 1-8)
use Bio::EnsEMBL::DBSQL::DBAdaptor;
my $db = new Bio::EnsEMBL::DBSQL::DBAdaptor(
-host => 'ensembldb.ensembl.org',
-dbname => 'homo_sapiens_core_42_36b',
-user => 'anonymous',
);
#OR connect automaticall using a the Registry
# (used in example 9)
#please see
http://www.ensembl.org/info/software/registry/index.html
use Bio::EnsEMBL::Compara::DBSQL::DBAdaptor;
use Bio::EnsEMBL::Registry;
Bio::EnsEMBL::Registry->load_registry_from_db(
-host => 'ensembldb.ensembl.org',
-user => 'anonymous'
);
# [connect (1)]
my @chromosomes;
foreach my $chr ( @{ $slice_adaptor->fetch_all('chromosome') } ) {
#print out information
print $chr->seq_region_name.", ".$chr->start." - ".$chr->end."\n";
#store the names
push @chromosomes, $chr->seq_region_name;
#or work with the chromosome...
}
# [connect (1)]
#all genes from a slice, e.g. a chromosome
# [get_slice]
my @genes = @{ $slice->get_all_Genes() };
#specific gene, using EnsEMBL-ID
my $gene_adaptor = $db->get_GeneAdaptor;
my $gene = $gene_adaptor->fetch_by_stable_id("ENSG00000147892");
#specific gene, using gene symbol (short name)
my $gene_adaptor = $db->get_GeneAdaptor;
my @genes = @{$gene_adaptor->fetch_all_by_external_name("ADAMTSL1")};
use Bio::EnsEMBL::DBSQL::DBAdaptor;
use Bio::EnsEMBL::Compara::DBSQL::DBAdaptor;
#use Registry file for a simple connection setup,
#please see
http://www.ensembl.org/info/software/registry/index.html
use Bio::EnsEMBL::Registry;
Bio::EnsEMBL::Registry->load_registry_from_db(
-host => 'ensembldb.ensembl.org',
-user => 'anonymous'
);
#get compara adaptors
my $ma = Bio::EnsEMBL::Registry->get_adaptor(
'compara', 'compara', 'Member')
or die "\n$@\ncan't get adaptor 1.\n";
my $ha = Bio::EnsEMBL::Registry->get_adaptor(
'compara', 'compara', 'Homology')
or die "\n$@\ncan't get adaptor 2.\n";
#fetch human gene from core database
my $query_species = "Homo_sapiens";
my $gene_id = "ENSG00000147892";
#fetch source gene
my $member = $ma->fetch_by_source_stable_id(
"ENSEMBLGENE", $gene_id) or return 0;
my $sourceGenome = $member->genome_db->dbID;
print "\nsource gene ($query_species): ".$member->stable_id;
#get all homologues from other species
my $other_species = "Mus_musculus";
my $homologies = $ha->fetch_by_Member_paired_species(
$member, $other_species);
#or from all species
#my $homologies = $ha->fetch_by_Member($member);
#display all results
foreach my $homologie (@$homologies) {
foreach my $member_attrib (@{$homologie->get_all_Member_Attribute}) {
my ($newmember, $attrib) = @$member_attrib;
if ($newmember->genome_db->dbID != $sourceGenome) {
print "\nhomologue: ".$newmember->stable_id.
" / ".$newmember->taxon_id.
": ".$newmember->chr_name.
" ".$newmember->chr_start.
"-".$newmember->chr_end;
}
}
}
10. get GeneOntology term for a gene using EnsEMBL & GOApph