HomoloGene, a tool of the National Center for Biotechnology Information (NCBI), is a system for automated detection of homologs (similarity attributable to descent from a common ancestor) among the annotated genes of several completely sequenced eukaryotic genomes. The National Center for Biotechnology Information ( NCBI) is part of the United States National Library of Medicine (NLM a branch of the National Institutes In Evolutionary biology, homology has come to mean any similarity between characters that is due to their shared ancestry.

The HomoloGene processing consists of the protein analysis from the input organisms. Sequences are compared using blastp[1], then matched up and put into groups, using a taxonomic tree built from sequence similarity, where closer related organisms are matched up first, and then further organisms are added to the tree. The protein alignments are mapped back to their corresponding DNA sequences, and then distance metrics as molecular distances Jukes and Cantor (1969), Ka/Ks ratio can be calculated. A substitution model describes the process from which a sequence of characters of a fixed size from some Alphabet changes into another set of traits In Genetics, the Ka/Ks ratio (or ω dN/dS is the ratio of the rate of non- Synonymous substitutions (Ka to the rate of synonymous substitutions (Ks which can be used as an

The sequences are matched up by using a heuristic algorithm for maximizing the score globally, rather than locally, in a bipartite matching (see complete bipartite graph). In Computer science, a heuristic algorithm or simply a Heuristic is an Algorithm that ignores whether the solution to the problem can be proven In the Mathematical field of Graph theory, a complete bipartite graph or biclique is a special kind of Bipartite graph where every And then it calculates the statistical significance of each match. Cutoffs are made per position and Ks values are set to prevent false "orthologs" from being grouped together. In Evolutionary biology, homology has come to mean any similarity between characters that is due to their shared ancestry. “Paralogs” are identified by finding sequences that are closer within species than other species. In Evolutionary biology, homology has come to mean any similarity between characters that is due to their shared ancestry.

## Input organisms

Homo sapiens, Mus musculus, Danio rerio, Rattus norvegicus, Pan troglodytes, Canis lupus familiaris, Arabidopsis thaliana, Gallus gallus, Oryza sativa, Anopheles gambiae, Drosophila melanogaster, Magnaporthe grisea, Neurospora crassa, Caenorhabditis elegans, Saccharomyces cerevisiae, Kluyveromyces lactis, Eremothecium gossypii, Schizosaccharomyces pombe and Plasmodium falciparum. Human beings, humans or man (Origin 1590–1600 L homō man OL hemō the earthly one (see Humus The House Mouse ( Mus musculus) is one of the most numerous species of the genus Mus commonly termed a Mouse. The zebrafish or zebra danio, Danio rerio, a Tropical Freshwater Fish belonging to the minnow family ( Cyprinidae) The Common Chimpanzee ( Pan troglodytes) also known as the Robust Chimpanzee, is a great ape. The dog ( Canis lupus familiaris) is a domesticated Subspecies of the gray wolf, a Mammal of the Canidae family of the order Arabidopsis thaliana ( A-ra-bi-dóp-sis tha-li-á-na; thale cress, mouse-ear cress or Arabidopsis) is a small The chicken ( Gallus gallus, sometimes G gallus domesticus) is a domesticated Fowl which is traditionally believed to have descended from Rice is a Cereal foodstuff which forms an important part of the diet of many people worldwide and as such it is a staple food for many Anopheles gambiae, refers to a complex of morphologically indistinguishable Mosquitoes in the genus Anopheles, which contains Drosophila melanogaster (from the Greek for black-bellied dew-lover) is a two-winged insect that belongs to the Diptera, the order Magnaporthe grisea, also known as rice blast fungus, rice rotten neck, rice seedling blight, blast of rice, oval leaf spot Neurospora crassa is a type of red bread mold of the phylum Ascomycota. Caenorhabditis elegans (ˌsiːnoʊræbˈdaɪtɪs ˈɛlɪgænz is a free-living Nematode (roundworm about 1  mm in length which Saccharomyces cerevisiae is a Species of Budding Yeast. It is perhaps the most useful Yeast owing to its use since ancient times Kluyveromyces lactis is a Kluyveromyces Yeast commonly used for genetic studies and industrial applications History and Significance Ashbya gossypii is a filamentous Fungus or Mold closely related to yeast but growing exclusively in a filamentous Schizosaccharomyces pombe, also called "fission yeast" is a Species of Yeast. Plasmodium falciparum is a Protozoan Parasite, one of the species of Plasmodium that cause Malaria in humans

## Interface

The HomoloGene is linked to all Entrez databases and based on homology and phenotype information of these links:

• Mouse Genome Informatics (MGI),
• Zebrafish Information Network (ZFIN),
• Saccharomyces Genome Database (SGD),
• Clusters of Orthologous Groups (COG),
• FlyBase,
• Online Mendelian Inheritance in Man (OMIM)

As a result HomoloGene displays information about Genes, Proteins, Phenotypes, and Conserved Domains.