The human genome is the genome of Homo sapiens, which is stored on 23 chromosome pairs. In classical genetics the genome of a Diploid Organism including Eukarya refers to a full set of chromosomes or genes in a Gamete, thereby Human beings, humans or man (Origin 1590–1600 L homō man OL hemō the earthly one (see Humus Twenty-two of these are autosomal chromosome pairs, while the remaining pair is sex-determining. An autosome is a non-sex Chromosome. It is an ordinarily paired type of chromosome that is the same in both Sexes of a species. The XY sex-determination system is the Sex-determination system found in Humans most other Mammals some insects ( Drosophila) and some The haploid human genome occupies a total of just over 3 billion DNA base pairs and has a data size of approximately 750 Megabytes, which is slightly larger than the capacity of a standard Compact Disc. "Haplo" redirects here For the fictional character see The Death Gate Cycle. Deoxyribonucleic acid ( DNA) is a Nucleic acid that contains the genetic instructions used in the development and functioning of all known In Molecular biology, two Nucleotides on opposite complementary DNA or RNA strands that are connected via Hydrogen bonds are called A megabyte is a unit of Information or Computer storage equal to either 106 (1000000 Bytes or 220 (1048576 bytes depending on A Compact Disc (also known as a CD) is an Optical disc used to store digital data, originally developed for storing digital audio The Human Genome Project produced a reference sequence of the euchromatic human genome, which is used worldwide in biomedical sciences. The Human Genome Project (HGP was an international Scientific research project with a primary goal to determine the sequence of chemical base pairs which make up DNA Euchromatin is a lightly packed form of Chromatin that is rich in Gene concentration and is often (but not always under active transcription. Health science is the applied science dealing with Health, and it includes many sub disciplines
The haploid human genome contains an estimated 20,000–25,000 protein-coding genes, far fewer than had been expected before its sequencing. History See also History of genetics The existence of genes was first suggested by Gregor Mendel (1822-1884 who in the 1860s studied inheritance  In fact, only about 1. 5% of the genome codes for proteins, while the rest is comprised of RNA genes, regulatory sequences, introns and (controversially) "junk" DNA. Proteins are large Organic compounds made of Amino acids arranged in a linear chain and joined together by Peptide bonds between the Carboxyl A non-coding RNA ( ncRNA) is any RNA molecule that is not translated into a Protein. A regulatory sequence (also called a regulatory region or a regulatory area) is a segment of DNA where regulatory proteins such as Transcription Introns, derived from the term "intragenic regions" and also called intervening sequence (IVS are DNA regions in a Gene that are not translated into In Molecular biology, junk DNA is a provisional label for the portions of the DNA sequence of a Chromosome or a Genome for which no 
There are 24 distinct human chromosomes: 22 autosomal chromosomes, plus the sex-determining X and Y chromosomes. A chromosome is an organized structure of DNA and Protein that is found in cells. An autosome is a non-sex Chromosome. It is an ordinarily paired type of chromosome that is the same in both Sexes of a species. The XY sex-determination system is the Sex-determination system found in Humans most other Mammals some insects ( Drosophila) and some The X chromosome is one of the two sex-determining Chromosomes in many animal species including mammals (the other is the Y chromosome) The Y chromosome is the sex-determining Chromosome in most Mammals including Humans In mammals it contains the gene SRY, which triggers Chromosomes 1–22 are numbered roughly in order of decreasing size. Somatic cells usually have 23 chromosome pairs: one copy of chromosomes 1–22 from each parent, plus an X chromosome from the mother, and either an X or Y chromosome from the father, for a total of 46. Somatic cells are any cells forming the body of an organism as opposed to Germline cells
There are estimated 20,000–25,000 human protein-coding genes. History See also History of genetics The existence of genes was first suggested by Gregor Mendel (1822-1884 who in the 1860s studied inheritance 
Surprisingly, the number of human genes seems to be less than a factor of two greater than that of many much simpler organisms, such as the roundworm and the fruit fly. Caenorhabditis elegans (ˌsiːnoʊræbˈdaɪtɪs ˈɛlɪgænz is a free-living Nematode (roundworm about 1  mm in length which Drosophila melanogaster (from the Greek for black-bellied dew-lover) is a two-winged insect that belongs to the Diptera, the order However, human cells make extensive use of alternative splicing to produce several different proteins from a single gene, and the human proteome is thought to be much larger than those of the aforementioned organisms. Alternative splicing is the RNA splicing variation mechanism in which the Exons of the primary gene transcript the Pre-mRNA, are separated and reconnected The proteome is the entire complement of Proteins expressed by a genome cell tissue or organism
Most human genes have multiple exons, and human introns are frequently much longer than the flanking exons. An exon is a Nucleic acid sequence that is represented in the mature form of an RNA molecule after a portions of a precursor RNA Introns have been Introns, derived from the term "intragenic regions" and also called intervening sequence (IVS are DNA regions in a Gene that are not translated into
Human genes are distributed unevenly across the chromosomes. Each chromosome contains various gene-rich and gene-poor regions, which seem to be correlated with chromosome bands and GC-content. Cytogenetics is a branch of Genetics that is concerned with the study of chromosomes and cell division GC-content (or guanine-cytosine content in molecular biology is the percentage of Nitrogenous bases on a DNA molecule which are either Guanine or The significance of these nonrandom patterns of gene density is not well understood. In addition to protein coding genes, the human genome contains thousands of RNA genes, including tRNA, ribosomal RNA, microRNA, and other non-coding RNA genes. A non-coding RNA ( ncRNA) is any RNA molecule that is not translated into a Protein. Transfer RNA (abbreviated tRNA) is a small RNA (usually about 74-95 nucleotides that transfers a specific Amino acid to a growing polypeptide chain at Ribosomes ( from ribo nucleic acid and "Greek soma ( meaning body") are complexes of RNA and Protein that In Genetics, microRNAs ( miRNA) are single-stranded RNA molecules of about 21–23 Nucleotides in length which regulate Gene expression
The human genome has many different regulatory sequences which are crucial to controlling gene expression. A regulatory sequence (also called a regulatory region or a regulatory area) is a segment of DNA where regulatory proteins such as Transcription Gene expression is the process by which inheritable information from a Gene, such as the DNA sequence, is made into a functional Gene product, such These are typically short sequences that appear near or within genes. A systematic understanding of these regulatory sequences and how they together act as a gene regulatory network is only beginning to emerge from computational, high-throughput expression and comparative genomics studies. A Gene regulatory network (also called a GRN or genetic regulatory network) is a collection of DNA segments in a cell which interact with each Comparative genomics is the study of the relationship of Genome structure and function across different biological Species or strains.
Identification of regulatory sequences relies in part on evolutionary conservation. The evolutionary branch between the human and mouse, for example, occurred 70–90 million years ago. A mouse (plural mice) is a small Animal that belongs to one  So computer comparisons of gene sequences that identify conserved non-coding sequences will be an indication of their importance in duties such as gene regulation. 
Another comparative genomic approach to locating regulatory sequences in humans is the gene sequencing of the puffer fish. Tetraodontidae is a family of primarily marine and estuarine fish These vertebrates have essentially the same genes and regulatory gene sequences as humans, but with only one-eighth the "junk" DNA. The compact DNA sequence of the puffer fish makes it much easier to locate the regulatory genes. 
Protein-coding sequences (specifically, coding exons) comprise less than 1. An exon is a Nucleic acid sequence that is represented in the mature form of an RNA molecule after a portions of a precursor RNA Introns have been 5% of the human genome.  Aside from genes and known regulatory sequences, the human genome contains vast regions of DNA the function of which, if any, remains unknown. These regions in fact comprise the vast majority, by some estimates 97%, of the human genome size. Genome size refers to the total amount of DNA contained within one copy of a Genome. Much of this is composed of:
However, there is also a large amount of sequence that does not fall under any known classification. In the study of DNA sequences one can distinguish two main types of repeated sequence: Tandem repeats Satellite DNA, Tandem repeats occur in DNA a pattern of two or more nucleotides is repeated and the repetitions are directly adjacent to each other Satellite DNA consists of highly repetitive DNA, and is so called because repetitions of a short DNA sequence tend to produce a different frequency of the Nucleotides This article is about the DNA sequence See also Miniaturized satellite for the size class of orbiting spacecraft A minisatellite is a section Microsatellites, or Simple Sequence Repeats (SSRs are polymorphic loci present in nuclear and organellar DNA that consist of repeating Interspersed repetitive DNA is found in all eukaryotic Genomes. Retrotransposons (also called transposons via RNA intermediates are genetic elements that can amplify themselves in a Genome and are ubiquitous components of the Transposons are sequences of DNA that can move around to different positions within the Genome of a single cell, a process called transposition Retrotransposons (also called transposons via RNA intermediates are genetic elements that can amplify themselves in a Genome and are ubiquitous components of the Retrotransposons (also called transposons via RNA intermediates are genetic elements that can amplify themselves in a Genome and are ubiquitous components of the Retrotransposons (also called transposons via RNA intermediates are genetic elements that can amplify themselves in a Genome and are ubiquitous components of the Transposons are sequences of DNA that can move around to different positions within the Genome of a single cell, a process called transposition Pseudogenes are defunct relatives of known Genes that have lost their Protein -coding ability or are otherwise no longer expressed
Much of this sequence may be an evolutionary artifact that serves no present-day purpose, and these regions are sometimes collectively referred to as "junk" DNA. In Molecular biology, junk DNA is a provisional label for the portions of the DNA sequence of a Chromosome or a Genome for which no There are, however, a variety of emerging indications that many sequences within are likely to function in ways that are not fully understood. Recent experiments using microarrays have revealed that a substantial fraction of non-genic DNA is in fact transcribed into RNA, which leads to the possibility that the resulting transcripts may have some unknown function. For terminology see glossary below A DNA microarray is a High-throughput technology used in Molecular biology and in Ribonucleic acid ( RNA) is a Nucleic acid that consists of a long chain of Nucleotide units Also, the evolutionary conservation across the mammalian genomes of much more sequence than can be explained by protein-coding regions indicates that many, and perhaps most, functional elements in the genome remain unknown. Mammals ( class Mammalia) are a class of Vertebrate Animals characterized by the presence of Sweat glands, including sweat glands  The investigation of the vast quantity of sequence information in the human genome whose function remains unknown is currently a major avenue of scientific inquiry. 
Most studies of human genetic variation have focused on single nucleotide polymorphisms (SNPs), which are substitutions in individual bases along a chromosome. A single nucleotide polymorphism ( SNP, pronounced snip) is a DNA sequence variation occurring when a single Nucleotide - A, T Most analyses estimate that SNPs occur on average somewhere between every 1 in 100 and 1 in 1,000 base pairs in the euchromatic human genome, although they do not occur at a uniform density. Euchromatin is a lightly packed form of Chromatin that is rich in Gene concentration and is often (but not always under active transcription. Thus follows the popular statement that "we are all, regardless of race, genetically 99. The term race or racial group usually refers to the concept of categorizing Humans into Populations or groups on the basis of various sets 9% the same", although this would be somewhat qualified by most geneticists. For example, a much larger fraction of the genome is now thought to be involved in copy number variation.  A large-scale collaborative effort to catalog SNP variations in the human genome is being undertaken by the International HapMap Project. The International HapMap Project is an organization whose goal is to develop a Haplotype map of the Human genome (the HapMap) which will describe the common
The genomic loci and length of certain types of small repetitive sequences are highly variable from person to person, which is the basis of DNA fingerprinting and DNA paternity testing technologies. In the study of DNA sequences one can distinguish two main types of repeated sequence: Tandem repeats Satellite DNA, A maternity or paternity identification test is conducted to establish whether a person is the biological Parent of another person The heterochromatic portions of the human genome, which total several hundred million base pairs, are also thought to be quite variable within the human population (they are so repetitive and so long that they cannot be accurately sequenced with current technology). Heterochromatin is a tightly packed form of DNA Its major characteristic is that transcription is limited These regions contain few genes, and it is unclear whether any significant phenotypic effect results from typical variation in repeats or heterochromatin. A phenotype is any observable characteristic of an Organism, such as its morphology, Development, biochemical or physiological properties
Most gross genomic mutations in germ cells probably result in inviable embryos; however, a number of human diseases are related to large-scale genomic abnormalities. A gamete (from Ancient Greek γαμέτης; translated gamete = wife gametes = husband is a cell that fuses with another gamete Down syndrome, Turner Syndrome, and a number of other diseases result from nondisjunction of entire chromosomes. Down syndrome, Down's syndrome, or trisomy 21 is a Chromosomal disorder caused by the presence of all or part of an extra 21st chromosome. Turner syndrome or Ullrich-Turner syndrome encompasses several conditions of which monosomy X is the most common Nondisjunction is the failure of chromosome pairs to separate properly during cell division Cancer cells frequently have aneuploidy of chromosomes and chromosome arms, although a cause and effect relationship between aneuploidy and cancer has not been established. Cancer (medical term Malignant Neoplasm) is a class of Diseases in which a group of cells display uncontrolled Aneuploidy is defined as an abnormal number of Chromosomes Syndromes caused by an extra or missing chromosome are among the most widely recognized Genetic disorders Causality (but not causation) denotes a necessary relationship between one event (called cause and another event (called effect) which is the direct consequence
Most aspects of human biology involve both genetic (inherited) and non-genetic (environmental) factors. Some inherited variation influences aspects of our biology that are not medical in nature (height, eye color, ability to taste or smell certain compounds, etc). Moreover, some genetic disorders only cause disease in combination with the appropriate environmental factors (such as diet). With these caveats, genetic disorders may be described as clinically defined diseases caused by genomic DNA sequence variation. In the most straightforward cases, the disorder can be associated with variation in a single gene. For example, cystic fibrosis is caused by mutations in the CFTR gene, and is the most common recessive disorder in caucasian populations with over 1300 different mutations known. Cystic fibrosis (also known as CF, mucoviscoidosis, or mucoviscidosis) is a hereditary disease affecting the exocrine (mucus glands of the lungs Disease-causing mutations in specific genes are usually severe in terms of gene function, and are fortunately rare, thus genetic disorders are similarly individually rare. However, since there are many genes that can vary to cause genetic disorders, in aggregate they comprise a significant component of known medical conditions, especially in pediatric medicine. Molecularly characterized genetic disorders are those for which the underlying causal gene has been identified, currently there are approximately 2200 such disorders annotated in the OMIM database,.
Studies of genetic disorders are often performed by means of family-based studies. In some instances population based approaches are employed, particularly in the case of so-called founder populations such as those in Finland, French-Canada, Utah, Sardinia, etc. Diagnosis and treatment of genetic disorders are usually performed by a geneticist-physician trained in clinical/medical genetics. A geneticist is a scientist who studies Genetics, the science of Heredity and variation of Organisms A geneticist can be employed as a researcher The results of the Human Genome Project are likely to provide increased availability of genetic testing for gene-related disorders, and eventually improved treatment. The Human Genome Project (HGP was an international Scientific research project with a primary goal to determine the sequence of chemical base pairs which make up DNA Genetic testing allows the genetic Diagnosis of vulnerabilities to inherited Diseases, and can also be used to determine a person's Ancestry. Parents can be screened for hereditary conditions and counselled on the consequences, the probability it will be inherited, and how to avoid or ameliorate it in their offspring. Genetic counseling is the process by which patients or relatives at risk of an inherited disorder are advised of the consequences and nature of the disorder the probability of developing
As noted above, there are many different kinds of DNA sequence variation, ranging from complete extra or missing chromosomes down to single nucleotide changes. It is generally presumed that much naturally occurring genetic variation in human populations is phenotypically neutral, i. e. has little or no detectable effect on the physiology of the individual (although there may be fractional differences in fitness defined over evolutionary time frames). Genetic disorders can be caused by any or all known types of sequence variation. To molecularly characterize a new genetic disorder, it is necessary to establish a causal link between a particular genomic sequence variant and the clinical disease under investigation. Such studies constitute the realm of human molecular genetics.
With the advent of the Human Genome and International HapMap Project, it has become feasible to explore subtle genetic influences on many common disease conditions such as diabetes, asthma, migraine, schizophrenia, etc. The International HapMap Project is an organization whose goal is to develop a Haplotype map of the Human genome (the HapMap) which will describe the common Although some causal links have been made between genomic sequence variants in particular genes and some of these diseases, often with much publicity in the general media, these are usually not considered to be genetic disorders per se as their causes are complex, involving many different genetic and environmental factors. Thus there may be disagreement in particular cases whether a specific medical condition should be termed a genetic disorder.
Comparative genomics studies of mammalian genomes suggest that approximately 5% of the human genome has been conserved by evolution since the divergence of those species approximately 200 million years ago, containing the vast majority of genes. Human evolution, or anthropogenesis, is the part of biological Evolution concerning the emergence of Homo sapiens as a distinct Species The Chimpanzee Genome Project is an effort to determine the DNA sequence of the Genome of the closest living human relatives Comparative genomics is the study of the relationship of Genome structure and function across different biological Species or strains.  Intriguingly, since genes and known regulatory sequences probably comprise less than 2% of the genome, this suggests that there may be more unknown functional sequence than known functional sequence. A smaller, yet large, fraction of human genes seem to be shared among most known vertebrates. Vertebrates are members of the Subphylum Vertebrata, Chordates with backbones or spinal columns The grouping sometimes includes The chimpanzee genome is 95% identical to the human genome. Chimpanzee (often shortened to chimp) is the common name for the two extant Species of Apes in the Genus Pan. On average, a typical human protein-coding gene differs from its chimpanzee ortholog by only two amino acid substitutions; nearly one third of human genes have exactly the same protein translation as their chimpanzee orthologs. In Evolutionary biology, homology has come to mean any similarity between characters that is due to their shared ancestry. In Chemistry, an amino acid is a Molecule containing both Amine and Carboxyl Functional groups In Biochemistry, this A major difference between the two genomes is human chromosome 2, which is equivalent to a fusion product of chimpanzee chromosomes 12 and 13. Chromosome 2 is one of the 23 pairs of Chromosomes in Humans People normally have two copies of this chromosome 
Humans have undergone an extraordinary loss of olfactory receptor genes during our recent evolution, which explains our relatively crude sense of smell compared to most other mammals. Olfactory receptors expressed in the Cell membranes of Olfactory receptor neurons are responsible for the detection of Odor molecules Olfaction (also known as olfactics or smell) refers to the Sense of smell. Evolutionary evidence suggests that the emergence of color vision in humans and several other primate species has diminished the need for the sense of smell. Color vision is the capacity of an organism or machine to distinguish objects based on the Wavelengths (or frequencies) of the Light they reflect or emit A primate is a member of the biological order Primates ( Latin: "prime first rank" the group that contains Lemurs the Aye-aye 
The human mitochondrial genome, while usually not included when referring to the "human genome", is of tremendous interest to geneticists, since it undoubtedly plays a role in mitochondrial disease. The mitochondrial genome is the genetic material of the Mitochondria. Mitochondrial diseases are a group of disorders relating to the mitochondria, the Organelles that are the "powerhouses" of the eukaryotic cells It also sheds light on human evolution; for example, analysis of variation in the human mitochondrial genome has led to the postulation of a recent common ancestor for all humans on the maternal line of descent. (see Mitochondrial Eve)
Due to the lack of a system for checking for copying errors, Mitochondrial DNA (mtDNA) has a more rapid rate of variation than nuclear DNA. Mitochondrial Eve ( mt-mrca) is the name given by researchers to the woman who is defined as the Matrilineal most recent common ancestor (MRCA for all currently This 20-fold increase in the mutation rate allows mtDNA to be used for more accurate tracing of maternal ancestry. Studies of mtDNA in populations have allowed ancient migration paths to be traced, such as the migration of Native Americans from Siberia or Polynesians from southeastern Asia. For indigenous peoples in the United States other than Hawaii and Alaska see also Native Americans in the United States. Siberia (Сиби́рь Sibir) is the name given to the vast region constituting almost all of Northern Asia and for the most part currently serving Polynesia (from Greek: πολύς many, νῆσος island) is a Subregion of Oceania, comprising a large grouping of over It has also been used to show that there is no trace of Neanderthal DNA in the European gene mixture inherited through purely maternal lineage. The Neanderthal (neɪˈændərtɑːl also with /niː-/ and /-θɔːl/ or Neandertal, is an extinct member of the Homo genus that is known from 
A variety of features of the human genome that transcend its primary DNA sequence, such as chromatin packaging, histone modifications and DNA methylation, are important in regulating gene expression, genome replication and other cellular processes. In Biology, the term epigenetics refers to changes in Gene expression caused by mechanisms other than changes in the underlying DNA sequence Chromatin is the complex basis of DNA and protein that makes up Chromosomes It is found inside the nuclei of eukaryotic cells, and within the In Biology, histones are the chief Protein components of Chromatin. DNA methylation is a type of chemical modification of DNA that can be inherited and subsequently removed without changing the original DNA sequence  These "epigenetic" features are thought to be involved in cancer and other abnormalities, and some may be heritable across generations.