GPATCH11

Search for
Structures	Swiss-model
Domains	InterPro

G-patch containing protein 11
G-patch containing protein 11
Identifiers
Symbol	GPATCH11
Alt. symbols	CCDC75, CENPY
NCBI gene	253635
HGNC	26768
RefSeq	NP_777591.3
UniProt	Q8N954
Other data
Locus	Chr. 2 p22.2
Structures
Search for
Structures	Swiss-model
Domains	InterPro

GPATCH11
Identifiers
Aliases	GPATCH11, CCDC75, CENP-Y, CENPY, G-patch domain containing 11
External IDs	MGI: 1858435; HomoloGene: 44687; GeneCards: GPATCH11; OMA:GPATCH11 - orthologs
Gene location (Human)
Chr.	Chromosome 2 (human)
End	37,099,244 bp
Gene location (Mouse)
Chr.	Chromosome 17 (mouse)
End	79,155,737 bp
RNA expression pattern
	Top expressed in
	Achilles tendon; ; middle temporal gyrus; ; pancreatic ductal cell; ; testicle; ; endothelial cell; ; cardiac muscle tissue of right atrium; ; adipose tissue; ; skin of arm; ; postcentral gyrus; ; Brodmann area 23;
	Top expressed in
	interventricular septum; ; zygote; ; genital tubercle; ; neural layer of retina; ; yolk sac; ; granulocyte; ; superior cervical ganglion; ; morula; ; muscle of thigh; ; tail of embryo;
	More reference expression data
	n/a
Orthologs
	253635
	53951
	ENSG00000152133
	ENSMUSG00000050668
	Q8N954
	Q3UFS4
	NM_174931; NM_001278505; NM_001322249
	NM_181649
NP_001265434; NP_001309178; NP_777591; NP_001358785; NP_001358787;
	NP_001358788; NP_001358789; NP_001358790; NP_001358791
	NP_857632; NP_001390142
	Wikidata
View/Edit Human	View/Edit Mouse

GPATCH11 is a protein that in humans is encoded by the G-patch domain containing protein 11 gene. The gene has four transcript variants encoding two functional protein isoforms and is expressed in most human tissues. The protein has been found to interact with several other proteins, including two from a splicing pathway. In addition, GPATCH11 has orthologs in all taxa of the eukarya domain.

Gene

G-patch domain containing protein 11 is a protein that in humans is encoded by the gene GPATCH11 and located on chromosome 2, location 2p22.2.^[5] It also contains several aliases including CCDC75, and CENPY.^[6] The gene is 14,484 bp long and contains 9 exons. Though the function of the protein is not yet known, it is predicted to serve in nucleic acid binding and protein binding.^[6]^[7]

mRNA

GPATCH11 has four predicted transcript variants, though only two are known to code for functional protein. Its longest form is unspliced and contains 9 exons whereas the second functional variant has 7 exons with exons 3 and 4 cut out.

Protein

GPATCH11 has a molecular weight of about 33.3 kdal and is 285 amino acids long.^[6]^[9] It also comes in a second isoform that is 156 amino acids long. The gene contains a G-patch domain and the DUF 4138 domain. The G-patch domain itself is a novel domain found only in eukarya. BLAST searches of the human gene against bacteria, archaea, and viruses, support this finding.^[6]

Primary structure

The following is the primary sequence of the long form of GPATCH11:

Human GPATCH11 protein sequence: The yellow region depicts the G-patch domain, while the blue region depicts the DUF domain.

The protein is rich in glutamic acid and is very highly charged. In addition, it is low in amino acids such as valine, threonine, phenylalanine, and proline. It is a soluble protein and has a nuclear export signal and bipartite nuclear import signal implying that it is localized in the nucleus.

Secondary structure

The conserved areas of the protein have a secondary structure composed only of alpha-helices and coiled-coil regions.

Tertiary structure

The image to the right is the predicted tertiary structure of GPATCH11 based on results obtained from I-tasser. The confidence score was very low though, so reliability is uncertain. However, it does match up with the secondary structure prediction of the protein being composed primarily of alpha-helices and coiled coils.

Protein expression

Protein expression has been found in the endocrine and nervous system, along with the eye, breast, colon, liver, ovary, and 55 other tissues. Gene expression is found to be about 1.1 times the average. The highest expression is found in the brain and spinal cord, followed by the spleen. There are six areas in the brain where GPATCH11 is expressed above average including the olfactory areas, hippocampus, midbrain, pons, medulla, and cerebellum.^[10] In addition, expression levels increase in cancerous tissue compared to normal tissue.

Predicted Post-Translational Modification

Using various tools at ExPASy^[11] the following are possible post-translational modifications for GPATCH11.

3 possible CK2 phosphorylation sites
6 possible PKC phosphorylation sites
2 possible N-mirystoylation sites
6 possible glycation sites

Protein Interaction

Protein	Abbreviation	Location	Function
Brain-specific angiogenesis inhibitor 3	BAI3	x	Plays a role in the regulation of synaptogenesis and dendritic spine formation
Jun proto-oncogene	JUN	Nucleus^[12]	Highly similar to the avian viral sarcoma protein, and which interacts directly with specific target DNA sequences to regulate gene expression
Zinc finger (CCCH type) RNA-binding motif and serine/arginine rich 2	ZRSR2	Nucleus^[12]	Encodes an essential splicing factor, and may play a role in network interactions during spliceosome assembly.
U2 small nuclear RNA auxiliary factor 1	U2AF1	Nucleus^[12]	Plays a critical role in both constitutive and enhancer-dependent splicing

The interaction between GPATCH11 and BAI3 was found via PSICQUIC,^[13] mentha,^[13] and STRING.^[12] The confidence score given by mentha is only .454, however, according to STRING the interaction between the two proteins has been experimentally determined by a validated two-hybrid approach. The two proteins are thought to have a direct physical interaction. BAI3 is a transmembrane protein and a p53 target gene. BAI3 may regulate the number of excitatory synapses that are formed on the hippocampus neurons, and may be involved in angiogenesis inhibition and suppression of glioblastoma. As GPATCH11does have higher expression than the average gene in the hippocampus and the spinal cord, this could be a real interaction.

The interaction between GPATCH11 and JUN could be real as JUN is both localized in the nucleus and associated with cancers. GPATCH11 tends to have higher expression in cancerous tissue compared to normal tissue, so interaction with other proteins highly expressed in cancers seems plausible.

Finally, the interactions between GPATCH11 and ZRSR2 and GPATCH11 and U2AF1 appear to be real due to the fact that ZRSR2 and U2AF1 are known to interact with each other, and all three proteins are localized in the nucleus.

Evolutionary History

The protein is found in all taxa of the domain eukarya, including unicellular organisms. Aligning the human gene with the various taxids revealed high conservation in the G-patch domain area and the DUF 4187 area.^[6] Alignments with closely related taxids such as birds and reptiles revealed conservation over the majority of the sequence. However, alignments with more distantly related taxids such as fungi and plants had less conservation with identities of less than 40%, though the G-patch domain and the DUF domain still had high conservation.^[14] Overall, the protein is composed mainly of charged amino acids, both acidic and basic. There were no regions of sustained non-polarity. This implies that this is not a transmembrane protein as that requires a long region of non-polarity.

When comparing the rate of evolution of GPATCH11 to known proteins such as fibrinogen and cytochrome c, GPATCH11 is evolving quite rapidly, similar to the rate of the fibrinogen protein. An unrooted evolutionary tree^[14] can be seen to the right including representatives of species ranging from invertebrates to mammals. This shows the hypothetical relationship of the GPATCH11 sequence among the different taxa, and is supported by divergence time of the taxa from humans as well as sequence identity/similarity.

Homology

The protein is highly conserved among the domain eukarya. The table below lists a number of species from all different taxids whose GPATCH11 sequence was compared to the human GPATCH11 sequence. Protein sequence lengths, similarities, and identities are represented, including divergence in millions of years.

Genus and Species	Common Name	Divergence (MYA)^[15]	Accession number	Sequence length (amino acids)	Sequence identity (%)	Sequence similarity (%)
Homo sapiens	Human	0	NP_777591.3	285	100	100
Equus asinus	African ass	97.5	XP_014688350.1	285	94	97
Picoides pubescens	Downy woodpecker	320.5	XP_009910012.1	256	73	86
Merops nubicus	Northern carmine bee-eater	320.5	XP_008934567.1	258	73	87
Chrysemys picta bellii	Western painted turtle	320.5	XP_005296317.1	257	76	89
Alligator mississippiensis	American Alligator	320.5	XP_006272937.1	260	71	85
Xenopus tropicalis	Western clawed frog	355.7	NP_001005035.1	261	63	80
Neolamprologus brichardi	Fairy (lyretail) cichlid	429.6	XP_006807714.1	260	60	78
Stegastes partitus	Bicolor damselfish	429.6	XP_008301855.1	265	58	78
Branchiostoma floridae	Florida lancelet	743	XP_002610131.1	264	45	65
Saccoglossus kowalevskii	Acorn worm	747.8	XP_002731571.2	311	48	67
Crassostrea gigas	Pacific oyster	847	XP_011417222.1	262	43	61
Bombus terrestris	Buff-tailed bumblee	847	XP_012173875.1	246	40	63
Monomorium pharaonis	Pharaoh ant	847	XP_012521549.1	248	38	61
Halyomorpha halys	Brown marmorated stink bug	847	XP_014272647.1	258	41	61
Trichoplax adhaerens	Placozoan	936	XP_002108305.1	256	42	60
Batrachochytrium dendrobatidis	Chytrid fungus	1302.5	XP_006681792.1	277	31	55
Saccharomyces cerevisiae	Baker's Yeast	1302.5	NP_013373.1	274	42	62
Musa acuminata malaccensis	Wild banana	1513.9	XP_009405687.1	248	33	51
Capsella rubella	Pink Shepherd's-Purse	1513.9	XP_006290276.1	269	33	54
Elaeis guineensis	African oil palm	1513.9	XP_010928444.1	253	34	52

Clinical significance

Clinical significance is not yet known, however, GPATCH11 is present in much higher amounts in cancerous tissue than normal tissue, and has shown possible protein interaction with oncogenes, so might somehow be involved in cancer.

References

^ ^a ^b ^c GRCh38: Ensembl release 89: ENSG00000152133 – Ensembl, May 2017
^ ^a ^b ^c GRCm38: Ensembl release 89: ENSMUSG00000050668 – Ensembl, May 2017
^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
^ "GeneCards - Human Genes | Gene Database | Gene Search". genecards.org. Archived from the original on 2016-02-29. Retrieved 2016-02-29.
^ ^a ^b ^c ^d ^e "National Center for Biotechnology Information". ncbi.nlm.nih.gov. Retrieved 2016-02-29.
^ "UniProt". uniprot.org. Retrieved 2016-02-29.
^ "AceView a comprehensive annotation of human and worm genes with mRNAs or ESTsAceView". ncbi.nlm.nih.gov. Retrieved 2016-05-09.
^ "Ensembl genome browser 83". ensembl.org. Retrieved 2016-02-29.
^ "ISH Data :: Allen Brain Atlas: Mouse Brain". mouse.brain-map.org. Retrieved 2016-05-09.
^ ExPASy Proteomics Server
^ ^a ^b ^c ^d "STRING: functional protein association networks". string-db.org. Retrieved 2016-05-09.
^ ^a ^b PSICQUIC. "PSICQUIC View". ebi.ac.uk. Retrieved 2016-05-09.
^ ^a ^b "SDSC Biology Workbench". workbench.sdsc.edu. Retrieved 2016-02-29.
^ "TimeTree :: The Timescale of Life". timetree.org. Retrieved 2016-02-29.

[refGRCh38Ensembl-1] GRCh38: Ensembl release 89: ENSG00000152133 – Ensembl, May 2017

[refGRCm38Ensembl-2] GRCm38: Ensembl release 89: ENSMUSG00000050668 – Ensembl, May 2017

[3] "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.

[4] "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.

[5] "GeneCards - Human Genes | Gene Database | Gene Search". genecards.org. Archived from the original on 2016-02-29. Retrieved 2016-02-29.

[:0-6] "National Center for Biotechnology Information". ncbi.nlm.nih.gov. Retrieved 2016-02-29.

[7] "UniProt". uniprot.org. Retrieved 2016-02-29.

[8] "AceView a comprehensive annotation of human and worm genes with mRNAs or ESTsAceView". ncbi.nlm.nih.gov. Retrieved 2016-05-09.

[9] "Ensembl genome browser 83". ensembl.org. Retrieved 2016-02-29.

[10] "ISH Data :: Allen Brain Atlas: Mouse Brain". mouse.brain-map.org. Retrieved 2016-05-09.

[expasy-11] ExPASy Proteomics Server

[:1-12] "STRING: functional protein association networks". string-db.org. Retrieved 2016-05-09.

[:2-13] PSICQUIC. "PSICQUIC View". ebi.ac.uk. Retrieved 2016-05-09.

[:3-14] "SDSC Biology Workbench". workbench.sdsc.edu. Retrieved 2016-02-29.

[15] "TimeTree :: The Timescale of Life". timetree.org. Retrieved 2016-02-29.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]