Nova-1 KH3 K-Homology RNA-Binding Domain

Kenny Farabaugh '10

I. Introduction

Nova-1 is an autoantigen in the human autoimmune neurological disease paraneoplastic opsoclonus-myclonus ataxia (POMA). POMA is categorized by loss of motor control of limbs and eyes. Ordinarily, the Nova-1 protein is only expressed in the central nervous system (CNS), but in POMA patients, it is expressed in tumors outside the CNS, causing an autoimmune response against the Nova-1 protein, which is considered 'non-self' by virtue of being found outside the CNS. The resulting autoimmune attack on the CNS is believed to cause the loss of motor control characteristic of this disease. [1]

Nova-1 is involved in the regulation of alternative splicing of mRNA in neural cells. This function was originally theorized based on the proximity of Nova-1 binding sites to introns near known alternatively spliced exons. The mechanism [2] involves the selective binding of the single-stranded hairpin loop of an RNA molecule, performed by the three K-Homology (KH) domains of the Nova-1 protein. When Nova-1 is present, the KH domains bind a specific glycine receptor protein pre-mRNA with a repeated UCAU sequence tetrad, thereby inhibiting splicing at this site and causing splicing at another site. When the autoimmune response is activated, antibodies are produced that preferentially bind the KH domain of Nova-1, inhibiting RNA-binding; when this alternative splicing does not occur, the result is apoptotic cell death and the POMA disease. [2]

Unfortunately, the entire Nova-1 protein has not been crystallized, but the most important functional domains, the KH RNA-binding domains, have. Since their discovery in the heterogeneous nuclear ribonucleoprotein (hnRNP) K, a superfamily of homologous domains have been found in eukaryotes and eubacteria, including such proteins as insulin-like growth factor 2 mRNA-binding protein, Fragile X disease protein FMR-1, and the 40S ribosomal protein S3. The functions of proteins containing KH RNA-binding domains ranges from translation to alternative splicing to mRNA localization and possibly even RNA interference. [1]

II. General Structure [1]

The KH domain contains 70 amino acid residues on average across the superfamily; specifically, the Nova-1 KH3 domain is composed of 76 residues, with 3 antiparallel beta-ribbons on one side and 3 alpha-helices on the other in the order S1-H1-H2-S2-S3-H3.

The structure of KH domains includes two well-defined loops, the H1-H2 loop and the S2-S3 loop . The H1-H2 loop includes a highly conserved Gly-X-X-Gly section, where X is usually Arg, Lys, or Gly. In the Nova-1 KH domains, X represents Lys and Gly (residues 23-24). The S2-S3 loop is incredibly variable among the KH superfamily, and has been known to include up to 44 amino acid residues. In the Nova-1 KH domain, the S2-S3 loop contains 11 amino acid residues (residues 41-52).

The KH domain is structured in such a way that aliphatic hydrophobic amino acid residues mainly face inward and polar hydrophilic residues face outward . This results in a tightly structured domain that resists both rearrangement and protease degradation in a cytoplasmic solvent.

This is how the KH1 and KH2 domains are believed to fit into the entire Nova-1 protein. [4, 5, 6] We see that the H1-H2 loops protrude from the outer face of the molecule, allowing Nova-1 to bind the RNA with no steric interference from the rest of the protein. The KH3 domain most likely protrudes from the Nova-1 in the same way.

III. RNA Binding [3]

The most important function of the Nova-1 KH domain is binding the single-stranded hairpin loop of an mRNA, preferentially those with a repeated UCAU tetrad. The crystal structure was not created using this base repeat, but we can theorize similar interactions occuring between RNA bases U-13, C-14, A-15, and C-16, and the H1-H2 loop, sometimes referred to as the hydrophobic binding platform, Gly-22, Lys-23, Gly-24, and Gly-25. Electrostatic interactions include hydrogen-bonding in the H1-H2 loop , specifically between the nitrogen on the side chain of Lys 23 and an oxygen on U-13 , the nitrogen on the backbone of Gly-22 and the oxygen of the pentose sugar ring of U-13 , the nitrogen on the backbone of Gly-24 and an oxygen on the phosphate backbone of C-14 , and the nitrogen on the backbone of of Gly-25 and the oxygen in the pentose sugar ring of A-15 . These Gly residues are highly conserved because of their size - larger residues would disrupt the chain and eliminate the functioning domain of the protein. Lys-40 can form a hydrogen bond with an oxygen on C-16, the only interaction with this base at all, supporting the argument that the specificity is not as pronounced for this base as for the preferred uracil . Ser-19 , Leu-41 , and Arg-54 can also form hydrogen bonds with an oxygen on U-13 a nitrogen in A-15, and a nitrogen or an oxygen in C-14, respectively.

Nonspecific hydrophobic Van der Waals interactions can occur between Ile-21 and the aliphatic segment of Lys-43, and and the ring structures of C-14 and A-15. These interactions are not as strong as the hydrogen bonds, but they may help account for the specificity of the KH-domain binding the UCAU RNA-base tetrad.

IV. Dimerization [1]

Many proteins contain multiple KH domains - Nova-1 has three, and some include as many as fifteen! The KH domains can intereact with each other not only to increase RNA-binding specificity but also to bind more than one mRNA simultaneously. The junction of the KH monomers occurs between the N-terminal S1 beta ribbons of each, lining them up to continue the antiparallel ribbon pattern. The dimer is stabilized by hydrophobic interactions (more Van der Waals forces), including the Tyr-6, Phe-7, Leu-8, Lys-9, Leu-11, Pro-13, Ala-103, Val-106, Ile-108, Ile-109, Val-110, , and Pro-111 residues. Hydrophobic forces also hold the dimer together in the H3 helix, evident in the Ile-62, Ala-73, Ile-76, Gly-157, Pro-159, Val-166, Ile-169, Ile-173, and Pro-177 residues. In the KH domain superfamily, a number of hydrophobic amino acid residues are highly conserved, such as Glu-6, Val-8, and Met-10 residues. Unfortunately, these residues are not conserved in the Nova-1 KH domain. This general conservation in the superfamily supports the formation of the dimer as well as showing that the dimer is potentially biologically important in evolved species.

It has been theorized that the Nova-1 KH domain can even form a tetramer [1]. This tetramer would be an asymmetric unit, and therefore noncrystallographic, but the formation is feasible. This feat would help explain why certain KH domain-containing proteins have numerous copies.

V. Implications [1]

The structure of the Nova-1 KH RNA-binding domain provided some insights into other RNA-binding proteins. Many RNA binding proteins contain a similar alpha helix/beta sheet structure. The beta sheets have been found to be involved in binding more often in single-stranded RNA-binding proteins, such as in the U1 snRNP A, the MS2 phage protein coat, and several tRNA synthetases. The alpha helices have been found to be involved in binding more often in double-stranded RNA-binding proteins, such as the HIV-1 Rev binding to the major groove of RRE RNA.

KH domains in other proteins in C. elegans resulted in early identification of several loss-of-function mutations. Mutation of the first Gly in the Gly-X-X-Gly H1-H2 loop results in loss-of-function because the KH domain can no longer bind the RNA. Surprisingly, some mutations in the variable region can lead to loss of function as well, such as the KH domain of Drosophila Bicaudal C protein; it is thought that this mutation disrupts protein folding and binding with correlated KH domains.

Since the Gly-X-X-Gly H1-H2 loop is so well conserved [1] across all KH domains, it is considered the most important part of the structure. However, most of the hydrogen bonds that form between the two molecules involve the phosphate backbone or the pentose sugars, which have no specificity factor. What then, accounts for the Gly-X-X-Gly specifically targeting the UCAU RNA-base tetrad? It has been theorized that because most proteins with a KH domain contain more than one domain linked together by another loop, the domains may work in tandem to bind specific RNA molecules with a much longer UCAU repeat in its secondary structure. Future studies may indicate the nature of specific RNA-binding by multiple correlated KH domains.

VI. References

1. Lewis, Hal A., Chen, Hua, Edo, Carme, Buckanovich, Ronald J., Yang, Yolanda YL, Musunuru, Kiran, Zhong, Ru, Darnell, Robert B., and Burley, Stephen K. February 1999. Crystal structures of Nova-1 and Nova-2 K-homology RNA-binding domains. Structure. 7:191-203.

2. News release. Rockefeller University researchers identify protein that regulates RNA in nerve tissue. February 24, 2004. http://runews.rockefeller.edu/index.php?page=engine&id=360&printer=1. The Rockefeller University. Copyright 2004-2005. (Accessed 9 December, 2007)

3. Sidiqi, M., Wilce, J. A., Vivian, J. P., Porter, C. J., Barker, A., Leedman, P. J., and Wilce, M. C. J. February 2005. Structure and RNA binding of the third KH domain of poly(C)-binding protein 1. Nucleic Acids Research. 33(4):1213-1221.

4. Schwede T., Kopp J., Guex N., and Peitsch MC (2003) SWISS-MODEL: an automated protein homology-modeling server. Nucleic Acids Research. 31:3381-3385.

5. Guex, N., and Peitsch, M.C. (1997) SWISS-MODEL and the Swiss-PdbViewer: An evironment for comparative protein modelling. Electrophoresis. 18:2714-2723.

6. Arnold, K.,Bordoli, L., Kopp, J., and Schwede, T. (2006). The SWISS-MODEL Workspace: A web-based environment for protein structure homology modelling. Bioinformatics. 22:195-201.

Back to Top