Escherichia coli Hfq

Anna Tancredi '19


I. Introduction

The prokaryotic Hfq protein was named after its discovery in Escherichia coli as an essential host factor of the RNA bacteriophage Qβ. Later studies revealed Hfq is also an RNA binding protein involved in translational activation and inhibition. Hfq has also been shown to impede translation of σS (RpoS) which is an important factor involved in stress response. hfq knockout strains exhibited pleiotropic phenotypes including decreased growth rate, morphological changes, and altered sensitivity to oxidants and UV light.

The Hfq protein shows high sequence conservation across different prokaryotes and across Sm proteins, which are eukaryotic proteins involved in mRNA splicing and RNA stabilization. By binding to small non-coding RNAs (sRNAs), Hfq can both inhibit and activate translation, as well as protect from and induce ribonuclease cleavage of these sRNAs. With such wide (and almost opposing) effects on RNA stability, Hfq has gained much attention. The exact mechanisms by which this protein plays such a large role in translation are currently being studied.

II. General Structure

Hfq forms a doughnut shaped ring composed of six identical . The highly conserved core comprises approximately 65 residues and is mainly composed of alpha helices and beta sheets. β4 of one protomer and β5 of another expose hydrogen bonding edges that allow the monomers to interact each other. This interaction is also reinforced by hydrophobic side chain interactions with the α-helix and neighboring strands.

Each is an α-β-5 structural unit, meaning it is composed of one α helix and five antiparallel β sheets. Gly 29 allows for the strand to twist and curve, forming a self-closing barrel. This is a structure typical to membrane-crossing proteins where the first strand is hydrogen bonded to the last. Together with the amino-terminal alpha helix, this structure is also common to Sm and Sm-like proteins shown below.

III. LSm Proteins

Sm proteins were originally discovered in their binding with snRNAs U1, U2, U4 and U5 of the spliceosome and got their name because they all began with Sm (SmE, SmN, etc). Sm and Sm-like (known as LSm) proteins are typically defined by a three-dimensional structure of six or seven subunit rings that are involved in mRNA processing or regulation. As the image above demonstrates, the typical Sm protein structure is an an alpha helix at the N-terminus and, at the C-terminus, a "β51234" structure, named after the order of the β sheets.

The Hfq core resembles the structural motifs Sm1 and Sm2 found in all Sm proteins. The Sm1 motif is 32 amino acids long and encompasses the first three β strands. The Sm2 motif consists of β4-5, and is 14 amino acids long in the E. coli Hfq. As mentioned above, these last two β strands are important for protomer interaction, so this Sm2 motif may influence interaction specificity and the number of subunits involved in ring formation. Interestingly, this is where Hfq diverges most from conserved sequences: the Sm1 motif is highly conserved, whereas the Hfq Sm2 motif is much shorter than most other LSm proteins as these are usually up to 28 amino acids long.

IV. RNA Binding

The exact way in which E. coli Hfq binds mRNA is still not entirely known. For many Sm proteins, RNA will thread through the center of the protein, but this is not the case for Hfq as the narrowest point of the is 11 Å wide. Instead, RNA binds in circular grooves on either the proximal or distal . The proximal face preferentially houses the N-terminal α-helices and binds U-rich sequences, while the distal face binds more ARN motifs (where R is a purine and N is any base).

When RNA binds the entire hexamer, the structure retains its symmetry. The electron density forms a continuous circle of U bases, like in the image below, where RNA is shown as a blue wireframe.

The ring structures of other Sm/LSm proteins and of the S. aureus Hfq present a central nucleotide binding pocket in which RNA, specifically U-rich sequences, bind. An equivalent pocket may exist in E. coli Hfq. Tyr 55, in the β4 strand, has no H-bond partner when Hfq is not bound to RNA, so it could rotate into the central cavity and offer an alternative base-stacking binding mode similar to that found in Sm proteins with a UMP base. As mentioned above, the β4 strands of each monomer are important for subunit interaction, meaning this U-specific pocket and therefore RNA binding is essentially intra-monomeric. Pi-bond stacking is also thought to occur with and could also be formed with Gln 8, Gln 41, or Lys 56.

V. sRNA Binding: RydC-Hfq Complex

Bacterial sRNAs are 50-500 nucleotide-long RNAs that form stem-loop structures and can bind mRNA to influence translation. RydC is an sRNA involved in biofilm formation in E. coli and has recently been crystallized binding to the proximal face of Hfq. This sRNA has a uridine-rich 3' end and adopts a pseudoknot structure which allows it to bind Hfq and facilitate Rho-independent transcription termination. In vivo stability of this sRNA requires Hfq (in binding to Hfq, RydC degradation is inhibited). The 3’ end of RydC primarily binds Hfq through lateral aromatic interactions between the of Hfq and the bases U62, C63, and U64. Other amino acids shown to affect RydC binding are Lys56, Gln8, and His57.

Interestingly, RydC will sometimes bind two Hfq hexamers, as pictured below. The 3’ end of RydC binds to the proximal face of one Hfq in the manner explained above, and the 5’ end will bind at an angle to the proximal face of another Hfq hexamer. F39, R16, and N13 of the Hfq α-helices can interact with the sRNA base U9, and R17 and H71 can bind the RydC base G8.
The stoichiometry of Hfq and RydC concentrations affect whether RydC binds one or two hexamers. If there are fewer Hfq molecules than RydC sRNAs, RydC will bind only one Hfq, however, if there is an abundance of RydC, this sRNA will bind two hexamers.

VI. C Terminal Tails

There is much controversy surrounding the importance of the carboxy termini of E. coli Hfq. The crystal structure of this protein is only resolved to 74 residues, after which the data is disordered, most likely because of the odd nature and length of these . In many bacterial species, these tails can be over 100 amino acids long. The current hypothesis is these positively charged tails will increase the protein's affinity for the negatively charged backbone and "fish-hook" the RNA to bring it towards the protein faces. However, there is conflicting evidence that both support and deny this claim. One study found that tail-less E. coli strains could still efficiently bind sRNA, while previous studies had found the opposite to be the case. As this research continues, it is important to note that these C-terminal tails have been preserved over time, suggesting they must have some evolutionary advantage.

VII. References

Dimastrogiovanni D, Fröhlich KS, Bandyra KJ, Bruce HA, Hohensee S, Vogel J, Luisi BF. Recognition of the small regulatory RNA RydC by the bacterial Hfq protein. eLife 2014;3:e05375

Khusial P, Plaag R, Zieve GW (September 2005). "LSm proteins form heptameric rings that bind to RNA via repeating motifs". Trends Biochem. Sci. 30 (9): 522–8.

Li X, Yang M, Ke Y, Liu M, Wang Y, Liu S, Liu B, Chen Z. 2016. “Hfq Mutation Confers Increased Cephalosporin Resistance in Klebsiella pneumoniae.”

Sauter, Claude, Basquin J, Suck D. 2003. Sm-like proteins in Eubacteria: the crystal structure of the Hfq protein from Escherichia coli. Nucleic Acids Research 31(14):4091-4098

Schulz EC, Barabas O. Structure of an Escherichia coli Hfq:RNA complex at 0.97?Å resolution. Acta Crystallographica Section F, Structural Biology Communications. 2014;70(Pt 11):1492-1497.

Vogel J, Luisi BF. 22 October 2015. Hfq and its constellation of RNA. Nat Rev Microbiol. 9(8):578-589.

Back to Top