Difference between revisions of "IS Families/IS3 family"

From TnPedia
Jump to navigation Jump to search
Line 1: Line 1:
 
===Original Identification===
 
===Original Identification===
IS''3'' and another member of this family, IS''2'' were identified genetically as a DNA segments causing insertional inactivation of ''[[wikipedia:Gal_operon|gal]]'' and ''[[wikipedia:Lac_operon|lac]]'' operons and physically by electron microscopy<ref><nowiki><pubmed>4567156</pubmed></nowiki></ref> and in [[wikipedia:Fertility_factor_(bacteria)|plasmid F]] as a segment called alpha-beta<ref><nowiki><pubmed>1092667</pubmed></nowiki></ref><ref><nowiki><pubmed>1092668</pubmed></nowiki></ref>. IS''3'' was subsequently wrongly identified as the insertion sequence flanking the [[wikipedia:Tetracycline_antibiotics|tetracycline resistance]] transposon Tn''10''<ref><nowiki><pubmed>1092669</pubmed></nowiki></ref><ref><nowiki><pubmed>383689</pubmed></nowiki></ref>. It has subsequently been found as a component of a large number of plasmids particularly in gram negative enterics.
+
IS''3'' and another member of this family, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] were identified genetically as a DNA segments causing insertional inactivation of ''[[wikipedia:Gal_operon|gal]]'' and ''[[wikipedia:Lac_operon|lac]]'' operons and physically by [[wikipedia:Scanning_electron_microscope|electron microscopy]]<ref><nowiki><pubmed>4567156</pubmed></nowiki></ref> and in [[wikipedia:Fertility_factor_(bacteria)|plasmid F]] as a segment called alpha-beta<ref><nowiki><pubmed>1092667</pubmed></nowiki></ref><ref><nowiki><pubmed>1092668</pubmed></nowiki></ref>. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''] was subsequently wrongly identified as the insertion sequence flanking the [[wikipedia:Tetracycline_antibiotics|tetracycline resistance]] transposon Tn''10''<ref><nowiki><pubmed>1092669</pubmed></nowiki></ref><ref><nowiki><pubmed>383689</pubmed></nowiki></ref>. It has subsequently been found as a component of a large number of plasmids particularly in gram negative enterics.
  
 
===Presence in Compound Transposons===
 
===Presence in Compound Transposons===
Although IS''3'' family elements do participate in compound transposons (e.g. IS''3411'') flanking the [[wikipedia:Citrate_test|Citrate Utilization]] to our knowledge there has been no systematic survey undertaken and very few IS''3''-associated compounds have been described to date. Several family members are part of compound transposons. These include: IS''3411'' flanking genes for [[wikipedia:Citrate_test|citrate utilization]] in transposon Tn''3411''<ref><nowiki><pubmed>6277857</pubmed></nowiki></ref><ref><nowiki><pubmed>2832386</pubmed></nowiki></ref><ref><nowiki><pubmed>6094480</pubmed></nowiki></ref>, IS''4521'' which flanks a heat stable enterotoxin gene in [[wikipedia:Enterotoxigenic_Escherichia_coli|enterotoxinogenic ''Escherichia coli'']] and IS''1706'', which flanks genes of the [[wikipedia:Clp_protease_family|Clp protease]]/[[wikipedia:Chaperone|chaperone]] family.
+
Although IS''3'' family elements do participate in compound transposons (e.g. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']) flanking the [[wikipedia:Citrate_test|Citrate Utilization]], to our knowledge there has been no systematic survey undertaken and very few IS''3''-associated compounds have been described to date. Several family members are part of compound transposons. These include: [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411''] flanking genes for [[wikipedia:Citrate_test|citrate utilization]] in transposon Tn''3411''<ref><nowiki><pubmed>6277857</pubmed></nowiki></ref><ref><nowiki><pubmed>2832386</pubmed></nowiki></ref><ref><nowiki><pubmed>6094480</pubmed></nowiki></ref>, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS4521 IS''4521''] which flanks a heat stable enterotoxin gene in [[wikipedia:Enterotoxigenic_Escherichia_coli|enterotoxinogenic ''Escherichia coli'']] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1706 IS''1706''], which flanks genes of the [[wikipedia:Clp_protease_family|Clp protease]]/[[wikipedia:Chaperone|chaperone]] family.
  
 
===Distribution===
 
===Distribution===
 
This is one of the most coherent, largest, most abundant and widely distributed IS families <ref>Craig NL, Lambowitz AM, Craigie R, Gellert M, editors. Mobile DNA II. American Society of Microbiology; 2002. </ref> (see <ref><nowiki><pubmed>26350305</pubmed></nowiki></ref>). Nearly 600 individual different members of this family have been identified in more than 267 bacterial species distributed over 145 genera. However, their true distribution is clearly significantly greater than this.
 
This is one of the most coherent, largest, most abundant and widely distributed IS families <ref>Craig NL, Lambowitz AM, Craigie R, Gellert M, editors. Mobile DNA II. American Society of Microbiology; 2002. </ref> (see <ref><nowiki><pubmed>26350305</pubmed></nowiki></ref>). Nearly 600 individual different members of this family have been identified in more than 267 bacterial species distributed over 145 genera. However, their true distribution is clearly significantly greater than this.
  
For example, IS''911'', (isolated from a ''[[wikipedia:Shigella_dysenteriae|Shigella dysenteriae]]'' phage λ lysogen by spontaneous insertion into the phage cI repressor gene<ref name=":0"><pubmed>2163395</pubmed></nowiki></ref>) is present in multiple copies in the original host strain and in type strains of other ''[[wikipedia:Shigella|Shigella]]'' species. Two vestigial copies, both interrupted by a copy of IS''30'', were also detected in the chromosome of [[wikipedia:Escherichia_coli_in_molecular_biology#K-12|''E. coli'' K12]]<ref><nowiki><pubmed>9278503</pubmed></nowiki></ref> and could form transposition intermediates when supplied with IS''911'' transposase<ref><nowiki><pubmed>9302015</pubmed></nowiki></ref>. Entire or truncated IS''911'' copies have also been identified in several ''E. coli'' virulence plasmids (e.g. <ref><nowiki><pubmed>10496929</pubmed></nowiki></ref>), in pathogenicity islands of uropathogenic ''E. coli'' (e.g. <ref><nowiki><pubmed>8751923</pubmed></nowiki></ref>), in various other clinical isolates of ''E. coli'' and in a large number of well-known and less well-known enterobacteria such as ''[[wikipedia:Escherichia_fergusonii|Escherichia fergusonii]]'', ''[[wikipedia:Cronobacter|Chronobacter]]'', [[wikipedia:Dickeya|''Dickeya'']], ''[[wikipedia:Erwinia|Erwinia]]'', ''[[wikipedia:Klebsiella|Klebsiella]]'', ''[[wikipedia:Pantoea|Pantoea]]'', ''[[wikipedia:Shimwellia|Shimwellia]]'', and ''[[wikipedia:Yersinia|Yersinia]]''.
+
For example, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], (isolated from a ''[[wikipedia:Shigella_dysenteriae|Shigella dysenteriae]]'' phage λ lysogen by spontaneous insertion into the phage cI repressor gene<ref name=":0"><pubmed>2163395</pubmed>
  
Most IS''3'' family members have been identified in bacteria although at least one example, IS''Mco1'', has also been identified in the archaea [[wikipedia:Methanosaeta_concilii|''Methanosaeta concilii'']]<ref><nowiki><pubmed>17347521</pubmed></nowiki></ref>. Since this archaeon is widespread in nature<ref><nowiki><pubmed>17320399</pubmed></nowiki></ref>, it is possible that this represents a case of recent horizontal transfer. The presence of 8 copies implies that IS''Mco1'' is active in its archaeal host.
+
&lt;/nowiki&gt;</ref>) is present in multiple copies in the original host strain and in type strains of other ''[[wikipedia:Shigella|Shigella]]'' species. Two vestigial copies, both interrupted by a copy of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS30 IS''30''], were also detected in the chromosome of [[wikipedia:Escherichia_coli_in_molecular_biology#K-12|''E. coli'' K12]]<ref><nowiki><pubmed>9278503</pubmed></nowiki></ref> and could form transposition intermediates when supplied with [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposase<ref><nowiki><pubmed>9302015</pubmed></nowiki></ref>. Entire or truncated [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] copies have also been identified in several ''[[wikipedia:Escherichia_coli|E. coli]]'' virulence plasmids (e.g. <ref><nowiki><pubmed>10496929</pubmed></nowiki></ref>), in pathogenicity islands of uropathogenic ''[[wikipedia:Escherichia_coli|E. coli]]'' (e.g. <ref><nowiki><pubmed>8751923</pubmed></nowiki></ref>), in various other clinical isolates of ''[[wikipedia:Escherichia_coli|E. coli]]'' and in a large number of well-known and less well-known enterobacteria such as ''[[wikipedia:Escherichia_fergusonii|Escherichia fergusonii]]'', ''[[wikipedia:Cronobacter|Chronobacter]]'', [[wikipedia:Dickeya|''Dickeya'']], ''[[wikipedia:Erwinia|Erwinia]]'', ''[[wikipedia:Klebsiella|Klebsiella]]'', ''[[wikipedia:Pantoea|Pantoea]]'', ''[[wikipedia:Shimwellia|Shimwellia]]'', and ''[[wikipedia:Yersinia|Yersinia]]''.
 +
 
 +
Most IS''3'' family members have been identified in bacteria although at least one example, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISMco1 IS''Mco1''], has also been identified in the archaea [[wikipedia:Methanosaeta_concilii|''Methanosaeta concilii'']]<ref><nowiki><pubmed>17347521</pubmed></nowiki></ref>. Since this archaeon is widespread in nature<ref><nowiki><pubmed>17320399</pubmed></nowiki></ref>, it is possible that this represents a case of recent horizontal transfer. The presence of 8 copies implies that [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISMco1 IS''Mco1''] is active in its archaeal host.
  
 
===Organization===
 
===Organization===
The family is quite homogenous in the organization [[:File:Fig. IS3.1.png|(Fig.IS3.1)]]. in spite of its wide distribution in bacteria exhibiting a large range of G+C contents (from 70% in the Mycobacterial examples to 25% in those isolated from ''[[wikipedia:Mycoplasma|Mycoplasma]]'') and of the presence of members in hosts such as ''[[wikipedia:Mycoplasma|Mycoplasma]]'' with a non-universal genetic code (e.g. IS''1138'') or in bacteria which use stop codon read-through by insertion of the unusual amino acid selenocysteine (e.g. IS''Dvu3'' from ''[[wikipedia:Desulfovibrio_vulgaris|Desulfovibrio vulgaris]]''). In the case of both copies of IS''1138'', which participates in high frequency rearrangements of the ''[https://microbewiki.kenyon.edu/index.php/Mycoplasma_pulmonis Mycoplasma pulmonis]'' chromosome, the Tpase orf carries 11 UGA codons which are decoded as tryptophan<ref name=":1"><pubmed>8096321</pubmed></nowiki></ref>.
+
The family is quite homogenous in the organization [[:File:Fig. IS3.1.png|(Fig.IS3.1)]]. in spite of its wide distribution in bacteria exhibiting a large range of G+C contents (from 70% in the [[wikipedia:Mycobacterium|Mycobacterial]] examples to 25% in those isolated from ''[[wikipedia:Mycoplasma|Mycoplasma]]'') and of the presence of members in hosts such as ''[[wikipedia:Mycoplasma|Mycoplasma]]'' with a non-universal genetic code (e.g. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138 IS''1138'']) or in bacteria which use stop codon read-through by insertion of the unusual amino acid selenocysteine (e.g. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISDvu3 IS''Dvu3''] from ''[[wikipedia:Desulfovibrio_vulgaris|Desulfovibrio vulgaris]]''). In the case of both copies of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138 IS''1138''], which participates in high frequency rearrangements of the ''[https://microbewiki.kenyon.edu/index.php/Mycoplasma_pulmonis Mycoplasma pulmonis]'' chromosome, the Tpase orf carries 11 '''UGA''' codons which are decoded as tryptophan<ref name=":1"><pubmed>8096321</pubmed>
[[Image:Fig. IS3.1.png|thumb|center|500x500px|'''Fig. IS3.1'''. '''(A)''' Genetic organization of IS911. The 1,250-bp IS911 is shown as a box. The boxes at each end represent the left (IRL) and right (IRR) terminal inverted repeats. The two open reading frames, ''orfA'' (blue) and ''orfB'' (green) are positioned in relative reading phases 0 and −1, respectively, as indicated. The indigenous promoter, pIRL, is shown. The region of overlap between ''orfA'' and ''orfB'', which includes the frameshifting signals to produce OrfAB, lies within IS911 coordinates 300 and 400. The precise point at
+
 
 +
&lt;/nowiki&gt;</ref>.
 +
[[Image:Fig. IS3.1.png|thumb|center|720x720px|'''Fig. IS3.1'''. '''(A)''' Genetic organization of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']. The 1,250-bp [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] is shown as a box. The boxes at each end represent the left (IRL) and right (IRR) terminal inverted repeats. The two open reading frames, ''orfA'' (blue) and ''orfB'' (green) are positioned in relative reading phases 0 and −1, respectively, as indicated. The indigenous promoter, pIRL, is shown. The region of overlap between ''orfA'' and ''orfB'', which includes the frameshifting signals to produce OrfAB, lies within IS911 coordinates 300 and 400. The precise point at
 
which the frameshift occurs, within the last heptad of the LZ, is indicated by the vertical dotted line. '''(B)''' Structure-function map of OrfAB and OrfA. HTH, a potential helix-turn-helix motif; LZ, a leucine zipper motif involved in homo- and hetero-multimerization of OrfAB and OrfA. Programmed translational frameshifting that fuses OrfA and OrfB to generate the transposase OrfAB occurs within the fourth heptad. The LZ of OrfA and OrfAB, therefore, differ in their fourth heptad. A second region, M, necessary for multimerization of OrfAB is shown, as is the catalytic core of the enzyme which carries a third multimerization domain. OrfA translation initiates at an AUG, terminates with UAA whereas OrfAB translation terminates within the right IR. The vertical line to the right of M shows the extent of the truncated transposase, OrfAB[1–149] described in the text. '''(C)''' Frameshifting window. The mRNA sequence around the programmed translational frameshifting window is presented. The boxed sequence GGAG is the potential ribosome-binding site located upstream of ''orfB'' whose potential translation would be initiated at the boxed AUU codon. A ribosome (not to scale) is shown covering a series of “slippery” codons (AAAAAAG). A downstream secondary structure is also shown with the UAA, OrfA translation termination codon. The ribosome-binding site, slippery codons, and secondary structure all contribute to the efficiency of the programmed −1 frameshift. The box at the foot of this figure shows how the anti-codons of two tRNALys are thought to undergo re-pairing with their codons in the AAAAAAG motif.|alt=]]
 
which the frameshift occurs, within the last heptad of the LZ, is indicated by the vertical dotted line. '''(B)''' Structure-function map of OrfAB and OrfA. HTH, a potential helix-turn-helix motif; LZ, a leucine zipper motif involved in homo- and hetero-multimerization of OrfAB and OrfA. Programmed translational frameshifting that fuses OrfA and OrfB to generate the transposase OrfAB occurs within the fourth heptad. The LZ of OrfA and OrfAB, therefore, differ in their fourth heptad. A second region, M, necessary for multimerization of OrfAB is shown, as is the catalytic core of the enzyme which carries a third multimerization domain. OrfA translation initiates at an AUG, terminates with UAA whereas OrfAB translation terminates within the right IR. The vertical line to the right of M shows the extent of the truncated transposase, OrfAB[1–149] described in the text. '''(C)''' Frameshifting window. The mRNA sequence around the programmed translational frameshifting window is presented. The boxed sequence GGAG is the potential ribosome-binding site located upstream of ''orfB'' whose potential translation would be initiated at the boxed AUU codon. A ribosome (not to scale) is shown covering a series of “slippery” codons (AAAAAAG). A downstream secondary structure is also shown with the UAA, OrfA translation termination codon. The ribosome-binding site, slippery codons, and secondary structure all contribute to the efficiency of the programmed −1 frameshift. The box at the foot of this figure shows how the anti-codons of two tRNALys are thought to undergo re-pairing with their codons in the AAAAAAG motif.|alt=]]
  
Members are between 1200 and 1550 bp with relatively well conserved inverted terminal repeats in the range of 20-40 bp. One exception previously attributed to this family, IS''481'', is 1045 bp long and has now been placed in a separate family; see "[[IS Families/IS481 family|IS''481'' family]]"). They generate 3 or 4 bp DR on insertion.
+
Members are between 1200 and 1550 bp with relatively well conserved inverted terminal repeats in the range of 20-40 bp. One exception previously attributed to this family, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS481 IS''481''], is 1045 bp long and has now been placed in a separate family; see "[[IS Families/IS481 family|IS''481'' family]]"). They generate 3 or 4 bp DR on insertion.
  
 
The majority of IR terminate with 5'-TG-----CA-3' and present an internal block of G/C residues of variable length [[:File:Fig. IS3.2.png|(Fig.IS3.2)]].  
 
The majority of IR terminate with 5'-TG-----CA-3' and present an internal block of G/C residues of variable length [[:File:Fig. IS3.2.png|(Fig.IS3.2)]].  
[[Image:Fig. IS3.2.png|thumb|center|500x500px|'''Fig. IS3.2.'''  WebLogo of IS''3'' family ends. The left (IRL) and right IRR inverted terminal repeats of the major IS''3'' family groups as defined in ISfinder are shown in WebLogo format (Crooks et al., 2004). They are defined by the direction of transcription/translation of the transposase gene. IRL, by definition, is located on the 5’ side of the transposase orf. |alt=]]
+
[[Image:Fig. IS3.2.png|thumb|center|680x680px|'''Fig. IS3.2.'''  [http://weblogo.berkeley.edu/ WebLogo] of IS''3'' family ends. The left (IRL) and right IRR inverted terminal repeats of the major IS''3'' family groups as defined in ISfinder are shown in [http://weblogo.berkeley.edu/ WebLogo] format. They are defined by the direction of transcription/translation of the transposase gene. IRL, by definition, is located on the 5’ side of the transposase orf. |alt=]]
 +
 
 +
IS''3''-family members generally have two consecutive and partially overlapping reading frames, ''orfA'' and ''orfB'', in relative translational reading phases 0 and -1, respectively [[:File:Fig. IS3.1.png|(Fig.IS3.1 A)]] under control of a weak promoter, pIRL, partially located in IRL ([[:File:Fig. IS3.1.png|Fig.IS3.1 A]] and [[:File:Fig. IS3.3.png|Fig.IS3.3 C]]). The 5' end of ''orfB'' overlaps the 3' end of ''orfA'' and occurs in reading phase -1 relative to ''orfA'' [[:File:Fig. IS3.1.png|(Fig.IS3.1)]].
 +
 
 +
It had been demonstrated in the 1990s that several family members ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']<ref name=":2"><pubmed>1653413</pubmed>
 +
 
 +
&lt;/nowiki&gt;</ref>, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3'']<ref name=":3"><pubmed>8107082</pubmed>
 +
 
 +
&lt;/nowiki&gt;</ref>, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']<ref name=":4"><pubmed>1660923</pubmed>
  
IS''3''-family members generally have two consecutive and partially overlapping reading frames, ''orfA'' and ''orfB'', in relative translational reading phases 0 and -1, respectively [[:File:Fig. IS3.1.png|(Fig.IS3.1 A)]] under control of a weak promoter, pIRL, partially located in IRL ([[:File:Fig. IS3.1.png|Fig.IS3.1 A]] and [[:File:Fig. IS3.3.png|Fig.IS3.3 C]]). The 5' end of ''orfB'' overlaps the 3' end of ''orfA'' and occurs in reading phase -1 relative to ''orfA'' [[:File:Fig. IS3.1.png|(Fig.IS3.1)]]. It had been demonstrated in the 1990s that several family members (IS''150''<ref name=":2"><pubmed>1653413</pubmed></nowiki></ref>, IS''3''<ref name=":3"><pubmed>8107082</pubmed></nowiki></ref>, IS''911''<ref name=":4"><pubmed>1660923</pubmed></nowiki></ref>, and IS''2''<ref name=":5"><pubmed>9302014</pubmed></nowiki></ref>) express two major proteins [[:File:Fig. IS3.1.png|(Fig.IS3.1 B)]]: OrfA, the product of the upstream frame,and the transposase, OrfAB, a “fusion” or “transframe” protein generated from ''orfA'' and ''orfB'' by programmed -1 ribosomal frameshifting (PRF) (see "[[General Information/Transposase expression and activity#Programmed Translational Frameshifting|Programmed translational frameshifting]]")<ref><nowiki><pubmed>8384687</pubmed></nowiki></ref>. Many other members of this family are also organized in this way<ref name=":6"><pubmed>21673094</pubmed></nowiki></ref><ref name=":7"><pubmed>24875478</pubmed></nowiki></ref>. The frameshifting frequency varies from element to element. It is approximately 50% in the case of IS''150<ref name=":2" />'' and only 15% for IS''911<ref name=":4" />''.
+
&lt;/nowiki&gt;</ref>, and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2'']<ref name=":5"><pubmed>9302014</pubmed>
[[Image:Fig. IS3.3.png|thumb|center|620x620px|'''Fig. IS3.3.''' '''(A)''' Organization of the IS''911'' inverted repeat (IR). The nucleotide sequence of IRL and IRR is boxed. Grey horizontal bars above and below indicate the internal regions protected from DNaseI digestion by binding of OrfAB[1–149], a derivative of the 382-amino-acid OrfAB truncated for its catalytic domain.
 
The dotted horizontal grey bar indicates partial protection. The dashes within the sequence indicate mismatches between the left and right ends. The −35 and −10 components of the indigenous promoter pIRL (blue boxes) and of pjunc (green boxes) are shown. The conserved 5′TG tips are highlighted in red. '''(B)''' Organization of pjunc. The “junction” promoter assembled on the circularization of IS911 is shown as green boxes. The initiating transcript nucleotide (+1 pjunc), the indigenous pIRL (blue boxes), and the initiating transcript nucleotide (+1 pIRL) are also shown. The conserved 5′TG tips are highlighted in red. '''(C)''' Secondary structure at the left IS911 end. The sequence of the “top” strand of IRL is shown together with the various transcription and translation signals. The symbols below are standard “dot-bracket” notations to indicate potential secondary structures formed with transcripts from top to bottom: from an external promoter, from pjunc, or from pIRL respectively. The brackets are shown in italic simply permit the reader to identify the apical stem of the secondary structure.|alt=]]
 
  
Complex internal inverted repeat sequences ([[:File:Fig. IS3.3.png|Fig.IS3.3 C]]) (for IS''911'', located between coordinates 19 and 73) include the -35 and -10 hexamers of pIRL, the transcription start site and the [[wikipedia:Ribosome-binding_site|ribosome binding site]] for OrfA. This is thought to play a role at the mRNA level in preventing excess transposase expression resulting from external transcription. The full secondary structure would be present in transcripts initiated outside the IS thus sequestering the translation initiation signals but only the 3’ part would be present if transcription initiates at pIRL. In this case, the translation initiation signals would be exposed. Initial studies ([https://scholar.google.com/citations?user=dDU8ukUAAAAJ&hl=en Prère] and [https://scholar.google.com/citations?user=wAxcf14AAAAJ&hl=en Fayet] pers communication) have shown that translation from the longer transcript is very low but that deletion of its 5’ end to “liberate” the ribosome binding site ([[:File:Fig. IS3.3.png|Fig.IS3.3 C]]) indeed results in a significant increase in translation. In the related IS2 element, a similar sequence appears to function as a DNA binding site for the OrfA protein which represses promoter activity but further studies are necessary to confirm this<ref name=":8"><pubmed>8107136</pubmed></nowiki></ref>.  
+
&lt;/nowiki&gt;</ref>) express two major proteins [[:File:Fig. IS3.1.png|(Fig.IS3.1 B)]]: OrfA, the product of the upstream frame,and the transposase, OrfAB, a “fusion” or “transframe” protein generated from ''orfA'' and ''orfB'' by '''P'''rogrammed -1 '''R'''ibosomal '''F'''rameshifting (PRF) (see "[[General Information/Transposase expression and activity#Programmed Translational Frameshifting|Programmed translational frameshifting]]")<ref><nowiki><pubmed>8384687</pubmed></nowiki></ref>. Many other members of this family are also organized in this way<ref name=":6"><pubmed>21673094</pubmed>
 +
 
 +
&lt;/nowiki&gt;</ref><ref name=":7"><pubmed>24875478</pubmed>&lt;/nowiki&gt;</ref>. The frameshifting frequency varies from element to element. It is approximately 50% in the case of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']''<ref name=":2" />'' and only 15% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']''<ref name=":4" />''.
 +
[[Image:Fig. IS3.3.png|thumb|center|780x780px|'''Fig. IS3.3.''' '''(A)''' Organization of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] inverted repeat ('''IR'''). The nucleotide sequence of IRL and '''IRR''' is boxed. Grey horizontal bars above and below indicate the internal regions protected from DNaseI digestion by binding of OrfAB [1–149], a derivative of the 382-amino-acid OrfAB truncated for its catalytic domain.
 +
The dotted horizontal gray bar indicates partial protection. The dashes within the sequence indicate mismatches between the left and right ends. The −35 and −10 components of the indigenous promoter pIRL (blue boxes) and of '''pjunc''' (green boxes) are shown. The conserved '''5′''' TG tips are highlighted in red. '''(B)''' Organization of '''pjunc'''. The “junction” promoter assembled on the circularization of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] is shown as green boxes. The initiating transcript nucleotide (+1 '''pjunc'''), the indigenous pIRL (blue boxes), and the initiating transcript nucleotide (+1 pIRL) are also shown. The conserved '''5′''' TG tips are highlighted in red. '''(C)''' Secondary structure at the left [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] end. The sequence of the “top” strand of '''IRL''' is shown, together with the various transcription and translation signals. The symbols below are standard “dot-bracket” notations to indicate potential secondary structures formed with transcripts from top to bottom: from an external promoter, from '''pjunc''', or from pIRL respectively. The brackets are shown in ''italic'', simply permit the reader to identify the apical stem of the secondary structure.|alt=]]
 +
 
 +
Complex internal inverted repeat sequences ([[:File:Fig. IS3.3.png|Fig.IS3.3 C]]) (for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], located between coordinates 19 and 73) include the -35 and -10 hexamers of pIRL, the transcription start site and the [[wikipedia:Ribosome-binding_site|ribosome binding site]] for OrfA. This is thought to play a role at the mRNA level in preventing excess transposase expression resulting from external transcription. The full secondary structure would be present in transcripts initiated outside the IS thus sequestering the translation initiation signals but only the 3’ part would be present if transcription initiates at pIRL. In this case, the translation initiation signals would be exposed. Initial studies ([https://scholar.google.com/citations?user=dDU8ukUAAAAJ&hl=en Prère] and [https://scholar.google.com/citations?user=wAxcf14AAAAJ&hl=en Fayet] pers communication) have shown that translation from the longer transcript is very low but that deletion of its 5’ end to “liberate” the ribosome binding site ([[:File:Fig. IS3.3.png|Fig.IS3.3 C]]) indeed results in a significant increase in translation. In the related [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] element, a similar sequence appears to function as a DNA binding site for the OrfA protein which represses promoter activity but further studies are necessary to confirm this<ref name=":8"><pubmed>8107136</pubmed>&lt;/nowiki&gt;</ref>.  
  
 
===Formation of a strong transposase promoter===
 
===Formation of a strong transposase promoter===
In common with many IS of other families (e.g. IS''21''<ref><nowiki><pubmed>2540414</pubmed></nowiki></ref>, IS''30''<ref><nowiki><pubmed>3039299</pubmed></nowiki></ref>, IS''110''<ref><nowiki><pubmed>10438765</pubmed></nowiki></ref><ref name=":9"><pubmed>11598022</pubmed></nowiki></ref>) the IS''3'' family IRR carry an outward-directed -35 promoter hexamer while IRL carries an inward-directed -10 promoter component ([[:File:Fig. IS3.3.png|Fig.IS3.3 B]]). These are assembled into a strong promoter, pJunc, which serves to express high levels of transposition proteins ([[:File:Fig. IS3.3.png|Fig.IS3.3 B]]); ([[:File:Fig. IS3.4.png|Fig.IS3.4]]) in one of its key transposition intermediates, an excised transposon circle (see "[[General Information/Major Groups are Defined by the Type of Transposase They Use#Major DDE transposition pathways|Transposition Pathway]]"). Transcription initiation from pJunc, like that from impinging transcription, would also produce an RNA which could sequester the translation initiation signals but in a shorter and less stable stem loop structure ([[:File:Fig. IS3.3.png|Fig.IS3.3 C]]).
+
In common with many IS of other families (e.g. [[IS Families/IS21 family|IS''21'']]<ref><nowiki><pubmed>2540414</pubmed></nowiki></ref>, [[IS Families/IS30 family|IS''30'']]<ref><nowiki><pubmed>3039299</pubmed></nowiki></ref>, [[IS Families/IS110 family|IS''110'']]<ref><nowiki><pubmed>10438765</pubmed></nowiki></ref><ref name=":9"><pubmed>11598022</pubmed>&lt;/nowiki&gt;</ref>) the IS''3'' family '''IRR''' carry an outward-directed -35 promoter hexamer while '''IRL''' carries an inward-directed -10 promoter component ([[:File:Fig. IS3.3.png|Fig.IS3.3 B]]). These are assembled into a strong promoter, '''pJunc''', which serves to express high levels of transposition proteins ([[:File:Fig. IS3.3.png|Fig.IS3.3 B]]); ([[:File:Fig. IS3.4.png|Fig.IS3.4]]) in one of its key transposition intermediates, an excised transposon circle (see "[[General Information/Major Groups are Defined by the Type of Transposase They Use#Major DDE transposition pathways|Transposition Pathway]]"). Transcription initiation from '''pJunc''', like that from impinging transcription, would also produce an RNA which could sequester the translation initiation signals but in a shorter and less stable stem loop structure ([[:File:Fig. IS3.3.png|Fig.IS3.3 C]]).
[[Image:Fig. IS3.4.png|thumb|center|600x600px|'''Fig. IS3.4.''' '''Left:''' Primer extension analysis of lac transcripts. Lanes 1 and 2: two independent cultures. Lanes 3 and 4: primer extension products obtained from identical quantities of total RNA isolated from two independent cultures. The major products are indicated by unfilled arrowheads (right). The scheme at the left shows the relative position of the IRR–IRL junction.  '''Middle:''' Schematic of the different plasmid forms notes that to obtain results for the transposon junction a copy was cloned into a suitable vector. '''Right:''' Colonies on MacConkey lactose plates|alt=]]
+
[[Image:Fig. IS3.4.png|thumb|center|680x680px|'''Fig. IS3.4.''' '''Left:''' Primer extension analysis of ''lac'' transcripts. Lanes 1 and 2: two independent cultures. Lanes 3 and 4: primer extension products obtained from identical quantities of total RNA isolated from two independent cultures. The major products are indicated by unfilled arrowheads (right). The scheme at the left shows the relative position of the IRR–IRL junction.  '''Middle:''' Schematic of the different plasmid forms notes that to obtain results for the transposon junction a copy was cloned into a suitable vector. '''Right:''' Colonies on MacConkey lactose plates.|alt=]]
  
 
===Regulation by Methylation?===
 
===Regulation by Methylation?===
Several members carry GATC [[wikipedia:DNA_methylation|methylation]] sites within 50bp of their ends, which have been shown in one case, IS''3'', to modulate transposition activity<ref name=":10"><pubmed>1645443</pubmed></nowiki></ref>, however, this is not a general characteristic of the family nor is it restricted to any particular subgroup.  
+
Several members carry GATC [[wikipedia:DNA_methylation|methylation]] sites within 50bp of their ends, which have been shown in one case, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''], to modulate transposition activity<ref name=":10"><pubmed>1645443</pubmed>&lt;/nowiki&gt;</ref>, however, this is not a general characteristic of the family nor is it restricted to any particular subgroup.  
  
 
===Insertion specificity===
 
===Insertion specificity===
There appears to be little sequence specificity for insertion of members of the family. IS''2'' exhibits a preference for a region of [[wikipedia:P1_phage|bacteriophage P1]] but the basis of this preference is at present unknown<ref><nowiki><pubmed>3035338</pubmed></nowiki></ref>. Both IS''911''<ref name=":11"><pubmed>8106332</pubmed></nowiki></ref> and IS''150'' <ref>Welz C. Functionelle analyse des Bakteriellen Insertionelements IS150. PhD thesis: Fakultät für Biologie der Albert-Ludwigs-Univesität Freiburg; 1993. </ref> have been found next to sequences which resemble their IRs (see “[[IS Families/IS3 family#Targeted Insertion|Targeted Insertion]]”) and IS''1397'' is invariably located within intergenic repeated sequences in ''E. coli'' ('''B'''acterial '''I'''nterspersed '''M'''osaic '''E'''lements or BIMEs<ref><nowiki><pubmed>9055066</pubmed></nowiki></ref>.
+
There appears to be little sequence specificity for insertion of members of the family. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] exhibits a preference for a region of [[wikipedia:P1_phage|bacteriophage P1]] but the basis of this preference is at present unknown<ref><nowiki><pubmed>3035338</pubmed></nowiki></ref>. Both [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']<ref name=":11"><pubmed>8106332</pubmed>&lt;/nowiki&gt;</ref> and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150''] <ref>Welz C. Functionelle analyse des Bakteriellen Insertionelements IS150. PhD thesis: Fakultät für Biologie der Albert-Ludwigs-Univesität Freiburg; 1993. </ref> have been found next to sequences which resemble their IRs (see “[[IS Families/IS3 family#Targeted Insertion|Targeted Insertion]]”) and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1397 IS''1397''] is invariably located within intergenic repeated sequences in ''[[wikipedia:Escherichia_coli|E. coli]]'' ('''B'''acterial '''I'''nterspersed '''M'''osaic '''E'''lements or BIMEs<ref><nowiki><pubmed>9055066</pubmed></nowiki></ref>.
  
 
===Group II intron insertions===
 
===Group II intron insertions===
Finally, an element isolated from the [https://pubmed.ncbi.nlm.nih.gov/6363394/?dopt=Abstract ECOR collection] of ''E. coli'' and closely related to IS''3411'' carries a [[wikipedia:Group_II_intron|group II intron]]<ref><nowiki><pubmed>7994604</pubmed></nowiki></ref>. The effect of this on regulation of transposition of this element has not been investigated.  
+
Finally, an element isolated from the [https://pubmed.ncbi.nlm.nih.gov/6363394/?dopt=Abstract ECOR collection] of ''[[wikipedia:Escherichia_coli|E. coli]]'' and closely related to [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411''] carries a [[wikipedia:Group_II_intron|group II intron]]<ref><nowiki><pubmed>7994604</pubmed></nowiki></ref>. The effect of this on regulation of transposition of this element has not been investigated.  
  
 
===IS3 family subgroups===
 
===IS3 family subgroups===
The IS''3'' family is divided into five subgroups ([[General Information/What Is an IS?#Characteristics of insertion sequence families|Table Characteristics of IS families]]; [[:File:1.4.2.png|Fig.4.2]]). This is supported by deep branching in the alignment of the various OrfA and OrfB sequences<ref name=":12"><pubmed>9729608</pubmed></nowiki></ref> ([[:File:Fig. IS3.5.png|Fig.IS3.5]]). These are: the IS''2'' and IS''407'' subgroups (which appear closely related), and the IS''3'', IS''51'', and IS''150'' subgroups. Additional members of the family identified subsequently also tend to follow this pattern. One feature which lends biological credence to these subgroups is that they also clearly appear clustered (with some exceptions) in the results of the alignments with the upstream OrfA protein<ref name=":12" />. Moreover, there is some correlation between the members of each group and the number of base pairs of target DNA duplicated on insertion (DR): for those elements in the IS''2'' subgroup, insertion invariably leads to a 5 bp DR; for the IS''407'' subgroup a 4 bp DR is observed; while for the other groups a 3 bp DR is generated ([[General Information/What Is an IS?#Characteristics of insertion sequence families|Table Characteristics of IS families]]). In the latter cases some of the elements, e.g. IS''911'', have been shown to occasionally generate 4 bp repeats. This clustering is also exhibited to some extent in the nucleotide sequence of the terminal IRs ([[:File:Fig. IS3.2.png|Fig.IS3.2]]) and is particularly marked in the IS''2'', IS''51'' and IS''407'' subgroups. It can also be observed in the primary sequence details of the putative leucine zipper<ref name=":13"><pubmed>10677279</pubmed></nowiki></ref>.
+
The IS''3'' family is divided into five subgroups ([[General Information/What Is an IS?#Characteristics of insertion sequence families|Table Characteristics of IS families]]; [[:File:1.4.2.png|Fig.4.2]]). This is supported by deep branching in the alignment of the various OrfA and OrfB sequences<ref name=":12"><pubmed>9729608</pubmed>
[[Image:Fig. IS3.5.png|thumb|center|620x620px|'''Fig. IS3.5.'''  '''Relationship of OrfA and OrfB in various IS''3'' family groups.''' Dendrogram based on the alignments of the amino acid sequences of predicted OrfA proteins from 40 elements (left) and 44 predicted OrfB frames (right) (adapted from Mahillon and Chandler 1998). The different colors indicate the different IS3 family groups showing that both A and B frames are largely group-specific.|alt=]]
+
 
 +
&lt;/nowiki&gt;</ref> ([[:File:Fig. IS3.5.png|Fig.IS3.5]]). These are: the '''IS''2'' and IS''407'' subgroups''' (which appear closely related), and the '''IS''3'', IS''51'', and IS''150'' subgroups'''.  
 +
 
 +
Additional members of the family identified subsequently also tend to follow this pattern. One feature which lends biological credence to these subgroups is that they also clearly appear clustered (with some exceptions) in the results of the alignments with the upstream OrfA protein<ref name=":12" />. Moreover, there is some correlation between the members of each group and the number of base pairs of target DNA duplicated on insertion (DR): for those elements in the '''IS''2'' subgroup''', insertion invariably leads to a 5 bp DR; for the '''IS''407'' subgroup''' a 4 bp DR is observed; while for the other groups a 3 bp DR is generated ([[General Information/What Is an IS?#Characteristics of insertion sequence families|Table Characteristics of IS families]]). In the latter cases some of the elements, e.g. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], have been shown to occasionally generate 4 bp repeats. This clustering is also exhibited to some extent in the nucleotide sequence of the terminal '''IRs''' ([[:File:Fig. IS3.2.png|Fig.IS3.2]]) and is particularly marked in the '''IS''2'', IS''51'' and IS''407'' subgroups'''. It can also be observed in the primary sequence details of the putative leucine zipper<ref name=":13"><pubmed>10677279</pubmed>&lt;/nowiki&gt;</ref>.
 +
[[Image:Fig. IS3.5.png|thumb|center|680x680px|'''Fig. IS3.5.'''  '''Relationship of OrfA and OrfB in various IS''3'' family groups.''' Dendrogram based on the alignments of the amino acid sequences of predicted OrfA proteins from 40 elements (left) and 44 predicted OrfB frames (right) (adapted from Mahillon and Chandler 1998). The different colors indicate the different IS3 family groups, showing that both A and B frames are largely group-specific.|alt=]]
  
 
===Family Exceptions===
 
===Family Exceptions===
Several family members exhibit an organization which does not apparently conform to the generic IS''3'' member. In IS''120'', for example, the relationship between the reading phases of the upstream and downstream orfs appears to be +1 rather than -1 while in IS''Ng1'' and IS''Ye1'' the characteristic motifs of OrfB are distributed between reading phases. Other members, such as IS''1076'', IS''1138'', IS''1221'', and IS''1141'', exhibit only one long open reading frame. Although these may be true variants, it cannot at present be ruled out that the variations are simply due to errors in sequence determination.  
+
Several family members exhibit an organization which does not apparently conform to the generic IS''3'' member. In [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS120 IS''120''], for example, the relationship between the reading phases of the upstream and downstream orfs appears to be +1 rather than -1 while in [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISNg1 IS''Ng1''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISYe1 IS''Ye1''] the characteristic motifs of OrfB are distributed between reading phases. Other members, such as [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1076 IS''1076''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138 IS''1138''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1221 IS''1221''], and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1141 IS''1141''], exhibit only one long open reading frame. Although these may be true variants, it cannot at present be ruled out that the variations are simply due to errors in sequence determination.  
  
 
===Mycoplasma and the non-universal genetic code===
 
===Mycoplasma and the non-universal genetic code===
Family members from ''[[wikipedia:Mycoplasma|Mycoplasma]]'' merit special attention. Not only does the host use a non-universal genetic code in which the opal termination codon TGA directs the insertion of tryptophan (see <ref><nowiki><pubmed>1579111</pubmed></nowiki></ref>, but their genomes are among the smallest bacterial genomes known and extremely rich in A+T. To date, several different IS''3'' family members have been observed in [[wikipedia:Mycoplasma|''Mycoplasma'']]. Of these, only IS''1138'' (and IS''1138b'') has been demonstrated directly to undergo autonomous transposition<ref name=":1" />. All exhibit similarly high AT levels and this unusual base composition could lead to difficulties in sequence determination. It is remarkable that typical IS''3'' family characters have been maintained in such an "extreme" genetic environment. Nine individuals are closely related and form a group of iso-elements which have been called IS''1221''. As indicated above, one of these carries a single long reading frame (representing ''orfA'' + ''orfB'') instead of two consecutive overlapping frames. The others each carry insertions or deletions which destroy either the equivalent of ''orfA'', ''orfB'', or both. Expression studies in ''E. coli'' indicate that a protein, equivalent to OrfAB, is indeed produced from the long open reading frame of IS''1221''. Interestingly, it appears that a second truncated protein, equivalent to OrfA, may be generated from the single ''orfAB'' frame by [[wikipedia:Ribosomal_frameshift|translational frameshifting]], representing an "inverted" expression pattern to the majority of the family members<ref name=":14"><pubmed>7476162</pubmed></nowiki></ref>. Although this appears not to be a general rule for IS''3'' family members originating from ''Mycoplasma'' hosts, the presence of a similar single-frame arrangement in a second member, IS''1138'', indicates that it might not be rare. Because of the extremely high AT content of these elements, many potential frameshift windows of the A6G(/C) or A7 type are expected to occur. The only direct experiment will, therefore, be able to determine which, if any, of these sequences are used to generate the Tpase or, conversely, an OrfA-like protein.  
+
Family members from ''[[wikipedia:Mycoplasma|Mycoplasma]]'' merit special attention. Not only does the host use a non-universal genetic code in which the opal termination codon TGA directs the insertion of tryptophan (see <ref><nowiki><pubmed>1579111</pubmed></nowiki></ref>, but their genomes are among the smallest bacterial genomes known and extremely rich in A+T. To date, several different IS''3'' family members have been observed in [[wikipedia:Mycoplasma|''Mycoplasma'']]. Of these, only [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138 IS''1138''] (and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138b IS''1138b'']) has been demonstrated directly to undergo autonomous transposition<ref name=":1" />. All exhibit similarly high AT levels and this unusual base composition could lead to difficulties in sequence determination. It is remarkable that typical IS''3'' family characters have been maintained in such an "extreme" genetic environment. Nine individuals are closely related and form a group of iso-elements which have been called [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1221 IS''1221'']. As indicated above, one of these carries a single long reading frame (representing ''orfA'' + ''orfB'') instead of two consecutive overlapping frames. The others each carry insertions or deletions which destroy either the equivalent of ''orfA'', ''orfB'', or both. Expression studies in ''[[wikipedia:Escherichia_coli|E. coli]]'' indicate that a protein, equivalent to OrfAB, is indeed produced from the long open reading frame of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1221 IS''1221'']. Interestingly, it appears that a second truncated protein, equivalent to OrfA, may be generated from the single ''orfAB'' frame by [[wikipedia:Ribosomal_frameshift|translational frameshifting]], representing an "inverted" expression pattern to the majority of the family members<ref name=":14"><pubmed>7476162</pubmed>
 +
 
 +
&lt;/nowiki&gt;</ref>. Although this appears not to be a general rule for IS''3'' family members originating from ''Mycoplasma'' hosts, the presence of a similar single-frame arrangement in a second member, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138 IS''1138''], indicates that it might not be rare. Because of the extremely high AT content of these elements, many potential frameshift windows of the A6G(/C) or A7 type are expected to occur. The only direct experiment will, therefore, be able to determine which, if any, of these sequences are used to generate the Tpase or, conversely, an OrfA-like protein.  
  
 
===A clade with non-canonical IR===
 
===A clade with non-canonical IR===
A clade carrying non-canonical ends has recently been identified. These IS include 7 supplementary base pairs on each end flanking canonical IS''3'' ends: a conserved stretch of 5 C residues is located 5’ to the left IR and a less conserved motif (CGG) is located 3’ to the right end. When these additional bases are taken into account every member of this clade exhibits a 4 bp DR characteristic of the IS''3'' family ([[General Information/What Is an IS?#Characteristics of insertion sequence families|Table Characteristics of IS families]]) (Gourbeyre, pers. comm.). This conclusion is supported by the presence of multiple IS copies (e.g. IS''Psy31'') and also by identification of “empty sites”. This clearly requires further experimental investigation.
+
A clade carrying non-canonical ends has recently been identified. These IS include 7 supplementary base pairs on each end flanking canonical IS''3'' ends: a conserved stretch of 5 C residues is located 5’ to the left '''IR''' and a less conserved motif (CGG) is located 3’ to the right end. When these additional bases are taken into account every member of this clade exhibits a 4 bp DR characteristic of the IS''3'' family ([[General Information/What Is an IS?#Characteristics of insertion sequence families|Table Characteristics of IS families]]) (Gourbeyre, pers. comm.). This conclusion is supported by the presence of multiple IS copies (e.g. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISPsy31 IS''Psy31'']) and also by identification of “empty sites”. This clearly requires further experimental investigation.
  
 
===An additional subgroup===
 
===An additional subgroup===
Recently, an additional subgroup has been proposed which includes IS''Ppy1''<ref><nowiki><pubmed>23832000</pubmed></nowiki></ref>. However, all members belong to the IS''150'' subgroup and their Tpases are not separated by our standard multiple alignments and [https://micans.org/mcl/ MCL analysis]. Although they do exhibit some variation in the sequence of their terminal dinucleotides, similar variations are found for IS''2'' and members of other IS''3'' subgroups.
+
Recently, an additional subgroup has been proposed which includes [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISPpy1 IS''Ppy1'']<ref><nowiki><pubmed>23832000</pubmed></nowiki></ref>. However, all members belong to the IS''150'' subgroup and their Tpases are not separated by our standard multiple alignments and [https://micans.org/mcl/ MCL analysis]. Although they do exhibit some variation in the sequence of their terminal dinucleotides, similar variations are found for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] and members of other IS''3'' subgroups.
  
 
===Mechanism===
 
===Mechanism===
 
====Transposition Proteins====
 
====Transposition Proteins====
Extensive alignment studies of the predicted OrfA and OrfB amino acid sequences between themselves and with those of other transposable elements<ref name=":15"><pubmed>8302872</pubmed></nowiki></ref><ref name=":16"><pubmed>1963920</pubmed></nowiki></ref><ref name=":17"><pubmed>1647013</pubmed></nowiki></ref><ref name=":18"><pubmed>1850126</pubmed></nowiki></ref><ref name=":19"><pubmed>7934941</pubmed></nowiki></ref> provided insights into structure/function relationships of the proteins ([[:File:Fig. IS3.1.png|Fig.IS3.1 B]]).  
+
Extensive alignment studies of the predicted OrfA and OrfB amino acid sequences between themselves and with those of other transposable elements<ref name=":15"><pubmed>8302872</pubmed>
 +
 
 +
&lt;/nowiki&gt;</ref><ref name=":16"><pubmed>1963920</pubmed>
 +
 
 +
&lt;/nowiki&gt;</ref><ref name=":17"><pubmed>1647013</pubmed>
 +
 
 +
&lt;/nowiki&gt;</ref><ref name=":18"><pubmed>1850126</pubmed>&lt;/nowiki&gt;</ref><ref name=":19"><pubmed>7934941</pubmed>&lt;/nowiki&gt;</ref> provided insights into structure/function relationships of the proteins ([[:File:Fig. IS3.1.png|Fig.IS3.1 B]]).  
  
 
====OrfA====
 
====OrfA====
OrfA is small. For IS''911'' it has a predicted molecular weight of 11.5 kDa. The predicted primary amino acid sequences of most IS''3'' family members exhibit a similarly placed HTH signature (see for example <ref name=":0" /><ref><nowiki><pubmed>9435062</pubmed></nowiki></ref>) which initially suggested that they might provide sequence-specific binding to the terminal IRs of their particular IS<ref><nowiki><pubmed>2841644</pubmed></nowiki></ref> involved in sequence-specific binding of the transposase to the terminal IRs OrfAB which was subsequently confirmed experimentally<ref name=":20"><pubmed>14981152</pubmed></nowiki></ref>. They also carry a C-terminal leucine zipper (LZ) motif first identified in IS''2'', IS''150'' and IS''3'' and which appears to be conserved in the majority of known members<ref name=":21"><pubmed>9761671</pubmed></nowiki></ref> and is involved in protein multimerization<ref name=":0" /><ref name=":4" /><ref name=":13" /><ref name=":21" />.
+
OrfA is small. For [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] it has a predicted molecular weight of 11.5 kDa. The predicted primary amino acid sequences of most IS''3'' family members exhibit a similarly placed HTH signature (see for example <ref name=":0" /><ref><nowiki><pubmed>9435062</pubmed></nowiki></ref>) which initially suggested that they might provide sequence-specific binding to the terminal '''IRs''' of their particular IS<ref name=":40"><nowiki><pubmed>2841644</pubmed></nowiki></ref> involved in sequence-specific binding of the transposase to the terminal '''IRs''' OrfAB which was subsequently confirmed experimentally<ref name=":20"><pubmed>14981152</pubmed>&lt;/nowiki&gt;</ref>. They also carry a C-terminal leucine zipper (LZ) motif first identified in IS''2'', IS''150'' and IS''3'' and which appears to be conserved in the majority of known members<ref name=":21"><pubmed>9761671</pubmed>&lt;/nowiki&gt;</ref> and is involved in protein multimerization<ref name=":0" /><ref name=":4" /><ref name=":13" /><ref name=":21" />.
  
 
====OrfB====
 
====OrfB====
 
The OrfB products carry a DD(35)E catalytic motif and share additional identities with [[wikipedia:Integrase|retroviral integrases]] and various other Tpases<ref name=":4" /><ref name=":15" /><ref name=":16" /><ref name=":17" /><ref name=":18" /><ref name=":19" /><ref><nowiki><pubmed>10547692</pubmed></nowiki></ref>. These include two amino acids located 4 and 7 residues downstream from the glutamate residue.  
 
The OrfB products carry a DD(35)E catalytic motif and share additional identities with [[wikipedia:Integrase|retroviral integrases]] and various other Tpases<ref name=":4" /><ref name=":15" /><ref name=":16" /><ref name=":17" /><ref name=":18" /><ref name=":19" /><ref><nowiki><pubmed>10547692</pubmed></nowiki></ref>. These include two amino acids located 4 and 7 residues downstream from the glutamate residue.  
  
IS''911'' OrfB is 299 residues long with a predicted molecular weight of 34.6kD. Its TAA termination codon lies just within IRR and may be significant in regulation. The OrfB initiation codon is AUU and consequently initiation occurs only at low levels<ref name=":4" /><ref name=":22"><pubmed>10064703</pubmed></nowiki></ref> and is modulated by the level of initiation factor IF3<ref name=":23"><pubmed>21478364</pubmed></nowiki></ref>.
+
[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] OrfB is 299 residues long with a predicted molecular weight of 34.6kD. Its TAA termination codon lies just within IRR and may be significant in regulation. The OrfB initiation codon is AUU and consequently initiation occurs only at low levels<ref name=":4" /><ref name=":22"><pubmed>10064703</pubmed>&lt;/nowiki&gt;</ref> and is modulated by the level of initiation factor IF3<ref name=":23"><pubmed>21478364</pubmed>&lt;/nowiki&gt;</ref>.
  
OrfB has been observed for: IS''3<ref name=":3" />'' ([https://scholar.google.com/citations?user=dDU8ukUAAAAJ&hl=en Prère] & [https://scholar.google.com/citations?user=wAxcf14AAAAJ&hl=en Fayet], unpublished), IS''150<ref name=":2" />'', IS''911<ref name=":4" />''<ref name=":22" /><ref name=":23" /> and IS''3411''/IS''629''<ref name=":24"><pubmed>18474594</pubmed></nowiki></ref><ref><nowiki><pubmed>16731525</pubmed></nowiki></ref> but not for IS''2''<ref name=":25"><pubmed>8824609</pubmed></nowiki></ref>. It is generally present at quite low levels although for IS''3'' approximately equal amounts of OrfB and OrfAB appear to be produced<ref name=":3" />. The IS''150'' OrfB initiation codon is out of phase with the rest of the gene and expression of full length OrfB would require a -1 frameshift after initiation.
+
OrfB has been observed for: IS''3<ref name=":3" />'' ([https://scholar.google.com/citations?user=dDU8ukUAAAAJ&hl=en Prère] & [https://scholar.google.com/citations?user=wAxcf14AAAAJ&hl=en Fayet], unpublished), [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']''<ref name=":2" />'', [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']''<ref name=":4" />''<ref name=":22" /><ref name=":23" /> and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']/[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629'']<ref name=":24"><pubmed>18474594</pubmed>
 +
 
 +
&lt;/nowiki&gt;</ref><ref><nowiki><pubmed>16731525</pubmed></nowiki></ref> but not for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2'']<ref name=":25"><pubmed>8824609</pubmed>&lt;/nowiki&gt;</ref>. It is generally present at quite low levels although for IS''3'' approximately equal amounts of OrfB and OrfAB appear to be produced<ref name=":3" />. The [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150''] OrfB initiation codon is out of phase with the rest of the gene and expression of full length OrfB would require a -1 frameshift after initiation.
  
 
Sequence analysis suggests that OrfB may in fact be synthesized by about 34% of IS''3'' family members through translational coupling: the stop codon of ''orfA'' overlaps with a potential ''orfB'' start codon (e.g. AUGA or GUGA) in 134 out of 399 ISs analyzed<ref name=":6" />.
 
Sequence analysis suggests that OrfB may in fact be synthesized by about 34% of IS''3'' family members through translational coupling: the stop codon of ''orfA'' overlaps with a potential ''orfB'' start codon (e.g. AUGA or GUGA) in 134 out of 399 ISs analyzed<ref name=":6" />.
Line 75: Line 105:
 
It is possible that the OrfB protein itself plays no direct role in transposition chemistry but that it is simply its translation signals which are important. Their recognition by the ribosome could modulate programmed translational frameshifting required to generate a single transposase protein, OrfAB, from the two reading frames ''orfA'' and ''orfB'' (see [[General Information/Transposase expression and activity#Programmed Translational Frameshifting|"Programmed translational frameshifting"]]).  
 
It is possible that the OrfB protein itself plays no direct role in transposition chemistry but that it is simply its translation signals which are important. Their recognition by the ribosome could modulate programmed translational frameshifting required to generate a single transposase protein, OrfAB, from the two reading frames ''orfA'' and ''orfB'' (see [[General Information/Transposase expression and activity#Programmed Translational Frameshifting|"Programmed translational frameshifting"]]).  
  
The OrfB amino acid sequence shares significant similarities with [[wikipedia:Integrase|retroviral integrases]], an observation which contributed to defining the highly conserved amino acid triad DDE common to all IS''3'' family members and to many of this type of phophoryltransferase enzymes<ref name=":16" /><ref><nowiki><pubmed>1314954</pubmed></nowiki></ref>. This constitutes part of the active site (for reviews see: [48,52,64–68]).
+
The OrfB amino acid sequence shares significant similarities with [[wikipedia:Integrase|retroviral integrases]], an observation which contributed to defining the highly conserved amino acid triad DDE common to all IS''3'' family members and to many of this type of phophoryltransferase enzymes<ref name=":16" /><ref><nowiki><pubmed>1314954</pubmed></nowiki></ref>. This constitutes part of the active site (for reviews see: <ref name=":40" /><ref name=":22" />).
  
 
OrfB carries neither the HTH nor the LZ motif.
 
OrfB carries neither the HTH nor the LZ motif.
  
 
====OrfAB: a product of programmed ribosomal frameshifting (PRTF)====
 
====OrfAB: a product of programmed ribosomal frameshifting (PRTF)====
OrfAB is assembled from ''orfA'' and ''orfB'' by a programmed –1 ribosomal frameshift occurring near the 3' end of orfA (see [[General Information/Transposase expression and activity#Programmed Translational Frameshifting|"Programmed translational frameshifting"]]) first demonstrated for the related IS''150<ref name=":2" />''.
+
OrfAB is assembled from ''orfA'' and ''orfB'' by a programmed –1 ribosomal frameshift occurring near the 3' end of orfA (see [[General Information/Transposase expression and activity#Programmed Translational Frameshifting|"Programmed translational frameshifting"]]) first demonstrated for the related [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']''<ref name=":2" />''.
  
 
The transframe protein combines the ''orfA'' HTH motif, an LZ motif and the ''orfB'' DD(35)E catalytic domain <ref name=":21" /> ([[:File:Fig. IS3.1.png|Fig.IS3.1 B]]).
 
The transframe protein combines the ''orfA'' HTH motif, an LZ motif and the ''orfB'' DD(35)E catalytic domain <ref name=":21" /> ([[:File:Fig. IS3.1.png|Fig.IS3.1 B]]).
Line 88: Line 118:
 
Ribosome rephasing to generate OrfAB occurs on a group of "slippery” lysine codons with a frequency of about 15% (measured using systems driven by two different promoters; T7p10 and ptac). OrfA is therefore normally expressed at significantly higher levels than OrfAB. Frameshifting permits the combination of different functional protein domains ([[:File:Fig. IS3.1.png|Fig.IS3.1 C]])..  
 
Ribosome rephasing to generate OrfAB occurs on a group of "slippery” lysine codons with a frequency of about 15% (measured using systems driven by two different promoters; T7p10 and ptac). OrfA is therefore normally expressed at significantly higher levels than OrfAB. Frameshifting permits the combination of different functional protein domains ([[:File:Fig. IS3.1.png|Fig.IS3.1 C]])..  
  
IS''3''-family frameshifting is similar to that used in some retroviruses to generate the [https://www.wikigenes.org/e/gene/e/155348.html pol-gag "polyprotein"]<ref><nowiki><pubmed>7636469</pubmed></nowiki></ref> and in the ''[https://www.wikigenes.org/e/gene/e/945105.html dnaX]'' gene of ''E. coli'' to synthesize γ the sub-unit of DNA polymerase III<ref name=":26"><pubmed>1547945</pubmed></nowiki></ref>.
+
IS''3''-family frameshifting is similar to that used in some retroviruses to generate the [https://www.wikigenes.org/e/gene/e/155348.html pol-gag "polyprotein"]<ref><nowiki><pubmed>7636469</pubmed></nowiki></ref> and in the ''[https://www.wikigenes.org/e/gene/e/945105.html dnaX]'' gene of ''[[wikipedia:Escherichia_coli|E. coli]]'' to synthesize γ the sub-unit of DNA polymerase III<ref name=":26"><pubmed>1547945</pubmed>&lt;/nowiki&gt;</ref>.
  
The relevant IS''911'' sequences involved in frameshifting are shown in ([[:File:Fig. IS3.1.png|Fig.IS3.1 C]]). Examples of frameshifting sequences from other members of the family are shown in [[:File:Fig. IS3.6.png|Fig.IS3.6]]. The group of slippery lysine codons is A AAA AAG and is directly preceded by the AUU OrfB initiation codon. Since ''E. coli'' does not encode a tRNALys with a 3’UUC5’ anti-codon for AAG, both lysine codons are decoded by the same tRNALys with a 3’UUU5’ anticodon. Its pairing is weaker with a G at the wobble position<ref><nowiki><pubmed>3860833</pubmed></nowiki></ref> probably because modifications of U34 increase the rigidity of the anticodon<ref><nowiki><pubmed>11027137</pubmed></nowiki></ref>. The presence of an upstream RBS (GGAG sequence) and a downstream secondary structure (Y-shaped stem-loop) stimulates ribosome rephasing in the -1 direction. What drives frameshifting is probably the thermodynamically favorable re-pairing of the two tRNALys from codons AAA-AAG to codons AAA-AAA<ref name=":26" /><ref><nowiki><pubmed>12970189</pubmed></nowiki></ref>. The stimulators likely have a mechanical effect bringing back in the register the ribosome and the mRNA after tRNA slippage. Different groups of codons have been observed to allow rephasing of the ribosome<ref name=":7" /> and, although the most common motif is A6G, different members of the IS''3'' family carry a variety of these (e.g. A3G for IS''3''; see [https://www.springer.com/gp/book/9780387893815 Atkins & Gesteland, Recoding: expansion of decoding rules enriches gene expression], Springer 2010).
+
The relevant [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] sequences involved in frameshifting are shown in ([[:File:Fig. IS3.1.png|Fig.IS3.1 C]]). Examples of frameshifting sequences from other members of the family are shown in [[:File:Fig. IS3.6.png|Fig.IS3.6]]. The group of slippery lysine codons is A AAA AAG and is directly preceded by the AUU OrfB initiation codon. Since ''[[wikipedia:Escherichia_coli|E. coli]]'' does not encode a tRNALys with a 3’UUC5’ anti-codon for AAG, both lysine codons are decoded by the same tRNALys with a 3’UUU5’ anticodon. Its pairing is weaker with a G at the wobble position<ref><nowiki><pubmed>3860833</pubmed></nowiki></ref> probably because modifications of U34 increase the rigidity of the anticodon<ref><nowiki><pubmed>11027137</pubmed></nowiki></ref>. The presence of an upstream RBS (GGAG sequence) and a downstream secondary structure (Y-shaped stem-loop) stimulates ribosome rephasing in the -1 direction. What drives frameshifting is probably the thermodynamically favorable re-pairing of the two tRNALys from codons AAA-AAG to codons AAA-AAA<ref name=":26" /><ref><nowiki><pubmed>12970189</pubmed></nowiki></ref>. The stimulators likely have a mechanical effect bringing back in the register the ribosome and the mRNA after tRNA slippage. Different groups of codons have been observed to allow rephasing of the ribosome<ref name=":7" /> and, although the most common motif is A6G, different members of the IS''3'' family carry a variety of these (e.g. A3G for IS''3''; see [https://www.springer.com/gp/book/9780387893815 Atkins & Gesteland, Recoding: expansion of decoding rules enriches gene expression], Springer 2010).
[[Image:Fig. IS3.6.png|thumb|center|620x620px|'''Fig. IS3.6.'''  '''Signals and predicted branched stem-loop structures in the frameshift regions of IS''911'', IS''3'', IS''3411'', and IS''1222.''''' This figure, adapted from Sharma et al., 2014 (IS''911'', IS''3''), Mazauric et al., 2008 (IS''3411'') and Mejlhede et al., 2004 (IS''1222''), illustrates several of the different potential secondary structures located downstream of the group of “slippery” codons at which a programmed -1 translational frameshift occurs. These include stem-loop structures in all cases but may also involve the formation of a pseudoknot which enhances ribosome slippage and an upstream ribosome binding site (SD sequence).|alt=]]
+
[[Image:Fig. IS3.6.png|thumb|center|780x780px|'''Fig. IS3.6.'''  '''Signals and predicted branched stem-loop structures in the frameshift regions of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411''], and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1222 IS''1222'']''.''''' This figure, adapted from Sharma et al., 2014 ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3'']), Mazauric et al., 2008 ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']) and Mejlhede et al., 2004 ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1222 IS''1222'']), illustrates several of the different potential secondary structures located downstream of the group of “slippery” codons at which a programmed -1 translational frameshift occurs. These include stem-loop structures in all cases, but may also involve the formation of a pseudoknot which enhances ribosome slippage and an upstream ribosome binding site (SD sequence).|alt=]]
  
Two similarly located partially overlapping reading frames in IS''3'', IS''150'' and IS''3411<ref name=":24" />'' also produce three proteins. The transposases, OrfAB, like that of IS''911'', are fusion products of the two orfs generated by a –1 translational frameshift.  
+
Two similarly located partially overlapping reading frames in [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']''<ref name=":24" />'' also produce three proteins. The transposases, OrfAB, like that of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], are fusion products of the two orfs generated by a –1 translational frameshift.  
  
For IS''3'', frameshifting is also stimulated by a presumed H-type pseudoknot structure similar to those generally involved in viral recoding<ref><nowiki><pubmed>18621088</pubmed></nowiki></ref>. In IS''3411'', -1 slippage on a U UUU motif requires a more convoluted form of pseudoknot structures formed by pairing of an apical loop and an internal loop belonging to two hairpins located 65 nucleotides apart on the mRNA<ref name=":24" />. Two similarly arranged orfs occur in IS2 and have been shown to encode OrfA and OrfAB equivalents only<ref name=":8" /><ref name=":25" />. This organization is observed in most members of the IS''3'' family but, beside the cases mentioned above, frameshifting has been analyzed experimentally only in a few other, less well-characterized, elements (including IS''51'', IS''222'', IS''600'', IS''1133'', IS''1222'').  
+
For IS''3'', frameshifting is also stimulated by a presumed H-type pseudoknot structure similar to those generally involved in viral recoding<ref><nowiki><pubmed>18621088</pubmed></nowiki></ref>. In [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411''], -1 slippage on a U UUU motif requires a more convoluted form of pseudoknot structures formed by pairing of an apical loop and an internal loop belonging to two hairpins located 65 nucleotides apart on the mRNA<ref name=":24" />. Two similarly arranged orfs occur in [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] and have been shown to encode OrfA and OrfAB equivalents only<ref name=":8" /><ref name=":25" />. This organization is observed in most members of the IS''3'' family but, beside the cases mentioned above, frameshifting has been analyzed experimentally only in a few other, less well-characterized, elements (including [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS51 IS''51''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS222 IS''222''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS600 IS''600''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1133 IS''1133''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1222 IS''1222'']).  
  
The frequency of frameshifting is quite variable from element to element: reported values are 15% for IS''911'', 50% for IS''150'', 6% for IS''3'' and 2% for IS''3411<ref name=":24" />''. These values may not reflect the ''in vivo'' situation since they were not established by direct measurement of the amount of the OrfA and OrfAB proteins synthesized from an intact IS, but after modification of expression signals of the IS genes or after cloning the frameshift signals in a reporter system<ref name=":2" /><ref name=":3" /><ref name=":4" />.
+
The frequency of frameshifting is quite variable from element to element: reported values are 15% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], 50% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150''], 6% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''] and 2% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']''<ref name=":24" />''. These values may not reflect the ''in vivo'' situation since they were not established by direct measurement of the amount of the OrfA and OrfAB proteins synthesized from an intact IS, but after modification of expression signals of the IS genes or after cloning the frameshift signals in a reporter system<ref name=":2" /><ref name=":3" /><ref name=":4" />.
  
The level of formation of a circular IS''911'' transposition intermediate IS''911'' carrying abutted left and right ends to generate an IRR-IRL junction ([[IS Families/IS3 family#The Transposition Pathway|Transposition Pathway]]) measured by PCR indeed depends on frameshifting frequency ''in vivo''<ref name=":27"><pubmed>12586397</pubmed></nowiki></ref>. IS''911'' copies from several clinical isolates contained variations in the frameshift region exhibited various reduced levels of frameshifting. When these were introduced into the model IS''911'' they resulted in comparable reductions in a circle formation.
+
The level of formation of a circular [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition intermediate [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] carrying abutted left and right ends to generate an '''IRR-IRL junction''' ([[IS Families/IS3 family#The Transposition Pathway|Transposition Pathway]]) measured by PCR indeed depends on frameshifting frequency ''in vivo''<ref name=":27"><pubmed>12586397</pubmed>
 +
 
 +
&lt;/nowiki&gt;</ref>. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] copies from several clinical isolates contained variations in the frameshift region exhibited various reduced levels of frameshifting. When these were introduced into the model [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] they resulted in comparable reductions in a circle formation.
  
 
Frameshifting is likely modulated by the physiological state of the host cells and by the environment: for example, frameshifting decreases when the temperature is raised or when ribosome density on the mRNA is increased ([https://scholar.google.com/citations?user=wAxcf14AAAAJ&hl=en O. Fayet], pers. Comm.).  
 
Frameshifting is likely modulated by the physiological state of the host cells and by the environment: for example, frameshifting decreases when the temperature is raised or when ribosome density on the mRNA is increased ([https://scholar.google.com/citations?user=wAxcf14AAAAJ&hl=en O. Fayet], pers. Comm.).  
  
 
====Artificial ''orfA''-''orfB'' fusion====
 
====Artificial ''orfA''-''orfB'' fusion====
For experimental purposes, production of OrfAB without necessitating a translational frameshift is obtained by introduction of a single additional base pair within the frameshift region which artificially fuses the ''orfA'' and ''orfB'' frames and eliminates OrfA production<ref name=":4" />. It was initially difficult to construct this mutant in the context of an entire IS''911'' (i.e. with the two flanking IR) but more recently this has been accomplished using a longer artificial IS and resulted in an exceptionally high transposition frequency<ref name=":28"><pubmed>22195971</pubmed></nowiki></ref>. A similar mutant in IS''3'' results in a high frequency of adjacent deletions<ref name=":3" />.
+
For experimental purposes, production of OrfAB without necessitating a translational frameshift is obtained by introduction of a single additional base pair within the frameshift region which artificially fuses the ''orfA'' and ''orfB'' frames and eliminates OrfA production<ref name=":4" />. It was initially difficult to construct this mutant in the context of an entire [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] (i.e. with the two flanking IR) but more recently this has been accomplished using a longer artificial IS and resulted in an exceptionally high transposition frequency<ref name=":28"><pubmed>22195971</pubmed>
 +
 
 +
&lt;/nowiki&gt;</ref>. A similar mutant in [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''] results in a high frequency of adjacent deletions<ref name=":3" />.
  
 
====Structural motifs====
 
====Structural motifs====
 
Although no structural information is available from crystallography, the role of the HTH and LZ motifs have been probed ''in vivo'' and ''in vitro''.
 
Although no structural information is available from crystallography, the role of the HTH and LZ motifs have been probed ''in vivo'' and ''in vitro''.
  
The conserved N-terminal helix-turn-helix (HTH) motif is related to the LysR family of bacterial transcription factors and has a highly conserved tryptophan residue similar to that of certain homeodomain protein HTH motifs. This domain is important in directing transposase to bind IS''911'' IR<ref name=":20" /> and is present in most IS''3'' family members ([[:File:Fig. IS3.7A.png|Fig.IS3.7 A]]). The N-terminal helices of the related IS2 transposase are also involved in IR binding<ref name=":20" />.
+
The conserved N-terminal helix-turn-helix (HTH) motif is related to the LysR family of bacterial transcription factors and has a highly conserved tryptophan residue similar to that of certain homeodomain protein HTH motifs. This domain is important in directing transposase to bind [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] '''IR'''<ref name=":20" /> and is present in most IS''3'' family members ([[:File:Fig. IS3.7A.png|Fig.IS3.7 A]]). The N-terminal helices of the related [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] transposase are also involved in '''IR''' binding<ref name=":20" />.
[[Image:Fig. IS3.7A.png|thumb|center|500x500px|'''Fig. IS3.7A.'''  Sequence alignments of the HTH motif.  '''Top.'''  Alignment of the predicted HTH motif of the transposase of the five defining members of subgroups within the IS3 family with that of IS911.  Identical or similar residues are boxed; bold lower case characters represent residues that fit the consensus. '''Bottom'''. An expanded view of the IS911 H-T-H motif with (below) mutated resides used in defining DNA binding functions.|alt=]]
+
[[Image:Fig. IS3.7A.png|thumb|center|640x640px|'''Fig. IS3.7A.'''  Sequence alignments of the HTH motif.  '''Top.'''  Alignment of the predicted HTH motif of the transposase of the five defining members of subgroups within the IS''3'' family with that of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''].  Identical or similar residues are boxed; bold lower case characters represent residues that fit the consensus. '''Bottom'''. An expanded view of the IS''911'' HTH motif with (below) mutated resides used in defining DNA binding functions.|alt=]]
  
Many members carry a putative leucine zipper located at the end of OrfA (sometimes extending into the OrfB region of the OrfAB protein) (see <ref name=":14" /> <ref><nowiki><pubmed>8520113</pubmed></nowiki></ref><ref><nowiki><pubmed>7496528</pubmed></nowiki></ref>). Studies with IS''911'' and IS''2'' indicate that this is a multimerization domain of the proteins<ref name=":13" /><ref name=":21" /><ref><nowiki><pubmed>9335268</pubmed></nowiki></ref>. The LZ motif of IS''911'' is composed of four heptameric units ([[:File:Fig. IS3.1.png|Fig.IS3.1 B]]) with a predicted coiled coil structure including a potential buried inter-subunit hydrogen bond across the dimer interface ([[:File:Fig. IS3.7B.png|Fig.IS3.7 B]]), to maintain the zipper in a dimeric state, and correctly placed residues with opposite charges potentially able to form characteristic inter-subunit salt-bridges to stabilize the dimeric structure<ref name=":21" />. Leucine zipper motifs are found in most IS''3'' family members ([[:File:Fig. IS3.7C.png|Fig.IS3.7 C]]).
+
Many members carry a putative leucine zipper located at the end of OrfA (sometimes extending into the OrfB region of the OrfAB protein) (see <ref name=":14" /> <ref><nowiki><pubmed>8520113</pubmed></nowiki></ref><ref><nowiki><pubmed>7496528</pubmed></nowiki></ref>). Studies with [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] indicate that this is a multimerization domain of the proteins<ref name=":13" /><ref name=":21" /><ref><nowiki><pubmed>9335268</pubmed></nowiki></ref>. The LZ motif of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] is composed of four heptameric units ([[:File:Fig. IS3.1.png|Fig.IS3.1 B]]) with a predicted coiled coil structure including a potential buried inter-subunit hydrogen bond across the dimer interface ([[:File:Fig. IS3.7B.png|Fig.IS3.7 B]]), to maintain the zipper in a dimeric state, and correctly placed residues with opposite charges potentially able to form characteristic inter-subunit salt-bridges to stabilize the dimeric structure<ref name=":21" />. Leucine zipper motifs are found in most IS''3'' family members ([[:File:Fig. IS3.7C.png|Fig.IS3.7 C]]).
[[Image:Fig. IS3.7B.png|thumb|center|620x620px|'''Fig. IS3.7B.''' '''A)''' OrfAB is shown at the top. The relative positions of the A and B domains are indicated together with those of the helix-turn-helix (HTH), leucine zipper (LZ), and DD(35)E motifs. M is a second region necessary for correct multimerization. The numbers below indicate the positions in amino acid residues. The single amino acid sequence below shows the LZ motif with the four-component heptad repeats indicated below and the leucine repeat highlighted. Repeating positions are indicated by the letters a to g. The changes in LZ sequence resulting from frameshifting between OrfA and OrfAB. '''B)''' A helical wheel diagram showing a head-to-head homodimer conformation to portray the predicted hydrophobic core (positions a and d) and electrostatic interactions (positions e and g). Arrows of decreasing size and intensity are directed towards the carboxy-terminal end.|alt=]]
+
[[Image:Fig. IS3.7B.png|thumb|center|780x780px|'''Fig. IS3.7B.''' '''A)''' OrfAB is shown at the top. The relative positions of the A and B domains are indicated together with those of the helix-turn-helix (HTH), leucine zipper (LZ), and DD(35)E motifs. M is a second region necessary for correct multimerization. The numbers below indicate the positions in amino acid residues. The single amino acid sequence below shows the LZ motif with the four-component heptad repeats indicated below and the leucine repeat highlighted. Repeating positions are indicated by the letters a to g. The changes in LZ sequence resulting from frameshifting between OrfA and OrfAB. '''B)''' A helical wheel diagram showing a head-to-head homodimer conformation to portray the predicted hydrophobic core (positions a and d) and electrostatic interactions (positions e and g). Arrows of decreasing size and intensity are directed towards the carboxy-terminal end.|alt=]]
[[Image:Fig. IS3.7C.png|thumb|center|500x500px|'''Fig. IS3.7C.'''  '''Conservation of the leucine zipper motif throughout the different IS''3'' family subgroups.''' Alignment of predicted coiled-coils in the OrfA proteins of members of the five IS''3'' families. Leucine residues are highlighted in red and other significant residues in blue. Adapted from Haren et al., 2000.|alt=]]
+
[[Image:Fig. IS3.7C.png|thumb|center|640x640px|'''Fig. IS3.7C.'''  '''Conservation of the leucine zipper motif throughout the different IS''3'' family subgroups.''' Alignment of predicted coiled-coils in the OrfA proteins of members of the five IS''3'' families. Leucine residues are highlighted in red and other significant residues in blue. Adapted from Haren et al., 2000.|alt=]]
  
 
OrfAB and OrfA form both homomultimers and mixed OrfAB-OrfA multimers<ref name=":13" /><ref name=":21" />.  
 
OrfAB and OrfA form both homomultimers and mixed OrfAB-OrfA multimers<ref name=":13" /><ref name=":21" />.  
  
Mutation of specific critical residues in the OrfAB LZ reduces the level of transposition intermediates ''in vivo'' and ''in vitro'' <ref><nowiki><pubmed>9761671</pubmed> </ref> ([[IS Families/IS3 family#The Transposition Pathway|Transposition Cycle]]) and reduced or prevented multimer (dimer) formation. OrfAB and OrfA share three of their four heptads ([[:File:Fig. IS3.7B.png|Fig.IS3.7 B]]). The last of each differs in sequence due to the translational frameshift which occurs within the heptad in the expression of OrfAB. This presumably results in different strengths of monomer-monomer interactions in the case of homo- and hetero-multimers and this may be involved in the regulation of transposition. A poorly defined region, M, located between residues 109 and 135 ([[:File:Fig. IS3.1.png|Fig.IS3.1 B]]) and components in the catalytic domain of OrfAB are also involved in its multimerization.  
+
Mutation of specific critical residues in the OrfAB LZ reduces the level of transposition intermediates ''in vivo'' and ''in vitro'' <ref>&lt;nowiki&gt;<pubmed>9761671</pubmed> </ref> ([[IS Families/IS3 family#The Transposition Pathway|Transposition Cycle]]) and reduced or prevented multimer (dimer) formation. OrfAB and OrfA share three of their four heptads ([[:File:Fig. IS3.7B.png|Fig.IS3.7 B]]). The last of each differs in sequence due to the translational frameshift which occurs within the heptad in the expression of OrfAB. This presumably results in different strengths of monomer-monomer interactions in the case of homo- and hetero-multimers and this may be involved in the regulation of transposition. A poorly defined region, '''M''', located between residues 109 and 135 ([[:File:Fig. IS3.1.png|Fig.IS3.1 B]]) and components in the catalytic domain of OrfAB are also involved in its multimerization.  
  
 
====Co-translational DNA binding====
 
====Co-translational DNA binding====
IS''911'' OrfAB has a strong cis preference ''in vivo <ref name=":28" />''. It has about a 200 fold higher activity on the IS copy from which it is expressed (in cis) than in trans. This prevents activation of transposition of one IS copy by OrfAB expressed from a second copy in the same cell. The strength of the cis effect depends on the distance of the transposase gene from the IS ends. Also, modification of the translational frameshifting pause signal has a strong influence on cis preference presumably by delaying translation and folding of the C-ter domain increasing the chance that the folded N-ter domain will recognize and bind its target IR.  
+
[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] OrfAB has a strong cis preference ''in vivo <ref name=":28" />''. It has about a 200 fold higher activity on the IS copy from which it is expressed (in cis) than in trans. This prevents activation of transposition of one IS copy by OrfAB expressed from a second copy in the same cell. The strength of the cis effect depends on the distance of the transposase gene from the IS ends. Also, modification of the translational frameshifting pause signal has a strong influence on cis preference presumably by delaying translation and folding of the C-ter domain increasing the chance that the folded N-ter domain will recognize and bind its target IR.  
  
''In vitro'' analyses using ribosome display with a coupled ''E.coli''-derived transcription-translation system coupled with size exclusion chromatography<ref name=":28" /> demonstrated that an added IR bound nascent OrfAB derivatives while they are still attached to the ribosome. Ternary complexes containing mRNA, ribosome, and a nascent peptide specifically bound added IR copies if only the N-ter 149 amino acids extended from the ribosome whereas a full-length Tpase exiting the ribosome did not.  
+
''In vitro'' analyses using ribosome display with a coupled ''[[wikipedia:Escherichia_coli|E.coli]]''-derived transcription-translation system coupled with size exclusion chromatography<ref name=":28" /> demonstrated that an added IR bound nascent OrfAB derivatives while they are still attached to the ribosome. Ternary complexes containing mRNA, ribosome, and a nascent peptide specifically bound added IR copies if only the N-ter 149 amino acids extended from the ribosome whereas a full-length Tpase exiting the ribosome did not.  
  
 
Direct evidence of coupled translational binding ([[:File:Fig. IS3.8.png|Fig.IS3.8]]) was obtained using a staged coupled transcription/translation reaction: nascent OrfAB bound the IR before its synthesis was complete but not after. Thus OrfAB can efficiently bind the IR only prior to its complete translation.  
 
Direct evidence of coupled translational binding ([[:File:Fig. IS3.8.png|Fig.IS3.8]]) was obtained using a staged coupled transcription/translation reaction: nascent OrfAB bound the IR before its synthesis was complete but not after. Thus OrfAB can efficiently bind the IR only prior to its complete translation.  
[[Image:Fig. IS3.8.png|thumb|center|500x500px|'''Fig. IS3.8.''' This schematic, not to scale, shows the insertion sequence with its left (IRL)and right (IRR) ends in green. RNA polymerase, RNAP, is shown in pale green in the process of transcribing from the promoter pIRL. The mRNA is shown in dark green with a ribosome (blue)paused at the frameshift secondary structure. The nascent OrfAB peptide (brown) is shown binding to IRL while undergoing translation. Above is shown the full-length OrfAB in a folded configuration proposed to prevent its binding to the IR as a completed protein.|alt=]]
+
[[Image:Fig. IS3.8.png|thumb|center|640x640px|'''Fig. IS3.8.''' This schematic, not to scale, shows the insertion sequence with its left ('''IRL''')and right ('''IRR''') ends in green. RNA polymerase, RNAP, is shown in pale green in the process of transcribing from the promoter pIRL. The mRNA is shown in dark green with a ribosome (blue)paused at the frameshift secondary structure. The nascent OrfAB peptide (brown) is shown binding to IRL while undergoing translation. Above is shown the full-length OrfAB in a folded configuration, proposed to prevent its binding to the IR as a completed protein.|alt=]]
  
 
====Co-translational multimerisation====
 
====Co-translational multimerisation====
An intriguing question arising directly from these results is how OrfAB multimerizes as is found in the transpososome to bind both ends of the IS. Stable formation of the important synaptic complex containing both IS ends and the transposase requires a dimeric OrfAB (see "[[IS Families/IS3 family#The IS911 transpososome|The IS''911'' transpososome]]" below). It is therefore possible that dimerization is in some way directly associated with translation. Indeed, using ''luxA'' and ''luxB'' as a model system, it been shown that ''luxA''/''B'' subunit assembly initiates cotranslationally on nascent LuxB ''in vivo''. Protein assembly appears to be directly coupled to translation and involves “spatially confined, actively chaperoned cotranslational subunit interactions”<ref><nowiki><pubmed>26405228</pubmed></nowiki></ref>.
+
An intriguing question arising directly from these results is how OrfAB multimerizes as is found in the transpososome to bind both ends of the IS. Stable formation of the important synaptic complex containing both IS ends and the transposase requires a dimeric OrfAB (see "[[IS Families/IS3 family#The IS911 transpososome|The IS''911'' transpososome]]" below). It is therefore possible that dimerization is in some way directly associated with translation. Indeed, using ''[https://www.uniprot.org/uniprot/Q91UU4 luxA]'' and ''[https://www.uniprot.org/uniprot/Q56822 luxB]'' as a model system, it been shown that ''luxA''/''B'' subunit assembly initiates cotranslationally on nascent [https://www.uniprot.org/uniprot/Q56822 LuxB] ''in vivo''. Protein assembly appears to be directly coupled to translation and involves “spatially confined, actively chaperoned cotranslational subunit interactions”<ref><nowiki><pubmed>26405228</pubmed></nowiki></ref>.
  
 
====The IS''911'' transpososome====
 
====The IS''911'' transpososome====
 
A crucial checkpoint in transposition is the assembly of the 'transpososome'. This step is a general prerequisite for initiating DNA cleavage and the subsequent chemical steps in transposition for most elements that use a DNA (rather than RNA) transposition intermediates. In this protein-DNA complex, both ends of the transposon are bridged by the transposase before it catalyzes the DNA strand cleavages and strand transfers necessary for transposon mobility<ref><nowiki><pubmed>21439812</pubmed></nowiki></ref><ref><nowiki><pubmed>23217365</pubmed></nowiki></ref><ref><nowiki><pubmed>16181782</pubmed></nowiki></ref>. The transpososome adopts very precise architectures to accomplish these steps, and undergoes defined changes throughout the transposition process.
 
A crucial checkpoint in transposition is the assembly of the 'transpososome'. This step is a general prerequisite for initiating DNA cleavage and the subsequent chemical steps in transposition for most elements that use a DNA (rather than RNA) transposition intermediates. In this protein-DNA complex, both ends of the transposon are bridged by the transposase before it catalyzes the DNA strand cleavages and strand transfers necessary for transposon mobility<ref><nowiki><pubmed>21439812</pubmed></nowiki></ref><ref><nowiki><pubmed>23217365</pubmed></nowiki></ref><ref><nowiki><pubmed>16181782</pubmed></nowiki></ref>. The transpososome adopts very precise architectures to accomplish these steps, and undergoes defined changes throughout the transposition process.
  
The overall IS''911'' transposition pathway is a two-step process, involving replicative excision followed by insertion ([[:File:Fig. IS3.9A.png|Fig.IS3.9 A]] and [[:File:Fig. IS3.9B.png|9B]]). This implies consecutive assembly of two types of transpososome: one implicated in IS excision (synaptic complex A; SCA) and includes both IS ends while the other (synaptic complex B; SCB) involves the circle junction with its abutted IRs to ensure its integration into the target DNA.
+
The overall [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition pathway is a two-step process, involving replicative excision followed by insertion ([[:File:Fig. IS3.9A.png|Fig.IS3.9 A]] and [[:File:Fig. IS3.9B.png|9B]]). This implies consecutive assembly of two types of transpososome: one implicated in IS excision ('''synaptic complex A'''; SCA) and includes both IS ends while the other ('''synaptic complex B'''; SCB) involves the circle junction with its abutted IRs to ensure its integration into the target DNA.
[[Image:Fig. IS3.9A.png|thumb|center|680x680px|'''Fig. IS3.9A.''' IS''911'' is shown in green, the flanking donor DNA in black, and the target DNA in blue. Transposon ends are shown as green filled circles. The small arrows shown in Figure 4 have been omitted for brevity. '''(A)''' Donor plasmid carrying the insertion sequence (IS). '''(B)''' Formation of the first synaptic complex SCA and cleavage of the left or right inverted repeat (IR) and attack of the other end. '''(C)''' Formation of a single-strand bridge to create a figure-eight molecule if the donor is a plasmid as shown here. '''(D)''' The products of IS-specific replication: the double-strand circular IS transposition intermediate and the regenerated transposon donor plasmid. The replicated strand is shown as a green dotted line. '''(E)''' Formation of the second synaptic complex SCB and engagement of the target DNA (blue). '''(F)''' Cleavage of the IS circle and integration. '''(G)''' The newly integrated IS.|alt=]]
+
[[Image:Fig. IS3.9A.png|thumb|center|780x780px|'''Fig. IS3.9A.''' [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] is shown in green, the flanking donor DNA in black, and the target DNA in blue. Transposon ends are shown as green filled circles. The small arrows shown in Figure 4 have been omitted for brevity. '''(A)''' Donor plasmid carrying the insertion sequence (IS). '''(B)''' Formation of the first synaptic complex SCA and cleavage of the left or right inverted repeat (IR) and attack of the other end. '''(C)''' Formation of a single-strand bridge to create a figure-eight molecule if the donor is a plasmid, as shown here. '''(D)''' The products of IS-specific replication: the double-strand circular IS transposition intermediate and the regenerated transposon donor plasmid. The replicated strand is shown as a green dotted line. '''(E)''' Formation of the second synaptic complex SCB and engagement of the target DNA (blue). '''(F)''' Cleavage of the IS circle and integration. '''(G)''' The newly integrated IS.|alt=]]
[[Image:Fig. IS3.9B.png|thumb|center|500x500px|'''Fig. IS3.9B.''' '''Top.''' cartoon of the IS''911'' figure eight (left) and IS circle (right). '''Bottom'''. Electron microscopy of figure eight (left) and IS circle (right). DNA has been coated with RecA protein to highlight double and single-stranded DNA occurs in the "crossover" region of the figure eight molecule on the left. Electron microscopy by [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4951002/ Edouard Boy de la Tour and Lucian Caro].|alt=]]
+
[[Image:Fig. IS3.9B.png|thumb|center|500x500px|'''Fig. IS3.9B.''' '''Top.''' cartoon of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name= IS''911''] figure eight (left) and IS circle (right). '''Bottom'''. Electron microscopy of figure eight (left) and IS circle (right). DNA has been coated with RecA protein to highlight double and single-stranded DNA occurs in the "crossover" region of the figure eight molecule on the left. Electron microscopy by [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4951002/ Edouard Boy de la Tour and Lucian Caro].|alt=]]
  
 
====Excision synaptic complex SCA.====
 
====Excision synaptic complex SCA.====
 
Using a band shift assay and IR of different lengths (the so-called “long-short” experiment) it was shown that the truncated OrfAB [1-149] forms a complex with two IR copies, the paired-end complex (PEC)<ref name=":13" /> equivalent to the SCA. An intact OrfAB [1-149] LZ is necessary for correct PEC/SCA formation<ref name=":13" /><ref name=":21" />. At higher OrfAB [1-149] concentrations a probable single end complex (SEC) composed of one IR and OrfAB [1-149] appeared. Addition of OrfA disturbed both PEC/SCA and SEC and generated a fast migrating species whose composition remains to be determined but does not appear to contain OrfA itself <ref name=":13" />.
 
Using a band shift assay and IR of different lengths (the so-called “long-short” experiment) it was shown that the truncated OrfAB [1-149] forms a complex with two IR copies, the paired-end complex (PEC)<ref name=":13" /> equivalent to the SCA. An intact OrfAB [1-149] LZ is necessary for correct PEC/SCA formation<ref name=":13" /><ref name=":21" />. At higher OrfAB [1-149] concentrations a probable single end complex (SEC) composed of one IR and OrfAB [1-149] appeared. Addition of OrfA disturbed both PEC/SCA and SEC and generated a fast migrating species whose composition remains to be determined but does not appear to contain OrfA itself <ref name=":13" />.
  
DNaseI and Copper [[wikipedia:Phenanthroline|phenanthroline]] footprinting revealed that OrfAB [1-149] protects a sub-terminal (internal) IR region including two conserved sequence blocks in the left (IRL) and right (IRR) ends ([[:File:Fig. IS3.1.png|Fig.IS3.1 A]]). DNA binding assays in vitro and measurement of in vivo recombination activity of sequential IR deletion derivatives suggested a model in which the N-terminal region of OrfAB binds the conserved boxes in a sequence-specific manner and anchors the two IRs into the SCA. The external region of the inverted repeat was proposed to contact the C-terminal transposase domain carrying the catalytic site<ref><nowiki><pubmed>11352577</pubmed></nowiki></ref>.
+
DNaseI and Copper [[wikipedia:Phenanthroline|phenanthroline]] footprinting revealed that OrfAB [1-149] protects a sub-terminal (internal) IR region including two conserved sequence blocks in the left ('''IRL''') and right ('''IRR''') ends ([[:File:Fig. IS3.1.png|Fig.IS3.1 A]]). DNA binding assays ''in vitro'' and measurement of ''in vivo'' recombination activity of sequential IR deletion derivatives suggested a model in which the N-terminal region of OrfAB binds the conserved boxes in a sequence-specific manner and anchors the two IRs into the SCA. The external region of the inverted repeat was proposed to contact the C-terminal transposase domain carrying the catalytic site<ref><nowiki><pubmed>11352577</pubmed></nowiki></ref>.
 +
 
 +
SCA is composed of a dimer of transposase bridging to two IR<ref name=":29"><pubmed>20553579</pubmed>
  
SCA is composed of a dimer of transposase bridging to two IR<ref name=":29"><pubmed>20553579</pubmed></nowiki></ref>, as judged by the use of a tagged and untagged truncated transposase derivative, OrfAB[1-149], and also of IR of different lengths. OrfAB[1-149] assembles two IRR copies in a parallel orientation ([[:File:Fig. IS3.9A.png|Fig.IS3.4]])<ref name=":29" /> as studied at the single molecule level by [[wikipedia:Atomic_force_microscopy|Atomic Force Microscopy]] (AFM) using asymmetric IRR-carrying DNA fragments.
+
&lt;/nowiki&gt;</ref>, as judged by the use of a tagged and untagged truncated transposase derivative, OrfAB[1-149], and also of IR of different lengths. OrfAB[1-149] assembles two IRR copies in a parallel orientation ([[:File:Fig. IS3.9A.png|Fig.IS3.4]])<ref name=":29" /> as studied at the single molecule level by [[wikipedia:Atomic_force_microscopy|Atomic Force Microscopy]] (AFM) using asymmetric IRR-carrying DNA fragments.
  
SCA assembly was also studied using a second single-molecule approach: [[wikipedia:Tethered_particle_motion|tethered particle motion]] (TPM) ([[:File:Fig. IS3.10.png|Fig.IS3.10]])<ref><nowiki><pubmed>15155821</pubmed></nowiki></ref> in which a DNA molecule is tethered to a glass support and its effective length is measured by observing the Brownian motion of a bead attached to its free end ([[:File:Fig. IS3.10.png|Fig.IS3.10]] left). OrfAB[1-149] binding to a single IR provoked a small shortening of the DNA, consistent with a DNA bend introduced by protein binding to the IR and was confirmed using EMSA. When two ends were present on the tethered DNA in their natural, inverted, configuration, OrfAB[149] not only provoked the short reduction in length but also generated species with greatly reduced effective length ([[:File:Fig. IS3.10.png|Fig.IS3.10]] middle and top right) consistent with DNA looping between the ends and thus SCA formation. SCA is very stable and kinetic analysis in real-time suggested that passage from the bound unlooped to the looped state could involve another unlooped species of intermediate length in which OrfAB[149] is bound to both IRs. DNA carrying directly repeated IR also gave rise to the looped species but the level of the intermediate species was significantly enhanced ([[:File:Fig. IS3.10.png|Fig.IS3.10]] middle and bottom right). Its accumulation could reflect a less favorable SCA formation with directly repeated IR copies than with inverted IR. This is compatible with a model in which OrfAB binds separately to and bends each IR and protein-protein interactions then lead to SCA formation ([[:File:Fig. IS3.11.png|Fig.IS3.11 A]])<ref><nowiki><pubmed>16923775</pubmed></nowiki></ref>. Cleavage and strand transfer would then give rise to a species in which both IS ends are joined by a single strand bridge (or figure-eight on a circular plasmid ([[:File:Fig. IS3.9B.png|Fig.IS3.9 C]]) (see "[[IS Families/IS3 family#The Transposition Pathway|The Transposition Pathway]]").
+
SCA assembly was also studied using a second single-molecule approach: [[wikipedia:Tethered_particle_motion|tethered particle motion]] (TPM) ([[:File:Fig. IS3.10.png|Fig.IS3.10]])<ref><nowiki><pubmed>15155821</pubmed></nowiki></ref> in which a DNA molecule is tethered to a glass support and its effective length is measured by observing the Brownian motion of a bead attached to its free end ([[:File:Fig. IS3.10.png|Fig.IS3.10]] left). OrfAB[1-149] binding to a single IR provoked a small shortening of the DNA, consistent with a DNA bend introduced by protein binding to the IR and was confirmed using EMSA. When two ends were present on the tethered DNA in their natural, inverted, configuration, OrfAB[149] not only provoked the short reduction in length but also generated species with greatly reduced effective length ([[:File:Fig. IS3.10.png|Fig.IS3.10]] middle and top right) consistent with DNA looping between the ends and thus SCA formation. SCA is very stable and kinetic analysis in real-time suggested that passage from the bound unlooped to the looped state could involve another unlooped species of intermediate length in which OrfAB[149] is bound to both '''IRs'''. DNA carrying directly repeated IR also gave rise to the looped species but the level of the intermediate species was significantly enhanced ([[:File:Fig. IS3.10.png|Fig.IS3.10]] middle and bottom right). Its accumulation could reflect a less favorable SCA formation with directly repeated IR copies than with inverted '''IR'''. This is compatible with a model in which OrfAB binds separately to and bends each '''IR''' and protein-protein interactions then lead to SCA formation ([[:File:Fig. IS3.11.png|Fig.IS3.11 A]])<ref><nowiki><pubmed>16923775</pubmed></nowiki></ref>. Cleavage and strand transfer would then give rise to a species in which both IS ends are joined by a single strand bridge (or figure-eight on a circular plasmid ([[:File:Fig. IS3.9B.png|Fig.IS3.9 C]]) (see "[[IS Families/IS3 family#The Transposition Pathway|The Transposition Pathway]]").
[[Image:Fig. IS3.10.png|thumb|center|620x620px|'''Fig. IS3.10.''' IR pairing by Tethered Particle Motion. he figure is adapted from Pouget et al., 2006|alt=]]
+
[[Image:Fig. IS3.10.png|thumb|center|690x690px|'''Fig. IS3.10.''' IR pairing by Tethered Particle Motion. The figure is adapted from Pouget et al., 2006|alt=]]
  
 
====Insertion synaptic complex SCB====
 
====Insertion synaptic complex SCB====
SCB has not been characterized in such a precise way as SCA. SCB is devoted to the insertion step of the transposition process. Two types of insertion, IR-targeted and non-targeted, have been observed ([[:File:Fig. IS3.11.png|Fig.IS3.11 B]]). It has been proposed that two different protein-DNA complexes are assembled during the two types of insertion reaction: SCBt and SCBnt (for targeted and non-targeted synaptic complex respectively<ref name=":30"><pubmed>17367389</pubmed></nowiki></ref>. Nothing is known about the stoichiometry and the geometry of these complexes but, based on protein and DNA requirements for protein-DNA complex formation, as judged by band shift, and for transposition products, as judged by in vitro and in vivo transposition assays, it has been proposed that SCBt is composed of a transposase dimer bridging a DNA molecule carrying an IR and a DNA molecule carrying an IRR-IRR junction (IS''911'' circle), the product of the replicative IS''911'' excision. This IR targeted insertion explains how the original isolate of IS''911'' might have occurred next to a sequence which strongly resembles an IR<ref name=":0" /> and can also explain one ended insertion<ref name=":11" />. In this regard, IRR shows a somewhat higher affinity than IRL. Note that if one of the two IR carried by the circle is omitted, SCBt resembles SCA ([[:File:Fig. IS3.11.png|Fig.IS3.11]]).  
+
SCB has not been characterized in such a precise way as SCA. SCB is devoted to the insertion step of the transposition process. Two types of insertion, IR-targeted and non-targeted, have been observed ([[:File:Fig. IS3.11.png|Fig.IS3.11 B]]). It has been proposed that two different protein-DNA complexes are assembled during the two types of insertion reaction: SCBt and SCBnt (for targeted and non-targeted synaptic complex respectively<ref name=":30"><pubmed>17367389</pubmed>&lt;/nowiki&gt;</ref>. Nothing is known about the stoichiometry and the geometry of these complexes but, based on protein and DNA requirements for protein-DNA complex formation, as judged by band shift, and for transposition products, as judged by in vitro and in vivo transposition assays, it has been proposed that SCBt is composed of a transposase dimer bridging a DNA molecule carrying an IR and a DNA molecule carrying an IRR-IRR junction ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] circle), the product of the replicative [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] excision. This IR targeted insertion explains how the original isolate of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] might have occurred next to a sequence which strongly resembles an IR<ref name=":0" /> and can also explain one ended insertion<ref name=":11" />. In this regard, IRR shows a somewhat higher affinity than IRL. Note that if one of the two IR carried by the circle is omitted, SCBt resembles SCA ([[:File:Fig. IS3.11.png|Fig.IS3.11]]).  
[[Image:Fig. IS3.11.png|thumb|center|690x690px|'''Fig. IS3.11.''' Proposed configuration and composition of synaptic complexes SCA and SCB involved in different steps of the IS911 transposition cycle.  
+
[[Image:Fig. IS3.11.png|thumb|center|780x780px|'''Fig. IS3.11.''' Proposed configuration and composition of synaptic complexes SCA and SCB involved in different steps of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition cycle.  
The excision complex SCA. The tips of the insertion sequence (IS), which are not protected by the truncated transposase OrfAB[1–149] are shown as green circles containing an arrowhead. IRs are indicated by thick black lines and the IS as green lines. Full-length OrfAB, which is presumed to cover the entire IR, is shown bound as a monomer to each end and to introduce a small bend in the DNA. Dimerization creates SCA, resulting in the pairing of both IRs and in the formation of a DNA loop which includes the IS. Finally, a cleavage and strand transfer event results in the formation of a single-strand bridge between the IRs. The integration complex SCB. Symbols are as in '''(A)'''. In the left-hand column, the IS circle intermediate with its newly replicated strand (dotted line) is shown to form a complex between an IR in the circle and a second in the target to form SCBt. Cleavage and strand transfer is shown to form a single-strand bridge between the two IRs. RecG helicase is thought to intervene to drive strand migration before a second cleavage and strand transfer results in the integration of the circle. This would explain the integration of the many different ISs observed to occur next to a resident IR in the target. The right-hand column: untargeted integration involving OrfA and OrfAB. OrfA is known to interact with OrfAB. It also changes in some way OrfAB binding but it is not clear whether it remains in the complex.|alt=]]
+
The excision complex SCA. The tips of the insertion sequence (IS), which are not protected by the truncated transposase OrfAB[1–149] are shown as green circles containing an arrowhead. '''IRs''' are indicated by thick black lines and the IS as green lines. Full-length OrfAB, which is presumed to cover the entire IR, is shown bound as a monomer to each end and to introduce a small bend in the DNA. Dimerization creates SCA, resulting in the pairing of both IRs and in the formation of a DNA loop which includes the IS. Finally, a cleavage and strand transfer event results in the formation of a single-strand bridge between the IRs. The integration complex SCB. Symbols are as in '''(A)'''. In the left-hand column, the IS circle intermediate with its newly replicated strand (dotted line) is shown to form a complex between an IR in the circle and a second in the target to form SCBt. Cleavage and strand transfer is shown to form a single-strand bridge between the two '''IRs'''. RecG helicase is thought to intervene to drive strand migration before a second cleavage and strand transfer results in the integration of the circle. This would explain the integration of the many different ISs observed to occur next to a resident IR in the target. The right-hand column: untargeted integration involving OrfA and OrfAB. OrfA is known to interact with OrfAB. It also changes in some way OrfAB binding but it is not clear whether it remains in the complex.|alt=]]
  
SCBnt is thought to differ from both SCA and SCBt and to include the second IS''911'' protein, OrfA. This protein, binds non-specifically to DNA and interacts with OrfAB<ref name=":13" /><ref name=":21" />, is proposed to direct an OrfAB-junction complex to a randomly chosen target-DNA to form SCBnt<ref name=":30" /><ref><nowiki><pubmed>18586933</pubmed></nowiki></ref>. This is based on the observation that integration of the transposon circle intermediate is greatly stimulated by preincubation of OrfAB and OrfA in an ''in vitro'' reaction<ref name=":31"><pubmed>9463394</pubmed></nowiki></ref>.
+
SCBnt is thought to differ from both SCA and SCBt and to include the second [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] protein, OrfA. This protein, binds non-specifically to DNA and interacts with OrfAB<ref name=":13" /><ref name=":21" />, is proposed to direct an OrfAB-junction complex to a randomly chosen target-DNA to form SCBnt<ref name=":30" /><ref><nowiki><pubmed>18586933</pubmed></nowiki></ref>. This is based on the observation that integration of the transposon circle intermediate is greatly stimulated by preincubation of OrfAB and OrfA in an ''in vitro'' reaction<ref name=":31"><pubmed>9463394</pubmed>&lt;/nowiki&gt;</ref>.
  
 
====The Transposition Pathway====
 
====The Transposition Pathway====
The IS''3'' family is one of an increasing number of IS families known to transpose using a double strand circular DNA intermediate. Closely related pathways have been demonstrated for IS''1''<ref><nowiki><pubmed>7489730</pubmed></nowiki></ref>, IS''2<ref name=":5" />'', IS''3''<ref><nowiki><pubmed>15493331</pubmed></nowiki></ref>, and IS''150''<ref name=":32"><pubmed>12374815</pubmed></nowiki></ref>. This represents a major transposition pathway which has yet to be widely recognized. As shown in [[:File:Fig. IS3.9A.png|Fig.IS3.9]], and '''the animation below''', IS''3'' family transposition proceeds through a copy-out-paste-in process..  
+
The IS''3'' family is one of an increasing number of IS families known to transpose using a double strand circular DNA intermediate. Closely related pathways have been demonstrated for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1 IS''1'']<ref><nowiki><pubmed>7489730</pubmed></nowiki></ref>, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2'']''<ref name=":5" />'', [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3'']<ref><nowiki><pubmed>15493331</pubmed></nowiki></ref>, and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']<ref name=":32"><pubmed>12374815</pubmed>&lt;/nowiki&gt;</ref>. This represents a major transposition pathway which has yet to be widely recognized. As shown in [[:File:Fig. IS3.9A.png|Fig.IS3.9]], and '''the animation below''', IS''3'' family transposition proceeds through a '''copy-out-paste-in process'''.  
 
<center>
 
<center>
 
{| class="wikitable"
 
{| class="wikitable"
|+'''IS''911'' transposition mechanisms'''
+
|+'''[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition mechanisms'''
![[File:IS91-fast.mp4|center|380x380px]]'''<small>IS''911''.</small> <small>copy-out-paste-in</small>'''
+
![[File:IS91-fast.mp4|center|380x380px]]'''<small>[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''].</small> <small>copy-out-paste-in mechanism</small>'''
 
|}
 
|}
 
</center>
 
</center>
  
 
====The Figure-eight form====
 
====The Figure-eight form====
The initial step is recognition of the IR by OrfAB (presumably during its translation) ('''IS''911'' movie above''') and assembly of SCA to correctly position the DNA ends and the transposase catalytic site for the subsequent chemical steps. Like all known DDE transposase-catalyzed reactions<ref><nowiki><pubmed>26104718</pubmed></nowiki></ref>, IS''911'' transposition proceeds by cleavage of a single strand at the transposon end generating a 3’-OH. This then attacks a target phosphodiester bond in a strand transfer reaction. The particularity of this copy-out-paste-in mechanism is that initial cleavage occurs at only one transposon end, either left or right ([[:File:Fig. IS3.9A.png|Fig.IS3.9]]). This single liberated 3’-OH directs strand transfer to the same strand 3 bases 5’ to the other end of the element. This generates a molecule in which a single transposon strand is circularized to produce a single strand bridge generating a figure-eight structure on a circular plasmid donor molecule ([[:File:Fig. IS3.12.png|Fig.IS3.12]]) which can be easily observed ''in vivo''<ref name=":33"><pubmed>7590258</pubmed></nowiki></ref>. The IRs are joined by the single-stranded bridge and separated by three bases derived from flanking DNA from either the left or right end. The three (or 4) bp direct repeats flanking the original insertion are not required for further transposition (as also shown for IS''3''<ref name=":34"><pubmed>10556026</pubmed></nowiki></ref>) and an IS''911''-based transposon engineered to have different flanks generates a mixed population of figure-eight molecules with one or other flank sequence. Prevention of cleavage of one or other transposon end resulted in a homogenous population that carries the 3nt DNA flank associated with the mutant end confirming that the IRL can attack IRR and vice versa. The reaction can be viewed as a one-ended site-specific transposition event. These initial steps can be accomplished by OrfAB alone. However, it should be noted that in the presence of OrfA, no figure eight or IS circles could be detected by a simple gel assay in vivo although IS circles were found using a PCR approach<ref name=":27" />. This suggests that OrfA may play a role in negatively regulating the initiation of transposition. A similar conclusion has been reached for OrfA of IS''3''<ref><nowiki><pubmed>9413996</pubmed></nowiki></ref>. Alternatively, OrfA may stimulate the disappearance of figure eight and IS circles (see below) since no effect of OrfA was observed on figure-eight formation in vitro. Together with the fact that OrfAB is normally produced at low levels from a weak promoter<ref name=":4" />, initiation of transposition to form the figure eight intermediate may be stochastic.
+
The initial step is recognition of the IR by OrfAB (presumably during its translation) ('''[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] movie above''') and assembly of SCA to correctly position the DNA ends and the transposase catalytic site for the subsequent chemical steps. Like all known DDE transposase-catalyzed reactions<ref><nowiki><pubmed>26104718</pubmed></nowiki></ref>, IS''911'' transposition proceeds by cleavage of a single strand at the transposon end generating a 3’-OH. This then attacks a target phosphodiester bond in a strand transfer reaction. The particularity of this copy-out-paste-in mechanism is that initial cleavage occurs at only one transposon end, either left or right ([[:File:Fig. IS3.9A.png|Fig.IS3.9]]). This single liberated 3’-OH directs strand transfer to the same strand 3 bases 5’ to the other end of the element. This generates a molecule in which a single transposon strand is circularized to produce a single strand bridge generating a figure-eight structure on a circular plasmid donor molecule ([[:File:Fig. IS3.12.png|Fig.IS3.12]]) which can be easily observed ''in vivo''<ref name=":33"><pubmed>7590258</pubmed>&lt;/nowiki&gt;</ref>. The IRs are joined by the single-stranded bridge and separated by three bases derived from flanking DNA from either the left or right end. The three (or 4) bp direct repeats flanking the original insertion are not required for further transposition (as also shown for IS''3''<ref name=":34"><pubmed>10556026</pubmed>&lt;/nowiki&gt;</ref>) and an [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']-based transposon engineered to have different flanks generates a mixed population of figure-eight molecules with one or other flank sequence. Prevention of cleavage of one or other transposon end resulted in a homogenous population that carries the 3nt DNA flank associated with the mutant end confirming that the IRL can attack IRR and vice versa. The reaction can be viewed as a one-ended site-specific transposition event. These initial steps can be accomplished by OrfAB alone. However, it should be noted that in the presence of OrfA, no figure eight or IS circles could be detected by a simple gel assay in vivo although IS circles were found using a PCR approach<ref name=":27" />. This suggests that OrfA may play a role in negatively regulating the initiation of transposition. A similar conclusion has been reached for OrfA of IS''3''<ref><nowiki><pubmed>9413996</pubmed></nowiki></ref>. Alternatively, OrfA may stimulate the disappearance of figure eight and IS circles (see below) since no effect of OrfA was observed on figure-eight formation in vitro. Together with the fact that OrfAB is normally produced at low levels from a weak promoter<ref name=":4" />, initiation of transposition to form the figure eight intermediate may be stochastic.
[[Image:Fig. IS3.12.png|thumb|center|500x500px|'''Fig. IS3.12.''' Agarose gel electrophoresis of DNA extracts from cells carrying a donor plasmid in the presence of high levels of transposase. '''The first panel, Left'''. cartoons of three IS911 related species. From top to bottom: the donor plasmid, the figure 8 molecule, and the IS circle. IS911 is shown in green, plasmid backbone in black and the transposon ends as red dots. '''Second panel.''' Ethidium bromide-stained Agarose gel showing various DNA species including the plasmid which was used to supply transposase. '''Third panel.''' Electron micrographs of RecA coated figure 8 and IS circles. |alt=]]
+
[[Image:Fig. IS3.12.png|thumb|center|680x680px|'''Fig. IS3.12.''' [[wikipedia:Agarose_gel_electrophoresis|Agarose gel electrophoresis]] of DNA extracts from cells carrying a donor plasmid in the presence of high levels of transposase. '''The first panel, Left'''. Cartoons of three [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] related species. From top to bottom: the donor plasmid, the figure 8 molecule, and the IS circle. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] is shown in green, plasmid backbone in black and the transposon ends as red dots. '''Second panel.''' [[wikipedia:Ethidium_bromide|Ethidium bromide]]-stained [[wikipedia:Agarose_gel_electrophoresis|Agarose gel]] showing various DNA species, including the plasmid which was used to supply transposase. '''Third panel.''' Electron micrographs of [[wikipedia:RecA|RecA]] coated figure 8 and IS circles. |alt=]]
  
 
====The circular intermediate====
 
====The circular intermediate====
 
Kinetic data<ref name=":28" /><ref name=":33" /> indicate that the figure-eight gives rise to the circular transposon form which can easily be detected ''in vivo'' and in which the IR are abutted and separated by three base pairs of DNA flanking the original insertion ([[:File:Fig. IS3.9B.png|Fig.IS3.9]] and [[:File:Fig. IS3.12.png|Fig.IS3.12]]). As for figure-eight molecules, a transposon engineered to have different flanks generates a mixed population of transposon circles with one or the other 3bp flank located at the junction<ref><nowiki><pubmed>1334464</pubmed></nowiki></ref>.
 
Kinetic data<ref name=":28" /><ref name=":33" /> indicate that the figure-eight gives rise to the circular transposon form which can easily be detected ''in vivo'' and in which the IR are abutted and separated by three base pairs of DNA flanking the original insertion ([[:File:Fig. IS3.9B.png|Fig.IS3.9]] and [[:File:Fig. IS3.12.png|Fig.IS3.12]]). As for figure-eight molecules, a transposon engineered to have different flanks generates a mixed population of transposon circles with one or the other 3bp flank located at the junction<ref><nowiki><pubmed>1334464</pubmed></nowiki></ref>.
  
Studies ''in vivo'' using a labeling protocol and a temperature-sensitive plasmid as transposon donor demonstrated that conversion from the figure-eight to the transposon circle occurs by semiconservative replication where the circular intermediate is “copied out” leaving a copy in the transposon donor molecule<ref name=":35"><pubmed>15359283</pubmed></nowiki></ref> ([[:File:Fig. IS3.9B.png|Fig.IS3.9]]). This is transposon-specific, requires OrfAB (presumably to generate the figure eight and generate a 3’-OH on the IS''911'' DNA flank) and does not depend on replication from the donor plasmid origin of replication<ref name=":35" />.
+
Studies ''in vivo'' using a labeling protocol and a temperature-sensitive plasmid as transposon donor demonstrated that conversion from the figure-eight to the transposon circle occurs by semiconservative replication where the circular intermediate is “copied out” leaving a copy in the transposon donor molecule<ref name=":35"><pubmed>15359283</pubmed>&lt;/nowiki&gt;</ref> ([[:File:Fig. IS3.9B.png|Fig.IS3.9]]). This is transposon-specific, requires OrfAB (presumably to generate the figure eight and generate a 3’-OH on the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] DNA flank) and does not depend on replication from the donor plasmid origin of replication<ref name=":35" />.
  
Using donor plasmids where one or other IR was inactivated for cleavage would be expected to determine whether one or other of the 3’-OH is used in transposon replication. This was tested using the Tus/ter system<ref><nowiki><pubmed>8021197</pubmed></nowiki></ref><ref><nowiki><pubmed>2181438</pubmed></nowiki></ref><ref><nowiki><pubmed>2510933</pubmed></nowiki></ref><ref><nowiki><pubmed>16148308</pubmed></nowiki></ref> (which blocks passage of a replication fork in an orientation specific fashion) cloned into the transposon in either one or other orientation. In the presence of Tus protein, no transposon circles were observed if the orientation of the ter site was that expected to block replication from one or the other end<ref name=":35" />.
+
Using donor plasmids where one or other IR was inactivated for cleavage would be expected to determine whether one or other of the 3’-OH is used in transposon replication. This was tested using the [[wikipedia:Replication_terminator_Tus_family|Tus/ter system]]<ref><nowiki><pubmed>8021197</pubmed></nowiki></ref><ref><nowiki><pubmed>2181438</pubmed></nowiki></ref><ref><nowiki><pubmed>2510933</pubmed></nowiki></ref><ref><nowiki><pubmed>16148308</pubmed></nowiki></ref> (which blocks passage of a replication fork in an orientation specific fashion) cloned into the transposon in either one or other orientation. In the presence of [[wikipedia:Replication_terminator_Tus_family|Tus protein]], no transposon circles were observed if the orientation of the ter site was that expected to block replication from one or the other end<ref name=":35" />.
  
 
At present, it is not known how OrfAB is removed and how this replication step is initiated or terminated to generate the final circles. It is possible that these processes involve host factors and mechanisms similar to those, which operate in replicative transposition of [[wikipedia:Bacteriophage_Mu|bacteriophage Mu]] (see <ref><nowiki><pubmed>26104374</pubmed></nowiki></ref><ref><nowiki><pubmed>12770828</pubmed></nowiki></ref><ref><nowiki><pubmed>11459960</pubmed></nowiki></ref>).  
 
At present, it is not known how OrfAB is removed and how this replication step is initiated or terminated to generate the final circles. It is possible that these processes involve host factors and mechanisms similar to those, which operate in replicative transposition of [[wikipedia:Bacteriophage_Mu|bacteriophage Mu]] (see <ref><nowiki><pubmed>26104374</pubmed></nowiki></ref><ref><nowiki><pubmed>12770828</pubmed></nowiki></ref><ref><nowiki><pubmed>11459960</pubmed></nowiki></ref>).  
  
RecG helicase is implicated in targeted insertion. This process involves a target IS''911'' end and strand transfer occur between one cleaved end of the IS circle and the target IS end to create an intermolecular single-strand bridge rather than the intramolecular bridge of the figure-eight intermediate ([[:File:Fig. IS3.13.png|Fig.IS3.13]]). Resolution of this structure implicates branch migration and replication from the donor plasmid<ref name=":36"><pubmed>15306008</pubmed></nowiki></ref>. This reinforces the idea that host proteins including components of the replication machinery are loaded onto figure-eight intermediates.
+
[https://www.uniprot.org/uniprot/P24230 RecG helicase] is implicated in targeted insertion. This process involves a target [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] end and strand transfer occur between one cleaved end of the IS circle and the target IS end to create an intermolecular single-strand bridge rather than the intramolecular bridge of the figure-eight intermediate ([[:File:Fig. IS3.13.png|Fig.IS3.13]]). Resolution of this structure implicates branch migration and replication from the donor plasmid<ref name=":36"><pubmed>15306008</pubmed>&lt;/nowiki&gt;</ref>. This reinforces the idea that host proteins including components of the replication machinery are loaded onto figure-eight intermediates.
[[Image:Fig. IS3.13.png|thumb|center|500x500px|'''Fig. IS3.13.''' ''In vitro'' reactions were performed using purified IS''911'' circles which included a chloramphenicol resistance gene and a plasmid target with a promoterless ''lacZ'' gene.
+
[[Image:Fig. IS3.13.png|thumb|center|620x620px|'''Fig. IS3.13.''' ''In vitro'' reactions were performed using purified [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] circles which included a chloramphenicol resistance gene and a plasmid target with a promoterless ''lacZ'' gene.
 
Following a standard In vitro reaction, the reaction mixture was used to transform competent E. coli with selection for chloramphenicol resistance. Lines on the interior and exterior of the plasmid circle represent different orientations of insertion.|alt=]]
 
Following a standard In vitro reaction, the reaction mixture was used to transform competent E. coli with selection for chloramphenicol resistance. Lines on the interior and exterior of the plasmid circle represent different orientations of insertion.|alt=]]
  
 
====Integration of the circular intermediate====
 
====Integration of the circular intermediate====
The IR junction formed by IS circularization is very unstable in the presence of OrfAB and undergoes high levels of deletion and insertion ''in vivo''<ref name=":37"><pubmed>9214651</pubmed></nowiki></ref> and ''in vitro<ref name=":31" />''. Transposon circle insertion presumably requires further transposase synthesis.  
+
The IR junction formed by IS circularization is very unstable in the presence of OrfAB and undergoes high levels of deletion and insertion ''in vivo''<ref name=":37"><pubmed>9214651</pubmed>&lt;/nowiki&gt;</ref> and ''in vitro<ref name=":31" />''. Transposon circle insertion presumably requires further transposase synthesis.  
  
A remarkable consequence of transposon circle formation is the assembly of a strong promoter, pjunc, from a –35 hexamer contributed by IRR and a –10 hexamer contributed by IRL ([[:File:Fig. IS3.3.png|Fig.IS3.3 B]]). The 3 (or more rarely 4) bp which separate IRL and IRR in the circle provide an ideal spacing between the –35 and –10 elements<ref name=":37" />. The junction promoter, pjunc, is 30-50 fold stronger than the indigenous promoter, pIRL<ref name=":37" /> ([[:File:Fig. IS3.4.png|Fig.IS3.4]]), and more than two fold stronger than ''lacUV5<ref name=":9" />''. It is correctly placed to drive high levels of transposase synthesis and plays an active role in controlling IS''911'' transposition.  
+
A remarkable consequence of transposon circle formation is the assembly of a strong promoter, pjunc, from a –35 hexamer contributed by IRR and a –10 hexamer contributed by '''IRL''' ([[:File:Fig. IS3.3.png|Fig.IS3.3 B]]). The 3 (or more rarely 4) bp which separate '''IRL''' and '''IRR''' in the circle provide an ideal spacing between the –35 and –10 elements<ref name=":37" />. The junction promoter, pjunc, is 30-50 fold stronger than the indigenous promoter, pIRL<ref name=":37" /> ([[:File:Fig. IS3.4.png|Fig.IS3.4]]), and more than two fold stronger than ''lacUV5<ref name=":9" />''. It is correctly placed to drive high levels of transposase synthesis and plays an active role in controlling [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition.  
  
Inactivation of pjunc by mutagenesis strongly reduced IS''911'' transposition in vivo when transposase was expressed in its native configuration<ref name=":9" />. Moreover, the truncated OrfAB derivative, OrfAB[1-149] , which specifically binds IRR and IRL, reduced in vivo promoter activity 10 fold in a mutated junction resistant to cleavage. Full-length OrfAB, which binds the IR only weakly, and OrfA, which does not specifically bind the IR, had no effect<ref name=":9" />. Integration results in disassembly of pjunc providing a powerful feedback mechanism resulting in transient and controlled activation of integration only in the presence of the correct (circular) intermediate.
+
Inactivation of pjunc by mutagenesis strongly reduced [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition ''in vivo'' when transposase was expressed in its native configuration<ref name=":9" />. Moreover, the truncated OrfAB derivative, OrfAB[1-149] , which specifically binds '''IRR''' and '''IRL''', reduced in vivo promoter activity 10 fold in a mutated junction resistant to cleavage. Full-length OrfAB, which binds the IR only weakly, and OrfA, which does not specifically bind the IR, had no effect<ref name=":9" />. Integration results in disassembly of pjunc providing a powerful feedback mechanism resulting in transient and controlled activation of integration only in the presence of the correct (circular) intermediate.
  
For the related IS''2'', this junction promoter is required for transposition<ref><nowiki><pubmed>14729714</pubmed></nowiki></ref>.
+
For the related [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''], this junction promoter is required for transposition<ref><nowiki><pubmed>14729714</pubmed></nowiki></ref>.
  
 
Circle junction formation brings both transposons ends together in an inverted orientation. This active junction must then participate in the second type of synaptic complex which includes target DNA ([[:File:Fig. IS3.9B.png|Fig.IS3.9]] and [[:File:Fig. IS3.11.png|Fig.IS3.11 B]]).  
 
Circle junction formation brings both transposons ends together in an inverted orientation. This active junction must then participate in the second type of synaptic complex which includes target DNA ([[:File:Fig. IS3.9B.png|Fig.IS3.9]] and [[:File:Fig. IS3.11.png|Fig.IS3.11 B]]).  
Line 195: Line 231:
 
The final step requires OrfAB but is greatly stimulated by OrfA and is sensitive to the ratio of OrfAB/OrfA<ref name=":31" />.
 
The final step requires OrfAB but is greatly stimulated by OrfA and is sensitive to the ratio of OrfAB/OrfA<ref name=":31" />.
  
It is not known whether target capture occurs before or after cleavage of the circle junction although it has been observed that linear copies of IS''911'' are produced from transposon circles ''in vitro'' and in the presence of high OrfAB levels in vivo and a pre-cleaved linear transposon was a robust substrate for integration ''in vitro''<ref><nowiki><pubmed>10320583</pubmed></nowiki></ref>.
+
It is not known whether target capture occurs before or after cleavage of the circle junction although it has been observed that linear copies of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] are produced from transposon circles ''in vitro'' and in the presence of high OrfAB levels in vivo and a pre-cleaved linear transposon was a robust substrate for integration ''in vitro''<ref><nowiki><pubmed>10320583</pubmed></nowiki></ref>.
  
 
Based on kinetics and on the formation of the strong pjunc promoter, we favor a model in which the IS circles represent a reservoir of transposition intermediates and that linear forms are generated from the IS circles during the integration process.  
 
Based on kinetics and on the formation of the strong pjunc promoter, we favor a model in which the IS circles represent a reservoir of transposition intermediates and that linear forms are generated from the IS circles during the integration process.  
  
This has also been proposed for IS''3<ref name=":34" />''.  
+
This has also been proposed for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3'']''<ref name=":34" />''.  
  
 
====Targeted Insertion====
 
====Targeted Insertion====
As stated above, several IS including IS''911'' show a preference for integration next to sequences in the target similar to their IR. One way of understanding this is that the transposon circle is able to form a synaptic complex (SCBt; [[:File:Fig. IS3.11.png|Fig.IS3.11 B]] left) which is similar to SCA ([[:File:Fig. IS3.11.png|Fig.IS3.11 A]]) but which occurs “in trans” between an IR of the transposon circle and an IR in the target. In the case of IS''911'', this phenomenon occurs more frequently if OrfA is not present (Fig. IS3.13) and it was proposed that one role of OrfA is to promote dispersion of the IS<ref name=":30" /><ref name=":38"><pubmed>12145217</pubmed></nowiki></ref>.
+
As stated above, several IS including IS''911'' show a preference for integration next to sequences in the target similar to their IR. One way of understanding this is that the transposon circle is able to form a synaptic complex (SCBt; [[:File:Fig. IS3.11.png|Fig.IS3.11 B]] left) which is similar to SCA ([[:File:Fig. IS3.11.png|Fig.IS3.11 A]]) but which occurs “in trans” between an IR of the transposon circle and an IR in the target. In the case of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], this phenomenon occurs more frequently if OrfA is not present (Fig. IS3.13) and it was proposed that one role of OrfA is to promote dispersion of the IS<ref name=":30" /><ref name=":38"><pubmed>12145217</pubmed>&lt;/nowiki&gt;</ref>.
  
 
This type of one-ended intermolecular recombination/integration has been analyzed in some detail<ref name=":36" /><ref name=":38" /><ref><nowiki><pubmed>14756780</pubmed></nowiki></ref>.
 
This type of one-ended intermolecular recombination/integration has been analyzed in some detail<ref name=":36" /><ref name=":38" /><ref><nowiki><pubmed>14756780</pubmed></nowiki></ref>.
Line 208: Line 244:
 
IR-targeted insertion involves the transfer of a single end of the junction to the target IR to generate a branched DNA structure. The single-end transfer (SET) intermediate, but not the final insertion product, was detected ''in vitro''. This implies that SET intermediates must be processed by the bacterial host to obtain the final insertion products. Sequence analysis of ''in vitro'' and ''in vivo'' IR-targeted insertion products revealed high levels of DNA sequence conversion in which mutations from one IR were transferred to another. These sequence changes could not be explained by the classic transposition pathway but could be understood in terms of a mechanism in which SET generates a four-way [[wikipedia:Holliday_junction|Holliday-like junction]] which is then processed by host-mediated branch migration, resolution, repair and replication. This pathway resembles those described for processing other branched DNA structures such as stalled replication forks.
 
IR-targeted insertion involves the transfer of a single end of the junction to the target IR to generate a branched DNA structure. The single-end transfer (SET) intermediate, but not the final insertion product, was detected ''in vitro''. This implies that SET intermediates must be processed by the bacterial host to obtain the final insertion products. Sequence analysis of ''in vitro'' and ''in vivo'' IR-targeted insertion products revealed high levels of DNA sequence conversion in which mutations from one IR were transferred to another. These sequence changes could not be explained by the classic transposition pathway but could be understood in terms of a mechanism in which SET generates a four-way [[wikipedia:Holliday_junction|Holliday-like junction]] which is then processed by host-mediated branch migration, resolution, repair and replication. This pathway resembles those described for processing other branched DNA structures such as stalled replication forks.
 
A version of this model is shown in [[:File:Fig. IS3.14.png|Fig.IS3.14]]. Subsequent studies showed that the RecG helicase is implicated in vivo, as might be expected for strand migration<ref name=":36" />.
 
A version of this model is shown in [[:File:Fig. IS3.14.png|Fig.IS3.14]]. Subsequent studies showed that the RecG helicase is implicated in vivo, as might be expected for strand migration<ref name=":36" />.
[[Image:Fig. IS3.14.png|thumb|center|620x620px|'''Fig. IS3.14.''' IRR and IRL in red and green respectively. A mutant terminal dinucleotide (pale red or green boxes) prevents donor activity but allows target activity. Three interstitial base pairs in the IR/IR junction are as grey and white circles to distinguish DNA strand polarity. The same convention is used for the three base pairs flanking the target mutant IRL* as diamonds. Dotted lines: donor transposon circle; full lines: target DNA. The 3’ ends of Tpase-mediated nicks are indicated by arrows. Those, which may exist transiently during second strand resolution, are indicated by a gap. '''I,''' synapsis and cleavage at one end and strand transfer; '''II,''' the formation of a SET between donor and target; '''III,''' branch migration in the sense of the arrow creating hybrid IRL or IRL/IRR copies; '''IV,''' [[wikipedia:Holliday_junction|Holliday junction]] resolution, thick dashed lines; '''V,''' resolved product subsequently subject to mismatch repair and replication. Lower case roman numerals below indicate the type of final product. The differences between '''A''', '''B''' and '''C''' depend on the IR which attacks the target. '''A.''' IRR attacks three base pairs from the target IRL*. B. IRL attacks three base pairs from the target IRL* leading to hybrid IRs in which one strand was derived from IRR and the other from IRL*. The figure shows the expected results if branch migration continued into the region of non-complementarity after the IRs. '''C.''' IRL attacks at the tip of the target IRL*.|alt=]]
+
[[Image:Fig. IS3.14.png|thumb|center|780x780px|'''Fig. IS3.14.''' '''IRR''' and '''IRL''' in red and green respectively. A mutant terminal dinucleotide (pale red or green boxes) prevents donor activity but allows target activity. Three interstitial base pairs in the IR/IR junction are as grey and white circles to distinguish DNA strand polarity. The same convention is used for the three base pairs flanking the target mutant IRL* as diamonds. Dotted lines: donor transposon circle; full lines: target DNA. The 3’ ends of Tpase-mediated nicks are indicated by arrows. Those, which may exist transiently during second strand resolution, are indicated by a gap. '''I,''' synapsis and cleavage at one end and strand transfer; '''II,''' the formation of a SET between donor and target; '''III,''' branch migration in the sense of the arrow creating hybrid IRL or IRL/IRR copies; '''IV,''' [[wikipedia:Holliday_junction|Holliday junction]] resolution, thick dashed lines; '''V,''' resolved product subsequently subject to mismatch repair and replication. Lower case roman numerals below indicate the type of final product. The differences between '''A''', '''B''' and '''C''' depend on the IR which attacks the target. '''A.''' IRR attacks three base pairs from the target, IRL*. B. IRL attacks three base pairs from the target IRL* leading to hybrid IRs in which one strand was derived from IRR and the other from IRL*. The figure shows the expected results if branch migration continued into the region of non-complementarity after the IRs. '''C.''' IRL attacks at the tip of the target IRL*.|alt=]]
  
 
====Mechanism in other family members====
 
====Mechanism in other family members====
Several other members of this family have also been analysed in some detail. These include IS''2'', IS''3'', and IS''150''. All three have been shown to generate circles when supplied with high levels of the fused frame Tpase<ref name=":3" /><ref name=":5" /><ref name=":32" /><ref name=":34" /><ref name=":39"><pubmed>8550559</pubmed></nowiki></ref>.
+
Several other members of this family have also been analysed in some detail. These include [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''], and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']. All three have been shown to generate circles when supplied with high levels of the fused frame Tpase<ref name=":3" /><ref name=":5" /><ref name=":32" /><ref name=":34" /><ref name=":39"><pubmed>8550559</pubmed>&lt;/nowiki&gt;</ref>.
  
IS''3'' also generates adjacent deletions<ref name=":3" /> but, unlike IS''911'', appears to undergo excision from the donor molecule as a linear form following a staggered double strand break at each end. These forms have a 3 base 5' overhang and may be an alternative type of transposition intermediate<ref name=":39" />. Such forms may be equivalent to the linear IS''911'' species derived from transposon circles. In addition, IS''3''-derivative transposons in which two abutted ends have been engineered undergo high levels of transposition<ref name=":10" />.  
+
IS''3'' also generates adjacent deletions<ref name=":3" /> but, unlike [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], appears to undergo excision from the donor molecule as a linear form following a staggered double strand break at each end. These forms have a 3 base 5' overhang and may be an alternative type of transposition intermediate<ref name=":39" />. Such forms may be equivalent to the linear [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] species derived from transposon circles. In addition, IS''3''-derivative transposons in which two abutted ends have been engineered undergo high levels of transposition<ref name=":10" />.  
  
Insertion of IS''3'' creates generally 3 and sometimes 4 bp direct target repeats. It is significant that plasmids in which the IRs are separated by 4 bp are more active than those separated by 8 bp. In these studies, the authors were unable to engineer derivatives with two complete tandem IS''3'' elements. This may be the result of the formation of a strong hybrid promoter which, as described for IS''911'' and other ISs (see above), drives high levels of Tpase expression. This configuration of ends is equivalent to that found at the circle junction and suggests that abutted ends of IS''3'' are also efficient substrates in transposition.  
+
Insertion of IS''3'' creates generally 3 and sometimes 4 bp direct target repeats. It is significant that plasmids in which the '''IRs''' are separated by 4 bp are more active than those separated by 8 bp. In these studies, the authors were unable to engineer derivatives with two complete tandem IS''3'' elements. This may be the result of the formation of a strong hybrid promoter which, as described for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] and other ISs (see above), drives high levels of Tpase expression. This configuration of ends is equivalent to that found at the circle junction and suggests that abutted ends of IS''3'' are also efficient substrates in transposition.  
  
IS2 generates direct target duplications of 5 bp on insertion<ref><nowiki><pubmed>375194</pubmed></nowiki></ref> although transposon circles generated with this element carry only a single base pair separating IRL and IRR<ref name=":5" />.
+
[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] generates direct target duplications of 5 bp on insertion<ref><nowiki><pubmed>375194</pubmed></nowiki></ref> although transposon circles generated with this element carry only a single base pair separating '''IRL''' and '''IRR'''<ref name=":5" />.
  
While IS2 carries a conserved terminal 5'-CA-3' at its right end, the left end terminates with 5'-TG-3'. This atypical IRL does not act as a strand donor but uniquely as a target in the circularization reaction.  
+
While [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] carries a conserved terminal '''5'<nowiki/>''' -CA- '''3'<nowiki/>''' at its right end, the left end terminates with '''5'<nowiki/>''' -TG- '''3''''. This atypical IRL does not act as a strand donor but uniquely as a target in the circularization reaction.  
  
 
Functional studies indicate that the product of the upstream orfA may inhibit transposition<ref name=":8" />. It has been shown to bind specifically to IRL at a sequence that overlaps the -10 hexamer of the resident Tpase promoter and represses expression of OrfA.  
 
Functional studies indicate that the product of the upstream orfA may inhibit transposition<ref name=":8" />. It has been shown to bind specifically to IRL at a sequence that overlaps the -10 hexamer of the resident Tpase promoter and represses expression of OrfA.  
  
It does not appear to bind IRR (note that in the original article the authors inverse the standard definition of IRL and IRR<ref name=":8" />.
+
It does not appear to bind '''IRR''' (note that in the original article the authors inverse the standard definition of IRL and IRR<ref name=":8" />.
 
 
Several other elements also exhibit small inverted repeat sequences which flank the -10 hexamer of the putative resident Tpase promoter. IS''2''-derivative transposons in which two abutted ends have been engineered also undergo high levels of transposition<ref name=":5" /><ref><nowiki><pubmed>8676870</pubmed></nowiki></ref> and, like IS''911'', the circle junction of IS''2'' also constitutes a strong promoter capable of driving Tpase expression. Several (but not all) IS''3''-family elements may also carry similarly located potential -35 and -10 sequences within their IRs.  
 
  
<br />
+
Several other elements also exhibit small inverted repeat sequences which flank the -10 hexamer of the putative resident Tpase promoter. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2'']-derivative transposons in which two abutted ends have been engineered also undergo high levels of transposition<ref name=":5" /><ref><nowiki><pubmed>8676870</pubmed></nowiki></ref> and, like [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], the circle junction of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] also constitutes a strong promoter capable of driving Tpase expression. Several (but not all) IS''3''-family elements may also carry similarly located potential -35 and -10 sequences within their IRs.
  
 
===Structural studies===
 
===Structural studies===
Although there are at present no structural data available for any members of this family, recent results obtained with an IS from another family, IS''Cth4'' from the [[IS Families/IS256 family|IS''256'' family]], which also undergoes copy-out-paste-in transposition has provided some insights <ref><nowiki><pubmed>33006208</pubmed></nowiki></ref>. This particular transposition pathway is asymmetric in the sense that one IS end is cleaved and attacks the opposite end several nucleotides from the tip <ref><nowiki><pubmed>7590258</pubmed></nowiki></ref>. In accord with this type of mechanism, crystal structures of IS''Cth4'' transposase bound to three different substrates show a transposase dimer bound asymmetrically to a single DNA substrate: a pre-reaction substrate with '''IR'''R together with its flanking DNA, a pre-cleaved complex in which the '''IR'''R flank had been removed and a strand transfer complex including an abutted '''IR'''R and '''IR'''L separated by a gapped 6 base pair linker ([[:File:IS256.8.png|Fig. IS256.8]]). It is important to note that IS''256'' family transposases carry an alpha-helical insertion domain which separates the catalytic domain into two segments. This domain plays an important role in directing different DNA segments during the reaction. IS''3'' family transposases carry an uninterrupted catalytic domain without the alpha helical insertion domain implying that the atomic details of the process will differ. In this light, it is worth remembering that efficient insertion of IS''911'' transposon circles catalysed by OrfAB is greatly stimulated by inclusion of the upstream OrfA protein and is sensitive to the ratio of OrfAB/OrfA <ref><nowiki><pubmed>9463394</pubmed></nowiki></ref>.
+
Although there are at present no structural data available for any members of this family, recent results obtained with an IS from another family, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISCth4 IS''Cth4''] from the [[IS Families/IS256 family|IS''256'' family]], which also undergoes '''copy-out-paste-in''' transposition has provided some insights <ref><nowiki><pubmed>33006208</pubmed></nowiki></ref>. This particular transposition pathway is asymmetric in the sense that one IS end is cleaved and attacks the opposite end several nucleotides from the tip <ref><nowiki><pubmed>7590258</pubmed></nowiki></ref>. In accord with this type of mechanism, crystal structures of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISCth4 IS''Cth4''] transposase bound to three different substrates show a transposase dimer bound asymmetrically to a single DNA substrate: a pre-reaction substrate with '''IRR''' together with its flanking DNA, a pre-cleaved complex in which the '''IRR''' flank had been removed and a strand transfer complex including an abutted '''IRR''' and '''IRL''' separated by a gapped 6 base pair linker ([[:File:IS256.8.png|Fig. IS256.8]]).  
  
<br />
+
It is important to note that [[IS Families/IS256 family|IS''256'' family]] transposases carry an alpha-helical insertion domain which separates the catalytic domain into two segments. This domain plays an important role in directing different DNA segments during the reaction. IS''3'' family transposases carry an uninterrupted catalytic domain without the alpha helical insertion domain implying that the atomic details of the process will differ. In this light, it is worth remembering that efficient insertion of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposon circles catalysed by OrfAB is greatly stimulated by inclusion of the upstream OrfA protein and is sensitive to the ratio of OrfAB/OrfA <ref><nowiki><pubmed>9463394</pubmed></nowiki></ref>.
  
 
==Bibliography==
 
==Bibliography==
 
<references />
 
<references />

Revision as of 00:11, 11 August 2021

Original Identification

IS3 and another member of this family, IS2 were identified genetically as a DNA segments causing insertional inactivation of gal and lac operons and physically by electron microscopy[1] and in plasmid F as a segment called alpha-beta[2][3]. IS3 was subsequently wrongly identified as the insertion sequence flanking the tetracycline resistance transposon Tn10[4][5]. It has subsequently been found as a component of a large number of plasmids particularly in gram negative enterics.

Presence in Compound Transposons

Although IS3 family elements do participate in compound transposons (e.g. IS3411) flanking the Citrate Utilization, to our knowledge there has been no systematic survey undertaken and very few IS3-associated compounds have been described to date. Several family members are part of compound transposons. These include: IS3411 flanking genes for citrate utilization in transposon Tn3411[6][7][8], IS4521 which flanks a heat stable enterotoxin gene in enterotoxinogenic Escherichia coli and IS1706, which flanks genes of the Clp protease/chaperone family.

Distribution

This is one of the most coherent, largest, most abundant and widely distributed IS families [9] (see [10]). Nearly 600 individual different members of this family have been identified in more than 267 bacterial species distributed over 145 genera. However, their true distribution is clearly significantly greater than this.

For example, IS911, (isolated from a Shigella dysenteriae phage λ lysogen by spontaneous insertion into the phage cI repressor gene[11]) is present in multiple copies in the original host strain and in type strains of other Shigella species. Two vestigial copies, both interrupted by a copy of IS30, were also detected in the chromosome of E. coli K12[12] and could form transposition intermediates when supplied with IS911 transposase[13]. Entire or truncated IS911 copies have also been identified in several E. coli virulence plasmids (e.g. [14]), in pathogenicity islands of uropathogenic E. coli (e.g. [15]), in various other clinical isolates of E. coli and in a large number of well-known and less well-known enterobacteria such as Escherichia fergusonii, Chronobacter, Dickeya, Erwinia, Klebsiella, Pantoea, Shimwellia, and Yersinia.

Most IS3 family members have been identified in bacteria although at least one example, ISMco1, has also been identified in the archaea Methanosaeta concilii[16]. Since this archaeon is widespread in nature[17], it is possible that this represents a case of recent horizontal transfer. The presence of 8 copies implies that ISMco1 is active in its archaeal host.

Organization

The family is quite homogenous in the organization (Fig.IS3.1). in spite of its wide distribution in bacteria exhibiting a large range of G+C contents (from 70% in the Mycobacterial examples to 25% in those isolated from Mycoplasma) and of the presence of members in hosts such as Mycoplasma with a non-universal genetic code (e.g. IS1138) or in bacteria which use stop codon read-through by insertion of the unusual amino acid selenocysteine (e.g. ISDvu3 from Desulfovibrio vulgaris). In the case of both copies of IS1138, which participates in high frequency rearrangements of the Mycoplasma pulmonis chromosome, the Tpase orf carries 11 UGA codons which are decoded as tryptophan[18].

Fig. IS3.1. (A) Genetic organization of IS911. The 1,250-bp IS911 is shown as a box. The boxes at each end represent the left (IRL) and right (IRR) terminal inverted repeats. The two open reading frames, orfA (blue) and orfB (green) are positioned in relative reading phases 0 and −1, respectively, as indicated. The indigenous promoter, pIRL, is shown. The region of overlap between orfA and orfB, which includes the frameshifting signals to produce OrfAB, lies within IS911 coordinates 300 and 400. The precise point at which the frameshift occurs, within the last heptad of the LZ, is indicated by the vertical dotted line. (B) Structure-function map of OrfAB and OrfA. HTH, a potential helix-turn-helix motif; LZ, a leucine zipper motif involved in homo- and hetero-multimerization of OrfAB and OrfA. Programmed translational frameshifting that fuses OrfA and OrfB to generate the transposase OrfAB occurs within the fourth heptad. The LZ of OrfA and OrfAB, therefore, differ in their fourth heptad. A second region, M, necessary for multimerization of OrfAB is shown, as is the catalytic core of the enzyme which carries a third multimerization domain. OrfA translation initiates at an AUG, terminates with UAA whereas OrfAB translation terminates within the right IR. The vertical line to the right of M shows the extent of the truncated transposase, OrfAB[1–149] described in the text. (C) Frameshifting window. The mRNA sequence around the programmed translational frameshifting window is presented. The boxed sequence GGAG is the potential ribosome-binding site located upstream of orfB whose potential translation would be initiated at the boxed AUU codon. A ribosome (not to scale) is shown covering a series of “slippery” codons (AAAAAAG). A downstream secondary structure is also shown with the UAA, OrfA translation termination codon. The ribosome-binding site, slippery codons, and secondary structure all contribute to the efficiency of the programmed −1 frameshift. The box at the foot of this figure shows how the anti-codons of two tRNALys are thought to undergo re-pairing with their codons in the AAAAAAG motif.

Members are between 1200 and 1550 bp with relatively well conserved inverted terminal repeats in the range of 20-40 bp. One exception previously attributed to this family, IS481, is 1045 bp long and has now been placed in a separate family; see "IS481 family"). They generate 3 or 4 bp DR on insertion.

The majority of IR terminate with 5'-TG-----CA-3' and present an internal block of G/C residues of variable length (Fig.IS3.2).

Fig. IS3.2. WebLogo of IS3 family ends. The left (IRL) and right IRR inverted terminal repeats of the major IS3 family groups as defined in ISfinder are shown in WebLogo format. They are defined by the direction of transcription/translation of the transposase gene. IRL, by definition, is located on the 5’ side of the transposase orf.

IS3-family members generally have two consecutive and partially overlapping reading frames, orfA and orfB, in relative translational reading phases 0 and -1, respectively (Fig.IS3.1 A) under control of a weak promoter, pIRL, partially located in IRL (Fig.IS3.1 A and Fig.IS3.3 C). The 5' end of orfB overlaps the 3' end of orfA and occurs in reading phase -1 relative to orfA (Fig.IS3.1).

It had been demonstrated in the 1990s that several family members (IS150[19], IS3[20], IS911[21], and IS2[22]) express two major proteins (Fig.IS3.1 B): OrfA, the product of the upstream frame,and the transposase, OrfAB, a “fusion” or “transframe” protein generated from orfA and orfB by Programmed -1 Ribosomal Frameshifting (PRF) (see "Programmed translational frameshifting")[23]. Many other members of this family are also organized in this way[24][25]. The frameshifting frequency varies from element to element. It is approximately 50% in the case of IS150[19] and only 15% for IS911[21].

Fig. IS3.3. (A) Organization of the IS911 inverted repeat (IR). The nucleotide sequence of IRL and IRR is boxed. Grey horizontal bars above and below indicate the internal regions protected from DNaseI digestion by binding of OrfAB [1–149], a derivative of the 382-amino-acid OrfAB truncated for its catalytic domain. The dotted horizontal gray bar indicates partial protection. The dashes within the sequence indicate mismatches between the left and right ends. The −35 and −10 components of the indigenous promoter pIRL (blue boxes) and of pjunc (green boxes) are shown. The conserved 5′ TG tips are highlighted in red. (B) Organization of pjunc. The “junction” promoter assembled on the circularization of IS911 is shown as green boxes. The initiating transcript nucleotide (+1 pjunc), the indigenous pIRL (blue boxes), and the initiating transcript nucleotide (+1 pIRL) are also shown. The conserved 5′ TG tips are highlighted in red. (C) Secondary structure at the left IS911 end. The sequence of the “top” strand of IRL is shown, together with the various transcription and translation signals. The symbols below are standard “dot-bracket” notations to indicate potential secondary structures formed with transcripts from top to bottom: from an external promoter, from pjunc, or from pIRL respectively. The brackets are shown in italic, simply permit the reader to identify the apical stem of the secondary structure.

Complex internal inverted repeat sequences (Fig.IS3.3 C) (for IS911, located between coordinates 19 and 73) include the -35 and -10 hexamers of pIRL, the transcription start site and the ribosome binding site for OrfA. This is thought to play a role at the mRNA level in preventing excess transposase expression resulting from external transcription. The full secondary structure would be present in transcripts initiated outside the IS thus sequestering the translation initiation signals but only the 3’ part would be present if transcription initiates at pIRL. In this case, the translation initiation signals would be exposed. Initial studies (Prère and Fayet pers communication) have shown that translation from the longer transcript is very low but that deletion of its 5’ end to “liberate” the ribosome binding site (Fig.IS3.3 C) indeed results in a significant increase in translation. In the related IS2 element, a similar sequence appears to function as a DNA binding site for the OrfA protein which represses promoter activity but further studies are necessary to confirm this[26].

Formation of a strong transposase promoter

In common with many IS of other families (e.g. IS21[27], IS30[28], IS110[29][30]) the IS3 family IRR carry an outward-directed -35 promoter hexamer while IRL carries an inward-directed -10 promoter component (Fig.IS3.3 B). These are assembled into a strong promoter, pJunc, which serves to express high levels of transposition proteins (Fig.IS3.3 B); (Fig.IS3.4) in one of its key transposition intermediates, an excised transposon circle (see "Transposition Pathway"). Transcription initiation from pJunc, like that from impinging transcription, would also produce an RNA which could sequester the translation initiation signals but in a shorter and less stable stem loop structure (Fig.IS3.3 C).

Fig. IS3.4. Left: Primer extension analysis of lac transcripts. Lanes 1 and 2: two independent cultures. Lanes 3 and 4: primer extension products obtained from identical quantities of total RNA isolated from two independent cultures. The major products are indicated by unfilled arrowheads (right). The scheme at the left shows the relative position of the IRR–IRL junction. Middle: Schematic of the different plasmid forms notes that to obtain results for the transposon junction a copy was cloned into a suitable vector. Right: Colonies on MacConkey lactose plates.

Regulation by Methylation?

Several members carry GATC methylation sites within 50bp of their ends, which have been shown in one case, IS3, to modulate transposition activity[31], however, this is not a general characteristic of the family nor is it restricted to any particular subgroup.

Insertion specificity

There appears to be little sequence specificity for insertion of members of the family. IS2 exhibits a preference for a region of bacteriophage P1 but the basis of this preference is at present unknown[32]. Both IS911[33] and IS150 [34] have been found next to sequences which resemble their IRs (see “Targeted Insertion”) and IS1397 is invariably located within intergenic repeated sequences in E. coli (Bacterial Interspersed Mosaic Elements or BIMEs[35].

Group II intron insertions

Finally, an element isolated from the ECOR collection of E. coli and closely related to IS3411 carries a group II intron[36]. The effect of this on regulation of transposition of this element has not been investigated.

IS3 family subgroups

The IS3 family is divided into five subgroups (Table Characteristics of IS families; Fig.4.2). This is supported by deep branching in the alignment of the various OrfA and OrfB sequences[37] (Fig.IS3.5). These are: the IS2 and IS407 subgroups (which appear closely related), and the IS3, IS51, and IS150 subgroups.

Additional members of the family identified subsequently also tend to follow this pattern. One feature which lends biological credence to these subgroups is that they also clearly appear clustered (with some exceptions) in the results of the alignments with the upstream OrfA protein[37]. Moreover, there is some correlation between the members of each group and the number of base pairs of target DNA duplicated on insertion (DR): for those elements in the IS2 subgroup, insertion invariably leads to a 5 bp DR; for the IS407 subgroup a 4 bp DR is observed; while for the other groups a 3 bp DR is generated (Table Characteristics of IS families). In the latter cases some of the elements, e.g. IS911, have been shown to occasionally generate 4 bp repeats. This clustering is also exhibited to some extent in the nucleotide sequence of the terminal IRs (Fig.IS3.2) and is particularly marked in the IS2, IS51 and IS407 subgroups. It can also be observed in the primary sequence details of the putative leucine zipper[38].

Fig. IS3.5. Relationship of OrfA and OrfB in various IS3 family groups. Dendrogram based on the alignments of the amino acid sequences of predicted OrfA proteins from 40 elements (left) and 44 predicted OrfB frames (right) (adapted from Mahillon and Chandler 1998). The different colors indicate the different IS3 family groups, showing that both A and B frames are largely group-specific.

Family Exceptions

Several family members exhibit an organization which does not apparently conform to the generic IS3 member. In IS120, for example, the relationship between the reading phases of the upstream and downstream orfs appears to be +1 rather than -1 while in ISNg1 and ISYe1 the characteristic motifs of OrfB are distributed between reading phases. Other members, such as IS1076, IS1138, IS1221, and IS1141, exhibit only one long open reading frame. Although these may be true variants, it cannot at present be ruled out that the variations are simply due to errors in sequence determination.

Mycoplasma and the non-universal genetic code

Family members from Mycoplasma merit special attention. Not only does the host use a non-universal genetic code in which the opal termination codon TGA directs the insertion of tryptophan (see [39], but their genomes are among the smallest bacterial genomes known and extremely rich in A+T. To date, several different IS3 family members have been observed in Mycoplasma. Of these, only IS1138 (and IS1138b) has been demonstrated directly to undergo autonomous transposition[18]. All exhibit similarly high AT levels and this unusual base composition could lead to difficulties in sequence determination. It is remarkable that typical IS3 family characters have been maintained in such an "extreme" genetic environment. Nine individuals are closely related and form a group of iso-elements which have been called IS1221. As indicated above, one of these carries a single long reading frame (representing orfA + orfB) instead of two consecutive overlapping frames. The others each carry insertions or deletions which destroy either the equivalent of orfA, orfB, or both. Expression studies in E. coli indicate that a protein, equivalent to OrfAB, is indeed produced from the long open reading frame of IS1221. Interestingly, it appears that a second truncated protein, equivalent to OrfA, may be generated from the single orfAB frame by translational frameshifting, representing an "inverted" expression pattern to the majority of the family members[40]. Although this appears not to be a general rule for IS3 family members originating from Mycoplasma hosts, the presence of a similar single-frame arrangement in a second member, IS1138, indicates that it might not be rare. Because of the extremely high AT content of these elements, many potential frameshift windows of the A6G(/C) or A7 type are expected to occur. The only direct experiment will, therefore, be able to determine which, if any, of these sequences are used to generate the Tpase or, conversely, an OrfA-like protein.

A clade with non-canonical IR

A clade carrying non-canonical ends has recently been identified. These IS include 7 supplementary base pairs on each end flanking canonical IS3 ends: a conserved stretch of 5 C residues is located 5’ to the left IR and a less conserved motif (CGG) is located 3’ to the right end. When these additional bases are taken into account every member of this clade exhibits a 4 bp DR characteristic of the IS3 family (Table Characteristics of IS families) (Gourbeyre, pers. comm.). This conclusion is supported by the presence of multiple IS copies (e.g. ISPsy31) and also by identification of “empty sites”. This clearly requires further experimental investigation.

An additional subgroup

Recently, an additional subgroup has been proposed which includes ISPpy1[41]. However, all members belong to the IS150 subgroup and their Tpases are not separated by our standard multiple alignments and MCL analysis. Although they do exhibit some variation in the sequence of their terminal dinucleotides, similar variations are found for IS2 and members of other IS3 subgroups.

Mechanism

Transposition Proteins

Extensive alignment studies of the predicted OrfA and OrfB amino acid sequences between themselves and with those of other transposable elements[42][43][44][45][46] provided insights into structure/function relationships of the proteins (Fig.IS3.1 B).

OrfA

OrfA is small. For IS911 it has a predicted molecular weight of 11.5 kDa. The predicted primary amino acid sequences of most IS3 family members exhibit a similarly placed HTH signature (see for example [11][47]) which initially suggested that they might provide sequence-specific binding to the terminal IRs of their particular IS[48] involved in sequence-specific binding of the transposase to the terminal IRs OrfAB which was subsequently confirmed experimentally[49]. They also carry a C-terminal leucine zipper (LZ) motif first identified in IS2, IS150 and IS3 and which appears to be conserved in the majority of known members[50] and is involved in protein multimerization[11][21][38][50].

OrfB

The OrfB products carry a DD(35)E catalytic motif and share additional identities with retroviral integrases and various other Tpases[21][42][43][44][45][46][51]. These include two amino acids located 4 and 7 residues downstream from the glutamate residue.

IS911 OrfB is 299 residues long with a predicted molecular weight of 34.6kD. Its TAA termination codon lies just within IRR and may be significant in regulation. The OrfB initiation codon is AUU and consequently initiation occurs only at low levels[21][52] and is modulated by the level of initiation factor IF3[53].

OrfB has been observed for: IS3[20] (Prère & Fayet, unpublished), IS150[19], IS911[21][52][53] and IS3411/IS629[54][55] but not for IS2[56]. It is generally present at quite low levels although for IS3 approximately equal amounts of OrfB and OrfAB appear to be produced[20]. The IS150 OrfB initiation codon is out of phase with the rest of the gene and expression of full length OrfB would require a -1 frameshift after initiation.

Sequence analysis suggests that OrfB may in fact be synthesized by about 34% of IS3 family members through translational coupling: the stop codon of orfA overlaps with a potential orfB start codon (e.g. AUGA or GUGA) in 134 out of 399 ISs analyzed[24].

It is possible that the OrfB protein itself plays no direct role in transposition chemistry but that it is simply its translation signals which are important. Their recognition by the ribosome could modulate programmed translational frameshifting required to generate a single transposase protein, OrfAB, from the two reading frames orfA and orfB (see "Programmed translational frameshifting").

The OrfB amino acid sequence shares significant similarities with retroviral integrases, an observation which contributed to defining the highly conserved amino acid triad DDE common to all IS3 family members and to many of this type of phophoryltransferase enzymes[43][57]. This constitutes part of the active site (for reviews see: [48][52]).

OrfB carries neither the HTH nor the LZ motif.

OrfAB: a product of programmed ribosomal frameshifting (PRTF)

OrfAB is assembled from orfA and orfB by a programmed –1 ribosomal frameshift occurring near the 3' end of orfA (see "Programmed translational frameshifting") first demonstrated for the related IS150[19].

The transframe protein combines the orfA HTH motif, an LZ motif and the orfB DD(35)E catalytic domain [50] (Fig.IS3.1 B).

OrfAB of IS911 (382 amino acids) shares its 86 N-terminal amino acids with OrfA (100 amino acids) and its 296 C-terminal amino acids with OrfB (299 amino acids).

Ribosome rephasing to generate OrfAB occurs on a group of "slippery” lysine codons with a frequency of about 15% (measured using systems driven by two different promoters; T7p10 and ptac). OrfA is therefore normally expressed at significantly higher levels than OrfAB. Frameshifting permits the combination of different functional protein domains (Fig.IS3.1 C)..

IS3-family frameshifting is similar to that used in some retroviruses to generate the pol-gag "polyprotein"[58] and in the dnaX gene of E. coli to synthesize γ the sub-unit of DNA polymerase III[59].

The relevant IS911 sequences involved in frameshifting are shown in (Fig.IS3.1 C). Examples of frameshifting sequences from other members of the family are shown in Fig.IS3.6. The group of slippery lysine codons is A AAA AAG and is directly preceded by the AUU OrfB initiation codon. Since E. coli does not encode a tRNALys with a 3’UUC5’ anti-codon for AAG, both lysine codons are decoded by the same tRNALys with a 3’UUU5’ anticodon. Its pairing is weaker with a G at the wobble position[60] probably because modifications of U34 increase the rigidity of the anticodon[61]. The presence of an upstream RBS (GGAG sequence) and a downstream secondary structure (Y-shaped stem-loop) stimulates ribosome rephasing in the -1 direction. What drives frameshifting is probably the thermodynamically favorable re-pairing of the two tRNALys from codons AAA-AAG to codons AAA-AAA[59][62]. The stimulators likely have a mechanical effect bringing back in the register the ribosome and the mRNA after tRNA slippage. Different groups of codons have been observed to allow rephasing of the ribosome[25] and, although the most common motif is A6G, different members of the IS3 family carry a variety of these (e.g. A3G for IS3; see Atkins & Gesteland, Recoding: expansion of decoding rules enriches gene expression, Springer 2010).

Fig. IS3.6. Signals and predicted branched stem-loop structures in the frameshift regions of IS911, IS3, IS3411, and IS1222. This figure, adapted from Sharma et al., 2014 (IS911, IS3), Mazauric et al., 2008 (IS3411) and Mejlhede et al., 2004 (IS1222), illustrates several of the different potential secondary structures located downstream of the group of “slippery” codons at which a programmed -1 translational frameshift occurs. These include stem-loop structures in all cases, but may also involve the formation of a pseudoknot which enhances ribosome slippage and an upstream ribosome binding site (SD sequence).

Two similarly located partially overlapping reading frames in IS3, IS150 and IS3411[54] also produce three proteins. The transposases, OrfAB, like that of IS911, are fusion products of the two orfs generated by a –1 translational frameshift.

For IS3, frameshifting is also stimulated by a presumed H-type pseudoknot structure similar to those generally involved in viral recoding[63]. In IS3411, -1 slippage on a U UUU motif requires a more convoluted form of pseudoknot structures formed by pairing of an apical loop and an internal loop belonging to two hairpins located 65 nucleotides apart on the mRNA[54]. Two similarly arranged orfs occur in IS2 and have been shown to encode OrfA and OrfAB equivalents only[26][56]. This organization is observed in most members of the IS3 family but, beside the cases mentioned above, frameshifting has been analyzed experimentally only in a few other, less well-characterized, elements (including IS51, IS222, IS600, IS1133, IS1222).

The frequency of frameshifting is quite variable from element to element: reported values are 15% for IS911, 50% for IS150, 6% for IS3 and 2% for IS3411[54]. These values may not reflect the in vivo situation since they were not established by direct measurement of the amount of the OrfA and OrfAB proteins synthesized from an intact IS, but after modification of expression signals of the IS genes or after cloning the frameshift signals in a reporter system[19][20][21].

The level of formation of a circular IS911 transposition intermediate IS911 carrying abutted left and right ends to generate an IRR-IRL junction (Transposition Pathway) measured by PCR indeed depends on frameshifting frequency in vivo[64]. IS911 copies from several clinical isolates contained variations in the frameshift region exhibited various reduced levels of frameshifting. When these were introduced into the model IS911 they resulted in comparable reductions in a circle formation.

Frameshifting is likely modulated by the physiological state of the host cells and by the environment: for example, frameshifting decreases when the temperature is raised or when ribosome density on the mRNA is increased (O. Fayet, pers. Comm.).

Artificial orfA-orfB fusion

For experimental purposes, production of OrfAB without necessitating a translational frameshift is obtained by introduction of a single additional base pair within the frameshift region which artificially fuses the orfA and orfB frames and eliminates OrfA production[21]. It was initially difficult to construct this mutant in the context of an entire IS911 (i.e. with the two flanking IR) but more recently this has been accomplished using a longer artificial IS and resulted in an exceptionally high transposition frequency[65]. A similar mutant in IS3 results in a high frequency of adjacent deletions[20].

Structural motifs

Although no structural information is available from crystallography, the role of the HTH and LZ motifs have been probed in vivo and in vitro.

The conserved N-terminal helix-turn-helix (HTH) motif is related to the LysR family of bacterial transcription factors and has a highly conserved tryptophan residue similar to that of certain homeodomain protein HTH motifs. This domain is important in directing transposase to bind IS911 IR[49] and is present in most IS3 family members (Fig.IS3.7 A). The N-terminal helices of the related IS2 transposase are also involved in IR binding[49].

Fig. IS3.7A. Sequence alignments of the HTH motif. Top. Alignment of the predicted HTH motif of the transposase of the five defining members of subgroups within the IS3 family with that of IS911. Identical or similar residues are boxed; bold lower case characters represent residues that fit the consensus. Bottom. An expanded view of the IS911 HTH motif with (below) mutated resides used in defining DNA binding functions.

Many members carry a putative leucine zipper located at the end of OrfA (sometimes extending into the OrfB region of the OrfAB protein) (see [40] [66][67]). Studies with IS911 and IS2 indicate that this is a multimerization domain of the proteins[38][50][68]. The LZ motif of IS911 is composed of four heptameric units (Fig.IS3.1 B) with a predicted coiled coil structure including a potential buried inter-subunit hydrogen bond across the dimer interface (Fig.IS3.7 B), to maintain the zipper in a dimeric state, and correctly placed residues with opposite charges potentially able to form characteristic inter-subunit salt-bridges to stabilize the dimeric structure[50]. Leucine zipper motifs are found in most IS3 family members (Fig.IS3.7 C).

Fig. IS3.7B. A) OrfAB is shown at the top. The relative positions of the A and B domains are indicated together with those of the helix-turn-helix (HTH), leucine zipper (LZ), and DD(35)E motifs. M is a second region necessary for correct multimerization. The numbers below indicate the positions in amino acid residues. The single amino acid sequence below shows the LZ motif with the four-component heptad repeats indicated below and the leucine repeat highlighted. Repeating positions are indicated by the letters a to g. The changes in LZ sequence resulting from frameshifting between OrfA and OrfAB. B) A helical wheel diagram showing a head-to-head homodimer conformation to portray the predicted hydrophobic core (positions a and d) and electrostatic interactions (positions e and g). Arrows of decreasing size and intensity are directed towards the carboxy-terminal end.
Fig. IS3.7C. Conservation of the leucine zipper motif throughout the different IS3 family subgroups. Alignment of predicted coiled-coils in the OrfA proteins of members of the five IS3 families. Leucine residues are highlighted in red and other significant residues in blue. Adapted from Haren et al., 2000.

OrfAB and OrfA form both homomultimers and mixed OrfAB-OrfA multimers[38][50].

Mutation of specific critical residues in the OrfAB LZ reduces the level of transposition intermediates in vivo and in vitro [69] (Transposition Cycle) and reduced or prevented multimer (dimer) formation. OrfAB and OrfA share three of their four heptads (Fig.IS3.7 B). The last of each differs in sequence due to the translational frameshift which occurs within the heptad in the expression of OrfAB. This presumably results in different strengths of monomer-monomer interactions in the case of homo- and hetero-multimers and this may be involved in the regulation of transposition. A poorly defined region, M, located between residues 109 and 135 (Fig.IS3.1 B) and components in the catalytic domain of OrfAB are also involved in its multimerization.

Co-translational DNA binding

IS911 OrfAB has a strong cis preference in vivo [65]. It has about a 200 fold higher activity on the IS copy from which it is expressed (in cis) than in trans. This prevents activation of transposition of one IS copy by OrfAB expressed from a second copy in the same cell. The strength of the cis effect depends on the distance of the transposase gene from the IS ends. Also, modification of the translational frameshifting pause signal has a strong influence on cis preference presumably by delaying translation and folding of the C-ter domain increasing the chance that the folded N-ter domain will recognize and bind its target IR.

In vitro analyses using ribosome display with a coupled E.coli-derived transcription-translation system coupled with size exclusion chromatography[65] demonstrated that an added IR bound nascent OrfAB derivatives while they are still attached to the ribosome. Ternary complexes containing mRNA, ribosome, and a nascent peptide specifically bound added IR copies if only the N-ter 149 amino acids extended from the ribosome whereas a full-length Tpase exiting the ribosome did not.

Direct evidence of coupled translational binding (Fig.IS3.8) was obtained using a staged coupled transcription/translation reaction: nascent OrfAB bound the IR before its synthesis was complete but not after. Thus OrfAB can efficiently bind the IR only prior to its complete translation.

Fig. IS3.8. This schematic, not to scale, shows the insertion sequence with its left (IRL)and right (IRR) ends in green. RNA polymerase, RNAP, is shown in pale green in the process of transcribing from the promoter pIRL. The mRNA is shown in dark green with a ribosome (blue)paused at the frameshift secondary structure. The nascent OrfAB peptide (brown) is shown binding to IRL while undergoing translation. Above is shown the full-length OrfAB in a folded configuration, proposed to prevent its binding to the IR as a completed protein.

Co-translational multimerisation

An intriguing question arising directly from these results is how OrfAB multimerizes as is found in the transpososome to bind both ends of the IS. Stable formation of the important synaptic complex containing both IS ends and the transposase requires a dimeric OrfAB (see "The IS911 transpososome" below). It is therefore possible that dimerization is in some way directly associated with translation. Indeed, using luxA and luxB as a model system, it been shown that luxA/B subunit assembly initiates cotranslationally on nascent LuxB in vivo. Protein assembly appears to be directly coupled to translation and involves “spatially confined, actively chaperoned cotranslational subunit interactions”[70].

The IS911 transpososome

A crucial checkpoint in transposition is the assembly of the 'transpososome'. This step is a general prerequisite for initiating DNA cleavage and the subsequent chemical steps in transposition for most elements that use a DNA (rather than RNA) transposition intermediates. In this protein-DNA complex, both ends of the transposon are bridged by the transposase before it catalyzes the DNA strand cleavages and strand transfers necessary for transposon mobility[71][72][73]. The transpososome adopts very precise architectures to accomplish these steps, and undergoes defined changes throughout the transposition process.

The overall IS911 transposition pathway is a two-step process, involving replicative excision followed by insertion (Fig.IS3.9 A and 9B). This implies consecutive assembly of two types of transpososome: one implicated in IS excision (synaptic complex A; SCA) and includes both IS ends while the other (synaptic complex B; SCB) involves the circle junction with its abutted IRs to ensure its integration into the target DNA.

Fig. IS3.9A. IS911 is shown in green, the flanking donor DNA in black, and the target DNA in blue. Transposon ends are shown as green filled circles. The small arrows shown in Figure 4 have been omitted for brevity. (A) Donor plasmid carrying the insertion sequence (IS). (B) Formation of the first synaptic complex SCA and cleavage of the left or right inverted repeat (IR) and attack of the other end. (C) Formation of a single-strand bridge to create a figure-eight molecule if the donor is a plasmid, as shown here. (D) The products of IS-specific replication: the double-strand circular IS transposition intermediate and the regenerated transposon donor plasmid. The replicated strand is shown as a green dotted line. (E) Formation of the second synaptic complex SCB and engagement of the target DNA (blue). (F) Cleavage of the IS circle and integration. (G) The newly integrated IS.
Fig. IS3.9B. Top. cartoon of the IS911 figure eight (left) and IS circle (right). Bottom. Electron microscopy of figure eight (left) and IS circle (right). DNA has been coated with RecA protein to highlight double and single-stranded DNA occurs in the "crossover" region of the figure eight molecule on the left. Electron microscopy by Edouard Boy de la Tour and Lucian Caro.

Excision synaptic complex SCA.

Using a band shift assay and IR of different lengths (the so-called “long-short” experiment) it was shown that the truncated OrfAB [1-149] forms a complex with two IR copies, the paired-end complex (PEC)[38] equivalent to the SCA. An intact OrfAB [1-149] LZ is necessary for correct PEC/SCA formation[38][50]. At higher OrfAB [1-149] concentrations a probable single end complex (SEC) composed of one IR and OrfAB [1-149] appeared. Addition of OrfA disturbed both PEC/SCA and SEC and generated a fast migrating species whose composition remains to be determined but does not appear to contain OrfA itself [38].

DNaseI and Copper phenanthroline footprinting revealed that OrfAB [1-149] protects a sub-terminal (internal) IR region including two conserved sequence blocks in the left (IRL) and right (IRR) ends (Fig.IS3.1 A). DNA binding assays in vitro and measurement of in vivo recombination activity of sequential IR deletion derivatives suggested a model in which the N-terminal region of OrfAB binds the conserved boxes in a sequence-specific manner and anchors the two IRs into the SCA. The external region of the inverted repeat was proposed to contact the C-terminal transposase domain carrying the catalytic site[74].

SCA is composed of a dimer of transposase bridging to two IR[75], as judged by the use of a tagged and untagged truncated transposase derivative, OrfAB[1-149], and also of IR of different lengths. OrfAB[1-149] assembles two IRR copies in a parallel orientation (Fig.IS3.4)[75] as studied at the single molecule level by Atomic Force Microscopy (AFM) using asymmetric IRR-carrying DNA fragments.

SCA assembly was also studied using a second single-molecule approach: tethered particle motion (TPM) (Fig.IS3.10)[76] in which a DNA molecule is tethered to a glass support and its effective length is measured by observing the Brownian motion of a bead attached to its free end (Fig.IS3.10 left). OrfAB[1-149] binding to a single IR provoked a small shortening of the DNA, consistent with a DNA bend introduced by protein binding to the IR and was confirmed using EMSA. When two ends were present on the tethered DNA in their natural, inverted, configuration, OrfAB[149] not only provoked the short reduction in length but also generated species with greatly reduced effective length (Fig.IS3.10 middle and top right) consistent with DNA looping between the ends and thus SCA formation. SCA is very stable and kinetic analysis in real-time suggested that passage from the bound unlooped to the looped state could involve another unlooped species of intermediate length in which OrfAB[149] is bound to both IRs. DNA carrying directly repeated IR also gave rise to the looped species but the level of the intermediate species was significantly enhanced (Fig.IS3.10 middle and bottom right). Its accumulation could reflect a less favorable SCA formation with directly repeated IR copies than with inverted IR. This is compatible with a model in which OrfAB binds separately to and bends each IR and protein-protein interactions then lead to SCA formation (Fig.IS3.11 A)[77]. Cleavage and strand transfer would then give rise to a species in which both IS ends are joined by a single strand bridge (or figure-eight on a circular plasmid (Fig.IS3.9 C) (see "The Transposition Pathway").

Fig. IS3.10. IR pairing by Tethered Particle Motion. The figure is adapted from Pouget et al., 2006

Insertion synaptic complex SCB

SCB has not been characterized in such a precise way as SCA. SCB is devoted to the insertion step of the transposition process. Two types of insertion, IR-targeted and non-targeted, have been observed (Fig.IS3.11 B). It has been proposed that two different protein-DNA complexes are assembled during the two types of insertion reaction: SCBt and SCBnt (for targeted and non-targeted synaptic complex respectively[78]. Nothing is known about the stoichiometry and the geometry of these complexes but, based on protein and DNA requirements for protein-DNA complex formation, as judged by band shift, and for transposition products, as judged by in vitro and in vivo transposition assays, it has been proposed that SCBt is composed of a transposase dimer bridging a DNA molecule carrying an IR and a DNA molecule carrying an IRR-IRR junction (IS911 circle), the product of the replicative IS911 excision. This IR targeted insertion explains how the original isolate of IS911 might have occurred next to a sequence which strongly resembles an IR[11] and can also explain one ended insertion[33]. In this regard, IRR shows a somewhat higher affinity than IRL. Note that if one of the two IR carried by the circle is omitted, SCBt resembles SCA (Fig.IS3.11).

Fig. IS3.11. Proposed configuration and composition of synaptic complexes SCA and SCB involved in different steps of the IS911 transposition cycle. The excision complex SCA. The tips of the insertion sequence (IS), which are not protected by the truncated transposase OrfAB[1–149] are shown as green circles containing an arrowhead. IRs are indicated by thick black lines and the IS as green lines. Full-length OrfAB, which is presumed to cover the entire IR, is shown bound as a monomer to each end and to introduce a small bend in the DNA. Dimerization creates SCA, resulting in the pairing of both IRs and in the formation of a DNA loop which includes the IS. Finally, a cleavage and strand transfer event results in the formation of a single-strand bridge between the IRs. The integration complex SCB. Symbols are as in (A). In the left-hand column, the IS circle intermediate with its newly replicated strand (dotted line) is shown to form a complex between an IR in the circle and a second in the target to form SCBt. Cleavage and strand transfer is shown to form a single-strand bridge between the two IRs. RecG helicase is thought to intervene to drive strand migration before a second cleavage and strand transfer results in the integration of the circle. This would explain the integration of the many different ISs observed to occur next to a resident IR in the target. The right-hand column: untargeted integration involving OrfA and OrfAB. OrfA is known to interact with OrfAB. It also changes in some way OrfAB binding but it is not clear whether it remains in the complex.

SCBnt is thought to differ from both SCA and SCBt and to include the second IS911 protein, OrfA. This protein, binds non-specifically to DNA and interacts with OrfAB[38][50], is proposed to direct an OrfAB-junction complex to a randomly chosen target-DNA to form SCBnt[78][79]. This is based on the observation that integration of the transposon circle intermediate is greatly stimulated by preincubation of OrfAB and OrfA in an in vitro reaction[80].

The Transposition Pathway

The IS3 family is one of an increasing number of IS families known to transpose using a double strand circular DNA intermediate. Closely related pathways have been demonstrated for IS1[81], IS2[22], IS3[82], and IS150[83]. This represents a major transposition pathway which has yet to be widely recognized. As shown in Fig.IS3.9, and the animation below, IS3 family transposition proceeds through a copy-out-paste-in process.

IS911 transposition mechanisms
IS911. copy-out-paste-in mechanism

The Figure-eight form

The initial step is recognition of the IR by OrfAB (presumably during its translation) (IS911 movie above) and assembly of SCA to correctly position the DNA ends and the transposase catalytic site for the subsequent chemical steps. Like all known DDE transposase-catalyzed reactions[84], IS911 transposition proceeds by cleavage of a single strand at the transposon end generating a 3’-OH. This then attacks a target phosphodiester bond in a strand transfer reaction. The particularity of this copy-out-paste-in mechanism is that initial cleavage occurs at only one transposon end, either left or right (Fig.IS3.9). This single liberated 3’-OH directs strand transfer to the same strand 3 bases 5’ to the other end of the element. This generates a molecule in which a single transposon strand is circularized to produce a single strand bridge generating a figure-eight structure on a circular plasmid donor molecule (Fig.IS3.12) which can be easily observed in vivo[85]. The IRs are joined by the single-stranded bridge and separated by three bases derived from flanking DNA from either the left or right end. The three (or 4) bp direct repeats flanking the original insertion are not required for further transposition (as also shown for IS3[86]) and an IS911-based transposon engineered to have different flanks generates a mixed population of figure-eight molecules with one or other flank sequence. Prevention of cleavage of one or other transposon end resulted in a homogenous population that carries the 3nt DNA flank associated with the mutant end confirming that the IRL can attack IRR and vice versa. The reaction can be viewed as a one-ended site-specific transposition event. These initial steps can be accomplished by OrfAB alone. However, it should be noted that in the presence of OrfA, no figure eight or IS circles could be detected by a simple gel assay in vivo although IS circles were found using a PCR approach[64]. This suggests that OrfA may play a role in negatively regulating the initiation of transposition. A similar conclusion has been reached for OrfA of IS3[87]. Alternatively, OrfA may stimulate the disappearance of figure eight and IS circles (see below) since no effect of OrfA was observed on figure-eight formation in vitro. Together with the fact that OrfAB is normally produced at low levels from a weak promoter[21], initiation of transposition to form the figure eight intermediate may be stochastic.

Fig. IS3.12. Agarose gel electrophoresis of DNA extracts from cells carrying a donor plasmid in the presence of high levels of transposase. The first panel, Left. Cartoons of three IS911 related species. From top to bottom: the donor plasmid, the figure 8 molecule, and the IS circle. IS911 is shown in green, plasmid backbone in black and the transposon ends as red dots. Second panel. Ethidium bromide-stained Agarose gel showing various DNA species, including the plasmid which was used to supply transposase. Third panel. Electron micrographs of RecA coated figure 8 and IS circles.

The circular intermediate

Kinetic data[65][85] indicate that the figure-eight gives rise to the circular transposon form which can easily be detected in vivo and in which the IR are abutted and separated by three base pairs of DNA flanking the original insertion (Fig.IS3.9 and Fig.IS3.12). As for figure-eight molecules, a transposon engineered to have different flanks generates a mixed population of transposon circles with one or the other 3bp flank located at the junction[88].

Studies in vivo using a labeling protocol and a temperature-sensitive plasmid as transposon donor demonstrated that conversion from the figure-eight to the transposon circle occurs by semiconservative replication where the circular intermediate is “copied out” leaving a copy in the transposon donor molecule[89] (Fig.IS3.9). This is transposon-specific, requires OrfAB (presumably to generate the figure eight and generate a 3’-OH on the IS911 DNA flank) and does not depend on replication from the donor plasmid origin of replication[89].

Using donor plasmids where one or other IR was inactivated for cleavage would be expected to determine whether one or other of the 3’-OH is used in transposon replication. This was tested using the Tus/ter system[90][91][92][93] (which blocks passage of a replication fork in an orientation specific fashion) cloned into the transposon in either one or other orientation. In the presence of Tus protein, no transposon circles were observed if the orientation of the ter site was that expected to block replication from one or the other end[89].

At present, it is not known how OrfAB is removed and how this replication step is initiated or terminated to generate the final circles. It is possible that these processes involve host factors and mechanisms similar to those, which operate in replicative transposition of bacteriophage Mu (see [94][95][96]).

RecG helicase is implicated in targeted insertion. This process involves a target IS911 end and strand transfer occur between one cleaved end of the IS circle and the target IS end to create an intermolecular single-strand bridge rather than the intramolecular bridge of the figure-eight intermediate (Fig.IS3.13). Resolution of this structure implicates branch migration and replication from the donor plasmid[97]. This reinforces the idea that host proteins including components of the replication machinery are loaded onto figure-eight intermediates.

Fig. IS3.13. In vitro reactions were performed using purified IS911 circles which included a chloramphenicol resistance gene and a plasmid target with a promoterless lacZ gene. Following a standard In vitro reaction, the reaction mixture was used to transform competent E. coli with selection for chloramphenicol resistance. Lines on the interior and exterior of the plasmid circle represent different orientations of insertion.

Integration of the circular intermediate

The IR junction formed by IS circularization is very unstable in the presence of OrfAB and undergoes high levels of deletion and insertion in vivo[98] and in vitro[80]. Transposon circle insertion presumably requires further transposase synthesis.

A remarkable consequence of transposon circle formation is the assembly of a strong promoter, pjunc, from a –35 hexamer contributed by IRR and a –10 hexamer contributed by IRL (Fig.IS3.3 B). The 3 (or more rarely 4) bp which separate IRL and IRR in the circle provide an ideal spacing between the –35 and –10 elements[98]. The junction promoter, pjunc, is 30-50 fold stronger than the indigenous promoter, pIRL[98] (Fig.IS3.4), and more than two fold stronger than lacUV5[30]. It is correctly placed to drive high levels of transposase synthesis and plays an active role in controlling IS911 transposition.

Inactivation of pjunc by mutagenesis strongly reduced IS911 transposition in vivo when transposase was expressed in its native configuration[30]. Moreover, the truncated OrfAB derivative, OrfAB[1-149] , which specifically binds IRR and IRL, reduced in vivo promoter activity 10 fold in a mutated junction resistant to cleavage. Full-length OrfAB, which binds the IR only weakly, and OrfA, which does not specifically bind the IR, had no effect[30]. Integration results in disassembly of pjunc providing a powerful feedback mechanism resulting in transient and controlled activation of integration only in the presence of the correct (circular) intermediate.

For the related IS2, this junction promoter is required for transposition[99].

Circle junction formation brings both transposons ends together in an inverted orientation. This active junction must then participate in the second type of synaptic complex which includes target DNA (Fig.IS3.9 and Fig.IS3.11 B).

Two single strand cleavages, one at each abutted IR, would linearize the transposon circle permitting the two liberated 3'-OH groups to direct coordinated strand transfer (Fig.IS3.9 and Fig.IS3.11 B). The final step requires OrfAB but is greatly stimulated by OrfA and is sensitive to the ratio of OrfAB/OrfA[80].

It is not known whether target capture occurs before or after cleavage of the circle junction although it has been observed that linear copies of IS911 are produced from transposon circles in vitro and in the presence of high OrfAB levels in vivo and a pre-cleaved linear transposon was a robust substrate for integration in vitro[100].

Based on kinetics and on the formation of the strong pjunc promoter, we favor a model in which the IS circles represent a reservoir of transposition intermediates and that linear forms are generated from the IS circles during the integration process.

This has also been proposed for IS3[86].

Targeted Insertion

As stated above, several IS including IS911 show a preference for integration next to sequences in the target similar to their IR. One way of understanding this is that the transposon circle is able to form a synaptic complex (SCBt; Fig.IS3.11 B left) which is similar to SCA (Fig.IS3.11 A) but which occurs “in trans” between an IR of the transposon circle and an IR in the target. In the case of IS911, this phenomenon occurs more frequently if OrfA is not present (Fig. IS3.13) and it was proposed that one role of OrfA is to promote dispersion of the IS[78][101].

This type of one-ended intermolecular recombination/integration has been analyzed in some detail[97][101][102].

IR-targeted insertion involves the transfer of a single end of the junction to the target IR to generate a branched DNA structure. The single-end transfer (SET) intermediate, but not the final insertion product, was detected in vitro. This implies that SET intermediates must be processed by the bacterial host to obtain the final insertion products. Sequence analysis of in vitro and in vivo IR-targeted insertion products revealed high levels of DNA sequence conversion in which mutations from one IR were transferred to another. These sequence changes could not be explained by the classic transposition pathway but could be understood in terms of a mechanism in which SET generates a four-way Holliday-like junction which is then processed by host-mediated branch migration, resolution, repair and replication. This pathway resembles those described for processing other branched DNA structures such as stalled replication forks. A version of this model is shown in Fig.IS3.14. Subsequent studies showed that the RecG helicase is implicated in vivo, as might be expected for strand migration[97].

Fig. IS3.14. IRR and IRL in red and green respectively. A mutant terminal dinucleotide (pale red or green boxes) prevents donor activity but allows target activity. Three interstitial base pairs in the IR/IR junction are as grey and white circles to distinguish DNA strand polarity. The same convention is used for the three base pairs flanking the target mutant IRL* as diamonds. Dotted lines: donor transposon circle; full lines: target DNA. The 3’ ends of Tpase-mediated nicks are indicated by arrows. Those, which may exist transiently during second strand resolution, are indicated by a gap. I, synapsis and cleavage at one end and strand transfer; II, the formation of a SET between donor and target; III, branch migration in the sense of the arrow creating hybrid IRL or IRL/IRR copies; IV, Holliday junction resolution, thick dashed lines; V, resolved product subsequently subject to mismatch repair and replication. Lower case roman numerals below indicate the type of final product. The differences between A, B and C depend on the IR which attacks the target. A. IRR attacks three base pairs from the target, IRL*. B. IRL attacks three base pairs from the target IRL* leading to hybrid IRs in which one strand was derived from IRR and the other from IRL*. The figure shows the expected results if branch migration continued into the region of non-complementarity after the IRs. C. IRL attacks at the tip of the target IRL*.

Mechanism in other family members

Several other members of this family have also been analysed in some detail. These include IS2, IS3, and IS150. All three have been shown to generate circles when supplied with high levels of the fused frame Tpase[20][22][83][86][103].

IS3 also generates adjacent deletions[20] but, unlike IS911, appears to undergo excision from the donor molecule as a linear form following a staggered double strand break at each end. These forms have a 3 base 5' overhang and may be an alternative type of transposition intermediate[103]. Such forms may be equivalent to the linear IS911 species derived from transposon circles. In addition, IS3-derivative transposons in which two abutted ends have been engineered undergo high levels of transposition[31].

Insertion of IS3 creates generally 3 and sometimes 4 bp direct target repeats. It is significant that plasmids in which the IRs are separated by 4 bp are more active than those separated by 8 bp. In these studies, the authors were unable to engineer derivatives with two complete tandem IS3 elements. This may be the result of the formation of a strong hybrid promoter which, as described for IS911 and other ISs (see above), drives high levels of Tpase expression. This configuration of ends is equivalent to that found at the circle junction and suggests that abutted ends of IS3 are also efficient substrates in transposition.

IS2 generates direct target duplications of 5 bp on insertion[104] although transposon circles generated with this element carry only a single base pair separating IRL and IRR[22].

While IS2 carries a conserved terminal 5' -CA- 3' at its right end, the left end terminates with 5' -TG- 3'. This atypical IRL does not act as a strand donor but uniquely as a target in the circularization reaction.

Functional studies indicate that the product of the upstream orfA may inhibit transposition[26]. It has been shown to bind specifically to IRL at a sequence that overlaps the -10 hexamer of the resident Tpase promoter and represses expression of OrfA.

It does not appear to bind IRR (note that in the original article the authors inverse the standard definition of IRL and IRR[26].

Several other elements also exhibit small inverted repeat sequences which flank the -10 hexamer of the putative resident Tpase promoter. IS2-derivative transposons in which two abutted ends have been engineered also undergo high levels of transposition[22][105] and, like IS911, the circle junction of IS2 also constitutes a strong promoter capable of driving Tpase expression. Several (but not all) IS3-family elements may also carry similarly located potential -35 and -10 sequences within their IRs.

Structural studies

Although there are at present no structural data available for any members of this family, recent results obtained with an IS from another family, ISCth4 from the IS256 family, which also undergoes copy-out-paste-in transposition has provided some insights [106]. This particular transposition pathway is asymmetric in the sense that one IS end is cleaved and attacks the opposite end several nucleotides from the tip [107]. In accord with this type of mechanism, crystal structures of ISCth4 transposase bound to three different substrates show a transposase dimer bound asymmetrically to a single DNA substrate: a pre-reaction substrate with IRR together with its flanking DNA, a pre-cleaved complex in which the IRR flank had been removed and a strand transfer complex including an abutted IRR and IRL separated by a gapped 6 base pair linker (Fig. IS256.8).

It is important to note that IS256 family transposases carry an alpha-helical insertion domain which separates the catalytic domain into two segments. This domain plays an important role in directing different DNA segments during the reaction. IS3 family transposases carry an uninterrupted catalytic domain without the alpha helical insertion domain implying that the atomic details of the process will differ. In this light, it is worth remembering that efficient insertion of IS911 transposon circles catalysed by OrfAB is greatly stimulated by inclusion of the upstream OrfA protein and is sensitive to the ratio of OrfAB/OrfA [108].

Bibliography

  1. <pubmed>4567156</pubmed>
  2. <pubmed>1092667</pubmed>
  3. <pubmed>1092668</pubmed>
  4. <pubmed>1092669</pubmed>
  5. <pubmed>383689</pubmed>
  6. <pubmed>6277857</pubmed>
  7. <pubmed>2832386</pubmed>
  8. <pubmed>6094480</pubmed>
  9. Craig NL, Lambowitz AM, Craigie R, Gellert M, editors. Mobile DNA II. American Society of Microbiology; 2002.
  10. <pubmed>26350305</pubmed>
  11. 11.0 11.1 11.2 11.3 Prère MF, Chandler M, Fayet O . Transposition in Shigella dysenteriae: isolation and analysis of IS911, a new member of the IS3 group of insertion sequences. - J Bacteriol: 1990 Jul, 172(7);4090-9 [PubMed:2163395] [DOI] </nowiki>
  12. <pubmed>9278503</pubmed>
  13. <pubmed>9302015</pubmed>
  14. <pubmed>10496929</pubmed>
  15. <pubmed>8751923</pubmed>
  16. <pubmed>17347521</pubmed>
  17. <pubmed>17320399</pubmed>
  18. 18.0 18.1 Bhugra B, Dybvig K . Identification and characterization of IS1138, a transposable element from Mycoplasma pulmonis that belongs to the IS3 family. - Mol Microbiol: 1993 Feb, 7(4);577-84 [PubMed:8096321] [DOI] </nowiki>
  19. 19.0 19.1 19.2 19.3 19.4 Vögele K, Schwartz E, Welz C, Schiltz E, Rak B . High-level ribosomal frameshifting directs the synthesis of IS150 gene products. - Nucleic Acids Res: 1991 Aug 25, 19(16);4377-85 [PubMed:1653413] [DOI] </nowiki>
  20. 20.0 20.1 20.2 20.3 20.4 20.5 20.6 Sekine Y, Eisaki N, Ohtsubo E . Translational control in production of transposase and in transposition of insertion sequence IS3. - J Mol Biol: 1994 Feb 4, 235(5);1406-20 [PubMed:8107082] [DOI] </nowiki>
  21. 21.0 21.1 21.2 21.3 21.4 21.5 21.6 21.7 21.8 Polard P, Prère MF, Chandler M, Fayet O . Programmed translational frameshifting and initiation at an AUU codon in gene expression of bacterial insertion sequence IS911. - J Mol Biol: 1991 Dec 5, 222(3);465-77 [PubMed:1660923] [DOI] </nowiki>
  22. 22.0 22.1 22.2 22.3 22.4 Lewis LA, Grindley ND . Two abundant intramolecular transposition products, resulting from reactions initiated at a single end, suggest that IS2 transposes by an unconventional pathway. - Mol Microbiol: 1997 Aug, 25(3);517-29 [PubMed:9302014] [DOI] </nowiki>
  23. <pubmed>8384687</pubmed>
  24. 24.0 24.1 Sharma V, Firth AE, Antonov I, Fayet O, Atkins JF, Borodovsky M, Baranov PV . A pilot study of bacterial genes with disrupted ORFs reveals a surprising profusion of protein sequence recoding mediated by ribosomal frameshifting and transcriptional realignment. - Mol Biol Evol: 2011 Nov, 28(11);3195-211 [PubMed:21673094] [DOI] </nowiki>
  25. 25.0 25.1 Sharma V, Prère MF, Canal I, Firth AE, Atkins JF, Baranov PV, Fayet O . Analysis of tetra- and hepta-nucleotides motifs promoting -1 ribosomal frameshifting in Escherichia coli. - Nucleic Acids Res: 2014 Jun, 42(11);7210-25 [PubMed:24875478] [DOI] </nowiki>
  26. 26.0 26.1 26.2 26.3 Hu ST, Hwang JH, Lee LC, Lee CH, Li PL, Hsieh YC . Functional analysis of the 14 kDa protein of insertion sequence 2. - J Mol Biol: 1994 Feb 18, 236(2);503-13 [PubMed:8107136] [DOI] </nowiki>
  27. <pubmed>2540414</pubmed>
  28. <pubmed>3039299</pubmed>
  29. <pubmed>10438765</pubmed>
  30. 30.0 30.1 30.2 30.3 Duval-Valentin G, Normand C, Khemici V, Marty B, Chandler M . Transient promoter formation: a new feedback mechanism for regulation of IS911 transposition. - EMBO J: 2001 Oct 15, 20(20);5802-11 [PubMed:11598022] [DOI] </nowiki>
  31. 31.0 31.1 Spielmann-Ryser J, Moser M, Kast P, Weber H . Factors determining the frequency of plasmid cointegrate formation mediated by insertion sequence IS3 from Escherichia coli. - Mol Gen Genet: 1991 May, 226(3);441-8 [PubMed:1645443] [DOI] </nowiki>
  32. <pubmed>3035338</pubmed>
  33. 33.0 33.1 Polard P, Seroude L, Fayet O, Prère MF, Chandler M . One-ended insertion of IS911. - J Bacteriol: 1994 Feb, 176(4);1192-6 [PubMed:8106332] [DOI] </nowiki>
  34. Welz C. Functionelle analyse des Bakteriellen Insertionelements IS150. PhD thesis: Fakultät für Biologie der Albert-Ludwigs-Univesität Freiburg; 1993.
  35. <pubmed>9055066</pubmed>
  36. <pubmed>7994604</pubmed>
  37. 37.0 37.1 Mahillon J, Chandler M . Insertion sequences. - Microbiol Mol Biol Rev: 1998 Sep, 62(3);725-74 [PubMed:9729608] </nowiki>
  38. 38.0 38.1 38.2 38.3 38.4 38.5 38.6 38.7 Haren L, Normand C, Polard P, Alazard R, Chandler M . IS911 transposition is regulated by protein-protein interactions via a leucine zipper motif. - J Mol Biol: 2000 Feb 25, 296(3);757-68 [PubMed:10677279] [DOI] </nowiki>
  39. <pubmed>1579111</pubmed>
  40. 40.0 40.1 Zheng J, McIntosh MA . Characterization of IS1221 from Mycoplasma hyorhinis: expression of its putative transposase in Escherichia coli incorporates a ribosomal frameshift mechanism. - Mol Microbiol: 1995 May, 16(4);669-85 [PubMed:7476162] [DOI] </nowiki>
  41. <pubmed>23832000</pubmed>
  42. 42.0 42.1 Doak TG, Doerder FP, Jahn CL, Herrick G . A proposed superfamily of transposase genes: transposon-like elements in ciliated protozoa and a common "D35E" motif. - Proc Natl Acad Sci U S A: 1994 Feb 1, 91(3);942-6 [PubMed:8302872] [DOI] </nowiki>
  43. 43.0 43.1 43.2 Fayet O, Ramond P, Polard P, Prère MF, Chandler M . Functional similarities between retroviruses and the IS3 family of bacterial insertion sequences? - Mol Microbiol: 1990 Oct, 4(10);1771-7 [PubMed:1963920] [DOI] </nowiki>
  44. 44.0 44.1 Katzman M, Mack JP, Skalka AM, Leis J . A covalent complex between retroviral integrase and nicked substrate DNA. - Proc Natl Acad Sci U S A: 1991 Jun 1, 88(11);4695-9 [PubMed:1647013] [DOI] </nowiki>
  45. 45.0 45.1 Khan E, Mack JP, Katz RA, Kulkosky J, Skalka AM . Retroviral integrase domains: DNA binding and the recognition of LTR sequences. - Nucleic Acids Res: 1991 Feb 25, 19(4);851-60 [PubMed:1850126] [DOI] </nowiki>
  46. 46.0 46.1 Rezsöhazy R, Hallet B, Delcour J, Mahillon J . The IS4 family of insertion sequences: evidence for a conserved transposase motif. - Mol Microbiol: 1993 Sep, 9(6);1283-95 [PubMed:7934941] [DOI] </nowiki>
  47. <pubmed>9435062</pubmed>
  48. 48.0 48.1 <pubmed>2841644</pubmed>
  49. 49.0 49.1 49.2 Rousseau P, Gueguen E, Duval-Valentin G, Chandler M . The helix-turn-helix motif of bacterial insertion sequence IS911 transposase is required for DNA binding. - Nucleic Acids Res: 2004, 32(4);1335-44 [PubMed:14981152] [DOI] </nowiki>
  50. 50.0 50.1 50.2 50.3 50.4 50.5 50.6 50.7 Haren L, Polard P, Ton-Hoang B, Chandler M . Multiple oligomerisation domains in the IS911 transposase: a leucine zipper motif is essential for activity. - J Mol Biol: 1998, 283(1);29-41 [PubMed:9761671] [DOI] </nowiki>
  51. <pubmed>10547692</pubmed>
  52. 52.0 52.1 52.2 Rettberg CC, Prère MF, Gesteland RF, Atkins JF, Fayet O . A three-way junction and constituent stem-loops as the stimulator for programmed -1 frameshifting in bacterial insertion sequence IS911. - J Mol Biol: 1999 Mar 12, 286(5);1365-78 [PubMed:10064703] [DOI] </nowiki>
  53. 53.0 53.1 Prère MF, Canal I, Wills NM, Atkins JF, Fayet O . The interplay of mRNA stimulatory signals required for AUU-mediated initiation and programmed -1 ribosomal frameshifting in decoding of transposable element IS911. - J Bacteriol: 2011 Jun, 193(11);2735-44 [PubMed:21478364] [DOI] </nowiki>
  54. 54.0 54.1 54.2 54.3 Mazauric MH, Licznar P, Prère MF, Canal I, Fayet O . Apical loop-internal loop RNA pseudoknots: a new type of stimulator of -1 translational frameshifting in bacteria. - J Biol Chem: 2008 Jul 18, 283(29);20421-32 [PubMed:18474594] [DOI] </nowiki>
  55. <pubmed>16731525</pubmed>
  56. 56.0 56.1 Hu ST, Lee LC, Lei GS . Detection of an IS2-encoded 46-kilodalton protein capable of binding terminal repeats of IS2. - J Bacteriol: 1996 Oct, 178(19);5652-9 [PubMed:8824609] [DOI] </nowiki>
  57. <pubmed>1314954</pubmed>
  58. <pubmed>7636469</pubmed>
  59. 59.0 59.1 Tsuchihashi Z, Brown PO . Sequence requirements for efficient translational frameshifting in the Escherichia coli dnaX gene and the role of an unstable interaction between tRNA(Lys) and an AAG lysine codon. - Genes Dev: 1992 Mar, 6(3);511-9 [PubMed:1547945] [DOI] </nowiki>
  60. <pubmed>3860833</pubmed>
  61. <pubmed>11027137</pubmed>
  62. <pubmed>12970189</pubmed>
  63. <pubmed>18621088</pubmed>
  64. 64.0 64.1 Licznar P, Bertrand C, Canal I, Prère MF, Fayet O . Genetic variability of the frameshift region in IS911 transposable elements from Escherichia coli clinical isolates. - FEMS Microbiol Lett: 2003 Jan 28, 218(2);231-7 [PubMed:12586397] [DOI] </nowiki>
  65. 65.0 65.1 65.2 65.3 Duval-Valentin G, Chandler M . Cotranslational control of DNA transposition: a window of opportunity. - Mol Cell: 2011 Dec 23, 44(6);989-96 [PubMed:22195971] [DOI] </nowiki>
  66. <pubmed>8520113</pubmed>
  67. <pubmed>7496528</pubmed>
  68. <pubmed>9335268</pubmed>
  69. <nowiki> Haren L, Polard P, Ton-Hoang B, Chandler M . Multiple oligomerisation domains in the IS911 transposase: a leucine zipper motif is essential for activity. - J Mol Biol: 1998, 283(1);29-41 [PubMed:9761671] [DOI]
  70. <pubmed>26405228</pubmed>
  71. <pubmed>21439812</pubmed>
  72. <pubmed>23217365</pubmed>
  73. <pubmed>16181782</pubmed>
  74. <pubmed>11352577</pubmed>
  75. 75.0 75.1 Rousseau P, Tardin C, Tolou N, Salomé L, Chandler M . A model for the molecular organisation of the IS911 transpososome. - Mob DNA: 2010 Jun 16, 1(1);16 [PubMed:20553579] [DOI] </nowiki>
  76. <pubmed>15155821</pubmed>
  77. <pubmed>16923775</pubmed>
  78. 78.0 78.1 78.2 Rousseau P, Loot C, Guynet C, Ah-Seng Y, Ton-Hoang B, Chandler M . Control of IS911 target selection: how OrfA may ensure IS dispersion. - Mol Microbiol: 2007 Mar, 63(6);1701-9 [PubMed:17367389] [DOI] </nowiki>
  79. <pubmed>18586933</pubmed>
  80. 80.0 80.1 80.2 Ton-Hoang B, Polard P, Chandler M . Efficient transposition of IS911 circles in vitro. - EMBO J: 1998 Feb 16, 17(4);1169-81 [PubMed:9463394] [DOI] </nowiki>
  81. <pubmed>7489730</pubmed>
  82. <pubmed>15493331</pubmed>
  83. 83.0 83.1 Haas M, Rak B . Escherichia coli insertion sequence IS150: transposition via circular and linear intermediates. - J Bacteriol: 2002 Nov, 184(21);5833-41 [PubMed:12374815] [DOI] </nowiki>
  84. <pubmed>26104718</pubmed>
  85. 85.0 85.1 Polard P, Chandler M . An in vivo transposase-catalyzed single-stranded DNA circularization reaction. - Genes Dev: 1995 Nov 15, 9(22);2846-58 [PubMed:7590258] [DOI] </nowiki>
  86. 86.0 86.1 86.2 Sekine Y, Aihara K, Ohtsubo E . Linearization and transposition of circular molecules of insertion sequence IS3. - J Mol Biol: 1999 Nov 19, 294(1);21-34 [PubMed:10556026] [DOI] </nowiki>
  87. <pubmed>9413996</pubmed>
  88. <pubmed>1334464</pubmed>
  89. 89.0 89.1 89.2 Duval-Valentin G, Marty-Cointin B, Chandler M . Requirement of IS911 replication before integration defines a new bacterial transposition pathway. - EMBO J: 2004 Oct 1, 23(19);3897-906 [PubMed:15359283] [DOI] </nowiki>
  90. <pubmed>8021197</pubmed>
  91. <pubmed>2181438</pubmed>
  92. <pubmed>2510933</pubmed>
  93. <pubmed>16148308</pubmed>
  94. <pubmed>26104374</pubmed>
  95. <pubmed>12770828</pubmed>
  96. <pubmed>11459960</pubmed>
  97. 97.0 97.1 97.2 Turlan C, Loot C, Chandler M . IS911 partial transposition products and their processing by the Escherichia coli RecG helicase. - Mol Microbiol: 2004 Aug, 53(4);1021-33 [PubMed:15306008] [DOI] </nowiki>
  98. 98.0 98.1 98.2 Ton-Hoang B, Bétermier M, Polard P, Chandler M . Assembly of a strong promoter following IS911 circularization and the role of circles in transposition. - EMBO J: 1997 Jun 2, 16(11);3357-71 [PubMed:9214651] [DOI] </nowiki>
  99. <pubmed>14729714</pubmed>
  100. <pubmed>10320583</pubmed>
  101. 101.0 101.1 Loot C, Turlan C, Rousseau P, Ton-Hoang B, Chandler M . A target specificity switch in IS911 transposition: the role of the OrfA protein. - EMBO J: 2002 Aug 1, 21(15);4172-82 [PubMed:12145217] [DOI] </nowiki>
  102. <pubmed>14756780</pubmed>
  103. 103.0 103.1 Sekine Y, Eisaki N, Ohtsubo E . Identification and characterization of the linear IS3 molecules generated by staggered breaks. - J Biol Chem: 1996 Jan 5, 271(1);197-202 [PubMed:8550559] [DOI] </nowiki>
  104. <pubmed>375194</pubmed>
  105. <pubmed>8676870</pubmed>
  106. <pubmed>33006208</pubmed>
  107. <pubmed>7590258</pubmed>
  108. <pubmed>9463394</pubmed>