Difference between revisions of "IS Families/IS5 and related IS1182 families"

From TnPedia
Jump to navigation Jump to search
 
(8 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
====IS''5'' family====
 
====IS''5'' family====
 
=====Original Identification=====
 
=====Original Identification=====
IS''5'' was originally isolated as an insertion into the immunity region of [[wikipedia:Lambda_phage|bacteriophage lambda]] and subsequently found as a cause of mutation in a number of ''[[wikipedia:Escherichia_coli|E. coli]]'' genes<ref><nowiki><pubmed>4432374</pubmed></nowiki></ref><ref><nowiki><pubmed>353507</pubmed></nowiki></ref><ref><nowiki><pubmed>84614</pubmed></nowiki></ref><ref><nowiki><pubmed>641012</pubmed></nowiki></ref>. Together with IS''1'', it was also identified as an activator (by insertion) of expression of the usually cryptic [[wikipedia:Beta-glucosidase|beta-glucosidase]] gene of ''[[wikipedia:Escherichia_coli|E. coli]]''<ref><nowiki><pubmed>6270569</pubmed></nowiki></ref><ref><nowiki><pubmed>3034860</pubmed></nowiki></ref><ref><nowiki><pubmed>3034860</pubmed></nowiki></ref><ref><nowiki><pubmed>1311089</pubmed></nowiki></ref><ref><nowiki><pubmed>2846278</pubmed></nowiki></ref><ref><nowiki><pubmed>1311089</pubmed></nowiki></ref><ref><nowiki><pubmed>7781607</pubmed></nowiki></ref><ref><nowiki><pubmed>8710516</pubmed></nowiki></ref>.
+
IS''5'' was originally isolated as an insertion into the immunity region of [[wikipedia:Lambda_phage|bacteriophage lambda]] and subsequently found as a cause of mutation in a number of ''[[wikipedia:Escherichia_coli|E. coli]]'' genes<ref><pubmed>4432374</pubmed></ref><ref><pubmed>353507</pubmed></ref><ref><pubmed>84614</pubmed></ref><ref><pubmed>641012</pubmed></ref>. Together with [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1R IS''1''], it was also identified as an activator (by insertion) of expression of the usually cryptic [[wikipedia:Beta-glucosidase|beta-glucosidase]] gene of ''[[wikipedia:Escherichia_coli|E. coli]]''<ref><pubmed>6270569</pubmed></ref><ref><pubmed>3034860</pubmed></ref><ref name=":0"><pubmed>1311089</pubmed>
 +
 
 +
</ref><ref><pubmed>2846278</pubmed></ref><ref><pubmed>7781607</pubmed></ref><ref><pubmed>8710516</pubmed></ref>.
  
 
=====Presence in Compound Transposons=====
 
=====Presence in Compound Transposons=====
Several members are associated with compound transposons. These include IS''903'' and IS''602'', which form part of the [[wikipedia:Kanamycin_A|kanamycin]] resistance transposons Tn''903''<ref><nowiki><pubmed>6261245</pubmed></nowiki></ref>, and Tn''602''<ref><nowiki><pubmed>2819910</pubmed></nowiki></ref> respectively, and IS''Va1''/IS''Va2'' which form part of a transposon carrying iron transport genes<ref><nowiki><pubmed>7568465</pubmed></nowiki></ref>.
+
Several members are associated with compound transposons. These include [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS602 IS''602''], which form part of the [[wikipedia:Kanamycin_A|kanamycin]] resistance transposons [http://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn903-V00359.1 Tn''903'']<ref><pubmed>6261245</pubmed></ref>, and Tn''602''<ref><pubmed>2819910</pubmed></ref> respectively, and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISVa1 IS''Va1''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISVa2 IS''Va2''] which form part of a transposon carrying iron transport genes<ref><pubmed>7568465</pubmed></ref>.
  
 
=====Distribution=====
 
=====Distribution=====
 
The IS''5'' family, like the IS''4'' family, is also a relatively heterogeneous group which now requires reanalysis. It also includes sequences from both eubacteria and the archaea.  
 
The IS''5'' family, like the IS''4'' family, is also a relatively heterogeneous group which now requires reanalysis. It also includes sequences from both eubacteria and the archaea.  
There are now a large number of identified members of the IS''5'' family (>550 members) and of a closely related IS''1182'' family (>150 members) which have allowed a more detailed analysis and a separation into various subgroups and families. The IS''5'' family is partitioned into 6 subgroups: IS''5'', IS''903'', IS''L2'', IS''H1'', IS''1031'' and IS''427''<ref><nowiki><pubmed>9729608</pubmed></nowiki></ref> ([[General Information/What Is an IS?#Characteristics%20of%20insertion%20sequence%20families|Table Characteristics of IS families]]; [[:File:Fig. IS5.1.png|Fig.5.1]]). Some of these may prove to be emerging families. Members of the IS''5'' subgroup appear to be composed of two groups with different lengths: one of 1060-1300 bp and a second of 1460-1610 ([[:File:Fig. IS5.2.png|Fig.5.2 A]]).  
+
There are now a large number of identified members of the IS''5'' family (>550 members) and of a closely related [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1182 IS''1182''] family (>150 members) which have allowed a more detailed analysis and a separation into various subgroups and families. The IS''5'' family is partitioned into 6 subgroups: [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS5 IS''5''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISL2 IS''L2''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISH1 IS''H1''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1031 IS''1031''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS427 IS''427'']<ref name=":1"><pubmed>9729608</pubmed>
[[Image:Fig. IS5.1.png|thumb|center|500x500px|'''Fig. IS5.1.''' Correspondence between the IS IRs and different IS''5'' family subgroups.|alt=]]
+
 
[[Image:Fig. IS5.2.png|thumb|center|500x500px|'''Fig. IS5.2.''' IS''5'' family IS''5'' subgroups '''A)''' distribution of IS length (base pairs); '''B)''' distribution of the length of transposase (amino acid residues).|alt=]]
+
</ref> ([[General Information/What Is an IS?#Characteristics%20of%20insertion%20sequence%20families|Table Characteristics of IS families]]; [[:File:Fig. IS5.1.png|Fig.5.1]]). Some of these may prove to be emerging families. Members of the IS''5'' subgroup appear to be composed of two groups with different lengths: one of 1060-1300 bp and a second of 1460-1610 ([[:File:Fig. IS5.2.png|Fig.5.2 A]]).  
 +
[[Image:Fig. IS5.1.png|thumb|center|680x680px|'''Fig. IS5.1.''' Correspondence between the IS '''IRs''' and different IS''5'' family subgroups.|alt=]]
 +
[[Image:Fig. IS5.2.png|thumb|center|720x720px|'''Fig. IS5.2.''' IS''5'' family IS''5'' subgroups '''A)''' distribution of IS length (base pairs); '''B)''' distribution of the length of transposase (amino acid residues).|alt=]]
  
 
=====Diversity=====
 
=====Diversity=====
The transposases of these are also of different lengths ([[:File:Fig. IS5.2.png|Fig.5.2 B]]) and transposase length is correlated with that of the IS. The lengths of the IS''1013'' subgroup are between ~900 and ~1200 bp with the majority between 103 and 1090 bp ([[:File:Fig. IS5.3.png|Fig.5.3]]), those of the IS''427'' group are between 800 and 1070 bp in length with most having lengths in the range of 810 900 bp ([[:File:Fig. IS5.3.png|Fig.5.3]]). Members of the IS''903'' subgroup are generally about 1030-1090 bp long ([[:File:Fig. IS5.4.png|Fig.5.4 A]] ), those of the IS<u>''H1''</u> subgroup are about 850 - 1200 bp long (note that this subgroup includes a number of MITES) ([[:File:Fig. IS5.4.png|Fig.5.4 B]]) and IS''L2'' members are 820  to 1260 bp long with a majority of about 820-970 bp ([[:File:Fig. IS5.4.png|Fig.5.4 C]]). There are a large number of additional IS''5'' family members whose attribution to subgroups has yet to be established.
+
The transposases of these are also of different lengths ([[:File:Fig. IS5.2.png|Fig.5.2 B]]) and transposase length is correlated with that of the IS. The lengths of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1013 IS''1013''] subgroup are between ~900 and ~1200 bp with the majority between 103 and 1090 bp ([[:File:Fig. IS5.3.png|Fig.5.3]]), those of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS427 IS''427''] group are between 800 and 1070 bp in length with most having lengths in the range of 810 900 bp ([[:File:Fig. IS5.3.png|Fig.5.3]]). Members of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''] subgroup are generally about 1030-1090 bp long ([[:File:Fig. IS5.4.png|Fig.5.4 A]] ), those of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISH1 IS''H1''] subgroup are about 850 - 1200 bp long (note that this subgroup includes a number of '''M'''iniature '''I'''nverted repeat '''T'''ransposable '''E'''lements (MITES) ([[:File:Fig. IS5.4.png|Fig.5.4 B]]) and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISL2 IS''L2''] members are 820  to 1260 bp long with a majority of about 820-970 bp ([[:File:Fig. IS5.4.png|Fig.5.4 C]]). There are a large number of additional IS''5'' family members whose attribution to subgroups has yet to be established.
[[Image:Fig. IS5.3.png|thumb|center|500x500px|'''Fig. IS5.3.'''|alt=]]
+
[[Image:Fig. IS5.3.png|thumb|center|720x720px|'''Fig. IS5.3.''' IS''5'' family [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1031 IS''1031''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS427 IS''427''] subgroups. '''Top''': distribution of IS length (base pairs) IS''1031''; '''Bottom''': distribution of IS length (base pairs) [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS427 IS''427'']. The number of examples used in the sample is shown above each column.|alt=]]
[[Image:Fig. IS5.4.png|thumb|center|500x500px|'''Fig. IS5.4.'''|alt=]]
+
[[Image:Fig. IS5.4.png|thumb|center|720x720px|'''Fig. IS5.4.''' Length (base pairs) distribution of IS''5'' family [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISH1 IS''H1''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISL2 IS''L2''] subgroups. The number of examples used in the sample is shown above each column.|alt=]]
  
There is a distant relationship, about 30% similarity, between IS''5'' and the Pif/Harbinger group of eukaryotic TE<ref><nowiki><pubmed>15020481</pubmed></nowiki></ref><ref><nowiki><pubmed>15020481</pubmed></nowiki></ref>.
+
There is a distant relationship, about 30% similarity, between [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS5 IS''5''] and the Pif/Harbinger group of eukaryotic TE<ref><pubmed>15020481</pubmed></ref>.
  
 
=====Organization=====
 
=====Organization=====
Although the majority of members have a single Tpase orf, about 20% may express Tpase by [[wikipedia:Frameshift_mutation|frameshifting]] since it is distributed between two translation phases similar to most of the IS''427'' subgroup (82/116)<ref><nowiki><pubmed>9729608</pubmed></nowiki></ref>. In these cases if frameshifting indeed occurs the frameshifting signals appear more appropriate for a [[General Information/Transposase expression and activity#Programmed Transcriptional Frameshifting|programmed transcriptional realignment frameshift]] mechanism (PTR) rather than for classical [[General Information/Transposase expression and activity#Programmed Translational Frameshifting|translation frameshifting]] (PRF) since there are no obvious downstream enhancement signals<ref><nowiki><pubmed>21673094</pubmed></nowiki></ref>.
+
Although the majority of members have a single Tpase orf, about 20% may express Tpase by [[wikipedia:Frameshift_mutation|frameshifting]] since it is distributed between two translation phases similar to most of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS427 IS''427''] subgroup (82/116)<ref name=":1" />. In these cases if frameshifting indeed occurs the frameshifting signals appear more appropriate for a [[General Information/Transposase expression and activity#Programmed Transcriptional Frameshifting|programmed transcriptional realignment frameshift]] mechanism (PTR) rather than for classical [[General Information/Transposase expression and activity#Programmed Translational Frameshifting|translation frameshifting]] (PRF) since there are no obvious downstream enhancement signals<ref><pubmed>21673094</pubmed></ref>.
  
Similar split reading frames have also been identified in several of the other subgroups: IS''1031'' (13/65 members); IS''L2'' (7/43); and few in the IS''5'' subgroup (7/149). There is no experimental evidence that these frameshift signals are functional but many of these IS are in multiple copy suggesting that the derivatives are active. In view of their diversity compared to families such as IS''3'', the subgroups will certainly be partitioned into additional groups as more ISs are identified.  
+
Similar split reading frames have also been identified in several of the other subgroups: [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1031 IS''1031''] (13/65 members); [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISL2 IS''L2''] (7/43); and few in the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS5 IS''5''] subgroup (7/149). There is no experimental evidence that these frameshift signals are functional but many of these IS are in multiple copies suggesting that the derivatives are active. In view of their diversity compared to families such as [[IS Families/IS3 family|IS''3'']], the subgroups will certainly be partitioned into additional groups as more ISs are identified.  
  
At present, the IS''903'' and the archaeal IS''H1'' subgroups whose IR are quite similar ([[:File:Fig. IS5.5.png|Fig.5.5]]) do not contain members with potential frameshifting.  
+
At present, the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''] and the archaeal [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISH1 IS''H1''] subgroups whose '''IR''' are quite similar ([[:File:Fig. IS5.5.png|Fig.5.5]]) do not contain members with potential frameshifting.  
[[Image:Fig. IS5.5.png|thumb|center|600x600px|'''Fig. IS5.5.''' WebLogo showing the most common ISH1 and IS903 ends.|alt=]]
+
[[Image:Fig. IS5.5.png|thumb|center|720x720px|'''Fig. IS5.5.''' '''[http://weblogo.threeplusone.com/ WebLogo] showing the most common ISH1 and IS''903'' ends.''' The left (IRL) and right IRR inverted terminal repeats are shown in [http://weblogo.threeplusone.com/ WebLogo] format. From top to bottom: [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1031 IS''1031''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS427 IS''427''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISL2 IS''L2''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISH1 IS''H1''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS5 IS''5''] subgroups.|alt=]]
  
 
In addition to their Tpases and the presence or absence of potential frameshifting, a further distinction between these elements resides in their target specificities.  
 
In addition to their Tpases and the presence or absence of potential frameshifting, a further distinction between these elements resides in their target specificities.  
  
Certain IS''427'' subgroup members and IS''1182'' family members do not carry a termination codon for their Tpases but generate this on insertion into a specific target sequence, CTAG, which is duplicated on insertion. Other IS such as IS''1031'', duplicate a sequence TNA while others such IS''L2'' appear to duplicate ANT.  
+
Certain [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS427 IS''427''] subgroup members and IS''1182'' family members do not carry a termination codon for their Tpases but generate this on insertion into a specific target sequence, CTAG, which is duplicated on insertion. Other IS such as [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1031 IS''1031''], duplicate a sequence TNA while others such [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISL2 IS''L2''] appear to duplicate ANT.  
  
The lengths of the entire group range from 789 bp (IS''Mbu1'') to 1643 bp (IS''493''). The latter carries a second open reading frame upstream of the "Tpase" frame inessential for transposition<ref><nowiki><pubmed>1319378</pubmed></nowiki></ref>. IS''4811'' (Tn''4811''<ref><nowiki><pubmed>1332944</pubmed></nowiki></ref>, which is greater than 5kb, clearly contains a number of passenger genes including one with a consensus [[wikipedia:ATP-binding_motif|ATP/GTP-binding motif]]; an [[wikipedia:Oxidoreductase|oxidoreductase-like protein]]; and one related to bacterial transcription regulators of the [https://www.wikigenes.org/e/gene/e/1034955.html AraC family].  Another, IS''881'' from ''[[wikipedia:Streptomyces|Streptomyces]]'', is interrupted by a [[wikipedia:Group_II_intron|group II intron]].
+
The lengths of the entire group range from 789 bp (e.g., [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISMbu1 IS''Mbu1'']) to 1643 bp (eg., [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS493 IS''493'']). The latter carries a second open reading frame upstream of the "Tpase" frame inessential for transposition<ref><pubmed>1319378</pubmed></ref>. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS4811 IS''4811''] (Tn''4811''<ref><pubmed>1332944</pubmed></ref>, which is greater than 5kb, clearly contains a number of passenger genes including one with a consensus [[wikipedia:ATP-binding_motif|ATP/GTP-binding motif]]; an [[wikipedia:Oxidoreductase|oxidoreductase-like protein]]; and one related to bacterial transcription regulators of the [https://www.wikigenes.org/e/gene/e/1034955.html AraC family].  Another, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS881 IS''881''] from ''[[wikipedia:Streptomyces|Streptomyces]]'', is interrupted by a [[wikipedia:Group_II_intron|group II intron]].
  
The major feature which defines this group is the similarities between their putative Tpases<ref><nowiki><pubmed><7934941/pubmed></nowiki></ref>. This includes the N2, N3 and C1 domains carried by the IS''4'' group<ref><nowiki><pubmed>7934941</pubmed></nowiki></ref>. However, IS''5'' family Tpases exhibit a spacing between the N3 and C1 domains of approximately 40 residues, a distance more consistent with the canonical DDE motif<ref><nowiki><pubmed>9729608</pubmed></nowiki></ref>.
+
The major feature which defines this group is the similarities between their putative Tpases<ref><pubmed>7934941</pubmed></ref>. This includes the N2, N3 and C1 domains carried by the IS''4'' group<ref><pubmed>7934941</pubmed></ref>. However, IS''5'' family Tpases exhibit a spacing between the N3 and C1 domains of approximately 40 residues, a distance more consistent with the canonical DDE motif<ref name=":1" />.
  
Analysis of the largely increased number of members generally confirms these subgroups. Members within each group also generate distinct DRs of similar lengths (IS''5'', 4 bp; IS''L2'', 2-3 bp; IS''1031'', 3-4 bp; IS''903'', 8-9 bp; and IS''427'', 2-3 bp).  
+
Analysis of the largely increased number of members generally confirms these subgroups. Members within each group also generate distinct DRs of similar lengths ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS5 IS''5''], 4 bp; [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISL2 IS''L2''], 2-3 bp; [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1031 IS''1031''], 3-4 bp; [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''], 8-9 bp; and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS427 IS''427''], 2-3 bp).  
  
The IS''903'' and IS''H1'' subgroups have similar terminal IRs ([[:File:Fig. IS5.5.png|Fig.5.5]]) but appear distinct by correlation with the length of the target duplication and, to a lesser extent, by the typical length of the entire IS ([[:File:Fig. IS5.4.png|Fig.5.4]]).  
+
The [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISH1 IS''H1''] subgroups have similar terminal IRs ([[:File:Fig. IS5.5.png|Fig.5.5]]) but appear distinct by correlation with the length of the target duplication and, to a lesser extent, by the typical length of the entire IS ([[:File:Fig. IS5.4.png|Fig.5.4]]).  
  
Several members exhibit GATC sites within their terminal 50 bp. This includes all members of the IS''903'' subgroup and many members of the IS''1031'' and IS''427'' subgroups. IS''903'' transposition activity has been shown to be modulated by Dam ''in vivo'' (cited in <ref><nowiki><pubmed>3000598</pubmed></nowiki></ref>).  
+
Several members exhibit GATC sites within their terminal 50 bp. This includes all members of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''] subgroup and many members of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1031 IS''1031''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS427 IS''427''] subgroups. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''] transposition activity has been shown to be modulated by Dam ''in vivo'' (cited in <ref><pubmed>3000598</pubmed></ref>).  
  
A preferred target sequence, YTAR (often CTAG), is observed for two subgroups, IS''5'' and IS''427'', and for two members of the IS''L2'' group (IS''112'' and IS''1373'') in which either all four base pairs or the central TA are duplicated on insertion.  
+
A preferred target sequence, YTAR (often CTAG), is observed for two subgroups, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS5 IS''5''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS427 IS''427''], and for two members of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISL2 IS''L2''] group (eg., [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS112 IS''112''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1373 IS''1373'']) in which either all four base pairs or the central TA are duplicated on insertion.  
  
It is important to underline that, in many cases, the sequence of the original target site before insertion is not available. This can introduce ambiguities not only in estimating the number of duplicated target base pairs but also in defining the IRs. It is particularly important in several cases where the target repeat is symmetrical (e.g. CTAG) and where it is impossible to distinguish whether the element duplicates 2 or 4 bp and therefore to determine the exact ends of the element. Alignment of the ends of these elements in subgroups has permitted a number of ambiguities to be resolved. Members of the IS''L2'' group which generate 3 bp DRs exhibit a preference for ANT while those from the IS''1031'' group (which generate exclusively a 3 bp DR) exhibits a preference for insertion sites with the sequence TNA. Neither the small IS''H1'' group (8 bp DRs) nor the IS''903'' group (9 bp DRs) exhibit marked target specificity (see IS''903'' and also [[General Information/Target Choice|Target specificity]]).  
+
It is important to underline that, in many cases, the sequence of the original target site before insertion is not available. This can introduce ambiguities not only in estimating the number of duplicated target base pairs but also in defining the '''IRs'''. It is particularly important in several cases where the target repeat is symmetrical (e.g. CTAG) and where it is impossible to distinguish whether the element duplicates 2 or 4 bp and therefore to determine the exact ends of the element. Alignment of the ends of these elements in subgroups has permitted a number of ambiguities to be resolved. Members of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISL2 IS''L2''] group which generate 3 bp DRs exhibit a preference for ANT while those from the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1031 IS''1031''] group (which generate exclusively a 3 bp '''DR''') exhibits a preference for insertion sites with the sequence TNA. Neither the small [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISH1 IS''H1''] group (8 bp DRs) nor the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''] group (9 bp DRs) exhibit marked target specificity (see [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''] and also [[General Information/Target Choice|Target specificity]]).  
Only two of these elements, IS''5'' and IS''903'', have received significant attention.  
+
Only two of these elements, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS5 IS''5''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''], have received significant attention.  
  
 
=====IS''5'' group=====
 
=====IS''5'' group=====
In spite of the historical importance of IS5 in generating mutations, the published work concerning this element is largely directed to an understanding of its coding capacity and expression properties. IS5 carries one large orf, ins5A, spanning the entire element and shown to be essential for transposition IS5 (see <ref><nowiki><pubmed>1311089</pubmed></nowiki></ref>), and two small orfs (ins5B and 5C<ref><nowiki><pubmed>6269958</pubmed></nowiki></ref><ref><nowiki><pubmed>6269959</pubmed></nowiki></ref><ref><nowiki><pubmed>6281651</pubmed></nowiki></ref><ref><nowiki><pubmed>6327289</pubmed></nowiki></ref>, whose relevance to transposition remains to be demonstrated. Nothing is known about the transposition mechanism of this element.  
+
In spite of the historical importance of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS5 IS''5''] in generating mutations, the published work concerning this element is largely directed to an understanding of its coding capacity and expression properties. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS5 IS''5''] carries one large orf, ins5A, spanning the entire element and shown to be essential for transposition [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS5 IS''5''] (see <ref name=":0" />), and two small orfs (ins5B and 5C<ref><pubmed>6269958</pubmed></ref><ref><pubmed>6269959</pubmed></ref><ref><pubmed>6281651</pubmed></ref><ref><pubmed>6327289</pubmed></ref>, whose relevance to transposition remains to be demonstrated. Nothing is known about the transposition mechanism of this element.  
  
 
=====Mechanism IS''903''=====
 
=====Mechanism IS''903''=====
The only IS''5'' family member which transposition mechanism has been addressed at present is IS''903''. The ends of IS''903'' carry IRs of 18 bp which exhibit the typical two-domain organization<ref><nowiki><pubmed>2825175</pubmed></nowiki></ref> . Transposase has been shown to bind specifically to the ends using a region located in the amino-terminal portion of the protein<ref><nowiki><pubmed>1324175</pubmed></nowiki></ref><ref><nowiki><pubmed>9417930</pubmed></nowiki></ref>. In addition, a region possibly involved in the formation of higher order multimers has been identified and residues probably involved in catalysis have been pinpointed among the conserved residues in the catalytic DDE domain<ref><nowiki><pubmed>9417930</pubmed></nowiki></ref>. Insertion generates a 9 bp target duplication.
+
The only [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS5 IS''5''] family member which transposition mechanism has been addressed at present is [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903'']. The ends of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''] carry IRs of 18 bp which exhibit the typical two-domain organization<ref><pubmed>2825175</pubmed></ref> . Transposase has been shown to bind specifically to the ends using a region located in the amino-terminal portion of the protein<ref><pubmed>1324175</pubmed></ref><ref name=":2"><pubmed>9417930</pubmed>
 +
 
 +
</ref>. In addition, a region possibly involved in the formation of higher order multimers has been identified and residues probably involved in catalysis have been pinpointed among the conserved residues in the catalytic DDE domain<ref name=":2" />. Insertion generates a 9 bp target duplication.
  
An elegant genetic analysis provided strong evidence that IS''903'' is not only capable of undergoing direct insertion but can also generate adjacent deletions in a duplicative manner. Moreover, point mutations in the terminal base pair of the IRs decrease overall transposition frequency but increase the frequency of [[wikipedia:Cointegrate|cointegrate formation]]<ref><nowiki><pubmed>10096085</pubmed></nowiki></ref>. Similarly, mutation of the first nucleotide flanking an IR also influences the level of [[wikipedia:Cointegrate|cointegrate formation]]<ref><nowiki><pubmed>11387225</pubmed></nowiki></ref>. The level of [[wikipedia:Cointegrate|cointegrate formation]] can also be increased by mutation of the Tpase. The molecular nature of these effects requires further investigation.  
+
An elegant genetic analysis provided strong evidence that IS''903'' is not only capable of undergoing direct insertion but can also generate adjacent deletions in a duplicative manner. Moreover, point mutations in the terminal base pair of the '''IRs''' decrease overall transposition frequency but increase the frequency of [[wikipedia:Cointegrate|cointegrate formation]]<ref><pubmed>10096085</pubmed></ref>. Similarly, mutation of the first nucleotide flanking an IR also influences the level of [[wikipedia:Cointegrate|cointegrate formation]]<ref><pubmed>11387225</pubmed></ref>. The level of [[wikipedia:Cointegrate|cointegrate formation]] can also be increased by mutation of the Tpase. The molecular nature of these effects requires further investigation.  
  
Factors affecting IS''903'' target site choice have been addressed in some detail. Initial studies<ref><nowiki><pubmed>9620951</pubmed></nowiki></ref> identified that insertion into the conjugative plasmid pOX38 showed no consensus in the 9 bp target duplication produced on insertion but alignment of the target sequences indicated a preference for sites with symmetry on either side. A cloned copy of one native symmetric site into a second conjugative plasmid, pUB307, confirmed its attractiveness for insertion. More extensive studies provided a consensus symmetric target sequence which, when cloned into a target replicon, proved highly efficient<ref><nowiki><pubmed>11178901</pubmed></nowiki></ref>. The preferred target was a 21 bp palindrome cantered on the 9 bp target duplication. It could be dissected into: the 5 bp flanking sequences, the most important for site-specific insertion; the 7 bp palindromic core within the target duplication; the dinucleotide pair at the transposon-target junction; and the local DNA context.
+
Factors affecting [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''] target site choice have been addressed in some detail. Initial studies<ref><pubmed>9620951</pubmed></ref> identified that insertion into the conjugative plasmid pOX38 showed no consensus in the 9 bp target duplication produced on insertion but the alignment of the target sequences indicated a preference for sites with symmetry on either side. A cloned copy of one native symmetric site into a second conjugative plasmid, pUB307, confirmed its attractiveness for insertion. More extensive studies provided a consensus symmetric target sequence which, when cloned into a target replicon, proved highly efficient<ref><pubmed>11178901</pubmed></ref>. The preferred target was a 21 bp palindrome cantered on the 9 bp target duplication. It could be dissected into: the 5 bp flanking sequences, the most important for site-specific insertion; the 7 bp palindromic core within the target duplication; the dinucleotide pair at the transposon-target junction; and the local DNA context.
  
Insertion into pUB307 itself showed a strong preference for a single orientation. By inverting either the vegetative (''oriV'') or transfer, ''oriT'', origins, it was concluded that orientation was determined by  the direction of conjugative transfer. This of course implies that the ends of IS''903'' are not equivalent. It also implies, as is the case for Tn''7''<ref><nowiki><pubmed>8804309</pubmed></nowiki></ref><ref><nowiki><pubmed>11274058</pubmed></nowiki></ref><ref><nowiki><pubmed>11030337</pubmed></nowiki></ref><ref><nowiki><pubmed>26104363</pubmed></nowiki></ref><ref><nowiki><pubmed>19703395</pubmed></nowiki></ref> and members of the IS''200''/IS''608'' family<ref><nowiki><pubmed>26350330</pubmed></nowiki></ref><ref><nowiki><pubmed>27466393</pubmed></nowiki></ref><ref><nowiki><pubmed>20691900</pubmed></nowiki></ref>, that transposition targets [[wikipedia:DNA_replication#Replication_fork|replication forks]].
+
Insertion into pUB307 itself showed a strong preference for a single orientation. By inverting either the vegetative (''oriV'') or transfer, ''oriT'', origins, it was concluded that orientation was determined by  the direction of conjugative transfer. This of course implies that the ends of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''] are not equivalent. It also implies, as is the case for [http://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn7-NC_002525 Tn''7'']<ref><pubmed>8804309</pubmed></ref><ref><pubmed>11274058</pubmed></ref><ref><pubmed>11030337</pubmed></ref><ref><pubmed>26104363</pubmed></ref><ref><pubmed>19703395</pubmed></ref> and members of the [[IS Families/IS200-IS605 family|IS''200''/IS''608'' family]]<ref><pubmed>26350330</pubmed></ref><ref><pubmed>27466393</pubmed></ref><ref><pubmed>20691900</pubmed></ref>, that transposition targets [[wikipedia:DNA_replication#Replication_fork|replication forks]].
  
The requirement the most abundant nucleoid proteins in transposition<ref><nowiki><pubmed>15130124</pubmed></nowiki></ref>. Most notably, H-NS was required for efficient transposition. Similar results were obtained for IS''10'' and Tn''522'' suggesting a more general role for H-NS in bacterial transposition. H-NS exerts its effect on target capture: IS''903''. Targeting preferences in the E. coli chromosome were dramatically altered in the absence of H-NS.
+
The requirement the most abundant nucleoid proteins in transposition<ref><pubmed>15130124</pubmed></ref>. Most notably, [[wikipedia:Histone-like_nucleoid-structuring_protein|H-NS]] was required for efficient transposition. Similar results were obtained for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS10R IS''10''] and Tn''522'' suggesting a more general role for [[wikipedia:Histone-like_nucleoid-structuring_protein|H-NS]] in bacterial transposition. [[wikipedia:Histone-like_nucleoid-structuring_protein|H-NS]] exerts its effect on target capture: [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903'']. Targeting preferences in the E. coli chromosome were dramatically altered in the absence of [[wikipedia:Histone-like_nucleoid-structuring_protein|H-NS]].
  
Several other host mutants were identified exhibiting a unique papillation pattern<ref><nowiki><pubmed>15968071</pubmed></nowiki></ref>: a ring phenotype with predominant papillae located just inside the edge of the colony, implying a spatial triggering of transposition within the. These mutants were found to be in ''pur'' genes, whose products are involved in purine biosynthesis. Genetic evidence was consistent with a requirement for GTP in IS''903'' transposition. These observations suggest that transposition occurs in later stages of colony growth. Transposition may occur within the colony edge in response to either a gradient of exogenous purines across the colony and may also reflect the developmental stage of the cells.
+
Several other host mutants were identified exhibiting a unique population pattern<ref name=":3"><pubmed>15968071</pubmed></ref>: a ring phenotype with predominant papillae located just inside the edge of the colony, implying a spatial triggering of transposition within the. These mutants were found to be in ''pur'' genes, whose products are involved in purine biosynthesis. The genetic evidence was consistent with a requirement for GTP in [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''] transposition. These observations suggest that transposition occurs in later stages of colony growth. Transposition may occur within the colony edge in response to either a gradient of exogenous purines across the colony and may also reflect the developmental stage of the cells.
  
IS''903'' transposase like those of a variety of other IS, exhibits a strong preference for action in cis: complementation of defective transposons in trans occurs at less than 1%<ref><nowiki><pubmed>15968071</pubmed></nowiki></ref>. Transposition is extremely sensitive to the distance between the 3' end of the transposase gene and the nearest transposon IR. An insertion of 1 kb of DNA reduces transposition to 1-2%. There is a strong correlation between the stability of transposase and its ability to act in trans. wild-type transposase has a half-life of about 3 min. Fusion with [[wikipedia:Alpha-galactosidase|α-galactosidase]] stabilizes the protein and results in an increase in its capacity to act in trans. A similar effect was noted in a lon mutant strain where trans activity was increased by a factor of 10-100. Further studies identified a class of transposase mutant specifically enhanced in trans activity and reduced in cis activity without increasing the overall transposition frequency. This was correlated with an increase in transposase half-life compared to the wildtype Derbyshire<ref><nowiki><pubmed>8898394</pubmed></nowiki></ref>. A second class of mutant with enhanced cis activity resulted in increased levels of transposase expression (as for IS''10''<ref><nowiki><pubmed>8412678</pubmed></nowiki></ref>).
+
[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS903 IS''903''] transposase like those of a variety of other IS, exhibits a strong preference for action in cis: complementation of defective transposons in trans occurs at less than 1% <ref name=":3" />. Transposition is extremely sensitive to the distance between the 3' end of the transposase gene and the nearest transposon '''IR'''. Insertion of 1 kb of DNA reduces transposition to 1-2%. There is a strong correlation between the stability of transposase and its ability to act in trans. wild-type transposase has a half-life of about 3 min. Fusion with [[wikipedia:Alpha-galactosidase|α-galactosidase]] stabilizes the protein and results in an increase in its capacity to act in trans. A similar effect was noted in a lon mutant strain where trans activity was increased by a factor of 10-100. Further studies identified a class of transposase mutants specifically enhanced in trans activity and reduced in '''cis''' activity without increasing the overall transposition frequency. This was correlated with an increase in transposase half-life compared to the wildtype Derbyshire<ref><pubmed>8898394</pubmed></ref>. A second class of mutants with enhanced cis activity resulted in increased levels of transposase expression (as for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS10R IS''10'']<ref><pubmed>8412678</pubmed></ref>).
  
 
====IS''1182''====
 
====IS''1182''====
IS''1182'' family members exhibit a diverse set of target specificities. Some duplicate 4 bp. These are of two types: those specific for CTAG and those that show no apparent target sequence specificity. Yet others target palindromic sequences. These are also of different types: some insert at the 3’ foot of a stem-loop and duplicate the entire structure while others insert 3’ of the loop and simply duplicate the loop ([https://scholar.google.fr/citations?user=WHAtfqcAAAAJ&hl=fr P. Siguier], [https://www.france-bioinformatique.fr/en/users/gourbeyre-edith E. Gourbeyre] and [https://scholar.google.fr/citations?user=r8TYgVEAAAAJ&hl=fr M. Chandler], unpublished) ([[:File:Fig. IS1182.1.png|Fig.IS1182.1]]).
+
[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1182 IS''1182''] family members exhibit a diverse set of target specificities. Some duplicate 4 bp. These are of two types: those specific for CTAG and those that show no apparent target sequence specificity. Yet others target palindromic sequences. These are also of different types: some insert at the 3’ foot of a stem-loop and duplicate the entire structure while others insert 3’ of the loop and simply duplicate the loop ([https://scholar.google.fr/citations?user=WHAtfqcAAAAJ&hl=fr P. Siguier], [https://www.france-bioinformatique.fr/en/users/gourbeyre-edith E. Gourbeyre] and [https://scholar.google.fr/citations?user=r8TYgVEAAAAJ&hl=fr M. Chandler], unpublished) ([[:File:Fig. IS1182.1.png|Fig.IS1182.1]]).
[[Image:Fig. IS1182.1.png|thumb|center|500x500px|'''Fig. IS1182.1.''' The IS''1182'' family main characteristics.  |alt=]]
+
[[Image:Fig. IS1182.1.png|thumb|center|720x720px|'''Fig. IS1182.1.''' '''The IS''1182'' family's main characteristics. Top:''' The left ('''IRL''') and right '''IRR''' inverted terminal repeats are shown in [http://weblogo.threeplusone.com/ WebLogo] format. '''Bottom''': distribution of IS length (base pairs) [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1182 IS''1182''] family members. The number of examples used in the sample is shown above each column.  |alt=]]
  
 
====IS''Dol1'' group (ISNCY)====
 
====IS''Dol1'' group (ISNCY)====
Another small group, IS''Dol1'', with 58 members from a large number of bacterial species has emerged from the ISNCY “orphan” group. Members have a length of between 1600-1900 bp ([[:File:Fig. ISDol1.1.png|Fig.ISDol.1]]) and generate DRs of 6-7bp.
+
Another small group, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISDol1 IS''Dol1''], with 58 members from a large number of bacterial species has emerged from the ISNCY “orphan” group. Members have a length of between 1600-1900 bp ([[:File:Fig. ISDol1.1.png|Fig.ISDol.1]]) and generate DRs of 6-7bp.
[[Image:Fig. ISDol1.1.png|thumb|center|500x500px|'''Fig. ISDol1.1.''' The IS''Dol1'' family main characteristics.  |alt=]]
+
[[Image:Fig. ISDol1.1.png|thumb|center|720x720px|'''Fig. ISDol1.1.''' '''The IS''1182'' family's main characteristics.''' '''Top''': The left ('''IRL''') and right '''IRR''' inverted terminal repeats are shown in [http://weblogo.threeplusone.com/ WebLogo] format. '''Bottom''': distribution of IS length (base pairs) [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1182 IS''1182''] family members. The number of examples used in the sample is shown above each column.  |alt=]]
 
==Bibliography==
 
==Bibliography==
 
<references />
 
<references />

Latest revision as of 12:49, 11 August 2021

IS5 family

Original Identification

IS5 was originally isolated as an insertion into the immunity region of bacteriophage lambda and subsequently found as a cause of mutation in a number of E. coli genes[1][2][3][4]. Together with IS1, it was also identified as an activator (by insertion) of expression of the usually cryptic beta-glucosidase gene of E. coli[5][6][7][8][9][10].

Presence in Compound Transposons

Several members are associated with compound transposons. These include IS903 and IS602, which form part of the kanamycin resistance transposons Tn903[11], and Tn602[12] respectively, and ISVa1 and ISVa2 which form part of a transposon carrying iron transport genes[13].

Distribution

The IS5 family, like the IS4 family, is also a relatively heterogeneous group which now requires reanalysis. It also includes sequences from both eubacteria and the archaea. There are now a large number of identified members of the IS5 family (>550 members) and of a closely related IS1182 family (>150 members) which have allowed a more detailed analysis and a separation into various subgroups and families. The IS5 family is partitioned into 6 subgroups: IS5, IS903, ISL2, ISH1, IS1031 and IS427[14] (Table Characteristics of IS families; Fig.5.1). Some of these may prove to be emerging families. Members of the IS5 subgroup appear to be composed of two groups with different lengths: one of 1060-1300 bp and a second of 1460-1610 (Fig.5.2 A).

Fig. IS5.1. Correspondence between the IS IRs and different IS5 family subgroups.
Fig. IS5.2. IS5 family IS5 subgroups A) distribution of IS length (base pairs); B) distribution of the length of transposase (amino acid residues).
Diversity

The transposases of these are also of different lengths (Fig.5.2 B) and transposase length is correlated with that of the IS. The lengths of the IS1013 subgroup are between ~900 and ~1200 bp with the majority between 103 and 1090 bp (Fig.5.3), those of the IS427 group are between 800 and 1070 bp in length with most having lengths in the range of 810 900 bp (Fig.5.3). Members of the IS903 subgroup are generally about 1030-1090 bp long (Fig.5.4 A ), those of the ISH1 subgroup are about 850 - 1200 bp long (note that this subgroup includes a number of Miniature Inverted repeat Transposable Elements (MITES) (Fig.5.4 B) and ISL2 members are 820 to 1260 bp long with a majority of about 820-970 bp (Fig.5.4 C). There are a large number of additional IS5 family members whose attribution to subgroups has yet to be established.

Fig. IS5.3. IS5 family IS1031 and IS427 subgroups. Top: distribution of IS length (base pairs) IS1031; Bottom: distribution of IS length (base pairs) IS427. The number of examples used in the sample is shown above each column.
Fig. IS5.4. Length (base pairs) distribution of IS5 family IS903, ISH1 and ISL2 subgroups. The number of examples used in the sample is shown above each column.

There is a distant relationship, about 30% similarity, between IS5 and the Pif/Harbinger group of eukaryotic TE[15].

Organization

Although the majority of members have a single Tpase orf, about 20% may express Tpase by frameshifting since it is distributed between two translation phases similar to most of the IS427 subgroup (82/116)[14]. In these cases if frameshifting indeed occurs the frameshifting signals appear more appropriate for a programmed transcriptional realignment frameshift mechanism (PTR) rather than for classical translation frameshifting (PRF) since there are no obvious downstream enhancement signals[16].

Similar split reading frames have also been identified in several of the other subgroups: IS1031 (13/65 members); ISL2 (7/43); and few in the IS5 subgroup (7/149). There is no experimental evidence that these frameshift signals are functional but many of these IS are in multiple copies suggesting that the derivatives are active. In view of their diversity compared to families such as IS3, the subgroups will certainly be partitioned into additional groups as more ISs are identified.

At present, the IS903 and the archaeal ISH1 subgroups whose IR are quite similar (Fig.5.5) do not contain members with potential frameshifting.

Fig. IS5.5. WebLogo showing the most common ISH1 and IS903 ends. The left (IRL) and right IRR inverted terminal repeats are shown in WebLogo format. From top to bottom: IS1031, IS427, ISL2, ISH1, IS903, IS5 subgroups.

In addition to their Tpases and the presence or absence of potential frameshifting, a further distinction between these elements resides in their target specificities.

Certain IS427 subgroup members and IS1182 family members do not carry a termination codon for their Tpases but generate this on insertion into a specific target sequence, CTAG, which is duplicated on insertion. Other IS such as IS1031, duplicate a sequence TNA while others such ISL2 appear to duplicate ANT.

The lengths of the entire group range from 789 bp (e.g., ISMbu1) to 1643 bp (eg., IS493). The latter carries a second open reading frame upstream of the "Tpase" frame inessential for transposition[17]. IS4811 (Tn4811[18], which is greater than 5kb, clearly contains a number of passenger genes including one with a consensus ATP/GTP-binding motif; an oxidoreductase-like protein; and one related to bacterial transcription regulators of the AraC family. Another, IS881 from Streptomyces, is interrupted by a group II intron.

The major feature which defines this group is the similarities between their putative Tpases[19]. This includes the N2, N3 and C1 domains carried by the IS4 group[20]. However, IS5 family Tpases exhibit a spacing between the N3 and C1 domains of approximately 40 residues, a distance more consistent with the canonical DDE motif[14].

Analysis of the largely increased number of members generally confirms these subgroups. Members within each group also generate distinct DRs of similar lengths (IS5, 4 bp; ISL2, 2-3 bp; IS1031, 3-4 bp; IS903, 8-9 bp; and IS427, 2-3 bp).

The IS903 and ISH1 subgroups have similar terminal IRs (Fig.5.5) but appear distinct by correlation with the length of the target duplication and, to a lesser extent, by the typical length of the entire IS (Fig.5.4).

Several members exhibit GATC sites within their terminal 50 bp. This includes all members of the IS903 subgroup and many members of the IS1031 and IS427 subgroups. IS903 transposition activity has been shown to be modulated by Dam in vivo (cited in [21]).

A preferred target sequence, YTAR (often CTAG), is observed for two subgroups, IS5 and IS427, and for two members of the ISL2 group (eg., IS112 and IS1373) in which either all four base pairs or the central TA are duplicated on insertion.

It is important to underline that, in many cases, the sequence of the original target site before insertion is not available. This can introduce ambiguities not only in estimating the number of duplicated target base pairs but also in defining the IRs. It is particularly important in several cases where the target repeat is symmetrical (e.g. CTAG) and where it is impossible to distinguish whether the element duplicates 2 or 4 bp and therefore to determine the exact ends of the element. Alignment of the ends of these elements in subgroups has permitted a number of ambiguities to be resolved. Members of the ISL2 group which generate 3 bp DRs exhibit a preference for ANT while those from the IS1031 group (which generate exclusively a 3 bp DR) exhibits a preference for insertion sites with the sequence TNA. Neither the small ISH1 group (8 bp DRs) nor the IS903 group (9 bp DRs) exhibit marked target specificity (see IS903 and also Target specificity). Only two of these elements, IS5 and IS903, have received significant attention.

IS5 group

In spite of the historical importance of IS5 in generating mutations, the published work concerning this element is largely directed to an understanding of its coding capacity and expression properties. IS5 carries one large orf, ins5A, spanning the entire element and shown to be essential for transposition IS5 (see [7]), and two small orfs (ins5B and 5C[22][23][24][25], whose relevance to transposition remains to be demonstrated. Nothing is known about the transposition mechanism of this element.

Mechanism IS903

The only IS5 family member which transposition mechanism has been addressed at present is IS903. The ends of IS903 carry IRs of 18 bp which exhibit the typical two-domain organization[26] . Transposase has been shown to bind specifically to the ends using a region located in the amino-terminal portion of the protein[27][28]. In addition, a region possibly involved in the formation of higher order multimers has been identified and residues probably involved in catalysis have been pinpointed among the conserved residues in the catalytic DDE domain[28]. Insertion generates a 9 bp target duplication.

An elegant genetic analysis provided strong evidence that IS903 is not only capable of undergoing direct insertion but can also generate adjacent deletions in a duplicative manner. Moreover, point mutations in the terminal base pair of the IRs decrease overall transposition frequency but increase the frequency of cointegrate formation[29]. Similarly, mutation of the first nucleotide flanking an IR also influences the level of cointegrate formation[30]. The level of cointegrate formation can also be increased by mutation of the Tpase. The molecular nature of these effects requires further investigation.

Factors affecting IS903 target site choice have been addressed in some detail. Initial studies[31] identified that insertion into the conjugative plasmid pOX38 showed no consensus in the 9 bp target duplication produced on insertion but the alignment of the target sequences indicated a preference for sites with symmetry on either side. A cloned copy of one native symmetric site into a second conjugative plasmid, pUB307, confirmed its attractiveness for insertion. More extensive studies provided a consensus symmetric target sequence which, when cloned into a target replicon, proved highly efficient[32]. The preferred target was a 21 bp palindrome cantered on the 9 bp target duplication. It could be dissected into: the 5 bp flanking sequences, the most important for site-specific insertion; the 7 bp palindromic core within the target duplication; the dinucleotide pair at the transposon-target junction; and the local DNA context.

Insertion into pUB307 itself showed a strong preference for a single orientation. By inverting either the vegetative (oriV) or transfer, oriT, origins, it was concluded that orientation was determined by the direction of conjugative transfer. This of course implies that the ends of IS903 are not equivalent. It also implies, as is the case for Tn7[33][34][35][36][37] and members of the IS200/IS608 family[38][39][40], that transposition targets replication forks.

The requirement the most abundant nucleoid proteins in transposition[41]. Most notably, H-NS was required for efficient transposition. Similar results were obtained for IS10 and Tn522 suggesting a more general role for H-NS in bacterial transposition. H-NS exerts its effect on target capture: IS903. Targeting preferences in the E. coli chromosome were dramatically altered in the absence of H-NS.

Several other host mutants were identified exhibiting a unique population pattern[42]: a ring phenotype with predominant papillae located just inside the edge of the colony, implying a spatial triggering of transposition within the. These mutants were found to be in pur genes, whose products are involved in purine biosynthesis. The genetic evidence was consistent with a requirement for GTP in IS903 transposition. These observations suggest that transposition occurs in later stages of colony growth. Transposition may occur within the colony edge in response to either a gradient of exogenous purines across the colony and may also reflect the developmental stage of the cells.

IS903 transposase like those of a variety of other IS, exhibits a strong preference for action in cis: complementation of defective transposons in trans occurs at less than 1% [42]. Transposition is extremely sensitive to the distance between the 3' end of the transposase gene and the nearest transposon IR. Insertion of 1 kb of DNA reduces transposition to 1-2%. There is a strong correlation between the stability of transposase and its ability to act in trans. wild-type transposase has a half-life of about 3 min. Fusion with α-galactosidase stabilizes the protein and results in an increase in its capacity to act in trans. A similar effect was noted in a lon mutant strain where trans activity was increased by a factor of 10-100. Further studies identified a class of transposase mutants specifically enhanced in trans activity and reduced in cis activity without increasing the overall transposition frequency. This was correlated with an increase in transposase half-life compared to the wildtype Derbyshire[43]. A second class of mutants with enhanced cis activity resulted in increased levels of transposase expression (as for IS10[44]).

IS1182

IS1182 family members exhibit a diverse set of target specificities. Some duplicate 4 bp. These are of two types: those specific for CTAG and those that show no apparent target sequence specificity. Yet others target palindromic sequences. These are also of different types: some insert at the 3’ foot of a stem-loop and duplicate the entire structure while others insert 3’ of the loop and simply duplicate the loop (P. Siguier, E. Gourbeyre and M. Chandler, unpublished) (Fig.IS1182.1).

Fig. IS1182.1. The IS1182 family's main characteristics. Top: The left (IRL) and right IRR inverted terminal repeats are shown in WebLogo format. Bottom: distribution of IS length (base pairs) IS1182 family members. The number of examples used in the sample is shown above each column.

ISDol1 group (ISNCY)

Another small group, ISDol1, with 58 members from a large number of bacterial species has emerged from the ISNCY “orphan” group. Members have a length of between 1600-1900 bp (Fig.ISDol.1) and generate DRs of 6-7bp.

Fig. ISDol1.1. The IS1182 family's main characteristics. Top: The left (IRL) and right IRR inverted terminal repeats are shown in WebLogo format. Bottom: distribution of IS length (base pairs) IS1182 family members. The number of examples used in the sample is shown above each column.

Bibliography

  1. Blattner FR, Fiandt M, Hass KK, Twose PA, Szybalski W . Deletions and insertions in the immunity region of coliphage lambda: revised measurement of the promoter-startpoint distance. - Virology: 1974 Dec, 62(2);458-71 [PubMed:4432374] [DOI]
  2. Charlier D, Crabeel M, Palchaudhuri S, Cunin R, Boyen A, GLANSDORFF N . Heteroduplex analysis of regulatory mutations and of insertions (IS1, IS2, IS5) in the bipolar argECBH operon of Escherichia coli. - Mol Gen Genet: 1978 May 3, 161(2);175-84 [PubMed:353507] [DOI]
  3. Charlier D, Crabeel M, Palchaudhuri S, Cunin R, Boyen A, Glansdorff N . Bidirectional polarity of IS2 elements and the polar effect of an IS5 insertion in the argECBH gene cluster of Escherichia coli [proceedings]. - Arch Int Physiol Biochim: 1978 Oct, 86(4);909-10 [PubMed:84614]
  4. Chow LT, Broker TR . Adjacent insertion sequences IS2 and IS5 in bacteriophage Mu mutants and an IS5 in a lambda darg bacteriophage. - J Bacteriol: 1978 Mar, 133(3);1427-36 [PubMed:641012]
  5. Reynolds AE, Felton J, Wright A . Insertion of DNA activates the cryptic bgl operon in E. coli K12. - Nature: 1981 Oct 22, 293(5834);625-9 [PubMed:6270569] [DOI]
  6. Schnetz K, Toloczyki C, Rak B . Beta-glucoside (bgl) operon of Escherichia coli K-12: nucleotide sequence, genetic organization, and possible evolutionary relationship to regulatory components of two Bacillus subtilis genes. - J Bacteriol: 1987 Jun, 169(6);2579-90 [PubMed:3034860] [DOI]
  7. 7.0 7.1 Schnetz K, Rak B . IS5: a mobile enhancer of transcription in Escherichia coli. - Proc Natl Acad Sci U S A: 1992 Feb 15, 89(4);1244-8 [PubMed:1311089] [DOI]
  8. Schnetz K, Rak B . Regulation of the bgl operon of Escherichia coli by transcriptional antitermination. - EMBO J: 1988 Oct, 7(10);3271-7 [PubMed:2846278]
  9. Schnetz K . Silencing of Escherichia coli bgl promoter by flanking sequence elements. - EMBO J: 1995 Jun 1, 14(11);2545-50 [PubMed:7781607]
  10. Schnetz K, Wang JC . Silencing of the Escherichia coli bgl promoter: effects of template supercoiling and cell extracts on promoter activity in vitro. - Nucleic Acids Res: 1996 Jun 15, 24(12);2422-8 [PubMed:8710516] [DOI]
  11. Grindley ND, Joyce CM . Genetic and DNA sequence analysis of the kanamycin resistance transposon Tn903. - Proc Natl Acad Sci U S A: 1980 Dec, 77(12);7176-80 [PubMed:6261245] [DOI]
  12. Stibitz S, Davies JE . Tn602: a naturally occurring relative of Tn903 with direct repeats. - Plasmid: 1987 May, 17(3);202-9 [PubMed:2819910] [DOI]
  13. Tolmasky ME, Crosa JH . Iron transport genes of the pJM1-mediated iron uptake system of Vibrio anguillarum are included in a transposonlike structure. - Plasmid: 1995 May, 33(3);180-90 [PubMed:7568465] [DOI]
  14. 14.0 14.1 14.2 Mahillon J, Chandler M . Insertion sequences. - Microbiol Mol Biol Rev: 1998 Sep, 62(3);725-74 [PubMed:9729608]
  15. Zhang X, Jiang N, Feschotte C, Wessler SR . PIF- and Pong-like transposable elements: distribution, evolution and relationship with Tourist-like miniature inverted-repeat transposable elements. - Genetics: 2004 Feb, 166(2);971-86 [PubMed:15020481] [DOI]
  16. Sharma V, Firth AE, Antonov I, Fayet O, Atkins JF, Borodovsky M, Baranov PV . A pilot study of bacterial genes with disrupted ORFs reveals a surprising profusion of protein sequence recoding mediated by ribosomal frameshifting and transcriptional realignment. - Mol Biol Evol: 2011 Nov, 28(11);3195-211 [PubMed:21673094] [DOI]
  17. Baltz RH, Hahn DR, McHenney MA, Solenberg PJ . Transposition of Tn5096 and related transposons in Streptomyces species. - Gene: 1992 Jun 15, 115(1-2);61-5 [PubMed:1319378] [DOI]
  18. Chen CW, Yu TW, Chung HM, Chou CF . Discovery and characterization of a new transposable element, Tn4811, in Streptomyces lividans 66. - J Bacteriol: 1992 Dec, 174(23);7762-9 [PubMed:1332944] [DOI]
  19. Rezsöhazy R, Hallet B, Delcour J, Mahillon J . The IS4 family of insertion sequences: evidence for a conserved transposase motif. - Mol Microbiol: 1993 Sep, 9(6);1283-95 [PubMed:7934941] [DOI]
  20. Rezsöhazy R, Hallet B, Delcour J, Mahillon J . The IS4 family of insertion sequences: evidence for a conserved transposase motif. - Mol Microbiol: 1993 Sep, 9(6);1283-95 [PubMed:7934941] [DOI]
  21. Roberts D, Hoopes BC, McClure WR, Kleckner N . IS10 transposition is regulated by DNA adenine methylation. - Cell: 1985 Nov, 43(1);117-30 [PubMed:3000598] [DOI]
  22. Engler JA, van Bree MP . The nucleotide sequence and protein-coding capability of the transposable element IS5. - Gene: 1981 Aug, 14(3);155-63 [PubMed:6269958] [DOI]
  23. Schoner B, Kahn M . The nucleotide sequence of IS5 from Escherichia coli. - Gene: 1981 Aug, 14(3);165-74 [PubMed:6269959] [DOI]
  24. Rak B, Lusky M, Hable M . Expression of two proteins from overlapping and oppositely oriented genes on transposable DNA insertion element IS5. - Nature: 1982 May 13, 297(5862);124-8 [PubMed:6281651] [DOI]
  25. Rak B, von Reutern M . Insertion element IS5 contains a third gene. - EMBO J: 1984 Apr, 3(4);807-11 [PubMed:6327289]
  26. Derbyshire KM, Hwang L, Grindley ND . Genetic analysis of the interaction of the insertion sequence IS903 transposase with its terminal inverted repeats. - Proc Natl Acad Sci U S A: 1987 Nov, 84(22);8049-53 [PubMed:2825175] [DOI]
  27. Derbyshire KM, Grindley ND . Binding of the IS903 transposase to its inverted repeat in vitro. - EMBO J: 1992 Sep, 11(9);3449-55 [PubMed:1324175]
  28. 28.0 28.1 Tavakoli NP, DeVost J, Derbyshire KM . Defining functional regions of the IS903 transposase. - J Mol Biol: 1997 Dec 12, 274(4);491-504 [PubMed:9417930] [DOI]
  29. Tavakoli NP, Derbyshire KM . IS903 transposase mutants that suppress defective inverted repeats. - Mol Microbiol: 1999 Feb, 31(4);1183-95 [PubMed:10096085] [DOI]
  30. Tavakoli NP, Derbyshire KM . Tipping the balance between replicative and simple transposition. - EMBO J: 2001 Jun 1, 20(11);2923-30 [PubMed:11387225] [DOI]
  31. Hu WY, Derbyshire KM . Target choice and orientation preference of the insertion sequence IS903. - J Bacteriol: 1998 Jun, 180(12);3039-48 [PubMed:9620951]
  32. Hu WY, Thompson W, Lawrence CE, Derbyshire KM . Anatomy of a preferred target site for the bacterial insertion sequence IS903. - J Mol Biol: 2001 Feb 23, 306(3);403-16 [PubMed:11178901] [DOI]
  33. Wolkow CA, DeBoy RT, Craig NL . Conjugating plasmids are preferred targets for Tn7. - Genes Dev: 1996 Sep 1, 10(17);2145-57 [PubMed:8804309] [DOI]
  34. Peters JE, Craig NL . Tn7 recognizes transposition target structures associated with DNA replication using the DNA-binding protein TnsE. - Genes Dev: 2001 Mar 15, 15(6);737-47 [PubMed:11274058] [DOI]
  35. Peters JE, Craig NL . Tn7 transposes proximal to DNA double-strand breaks and into regions where chromosomal DNA replication terminates. - Mol Cell: 2000 Sep, 6(3);573-82 [PubMed:11030337] [DOI]
  36. Peters JE . Tn7. - Microbiol Spectr: 2014 Oct, 2(5); [PubMed:26104363] [DOI]
  37. Parks AR, Li Z, Shi Q, Owens RM, Jin MM, Peters JE . Transposition into replicating DNA occurs through interaction with the processivity factor. - Cell: 2009 Aug 21, 138(4);685-95 [PubMed:19703395] [DOI]
  38. He S, Corneloup A, Guynet C, Lavatine L, Caumont-Sarcos A, Siguier P, Marty B, Dyda F, Chandler M, Ton Hoang B . The IS200/IS605 Family and "Peel and Paste" Single-strand Transposition Mechanism. - Microbiol Spectr: 2015 Aug, 3(4); [PubMed:26350330] [DOI]
  39. Lavatine L, He S, Caumont-Sarcos A, Guynet C, Marty B, Chandler M, Ton-Hoang B . Single strand transposition at the host replication fork. - Nucleic Acids Res: 2016 Sep 19, 44(16);7866-83 [PubMed:27466393] [DOI]
  40. Ton-Hoang B, Pasternak C, Siguier P, Guynet C, Hickman AB, Dyda F, Sommer S, Chandler M . Single-stranded DNA transposition is coupled to host replication. - Cell: 2010 Aug 6, 142(3);398-408 [PubMed:20691900] [DOI]
  41. Swingle B, O'Carroll M, Haniford D, Derbyshire KM . The effect of host-encoded nucleoid proteins on transposition: H-NS influences targeting of both IS903 and Tn10. - Mol Microbiol: 2004 May, 52(4);1055-67 [PubMed:15130124] [DOI]
  42. 42.0 42.1 Coros AM, Twiss E, Tavakoli NP, Derbyshire KM . Genetic evidence that GTP is required for transposition of IS903 and Tn552 in Escherichia coli. - J Bacteriol: 2005 Jul, 187(13);4598-606 [PubMed:15968071] [DOI]
  43. Derbyshire KM, Grindley ND . Cis preference of the IS903 transposase is mediated by a combination of transposase instability and inefficient translation. - Mol Microbiol: 1996 Sep, 21(6);1261-72 [PubMed:8898394] [DOI]
  44. Jain C, Kleckner N . Preferential cis action of IS10 transposase depends upon its mode of synthesis. - Mol Microbiol: 1993 Jul, 9(2);249-60 [PubMed:8412678] [DOI]