Difference between revisions of "General Information/ISfinder and the Growing Number of IS"

From TnPedia
Jump to navigation Jump to search
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
'''<big>I</big>'''S classification is needed to cope with the high numbers and diversity of ISs. It also permits identification of the many IS fragments present in numerous genomes, contributes to understanding their effects on their host genomes, and can provide insights into their regulation and transposition mechanism. This role has been assumed by ISfinder<ref><nowiki><pubmed>16381877</pubmed></nowiki></ref> following the closure of the Stanford repository<ref><nowiki><pubmed>6282704</pubmed></nowiki></ref>. Several criteria are used to classify IS. These include: genetic organization, the similarity of transposase amino acid primary sequence, length and sequence of terminal inverted repeats, target site preferences, length of target repeats and the chemistry of transposon DNA strand cleavage and transfer into the target DNA [[:Image:1.4.1.png|(Fig.1.4.1)]].[[Image:1.4.1.png|thumb|500x500px|'''Fig.1.4.1.''' Main characteristics to used to define the IS groups and families. |alt=|border|center]]Since 1998, IS have been centralized in the ISfinder database to provide a basic framework for nomenclature and IS classification into related groups or families, often divided into subgroups [[:Image:1.4.2.png|(Fig.1.4.2)]]<ref><nowiki><pubmed>16381877</pubmed></nowiki></ref>. Initially IS were each assigned a simple number<ref><nowiki><pubmed>467979</pubmed></nowiki></ref>. However, to provide information about their provenance, IS nomenclature rules were changed and now resemble those used for restriction enzymes: with the first letter of the genus followed by the first two letters of the species and a number (e.g., IS''Bce1'' for ''Bacillus cereus'').
+
'''<big>I</big>'''S classification is needed to cope with the high numbers and diversity of ISs. It also permits identification of the many IS fragments present in numerous genomes, contributes to understanding their effects on their host genomes, and can provide insights into their regulation and transposition mechanism. This role has been assumed by [https://isfinder.biotoul.fr/ ISfinder]<ref name=":0"><pubmed>16381877</pubmed>
In 1977 only 5 IS (IS''1'', IS''2'', IS''3'', IS''4'' and IS''5'') had been identified <ref><nowiki><pubmed>339095</pubmed></nowiki></ref>.
+
 
<br />[[Image:1.4.2.png|thumb|center|500x500px|Fig.1.4.2. IS groups and families abundance in ISfinder. |alt=]]At the time of publication of the first edition of Mobile DNA I [https://www.amazon.co.uk/Mobile-DNA-Douglas-Berg/dp/1555810055 (Berg & Howe, 1989)] this had risen to 50 [https://www.amazon.co.uk/Mobile-DNA-Douglas-Berg/dp/1555810055 (Galas & Chandler, 1989 pp. 109–162)]; at the time of the second, Mobile DNA II [https://www.asmscience.org/content/book/10.1128/9781555817954 (Craig, et al., 2002)], there were more than 700; and at present, ISfinder includes more than 4600 examples distributed into 29 families some of which can be conveniently divided into subgroups [[:Image:1.4.3.png|(Fig.1.4.3)]] <ref><nowiki><pubmed>26104715</pubmed></nowiki></ref><ref><nowiki><pubmed>24499397</pubmed></nowiki></ref>. This classification evolves continuously with the accumulation of additional ISs. The IS in the ISfinder repository represents only a fraction of IS present in the public databases. Not only has the number of IS identified increased dramatically with the advent of high throughput genome sequencing but the examination of the public databases has shown that genes annotated as transposases (Tpases), the enzymes which catalyze TE movement (or proteins with related functions), are by far the most abundant functional class<ref><nowiki><pubmed>PMC2910039</pubmed></nowiki></ref>. [[:Image:1.2.5.png|(Fig.1.2.5)]]
+
&lt;/nowiki&gt;</ref> following the closure of the [https://www.stanford.edu/ Stanford] repository<ref><pubmed>6282704</pubmed></ref>. Several criteria are used to classify IS. These include: genetic organization, the similarity of transposase amino acid primary sequence, length and sequence of terminal inverted repeats, target site preferences, length of target repeats and the chemistry of transposon DNA strand cleavage and transfer into the target DNA [[:Image:1.4.1.png|(Fig.4.1)]].[[Image:1.4.1.png|thumb|600x600px|'''Fig.4.1.''' Main characteristics used to define the IS groups and families. Terminal inverted repeats (IRL and IRR) are shown as two-colored boxes (a and b) with functions for transposase binding (a) and recognition for cleavage and strand transfer (a). A single (left) or double (right) open reading frame is shown underneath the IS (blue arrow). The transposase of the IS on the right is produced by programmed -1 translational frameshifting. The reading frames are indicated within the IS. The product of the upstream frame generally acts as a regulatory protein. The indigenous Tpase promoter is shown located (by convention) in IRL. XXX and YYYY represent the short direct target repeat sequence which is generally duplicated during the insertion event. |alt=|border|center]]Since 1998, IS have been centralized in the [https://isfinder.biotoul.fr/ ISfinder] database to provide a basic framework for nomenclature and IS classification into related groups or families, often divided into subgroups [[:Image:1.4.2.png|(Fig.4.2)]]<ref name=":0" />. Initially IS were each assigned a simple number<ref><pubmed>467979</pubmed></ref>. However, to provide information about their provenance, IS nomenclature rules were changed and now resemble those used for restriction enzymes: with the first letter of the genus followed by the first two letters of the species and a number <ref>Mahillon J, Chandler M. Insertion Sequence Nomenclature. ASM News. 2000;66:324. </ref> (e.g., [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISBce1 IS''Bce1''] for ''[[wikipedia:Bacillus_cereus|Bacillus cereus]]''). In 1977 only 5 IS ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1R IS''1''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS4 IS''4''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS5 IS''5'']) had been identified <ref><pubmed>339095</pubmed></ref>.[[Image:1.4.2.png|thumb|center|680x680px|'''Fig.4.2.''' IS groups and families abundance in ISfinder: Distribution of IS families in theISfinder database.
[[Image:1.4.3.png|thumb|center|500x500px|Fig.1.4.3. The growing number of IS deposited in the ISfinder Database. |alt=|border]]
+
 
 +
The histogram shows the number of IS of a given family, as defined in the text, in the ISfinder database (June 2013). The horizontal boxes indicate the number and relative size of different subgroups (see [[General Information/What Is an IS?#Characteristics of insertion sequence families|Table 1]] for the names of the subgroups) within the family. They are grouped by color to indicate the type of Tpase used: DDE, blue; undetermined, purple; DEDD, green; HUH, red; Serine, orange.|alt=]]At the time of publication of the first edition of Mobile DNA I [https://www.amazon.co.uk/Mobile-DNA-Douglas-Berg/dp/1555810055 (Berg & Howe, 1989)]<ref>Berg DE, Howe MM. Mobile DNA. Washington, D.C: American Society For Microbiology; 1989. p. 972. </ref> this had risen to 50 [https://www.amazon.co.uk/Mobile-DNA-Douglas-Berg/dp/1555810055 (Galas & Chandler, 1989 pp. 109–162)]<ref>Galas DJ, Chandler M. Bacterial insertion sequences. In: Berg DE, Howe MM, editors. Mob DNA. Washington DC: American Society for Microbiology; 1989. p. 109–162. </ref>; at the time of the second, Mobile DNA II [https://www.asmscience.org/content/book/10.1128/9781555817954 (Craig, et al., 2002)]<ref>Chandler M, Mahillon J. Insertion Sequences Revisited. In: Craig NL, Lambowitz AM, Craigie R, Gellert M, editors. Mobile DNA II. American Society of Microbiology; 2002. p. 305–366. </ref>, there were more than 700; and at present, ISfinder includes more than 4600 examples distributed into 29 families some of which can be conveniently divided into subgroups [[:Image:1.4.3.png|(Fig.4.3)]] <ref><pubmed>26104715</pubmed></ref><ref><pubmed>24499397</pubmed></ref>. This classification evolves continuously with the accumulation of additional ISs. The IS in the ISfinder repository represents only a fraction of IS present in the public databases. Not only has the number of IS identified increased dramatically with the advent of high throughput genome sequencing, but the examination of the public databases has shown that genes annotated as transposases (Tpases), the enzymes which catalyze TE movement (or proteins with related functions), are by far the most abundant functional class<ref><pubmed>PMC2910039</pubmed></ref> [[:Image:1.2.5.png|(Fig.2.5).]]
 +
[[Image:1.4.3.png|thumb|center|620x620px|'''Fig.4.3.''' The growing number of IS deposited in the ISfinder Database. Diagram showing the increase in the number of IS in the ISfinder database as a function of time. At present (May 2020) there are over 5000 entries. |alt=|border]]
 
==Bibliography==
 
==Bibliography==
 
<references />
 
<references />

Latest revision as of 18:15, 9 August 2021

IS classification is needed to cope with the high numbers and diversity of ISs. It also permits identification of the many IS fragments present in numerous genomes, contributes to understanding their effects on their host genomes, and can provide insights into their regulation and transposition mechanism. This role has been assumed by ISfinder[1] following the closure of the Stanford repository[2]. Several criteria are used to classify IS. These include: genetic organization, the similarity of transposase amino acid primary sequence, length and sequence of terminal inverted repeats, target site preferences, length of target repeats and the chemistry of transposon DNA strand cleavage and transfer into the target DNA (Fig.4.1).

Fig.4.1. Main characteristics used to define the IS groups and families. Terminal inverted repeats (IRL and IRR) are shown as two-colored boxes (a and b) with functions for transposase binding (a) and recognition for cleavage and strand transfer (a). A single (left) or double (right) open reading frame is shown underneath the IS (blue arrow). The transposase of the IS on the right is produced by programmed -1 translational frameshifting. The reading frames are indicated within the IS. The product of the upstream frame generally acts as a regulatory protein. The indigenous Tpase promoter is shown located (by convention) in IRL. XXX and YYYY represent the short direct target repeat sequence which is generally duplicated during the insertion event.

Since 1998, IS have been centralized in the ISfinder database to provide a basic framework for nomenclature and IS classification into related groups or families, often divided into subgroups (Fig.4.2)[1]. Initially IS were each assigned a simple number[3]. However, to provide information about their provenance, IS nomenclature rules were changed and now resemble those used for restriction enzymes: with the first letter of the genus followed by the first two letters of the species and a number [4] (e.g., ISBce1 for Bacillus cereus). In 1977 only 5 IS (IS1, IS2, IS3, IS4 and IS5) had been identified [5].

Fig.4.2. IS groups and families abundance in ISfinder: Distribution of IS families in theISfinder database. The histogram shows the number of IS of a given family, as defined in the text, in the ISfinder database (June 2013). The horizontal boxes indicate the number and relative size of different subgroups (see Table 1 for the names of the subgroups) within the family. They are grouped by color to indicate the type of Tpase used: DDE, blue; undetermined, purple; DEDD, green; HUH, red; Serine, orange.

At the time of publication of the first edition of Mobile DNA I (Berg & Howe, 1989)[6] this had risen to 50 (Galas & Chandler, 1989 pp. 109–162)[7]; at the time of the second, Mobile DNA II (Craig, et al., 2002)[8], there were more than 700; and at present, ISfinder includes more than 4600 examples distributed into 29 families some of which can be conveniently divided into subgroups (Fig.4.3) [9][10]. This classification evolves continuously with the accumulation of additional ISs. The IS in the ISfinder repository represents only a fraction of IS present in the public databases. Not only has the number of IS identified increased dramatically with the advent of high throughput genome sequencing, but the examination of the public databases has shown that genes annotated as transposases (Tpases), the enzymes which catalyze TE movement (or proteins with related functions), are by far the most abundant functional class[11] (Fig.2.5).

Fig.4.3. The growing number of IS deposited in the ISfinder Database. Diagram showing the increase in the number of IS in the ISfinder database as a function of time. At present (May 2020) there are over 5000 entries.

Bibliography

  1. 1.0 1.1 Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M . ISfinder: the reference centre for bacterial insertion sequences. - Nucleic Acids Res: 2006 Jan 1, 34(Database issue);D32-6 [PubMed:16381877] [DOI] </nowiki>
  2. Lederberg EM . Plasmid reference center registry of transposon (Tn) allocations through July 1981. - Gene: 1981 Dec, 16(1-3);59-61 [PubMed:6282704] [DOI]
  3. Campbell A, Berg DE, Botstein D, Lederberg EM, Novick RP, Starlinger P, Szybalski W . Nomenclature of transposable elements in prokaryotes. - Gene: 1979 Mar, 5(3);197-206 [PubMed:467979] [DOI]
  4. Mahillon J, Chandler M. Insertion Sequence Nomenclature. ASM News. 2000;66:324.
  5. Nevers P, Saedler H . Transposable genetic elements as agents of gene instability and chromosomal rearrangements. - Nature: 1977 Jul 14, 268(5616);109-15 [PubMed:339095] [DOI]
  6. Berg DE, Howe MM. Mobile DNA. Washington, D.C: American Society For Microbiology; 1989. p. 972.
  7. Galas DJ, Chandler M. Bacterial insertion sequences. In: Berg DE, Howe MM, editors. Mob DNA. Washington DC: American Society for Microbiology; 1989. p. 109–162.
  8. Chandler M, Mahillon J. Insertion Sequences Revisited. In: Craig NL, Lambowitz AM, Craigie R, Gellert M, editors. Mobile DNA II. American Society of Microbiology; 2002. p. 305–366.
  9. Siguier P, Gourbeyre E, Varani A, Ton-Hoang B, Chandler M . Everyman's Guide to Bacterial Insertion Sequences. - Microbiol Spectr: 2015 Apr, 3(2);MDNA3-0030-2014 [PubMed:26104715] [DOI]
  10. Siguier P, Gourbeyre E, Chandler M . Bacterial insertion sequences: their genomic impact and diversity. - FEMS Microbiol Rev: 2014 Sep, 38(5);865-91 [PubMed:24499397] [DOI]
  11. Aziz RK, Breitbart M, Edwards RA . Transposases are the most abundant, most ubiquitous genes in nature. - Nucleic Acids Res: 2010 Jul, 38(13);4207-17 [PubMed:20215432] [DOI]