Fungal viruses, referred to as mycoviruses, are now widely regarded as important factors in fungal biology and ecology. They occur in diverse fungal lineages and are considered to be ubiquitous throughout the fungal kingdom [1-4]. Among the diverse groups of mycoviruses, double-stranded RNA (dsRNA) viruses represent one of the most extensively studied categories. Taxonomically, dsRNA mycoviruses are classified into several families, including the previously established Partitiviridae, Chrysoviridae, Totiviridae, Quadriviridae, Megabirnaviridae, Spinareoviridae, and Alternaviridae, as well as recently identified Polymycoviridae (International Committee on Taxonomy of Viruses, http://ictv.global). Trichoderma species are widely studied fungi due to their ecological significance and potential applications in agriculture and biotechnology [5, 6]. In recent years, the occurrence and characterization of dsRNA mycoviruses have been reported in different Trichoderma isolates using molecular and genomic approaches, revealing diverse genomic organizations and evolutionary relationships [7-18]. So far, at least four virus families, including two classified families of Hypoviridae and Partitiviridae, and two proposed families of “Fusagraviridae” and “Ambiguiviridae”, have been identified for mycoviruses infecting Trichoderma spp. In this study, we report a novel mycovirus belonging to a member of the family Alternaviridaefrom the Trichoderma harzianum NFCF419 strain.To the best of our knowledge, this is the first report of the presence of alternavirus in the genus of Trichoderma.
dsRNA preparation from T. harzianum, followed by agarose gel electrophoresis, revealed a multiple-band pattern suggesting infections of multiple viruses (Fig. 1A).Next-generation sequencing (NGS)of purified dsRNA usingthe Illumina HiSeq 2000resulted in a total of 5,341 contigs with an average length of 383 nucleotides. The resulting contigs were then subjected to a BLAST search to identify the viral sequence in the NCBI database. Sequenceanalysis of the derived contigs revealed the presence of an alternavirus in a mixed infection. NGS sequencing revealed four contigs of 3,561, 2,529, 2,570, and 1,471 in size. RT-PCR amplification using sequence-specific primer pairs based on the assembled contigs yielded amplicons of the expected sizes, and manual sequencing of the amplified fragment verified the presence of the viral genome and confirmed the NGS sequence.
Northern blot analysis using RT-PCR amplicons representing each of the four contigs as probes revealed hybridizing bands at the expected dsRNA bands (Fig. 1B). These results demonstrated that the novel viral genome was successfully purified and observed in the gel. In addition, 5'- and 3'-RACE identified the terminal sequence of each genome segment.
The complete nucleotide sequences of fourdsRNA segments were determined to be 3,572 bp (dsRNA1) with a GC content of 53%; 2,552 bp (dsRNA2) with a GC content of 53%; 2,593 bp (dsRNA3) with a GC content of 57%; and 1,484 bp (dsRNA4) with a GC content of 60%, consistent with the sizes of the bands observed in agarose gel electrophoresis (Fig. 1A). The genome sequences were deposited in GenBank under the accession numbers of PX441381, PX441382, PX441383, and PX441384, respectively. Sequence analysis revealed that each segment contains a single ORF and terminates with a poly (A) tail at the 3′ end (Fig. 1C).
The ORF (ORF1) in dsRNA1 starts at nucleotide position 72 and ends at position 3,446. This ORF1 is predicted to encode a protein of 1,124 amino acids, with a calculated molecular weight of 127.8 kDa. BLASTp analysis indicated that the deduced amino acid sequence exhibited the highest similarity (59.9% identical aa) to the RNA-dependent RNA polymerase (RdRP) of Dactylonectria torresensis alternavirus 1 (DtAV1). Multiple alignment of RdRP sequences exhibits that our deduced sequence maintained all conserved motifs (I-VIII) of RdRP, and the featured substitution of glycine (G) with alanine (A) of the metal ion-binding triplet sequence (ADD) in motif VI was also observed (Fig. 2A). These were consistent with the characteristics of other reported alternaviruses.
The ORF (ORF2) in dsRNA2 comprises a single ORF from nucleotide position 66 to 2,372, encoding a protein of 768 amino acids with a calculated molecular mass of approximately 84.4 kDa. BLASTp analysis revealed that this protein shared the highest sequence similarity (36.1% identical aa) with a recently proposed putative methyltransferase (MTase) of dsRNA2 of DtAV1 [19], which has been classified as a hypothetical protein with an unknown function in dsRNA2 of other alternaviruses [20]. Based on the sequence analysis of putative MTases of alternaviruses [19], a highly conserved sequence (GDXPG[T/S][L/F][G/A/S]RXL) of MTase is well conserved as “GDHPGSLGRAL” from aa 262 to aa 272, while the following conserved sequence of (V[V/T]GXDP[K/R]N) is partially conserved as “SIGIDPLN” from aa 279 to aa 286.
The ORF (ORF3) in dsRNA3 starts from nucleotide position 92 to 2,386 and encodes a protein containing 764 aa with a calculated molecular weight of 83.1 kDa. BLASTp analysis revealed that this protein shared the highest sequence similarity (47.2% identical aa) with a coat protein encoded in dsRNA3 of DtAV1. Interestingly, our virus showed that the segment having an ORF encoding the coat protein has a longer nucleotide sequence than that of the corresponding segment (dsRNA2) with an ORF encoding methyltransferase, which is also observed in IrAV1 [19].
The ORF (ORF4) in dsRNA4 starts from nucleotide position 218 to 1,270 and encodes a protein containing 350 aa with a calculated molecular weight of 37.7 kDa. BLASTp analysis revealed that this protein shared the highest sequence similarity (29.2% identical aa) with a hypothetical protein of Alternaria alternata alternavirus-1 (AaV1) [21]. Compared to the results of other segments, the low sequence similarity reflects the fact that DtAV1, which shows the highest similarity to ThAV1, consists of only three genome segments.
Sequence analysis of the 5'-terminal region of each segment reveals that our four segments are highly diverse, with 7 common sub-terminal nucleotide sequences of 5'-GCCCGT-3' (Fig. 2B).
Phylogenetic analysis using RdRP reveals that our mycovirus clustered well with other known alternaviruses (Fig. 2C). All 23 alternaviruses clustered into three clades, and most alternaviruses were clustered in Clade I, within which our mycovirus with DtAV1 clustered into a subclade.
Based on sequence analysis, genome organization, and phylogenetic analysis, our dsRNA virus is a new member of the family Alternaviridae and is therefore referred to as a Trichoderma harzianum altrenavirus 1 (ThAV1).Since the first report in 2009, more than 20 viruses have been identified as members of the familyAlternaviridae, and Fusarium is the most common fungal genus infected with alternaviruses [20]. To the best of our knowledge, this is the first report of an alternavirus in the genus Trichoderma.