Enteropathogenic Escherichia coli (EPEC) is recognized as the leading cause of infantile diarrhea in developing countries. Since 1996, EPEC has been divided into typical and atypical strains, based on the… Click to show full abstract
Enteropathogenic Escherichia coli (EPEC) is recognized as the leading cause of infantile diarrhea in developing countries. Since 1996, EPEC has been divided into typical and atypical strains, based on the presence or absence of the plasmid encoding an EPEC adherence factor (EAF). Unlike typical EPEC (tEPEC) strains that infect human solely, atypical EPEC (aEPEC) strains have a wider host range and are more prevalent than tEPEC in both industrialized and developing countries. The role for aEPEC as a diarrheagenic pathogen in human is controversial in that some aEPEC can cause acute and persistent diarrhea in children, whereas aEPEC is also highly prevalent in asymptomatic children. In October 2010, an unusual 8-month infant death was notified to health authorities in Fujian province, southeast China. During the preliminary postmortem investigation, two distinct types of colonies were isolated from anal swabs inoculated on eosin-methylene blue (EMB) agar plates, the dominant ones (named as strain 2010137Y in the context) were identified as aEPEC strains by several molecular approaches in our laboratory. We previously detected potential virulence genes of strain 2010137Y by real-time PCR and conventional PCR assays, showing eae(+), ehxA(+), EHECstx1(–), EHEC-stx2(–), bfp(–) and paa(–), and also absent with other virulence genes including efa1/lifA, set/ent, nleE, lpfA1, lpfAO113, lpfAR141, ureD, yjaA, ibeA, astA, perA and bl121. In this study, whole genome sequencing (WGS) of this strain was conducted on the 454 GS-Junior sequencer, subsequent sequence annotation and in silico analyses of WGS data led to comprehensive evaluation on the genomic structure, putative virulence factors, antimicrobial resistance and genetic relationship of the emerging strain with other circulating strains. A shotgun run of strain 2010137Y yielded approximately 70 Mb raw sequence data from 156 k random reads. The read-length ranged from 40 to 612 bp, its median reads length was 487 bp, while the average read-length was 449 bp per read. The draft genome of 5.29 Mb (GenBank accession: MTPL00000000) composed of 343 contigs ranging from 206 bp to 166 kb in size. According to the NCBI Prokaryotic Genome Annotation Pipeline (PGAP), strain 2010137Y consisted of 5,746 coding sequences (CDSs). Meanwhile, the draft genome was predicted 5,434 CDSs categorized in 584 subsystems on the RAST (Rapid Annotation using Subsystem Technology) server (RAST genome annotation ID: 99570) (Figure S1 in Supporting Information). Each subsystem contained diverse number from several to hundreds of sub-categories. Notably, there were 121 CDSs associated with the subsystem of Virulence, Disease and Defense, no pathogenicity island (PAI) was identified (Table S1 in Supporting Information). The complete 2,805 bp intimin gene, which was flanked by core structures of a pathogenicity island known as the locus of enterocyte effacement (LEE),
               
Click one of the above tabs to view related content.