Mais conteúdo relacionado Semelhante a EVE 161 Winter 2018 Class 11 (20) Mais de Jonathan Eisen (20) EVE 161 Winter 2018 Class 112. RESEARCH ARTICLE
Whole-Genome Random
Sequencing and Assembly of
Haemophilus influenzae Rd
Robert D. Fleischmann, Mark D. Adams, Owen White, Rebecca A. Clayton,
Ewen F. Kirkness, Anthony R. Kerlavage, Carol J. Bult, Jean-Francois Tomb,
Brian A. Dougherty, Joseph M. Merrick, Keith McKenney, Granger Sutton,
Will FitzHugh, Chris Fields,* Jeannine D. Gocayne, John Scott, Robert Shirley,
Li-lng Liu, Anna Glodek, Jenny M. Kelley, Janice F. Weidman, Cheryl A. Phillips,
Tracy Spriggs, Eva Hedblom, Matthew D. Cotton, Teresa R. Utterback,
Michael C. Hanna, David T. Nguyen, Deborah M. Saudek, Rhonda C. Brandon,
Leah D. Fine, Janice L. Fritchman, Joyce L. Fuhrmann, N. S. M. Geoghagen,
Cheryl L. Gnehm, Lisa A. McDonald, Keith V. Small, Claire M. Fraser,
Hamilton O. Smith, J. Craig Ventert
An approach for genome analysis based on sequencing and assembly of unselecte
Lla(.%·BlllaB.LPr.lte
5. RESEARCH ARTICLE
Whole-Genome Random
Sequencing and Assembly of
Haemophilus influenzae Rd
Robert D. Fleischmann, Mark D. Adams, Owen White, Rebecca A. Clayton,
Ewen F. Kirkness, Anthony R. Kerlavage, Carol J. Bult, Jean-Francois Tomb,
Brian A. Dougherty, Joseph M. Merrick, Keith McKenney, Granger Sutton,
Will FitzHugh, Chris Fields,* Jeannine D. Gocayne, John Scott, Robert Shirley,
Li-lng Liu, Anna Glodek, Jenny M. Kelley, Janice F. Weidman, Cheryl A. Phillips,
Tracy Spriggs, Eva Hedblom, Matthew D. Cotton, Teresa R. Utterback,
Michael C. Hanna, David T. Nguyen, Deborah M. Saudek, Rhonda C. Brandon,
Leah D. Fine, Janice L. Fritchman, Joyce L. Fuhrmann, N. S. M. Geoghagen,
Cheryl L. Gnehm, Lisa A. McDonald, Keith V. Small, Claire M. Fraser,
Hamilton O. Smith, J. Craig Ventert
An approach for genome analysis based on sequencing and assembly of unselecte
Lla(.%·BlllaB.LPr.lte
12. What is new here?
• What did they do that was new in terms of sequencing?
32. Accuracy Checks?
• Frameshift in protein sequences vs. Homologs
• Coverage > 1x
• Few ambiguities
• Comparison to H. Influenza sequence in DBs
38. P0007 fdo« *ooot dh· HIOOHI *·C
10(taO 10001 fdol IHOui 10010 -- 100K^^KII001_«r«B ·003* *r*D
31 nIOoM~hD n0010 rLl ot r lOOlwB01l«» B0017 KIOOM nPOMO i"003 B0 oier 10024oItO 10024 lipfc 1002 K002daoA) MMOUT11010033 I1003510040 n0043 10(
K000410011 bOlDPO1001 noO~100« ·» n0033 oil 10027 lip» 10030 rlpK 10033pbp2 n0034 n0041 xthk n0043
~~~~~~~ad r
I~~~~~~~~~~~~~~~~~~~~~~~~~~~~02»«
BOM* o^»2........'gax47 ioi«4 B~~~~~~~~~~i((_aa ioi««Y
___________________ 1
P_____voI_________________________"0-? "«
nol* r"gl" laIglg
lrlS1^ ^ O? 1
!^ 1
^____________r04 D-f 104 uo«'1n0404 0e £C 01 0* ef11S a 05 ~ 0S 111gr113115 07 0*
1
0
Or~~~~~~~~~04 nolr*n P0144 glk1~ B01S1 hfUr Bpldoc OolS~fjM 001S* x»U21P II n017«10l7»actr
rmnxn 10« up O00 ra I00 rrm 10310nlrlll~ nmm~ iT
vaiq a--sa ----P01=) i030 [=) IISn~OIL ollPrrOnudolLborOI~ore~r·01 ~Q10~
Qr 102M zpo l VU I07 102*0D I4 101 0* r123 125zo130- 031ty 00 00 00 rC 139x
«e20 027 ytS 1024 1«n027«O~1027* P02«l 1<>2»2 102*3 PoP C *2*nuX102*0oop*t ·102*2 *u 0* o0«2« i»P2»pl 00*131133*
10271 10273xyh 1027K027*102t~tC 102*» *toC 102*1 102*4 autS B~f~pUC 10312 rwf 10314 r
1042 d~l> 0444top _g«p~f l~»P 0471hi» 1073
04 cX 04 0« 04 an 145 0 B0(wa 0* la 143 - ^iet rn04«« hipan0470 hi»Cn0472 hi- K
143e 03 0- 143 P104t»X14*t« 04*141 ...'*".no lel MOb onpr*B** t 04 0«
432o t034- 10434 Ulflo103 cir>r 1*447 IwtS 10450 norrOb n0454 1045(lrr n~~oor nrpocoo P04»
1043 oc-a r ~ urrrrr
no~t ·r011 P P~nglUno~r PO1o ·01~1nolo PO~1 n~or l PQIO oe Por0 r· P1Ir opD ol~roiu oo,,piune~e ^, 10«0toy«rll
^no-^i5* o0* r»»7nnl~ KOO tx I
O.r-*-« " 00 y* n«5n« 00 zs9 S
a ~ ~ 01 --- EI0720hl B72 II1 I III I
I 077 af 1000-l171 n 10713_B^I071 o~lpl B071norWn 101»^^ 072 JlKI~u Iwrlo ,..,-~r K,.,,,.,,l Be724 B0725 B07210rioQ 107291 pro* ·03177
BOTUt~pi1071 10721072 na»Tra 102 yt,...lg,73 .',71I7MIC74y* 03
Po1Ir our noI11 PO111 PO ·101~ oa· ~O1~1 oor IDIIII trru POI) norrr ·I~I~ P~lll~oU n~lrt pC PorI) PO PO~r1 uO norrr10T32
Origin~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~01 rpL21·r
·IQIII~ ~ ~ 1»> roAo11001 ·1004 11007 111 to 01^ 107l» 12
s tU Blgt-X KIoltf BUpft 111451 ph^ ~ 11152jPt"11* nUtalt* Bll(4^k IllCSnI~ il OnrllIOO Cl 11171OtrPr nor1173 nc
O ~ ~~ ~ ~ ~ ~ ~ ~ ~~~14 1114 B15 115 IlOyC Bltr 110y« 1( 1( i 1« 17 r
norrr~~~~~~~~~iiiii numua SSSH VEQ~r ^*^^^* iir~r ---norI moQdidO B^i ~ --(1*
11420r12 1143 tzp B14>h11451)·- B14C11 1 aodT
110 14* 122123 **112 tr 112 140 13 trpK 11434 a1437 1143* B1444 -tr B141o 10145I)·0'3 B1450 114( 12
B1401~__10_mlB40m41111 11 nlr 12 B44 o141pm143- 145 13B40***142rB* 14 tO 114 o143 iB 15 pa 14*10
11V 15KB15 B15»«pei 115*51 57 -- 115» 11oo1(11(1 tyrt 11(12''' 11(14I~lld~g p--11(~obiOl(2 1(4 1(5 12 dadfc111(3
B15> mn15*1n11I Iz 11(00 110 KltA1(* K2 Hla
^v...^O^..*-- C===3 C=> ---- « C=3 C=)~~~~~~~~~~~~~~~~~~~~~~nooo pl
1I0055 uoan n00(3 poaB 1l00(71 itL 1100(* gin!
-r ·I004( ~ ~ OI105_uXr 100(2C dluA 1~00(4 folK 100O((-iB 100(0r txpX
144 0047 ·da 1I004* MB*( 110051 n10053 B005( 1~00571 UXC 110059 l00(0 -hA
n0045 t0040 hhar ·10050 110052 u0050 M- u00(1 r~o2
110111 10194 wmS u0197 -pX 102000 **ID, ·1020 n~h ·I0200 wro
·II lgl<, 101«( 1II01171 11010* gIdhX 101*2e **«X ·10195 101»( aroC 101»»r «01»p ubn·1020 10207ua *roK1020*d-
110104 10105 101*0 fur 10193 *coC u1001 rpLI* ·I0203
·I01*1 LId* 00~O~tCTD 10204 rp~l«
10321 TagC
u0319 10324nmt 110331 oapB 10335l dffkA u042 110344 1034(
31510314c n0317 upfl u0320 n0323 lI0328 *fp ·I0332 ruc0 n0337 g~LnB K~033 pri*
733 VnnS'~sxztm'M~
rC
i~k P0475 hl«B 104090 u1097 h»10 10502 rb«X 10504 rtb· ·BIOS~rbttl
O474 hiT Il047( n0400 Z0104901 10492 049(0l ll0499·6*ldB 1000 lt0501 rb·D *10503 rbC ·I0505 fb-( l0507
~10470 ·tpC ·10410 *tp0 u1040 *tpn I10416 gidB 10490 10494109n0490~IoocpotD 105010
1047*tp 10411 *tk 140 td 0
nIOIIS911 I~llltpl
IO110415 atpB
l
1011al« 0(1 t 0(0 po 1(3 *cB
^
K0l aA 01 010gp l 02 ~t 1(4K(2tt 027 1(**l* T> 032tf O3
icl 10(15 tucX 10(20 hip* *~~~10(21 l·h I(2 031oa 1(3
1074dcuf 10750 dapf 1 107(0lrp·~IO~O
107* d- 074 BI74 plB 175 taO 175 1070xp3
iol ·xoclr ~~~ol ·I1019- or» nL0C199fl 1090 nI0907 It0915 ooir> M91
·f0'19 _n BIO»__m>dO·1111
11034 Iotr1103( ~11041 M·I0 o · 1) zl 11050l-urP 'l 1 I'1 u I'1 t o~
JOi 1132(m r
n~~t39dn~c·no~1132 11329 1 BX9133 ·I1340 1134 u07 134 po 1134rpoc
Zlll ^ "
K130 EaPI -^ j4
^
T o5 T _4 pt
110093~~~~~~~~~~~117·rB* 11412J 11414 114K0 114*1 rdg114* 1147 101500 o~
·114(3 u1144fl 141 81 41 tu»0 114731cedP 11475 nitCn p ·11410 1140wi1140 14*
H4(5tt- 11 41 470 t lpC 0147 11474_t<»C K147(UC IfO' ~ 10111 f00 1101 n11 rb 113
_I(3p^ n_3Ip KO* 9 ni(5(
14 df>D«I1(37 ~011(4 hi«T nl(4( cdcH49dl 11(1 *KM 11(5
B|Energy metabolism
Hiattyacid/Phospholipid metabolism
HPurines, pyrimidines, nucleosides and nucleotides
^BRegulatory functions
Replication
HITransport/binding proteins
^g~Translation (.2.,~ ,,
1 Transcription ^-N
IBOthercategories____~ ~~ ~ ~ ~ ~~~t--i__ k!bK
HiaHypothetical
||Ulnknown
172 _____rn73^ l3ltl13 11 ri1tta 5«-23»-14» PI170 r*00n1742xpoS
I11 I13-« 1l4«pT I14 -
B|Amino acid biosynthesis
BBBiosynthesisofcofactors, prostheticgroups, carriers
BBCell envelope
^BCellularprocesses
Central intermediary metabolism
onFebruary8,2018http://science.sciencemag.org/Downloadedfrom
39. II """""""".GI.ULICC.UII..I.CC.CCIII.u ,,.., rrr·rr - -_ --_ --_ .r··r rrr··r
*zoo0
212ribX10214 prIC 10215868h esdM 1026 h 1(0218 prrO Qlu *10321 go- *Q10232gaX 10224Irp 1022« brnQ ne0229pSWssin02»0 ~ r10231 d-0 10 123noa17
10211 pap» 80213 oppX 102119 10220 arc rlIut 5»-23»-16« 10223 rur010225s ahX n10227102110234102Ks wC 10219-O 10 241
nesss ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~~~~~~~~01 «xaa_<n<s16e
n0358 foAD n0369 hi«8
I
03»«
Iss
~105 IUJD
nei" 103«s lys C== n03»<
«XO»»»ssas~a1
10349mn pmfc 035 10352orrm 10359 n038 "03( tin*. 103( 10372 tdx111037 107 07 031pl
133_o*135CoQP3lISM IO4pf
1es 8173031al8101(1 f«C«s n0371 103731Lhec66 n0375 neifS n03*0 toUspr 1 103t 101*Iessp
c~~~mr~~~mn Pmrmd~~~~~~~nmmr ~~~~~mzma ron~~~~~~~sna I I I I~~~1052 a*01041 or
10510 10522 10527 fdxl 10511 zp«21h0533 rpoDs 10542L~ neepB n0549 togfcs rl nss
0U hk 013hncI154 pC155 pB 158 pl 018<oo 059 050053102 fa 056102 tr 150 c103 M* 103 a-0570« 059WC154 f 054Kp»104 «- «
10509~~~~~~~nes1051BlaoII101 pl 02 02 gc 102 d 01 c»15*u« 150*- 155xUB51i
nesse nesssnesss FQ nesss rplss nest, rpoD nesss rop·nesss lPI~~~~~~~~~~~~~~~~~~~~~~~~054 -i
~ ~ 104 10C4 gla gI 0« yC ZC9*
1061urB~hoclnesI --- -- AZ« pp nssrplIl^---IO1102 es nsser nssnss y·nssanss g est^rlt a ss rr es sPss p es p I
K~~~tl«~Z~K04ZP1Xr 105 1(3oyDB1C BC7 OT1- 0«1t
10C43tbi»Cs10644 toxCsBO4 -n04el6 05 062MA1(4tsg BOC5 __1<51 (5 KOC Upftsa0« 10«<5 asT~p BOC71 B0<7BOC7C a C B0<7
BOC37 ~~ ~ ~ ~ ~ ~ AetrarC5pd 04 169rp 105 d 1C5uo 1(7tp 1COBCC 07 OC4BT
B0771 xpl>4 10712 rpL22 1015 rpL29 10790 rpL5 10793 pL< 10796 xpL30 10(01 zpS
107«507C9 ffbl B0777 zpl. 10710x^ ^rp83 rp817 rp2 vlt z8 OT -T1(0 p 1(3
lOTftB sOrfZB75 07 plO 171z89 177 171 p1 09 pr5179 p1 00xo 00
076210766107 «X 07 n0774 ctfA 077 rpL3 17( zpI E7«rp1 179 t004 00 n0«0» frr 1M1 nelscysigalP0*1 no«&10(1 l essKO
"{fife L077 m 100 ~»n»9ph 01 ~fc1(4*« 01 «» 1(9-
~102 hoi*03 o~b~ePIIoCIorurrcls nronro a ouro o~10939 n0941 PO09145htrt r 10(54dae'llp B095« 7) POI~nesss trpl ness psd· nesss nessrop nessshdU nsss as nesss to~ nesICDs0943 ----- -- D
nesss nesss ir~~~~r nesss rp~~~s l[T01~~0 rp~~s rpsf ass as nesss s1090 1092 1194 ni09 110991103cy
110<4 llCstD HH06( nrt» 81710723 n1074 H075cyd 1107«^p llptaln ni0«2 K10«4 1100« 110»srodfcn1100 11«snse 11101 XtUetu
bZ 1108 nrt 16 nrf 1107 117 1107 eyd 117 gla 11» 103 115107*lP1«
dh*I1213 xprXss nss1125121 nes1319 3nesode essr~s es 11221 hijDs 11224 pyrTs 1123 n1317 dpaK s 1121 dnessJ r IgiaipP111245K
a2pr m117bgm n21 c» 22 Ia 12 up 121 pt 122 q 123 n* 135129pg 14 ni42c 114
mUl~ufX rr1229 4uX P1231 Ipdk 11236W 0134~1 PI
-11501~~srrs I--I1Il ~r
0*11507 11510 11512 ssgyQ 101~90neso 11514 ni5Ksn K1520 1152 n1521 parn1515 11517 lie*s r19LioC
*I59 11 1151 111 vmmmm157 519 mm1521 Tm 112112 112 ihp 112p~rrC 111 l«153 Z 11 156 151li»K50lo 14
HI1SO<1152 »BdB153 ri^Cs11831 tnb»154 1^ 154 lg«4 11547rroa 01549^
O~~~~~~~~~~~~~~~~~~~12 an 11544r 11541mnmHiss rb
l^^---l ~ ~ nes nses 'ss 1ss sssnss
11(59 ~ ~ ~ ~~~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~~~~~~~~oBmd 11(72mn 11(|4mm 111 19
11(60 nrd» n1(71 11(77 "xl<7> nspsney n1(79n1(12 nsohi 1161 111 ol1(1 1111(9 1(4B
H((3r 11665 1166 1166 pr 116 17 *& 17 *a 11 1X<X1(1 O 1(1 15 1(7 19 10
P1OH662__»C^^_ nrCH((4 011666 1 P1(01tiaOlSd 11(74 po~l~~sn P11oa0~nO~1011692tLd KUMQ 101--(~
300,000mn
BMW ~ ~~ ~ ~ ~~~~~~~~~~·~~102( h
' ' ^_^^^^^^ ~~~~~~~~~~~~~~~~0~10( raxQ
~~ ~ B»245_--f igallr
Prrr1 at
iw«(6" c^ 450,000nt
^ 140^ l n4» «m
________________040 P«^ r SS Pli N* *SSSamm m MU9wt"** "S " "y 0j
0400r·- MUBria B0414) 1042(_«dkl
BO- B057(mc, _o rzm
r I ~ ~ ~ ~ ~ ~~~~~~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ZD 60~~~~~~~~~~~~~~~~~~70,000nt
R ~~~ ~ ~ rr Bllll tx« B0111· 101 Cl
15,20.00nt
mmmm~~~~~~~~~~~~~~~UBT 1.35.0
00 ,00n
BU51T rnrBUM BUMr BU-U PrM BU1 pI 12( 37
I124(~ ~ Pr-~BUMmmm BUI BUM~rBU U5U *- BU M112 B1272 fm
Brr U47rTZ»
~rrrBU41r BUqftd BrU O
BUMl PrmcA BU B 7 to
l74
US I^C1I 1 gm "
'B117(IB1177 ·k- BUMrrne BUM ~BUM r«- U W Br- 1,050,000nt
BU7»y~ ~BlfUfClt U BMhtB~ U141 10 10
BUM 11*74-^.
^ u,<^ BUT~~~~~~~~~~~~~t.~ 1650.0,00nt
P ~~ ~ ~~ BM ~lBM o k UT yf 17 i B1577 11513)«rylt·
150bloO~~~~~~~~~~~~~~lM2~~~ ~ BUmmrBMmmUUM BMBM BM U1111 11
mu~ur1,500,00nt
B12 o- 127*g
2??ur"i^JS BiDly Buo^ga Bua^ira lrC 1u
1171 I^BU BI1ft B10 ku BU 11 TB71»- 11 B7 11 1U*- B72
^BUM B17M^ B171 Buj^ B1711 urwOPr r u~riic
B011J ta
B0107r n0101 10110 -aorr101ku
li"aa·s~
B~ ~ ~ Bill*lr
urrrr I·.1I·.·U .rru ur··rO amknmmmml I
BII IU2 -«C BCIIU - BIU BIU7~ BUM
I
BoI_«od0 150.000mn112«dk BOU4~ 110115
10077 10010
SSBS& 1=3l 1~0013 CI·tY·
K7 nrdPl
107 B 07
pB1010012 Olyll0wb 0ltCaOI
1I00741Il0015 <Mh 10088 thrB
010091 *ftr
010091 P00931·0094005 TC 01001 00102 du«B 04htpO n lQf
B10092 B0101 B10103
onFebruary8,2018http://science.sciencemag.org/Downloadedfrom