US20040067554A1 - Nucleotide sequences of moraxella catarrhalis genome - Google Patents

Nucleotide sequences of moraxella catarrhalis genome Download PDF

Info

Publication number
US20040067554A1
US20040067554A1 US10/672,787 US67278703A US2004067554A1 US 20040067554 A1 US20040067554 A1 US 20040067554A1 US 67278703 A US67278703 A US 67278703A US 2004067554 A1 US2004067554 A1 US 2004067554A1
Authority
US
United States
Prior art keywords
protein
nucleic acid
catarrhalis
vector
orf
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/672,787
Inventor
Robert Lagace
Chandra Patterson
Kim Berg
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Merck and Co Inc
Original Assignee
Lagace Robert E.
Chandra Patterson
Berg Kim L.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lagace Robert E., Chandra Patterson, Berg Kim L. filed Critical Lagace Robert E.
Priority to US10/672,787 priority Critical patent/US20040067554A1/en
Publication of US20040067554A1 publication Critical patent/US20040067554A1/en
Assigned to MERCK & CO., INC. reassignment MERCK & CO., INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ELITRA PHARMACEUTICALS, INC.
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/21Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Pseudomonadaceae (F)
    • C07K14/212Moraxellaceae, e.g. Acinetobacter, Moraxella, Oligella, Psychrobacter

Definitions

  • the present invention discloses nucleotide sequences from the genome of Moraxella catarrhalis . These sequences may be used in various assays and in the development of diagnostic and therapeutic agents.
  • M. catarrhalis ( Branhamella catarrhalis ) is a large, aerobic, gram-negative diplococcus normally found among the bacterial flora of human upper airways. It is nonmotile and possesses fimbriae. Collonies are regularly friable and nonadherent and grow well on blood or chocolate agar. Unlike many other pathogenic bacteria, M. catarrhalis shows a high degree of homogeneity in its outer membrane proteins. This usually harmless parasite of the mucous membranes may behave as an opportunistic pathogen when microbehost balance is perturbed. Following infection, host antibodies directed against one or more of the microbial outer-membrane proteins are detectable in the serum.
  • M. catarrhalis is known to cause acute, localized infections such as otitis media, sinusitis, and bronchopulmonary infection and life-threatening, systemic diseases including endocarditis and meningitis.
  • acute, localized infections such as otitis media, sinusitis, and bronchopulmonary infection and life-threatening, systemic diseases including endocarditis and meningitis.
  • the presence of bacterial endotoxin and host histamine and chemotactic factors are major indicators of M. catarrhalis pathogenicity.
  • M. catarrhalis can be isolated from the upper respiratory tract of 50% of healthy school children and 7% of healthy adults. In children with otitis media, colonization increases to 86%, and it is the third most common bacterial isolate. It causes 10-15% of otitis media and sinusitis. Infections of the maxillary sinuses, middle ears, or bronchi may occur through contiguous spread of the microbes. M. catarrhalis causes a large proportion of lower respiratory tract infections in elderly patients with chronic obstructive pulmonary diseases and is exceeded only by Haemoohilus influenzae and Streptococcus pneumoniae as a causative agent of acute purulent exacerbations of chronic bronchitis.
  • M. catarrhalis like that of H. influenzae or S. pneumoniae , begins with aspiration of the bacteria. Failure or absence of appropriate host defense allows the bacteria to replicate and produce an inflammatory response in the alveoli. Because of mandatory immunosuppression, organ transplant recipients can develop moderate to severe M. catarrhalis pneumonia very rapidly. Bloodstream invasion is less characteristic of M. catarrhalis than pneumococcal infection, but nearly 50% of M. catarrhalis pneumonia patients die within 3 months of onset.
  • M. catarrhalis is treated with antibiotic agents including penicillin-clavulanic acid combinations, cephalosporins, tetracycline, erythromycin, chloramphenicol, trimethoprim-sulfamethoxazole, and quinolones.
  • antibiotic agents including penicillin-clavulanic acid combinations, cephalosporins, tetracycline, erythromycin, chloramphenicol, trimethoprim-sulfamethoxazole, and quinolones.
  • antibiotic agents including penicillin-clavulanic acid combinations, cephalosporins, tetracycline, erythromycin, chloramphenicol, trimethoprim-sulfamethoxazole, and quinolones.
  • Clq first subcomponent of the complement system
  • Resistance is mediated by two closely related ⁇ -lactamases, BRO-1, present in 90% of resistant isolates and BRO-2, present in 10%.
  • M. catarrhalis In view of the conditions or diseases associated with M. catarrhalis , it would be advantageous to provide specific methods for the diagnosis, prevention, and treatment of diseases attributed to M. catarrhalis . Relevant methods would be based on the expression of M. catarrhalis -derived nucleic acid sequences. Such traits as virulence, acquisition of resistance factors, and effects of treatment using particular therapeutic agents may be characterized by under- or over-expression of nucleic acid sequences as revealed using PCR, hybridization or microarray technologies. Treatment for diseases attributed to M. catarrhalis can then be based on expression of these identified sequences or their expressed proteins, and efficacy of any particular therapy and development of resistance monitored. The information provided herein provides the basis for understanding the pathogenicity of M. catarrhalis and treating and monitoring the treatment of diseases caused by M. catarrhalis.
  • the present invention relates to a genomic library comprising the combination of nucleic acid molecules from Moraxella catarrhalis , presented as SEQ ID NOs:1-41.
  • the library substantially provides the nucleic acid molecules comprising the genome of M. catarrhalis , and the nucleic acid molecules provide a plurality of open reading frames (ORFs).
  • the ORFs uniquely identify structural, functional, and regulatory genes of M. catarrhalis .
  • the invention encompasses oligonucleotides, fragments, and derivatives of the M. catarrhalis nucleic acid molecules, and sequences complementary to the nucleic acid molecules listed in the Sequence Listing.
  • M. catarrhalis nucleic acid molecules, fragments, derivatives, oligonucleotides, and complementary sequences thereof can be used as probes to detect, amplify, or quantify M. catarrhalis genes, ORFs, cDNAs, or RNAs in biological, solution or substrate-based, assays or as compositions in diagnostic kits.
  • the invention contemplates the use of such diagnostic probes to identify the presence of M. catarrhalis sequence in a sample or to screen for virulence factors and mutations.
  • the invention also provides for the comparison of the M. catarrhalis genomic library or the encoded proteins with genomes, individual DNA sequences, or proteins from other Moraxella species or strains, other bacteria, and other organisms to identify virulence factors, regulatory elements, drug targets, and to characterize genomic organization.
  • the present invention provides for the use of computer databases to make such comparisons.
  • the invention further provides host cells and expression vectors comprising nucleic acid molecules of the invention and methods for the production of the proteins they encode. Such methods include culturing the host cells under conditions for expression of M. catarrhalis protein and recovering the protein from cell culture.
  • the invention still further provides purified M. catarrhalis protein of which at least a portion is encoded by a nucleic acid molecule selected from the nucleic acid molecules of the Sequence Listing.
  • the subject invention provides a method of screening a library or a plurality of molecules or compounds for specific binding to a M. catarrhalis nucleic acid molecule or fragment thereof or protein or portion thereof, to identify at least one ligand which specifically binds the M. catarrhalis nucleic acid molecule or protein.
  • Such a method comprises the steps of combining the M. catarrhalis nucleic acid molecule or protein with a library or a plurality of molecules or compounds under conditions to allow specific binding and detecting M. catarrhalis nucleic acid molecule or protein bound to at least one molecule or compound, thereby identifying a ligand which specifically binds the nucleic acid molecule or protein.
  • Suitable libraries of ligands comprise aptamers, DNA molecules, RNA molecules, peptide nucleic acids, peptides, mimetics, proteins, agonists, antagonists, antibodies, inhibitors, immunoglobulins, pharmaceutical agents, and drug compounds.
  • the subject invention also provides a method of purifying a ligand from a sample.
  • a method of purifying a ligand from a sample comprises the steps of combining the M. catarrhalis nucleic acid molecule or protein with a library or a plurality of molecules or compounds under conditions to allow specific binding, detecting M. catarrhalis nucleic acid molecule or protein bound to at least one molecule or compound, recovering the bound M. catarrhalis nucleic acid molecule or protein and separating the bound M. catarrhalis nucleic acid molecule or protein from the ligand, thereby obtaining purified ligand.
  • the invention further comprises an antibody specific for a purified M. catarrhalis protein or a portion thereof which is encoded by an M. catarrhalis nucleic acid molecule selected from the Sequence Listing.
  • Antibodies produced against M. catarrhalis protein may be used diagnostically for the detection of M. catarrhalis proteins in biological, solution- or substrate-based, samples and therapeutically to neutralize the activity of an M. catarrhalis protein expressed during infections caused by M. catarrhalis.
  • Sequence Listing is a compilation of the consensus sequences of contiguous sequences (contigs) or groups of overlapping sequences, assembled from individual sequences obtained by sequencing genomic clone inserts of a randomly generated M. catarrhalis DNA library. Each assembled contig or singlet is identified by a sequence identification number (SEQ ID NO) and by the contig number which it represents.
  • Table 1 lists the assembled M. catarrhalis contiguous sequences prepared as described in the Examples.
  • the first column contains the number of the contig, which is also SEQ ID NO, listed in ascending order.
  • the second column contains the length of the nucleic acid molecule.
  • the third and fourth columns contain the start and stop nucleotides, respectively, for any open reading frames (ORFs) in the contig.
  • the fifth column contains the Locus ID.
  • the sixth column lists the GenBank identification number of the closest homolog, if any.
  • the seventh column gives the P-value for the match to the homolog.
  • the last column contains the description of the homolog. Orphans or LURs have no GenBank homologs.
  • Table 2 shows the order of the contigs or singlets comprising the M. catarrhalis genome.
  • Bioly active refers to a protein having structural, immunological, regulatory, or chemical functions of a naturally occurring, recombinant, or synthetic molecule.
  • “Complementary” refer to the natural hydrogen bonding by base pairing between purines and pyrimidines.
  • the sequence A-C-G-T forms hydrogen bonds with its complements T-G-C-A or U-G-C-A.
  • the degree of complementarity between nucleic acid strands affects the efficiency and strength of the hybridization and amplification reactions.
  • “Derivative” refers to the chemical modification of a nucleic acid or amino acid molecule. Chemical modifications can include replacement of hydrogen by an alkyl, acyl, or amino group or glycosylation, pegylation, or any similar process which retains or enhances biological activity, stability, or lifespan of the molecule.
  • “Fragment” refers to an Incyte clone or any part of a nucleic acid molecule which retains a usable, functional characteristic.
  • Useful fragments include oligonucleotides which may be used in hybridization or amplification technologies or to regulate replication, transcription or translation.
  • Hybridization complex refers to a complex between two nucleic acid molecules by virtue of the formation of hydrogen bonds between purines and pyrimidines.
  • Ligand refers to any molecule or compound which will bind to a complementary site on a nucleic acid molecule or protein.
  • Modulates refers to a change in activity (biological, chemical, or immunological) or lifespan resulting from specific binding between a molecule or compound and either a nucleic acid molecule or a protein.
  • Microlecules is used substantially interchangeably with the terms agents and compounds. Such molecules modulate the activity of nucleic acid molecules or proteins of the invention and may be composed of at least one of the following: inorganic and organic substances including cofactors, nucleic acids, proteins, carbohydrates, fats, and lipids.
  • Nucleic acid molecule is substantially interchangeable with the term polynucleotide and may refer to a probe, a fragment of DNA or RNA of genomic or synthetic origin. Such molecules may be double-stranded or single-stranded and may be engineered into vectors to perform a particular activity such as transcription.
  • Oligomer is substantially equivalent to the terms “amplimer”, “primer”, “oligomer”, and “element”, and is preferably single stranded.
  • Protein refers to an amino acid sequence, oligopeptide, peptide, polypeptide or portions thereof whether naturally occurring or synthetic.
  • “Portion” refers to any part of a protein used for any purpose, but especially for the screening of a library of molecules or compounds which specifically bind to that portion or for the production of antibodies.
  • sample is used in its broadest sense.
  • a sample containing nucleic acid molecules may comprise a bodily fluid; an extract from a cell, chromosome, organelle, or membrane isolated from a cell; genomic DNA, RNA, or cDNA in solution or bound to a substrate; a cell; a tissue; a tissue print; a hair, and the like.
  • substantially purified refers to nucleic acid molecules or proteins that are isolated or separated from their natural environment and are about 60% free to about 90% free from other components with which they are naturally associated.
  • Substrate refers to any rigid or semi-rigid support to which nucleic acid molecules or proteins are bound and includes membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, capillaries or other tubing, plates, polymers, and microparticles with a variety of surface forms including wells, trenches, pins, channels and pores.
  • Genomic DNA was mechanically sheared, treated with enzyme to create blunt ends, gel-purified, and cloned into modified PBLUESCRIPT vectors (Stratagene, La Jolla Calif.). The vectors were transformed into E. coli cells and grown overnight. Colonies were picked, and plasmid DNA was isolated. Templates were prepared and sequenced, sequences were assembled into contiguous sequences (contigs), and open reading frames were identified.
  • the invention relates to a Moraxella catarrhalis genomic DNA library comprising a combination of nucleic acid molecules, SEQ ID NOs:1-41, and their complements. These nucleic acid molecules comprise contiguous sequences which contain annotated and unannotated reading frames (ORFs and LURs).
  • the nucleic acid molecules or fragments and probes thereof are used in hybridization, screening, and purification assays to identify ligands and in vectors and host cells to produce the proteins which they encode.
  • the proteins or portions thereof are also used in screening and purification assays to identify useful ligands or to produce antibodies.
  • the molecules or compounds used in hybridization, screening, and purification assays include aptamers, DNA molecules, RNA molecules, peptide nucleic acids, peptides, mimetics, transcription factor, enhancers, repressors, regulatory proteins, agonists, antagonists, antibodies, inhibitors, immunoglobulins, pharmaceutical agents, drug compounds, and the like.
  • the nucleic acid molecules and proteins of M. catarrhalis are compared with those of other organisms using computer algorithms and databases to select those nucleic acid molecules and proteins of potential diagnostic and therapeutic use.
  • Methods for sequencing nucleic acid molecules are well known in the art and may be used to practice any of the embodiments of the invention. These methods employ enzymes such as the Klenow fragment of DNA polymerase I, SEQUENASE, Taq DNA polymerase, thermostable T7 DNA polymerase (Amersham Pharmacia Biotech (APB), Piscataway N.J.), or combinations of polymerases and proofreading exonucleases such as those found in the ELONGASE amplification system (Life Technologies, Rockville Md.).
  • enzymes such as the Klenow fragment of DNA polymerase I, SEQUENASE, Taq DNA polymerase, thermostable T7 DNA polymerase (Amersham Pharmacia Biotech (APB), Piscataway N.J.), or combinations of polymerases and proofreading exonucleases such as those found in the ELONGASE amplification system (Life Technologies, Rockville Md.).
  • sequence preparation is automated with machines such as the HYDRA microdispenser (Robbins Scientific, Sunnyvale Calif.), MICROLAB 2200 system (Hamilton, Reno Nev.), and the DNA ENGINE thermal cycler (MJ Research, Watertown Mass.).
  • Machines used for sequencing include the ABI 3700, 377 or 373 DNA sequencing systems (PE Biosystems, Foster City Calif.), the MEGABACE 1000 DNA sequencing system (APB), and the like.
  • sequences may be analyzed using a variety of algorithms which are well known in the art and described in Ausubel (1997 ; Short Protocols in Molecular Biology , John Wiley & Sons, New York N.Y., unit 7.7) and in Meyers (1995 ; Molecular Biology and Biotechnology , Wiley VCH, New York N.Y., pp. 856-853).
  • Shotgun sequencing methods are well known in the art and use thermostable DNA polymerases and heat-labile DNA polymerases. A detailed procedure is provided in the Examples. Prefinished sequences (incomplete assembled sequences) are cross-compared for identity using various algorithms or programs such as CONSED (Gordon (1998) Genome Res. 8:195-202), GELVIEW Fragment Assembly system (Genetics Computer Group, Madison Wis., and PHRAP (Phil Green, University of Washington, Seattle Wash.). Contaminating sequences, including vector or chimeric sequences, can be masked, removed or restored, in the process of turning the prefinished sequences into finished sequences.
  • CONSED Gadon (1998) Genome Res. 8:195-202)
  • GELVIEW Fragment Assembly system Geneetics Computer Group, Madison Wis.
  • PHRAP Phil Green, University of Washington, Seattle Wash.
  • sequences of the invention may be extended using various PCR-based methods known in the art.
  • the XL-PCR kit PE Biosystems
  • nested primers and commercially available cDNA or genomic DNA libraries (Life Technologies and Clontech (Palo Alto Calif.), respectively) may be used to extend the nucleic acid sequence.
  • primers may be designed using commercially available software, such as OLIGO 4.06 software (National Biosciences, Madison Minn.) to be about 22 to 30 nucleotides in length, to have a GC content from about 40-45%, and to anneal to a target molecule at temperatures from about 55 C to about 68 C.
  • OLIGO 4.06 software National Biosciences, Plymouth Minn.
  • the M. catarrhalis nucleic acid molecules and fragments thereof can be used in various hybridization technologies for various purposes.
  • Hybridization probes may be designed or derived from a highly unique region such as the 5′ untranslated sequence preceding the initiation codon or from a conserved coding region encoding a specific protein signature or motif and used in protocols to identify naturally occurring molecules encoding a particular M. catarrhalis protein, allelic variants, or related molecules.
  • the probe should preferably have at least 50% sequence identity to any naturally occurring nucleic acid sequences.
  • the probe may be a single stranded DNA or RNA molecule, produced biologically or synthetically, and labeled using oligolabeling, nick translation, end-labeling, or PCR amplification in the presence of at least one labeled nucleotide.
  • a vector containing the nucleic acid molecule or a fragment thereof may be used to produce an mRNA probe in vitro by addition of an RNA polymerase and labeled nucleotides. These procedures may be conducted using commercially available kits such as those provided by APB.
  • the stringency of hybridization is determined by G+C content of the probe, salt concentration, and temperature. In particular, stringency can be increased by reducing the concentration of salt or raising the hybridization temperature. In solutions used for some membrane based hybridizations, addition of an organic solvent such as formamide allows the reaction to occur at a lower temperature.
  • Hybridization can be performed at low stringency with buffers, such as 5 ⁇ SSC with 1% sodium dodecyl sulfate (SDS) at 60 C, which permits the formation of a hybridization complex between nucleic acid sequences that contain some mismatches.
  • buffers such as 5 ⁇ SSC with 1% sodium dodecyl sulfate (SDS) at 60 C, which permits the formation of a hybridization complex between nucleic acid sequences that contain some mismatches.
  • washes are performed at increased stringency with buffers such as 0.2 ⁇ SSC with 0.1% SDS at either 45 C (medium stringency) or 68 C (high stringency).
  • buffers such as 0.2 ⁇ SSC with 0.1% SDS at either 45 C (medium stringency) or 68 C (high stringency).
  • high stringency hybridization complexes will remain stable only where the nucleic acid molecules are completely complementary.
  • 35-50% formamide can be added to the hybridization solution to reduce the temperature at which hybridization is performed.
  • Background signals can be reduced by the use of other detergents such as Sarkosyl or TRITON X-100 (Sigma-Aldrich, St. Louis Mo.) and a blocking agent such as denatured salmon sperm DNA.
  • Microarrays may be prepared and analyzed using methods known in the art. Oligonucleotides or fragments of a nucleic acid molecule may be used as either probes or targets.
  • the microarray can be used to monitor the expression level of large numbers of genes simultaneously and to identify genetic variants, mutations, and single nucleotide polymorphisms. Such information may be used to determine gene function; to understand the genetic basis of a condition, disease, or disorder; to diagnose a condition, disease, or disorder; and to develop and monitor the activities of therapeutic agents used to treat the condition, disease, or disorder. (See, eg, Brennan et al. (1995) U.S. Pat. No. 5,474,796; Schena et al.
  • Hybridization probes are also useful in mapping the naturally occurring genomic sequence.
  • the probes may be hybridized to: 1) a particular chromosome, 2) a specific region of a chromosome, 3) an artificial chromosome constructions such as human artificial chromosomes (HACs), yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial PI constructions, single chromosomes from eukaryotic species, or 5) DNA libraries made from any of these sources.
  • HACs human artificial chromosomes
  • YACs yeast artificial chromosomes
  • BACs bacterial artificial chromosomes
  • PI constructions single chromosomes from eukaryotic species, or 5
  • a nucleic acid molecule encoding a M. catarrhalis protein may be cloned into a vector and used to express the protein or portions thereof in host cells.
  • the nucleic acid sequence can be engineered by such methods as DNA shuffling (U.S. Pat. No. 5,830,721) and site-directed mutagenesis to create new restriction sites, alter glycosylation patterns, change codon preference to increase expression in a particular host, produce splice variants, extend half-life, and the like.
  • the expression vector may contain transcriptional and translational control elements (promoters, enhancers, specific initiation signals, and polyadenylated sequence) from various sources which have been selected for their efficiency in a particular host.
  • the vector, nucleic acid molecule, and regulatory elements are combined using in vitro recombinant DNA techniques, synthetic techniques, and/or in vivo genetic recombination techniques well known in the art and described in Sambrook (supra ch. 4, 8, 16 and 17).
  • a variety of host systems may be transformed with an expression vector. These include, but are not limited to, bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems transformed with baculovirus expression vectors; plant cell systems transformed with expression vectors containing viral and/or bacterial elements, or animal cell systems (Ausubel, supra, unit 16).
  • Routine cloning, subcloning, and propagation of nucleic acid molecules can be achieved using the multifunctional PBLUESCRIPT vector (Stratagene) or PSPORT1 plasmid (Life Technologies). Introduction of a nucleic acid sequence into the multiple cloning site of these vectors disrupts the lacZ gene and allows colorimetric screening for transformed bacteria. In addition, these vectors may be useful for in vitro transcription, dideoxy sequencing, single strand rescue with helper phage, and creation of nested deletions in the cloned sequence.
  • the vector can be stably transformed into competent cells of E. coli along with a selectable or visible marker gene on the same or on a separate vector. After transformation, cells are allowed to grow in enriched media containing a selective agent. Selectable markers, antimetabolite, antibiotic, or herbicide resistance genes confer resistance to the respective selective agent and allow growth and recovery of cells which successfully express the introduced sequences. Resistant clones or colonies, identified either by survival on selective media or by the expression of visible markers, such as anthocyanins, green fluorescent protein (GFP), ⁇ glucuronidase, luciferase and the like, may be propagated using culture techniques well known in the art. Visible markers are also used to quantify the amount of protein expressed by the introduced genes. Verification that the host cell contains the desired M. catarrhalis nucleic acid molecule is based on DNA-DNA or DNA-RNA hybridizations or PCR amplification.
  • the host cell may be chosen for its ability to modify a recombinant protein in a desired fashion. Such modifications include acetylation, carboxylation, glycosylation, phosphorylation, lipidation, acylation, and the like. Post-translational processing sequences (“prepro” forms) may also be engineered into the recombinant nucleotide sequence in order to specify protein targeting, folding, and/or activity. Different host cells available from the ATCC (Manassas Va.) which have specific cellular machinery and characteristic mechanisms for post-translational activities may be chosen to ensure the correct modification and processing of the recombinant protein.
  • Heterologous moieties engineered into a vector for ease of purification include glutathione S-transferase (GST), calmodulin binding peptide (CBP), 6 ⁇ His, FLAG, MYC, and the like.
  • GST, CBP, and 6 ⁇ His are purified using commercially available affinity matrices such as immobilized glutathione, calmodulin, and metal-chelate resins, respectively.
  • FLAG and MYC are purified using commercially available monoclonal and polyclonal antibodies.
  • a proteolytic cleavage site may be located between the desired protein sequence and the heterologous moiety for ease of separating the desired protein following purification. Methods for recombinant protein expression and purification are discussed in Ausubel (supra, unit 16) and are commercially available (Invitrogen, San Diego Calif.).
  • Proteins or portions thereof may be produced not only by recombinant methods, but also by using chemical methods well known in the art.
  • Solid phase peptide synthesis may be carried out in a batchwise or continuous flow process which sequentially adds ⁇ -amino and side chain-protected amino acid residues to an insoluble polymeric support via a linker group.
  • a linker group such as methylamine-derivatized polyethylene glycol is attached to poly(styrene-co-divinylbenzene) to form the support resin.
  • the amino acid residues are N- ⁇ -protected by acid labile Boc (t-butyloxycarbonyl) or base-labile Fmoc (9-fluorenylmethoxycarbonyl).
  • the carboxyl group of the protected amino acid is coupled to the amine of the linker group to anchor the residue to the solid phase support resin.
  • Trifluoroacetic acid or piperidine are used to remove the protecting group in the case of Boc or Fmoc, respectively.
  • Each additional amino acid is added to the anchored residue using a coupling agent or pre-activated amino acid derivative, and the resin is washed.
  • the full length peptide is synthesized by sequential deprotection, coupling of derivitized amino acids, and washing with dichloromethane and/or N,N-dimethylformamide. The peptide is cleaved between the peptide carboxy terminus and the linker group to yield a peptide acid or amide.
  • a protein or portion thereof may be substantially purified by preparative high performance liquid chromatography and its composition confirmed by amino acid analysis or by sequencing (Creighton (1984) Proteins, Structures and Molecular Properties , W H Freeman, New York N.Y.).
  • Various hosts including goats, rabbits, rats, mice, humans, and others may be immunized by injection with M. catarrhalis protein or any portion thereof.
  • Adjuvants such as Freund's, mineral gels, and surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemacyanin (KLH), and dinitrophenol may be used to increase immunological response.
  • the oligopeptide, peptide, or portion of protein used to induce antibodies should consist of about five to fifteen amino acids which are identical to a portion of the natural protein. Oligonucleotides may be fused with proteins such as KLH in order to produce antibodies to the chimeric molecule.
  • Monoclonal antibodies may be prepared using any technique which provides for the production of antibodies by continuous cell lines in culture. These include, but are not limited to, the hybridoma technique, the human B-cell hybridoma technique, and the EBV-hybridoma technique. (See, eg, Kohler et al. (1975) Nature 256:495-497; Kozbor et al. (1985) J Immunol Methods 81:31-42; Cote et al. (1983) Proc Natl Acad Sci 80:2026-2030; and Cole et al. (1984) Mol Cell Biol 62:109-120.)
  • Antibody fragments which contain specific binding sites for epitopes of the M. catarrhalis protein may also be generated
  • fragments include, but are not limited to, F(ab′) 2 fragments produced by pepsin digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of the F(ab′) 2 fragments.
  • Fab expression libraries may be constructed to allow rapid and easy identification of monoclonal Fab fragments with the desired specificity (Huse et al. (1989) Science 246:1275-1281).
  • the M. catarrhalis protein may be used in screening assays of phage mid or B-lymphocyte immunoglobulin libraries to identify antibodies having the desired specificity.
  • Numerous protocols for competitive binding or immunoassays using either polyclonal or monoclonal antibodies with established specificities are well known in the art. Such immunoassays typically involve the measurement of complex formation between the protein and its specific antibody.
  • a two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive to two non-interfering epitopes is preferred, but a competitive binding assay may also be employed (Pound (1998) Immunochemical Protocols , Humana Press, Totowa N.J.).
  • a wide variety of labels and conjugation techniques are known by those skilled in the art and may be used in various nucleic acid molecule, protein, and antibody assays. Synthesis of labeled molecules may be achieved using Promega (Madison Wis.) or APB kits for incorporation of a labeled nucleotide such as 32 p-dCTP, Cy3-dCTP or Cy5dCTP (APB) or amino acid such as 35 S-methionine (APB).
  • a labeled nucleotide such as 32 p-dCTP, Cy3-dCTP or Cy5dCTP (APB) or amino acid such as 35 S-methionine (APB).
  • Nucleotides and amino acids may be directly labeled with a variety of substances including fluorescent, chemiluminescent, or chromogenic agents and the like, by chemical conjugation to amines, thiols and other groups present in the molecules using reagents such as BIODIPY or FITC (Molecular Probes, Eugene Oreg.).
  • the nucleic acid molecules, fragments, oligonucleotides, complementary RNA and DNA molecules, and peptide nucleic acids (PNAs) may be used to detect and quantify differential gene expression, absence/presence vs. excess, of mRNAs or to monitor mRNA levels following drug treatment.
  • Conditions, diseases or disorders associated with M. catarrhalis gene expression may include conditions and diseases such as allergies, asthma, bronchitis, chronic obstructive pulmonary disease, emphysema, endocarditis, hypereosinophilia, meningitis, otitis media, pneumonia, sinusitis, and various respiratory distress syndromes.
  • the diagnostic assay may use hybridization or amplification technology to compare gene expression in a biological sample from a patient to expression in disease and control standards in order to detect differential gene expression. Qualitative or quantitative methods for this comparison are well known in the art.
  • the nucleic acid molecule, fragment, or probe may be labeled by standard methods and added to a sample from a patient under conditions for the formation of hybridization complexes. After an incubation period, the sample is washed and the amount of label (or signal) associated with hybridization complexes, is quantified and compared with a standard value. If the amount of label in the patient sample is significantly altered in comparison to the standard value, then the presence of elevated amounts of M. catarrhalis is responsible for the associated condition or disease.
  • a normal or standard expression profile is established. This may be accomplished by combining a biological sample taken from normal subjects, animal or more preferably human, with a probe under conditions for hybridization or amplification. Standard hybridization may be quantified by comparing the values obtained using normal subjects with values from an experiment in which a known amount of a substantially purified target sequence is used. Standard values obtained in this manner may be compared with values obtained from samples from patients who are symptomatic for a particular condition or diseases listed above. Deviation from standard values toward those associated with a particular diagnosed condition is used to diagnose the patient.
  • Such assays may also be used to evaluate the efficacy of a particular therapeutic treatment regimen in animal studies or in a clinical trial. Once efficacy is established, these assays may be used on a regular basis to determine if the therapy is effective in an individual patient. The results obtained from successive patient assays may be used over a period ranging from several days to months.
  • Detection and quantification of a protein using either specific polyclonal or monoclonal antibodies are known in the art. Examples of such techniques include enzyme-linked immunosorbent assays (ELISAs), radioimmunoassays, and fluorescence activated cell sorting.
  • ELISAs enzyme-linked immunosorbent assays
  • radioimmunoassays radioimmunoassays
  • fluorescence activated cell sorting A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive to two non-interfering epitopes is preferred, but a competitive binding assay may be employed.
  • ELISAs enzyme-linked immunosorbent assays
  • radioimmunoassays radioimmunoassays
  • fluorescence activated cell sorting fluorescence activated cell sorting.
  • a two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive to two non-interfering epitopes is preferred
  • a ligand such as an antagonist, antibody, or inhibitor identified by screening a plurality of molecules with the M. catarrhalis protein is administered to the subject to decrease the activity of the M. catarrhalis or homologous protein as it is overexpressed during pathogenesis.
  • a composition comprising the substantially purified ligand and a pharmaceutical carrier may be administered to a subject to decrease the activity of the M. catarrhalis or homologous protein as it is overexpressed during pathogenesis.
  • an antibody which specifically binds the M. catarrhalis protein may be used as a targeting or delivery mechanism for bringing a pharmaceutical agent to cells or tissues which are affected by the overexpression of the M. catarrhalis protein.
  • any of the ligands may be administered in combination with other therapeutic agents. Selection of the agents for use in combination therapy may be made by one of ordinary skill in the art according to conventional pharmaceutical principles. A combination of therapeutic agents may act synergistically to effect prevention or treatment of a particular condition at a lower dosage of each agent.
  • Gene expression may be modified by designing complementary or antisense molecules (DNA, RNA, or PNA) to the 5′, 3′, or intronic regions of the M. catarrhalis nucleic acid molecule. Oligonucleotides designed with reference to the transcription initiation site are preferred. Similarly, inhibition can be achieved using triple helix base-pairing which inhibits the binding of polymerases, transcription factors, or regulatory molecules (Gee et al. In: Huber and Carr (1994) Molecular and Immunologic Approaches , Futura Publishing, Mt. Kisco N.Y., pp. 163-177). A complementary molecule may also be designed to block translation by preventing binding between ribosomes and mRNA.
  • a complementary molecule may also be designed to block translation by preventing binding between ribosomes and mRNA.
  • a library of cDNA molecules may be screened to identify those which specifically bind a regulatory, untranslated M. catarrhalis sequence. Delivery of this inhibitory nucleotide sequence using a vector designed to be transferred from transformed M. catarrhalis cells to infectious M. catarrhalis via genetic recombination is contemplated.
  • Ribozymes enzymatic RNA molecules
  • the mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme molecule to complementary target RNA followed by endonucleolytic cleavage at sites such as GUA, GUU, and GUC. Once such sites are identified, an oligonucleotide with the same sequence may be evaluated for secondary structural features which would render the oligonucleotide inoperable. The suitability of candidate targets may also be evaluated by testing their hybridization with complementary oligonucleotides using ribonuclease protection assays.
  • RNA molecules may be modified to increase intracellular stability and half-life by addition of flanking sequences at the 5′ and/or 3′ ends of the molecule or by the use of phosphorothioate or 2′ O-methyl rather than phosphodiesterase linkages within the backbone of the molecule. Modification is inherent in the production of PNAs and can be extended to other derivative nucleotide molecules.
  • the M. catarrhalis nucleic acid molecule may be used to screen a plurality or a library of molecules or compounds for specific binding affinity.
  • the molecules or compounds may be selected from aptamers, DNA molecules, RNA molecules, PNAs, peptides, transcription factors, enhancers, repressors, regulatory proteins and other ligands which modulate the activity, replication, transcription, or translation of the nucleic acid molecules in the biological system.
  • the assay involves combining the M. catarrhalis nucleic acid molecule or a fragment thereof with molecules or compounds under conditions to allow specific binding, and detecting specific binding to identify at least one ligand which specifically binds the M. catarrhalis nucleic acid molecule.
  • the M. catarrhalis protein or a portion thereof may be used to screen a plurality of libraries of molecules or compounds in any of a variety of screening assays.
  • the molecules or compounds may be selected from aptamers, DNA molecules, RNA molecules, peptide nucleic acids, peptides, mimetics, proteins, agonists, antagonists, antibodies, inhibitors, immunoglobulins, pharmaceutical agents, drug compounds, and the like.
  • the protein or portion thereof employed in such screening may be free in solution, affixed to an abiotic or biotic substrate (eg, borne on a cell surface), or located intracellularly. Specific binding between the protein and molecule may be measured.
  • One method for high throughput screening using very small assay volumes and very small amounts of test compound is described in U.S. Pat. No. 5,876,946, incorporated herein by reference, which teaches how to screen large numbers of molecules for specific binding to a protein.
  • the M. catarrhalis nucleic acid molecule or a fragment thereof may be used to purify a ligand from a sample.
  • a method for using a M. catarrhalis nucleic acid molecule or a fragment thereof to purify a ligand would involve combining the nucleic acid molecule or a fragment thereof with a sample under conditions to allow specific binding, detecting specific binding, recovering the bound M. catarrhalis nucleic acid molecule, and using an appropriate agent to separate the M. catarrhalis nucleic acid molecule from the purified ligand.
  • the protein or a portion thereof may be used to purify a ligand from a sample.
  • a method for using a M. catarrhalis protein or a portion thereof to purify a ligand would involve combining the protein or a portion thereof with a sample under conditions to allow specific binding, detecting specific binding between the protein and ligand, recovering the bound protein, and using an appropriate chaotropic agent to separate the protein from the purified ligand.
  • compositions are those substances wherein the active ingredients are contained in an effective amount to achieve a desired and intended purpose.
  • the determination of an effective dose is well within the capability of those skilled in the art.
  • the therapeutically effective dose may be estimated initially either in cell culture assays or in animal models. The animal model is also used to achieve a desirable concentration range and route of administration. Such information may then be used to determine useful doses and routes for administration in humans.
  • a therapeutically effective dose refers to that amount of a pharmaceutical agent which ameliorates the symptoms or condition.
  • Therapeutic efficacy and toxicity of such agents may be determined by standard pharmaceutical procedures in cell cultures or experimental animals, eg, ED 50 (the dose therapeutically effective in 50% of the population) and LD 50 (the dose lethal to 50% of the population).
  • the dose ratio between toxic and therapeutic effects is the therapeutic index, and it may be expressed as the ratio, LD 50 /ED 50 .
  • Pharmaceutical compositions which exhibit large therapeutic indexes are preferred. The data obtained from cell culture assays and animal studies are used in formulating a range of dosage for human use.
  • the goal of rational drug design is to produce structural analogs of biologically active M. catarrhalis proteins of interest or of ligands with which they interact. Any of these examples can be used to fashion drugs which are more active or stable forms of the protein, or which enhance or interfere with the function of a protein in vivo (Hodgson (1991) Bio/Technology 9:19-21).
  • the three-dimensional structure of an M. catarrhalis protein, or of an M. catarrhalis protein-inhibitor complex is determined by X-ray crystallography, by computer modeling or, most typically, by a combination of the two approaches. Both the shape and charges of the protein must be ascertained to elucidate the structure and to determine active site(s). Less often, useful information regarding the structure of a protein may be gained by modeling based on the structure of homologous proteins. In both cases, relevant structural information is used to design analogous M. catarrhalis protein-like molecules or to identify efficient inhibitors.
  • Useful examples of rational drug design may include molecules which have improved activity or stability, as shown by Braxton et al. (1992 , Biochem 31:77967801), or which act as inhibitors, agonists, or antagonists of M. catarrhalis peptides, as shown by Athauda et al. (1993 , J Biochem 113:742-746).
  • a target-specific antibody selected by functional assay, as described above, and then to solve its crystal structure.
  • This approach in principle, yields a pharmacore upon which subsequent drug design can be based. It is possible to bypass protein crystallography altogether by generating anti-idiotypic antibodies (anti-ids) to a functional, pharmacologically-active antibody.
  • anti-ids anti-idiotypic antibodies
  • the binding site of the anti-id is an analog of the original receptor.
  • the anti-id can be used to identify and isolate peptides from banks of chemically or biologically-produced peptides. The isolated peptides act as the pharmacore.
  • L is the genome length
  • n is the number of clones insert ends sequenced
  • w is the sequencing read length
  • m nw/L
  • the total gap length is Le ⁇ m
  • the average gap size is L/n.
  • An M. catarrhalis genomic DNA library was constructed using DNA purified from the gram negative, aerobic diplococcus, M. catarrhalis , ATCC accession number 43617. The isolate was obtained from transtracheal aspirate of a coal miner with chronic bronchitis. The G+C content is 42%.
  • ⁇ -AGARASE New England Biolabs (NEB), Beverly Mass.) and 10x ⁇ -AGARASE (NEB) were added, and the preparation was incubated for 1-3 hours with addition of a half initial volume of ⁇ -AGARASE (NEB) after 1 hour and mixing by inversion every half hour.
  • the DNA was extracted once with phenol:chloroform:isoamyl alcohol (25:24:1) followed by extraction with chloroform:isoamyl alcohol (24:1) and precipitated by addition of 1-3 ⁇ l glycogen, ⁇ fraction (1/10) ⁇ volume 3M NaOAc, and 2.5 volumes cold 100% ethanol. The sample was stored overnight at ⁇ 20 C.
  • the PBLUESCRIPT plasmid (Stratagene) was cut with SmaI endonuclease, and the ends of the strands dephosphorylated to prepare the BS.S2 vector.
  • the purified M. catarrhalis DNA (2 ⁇ g) was ligated into the BS.S2 vector (1 ⁇ g) with T4 DNA ligase (Life Technologies) for 4 hours at 14 C. Following the ligation reaction, the ligated DNA was extracted and precipitated as above.
  • the circular plasmid was transformed into DH10B competent cells (Life Technologies) by electroporation at 1.8 volts. Transformed cells were selected by growth on X-Gal+isopropyl beta-D-thiogalactopyranoside (IPTG)+2 ⁇ carbenicillin (carb) LB agar plates.
  • Plasmid DNA was released from the cells and purified using the REAL PREP 96 plasmid kit (QIAGEN, Chatsworth Calif.). This kit enabled simultaneous purification of 96 samples in a 96-well block using multi-channel reagent dispensers.
  • the recommended protocol was employed except for the following changes: 1) the bacteria were cultured in 1 ml of sterile TERRIFIC BROTH (BD Biosciences, Sparks Md.) with carb at 25 mg/l and glycerol at 0.4%; 2) after inoculation and incubation for 19 hours, the cells were lysed with 0.3 ml of lysis buffer; and 3) following isopropanol precipitation, the plasmid DNA pellet was resuspended in 0.1 ml of distilled water. After this final step, samples were transferred to a 96-well block for storage at 4 C.
  • the DNA inserts were prepared for sequencing using a 96 well HYDRA microdispenser (Robbins Scientific) in combination with DNA ENGINE thermal cyclers (MJ Research). After thermal cycling, the A, C, G, and T reactions with each DNA template were combined. Then, 50 ⁇ l 100% ethanol was added, and the solution was spun at 4 C for 30 min at 4500 rpm in a centrifuge (Jouan, Winchester Va.). After the pellet was dried for 15 min under vacuum, the DNA sample was dissolved in 3 ⁇ l of formaldehyde/50 mM EDTA and loaded on wells in volumes of 1 ⁇ l per well for sequencing. Sequencing used the method of Sanger and Coulson (1975, J. Mol. Biol. 94:441f) and an ABI PRISM 377 sequencing systems (PE Biosystems). After electrophoresis for four hours on 4% acrylamide gels on 36 cm plates at 2.3 kV, approximately 500-650 bps were determined per sequence.
  • Sequences were generated from either shotgun sequencing or closure sequencing. Closure sequences were obtained by directed genomic walks or PCR of specific genomic regions. In the latter case, the PCR products were sequenced.
  • Sequences were edited in a two-step process.
  • first step vector sequences from both the 5′ and 3′ ends were clipped using the algorithm provided in U.S. Ser. No. 09/276,534 filed Mar. 25, 1999.
  • second step possible contaminating sequence was removed by reading each raw sequence and performing a cross-match search against a contamination database containing known vector sequences and DNA marker sequences. Sequences with cross-match scores of 18 or greater were removed.
  • Contigs were assembled using PHRAP (Green, supra) which aligns multiple, overlapping DNA sequences to form a contiguous consensus sequence. Alignments were influenced by quality scores assigned to each base in a sequence. A single sequence cannot belong to more than one contig.
  • ORF identification was carried out through combination of BLAST (Karlin, supra) and FASTA searches. These serial searches compared the consensus sequences of the assembled contigs, presented in Table 1, against sequences in public-domain databases. The searches identified similarity matches, or “hits”, that indicated an ORF within the sequence.
  • a gene clustering protocol is used to determine related ORFs within and across genomes. Gene clustering is carried out through BLAST2 pairwise comparisons of each ORF in the PATHOSEQ database (Incyte Genomics, Palo Alto Calif.) against every other ORF in the database. If two ORFs matched each other at a P-value less than or equal to 1e-15, they were placed in the same cluster. If a third ORF matched either of the first two ORFs at a P-value of less than or equal to 1e-15, the third ORF joined the cluster. Thus, clusters were formed so that any ORF in a cluster must match at least one other ORF in the cluster at less than or equal to the threshold P-value of 1e-15. The representative ORF for a cluster is the one with the best matched annotation.
  • Contig ordering based on 5′/3′ sequence pairs was done by identifying all 5′/3′ sequence pairs (5′ and 3′ sequences with the same Sequence ID) that were not in the same contig, but span a gap between two contigs with the estimated distance between them of about 1.5-3.0 kb (the insert size of the library).
  • Annotation information was used to determine contig order in two ways, either by identifying genes spanning contig gaps or by comparison with genes at the ends of contigs in related organisms with similar gene order.
  • an ORF is extended using a modified XL-PCR (PE Biosystems) procedure.
  • Oligonucleotide primers, one to initiate 5′ extension and the other to initiate 3′ extension were designed using the nucleotide sequence of the known fragment and OLIGO 4.06 software (National Biosciences).
  • the initial primers were about 22 to 30 nucleotides in length, had a GC content of about 42%, and annealed to the target sequence at temperatures of about 55 C to about 68 C. Any fragment which would result in hairpin structures and primer-primer dimerizations was avoided.
  • the genomic DNA library was used to extend the molecule. If more than one extension was needed, additional or nested sets of primers were designed.
  • the concentration of DNA in each well was determined by dispensing 100 ⁇ l PICOGREEN quantitation reagent (0.25% v/v; Molecular Probes) dissolved in 1 ⁇ TE and 0.5 ⁇ l of undiluted PCR product into each well of an opaque fluorimeter plate (Corning Costar, Acton Mass.) and allowing the DNA to bind to the reagent. The plate was scanned in a Fluoroskan II (Labsystems Oy, Helsinki, Finland) to measure the fluorescence of the sample and to quantify the concentration of DNA. A 5 ⁇ l to 10 ⁇ l aliquot of the reaction mixture was analyzed by electrophoresis on a 1% agarose mini-gel to determine which reactions were successful in producing longer sequence.
  • the extended sequences were desalted, concentrated, transferred to 384-well plates, digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison Wis.), and sonicated or sheared prior to religation into pUC18 vector (APB).
  • CviJI cholera virus endonuclease Molecular Biology Research, Madison Wis.
  • AGARACE enzyme promega
  • Extended fragments were religated using T4 DNA ligase (NEB) into pUC18 vector (APB), treated with Pfu DNA polymerase (Stratagene) to fill-in restriction site overhangs, and transformed into competent E. coli cells. Transformed cells were selected on antibiotic-containing media, and individual colonies were picked and cultured overnight at 37 C in 384-well plates in LB/2 ⁇ carb liquid media.
  • DNA was quantified by PICOGREEN reagent (Molecular Probes) as described above. Samples with low DNA recoveries were reamplified using the conditions described above.
  • Nucleic acids are isolated from a biological source and applied to a substrate for standard hybridization protocols by one of the following methods.
  • a mixture of nucleic acids, a restriction digest of genomic DNA is fractionated by electrophoresis through an 0.7% agarose gel in 1 ⁇ TAE running buffer and transferred to a nylon membrane by capillary transfer using 20 ⁇ saline sodium citrate (SSC).
  • SSC 20 ⁇ saline sodium citrate
  • the nucleic acids are individually ligated to a vector and inserted into bacterial host cells to form a library.
  • Nucleic acids are arranged on a substrate by one of the following methods. In the first method, bacterial cells containing individual clones are robotically picked and arranged on a nylon membrane. The membrane is placed on bacterial growth medium.
  • LB agar containing carb LB agar containing carb, and incubated at 37 C for 16 hours. Bacterial colonies are denatured, neutralized, and digested with proteinase K. Nylon membranes are exposed to UV irradiation in a STRATALINKER UV-crosslinker (Stratagene) to cross-link DNA to the membrane.
  • STRATALINKER UV-crosslinker Stratagene
  • nucleic acids are amplified from bacterial vectors by thirty cycles of PCR using primers complementary to vector sequences flanking the insert.
  • Amplified nucleic acids are purified using SEPHACRYL-400 beads (APB).
  • Purified nucleic acids are robotically arrayed onto a glass microscope slide (Corning Science Products, Corning N.Y.). The slide is previously coated with 0.05% aminopropyl silane (Sigma-Aldrich, St. Louis Mo.) and cured at 110 C.
  • the arrayed glass slide (microarray) is exposed to UV irradiation in a STRATALINKER UV-crosslinker (Stratagene).
  • DNA probes are made from mRNA templates. Five micrograms of mRNA is mixed with 1 ⁇ g random primer (Life Technologies), incubated at 70 C for 10 minutes, and lyophilized. The lyophilized sample is resuspended in 50 ⁇ l of 1 ⁇ first strand buffer (cDNA Synthesis systems; Life Technologies) containing a dNTP mix, [ ⁇ - 32 P]dCTP, dithiothreitol, and MMLV reverse transcriptase (Stratagene), and incubated at 42 C for 1-2 hours. After incubation, the probe is diluted with 42 ⁇ l dH 2 O, heated to 95 C for 3 minutes, and cooled on ice.
  • cDNA Synthesis systems Life Technologies
  • mRNA in the probe is removed by alkaline degradation.
  • the probe is neutralized, and degraded mRNA and unincorporated nucleotides are removed using a PROBEQUANT G-50 column (APB).
  • Probes are labeled with fluorescent markers, Cy3-dCTP or Cy5-dCTP (APB), in place of the radionucleotide, [ 32 P]dCTP.
  • Hybridization is carried out at 65 C in a hybridization buffer containing 0.5 M sodium phosphate (PH 7.2), 7% SDS, and 1 mM EDTA After the substrate is incubated in hybridization buffer at 65 C for at least 2 hours, the buffer is replaced with 10 ml of fresh buffer containing the probes. After incubation at 65 C for 18 hours, the hybridization buffer is removed, and the substrate is washed sequentially under increasingly stringent conditions, up to 40 mM sodium phosphate, 1% SDS, 1 mM EDTA at 65 C.
  • the substrate is exposed to a PHOSPHORIMAGER cassette (APB), and the image is analyzed using IMAGEQUANT data analysis software (APB).
  • APB PHOSPHORIMAGER cassette
  • IMAGEQUANT data analysis software APB
  • the substrate is examined by confocal laser microscopy, and images are collected and analyzed using GEMTOOLS gene expression analysis software (Incyte Genomics).
  • Molecules complementary to the nucleic acid molecule, or a fragment thereof are used to detect, decrease, or inhibit gene expression.
  • oligonucleotides comprising from about 15 to about 30 base pairs is described, the same procedure is used with larger or smaller fragments or derivatives such as peptide nucleic acids (PNAs).
  • Oligonucleotides are designed using OLIGO 4.06 software (National Biosciences) and a nucleic acid molecule of the Sequence Listing or fragment thereof.
  • a complementary oligonucleotide is designed to bind to sequence 5′ of the ORF, most preferably about 10 nucleotides before the initiation codon of the ORF.
  • a complementary oligonucleotide is designed to prevent ribosomal binding to the mRNA encoding the M. catarrhalis protein.
  • An M. catarrhalis nucleic acid molecule is subcloned into a vector containing an antibiotic resistance gene and the inducible T5 or T7 bacteriophage promoter in conjunction with the lac operator regulatory element.
  • Recombinant vectors are transformed into BL21(DE3) competent cells (Stratagene). Antibiotic resistant bacteria express the bacterial protein upon induction with IPTG.
  • the protein is synthesized as a fusion protein with FLAG which permits affinity-based purification of the recombinant fusion protein from crude cell lysates.
  • Kits for immunoaffinity purification using monoclonal and polyclonal anti-FLAG antibodies are commercially available. Following purification the heterogeneous moiety is proteolytically cleaved from the bacterial protein at specifically engineered sites. Purified protein is used directly in the production of antibodies or in activity assays.
  • An M. catarrhalis produced as described above or an oligopeptide designed and synthesized using an ABI 431 A peptide synthesizer (pE Biosystems) is used to produce an antibody. Animals are immunized with the protein or an oliopeptide-KLH complex in complete Freund's adjuvant. Immunizations are repeated at intervals thereafter in incomplete Freund's adjuvant. After a minimum of seven weeks for mouse or twelve weeks for rabbit, antisera are drawn and tested for antipeptide activity. Testing involves binding the peptide to plastic, blocking with 1% bovine serum albumin, reacting with rabbit antisera, washing, and reacting with radio-iodinated goat anti-rabbit IgG. Methods and machinery well known in the art are used to determine antibody titer and the amount of complex formation.
  • the nucleic acid molecule, or fragments thereof, or the protein, or portions thereof, are labeled with 32 P-dCTP, Cy3-dCTP, Cy5-dCTP (APB), or BIODIPY or FITC (Molecular Probes), respectively.
  • Libraries of candidate molecules previously arranged on a substrate are incubated in the presence of labeled nucleic acid molecule or protein. After incubation under conditions for either a nucleic acid or amino acid sequence, the substrate is washed, and any position on the substrate retaining label, which indicates specific binding or complex formation, is assayed, and the binding molecule is identified. Data obtained using different concentrations of the nucleic acid or protein are used to calculate affinity between the labeled nucleic acid or protein and the bound molecule.
  • In vivo expression technology is used with the sequences, or ORFs, to identify M. catarrhalis genes specifically induced during infection or under pathogenic conditions (Mahan et al. (1993) Science 259:686).
  • a library of random genomic fragments of M. catarrhalis is made and ligated to a gene for a selectable marker required for survival in the host animal. Only those M. catarrhalis cells harboring a fusion sequence containing an active promoter will survive passage through the host. Fusion bearing promoters with constitutive activity are identified and discarded by examining reporter activity on laboratory medium passaged M. catarrhalis bacteria.
  • Host induced M. catarrhalis genes are identified using the M. catarrhalis sequences and ORFs disclosed herein and the method of differential fluorescence induction described by Valdivia and Falkow, (1996; Mol Microbiol 22:367).
  • genes required for survival in a host is determined using the signature-tagged transposon method described by Hensel et al. (1995; Science 269:400).
  • a library of M. catarrhalis mutants is marked with a unique oligonucleotide sequence for each disrupted gene. After passage of the library though an infected animal or other selective environment, putative survival genes are identified by absence of the mutant from the passaged library.
  • subtilis 10 19988 15744 16295 MCA100456 g1805560 3.00E ⁇ 36 phosphoribosylglycinamide formyltransferase (EC 2.1.2.2) 10 19988 16331 17356 MCA100457 g1788845 e ⁇ 130 phosphoribosylaminoimi dazole synthetase AIR synthetase 10 19988 17685 18677 MCA100458 g3861171 2.00E ⁇ 27 putative permease homolog (perM) 10 19988 18921 19685 MCA100459 g3212215 2.00E ⁇ 11 conserved hypothetical protein 10 19988 5532 8192 MCA100516 g1800083 0 Alanyl-tRNA Synthetase (EC 6.1.1.7) 10 19988 8821 10335 MCA100518 g2632668 3.00E ⁇ 69 similar to di- tripeptide ABC transporter 10 19988 3517 4892 MCA100711 g157
  • influenzae HI0735 32 62909 6025 6510 MCA100650 g1786736 1.00E ⁇ 52 peptidyl-prolyl cis- trans isomerase B (rotamase B) 32 62909 4072 5826 MCA100651 g1574816 e ⁇ 175 glutaminyl-tRNA synthetase (glnS) 32 62909 2634 3977 MCA100652 g3850110 3.00E ⁇ 60 rrm3-pif1 helicase homolog 32 62909 1016 2038 MCA100654 g39921 3.00E ⁇ 75 glyceraldehyde-3- phosphate dehydrogenase (AA 1-335) 32 62909 54353 54796 MCA100831 g1573349 3.00E ⁇ 38 conserved hypothetical protein 32 62909 54874 56076 MCA100832 g1788879 e ⁇ 169 putative aminotransferase 32 62909 56256 56636 MCA
  • tuberculosis MTCY277.09 39 100848 36123 37547 MCA100305 g2984771 e ⁇ 101 PhpA 39 100848 34625 35815 MCA100306 g409800 e ⁇ 132 tyrosine aminotransferase 39 100848 89115 89381 MCA100389 g429056 1.00E ⁇ 26 ribosomal protein S15 39 100848 89607 91682 MCA100390 g3650364 0 polyribonucleotide nucleotidyltransferase 39 100848 91827 92300 MCA100391 g2959336 4.00E ⁇ 46 hypothetical protein 39 100848 92532 92957 MCA100392 g1100876 5.00E ⁇ 19 hypothetical OrfY 39 100848 92969 93382 MCA100393 g1789538 2.00E ⁇ 08 orf, hypothetical protein 39 100848 93467 94066 MCA100394 g178
  • influenzae U32836 41 269223 140754 141998 MCA101393 g1787438 e ⁇ 138 D-amino acid dehydrogenase subunit 41 269223 142379 144201 MCA101394 g1790427 0 thiamin biosynthesis, pyrimidine moiety 41 269223 144333 146159 MCA101395 g1574084 0 ABC transporter, ATP- binding protein 41 269223 146383 147726 MCA101396 g2635428 e ⁇ 130 argininosuccinate lyase 41 269223 147971 148915 MCA101397 g41666 e ⁇ 100 porphobilinogen deaminase (AA 1-313) 41 269223 149877 150605 MCA101399 g1573875 4.00E ⁇ 46 conserved hypothetical protein 41 269223 38460 38705 MCA101530 g42543 1.00E ⁇ 13 pspE protein 41 269223 31815 327
  • subtilis ribG 41 269223 119545 120186 MCA101660 g150707 3.00E ⁇ 49 riboflavin synthetase alpha subunit 41 269223 118437 119363 MCA101661 g3328155 4.00E ⁇ 69 methionyl-tRNA formyltransferase 41 269223 117032 118369 MCA101662 g1573620 7.00E ⁇ 65 sun protein (sun) 41 269223 115305 116708 MCA101663 g2160269 e ⁇ 153 threonine synthase 41 269223 114048 115172 MCA101664 g1574014 2.00E ⁇ 44 DNA processing chain A (dprA) 41 269223 113447 114028 MCA101665 g2367210 1.00E ⁇ 19 orf, hypothetical protein 41 269223 110508 111677 MCA101668 g1460081 3.00E ⁇ 85 hypothetical protein Rv2559c 41 269223 109304 1098

Abstract

The present invention provides the genomic sequences of a library of purified, polynucleotides, or their complements, comprising the genome of Moraxella catarrhalis. The invention also provides the identification of open reading frames contained within the polynucleotides of the library. The present invention further provides for the use of the polynucleotides, their complements or fragments, and proteins or portions thereof for identifying ligands and useful diagnostic and therapeutic compositions. In addition the invention provides for vectors, host cell sand methods for producing M. catarrhalis proteins or portions thereof.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of U.S. patent application Ser. No. 09/596,002, filed on Jun. 16, 2000, which claims priority under 35 U.S.C. §119(e) to U.S. Provisional Application Serial No. 60/140,121, filed Jun. 18, 1999, both of which are hereby expressly incorporated herein by reference in their entireties.[0001]
  • A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. [0002]
  • TECHNICAL FIELD
  • The present invention discloses nucleotide sequences from the genome of [0003] Moraxella catarrhalis. These sequences may be used in various assays and in the development of diagnostic and therapeutic agents.
  • Sequence Listing
  • The present application is being filed along with duplicate copies of a CD-ROM marked “Copy 1” and “Copy 2” containing a Sequence Listing in electronic format. The duplicate copies of the CD-ROM each contain a file entitled ELITRA.025C1.txt created on Sep. 26, 2003 which is 2,330,432 bytes in size. The information on these duplicate CD-ROMs is incorporated herein by reference in its entirety. [0004]
  • BACKGROUND OF INVENTION
  • All animals coexist with an indigenous microflora. Beginning shortly after birth, the gastrointestinal tract, lungs, and other areas of the human body are colonized by different bacterial species. A large number of factors operate to maintain symbiotic, host-microbe balance. These include the physical barriers of skin and mucosal surfaces and both nonspecific and highly specific aspects of the immune system. When host-microbe balance becomes disturbed, infection may ensue. Virulence, the ability of a microbe to produce infection, is related to a variety of complex mechanisms of disease induction. Some organisms are highly virulent and cause clinical illness when they colonize most or all hosts. Alternatively, when host defenses are compromised, normally symbiotic microbes can induce serious, or even life-threatening, infections. Thus, infection is generally a consequence of the interaction between a relatively virulent microbe and a normal host or between a relatively less virulent microbe and a host with some degree of transient or permanent immunological impairment. [0005]
  • [0006] M. catarrhalis (Branhamella catarrhalis) is a large, aerobic, gram-negative diplococcus normally found among the bacterial flora of human upper airways. It is nonmotile and possesses fimbriae. Collonies are regularly friable and nonadherent and grow well on blood or chocolate agar. Unlike many other pathogenic bacteria, M. catarrhalis shows a high degree of homogeneity in its outer membrane proteins. This usually harmless parasite of the mucous membranes may behave as an opportunistic pathogen when microbehost balance is perturbed. Following infection, host antibodies directed against one or more of the microbial outer-membrane proteins are detectable in the serum.
  • [0007] M. catarrhalis is known to cause acute, localized infections such as otitis media, sinusitis, and bronchopulmonary infection and life-threatening, systemic diseases including endocarditis and meningitis. The presence of bacterial endotoxin and host histamine and chemotactic factors are major indicators of M. catarrhalis pathogenicity.
  • [0008] M. catarrhalis can be isolated from the upper respiratory tract of 50% of healthy school children and 7% of healthy adults. In children with otitis media, colonization increases to 86%, and it is the third most common bacterial isolate. It causes 10-15% of otitis media and sinusitis. Infections of the maxillary sinuses, middle ears, or bronchi may occur through contiguous spread of the microbes. M. catarrhalis causes a large proportion of lower respiratory tract infections in elderly patients with chronic obstructive pulmonary diseases and is exceeded only by Haemoohilus influenzae and Streptococcus pneumoniae as a causative agent of acute purulent exacerbations of chronic bronchitis.
  • Pneumonia due to [0009] M. catarrhalis, like that of H. influenzae or S. pneumoniae, begins with aspiration of the bacteria. Failure or absence of appropriate host defense allows the bacteria to replicate and produce an inflammatory response in the alveoli. Because of mandatory immunosuppression, organ transplant recipients can develop moderate to severe M. catarrhalis pneumonia very rapidly. Bloodstream invasion is less characteristic of M. catarrhalis than pneumococcal infection, but nearly 50% of M. catarrhalis pneumonia patients die within 3 months of onset.
  • [0010] M. catarrhalis is treated with antibiotic agents including penicillin-clavulanic acid combinations, cephalosporins, tetracycline, erythromycin, chloramphenicol, trimethoprim-sulfamethoxazole, and quinolones. Over 85% of M. catarrhalis clinical isolates have been reported to be resistant to penicillin. Moreover, the microbe protects itself by binding to the first subcomponent of the complement system (Clq) which inactivates the C1 complex or by inactivating the terminal, lytic complement complex via a protein on the outer cell wall surface. Resistance is mediated by two closely related β-lactamases, BRO-1, present in 90% of resistant isolates and BRO-2, present in 10%. These enzymes are active against penicillin, ampicillin, and amoxicillin, less active against cephalosporins, and bind avidly to clavulanic acid and sublactam. Tetracycline resistant strains are increasing in Europe and Asia and have been documented in the United States. Ampicillin, which had been universally effective in treating M. catarrhalis pneumonia, can no longer be used.
  • [0011] M. catarrhalis physiology and pathogenicity are reviewed in: Holt et al. (1994) Bergey's Manual of Determinative Bacteriology, Williams and Wilkins, Baltimore Md.; Cullmann (1997) Med Klin 92(3):162-166; Isselbacher et al. (1994) Harrison's Principles of Internal Medicine, McGraw-Hill, New York N.Y.; Murray (1995) Manual of Clinical Microbiology, ASM Press, Washington D.C.; and Shulman et al. (1997) The Biologic and Clinical Basis of Infectious Diseases, W B Saunders, Philadelphia Pa.
  • In view of the conditions or diseases associated with [0012] M. catarrhalis, it would be advantageous to provide specific methods for the diagnosis, prevention, and treatment of diseases attributed to M. catarrhalis. Relevant methods would be based on the expression of M. catarrhalis-derived nucleic acid sequences. Such traits as virulence, acquisition of resistance factors, and effects of treatment using particular therapeutic agents may be characterized by under- or over-expression of nucleic acid sequences as revealed using PCR, hybridization or microarray technologies. Treatment for diseases attributed to M. catarrhalis can then be based on expression of these identified sequences or their expressed proteins, and efficacy of any particular therapy and development of resistance monitored. The information provided herein provides the basis for understanding the pathogenicity of M. catarrhalis and treating and monitoring the treatment of diseases caused by M. catarrhalis.
  • SUMMARY OF THE INVENTION
  • The present invention relates to a genomic library comprising the combination of nucleic acid molecules from [0013] Moraxella catarrhalis, presented as SEQ ID NOs:1-41. The library substantially provides the nucleic acid molecules comprising the genome of M. catarrhalis, and the nucleic acid molecules provide a plurality of open reading frames (ORFs). The ORFs uniquely identify structural, functional, and regulatory genes of M. catarrhalis. The invention encompasses oligonucleotides, fragments, and derivatives of the M. catarrhalis nucleic acid molecules, and sequences complementary to the nucleic acid molecules listed in the Sequence Listing.
  • [0014] M. catarrhalis nucleic acid molecules, fragments, derivatives, oligonucleotides, and complementary sequences thereof, can be used as probes to detect, amplify, or quantify M. catarrhalis genes, ORFs, cDNAs, or RNAs in biological, solution or substrate-based, assays or as compositions in diagnostic kits. The invention contemplates the use of such diagnostic probes to identify the presence of M. catarrhalis sequence in a sample or to screen for virulence factors and mutations.
  • The invention also provides for the comparison of the [0015] M. catarrhalis genomic library or the encoded proteins with genomes, individual DNA sequences, or proteins from other Moraxella species or strains, other bacteria, and other organisms to identify virulence factors, regulatory elements, drug targets, and to characterize genomic organization. In another aspect, the present invention provides for the use of computer databases to make such comparisons.
  • The invention further provides host cells and expression vectors comprising nucleic acid molecules of the invention and methods for the production of the proteins they encode. Such methods include culturing the host cells under conditions for expression of [0016] M. catarrhalis protein and recovering the protein from cell culture. The invention still further provides purified M. catarrhalis protein of which at least a portion is encoded by a nucleic acid molecule selected from the nucleic acid molecules of the Sequence Listing.
  • The subject invention provides a method of screening a library or a plurality of molecules or compounds for specific binding to a [0017] M. catarrhalis nucleic acid molecule or fragment thereof or protein or portion thereof, to identify at least one ligand which specifically binds the M. catarrhalis nucleic acid molecule or protein. Such a method comprises the steps of combining the M. catarrhalis nucleic acid molecule or protein with a library or a plurality of molecules or compounds under conditions to allow specific binding and detecting M. catarrhalis nucleic acid molecule or protein bound to at least one molecule or compound, thereby identifying a ligand which specifically binds the nucleic acid molecule or protein. Suitable libraries of ligands comprise aptamers, DNA molecules, RNA molecules, peptide nucleic acids, peptides, mimetics, proteins, agonists, antagonists, antibodies, inhibitors, immunoglobulins, pharmaceutical agents, and drug compounds.
  • The subject invention also provides a method of purifying a ligand from a sample. Such a method comprises the steps of combining the [0018] M. catarrhalis nucleic acid molecule or protein with a library or a plurality of molecules or compounds under conditions to allow specific binding, detecting M. catarrhalis nucleic acid molecule or protein bound to at least one molecule or compound, recovering the bound M. catarrhalis nucleic acid molecule or protein and separating the bound M. catarrhalis nucleic acid molecule or protein from the ligand, thereby obtaining purified ligand.
  • The invention further comprises an antibody specific for a purified [0019] M. catarrhalis protein or a portion thereof which is encoded by an M. catarrhalis nucleic acid molecule selected from the Sequence Listing. Antibodies produced against M. catarrhalis protein may be used diagnostically for the detection of M. catarrhalis proteins in biological, solution- or substrate-based, samples and therapeutically to neutralize the activity of an M. catarrhalis protein expressed during infections caused by M. catarrhalis.
  • DESCRIPTION OF THE SEQUENCE LISTING AND TABLES
  • The Sequence Listing is a compilation of the consensus sequences of contiguous sequences (contigs) or groups of overlapping sequences, assembled from individual sequences obtained by sequencing genomic clone inserts of a randomly generated [0020] M. catarrhalis DNA library. Each assembled contig or singlet is identified by a sequence identification number (SEQ ID NO) and by the contig number which it represents.
  • Table 1 lists the assembled [0021] M. catarrhalis contiguous sequences prepared as described in the Examples. The first column contains the number of the contig, which is also SEQ ID NO, listed in ascending order. The second column contains the length of the nucleic acid molecule. The third and fourth columns contain the start and stop nucleotides, respectively, for any open reading frames (ORFs) in the contig. The fifth column contains the Locus ID. The sixth column lists the GenBank identification number of the closest homolog, if any. The seventh column gives the P-value for the match to the homolog. The last column contains the description of the homolog. Orphans or LURs have no GenBank homologs.
  • Table 2 shows the order of the contigs or singlets comprising the [0022] M. catarrhalis genome.
  • DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • It is understood that this invention is not limited to the particular machines, materials and methods described. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the scope of the present invention which will be limited only by the appended claims. As used herein, the singular forms “a”, “an”, and “the” include plural reference unless the context clearly dictates otherwise. For example, a reference to “a host cell” includes a plurality of such host cells known to those skilled in the art. [0023]
  • All patents and publications cited for the purpose of describing and disclosing the cell lines, protocols, reagents and vectors which might be used in connection with the invention are expressly incorporated by reference. Citation is for the purpose of providing the best description of the invention and is not to be construed as an admission that the invention is not entitled to antedate such disclosure. [0024]
  • Definitions [0025]
  • “Biologically active” refers to a protein having structural, immunological, regulatory, or chemical functions of a naturally occurring, recombinant, or synthetic molecule. [0026]
  • “Complementary” refer to the natural hydrogen bonding by base pairing between purines and pyrimidines. For example, the sequence A-C-G-T forms hydrogen bonds with its complements T-G-C-A or U-G-C-A. The degree of complementarity between nucleic acid strands affects the efficiency and strength of the hybridization and amplification reactions. [0027]
  • “Derivative” refers to the chemical modification of a nucleic acid or amino acid molecule. Chemical modifications can include replacement of hydrogen by an alkyl, acyl, or amino group or glycosylation, pegylation, or any similar process which retains or enhances biological activity, stability, or lifespan of the molecule. [0028]
  • “Fragment” refers to an Incyte clone or any part of a nucleic acid molecule which retains a usable, functional characteristic. Useful fragments include oligonucleotides which may be used in hybridization or amplification technologies or to regulate replication, transcription or translation. [0029]
  • “Hybridization complex” refers to a complex between two nucleic acid molecules by virtue of the formation of hydrogen bonds between purines and pyrimidines. [0030]
  • “Ligand” refers to any molecule or compound which will bind to a complementary site on a nucleic acid molecule or protein. [0031]
  • “Modulates” refers to a change in activity (biological, chemical, or immunological) or lifespan resulting from specific binding between a molecule or compound and either a nucleic acid molecule or a protein. [0032]
  • “Molecules” is used substantially interchangeably with the terms agents and compounds. Such molecules modulate the activity of nucleic acid molecules or proteins of the invention and may be composed of at least one of the following: inorganic and organic substances including cofactors, nucleic acids, proteins, carbohydrates, fats, and lipids. [0033]
  • “Nucleic acid molecule” is substantially interchangeable with the term polynucleotide and may refer to a probe, a fragment of DNA or RNA of genomic or synthetic origin. Such molecules may be double-stranded or single-stranded and may be engineered into vectors to perform a particular activity such as transcription. [0034]
  • “Oligonucleotide” is substantially equivalent to the terms “amplimer”, “primer”, “oligomer”, and “element”, and is preferably single stranded. [0035]
  • “Protein” refers to an amino acid sequence, oligopeptide, peptide, polypeptide or portions thereof whether naturally occurring or synthetic. [0036]
  • “Portion” refers to any part of a protein used for any purpose, but especially for the screening of a library of molecules or compounds which specifically bind to that portion or for the production of antibodies. [0037]
  • “Sample” is used in its broadest sense. A sample containing nucleic acid molecules may comprise a bodily fluid; an extract from a cell, chromosome, organelle, or membrane isolated from a cell; genomic DNA, RNA, or cDNA in solution or bound to a substrate; a cell; a tissue; a tissue print; a hair, and the like. [0038]
  • “Substantially purified” refers to nucleic acid molecules or proteins that are isolated or separated from their natural environment and are about 60% free to about 90% free from other components with which they are naturally associated. [0039]
  • “Substrate” refers to any rigid or semi-rigid support to which nucleic acid molecules or proteins are bound and includes membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, capillaries or other tubing, plates, polymers, and microparticles with a variety of surface forms including wells, trenches, pins, channels and pores. [0040]
  • The Invention [0041]
  • The majority of the [0042] Moraxella catarrhalis genome was sequenced using a strategy of shotgun sequencing. Genomic DNA was mechanically sheared, treated with enzyme to create blunt ends, gel-purified, and cloned into modified PBLUESCRIPT vectors (Stratagene, La Jolla Calif.). The vectors were transformed into E. coli cells and grown overnight. Colonies were picked, and plasmid DNA was isolated. Templates were prepared and sequenced, sequences were assembled into contiguous sequences (contigs), and open reading frames were identified.
  • The invention relates to a [0043] Moraxella catarrhalis genomic DNA library comprising a combination of nucleic acid molecules, SEQ ID NOs:1-41, and their complements. These nucleic acid molecules comprise contiguous sequences which contain annotated and unannotated reading frames (ORFs and LURs). The nucleic acid molecules or fragments and probes thereof are used in hybridization, screening, and purification assays to identify ligands and in vectors and host cells to produce the proteins which they encode. The proteins or portions thereof are also used in screening and purification assays to identify useful ligands or to produce antibodies. The molecules or compounds used in hybridization, screening, and purification assays include aptamers, DNA molecules, RNA molecules, peptide nucleic acids, peptides, mimetics, transcription factor, enhancers, repressors, regulatory proteins, agonists, antagonists, antibodies, inhibitors, immunoglobulins, pharmaceutical agents, drug compounds, and the like. The nucleic acid molecules and proteins of M. catarrhalis are compared with those of other organisms using computer algorithms and databases to select those nucleic acid molecules and proteins of potential diagnostic and therapeutic use.
  • Characterization and Use of the Invention [0044]
  • Sequencing [0045]
  • Methods for sequencing nucleic acid molecules are well known in the art and may be used to practice any of the embodiments of the invention. These methods employ enzymes such as the Klenow fragment of DNA polymerase I, SEQUENASE, Taq DNA polymerase, thermostable T7 DNA polymerase (Amersham Pharmacia Biotech (APB), Piscataway N.J.), or combinations of polymerases and proofreading exonucleases such as those found in the ELONGASE amplification system (Life Technologies, Rockville Md.). Preferably, sequence preparation is automated with machines such as the HYDRA microdispenser (Robbins Scientific, Sunnyvale Calif.), MICROLAB 2200 system (Hamilton, Reno Nev.), and the DNA ENGINE thermal cycler (MJ Research, Watertown Mass.). Machines used for sequencing include the ABI 3700, 377 or 373 DNA sequencing systems (PE Biosystems, Foster City Calif.), the MEGABACE 1000 DNA sequencing system (APB), and the like. The sequences may be analyzed using a variety of algorithms which are well known in the art and described in Ausubel (1997[0046] ; Short Protocols in Molecular Biology, John Wiley & Sons, New York N.Y., unit 7.7) and in Meyers (1995; Molecular Biology and Biotechnology, Wiley VCH, New York N.Y., pp. 856-853).
  • Shotgun sequencing methods are well known in the art and use thermostable DNA polymerases and heat-labile DNA polymerases. A detailed procedure is provided in the Examples. Prefinished sequences (incomplete assembled sequences) are cross-compared for identity using various algorithms or programs such as CONSED (Gordon (1998) Genome Res. 8:195-202), GELVIEW Fragment Assembly system (Genetics Computer Group, Madison Wis., and PHRAP (Phil Green, University of Washington, Seattle Wash.). Contaminating sequences, including vector or chimeric sequences, can be masked, removed or restored, in the process of turning the prefinished sequences into finished sequences. [0047]
  • Extension of a Nucleic Acid Sequence [0048]
  • The sequences of the invention may be extended using various PCR-based methods known in the art. For example, the XL-PCR kit (PE Biosystems), nested primers, and commercially available cDNA or genomic DNA libraries (Life Technologies and Clontech (Palo Alto Calif.), respectively) may be used to extend the nucleic acid sequence. For all PCR-based methods, primers may be designed using commercially available software, such as OLIGO 4.06 software (National Biosciences, Plymouth Minn.) to be about 22 to 30 nucleotides in length, to have a GC content from about 40-45%, and to anneal to a target molecule at temperatures from about 55 C to about 68 C. When extending a sequence to recover untranslated, regulatory elements, it is preferable to use genomic, rather than cDNA libraries. [0049]
  • Use of [0050] M. catarrhalis Nucleic Acid Molecules
  • Hybridization [0051]
  • The [0052] M. catarrhalis nucleic acid molecules and fragments thereof can be used in various hybridization technologies for various purposes. Hybridization probes may be designed or derived from a highly unique region such as the 5′ untranslated sequence preceding the initiation codon or from a conserved coding region encoding a specific protein signature or motif and used in protocols to identify naturally occurring molecules encoding a particular M. catarrhalis protein, allelic variants, or related molecules. The probe should preferably have at least 50% sequence identity to any naturally occurring nucleic acid sequences. The probe may be a single stranded DNA or RNA molecule, produced biologically or synthetically, and labeled using oligolabeling, nick translation, end-labeling, or PCR amplification in the presence of at least one labeled nucleotide. A vector containing the nucleic acid molecule or a fragment thereof may be used to produce an mRNA probe in vitro by addition of an RNA polymerase and labeled nucleotides. These procedures may be conducted using commercially available kits such as those provided by APB.
  • The stringency of hybridization is determined by G+C content of the probe, salt concentration, and temperature. In particular, stringency can be increased by reducing the concentration of salt or raising the hybridization temperature. In solutions used for some membrane based hybridizations, addition of an organic solvent such as formamide allows the reaction to occur at a lower temperature. Hybridization can be performed at low stringency with buffers, such as 5×SSC with 1% sodium dodecyl sulfate (SDS) at 60 C, which permits the formation of a hybridization complex between nucleic acid sequences that contain some mismatches. Subsequent washes are performed at increased stringency with buffers such as 0.2×SSC with 0.1% SDS at either 45 C (medium stringency) or 68 C (high stringency). At high stringency, hybridization complexes will remain stable only where the nucleic acid molecules are completely complementary. In some membrane-based hybridizations, 35-50% formamide can be added to the hybridization solution to reduce the temperature at which hybridization is performed. Background signals can be reduced by the use of other detergents such as Sarkosyl or TRITON X-100 (Sigma-Aldrich, St. Louis Mo.) and a blocking agent such as denatured salmon sperm DNA. Selection of components and conditions for hybridization are well known to those skilled in the art and are reviewed in Ausubel (supra) and in Sambrook et al. (1989[0053] ; Molecular Cloning A Laboratory Manual, Cold Spring Harbor Press, Plainview N.Y.).
  • Microarrays may be prepared and analyzed using methods known in the art. Oligonucleotides or fragments of a nucleic acid molecule may be used as either probes or targets. The microarray can be used to monitor the expression level of large numbers of genes simultaneously and to identify genetic variants, mutations, and single nucleotide polymorphisms. Such information may be used to determine gene function; to understand the genetic basis of a condition, disease, or disorder; to diagnose a condition, disease, or disorder; and to develop and monitor the activities of therapeutic agents used to treat the condition, disease, or disorder. (See, eg, Brennan et al. (1995) U.S. Pat. No. 5,474,796; Schena et al. (1996) Proc Natl Acad Sci 93:10614-10619; Baldeschweiler et al. (1995) PCT application WO95/251116; Shalon et al. (1995) PCT application WO95/35505; Heller et al. (1997) Proc Natl Acad Sci 94:2150-2155; and Heller et al. (1997) U.S. Pat. No. 5,605,662.) [0054]
  • Hybridization probes are also useful in mapping the naturally occurring genomic sequence. The probes may be hybridized to: 1) a particular chromosome, 2) a specific region of a chromosome, 3) an artificial chromosome constructions such as human artificial chromosomes (HACs), yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial PI constructions, single chromosomes from eukaryotic species, or 5) DNA libraries made from any of these sources. [0055]
  • Expression [0056]
  • A nucleic acid molecule encoding a [0057] M. catarrhalis protein may be cloned into a vector and used to express the protein or portions thereof in host cells. The nucleic acid sequence can be engineered by such methods as DNA shuffling (U.S. Pat. No. 5,830,721) and site-directed mutagenesis to create new restriction sites, alter glycosylation patterns, change codon preference to increase expression in a particular host, produce splice variants, extend half-life, and the like. The expression vector may contain transcriptional and translational control elements (promoters, enhancers, specific initiation signals, and polyadenylated sequence) from various sources which have been selected for their efficiency in a particular host. The vector, nucleic acid molecule, and regulatory elements are combined using in vitro recombinant DNA techniques, synthetic techniques, and/or in vivo genetic recombination techniques well known in the art and described in Sambrook (supra ch. 4, 8, 16 and 17).
  • A variety of host systems may be transformed with an expression vector. These include, but are not limited to, bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems transformed with baculovirus expression vectors; plant cell systems transformed with expression vectors containing viral and/or bacterial elements, or animal cell systems (Ausubel, supra, unit 16). [0058]
  • Routine cloning, subcloning, and propagation of nucleic acid molecules can be achieved using the multifunctional PBLUESCRIPT vector (Stratagene) or PSPORT1 plasmid (Life Technologies). Introduction of a nucleic acid sequence into the multiple cloning site of these vectors disrupts the lacZ gene and allows colorimetric screening for transformed bacteria. In addition, these vectors may be useful for in vitro transcription, dideoxy sequencing, single strand rescue with helper phage, and creation of nested deletions in the cloned sequence. [0059]
  • For long term production of recombinant [0060] M. catarrhalis proteins, the vector can be stably transformed into competent cells of E. coli along with a selectable or visible marker gene on the same or on a separate vector. After transformation, cells are allowed to grow in enriched media containing a selective agent. Selectable markers, antimetabolite, antibiotic, or herbicide resistance genes confer resistance to the respective selective agent and allow growth and recovery of cells which successfully express the introduced sequences. Resistant clones or colonies, identified either by survival on selective media or by the expression of visible markers, such as anthocyanins, green fluorescent protein (GFP), β glucuronidase, luciferase and the like, may be propagated using culture techniques well known in the art. Visible markers are also used to quantify the amount of protein expressed by the introduced genes. Verification that the host cell contains the desired M. catarrhalis nucleic acid molecule is based on DNA-DNA or DNA-RNA hybridizations or PCR amplification.
  • The host cell may be chosen for its ability to modify a recombinant protein in a desired fashion. Such modifications include acetylation, carboxylation, glycosylation, phosphorylation, lipidation, acylation, and the like. Post-translational processing sequences (“prepro” forms) may also be engineered into the recombinant nucleotide sequence in order to specify protein targeting, folding, and/or activity. Different host cells available from the ATCC (Manassas Va.) which have specific cellular machinery and characteristic mechanisms for post-translational activities may be chosen to ensure the correct modification and processing of the recombinant protein. [0061]
  • Recovery of Proteins from Cell Culture [0062]
  • Heterologous moieties engineered into a vector for ease of purification include glutathione S-transferase (GST), calmodulin binding peptide (CBP), 6×His, FLAG, MYC, and the like. GST, CBP, and 6×His are purified using commercially available affinity matrices such as immobilized glutathione, calmodulin, and metal-chelate resins, respectively. FLAG and MYC are purified using commercially available monoclonal and polyclonal antibodies. A proteolytic cleavage site may be located between the desired protein sequence and the heterologous moiety for ease of separating the desired protein following purification. Methods for recombinant protein expression and purification are discussed in Ausubel (supra, unit 16) and are commercially available (Invitrogen, San Diego Calif.). [0063]
  • Chemical Synthesis of Peptides [0064]
  • Proteins or portions thereof may be produced not only by recombinant methods, but also by using chemical methods well known in the art. Solid phase peptide synthesis may be carried out in a batchwise or continuous flow process which sequentially adds α-amino and side chain-protected amino acid residues to an insoluble polymeric support via a linker group. A linker group such as methylamine-derivatized polyethylene glycol is attached to poly(styrene-co-divinylbenzene) to form the support resin. The amino acid residues are N-α-protected by acid labile Boc (t-butyloxycarbonyl) or base-labile Fmoc (9-fluorenylmethoxycarbonyl). The carboxyl group of the protected amino acid is coupled to the amine of the linker group to anchor the residue to the solid phase support resin. Trifluoroacetic acid or piperidine are used to remove the protecting group in the case of Boc or Fmoc, respectively. Each additional amino acid is added to the anchored residue using a coupling agent or pre-activated amino acid derivative, and the resin is washed. The full length peptide is synthesized by sequential deprotection, coupling of derivitized amino acids, and washing with dichloromethane and/or N,N-dimethylformamide. The peptide is cleaved between the peptide carboxy terminus and the linker group to yield a peptide acid or amide. (Novabiochem 1997/98 Catalog and Peptide Synthesis Handbook, San Diego Calif., pp. S1-S20). Automated synthesis may also be carried out on machines such as the ABI 431A peptide synthesizer (PE Biosystems). A protein or portion thereof may be substantially purified by preparative high performance liquid chromatography and its composition confirmed by amino acid analysis or by sequencing (Creighton (1984) [0065] Proteins, Structures and Molecular Properties, W H Freeman, New York N.Y.).
  • Preparation and Screening of Antibodies [0066]
  • Various hosts including goats, rabbits, rats, mice, humans, and others may be immunized by injection with [0067] M. catarrhalis protein or any portion thereof. Adjuvants such as Freund's, mineral gels, and surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemacyanin (KLH), and dinitrophenol may be used to increase immunological response. The oligopeptide, peptide, or portion of protein used to induce antibodies should consist of about five to fifteen amino acids which are identical to a portion of the natural protein. Oligonucleotides may be fused with proteins such as KLH in order to produce antibodies to the chimeric molecule.
  • Monoclonal antibodies may be prepared using any technique which provides for the production of antibodies by continuous cell lines in culture. These include, but are not limited to, the hybridoma technique, the human B-cell hybridoma technique, and the EBV-hybridoma technique. (See, eg, Kohler et al. (1975) Nature 256:495-497; Kozbor et al. (1985) J Immunol Methods 81:31-42; Cote et al. (1983) Proc Natl Acad Sci 80:2026-2030; and Cole et al. (1984) Mol Cell Biol 62:109-120.) [0068]
  • Alternatively, techniques described for the production of single chain antibodies may be adapted, using methods known in the art, to produce epitope specific single chain antibodies. Antibody fragments which contain specific binding sites for epitopes of the [0069] M. catarrhalis protein may also be generated For example, such fragments include, but are not limited to, F(ab′)2 fragments produced by pepsin digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of the F(ab′)2 fragments. Alternatively, Fab expression libraries may be constructed to allow rapid and easy identification of monoclonal Fab fragments with the desired specificity (Huse et al. (1989) Science 246:1275-1281).
  • The [0070] M. catarrhalis protein may be used in screening assays of phage mid or B-lymphocyte immunoglobulin libraries to identify antibodies having the desired specificity. Numerous protocols for competitive binding or immunoassays using either polyclonal or monoclonal antibodies with established specificities are well known in the art. Such immunoassays typically involve the measurement of complex formation between the protein and its specific antibody. A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive to two non-interfering epitopes is preferred, but a competitive binding assay may also be employed (Pound (1998) Immunochemical Protocols, Humana Press, Totowa N.J.).
  • Labeling of Molecules for Assay [0071]
  • A wide variety of labels and conjugation techniques are known by those skilled in the art and may be used in various nucleic acid molecule, protein, and antibody assays. Synthesis of labeled molecules may be achieved using Promega (Madison Wis.) or APB kits for incorporation of a labeled nucleotide such as [0072] 32p-dCTP, Cy3-dCTP or Cy5dCTP (APB) or amino acid such as 35S-methionine (APB). Nucleotides and amino acids may be directly labeled with a variety of substances including fluorescent, chemiluminescent, or chromogenic agents and the like, by chemical conjugation to amines, thiols and other groups present in the molecules using reagents such as BIODIPY or FITC (Molecular Probes, Eugene Oreg.).
  • Diagnostics [0073]
  • The nucleic acid molecules, fragments, oligonucleotides, complementary RNA and DNA molecules, and peptide nucleic acids (PNAs) may be used to detect and quantify differential gene expression, absence/presence vs. excess, of mRNAs or to monitor mRNA levels following drug treatment. Conditions, diseases or disorders associated with [0074] M. catarrhalis gene expression may include conditions and diseases such as allergies, asthma, bronchitis, chronic obstructive pulmonary disease, emphysema, endocarditis, hypereosinophilia, meningitis, otitis media, pneumonia, sinusitis, and various respiratory distress syndromes. The diagnostic assay may use hybridization or amplification technology to compare gene expression in a biological sample from a patient to expression in disease and control standards in order to detect differential gene expression. Qualitative or quantitative methods for this comparison are well known in the art.
  • For example, the nucleic acid molecule, fragment, or probe may be labeled by standard methods and added to a sample from a patient under conditions for the formation of hybridization complexes. After an incubation period, the sample is washed and the amount of label (or signal) associated with hybridization complexes, is quantified and compared with a standard value. If the amount of label in the patient sample is significantly altered in comparison to the standard value, then the presence of elevated amounts of [0075] M. catarrhalis is responsible for the associated condition or disease.
  • In order to provide a basis for the diagnosis of a condition, disease or disorder associated with gene expression, a normal or standard expression profile is established. This may be accomplished by combining a biological sample taken from normal subjects, animal or more preferably human, with a probe under conditions for hybridization or amplification. Standard hybridization may be quantified by comparing the values obtained using normal subjects with values from an experiment in which a known amount of a substantially purified target sequence is used. Standard values obtained in this manner may be compared with values obtained from samples from patients who are symptomatic for a particular condition or diseases listed above. Deviation from standard values toward those associated with a particular diagnosed condition is used to diagnose the patient. [0076]
  • Such assays may also be used to evaluate the efficacy of a particular therapeutic treatment regimen in animal studies or in a clinical trial. Once efficacy is established, these assays may be used on a regular basis to determine if the therapy is effective in an individual patient. The results obtained from successive patient assays may be used over a period ranging from several days to months. [0077]
  • Immunological Methods [0078]
  • Detection and quantification of a protein using either specific polyclonal or monoclonal antibodies are known in the art. Examples of such techniques include enzyme-linked immunosorbent assays (ELISAs), radioimmunoassays, and fluorescence activated cell sorting. A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive to two non-interfering epitopes is preferred, but a competitive binding assay may be employed. (See, eg, Coligan et al. (1997) [0079] Current Protocols in Immunology, Wiley-Interscience, New York N.Y.; Pound, supra.)
  • Therapeutics [0080]
  • Chemical and structural similarity, in the context of sequences, signatures and motifs, antigenic epitopes and the like, generally exists between regions of homologous proteins. Comparisons of [0081] M. catarrhalis nucleic acid molecules and proteins with those of other M. catarrhalis strains, other bacteria and other organisms allow preselection of therapeutic agents that affect the pathogenic organism without harming the host. Such therapeutic agents are useful in treating conditions and diseases such as allergies, asthma, bronchitis, chronic obstructive pulmonary disease, emphysema, endocarditis, hypereosinophilia, meningitis, otitis media, pneumonia, sinusitis, and various respiratory distress syndromes caused by M. catarrhalis. In conditions associated with increased expression or activity of M. catarrhalis nucleic acid molecule or protein, it is desirable to decrease expression or protein activity.
  • In one embodiment, a ligand such as an antagonist, antibody, or inhibitor identified by screening a plurality of molecules with the [0082] M. catarrhalis protein is administered to the subject to decrease the activity of the M. catarrhalis or homologous protein as it is overexpressed during pathogenesis.
  • In another embodiment, a composition comprising the substantially purified ligand and a pharmaceutical carrier may be administered to a subject to decrease the activity of the [0083] M. catarrhalis or homologous protein as it is overexpressed during pathogenesis. In one aspect, an antibody which specifically binds the M. catarrhalis protein may be used as a targeting or delivery mechanism for bringing a pharmaceutical agent to cells or tissues which are affected by the overexpression of the M. catarrhalis protein.
  • Any of the ligands may be administered in combination with other therapeutic agents. Selection of the agents for use in combination therapy may be made by one of ordinary skill in the art according to conventional pharmaceutical principles. A combination of therapeutic agents may act synergistically to effect prevention or treatment of a particular condition at a lower dosage of each agent. [0084]
  • Modification of Gene Expression Using Nucleic Acids [0085]
  • Gene expression may be modified by designing complementary or antisense molecules (DNA, RNA, or PNA) to the 5′, 3′, or intronic regions of the [0086] M. catarrhalis nucleic acid molecule. Oligonucleotides designed with reference to the transcription initiation site are preferred. Similarly, inhibition can be achieved using triple helix base-pairing which inhibits the binding of polymerases, transcription factors, or regulatory molecules (Gee et al. In: Huber and Carr (1994) Molecular and Immunologic Approaches, Futura Publishing, Mt. Kisco N.Y., pp. 163-177). A complementary molecule may also be designed to block translation by preventing binding between ribosomes and mRNA. In one alternative, a library of cDNA molecules may be screened to identify those which specifically bind a regulatory, untranslated M. catarrhalis sequence. Delivery of this inhibitory nucleotide sequence using a vector designed to be transferred from transformed M. catarrhalis cells to infectious M. catarrhalis via genetic recombination is contemplated.
  • Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of an [0087] M. catarrhalis RNA. The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme molecule to complementary target RNA followed by endonucleolytic cleavage at sites such as GUA, GUU, and GUC. Once such sites are identified, an oligonucleotide with the same sequence may be evaluated for secondary structural features which would render the oligonucleotide inoperable. The suitability of candidate targets may also be evaluated by testing their hybridization with complementary oligonucleotides using ribonuclease protection assays.
  • Complementary nucleic acids and ribozymes of the invention maybe prepared via recombinant expression, in vitro or in vivo, or using solid phase phosphoramidite chemical synthesis. In addition, RNA molecules may be modified to increase intracellular stability and half-life by addition of flanking sequences at the 5′ and/or 3′ ends of the molecule or by the use of phosphorothioate or 2′ O-methyl rather than phosphodiesterase linkages within the backbone of the molecule. Modification is inherent in the production of PNAs and can be extended to other derivative nucleotide molecules. Either the inclusion of nontraditional bases such as inosine, queosine, and wybutosine, and/or the modification of adenine, cytidine, guanine, thymine, and uridine with acetyl-, methyl-, thio-groups renders the molecule less available to endogenous bacterial endonucleases. [0088]
  • Screening Assays [0089]
  • The [0090] M. catarrhalis nucleic acid molecule may be used to screen a plurality or a library of molecules or compounds for specific binding affinity. The molecules or compounds may be selected from aptamers, DNA molecules, RNA molecules, PNAs, peptides, transcription factors, enhancers, repressors, regulatory proteins and other ligands which modulate the activity, replication, transcription, or translation of the nucleic acid molecules in the biological system. The assay involves combining the M. catarrhalis nucleic acid molecule or a fragment thereof with molecules or compounds under conditions to allow specific binding, and detecting specific binding to identify at least one ligand which specifically binds the M. catarrhalis nucleic acid molecule.
  • Similarly the [0091] M. catarrhalis protein or a portion thereof may be used to screen a plurality of libraries of molecules or compounds in any of a variety of screening assays. The molecules or compounds may be selected from aptamers, DNA molecules, RNA molecules, peptide nucleic acids, peptides, mimetics, proteins, agonists, antagonists, antibodies, inhibitors, immunoglobulins, pharmaceutical agents, drug compounds, and the like. The protein or portion thereof employed in such screening may be free in solution, affixed to an abiotic or biotic substrate (eg, borne on a cell surface), or located intracellularly. Specific binding between the protein and molecule may be measured. One method for high throughput screening using very small assay volumes and very small amounts of test compound is described in U.S. Pat. No. 5,876,946, incorporated herein by reference, which teaches how to screen large numbers of molecules for specific binding to a protein.
  • Purification of Ligand [0092]
  • The [0093] M. catarrhalis nucleic acid molecule or a fragment thereof may be used to purify a ligand from a sample. A method for using a M. catarrhalis nucleic acid molecule or a fragment thereof to purify a ligand would involve combining the nucleic acid molecule or a fragment thereof with a sample under conditions to allow specific binding, detecting specific binding, recovering the bound M. catarrhalis nucleic acid molecule, and using an appropriate agent to separate the M. catarrhalis nucleic acid molecule from the purified ligand.
  • Similarly, the protein or a portion thereof may be used to purify a ligand from a sample. A method for using a [0094] M. catarrhalis protein or a portion thereof to purify a ligand would involve combining the protein or a portion thereof with a sample under conditions to allow specific binding, detecting specific binding between the protein and ligand, recovering the bound protein, and using an appropriate chaotropic agent to separate the protein from the purified ligand.
  • Pharmacology [0095]
  • Pharmaceutical compositions are those substances wherein the active ingredients are contained in an effective amount to achieve a desired and intended purpose. The determination of an effective dose is well within the capability of those skilled in the art. For any compound, the therapeutically effective dose may be estimated initially either in cell culture assays or in animal models. The animal model is also used to achieve a desirable concentration range and route of administration. Such information may then be used to determine useful doses and routes for administration in humans. [0096]
  • A therapeutically effective dose refers to that amount of a pharmaceutical agent which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity of such agents may be determined by standard pharmaceutical procedures in cell cultures or experimental animals, eg, ED[0097] 50 (the dose therapeutically effective in 50% of the population) and LD50 (the dose lethal to 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index, and it may be expressed as the ratio, LD50/ED50. Pharmaceutical compositions which exhibit large therapeutic indexes are preferred. The data obtained from cell culture assays and animal studies are used in formulating a range of dosage for human use.
  • Rational Drug Design [0098]
  • The goal of rational drug design is to produce structural analogs of biologically active [0099] M. catarrhalis proteins of interest or of ligands with which they interact. Any of these examples can be used to fashion drugs which are more active or stable forms of the protein, or which enhance or interfere with the function of a protein in vivo (Hodgson (1991) Bio/Technology 9:19-21).
  • In one approach, the three-dimensional structure of an [0100] M. catarrhalis protein, or of an M. catarrhalis protein-inhibitor complex, is determined by X-ray crystallography, by computer modeling or, most typically, by a combination of the two approaches. Both the shape and charges of the protein must be ascertained to elucidate the structure and to determine active site(s). Less often, useful information regarding the structure of a protein may be gained by modeling based on the structure of homologous proteins. In both cases, relevant structural information is used to design analogous M. catarrhalis protein-like molecules or to identify efficient inhibitors.
  • Useful examples of rational drug design may include molecules which have improved activity or stability, as shown by Braxton et al. (1992[0101] , Biochem 31:77967801), or which act as inhibitors, agonists, or antagonists of M. catarrhalis peptides, as shown by Athauda et al. (1993, J Biochem 113:742-746).
  • It is also possible to isolate a target-specific antibody, selected by functional assay, as described above, and then to solve its crystal structure. This approach, in principle, yields a pharmacore upon which subsequent drug design can be based. It is possible to bypass protein crystallography altogether by generating anti-idiotypic antibodies (anti-ids) to a functional, pharmacologically-active antibody. As a mirror image of a mirror image, the binding site of the anti-id is an analog of the original receptor. The anti-id can be used to identify and isolate peptides from banks of chemically or biologically-produced peptides. The isolated peptides act as the pharmacore.[0102]
  • EXAMPLES Example 1 Shotgun Sequencing Strategy
  • The strategy for sequencing the [0103] M. catarrhalis genome was a modification of the shotgun approach to whole genome sequencing described by Lander and Waterman (1988 Genomics 2:231). They applied the equation for the Poisson distribution px=mxe−m/x!, where x is the number of occurrences of an event, m is the mean number of occurrences, and Px is the probability that any given base is not sequenced after a certain amount of random sequence has been generated. If L is the genome length, n is the number of clones insert ends sequenced, and w is the sequencing read length, then m=nw/L, and the probability that no clone originates at any of the w bases preceding a given base, ie, the probability that a base is not sequenced, is p0=e−m. For sequencing where p0>0, the total gap length is Le−m, and the average gap size is L/n.
  • The shotgun approach has recently been used to sequence the genomes of [0104] H. influenzae (Fleischmann et al. (1995) Science 269:496; WO 96/33276), Mycoplasma genitalium (Fraser et al. (1995) Science 270:397 and Methanococcus jannashii (Bult et al. (1996) Science 273:1058). All of these microbes have relatively small genomes of 1.8, 0.6, and 1.8 megabases, respectively. The size of the M. catarrhalis genome is estimated to be 1.9 megabases.
  • Example 2 Construction of the Genomic Library
  • An [0105] M. catarrhalis genomic DNA library was constructed using DNA purified from the gram negative, aerobic diplococcus, M. catarrhalis, ATCC accession number 43617. The isolate was obtained from transtracheal aspirate of a coal miner with chronic bronchitis. The G+C content is 42%.
  • Using a syringe fitted with a 0.0025 in. Ruby orifice (Stanford University, Stanford Calif.), 50 μg of [0106] M. catarrhalis DNA was sheared into 1.5-2.9 kb fragments. The shearing process was monitored by electrophoresis of a subsample of sheared DNA on a 0.8% SEAKEM GTG agarose gel (FMC Bioproducts, Rockland Me.) in 1×TAE buffer at about 950 V-h. Comparison with a DNA ladder with known size fragments was used to verify the size and quality of the sheared DNA
  • Sheared DNA was visualized with low wavelength UV and bands of 1.5 to 2.8 kbs were removed from a preparative 0.8% SEAKEM GTG agarose gel (FMC Bioproducts). The 1.5-2.9 kb fragments were electrophoresced through a preparative 0.8% SEAPLAQUE GTG low melt agarose gel (FMC Bioproducts) in 1×TAE buffer at about 850 V-h. The DNA band was removed from the low melt agarose, placed in an microcentrifuge tube, and the agarose melted at 65 C for 10-15 minutes. After 5 minutes of heating, the melted agarose was diluted with a half volume of double distilled water, and the sample was equilibrated to 42 C. β-AGARASE (New England Biolabs (NEB), Beverly Mass.) and 10xβ-AGARASE (NEB) were added, and the preparation was incubated for 1-3 hours with addition of a half initial volume of β-AGARASE (NEB) after 1 hour and mixing by inversion every half hour. The DNA was extracted once with phenol:chloroform:isoamyl alcohol (25:24:1) followed by extraction with chloroform:isoamyl alcohol (24:1) and precipitated by addition of 1-3 μl glycogen, {fraction (1/10)} volume 3M NaOAc, and 2.5 volumes cold 100% ethanol. The sample was stored overnight at −20 C. [0107]
  • The purified DNA strands were treated with BAL31 (NEB) at 1 U/20 μg DNA in a final volume of 50 μl at 30 C for 10 minutes to prepare blunt ends. Then the DNA was re-extracted as above (phenol:chloroform:isoamyl alcohol followed by chloroform:isoamyl alcohol). The DNA was reprecipitated as above and stored at −20 C until ligation into the vector. [0108]
  • The PBLUESCRIPT plasmid (Stratagene) was cut with SmaI endonuclease, and the ends of the strands dephosphorylated to prepare the BS.S2 vector. The purified [0109] M. catarrhalis DNA (2 μg) was ligated into the BS.S2 vector (1 μg) with T4 DNA ligase (Life Technologies) for 4 hours at 14 C. Following the ligation reaction, the ligated DNA was extracted and precipitated as above. The ligated vector:insert DNA was the size selected (vector+insert=4.4-5.7 kb) and purified by gel electrophoresis and extracted as described above.
  • Following gel purification, the ends of the vector:insert DNA were repaired using T4 DNA polymerase (NEB) for 5 minutes at 37 C, re-extracted and precipitated as above, and self-ligated into circles with T4 DNA ligase (Life Technologies). After 10 minutes, the ligation reaction was stopped by heating at 70 C for 10 minutes. [0110]
  • The circular plasmid was transformed into DH10B competent cells (Life Technologies) by electroporation at 1.8 volts. Transformed cells were selected by growth on X-Gal+isopropyl beta-D-thiogalactopyranoside (IPTG)+2× carbenicillin (carb) LB agar plates. [0111]
  • Example 3 Isolation of Clones and Sequencing
  • Plasmid DNA was released from the cells and purified using the REAL PREP 96 plasmid kit (QIAGEN, Chatsworth Calif.). This kit enabled simultaneous purification of 96 samples in a 96-well block using multi-channel reagent dispensers. The recommended protocol was employed except for the following changes: 1) the bacteria were cultured in 1 ml of sterile TERRIFIC BROTH (BD Biosciences, Sparks Md.) with carb at 25 mg/l and glycerol at 0.4%; 2) after inoculation and incubation for 19 hours, the cells were lysed with 0.3 ml of lysis buffer; and 3) following isopropanol precipitation, the plasmid DNA pellet was resuspended in 0.1 ml of distilled water. After this final step, samples were transferred to a 96-well block for storage at 4 C. [0112]
  • The DNA inserts were prepared for sequencing using a 96 well HYDRA microdispenser (Robbins Scientific) in combination with DNA ENGINE thermal cyclers (MJ Research). After thermal cycling, the A, C, G, and T reactions with each DNA template were combined. Then, 50 μl 100% ethanol was added, and the solution was spun at 4 C for 30 min at 4500 rpm in a centrifuge (Jouan, Winchester Va.). After the pellet was dried for 15 min under vacuum, the DNA sample was dissolved in 3 μl of formaldehyde/50 mM EDTA and loaded on wells in volumes of 1 μl per well for sequencing. Sequencing used the method of Sanger and Coulson (1975, J. Mol. Biol. 94:441f) and an ABI PRISM 377 sequencing systems (PE Biosystems). After electrophoresis for four hours on 4% acrylamide gels on 36 cm plates at 2.3 kV, approximately 500-650 bps were determined per sequence. [0113]
  • Example 4 Sequence Processing and Contiguous Sequence Assembly
  • Sequences were generated from either shotgun sequencing or closure sequencing. Closure sequences were obtained by directed genomic walks or PCR of specific genomic regions. In the latter case, the PCR products were sequenced. [0114]
  • Sequences were edited in a two-step process. In the first step, vector sequences from both the 5′ and 3′ ends were clipped using the algorithm provided in U.S. Ser. No. 09/276,534 filed Mar. 25, 1999. In the second step, possible contaminating sequence was removed by reading each raw sequence and performing a cross-match search against a contamination database containing known vector sequences and DNA marker sequences. Sequences with cross-match scores of 18 or greater were removed. [0115]
  • Contigs were assembled using PHRAP (Green, supra) which aligns multiple, overlapping DNA sequences to form a contiguous consensus sequence. Alignments were influenced by quality scores assigned to each base in a sequence. A single sequence cannot belong to more than one contig. [0116]
  • The 41 contigs presented in Table 1 and the Sequence Listing were assembled from 47385 individual sequences. The contigs represent approximately 13.3× coverage or 100.7% of the [0117] M. catarrhalis genome.
  • Example 5 Gene Finding
  • ORF identification was carried out through combination of BLAST (Karlin, supra) and FASTA searches. These serial searches compared the consensus sequences of the assembled contigs, presented in Table 1, against sequences in public-domain databases. The searches identified similarity matches, or “hits”, that indicated an ORF within the sequence. [0118]
  • The consensus sequences of the contigs were analyzed against the GenBank peptide (GenPept) database. The ORF identification process assigned ORFs to loci on a contig. If a match was found at a P-value less than or equal to 1e-6, the corresponding locus on the contig was designated as an ORF. This portion of the contig was masked by Ns, and the consensus sequence underwent a second BLASTX or FASTX search against the GenPept database. Again, the match with the lowest P-value (less than or equal to 1e-6) was used to identify a second ORF. The corresponding sequences were masked, and the process continued until all BLASTX and FASTX matches with P-values less than or equal to le-6 had been identified for a given contig. Then, the contigs were run through GeneMark, an algorithm for identifying putative ORFs. The GeneMark algorithm is described and developed in the following references: Borodovsky and McIninch (1993) Computers & Chemistry 17:123; Blattner et al. (1993) Nucl Acid Res 21:5408; and Borodovsky et al. (1994) Trends Biochem Sci 19:309. After all possible homology and algorithm-based ORFs were identified, a process called ORF selection was applied. In this process GeneMark ORFs that overlapped homology-based ORFs were rejected, and homology-based ORFs were retained. GeneMark ORFs that did not overlap homology-based ORFs and those that overlapped other GeneMark ORFs were retained. Finally, all ORFs were annotated by performing BLAST2 comparisons against GenPept and taking annotation from the best hit with P-value less than or equal to 1e-6. [0119]
  • Contigs with high probability for ORFs, but no identified ORFs, were identified as “orphan” contigs (Table 1). Unannotated regions of contigs exceeding 500 bases in length were identified as “Long-Unannotated Regions” (LURs) and contain novel ORFs. The designations, orphan and LUR, were based on comparative analyses of the lengths of ORFs and unannotated regions. [0120]
  • A total of 1258 ORFs were identified by homology searches of the GenPept database with an additional 253 ORFs identified using the GeneMark algorithm. [0121]
  • Example 6 Gene Clustering
  • In the final step of analysis, a gene clustering protocol is used to determine related ORFs within and across genomes. Gene clustering is carried out through BLAST2 pairwise comparisons of each ORF in the PATHOSEQ database (Incyte Genomics, Palo Alto Calif.) against every other ORF in the database. If two ORFs matched each other at a P-value less than or equal to 1e-15, they were placed in the same cluster. If a third ORF matched either of the first two ORFs at a P-value of less than or equal to 1e-15, the third ORF joined the cluster. Thus, clusters were formed so that any ORF in a cluster must match at least one other ORF in the cluster at less than or equal to the threshold P-value of 1e-15. The representative ORF for a cluster is the one with the best matched annotation. [0122]
  • Example 7 Ordering of Contiguous Sequences
  • The ordering of contigs has been accomplished through three types of analyses: 1) 5′/3′ sequence pair information, 2) annotation information, and 3) BLAST2 analysis of the ends of contigs. Contig ordering based on 5′/3′ sequence pairs was done by identifying all 5′/3′ sequence pairs (5′ and 3′ sequences with the same Sequence ID) that were not in the same contig, but span a gap between two contigs with the estimated distance between them of about 1.5-3.0 kb (the insert size of the library). Annotation information was used to determine contig order in two ways, either by identifying genes spanning contig gaps or by comparison with genes at the ends of contigs in related organisms with similar gene order. [0123]
  • Genes spanning gaps were identified by observing the N-terminal portion of an ORF at the end of one contig and the C-terminal portion of an ORF at the end of another contig. Two partial ORFs are considered to be portions of the same ORF when they meet this criteria and annotate to the same top five GenPept database entries. Comparison of two related organisms with similar gene order is used to predict contig ordering when one organism contains continuous gene order information over a region that spans a gap in the second organism. BLAST analysis of the ends of contigs was used to identify those contigs which overlapped, but failed to join because the sequence overlap did not meet the length or quality score required by PHRAP (Green, supra). Table 2 shows the ordering of the [0124] M. catarrhalis contigs as supported by one or more of these analyses.
  • Example 8 Extension of Partial ORFs to Full Length
  • Using the DNA sequences disclosed herein, an ORF is extended using a modified XL-PCR (PE Biosystems) procedure. Oligonucleotide primers, one to initiate 5′ extension and the other to initiate 3′ extension were designed using the nucleotide sequence of the known fragment and OLIGO 4.06 software (National Biosciences). The initial primers were about 22 to 30 nucleotides in length, had a GC content of about 42%, and annealed to the target sequence at temperatures of about 55 C to about 68 C. Any fragment which would result in hairpin structures and primer-primer dimerizations was avoided. The genomic DNA library was used to extend the molecule. If more than one extension was needed, additional or nested sets of primers were designed. [0125]
  • High fidelity amplification was obtained by performing PCR in 96-well plates using the DNA ENGINE thermal cycler (MJ Research). The reaction mix contained 2+DNA template, 200 mol of each primer, reaction buffer containing Mg, (NH[0126] 4)2SO4, and β-mercaptoethanol, Taq DNA polymerase (APB), ELONGASE enzyme (Life Technologies), and Pfu DNA polymerase (Stratagene), with the following parameters for primer pair selected from the plasmid: Step 1: 94 C, 3 min; Step 2: 94 C, 15 sec; Step 3: 60 C, 1 min; Step 4: 68 C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68 C, 5 min; Step 7: storage at 4 C. In the alternative, parameters for the primer pair, T7 and SK+ (Stratagene), were as follows: Step 1: 94 C, 3 min; Step 2: 94 C, 15 sec; Step 3: 57 C, 1 min; Step 4: 68 C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68 C, 5 min; Step 7: storage at 4 C.
  • The concentration of DNA in each well was determined by dispensing 100 μl PICOGREEN quantitation reagent (0.25% v/v; Molecular Probes) dissolved in 1×TE and 0.5 μl of undiluted PCR product into each well of an opaque fluorimeter plate (Corning Costar, Acton Mass.) and allowing the DNA to bind to the reagent. The plate was scanned in a Fluoroskan II (Labsystems Oy, Helsinki, Finland) to measure the fluorescence of the sample and to quantify the concentration of DNA. A 5 μl to 10 μl aliquot of the reaction mixture was analyzed by electrophoresis on a 1% agarose mini-gel to determine which reactions were successful in producing longer sequence. [0127]
  • The extended sequences were desalted, concentrated, transferred to 384-well plates, digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison Wis.), and sonicated or sheared prior to religation into pUC18 vector (APB). For shotgun sequencing, the digested fragments were separated on about 0.6-0.8% agarose gels, fragments were excised as visualized under UV light, and agarose removed/digested with AGARACE enzyme (promega). Extended fragments were religated using T4 DNA ligase (NEB) into pUC18 vector (APB), treated with Pfu DNA polymerase (Stratagene) to fill-in restriction site overhangs, and transformed into competent [0128] E. coli cells. Transformed cells were selected on antibiotic-containing media, and individual colonies were picked and cultured overnight at 37 C in 384-well plates in LB/2×carb liquid media.
  • The cells were lysed, and DNA was amplified using Taq DNA polymerase (APB) and Pfu DNA polymerase (Stratagene) with the following parameters: Step 1: 94 C, 3 min; Step 2: 94 C, 15 sec; Step 3: 60 C, 1 min; Step 4: 72 C, 2 min; Step 5: steps 2, 3, and 4 repeated 29 times; Step 6: 72 C, 5 min; Step 7: storage at 4 C. DNA was quantified by PICOGREEN reagent (Molecular Probes) as described above. Samples with low DNA recoveries were reamplified using the conditions described above. Samples were diluted with 20% dimethysulphoxide (1:2, v/v), and sequenced using DYENAMIC energy transfer sequencing primers and the DYENAMIC DIRECT kit (APB) or the ABI PRISM BIGDYE terminator kit (PE Biosystems). [0129]
  • Example 9 Labeling of Probes and Hybridization Analyses
  • Substrate Preparation [0130]
  • Nucleic acids are isolated from a biological source and applied to a substrate for standard hybridization protocols by one of the following methods. A mixture of nucleic acids, a restriction digest of genomic DNA, is fractionated by electrophoresis through an 0.7% agarose gel in 1×TAE running buffer and transferred to a nylon membrane by capillary transfer using 20× saline sodium citrate (SSC). Alternatively, the nucleic acids are individually ligated to a vector and inserted into bacterial host cells to form a library. Nucleic acids are arranged on a substrate by one of the following methods. In the first method, bacterial cells containing individual clones are robotically picked and arranged on a nylon membrane. The membrane is placed on bacterial growth medium. LB agar containing carb, and incubated at 37 C for 16 hours. Bacterial colonies are denatured, neutralized, and digested with proteinase K. Nylon membranes are exposed to UV irradiation in a STRATALINKER UV-crosslinker (Stratagene) to cross-link DNA to the membrane. [0131]
  • In the second method, nucleic acids are amplified from bacterial vectors by thirty cycles of PCR using primers complementary to vector sequences flanking the insert. Amplified nucleic acids are purified using SEPHACRYL-400 beads (APB). Purified nucleic acids are robotically arrayed onto a glass microscope slide (Corning Science Products, Corning N.Y.). The slide is previously coated with 0.05% aminopropyl silane (Sigma-Aldrich, St. Louis Mo.) and cured at 110 C. The arrayed glass slide (microarray) is exposed to UV irradiation in a STRATALINKER UV-crosslinker (Stratagene). [0132]
  • Probe Preparation [0133]
  • DNA probes are made from mRNA templates. Five micrograms of mRNA is mixed with 1 μg random primer (Life Technologies), incubated at 70 C for 10 minutes, and lyophilized. The lyophilized sample is resuspended in 50 μl of 1× first strand buffer (cDNA Synthesis systems; Life Technologies) containing a dNTP mix, [α-[0134] 32P]dCTP, dithiothreitol, and MMLV reverse transcriptase (Stratagene), and incubated at 42 C for 1-2 hours. After incubation, the probe is diluted with 42 μl dH2O, heated to 95 C for 3 minutes, and cooled on ice. mRNA in the probe is removed by alkaline degradation. The probe is neutralized, and degraded mRNA and unincorporated nucleotides are removed using a PROBEQUANT G-50 column (APB). Probes are labeled with fluorescent markers, Cy3-dCTP or Cy5-dCTP (APB), in place of the radionucleotide, [32P]dCTP.
  • Hybridization [0135]
  • Hybridization is carried out at 65 C in a hybridization buffer containing 0.5 M sodium phosphate (PH 7.2), 7% SDS, and 1 mM EDTA After the substrate is incubated in hybridization buffer at 65 C for at least 2 hours, the buffer is replaced with 10 ml of fresh buffer containing the probes. After incubation at 65 C for 18 hours, the hybridization buffer is removed, and the substrate is washed sequentially under increasingly stringent conditions, up to 40 mM sodium phosphate, 1% SDS, 1 mM EDTA at 65 C. To detect sighal produced by a radiolabeled probe hybridized on a membrane, the substrate is exposed to a PHOSPHORIMAGER cassette (APB), and the image is analyzed using IMAGEQUANT data analysis software (APB). To detect signals produced by a fluorescent probe hybridized on a microarray, the substrate is examined by confocal laser microscopy, and images are collected and analyzed using GEMTOOLS gene expression analysis software (Incyte Genomics). [0136]
  • Example 10 Complementary Nucleic Acid Molecules
  • Molecules complementary to the nucleic acid molecule, or a fragment thereof, are used to detect, decrease, or inhibit gene expression. Although use of oligonucleotides comprising from about 15 to about 30 base pairs is described, the same procedure is used with larger or smaller fragments or derivatives such as peptide nucleic acids (PNAs). Oligonucleotides are designed using OLIGO 4.06 software (National Biosciences) and a nucleic acid molecule of the Sequence Listing or fragment thereof. To inhibit transcription by preventing promoter binding, a complementary oligonucleotide is designed to bind to sequence 5′ of the ORF, most preferably about 10 nucleotides before the initiation codon of the ORF. To inhibit translation, a complementary oligonucleotide is designed to prevent ribosomal binding to the mRNA encoding the [0137] M. catarrhalis protein.
  • Example 11 Expression of an M. catarrhalis Protein
  • An [0138] M. catarrhalis nucleic acid molecule is subcloned into a vector containing an antibiotic resistance gene and the inducible T5 or T7 bacteriophage promoter in conjunction with the lac operator regulatory element. Recombinant vectors are transformed into BL21(DE3) competent cells (Stratagene). Antibiotic resistant bacteria express the bacterial protein upon induction with IPTG.
  • The protein is synthesized as a fusion protein with FLAG which permits affinity-based purification of the recombinant fusion protein from crude cell lysates. Kits for immunoaffinity purification using monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak, Rochester N.Y.) are commercially available. Following purification the heterogeneous moiety is proteolytically cleaved from the bacterial protein at specifically engineered sites. Purified protein is used directly in the production of antibodies or in activity assays. [0139]
  • Example 12 Production of M. catarrhalis Protein Specific Antibodies
  • An [0140] M. catarrhalis produced as described above or an oligopeptide designed and synthesized using an ABI 431 A peptide synthesizer (pE Biosystems) is used to produce an antibody. Animals are immunized with the protein or an oliopeptide-KLH complex in complete Freund's adjuvant. Immunizations are repeated at intervals thereafter in incomplete Freund's adjuvant. After a minimum of seven weeks for mouse or twelve weeks for rabbit, antisera are drawn and tested for antipeptide activity. Testing involves binding the peptide to plastic, blocking with 1% bovine serum albumin, reacting with rabbit antisera, washing, and reacting with radio-iodinated goat anti-rabbit IgG. Methods and machinery well known in the art are used to determine antibody titer and the amount of complex formation.
  • Example 13 Screening or Purifying Molecules Using Specific Binding
  • The nucleic acid molecule, or fragments thereof, or the protein, or portions thereof, are labeled with [0141] 32P-dCTP, Cy3-dCTP, Cy5-dCTP (APB), or BIODIPY or FITC (Molecular Probes), respectively. Libraries of candidate molecules previously arranged on a substrate are incubated in the presence of labeled nucleic acid molecule or protein. After incubation under conditions for either a nucleic acid or amino acid sequence, the substrate is washed, and any position on the substrate retaining label, which indicates specific binding or complex formation, is assayed, and the binding molecule is identified. Data obtained using different concentrations of the nucleic acid or protein are used to calculate affinity between the labeled nucleic acid or protein and the bound molecule.
  • Example 14 Identification of M. catarrhalis Genes Induced During Infection
  • In vivo expression technology (IVET) is used with the sequences, or ORFs, to identify [0142] M. catarrhalis genes specifically induced during infection or under pathogenic conditions (Mahan et al. (1993) Science 259:686). A library of random genomic fragments of M. catarrhalis is made and ligated to a gene for a selectable marker required for survival in the host animal. Only those M. catarrhalis cells harboring a fusion sequence containing an active promoter will survive passage through the host. Fusion bearing promoters with constitutive activity are identified and discarded by examining reporter activity on laboratory medium passaged M. catarrhalis bacteria. By harvesting M. catarrhalis cells from infection sites in the host and subtraction of the identified constitutively activated genes, a list of genes turned on during infection or under pathogenic conditions are compiled.
  • Host induced [0143] M. catarrhalis genes are identified using the M. catarrhalis sequences and ORFs disclosed herein and the method of differential fluorescence induction described by Valdivia and Falkow, (1996; Mol Microbiol 22:367).
  • Example 15 Identification of M. catarrhalis Genes Required for Survival in Host
  • Using the [0144] M. catarrhalis genomic sequences and ORFs, genes required for survival in a host is determined using the signature-tagged transposon method described by Hensel et al. (1995; Science 269:400). A library of M. catarrhalis mutants is marked with a unique oligonucleotide sequence for each disrupted gene. After passage of the library though an infected animal or other selective environment, putative survival genes are identified by absence of the mutant from the passaged library.
  • Various modifications of the described method and system of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. Although the invention has been described as specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the above-described modes for carrying out the invention which are obvious to those skilled in the field of molecular biology or related fields are intended to be within the scope of the following claims. [0145]
    TABLE 1
    Contig Size Start End Locus ID Identifier P-value Description
    1 429 4 264 MCA101123 g2634865 5.00E−18 methylenetetrahydrofolate
    dehydrogenase
    5 4258 4030 4257 MCA100094 g145409 4.00E−17 bacterioferritin
    5 4258 1264 2612 MCA100203 g3402236 e−127 L-serine dehydratase
    5 4258 3523 3978 MCA100205 g1673579 2.00E−51 bacterioferritin
    5 4258 2 343 MCA101132 g1001512 3.00E−24 methylenetetrahydrofolate
    dehydrogenase
    6 5009 41 1448 MCA100317 g1519052 e−134 succinyl CoA: 3-oxoacid
    CoA transferase
    precursor
    6 5009 1777 4587 MCA100318 g1574147 0 transferrin-binding
    protein, putative
    6 5009 4729 5007 MCA101039 g1786625 6.00E−13 putative
    oxidoreductase
    7 6703 2960 3466 MCA100395 g3861150 6.00E−23 probable 50S ribosomal
    protein L25 (rply)
    7 6703 965 2437 MCA100550 g2465556 e−155 OpuE
    7 6703 3687 4250 MCA100554 g1573366 6.00E−44 peptidyl-tRNA
    hydrolase (pth)
    7 6703 4491 5846 MCA100555 g1220106 e−120 hemN
    7 6703 351 563 MCA101455 g2731760 1.00E−13 30S subunit ribosomal
    protein S21
    8 7424 2423 3103 MCA100638 g286176 4.00E−33 negative regulator of
    pyocin genes
    8 7424 5081 6058 MCA101449 g48773 3.00E−97 methyltransferase
    8 7424 3218 4327 MCA101610
    8 7424 4320 5060 MCA101612
    8 7424 6504 6665 MCA101982
    8 7424 6662 6928 MCA101983
    8 7424 6925 7320 MCA101984 g1742219 1.00E−08 Exodeoxyribonuclease
    VIII (EC 3.1.11.—)
    (Exo VIII).
    9 10709 465 1976 MCA100745 g347071 e−141 4-hydroxybutyrate
    coenzyme A transferase
    9 10709 2306 3046 MCA100746 g3063885 5.00E−30 putative acyl-coA
    dehydrogenase
    9 10709 4192 5478 MCA100748 g1923241 4.00E−69 site-specific
    recombinase
    9 10709 5983 7809 MCA100749 g216913 0 principal sigma
    factor, rpoDA
    9 10709 8288 8701 MCA100750
    9 10709 8698 9393 MCA100751 g1574756 3.00E−12 conserved hypothetical
    transmembrane protein
    9 10709 3 200 MCA101334 g154276 3.00E−22 peptide chain release
    factor 2
    9 10709 9866 10330 MCA101713 g3025510 2.00E−33 putative
    transglycosylase
    10 19988 12800 12973 MCA100043 g2281030 1.00E−22 ZfiA protein
    10 19988 13066 13413 MCA100044
    10 19988 966 2060 MCA100336 g4062697 e−121 Hypothetical protein
    in purB 5′region (orf-
    15)
    10 19988 2141 3409 MCA100338 g2633742 4.00E−18 similar to
    hypothetical proteins
    from B. subtilis
    10 19988 15744 16295 MCA100456 g1805560 3.00E−36 phosphoribosylglycinamide
    formyltransferase
    (EC 2.1.2.2)
    10 19988 16331 17356 MCA100457 g1788845 e−130 phosphoribosylaminoimi
    dazole synthetase = AIR
    synthetase
    10 19988 17685 18677 MCA100458 g3861171 2.00E−27 putative permease
    homolog (perM)
    10 19988 18921 19685 MCA100459 g3212215 2.00E−11 conserved hypothetical
    protein
    10 19988 5532 8192 MCA100516 g1800083 0 Alanyl-tRNA Synthetase
    (EC 6.1.1.7)
    10 19988 8821 10335 MCA100518 g2632668 3.00E−69 similar to di-
    tripeptide ABC
    transporter
    10 19988 3517 4892 MCA100711 g1573637 e−171 adenylosuccinate lyase
    (purB)
    10 19988 11303 12571 MCA100888 g2983613 e−106 aspartokinase
    10 19988 13673 13906 MCA101216 g1573976 4.00E−31 ribosomal protein L28
    (rpL28)
    10 19988 13949 14101 MCA101228 g1790067 7.00E−18 50S ribosomal subunit
    protein L33
    10 19988 14201 14950 MCA101234 g3342798 1.00E−29 glutamine
    cyclotransferase
    precursor
    10 19988 8330 8503 MCA101481
    10 19988 334 801 MCA101636 g1789103 9.00E−38 orf, hypothetical
    protein
    11 14335 4618 5967 MCA100986 g1572963 e−155 conserved hypothetical
    protein
    11 14335 7881 8108 MCA100989
    11 14335 8089 8514 MCA100990
    11 14335 8504 9154 MCA100991 g455332 2.00E−07 pilus expression
    protein
    11 14335 9281 10588 MCA100992 g459551 1.00E−73 fimbrial assembly
    protein
    11 14335 10856 11347 MCA100993 g1573166 3.00E−44 shikimic acid kinase I
    (aroK)
    11 14335 11422 12447 MCA100994 g2661441 6.00E−88 3-dehydroquinate
    synthetase
    11 14335 12538 13482 MCA100995
    11 14335 13503 14108 MCA100996 g2950411 5.00E−26 hypothetical protein
    Rv3588c
    11 14335 1110 2087 MCA101460 g4235484 e−142 malate dehydrogenase
    11 14335 2383 3599 MCA101547 g1790853 2.00E−25 soluble lytic murein
    transglycosylase
    11 14335 7292 7798 MCA101551 g455330 4.00E−15 membrane protein
    11 14335 14167 14335 MCA101992
    12 21410 15 647 MCA100476 g2462048 9.00E−50 monofunctional
    peptidoglycan
    transglycosylase
    12 21410 993 3011 MCA100477 g2462047 0 polyphosphate kinase
    12 21410 3051 3521 MCA100478 g1573243 1.00E−34 conserved hypothetical
    protein
    12 21410 3641 4690 MCA100479 g1573154 e−142 chorismate synthase
    (aroC)
    12 21410 5549 6016 MCA100481 g1786848 6.00E−38 protein of lipoate
    biosynthesis
    12 21410 6421 7621 MCA100938 g1787162 9.00E−88 nicotinate
    phosphoribosyltransferase
    12 21410 8297 9625 MCA100940 g1573601 e−123 conserved hypothetical
    protein
    12 21410 9759 10676 MCA100941 g149244 3.00E−59 Lys R member
    12 21410 10956 12413 MCA100942 g4456996 5.00E−90 permease for AmpC
    beta-lactamase
    expression AmpG
    12 21410 12579 13343 MCA100943 g1651602 3.00E−41 Protoporphyrinogen
    oxidase (EC 1.3.3.4)
    hemK
    12 21410 13406 14134 MCA100944 g1787048 1.00E−40 molybdopterin
    biosynthesis
    12 21410 14383 15528 MCA100945 g3261724 2.00E−42 hypothetical protein
    Rv0647c
    12 21410 17885 18445 MCA100947 g41336 9.00E−49 enterohemolysin 1
    12 21410 4870 5397 MCA101603 g1573079 2.00E−71 inorganic
    pyrophosphatase (ppa)
    13 31940 29883 30041 MCA100005 g3282800 2.00E−09 50S ribosomal protein
    L32
    13 31940 17948 18358 MCA100019 g42833 2.00E−46 ribosomal protein L16
    (rplP) (aa 1-136)
    13 31940 20208 20510 MCA100105 g1789703 3.00E−29 30S ribosomal subunit
    protein S14
    13 31940 22493 22663 MCA100139 g498362 1.00E−16 ribosomal protein L30
    13 31940 22675 23106 MCA100140 g1573807 8.00E−37 ribosomal protein L15
    (rpL15)
    13 31940 23182 24408 MCA100141 g606234 e−111 secY
    13 31940 18936 19301 MCA100153 g606244 1.00E−53 50S ribosomal subunit
    protein L14
    13 31940 19325 19627 MCA100154 g1573799 3.00E−24 ribosomal protein L24
    (rpL24)
    13 31940 19660 20193 MCA100155 g1573800 2.00E−71 ribosomal protein L5
    (rpL5)
    13 31940 20528 20923 MCA100157 g1573802 1.00E−41 ribosomal protein S8
    (rpS8)
    13 31940 21077 21607 MCA100158 g710620 7.00E−58 ribosomal protein L6
    13 31940 21628 21969 MCA100159 g1573804 1.00E−32 ribosomal protein L18
    (rpL18)
    13 31940 21975 22469 MCA100160 g42986 8.00E−54 S5 (rpSE) (aa 1-167)
    13 31940 14176 14808 MCA100248 g1573787 4.00E−78 ribosomal protein L3
    (rpL3)
    13 31940 14853 15425 MCA100249 g1037107 3.00E−70 L4
    13 31940 15437 15724 MCA100250 g510688 7.00E−17 ribosomal protein L23
    13 31940 15765 16586 MCA100251 g48648 e−121 ribosomal protein L2
    (AA 1-274)
    13 31940 16605 16877 MCA100252 g1841326 1.00E−37 ribosomal protein S19
    13 31940 16890 17216 MCA100253 g42831 1.00E−35 ribosomal protein L22
    (rplV) (aa 1-110)
    13 31940 17222 17926 MCA100254 g42832 2.00E−78 ribosomal protein S3
    (rpsC) (aa 1-233)
    13 31940 11780 13402 MCA100255 g48826 e−113 orfF
    13 31940 10997 11554 MCA100256 g606188 1.00E−24 ORF_f217; orfE of
    ECMRED, uses 2nd start
    13 31940 10381 10659 MCA100257 g2589194 1.00E−08 Glu-tRNAGln
    amidotransferase
    subunit C
    13 31940 8809 10284 MCA100258 g1224069 0 amidase
    13 31940 7813 8754 MCA100259 g1403365 0 BRO-2
    13 31940 3925 4569 MCA100414 g3493603 5.00E−26 outer membrane protein
    homolog
    13 31940 24691 25044 MCA100423 g581217 6.00E−46 ribosomal protein S13
    (aa 1-118)
    13 31940 25068 25457 MCA100424 g4098575 7.00E−48 ribosomal protein S11
    13 31940 25473 26111 MCA100425 g42798 4.00E−72 ribosomal protein S4
    (aa 1-206)
    13 31940 26142 27107 MCA100426 g2896137 e−112 DNA-directed RNA
    polymerase alpha chain
    13 31940 27162 27518 MCA100427 g2896138 3.00E−52 ribosomal large
    subunit protein L17
    13 31940 29100 29645 MCA100430
    13 31940 18361 18540 MCA100557 g1841330 9.00E−09 ribosomal protein L29
    13 31940 7570 7746 MCA100583 g2589196 2.00E−15 Glu-tRNAGln
    amidotransferase
    subunit B
    13 31940 6307 7563 MCA100584 g1224071 0 unknown
    13 31940 2606 3502 MCA100588 g304968 3.00E−45 ORF_f310
    13 31940 30365 31270 MCA100612 g3282803 2.00E−64 malonyl CoA-acyl
    carrier protein
    transacylase
    13 31940 1 282 MCA101350 g1651578 2.00E−26 Cell division
    inhibitor MinD.
    13 31940 488 748 MCA101742 g1651579 1.00E−14 Cell division
    inhibitor MinC.
    13 31940 18573 18818 MCA101811 g606245 9.00E−23 30S ribosomal subunit
    protein S17
    13 31940 31291 31908 MCA101812 g1173841 4.00E−62 3-ketoacyl-ACP
    reductase
    13 31940 27617 28207 MCA101856 g1742075 2.00E−29 ORF_ID: o253#4; similar
    to [P45847]
    13 31940 28272 28676 MCA101857 g1788666 7.00E−34 putative transporting
    ATPase
    13 31940 13809 14117 MCA101858 g1573786 4.00E−45 ribosomal protein S10
    (rpS10)
    13 31940 5219 5743 MCA101999 g2231996 2.00E−06 cytochrome c5
    14 19619 11690 13288 MCA100149 g1001407 2.00E−80 iron utilization
    protein
    14 19619 18587 19294 MCA100717 g2314220 4.00E−26 phosphatidylserine
    synthase (pssA)
    14 19619 17517 18404 MCA100718 g1573417 5.00E−39 orfJ protein
    14 19619 16112 16555 MCA100720 g1573816 9.00E−36 H. influenzae
    predicted coding
    region HI0787
    14 19619 14601 15785 MCA100721 g4210610 e−110 DapE
    14 19619 13561 14508 MCA100722 g1651916 8.00E−78 iron transport protein
    14 19619 759 1838 MCA100895 g1574693 5.00E−72 UDP-N-
    acetylglucosamine
    14 19619 2157 2699 MCA100896 g2632721 3.00E−18 similar to
    acetyltransferase
    14 19619 2894 4285 MCA100897 g42056 e−148 (UDP-N-acetylmuramate:
    L-alanine ligase)
    14 19619 4384 5265 MCA100898 g1574696 4.00E−78 D-alanine--D-alanine
    ligase (ddlB)
    14 19619 5654 5914 MCA100899 g2622037 9.00E−11 unknown
    14 19619 5994 6857 MCA100900 g2098748 3.00E−49 oxidative stress
    transcriptional
    regulator; OxyR
    14 19619 7087 7644 MCA100901 g1064782 2.00E−63 alkyl hydroperoxide
    reductase
    14 19619 8407 9966 MCA100903 g1786823 e−135 alkyl hydroperoxide
    reductase, F52a
    subunit
    14 19619 10365 10556 MCA100904 g1799927 5.00E−17 similar to [P37096]
    14 19619 10801 11643 MCA100905 g4514346 2.00E−67 MsmX
    14 19619 6 629 MCA101403 g882476 3.00E−57 glutathione synthetase
    15 28626 10223 10792 MCA100003
    15 28626 27408 28103 MCA100097 g403436 3.00E−27 repressor protein
    15 28626 24288 24542 MCA100178 g1001663 4.00E−16 rare lipoprotein A
    15 28626 16822 17763 MCA100385 g453969 e−103 coproporphyrinogen
    oxidase
    15 28626 17790 18383 MCA100386 g1573172 2.00E−52 GTP cyclohydrolase II
    (ribA)
    15 28626 12359 13507 MCA100396 g1684734 2.00E−44 ORF396 protein
    15 28626 10910 12217 MCA100397 g146020 2.00E−78 folypolyglutamate
    synthetase-
    dihydrofolate
    synthetase
    15 28626 1297 2204 MCA100824 g1786319 7.00E−91 putative ATP-binding
    component of a
    transport system
    15 28626 2319 3065 MCA100825 g1786320 9.00E−75 orf, hypothetical
    protein
    15 28626 3176 3997 MCA100826 g882689 2.00E−48 ORF_o282
    15 28626 6151 6777 MCA100828 g141797 6.00E−51 phosphoribosyl
    anthranilate isomerase
    15 28626 6927 8117 MCA100829 g141798 e−172 tryptophan synthase
    beta-subunit
    15 28626 8163 8981 MCA100830 g144288 6.00E−51 tryptophan synthase A
    protein (EC 4.2.1.20)
    15 28626 766 1017 MCA100987 g2865528 2.00E−10 mono-heme c-type
    cytochrome ScyA
    15 28626 9250 10096 MCA101005 g1788655 2.00E−78 acetylCoA carboxylase,
    carboxytransferase
    beta subunit
    15 28626 13890 14987 MCA101042
    15 28626 15277 15660 MCA101046
    15 28626 15667 15975 MCA101766
    15 28626 4067 5800 MCA101839 g1573733 0 prolyl-tRNA synthetase
    (proS)
    15 28626 18809 20821 MCA101840 g1574278 e−166 1-deoxyxylulose-5-
    phosphate synthase (E.
    coli)
    15 28626 20981 21787 MCA101843 g1573958 4.00E−56 extragenic suppressor
    (suhB)
    15 28626 22787 23935 MCA101845 g1657482 2.00E−13 hypothetical protein
    15 28626 28257 28442 MCA101846 g403437 2.00E−11 putative
    16 22407 21035 22123 MCA100084 g1573365 e−141 conserved hypothetical
    GTP-binding protein
    16 22407 3904 4449 MCA100337 g3091146 7.00E−25 iron-starvation
    protein PigA
    16 22407 19532 20179 MCA100398 g3402250 4.00E−25 putative
    transcriptional
    regulator
    16 22407 18427 19210 MCA100399 g1079662 1.00E−54 catabolite repression
    control protein
    16 22407 16346 18019 MCA100400 g2649804 4.00E−70 L-lactate permease
    (lctP)
    16 22407 152 415 MCA101103
    16 22407 471 1757 MCA101104 g507736 e−167 PurA
    16 22407 2286 2729 MCA101106 g2909463 2.00E−08 hypothetical protein
    Rv0274
    16 22407 2747 2950 MCA101107
    16 22407 2940 3770 MCA101108 g3261756 9.00E−14 hypothetical protein
    Rv0939
    16 22407 4923 5546 MCA101110 g1574542 5.00E−78 endonuclease III (nth)
    16 22407 5747 6997 MCA101111 g1787188 2.00E−62 putative ATP-dependent
    protease
    16 22407 8306 8893 MCA101113 g581247 2.00E−32 gidB protein
    16 22407 8949 9728 MCA101114 g45713 2.00E−49 unnamed protein
    product
    16 22407 9744 10025 MCA101115
    16 22407 10335 11093 MCA101116 g45714 4.00E−59 unnamed protein
    product
    16 22407 11190 12152 MCA101117 g1573007 3.00E−49 conserved hypothetical
    protein
    16 22407 12332 13051 MCA101118 g1651444 1.00E−53 3-deoxy-manno-
    octulosonate
    cytidylyltransferase
    16 22407 13087 13668 MCA101119
    16 22407 13707 14210 MCA101120 g972778 3.00E−23 homology to delta
    subunit of DNA
    polymerase III
    16 22407 14905 16044 MCA101122 g1381737 e−170 lactate dehydrogenase
    17 23210 18014 20569 MCA100120 g2772586 0 high molecular weight
    outer membrane protein
    17 23210 505 1527 MCA101311 g3170587 e−105 glyceraldehyde-3-
    phosphate
    dehydrogenase homolog
    17 23210 2353 3555 MCA101313 g1573894 e−102 GTP-binding protein
    (yhbZ)
    17 23210 3919 4956 MCA101314 g409791 e−104 uroporphyrinogen
    decarboxylase
    17 23210 6000 7055 MCA101316 g4154933 3.00E−71 Protease DO
    17 23210 7823 8527 MCA101318 g1573324 1.00E−40 ABC transporter,
    permease protein
    17 23210 8692 9441 MCA101319 g1431416 2.00E−12 ORF YDL244w
    17 23210 9572 10231 MCA101320 g2293296 1.00E−34 putative transporter
    17 23210 11483 12235 MCA101323
    17 23210 13108 14196 MCA101325 g47094 e−107 3-phosphoserine
    aminotransferase (AA
    1-362)
    17 23210 14309 15082 MCA101326 g1552782 5.00E−42 hypothetical protein
    17 23210 15932 17658 MCA101328 g452382 e−150 2-isopropylmalate
    synthase
    17 23210 7143 7448 MCA101647 g1652439 6.00E−08 hypothetical protein
    17 23210 15246 15692 MCA101649 g2217944 2.00E−26 Lrp-family
    transcriptional
    regulators
    17 23210 10452 10742 MCA101666 g1001663 1.00E−23 rare lipoprotein A
    17 23210 20720 21990 MCA101696 g537207 7.00E−40 ORF_f277
    17 23210 22380 22529 MCA101725 g996086 1.00E−09 ORFY; non-essential
    for pilus assembly
    17 23210 22985 23149 MCA101847
    17 23210 12265 13008 MCA101963
    18 34001 23020 23238 MCA100089
    18 34001 24445 24774 MCA100093
    18 34001 27135 28022 MCA100416 g1890655 4.00E−90 UDP-3-O-acyl-GlcNAc
    deacetylase
    18 34001 29225 29902 MCA100418
    18 34001 31130 31741 MCA100421 g746400 7.00E−53 regulatory protein
    18 34001 15193 15909 MCA100448 g496598 2.00E−69 ORF1
    18 34001 184 930 MCA100873 g1209054 3.00E−87 EtfS
    18 34001 972 1898 MCA100874 g1209055 6.00E−90 EtfL
    18 34001 4318 5247 MCA100877 g309885 e−100 ‘aspartate
    transcarbamoylase’
    18 34001 5421 6119 MCA100878 g1786864 2.00E−43 orf, hypothetical
    protein
    18 34001 6359 7432 MCA100879 g309886 3.00E−73 dihydroorotase-like
    18 34001 7488 8273 MCA100880 g2113931 9.00E−18 citE
    18 34001 23341 23862 MCA101248
    18 34001 26268 26834 MCA101720 g433670 1.00E−70 elongation factor P
    18 34001 2166 2930 MCA101753 g1653441 1.00E−20 rRNA methylase
    18 34001 3046 4006 MCA101756 g901869 2.00E−78 fructose-1,6-/
    sedoheptulose-1,7-
    bisphosphate
    phosphatase
    18 34001 9314 10354 MCA101758 g1788660 2.00E−42 erythronate-4-
    phosphate dehyrogenase
    18 34001 10507 11499 MCA101759 g2983326 3.00E−28 hypothetical protein
    18 34001 11730 12191 MCA101764 g1786586 2.00E−29 orf, hypothetical
    protein
    18 34001 25125 26090 MCA101767 g1790589 7.00E−77 orf, hypothetical
    protein
    18 34001 12249 13307 MCA101768 g1621601 7.00E−67 PurK
    18 34001 13435 13911 MCA101769 g1574461 1.00E−53 phosphoribosylaminoimi
    dazole carboxylase
    18 34001 8282 9238 MCA101775 g41552 7.00E−58 genX
    18 34001 21669 22925 MCA101780
    18 34001 23957 24285 MCA101781 g2649731 6.00E−23 conserved hypothetical
    protein
    18 34001 31862 33821 MCA101782 g746401 0 ATP-binding protein
    18 34001 30667 30945 MCA101796 g1750388 2.00E−19 orf2
    18 34001 15937 16377 MCA101803 g2314656 2.00E−16 conserved hypothetical
    integral membrane
    protein
    18 34001 16523 18349 MCA101806 g2896133 3.00E−24 outer membrane
    esterase
    18 34001 18662 19597 MCA101808 g2294845 e−103 biotin synthase
    18 34001 20305 20988 MCA101813 g3417415 1.00E−44 phosphoserine
    phosphatase
    19 33778 32970 33659 MCA100015 g2459964 2.00E−36 HisX
    19 33778 20378 21868 MCA100026 g608530 e−106 L-aspartate oxidase
    19 33778 15834 16912 MCA100127 g968930 e−132 peptide chain release
    factor 1
    19 33778 17205 18047 MCA100128 g1498753 9.00E−76 nicotinate-nucleotide
    pyrophosphorylase
    19 33778 19349 20326 MCA100320 g1651337 e−116 Quinolinate synthetase
    A.
    19 33778 10305 11824 MCA100473 g2313949 1.00E−98 osmoprotection protein
    (proWX)
    19 33778 12732 14177 MCA100475 g1789015 e−165 succinate-semialdehyde
    dehydrogenase, NADP−
    dependent
    19 33778 2058 2579 MCA100756
    19 33778 4059 4889 MCA100758
    19 33778 31220 32257 MCA100768 g2695825 4.00E−58 corA
    19 33778 29370 31016 MCA100769 g1573928 e−119 glutathione-regulated
    potassium efflux
    system protein
    19 33778 27814 29127 MCA100770 g1573294 3.00E−98 conserved hypothetical
    protein
    19 33778 25151 27505 MCA100771 g2959335 0 Lon-protease
    19 33778 24481 25038 MCA100772 g1754527 4.00E−16 intracellular
    septation A
    19 33778 23332 23889 MCA100774 g3916254 2.00E−25 ExbB
    19 33778 23892 24287 MCA100946 g3916255 1.00E−23 ExbD
    19 33778 9106 9774 MCA101121 g927800 2.00E−20 Ydr533cp; CAI: 0.24
    19 33778 219 1652 MCA101802
    19 33778 3487 3846 MCA101805
    19 33778 4651 4911 MCA101974
    19 33778 6334 6705 MCA101975
    19 33778 2811 3494 MCA101977
    19 33778 22342 23226 MCA102006
    2 1169 157 555 MCA100759 g2633670 2.00E−17 yzzE; similar to
    general stress protein
    2 1169 795 1166 MCA101009 g3929904 5.00E−18 fumarate hydratase B,
    beta subunit
    20 31063 848 1366 MCA100998 g396321 2.00E−57 nusG
    20 31063 1476 1898 MCA100999 g2367334 7.00E−51 50S ribosomal subunit
    protein L11
    20 31063 1907 2581 MCA101000 g47257 2.00E−62 L1 protein (AA 1-234)
    20 31063 2920 3411 MCA101001 g1573638 9.00E−63 ribosomal protein L10
    (rpL10)
    20 31063 3481 3852 MCA101002 g1573639 7.00E−25 ribosomal protein
    L7/L12 (rpL7/L12)
    20 31063 4275 8360 MCA101003 g45729 0 beta-subunit of RNA
    polymerase
    20 31063 8446 12564 MCA101004 g2367335 0 RNA polymerase, beta
    prime subunit
    20 31063 12905 14122 MCA101239 g1573443 e−146 D-3-phosphoglycerate
    dehydrogenase (serA)
    20 31063 14321 15688 MCA101240 g1573119 e−171 glutathione reductase
    (gor)
    20 31063 16095 16997 MCA101241 g4062671 6.00E−73 Hypothetical protein
    HI0959
    20 31063 17242 19314 MCA101242 g1574519 6.00E−81 tail specific protease
    (prc)
    20 31063 20177 20935 MCA101244 g1573922 4.00E−28 conserved hypothetical
    protein
    20 31063 21988 22695 MCA101246 g2314002 5.00E−13 H. pylori predicted
    coding region HP0862
    20 31063 23138 23536 MCA101247 g1888564 7.00E−36 ORFX
    20 31063 24093 24545 MCA101249 g4545247 6.00E−53 invasion protein
    homolog
    20 31063 24726 26248 MCA101250 g2633966 5.00E−49 chromosome segregation
    SMC protein homolog
    20 31063 28591 29325 MCA101251 g296030 4.00E−97 ribosomal protein S2
    20 31063 29460 30314 MCA101252 g1552747 4.00E−61 elongation factor EF-
    Ts
    20 31063 30482 31063 MCA101253 g1079661 2.00E−47 orotate phosphoribosyl
    transferase
    20 31063 26531 28321 MCA101493 g1237015 4.00E−44 ORF4
    20 31063 350 823 MCA101880
    20 31063 21040 21933 MCA101950 g2983199 5.00E−07 biotin [acetyl-CoA-
    carboxylase] ligase
    21 39003 30165 31499 MCA100007 g1772845 e−130 NAD (P) H-dependent
    glutamate
    dehydrogenase
    21 39003 28829 29935 MCA100118 g1786552 e−134 glutathione-dependent
    formaldehyde
    dehydrogenase
    21 39003 25255 26679 MCA100217 g1787999 4.00E−77 orf, hypothetical
    protein
    21 39003 27082 27942 MCA100218
    21 39003 27992 28813 MCA100219 g405878 1.00E−86 probable esterase
    21 39003 20225 20965 MCA100226 g3220185 3.00E−31 pteridine reductase
    21 39003 19027 20070 MCA100227 g882578 7.00E−55 CG Site No. 933
    21 39003 21277 22656 MCA100347 g1736694 e−126 Proline transport
    protein
    21 39003 24025 24876 MCA100349 g2570906 1.00E−64 stearoyl-CoA
    desaturase
    21 39003 35864 38086 MCA100561 g1763284 e−163 penicillin-binding
    protein 1A
    21 39003 33490 35418 MCA100562 g862902 0 high temperature
    protein G
    21 39003 8041 9210 MCA101029 g1806239 1.00E−35 lipD
    21 39003 16664 18907 MCA101134 g1788806 0 putative multimodular
    enzyme
    21 39003 15338 16315 MCA101135 g1009431 e−106 porphobilinogen
    synthase
    21 39003 13425 14354 MCA101137 g42903 e−119 ruvB gene product (AA
    1-336)
    21 39003 12028 13293 MCA101138 g2909447 e−147 fadA2
    21 39003 10330 11691 MCA101140 g3063883 8.00E−92 putative 3-oxoacyl-
    [acyl-carrier protein]
    reductase
    21 39003 9377 10174 MCA101141 g2909445 3.00E−35 hypothetical protein
    Rv0241c
    21 39003 7384 7893 MCA101143 g3046326 4.00E−55 hypoxanthine
    phosphoribosyltransferase
    21 39003 4877 6769 MCA101145 g288532 0 dihydroxy acid
    21 39003 2806 4254 MCA101147 g2078066 5.00E−97 betP
    21 39003 1461 2414 MCA101149 g1001519 3.00E−23 hypothetical protein
    21 39003 559 1209 MCA101201
    21 39003 116 433 MCA101854 g2226116 2.00E−16 hypothetical protein
    21 39003 38281 38810 MCA101855 g972976 3.00E−20 1-acyl-sn-glycerol-3-
    phosphate
    acyltransferase
    21 39003 6901 7305 MCA101863
    21 39003 14701 15213 MCA101864
    22 45613 33275 34222 MCA100119 g1786405 3.00E−57 transcriptional
    regulator for nitrite
    reductase
    22 45613 31023 32033 MCA100130 g1653241 1.00E−40 hemolysin
    22 45613 13590 14525 MCA100133 g476229 e−150 isopropylmalate
    dehydrogenase
    22 45613 40430 41209 MCA100144 g1799842 7.00E−62 sulfate/thiosulfate
    transport protein cysW
    22 45613 41338 42090 MCA100171 g1799853 9.00E−60 sulfate transport
    system permease
    protein cyst.
    22 45613 42522 42968 MCA100210
    22 45613 42993 44153 MCA100212 g1573911 4.00E−84 ATP-dependent RNA
    helicase (rhlB)
    22 45613 44209 45369 MCA100213 g1573441 2.00E−87 oxygen-independent
    coproporphyrinogen III
    oxidase
    22 45613 10853 13060 MCA100223 g1000692 0 LeuA
    22 45613 536 1627 MCA100312 g1790487 4.00E−49 alanine racemase 1
    22 45613 1693 3003 MCA100313 g145763 e−106 DnaB replication
    protein (dnaB)
    22 45613 3266 4333 MCA100314 g1786237 3.00E−66 pyridoxine
    biosynthesis
    22 45613 8040 9071 MCA100353 g3758880 e−153 fructose-1,6-
    bisphosphate aldolase
    22 45613 9074 9676 MCA100354 g1573280 4.00E−29 Holliday junction DNA
    helicase (ruvA)
    22 45613 10292 10609 MCA100356 g1850796 6.00E−19 CynR protein
    22 45613 30261 30536 MCA100450 g1573206 3.00E−17 conserved hypothetical
    protein
    22 45613 28267 30132 MCA100451 g3983168 e−141 SecD
    22 45613 27163 28047 MCA100452 g1573204 4.00E−55 protein-export
    membrane protein
    (secF)
    22 45613 26200 26925 MCA100453 g1518782 4.00E−38 penicillin-binding
    protein 5
    22 45613 39609 40322 MCA100541 g1799841 2.00E−67 sulfate/thiosulfate
    transport protein cysA
    22 45613 38143 39546 MCA100542 g1881700 e−143 RadA
    22 45613 36060 37833 MCA100543 g1680533 0 phospho enol pyruvate
    carboxykinase
    22 45613 34862 35839 MCA100544 g2226145 4.00E−30 hypothetical protein
    22 45613 15396 16193 MCA100678 g1572987 2.00E−90 exodeoxyribonuclease
    III (xthA)
    22 45613 16548 18068 MCA100679 g1359473 0 lysyl-tRNA-synthase
    22 45613 18097 19173 MCA100680 g1574159 e−104 DNA polymerase III,
    subunits gamma and tau
    (dnaX)
    22 45613 20776 21252 MCA100682 g924993 8.00E−19 transcriptional
    regulator LtrA
    22 45613 21816 22710 MCA100684 g1786984 3.00E−32 putative
    transcriptional
    regulator LYSR-type
    22 45613 22855 23910 MCA100685 g2108220 1.00E−88 hemolysin
    22 45613 24272 25591 MCA100686 g2209268 3.00E−69 Na+/H+ antiporter
    22 45613 5347 6123 MCA100727 g1573537 1.00E−51 diadenosine-
    tetraphosphatase
    (apaH)
    22 45613 4478 5278 MCA100787 g1786236 7.00E−62 S-adenosylmethionine-
    6-N′,N′-adenosyl
    dimethyltransferase
    22 45613 6267 7456 MCA101090 g41422 e−121 phosphoglycerate
    kinase (AA 1-387)
    22 45613 32181 32786 MCA101784
    23 33140 647 814 MCA100041
    23 33140 2719 3444 MCA100603 g2330641 1.00E−22 htrB
    23 33140 3463 5241 MCA100604 g1788173 0 aspartate tRNA
    synthetase
    23 33140 5822 7239 MCA100606 g4062776 5.00E−83 ORF_ID: o245#1
    23 33140 7701 8581 MCA100608 g1574534 1.00E−72 protease, putative
    (sohB)
    23 33140 8907 9644 MCA100609 g1524217 3.00E−47 hypothetical protein
    Rv0945
    23 33140 9956 10741 MCA100610 g41424 3.00E−45 ORF4 (AA 1-197)
    23 33140 31971 33044 MCA100705 g1788953 8.00E−98 3-deoxy-D-
    arabinoheptulosonate-
    7-phosphate synthase
    23 33140 10882 11415 MCA101509 g1573653 8.00E−53 DNA-3-methyladenine
    glycosidase I (tagI)
    23 33140 11492 12220 MCA101510 g3046322 2.00E−69 O-acetylserine
    synthase; CysE2
    23 33140 12686 13213 MCA101511 g3046324 1.00E−24 unknown
    23 33140 13720 16956 MCA101513 g940886 0 DNA polymerase III
    holoenzyme alpha
    subunit
    23 33140 17151 18281 MCA101514 g1573367 3.00E−93 conserved hypothetical
    protein
    23 33140 18669 19625 MCA101515 g1799725 2.00E−69 similar to [SwissProt
    Accession Number
    P39199]
    23 33140 19870 20970 MCA101516 g1162959 e−123 homologous to HI0365
    in Haemophilus
    influenzae; ORF1
    23 33140 21062 21676 MCA101517
    23 33140 21735 22844 MCA101518 g1531668 e−122 AarC
    23 33140 22996 23775 MCA101519 g4155368 3.00E−53 putative
    23 33140 23844 25085 MCA101520 g1573338 e−117 histidyl-tRNA
    synthetase (hisS)
    23 33140 25203 26036 MCA101521 g1573339 1.00E−12 conserved hypothetical
    protein
    23 33140 26145 27266 MCA101522 g1805571 8.00E−33 serine/threonine
    protein kinase (EC
    2.7.1.—)
    23 33140 27407 28831 MCA101523 g1788858 e−153 putative GTP-binding
    factor
    23 33140 28941 29570 MCA101524 g2633978 1.00E−30 ribonuclease H
    23 33140 29683 30894 MCA101525 g1694783 2.00E−67 1pxB
    23 33140 31117 31638 MCA101526 g1787602 4.00E−11 orf, hypothetical
    protein
    23 33140 136 480 MCA101883
    23 33140 882 1604 MCA101889
    24 33248 31423 31823 MCA101434 g1046241 8.00E−16 orf14
    24 33248 25628 29158 MCA101438 g1651549 0 Transcription-repair
    coupling protein mfd
    24 33248 24151 25353 MCA101439 g1685080 5.00E−30 TolB
    24 33248 22836 23243 MCA101441 g1103861 1.00E−17 TolR
    24 33248 22115 22702 MCA101442 g1103860 1.00E−37 TolQ
    24 33248 17684 21622 MCA101443 g1574628 0 ATP-dependent helicase
    (hrpa)
    24 33248 15920 16918 MCA101445 g2314661 2.00E−13 lipase-like protein
    24 33248 14698 15579 MCA101446 g1840154 9.00E−36 36 kDa protein
    24 33248 13519 14589 MCA101447 g4155989 1.00E−12 putative
    24 33248 12383 13468 MCA101448 g2314658 7.00E−25 conserved hypothetical
    integral membrane
    protein
    24 33248 11331 11747 MCA101450 g1787709 2.00E−32 orf, hypothetical
    protein
    24 33248 10560 11324 MCA101451 g3192702 6.00E−28 gp19
    24 33248 32602 33087 MCA101505 g1790034 3.00E−36 orf, hypothetical
    protein
    24 33248 9940 10167 MCA101507 g1628368 1.00E−08 gepA
    24 33248 5471 6674 MCA101512 g437700 5.00E−39 traN
    24 33248 99 350 MCA102008
    24 33248 1019 1525 MCA102009
    24 33248 1526 2998 MCA102010
    24 33248 2998 4413 MCA102011
    24 33248 7022 8038 MCA102014 g2764860 9.00E−16 gene 13
    24 33248 8049 8252 MCA102016
    24 33248 8313 8672 MCA102017
    24 33248 23253 24080 MCA102018
    24 33248 8674 9030 MCA102026
    24 33248 9030 9377 MCA102028
    24 33248 31013 31210 MCA102029
    24 33248 32232 32447 MCA102030
    25 31147 830 1147 MCA100008 g3776111 6.00E−32 thioredoxin
    25 31147 3 593 MCA100009 g454841 3.00E−79
    25 31147 29786 30031 MCA100048 g1518927 1.00E−32 ferredoxin
    25 31147 29298 29753 MCA100049 g1518926 2.00E−45 protein for
    lipopolysaccharide
    core synthesis
    25 31147 12271 13725 MCA100080 g4200042 2.00E−81 exopolyphosphatase
    25 31147 4751 5011 MCA100380 g663269 2.00E−13 BolA
    25 31147 2616 4289 MCA100381 g2626753 2.00E−58 sulfate transporter
    25 31147 1432 2072 MCA100384 g1786244 1.00E−36 orf, hypothetical
    protein
    25 31147 6397 7359 MCA100487 g1052826 8.00E−97 phosphate binding
    protein
    25 31147 7554 8459 MCA100488 g1574215 1.00E−70 phosphate ABC
    transporter, permease
    protein (pstC)
    25 31147 8539 9348 MCA100489 g42397 9.00E−76 phoT (pstA) gene
    product (aa 1-296)
    25 31147 9516 10262 MCA100490 g1790162 7.00E−94 ABC transporter, high-
    affinity phosphate-
    specific
    25 31147 10496 11182 MCA100491 g1786599 6.00E−64 positive response
    regulator for pho
    regulon
    25 31147 11382 12201 MCA100492 g3282775 6.00E−53 histidine protein
    kinase PhoR
    25 31147 5110 5892 MCA100803 g1653285 6.00E−19 hypothetical protein
    25 31147 14590 15696 MCA101453
    25 31147 16710 17657 MCA101456 g2766195 3.00E−15 putative permease BhiE
    25 31147 17742 18020 MCA101457 g2415545 2.00E−19 permease protein
    25 31147 18338 19156 MCA101458 g1574806 7.00E−65 spermidine/putrescine
    ABC transporter
    25 31147 19449 20102 MCA101459 g4539576 4.00E−10 putative morphological
    differentiation-
    associated protein
    25 31147 20696 21667 MCA101461 g1881313 8.00E−80 similar to alkanal
    monooxygenase alpha
    chain
    25 31147 21810 22436 MCA101462 g1788844 6.00E−70 uracil
    phosphoribosyltransferase
    25 31147 23978 25966 MCA101464 g1574651 0 DNA ligase (lig)
    25 31147 25990 26874 MCA101465
    25 31147 27604 28056 MCA101467 g1788973 5.00E−48 small protein B
    25 31147 28358 29119 MCA101468 g478986 1.00E−47 NADPH-flavin
    oxidoreductase
    25 31147 15766 16581 MCA101993 g1360216 1.00E−06 ORF YLL031c
    26 34279 24575 24982 MCA100071 g1787709 2.00E−33 orf, hypothetical
    protein
    26 34279 23822 24559 MCA100072 g3192702 4.00E−32 gp19
    26 34279 25922 28576 MCA100506 g3192704 0 gp21
    26 34279 30501 30830 MCA100508
    26 34279 30 378 MCA100640 g1574256 2.00E−24 H. influenzae
    predicted coding
    region HI1422
    26 34279 775 1820 MCA100642 g15152 4.00E−31 alpha gene (pot.P4-
    specific DNA primase)
    (AA 1-777)
    26 34279 3747 4175 MCA100645
    26 34279 4724 5230 MCA100647
    26 34279 5715 7454 MCA100648 g3703076 5.00E−08 putative terminase
    large subunit
    26 34279 25324 25890 MCA100871 g3192703 6.00E−26 gp20
    26 34279 7772 8620 MCA101290 g1574365 5.00E−78 H. influenzae
    predicted coding
    region HI1523
    26 34279 8726 8929 MCA101291
    26 34279 8996 9613 HCA101292
    26 34279 11030 11218 MCA101295
    26 34279 11362 12360 MCA101296 g4126611 2.00E−21 ORF25
    26 34279 12828 13169 MCA101297
    26 34279 13153 13626 MCA101299
    26 34279 13623 13979 MCA101300
    26 34279 14007 14438 MCA101301
    26 34279 14521 14868 MCA101302
    26 34279 14943 15191 MCA101303
    26 34279 15247 15624 MCA101304
    26 34279 15733 19257 MCA101305 g2392838 2.00E−07 unknown
    26 34279 19350 19622 MCA101306 g2232363 2.00E−09 lambda phage M tail
    component homolog
    26 34279 22634 23014 MCA101309
    26 34279 23069 23783 MCA101409 g3192701 1.00E−44 gp18
    26 34279 4281 4589 MCA101760
    26 34279 5384 5770 MCA101762
    26 34279 30917 31486 MCA101785
    26 34279 12525 12812 MCA101793
    26 34279 10141 10902 MCA101809 g3172264 4.00E−12 major head subunit;
    gp17
    26 34279 21575 22135 MCA101932
    26 34279 22098 22577 MCA101933
    26 34279 7432 7626 MCA101935
    26 34279 5227 5397 MCA102035
    27 48328 3898 4593 MCA100056
    27 48328 23080 24003 MCA100073 g3482882 2.00E−81 unknown
    27 48328 1179 1733 MCA100106
    27 48328 1882 2790 MCA100107
    27 48328 43439 45661 MCA100173 g1786239 3.00E−52 organic solvent
    tolerance
    27 48328 18470 18898 MCA100206 g2314029 3.00E−33 conserved hypothetical
    protein
    27 48328 18957 19259 MCA100207 g3228385 1.00E−10 DsrC
    27 48328 19608 19982 MCA100208 g606279 7.00E−14 ORF_f128
    27 48328 20280 22904 MCA100209 g1789433 e−171 adenylylating enzyme
    for glutamine
    synthetase
    27 48328 39728 40198 MCA100292 g41611 3.00E−53 GreA protein
    27 48328 40220 40582 MCA100293
    27 48328 40907 41812 MCA100294 g440377 8.00E−14 297 amino acids
    peptide, unknown
    function
    27 48328 41954 43224 MCA100295 g1786238 1.00E−28 survival protein
    27 48328 13080 13841 MCA100296 g3192702 4.00E−33 gp19
    27 48328 13845 14246 MCA100297 g1046241 5.00E−30 orf14
    27 48328 15183 16646 MCA100300 g3192704 e−126 gp21
    27 48328 9361 10777 MCA100325 g3192699 8.00E−13 gp16
    27 48328 17057 18226 MCA100681 g3294478 6.00E−74 putative integrase
    27 48328 5343 5990 MCA100784 g15640 5.00E−36 antirepressor protein
    gene (aa 1-300)
    27 48328 7640 9283 MCA100788 g2764873 9.00E−27 gene 18.1
    27 48328 10904 11236 MCA100790
    27 48328 11341 11730 MCA100791
    27 48328 11814 12479 MCA100792 g3192701 4.00E−32 gp18
    27 48328 24782 25846 MCA101267 g2105065 8.00E−71 hypothetical protein
    Rv3629c
    27 48328 25926 26549 MCA101268 g3406829 5.00E−40 glutathione-s-
    transferase homolog
    27 48328 26714 28057 MCA101269 g1789768 2.00E−93 uroporphyrinogen III
    methylase; sirohaeme
    biosynthesis
    27 48328 28527 30197 MCA101270 g2565334 e−175 sulfite reductase
    27 48328 30403 31599 MCA101271 g1799660 e−141 aspartate
    aminotransferase (EC
    2.6.1.1)
    27 48328 32136 32504 MCA101273 g1788077 1.00E−27 orf, hypothetical
    protein
    27 48328 32871 34085 MCA101274 g451651 e−139 carbamoyl phosphate
    synthetase light
    subunit
    27 48328 34231 35126 MCA101275 g1781074 2.00E−41 mrr
    27 48328 35218 35517 MCA101276
    27 48328 35648 36154 MCA101277 g1573288 3.00E−39 conserved hypothetical
    protein
    27 48328 36212 39451 MCA101278 g1750387 0 carbamoylphosphate
    synthetase large
    subunit
    27 48328 1690 1878 MCA101315
    27 48328 46479 47453 MCA101401 g4545243 3.00E−43 unknown
    27 48328 14561 15130 MCA101644 g3192703 1.00E−17 gp20
    27 48328 47519 48194 MCA101706 g4545244 4.00E−34 unknown
    27 48328 6600 6881 MCA101849
    27 48328 3071 3532 MCA101900
    27 48328 3625 3816 MCA101901
    27 48328 2 349 MCA101902
    28 49617 33195 34376 MCA100162 g1573560 e−173 elongation factor Tu
    (tufA)
    28 49617 34523 35461 MCA100163 g1787114 e−103 thioredoxin reductase
    28 49617 29820 30191 MCA100230 g148985 3.00E−59 StrA
    28 49617 30315 30785 MCA100231 g1573568 6.00E−60 ribosomal protein S7
    (rpS7)
    28 49617 30948 33044 MCA100232 g41517 0 elongation factor G
    28 49617 762 1895 MCA100242 g164759 9.00E−17 alanine: glyoxylate
    aminotransferase
    28 49617 2047 3519 MCA100244 g1573675 e−137 aminoacyl-histidine
    dipeptidase (pepD)
    28 49617 3619 4347 MCA100245 g746513 2.00E−23 D1022.4
    28 49617 35607 36643 MCA100342 g3172117 5.00E−84 acyl-CoA dehydrogenase
    28 49617 36644 37420 MCA100343 g2909448 3.00E−31 fadE5
    28 49617 37843 38634 MCA100344 g1785900 6.00E−30 shikimate
    dehydrogenase
    28 49617 38747 39349 MCA100345
    28 49617 39350 40180 MCA100346 g1651539 4.00E−07 4-amino-4-
    deoxychorismate lyase.
    28 49617 14395 17115 MCA100440 g3414697 0 lactoferrin binding
    protein B; LbpB
    28 49617 22514 23227 MCA100449 g3414695 e−135 unknown
    28 49617 40373 41422 MCA100670 g1573431 3.00E−63 conserved hypothetical
    protein
    28 49617 41438 42034 MCA100671 g3328593 2.00E−29 Thymidylate Kinase
    28 49617 42254 43129 MCA100672 g1573221 4.00E−76 dihydrodipicolinate
    synthetase (dapA)
    28 49617 43531 44238 MCA100673 g1788820 1.00E−80 phosphoribosylaminoimi
    dazolesuccinocarboxamide
    de synthetase
    28 49617 44287 44583 MCA100674 g1261932 2.00E−22 hypothetical protein
    Rv2230c
    28 49617 44964 46457 MCA100675 g38754 e−161 anthranilate synthase
    28 49617 47871 48461 MCA100677 g1420585 9.00E−23 ORF YOR259c
    28 49617 4561 4887 MCA100806 g4062758 6.00E−28 Hypothetical protein
    HI1355
    28 49617 5171 5995 MCA100807 g1778577 5.00E−38 similar to H.
    influenzae
    28 49617 7002 7334 MCA100810 g536952 1.00E−32 phnA gene product
    28 49617 7401 8669 MCA100811 g557262 e−141 glutamate 1-
    semialdehyde 2,1-
    aminomutase
    28 49617 8987 11776 MCA100812 g1786287 0 preprotein
    translocase; secretion
    protein
    28 49617 11952 12248 MCA100813
    28 49617 12453 13913 MCA100961 g4033729 2.00E−92 apolipoprotein N-
    acyltransferase
    28 49617 17302 20301 MCA101127 g3414688 0 lactoferrin binding
    protein A; LbpA
    28 49617 22158 22340 MCA101129
    28 49617 23390 24286 MCA101130 g3861035 4.00E−53 unknown
    28 49617 24341 25198 MCA101131 g154231 2.00E−57 p-aminobenzoate
    synthase component I
    28 49617 25891 27114 MCA101133 g2384564 e−115 beta-ketoacyl-ACP
    synthase I
    28 49617 43166 43477 MCA101765
    28 49617 27638 28825 MCA101786 g3924824 3.00E−18 cDNA ESTs D37429,
    D34381, yk370a12.5,
    and yk370a12.3
    28 49617 20306 21928 MCA101788 g3414689 0 unknown
    28 49617 6260 6820 MCA101859 g887848 3.00E−16 ORF_o326
    28 49617 237 524 MCA101905
    29 66986 35441 38304 MCA100016 g154417 0 DNA repair enzyme
    29 66986 59667 60365 MCA100045 g1770057 3.00E−25 glutamate racemase
    29 66986 26527 27261 MCA100088 g551827 1.00E−50 phosphatidylserine
    decarboxylase
    29 66986 62551 62976 MCA100100 g2621609 3.00E−35 peptide methionine
    sulfoxide reductase
    29 66986 32810 33283 MCA100164 g1871177 1.00E−32 unknown protein
    29 66986 32188 32637 MCA100165 g1790320 4.00E−29 orf, hypothetical
    protein
    29 66986 31513 32049 MCA100166 g1574395 2.00E−41 dethiobiotin synthase
    (bioD-2)
    29 66986 30641 31438 MCA100167 g1574396 2.00E−26 biotin synthesis
    protein, putative
    29 66986 3760 4908 MCA100170 g150277 e−144 major anaerobically
    induced outer membrane
    protein
    29 66986 7578 8528 MCA100196 g1788007 e−108 phenylalanine tRNA
    synthetase, alpha-
    subunit
    29 66986 8587 10980 MCA100197 g1788006 0 phenylalanine tRNA
    synthetase, beta-
    subunit
    29 66986 376 2616 MCA100310 g2584871 0 nitric oxide reductase
    29 66986 63073 63813 MCA100362 g1573289 6.00E−48 conserved hypothetical
    protein
    29 66986 63968 64921 MCA100363 g1736517 2.00E−86 ORF_ID: o337#12;
    similar to [P44167]
    29 66986 65011 65925 MCA100364 g1788268 2.00E−60 orf, hypothetical
    protein
    29 66986 27579 27932 MCA100376 g1773150 3.00E−10 hypothetical 14.8 kd
    protein
    29 66986 28126 29346 MCA100377 g1574398 e−134 adenosylmethionine-8-
    amino-7-oxononanoate
    aminotransfer
    29 66986 29451 30593 MCA100378 g1574397 3.00E−94 8-amino-7-oxononanoate
    synthase (bioF)
    29 66986 38453 38947 MCA100569 g1573216 3.00E−41 single-stranded DNA
    binding protein (ssb)
    29 66986 41258 41935 MCA100572 g1067166 3.00E−67 inner membrane protein
    29 66986 6768 7145 MCA100655 g2983502 3.00E−12 hypothetical protein
    29 66986 56916 58574 MCA100693 g1842057 0 electron transfer
    flavoprotein-
    ubiquinone
    oxidoreductase
    29 66986 55454 56770 MCA100694 g1787461 5.00E−88 enzyme in alternate
    path of synthesis of
    5-aminolevulin
    29 66986 53509 54726 MCA100696 g557259 1.00E−18 orf3
    29 66986 5678 6376 MCA100697 g1806180 4.00E−13 hypothetical protein
    Rv0712
    29 66986 52515 52949 MCA100698 g557258 3.00E−09 hemM
    29 66986 51719 52480 MCA100699 g968927 9.00E−37 orfY gene product
    29 66986 50111 51057 MCA100701 g147379 e−122 phosphoribosylpyrophosphate
    synthetase (EC
    2.7.6.1)
    29 66986 49534 50058 MCA100957 g4062631 1.00E−11 Cytochrome b561
    29 66986 23587 25704 MCA100973 g939724 2.00E−99 putative sensor
    kinase; regulatory
    protein
    29 66986 21832 22698 MCA100974 g581757 e−110 cysteine synthase
    29 66986 21122 21790 MCA100975 g4155184 9.00E−19 putative
    29 66986 19031 20455 MCA100977 g1789148 5.00E−69 putative enzyme
    29 66986 17277 18389 MCA100979 g1573195 1.00E−82 ATP-dependent RNA
    helicase (deaD)
    29 66986 14191 16212 MCA100981 g1789147 e−144 (p)ppGpp synthetase I
    (GTP
    pyrophosphokinase)
    29 66986 13280 14149 MCA100982 g466773 2.00E−57 formamidopyrimidine-
    DNA glycosylase
    29 66986 11637 11894 MCA100984 g1657496 1.00E−21 hypothetical protein
    29 66986 61385 62110 MCA101336 g3132253 1.00E−33 ORF5
    29 66986 11131 11412 MCA101783 g1435199 3.00E−26 IhfA
    29 66986 49142 49360 MCA101787
    29 66986 60620 60838 MCA101791
    29 66986 41962 42651 MCA101800 g1174236 8.00E−30 CycJ
    29 66986 47425 48129 MCA101884 g467327 9.00E−49 unknown
    29 66986 33583 33888 MCA101885 g1196481 4.00E−10 unknown protein
    29 66986 34239 34529 MCA101888 g1778554 3.00E−20 HI0034 homolog
    29 66986 34824 35239 MCA101893 g1303791 7.00E−15 YqeJ
    29 66986 2840 3361 MCA101894 g2633273 1.00E−30 similar to
    hypothetical proteins
    29 66986 39252 40400 MCA101895 g1789416 7.00E−91 putative
    synthetase/amidase
    29 66986 42814 43641 MCA101896 g150508 e−103 lipoprotein
    29 66986 43836 44480 MCA101897 g1552774 1.00E−37 hypothetical
    29 66986 44515 45558 MCA101898 g1573615 e−121 ABC transporter, ATP-
    binding protein
    29 66986 45781 46777 MCA101899 g2072712 9.00E−14 mtrB
    29 66986 58939 59568 MCA102050
    29 66986 20802 21026 MCA102051
    29 66986 12225 13193 MCA102055
    30 58909 57032 58390 MCA100109 g4062412 e−165 Hypothet. 51.7 kd
    protein in dnaJ-rpsU
    interegenic region.
    30 58909 44550 45806 MCA100235 g1799634 2.00E−97 NADH dehydrogenase I
    chain N (EC 1.6.5.3)
    30 58909 47991 49715 MCA100331 g1574424 0 arginyl-tRNA
    synthetase (argS)
    30 58909 46973 47773 MCA100332 g290446 4.00E−31 ferredoxin NADP+
    reductase
    30 58909 1064 2329 MCA100463 g436156 e−127 GTPase required for
    high frequency
    lysogenization
    30 58909 2502 3320 MCA100464 g606115 5.00E−55 dihydropteroate
    synthase
    30 58909 3369 4094 MCA100465 g1789315 4.00E−34 orf, hypothetical
    protein
    30 58909 56014 56754 MCA100615 g1183839 8.00E−73 unknown
    30 58909 54292 55815 MCA100616 g148179 e−131 threonine deaminase
    30 58909 53064 54086 MCA100617 g44888 e−153 NgoPII restriction and
    modification
    30 58909 52624 53001 MCA100618 g606334 1.00E−30 ORF_o133
    30 58909 52190 52600 MCA100619 g1147812 1.00E−23 red cell-type low
    molecular weight acid
    phosphatase
    30 58909 51008 52030 MCA100620 g145431 4.00E−49 unidentified reading
    frame II
    30 58909 4392 5996 MCA100757 g44839 e−139 pilB gene product (AA
    1-521)
    30 58909 45970 46683 MCA100785 g1573561 5.00E−96 membrane protein
    30 58909 6 854 MCA100838 g1573723 7.00E−63 heat shock protein
    (htpX)
    30 58909 39210 39746 MCA101072 g1788617 2.00E−81 NADH dehydrogenase I
    chain I
    30 58909 39794 40300 MCA101079 g1788616 2.00E−32 NADH dehydrogenase I
    chain J
    30 58909 6340 7718 MCA101157 g2804454 e−131 C. elegans
    adenosylhomocysteinase
    (SW: P27604)
    30 58909 8333 11554 MCA101159 g3523135 0 transferrin binding
    protein A; TbpA
    30 58909 12590 14125 MCA101161 g3523128 0 unknown
    30 58909 14403 16520 MCA101164 g3523129 0 transferrin binding
    protein B; TbpB
    30 58909 17432 18442 MCA101166 g1590923 8.00E−21 conserved hypothetical
    protein
    30 58909 18722 19336 MCA101167 g3861219 9.00E−47 unknown
    30 58909 19375 20268 MCA101168 g1651962 3.00E−80 hypothetical protein
    30 58909 22343 23683 MCA101170 g1574303 e−128 mrsA protein (mrsA)
    30 58909 23858 24490 MCA101194 g1653389 9.00E−50 pyridoxamine 5-
    phosphate oxidase
    30 58909 24814 25410 MCA101195 g4063381 3.00E−27 periplasmic chaperone
    protein
    30 58909 25438 25635 MCA101196 g1573260 3.00E−08 mercuric ion scavenger
    protein (merP)
    30 58909 25824 26192 MCA101197 g3273735 2.00E−32 NADH dehydrogenase
    chain A
    30 58909 26785 27447 MCA101199 g1788624 6.00E−87 NADH dehydrogenase I
    chain B
    30 58909 27619 29301 MCA101200 g1788622 0 NADH dehydrogenase I
    chain C, D
    30 58909 30568 31590 MCA101202 g682765 3.00E−74 mccB
    30 58909 31965 32180 MCA101203 g349635 2.00E−19 NADH dehydrogenase
    subunit
    30 58909 33192 33647 MCA101205 g349636 3.00E−46 NADH dehydrogenase
    subunit
    30 58909 33770 35029 MCA101206 g1799645 e−152 NADH dehydrogenase I
    chain F (EC 1.6.5.3)
    30 58909 35070 38009 MCA101207 g409013 0 NADH dehydrogenase
    subunit
    30 58909 38202 39188 MCA101208 g1788618 e−123 NADH dehydrogenase I
    chain H
    30 58909 40440 40736 MCA101211 g1799639 4.00E−22 NADH dehydrogenase I
    chain K (EC 1.6.5.3)
    30 58909 40746 42596 MCA101212 g1788614 0 NADH dehydrogenase I
    chain L
    30 58909 42622 44157 MCA101213 g1799637 e−148 NADH dehydrogenase
    chain 4 (EC 1.6.5.3)
    30 58909 32262 33029 MCA101966
    31 65792 57101 58057 MCA100214 g1236631 2.00E−69 SfhB
    31 65792 58173 58838 MCA100215 g2104329 5.00E−19 yfiH
    31 65792 58955 59695 MCA100216 g1573058 1.00E−62 conserved hypothetical
    protein
    31 65792 31449 32228 MCA100281 g4008034 3.00E−82 enoyl-(acyl-carrier
    protein) reductase
    31 65792 32373 33071 MCA100282 g1573553 3.00E−91 ribulose-phosphate 3-
    epimerase (dod)
    31 65792 33430 33732 MCA100283
    31 65792 33788 34507 MCA100284
    31 65792 34613 35137 MCA100286 g2959334 8.00E−17 hypothetical protein
    31 65792 44547 46088 MCA100350 g1790041 e−153 2,3-
    bisphosphoglycerate-
    indpndnt
    phosphoglycerate
    mutase
    31 65792 46329 47333 MCA100351 g2983365 2.00E−42 carboxyl-terminal
    protease
    31 65792 59939 62041 MCA100406 g1573258 e−178 potassium/copper-
    transporting ATPase,
    putative
    31 65792 62189 62968 MCA100407
    31 65792 63137 63424 MCA100408 g1787108 7.00E−14 orf, hypothetical
    protein
    31 65792 63494 65749 MCA100409 g45972 0 URF 2
    31 65792 342 1250 MCA100493 g1787799 6.00E−40 orf, hypothetical
    protein
    31 65792 5366 7711 MCA100687 g42481 0 pyruvate, water
    dikinase
    31 65792 8122 8934 MCA100688 g1001627 5.00E−16 hypothetical protein
    31 65792 9194 11455 MCA100689 g4062515 e−117 Hypothetical protein
    HI0115
    31 65792 12030 12881 MCA100691 g1787606 5.00E−96 orf, hypothetical
    protein
    31 65792 35380 36765 MCA100702 g4155857 e−162 fumerase
    31 65792 37101 40302 MCA100703 g3928723 4.00E−77 putative ABC
    transporter
    31 65792 41558 41968 MCA100706 g4154631 1.00E−26 bacterioferritin
    comigratory protein
    31 65792 42310 43617 MCA100707 g1573080 0 conserved hypothetical
    protein
    31 65792 13827 14018 MCA100733 g1778825 7.00E−21 major cold shock
    protein CspA
    31 65792 33077 33430 MCA100775
    31 65792 47450 48073 MCA100793 g3142729 2.00E−62 response regulator
    31 65792 48273 48530 MCA100794 g2632000 3.00E−22 RpsT protein
    31 65792 48820 49518 MCA100795 g1203935 7.00E−08 coded for by C.
    elegans cDNA yk86b10.5
    31 65792 49766 52474 MCA100796 g525202 0 DNA topoisomerase
    (ATP-hydrolysing)
    31 65792 52499 53179 MCA100797 g557844 5.00E−19 orf, len: 234, CAI:
    0.26
    31 65792 53919 55553 MCA100799 g882589 4.00E−61 CG Site No. 847;
    alternate gen name
    dnaP, parB
    31 65792 55986 56600 MCA100800 g1573134 6.00E−31 lipoprotein, putative
    31 65792 30651 31190 MCA100907 g2981082 1.00E−51 GTP-cyclohydrolase
    31 65792 28838 30289 MCA100908 g4062623 5.00E−91 Novobiocin resistance-
    related protein Nov
    31 65792 27100 28536 MCA100909 g2894397 6.00E−25 TphA protein
    31 65792 26354 26986 MCA100911 g2708657 3.00E−57 ribose-5-phosphate
    isomerase
    31 65792 25195 26139 MCA100912 g1787100 3.00E−43 putative surface
    protein
    31 65792 23910 25004 MCA100913 g1789273 4.00E−39 orf, hypothetical
    protein
    31 65792 22262 23656 MCA100914 g142309 e−179 glutamine synthetase
    31 65792 53226 53429 MCA101798
    31 65792 21511 21816 MCA101835
    31 65792 17390 18373 MCA101836 g1653422 2.00E−06 hypothetical protein
    31 65792 20955 21458 MCA101838
    31 65792 1604 2059 MCA101861 g2688497 7.00E−13 carboxypeptidase,
    putative
    31 65792 2444 3820 MCA101862 g1907384 e−160 soluble pyridine
    nucleotide
    transhydrogenase
    31 65792 4190 4996 MCA101866 g1787995 2.00E−61 orf, hypothetical
    protein
    31 65792 14240 16021 MCA101867 g1651441 e−107 MsbA protein.
    31 65792 18490 19170 MCA101868 g561691 5.00E−40 LpsA
    31 65792 19197 19931 MCA101873 g1573652 1.00E−55 lipopolysaccharide
    biosynthesis protein
    31 65792 19998 20750 MCA101874 g1573652 4.00E−56 lipopolysaccharide
    biosynthesis protein
    31 65792 13103 13522 MCA101875 g3062 4.00E−41 3-dehydroquinate
    dehydratase
    32 62909 50745 52567 MCA100340 g2623969 2.00E−56 putative peptidyl-
    prolyl cis-trans
    isomerase
    32 62909 49000 50580 MCA100341 g42595 0 purH gene product
    32 62909 42928 48531 MCA100348 g1666683 1.00E−45 hsf gene product
    32 62909 8351 8881 MCA100498 g1574570 2.00E−61 conserved hypothetical
    protein
    32 62909 10103 11257 MCA100501 g1789311 e−157 methionine
    adenosyltransferase 1
    32 62909 11895 12551 MCA100503 g4062689 1.00E−56 heterocyst maturation
    protein (devA) homolog
    32 62909 12581 13813 MCA100504 g1787362 2.00E−62 putative kinase
    32 62909 6566 7315 MCA100649 g1773205 2.00E−22 similar to H.
    influenzae HI0735
    32 62909 6025 6510 MCA100650 g1786736 1.00E−52 peptidyl-prolyl cis-
    trans isomerase B
    (rotamase B)
    32 62909 4072 5826 MCA100651 g1574816 e−175 glutaminyl-tRNA
    synthetase (glnS)
    32 62909 2634 3977 MCA100652 g3850110 3.00E−60 rrm3-pif1 helicase
    homolog
    32 62909 1016 2038 MCA100654 g39921 3.00E−75 glyceraldehyde-3-
    phosphate
    dehydrogenase (AA 1-335)
    32 62909 54353 54796 MCA100831 g1573349 3.00E−38 conserved hypothetical
    protein
    32 62909 54874 56076 MCA100832 g1788879 e−169 putative
    aminotransferase
    32 62909 56256 56636 MCA100833 g1788878 3.00E−55 orf, hypothetical
    protein
    32 62909 56752 57066 MCA100834 g1573345 2.00E−30 conserved hypothetical
    protein
    32 62909 57767 59620 MCA100836 g1573342 e−135 heat shock protein
    (hscA)
    32 62909 59732 60067 MCA100837 g3925514 6.00E−39 ferredoxin
    32 62909 60693 62453 MCA100839 g3261657 3.00E−97 ggtB
    32 62909 57114 57557 MCA100980 g1799935 4.00E−17 similar to [P36540]
    32 62909 14126 14635 MCA101066
    32 62909 17539 17940 MCA101071 g2114470 5.00E−46 transposase homolog A
    32 62909 21605 22480 MCA101075 g1788819 2.00E−68 orf, hypothetical
    protein
    32 62909 22570 23385 MCA101076 g1001366 7.00E−39 hypothetical protein
    32 62909 26086 26817 MCA101080 g2367307 7.00E−95
    32 62909 27509 29122 MCA101082 g2367309 5.00E−89 orf, hypothetical
    protein
    32 62909 29170 29628 MCA101083 g1653085 8.00E−26 adenine
    phosphoribosyltransferase
    32 62909 53480 54157 MCA101204
    32 62909 31514 32173 MCA101329 g1110441 2.00E−27 hypothetical product
    32 62909 32281 34587 MCA101330 g290642 2.00E−80 ATPase
    32 62909 35413 37533 MCA101332 g1574581 e−127 penicillin-binding
    protein 1B (ponB)
    32 62909 40898 41815 MCA101337 g2367208 1.00E−56 methylase for 50S
    ribosomal subunit
    protein L11
    32 62909 41865 42068 MCA101338 g2773316 2.00E−12 small DNA binding
    protein Fis
    32 62909 62692 62907 MCA101380 g2407233 5.00E−23 similar to Haemophilus
    influenzae U32796
    32 62909 52735 53004 MCA101444 g535709 5.00E−26 HU protein
    32 62909 19635 20612 MCA101773
    32 62909 26826 27470 MCA101776
    32 62909 29954 30133 MCA101904 g1788076 5.00E−10 orf, hypothetical
    protein
    32 62909 30170 31093 MCA101910 g1800020 1.00E−54 similar to [P37768]
    32 62909 39861 40532 MCA101911 g48895 9.00E−10 acid phosphatase
    32 62909 15209 16036 MCA101913 g2649017 2.00E−16 conserved hypothetical
    protein
    32 62909 16414 17027 MCA101914 g1652952 5.00E−30 transposase
    32 62909 20712 21326 MCA101917 g244501 5.00E−42 esterase
    II = carboxylesterase
    {EC 3.1.1.1}
    32 62909 24945 25550 MCA101919 g2407235 3.00E−81 manganese superoxide
    dismutase
    32 62909 9114 9776 MCA102048 g1001410 1.00E−07 hypothetical protein
    32 62909 11483 11827 MCA102049
    33 63563 62405 62632 MCA101035 g2314031 5.00E−10 conserved hypothetical
    protein
    33 63563 56948 58870 MCA101040 g2623258 4.00E−45 putative secreted
    protein
    33 63563 21766 23691 MCA101136 g2765451 8.00E−61 nitrate/nitrite
    sensory protein
    33 63563 3 827 MCA101560 g2098763 7.00E−67 ThiI
    33 63563 31681 31896 MCA101587 g39312 3.00E−08 barstar (AA 1-90)
    33 63563 1409 2644 MCA101680 g1684734 3.00E−41 ORF396 protein
    33 63563 3749 4354 MCA101682 g1786318 2.00E−61 putative carbonic
    anhdrase (EC 4.2.1.1)
    33 63563 4569 8282 MCA101683 g1911243 0 alpha-subunit of
    nitrate reductase
    33 63563 8347 9879 MCA101684 g2765455 0 respiratory nitrate
    reductase beta subunit
    33 63563 9907 10644 MCA101685 g2765456 1.00E−40 putative chaperone
    33 63563 10719 11384 MCA101686 g2765457 2.00E−63 respiratory nitrate
    reductase gamma
    subunit
    33 63563 11872 12597 MCA101688 g2765458 6.00E−39 NifM protein
    33 63563 12741 13922 MCA101689 g1574287 9.00E−70 molybdopterin
    biosynthesis protein
    (moeA)
    33 63563 13931 15273 MCA101690 g1574545 4.00E−46 molybdenum ABC
    transporter, permease
    protein (modB)
    33 63563 15349 16047 MCA101691 g973214 2.00E−49 ModA
    33 63563 16157 16573 MCA101692 g899221 1.00E−26 potential molybdenum-
    pterin-binding-protein
    33 63563 16659 17036 MCA101693 g1001213 1.00E−26 molybdopterin (MPT)
    converting factor,
    subunit 2
    33 63563 17122 17355 MCA101694 g1673309 1.00E−09 hypothetical protein
    33 63563 17375 17827 MCA101695 g4185548 2.00E−27 molybdenum cofactor
    biosynthesis protein C
    33 63563 18520 19008 MCA101697 g42009 2.00E−50 moaB
    33 63563 19257 19745 MCA101698 g1790345 5.00E−20 orf, hypothetical
    protein
    33 63563 19849 20817 MCA101699 g1574526 1.00E−73 molybdenum cofactor
    biosynthesis protein A
    (moaA)
    33 63563 21099 21722 MCA101700 g2765450 1.00E−57 nitrate/nitrite
    regulatory protein
    33 63563 24027 25301 MCA101702 g2765452 e−100 nitrate extrusion
    protein
    33 63563 25322 26662 MCA101703 g2765453 e−131 nitrate extrusion
    protein
    33 63563 26767 27003 MCA101704 g43593 7.00E−25 IS1016-V6
    33 63563 27101 27838 MCA101705 g1256835 2.00E−37 moeB gene product
    33 63563 30824 31012 MCA101707 g39312 6.00E−08 barstar (AA 1-90)
    33 63563 31908 32282 MCA101708 g532528 5.00E−15 ribonuclease precursor
    33 63563 44513 44764 MCA101912
    33 63563 59342 60850 MCA101915 g1772622 3.00E−30 HecB
    33 63563 63286 63563 MCA101916
    34 89047 54807 56590 MCA100174 g2984323 4.00E−67 hypothetical protein
    34 89047 72751 73173 MCA100188 g1788522 2.00E−25 possible subunit of
    heme lyase
    34 89047 64432 65214 MCA100272 g1799711 8.00E−72 pseudouridylate
    synthase I (EC
    4.2.1.70)
    34 89047 64078 64287 MCA100273 g142459 7.00E−25 initiation factor 1
    34 89047 16260 18866 MCA100326 g1651269 0 Leucine-tRNA ligase
    (EC 6.1.1.4).
    34 89047 67834 68322 MCA100327 g1573775 6.00E−27 conserved hypothetical
    protein
    34 89047 68604 69926 MCA100329
    34 89047 70103 72067 MCA100330 g1174237 e−175 CycK
    34 89047 8218 9123 MCA100410 g1420863 e−140 oligopeptidepermease
    34 89047 9349 11319 MCA100411 g1420859 0 oligopeptidepermease
    34 89047 11462 11734 MCA100412 g1817528 7.00E−13 component protein of
    adhesin complex
    34 89047 12117 12434 MCA100413 g1817528 1.00E−14 component protein of
    adhesin complex
    34 89047 31288 32337 MCA100432 g3212213 e−120 H. influenzae
    predicted coding
    region HI1126.1
    34 89047 30886 31281 MCA100623 g3212214 8.00E−48 H. influenzae
    predicted coding
    region HI1127
    34 89047 3573 4214 MCA100666 g1573906 6.00E−96 H. influenzae
    predicted coding
    region HI0882
    34 89047 4621 6105 MCA100667 g1420860 0 oligopeptidepermease
    34 89047 6109 7032 MCA100668 g1420861 e−145 oligopeptidepermease
    34 89047 7081 8115 MCA100669 g1420862 e−163 oligopeptidepermease
    34 89047 26541 28064 MCA100734 g2984319 2.00E−95 Na (+): solute symporter
    (Ssf family)
    34 89047 24901 25710 MCA100736 g1513082 5.00E−67 ATPase
    34 89047 23328 24365 MCA100738 g1786606 8.00E−89 S-
    adenosylmethionine: tRNA
    ribosyltransferase-
    isomerase
    34 89047 22063 23202 MCA100739 g1573209 e−147 tRNA-guanine
    transglycosylase (tgt)
    34 89047 20280 21854 MCA100740 g536958 2.00E−74 yjdB gene product
    34 89047 19010 19351 MCA100742 g1573052 7.00E−15 conserved hypothetical
    protein
    34 89047 72176 72649 MCA100857 g929791 1.00E−22 periplasmic or inner
    membrane associated
    protein
    34 89047 60817 61410 MCA101043 g312708 5.00E−41 miaE
    34 89047 59356 60669 MCA101044 g1790609 8.00E−39 orf, hypothetical
    protein
    34 89047 57906 58931 MCA101045 g1573704 7.00E−40 conserved hypothetical
    protein
    34 89047 56828 57394 MCA101047 g3328430 3.00E−71 Deoxycytidine
    triphosphate deaminase
    family protein
    34 89047 52985 53889 MCA101051 g2636549 2.00E−22 similar to
    hypothetical proteins
    34 89047 51712 52935 MCA101052 g216628 4.00E−52 UbiH (VisB)
    34 89047 50505 51353 MCA101053 g1787880 7.00E−32 putative transport
    protein
    34 89047 48105 50117 MCA101054 g148182 e−177 rep helicase
    34 89047 46737 47753 MCA101056 g537005 4.00E−58 ORF_f337
    34 89047 74796 75440 MCA101231 g4520134 7.00E−73 adenylate kinase
    34 89047 78867 80283 MCA101233 g3861163 9.00E−74 2-
    acylglycerophosphoetha
    nolamine
    acyltransferase
    34 89047 82080 83144 MCA101235 g1573700 1.00E−28 conserved hypothetical
    protein
    34 89047 85493 88297 MCA101238 g1573699 2.00E−69 conserved hypothetical
    protein
    34 89047 45297 45752 MCA101341 g1790038 3.00E−37 protein export;
    molecular chaperone
    34 89047 44704 45165 MCA101342 g41300 4.00E−46 dUTPase (dut)
    34 89047 44243 44665 MCA101343 g2984288 1.00E−33 acetylglutamate kinase
    34 89047 43444 44199 MCA101344 g2462049 1.00E−14 hypothetical protein
    34 89047 42700 43350 MCA101345 g1763619 6.00E−19 potassium channel
    alpha subunit
    34 89047 39885 40328 MCA101347 g42848 6.00E−32 ribosome protein L9
    (aa 1-149)
    34 89047 39641 39865 MCA101348 g1573530 5.00E−29 ribosomal protein S18
    (rpS18)
    34 89047 39224 39610 MCA101349 g42845 2.00E−35 ribosomal protein S6
    (aa 1-131)
    34 89047 36447 37520 MCA101351 g1789272 1.00E−96 tetrahydrofolate-
    dependent
    aminomethyltransferase
    34 89047 35751 36128 MCA101352 g1789271 8.00E−40 carrier of aminomethyl
    moiety via lipoyl
    cofactor
    34 89047 32628 35462 MCA101353 g304892 0 gcvHP
    34 89047 28777 30564 MCA101356 g3212231 e−141 TonB-dependent
    receptor, putative
    34 89047 73261 74523 MCA101532
    34 89047 45820 46071 MCA101632 g3860768 7.00E−16 glutaredoxin 3
    34 89047 62090 63166 MCA101727 g1922276 2.00E−15 porin
    34 89047 25927 26316 MCA101860 g4545096 5.00E−09 unknown
    34 89047 38043 38363 MCA101920 g4062756 3.00E−08 Hypothetical protein
    HI1446
    34 89047 66384 67498 MCA101922 g1420975 e−130 aspartate semialdehyde
    dehydrogenase
    34 89047 57510 57803 MCA102061
    34 89047 403 2859 MCA102062 g2983163 5.00E−07 outer membrane protein c
    34 89047 3164 3520 MCA102063
    34 89047 38496 38981 MCA102068
    34 89047 13061 14095 MCA102070 g4456807 4.00E−07 hypothetical protein
    34 89047 40804 41724 MCA102072
    34 89047 41911 42456 MCA102073 g1790149 3.00E−12 orf, hypothetical
    protein
    35 96109 63603 63740 MCA100010 g3603060 9.00E−11 ribosomal protein L36
    35 96109 63882 64673 MCA100011 g609333 6.00E−61 orf272
    35 96109 781 1275 MCA100095 g1789019 5.00E−25 orf, hypothetical
    protein
    35 96109 31479 31784 MCA100151 g149064 4.00E−07 insb (putative);
    putative
    35 96109 16679 17584 MCA100238 g1574277 9.00E−55 geranyltranstransferase
    (ispA)
    35 96109 15484 16293 MCA100239 g146864 5.00E−60 A/G-specific adenine
    glycosylase
    35 96109 14399 14971 MCA100241 g1314160 3.00E−20 mitochondrial nuclease
    35 96109 330 551 MCA100571 g1173842 2.00E−20 acyl carrier protein
    35 96109 91699 93600 MCA100613 g1574199 0 threonyl-tRNA
    synthetase (thrS)
    35 96109 18008 18937 MCA100723 g1574400 3.00E−61 2-hydroxyacid
    dehydrogenase
    35 96109 19173 22007 MCA100724 g1786245 0 probable ATP-dependent
    RNA helicase
    35 96109 23729 25783 MCA100726 g2695959 0 fadH
    35 96109 64879 65883 MCA100851 g2198496 2.00E−51 B1306.06c protein
    35 96109 68453 68746 MCA100854 g144052 5.00E−18 outer membrane protein A
    35 96109 69092 69673 MCA100855 g1573697 3.00E−46 conserved hypothetical
    protein
    35 96109 69937 71532 MCA100856 g790611 9.00E−63 unknown
    35 96109 72055 72594 MCA100858 g2160520 2.00E−32 ORF1; similar to E
    coli L28082
    35 96109 72778 73755 MCA100859
    35 96109 73860 74870 MCA100860 g3257505 2.00E−32 homocysteine S-
    methyltransferase
    35 96109 89648 90142 MCA100884 g290449 6.00E−45 initiation factor 3
    35 96109 86580 88901 MCA100886 g1790622 e−148 putative enzyme
    35 96109 83852 85201 MCA100889 g2558473 e−124 Na-translocating NADH-
    quinone reductase
    alpha-subunit
    35 96109 82641 83837 MCA100890 g1573123 e−138 NADH: ubiquinone
    oxidoreductase,
    subunit B (nqrB)
    35 96109 81848 82621 MCA100891 g2558475 2.00E−42 Na-translocating NADH-
    quinone reductase
    gamma-subunit
    35 96109 81207 81806 MCA100892 g1573125 2.00E−71 NADH: ubiquinone
    oxidoreductase, Na
    translocating
    35 96109 80542 81147 MCA100893 g2558477 2.00E−78 Na-translocating NADH-
    quinone reductase
    subunit 5
    35 96109 79287 80495 MCA100894 g1573127 e−164 Na-translocating NADH-
    quinone reductase
    beta-subunit
    35 96109 22117 23637 MCA100915 g1001214 e−134 hypothetical protein
    35 96109 2411 4147 MCA100916 g1786265 0 acetolactate synthase
    III, val sensitive,
    large subunit
    35 96109 4168 4656 MCA100917 g1786266 6.00E−44 acetolactate synthase
    III, val sensitive,
    small subunit
    35 96109 4835 5848 MCA100918 g2529237 e−125 acetohydroxy acid
    isomeroreductase
    35 96109 5960 6370 MCA100919
    35 96109 6718 6918 MCA100920 g4454361 4.00E−22 cold shock protein,
    CSPA
    35 96109 7163 7651 MCA100921 g1573284 2.00E−42 crossover junction
    endodeoxyribonuclease
    (ruvC)
    35 96109 7852 8388 MCA100922
    35 96109 8484 9779 MCA100923 g3298336 1.00E−65 NorM
    35 96109 10000 11088 MCA100924 g1574692 5.00E−58 cell division protein
    (ftsW)
    35 96109 11357 12736 MCA100925 g1574691 1.00E−75 UDP-N-
    acetylmuramoylalanine-
    -D-glutamate ligase
    35 96109 12938 13273 MCA100926 g2735324 7.00E−44 PII-protein
    35 96109 66095 66631 MCA100978 g3323304 7.00E−13 glpG protein, putative
    35 96109 26724 27458 MCA101006 g473823 3.00E−85 ‘methionine
    aminopeptidase’
    35 96109 27687 30377 MCA101007 g39257 e−153 uridylyl transferase
    35 96109 30510 31373 MCA101008
    35 96109 32708 33978 MCA101010 g1788783 3.00E−40 putative prophage
    integrase
    35 96109 35233 36276 MCA101012
    35 96109 36398 37465 MCA101013
    35 96109 37547 37858 MCA101014
    35 96109 37855 38175 MCA101015
    35 96109 56595 57344 MCA101109 g1573676 4.00E−56 integrase/recombinase
    (xerC)
    35 96109 39637 39939 MCA101486
    35 96109 40057 40410 MCA101487
    35 96109 45467 46231 MCA101490 g1573242 2.00E−36 ribonuclease BN (rbn)
    35 96109 46598 46957 MCA101491 g3493605 3.00E−30 Trp repressor binding
    protein
    35 96109 47185 47616 MCA101492
    35 96109 48860 49144 MCA101494 g149688 3.00E−32 htpA
    35 96109 49273 50910 MCA101495 g499206 0 GroEL
    35 96109 51130 51963 MCA101496 g1789192 1.00E−74 prolipoprotein
    diacylglyceryl
    transferase
    35 96109 51990 52829 MCA101497 g2258280 2.00E−97 thymidylate synthase
    35 96109 52856 53290 MCA101498 g665643 1.00E−28 dihydrofolate
    reductase
    35 96109 53413 54426 MCA101499 g1573128 3.00E−47 lipoprotein, putative
    35 96109 54579 55025 MCA101500
    35 96109 55115 56281 MCA101501 g216628 1.00E−35 UbiH (VisB)
    35 96109 57647 58471 MCA101503 g1790242 4.00E−80 diaminopimelate
    epimerase
    35 96109 58748 59965 MCA101504 g1929094 e−110 LysA protein
    35 96109 60612 61766 MCA101506 g1405880 5.00E−83 acetate kinase
    35 96109 62334 63320 MCA101508 g1574131 e−127 phosphate
    acetyltransferase
    (pta)
    35 96109 26139 26477 MCA101763 g2564977 4.00E−09 hypothetical protein
    35 96109 41837 43138 MCA101842 g1033120 3.00E−15 ORF_o469
    35 96109 85730 86452 MCA101876 g836646 9.00E−64 phosphoribosylformimino-
    5-aminoimidazole
    carboxamide
    35 96109 89243 89524 MCA101877 g42742 2.00E−11 rimI protein (AA 1-161)
    35 96109 75011 75493 MCA101878 g4062570 5.00E−37 4-hydroxyphenylacetate
    3-monooxygenase (EC
    1.14.13.3)
    35 96109 75733 77289 MCA101881 g1787597 7.00E−94 putative pump protein
    (transport)
    35 96109 77651 79135 MCA101882 g1573949 0 catalase (hktE)
    35 96109 38185 38586 MCA101930
    35 96109 40762 41004 MCA102021 g2313086 1.00E−08 DNA primase (dnaG)
    35 96109 43196 43354 MCA102022
    35 96109 95181 95342 MCA102078
    36 92407 91233 91847 MCA100081 g2635437 1.00E−27 similar to protease IV
    36 92407 50092 50511 MCA100085 g1574283 3.00E−53 ribosomal protein L13
    (rpL13)
    36 92407 49696 50073 MCA100086 g241867 3.00E−44 ribosomal protein S9
    homolog = rpsI
    36 92407 7088 7378 MCA100136 g2865528 1.00E−13 mono-heme c-type
    cytochrome ScyA
    36 92407 7748 8335 MCA100137 g516878 3.00E−35 cytochrome c4
    preprotein
    36 92407 14107 15696 MCA100530 g581070 e−144 acyl coenzyme A
    synthetase
    36 92407 12531 13733 MCA100531 g1573978 2.00E−83 DNA/pantothenate
    metabolism
    flavoprotein (dfp)
    36 92407 11001 12140 MCA100532 g551299 e−106 Na+/H+ antiporter
    36 92407 16025 17620 MCA100708 g581070 e−166 acyl coenzyme A
    synthetase
    36 92407 17919 18623 MCA100709 g1079663 6.00E−79 RNase PH
    36 92407 18634 19089 MCA100710
    36 92407 19908 20546 MCA100712 g436881 2.00E−34 outer membrane
    phospholipase A
    36 92407 20579 21427 MCA100713
    36 92407 21387 21977 MCA100714
    36 92407 21974 22960 MCA100715
    36 92407 22957 23763 MCA100716
    36 92407 816 1589 MCA100752 g2984360 7.00E−71 thiamine biosynthesis,
    thiazole moiety
    36 92407 1761 3098 MCA100753 g2960158 7.00E−59 hypothetical protein
    Rv3734c
    36 92407 3243 5234 MCA100754 g1574731 0 methionyl-tRNA
    synthetase (metG)
    36 92407 5571 6977 MCA100755 g41206 e−132 cysteinyl-tRNA
    synthetase
    36 92407 61788 63133 MCA100840 g1788963 e−156 GTP-binding export
    factor
    36 92407 63356 64015 MCA100842 g1788109 4.00E−20 orf, hypothetical
    protein
    36 92407 64186 64992 MCA100843 g1789437 4.00E−43 bacitracin resistance
    36 92407 65314 65850 MCA100844 g3851182 5.00E−14 unknown
    36 92407 65942 66205 MCA100845
    36 92407 66244 67065 MCA100846 g396375 5.00E−64 4-hydroxybenzoate-
    octaprenyl transferase
    36 92407 67362 68897 MCA100847 g1449339 e−137 pitB
    36 92407 69294 69974 MCA100848 g606374 9.00E−53 ORF_f231
    36 92407 70365 70850 MCA100849 g1574067 2.00E−34 conserved hypothetical
    protein
    36 92407 70982 71563 MCA100850 g497127 2.00E−55 RNase T
    36 92407 38857 39717 MCA100927 g4376782 5.00E−12 CT391 hypothetical
    protein
    36 92407 40914 41549 MCA100929 g3860928 5.00E−25 ABC transporter ATP-
    binding protein
    36 92407 42061 44601 MCA100931 g1573874 0 ATP-dependent Clp
    protease, ATPase
    subunit (clpB)
    36 92407 45517 45870 MCA100933 g1574279 2.00E−28 stringent starvation
    protein B (sspB)
    36 92407 45891 46442 MCA100934 g42998 6.00E−33 SSP (AA1-212)
    36 92407 46643 47320 MCA100935 g2642363 1.00E−39 cytochrome c1
    36 92407 47395 48567 MCA100936 g2642362 e−133 cytochrome b
    36 92407 48597 49166 MCA100937 g2642361 4.00E−48 Fe-S protein
    36 92407 88972 90090 MCA101033 g305386 6.00E−21 recombination protein
    36 92407 81971 82912 MCA101037 g1377868 2.00E−47 cbb3-type cytochrome c
    oxidase CcoP subunit
    36 92407 71602 72657 MCA101086 g3868712 e−114 dihydroorotase
    36 92407 72855 74180 MCA101087 g1574583 0 argininosuccinate
    synthetase (argG)
    36 92407 74397 74897 MCA101088
    36 92407 75049 75960 MCA101089 g3643996 2.00E−30 putative regulatory
    protein
    36 92407 76983 78173 MCA101091 g152210 4.00E−68 nitrogen fixation
    protein fixG
    36 92407 79617 80960 MCA101093 g1552601 e−179 FixNd
    36 92407 81064 81636 MCA101094 g1002879 3.00E−56 CcoO
    36 92407 83103 84722 MCA101097 g1574630 0 CTP synthetase (pyrG)
    36 92407 84893 85729 MCA101098 g4235471 e−114 2-dehydro-3-
    deoxyphosphooctonate
    aldolase
    36 92407 85823 87097 MCA101099 g1789141 e−156 enolase
    36 92407 87210 87455 MCA101100 g1789105 4.00E−08 orf, hypothetical
    protein
    36 92407 87621 88316 MCA101101 g1573673 3.00E−36 conserved hypothetical
    protein
    36 92407 39980 40804 MCA101148 g3860927 4.00E−24 unknown
    36 92407 59021 60271 MCA101153 g42913 1.00E−58 ORF 45 peptide (AA 1-400)
    36 92407 55081 58941 MCA101154 g42914 1.00E−59 SbcC (AA 1-1048)
    36 92407 51152 52987 MCA101156 g581463 0 homologous to E. coli
    gidA
    36 92407 35356 36111 MCA101172 g1651445 2.00E−42 SmtA protein.
    36 92407 33986 35242 MCA101173 g1245347 2.00E−43 AlgI
    36 92407 30688 31161 MCA101176 g2765835 2.00E−29 hypothetical protein
    36 92407 29194 30474 MCA101177 g3132889 1.00E−62 WaaA
    36 92407 26469 28985 MCA101178 g1574460 e−160 aminopeptidase N
    (pepN)
    36 92407 25542 26057 MCA101179 g663068 1.00E−26 PAL
    36 92407 8594 9688 MCA101272
    36 92407 9676 10008 MCA101294
    36 92407 24074 24832 MCA101848
    36 92407 36281 37267 MCA101850
    36 92407 37432 38508 MCA101851 g3860926 1.00E−08 unknown
    36 92407 60775 61569 MCA101909 g1788964 2.00E−15 orf, hypothetical
    protein
    36 92407 81687 81869 MCA101928
    36 92407 53341 54315 MCA101944
    36 92407 54504 54968 MCA101945
    37 99629 69767 70210 MCA100038 g1718488 6.00E−34 FabZ
    37 99629 70275 71039 MCA100039 g1786378 3.00E−77 UDP-N-
    acetylglucosamine
    acetyltransferase
    37 99629 71432 72897 MCA100082 g1573742 e−119 sodium-dependent
    transporter, putative
    37 99629 76489 78342 MCA100169 g2599340 2.00E−40 protein-disulfide
    reductase
    37 99629 51376 52041 MCA100276 g2865530 3.00E−30 cytochrome c
    maturation protein B
    37 99629 73294 74871 MCA100290 g142301 e−168 cytochrome d subunit
    Ia
    37 99629 74913 76046 MCA100291 g1786954 2.00E−99 cytochrome d terminal
    oxidase polypeptide
    subunit II
    37 99629 66172 68571 MCA100323 g1552754 e−123 hypothetical protein
    37 99629 68643 69560 MCA100324 g1573936 2.00E−56 UDP-3-O-(3-
    hydroxymyristoyl)-
    glucosamine N-
    acyltransfer
    37 99629 33622 34110 MCA100374 g1574669 1.00E−31 thioredoxin, putative
    37 99629 32014 33450 MCA100375 g1573139 e−105 amino acid carrier
    protein, putative
    37 99629 2692 5811 MCA100461 g438854 0 envD homologue; ORFB
    37 99629 5884 7308 MCA100564 g3184190 3.00E−77 OprM
    37 99629 8308 9618 MCA100566 g1061260 2.00E−68 putative protein
    37 99629 9973 11343 MCA100567 g1788397 e−165 orf, hypothetical
    protein
    37 99629 11391 12323 MCA100568 g2314272 6.00E−88 cytosine specific DNA
    methyltransferase
    (BSP6IM)
    37 99629 2 532 MCA100700 g1786393 5.00E−27 orf, hypothetical
    protein
    37 99629 56471 57733 MCA100776 g1651420 e−145 Serine-tRNA ligase (EC
    6.1.1.11)
    37 99629 57951 59921 MCA100777 g2367177 0 transketolase 1
    isozyme
    37 99629 60119 60835 MCA100778 g3417448 1.00E−67 UMP kinase
    37 99629 60950 61501 MCA100779 g3417449 1.00E−63 ribosome recycling
    factor
    37 99629 61598 62323 MCA100780 g1786371 5.00E−54 orf, hypothetical
    protein
    37 99629 62522 63199 MCA100781 g1262332 5.00E−39 CDP-diglyceride
    synthetase
    37 99629 63358 64560 MCA100782 g1786369 3.00E−85 putative ATP-binding
    component of a
    transport system
    37 99629 64584 65951 MCA100783 g1552753 8.00E−83 hypothetical
    37 99629 34923 35243 MCA100789 g142304 3.00E−52 ferredoxin I
    37 99629 1269 2564 MCA100852 g532310 1.00E−61 42 kDa protein
    37 99629 26942 30208 MCA101055 g2367096 0 isoleucine tRNA
    synthetase
    37 99629 83288 84046 MCA101084 g1789140 3.00E−18 orf, hypothetical
    protein
    37 99629 30484 31758 MCA101163 g4062560 e−147 Uracil transport
    protein
    37 99629 38692 40539 MCA101256
    37 99629 40499 41389 MCA101257
    37 99629 43223 46123 MCA101259 g1574225 0 valyl-tRNA synthetase
    (valS)
    37 99629 46207 47085 MCA101260 g303628 e−161 MboI methyltransferase A
    37 99629 47093 47932 MCA101261 g303629 e−151 MboI endonuclease
    37 99629 47937 48755 MCA101262 g303630 e−145 MboI methyltransferase C
    37 99629 50795 51373 MCA101265 g46024 2.00E−25 helA
    37 99629 26437 26910 MCA101360 g151348 3.00E−35 signal peptidase II
    37 99629 25749 26177 MCA101361 g151349 2.00E−26 ORF149
    37 99629 24426 25547 MCA101362 g1835114 1.00E−95 homoserine O-
    acetyltransferase
    37 99629 23029 23605 MCA101364 g4062259 6.00E−14 Sel-1 protein
    37 99629 20479 22755 MCA101365 g308942 0 major outer membrane
    protein
    37 99629 18600 20063 MCA101366 g38720 0 IMP dehydrogenase
    37 99629 17326 18006 MCA101368 g3135321 7.00E−33 putative
    thiol: disulfide
    interchange protein
    precursor
    37 99629 15653 16846 MCA101369 g45329 8.00E−97 homoserine
    dehydrogenase
    37 99629 14813 15373 MCA101370 g1790296 1.00E−55 orf, hypothetical
    protein
    37 99629 13917 14735 MCA101371 g606086 6.00E−72 ORF_f286
    37 99629 78730 80198 MCA101417 g141886 0 acetaldehyde
    dehydrogenase II
    37 99629 80403 81914 MCA101418 g2635246 e−118 similar to
    sodium/proton-
    dependent alanine
    carrier prot
    37 99629 82372 82926 MCA101419 g3322862 1.00E−33 Tp70 protein
    37 99629 84049 84567 MCA101421
    37 99629 98444 98752 MCA101422 g216636 3.00E−21 ribosomal protein L21
    37 99629 85377 86027 MCA101423 g4102010 2.00E−38 putative transposase
    37 99629 86093 86667 MCA101424 g4512224 2.00E−26 Similar to IS1301 of
    Neisseria meningitidis
    37 99629 86955 88568 MCA101426 g1747491 0 alxA
    37 99629 88573 89919 MCA101427 g1685099 4.00E−56 HSDS
    37 99629 91158 94300 MCA101429 g1685100 0 HSDR
    37 99629 94381 95240 MCA101430 g1786518 6.00E−66 putative
    oxidoreductase
    37 99629 95287 95940 MCA101431 g1574733 5.00E−72 NAD(P)H-flavin
    oxidoreductase
    37 99629 96051 97094 MCA101432 g1303964 2.00E−70 YqjM
    37 99629 97366 98229 MCA101433 g150233 6.00E−30 nahR protein precursor
    37 99629 98820 99074 MCA101440 g216637 2.00E−28 ribosomal protein L27
    37 99629 13079 13333 MCA101463 g1518927 6.00E−28 ferredoxin
    37 99629 13439 13879 MCA101466 g1575483 3.00E−23 LporfX
    37 99629 12334 13065 MCA101598 g4155637 9.00E−79 putative
    37 99629 53924 54736 MCA101923 g765096 2.00E−94 heat-shock sigma
    factor
    37 99629 36268 37779 MCA101924 g1787309 e−103 putative virulence
    factor
    37 99629 37994 38530 MCA101929 g4079828 8.00E−45 N-acetyl-
    anhydromuramyl-L-
    alanine amidase
    37 99629 41474 42911 MCA101936 g2633081 e−119 similar to 2-
    oxoglutarate/malate
    translocator
    37 99629 48799 49662 MCA101938 g580726 7.00E−63 Portion of
    hypothetical protein
    37 99629 52121 52933 MCA101939 g3513356 3.00E−39 hypothetical protein
    37 99629 89930 91132 MCA102002
    38 94750 82819 83559 MCA100037 g1573162 3.00E−71 tRNA (guanine-N1)-
    methyltransferase
    (trmD)
    38 94750 83736 84065 MCA100220 g1800011 8.00E−36 ribosomal protein L19
    38 94750 84195 84599 MCA100221 g145063 8.00E−31 two-subunit pilin
    precursor
    38 94750 38362 39300 MCA100287
    38 94750 39368 40069 MCA100288 g39705 3.00E−27 fimC
    38 94750 37413 38177 MCA100301 g1573311 4.00E−49 conserved hypothetical
    protein
    38 94750 36351 37259 MCA100302 g1786208 7.00E−49 putative regulator
    38 94750 43520 43906 MCA100403 g1055071 7.00E−33 C23G10.2 gene product
    38 94750 40106 42352 MCA100405 g147345 e−140 primosomal protein n'
    38 94750 601 1360 MCA100435 g2633826 1.00E−30 similar to
    hypothetical proteins
    38 94750 1401 2000 MCA100436 g1001747 1.00E−40 alkaline phosphatase-
    like
    38 94750 2433 3071 MCA100437 g1574697 4.00E−12 cell division protein
    (ftsQ)
    38 94750 3143 4201 MCA100438 g2738588 5.00E−23 cell division protein
    38 94750 77707 78381 MCA100467 g1079807 9.00E−42 RstA
    38 94750 79179 80048 MCA100469 g1742648 4.00E−37 Sensor protein RstB
    (EC 2.7.3.—).
    38 94750 81833 82078 MCA100471 g1573164 3.00E−25 ribosomal protein S16
    (rpS16)
    38 94750 82288 82782 MCA100472 g1573163 7.00E−26 conserved hypothetical
    protein
    38 94750 29640 30077 MCA100521 g4164224 3.00E−55 ferric uptake
    regulator
    38 94750 30269 31297 MCA100522 g151490 7.00E−90 twitching motility
    protein
    38 94750 31720 32301 MCA100523 g454838 7.00E−51 ORF 6; putative
    38 94750 32364 33974 MCA100524 g1653472 e−120 NH(3)-dependent NAD(+)
    synthetase
    38 94750 25258 27037 MCA100546 g2735093 0 ubiquitous surface
    protein A 2
    38 94750 27198 28070 MCA100547 g2677632 1.00E−66 methionine regulatory
    protein MetR
    38 94750 28330 28986 MCA100548 g1799710 3.00E−47 dedA protein
    38 94750 70429 71286 MCA100628 g669111 9.00E−79 alternate atpB CDS
    38 94750 71347 71586 MCA100629 g1573462 1.00E−14 ATP synthase F0,
    subunit c (atpE)
    38 94750 71683 72144 MCA100630 g581814 4.00E−30 uncF (AA 1-156)
    38 94750 72160 72699 MCA100631 g48336 9.00E−26 uncH (AA 1-177)
    38 94750 72749 74284 MCA100632 g1790172 0 membrane-bound ATP
    synthase, F1 sector,
    alpha-subunit
    38 94750 74372 75238 MCA100633 g1790171 3.00E−96 membrane-bound ATP
    synthase, F1 sector,
    gamma-subunit
    38 94750 75694 77103 MCA100635 g1573457 0 ATP synthase F1,
    subunit beta (atpD)
    38 94750 77188 77586 MCA100636 g1573456 2.00E−16 ATP synthase F1,
    subunit epsilon (atpC)
    38 94750 42399 43304 MCA100808 g1788771 1.00E−66 orf, hypothetical
    protein
    38 94750 23867 24892 MCA101243 g1573514 e−106 O-sialoglycoprotein
    endopeptidase (gcp)
    38 94750 29005 29400 MCA101264 g1033113 1.00E−11 ORF_o113
    38 94750 4673 5742 MCA101528 g216509 3.00E−82 cell division protein
    fstZ
    38 94750 5866 6756 MCA101529 g1574235 1.00E−42 conserved hypothetical
    protein
    38 94750 7767 8792 MCA101531 g440089 e−137 RecA
    38 94750 9699 11027 MCA101533 g3876615 e−112 Similar to Yeast D-
    lactate dehydrogenase
    (SW: DLD1_YEAST)
    38 94750 11050 11592 MCA101534
    38 94750 11674 12723 MCA101535
    38 94750 12838 13641 MCA101536 g1573029 1.00E−27 conserved hypothetical
    protein
    38 94750 13667 14434 MCA101537 g1789177 1.00E−42 putative enzyme
    38 94750 14676 15545 MCA101538 g1574480 e−101 2,3,4,5-
    tetrahydropyridine-2-
    carboxylate N-
    succinyltransf
    38 94750 16830 17747 MCA101540 g1572971 3.00E−93 lipoate biosynthesis
    protein A (lipA)
    38 94750 18269 19222 MCA101542 g1786681 2.00E−89 ferrochelatase: final
    enzyme of heme
    biosynthesis
    38 94750 19956 21070 MCA101544 g1652222 9.00E−44 hypothetical protein
    38 94750 21261 23480 MCA101545 g1030696 0 isocitrate
    dehydrogenase
    38 94750 44197 46308 MCA101565 g1574600 9.00E−78 guanosine-3′,5′-
    bis (diphosphate) 3′-
    pyrophosphohydrolase
    38 94750 46693 46932 MCA101566 g1574602 1.00E−14 DNA-directed RNA
    polymerase, omega
    chain (rpoZ)
    38 94750 47038 47643 MCA101567 g290498 2.00E−50 5′ guanylate kinase
    38 94750 47816 48742 MCA101568 g216456 e−110 hypothetical 34.8 K
    protein(PIR: JE0403)
    38 94750 48853 50493 MCA101569 g1789259 e−124 ssDNA exonuclease,
    5′-> 3′ specific
    38 94750 50589 51176 MCA101570 g290496 2.00E−33 o223
    38 94750 51346 52017 MCA101572 g2984272 3.00E−19 hypothetical protein
    38 94750 52519 53892 MCA101574 g2340815 0 L-2,4-
    diaminobutyrate: 2-
    ketoglutarate 4-
    aminotransferase
    38 94750 54051 55967 MCA101575 g4454667 e−134 methyltransferase
    38 94750 55995 58601 MCA101576 g4454668 0 restriction
    endonuclease
    38 94750 58652 60190 MCA101577 g893355 0 L-2,4-diaminobutyrate
    decarboxylase
    38 94750 60278 62041 MCA101578 g472402 e−128 UVR excinuclease
    subunit C
    38 94750 62223 62858 MCA101579 g1573552 2.00E−44 phosphoglycolate
    phosphatase (gph)
    38 94750 63199 63741 MCA101580
    38 94750 63889 64746 MCA101581 g1786337 1.00E−42 putative tRNA
    synthetase
    38 94750 64772 65185 MCA101582 g1786338 4.00E−43 dnaK suppressor
    protein
    38 94750 65335 66003 MCA101583 g882562 1.00E−23 icc gene product
    38 94750 66160 66916 MCA101584 g1573380 3.00E−27 conserved hypothetical
    integral membrane
    protein
    38 94750 66967 67674 MCA101585 g1736501 1.00E−47 Sulfate transport ATP-
    binding protein CYsA.
    38 94750 67700 68140 MCA101586 g1790480 7.00E−20 putative regulator
    38 94750 69471 69878 MCA101588
    38 94750 75267 75602 MCA101681
    38 94750 68546 69241 MCA101853 g1788164 3.00E−16 putative adhesin
    38 94750 34301 34576 MCA101890
    38 94750 35674 36312 MCA101892
    38 94750 87827 89506 MCA101940 g409365 0 urocanase
    38 94750 89601 91106 MCA101941 g151274 e−164 histidine ammonialyase
    (hutH) precursor
    (gtg start codon (E. C.
    4.3.1.3)
    38 94750 91634 92272 MCA101942 g149204 5.00E−35 histidine utilization
    repressor G
    38 94750 92575 93723 MCA101946 g4106576 e−109 ORF9, highly similar
    to imidazolone
    propionate hydrolase
    38 94750 15658 16503 MCA101947 g2285919 1.00E−13 K5L + K6L
    38 94750 6816 7307 MCA101948 g1321618 6.00E−16
    38 94750 80209 81537 MCA101953 g3402275 1.00E−51 EnvZ protein
    38 94750 85007 87612 MCA101955 g2367097 0 aconitate hydrase B
    39 100848 79190 79684 MCA100004 g1835603 1.00E−30 15 kDa protein
    39 100848 77575 78220 MCA100013 g49095 2.00E−47 triosephosphate
    isomerase
    39 100848 33560 34450 MCA100033 g1786984 3.00E−38 putative
    transcriptional
    regulator LYSR-type
    39 100848 16050 17411 MCA100152 g154205 e−139 phosphomannomutase
    39 100848 38007 39128 MCA100236 g1574558 2.00E−27 conserved hypothetical
    protein
    39 100848 39149 40258 MCA100237 g1790713 7.00E−15 orf, hypothetical
    protein
    39 100848 13324 14526 MCA100260 g1788092 4.00E−39 putative amino
    acid/amine transport
    protein
    39 100848 14586 15035 MCA100261
    39 100848 15091 15930 MCA100262 g1773171 4.00E−38 similar to M.
    tuberculosis
    MTCY277.09
    39 100848 36123 37547 MCA100305 g2984771 e−101 PhpA
    39 100848 34625 35815 MCA100306 g409800 e−132 tyrosine
    aminotransferase
    39 100848 89115 89381 MCA100389 g429056 1.00E−26 ribosomal protein S15
    39 100848 89607 91682 MCA100390 g3650364 0 polyribonucleotide
    nucleotidyltransferase
    39 100848 91827 92300 MCA100391 g2959336 4.00E−46 hypothetical protein
    39 100848 92532 92957 MCA100392 g1100876 5.00E−19 hypothetical OrfY
    39 100848 92969 93382 MCA100393 g1789538 2.00E−08 orf, hypothetical
    protein
    39 100848 93467 94066 MCA100394 g1789540 1.00E−06 putative periplasmic
    protein
    39 100848 28411 29109 MCA100525 g41638 3.00E−64 PufX protein
    39 100848 30030 30761 MCA100527 g1742082 8.00E−54 Internalin B
    39 100848 30895 32214 MCA100528 g537059 e−129 ORF_f447
    39 100848 32302 33378 MCA100529 g2916960 2.00E−46 chaA
    39 100848 94363 94614 MCA100761 g415661 4.00E−14 putative; ORF3
    39 100848 94621 95874 MCA100762 g415662 e−141 UDP-N-
    acetylglucosamine 1-
    carboxyvinyl
    transferase
    39 100848 95992 96555 MCA100763 g2636005 8.00E−43 ATP
    phosphoribosyltransferase
    39 100848 96820 98121 MCA100764 g2983343 e−101 histidinol
    dehydrogenase
    39 100848 98225 99295 MCA100765 g440346 3.00E−99 histidinol phosphate
    aminotransferase
    39 100848 99499 100359 MCA100766 g2984079 1.00E−41 fumarate hydratase
    (fumarase)
    39 100848 79796 81271 MCA100801 g1789560 e−128 transcription pausing;
    L factor
    39 100848 81439 84168 MCA100802 g3850831 0 initiation factor IF2-
    alpha
    39 100848 86548 86931 MCA100804 g606107 2.00E−17 P15B
    39 100848 86964 87845 MCA100805 g1574748 2.00E−54 tRNA pseudouridine 55
    synthase (truB)
    39 100848 67997 69420 MCA100815 g717082 e−139 glutamyl-tRNA
    synthetase
    39 100848 69744 70682 MCA100816 g42318 8.00E−73 orfB
    39 100848 70742 71092 MCA100817
    39 100848 71246 73027 MCA100818 g840842 2.00E−81 penicillin-binding
    protein 3
    39 100848 73207 74637 MCA100819 g1574688 2.00E−74 UDP-N-acetylmuramyl-
    tripeptide synthetase
    (murE)
    39 100848 74755 76140 MCA100820 g1786274 9.00E−76 D-alanine: D-alanine-
    adding enzyme
    39 100848 76209 77270 MCA100821 g1574690 e−105 phospho-N-
    acetylmuramoyl-
    pentapeptide-
    transferase E
    39 100848 18959 19780 MCA100862 g1789144 2.00E−46 orf, hypothetical
    protein
    39 100848 19920 20072 MCA100863 g973208 4.00E−09 unknown
    39 100848 20368 21621 MCA100864 g3650360 3.00E−58 polynucleotide
    adenylyltransferase
    39 100848 22089 22535 MCA100865 g1573012 4.00E−30 2-amino-4-hydroxy-6-
    hydroxymethyldihydropteridine-
    pyroph
    39 100848 22769 23563 MCA100866 g3970812 2.00E−74 3-methyl-2-
    oxobutanoate
    hydroxymethyltransferase
    39 100848 23576 24412 MCA100867 g854607 2.00E−64 putative pantoate —
    beta-alanine ligase
    39 100848 24556 25401 MCA100868 g4138364 3.00E−59 ORF284
    39 100848 25460 26035 MCA100869 g4467403 2.00E−23 hsdS protein (AA 1-410
    39 100848 26235 26776 MCA100870 g4155604 4.00E−16 putative
    39 100848 29173 29787 MCA100902 g606319 7.00E−20 27 kD protein in
    ECDAMOPRA
    39 100848 155 772 MCA100959
    39 100848 787 1221 MCA100960
    39 100848 2287 2865 MCA100962 g1789409 3.00E−18 orf, hypothetical
    protein
    39 100848 3088 4974 MCA100963 g4176381 0 topoisomerase IV
    subunit
    39 100848 5074 5685 MCA100964 g2622643 3.00E−33 imidazoleglycerol-
    phosphate synthase
    39 100848 5692 6273 MCA100965 g38667 3.00E−57 hisB
    39 100848 6509 7017 MCA100966 g41474 2.00E−43 fms
    39 100848 7147 8805 MCA100967 g1800021 2.00E−69 DNA repair protein
    RecN
    39 100848 8859 9404 MCA100968 g1789317 1.00E−30 orf, hypothetical
    protein
    39 100848 9428 9826 MCA100969 g1789318 1.00E−23 orf, hypothetical
    protein
    39 100848 9901 10368 MCA100970
    39 100848 10483 10698 MCA100971 g1789881 1.00E−15 orf, hypothetical
    protein
    39 100848 10775 11650 MCA100972 g2645800 3.00E−62 site-specific
    recombinase
    39 100848 17947 18870 MCA100983 g1781241 1.00E−99 cysK
    39 100848 27386 27973 MCA100985 g1814074 1.00E−34 DsbA
    39 100848 40307 41437 MCA101057 g1657573 3.00E−49 hypothetical protein
    39 100848 41491 41649 MCA101058
    39 100848 41663 42544 MCA101059 g1773136 2.00E−52 acyl-coA thioesterase
    II
    39 100848 42892 45303 MCA101060 g1573755 e−124 glycerol-3-phosphate
    acyltransferase (plsB)
    39 100848 45434 46276 MCA101061 g3372537 1.00E−61 UTP-glucose-1-
    phosphate
    uridylyltransferase
    39 100848 46369 47937 MCA101062 g927386 e−163 glucose-6-phosphate
    isomerase
    39 100848 48368 48901 MCA101063 g3559950 1.00E−20 UDP-glucose 6-
    dehydrogenase
    39 100848 49598 49843 MCA101064
    39 100848 50331 50846 MCA101065
    39 100848 64882 65763 MCA101402 g2661442 4.00E−80 YafJ
    39 100848 62805 63572 MCA101404 g38674 2.00E−91 cyclase
    39 100848 62144 62566 MCA101405 g1773099 2.00E−42 probable riboflavin
    synthase beta chain
    39 100848 61547 61969 MCA101406 g1574763 4.00E−17 N utilization
    substance protein B
    (nusB)
    39 100848 60480 61445 MCA101407 g2329840 1.00E−50 thiamine-monophosphate
    kinase
    39 100848 59736 60230 MCA101408 g1574765 4.00E−19 phosphatidylglycerophosphatase
    A (pgpA)
    39 100848 58735 59224 MCA101410 g2769574 4.00E−22 methylase
    39 100848 56628 57614 MCA101412 g580766 1.00E−54 BepI modification
    methylase (AA 1-403)
    39 100848 54681 55580 MCA101414 g1573822 8.00E−37 conserved hypothetical
    protein
    39 100848 52655 54490 MCA101415 g2654003 0 glucosamine synthase
    39 100848 51555 52574 MCA101416 g1429254 e−111 UDP-glucose 4-
    epimerase
    39 100848 11886 13143 MCA101479 g1787337 e−109 3-oxoacyl-[acyl-
    carrier-protein]
    synthase II
    39 100848 88447 88902 MCA101792 g940802 1.00E−15 outer membrane protein
    39 100848 93930 94229 MCA101810
    39 100848 50855 51313 MCA101869
    39 100848 56357 56563 MCA101870
    39 100848 63863 64879 MCA101871 g3089616 4.00E−13 homoserine kinase
    homolog
    39 100848 65763 66659 MCA101872
    39 100848 78259 78561 MCA102126
    4 2642 463 783 MCA100115 g290546 1.00E−07 f135
    4 2642 954 1610 MCA100117 g2960085 3.00E−15 hypothetical protein
    Rv3661
    4 2642 1764 2642 MCA101198 g154276 8.00E−96 peptide chain release
    factor 2
    40 119211 50160 50753 MCA100057 g4062767 2.00E−34 ZK688.3 protein
    40 119211 50865 51788 MCA100058 g1359474 1.00E−81 homology to hydrolases
    40 119211 51852 52013 MCA100059 g599606 5.00E−24 rubredoxin
    40 119211 8413 8958 MCA100065 g4337446 1.00E−58 ECORLD_ORF1; encoded
    by M30388 and Z29635
    40 119211 10888 11190 MCA100146 g1573418 2.00E−24 conserved hypothetical
    protein
    40 119211 10282 10866 MCA100147 g1573419 2.00E−46 recombination protein
    (recR)
    40 119211 9069 10181 MCA100148 g1788105 6.00E−35 RNase D, processes
    tRNA precursor
    40 119211 106 690 MCA100179 g3861026 1.00E−13 unknown
    40 119211 693 1781 MCA100180 g606171 6.00E−92 ORF_f375
    40 119211 1850 2371 MCA100181 g1742876 3.00E−28 ORF_ID: o329#2; similar
    to [A40360]
    40 119211 2693 3697 MCA100182 g2634701 1.00E−61 NAD (P) H-dependent
    glycerol-3-phosphate
    dehydrogenase
    40 119211 7778 8185 MCA100367 g145892 2.00E−18 biotin carboxyl
    carrier protein
    40 119211 6422 7750 MCA100368 g405541 e−152 biotin carboxylase
    40 119211 5139 6181 MCA100369 g1786881 2.00E−94 putative ATP-binding
    protein in pho regulon
    40 119211 4544 4891 MCA100370 g1786880 4.00E−13 orf, hypothetical
    protein
    40 119211 27651 28547 MCA100431 g151405 e−111 phaseolotoxin
    sensitive octase
    40 119211 26345 26839 MCA100433 g2632225 9.00E−15 YkuD protein
    40 119211 76550 76939 MCA100482 g304913 3.00E−26 urf2
    40 119211 114141 114743 MCA100510 g286176 7.00E−28 negative regulator of
    pyocin genes
    40 119211 115659 116633 MCA100512
    40 119211 116611 117456 MCA100513
    40 119211 117460 118032 MCA100514
    40 119211 22301 24235 MCA100948 g1574757 e−143 ABC transporter, ATP-
    binding protein
    40 119211 21230 22201 MCA100949 g1872207 2.00E−35 HtrB homolog
    40 119211 20793 21170 MCA100950 g2634659 4.00E−42 aspartate 1-
    decarboxylase
    40 119211 17870 18673 MCA100952 g1052830 6.00E−63 indoleglycerol
    phosphate synthetase
    40 119211 16782 17798 MCA100953 g143784 3.00E−42 tryptophanyl tRNA
    synthetase (EC
    6.1.1.2)
    40 119211 15955 16656 MCA100954 g410131 8.00E−22 ORFX7
    40 119211 15289 15762 MCA100955 g410132 3.00E−14 ORFX8
    40 119211 14182 15102 MCA100956 g1574128 5.00E−73 conserved hypothetical
    protein
    40 119211 77032 77787 MCA101016 g1573017 1.00E−50 tRNA delta(2)-
    isopentenylpyrophosphate
    transferase
    40 119211 78161 78421 MCA101017 g1065627 3.00E−30 yersinia multiple
    regulator
    40 119211 78982 79953 MCA101019 g1789588 4.00E−68 putative isomerase
    40 119211 80020 80511 MCA101020 g2367202 6.00E−33 orf, hypothetical
    protein
    40 119211 80545 81120 MCA101021
    40 119211 81173 81667 MCA101023 g606139 6.00E−15 ORF_o185
    40 119211 81698 82408 MCA101024 g2317737 3.00E−87 putative ABC
    transporter ATP-
    binding protein
    40 119211 82528 86061 MCA101025 g2766693 0 proline dehydrogenase
    40 119211 88029 89999 MCA101028 g1161059 3.00E−57 protease
    40 119211 90522 92645 MCA101031
    40 119211 60578 62242 MCA101150 g1574163 e−112 dihydrolipoamide
    acetyltransferase
    (aceF)
    40 119211 48773 50050 MCA101214 g154288 e−142 5-
    phosphoribosylglycinamide
    synthetase
    40 119211 47317 48624 MCA101215 g3087737 9.00E−44 ABC1 protein
    40 119211 44031 44555 MCA101218 g1573090 1.00E−48 DNA polymerase III,
    epsilon subunit (dnaQ)
    40 119211 43024 43593 MCA101220 g396335 3.00E−37 No definition line
    found
    40 119211 42522 42941 MCA101221 g1742695 3.00E−34 Ferredoxin II.
    40 119211 40605 40901 MCA101223 g1787504 7.00E−22 orf, hypothetical
    protein
    40 119211 38672 40519 MCA101224 g1799717 7.00E−74 similar to [SwissProt
    Accession Number
    P44246]
    40 119211 37107 37787 MCA101226 g3861231 6.00E−49 unknown
    40 119211 114989 115282 MCA101355
    40 119211 92788 93711 MCA101469 g1573776 e−104 cell division protein
    (ftsY)
    40 119211 93897 94241 MCA101470 g2313803 2.00E−27 methylated-DNA -
    protein-cysteine
    methyltransferase
    40 119211 94362 95357 MCA101471 g47870 2.00E−94 dihydroorotate oxidase
    40 119211 95392 95904 MCA101472
    40 119211 95970 97439 MCA101473 g1788651 e−171 amidophosphoribosyltransferase = PRPP
    amidotransferase
    40 119211 97996 98835 MCA101475 g1944158 5.00E−36 lytic transglycosylase
    40 119211 99306 101294 MCA101476 g1592818 0 uvrB
    40 119211 101328 101969 MCA101477
    40 119211 102078 105977 MCA101480 g1574781 2.00E−44 exodeoxyribonuclease
    V, beta chain (recB)
    40 119211 106602 108041 MCA101482 g3142727 3.00E−49 exodeoxyribonuclease V
    subunit
    40 119211 108251 109219 MCA101483 g3885440 1.00E−86 yhdG homolog
    40 119211 109659 110585 MCA101484 g148275 5.00E−16 Exonuclease VII large
    subunit
    40 119211 111005 111736 MCA101485 g2072699 4.00E−74 pvdS
    40 119211 118395 118646 MCA101541
    40 119211 118082 118393 MCA101543
    40 119211 52375 53448 MCA101589 g151446 e−112 P-protein
    40 119211 53505 54374 MCA101590 g410055 2.00E−43 cyclohexadienyl
    dehydrogenase
    40 119211 54495 55763 MCA101591 g2634678 e−101 5-
    enolpyruvoylshikimate-
    3-phosphate synthase
    40 119211 55862 56695 MCA101592 g1906367 4.00E−64 hypothetical protein
    40 119211 56723 57088 MCA101593 g1789438 1.00E−10 putative kinase
    40 119211 57079 57510 MCA101594
    40 119211 57818 60442 MCA101595 g2564217 0 pyruvate dehydrogenase
    (lipoamide)
    40 119211 62595 63365 MCA101597 g1789363 4.00E−78 orf, hypothetical
    protein
    40 119211 67710 68651 MCA101599 g1788765 7.00E−77 thiosulfate binding
    protein
    40 119211 69040 70197 MCA101600 g3978474 e−115 MetZ homolog
    40 119211 70448 71575 MCA101601 g1574510 e−157 ribonucleoside
    diphosphate reductase,
    beta chain (nrdB)
    40 119211 71681 71902 MCA101602 g1788568 2.00E−08 orf, hypothetical
    protein
    40 119211 73244 74389 MCA101604 g498170 3.00E−87 carboxynorspermidine
    decarboxylase
    40 119211 74602 75804 MCA101605 g1001125 3.00E−74 hypothetical protein
    40 119211 75957 76511 MCA101606 g4155434 7.00E−36 putative
    40 119211 112492 112878 MCA101770
    40 119211 112942 113109 MCA101771
    40 119211 118691 119050 MCA101772
    40 119211 119052 119211 MCA101774
    40 119211 18727 20568 MCA101814 g141801 1.00E−83 anthranilate
    phosphoribosyltransferase
    (EC 2.4.2.18)
    40 119211 11382 13633 MCA101815 g1799581 0 ribonucleoside-
    diphosphate reductase
    1 alpha (EC1.17.4.1)
    40 119211 63531 66164 MCA101886 g1573962 2.00E−39 exodeoxyribonuclease
    V, gamma chain (recC)
    40 119211 44757 45182 MCA101959 g1552784 1.00E−34 ribonuclease H
    40 119211 45397 45936 MCA101960 g3861372 2.00E−09 possible
    protoporphyrinogen
    oxidase (hemk)
    40 119211 46032 47180 MCA101961 g2293312 3.00E−21 YtfP
    40 119211 24876 26252 MCA101962 g598251 0 outer membrane protein E
    40 119211 29114 29992 MCA101964 g2983572 5.00E−19 3-oxoacyl-[acyl-
    carrier-protein]
    synthase III
    40 119211 31377 32036 MCA101965 g580875 3.00E−59 ipa-57d
    40 119211 32139 32588 MCA101967 g1788911 3.00E−35 putative deaminase
    40 119211 32677 33342 MCA101968 g1574149 2.00E−50 cytidylate kinase 1
    (cmkA)
    40 119211 33597 35186 MCA101969 g1651439 0 30S ribosomal protein
    S1.
    40 119211 35506 35781 MCA101970 g399670 2.00E−16 integration host
    factor beta subunit
    40 119211 36355 37032 MCA101971 g805068 6.00E−56 OMP decarboxylase
    40 119211 37969 38598 MCA101972 g2635898 2.00E−17 similar to
    hypothetical proteins
    40 119211 86419 87177 MCA102059
    40 119211 3811 4308 MCA102109 g1001123 6.00E−08 hypothetical protein
    40 119211 24430 24660 MCA102111
    40 119211 35812 36213 MCA102116
    40 119211 30377 31330 MCA102117
    41 269223 188318 189049 MCA100014 g2181957 5.00E−43 hypothetical protein
    Rv3300c
    41 269223 77773 79113 MCA100035 g149757 0 outer membrane protein
    CD
    41 269223 255725 256996 MCA100036 g882710 e−118 N-acetylglutamate
    synthase
    41 269223 1764 2576 MCA100054 g1573276 2.00E−46 pyrroline-5-
    carboxylate reductase
    (proC)
    41 269223 195583 196011 MCA100074 g1001829 4.00E−15 hypothetical protein
    41 269223 82057 82719 MCA100076 g987642 5.00E−49 ribonuclease III
    41 269223 79399 80121 MCA100078 g1788917 1.00E−61 pyridoxine
    biosynthesis
    41 269223 127128 128444 MCA100098 g407186 3.00E−75 DnaA protein
    41 269223 192138 192839 MCA100103 g2108342 1.00E−89 OmpR protein
    41 269223 191142 192041 MCA100104 g1788499 6.00E−42 orf, hypothetical
    protein
    41 269223 126337 126468 MCA100112 g147682 7.00E−16 ribosomal protein L34
    41 269223 125896 126168 MCA100113 g581462 2.00E−13 homologous to E. coli
    rnpA
    41 269223 125582 125788 MCA100114 g2898108 2.00E−15 9-10 kDa protein-like
    41 269223 193168 195417 MCA100121 g1098475 e−171 region E; orf;
    homologous to E. coli
    o622, U18997
    41 269223 254370 255644 MCA100131 g1574371 e−100 glutamate permease
    (gltS)
    41 269223 4189 4955 MCA100190 g147322 2.00E−77 acetyl-CoA carboxylase
    41 269223 41968 43620 MCA100198 g2367384 0 putative ATP-binding
    component of a
    transport system
    41 269223 40805 41419 MCA100200 g2231726 2.00E−41 macrophage infectivity
    potentiator
    41 269223 189796 190944 MCA100247 g1789473 e−107 putative transport
    protein
    41 269223 185949 186641 MCA100307 g1574175 3.00E−48 16s pseudouridylate
    516 synthase (rsuA)
    41 269223 184967 185572 MCA100308 g3135321 5.00E−12 putative
    thiol: disulfide
    interchange protein
    precursor
    41 269223 183536 184672 MCA100309 g1389759 2.00E−94 DnaJ
    41 269223 37916 38281 MCA100355 g3323226 2.00E−21 T. pallidum predicted
    coding region TP0895
    41 269223 227863 230013 MCA100365 g391839 0 alpha-subunit of HDT
    41 269223 230052 231215 MCA100366 g391840 e−146 beta-subunit of HDT
    41 269223 36803 37561 MCA100439 g1468939 7.00E−60 meso-2,3-butanediol
    dehydrogenase (D-
    acetoin forming)
    41 269223 34942 36237 MCA100441 g1657503 e−106 similar to S. aureus
    mercury (II) reductase
    41 269223 33813 34805 MCA100442 g1001812 4.00E−72 hypothetical protein
    41 269223 32952 33533 MCA100443 g1789819 2.00E−49 orf, hypothetical
    protein
    41 269223 164675 165019 MCA100454 g2635307 3.00E−08 ysmA
    41 269223 94670 95482 MCA100483 g1573330 e−120 iron (chelated) ABC
    transporter,
    periplasmic-binding
    prot
    41 269223 95485 96356 MCA100484 g1573329 e−115 iron (chelated) ABC
    transporter, ATP-
    binding prot (yfeB)
    41 269223 96387 97214 MCA100485 g1573328 e−100 iron (chelated) ABC
    transporter, permease
    prot (yfeC)
    41 269223 97272 98081 MCA100486 g1245467 1.00E−87 YfeD
    41 269223 231781 232396 MCA100534 g2340007 1.00E−28 YlbK protein
    41 269223 233066 233581 MCA100536 g2342534 8.00E−45 PAPS reductase
    41 269223 233689 234591 MCA100537 g1322409 9.00E−89 cysD
    41 269223 234772 236025 MCA100538 g1322410 e−100 cysN
    41 269223 236187 238250 MCA100539 g2367254 0 DNA helicase
    41 269223 66114 68632 MCA100556 g1574437 e−153 cell division protein
    FtsK-related protein
    41 269223 69114 69851 MCA100558 g2668599 2.00E−78 ATPase
    41 269223 70011 70676 MCA100559 g1787088 8.00E−34 arginine 3rd transport
    system periplasmic
    binding prot
    41 269223 70868 71533 MCA100560 g769794 2.00E−40 artJ
    41 269223 75715 77502 MCA100597 g1790302 0 putative GTP-binding
    factor
    41 269223 74090 75439 MCA100598 g1573640 e−127 UDP-N-
    acetylglucosamine
    pyrophosphorylase
    (glmU)
    41 269223 73356 74006 MCA100599 g496542 1.00E−48 OccM
    41 269223 71723 73317 MCA100600 g1787085 1.00E−36 arginine 3rd transport
    system periplasmic
    binding prot
    41 269223 2850 4010 MCA100637 g971394 6.00E−27 similar to Acc. No.
    D26185
    41 269223 176444 178372 MCA100657 g606286 e−158 ORF_o637
    41 269223 179340 180227 MCA100659 g1789752 5.00E−45 orf, hypothetical
    protein
    41 269223 180371 181150 MCA100660 g1185002 2.00E−47 dihydrodipicolinate
    reductase
    41 269223 181240 182331 MCA100661 g304266 1.00E−45 cystathionine beta-
    lyase
    41 269223 182445 183365 MCA100662 g2634328 3.00E−89 similar to sodium-
    dependent transporter
    41 269223 178416 179237 MCA100692 g2293347 2.00E−12 DnaJ
    41 269223 39931 40560 MCA100773 g451652 1.00E−45 unknown
    41 269223 244876 245628 MCA101070 g4186118 2.00E−24 type 4 prepilin
    peptidase
    41 269223 303 1001 MCA101092 g4155349 1.00E−27 phosphomethylpyrimidine
    kinase
    41 269223 129669 130736 MCA101112 g150880 2.00E−37 putative
    41 269223 82887 83588 MCA101125 g1788921 8.00E−43 leader peptidase
    (signal peptidase I)
    41 269223 111855 112940 MCA101128 g150708 1.00E−99 [ribB] gene products
    41 269223 268513 268884 MCA101181 g1224005 7.00E−40 ORF2; sim. to N-
    terminal
    phosphoribosyl c-AMP
    hydrolase
    41 269223 268096 268443 MCA101182 g1224006 6.00E−28 ORF3; sim. to C-
    terminal
    phosphoribosyl c-AMP
    hydrolase
    41 269223 267596 268026 MCA101183 g1224007 2.00E−18 ORF4
    41 269223 266565 267230 MCA101184 g1224008 3.00E−59 ORF5; mutations in
    this gene affect the
    culture pH
    41 269223 264696 266135 MCA101185 g2577963 5.00E−86 YerD protein
    41 269223 263394 264128 MCA101187 g149205 6.00E−36 histidine utilization
    repressor C (hutC)
    41 269223 260788 261690 MCA101189 g1573236 8.00E−61 conserved hypothetical
    protein
    41 269223 259547 260607 MCA101190 g413953 1.00E−87 ipa-29d
    41 269223 258434 259207 MCA101191 g413952 4.00E−45 ipa-28d
    41 269223 44402 44662 MCA101279
    41 269223 45635 47095 MCA101281 g1498192 8.00E−54 putative
    41 269223 52663 52923 MCA101283 g1652924 3.00E−10 pterin-4a-
    carbinolamine
    dehydratase
    41 269223 53084 55264 MCA101284 g4176379 0 topoisomerase IV
    subunit
    41 269223 59095 59403 MCA101288
    41 269223 59601 62384 MCA101289 g1573871 0 DNA polymerase I
    (polA)
    41 269223 196489 197751 MCA101331 g141770 0 citrate synthase
    precursor
    41 269223 250144 254073 MCA101372 g1788909 0 phosphoribosylformyl-
    glycine amide
    synthetase
    41 269223 248757 249935 MCA101373 g2632881 1.00E−41 similar to
    bicyclomycin
    resistance protein
    41 269223 246950 248584 MCA101374 g3220230 e−135 type IV pilus assembly
    protein TapB
    41 269223 245649 246836 MCA101375 g3025702 1.00E−56 pilus assembly protein
    PilC
    41 269223 244092 244709 MCA101377 g1573909 1.00E−33 conserved hypothetical
    protein
    41 269223 240255 243272 MCA101379 g1736781 e−111 Acriflavin resistance
    protein D.
    41 269223 239100 239612 MCA101381 g550460 4.00E−18 membrane fusion
    protein
    41 269223 128505 129656 MCA101382 g45691 7.00E−61 dnaN protein (AA 1-367)
    41 269223 131062 133455 MCA101384 g41646 0 gyrase B (AA 1-804)
    41 269223 133644 135200 MCA101385 g1573186 0 GMP synthase (guaA)
    41 269223 136888 137169 MCA101388 g1001663 2.00E−16 rare lipoprotein A
    41 269223 137351 137692 MCA101389 g1652134 2.00E−23 FKBP-type peptidyl-
    prolyl cis-trans
    isomerase
    41 269223 137915 139009 MCA101390 g2983314 3.00E−63 ornithine
    decarboxylase
    41 269223 139063 140330 MCA101391 g1789996 4.00E−99 alanine-alpha-
    ketoisovalerate
    transaminase C
    41 269223 140389 140727 MCA101392 g2407234 8.00E−26 similar to H.
    influenzae U32836
    41 269223 140754 141998 MCA101393 g1787438 e−138 D-amino acid
    dehydrogenase subunit
    41 269223 142379 144201 MCA101394 g1790427 0 thiamin biosynthesis,
    pyrimidine moiety
    41 269223 144333 146159 MCA101395 g1574084 0 ABC transporter, ATP-
    binding protein
    41 269223 146383 147726 MCA101396 g2635428 e−130 argininosuccinate
    lyase
    41 269223 147971 148915 MCA101397 g41666 e−100 porphobilinogen
    deaminase (AA 1-313)
    41 269223 149877 150605 MCA101399 g1573875 4.00E−46 conserved hypothetical
    protein
    41 269223 38460 38705 MCA101530 g42543 1.00E−13 pspE protein
    41 269223 31815 32798 MCA101546 g1001340 4.00E−54 hypothetical protein
    41 269223 28035 30956 MCA101548 g4377308 e−118 Zinc Metalloprotease
    (insulinase family)
    41 269223 26681 27871 MCA101549 g2367234 e−107 orf, hypothetical
    protein
    41 269223 25873 26463 MCA101550 g1573078 1.00E−36 phosphatidylglycerophosphate
    synthase (pgsA)
    41 269223 23781 24791 MCA101552 g1657863 0 NAD repressor/NMN
    transporter NadRp
    41 269223 23259 23432 MCA101553 g2636024 5.00E−09 yvlC
    41 269223 19781 22992 MCA101554 g1657862 0 glycyl-tRNA synthetase
    alpha subunit
    41 269223 18833 19485 MCA101555 g1787111 1.00E−42 leucyl, phenylalanyl-
    tRNA-protein
    transferase
    41 269223 17415 18665 MCA101556 g3284000 0 serine
    hydroxymethyltransferase
    41 269223 16824 17255 MCA101557 g43231 1.00E−10 chorismate-pyruvate
    lyase
    41 269223 14797 16386 MCA101558 g2662054 e−171 isocitrate lyase
    41 269223 12474 14624 MCA101559 g1906369 0 hypothetical protein
    41 269223 8656 11007 MCA101561 g1651530 e−160 Ribonuclease e (EC
    3.1.4.—) (RNase E).
    41 269223 6766 7716 MCA101563 g1573385 5.00E−64 conserved hypothetical
    protein
    41 269223 5116 6546 MCA101564 g4200042 e−112 exopolyphosphatase
    41 269223 91641 91808 MCA101609 g208931 1.00E−16 ORF16-lacZ fusion
    protein
    41 269223 88129 88366 MCA101611 g1334480 4.00E−14 unique orf
    41 269223 86216 86662 MCA101614 g1573906 3.00E−65 H. influenzae
    predicted coding
    region HI0882
    41 269223 83997 85778 MCA101615 g1572960 0 GTP-binding membrane
    protein (lepA)
    41 269223 80995 81894 MCA101618 g1572957 1.00E−80 GTP-binding protein
    (era)
    41 269223 175707 176225 MCA101619 g560723 5.00E−22 Mip = 24 kda macrophage
    infectivity
    potentiator protein
    41 269223 174030 174176 MCA101621 g1894774 5.00E−16 rubredoxin
    41 269223 172917 173972 MCA101622 g1789065 1.00E−42 putative
    oxidoreductase
    41 269223 171413 172576 MCA101623 g2150108 2.00E−85 periplasmic substrate
    binding protein
    41 269223 170503 171255 MCA101624 g2150109 5.00E−61 integral membrane
    protein
    41 269223 169728 170423 MCA101625 g48972 2.00E−64 nitrate transporter
    41 269223 169168 169497 MCA101626 g1574579 3.00E−30 conserved hypothetical
    protein
    41 269223 167480 168979 MCA101627 g3005690 7.00E−91 gamma-glutamylcysteine
    synthetase
    41 269223 165388 166755 MCA101629 g1573076 e−121 conserved hypothetical
    protein
    41 269223 164248 164496 MCA101631 g1573769 9.00E−08 conserved hypothetical
    protein
    41 269223 153230 153748 MCA101633 g1573022 8.00E−20 heat shock protein
    (grpE)
    41 269223 151115 153019 MCA101634 g2522264 0 DnaK
    41 269223 198632 198931 MCA101637 g2239247 1.00E−18 SdhC protein
    41 269223 198958 199290 MCA101638 g42924 5.00E−19 succinate
    dehydrogenase
    hydrophobic subunit
    41 269223 199379 201199 MCA101639 g3273345 0 fumarate reductase
    flavoprotein subunit
    41 269223 201300 201977 MCA101640 g2239250 1.00E−96 succinate
    dehydrogenase putative
    iron sulphur subunit
    41 269223 202407 205205 MCA101641 g39232 0 2-oxoglutarate
    dehydrogenase
    41 269223 205326 206555 MCA101642 g39283 e−131 succinyltransferase
    41 269223 206648 208090 MCA101643 g151345 e−155 dihydrolipoamide
    dehydrogenase
    41 269223 212826 214043 MCA101645
    41 269223 214142 215374 MCA101646
    41 269223 216050 218155 MCA101648 g148698 3.00E−92 prolyl endopeptidase
    41 269223 218735 220828 MCA101650 g1573174 e−147 oligopeptidase A
    (prlC)
    41 269223 221075 221800 MCA101651 g1787008 8.00E−40 orf, hypothetical
    protein
    41 269223 221952 222545 MCA101652 g882483 3.00E−50 ORF_o197
    41 269223 222757 224055 MCA101653 g1773120 e−105 trigger factor
    41 269223 224295 224885 MCA101654 g1773121 1.00E−84 ATP-dependent Clp
    proteinase
    41 269223 224934 226208 MCA101655 g1573717 e−149 ATP-dependent Clp
    protease, ATP-binding
    subunit
    41 269223 123662 125293 MCA101656 g45709 e−133 homologous to E. coli
    60 K
    41 269223 122095 123465 MCA101657 g45710 e−113 homologous to E. coli
    50 K
    41 269223 121548 121988 MCA101658 g42148 1.00E−46 orf1
    41 269223 120490 121497 MCA101659 g581147 4.00E−80 orf2, homologue to
    B. subtilis ribG
    41 269223 119545 120186 MCA101660 g150707 3.00E−49 riboflavin synthetase
    alpha subunit
    41 269223 118437 119363 MCA101661 g3328155 4.00E−69 methionyl-tRNA
    formyltransferase
    41 269223 117032 118369 MCA101662 g1573620 7.00E−65 sun protein (sun)
    41 269223 115305 116708 MCA101663 g2160269 e−153 threonine synthase
    41 269223 114048 115172 MCA101664 g1574014 2.00E−44 DNA processing chain A
    (dprA)
    41 269223 113447 114028 MCA101665 g2367210 1.00E−19 orf, hypothetical
    protein
    41 269223 110508 111677 MCA101668 g1460081 3.00E−85 hypothetical protein
    Rv2559c
    41 269223 109304 109822 MCA101670 g402362 3.00E−15 hypothetical protein
    41 269223 105340 106233 MCA101673 g1354827 3.00E−67 arginase
    41 269223 104054 105262 MCA101674 g790956 e−145 ornithine
    aminotransferase
    41 269223 103248 103808 MCA101675 g1628369 2.00E−10 gepB
    41 269223 101499 102242 MCA101677 g4154851 3.00E−72 putative
    41 269223 100074 101222 MCA101678 g1573761 2.00E−75 conserved hypothetical
    protein
    41 269223 98638 99816 MCA101679 g1574452 e−120 tyrosyl tRNA
    synthetase (tyrS)
    41 269223 44008 44328 MCA101794
    41 269223 257352 257930 MCA101931
    41 269223 238243 238896 MCA101934
    41 269223 239645 239932 MCA101937
    41 269223 243516 244079 MCA101943
    41 269223 44993 45466 MCA101954
    41 269223 186833 187384 MCA101958 g42358 5.00E−21 pepQ product, proline
    dipeptidase
    41 269223 187980 188180 MCA101973 g3322357 1.00E−08 dnaK suppressor,
    putative
    41 269223 211262 211762 MCA101976 g529727 7.00E−09 heme receptor
    41 269223 55427 56215 MCA101978 g1788125 8.00E−47 putative enzyme
    41 269223 56337 57158 MCA101979 g4155762 3.00E−16 putative
    41 269223 57227 58789 MCA101980 g1574592 0 peptide chain release
    factor 3 (prfC)
    41 269223 62725 65282 MCA101981 g1574197 0 DNA topoisomerase I
    (topA)
    41 269223 106832 107182 MCA102132
    41 269223 113110 113376 MCA102133 g1788096 5.00E−11 orf, hypothetical
    protein
    41 269223 24857 25618 MCA102137 g1651338 7.00E−08 PnuC protein
    41 269223 31241 31690 MCA102138
    41 269223 135356 136573 MCA102139
    41 269223 262656 262982 MCA102143
    41 269223 148933 149691 MCA102146 g496215 5.00E−12 uropprphyrinogen-III-
    synthase
    41 269223 155575 156525 MCA102147
    41 269223 156368 159940 MCA102148
    41 269223 160109 161479 MCA102149
    41 269223 161476 162411 MCA102150
    41 269223 162428 163453 MCA102151
    41 269223 163450 164040 MCA102152
  • [0146]
    TABLE 2
    Locus ID End Locus ID End
    MCA1c0001 5′ MCA1c0005 5′
    MCA1c0001 3′ ND ND
    MCA1c0002 5′ ND ND
    MCA1c0002 3′ MCA1c0039 3′
    MCA1c0003 5′ ND ND
    MCA1c0003 3′ ND ND
    MCA1c0004 5′ ND ND
    MCA1c0004 3′ MCA1c0009 5′
    MCA1c0005 5′ MCA1c0001 5′
    MCA1c0005 3′ ND ND
    MCA1c0006 5′ ND ND
    MCA1c0006 3′ MCA1c0033 5′
    MCA1c0007 5′ ND ND
    MCA1c0007 3′ ND ND
    MCA1c0008 5′ ND ND
    MCA1c0008 3′ MCA1c0012 3′
    MCA1c0009 5′ MCA1c0004 3′
    MCA1c0009 3′ MCA1c0030 5′
    MCA1c0010 5′ ND ND
    MCA1c0010 3′ ND ND
    MCA1c0011 5′ ND ND
    MCA1c0011 3′ ND ND
    MCA1c0012 5′ ND ND
    MCA1c0012 3′ MCA1c0008 3′
    MCA1c0013 5′ ND
    MCA1c0013 3′ ND
    MCA1c0014 5′ ND
    MCA1c0014 3′ ND
    MCA1c0015 5′ ND
    MCA1c0015 3′ MCA1c0026 5′
    MCA1c0016 5′ MCA1c0019 3′
    MCA1c0016 3′ ND
    MCA1c0017 5′ ND
    MCA1c0017 3′ ND
    MCA1c0018 5′ MCA1c0038 3′
    MCA1c0018 3′ MCA1c0021 3′
    MCA1c0019 5′ ND
    MCA1c0019 3′ MCA1c0016 5′
    MCA1c0020 5′ ND
    MCA1c0020 3′ ND
    MCA1c0021 5′ ND ND
    MCA1c0021 3′ MCA1c0018 3′
    MCA1c0022 5′ ND ND
    MCA1c0022 3′ ND ND
    MCA1c0023 5′ ND ND
    MCA1c0023 3′ ND ND
    MCA1c0024 5′ ND ND
    MCA1c0024 3′ ND ND
    MCA1c0025 5′ ND ND
    MCA1c0025 3′ ND ND
    MCA1c0026 5′ MCA1c0015 3′
    MCA1c0026 3′ ND ND
    MCA1c0027 5′ ND ND
    MCA1c0027 3′ ND ND
    MCA1c0028 5′ MCA1c0029 3′
    MCA1c0028 3′ ND ND
    MCA1c0029 5′ ND ND
    MCA1c0029 3′ MCA1c0028 5′
    MCA1c0030 5′ MCA1c0009 3′
    MCA1c0030 3′ ND ND
    MCA1c0031 5′ ND ND
    MCA1c0031 3′ ND ND
    MCA1c0032 5′ ND ND
    MCA1c0032 3′ ND ND
    MCA1c0033 5′ MCA1c0006 3′
    MCA1c0033 3′ ND ND
    MCA1c0034 5′ MCA1c0036 3′
    MCA1c0034 3′ ND ND
    MCA1c0035 5′ ND ND
    MCA1c0035 3′ ND ND
    MCA1c0036 5′ ND ND
    MCA1c0036 3′ MCA1c0034 5′
    MCA1c0037 5′ ND ND
    MCA1c0037 3′ ND ND
    MCA1c0038 5′ ND ND
    MCA1c0038 3′ MCA1c0018 5′
    MCA1c0039 5′ ND ND
    MCA1c0039 3′ MCA1c0002 3′
    MCA1c0040 5′ ND ND
    MCA1c0040 3′ ND ND
    MCA1c0041 5′ ND ND
    MCA1c0041 3′ ND ND
  • [0147]
  • 0
    SEQUENCE LISTING
    The patent application contains a lengthy “Sequence Listing” section. A copy of the “Sequence Listing” is available in electronic form from the USPTO
    web site (http://seqdata.uspto.gov/sequence.html?DocID=20040067554). An electronic copy of the “Sequence Listing” will also be available from the
    USPTO upon request and payment of the fee set forth in 37 CFR 1.19(b)(3).

Claims (20)

What is claimed is:
1. A purified or isolated nucleic acid consisting essentially of a nucleotide sequence that encodes the same UDP-N-acetylmuramoylalanine-D-glutamate ligase encoded by nucleotides 11357 to 12736 of SEQ ID NO:35 or a nucleotide sequence fully complementary thereto.
2. The purified or isolated nucleic acid of claim 1, wherein said nucleic acid sequence consists essentially of nucleotides 11357 to 12736 of SEQ ID NO:35 or a nucleotide sequence fully complementary thereto.
3. A purified or isolated oligonucleotide consisting essentially of a fragment of a nucleic acid having the nucleotide sequence of nucleotides 11357 to 12736 of SEQ ID NO:35 or a sequence complementary thereto, wherein said oligonucleotide is at least 22 nucleotides in length.
4. A recombinant construct comprising a nucleotide sequence that encodes the same UDP-N-acetylmuramoylalanine-D-glutamate ligase encoded by nucleotides 11357 to 12736 of SEQ ID NO:35, or a nucleotide sequence fully complementary thereto, operably linked to a promoter.
5. A method of making UDP-N-acetylmuramoylalanine-D-glutamate ligase of Moraxella catarrahalis comprising:
obtaining a nucleic acid consisting essentially of a nucleotide sequence that encodes the same UDP-N-acetylmuramoylalanine-D-glutamate ligase encoded by nucleotides 11357 to 12736 of SEQ ID NO:35;
inserting said nucleic acid in an expression vector such that said nucleic acid is operably linked to a promoter; and
introducing said expression vector into a host cell whereby said host cell produces the protein encoded by said nucleic acid.
6. The method of claim 5, further comprising isolating the protein.
7. The method of claim 5, wherein said nucleic acid sequence consists essentially of nucleotides 11357 to 12736 of SEQ ID NO:35 or a nucleotide sequence fully complementary thereto.
8. A method for constructing a host cell that expresses UDP-Nacetylmuramoylalanine-D-glutamate ligase of Moraxella catarrahalis comprising introducing a recombinant construct comprising a promoter operably linked to a nucleic acid comprising a nucleotide sequence that encodes the same UDP-N-acetylmuramoylalanine-D-glutamate ligase encoded by nucleotides 11357 to 12736 of SEQ ID NO:35 into said cell.
9. The method of claim 8, wherein said nucleic acid sequence consists essentially of nucleotides 11357 to 12736 of SEQ ID NO:35 or a nucleotide sequence fully complementary thereto.
10. A vector comprising the purified or isolated nucleic acid of claim 1
11. The vector of claim 10, wherein the isolated nucleic acid is operably linked to a promoter.
12. The vector of claim 11, wherein the vector is an expression vector.
13. A cultured cell line comprising the vector of claim 10.
14. A vector comprising the purified or isolated nucleic acid of claim 2.
15. The vector of claim 14, wherein the isolated nucleic acid is operably linked to a promoter.
16. The vector of claim 15, wherein the vector is an expression vector.
17. A cultured cell line comprising the vector of claim 14.
18. An isolated expression construct comprising nucleotides 11357 to 12736 of SEQ ID NO:35, which encodes UDP-N-acetylmuramoylalanine-D-glutamate ligase, or a nucleotide sequence fully complementary thereto, operably linked to a promoter.
19. A purified or isolated nucleic acid consisting essentially of a nucleic acid sequence which hybridizes under high stringency to nucleotides 11357 to 12736 of SEQ ID NO:35 and which encodes UDP-N-acetylmuramoylalanine-D-glutamate ligase.
20. A purified or isolated nucleic acid which hybridizes substantially over the entire length to nucleotides 11357 to 12736 of SEQ ID NO:35 or a sequence complementary thereto under the following conditions: 5×SSC with 1% SDS at 60° C.; and washing with 0.2×SSC with 0.1% SDS at either 45 C or 68° C. or 0.5M sodium phospahate (pH 7.2), 7% SDS, and 1 mM EDTA at 65° C.; and washing with 40 mM sodium phosphate, 1% SDS, 1 mM EDTA at 65° C.
US10/672,787 1999-06-18 2003-09-26 Nucleotide sequences of moraxella catarrhalis genome Abandoned US20040067554A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/672,787 US20040067554A1 (en) 1999-06-18 2003-09-26 Nucleotide sequences of moraxella catarrhalis genome

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US14012199P 1999-06-18 1999-06-18
US09/596,002 US6632636B1 (en) 1999-06-18 2000-06-16 Nucleic acids encoding 3-ketoacyl-ACP reductase from Moraxella catarrahalis
US10/672,787 US20040067554A1 (en) 1999-06-18 2003-09-26 Nucleotide sequences of moraxella catarrhalis genome

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/596,002 Continuation US6632636B1 (en) 1999-06-18 2000-06-16 Nucleic acids encoding 3-ketoacyl-ACP reductase from Moraxella catarrahalis

Publications (1)

Publication Number Publication Date
US20040067554A1 true US20040067554A1 (en) 2004-04-08

Family

ID=22489850

Family Applications (2)

Application Number Title Priority Date Filing Date
US09/596,002 Expired - Fee Related US6632636B1 (en) 1999-06-18 2000-06-16 Nucleic acids encoding 3-ketoacyl-ACP reductase from Moraxella catarrahalis
US10/672,787 Abandoned US20040067554A1 (en) 1999-06-18 2003-09-26 Nucleotide sequences of moraxella catarrhalis genome

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/596,002 Expired - Fee Related US6632636B1 (en) 1999-06-18 2000-06-16 Nucleic acids encoding 3-ketoacyl-ACP reductase from Moraxella catarrahalis

Country Status (5)

Country Link
US (2) US6632636B1 (en)
EP (1) EP1218512A2 (en)
AU (1) AU1824101A (en)
CA (1) CA2378687A1 (en)
WO (1) WO2000078968A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012112922A1 (en) * 2011-02-18 2012-08-23 Alexion Pharmaceuticals, Inc. Methods for synthesizing molybdopterin precursor z derivatives
CN102766165A (en) * 2011-02-18 2012-11-07 阿莱克申药物国际公司 Methods for synthesizing molybdopterin precursor z derivatives

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB9809683D0 (en) * 1998-05-06 1998-07-01 Smithkline Beecham Biolog Novel compounds
GB9810285D0 (en) 1998-05-13 1998-07-15 Smithkline Beecham Biolog Novel compounds
GB9820002D0 (en) 1998-09-14 1998-11-04 Smithkline Beecham Biolog Novel compounds
AU1824101A (en) * 1999-06-18 2001-01-09 Elitra Pharmaceuticals, Inc. Nucleotide sequences of moraxella catarrhalis genome
GB9918003D0 (en) 1999-07-30 1999-09-29 Smithkline Beecham Biolog Novel compounds
GB9918038D0 (en) * 1999-07-30 1999-09-29 Smithkline Beecham Biolog Novel compounds
GB9918041D0 (en) * 1999-07-30 1999-09-29 Smithkline Beecham Biolog Novel compounds
AU6568300A (en) * 1999-07-30 2001-02-19 Smithkline Beecham Biologicals (Sa) Novel compounds
GB9918279D0 (en) * 1999-08-03 1999-10-06 Smithkline Beecham Biolog Novel compounds
GB9918302D0 (en) 1999-08-03 1999-10-06 Smithkline Beecham Biolog Novel compounds
GB9921691D0 (en) * 1999-09-14 1999-11-17 Smithkline Beecham Sa Novel compounds
GB9921692D0 (en) * 1999-09-14 1999-11-17 Smithkline Beecham Sa Novel compounds
AU8743001A (en) * 2000-08-28 2002-03-13 Aventis Pasteur Moraxella polypeptides and corresponding dna fragments and uses thereof
AU2002302236B2 (en) * 2001-05-15 2008-09-25 Id Biomedical Corporation Moraxella(branhamella) catarrhalis antigens
CN1582336A (en) * 2001-06-18 2005-02-16 希雷生物化学有限公司 Moraxella (branhamella) catarrhalis antigens
CN1549727A (en) * 2001-08-27 2004-11-24 ϣ�����ﻯѧ���޹�˾ Moraxella (branhamella) catarrhalis polypeptides and corresponding DNA fragments
ES2316627T3 (en) * 2001-11-16 2009-04-16 Id Biomedical Corporation MORAXELLA POLIPEPTIDES (BRANHAMELLA) CATARRHALIS.
US6929935B2 (en) 2001-12-19 2005-08-16 Bristol-Myers Squibb Company Gluconobacter oxydans 2-ketoreductase enzyme and applications thereof
WO2004014419A1 (en) 2002-08-02 2004-02-19 Glaxosmithkline Biologicals S.A. Vaccine composition comprising transferrin binding protein and hsf from gram negative bacteria
WO2006101176A1 (en) * 2005-03-24 2006-09-28 Kaneka Corporation Microorganism capable of accumulating ultra high molecular weight polyester
DK2089516T3 (en) * 2006-11-29 2012-02-20 Novozymes Inc Methods for improving the insertion of DNA into bacterial cells
US20130004530A1 (en) 2010-03-10 2013-01-03 Jan Poolman Vaccine composition

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5474796A (en) * 1991-09-04 1995-12-12 Protogene Laboratories, Inc. Method and apparatus for conducting an array of chemical reactions on a support surface
US5605662A (en) * 1993-11-01 1997-02-25 Nanogen, Inc. Active programmable electronic devices for molecular biological analysis and diagnostics
US5830721A (en) * 1994-02-17 1998-11-03 Affymax Technologies N.V. DNA mutagenesis by random fragmentation and reassembly
US5876946A (en) * 1997-06-03 1999-03-02 Pharmacopeia, Inc. High-throughput assay
US6110704A (en) * 1999-01-28 2000-08-29 Smithkline Beecham Corporation 3-ketoacyl-ACP-reductase (FabG) of Staphylococcus aureus
US6632636B1 (en) * 1999-06-18 2003-10-14 Elitra Pharmaceuticals Inc. Nucleic acids encoding 3-ketoacyl-ACP reductase from Moraxella catarrahalis
US6673910B1 (en) * 1999-04-08 2004-01-06 Genome Therapeutics Corporation Nucleic acid and amino acid sequences relating to M. catarrhalis for diagnostics and therapeutics

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6348328B1 (en) * 1997-05-14 2002-02-19 Smithkline Beecham Corporation Compounds

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5474796A (en) * 1991-09-04 1995-12-12 Protogene Laboratories, Inc. Method and apparatus for conducting an array of chemical reactions on a support surface
US5605662A (en) * 1993-11-01 1997-02-25 Nanogen, Inc. Active programmable electronic devices for molecular biological analysis and diagnostics
US5830721A (en) * 1994-02-17 1998-11-03 Affymax Technologies N.V. DNA mutagenesis by random fragmentation and reassembly
US5876946A (en) * 1997-06-03 1999-03-02 Pharmacopeia, Inc. High-throughput assay
US6110704A (en) * 1999-01-28 2000-08-29 Smithkline Beecham Corporation 3-ketoacyl-ACP-reductase (FabG) of Staphylococcus aureus
US6673910B1 (en) * 1999-04-08 2004-01-06 Genome Therapeutics Corporation Nucleic acid and amino acid sequences relating to M. catarrhalis for diagnostics and therapeutics
US6632636B1 (en) * 1999-06-18 2003-10-14 Elitra Pharmaceuticals Inc. Nucleic acids encoding 3-ketoacyl-ACP reductase from Moraxella catarrahalis

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012112922A1 (en) * 2011-02-18 2012-08-23 Alexion Pharmaceuticals, Inc. Methods for synthesizing molybdopterin precursor z derivatives
CN102766165A (en) * 2011-02-18 2012-11-07 阿莱克申药物国际公司 Methods for synthesizing molybdopterin precursor z derivatives
US9260462B2 (en) 2011-02-18 2016-02-16 Alexion Pharmaceuticals, Inc. Methods for synthesizing molybdopterin precursor Z derivatives

Also Published As

Publication number Publication date
WO2000078968A2 (en) 2000-12-28
EP1218512A2 (en) 2002-07-03
CA2378687A1 (en) 2000-12-28
US6632636B1 (en) 2003-10-14
WO2000078968A3 (en) 2002-05-10
AU1824101A (en) 2001-01-09

Similar Documents

Publication Publication Date Title
US6632636B1 (en) Nucleic acids encoding 3-ketoacyl-ACP reductase from Moraxella catarrahalis
US8652773B2 (en) Genes of an otitis media isolate of nontypeable Haemophilus influenza
US6503729B1 (en) Selected polynucleotide and polypeptide sequences of the methanogenic archaeon, methanococcus jannashii
US6448043B1 (en) Enterococcus faecalis EF040 and uses therefor
US6506581B1 (en) Nucleotide sequence of the Haemophilus influenzae Rd genome, fragments thereof, and uses thereof
WO2005049642A2 (en) Genome of legionella pneumophila paris and lens strain-diagnostic and epidemiological applications
JPH11501520A (en) Nucleotide sequence of Haemophilus influenzae Rd genome, fragments thereof and uses thereof
Braibant et al. A Mycobacterium tuberculosis gene cluster encoding proteins of a phosphate transporter homologous to the Escherichia coli Pst system
WO2002094867A2 (en) Sequence of the photorhabdus luminescens strain tt01 genome and uses
US6720139B1 (en) Genes identified as required for proliferation in Escherichia coli
US20030049648A1 (en) 37 staphylococcus aureus genes and polypeptides
CA2395335A1 (en) Genes identified as required for proliferation of e. coli
US20060068386A1 (en) Complete genome and protein sequence of the hyperthermophile methanopyrus kandleri av19 and monophyly of archael methanogens and methods of use thereof
US20050131222A1 (en) Nucleotide sequence of the haemophilus influenzae Rd genome, fragments thereof, and uses thereof
US6528289B1 (en) Nucleotide sequence of the Haemophilus influenzae Rd genome, fragments thereof, and uses thereof
US20040219585A1 (en) Nontypeable haemophilus influenzae virulence factors
US6589738B1 (en) Genes essential for microbial proliferation and antisense thereto
US20050202424A1 (en) Regulators of biofilm formation and uses thereof
US6632935B2 (en) Genome DNA of bacterial symbiont of aphids
US6673538B1 (en) Methods and compositions for designing vaccines
AU710880B2 (en) Nucleic acid and amino acid sequences relating to helicobacter pylori for diagnostics and therapeutics
US20050272089A1 (en) Critical genes and polypeptides of haemophilus influenzae and methods of use
Average The genome sequence of the plant pathogen Xylella fastidiosa
AU5588801A (en) Enterococcus faecalis polynucleotides and polypeptides
AU3796099A (en) Assays using nucleic acid and amino acid sequences relating to helicobacter pylori

Legal Events

Date Code Title Description
AS Assignment

Owner name: MERCK & CO., INC., NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ELITRA PHARMACEUTICALS, INC.;REEL/FRAME:015232/0170

Effective date: 20040812

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION